File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/06/w06-0501_evalu.xml

Size: 7,919 bytes

Last Modified: 2025-10-06 13:59:49

<?xml version="1.0" standalone="yes"?>
<Paper uid="W06-0501">
  <Title>Enriching a formal ontology with a thesaurus: an aplication in the cultural heritage domain</Title>
  <Section position="8" start_page="8" end_page="10" type="evalu">
    <SectionTitle>
4 Evaluation
</SectionTitle>
    <Paragraph position="0"> Since the CIDOC-CRM model formalizes a large number of fine-grained properties (precisely, 141), we selected a subset of properties for our experiments (reported in Table 2). We wrote a relation checker for each property in the Table. By applying the checkers in cascade to a glos G, a set of annotations is produced. The folowing is an example of an anotated glos for the term &amp;quot;vedute&amp;quot;: Refers to detailed, largely factual topographical views, especialy &lt;has-time-span&gt;18th-century&lt;/has-time-span&gt; Italian paintings, drawings, or prints of cities. The first vedute probably were &lt;caried-out-by&gt;painted by northern European artists&lt;/cariedout-by&gt; who worked &lt;has former-or-curent-location&gt;in Italy&lt;/has former-or-curent-location&gt;&lt;has-time-span&gt;in the 16th century&lt;/has-time-span&gt;. The term refers ore generaly to any painting, drawing or print &lt;depicts&gt;representing a landscape or town view&lt;/depicts&gt; that is largely topographical in conception. Figure 3 shows a more comprehensive graph representation of the outcome for the concepts vedute#1 and maesta#1 (see the glos in Section 2.2).</Paragraph>
    <Paragraph position="1"> To evaluate the methodology described in Section 3 we considered 814 gloses from the Visual Works sub-tree of the AT thesaurus, containing a total of 27,925 words. The authors wrote the relation checkers by tuning them on a subset of 12 gloses, and tested their generality on the remaining 692. The test set was manually tagged with the subset of the CIDOC-CRM properties shown in Table 2 by two annotators with adjudication (requiring a careful comparison of the two sets of annotations).</Paragraph>
    <Paragraph position="2"> We performed two experiments: in the first, we evaluated the glos anotation task, in the second the property instance extraction task, i.e. the ability to identify the appropriate domain and range of a property instance. In the case of the glos annotation task, for evaluating each piece of information we adopted the measures of &amp;quot;labeled&amp;quot; precision and recal. These measures are commonly used to evaluate parse trees obtained by a parser (Charniak, 197) and allow the rewarding of god partial results. Given a property R, labeled precision is the number of words annotated correctly with R over the number of words annotated automatically with R, while labeled recal is the number of words annotated correctly with R over the total number of words manually annotated with R.</Paragraph>
    <Paragraph position="3"> Table 3 shows the results obtained by applying the checkers to tag the test set (containing a total number of 1,328 distinct annotations and 5,965 annotated words). Note that here we are evaluating the ability of the system to assign the correct tag to every word in a glos fragment f, according to the appropriate relation checker. We chose to evaluate the tag assigned to single words rather than to a whole phrase, because each misalignment would count as a mistake even if the most part of a phrase was tagged correctly by the automatic annotator.</Paragraph>
    <Paragraph position="4"> The second experiment consisted in the evaluation of the property instances extracted.</Paragraph>
    <Paragraph position="5"> Starting from 1,328 manually annotated fragments of 692 gloses, the checkers extracted an overall number of 1,101 property instances. We randomly selected a subset of 160 gloses for evaluation, from which we manually extracted 34 property instances.</Paragraph>
    <Paragraph position="6"> Two aspects of the property instance extraction task had to be assessed: the extraction of the appropriate range words in a glos, for a given property instance the precision and recall in the extraction of the appropriate concepts for both domain and range of the property instance.</Paragraph>
    <Paragraph position="7"> An overall number of 23 property instances were automatically colected by the checkers, out of which 203 were correct with respect to the first assessment (87.12% precision (203/23), 59.01% recall (203/34)).</Paragraph>
    <Paragraph position="8"> In the second evaluation, for each property</Paragraph>
    <Paragraph position="10"> ) we assessed the semantic correctness of both the concepts C</Paragraph>
    <Paragraph position="12"> The apropriateness of the concept C</Paragraph>
    <Paragraph position="14"> for the domain must be evaluated, since, even if a term t satisfies the semantic constraints of the domain for a property R, stil it can be the case that a fragment f in G does not refer to t, like in the folowing example: pastels (visual works) -- Works of art, typicaly on a paper or velum suport, to which designs are aplied using crayons made of ground pigment held together with a binder, typicaly oil or water and gum.</Paragraph>
    <Paragraph position="15">  numbers are omited for clarity).</Paragraph>
    <Paragraph position="16"> In this example, ground pigment refers to crayons (not to pastels).</Paragraph>
    <Paragraph position="17"> The evaluation of the semantic correctness of the domain and range of the property instances extracted led to the final figures of 81.1% (189/23) precision and 54.94% (189/34) recall, due to 9 errors in the choice of C t as a domain for an instance R(C</Paragraph>
    <Paragraph position="19"> ) and 5 errors in the semantic disambiguation of range words w not appearing in AT, but encoded in WordNet (as described in the last part of Section 3). A final experiment was performed to evaluate the generality of the approach presented in this paper.</Paragraph>
    <Paragraph position="20"> As already remarked, the same procedure used for annotating the gloses of a thesaurus can be used to annotate web documents. Our objective in this third experiment was to: Evaluate the ability of the system to annotate fragments of web documents with CIDOC relations Evaluate the domain dependency of the relation checkers, by letting the system annotate documents not in the cultural heritage domain.</Paragraph>
    <Paragraph position="21"> We then selected 5 documents at random from an historical archive and an artist's biographies archive  including about 6,00 words in total, about 5,00 of which in the historical domain. We then ran the automatic annotation procedure on these documents and we evaluated the result, using the same criteria as in Table 3.</Paragraph>
    <Paragraph position="22">  instance in the analysed documents. It is remarkable that, especially for the less domain-dependent properties, the precision and recall of the algorithm is stil high, thus showing the generality of the method. Notice that the historical documents influenced the result much more than the artist biographies, because of their dimension.</Paragraph>
    <Paragraph position="23"> In Table 4 the recall of P14 (caried out by) is omited. This is motivated by the fact that this property, in a generic domain, corresponds to the agent relation (&amp;quot;an active animate entity that voluntarily initiates an action&amp;quot;  ), while in the cultural heritage domain it has a more narrow interpretation (an example of this relation in the CIDOC handbok is: &amp;quot;the painting of the Sistine Chapel (E7) was caried out by Michelangelo Buonaroti (E21) in the role of master craftsman (E5)&amp;quot;). However, the domain and range restrictions for P14 correspond to an agent relation, therefore, in a generic domain, one should annotate as &amp;quot;carried out by&amp;quot; almost any verb phrase with the subject (including pronouns and anaphoric references) in the class Human.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML