File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/05/w05-1004_intro.xml
Size: 8,722 bytes
Last Modified: 2025-10-06 14:03:11
<?xml version="1.0" standalone="yes"?> <Paper uid="W05-1004"> <Title>Automatically Learning Qualia Structures from the Web</Title> <Section position="5" start_page="30" end_page="33" type="intro"> <SectionTitle> 4 Evaluation </SectionTitle> <Paragraph position="0"> We evaluate our approach for the lexical elements knife, beer, book, which are also discussed in (Johnston and Busa, 1996) or (Pustejovsky, 1991), as well as computer, an abstract noun, i.e. conversation, as well as two very specific multi-term words, i.e. natural language processing and data mining. We give the automatically learned weighted Qualia Structures for these entries in Figures 3, 4, 5 and 6. The evaluation of our approach consists on the one hand of a discussion of the weighted qualia structures, in particular comparing them to the ideal structures form the literature. On the other hand, we also asked a student at our institute to assign credits to each of the qualia elements from 0 (incorrect) to 3 (totally correct) whereby 1 credit meaning 'not totally wrong' and 2 meaning 'still acceptable'.</Paragraph> <Section position="1" start_page="31" end_page="31" type="sub_section"> <SectionTitle> 4.1 Quantitative Evaluation </SectionTitle> <Paragraph position="0"> The distribution of credits for each qualia role and term is given in Table 4. It can be seen that with three exceptions: beera10 formal, booka10 agentive as well as beera10 constitutive, '3' is the mark assigned in most cases to the automatically learned qualia elements. Further, for almost every query term and qualia role, at least 50% of the automatically learned qualia structures have a mark of '2' or '3' - the only exceptions being beera10 formal with 45.45%, booka10 agentive with 33.33% and beera10 constitutive with 28.57%. In general this shows that the automatically learned qualia roles are indeed reasonable. Considering the average over all the terms ('All' in the table), we observe that the qualia role which is recognized most reliably is the Telic one with 73.15% assignments of credit '3' and 75.93% of credits '2' or '3', followed by the Agentive role with 71.43% assignments of credit 3. The results for the Formal and Constitutive role are still reasonable with 62.09% assignments of credit '3' and 66.01% assignments of credits '2' or '3' for the Formal role; and respectively 61.61% and 64.61% for the Constitutive role. The worst results are achieved for the Constitutive role due to the fact that 26.26% of the qualia elements are regarded as totally wrong. Table 5 supports the above claims and shows the average credits assigned by the human evaluator per query term and role. It shows again that the roles with the best results are the Agentive and Telic roles, while the Formal and Constitutive roles are not identified as accurately. This is certainly due to the fact that the patterns for the Telic role are much less ambiguous than the ones for the Formal and Constitutive roles. Finally, we also discuss the correlation between the credits assigned and the Jaccard Coefficient. Figure 2 shows this correlation.</Paragraph> <Paragraph position="1"> While for the Formal role the correlation is as expected, i.e. the higher the credit assigned, the higher also the Jaccard Coefficient, for the Constitutive and Telic roles this correlation is unfortunately less clear, thus making the task of finding a cut-off threshold more difficult.</Paragraph> </Section> <Section position="2" start_page="31" end_page="33" type="sub_section"> <SectionTitle> 4.2 Qualitative Evaluation & Discussion </SectionTitle> <Paragraph position="0"> In this section we provide a more subjective evaluation of the automatically learned qualia structures by comparing them to ideal qualia structures discussed in the literature wherever possible. In particular, we discuss more in detail the qualia structure for book, knife and beer and leave the detailed assessment of the qualia structures for computer, natural language processing, data mining and conversation to the interested reader.</Paragraph> <Paragraph position="1"> For book, the first four candidates of the Formal role, i.e. product, item, publication and document are very appropriate, but alluding to the physical object meaning of book as opposed to the meaning in the sense of information container (compare (Pustejovsky, 1991). As candidates for the Agentive role we have make, write and create which are appropriate, write being the ideal filler of the Agentive role according to (Pustejovsky, 1991). For the Constitutive role of book we get - besides it at the first position which could be easily filtered out - sign (2nd position), letter (3rd position) and page (6th position), which are quite appropriate. The top four candidates for the Telic role are give, select, read and purchase. It seems that give is emphasizing the role of a book as a gift, read is referring to the most obvious purpose of a book as specified in the ideal qualia structures of (Pustejovsky, 1991) as well as (Johnston and Busa, 1996) and purchase denotes the more general purpose of a book, i.e.</Paragraph> <Paragraph position="2"> to be bought.</Paragraph> <Paragraph position="3"> The first element of the Formal role of knife unfortunately denotes the material it is typically made of, i.e. steel, but the next 5 elements are definitely appropriate: weapon, item, kitchenware, object and instrument. The ideal element artifact tool (compare (Johnston and Busa, 1996)) can be found at the 10th position. The results are interesting in that on the one hand the most prominent meaning of knife according to the web is the one of a weapon.</Paragraph> <Paragraph position="4"> On the other hand our results are more specific, classifying a knife as kitchenware instead of merely as an artifact tool. Very interesting are the specific and accurate results at the end of the list. The reason why they appear at the end is that the Jaccard Coefficient ranks them lower because they are more specific, thus appearing less frequently. This shows that using some other measure less sensitive to frequency could yield more accurate results.</Paragraph> <Paragraph position="5"> The fillers of the Agentive role produce, make and create seem all appropriate, whereby make corresponds exactly to the ideal filler for the Agentive role as mentioned in (Johnston and Busa, 1996). The results for the Constitutive role contain not only parts but also materials a knife is made of and thus contain more information than the typical qualia structures assumed in the literature. The best results are (in this order) blade, metal, steel, wood and handle at the 6th position. In fact, in the ideal qualia structure in (Johnston and Busa, 1996) blade and han- null dle are mentioned as fillers of the Constitutive role, while there are no elements describing the materials of which a knife is made of. Finally, the top four candidates for the Telic role are kill, slit, cut and slice, whereby cut corresponds to the ideal filler of the qualia structure for knife as mentioned in (Johnston and Busa, 1996).</Paragraph> <Paragraph position="6"> Considering the qualia structure for beer, it is surprising that no purpose has been found. The reason is that currently no results are returned by Google for the clue a beer is used to and the four snippets returned for the purpose of a beer contain expressions of the form the purpose of a beer is to drink it which is not matched by our patterns as it is a pronoun and not matched by our NP pattern (unless it is matched by an error as in the Qualia Structure for book in Figure 4). Considering the results for the Formal role, the elements drink (1st), alcohol (2nd) and beverage (4th) are much more specific than liquid as given in (Pustejovsky, 1991), while thing at the 3rd position is certainly too general. Furthermore, according to the automatically learned qualia structure, beer is made of rice, malt and hop, which are perfectly reasonable results. Very interesting are the results concoction and libation for the Formal role of beer, which unfortunately were rated low by our evaluator (compare Figure 3).</Paragraph> <Paragraph position="7"> Overall, the discussion has shown that the results produced by our method are reasonable when compared to the qualia structures from the literature. In general, our method produces in some cases additional qualia candidates, such as the ones describing the material a knife is typically made of. In other cases it discovers more specific candidates, such as for example weapon or kitchenware as elements of the Formal role for knife instead of the general term artifact tool.</Paragraph> </Section> </Section> class="xml-element"></Paper>