File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/04/c04-1174_evalu.xml
Size: 6,129 bytes
Last Modified: 2025-10-06 13:59:09
<?xml version="1.0" standalone="yes"?> <Paper uid="C04-1174"> <Title>Automatic Construction of Nominal Case Frames and its Application to Indirect Anaphora Resolution</Title> <Section position="7" start_page="0" end_page="0" type="evalu"> <SectionTitle> 6 Experiments </SectionTitle> <Paragraph position="0"> We evaluated the automatically constructed nominal case frames, and conducted an experiment of indirect anaphora resolution.</Paragraph> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 6.1 Evaluation of case frames </SectionTitle> <Paragraph position="0"> We constructed nominal case frames from newspaper articles in 25 years (12 years of Mainichi newspaper and 13 years of Nihonkeizai newspaper). These newspaper corpora consist of about Nh&quot; noun phrases were extracted from them. The result consists of 17,000 nouns, the average number of case frames for a noun is 1.06, and the average number of case slots for a case frame is 1.09.</Paragraph> <Paragraph position="1"> We randomly selected 100 nouns that occur more than 10,000 times in the corpora, and created gold standard case frames by hand. For each test noun, possible case frames were considered, and for each case frame, obligatory case slots were given manually. As a result, 68 case frames for 65 test nouns were created, and 35 test nouns have no case frames.</Paragraph> <Paragraph position="2"> We evaluated automatically constructed case frames for these test nouns against the gold standard case frames. A case frame which has the same case slots with the gold standard is judged as correct. The evaluation result is shown in Table 5: the system output 70 case frames, and out of them, 58 case frames were judged as correct.</Paragraph> <Paragraph position="3"> The recall was deteriorated by the highly restricted conditions in the example collection. For instance, maker does not have obligatory case slot for its products. This is because maker is usually used in the form of compound noun phrase, &quot;products maker&quot;, and there are few occurrences of &quot;products no maker&quot;. To address this problem, not only &quot;Nm no Nh&quot; but also &quot;Nm Nh&quot; (compound noun phrase) and &quot;Nm ni-kansuru 'in terms of' Nh&quot; should be collected.</Paragraph> </Section> <Section position="2" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 6.2 Experimental results of indirect </SectionTitle> <Paragraph position="0"> anaphora resolution We conducted a preliminary experiment of our indirect anaphora resolution system using &quot;Relevance-tagged corpus&quot; (Kawahara et al., 2002). This corpus consists of Japanese newspaper articles, and has relevance tags, including antecedents of indirect anaphors.</Paragraph> <Paragraph position="1"> We prepared a small test corpus that consists of randomly selected 10 articles. The test corpus contains 217 nouns. Out of them, 106 nouns are indirect anaphors, and have 108 antecedents, which is because two nouns have double antecedents. 49 antecedents directly depend on their anaphors, and 59 do not. For 91 antecedents out of 108, a case frame of its anaphor includes the antecedent itself or its similar word (the similarity exceeds the threshold, 0.95). Accordingly, the upper bound of the recall of our case-frame-based anaphora resolution is 84.3% (91/108).</Paragraph> <Paragraph position="2"> We ran the system on the test corpus, and compared the system output and the corpus annotation. Table 6 shows the experimental results. Inthistable, &quot;wdep.&quot; (withdependency) is the evaluation of the antecedents that directly depend on their anaphors. &quot;w/o dep.&quot; (without dependency) is the case of the antecedents that do not directly depend on their anaphors.</Paragraph> <Paragraph position="3"> Although the analysis of &quot;w dep.&quot; is intrinsically easier than that of &quot;w/o dep.&quot;, the recall of &quot;w dep.&quot; was not much higher than that of &quot;w/o dep.&quot;. The low recall score of &quot;w dep.&quot; wascausedbynonexistenceofcaseframeswhich include the antecedent itself or its similar word. The antecedents that directly depend on their anaphors were often a part of compound noun phrases, such as &quot;products maker&quot;, which are not covered by our examples collection.</Paragraph> <Paragraph position="4"> Major errors in the analyses of the antecedents that do not directly depend on their anaphors were caused by the following reasons.</Paragraph> <Paragraph position="5"> Specific/generic usages of nouns Some erroneous system outputs were caused by nouns that have both specific and generic usages. null (ph sold the stock of the subsidiary.) In this case, kogaisya 'subsidiary' is an obligatory information for kabushiki 'stock', which is specifically used. kogaisya matches the [kaisya 'company'] case slot in Table 4.</Paragraph> <Paragraph position="6"> However, kabushiki 'stock'inthefollowingexampleisusedgenerically, anddoesnotneedspecific company information.</Paragraph> <Paragraph position="7"> (ph become the rise factor of the stock prices.) Since the current system cannot judge generic or specific nouns, an antecedent which corresponds to [kaisha 'company'] is incorrectly estimated. null Beyond selectional restriction of case frames Selectional restriction based on the case frames usually worked well, but did not work to distinguish candidates both of which belong to Human or Organization.</Paragraph> <Paragraph position="8"> In this example, daitouryou 'president' requires an obligatory case kuni 'nation'. The system estimates its antecedent as Russia, though the correct answer is bei 'America'. This is because Russia is closer than beikoku. This problem is somehow related to world knowledge, but if the system can carefully exploit the context, it might be able to find the correct answer from &quot;Bush bei seiken&quot; 'Bush American administration'. null</Paragraph> </Section> </Section> class="xml-element"></Paper>