File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/00/c00-1020_evalu.xml
Size: 2,734 bytes
Last Modified: 2025-10-06 13:58:34
<?xml version="1.0" standalone="yes"?> <Paper uid="C00-1020"> <Title>A Client/Server Architecture for Word Sense Disambiguation</Title> <Section position="4" start_page="136" end_page="136" type="evalu"> <SectionTitle> 3 Evaluation </SectionTitle> <Paragraph position="0"> We ewfluated the system for English on the 34 words used in the SENSEVAL competition (Kilgartiff 98; Kilgarriff 99), as well as on the SENSEVAL corpus (HECTOR). This provkled a test set of around 8500 sentences. The SENSEVAL words arc all polysemous which means that the results given below reflect real polysemy.</Paragraph> <Paragraph position="1"> We use the SENSEVAL test set for this in vitro ewfluation in order to give us a mean of comparison, especially with the results obtained in tiffs competition with GINGER i1 (Dini el al. 99). Still, it is impel tant to keep in mind that this comparison is difficult since the dictionaries used are different. We used the OHF1) bilingual dictionary while in SENSEVAL the Oxford monolingual dictionary fl'om HECTOR was used.</Paragraph> <Paragraph position="2"> The evaluation given below is l)efformed it' and only if the semantic disambiguator has found a matching rule, which means that tim results focus only on our methodology: recall and precision would have been better if we had ewduated all outputs (even when the resul! is just the first meaning corresponding to the syntactic part el' speech of the word in the sentence) because the OHFI) gives by default the most frequent meaning of a word.</Paragraph> <Paragraph position="3"> The results obtained with the system arc given on the following table: Numbers show that the recall is equivalent to the one we obtained with GINGER 1I (37.6 %) in SEN-SF, VAL (tiffs just means that dictionaries content is about the same) but precision is dramatically improved (46% for GINGER 1I for 79.5% with this system). Increase in precision is due to the fact that we used more fine-grained dictionary information.</Paragraph> <Paragraph position="4"> Moreover, the evaluation shows that the distribution of the precision results follows the preference strategy employed to select rtfles: collocate rules am more precise than examples rules, compounds or idiom rules am themselves more precise than usagle exalnples, etc.</Paragraph> <Paragraph position="5"> Another ewfluation of smaller coverage has been performed on &quot;all polysemous words&quot; of about 400 sentences extracted flom the T/me,s' newspaper; and shows similar results according to part of speech distribution.</Paragraph> <Paragraph position="6"> These results confirm that dictionary information is very reliable for senmntic disambiguation tasks.</Paragraph> </Section> class="xml-element"></Paper>