File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/98/w98-0709_evalu.xml
Size: 3,517 bytes
Last Modified: 2025-10-06 14:00:35
<?xml version="1.0" standalone="yes"?> <Paper uid="W98-0709"> <Title>I I I I I I I I I I I I I I I</Title> <Section position="6" start_page="68" end_page="68" type="evalu"> <SectionTitle> 5 Extending and Filling Gaps. </SectionTitle> <Paragraph position="0"> Up to now we have described a methodology to connect words from a language to a WN skeleton, and another methodology to build taxonomies.</Paragraph> <Paragraph position="1"> The words finally connected in the first process, apart from the precission threshold criterion, do not follow any other criterion: they are not the most important, neither the topmost nor the lowermost concepts in the hierarchy; the connections are scattered all over the skeleton.</Paragraph> <Paragraph position="2"> The final set of words connected to the skeleton is random, and we don't have any control over it.</Paragraph> <Paragraph position="4"> Furthermore, we also find relevant words not connected to the hierarchy.</Paragraph> <Paragraph position="5"> We are currently developing a methodology which tries to fill the gaps by merging the taxonomy automatically extracted, and the sparse skeleton. By now we have studied very simple and short structures.</Paragraph> <Paragraph position="6"> We have then two hierarchies to compare, and two ways of connecting them: the already extracted connections (A) between Spanish words and synsets, and the translations (B) given by the bilinguals (not disambiguated). We have then looked for the next simple configurations: where Spanish words are connected between them via the automatically extracted taxonomy, and the English words via WN. The English words can be connected to the Spanish via A or via B, or they can be unconnected. Then we obtain eight configurations. We have evaluated up to now three of these classes: * class 1: connections via A above and below 15 * class 2: connections via A above and B below * class 4: connections via B above and A below Below there is a table showing volumes. The experiment was carried out on four file senses which in our opinion would differ in their behaviour, food and artifact, which are classified very similarly in Spanish and in English, and mental process and communication, which are not so dear:.</Paragraph> <Paragraph position="7"> Of these volumes, some were already extracted with the previous methods, but some are newly produced connections. Some of the already existing connections were incorrect, and led to incorrect deductions. Of the newly added connections, a sample has been evaluated, giving the results We can then decide, after studying all the cases, to accept the connections above a threshold, or we can also try to combine them to extract more precise ones. For example, some promising combinations could be:</Paragraph> <Paragraph position="9"> which are the combinations of classes 2 and 4 in (5), and combinations of two new classes in (6).</Paragraph> <Paragraph position="10"> Furthermore, we are planning to apply an iterative bootstrapping method taking profit of those links with higher confidence scores gathered in previous steps (acting as anchors) to spread evidence where no connections have been found.</Paragraph> <Paragraph position="11"> We are also considering the possibility oL not only filling gaps in the middle levels of the hierarchy, but also to extend the LgWN adding subtaxonomies to bottom synsets of WN, trying to cover semantic fields specific of Lg not covered by the original WN.</Paragraph> </Section> class="xml-element"></Paper>