File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/02/w02-1403_concl.xml
Size: 3,394 bytes
Last Modified: 2025-10-06 13:53:23
<?xml version="1.0" standalone="yes"?> <Paper uid="W02-1403"> <Title>Lexically-Based Terminology Structuring: Some Inherent Limits</Title> <Section position="7" start_page="0" end_page="0" type="concl"> <SectionTitle> 5 Synthesis </SectionTitle> <Paragraph position="0"> We presented in this paper a human analysis of automatically, lexically-induced term relations that were not found in the terminology from which the terms were obtained (the MeSH thesaurus). This lexical method considers that a term a0 is probably a parent of a term a1 iff all the words of a0 occur in a1 .</Paragraph> <Paragraph position="1"> This inclusion test is helped by morphological normalization. null Morphological normalization was found to be useful not only in identifying the already existing relations (section 3.2), but also for the 'new' relations. This confirms previous work by Jacquemin and Tzoukermann (1999).</Paragraph> <Paragraph position="2"> The occurrences of syntactic ambiguity suggest that morphosyntactic tagging could be useful. The methods specifically designed for detection of syntactic and morpho-syntactic term variants (Bourigault, 1994; Jacquemin and Tzoukermann, 1999) might then be more efficient and less error-prone. We must be warned however that this may not be an easy task, since most of the MeSH terms are not syntactically well-formed (few determiners and prepositions, inverted heads) and contain rare, technical words that are likely to be absent from most electronic lexicons.</Paragraph> <Paragraph position="3"> Spurious relations may come from several sources. A few cases are due to abusive morphological normalization; errors in term names (translation errors) were also uncovered. We made a distinction between 'head' and 'expansion' positions of the 'parent' term in its 'child'. One would expect that relations where the parent is in head position would be correct; however, this is not always true.</Paragraph> <Paragraph position="4"> The putative head of a term is sometimes not correctly identified because of specific thesaural constructs (the 'comma' form) and chemical constructs (quinone reductases are a kind of reductases) which display head inversion, and because of enumerations. An additional situation is that of a term whose actual syntactic head does not entertain an is-a relation with it (the 'plaster cat'). Furthermore, the head word may not have a stable meaning: it may be syntactically ambiguous (cilie), polysemous (investissement) or underspecified (acne).</Paragraph> <Paragraph position="5"> The remaining 'head' cases reveal specific modeling options, or 'ontological commitments', of the terminology designers: the relations induced might be considered semantically valid, but were discarded in the MeSH because of overall structuring choices. These choices cannot be predicted with the lexical methods used here, and seem to be the most resistant to attempts at automatic derivation. They also show that what is correct is not necessarily useful for a given terminology.</Paragraph> <Paragraph position="6"> The 'expansion' cases may be useful to propose other relations than is-a: we displayed partitive relations, but left to further work a classification of the remaining ones. The UMLS semantic network relations (NLM, 2001b) might be a relevant direction to look into to represent such links.</Paragraph> </Section> class="xml-element"></Paper>