File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/w04-1807_intro.xml
Size: 4,523 bytes
Last Modified: 2025-10-06 14:02:39
<?xml version="1.0" standalone="yes"?> <Paper uid="W04-1807"> <Title>Detecting semantic relations between terms in definitions</Title> <Section position="2" start_page="0" end_page="0" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> A terminology is an artifact structuring terms according to some semantic relations. Grabar and Hamon (2004) present the different semantic relations likely to be found in terminologies. These can be divided into lexical (synonymy), vertical (hypernymy, meronymy) and transversal relations (domain-specific relations). A study of definition typologies, like the one of (Auger, 1997), shows that these different relations are also present in definitions. We can then hypothesise that mining definitions along with the detection of their inherent semanticrelationcanhelptoorganisetermsaccording null totherelationsusedinstructuredterminologies. We focus in this paper on the detection of terms related by hypernymy and synonymy in definitions.</Paragraph> <Paragraph position="1"> The automatic detection of definitions can rely on different types of existing works. We can, first, consider the studies describing what definition is, and more particularly what definition in corpus is like. In this respect, we can cite the work of Trimble (1985), Flowerdew (1992), Sager (2001) and Meyer (2001). Another type of interesting existing work is about typologies of definitions: Martin (1983), Chukwu and Thoiron (1989) and Auger (1997), amongst others, provide, in their classifications of definitions, linguistic clues to find defining statements in corpus. We propose to integrate the typologies that we mention in section 2.2, along with the linguistic clues they give: the definition markers. And, at last, some works have already focused on mining definitions from corpora, including Cartier (1997), Pearson (1996), Rebeyrolle (2000) and Muresan and Klavans (2002), mostly through the use of lexical definition markers. These works provide us with methodological guidelines and another set of lexical markers for our own experiment.</Paragraph> <Paragraph position="2"> As (Pearson (1996); Rebeyrolle (2000)), our method is based on lexico-syntactic patterns, so that we can build on the work on French language by Rebeyrolle (2000). We extended her work in two respects: an analysis of the parenthesis as low-level linguistic clue for definitions, and the concomitant extraction of the semantic relation involved in a &quot;defining expression&quot;, along with the extraction of the definition itself. Previous works have, for instance, mined definitions to find terms specific to a particular domain of knowledge (Chukwu and Thoiron (1989)), and to describe their meaning (Rebeyrolle,2000); wefocusonthedetectionofthesemantic relations between the main terms of a definition in order to help a terminologist to build a structured terminology following these relations.</Paragraph> <Paragraph position="3"> We implemented an interface to visualise these definitions and semantic relations extractions. We tuned markers and patterns for extracting definitions and semantic relations on a first corpus about anthropology; we then tested the validity of these markers and patterns on another corpus focused on dietetics. The purpose of this test was, on the one hand, to observe whether definitions were still correctly extracted on the basis of patterns trained on a corpus differing in the domain of knowledge and in the genre of documents involved, and, on the CompuTerm 2004 - 3rd International Workshop on Computational Terminology 55 other hand, to detect if the semantic relation associated with each pattern was the same as the one observed in the first corpus. The markers and patterns showed to be comparable to the other experiments mentioned in terms of definition extraction: the precision reached from 61 to 66%. As for the semantic relation associated with the patterns, it obtained different scores, depending on the marker. But, in most cases, one main semantic relation is associated with a pattern in the scope of a single domain, event though a few patterns convey the same relation across our two corpora.</Paragraph> <Paragraph position="4"> The remainder of this paper is organised as follows: we first present previous work (section 2), describe our method and experiment (section 3), then present and discuss results (section 4) and conclude with directions for future work (section 5).</Paragraph> </Section> class="xml-element"></Paper>