File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/p04-2001_intro.xml

Size: 3,306 bytes

Last Modified: 2025-10-06 14:02:23

<?xml version="1.0" standalone="yes"?>
<Paper uid="P04-2001">
  <Title>Determining the Specificity of Terms using Compositional and Contextual Information</Title>
  <Section position="3" start_page="0" end_page="2" type="intro">
    <SectionTitle>
1. Introduction
</SectionTitle>
    <Paragraph position="0"> Terminology management concerns primarily with terms, i.e., the words that are assigned to concepts used in domain-related texts. A term is a meaningful unit that represents a specific concept within a domain (Wright, 1997).</Paragraph>
    <Paragraph position="1"> Specificity of a term represents the quantity of domain specific information contained in the term. If a term has large quantity of domain specific information, specificity value of the term is large; otherwise specificity value of the term is small. Specificity of term X is quantified to positive real number as equation (1).</Paragraph>
    <Paragraph position="3"> Specificity of terms is an important necessary condition in term hierarchy, i.e., if X  ). Specificity can be applied in automatic construction and evaluation of term hierarchy. When domain specific concepts are represented as terms, the terms are classified into two categories based on composition of unit words. In the first category, new terms are created by adding modifiers to existing terms. For example &amp;quot;insulin-dependent diabetes mellitus&amp;quot; was created by adding modifier &amp;quot;insulin-dependent&amp;quot; to its hypernym &amp;quot;diabetes mellitus&amp;quot; as in Table 1. In English, the specific level terms are very commonly compounds of the generic level term and some modifier (Croft, 2004). In this case, compositional information is important to get their meaning. In the second category, new terms are created independently to existing terms. For example, &amp;quot;wolfram syndrome&amp;quot; is semantically related to its ancestor terms as in Table 1. But it shares no common words with its ancestor terms.</Paragraph>
    <Paragraph position="4"> In this case, contextual information is used to discriminate the features of the terms.</Paragraph>
    <Paragraph position="5">  tree. Node numbers represent hierarchical structure of terms Contextual information has been mainly used to represent the characteristics of terms. (Caraballo, 1999A) (Grefenstette, 1994) (Hearst, 1992) (Pereira, 1993) and (Sanderson, 1999) used contextual information to find hyponymy relation between terms. (Caraballo, 1999B) also used contextual information to determine the specificity of nouns. Contrary, compositional information of terms has not been commonly discussed.  MeSH is available at http://www.nlm.nih.gov/mesh. MeSH 2003 was used in this research.</Paragraph>
    <Paragraph position="6"> We propose new specificity measuring methods based on both compositional and contextual information. The methods are formulated as information theory like measures. Because the methods don't use domain specific information, they are easily adapted to terms of other domains. This paper consists as follow: compositional and contextual information is discussed in section 2, information theory like measures are described in section 3, experiment and evaluation is discussed in section 4, finally conclusions are drawn in section 5.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML