XML Viewer - p06-2064

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/06/p06-2064_intro.xml
Size: 6,145 bytes
Last Modified: 2025-10-06 14:03:40
<?xml version="1.0" standalone="yes"?>
<Paper uid="P06-2064">
  <Title>Interpreting Semantic Relations in Noun Compounds via Verb Semantics</Title>
  <Section position="4" start_page="0" end_page="491" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> The interpretation of nouncompounds(hereafter, NCs) such as apple pie or family car is a well-established sub-task of language understanding.</Paragraph>
    <Paragraph position="1"> Conventionally, the NC interpretation task is defined in terms of unearthing the underspecified semantic relation between the head noun and modifier(s), e.g. pie and apple respectively in the case of apple pie.</Paragraph>
    <Paragraph position="2"> NC interpretation has been studied in the context of applications including question-answering and machine translation (Moldovan et al., 2004; Cao and Li, 2002; Baldwin and Tanaka, 2004; Lauer, 1995). Recent work on the automatic/semiautomatic interpretation of NCs (e.g., Lapata (2002), Rosario and Marti (2001), Moldovan et al.</Paragraph>
    <Paragraph position="3">  (2004)andKimandBaldwin(2005))hasmadeassumptions about the scope of semantic relations or restrictedthedomainofinterpretation. Thismakes it difficult to gauge the general-purpose utility of the different methods. Our method avoids any such assumptions while outperforming previous methods.</Paragraph>
    <Paragraph position="4"> In seminal work on NC interpretation, Finin (1980) tried to interpret NCs based on hand-coded rules. Vanderwende (1994) attempted the automatic interpretation of NCs using hand-written rules, with the obvious cost of manual intervention. Fan et al. (2003) estimated the knowledge required to interpret NCs and claimed that performance was closely tied to the volume of data acquired. null In more recent work, Barker and Szpakowicz (1998) used a semi-automatic method for NC interpretation in a fixed domain. Lapata (2002) developed a fully automatic method but focused on nominalizations, a proper subclass of NCs.1 Rosario and Marti (2001) classified the nouns in medical texts by tagging hierarchical information using neural networks. Moldovan et al. (2004) used the word senses of nouns based on the domain or range of interpretation of an NC, leading to questions of scalability and portability to novel domains/NC types. Kim and Baldwin (2005) proposed a simplistic general-purpose method based on the lexical similarity of unseen NCs with training instances.</Paragraph>
    <Paragraph position="5"> The aim of this paper is to develop an automatic  methodforinterpretingNCsbasedonsemanticrelations. We interpret semantic relations relative to a fixed set of constructions involving the modifier and head noun and a set of seed verbs for each semantic relation: e.g. (the) family owns (a) car is taken as evidence for family car being an instance of the POSSESSOR relation. We then attempttomapallinstancesofthemodifierandhead null noun as the heads of NPs in a transitive sentential context onto our set of constructions via lexical similarity over the verb, to arrive at an interpretation: e.g. we would hope to predict that possess is sufficiently similar to own that (the) family possesses (a) car would be recognised as support1With nominalizations, the head noun is deverbal, and in the case of Lapata (2002), nominalisations are assumed to be interpretable as the modifier being either the subject (e.g. child behavior) or object (e.g. car lover) of the base verb of the head noun.</Paragraph>
    <Paragraph position="6">  ing evidence for the POSSESSOR relation. We use a supervised classifier to combine together the evidencecontributedbyindividualsententialcontexts null of a given modifier-head noun combination, and arrive at a final interpretation for a given NC.</Paragraph>
    <Paragraph position="7"> Mapping the actual verbs in sentences to appropriate seed verbs is obviously crucial to the success of our method. This is particularly important as there is no guarantee that we will find large numbers of modifier-head noun pairings in the sorts of sentential contexts required by our method, nor that we will find attested instances based on the seed verbs. Thus an error in mapping an attested verb to the seed verbs could result in a wrong interpretation or no classification at all. In this paper, we experiment with the use of Word-Net (Fellbaum, 1998) and word clusters (based on Moby's Thesaurus) in mapping attested verbs to the seed verbs. We also make use of CoreLex in dealing with the semantic relation TIME and the RASP parser (Briscoe and Carroll, 2002) to determine the dependency structure of corpus data.</Paragraph>
    <Paragraph position="8"> The data source for our set of NCs is binary NCs (i.e. NCs with a single modifier) from the Wall Street Journal component of the Penn Treebank. We deliberately choose to ignore NCs with multiple modifiers on the grounds that: (a) 88.4% of NC types in the Wall Street Journal component of the Penn Treebank and 90.6% of NC types in the British National Corpus are binary; and (b) we expect to be able to interpret NCs with multiple modifiers by decomposing them into binary NCs.</Paragraph>
    <Paragraph position="9"> Another simplifying assumption we make is to remove NCs incorporating proper nouns since: (a) the lexical resources we employ in this research do not contain them in large numbers; and (b)  thereissomedoubtastowhetherthesetofsemantic relations required to interpret NCs incorporating proper nouns is that same as that for common nouns.</Paragraph>
    <Paragraph position="10"> The paper is structured as follows. Section 2 takes a brief look at the semantics of NCs and the basic idea behind the work. Section 3 details the set of NC semantic relations that is used in our research, Section 4 presents an extended discussionofourapproach,Section5brieflyexplainsthe null tools we use, Section 6.1 describes how we gather and process the data, Section 6.2 explains how we map the verbs to seed verbs, and Section 7 and Section 8 present the results and analysis of our approach. Finally we conclude our work in Section 9.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML