XML Viewer - c04-1133

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/c04-1133_intro.xml
Size: 10,047 bytes
Last Modified: 2025-10-06 14:02:09
<?xml version="1.0" standalone="yes"?>
<Paper uid="C04-1133">
  <Title>Automated Induction of Sense in Context</Title>
  <Section position="2" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> This paper describes a new model for the acquisition and exploitation of selectional preferences for predicates from natural language corpora. Our goal is to apply this model in order to construct a dictionary of normal selection contexts for natural language; that is, a computational lexical database of rich selectional contexts, associated with procedures for assigning interpretations on a probabilistic basis to less normal contexts. Such a semi-automatically developed resource promises to have applications for a number of NLP tasks, including word-sense disambiguation, selectional preference acquisition, as well as anaphora resolution and inference in specialized domains. We apply this methodology to a selected set of verbs, including a subset of the verbs in the Senseval 3 word sense discrimination task and report our initial results.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
1.1 Selectional Preference Acquisition:
</SectionTitle>
      <Paragraph position="0"> Current State of the Art Predicate subcategorization information constitutes an essential part of the computational lexicon entry. In recent years, a number of approaches have been proposed for dealing computationally with selectional preference acquisition (Resnik (1996); Briscoe and Carroll (1997); McCarthy (1997); Rooth et al.</Paragraph>
      <Paragraph position="1"> (1999); Abney and Light (1999); Ciaramita and Johnson (2000); Korhonen (2002)).</Paragraph>
      <Paragraph position="2"> The currently available best algorithms developed for the acquisition of selectional preferences for predicates are induction algorithms modeling selectional behavior as a distribution over words (cf. Abney and Light (1999)). Semantic classes assigned to predicate arguments in subcategorization frames are either derived automatically through statistical clustering techniques (Rooth et al. (1999), Light and Grei (2002)) or assigned using hand-constructed lexical taxonomies such as the WordNet hierarchy or LDOCE semantic classes. Overwhelmingly, Word-Net is chosen as the default resource for dealing with the sparse data problem (Resnik (1996); Abney and Light (1999); Ciaramita and Johnson (2000); Agirre and Martinez (2001); Clark and Weir (2001); Carroll and McCarthy (2000); Korhonen and Preiss (2003)).</Paragraph>
      <Paragraph position="3"> Much of the work on inducing selectional preferences for verbs from corpora deals with predicates indiscriminately, assuming no di erentiation between predicate senses (Resnik (1996); Abney and Light (1999); Ciaramita and Johnson (2000); Rooth et al.</Paragraph>
      <Paragraph position="4"> (1999)). Those approaches that do distinguish between predicate senses or complementation patterns in acquisition of selectional constraints (Korhonen (2002); Korhonen and Preiss (2003)) do not use corpus analysis for verb sense classi cation.</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
1.2 Word Sense Disambiguation: Current
</SectionTitle>
      <Paragraph position="0"> State of the Art Previous computational concerns for economy of grammatical representation have given way to models of language that not only exploit generative grammatical resources but also have access to large lists of contexts of linguistic items (words), to which new structures can be compared in new usages.</Paragraph>
      <Paragraph position="1"> However, following the work of Yarowsky (1992), Yarowsky (1995), many supervised WSD systems use minimal information about syntactic structures, for the most part restricting the notion of context to topical and local features. Topical features track open-class words that appear within a certain window around a target word, and local features track small N-grams associated with the target word. Disambiguation therefore relies on word co-occurrence statistics, rather than on structural similarities. That remains the case for most systems that participated in Senseval-2 (Preiss and Yarowsky (2001)). Some recent work (Stetina et al. (1998); Agirre et al. (2002); Yamashita et al. (2003)) attempts to change this situation and presents a directed e ort to investigate the impact of using syntactic features for WSD learning algorithms. Agirre et al (2002) and Yamashita et al. (2003) report resulting improvement in precision.</Paragraph>
      <Paragraph position="2"> Stevenson and Wilks (2001) propose a somewhat related technique to handle WSD, based on integrating LDOCE classes with simulated annealing.</Paragraph>
      <Paragraph position="3"> Although space does not permit discussion here, initial comparisons suggest that our selection contexts could incorporate similar knowledge resources; it is not clear what role model bias plays in associating patterns with senses, however.</Paragraph>
      <Paragraph position="4"> In this paper we modify the notion of word sense, and at the same time revise the manner in which senses are encoded. The notion of word sense that has been generally adopted in the literature is an artifact of several factors in the status quo, notably the availability of lexical resources such as machine-readable dictionaries, in which ne sense distinctions are not supported by criteria for selecting one sense rather than another, and WordNet, where synset groupings are taken as de ning word sense distinctions. Thus, for instance, Senseval-2 WSD tasks required disambiguation using WordNet senses (see, e.g., discussion in Palmer et al. (2004)). The feature sets used in the supervised WSD algorithms at best use only minimal information about the typing of arguments. The approach we adopt, Corpus Pattern Analysis (CPA) (Pustejovsky and Hanks (2001)), incorporates semantic features of the arguments of the target word. Semantic features are expressed in terms of a restricted set of shallow types, chosen for their prevalence in selection context patterns. This type system is extended with predicate-based noun clustering, in the bootstrapping process described below.</Paragraph>
    </Section>
    <Section position="3" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
1.3 Related Resources: FrameNet
</SectionTitle>
      <Paragraph position="0"> It is necessary to say a few words about the differences between CPA and FrameNet. The CPA approach has its origins in the analysis of large corpora for lexicographic purposes (e.g. Cobuild (Sinclair et al., 1987)) and in systemic-functional grammar, in particular in Halliday's notion of &amp;quot;lexis as a linguistic level&amp;quot; (Halliday, 1966) and Sinclair's empirical approach to collocational analysis (Sinclair, 1991). FrameNet (freely available online in a beautifully designed data base at http://www.icsi.berkeley.edu/~framenet/), is an attempt to implement Fillmore's 1975 proposal that, instead of seeking to satisfy a set of necessary and su cient conditions, the meanings of words in text should be analyzed by calculating resemblance to a prototype (Fillmore, 1975).</Paragraph>
      <Paragraph position="1"> CPA (Hanks, 2004) is concerned with establishing prototypical norms of usage for individual words. It is possible (and certainly desirable) that CPA norms will be mappable onto FrameNet's semantic frames (for which see the whole issue of the International Journal of Lexicography for September 2003 (in particular Atkins et al. (2003a), Atkins et al. (2003b), Fillmore et al. (2003a), Baker et al. (2003), Fillmore et al. (2003b)). In frame semantics, the relationship between semantics and syntactic realization is often at a comparatively deep level, i.e. in many sentences there are elements that are potentially present but not actually expressed. For example, in the sentence \he risked his life&amp;quot;, two semantic roles are expressed (the risker and the valued object \his life&amp;quot; that is put at risk). But at least three other roles are subliminally present although not expressed: the possible bad outcome (\he risked his death&amp;quot;), the bene ciary or goal (\he risked his life for her/for a few dollars&amp;quot;), and the means (\he risked a backward glance&amp;quot;).</Paragraph>
      <Paragraph position="2"> CPA, on the other hand, is shallower and more practical: the objective is to identify, in relation to a given target word, the overt textual clues that activate one or more components of its meaning potential. There is also a methodological di erence: whereas FrameNet research proceeds frame by frame, CPA proceeds word by word. This means that when a word has been analysed in CPA the patterns are immediately available for disambiguation. FrameNet will be usable for disambiguation only when all frames have been completely analysed.</Paragraph>
      <Paragraph position="3"> Even then, FrameNet's methodology, which requires the researchers to think up all possible members of a Frame a priori, means that important senses of words that have been partly analysed are missing and may continue to be missing for years to come.</Paragraph>
      <Paragraph position="4"> There is no attempt in FrameNet to identify the senses of each word systematically and contrastively.</Paragraph>
      <Paragraph position="5"> In its present form, at least, FrameNet has at least as many gaps as senses. For example, at the time of writing toast is shown as part of the Apply Heat frame but not the Celebrate frame. It is not clear how or whether the gaps are to be lled systematically. We do not even know whether there is (or is going to be) a Celebrate frame and if so what it will be called. What is needed is a principled x { a decision to proceed from evidence, not frames. This is ruled out by FrameNet for principled reasons: the unit of analysis for FrameNet is the frame, not the word.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML