File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/03/w03-0410_intro.xml

Size: 5,183 bytes

Last Modified: 2025-10-06 14:01:54

<?xml version="1.0" standalone="yes"?>
<Paper uid="W03-0410">
  <Title>Semi-supervised Verb Class Discovery Using Noisy Features</Title>
  <Section position="2" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> Computational linguists face a lexical acquisition bottleneck, as vast amounts of knowledge about individual words are required for language technologies. Learning the argument structure properties of verbs--the semantic roles they assign and their mapping to syntactic positions--is both particularly important and difficult. A number of supervised learning approaches have extracted such informationabout verbs from corpora, including their argument roles (Gildea and Jurafsky, 2002), selectional preferences (Resnik, 1996), and lexical semantic classification (i.e., grouping verbs according to their argument structure properties) (Dorr and Jones, 1996; Lapata and Brew, 1999; Merlo and Stevenson, 2001; Joanis and Stevenson, 2003). Unsupervised or semi-supervised approaches have been successful as well, but have tended to be more restrictive, in relying on human filtering of the results (Riloff and Schmelzenbach, 1998), on the handselection of features (Stevenson and Merlo, 1999), or on the use of an extensive grammar (Schulte im Walde and Brew, 2002).</Paragraph>
    <Paragraph position="1"> We focus here on extending the applicability of unsupervised methods, as in (Schulte im Walde and Brew, 2002; Stevenson and Merlo, 1999), to the lexical semantic classification of verbs. Such classes group together verbs that share both a common semantics (such as transfer of possession or change of state), and a set of syntactic frames for expressing the arguments of the verb (Levin, 1993; FrameNet, 2003). As such, they serve as a means for organizing complex knowledge about verbs in a computational lexicon (Kipper et al., 2000). However, creating a verb classification is highly resource intensive, in terms of both required time and linguistic expertise. Development of minimally supervised methods is of particular importance if we are to automatically classify verbs for languages other than English, where substantial amounts of labelled data are not available for training classifiers. It is also necessary to consider the probable lack of sophisticated grammars or text processing tools for extracting accurate features.</Paragraph>
    <Paragraph position="2"> We have previously shown that a broad set of 220 noisy features performs well in supervised verb classification (Joanis and Stevenson, 2003). In contrast to Merlo and Stevenson (2001), we confirmed that a set of general features can be successfully used, without the need for manually determining the relevant features for distinguishing particular classes (cf. Dorr and Jones, 1996; Schulte im Walde and Brew, 2002). On the other hand, in contrast to Schulte im Walde and Brew (2002), we demonstrated that accurate subcategorizationstatistics are unnecessary (see also Sarkar and Tripasai, 2002).</Paragraph>
    <Paragraph position="3"> By avoiding the dependence on precise feature extraction, our approach should be more portable to new languages. However, a general feature space means that most features will be irrelevant to any given verb discrimination task. In an unsupervised (clustering) scenario of verb class discovery, can we maintain the benefit of only needing noisy features, without the generality of the feature space leading to &amp;quot;the curse of dimensionality&amp;quot;? In supervised experiments, the learner uses class labels during the training stage to determine which features are relevant to the task at hand. In the unsupervised setting, the large number of potentially irrelevant features becomes a serious problem, since those features may mislead the learner.</Paragraph>
    <Paragraph position="4"> Thus, the problem of dimensionality reduction is a key issue to be addressed in verb class discovery. In this paper, we report results on several feature selection approaches to the problem: manual selection (based on linguistic knowledge), unsupervised selection (based on an entropy measure among the features, Dash et al., 1997), and a semi-supervised approach (in which seed verbs are used to train a supervised learner, from which we extract the useful features). Although our motivation is verb class discovery, we perform our experiments on English, for which we have an accepted classification to serve as a gold standard (Levin, 1993). To preview our results, we find that, overall, the semi-supervised method not only outperforms the entire feature space, but also the manually selected subset of features. The unsupervised feature selection method, on the other hand, was not usable for our data.</Paragraph>
    <Paragraph position="5"> In the remainder of the paper, we first briefly review our feature space and present our experimental classes and verbs. We then describe our clustering methodology, the measures we use to evaluate a clustering, and our experimental results. We conclude with a discussion of related work, our contributions, and future directions.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML