File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/03/p03-1065_intro.xml

Size: 3,295 bytes

Last Modified: 2025-10-06 14:01:49

<?xml version="1.0" standalone="yes"?>
<Paper uid="P03-1065">
  <Title>An Expert Lexicon Approach to Identifying English Phrasal Verbs</Title>
  <Section position="2" start_page="0" end_page="1" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> Any natural language processing (NLP) system needs to address the issue of handling multiword expressions, including Phrasal Verbs (PV) [Sag et al. 2002; Breidt et al. 1996]. This paper presents a proven approach to identifying English PVs based on pattern matching using a formalism called Expert Lexicon.</Paragraph>
    <Paragraph position="1"> Phrasal Verbs are an important feature of the English language since they form about one third of the English verb vocabulary.</Paragraph>
    <Paragraph position="2">  For the verb vocabulary of our system based on machine-readable dictionaries and two Phrasal Verb dictionaries, phrasal verb entries constitute 33.8% of the entries.</Paragraph>
    <Paragraph position="3"> recognizing PVs is an important condition for English parsing. Like single-word verbs, each PV has its own lexical features including subcategorization features that determine its structural patterns [Fraser 1976; Bolinger 1971; Pelli 1976; Shaked 1994], e.g., look for has syntactic subcategorization and semantic features similar to those of search; carry...on shares lexical features with continue. Such lexical features can be represented in the PV lexicon in the same way as those for single-word verbs, but a parser can only use them when the PV is identified.</Paragraph>
    <Paragraph position="4"> Problems like PVs are regarded as 'a pain in the neck for NLP' [Sag et al. 2002]. A proper solution to this problem requires tighter interaction between syntax and lexicon than traditionally available [Breidt et al. 1994].</Paragraph>
    <Paragraph position="5"> Simple lexical lookup leads to severe degradation in both precision and recall, as our benchmarks show (Section 4). The recall problem is mainly due to separable PVs such as turn...off which allow for syntactic units to be inserted inside the PV compound, e.g., turn it off, turn the radio off. The precision problem is caused by the ambiguous function of the particle. For example, a simple lexical lookup will mistag looked for as a phrasal verb in sentences such as He looked for quite a while but saw nothing.</Paragraph>
    <Paragraph position="6"> In short, the traditional NLP framework that separates the lexicon module from a parser makes it difficult to handle this problem properly. This paper presents an expert lexicon approach that integrates the lexical module with contextual checking based on shallow parsing results.</Paragraph>
    <Paragraph position="7"> Extensive blind benchmarking shows that this approach is very effective for identifying phrasal verbs, resulting in the precision/recall combined F-score of about 96%.</Paragraph>
    <Paragraph position="8"> The remaining text is structured as follows.</Paragraph>
    <Paragraph position="9"> Section 2 presents the problem and defines the task. Section 3 presents the Expert Lexicon formalism and illustrates the use of this formalism in solving this problem. Section 4 shows the benchmarking and analysis, followed by conclusions in Section 5.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML