File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/00/a00-2036_intro.xml

Size: 4,812 bytes

Last Modified: 2025-10-06 14:00:41

<?xml version="1.0" standalone="yes"?>
<Paper uid="A00-2036">
  <Title>Left-To-Right Parsing and Bilexical Context-Free Grammars</Title>
  <Section position="2" start_page="0" end_page="272" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> Traditionally, algorithms for natural language parsing process the input string strictly from left to right.</Paragraph>
    <Paragraph position="1"> In contrast, several algorithms have been proposed in the literature that process the input in a bidirectional fashion; see (van Noord, 1997; Satta and Stock, 1994) and references therein. The issue of parsing efficiency for left-to-right vs. bidirectional methods has longly been debated. On the basis of experimental results, it has been argued that the choice of the most favourable strategy should depend on the grammar at hand. With respect to grammar formalisms based upon context-free grammars, and when the rules of these formalisms strongly depend on lexical information, (van Noord, 1997) shows that bidirectional strategies are more efficient than left-to-right strategies. This is because bidirectional strategies are most effective in reducing the parsing search space, by activating as early as possible the maximum number of lexical constraints available in the grammar.</Paragraph>
    <Paragraph position="2"> In this paper we present mathematical arguments in support of the above empirically motivated thesis. We investigate a class of lexicalized grammars that, in their probabilistic versions, have been widely adopted as language models in state-of-the-art real-world parsers. The size of these grammars usually grows with the square of the size of the working lexicon, and thus can be very large. In these cases, the primary goal in the design of a parsing algorithm is to achieve asymptotic time performance sublinear in the size of the working grammar and independent of the size of the lexicon. These desiderata are met by existing bidirectional algorithms (Alshawi, 1996; Eisner, 1997; Eisner and Satta, 1999). In contrast, we show the following two main results for the asymptotic time performance of left-to-right algorithms satisfying the so called correct-prefix property. null * In case off-line compilation of the working grammar is not allowed, left-to-right parsing cannot be realised within time bounds independent of the size of the lexicon.</Paragraph>
    <Paragraph position="3"> * In case polynomial-time, off-line compilation of the working grammar is allowed, left-to-right parsing cannot be realised in polynomial time, and independently of the size of the lexicon, unless a strong conjecture based on complexity results for the representation of regular languages is falsified.</Paragraph>
    <Paragraph position="4"> The first result implies that the well known Earley algorithm and related standard parsing techniques that do not require grammar precompilation cannot be directly extended to process the above mentioned grammars (resp. language models) within an acceptable time bound. The second result provides evidence that well known parsing techniques as left-corner parsing, requiring polynomial-time preprocessing of the grammar, also cannot be directly extended to process these formalisms within an acceptable time bound.</Paragraph>
    <Paragraph position="5"> The grammar formalisms we investigate are based upon context-free grammars and are called bilexical context-free grammars. Bilexical context-free grammars have been presented in (Eisner and Satta, 1999) as an abstraction of language models that have been adopted in several recent real-world parsers, improving state-of-the-art parsing accuracy (A1shawl, 1996; Eisner, 1996; Charniak, 1997; Collins, 1997). Our results directly transfer to all these language models. In a bilexical context-free grammar, possible arguments of a word are always specified along with possible head words for those arguments.</Paragraph>
    <Paragraph position="6"> Therefore a bilexical grammar requires the grammar writer to make stipulations about the compatibil- null ity of particular pairs of words in particular roles, something that was not necessarily true of general context-free grammars.</Paragraph>
    <Paragraph position="7"> The remainder of this paper is organized as follows. We introduce bilexical context-free grammars in Section 2, and discuss parsing with the correct-prefix property in Section 3. Our results for parsing with on-line and off-line grammar compilation are presented in Sections 4 and 5, respectively. To complete the presentation, Appendix A shows that left-to-right parsing in time independent of the size of the lexicon is indeed possible when an off-line compilation of the working grammar is allowed that has an exponential time complexity.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML