File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/90/p90-1035_intro.xml

Size: 3,628 bytes

Last Modified: 2025-10-06 14:05:00

<?xml version="1.0" standalone="yes"?>
<Paper uid="P90-1035">
  <Title>DETERMINISTIC LEFT TO RIGHT PARSING OF TREE ADJOINING LANGUAGES*</Title>
  <Section position="4" start_page="0" end_page="276" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> LR(k) parsers for Context Free Grammars (Knuth, 1965) consist of a finite state control (constructed given a CFG) that drives deterministically with k lookahead symbols a push down stack, while scanning the input from left to right. It has been shown that they recognize exactly the set of languages recognized by deterministic push down automata. LR(k) parsers for CFGs have been proven useful for compilers as well as recently for natural language processing. For natural language processing, although LR(k) parsers are not powerful enough, *The first author is partially supported by Darpa grant N0014-85-K0018, ARO grant DAAL03-89-C-003iPRI NSF grant-IRIS4-10413 A02. We are extremely grateful to Bernard Lang and David Weir for their valuable suggestions.</Paragraph>
    <Paragraph position="1">  conflicts between multiple choices are solved by pseudo-parallelism (Lang, 1974, Tomita, 1987). This gives rise to a class of powerful yet efficient parsers for natural languages. It is in this context that we study deterministic (LR(k)-style) parsing of TAGs.</Paragraph>
    <Paragraph position="2"> The set of Tree Adjoining Languages is a strict superset of the set of Context Free Languages (CFLs).</Paragraph>
    <Paragraph position="3"> For example, the cross serial dependency constmction in Dutch can be generated by a TAG. 1 Waiters (1970), R~v6sz (1971), Turnbull and Lee (1979) investigated deterministic parsing of the class of context-sensitive languages. However they used Turing machines which recognize languages much more powerful than Tree Adjoining Languages. So far no deterministic bottom-up parser has been proposed for any member of the class of the so-called &amp;quot;mildly context sensitive&amp;quot; formalisms (Joshi, 1985) in which Tree Adjoining Grammars fall. 2 Since the set of Tree Adjoining Languages (TALs) is a strict superset of the set of Context Free Languages, in order to define LR-type parsers for TAGs, we need to use a more powerful configuration then a finite state automaton driving a push down stack. We investigate the design of deterministic left to right bottom up parsers for TAGs in which a finite state control drives the moves of a Bottom-up Embedded Push Down Stack. The class of corresponding non-deterministic automata recognizes exactly the set of TALs.</Paragraph>
    <Paragraph position="4"> We focus our attention on showing how a bottom-up embedded pushdown automaton is deterministically driven given a parsing table. To illustrate the building of a parsing table, we consider the simplest case, i.e.</Paragraph>
    <Paragraph position="5"> building of LR(0) items and the corresponding LR(0)  the same subclass of context-sensitive languages) fall in the class of the so-called &amp;quot;mildly context sensitive&amp;quot; formalisms. The Embedded Push Down Automaton recognizes exactly this set of languages (Vijay-Shanker 1987).</Paragraph>
    <Paragraph position="6"> parsing table for a given TAG. An example for a TAG generating a context-sensitive language is given in Figure 5. Finally, we consider the construction of SLR(1) parsing tables.</Paragraph>
    <Paragraph position="7"> We assume that the reader is familiar with TAGs. We refer the reader to Joshi (1987) for an introduction to TAGs. We will assume that the trees can be combined by adjunction only.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML