File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/93/j93-4002_abstr.xml

Size: 5,189 bytes

Last Modified: 2025-10-06 13:47:52

<?xml version="1.0" standalone="yes"?>
<Paper uid="J93-4002">
  <Title>Parsing Some Constrained Grammar Formalisms</Title>
  <Section position="2" start_page="0" end_page="0" type="abstr">
    <SectionTitle>
1. Introduction
</SectionTitle>
    <Paragraph position="0"> This paper presents a scheme to extend known recognition algorithms for Context-Free Grammars (CFG) in order to obtain recognition algorithms for a class of grammatical formalisms that generate a strict superset of the set of languages generated by CFG.</Paragraph>
    <Paragraph position="1"> In particular, we use this scheme to give recognition algorithms for Linear Indexed Grammars (LIG), Tree Adjoining Grammars (TAG), and a version of Combinatory Categorial Grammars (CCG). These formalisms belong to the class of mildly context-sensitive grammar formalisms identified by Joshi (1985) on the basis of some properties of their generative capacity. The parsing strategy that we propose can be applied to the formalisms listed as well as others that have similar characteristics (as outlined below) in their derivational process. Some of the main ideas underlying our scheme have been influenced by the observations that can be made about the constructions used in the proofs of the equivalence of these formalisms and Head Grammars (HG) (Vijay-Shanker 1987; Weir 1988; Vijay-Shanker and Weir 1993).</Paragraph>
    <Paragraph position="2"> There are similarities between the TAG and HG derivation processes and that of Context-Free Grammars (CFG). This is reflected in common features of the parsing algorithms for HG (Pollard 1984) and TAG (Vijay-Shanker and Joshi 1985) and the CKY algorithm for CFG (Kasami 1965; Younger 1967). In particular, what can happen at each step in a derivation can depend only on which of a finite set of &amp;quot;states&amp;quot; the derivation is in (for CFG these states can be considered to be the nonterminal symbols).</Paragraph>
    <Paragraph position="3"> This property, which we refer to as the context-freeness property, is important because it allows one to keep only a limited amount of context during the recognition process,  * Department of Computer and Information Sciences, University of Delaware, Newark, DE 19716. E-mail: vijay@udel.edu. School of Cognitive and Computing Sciences, University of Sussex, Brighton BN1 9QH, U.K. E-mail: davidw@cogs.susx,ac.uk. (c) 1994 Association for Computational Linguistics  Computational Linguistics Volume 19, Number 4 which results in polynomial time algorithms. In the recognition algorithms mentioned above for CFG, HG, and TAG this is reflected in the fact that the recognizer can encode intermediate stages of the derivation with a bounded number of states. An array is used whose entries are associated with a given component of the input. In the case of the CKY algorithm, the presence of a particular nonterminal in an array entry is used to encode the fact that the nonterminal derives the associated substring of the input.</Paragraph>
    <Paragraph position="4"> The context-freeness of CFG has the consequence that there is no need to encode the way, or ways, in which a nonterminal came to be placed in an array entry.</Paragraph>
    <Paragraph position="5"> In this respect, the derivation processes of CCG and LIG would appear to differ from that of CFG. In these systems unbounded stacklike structures replace the role played by nonterminals in controlling derivation choices. This would seem to suggest that the context-freeness property of CFG, HG, and TAG derivations no longer holds.</Paragraph>
    <Paragraph position="6"> Unbounded stacks can encode an unbounded number of earlier derivation choices. In fact, while the path sets 1 of CFG, HG, and TAG derivation trees are regular languages, the path sets of CCG and LIG are context-free languages. With respect to recognition algorithms, this suggests that the array (whose entries contain nonterminals in the case of CFG) would need to contain complete encodings of unbounded stacks giving an exponential time algorithm.</Paragraph>
    <Paragraph position="7"> However, in LIG and CCG, the use of stacks to control derivations is limited in that different branches of a derivation cannot share stacks. Thus, despite the above observations, the context-freeness property does in fact hold. A detailed explanation of why this is so will be presented below. We propose a method to extend the CKY algorithm to handle the limited use of stacks found in CCG and LIG. We have chosen to adapt the CKY algorithm since it is the simplest form of bottom-up parsing. A similar approach using Earley algorithm is also possible, although not considered here. Since the use of the stacks is most explicit in the LIG formalism we describe our approach in detail by developing a recognition algorithm for LIG (Sections 2 and 3). We then show how the general approach suggested in the parser for LIG can be tailored to CCG (in Section 4). In the above discussion TAG has been grouped with HG. However, TAG can also be viewed as making use of stacks in the same way as LIG and CCG. In Section 5 we show how the LIG algorithm presented in Section 3 can be adapted for TAG.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML