File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/80/j80-2002_abstr.xml

Size: 8,652 bytes

Last Modified: 2025-10-06 13:45:56

<?xml version="1.0" standalone="yes"?>
<Paper uid="J80-2002">
  <Title>A Parsing Algorithm That Extends Phrases</Title>
  <Section position="2" start_page="0" end_page="0" type="abstr">
    <SectionTitle>
1. Introduction
</SectionTitle>
    <Paragraph position="0"> To analyze a sentence of a natural language, a computer program recognizes the phrases within the sentence, builds data structures, such as conceptual representations, for each of them and combines those structures into one that corresponds to the entire sentence. The algorithm which recognizes the phrases and invokes the structure-building procedures is the parsing algorithm implemented by the program. The parsing algorithm is combined with a set of procedures for deciding between alternative actions and for building the datastructures. Since it is organized around phrases, it is primarily concerned with syntax, while the procedures it calls deal with non-syntactic parts of the analysis. When the program is run, there may be a complex interplay between the code segments that handle syntax and those that handle semantics and pragmatics, but the program organization can still be abstracted into a (syntactic) parsing algorithm and a set of procedures that are called to augment that algorithm.</Paragraph>
    <Paragraph position="1"> By taking the view that the parsing algorithm recognizes the phrases in a sentence, that is, the components of its surface structure and how they can be decomposed, it suffices to specify the syntax of a natural language, at least approximately, with a context-free phrase structure grammar, the rules of which serve as phrase decomposition rules. Although linguists have developed more elaborate grammars for this purpose, most computer programs for sentence analysis, e.g., Heidorn (1972), Winograd (1972) and Woods (1970), specify the syntax with such a grammar, or something equivalent, and then augment that grammar with procedures and data structures to handle the non-context-free components of the language. The notion of parsing algorithm is therefore restricted in this paper to an algorithm that recognizes phrases in accordance with a context-free phrase structure grammar.</Paragraph>
    <Paragraph position="2"> Since the parsing algorithm of a sentence analysis program determines when data structures get combined, it seems reasonable to expect that the actions of the parser should reflect the actions on the data structures. In particular, the combination of phrases into larger phrases can be expected to coincide with the combination of corresponding data structures into larger data structures. This happens naturally when the computer program is such that it calls the procedures for combining data structures at the same time the parsing algorithm indicates that the corresponding phrases should be combined.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
1.1 Other Parsing Algorithms
</SectionTitle>
      <Paragraph position="0"> According to one classification of parsing algorithms (Aho and Ullman 1972), most analysis programs are based on algorithms that are either &amp;quot;topdown&amp;quot;, &amp;quot;bottom-up&amp;quot; or &amp;quot;left-corner&amp;quot;, though according to a recent study by Grishman (1975), the top-down and bottom-up approaches are dominant.</Paragraph>
      <Paragraph position="1"> The principle of top-down parsing is that the rules of the controlling grammar are used to generate a sentence that matches the one being analyzed. A seri-Copyright 1980 by the Association for Computational Linguistics. Permission to copy without fee all or part of this material is granted provided that the copies are not made for direct commercial advantage and the Journal reference and this copyright notice are included on the first page. To copy otherwise, or to republish, requires a fee and/or specific permission.  ous problem with this approach, if computed phrases are supposed to correspond to natural phrases in a sentence, is that the parser cannot handle left-branching phrases. But such phrases occur in English, Japanese, Turkish and other natural languages (Chomsky 1965, Kimball 1973, Lyons 1970).</Paragraph>
      <Paragraph position="2"> The principle of bottom-up parsing is that a sequence of phrases whose types match the right-hand side of a grammar rule is reduced to a phrase of the type on the left-hand side of the rule. None of the matching is done until all the phrases are present; this can be ensured by matching the phrase types in the right-to-left order shown in the grammar rule.</Paragraph>
      <Paragraph position="3"> The difficulty with this approach is that the analysis of the first part of a sentence has no influence on the analysis of latter parts until the results of the analyses are finally combined. Efforts to overcome this difficulty lead naturally to the third approach, left-corner parsing.</Paragraph>
      <Paragraph position="4"> Left-corner parsing, like bottom-up parsing, reduces phrases whose types match the right-hand side of a grammar rule to a phrase of the type on the -~ left-hand side of the rule; the difference is that the '&lt; ~,types listed in the right-hand side of the rule are /j0&amp;quot;x~matched from left to right for left-corner parsing ~. ~ instead of from right to left. This technique gets its ~ name from the fact that the first phrase found cor~ responds to the left-most symbol of the right-hand b'~side of the grammar rule, and this symbol has been called the left corner of the rule. (When a grammar rule is drawn graphically with its left-hand side as the parent node and the symbols of the right-hand side as the daughters, forming a triangle, the left-most symbol is the left corner of the triangle.) Once the left-corner phrase has been found, the grammar rule can be used to predict what kind of phrase will come next. This is how the analysis of the first part of a sentence can influence the analysis of later parts.</Paragraph>
      <Paragraph position="5"> Most programs based on augmented transition networks employ a top-down parser to which registers and structure building routines have been added, e.g., Kaplan (1972) and Wanner and Maratsos (1978). It is important to note, however, that the concept of augmented transition networks is a particular way to represent linguistic knowledge; it does not require that the program using the networks operate in top-down fashion. In an early paper by Woods (1970), alternative algorithms that can be used with augmented transition networks are discussed, including the bottom-up and Earley algorithms. null The procedure-based programs of Winograd (1972) and Novak (1976) are basically top-down parsers, too. The NLP program of Heidorn (1972) employs an augmented phrase structure grammar to combine phrases in a basically bottom-up fashion.</Paragraph>
      <Paragraph position="6"> Likewise, PARRY, the program written by Colby (Parkison, Colby and Faught 1977), uses a kind of bottom-up method to drive a computer model of a paranoid.</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
1.2 A New Parsing Algorithm
</SectionTitle>
      <Paragraph position="0"> This paper presents a parsing algorithm that allows data structures to be combined as soon as possible. The algorithm permits a structure A to be ~.~ combined with a structure B to form a structure C, and then to enlarge B to form a new structure B, ~_.~ This new structure is to be formed in such a wayA~L~A~ ~ that C is now composed of A and B' instead of Al~XO~y and B. The algorithm permits these actions on data structures because it permits similar actions on phrases, namely, phrases are combined with other phrases and afterward are extended to encompass more words in the sentence being analyzed. This behavior of combining phrases before all of their components have been found is called closure by Kimball (1973). It is desirable because it permits the corresponding data structures to be combined and to influence the construction of other data structures sooner than they otherwise could.</Paragraph>
      <Paragraph position="1"> In the next section of this paper the desired behavior for combining phrases is discussed in more detail to show the two kinds of actions that are required. Then the algorithm itself is explained and its operation illustrated by examples, with some details of an experimental implementation being given, also. Finally, this algorithm is compared to those used in the sentence analysis programs of Marcus and Riesbeck, and some concluding remarks are made.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML