File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/92/j92-4001_intro.xml

Size: 11,080 bytes

Last Modified: 2025-10-06 14:05:16

<?xml version="1.0" standalone="yes"?>
<Paper uid="J92-4001">
  <Title>The Acquisition and Use of Context-Dependent Grammars for English</Title>
  <Section position="4" start_page="394" end_page="398" type="intro">
    <SectionTitle>
2 Described in Section 7.3.
</SectionTitle>
    <Paragraph position="0"/>
    <Paragraph position="2"> n v det adj n v det adj n cnj det adj n cnj det adj n cnj det adj n cnj det adj n cnj det adj n cnj det adj n cnj det adj n  An example of windowed context. of the context invisible to the system. The next operation is solely decided by the windowed context. It can be observed that the last state in the analysis is the single symbol SNT--the designated root symbol, on the stack along with an empty input string, successfully completing the parse. And this is the CDG form of rule used in the phrase structure analysis.</Paragraph>
    <Section position="1" start_page="395" end_page="396" type="sub_section">
      <SectionTitle>
2.2 Algorithm for the Shift/Reduce Parser
</SectionTitle>
      <Paragraph position="0"> The parser accepts a string of syntactic word classes as its input and forms a ten-symbol vector, five symbols each from the stack and the input string. It looks up this vector as the left half of a production in the grammar and interprets the right half of the production as an instruction to modify the stack and input sequences to construct the next state of the parse. To accomplish these tasks, it maintains two stacks, one for the input string and one for the syntactic constituents. These stacks may be arbitrarily large.</Paragraph>
      <Paragraph position="1"> An algorithm for the parser is described in Figure 3. The most important part of this algorithm is to find an applicable CDG rule from the grammar. Finding such a rule is based on the current windowed context. If there is a rule whose left side exactly matches the current windowed context, that rule will be applied. However, realistically, it is often the case that there is no exact match with any rule. Therefore, it is necessary to find a rule that best matches the current context.</Paragraph>
      <Paragraph position="2">  Robert E Simmons and Yeong-Ho Yu Context-Dependent Grammars for English CD-SR-Parser(Input,Cdg) Input is a string of syntactic classes for the given sentence.</Paragraph>
      <Paragraph position="3"> Cdg is the given CDG grammar rules.</Paragraph>
      <Paragraph position="5"> The functions, Top_five and First-five, return the lists of top (or first) five elements of the Stack and the Input respectively. If there are not enough elements, these procedures pad with blanks. The function Append concatenates two lists into one. Consult_CDG consults the given CDG rules to find the next operation to take. The details of this function are the subject of the next section. Push and Pop add or delete one element to/from a stack while First and Second return the first or second elements of a list, respectively. Rest returns the given list minus the first element.</Paragraph>
      <Paragraph position="6"> Figure 3 Context-sensitive shift reduce parser.</Paragraph>
    </Section>
    <Section position="2" start_page="396" end_page="398" type="sub_section">
      <SectionTitle>
2.3 Consulting the CDG Rules
</SectionTitle>
      <Paragraph position="0"> There are two related issues in consulting the CDG rules. One is the computational representation of CDG rules, and the other is the method for selecting an applicable rule.</Paragraph>
      <Paragraph position="1"> In the traditional CFG paradigms, a CFG rule is applicable if the left-hand side of the rule exactly matches the top elements of the stack. However, in our CDG paradigm, a perfect match between the left side of a CDG rule and the current state cannot be assured, and in most cases, a partial match must suffice for the rule to be applied. Since many rules may partially match the current context, the best matching rule should be selected.</Paragraph>
      <Paragraph position="2"> One way to do this is to use a neural network. Through the back-propagation algorithm (Rumelhart, Hinton, and Williams 1986), a feed-forward network can be trained to memorize the CDG rules. After successful training, the network can be used to retrieve the best matching rule. However, this approach based on ~ neural network usually takes considerable training time. For instance, in our previous experiment (Simmons and Yu 1990), training a network for about 2,000 CDG rules took several days of computation. Therefore, this approach has an intrinsic problem for scaling up, at least on the present generation of neural net simulation software.</Paragraph>
      <Paragraph position="3"> Another method is based on a hash table in which every CDG rule is stored according to its top two elements of the stack--the fourth and fifth elements of the left half of the rule. Given the current windowed context, the top two elements of the stack are used to retrieve all the relevant rules from the hash table.</Paragraph>
      <Paragraph position="4">  Computational Linguistics Volume 18, Number 4 We use no more than 64 word and phrase class symbols, so there can be no more than 4,096 possible pairs. The effect is to divide the large number of rules into no more than 4,096 subgroups, each of which will have a manageable subset. In fact, with 16,275 rules we discovered that we have only 823 pairs and the average number of rules per subgroup is 19.8; however, for frequently occurring pairs the number of rules in the subgroups can be much larger. The problem is to determine what scoring formula should be used to find the rule that best matches a parsing context.</Paragraph>
      <Paragraph position="5"> Sejnowski and Rosenberg (1988) analyzed the weight matrix that resulted from training NETtalk and discovered a triangular function with the apex centered at the character in the window and the weights falling off in proportion to distance from that character. We decided that the best matching rule in our system would follow a similar pattern with maximum weights for the top two elements on the stack with weights decreasing in both directions with distance from those positions. The scoring function we use is developed as follows: Let T4 be the set of vectors {RI~R2,... ,Rn} where Ri is the vector \[rl, r2,..., rl0\] Let C be the vector \[Cl, Ca,..., c10\] Let #(ci, ri) be a matching function whose value is 1 if ci = ri, and 0 otherwise. TZ is the entire set of rules, Ri is (the left half of) a particular rule, and C is the parse context.</Paragraph>
      <Paragraph position="6"> Then/-4' is the subset of T4 where if Ri E T~ I then #(ri4,c4) * #(ris~cs) = 1. Access of the hash table with the top two elements of the stack, c4, c5 produces the set T4'.</Paragraph>
      <Paragraph position="7"> We can now define the scoring function for each Ri C T~ I.</Paragraph>
      <Paragraph position="9"> The first summation scores the matches between the stack elements of the rule and the current context, and the second summation scores the matches between the elements in the input string. If two items of the rule and context match, the total score is increased by the weight assigned to that position. The maximum score for a perfect match is 21 according to the above formula.</Paragraph>
      <Paragraph position="10"> From several experiments, varying the length of vector and the weights, particularly those assigned to blanks, it has been determined that this formula gave the best performance among those tested. More importantly, it has worked well in the current phrase structure and case analysis experiments.</Paragraph>
      <Paragraph position="11"> It was an unexpected surprise to us 3 that using context-sensitive productions, an elementary, deterministic, parsing algorithm proved adequate to provide 99% correct, unambiguous anAalyses for the entire text studied.</Paragraph>
      <Paragraph position="12"> 3. Grammar Acquisition for CDG Constructing an augmented phrase structure grammar of whatever type unification, GPSG, or ATN--is a painful process usually involving a well-trained linguistic team of several people. These types of grammar require that a CFG recognition rule such  Robert F. Simmons and Yeong-Ho Yu Context-Dependent Grammars for English as np vp ~ snt be supported by such additional information as the fact that the np and vp agree in number, that the np is characterized by particular features such as count, animate, etc., and that the vp can or cannot accept certain types of complements. The additional features make the rules exceedingly complex and difficult to prepare and debug. College students can be taught easily to make a phrase structure tree to represent a sentence, but it requires considerable linguistic training to deal successfully with a feature grammar.</Paragraph>
      <Paragraph position="13"> We have seen in the preceding section that a CFG is derived from recording the successive states of the parses of sentences. Thus it was natural for us to develop an interactive acquisition system that would assist a linguist (or a student) in constructing such parses to produce easily large sets of example CFG rules. 4 The system continued to evolve as a consequence of our use until we had included capabilities to:  * read in text and data files * compile dictionary and grammar tables from completed text files * select a sentence to continue processing or revise * look up words in a dictionary to suggest the syntactic class for the word in context when assigning syntactic classes to the words in a sentence * compare each state of the parse with rules in the current grammar to predict the shift/reduce operation. A carriage return signals that the user accepts the prompt, or the typing in of the desired operation overrides it.</Paragraph>
      <Paragraph position="14"> * compute and display the parse tree from the local grammar after completion of each sentence, or from the global total grammar at any time * provide backing up and editing capability to correct errors * print help messages and guide the user * compile dictionary and grammar entries at the completion of each sentence, insuring no duplicate entries * save completed or partially completed grammar files.</Paragraph>
      <Paragraph position="15">  The resulting tool, GRAMAQ, enables a linguist to construct a context-sensitive grammar for a text corpus at the rate of several sentences per hour. Thousands of rules are accumulated with only weeks of effort in contrast to the years required for a comparable system of augmented CFG rules. About ten weeks of effort were required to produce the 16,275 rules on which this study is based. Since GRAMAQ's prompts become more accurate as the dictionary and grammar grow in size, there is a positive acceleration in the speed of grammar accumulation and the linguist's task gradually converges to one of alert supervision of the system's prompts.</Paragraph>
      <Paragraph position="16"> A slightly different version of GRAMAQ is Caseaq, which uses operations that create case constituents to accumulate a context-sensitive grammar that transforms</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML