File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/81/j81-2002_metho.xml

Size: 14,779 bytes

Last Modified: 2025-10-06 14:11:24

<?xml version="1.0" standalone="yes"?>
<Paper uid="J81-2002">
  <Title>Relaxation Techniques for Parsing Grammatically Ill-Formed Input in Natural Language Understanding Systems 1</Title>
  <Section position="5" start_page="0" end_page="0" type="metho">
    <SectionTitle>
(PAT ((PERSON TRANS-ACT OBJ)) T ... )
</SectionTitle>
    <Paragraph position="0"> It could further be modified to permit optional constituents, as in</Paragraph>
    <Paragraph position="2"> Pattern matching proceeds by matching each arc in the pattern against the input string, but is affected by the chosen &amp;quot;mode'&amp;quot; of matching. Since the individual component arcs are, in a sense, complex patterns, the ATN interpreter can be considered part of the matching algorithm as we\[l. In arcs within patterns, explicit transfer to a new state is ignored arid the next arc attempted on success is the one following ha the pattern. As in the example above, art arc in a pattern prefaced by &amp;quot;&gt;&amp;quot; can be. considered optional, if the OPTIONAL mode has been selected to activate this feature. When this is done, the matching algorithm still attempts to match optional arcs, but may ignore them. In the example of a pattern arc using the op- null American Journal of Computational Linguistics, Volume 7, Number 2, April-June 1981 103 Stan C. Kwasny and Norman K. Sondheimer Relaxation Techniques for Parsing Ill-Formed Input tional feature above, the pattern component would be able to match any of the fragments: *Mary the car *Drove the car *The car as well as the original sentence. We will return to this example in our discussion on conjunction.</Paragraph>
    <Paragraph position="3"> The manipulation of registers within optional arcs poses some interesting problems. With the optionality feature, not every arc will be executed in a pattern.</Paragraph>
    <Paragraph position="4"> To complicate matters, registers previously set may be used within a test or they may hold elements of the final structure. Tests involving undefined registers are defaulted to allow traversal of these arcs, while structure-building actions involving undefined registers mark the missing components as undefined.</Paragraph>
    <Paragraph position="5"> In addition, there are two other modes of matching.</Paragraph>
    <Paragraph position="6"> A pattern unanchoring capability is activated by specifying the mode UNANCHOR. In this mode, patterns are permitted to skip words prior to matching. Specifying the SKIP mode results in words being ignored between matches of the arcs within a pattern. This is a generalization of the UNANCHOR mode.</Paragraph>
    <Paragraph position="7"> As with all forms of relaxation, pattern matching results in deviance notes. For patterns, these notes contain information necessary to determine how matching succeeded. Sufficient information is contained in the deviance notes to recover skipped words and skipped arcs, if desired.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.4 Patterns, Ellipsis, and Extraneous Forms
</SectionTitle>
      <Paragraph position="0"> The pattern arc is the primary mechanism for handling ellipsis and extraneous forms. A pattern arc can be seen as capturing a single path through a network.</Paragraph>
      <Paragraph position="1"> The matcher gives some freedom in how that path relates to a string. The appropriate parsing path through a network relates to an elliptical sentence or one with extra words in the same way. With contextual ellipsis, the relationship will be in having some of the arcs on the correct path not satisfied. In pattern arcs, these will be represented by arcs marked as optional. Dialogue context will provide the defaults for the missing components. With pattern arcs, the deviance notes will show what was left out and the other components in the NLU system will be responsible for supplying the values.</Paragraph>
      <Paragraph position="2"> As an example, consider the following question and its possible elliptical responses: Who drove a car? *Mary.</Paragraph>
      <Paragraph position="3"> *Mary did.</Paragraph>
      <Paragraph position="4"> *Mary her Ford.</Paragraph>
      <Paragraph position="5"> The form of the basic expected response could be shown by the following pattern, where AUX refers to an arc that identifies auxiliary verbs:</Paragraph>
      <Paragraph position="7"> This would serve to identify the three elliptical responses as well as others. It would also accept some unlikely inputs such as: *Mary could a car.</Paragraph>
      <Paragraph position="8"> *Mary the Empire State Building.</Paragraph>
      <Paragraph position="9"> These would have to be filtered by later processing by other components through the use of the deviance notes. A pattern for the similar question: Mary drove what? could be as follows:</Paragraph>
      <Paragraph position="11"> The source of patterns for contextual ellipsis is important. In the LIFER system discussed in Hendrix, Sacerdoti, Sagalowicz, and Slocum (1978), the previous user input can be seen as a pattern for elliptical processing of the current input. The automatic pattern generator introduced in the next section, along with an expectation mechanism, will capture this level of processing. But, with the ability to construct arbitrary patterns and to add them to the grammar from other components of the NLU system, our approach can accomplish much more. For example, a question generation routine could add an expectation of a yes/no answer in front of a transformed rephrasing of a question, as in Did Mary drive a car? *Yes, she did.</Paragraph>
      <Paragraph position="12"> *No, she did not.</Paragraph>
      <Paragraph position="13"> The appropriate patterns might be the following, where YES, NO, and NOT refer to arcs that accept these words:</Paragraph>
      <Paragraph position="15"> Patterns for telegraphic ellipsis have to be added to the grammar manually. One typical usage would be in accepting terse questions where the actions default to a standard operation in the system. For example, in a database situation, a reference to a set of objects could be understood as a command to retrieve them, as with the &amp;quot;profit margin&amp;quot; query given earlier. A pattern here might be the following, where IMP-ACT is an arc that identifies the typical action, and SET 104 American Journal of Computational Linguistics, Volume 7, Number 2, April-June 1981 Stan C. Kwasny and Norman K. Sondheimer Relaxation Techniques for Parsing Ill-Formed Input identifies a set description:</Paragraph>
      <Paragraph position="17"> Generally, patterns of usage must be identified, say in a study like that of Malhotra (1975), so that appropriate patterns can be constructed. Patterns for extraneous forms must also be added in advance. These would use either the UNANCHOR option in order to skip false starts, or use dynamically-produced patterns to catch repetitions for emphasis. In general, only a limited number of these patterns should be required.</Paragraph>
      <Paragraph position="18"> The value of the pattern mechanism here, especially in the case of telegraphic ellipsis, is in connecting the ungrammatical forms to the grammatical ones.</Paragraph>
      <Paragraph position="19"> The problem of extraneous words can be attacked by two of the &lt;mode&gt; settings in pattern arcs. The UNANCHOR mode applies to restarts, as in the earlier example, through the use of a simple pattern arc: mar through one of two devices. They may be constructed as needed by special macro arcs or they may be constructed for future use through an expectation mechanism.</Paragraph>
      <Paragraph position="20"> As the expectation-based parsing efforts clearly show, syntactic elements, especially words, contain important clues on processing (Riesbeck and Schank, 1976). Indeed, we also have found it useful to make the ATN mechanism more &amp;quot;active&amp;quot; by allowing it to produce new arcs based on such clues. To achieve this, the CAT, MEM, TST, and WRD arcs have been generalized and four new &amp;quot;macro&amp;quot; arcs, known as CAT*, MEM*, TST*, and WRD*, have been added to the ATN formalism. These are similar in every way to their counterparts, except that as a final action, instead of indicating the state to which the traversal leads, a new arc is constructed dynamically and immediately executed. The difference in the form that the new arc takes is seen in the following comparison:</Paragraph>
    </Section>
  </Section>
  <Section position="6" start_page="0" end_page="0" type="metho">
    <SectionTitle>
(PAT (((PUSH S/ ... )) UNANCHOR) ... )
</SectionTitle>
    <Paragraph position="0"> where the S/ network is used for parsing sentences.</Paragraph>
    <Paragraph position="1"> Extraneous words in the interior of sentences can be accommodated by the SKIP mode. These methods are similar to those described by Hayes and Mouradian (1980).</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.5 Dynamic Generation of Patterns
</SectionTitle>
      <Paragraph position="0"> Patterns need to be generated dynamically in order to make use of the patterns of language occurring in the sentence. In this way, previous processing can be made to contribute more directly to later processing.</Paragraph>
      <Paragraph position="1"> An automatic pattern generation mechanism has been implemented using the trace of the current execution path to produce a pattern. This is invoked by using the symbol &amp;quot;&gt;&amp;quot; in place of the pattern name. Patterns produced in this fashion contain only those arcs traversed at the current level of recursion in the network, although this could easily be generalized in a way which would permit PUSH arcs to be automatically replaced by their subnetwork paths. Each arc in an automatic pattern of this type is marked as optional; however, additional control of the optionality feature of the pattern matcher can be exercised through judicious use of the test component of the arc. Patterns can also be constructed dynamically in precisely the same way grammatical structures are built by using BUILDQ and other structure-building functions.</Paragraph>
      <Paragraph position="2"> Pattern arcs enter the grammar in two ways. They are manually written into the grammar in those cases where the ungrammaticalities are common and they are added to the grammar automatically in those cases where the ungrammaticality is dependent on context.</Paragraph>
      <Paragraph position="3"> Those that are produced dynamically enter the gram(CAT &lt;cat&gt; &lt;test&gt; &lt;act&gt;* &lt;term&gt;) (CAT* &lt;cat&gt; &lt;test&gt; &lt;act&gt;* &lt;creat act&gt;) In this example, &lt;creat act&gt; is used to define the dynamic arc through the use of BUILDQ and similar structure-building functions, while (term&gt; represents any terminal action (usually a TO action). Arcs computed by macro arcs can be of any type permitted by the ATN, but one of the most useful arcs to compute in this manner is the pattern arc discussed above. This allows a prescribed path of arcs to be attempted in processing the sentence.</Paragraph>
      <Paragraph position="4"> An example of much of what was just described is the following macro arc:</Paragraph>
      <Paragraph position="6"> This WRD* macro arc attempts to find the word AND at the current position of processing the input. If successful, it computes a pattern arc with the pattern left to be automatically generated upon execution.</Paragraph>
      <Paragraph position="7"> While the macro arc forces immediate execution of an arc, arcs may also be computed and temporarily added to the grammar for later execution through the expectation mechanism. Expectations are performed as actions within arcs (analogous to the HOLD action for parsing structures) or as actions elsewhere in the NLU system, e.g., during generation, when particular types American Journal of Computational Linguistics, Volume 7, Number 2, April-June 1981 105 Stan C. Kwasny and Norman K. Sondheimer Relaxation Techniques for Parsing Ill-Formed Input of responses can be foreseen. Two forms are allowed: (essentially) produce the arc: (EXPECT &lt;creat act&gt; &lt;state&gt;) (EXPECT &lt;creat act&gt;) In the first case, the arc created is bound to a state as specified. When later processing leads to that state, the expected arc will be attempted as one alternative at that state. In the second case, where no state is specified, the effect is to attempt the arc at every state visited during the parse.</Paragraph>
      <Paragraph position="8"> The range of an expectation produced during parsing is ordinarily limited to a single sentence, with the arc disappearing after it has been used; however, the start state, S/, is reserved for expectations intended to be active at the beginning of the next sentence. These will disappear in turn at the end of processing for that sentence.</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.6 Conjunction and Macro Arcs
</SectionTitle>
      <Paragraph position="0"> Besides their utility in handling cases of ellipsis, pattern arcs are also the primary mechanism for handling conjunction. As mentioned earlier, the rationale for this is the connection between conjunction and ellipsis. Whenever a conjunction is seen, a pattern is developed from the already identified elements and matched against the remaining segments of input.</Paragraph>
      <Paragraph position="1"> All of the forms of conjunction described above are treated through a globally defined set of &amp;quot;conjunction arcs&amp;quot;, which are active at every state of the grammar. Some restricted cases, such as &amp;quot;and&amp;quot; following &amp;quot;between&amp;quot;, may have the conjunction built into the grammar. These forms are not subject to the conjunction arcs. In general, the conjunction arcs are made up of macro arcs which compute pattern arcs. The automatic pattern mechanism is heavily used.</Paragraph>
      <Paragraph position="2"> Returning to the earlier example of pattern arcs, it can be shown how these conjunction arcs work. Consider what takes place when the macro arc previously described is activated during the processing of the sentence: Mary drove the car and John drove the truck.</Paragraph>
      <Paragraph position="3"> Processing proceeds until the conjunction &amp;quot;and&amp;quot; is encountered. For the purpose of this example, it is assumed that the top level path of arcs followed during processing is simply the pattern discussed earlier:</Paragraph>
    </Section>
  </Section>
  <Section position="7" start_page="0" end_page="0" type="metho">
    <SectionTitle>
(PERSON TRANS-ACT OBJ)
</SectionTitle>
    <Paragraph position="0"> The action of the macro arc is, therefore, to</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML