File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/83/j83-3003_intro.xml

Size: 12,319 bytes

Last Modified: 2025-10-06 14:04:20

<?xml version="1.0" standalone="yes"?>
<Paper uid="J83-3003">
  <Title>Meta-rules as a Basis for Processing Ill-Formed Input 1</Title>
  <Section position="4" start_page="0" end_page="0" type="intro">
    <SectionTitle>
3 The purpose of these meta-rules is therefore quite distinct
</SectionTitle>
    <Paragraph position="0"> from that of Gawron (1982).</Paragraph>
    <Paragraph position="1"> formed structures through the modification of the violated normal rules should be employed. These meta-rules correspond to types of errors. 3 The rest of the paper argues for this rule-based approach. Section 2 characterizes both the types of ill-formed input, and the types of possible approaches to them, including our proposal. Section 3 gives examples of meta-rules for processing ill-formed input. Section 4 describes how some heuristics developed by others fit within our paradigm. An implementation is sketched in Section 5. Section 6 discusses limitations of the proposal. Sections 7 and 8 present directions for future work and conclusions.</Paragraph>
    <Paragraph position="2"> 2. Approaches to III-formedness This section introduces the problem of interpreting ill-formed input. First, we discuss the types of ill-formed input briefly. Then we consider the range of approaches that have been tried for allowing for such input.</Paragraph>
    <Paragraph position="3"> Ill-formedness phenomena can be divided into two sets. The first defines what we call absolute ill-formedness. An utterance is absolutely ill-formed if the typical listener considers it ill-formed. The definition unfortunately appeals to subjective evaluations; these are known to differ widely (Ross 1979). But it seems to include the majority of typical cases and exclude the majority of types of good English sentences. null The second set defines relative ill-formedness. This is ill-formedness with respect to the normal processing rules of the formal computing system including the natural language interface and the underlying application system. The set of ill-formed inputs for an interface can be defined as the union of these two sets for that interface.</Paragraph>
    <Paragraph position="4"> The set of ill-formed input captured by these definitions can also be seen through the four typical phases of interpretation in natural language interfaces: lexical, syntactic, semantic, and pragmatic processing. In lexical processing, absolute ill-formedness can come from misspelling, mistyping, and mispronunciation; relative ill-formedness can arise from unknown words.</Paragraph>
    <Paragraph position="5"> In syntactic processing, absolute ill-formedness is seen in faulty subject-verb agreement, word order errors, omitted words, run-on sentences, etc; relative ill-formedness is seen in grammatical combinations of words that exceed the interface's grammar.</Paragraph>
    <Paragraph position="6"> Semantic processing can be defined as the interpretation of the input in isolation. Knowledge of the task domain can be applied, but the context of input with respect to previous interactions and the state of the underlying computing system are only considered in pragmatic processing.</Paragraph>
    <Paragraph position="7"> Absolute ill-formedness in semantics includes omitting needed information and violating of selectional restrictions. Absolute ill-formedness in pragmatics 162 American Journal of Computational Linguistics, Volume 9, Numbers 3-4, July-December 1983 Ralph M. Weischedel and Norman K. Sondheimer Mete-rules as a Basis for Processing Ill-Formed Input includes breaking the rules of conversation, as when answering a question with a question, having presuppositions of the speaker fail, and failing to make clear an anaphoric reference. Relative ill-formedness in both cases includes &amp;quot;overshoot&amp;quot;, requesting capabilities or information not covered by the system in its current state, and parenthetical expressions incomprehensible to the system.</Paragraph>
    <Paragraph position="8"> 2.1. Four alternative approaches to ill-formedness There are at least five approaches one can take to ill-formedness. This section outlines the four alternatives to the approach we have formulated; our approach is covered in Section 2.2. In describing the five approaches, we use the following informal notation. SYSTEM\[s\] refers to a system designed to process a set of sentences s. WELL-FORMED is a set of well-formed utterances; ILL-FORMED is a set of ill-formed utterances. Naturally, an approach that covers the broadest range of linguistic behaviour should be preferred.</Paragraph>
    <Paragraph position="9"> One alternative is to treat the processing of ill-formed and well-formed inputs identically, by ignoring constraints. That is, one designs SYSTEM\[WELL-FORMED U ILL-FORMED\]. Schank et al. (1980) and Waltz (1978) have taken this approach toward grammatical constraints. CASPAR (Hayes and Carbonell 1981) exhibits this approach for grammatical constraints as well. Since there is much redundancy in language, the practice of not using certain constraints will often work. However, this will fail on many utterances, since it ignores rules that not only constrain search but also eliminate unintended interpretations.</Paragraph>
    <Paragraph position="10"> One can see this by considering subject-verb agreement, a grammatical constraint that people sometimes violate and that is often left out of natural language systems. Though other constraints, such as semantic (selection) restrictions between a verb and its subject, often indicate the intended interpretation, it is easy to think of examples where subject-verb agreement is crucial to understanding. Comparing examples (1) and (2) below, subject-verb agreement is crucial to determining whether the company or assets were purchased. null  (1) List the assets of the company that was purchased by XYZ Corp.</Paragraph>
    <Paragraph position="11"> (2) List the assets of the company that were purchased by XYZ Corp.</Paragraph>
    <Paragraph position="12">  A second approach is to build systems for well-formed input and for ill-formed input together; that is, one designs SYSTEM\[ILL-FORMED\] merged with SYSTEM\[WELL-FORMED\]. Unlike the first approach, well-formedness constraints are employed on well-formed input. LUNAR (Woods et al. 1972), an early English interface to a question-answering system, and SOPHIE (Burton and Brown 1977), an intelligent tutoring system with an English interface, both used this approach. The problem with this approach is that it does not reflect the fact that constraints indicate preferences in interpretation. For instance, though example (3) below has two legitimate syntactic interpretations, the one that violates our model of the world is rejected, causing us to reject the &amp;quot;garden path&amp;quot; interpretation. null (3) I saw the Statue of Liberty flying to New York.</Paragraph>
    <Paragraph position="13"> As another example, the two pronouns in &amp;quot;He shot him&amp;quot; are normally considered to refer to different people; the alternative that the speaker meant &amp;quot;He shot himself&amp;quot; does not arise unless there are strong expectations ahead of time that that is the correct proposition.</Paragraph>
    <Paragraph position="14"> A third approach is to build two systems, but to use SYSTEM\[ILL-FORMED\] only if SYSTEM\[WELL-FORMED\] finds no interpretation. A commercially available English interface to data bases (Harris 1977) has taken this approach. The EPISTLE project (Jensen and Heidorn 1983, Miller et al. 1981) employs this alternative for grammatical violations. DYPAR (Carbonell et al. 1983) has taken this approach in an interface to an expert system. Kaplan (1978) developed a strategy to give more useful responses when a data base query yields a negative response, for example, when no entity satisfies the desired conditions. Chang (1978) created a heuristic for inferring missing joins in incomplete queries to relational data bases.</Paragraph>
    <Paragraph position="15"> The defect in this model is that there is no means of relating strategies for processing ill-formedness explicitly to the strategies for processing well-formedness. We argue here that one can explicitly relate the two classes of strategies.</Paragraph>
    <Paragraph position="16"> A fourth approach is to build only one system, SYSTEM\[WELL-FORMED\], and to employ a metric to measure how far a postulated interpretation is from satisfying all well-formedness constraints. Charniak (1981) has advocated this for grammatical processing; Wilks (1975) has made this the basis of semantic processing during the interpretation phase. Of course, the notion of weighing alternatives and using metrics has been used for phenomena other than ill-formedness, such as parsing (Robinson 1982) and speech understanding (Walker 1978, Woods et al. 1976).</Paragraph>
    <Paragraph position="17"> Clearly, ranking alternative interpretations is necessary. However, if one relies solely on a metric and SYSTEM\[WELL-FORMED\], then an account of the fact that the ill-formedness often has specific implications is still needed. In example (4), the selection restriction that &amp;quot;like&amp;quot; requires animate agents is violated; a reasonable inference is that the speaker somewhat personifies the computer in question.</Paragraph>
    <Paragraph position="18"> (4) My home computer doesn't like to run BASIC.</Paragraph>
    <Paragraph position="19"> American Journal of Computational Linguistics, Volume 9, Numbers 3-4, July-December 1983 163 Ralph M. Weischedel and Norman K. Sondheimer Meta-rules as a Basis for Processing Ill-Formed Input Nor does a metric reflect the fact that there are clear patterns of error, such as those that have been reported in linguistic studies (Fromkin 1973) and in application studies (Thompson 1980, Eastman and McLean 1981).</Paragraph>
    <Paragraph position="20"> Table I summarizes these four approaches.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.2. Our approach
</SectionTitle>
      <Paragraph position="0"> Based on previous work, both our own and that of others, we propose a framework employing meta-rules to relate the processing of ill-formed input to well-formedness rules. This framework may be stated as follows:  1. Process the input using SYSTEM\[WELL-FORMED\]. 2. If no interpretation is found by SYSTEM  \[WELL-FORMED\], apply a meta-rule to the well-formedness rules, based on a ranking of the alternatives, in order to a) diagnose the problem, that is, the rule that is violated and how it is violated, b) relax the rule, c) add a &amp;quot;deviance note&amp;quot; to the interpretation recording the violation, d) resume processing via the well-formedness rules, if possible.</Paragraph>
      <Paragraph position="1"> 3. Repeat step 2 as necessary.</Paragraph>
      <Paragraph position="2"> Each meta-rule should correspond to a pattern of ill-formedness and should account for utterances corresponding to only that pattern. SYSTEM\[ILL-FORMED\] is therefore implicit in the meta-rules.</Paragraph>
      <Paragraph position="3"> This framework has advantages lacking in one or more of the other approaches. Well-formedness constraints, whether syntactic, semantic, or pragmatic, are employed to eliminate unintended interpretations. Well-formed interpretations are always preferred. Ill-formedness processing is explicitly related to the well-formedness rules. Only the constraint that seems to be violated is relaxed; all other well-formedness constraints are still effective. Furthermore, the deviance notes record the aspect that deviates from wellformedness, thus allowing pragmatic inferences by later processing.</Paragraph>
      <Paragraph position="4"> In the next two sections, we propose a handful of primitives for syntactic and semantic problems and also propose a formalism for writing meta-rules. As supporting evidence, we state meta-rules for a number of problems and describe approaches for several others. These phenomena include the following: failed grammatical tests, word confusions, spelling errors, unknown words, restarted sentences, resumptive pronouns and noun phrases, contextual ellipses, selection restriction violation, metonymy, personification, and presupposition failure.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML