File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/80/c80-1008_abstr.xml

Size: 5,098 bytes

Last Modified: 2025-10-06 13:45:51

<?xml version="1.0" standalone="yes"?>
<Paper uid="C80-1008">
  <Title>A RULE-BASED APPROACH TO ILL-FORMED INPUT</Title>
  <Section position="1" start_page="0" end_page="46" type="abstr">
    <SectionTitle>
SUMMARY
</SectionTitle>
    <Paragraph position="0"> Though natural language understanding systems have improved markedly in recent years, they have only begun to consider a major problem of truly natural input: ill-formedness. Quite often natural language input is ill-formed in the sense of being misspelled, ungrammatical, or not entirely meaningful. A requirement for any successful natural language interface must be that the system either intelligently guesses at a user's intent, requests direct clarification, or at the very least, accurately identifies the ill-formedness. This paper presents a proposal for the proper treatment of ill-formed input.</Paragraph>
    <Paragraph position="1"> Our conjecture is that ill-formedness should be treated as rule-based. Violation of the rules of normal processing should be used to signal ill-formedness. Meta-rules modifying the rules of normal processing should be used for error identification and recovery. These meta-rules correspond to types of errors. Evidence for this conjecture is presented as well as some open ~\]estions.</Paragraph>
    <Paragraph position="2"> I. Introduction Natural Language interfaces have improved markedly in recent years and have even begun to enter the cor~ercial marketplace, e.g., the ROBOT system of Artificial Intelligence Corporation (Harris, 1978). These systems promise to make major improvements in the ease-of-use of data base management and other computer systems. However, they have only begun to consider the problems of truly natural input. The emphasis has been, and continues to be, on the understanding of well-formed inputs. True natural language input is often ill-formed in the absolute sense of being filled with misspellings, mistypings, mispunctuations, tense and number errors, word order problems, run-on sentences, sentence fragments, extraneous forms, meaningless sentences, impossible requests, etc. In addition, natural input is ill-formed in the relative sense of containing requests that are beyond the limits of either the computer system or the natural language interface. The frequent occurrence of these phenomena has been pointed out by both friend and foe of natural language interfaces, see for example, Malhotra (1975), Montgomery (1972), and Shneiderman (1978).</Paragraph>
    <Paragraph position="3"> Most systems deal with a few of these types of ill-formedness. Experience (Harris, 1977b and Hendrix, et al., 1978) has shown that users can adapt to the limitations of the system's well-formed, anticipated input. Yet, we feel that presuming on such user adaptation eliminates one of the most powerful motivations for English input: namely, enabling infrequent users to access their data without an intermediary person and without extensive practice. Even for the person who frequently uses such a system, if it cannot explain why it misunderstands an input, the system will be exasperating at times.</Paragraph>
    <Paragraph position="4"> Therefore, we totally agree with Wilks (1976) in his statement that &amp;quot;Understanding requires, at the very least, ... some attempt to interpret, rather than merely reject, what seems to be ill-formed utterances.&amp;quot; A requirement for any natural language interface must be that, when faced with ill-formed input, the system either intelligently guesses at a user's intent, requests direct clarification, or at the very least, accurately identifies, the illformedness. null Researchers including ourselves have worked on various aspects of ill-formedness.</Paragraph>
    <Paragraph position="5"> Out of our work, and that of others, we have produced a conjecture on the treatment of ill-formed input to natural language interfaces.</Paragraph>
    <Paragraph position="6"> That conjecture is in essence that ill-formedness should be treated as &amp;quot;rule-based&amp;quot;. First, natural language interfaces should process all input as presumably well-formed until the rules of normal processing are violated. At that ~nt, error handling procedures based on meta-rules relating ill-formed input to well-formed structures through the modification of the violated normal rules should be employed.</Paragraph>
    <Paragraph position="7"> These meta-rules correspond to types of errors.</Paragraph>
    <Paragraph position="8"> The rest of the paper argues for this rule-based approach. Section 2 characterizes both the types of ill-formed input, and the types of possible approaches to them. Section 3 explains our proposal. Section 4 motivates the proposal through analysis of its effect on the  development and operation of natural language interfaces, the use of evidence from other disciplines that consider ill-formedness in natural and artificial languages, and, most importantly, evidence from work on natural language understanding systems. Section 5 discusses some open problems in light of the proposal.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML