File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/94/a94-1018_intro.xml

Size: 4,036 bytes

Last Modified: 2025-10-06 14:05:35

<?xml version="1.0" standalone="yes"?>
<Paper uid="A94-1018">
  <Title>Yet Another Chart-Based Technique for Parsing Ill-Formed Input</Title>
  <Section position="2" start_page="0" end_page="107" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> It is important that natural language interface systems have the capability of composing the globally most plausible explanation if a given input can not be syntactically parsed. This would be useful for handling erroneous inputs from the user and for offsetting grammar and lexicon insufficiency. Also, such a capability could be applied to deal with the ungrammatical sentences and sentence fragments that frequently appear in spoken dialogs (Bear, Dowding and Shriberg, 1992). Several efforts have been conducted to achieve this objective ((Lang, 1988; Saito and Tomita, 1988), for example.) One major decision to be made in designing this capability is whether knowledge other than purely syntactic knowledge is to be used. Otherthan syntactic knowledge includes grammar specific recovery rules such as recta-rules (Weishedel and Sondheimer, 1983), semantic or pragmatic knowledge which may depend on a particular domain (Carbonell and Hayes, 1983) or the characteristics of the ill-formed utterances observed in human discourse (Hindle, 1983).</Paragraph>
    <Paragraph position="1"> Although it is obvious that the utilizing such knowledge allows us to devise more powerful strategies, we should first determine the effectiveness of using only syntactic knowledge. Moreover, the result can be applied widely, as using syntactic knowledge is a base of the most of strategies.</Paragraph>
    <Paragraph position="2"> One significant advance in the usage of syntactic knowledge was contained in the technique proposed by Mellish (1989). It can handle not only unknown/misspelled words, but also omitted words and extraneous words in sentences. It can deal with such problems, and develop plausible explanations quickly since it utilizes the full syntactic context by using an active chart parser (Kay, 1980; Gazdar and Mellish, 1989). One problem with his technique is that its performance heavily depends on how the search heuristics, which is implemented as a score calculated from six parameters, is set. The heuristics complicates the algorithm significantly. This must be one of reasons why the performance of the method, as Mellish himself noted, dropped dramatically when the input contains multiple errors.</Paragraph>
    <Paragraph position="3"> This paper proposes a new technique for parsing inputs that contain simple kinds of ill-formedness. This generalized parsing strategy is, similar to Mellish's, based on an active chart parser, and so shares the many advantages of Mellish's technique. It is based on pure syntactics, it is independent of all grammars, and it does not slow down the original parsing operation if there is no iU-formedness. However, unlike Mellish's technique, it doesn't employ any complicated heuristic parameters.</Paragraph>
    <Paragraph position="4"> There are two key points. First, instead of using a unified or interleaved process for finding errors and correcting them, we separate the initial error detection stage from the other stages and adopt a version of bi-directional parsing, which has been pointed out to be a useful strategy for fragment parsing by itself (Satta and Stock, 1989). This effectively prunes the search space and allows the new technique to take full account of the right-side context. Second, it employs normal top-down parsing, in which each parsing state reflects the global context, instead of top-down chart parsing. This enables the technique to determine the global plausibility of candidates easily. The results of preliminary experiments are encouraging. The proposed strategy could enumerate  all possible minimal-penalty solutions in just 4 times the time taken to parse the correct sentences. That is, it is almost twice as fast as Mellish's strategy.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML