File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/95/p95-1022_intro.xml

Size: 3,906 bytes

Last Modified: 2025-10-06 14:05:52

<?xml version="1.0" standalone="yes"?>
<Paper uid="P95-1022">
  <Title>The intersection of Finite State Automata and Definite Clause Grammars</Title>
  <Section position="2" start_page="0" end_page="159" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> In this paper we are concerned with the syntactic analysis phase of a natural language understanding system. Ordinarily, the input of such a system is a sequence of words. However, following Bernard Lang we argue that it might be fruitful to take the input more generally as a finite state automaton (FSA) to model cases in which we are uncertain about the actual input. Parsing uncertain input might be necessary in case of ill-formed textual input, or in case of speech input.</Paragraph>
    <Paragraph position="1"> For example, if a natural language understanding system is interfaced with a speech recognition component, chances are that this co~t is uncertain about the actual string of words that has been uttered, and thus produces a word lattice of the most promising hypotheses, rather than a single sequence of words. FSA of course generalizes such word lattices.</Paragraph>
    <Paragraph position="2"> As another example, certain techniques to deal with ill-formed input can be characterized as finite state transducers (Lang, 1989); the composition of an input string with such a finite state transducer results in a FSA that can then be input for syntactic parsing. Such an approach allows for the treatment of missing, extraneous, interchanged or misused words (Teitelbaum, 1973; Saito and Tomita, 1988; Nederhof and Bertsch, 1994).</Paragraph>
    <Paragraph position="3"> Such techniques might be of use both in the case of written and spoken language input. In the latter case another possible application concerns the treatment of phenomena such as repairs (Carter, 1994). Note that we allow the input to be a full FSA (possibly including cycles, etc.) since some of the above-mentioned techniques indeed result in cycles. Whereas an ordinary word-graph always defines a finite language, a FSA of course can easily define an infinite number of sentences. Cycles might emerge to treat unknown sequences of words, i.e.</Paragraph>
    <Paragraph position="4"> sentences with unknown parts of unknown lengths (Lang, 1988).</Paragraph>
    <Paragraph position="5"> As suggested by an ACL reviewer, one could also try to model haplology phenomena (such as the's in English sentences like 'The chef at Joe's hat', where 'Joe's&amp;quot; is the name of a restaurant) using a finite state transducer. In a straightforward approach this would also lead to a finite-state automaton with cycles.</Paragraph>
    <Paragraph position="6"> It can be shown that the computation of the intersection of a FSA and a CFG requires only a rain- null imal generalization of existing parsing algorithms.</Paragraph>
    <Paragraph position="7"> We simply replace the usual string positions with the names of the states in the FSA. It is also straight-forward to show that the complexity of this process is cubic in the number of states of the FSA (in the case of ordinary parsing the number of states equals n + 1) (Lang, 1974; Billot and Lang, 1989) (assuming the right-hand-sides of grammar rules have at most two categories).</Paragraph>
    <Paragraph position="8"> In this paper we investigate whether the same techniques can be applied in case the grammar is a constraint-based grammar rather than a CFG. For specificity we will take the grammar to be a Definite Clause Grammar (DCG) (Pereira and Warren, 1980).</Paragraph>
    <Paragraph position="9"> A DCG is a simple example of a family of constraint-based grammar formalisms that are widely used in natural language analysis (and generation). The main findings of this paper can be extended to other members of that family of constraint-based grammar formalisms.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML