File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/97/p97-1060_intro.xml
Size: 4,971 bytes
Last Modified: 2025-10-06 14:06:21
<?xml version="1.0" standalone="yes"?> <Paper uid="P97-1060"> <Title>Representing Constraints with Automata</Title> <Section position="2" start_page="0" end_page="468" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> In recent years there has been a continuing interest in computational linguistics in both model theoretic syntax and finite state techniques. In this paper we attempt to bridge the gap between the two by exploiting an old result in logic, that the weak monadic second order (MSO) theory of two successor functions (WS2S) is decidable (Thatcher and Wright 1968, Doner 1970). A &quot;weak&quot; second order theory is one in which the set variables are allowed to range only over finite sets. There is a more powerful result available: it has been shown (Rabin 1969) that the strong monadic second order theory (variables range over infinite sets) of even countably many successor functions is decidable. However, in our linguistic applications we only need to quantify over finite sets, so the weaker theory is enough, and the techniques correspondingly simpler3 The decidability proof works by showing a correspondence between formulas in the language of WS2S and tree automata, developed in such a way that the formula is satisfiable iff the set of trees accepted by the corresponding automaton is nonempty. While these results were well known, the (rather surprising) suitability of this formalism as a constraint language for Principles and Parameters (P&P) based linguistic theories has only recently been shown by Rogers (1994).</Paragraph> <Paragraph position="1"> It should be pointed out immediately that the translation from formulas to automata, while effective, is just about as complex as it is possible to be. In the worst case, the number of states can be given as a function of the number of variables in the input formula with a stack of exponents as tall as the number of quantifier alternations in the formula. However, there is a growing body of work in the computer science literature motivated by the success of the MONA decision procedure (Henriksen et al. 1995) 2 on the application of these techniques in computer science (Basin and Klarlund 1995, Kelb et al. 1997), which suggests that in practical cases the extreme explosiveness of this technique can be effectively controlled. It is one of our goals to show that this is the case in linguistic applications as well. The decidability proof for WS2S is inductive on the structure of MSO formulas. Therefore we can choose our particular tree description language rather freely, knowing (a) that the resulting logic 1All of these are generalizations to trees of results on strings and the monadic second order theory of one successor function originally due to Biichi (1960). The applications we mention here could be adapted to strings with finite-state automata replacing tree automata. In general, all the techniques which apply to tree automata are straightforward generalizations of techniques for FSAs.</Paragraph> <Paragraph position="2"> 2The current version of the MONA tool works only on the MSO logic of strings. There is work in progress at the University of Aarhus to extend MONA to &quot;MONA++&quot;, for trees (Biehl et al. 1996).</Paragraph> <Paragraph position="3"> will be decidable and (b) that the translation to automata will go through as long as the atomic formulas of the language represent relations which can be translated (by hand if necessary) to tree automata.</Paragraph> <Paragraph position="4"> We will see how this is done ill the next section, but the point can be appreciated immediately. For example, Niehren and Podelski (1992) and Ayari et al. (1997) have investigated the usefulness of these techniques in dealing with feature trees which unfold feature structures; there the attributes of an attribute-value term are translated to distinct successor functions. On the other hand, Rogers (1996) has developed a language rich in long-distance relations (dominance and precedence) which is more appropriate for work in Government-Binding (GB) theory. Compact automata can be easily constructed to represent dominance and precedence relations.</Paragraph> <Paragraph position="5"> One can imagine other possibilities as well: as we will see, the automaton for Kayne-style asymmetric, precedence-restricted c-command (Kayne 1994) is also very compact, and makes a suitable primitive for a description language along the lines developed by Frank and Vijay-Shanker (1995).</Paragraph> <Paragraph position="6"> The paper is organized as follows. First we present some of the mathematical background, then we discuss (na'ive) uses of the techniques, followed by the presentation of a constraint logic programming-based extension of MSO logic to avoid some of the problems of the naive approach, concluding with a discussion of its strengths and weaknesses.</Paragraph> </Section> class="xml-element"></Paper>