File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/95/e95-1013_intro.xml
Size: 8,511 bytes
Last Modified: 2025-10-06 14:05:51
<?xml version="1.0" standalone="yes"?> <Paper uid="E95-1013"> <Title>Literal Movement Grammars</Title> <Section position="3" start_page="90" end_page="93" type="intro"> <SectionTitle> 2 Definition and Examples </SectionTitle> <Paragraph position="0"> There is evidence that suggests that the typical human processing of movement is to first locate displaced information (the filler), and then find the logical location (the trace), to substitute that information. It also seems that by and large, displaced information appears earlier than (or left of) its logical position, as in all examples given in the previous section. The typical unification-based approach to such movement is to structurally analyse the displaced constituent, and use this analysed information in the treatment of the rest of the sentence. This method is called gap-threading; see (Alshawi, 1992).</Paragraph> <Paragraph position="1"> If we bear in mind that a filler is usually found to the left of the corresponding trace, it is worth taking into consideration to develop a way of deferring treatment of syntactical data. E.g. for example sentence 1 this means that upon finding the displaced constituent which book, we will not evaluate that constituent, but rather remember during the treatment of the remaining part of the sentence, that this data is still to be fitted into a logical place.</Paragraph> <Paragraph position="2"> This is not a new idea. A number of non-concatenative grammar formalisms has been put forward, such as head-wrapping grammars (HG) (Pollard, 1984), extraposition grammars (XG) (Pereira, 1981). and tree adjoining grammars (TAG) (Kroch and Joshi, 1986). A discussion of these formalisms as alternatives to the LMG formalism is given in section 4.</Paragraph> <Paragraph position="3"> Lessons in parsing by hand in high school (e.g. in English or Latin classes) informally illustrate the purpose of literal movement grammars: as opposed to the traditional linguistic point of view that there is only one head which dominates a phrase, constituents of a sentence have several key components. A verb phrase for example not only has its finite verb, but also one or more objects. It is precisely these key components that can be subject to movement. Now when such a key component is found outside the consitituent it belongs to, the LMG formalism implements a simple mechanism to pass the component down the derivation tree, where it is picked up by the constituent that contains its trace.</Paragraph> <Paragraph position="4"> It is best to think of LMGs versus context-free grammars as a predicate version of the (propositional) paradigm of context-free grammars, in that nonterminals can have arguments. If we call the general class of such grammars predicate grammars, the distinguishing feature of LMG with respect to other predicate grammar formalisms such as indexed grammars I (Weir, 1988) (Aho, 1968) is the ability of binding or quantification in the right hand side of a phrase structure rule.</Paragraph> <Paragraph position="5"> 1 2.1 Definition We fix disjoint sets N, T, V of non-terminal symbols, terminal symbols and variables. We will write A, B, C... to denote nonterminal symbols, a, b, c... to denote terminal symbols, and x, y, z for variables. A sequence ala2. * * a,~ or a E T* is called a (terminal) word or string. We will use the symbols a, b, e for terminal words. (Note the use of bold face</Paragraph> <Paragraph position="7"> variables only, we call it a vector and usually write x.</Paragraph> <Paragraph position="8"> 1 2.3 Definition (similarity type) A (partial) function # mapping N to the natural numbers is called a similarity type.</Paragraph> <Paragraph position="9"> 1 2.4 Definition (predicate) Let # be a similarity type, A E N and n = /~(A), and for 1 <_ i <_ n, let ti be a term. Then a predicate qa of type # is a terminal a (a terminal predicate) or a syntactical unit of the form A ( t l , t 2, * *., t,~ ), called a nonterminal predicate. If all t~ = xl are vectors, we say that = A(a~l, ~e2,... , a~n) is apattern.</Paragraph> <Paragraph position="10"> Informally, we think of the arguments of a nonterminal as terminal words. A predicate A(x) then stands for a constituent A where certain information with terminal yield x has been extraposed (i.e. found outside the constituent), and must hence be left out of the A constituent itself.</Paragraph> <Paragraph position="11"> 1 2.5 Definition (item) Let/z be a similarity type, ~p a predicate of type #, and t a term. Then an item of type # is a syntactical unit of one of the following forms: 1 Indexed grammars are a weak form of monadic predicate grammar, as a nonterminal can have at most one argument. 1. qo (a nonterminal or terminal predicate) 2. x:~ (a quantifier item) 3. ~/t (a slash item) We will use C/, qJ to denote items, and a,/3, 3' to denote sequences of items.</Paragraph> <Paragraph position="12"> 1 2.6 Definition Let /z be a similarity type. A rewrite rule R of type/2 is a syntactical unit qo ---, qbl (I)2 * ' * qb,~ where qo is a pattern of type #, and for I < i < n, ~i is an item of type #.</Paragraph> <Paragraph position="13"> A literal movement grammar is a triple (#, S, P) where # is a similarity type, S E N, #(S) = 0 and P is a set of rewrite rules of type #.</Paragraph> <Paragraph position="14"> Items on the right hand side of a rule can either refer to variables, as in the following rule: A(x, yz) -~ BO/x a/y C(z) or bind new variables, as the first two items in A 0 ---, x:B 0 y:C(x) D(y).</Paragraph> <Paragraph position="15"> A slash item such as B()/x means that x should be used instead of the actual &quot;input&quot; to recognize the non-terminal predicate B(). I.e. the terminal word x should be recognized as B0, and the item BO/x itself will recognize the empty string. A quantifier item x:B() means that a constituent B() is recognized from the input, and the variable x, when used elsewhere in the rule, will stand for the part of the input recognized. 1 2.7 Definition (rewrite semantics) Let R = A(Xh..., x,~) ~ ~1(I)2 ,.. ~rn be a rewrite rule, then an instantiation of R is the syntactical entity obtained by substituting for each i and for each variable x E xl a terminal word a~.</Paragraph> <Paragraph position="16"> A grammar derives the string a iff S 0 =~ a where G ===~ is a relation between predicates and sequences of items defined inductively by the following axioms and (the two a symbols in this case) is 'moved back down' into the tree, until it gets 'consumed' by a slash item. It also shows how we can extract a context-free 'deep structure' for further analysis by, for example, formal specification tools: if we transform the tree, as shown in figure 3, by removing quantified (extraposed) data, and abstracting away from the parameters, we see that the grammar, in a sense, works by transforming the language anbnc n to the context-free language (ab)ncn. Figure 4 shows how we can derive a context free 'backbone grammar' from the original grammar.</Paragraph> <Paragraph position="17"> 12.9 Example (cross-serial dependencies in Dutch) The following LMG captures precisely the three basic types of extraposition defined in section 1.3: the three Dutch verb orders, topicalization and cross-serial verb-object dependencies.</Paragraph> <Paragraph position="19"> A sentence S' has one argument which is used, if nonempty, to fill a noun phrase trace. A VP has two arguments: the first is used to fill verb traces, the second is treated as a list of noun phrases to which more noun phrases can be appended. A V' is similar to a VP except that it uses the list of noun phrases in its second argument to fill noun phrase traces rather than adding to it.</Paragraph> <Paragraph position="20"> Figure 5 shows how this grammar accepts the sentence null Marie zag Fred Anne kussen.</Paragraph> <Paragraph position="21"> We see that it is analyzed as Marie zag i Fredj Annek IV' ei ej \[V, kussen e~ \]\] which as anticipated in section 1.3 has precisely the basic, context-free underlying structure of the corresponding English sentence Mary saw Fred kiss Anne indicated in figure 5 by terminal words in bold face. Note that arbitrary verbs are recognized by a quanti-</Paragraph> <Paragraph position="23"> fier item v:V, and only when, further down the tree, a trace is filled with such a verb in items such as VR/v, its subcategorization types VI, VT and VR start playing a role.</Paragraph> </Section> class="xml-element"></Paper>