File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/96/c96-2122_metho.xml

Size: 17,364 bytes

Last Modified: 2025-10-06 14:14:12

<?xml version="1.0" standalone="yes"?>
<Paper uid="C96-2122">
  <Title>An Earley-type recognizer for dependency grammar</Title>
  <Section position="3" start_page="723" end_page="726" type="metho">
    <SectionTitle>
2 A dependency formalism
</SectionTitle>
    <Paragraph position="0"> In this section we introduce a dependency formalism.</Paragraph>
    <Paragraph position="1"> We express the dependency relations in terms of rules that are very similar to their constituency counterpart, i.e. context-free grammars. The formalism has been adapted from (Gaiflnan 1965). Less constrained dependency formalisms exist in the literature (Mel'cuk 1988) (Fraser, Hudson 1992), but no mathematical studies on their expressive power exist.</Paragraph>
    <Paragraph position="2"> A dependency grammar is a quintuple &lt;S, C, W, L, T&gt;, where W is a finite set of symbols (vocabulary of words of a natural language), C is a set of syntactic categories (preterminals, in constituency terms), S is a non-empty set of root categories (C _ S), L is a set of category assignment rules of the form X: x, where XCC, x@W, and</Paragraph>
    <Paragraph position="4"> governor, and Y1 ..... Ym are the dependent of X in the given order (X is in # position).</Paragraph>
    <Paragraph position="5"> T is a set of dependency rules of the form X(Y1 Y2 ... Yi-1 # Yi+l ... Ym), where XGC, Y1GC, .... Ym@C, and # is a special symbol that does not belong to C. (see fig. 2).</Paragraph>
    <Paragraph position="6"> The modifier symbols Yj can take the form Yj*: as usual, this means that an indefinite number of Yj's (zero or more) may appear in an application of the rule 1 . In the sample grammar below, this extension allows for several prepositional modifiers under a single verbal or nominal head without introducing intermediate symbols; the predicate-arguments structure is immediately represented by a one-level (flat) dependency structure.</Paragraph>
    <Paragraph position="7"> Let x=al a2...ap ~W* be a sentence. A dependency tree of x is a tree such that: 1) the nodes are the symbols ai~W (l&lt;i&lt;p); 2) a node ak,j has left daughters ak,1 ..... ak,j-1 occurring in this order and right daughters ak,j+l, .... ak,q in this order if and only if there exist the roles Ak,l: ak,1 ..... Akj: akj ..... Ak,q: ak,q in L and the rule Ak,j(Ak,1 ... Akj-I # Akj+l ... Ak,q) in T. We say that ak,1 ..... akj-1, ak,j+l ..... ak,q directly depend on ak,j, or equivalently that ak,j directly governs ak, 1 ..... ak,j.1, akj+l ...... ak, q. akj and ak, h (h = 1 ..... j-l, j+l ..... q) are said to be in a dependency relation, where ak,j is the head and ak,h is the modifier, if there exists a sequence of nodes ai, ai+l ..... aj-l, aj such that ak directly depends on ak-1 for each k such that i+l~k-~j, then we say that ai depends' on aj; 3) it satisfies the condition ofprojectivity with respect to the order in x, that is, if ai depends directly on aj and ak intervenes between them (i&lt;k&lt;j or j&lt;k&lt;i), then either ak depends on a i or ak depends on aj (see fig. 3); 4) the root is a unique symbol as such that As: as E L and As~S.</Paragraph>
    <Paragraph position="8"> The condition of projectivity limits the expressive power of the formalism to be equivalent to the context-free power. Intuitively, this principle states aj</Paragraph>
    <Paragraph position="10"> respect to Gaifman: however, it is not uncommon to allow the symbols on the right hand side of a rule to be regular expressions in order to augment the perspicuity of the syntactic representation, but not the expressive power of the grammar (a similar extension appears in the context-free part of the LFG formalism (Kaplan, Bresnan 1982)).</Paragraph>
    <Paragraph position="11">  that a dependent is never separated from its governor by anything other than another dependent, together with its subtree, or by a dependent of its own.</Paragraph>
    <Paragraph position="12"> As an example, consider the grammar GI= &lt;iV}, iV,N, P, A, D}, {I, saw, a, tall, old, man, in, the, park, with, telescope}, {N: I, V: saw, D: a, A: tall, A: old, N: man, P: in, D: the, N: park, P: with, N: telescope} TI&gt;, where T1 is the following set of dependency rules:  t. V(N # P*); 2. V(N # N P*); 3. N(A*#P*); 4. N(DA* # P*); 5. P(# N); 6. A(#); 7. D(#).</Paragraph>
    <Paragraph position="13"> For instance, the two rules for the root category V(erb) specify that a verb (V) can dominate one or two nouns and some prepositions (*).</Paragraph>
    <Paragraph position="14"> 3 Recognition with a dependency grammar  ~\[he recognizer is an improved Earley-type algorithm, where the predictive component has been compiled in a set of parse tables. We use two primary actions: predict, that corresponds to the top-down guessing of a category, and scan, that corresponds to the scanning of the current input word. In the subsection 3.1 we describe the data structures and the algorithms for translating the dependency rules into the parse tables: the dependency rules for a category are first translated into a transition graph, and then the transition graph is mapped onto a parse table. In the subsection 3.2 we present the Earley-type recognizer, that equals the most efficient recognizers for context-tree grmnmar.</Paragraph>
    <Paragraph position="15"> 3,1 Transition graphs and parse tables A transition graph is a pair (V, E), where V is a set of vertices called states, and E is a set of directed edges labelled with a syntactic category or the symbol #. Given a grammar G=&lt;S, C, W, L, T&gt;, a state of the transition graph for a category Cat ~ C is a'set of dotted slxings of the Ibrm &amp;quot;. 13&amp;quot;, where Cat(c~13) C T # * and et, \[~ E (C U { }) ; an edge is a triple &lt;si, sj, Y&gt;, where si, sj C V and Y G C U {#}. A state that contains the dotted string &amp;quot;.&amp;quot; is called final; a final state signals that the recognition of one or more dependency rules has been completed. The following algorithm constructs the transition graph for the</Paragraph>
    <Paragraph position="17"> each category until all states in V are marked;</Paragraph>
    <Paragraph position="19"> take a non-marked dotted string ds from set-of-strings; mark ds; if ds has the form &amp;quot;.Y\[V' and Y is starred then set-of-strings := set-of-strings U {&amp;quot;.f~&amp;quot;} all dotted strings in set-of-strings are marked star:= set-of-strings.</Paragraph>
    <Paragraph position="20"> The initial set of states consists of a single state so, that contains all the possible strings &amp;quot;.a&amp;quot;, such that Cat(c0 is a dependency rule. Each string is prefixed with a dot. The marked states are the states that were expanded in a previous step. The expansion of a state s takes into account each symbol Y that ilmnediately follows a dot (Y C C U {#}). Y is a possible continuation to a new state s', that contains the dotted string &amp;quot;.\[3&amp;quot;, where &amp;quot;.Y\[5&amp;quot; is a dotted string in s. s' is added to the set of states, and a new edge from s to s' labelled with Y is added to the set of edges. A dotted string of the form .Y'13 is treated as a pair of dotted strings {.Y'13, .\[3}, so to allow a number of iterations (one or more Y's follow) or no iteration (the first symbol in \[~ follows) in the next step. The function &amp;quot;star&amp;quot; takes into account these cases; the repeat loop accounts for the case when the first symbol of 13 is starred too.</Paragraph>
    <Paragraph position="21"> The transition graphs obtained for the five categories of G1 are in fig. 4. Conventionally, we indicate the non-final states as h and the final states as Sk, where h and k are integers.</Paragraph>
    <Paragraph position="22"> The total number of states of all the transition graphs for a grammar G is at most O(IGI), where IGI is the sum of the lengths of the dependency rules. The length of a dependency rule Cat(c0 is the length of (~. Starting from the transition graph for a category Cat, we can build the parse table for Cat, i.e. PTCa t.</Paragraph>
    <Paragraph position="23"> VI'Ca t is an array h x k, where h is the number of states of the transition graph and k is the nmnber of syntactic categories in C. Each row is identified by a pair &lt;Cat, State&gt;, where State is the label of a state of the corresponding transition graph; each column is associated with a syntactic category. In order to improve the top-down algorithm we introduce the concept of &amp;quot;first&amp;quot; of a category. The first of the category Cat is the set of categories that appear as leftmost node of a subtree headed by Cat. The first of a category X is computed by a simple procedure that we omit here. The function parse_table computes the parse tables of the various categories. E(t-graphcat) returns the set of the edges of the graph t-graphca t.</Paragraph>
    <Paragraph position="24"> The contents of the entries in the parse tables are sets (possibly empty) of predict and scan. The initialization step consists in setting all entries of the table to the empty set.</Paragraph>
    <Paragraph position="25">  although this does not happen for our simple grammar G 1.</Paragraph>
    <Paragraph position="26"> 3-2 A dependency recognizer The dependency recognizer exhibits the same data structures of Earley's recognizer (Earley 1970), but improves the performance of that algorithm because of the precompilation of the predictive component into the parse tables.</Paragraph>
    <Paragraph position="27"> In order to recognize a sentence of n words, n+l sets Si of items are built. An item is a quadruple &lt;Category, State, Position, Depcat&gt; where the first two elements (Category and State) correspond to a row of the parse table PTCategory, the third element (Position) gives the index i of the set Si where the recognition of a substructure began, and the fourth one (Depcat) is used to request the completion of a substructure headed by Depcat, parse-table (Cat, t-graphcat):</Paragraph>
    <Paragraph position="29"> The parse tables for the grammar G 1 are reported in fig. 5. The firsts are: first(V)=first(N)={N, A, D}; first(P)={P}; first(A)={A}; first(D)={D}. Note that an entry of a table can contain more than one action,  before continuing in the recognition of the larger slructure headed by Category (Depcat = &amp;quot;_&amp;quot; means that the item is not waiting for any completion).</Paragraph>
    <Paragraph position="30"> Sentence: w 0 w 1 ... Wn. 1 initialization each root category V do INSERT &lt;V, 0, 0, &gt; into SO ~a~dfor body l~i fix/m0 ~ n do each item P=&lt;Cat, State, j, &gt; in Si do completer: if final(State) then .f.0Z each item &lt;Cat', k, j', Cat&gt; in S i d~ INSERT &lt;Cat', k, j', &gt;into S i &amp;quot; predictor: if &lt;predict(Cat'), State'&gt;C l~l'Cat(&lt;Cat, State&gt; x</Paragraph>
    <Paragraph position="32"> e, nd~ each item ternfination j~ &lt;V,$k, O, &gt; is in S n lh~ accept g\]~reject 'File external loop of the algorithm cycles on the sets Si (0 &lt; i &lt; n); tile inner loop cycles on the items of the set Si of the form &lt;Cat, State, j, &gt;. At each step of the inner hoop, the action(s) given by the entry &amp;quot;&lt;Cat, State&gt; x lnputcat&amp;quot; in the parse table PTCa t is(are) executed (where lnputcat is one of the categories of the current word). Like in Earley's parser there are three phases: completer, predictor and scanner.</Paragraph>
    <Paragraph position="33"> completer: When an item is in a final state (of the form $h), the algorithm looks for the items which represent the beginning of tile input portion just analyzed: they are the l:i)ur-element items contained in the set referred by j. These items are inserted into Si after having set to &amp;quot;null&amp;quot; the fourth element (_). predictor: &amp;quot;&lt;predict(Cat'), State'&gt;&amp;quot; corresponds to a prediction of the category Cat' as a modifier of the category Cat and to the transition to State', in case a substructure headed by Cat' is actually found. This is modeled by introducing two new items in the set: a) &lt;Cat', 0, i, &gt;, which represents the initial state of the transition graph of the category Cat' which will span a portion of the input starting at i. In F, arley's terms, this item corresponds to all the dotted rules of the form Cat'(. cz).</Paragraph>
    <Paragraph position="34"> b) &lt;Cat, State', j, Cat'&gt;, which represents the arc of the transition graph of the category Cat, entering the state State' and labelled Cat'. In Earley's terms, this item corresponds to a dotted rule of the form Cat(~z . Cat' l~). The items including a non-null Depcat are just passive receptors waiting to be reactivated later when (and ii) the recognition of the hypothesized substructure has successfully completed.</Paragraph>
    <Paragraph position="35"> scanner: &amp;quot;&lt;scan, State'&gt;&amp;quot; results in inserting a new item &lt;Cat, State', i, __&gt; into the set Si+l.</Paragraph>
    <Paragraph position="36"> Let us trace the recognition of the sentence &amp;quot;I saw a tall old man in the park with a telescope&amp;quot;. The first set SO (fig. 6) includes three items: the first one, &lt;V, 0, 0, &gt;, is produced by the initialization; the next two, &lt;V, 1,0, N&gt; arid &lt;N, 0, 0, _&gt; are produced by the predictor (a N-headed subtree beginning in position 0 must be recognized and, in case such a recognition occurs, the governing V can pass to state 1).</Paragraph>
    <Paragraph position="37"> In S1 the first item &lt;N, $2,0, &gt; is produced by tire scanner: it is the result of advancing on the input string according to the item &lt;N, 0, 0, &gt; in SO with an input noun &amp;quot;I&amp;quot; (the entry in the parse table PTN &lt;N, 0&gt; x N contains &lt;scan,S2&gt;). The next item, &lt;V, 1,0,_&gt; is produced by applying the completer to the item in S 0 &lt;V, 1,0, N&gt;.</Paragraph>
    <Paragraph position="38"> $2 contains the item &lt;V, $2,0, _&gt;, obtained by the scanner, that advances on the verb &amp;quot;saw&amp;quot;. The other four items are the result of a double application of the predictor, which, in a sense, builds a &amp;quot;chain&amp;quot; that consists of a noun governed by the root verb and of a determiner governed by that noun; this is the only way, according to the grammar, to accomodate an incoming determiner when a verb is under analysis.</Paragraph>
    <Paragraph position="39"> The subsequent steps can easily be traced by the reader. The input sentence is accepted because of the appearance in the last set of the item &lt;V, $3,0, &gt;, encoding that a structure headed by a verb (i.e. a root category), ending in a final state ($3), and covering all the words from the beginning of the sentence has been successfully recognized.</Paragraph>
    <Paragraph position="40"> The space complexity of the recognizer is O(IGI n2). Each item is a quadruple &lt;Cat, State, Position, Depcat&gt;: Depcat is a constant of the grammar; the pairs of Cat and State are bounded by O(IGI);  S O \[1\] &lt;N, O, 2 _&gt; &lt;N, 1, 2, _&gt; &lt;V, $3, O, P&gt; &lt;N, 1, 7, &gt; &lt;V,O,O,_&gt; &lt;N, 1,2, D&gt; &lt;A,O, 4,_&gt; &lt;N,$2,2,P&gt; Slola\] S12 &lt;V, 1, O, N&gt; &lt;D, O, 2, _&gt; &lt;N, 1, 2, A&gt; S 9 \[with\] &lt;P, 1, 9, _&gt; &lt;N, $2, 10, _&gt; &lt;N, O, O, _&gt; $7 ltlw\] &lt;N, $2, 7, &gt; &lt;N, O, 10, &gt; &lt;P, $2, 9, &gt; $3 \[tall\] S 5 #nan\] &lt;P, 1, 6, _&gt; &lt;P, $2,6, &gt; &lt;P, $2,9, N&gt; &lt;N, $2,7, &gt; S 1 \[saw\] &lt;D,$1,2,_&gt; &lt;A,$1,4, &gt; &lt;N,O, 7, &gt; &lt;N,$2,2,_&gt; &lt;D,O, 10,_&gt; &lt;N,$2,2, &gt; &lt;N,$2,0, &gt; &lt;N, 1,2,_&gt; &lt;N, 1,2, &gt; &lt;P,$2,6, N&gt; &lt;V,$3,0,_&gt; &lt;N,I, IO, D&gt; &lt;V,$3,0,_&gt; &lt;V, 1, O, &gt; &lt;A, O, 3, _&gt; &lt;D, O, 7, &gt; &lt;P, O, 9, _&gt; &lt;P, $2,6, _&gt; &lt;N, 1, 2, A&gt; $6 lin\] &lt;N, 1, 7, D&gt; &lt;N, $2,7, P&gt; S11 \[telescope\] $2 \[a\] &lt;N, $2,2, &gt; &lt;N, $2,2, P&gt; &lt;D, $1,10, _&gt; &lt;V, $2, O, _&gt; S 4 \[old\] &lt;V, $3, 2, _&gt; $8 \[park\] &lt;V, $3, O, P&gt; &lt;N, 1, 10, _&gt; &lt;V, $3, O, N&gt; &lt;A, $1,3, _&gt; &lt;P, O, 6, &gt; &lt;D, $1,7, &gt;  Position is bounded by O(n). The number of such quadruples in a set of items is bounded by O(IGI n) and there are n sets of items.</Paragraph>
    <Paragraph position="41"> The time complexity of the recognizer is O(IGI 2 n3). The phases scanner and predictor execute at most O(IGI) actions per item; the items are at most O(IGI n 2) and the cost of these two phases for the whole algorithm is O(IGl2n2). The phase completer executes at most one action per pair of items. The variables of such a pair of items are the two states (O(IGI2)), the two sets that contain them (O(n2)), and the two positions (O(n2)). But the pairs considered are not all the possible pairs: one of the sets has the index which is the same of one of the positions, and the complexity of the completer is O(IGI 2 n3). The phase completer prevails on the other two phases and the total complexity of the algorithm is O(IGI 2 n3).</Paragraph>
    <Paragraph position="42"> Even if the O-analysis is equivalent to Earley's, the phase of precompilation into the parse tables allows to save a lot of computation time needed by the predictor. All the possible predictions are precomputed in the transition to a new state. A similar device is presented in (Schabes 1990) for context-free grammars.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML