XML Viewer - p85-1011

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/85/p85-1011_metho.xml
Size: 45,618 bytes
Last Modified: 2025-10-06 14:11:46
<?xml version="1.0" standalone="yes"?>
<Paper uid="P85-1011">
  <Title>SOME COMPUTATIONAL PROPERTISS OF TREE ADJOINING GRAMM.~.S*</Title>
  <Section position="4" start_page="82" end_page="85" type="metho">
    <SectionTitle>
2. TREE ADJOINING GRAMMARS--TAG's
</SectionTitle>
    <Paragraph position="0"> We now introduce tree adjoining grammars (TAG's). TAG's are more powerful than CFG's, botb weakly and strongly, l TAG's were first introduced in \[Joshi, Levy, and Takahashi,1975J and \[Joehi,1983 I. We include their description in this ~*ction to make the paper ~lf-contalned.</Paragraph>
    <Paragraph position="1"> We can define a tree adjoining grammar as follows. A tree adjoining grammar G is a paw (i,A) where i is a set of initial trees, and A is a set of auxiliary trees.</Paragraph>
    <Paragraph position="2"> A tree a ls an initial tree if it is of the form</Paragraph>
    <Paragraph position="4"> That m, the root node of a is labelled S and the frontier nodes are all terminal symbob. The internal nodes are ~11 non-terminals.</Paragraph>
    <Paragraph position="5"> A tree ~ is an acxiliar? tree if it is of the form</Paragraph>
    <Paragraph position="7"> That is, the root node of ~ is labelled with a :on-terminal X and the frontier nodes are all labelled with terminals symbols except one which is labelled X. The node labelled by X on the frontier will be c~dl~l the foot node of ~. The frontiers of initial trees belong to r-*, whereas the frontiers of the auxiliary trees belong to ~ N ~ U ~'+ N '-'*.</Paragraph>
    <Paragraph position="8"> ~/e will now define a compoeition operation called adjoining, (or adlunetion) which compo6es an auxiliary tree ~ with a tree 3'. Let 3' be a tree with a node n labelled X and let ~ be an auxiliary tree with the root labelled with the same symbol X. (Note that mnst have, by definition, a node (and only one) labelled X on the frontier.) IGr~nm~u Ol tad G2 mm w*aJtly equivuJ*a* if the forint ItaCU*ll* of GI, I~Gi} m tim J~inC/ luaC/un4pD ot G~ ~G2b GI tad G:I *.,,* ,trooC/ly *quivuJeot they m mmkl7 eq~,ivuJeIt tad for etch w UI E,(GI) ~e L(G2), both Gi tad G2 the strne itI~l~urld delleriptioll to v. A ~mr G is ~ly uleqoa~ for t IPtriD|l llMl~ql~ ~* if UGI am L G ~1 Itt'OOC/~ I~deql\]otdl for b if L(G) m h tad for elg'b w is I~ G *~iglm am deg*ppmpdm e ,ttuctural description to m. The 8oti~a 0( ItrOuC/ *dequtcT ~ undoobtodlY not pmciN becsmn it deport ,4* ol the notion 0~ zpp~pfiato *tntttu~ de~.*riptioml Adjoining can now be defined as follows. If # is adjoined to at the node n then the resulting tree &amp;quot;Tt' is as shown in Fig. 2.1 below.</Paragraph>
    <Paragraph position="10"> The tree t dominnted by X in 3' is excised, ~ is inserted at the node n in &amp;quot;7 and the tree t is attached to the foot node (lab*lled X) of ~, i.e., ~ is inserted or adjoined to the node n in 3' pushing t downwards, Note that ~ljoinmg is not a suJmtitutioa operation.</Paragraph>
    <Paragraph position="11"> We will now define T(G): The set of alJ trees derived in G starting from initial trees in I. This set will be called the tree set of G.</Paragraph>
    <Paragraph position="12"> L(G): The set of all terminal strinp which uppe'mr in the frontier of the trees in TIG). This set will be called the string language (~r langtiage) of G. If L is the string language of s TAG G then we say that L is a Tree-Adjoinin~ I.angllage (TAL). The relationship between TAG's , context-free grammmm, and the corresponding string languages can be summarised as follows (\[Joehi, Levy, and Takahashi, 1975\], \[Joshi, 19831).</Paragraph>
    <Paragraph position="13">  Theorem 2.1: For every context-free grammar, G', there is so equivalent TAG, G, both weakly and strongly.</Paragraph>
    <Paragraph position="14"> Theorem 2.2: For every TAG, G, we have the following  sitoatious: a. LeG) is context-free 3nd there is a context-free grammar G' that is strongly (cud therefore weakly) equivalent to G.</Paragraph>
    <Paragraph position="15"> b.</Paragraph>
    <Paragraph position="16"> C.</Paragraph>
    <Paragraph position="17"> L(G) is context-free and there is 4o coutext~free gramma~ G' that is equivalent to G. Of course, there must be n context-free grmmmar that is weakly equivalent to G. L(G) is strictly context-sensitive. Obviously in this cue, there is no context-freo grammar that is weakly  equivalent to G.</Paragraph>
    <Paragraph position="18"> Part8 Ca) ~d (e) of Theorem 2.2 appear in (\[Jushi, Levy, and Tskahacbi, 19T5\]). Pact (b) is implicit im that paper, but it is impor*ut to state it explicitly as we have done here because of it8 linguistic significance. ~mmple 2.1 illustrates part Ca). We will now illustrate p,1~ (b) and (e).</Paragraph>
    <Paragraph position="19"> Example 2.2: Let G J (I,A) where</Paragraph>
    <Paragraph position="21"/>
    <Paragraph position="23"> adjoined at S am indicated in &amp;quot;f0. adjoined at T as indicated in ~..</Paragraph>
    <Paragraph position="24"> Clearly. L(G), the string language of G is L-- {,.eb. / Q&gt;o } which is a context-free language. Thus, there must exist a context-tree grammar, G', which is at least we~tkly equivalent to G. \[t cam be shown however that there is no context.flee grammar G' which is strongly equivalent to G, i.e., T(G) I- T(G'). This follows from the fat that the set T(G) (the tree ~et of G) is non-r~o,~nizable. *.e., there is an finite st~e bottom-up tree automaton that can recognize precisely T(G). Thus s TAG ma~&amp;quot; ~ _z context-free language, ~ign structural de~riptious to the strinAs that cannot be</Paragraph>
    <Paragraph position="26"> The precise definition of L(G) is as follows:  L(G) =- L t =. {w * ca / n &gt; o, w is a string of a's and b's such that (1) the number o( u's I=, the number o( b's -- n, and (2) for any initial subetriag of w, the number of a's &gt; the number o( b's. }  L I is a strictly context-sensitive language (i.e., s context,, sensitive language that i, not context-free). This can be shown as follows. Intersecting L with the regular language a* b* * c* results in the language 1~== { a abnec a/ n&gt;&gt;_o} =-L t Na'b'ec&amp;quot; i~ i~ well-known strictly context-sensitive language. The result of intersecting a context-free language with a regular language is always a context-free language; hence, L t is not a context-free language. It is thus a strictly context-feusitive language. Example 2.3 thus illustrates part (e) of Theorem 2.2.</Paragraph>
    <Paragraph position="27"> TAG's have more power than CFG's. However, the extra power is quite limited. The language L t bag equal number of a's, b's a~d c's; however, the s's and b's are mixed in a certain way. The Itmguage I~ is similar to Lt, except that a's come before all b's. TAG's as defined so far are not powerful enough to generate L t. This can be seen as follows. Clearly, for any TAG for I.~, each initial tree must contain equal number of a's, b's and c's (including sero), sod each auxiliary tree must also contain equal number of a's, b's and c's. Further in each cue the a's must precede the b's. Then it i~ easy to see from the grammar of Example 2.3, that it will not be po~ible to avoid getting the a's and b's mixed. However, L t can be generated by a TAG with local constraints (see Section 2.1} The so-called copy language.</Paragraph>
    <Paragraph position="28"> t.- {wewlw,{~b}&amp;quot; } also cannot be generated by s TAG, however, again, with local constraints. It is thus clear that TAG's can generate more than context-free languages. It can be shown that TAG's cannot generate all context,-sensitive languages \[Jmhi ,lg84J.</Paragraph>
    <Paragraph position="29"> Although TAG's are more powerful than CFG's, this extra power is highly constrained and apparently it is just the right kind for characterizing certain structural descriptions. TAG's share almost all the formal properties of CFG's (more precisely, the corresponding classes of language,). ~. we shalJ see in Netin* 4 of this paper and \[Vijay-Shankar and Joehi,1985J. In addition,the string languages of TAG's can also be parsed in polynomial time, in partkular is O(nS}. The parsing algorithm is described is detail in section 3.</Paragraph>
    <Paragraph position="30"> |.1. TAG's with Lanai Constraints on Ad, Jolnln| The adjoining operation as def'med in Seetion 2.1 is &amp;quot;contextfree'. Au auxiliary tree, say,</Paragraph>
    <Paragraph position="32"> is adjoinable to s tree t at a node, say, n, if the label of that node is X. Adjoining does not depend on thn context (tree context) around the node n. In this sense, adjoining is context-free.</Paragraph>
    <Paragraph position="33"> In \[Jmhi ,19831, I~al constraints on adjoining similar to those investigated by \[Joshi and Levy ,1977\] were considered.These are a generalization of the context-sensitive constraints studied by \[Peters and Ritchie ,1~9\]. It was soon recognized, however, that the full power of these constraints was never fully utilized, both in the linguistic context as well as in the &amp;quot;formal languages' of TAG's.</Paragraph>
    <Paragraph position="34"> The so-called proper analysis contexts and domination contexts (as defined in \[Jmhi and Levy ,197T\]) as used in \[Joshi ,1983J always turned out to be such that the context elements were always in a specific elementary tree i.e., they were further localized by being in the same elementary tree. Based on this observation and a suggestion in \[Jaehi, Levy and Takahashi ,1975\], we will deseribe a new way of introducing local constraints. This approach not only captures the insight stated above, but it is truly in the spirit of TAG's. The earlier approach was not so, although it was certainly adequate for the investigation in \[Jmhi ,1983J. A precise characterization of that approach still remains an open problem.</Paragraph>
    <Paragraph position="35"> G -- (I,A) be a TAG with local constraints if for each elementary tree t E l t.J A, and for each node, n, in t, we specify the set ~ of auxiliary trees that nan be adjoined at the node n. Note that if there is no constraint then all auxiliary trees are adjoinable at n (of course, only those whose root has the same label as the label of th* node s). Thus, in general, ~ is a subset o( the set of all the auxiliary trees adjoiuable at n.</Paragraph>
    <Paragraph position="36"> We will adopt the following conventions.</Paragraph>
    <Paragraph position="37">  1. Since. by definition, no auxiliary trees are adjoinable to a node labelled by a terminal symbol, no constraint has to be stated for node labelled by a terminal.</Paragraph>
    <Paragraph position="38"> 2. If there is no constraint, i.e., all auxiliary trees (with the  appropriate root label} are adioinable at a node, say, u, then we will not state this explicitly.</Paragraph>
    <Paragraph position="39"> 3. if no auxiliary trees are adjoinable at a node n, then we will write the constraint as ($~, where $ denotes the null set.</Paragraph>
    <Paragraph position="40"> We will alE.~ allow for the possibility that for a node at least one adjoining is obligatory, of course, from the set of all ixxmible auxiliary trees adjoiuable at that node.</Paragraph>
    <Paragraph position="41"> Hence, a TAG with Meal constraints is defined as follows. G = (I, A) is a TAG with local constraints dr for each node, n. in each tree t, be speeify one (and only one) of the following constraints.</Paragraph>
    <Paragraph position="42">  1. S, Ioetive Adjoinin~ ~.qA:) Only u specified subset of the  set of all auxiliary trees are adjoinable at u. SA is w-linen aa (C), where C is u subset of the set of all auxiliary trees adjoisable at n.</Paragraph>
    <Paragraph position="43"> If C equals the set of all auxiliary trm adjoinable at n, then we do not explkitly state this at the node n.</Paragraph>
    <Paragraph position="44">  2. Null Adjoining; (NA:) No auxiliary tree ia adjoinable at the ,,ode N. NA will be written u (~).</Paragraph>
    <Paragraph position="45"> 3. Obli~atin~ Adjoining; {OA:) At least one (out of all the auxiliary trees adjoissble at a) must be adjoined at n.</Paragraph>
    <Paragraph position="46"> OA is written as (OA). or as O(C) where C is a subeet of the set of all suxifiacy trees adjoisable at u.</Paragraph>
    <Paragraph position="48"> In a t no anxiliary trees can be adjoined to the root node. Only ~t is adjoinable to the left S node at depth 1 and only ~= is adjoinable to the right S node at depth 1. In ~t only BI is adjoinuhie at the root node and uo auxiliary trees ate adjoinable at the ~.~,~' node. Similarly for ~2.</Paragraph>
    <Paragraph position="49"> We must now modify our definition of adjoining to take care o( the local constraints, given a tree &amp;quot;7 with a node, say, is, labelled A and given an auxiliary tree, say,/J, with the root node labelled A, we define adjoining as follows. ~ is adjoinable to &amp;quot;y at the node n if B E ~, where ~ is the constraint associated with the node u in &amp;quot;7. The result of adjoining d to ~ will be as defined in earlier, except that the constraint C ~.~sociated with u will be replaced by C', the constraint *ssociated with the root node orb and by C', the constraint associated with the foot node of ~. Thus, given</Paragraph>
    <Paragraph position="51"> We abo adopt the convention that any derived tree with a node which has an OA constraint associated with it will not be included in the tree set associated with a TAG, G. The string language L of G is then defined as the get of all terminal strings at all trees derived in G (starting with initial tre~) whkh have on OA constraints left-in them.</Paragraph>
    <Paragraph position="52"> Example 2.5: Let G == (I,A) be a TAG with local constraints where</Paragraph>
    <Paragraph position="54"> It ia easy to ~.e that G generates the string language</Paragraph>
    <Paragraph position="56"> Other languages such as L'=={a al In ~_~1}, L&amp;quot; == {a a= I n ~__ 1} aim cannot be generated by TAG's. This is because the strings of a TAL grow linearly (for a detailed definite of the property called &amp;quot;contact growth&amp;quot; property, see \[Jmhi ,1983 I.</Paragraph>
    <Paragraph position="57"> For those familiar with \[Joehi, 19&amp;3\], it is worth pointing out that the SA constraint is only abbreviating, i.e., it does not affect the power of TAG's. The NA and OA constraints however do affect the power of TAG's. This way of looking at local constraints has only greatly simplified their statement, but it has also Mlowed us to capture the insight that the 'locality' of the constraint in statable in terms of the elemental/ trees themselves! S.2. Simple Llngulntle Exmmphm We now give a couple of Unguistie examples. Readers may refer ~o \[Krocb and Joshi, 1985\] for detads.</Paragraph>
    <Paragraph position="58"> I, Starting with ~fl ~m at which is an initial tree and then adjoining ~1 (with appropriate lexieaJ insertions) at the indicated node in at, we obtain &amp;quot;~:~.</Paragraph>
    <Paragraph position="60"> The glrl who net BLll t,* n sealer 2. Starting with the initial tree 3't =a ~ and adjoining 0~ at the indicated node in a, we obtain 7~-</Paragraph>
    <Paragraph position="62"> John pomaded eLI1 ~o XnvLte M~ry John persuaded B211 S Note that the initial tree cz 2 is not a matrix sentence. In order for it to become a matrix sentence, it must undergo am adjuuction at its root node, for example, by the auxiliary tree ~2 as shown above. Thus. for a 2 we will specify a local constraint O(~2) for the root node, indicating that a 2 requires for it to undergo am adjuuction at the mot node by an auxiliary tree 02. In a fuller grammar there will be, of course, some alternatives in the scope of O().</Paragraph>
  </Section>
  <Section position="5" start_page="85" end_page="87" type="metho">
    <SectionTitle>
3. PARSING TREE-ADJOINING
LANGUAGES
</SectionTitle>
    <Paragraph position="0"> a.l. l)eflnltlonm We will give a few additional definitioM. These sre not necessaW for defining derivations in a TAG as defined in section 2. However, they are introduced to help explain the parsing algorithm and the proofs for some of the closure properties of TAL's.  DEFINITION 3.1 Let 3',3&amp;quot; be two tre~.We say &amp;quot;r \[--&amp;quot; 3&amp;quot; if in 3' we adjoin an auxiliary tree to obtain 3&amp;quot;.</Paragraph>
    <Paragraph position="1"> I'-* is the reflexive,transitive closure of \]---.</Paragraph>
    <Paragraph position="2"> DEFINITION 3.2 3&amp;quot; is called a derived tree if 7 I--* 3&amp;quot; for some elementary tree % ' We then say &amp;quot;~' E D('I).</Paragraph>
    <Paragraph position="3">  The frontier of any derived tree 3' belongs to either ~ ~ ~ U N ~ if 7E D(,~) for some auxiliary tree 0. or to ~ if 3' E Dqcr) for some initial tree a. Note if &amp;quot;;, E D(a) for some initial tree ~, then 3' is aim a sententtal tree.</Paragraph>
    <Paragraph position="4"> If 0 is an auxiliary tre~, &amp;quot;7 E D(0) and the frontier of 3' is w I X w 2 {X is a nooterminsJ.wl.w 2 E ~ r~') then the le~ node having this non-terminal symbol X at the frontier is called the foot of 3'. Sometimes we will be loosely using the phrase &amp;quot;adjoining with a derived tree&amp;quot; &amp;quot;7 E D(~) for some auxiliary tree 0. What we mean is that suppose we sdjoin d at some nc~le and then sLdjoin within t~ and so on, we can derive the desired derived tree E D(0) which uses the same adjoining sequence and use this resulting tree to &amp;quot;adioin&amp;quot; at the original node.</Paragraph>
    <Section position="1" start_page="85" end_page="87" type="sub_section">
      <SectionTitle>
3.3. The Psrsi.s Alsorlthm
</SectionTitle>
      <Paragraph position="0"> The ~igorithm, we present here to parse Tree-Adjoining Languages {TAL~), is s modification of the CTK algorithm (which is described in detail iu \[Abe and UIIman,1073 D, which uses ,, dynamic programming technique to parse CFL's. For the sake of making our description of the parsing algorithm simpler, we shall present the algorithm for parsing without considering local constraints. We will later show how to handle local constraints.</Paragraph>
      <Paragraph position="1"> We shall s.~ume that any node in the elementary trees in the grammar has atmosC/ two children. Thm assumption c~m be made without any loss of generality, because it can be easily shown that for any TAG G there m an equivalent TAG G I such that amy node in amy elementary tree in G t has atmmt two children. A similar assumption is made in CYK algorithm. We use the terms ancestor rand descend~at, throughout the paper ms &amp; transitive and reflexive relation, for example, the foot node may be called the ancestor of the foot ands.</Paragraph>
      <Paragraph position="2"> The ~lgoritbm works am follows. Let st... % be the input to be posed. We use a fom~limeoaioaal array A; each element of the srrny cont4uiu a subset of the nodes o( derived trm. We say a node X of a derived tree 3&amp;quot; belongs to A(i,j.k,lJ iJr X dominates a sub-tree o( 3' whose frontier m given by either =q+a...aq Y ak+i... ~ (where the foot node of 3' ~ labelled by Y) or ~q+t--.~ (i.e., j ,,- k. ~;- null corresponds to the case when T is a sentential tree). The indices (i,j,k,I) refer to the positions between the input symbols and range over 0 through u. If i == 5 say. the,, it refers to the gap between a s and a s.</Paragraph>
      <Paragraph position="3"> Initially, we fill Ali,i+l,t+l,i+l \] with those nodes in the frontier of the elementary trees whose label is the same as the input ai+ t for 0 &lt; i &lt; n-l. The foot nodes of auxiliary trees will belong to MI A(i,i,j,jl, such that i _&lt; j.</Paragraph>
      <Paragraph position="4"> We are now in a position to fill in 311 the elements of the array A. There are five c~mes to be considered.</Paragraph>
      <Paragraph position="5">  Case 1. We know that if a node X in a derived tree is the ancestor of the foot node, and node Y is its right sibling, such that X E A\[i,j,k,II and Y E A\[l,m.m,nJ, then their parent, say. Z should belong to A(i,j,k,n\[, see Fig 3.1a.</Paragraph>
      <Paragraph position="6"> Case 2. If the right sibling Y is the ancestor of the foot node such that it belongs to All,m,n,pJ and its left sibling X belongs to A i.j.j.lJ, then we know that the parent Z of X and Y belongs to A i,m,n.p, see Fig 3.1b Case 3. If neither X nor its right sibling Y are the ancestors of the foot node ( or there is no foot node) then if X E A\[i,J,j,ll and Y E A\[I.m.m,nJ then their parent Z belongs to A\[ioj,j,n\[.</Paragraph>
      <Paragraph position="7"> Came 4. If * node Z has only one child X, and if X E A\[i,j,k,l\], then obviously Z E A{i,j,k,ll.</Paragraph>
      <Paragraph position="8"> Ca~e 5. If 3 node X E AIi.j,k,ll, and the root Y of a derived tree &amp;quot;7 having the same label as that of X, belongs to A\[m,i,l.u I, then adjoining &amp;quot;t at X makes the resulting node to be in AIm,Lk,nl, see Fig</Paragraph>
      <Paragraph position="10"> Although we have stated that the elements of the array contain 3 subset of the nodes of derived trees, what really goes in there ape the addresses of nodes in the elementary trees. Thus the the size of any set is bounded by a constant, determined by the grammar. It is hoped that the presentation of the sdgorithm below will make it clear why we do so.</Paragraph>
      <Paragraph position="11"> 3.3. The adl~orithm The complete algorithm is given below  ancestor of the foot node. The parent is put in A\[i,j.k.l I if the left sibling is in A\[i,j.k.m I and the right sibling is in A|m.p,p,l|, where k ~_ m &lt; I, m _~ p, p ~_ I. Therefore Came I m written as For ask to 1-I ~top I do for p= a to I step I do if there is * left sibling in A\[t.J.k.n\] and the right sibling in A\[n.p.p.1\] satisfying appropriate restrictionn then put their parent  in A\[i,j,k.i\].</Paragraph>
      <Paragraph position="12"> (b) Case 2 corresponds to the case where the right sibliog is the  ancestor ,~f the foot node. If the left sibling is in A\[i,m.m.pl and the .ght sibling is in A(p,j,k.I I, i -- m &lt; p and p ~ j, then we put their parent in A\[i,j,k,l I. This may be written as For n:l to J-t stop 1 do For p=u-t to J step 1 do for *11 left 8iblinp in A(t.n.n,p\] and riKht  (c) Case 3 corresponds to the cane where *either children ate  ancestors of the foot *ode. If the left sibling E A\[i,j,j,ml and the right sibling E A(m,p,p01\[ then we can pat the parent in A\[i,j,j,lJ if it is the c~,.that(i&lt; j _&lt; mori~ j &lt; m) and(m &lt; p ~ lot m _&lt; p &lt; |),This may be written ae fo~ s : J t,o l-t st,up I do for p : J to 1 *~*p t do f*r .11 left, sLblLnKg in A\[i.J,J,n\] and right, siblings i* A(n,p,p,1\] *at1*fy1.nlg t, he appropriate rant,rXcCio** pot their pgwuat, Xa A(/.J.J.I\]. (e) Came 5 correspo*ds to adjoining. If X is n node in A\[m,j,k,pJ and Y is the root of a a*xiliary tree with same symbol as that of X, such that Y is in A\[i,m,p,I\] ((i &lt;_ m _&lt; p &lt;iori &lt; m_&lt; p &lt;_lJand(m &lt; j &lt; k ~ porto ~j ~_k &lt; p)J. This may be writte* as for * = PS co J 8t*p t do for p = u ~o I stop t do tf t node X E A\[a.J.k.p\] and t, he root, of tuxllXary tree ~.* In k\[t,a.p,l\] t, heu put, X Xn A(i.J,k,l\] Case 4 corresponds to the case where s *ode Y has only one child X If X E A~i,j,k,ll then put Y in A\[i,j,k,l\[. Repe~t Case 4 again if Y has us siblings.</Paragraph>
    </Section>
    <Section position="2" start_page="87" end_page="87" type="sub_section">
      <SectionTitle>
3.4. Complexity of the Alsorlthm
</SectionTitle>
      <Paragraph position="0"> It is obvious that steps I0 through 15 (cases a-e) are completed in 0(*-*), beta*an the different cases have at most two nested for loop statements, the iterating variables taking values in the range 0 thro*gh u. They are repeated utmost 0(* 4) times, because o( the four loop statements i* steps 6 through 9. The initialization phase (steps 1 through 5) has a time complexity of 0(* + *:) == 0(*2).</Paragraph>
      <Paragraph position="1"> Step 15 is completed in O(*). Therefore, the time complexity of the parsing algorithm is O(*S).</Paragraph>
      <Paragraph position="2"> 3.5. Cot,~.etnem of tha Allorlthm The main issue in proving the algorithm correct, is to show that while computing the contents of an element of the array A, we must have already determined the contents of other elements of the array needed to correctly complete this entry. We can show this inductively by considering each cue individually. We give an ;.uformal argument below.</Paragraph>
      <Paragraph position="3"> Case h We need to know the co*tents of A\[i,j,k.m\[, A\[m,p,p,I\] where m &lt; I, i &lt; m. when we are trying to compute the co*tents or Aii.j,k,l \[. Since I is the y&amp;riable itererated i* the outermost loop (step 6), we can assume (by indnctio* hypothesis) that for all m &lt; I and for all p,q,r, the coate*ts of A\[p,q,r,mJ are already computed. Hence, the contents of A\[i,j,k,mJ are known. Similarly, for all m &gt; i, and for all p,q, and r &lt;_. l, A\[m,p,q,rJ would have been computed. Thus, A\[m,p,p,i I would also have bee* computed.</Paragraph>
      <Paragraph position="4"> Case 2: By s similar ream*lag, the co*tents of A(i,m,m,pJ and A\[p,j,k,l I are known since p &lt; I and p &gt; i.</Paragraph>
      <Paragraph position="5"> Case 3: Woe* we are trying to camp*re the contents of some Aii,j,j,lJ, we need to know the nodes in A(i,j~i,pJ and A\[p,q,q,l\[. ,Note j &gt; i or j &lt; I. tlence, we know that the co*tents of A\[i,j.i,pj and A(p,q,q,l\] would have bee* compared already.</Paragraph>
      <Paragraph position="6"> Came 5: The co*tents of A\[i,m,p,iJ and A(m,j,k,pJ must be k*own i* order to compote A(i,j,k,l\[, where ( i _&lt; m ~ p &lt; I or i &lt; m &lt; p_&lt;l)aad(m_&lt;j_&lt; k &lt; porto &lt;j_&lt; k_&lt;p). Since either m &gt; i or p &lt; I, contents of Alm,j,k,pl will be know*.</Paragraph>
      <Paragraph position="7"> Similarly, since either m &lt; j or k &lt; p, the co*re*re of A(i,m,p,l I would have been comp*tcd.</Paragraph>
      <Paragraph position="8"> 3.S. Pmmlug with Loead Const~mlnt4 So far,we have a~,samed that the give* grammar has *o local constraints, If the grammar has local constraints, it is easy to modify the above algorithm to take care of them. Note that in Ca~e 5, if an adjunctio* occurs at a *ode X, we add X again to the element of the array we are computing. This seems to be in co*trust with our definition of how to associate local constraints with the *odes in a se*te*tial tree. We should have added the root of the auxiliary tree instead to the element of the array being computed, since so far u the local constraints are concerned,this *ode decides the local constraints at this node in the derived tree. However, this scheme cannot be adopted in oar algorithm for obvious reasons. We let pairs of the form (g,C) belong to elements of the array, where g is -before and C represents the local constraints to be associated with this *ode.</Paragraph>
      <Paragraph position="9"> We then alter the algorithm as follows. If (X,CI) refers to a uode at which we attempt to adjoin with an auxiliary tree (whose root is denoted by (Y,Cs)). the* adi*nctio* would determined by C I. If adjunctio* is allowed, then we can add (X,Cs) in the corresponding element of the array. In cases I through 4, we do not attempt to add a new element if any one of the children has an obligatory constraint.</Paragraph>
      <Paragraph position="10"> Once it has bee* determined that the given string belongs to the language, we ca* find the parse i* a way similar to the scheme adopted i* CYK algorithm.To make this process simpler and more efficient, we can use pointers from the new clement added to the elements which caused it to be put there. For example, consider Case i of the algorithm (step 10 ). If we add a node Z to A(i.i,k,I I, because of the pr~nce of its children X and Y= i* A\[ij,k,m i and A(m,p,p.q respectively, then we add pointers from this node Z i* A\[i,j,k,l\] to the nodes X, Y i* A{i,j,k,mj and A\[m,p,p,l\[. Once this has been done, the parse c,m be found by traversing the tree formed by these pointers.</Paragraph>
      <Paragraph position="11"> A paner based o* the techniques described above is currently being implemented mad wiU be reported at time of presentation.</Paragraph>
    </Section>
  </Section>
  <Section position="6" start_page="87" end_page="90" type="metho">
    <SectionTitle>
4. CLOSURE PROPERTIES OF TAG's
</SectionTitle>
    <Paragraph position="0"> I* this 6ectio*, we present some closure resoits for TALe. We now informally sketch the proofs for the closure properties.</Paragraph>
    <Paragraph position="1"> interested readers may refer to \[Vijay-Shaakas mad Jo6hi,1985\] for the eL, replete proofs.</Paragraph>
    <Section position="1" start_page="87" end_page="88" type="sub_section">
      <SectionTitle>
4.1. Closure undem Union
</SectionTitle>
      <Paragraph position="0"> Let G t and G. z be two TAGs generating L I and l.~ respectively.</Paragraph>
      <Paragraph position="1"> We c~* eonstrnct '~ TAG G snch that L(G)m'L t U L-a-Le* G I =- { 11, At, NI, S ), and G 2 = ( I~, A=, N~., S ) Without Io~ of senerality, we may assume that the N I N N:e =&amp;quot; h. Let G -- ( I l U 12 , At LJ A=, N t U N=, S ). We claim that L(G) :~ L l Let x ELt U L-z. Then x ELI or x E I~. If x ELI, then it must be possible to generate the string x in G , since 11 , A t are in G. Hence x E L(G). Similarly if x E \[q , we can show that x E L(G).</Paragraph>
      <Paragraph position="2"> Hence L t U L~ C L(G). If x E L(G), then x is derived using either  only Ij, A t or only l~,A:tsince N I I&amp;quot;1 N,j =,, ~. Hence, x ELt or X E t~ Thus, L(G} '-- Lt U I~ Therefore, L(G) =- Lt U L~</Paragraph>
    </Section>
    <Section position="2" start_page="88" end_page="88" type="sub_section">
      <SectionTitle>
4.2. Clmure under Concatena~on
</SectionTitle>
      <Paragraph position="0"> Let G t --(lt,At,N~,St), G, ,,, (\[~.~=,N~,S~) be two TAGs generating Lt, I~ respectively, such that N I I'1 N= =- ~. We cam construct * TAG G =- (I, A, N, S) such that L(G)=,, L! . !~. We choo~ S such that S is not in Ns t,J N=. We let N -- N t IJ N, U {S}, A ,m A t U An. For all t t E !1, t~ E I,, we add tl:~ to I, as shown in Fig 4.2.1. Therefore, ! =- ( tl= / t! E It, t~ ~ l~), where the nodes in the subtrees t t and t~ of the tree t~= have the same coustra~atm mmocinted with them us in the original grammars G ! and G=. it is easy to show that L(G) ,m L I . L~, once we note that there are no Nxifia~ trees in G rooted with the symbol S, and that N I f3 N, ,m d).</Paragraph>
      <Paragraph position="2"> Fib, urn 4 2. t</Paragraph>
    </Section>
    <Section position="3" start_page="88" end_page="90" type="sub_section">
      <SectionTitle>
4.3. Cloeuru under Kle~ne gt.m~
</SectionTitle>
      <Paragraph position="0"> Let G t =, (iI,At,NI,S1) be a TAG generating L t. We can show that we can construct a TAG G such that L(G) -. Lt*. Let S be a symbol not in N t, and let N m N I U {S}. We let the set \[ of initial trees of G be (re} . where t e is the tree shown in Fig 4.3~. The set o( auxiliary tree, A is defined u A = {t~A / t t C/ It} UAt.</Paragraph>
      <Paragraph position="1"> The tree tlA is u shown in Fig 4.3b, with the coustraintm on the root of each tlA being the null adjoining constraint, an constraint~ on the foot, and the constraints on the nodes of the snbtreee t t of the tre~ ttA being the same sm thee for the corresponding nodes in the inithd tree t t of G I.</Paragraph>
      <Paragraph position="2"> To see why L(G) ,m Lt*, consider x ~ L(G). Obviously, the tree derived (whose frontier is given by x ) must be of the form ~howu in Fig 4.3C/, where each t t' is a sententinJ tree in GI~UCh t I' E D(ti), for zn initial tree t i in G t. Thus, L(G) C LI*.</Paragraph>
      <Paragraph position="3"> On the other hand, if x E Ls*, then x =- Wl...wu, w i ~ L t for 1 _&lt; i &lt; n. Let e,u'h w |then be the frontier of t~Je sententiai tree t i' of G t such that t i' ~ D(t;), t I ~ I t. Obviously, we ca8 derive the tree T, using the initial tree t,, and have * sequence of adjoining operations using the auxiliary trees tl, ~ for I _&lt; i _ n. From T we c,-, obviously obtain the tree T' the same am given by Fig 4.3C/, using only the mtxifimry tre~ in A t. The fruntiee of T' is obviously wl...w =. Henee, x I~G). Therefore, LI* E L(G). Thus L(G) =~ Us*.</Paragraph>
      <Paragraph position="4">  Let L T be a TAL and L R be a regular language. Let G be * TAG generating L T and M = (Q , ~ , 6 , q0 , QV) be a finite state automaton recognizing Lit. We can construct a 8ramma: G and will show that L(GI) u L T N L R.</Paragraph>
      <Paragraph position="5"> Let a be an elementary tree in G. We shall associate with each node a quadruple (qt,q2,%,q4) where qt,q2,q.l,qi E Q Let (qt,%,q.~,q4) be mare)tinted with a node X in (~. Let us assume that a is an auxiliary tree, and that X is an ancestor of the foot node of a. and hence, the ancestor of the foot node of any derived tree &amp;quot;r in D(a). Let Y be the label of the root and foot nodes of (~. If the frontier of 7 ('T in D(o)) is w t w 2 Y w s w 4, and the frontier of the snbtree of rooted at Z, which corresponds to the node X in a is w= Y w~. The idea of amso~iating (qt,q~,q3,q~) with X is that it must be the case that 6deg(qz, w~) =- q~, and ~(q~, w=) =, qs. When ~ becomes a part of the seutenti ~I tree ~&amp;quot; whose frontier is given by u w I w 2 v w s w4 w, then it must be the case that 6*(q~, v) == cut. Following this remmoing, we must make q= == q~, if Z is not the ancestor of the foot node of % or if &amp;quot;~ is in D(o) for some initial tree (~ in G.</Paragraph>
      <Paragraph position="6"> We have assumed here, as in the case of the parting algorithm presenf~ed earlier, that =ny node in ~y elementary tree has ~tmost two children.</Paragraph>
      <Paragraph position="7"> From G we cam obtain GI u follows. For each initial tree a, mmociate with the root the quadruple (q0, q, q, qr) where qe is the initial state of the ~qni~ state automaton M, and ~ E QF. For each auxiliary tree # of G, associate with the root the quadruple (ql,q~,qa,q4), where q,ql,q=,ch,q4 a~e some variables which will later be given values from Q. Let X be some node in some elementary tree cL Let (ql,q=,o.s,q4) be ~umociaU~l with X. Then, we have to consider the fol~)'~iag cues Cans I&amp;quot; X hi- two chUdreu Y and Z. The left child y is the ancestor of the foot node of a. Then zuoeiste with V the quadruple ( p, q~, o..I, q ), and ( q, r, r, s ) with Z, and ~ssociate with X: the constraint that only throe trees whoue root has the quadruple ( qt, P, s, q4 ), among Shone which were allowed in the orism~ grmmmus, may be adjoined at this node. If qt pd p, or q4 ~,i s , then the constraint associated with X must be made obligatory. Lf in the origin.l gruamar X had an obligatory constraint associated with it then we retmm the obligatory constraint regarcllelm of the relationship between qt and p, mud q4 and s. if the constraint amsccinted with X is a null adjoining constraint, we seaociate ( qt, qt, CL,, q ), and ( q, r, r, q4 ) with Y and Z resp~tively, and aamcinte the nuU adjoining enustramt with X. If the label o( Z is a. where s E ~, then we cboous s ~ q such that 6 ( q, a ) I s. In the nu II adjoining constr~nt c~ule, q is cheeeu such that 6 ( q, a ) == q4.</Paragraph>
      <Paragraph position="8">  child ) be the aucestor of the the foot node the tree a. Then we shall smucinte (p,q,q,r), (r,qs,qa,s) with Y and Z. The am*slated cottstraiat with X shaft be that only those trees amour those which were allowed in the nepal f~nmlmar may be adjoined provided their root has the quadruple (ql,p,s,q4) aasoC/inted with it. If qt ~ P or q4 ~ r then we make the constraint obligatory. If the original grammar had obfiptory constraint we wifl retm the obfiptory constraint. NaB constraint in the original grammar will force us to use null constraint ud not consider the cases where it is not the case that qt I p and  q4 m s. If the label of Y is * terminal 'a' then we chouse r such that 6*(p,a) m r. If the constraint at X is s nuU adjoining constraint, then * C/(qt,a) - r.</Paragraph>
      <Paragraph position="9"> Case 3: This corresponds to the cue where *either the left  child V nor the right child Z of the node X is the ancestor of the foot node of a or if a is a initial tree. Then qs ~ q8 I q. We will ammeiate with Y and 7. the quadruples (p,r,r,q) and (q,u,t) reap. The constraints are assigned as before , in this cuse it is dictated by the quadruple (ql,P,t,qt). \[f it is not the cue that ql &amp;quot; P and q4 um t, then it becomes an OA constraint. The OA and NA constraints at X are treated similar to the previous eMes, and so is the cue if either Y o1' Z is labelled by a terminal symbol.</Paragraph>
      <Paragraph position="10"> Cuss 4: If (ql,qt,q~bqt) is assort*ted with a node X, which hun only one child Y, then we can de~ with the various cusee as follows. We will annotate with Y the q*adruple (p,qs,qa~t) and the constraint that root of the t~,e which can be adjoined at X should have the quadruple (qt,P~,qt) amucinted with it amen8 the trees which were aflowed in the original grammar, if it is to be adjoined st X. The cm where the original grammar bad null or obligatory constraint amocinted with this node or Y is labelled with a terminsi symbol, are treated similar to how we dealt with them in the previous cuses.</Paragraph>
      <Paragraph position="11"> Once this has been done, let ql,---,qm be the independent variables for this elementary tree o, then we produce as many co~ of a so that ql,..-,qm take ad possible value8 from Q. The only diHerenee *meal the varions copies of cs so produced will be eonsteaint8 u~ with the nodes in the trees. Repeat the prose* for aft the elementary trees in G a. Once this has been dome and each tree |lynn ~ unique name we can write the constraints in terms of them names. We will now show why L~G1) m U T ~ L R.</Paragraph>
      <Paragraph position="12"> Let w E I~GI). Then there is a sequence of adjoining operations starting with uu inithd tree a to derive w. Obviowdy, w E L.F, also since corresponding to ensh tree used in deriving w, there is n corresponding tree in G, which diffem only in the constraints asmC/inted with its nodes. Note, however, that the coutraints aloeinted with the nodes in tre~ in G z are just a reatriction of the corresponding om in G, or an obligatory constraint where there wu noes in G. Now, if we can amume ( by induction hypothesis ) that if after n adjoining operation we cam derive &amp;quot;/' E D(a'). the* there is a corresponding tree ~, E D(a) in G, which bus the same tree structure as 7' but differm |only in the constraints aasociated with the corl~sponding nodes, then if we adjoin at some ..ode in &amp;quot;7' to obtain ~t'. we can adjoin in &amp;quot;T to obtain &amp;quot;ft (corresponding to &amp;quot;it'). Therefore, if w can be derived in Gt, then it eu definitely be derived inG.</Paragraph>
      <Paragraph position="13"> If we can abe 8bow that l,(Gt) ~ 14, then we ean conclude that L(GI) ~ L T /'1 Lm. We can use induction to prove this. The induction hypothesis is that if all derived trees obtained after k &lt;_ n adjeininlg operations have the prepethy P then so will the derived after n + 1 adjoininp where P is defined as,  the tree 0 to which X belongs labeDed Y as * descendant sucb that w z Y w= is the fro*tier of the s*btree of ~ rooted at X, then if (ql,q~,q.l,q4) had bee* as*oct*ted with X, 6*(qt,wl) m q= and 6&amp;quot;(q3,ws) m q4, and if w is the fro*tier of the subtree under the foot node of 0 i* &amp;quot;/is then 6*(q~,w) ~ q8- if X is not the ancestor of the foot *ode of 0 then the subtree of 0 below is of the form wtw s.</Paragraph>
      <Paragraph position="14"> Suppme X has aso~inted with it (ql,q,q,q2) the* 6*(qt,wl) -- q, 5*(q,w,) = q,.</Paragraph>
      <Paragraph position="15"> Actually what we mean by an adjoining operation is not *eeessarily just one adjoining operation but the minimum number so that no obligatory constraints are am*tinted with any nodes in the derived trees. Similarly, the base ease need not consider only elementary trees, but the smalleat (in terms of the number of adjoining operations) tree starting with elementary trees which h,m no obligatory constraint annotated with any o( its nodes. The base cue can be see* easily considering the why the grammar wse built (it can be shown far*ally by induction on the height of the tree) The inductive step is obvious. Note that the derived tree we are gong to use for adjoining will have the property P, and so will the tree st which we adjoin; the former because of the way we dreig*ed the grammar and amiped coaatraints, and the latter because of induction hypothesis. Thus so will the new derived tree. Once we have proved this, all we have to do to show that L(GI) C_ L R is to consider those derived trees which axe soots*tint trees and observe that the roots of these trees obey property P.</Paragraph>
      <Paragraph position="16"> Now, if n string x E LT f3 Lit, we can show that x E L(G). To do that, we make use of the following claim.</Paragraph>
      <Paragraph position="17"> let ~ be sn anxilinry tree in G with root labelled Y and let &amp;quot;r E D(B). We claim that the~ is a B' in Gt with the same structure u 0, such 'that there is n ~,' in D(beta~))') where q' hu the same structure as % such that there is no OA constraint in ~'. let X be a node in ~t which wu used in deriving ~,. The* there is n node X' in ~' such that X' belo*p to the anxilliary tree 0f (with the same structure as 01- There are several rMes to consider Case 1: X is the ancestor of the foot node of 01, such that the fro*tier of the subtree of 0t rooted at X is wsYw 4 and the fro*tier of the subtree or 7 rooted at X is W|WlZW~W t. Let 6~(qt,w|) an q, 6*(q,wt) -- q,, 6*(qa,w2) n r, and 6*(r,wt) -- q4. Then X' will have (ql,q,r,qt) aseocinted with it, and there will be no OA constraint in Case 2: X is the ancestor of the foot node o( Of and the frontier of the subtree of 0t rooted at X is wsYw 4. let the frontier of the subtree of &amp;quot;I rooted at X is WsWlW=W t. Then we claim that X' in 7' will have amucinted with it the q*adl~tple (qt,q,r,qt), if 6*(ql,wl) m q, 6*(q,wl) me p, 60(p,w2) me r, and 6*(r,wt) u q4-Case 3: let '.he frontier of the subtree of 0t {and aJeo ~) rooted at X is WlW =. Let 6*(q,wl) a p, ~(p,ws) I r. Then X' will have associated with it the quadruple (q,p,p,r).</Paragraph>
      <Paragraph position="18"> We shall prove o*r claim by induction o* the number of ucljoi*ins operations used to derive &amp;quot;T. The buse case (where -~ == 0} is obvious from the way the Irammar (i t wu built. We shall now amume that for all derived trees % which have bee* derived from 0 using k or less adjolnins operatiou, have the property u required ia o*r claim, let &amp;quot;f be a derived tree in 0 after k adjuuctious. By our inductive hypothesis we may ass*me the existence of the corresponding derived tree &amp;quot;T' (E D(0') derived in G t. Let X be n node in -y as show* in Fig. 4.4.1. The* the *ode X' in 7' corresponding to X will have associated with it the q*adruple (ql',cht',qs',qt&amp;quot;). Note we are nan*inn here that the left child Y' of X' is the ancestor of the  foot node of ~', The quadruples (qt',ql',q~',P) and (P,Pl,Pl,q4&amp;quot;) will be asaoC/inted with Y=' and Z' (by the induction hypothesis). Let &amp;quot;h be derived from ~ by adjoining ~1 at X as in Fig. 4.4.2. We have to chew the existence of ~t' in G 1 such that the root of this auxiliar7 tree hu sasoC/iatod with it the quadruple (q,qt',q4&amp;quot;,r). The exmtence el the tree follows from induction hypothesis (k =ffi 0). We have also got to show that there exists &amp;quot;/t' with the same structure us &amp;quot;f but one that allows ~1' to be adjoined at the required node. But this should be 8o, since from the way we obtained the tree, in G1, there will exist ~t&amp;quot; such that X I' has the quadruple (q,q~',qa',r) and the constraint* at X l' are dictated by the quadruple (q,qt',q4e,r), bat such that the two children.of X t' will have the same quadruple as in 7'. We can now adjoin ~I' in ~I&amp;quot; to obtain &amp;quot;h'. It can be shown that ~t' has the required property to establish our clam.</Paragraph>
      <Paragraph position="20"> Flatly, any node below the foot of Dr' in 74' will satisfy our requieement~ as they are the same as the corresponding nodes in 71 *.</Paragraph>
      <Paragraph position="21"> Since BI' satisfies the requirement, it is simple to obasrve that the nodes in ~1' will, even after the adjunctiou of ~1' in &amp;quot;at'. However, because the quadruple associated with X I' are different, the quadruples of the nodes above X t&amp;quot; must reflect this cbuge. It is easy to check the existence of an anxKinr? tree such that the nodes above X t' satisfy the requirements as sta~l above. It can alan be argued am the basis of the design of gramme GI, that there exisu trees which ailow this new auxiliary tree to be adjoined ~t the appropriate place.</Paragraph>
      <Paragraph position="22"> This then allows us to conclude that there exmt a derived tree for etch derived tree beiongin to D(~) as in our claim. The next step is to extend our claim to take into --count all derived trees (i.e., including the sentential trees). This can be done in a manner similar to our treatment of derived trees belonging to D(~) for some ~dlinry tree ~ as above. Of course, we have to consider only the cue where the finite state automaton start8 from the iniC/i~d sta~ q0, and rez~bes some final state qr ou the input which is the frontier o( some esnten*ial tree in G. This, then allowu us to conclude that L~ rl 'L R C L(G1). Hence, L(Gt) -- L T ~l Lit.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML