File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/91/e91-1010_metho.xml
Size: 19,027 bytes
Last Modified: 2025-10-06 14:12:37
<?xml version="1.0" standalone="yes"?> <Paper uid="E91-1010"> <Title>AN INDEXING TECHNIQUE FOR IMPLEMENTING COMMAND RELATIONS</Title> <Section position="3" start_page="0" end_page="0" type="metho"> <SectionTitle> 0. INTRODUCTION </SectionTitle> <Paragraph position="0"> Barker and Pullum (1990) have given a general definition of command relations. Their definition covers most cases of command relations that have been presented in linguistic literature. I will present here an indexing technique for syntax trees which allows us to implement a check for all command relations which fulfill the definition from Barker/Pullum (1990). The indexing technique can be implemented in a simple and efficient way without any special requierments for the formalism used. Hence, the indexing technique has a wide spectrum of applications for testing command relations in syntactic analysis. Futhermore, this method can also be used for command tests in semantics, i.e. to test for any two semantic representations whether the corresponding nodes of the syntax tree are in a command relation. The usefulness and necessity of a command test in semantics have been demonstrated in Latecki/Pinkal (1990).</Paragraph> <Paragraph position="1"> The general idea of the indexing technique is the following: while a syntax tree is being built, special indices are assigned to the nodes of this tree.</Paragraph> <Paragraph position="2"> Afterwards we can check whether a command relation between two nodes of this tree holds by merely examining simple set-theoretical relations of corresponding index sets.</Paragraph> </Section> <Section position="4" start_page="0" end_page="0" type="metho"> <SectionTitle> 1. A GENERAL DEFINITION FOR COMMAND RELATIONS </SectionTitle> <Paragraph position="0"> The general command definition from Barker/Pullum (1990) can be informally stated in the following way:</Paragraph> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 1.1 DEFINITION (all). a P-commands 13 iff </SectionTitle> <Paragraph position="0"> every node with a property P that properly dominates a also dominates ~.</Paragraph> <Paragraph position="1"> In this chapter I will show that this definition is equivalent to the following definition (minimum). 1.2 DEFINITION (minimum). a P-commands ~ iff the first node with a property P that properly dominates C/x also dominates 13.</Paragraph> <Paragraph position="2"> In this definition the first node that dominates a means the node most immediately dominating a, as it isusually used in linguistics. Below I will specify both of these definitions formally.</Paragraph> <Paragraph position="3"> The main difference between these two definitions is that in the first we must check every node with a property P that properly dominates a, while in the second it is enough to check only one node, just the first node with the property P that properly diominates a.</Paragraph> <Paragraph position="4"> It can be easily seen that the command tests based on definition 1.2 are an important improvement in efficiency for computational applications.</Paragraph> </Section> <Section position="2" start_page="0" end_page="0" type="sub_section"> <SectionTitle> -51 - </SectionTitle> <Paragraph position="0"> Both versions (all) and (minimum) are used as command definitions in linguistic literature, so their equivalence also has linguistic consequences. For example, definition 1.3 of MAX-command from Barker/Pullum (1990) (which I formulate following Sells&quot; definition of c-command, Sells (1987)) is equivalent to Definition 1.4, which has been proposed in Aoun/Sportich (1982).</Paragraph> <Paragraph position="1"> 1.3 DEFINITION. a MAX-commands ~ iff every maximal projection properly dominating a dominates ~.</Paragraph> <Paragraph position="2"> 1.4 DEFINITION. a MAX-commands ~ iff the first maximal projection properly dominating a dominates ~.</Paragraph> <Paragraph position="3"> These definitions are special cases of definitions</Paragraph> </Section> <Section position="3" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 1.1 and 1.2 for the property of being a set of </SectionTitle> <Paragraph position="0"> maximal projections.</Paragraph> <Paragraph position="1"> Before I formulate the general command definition in a formal way, I will now quote some other definitions from Barker/Pullum (1990).</Paragraph> <Paragraph position="2"> 1.5 DEFINITION. A relation R on a set N is reflexive iff aRa for all a in N; irreflexive iff aRa; symmetric iff aRb implies bRa; asymmetric iff aRb implies ~bRa; antisymmetric iff aRb and bRa implies a=b; transitive iff aRb and bRc implies aRc.</Paragraph> <Paragraph position="3"> A relation R on a set N is called a linear order if it is reflexive, antisymmetric, transitive and has the following property (comparability): for every a and b in N, either aRb or bRa.</Paragraph> <Paragraph position="4"> The following definition of a tree stems from Wall (1972).</Paragraph> <Paragraph position="5"> 1.6 DEFINITION. A tree is a 5-tuple T=<N,L,_>D,<P,LABEL>, where N is a finite nonempty set, the nodes of T, L is a finite set, the labels of T, >D is a reflexive, antisymmetric relation on N, the dominance relation of T, <P is an irreflexive, asymmetric, transitive relation on N, the precedence relation of T, and LABEL is a total function from N into L, the labeling function of T, such that for all a, b, c and d from N and some unique r in N (the root node of T), the following hold: form <a,a> removed.</Paragraph> <Paragraph position="6"> 1.7 DEFINITION. If aM 13 we say that a is the mother of 13, or a immediately dominates 13, where ctM\[3 iff o~13 ^-~3x \[o.~x>Dl~\]. 1.8 DEFINITION. A property P on a set of nodes N is a subset of N. If a node ot satisfies P, I will write cte P or P(o0.</Paragraph> <Paragraph position="7"> 1.9 DEFINITION. The set of upper bounds for a with respect to a property P, written UB(a,P), is given by UB(a,P)= {13e N: 13>Da ^ P(~)}.</Paragraph> <Paragraph position="8"> Thus 13 is an upper bound for a if and only if it properly dominates a and satisfies P. 1.I0 DEFINITION. Let X be any nonempty subset of a set of nodes N of a tree T. We will call an element a the smallest element of X and denote it as minX if cte X and for every node x xe X --, x>Da. If X is an empty set, then minX = the root node of T.</Paragraph> <Paragraph position="9"> A set X is said to be well-ordered by >D if the relation >D is a linear order on X and every nonempty subset of X has a smallest element. For example, the set Z of integers with the. usual ordering relation > is well-ordered.</Paragraph> <Paragraph position="10"> Now I can formally specify the meaning of the expression &quot;the first node with a property P that properly dominates a&quot; from the definition 1.2; it denotes the smallest element of the set UB(a,P), minUB(a,P). First I will show that this element always exists.</Paragraph> <Paragraph position="11"> .In set theory, it is a well-known fact that in any tree, a set of nodes that dominate a given node is well-ordered in the dominance relation (see Kuratowski/Mostowski (1976), for example). To be precise, for a given node a of a tree T, the set UB(ct)= {xe T: x>Dct} is well-ordered. Hence, the set UB(a,P)=UB(a) n P has a smallest element, which we denote minUB(a,P).</Paragraph> <Paragraph position="12"> At this point we are ready to formally state command definitions 1.1 and 1.2.</Paragraph> <Paragraph position="13"> 1.U DEFINITION (all). a P-commands 13 iff Vx (xe UB(a,P) -~ x>DI3).</Paragraph> <Paragraph position="14"> 1.12 DEFINITION (minimum). ct P-commands 13 iff minUB(a,P)_>DIL - 52 - null We say that P generates the P-command relation. For example, we obtain the MAX-command relation (1.3) as a special case of Definition 1.11 if we take the set {ae N: LABEL(a)e MAX} as a property P, where MAX is any set of maximal projections.</Paragraph> <Paragraph position="15"> Def'mition 1.11 is the general command definition from Barker/PuHum (1990).</Paragraph> <Paragraph position="16"> 1.13 THEOREM. Definitions 1.11 (all) and 1,12 (minimum) are equivalent.</Paragraph> <Paragraph position="17"> Proof. If a pair <C/x,l~> fulfills the definition (all), then it also fulfills the definition (minimum), because minUB(C/c~)C/ UB(cx,P) if UB(a,P) ~ O. If UB(a,P) = O, then minUB(ct,P) = the root node of T, so condition minUB(cx,P)>D\[3 is also fulfilled. Conversely, let a pair <cx,13> fulfill the definition (rain). This means that minUB(a,P)>D\[L We must show that Vx (xe UB(cx,P) ---> x>DI3). If UB(a,P) is the empty set, then the claim is trivially fulfilled. If UB(cx,P) is not empty, then let x be any element from UB(~,P). Then x>DminUB(et,P).</Paragraph> <Paragraph position="18"> Since ,,>D,, is a linear relation on UB(cz,P), it is transitive. Hence, x_>DminUB(a,P) and minUB(cc,P)>D\[3 implies x>_.DI3.</Paragraph> </Section> </Section> <Section position="5" start_page="0" end_page="0" type="metho"> <SectionTitle> 2. AN INDEXING TECHNIQUE </SectionTitle> <Paragraph position="0"> I will now present an indexing mechanism which allows us to check any command relation in the sense of Definition 1.11 in a simple and straightforward way.</Paragraph> <Paragraph position="1"> Let P be any property of nodes of a given syntax tree. The idea is the following: while the syntax tree i~ being built, there are special indices assigned to every node of this tree.</Paragraph> <Paragraph position="2"> Generally, every node inherits indices from its mother.</Paragraph> <Paragraph position="3"> Specifically, if P(c0 holds for a node co, then a unique new index is put into the index set of C/z and the new index set of o~ obtained in this way will be inherited futher. This process is formally described in the following definition of functions indp and fp. Letting T be any syntax tree, we define functions indp and fp from all nodes of T into finite subsets of N (the positive integers), whereby we can take finite subsets of any index set as a image of indp and fp.</Paragraph> <Paragraph position="4"> 2.1 DEFINITION. Let P be any property. The function indp:N --~ F(N)= {a~N:crisa finite subset of N} is defined recursively as follows: 1 deg indp(root(T)) = { 1 }, where root(T) denotes the root node.</Paragraph> <Paragraph position="5"> 2 deg If cx immediately dominates \[3, then</Paragraph> <Paragraph position="7"> where fp is a function fp: N --', F (N) which fulfills the following conditions: 11 If C/t~ P, then fp(ct) = O. 21 If C/xeP, then fp(ct)= {x}, for some unique index xeN (x~ U {fp(T): TeN and y~oc}). The procedural aspect of this definition can be described as follows. First, the function fp assigns a set with a unique index to every node from P, and the empty set to every node which does not belong toP, Then, for every node, the set it has been assigned by the function fp is added to the indices it inherits from its mother. The result is the value of the function indp.</Paragraph> <Paragraph position="8"> Based on this description, it is easy to note the following facts.</Paragraph> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 2.2 FACT. </SectionTitle> <Paragraph position="0"> If ~-Di3, then indp(v) G indp(13) - fp(\[3).</Paragraph> </Section> <Section position="2" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 2.3 FACT. </SectionTitle> <Paragraph position="0"> If 3~ P, then Te_DI~ iff fp(y) ~ indp(\[3).</Paragraph> <Paragraph position="1"> Now I present the main theorem of this paper which gives a basis for efficient and simple implementations of P-command relations. Due to this theorem, we can check whether any P-command relation holds between two nodes by merely examining the subset relationship of corresponding index sets.</Paragraph> </Section> <Section position="3" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 2.4 THEOREM. Node C/x P-commands node \[~ </SectionTitle> <Paragraph position="0"> iff indp(ct) - fp(cx) ~ indp(\[~).</Paragraph> <Paragraph position="1"> The proof, which makes use of equivalence Theorem 1.13, will be given at the end of this chapter.</Paragraph> <Paragraph position="2"> To illustrate the P-command check based on the theorem above, I give an example for a MAX-command relation (1.3) which we obtain from general command definition 1.11 if we take the set {seN: LABEL(~)e MAX} as a property P, where MAX = {NP, VP, AP, PP, S-bar} is a set of maximal projections (Sells 1987).</Paragraph> <Paragraph position="3"> Let us consider the syntax analysis for sentence (2.5). In tree (2.6), the upper set of indices at every node corresponds to the value of the function fp for - 53 this node and the lower set of indices corresponds to the value of the function indp for this node. (2.5) A friend of his saw every man with a telescope.</Paragraph> <Paragraph position="4"> We can see, for example, that the verb &quot;saw&quot; MAXcommands the prepositional phrase &quot;with a telescope&quot;, by verifying that indpCsaw&quot;)-fpCsaw&quot;)= {1,5} ~ { 1,5,7}=indpCwith a telescope&quot;), or that &quot;every man&quot; does not MAX-command &quot;his&quot;, by verifying that indp(&quot;every man&quot;)-fpCevery man&quot;)= {1,5} is not a</Paragraph> <Paragraph position="6"> with a telescope To do P-command tests in semantics, we merely need to extend functions fp and indp to semantic representations of every node. We can do this in the following way: If a&quot; is a semantic representation of a node or, then fp(cx') = fp(a) and indp(a') = indp(a).</Paragraph> <Paragraph position="7"> Now we can check, for two given semantic representations ct', 13&quot;, whether a P-command relation holds between the two corresponding nodes C/x, 13, by examining the condition from Theorem 2.4 for C/x', 13&quot;: indp(a3 - fp(cx') ~ indp(\[~'). (For more details see Latecki/Pinkal (1990).) An important advantage of the indexing technique is that its applicability for checking command relations in semantics does not depend on an ispomorphism between syntactic and semantic structure, since the necessary syntactic information is encoded in indices. Therefore, this information can be moved to any required position in the semantic structure together with the representation of a given node.</Paragraph> <Paragraph position="8"> Definition 1.11 does not cover all cases of command relations which have been presented in linguistic literature, but there are only a few exceptions. One is the relation that is called c-command in Reinhart (1976; 1981, p.612; 1983, p.23):</Paragraph> </Section> <Section position="4" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 2.7 DEFINITION. Node ct c(onsitituent)- </SectionTitle> <Paragraph position="0"> commands node ~ iff the branching node Xl most immediately dominating ct either dominates 13 or is immediately dominated by a node x 2 which dominates \[~, and x2 is of the same category type as x 1. A node y is a branching node iff there exists two different nodes x,y such that ~/Mx ^ vMy.</Paragraph> <Paragraph position="1"> As T. Reinhart wrote, the intention of this def'mition is to capture c-command relations in cases S-bar over S or VP over VP. Hence, we can say (for</Paragraph> <Paragraph position="3"> This c-command definition allows the minimal upper bound to be replaced by another node, one node closer to the root, so this relation cannot be generated by any property, since this property must then depend on the node ix. However, the condition of Definition 2.7 can be generated by a relation. In order to use a given relation R as generator, it is enough to replace the set of upper bounds UB(ot,P) by the set UB(tx,R)={13~N: 13>Dot ^ (xRI3)}, in general command definition 1.11. For detailed disscusion see Barker/Pullum (1990).</Paragraph> <Paragraph position="4"> With an example of Reinharts c-commando relation, I want to show that it is also possible to treat relational command definitions with the indexing technique. Here I do not want to consider the treatment of the relational command definition with the indexing technique in the general case, because it would lead to a formal mathematical discussion without linguistic connections.</Paragraph> <Paragraph position="5"> To specify a test for Reinharts c-command, we need merely to modify part 20 of the definition of the function indp in 2.1. The definition of the function fp together with the basis test condition given in Theorem 2.4 will be left unchanged. As a property P we take the set of branching nodes.</Paragraph> <Paragraph position="6"> New part 20 of DeFinition 2.1 will be formulated in the following way: 2 deg&quot; If (x immediately dominates 13 and 13 is of the same category as ix, then indp(\[3) = indp(tx).</Paragraph> <Paragraph position="7"> If ct immediately dominates 13 and \[3 is not of the same category as ct then indp(\[3) = indp(ct) L) fp(~). The idea of this modification is that if a node 13 is of the same category as a node ix, then 13 only inherits the indices from or. So, in this case, the new index from the set fP(l~) does not influence the value of the function indp on 13. I illustrate the indexing check for c-command definition 2.7 with the syntax analysis for the following example sentense from Reinhart (1983).</Paragraph> <Paragraph position="8"> (2.8) Lola found the book in the library.</Paragraph> <Paragraph position="9"> In tree (2.9), the upper set of indices at every node corresponds to the value of the function fp at this node and the lower set of indices corresponds: to the value of the function indp at this node.</Paragraph> <Paragraph position="10"> We can see, for example, that the subject of S, &quot;Lola&quot;, c-commands the COMP in S-bar, by verifying that indpCLola&quot;)-fpCLola&quot;)={ 1 } K { 1 } = indp(COMP), or that the object, &quot;the book&quot;, c-commands the NP in PP, &quot;the library&quot;, by verifying that indp(&quot;the book&quot;)-fpCthe book&quot;)={ 1,4} ~ { 1,4,7,8}= indpCthe litrary&quot;).</Paragraph> <Paragraph position="11"> node x between ct and % Due to Definition 1.12 (minimum), the node ~/also dominates 13, hence I: ~ indp(I\]). So, we have the inclusion indp(a)-fp(~t) ~ indp(13).</Paragraph> <Paragraph position="12"> &quot;~&quot; Now let indp(c0-fp(c0 ~ indp(I\]), for any two nodes (x, I\] of some tree T, and let ~, be any node from P that dominates 0t. indp(y) ~ indp(a)-fp(a), since y dominates ct (2.2).</Paragraph> <Paragraph position="13"> From the transitivity of the inclusion relation, indp(y) ~ indp(~). This implies that fp('f~ indp(\[3). Due to Fact 2.3, the needed relation &quot;Feu~ holds, so ot P-commands ~.</Paragraph> </Section> </Section> class="xml-element"></Paper>