File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/06/e06-3007_metho.xml
Size: 21,425 bytes
Last Modified: 2025-10-06 14:10:05
<?xml version="1.0" standalone="yes"?> <Paper uid="E06-3007"> <Title>Lexicalising Word Order Constraints for Implemented Linearisation Grammar</Title> <Section position="3" start_page="24" end_page="28" type="metho"> <SectionTitle> 2 Lexicalising Word Order Constraints </SectionTitle> <Paragraph position="0"/> <Section position="1" start_page="24" end_page="24" type="sub_section"> <SectionTitle> 2.1 Overview </SectionTitle> <Paragraph position="0"> Our theoretical goal is, in a nutshell, to achieve what Reape does, namely handling discontinuity and linear precedence, in a simpler, more lexicalist manner. My central proposal consists in incorporating the Word Order Constraint (WOC) feature into the lexical heads, rather than positing an additional tier for linearisation. Some new sub-features will also be introduced.</Paragraph> <Paragraph position="1"> The value of the WOC feature is a set of word-order related constraints. It may contain any relational constraint the grammar writer may want with the proviso of its formalisability, but for the current proposal, I include two subfeatures ADJ (adjacency) and LP, both of which, being binary relations, are represented as a set of ordered pairs, the members of which must either be the head itself or its sisters. Figure 2 illustrates what such feature structure looks like with an English verb provide, as in provide him with a book.</Paragraph> <Paragraph position="2"> We will discuss the new PHON subfeatures in the next section - for now it would suffice to consider them to constitute the standard PHON list so let us focus on WOC here. The WOC feature of this verb says, for its projection (VP), three constraints have to be observed. Firstly, the ADJ sub-feature says that the indirect object NP has to be in the adjacent position to the verb ('provide yesterday him with a book' is not allowed). Secondly, the first two elements of the LP value encode a head-initial constraint for English VPs, namely that a head verb has to be preceded by its complements. Lastly, the last pair in the same set says the indirect object must precede the with-PP ('provide with a book him' is not allowed). Notice that this specification leaves room for some discontinuity, as there is no ADJ requirement between the indirect NP and with-PP. Hence, provide him yesterday with a book is allowed.</Paragraph> <Paragraph position="3"> The key idea here is that since the complements of a lexical head are available in its COMPS feature, it should be possible to state the relative linear order which holds between the head and a complement, as well as between complements, inside the feature structure of the head.</Paragraph> <Paragraph position="4"> Admittedly word order would naturally be considered to reside in a phrase, string of words.</Paragraph> <Paragraph position="5"> It might be argued, on the ground that a head's COMPS feature simply consists of the categories it selects for in exclusion of the PHON feature, that with this architecture one would inevitably encounter the 'accessibility' problem discussed in Section 1.2: in order to ensure the enforceability of word order constraints, an access must be secured to the values of the internal features including the PHON values. However, this problem can be overcome, as we will see, if due arrangements are in place.</Paragraph> <Paragraph position="6"> The main benefit of this mechanism is that it paves way to an entirely lexicon-based rule specification, so that, on one hand, duplication of informationbetween lexicalspecification and phrase structure rules can be reduced and on the other, a wide variety of lexical properties can be flexibly handled. If the word order constraints, which have been regarded as the bastion of rule-based grammars, is shown to be lexically handled, it is one significant step further to a fully lexicalist grammar. null</Paragraph> </Section> <Section position="2" start_page="24" end_page="26" type="sub_section"> <SectionTitle> 2.2 New Head-Argument Schema </SectionTitle> <Paragraph position="0"> What is crucial for this WOC-incorporated grammar is how the required word order constraints stated in WOC are passed on and enforced in its projection. I attempt to formalise this in the form of Head-Argument Schema, by modifying Head-Complement Schema of Pollard and Sag (1994).</Paragraph> <Paragraph position="1"> There are two key revisions: an enriched PHON feature that contains word order constraints and percolation of these constraints emanating from the WOC feature in the head.</Paragraph> <Paragraph position="2"> The revised Schema is shown in Figure 3. For simplicity only the LP subfeature is dealt with, since the ADJ subfeature would work exactly the same way. The set notations attached underneath states the restriction on the value of WOC, namely that all the signs that appear in the constraint pairs must be 'relevant', i.e. must also appear as daughters (included in 'DtrSet', the set of the head daughter and non-head daughters). Naturally, they also cannot be the same signs (xnegationslash=y).</Paragraph> <Paragraph position="3"> Let me discuss some auxiliary modifications first. Firstly, we change the feature name from COMPS to ARGS because we assume a non-configurational flat structure, as is commonly the case with linearisation grammar. Another change I propose is to make ARGS a list of underspecified signs instead of SYNSEMs as standardly assumed (Pollard and Sag, 1994). In fact, this is a position taken in an older version of HPSG (Pollard and Sag, 1987) but rejected on the ground of the locality of subcategorisation. The main reason for this reversal is to facilitate the 'accessibility' we discussed earlier. As unification and percolation of the PHON information is involved in the Schema, it is much more straightforward to formulate with signs. Though the change may not be quite defensible solely on this ground,1 there is reason to leave the locality principle as an option for languages of which it holds rather than hardwire it into the Schema, since some authors raise doubt as for the universal applicability of the locality principle e.g. (Meurers, 1999).</Paragraph> <Paragraph position="4"> Turning to a more substantial modification, our new PHON feature consists of two subfeatures, CONSTITUENTS (or CONSTITS) and CONSTRAINTS (or CONSTRS). The former encodes the set that comprises the phonology of words of which the string consists. Put simply, it is the un1Another potential problem is cyclicity, since the signvalued ARGS feature contains the WOC feature, which could contain the head itself. This has to be fixed for the systems that do not allow cyclicity.</Paragraph> <Paragraph position="5"> ordered version of the standard PHON list. The CONSTRAINTS feature represents the concatanative constraints applicable to the string. Thus, the PHON feature overall represents the legitimate word order patterns in an underspecified way, i.e.</Paragraph> <Paragraph position="6"> any of the possible string combinations that obey the constraints. Let me illustrate with a VP example, say, consisting of meet, often and Tom, for which we assume that the following word order patterns are acceptable, <meet, Tom, often> , <often, meet, Tom> but not the followings: <meet, often, Tom> , <Tom, often, meet> , <Tom, meet, often> , <often, Tom, meet> .</Paragraph> <Paragraph position="7"> This situation can be captured by the following feature specification for PHON, which encodes any of the acceptable strings above in an under-specified way.</Paragraph> <Paragraph position="8"> The key point is that now the computation of word order can be done based on the information inside the PHON feature, though indeed the CONSTR values have to come from outside - the word order crucially depends on SYNSEM-related values of the daughter signs.</Paragraph> <Paragraph position="9"> Let us now go back to the Schema in Figure 3 and see how to determine the CONSTR values to enter the PHON feature. This is achieved by looking up the WOC constraints in the head (let's call this Step 1) and pushing the relevant constraints into the PHON feature of its mother, according to the type of constraints (Step 2).</Paragraph> <Paragraph position="10"> For readability Figure 3 only states explicitly a special case - where one LP constraint holds of two of the arguments - but the reader is asked to interpret ai and aj in the head daughter's WOC|LP to represent any two signs chosen from the 'DTRS' list (including the head, hd). 2 The structure sharing of ai and aj between WOC|LP and ARGS indicates that the LP constraint applies to these two arguments in this order, i.e. ai[?]aj. Thus through unification, it is determined which constraints apply to which pairs of daughter signs inside the head. This corresponds to Step 1.</Paragraph> <Paragraph position="11"> Now, only for these WOC-applicable daughter signs, the PHON|CONSTIITS valuesare pairedup for each constraint (in this case <pai, paj> ) and pushed into the mother's PHON|CONSTRS feature. This corresponds to Step 2.</Paragraph> <Paragraph position="12"> Notice alsothatthe CONSTRAINTSsubfeature is cumulatively inherited. All the non-head daughters' CONSTR values (ca1,...,can) - the word order constraints applicable to each of these daughters - are also passed up, collecting effectively all the CONSTR values of its daughters and descendants. This means the information concerning word order, as tied to particular string pairs, is never lost and passed up all the way through. Thus the WOC constraints can be enforced at any point where both members of the string pair in question are instantiated.</Paragraph> </Section> <Section position="3" start_page="26" end_page="28" type="sub_section"> <SectionTitle> 2.3 A Worked Example </SectionTitle> <Paragraph position="0"> Let us now go through an example of applying the Schema, again with the German subordinate clause, das Buch der Fritz dem Frank zu lesen erlaubt (and other acceptable variants). Our goal is to enforce the ADJ and LP constraints in a flexible enough way, allowing the acceptable sequences such as those we saw in Section 1.2.1. while blocking the constraint-violating instances.</Paragraph> <Paragraph position="1"> The instantiated Schema is shown in Figure 4.</Paragraph> <Paragraph position="2"> Let us start with a rather deeply embedded level, the embedded verb zu-lesen, marked v2, found inside vp (the last and largest NHD-DTR) as its HD2For the generality of the number of ARGS elements, which should be taken to be any number including zero, the recursive definition as detailed in (Richter and Sailer, 1995) can be adopted.</Paragraph> <Paragraph position="3"> DTR, which I suppose to be one lexical item for simplicity. This is one of the lexical heads from which the WOC constraints emanate. Find, in this item's WOC, a general LP constraint for zu-Infinitiv VPs, COMPS[?]V, namely np3[?]v2. Then the PHON|CONSTITS values of these signs are searched for and found in the daughters, namely pnp3 and pv2. These values are paired up and passed into the CONSTRS|LP value of its mother VP. Notice also that into this value the NHD-DTRs' CONSTR|LP values, in this case only lpnp3 ({das}[?]{Buch}), are also unioned, constituting lpvp: we are here witnessing the cumulative inheritance of constraints explained earlier. Turn attention now to the percolation of ADJ subfeature: no ADJ requirement is found between das Buch and zu-lesen (v2's WOC|ADJ is empty), though ADJ is required one node below, between das and Buch (np3's PHN|CONSTR|ADJ). Thus no new ADJ pair is added to the mother VP's PHON|CONSTR feature.</Paragraph> <Paragraph position="4"> Exactly the same process is repeated for the projection of erlauben (v1), where its WOC again contains only LP requirements. With the PHON|CONSTITS values of the relevant signs found and paired up ({Fritz,der}[?]{erlaubt} and {Frank,dem}[?]{erlaubt}), they are pushed into its mother's PHON|CONSTRS|LP value, which is also unioned with the PHON|CONSTRS values of the NHD-DTRS. Notice this time that there is no LP requirement between the zu-Infinitiv VP, das Buch zu-lesen, and the higher verb, erlaubt. This is intended to allow for extraposition.3 The eventual effect of the cumulative constraint inheritance can be more clearly seen in the sub-AVM underneath, which shows the PHON part of the whole feature structure with its values instantiated. After a succession of applications of the Head-Argument Schema, we now have a pool of WOCs sufficient to block unwanted word order patterns while endorsing legitimate ones. The representation of the PHON feature being underspecified, it corresponds to any of the appropriately constrained order patterns. der Fritz dem Frank zu lesen das Buch erlaubt would be ruled out by the violation of the last LP constraint, der Fritz erlaubt dem Frank das Buch zu lesen by the second, and so on.</Paragraph> <Paragraph position="5"> The reader might be led to think, because of 3The lack of this LP requirement also entails some marginally acceptable instances, such as der Fritz dem Frank das Buch erlaubt zu lesen, considered ungrammatical by many. These instances can be blocked, however, by introducing more complex WOCs. See Sato (forthcoming a). the monotonic inheritance of constraints, that the WOC compliance cannot be checked until the stage of final projection. While this is generally true for freer word order languages considering various scenarios such as bottom-up generation, one can conduct the WOCcheck immediatelyafter the instantiation of relevant categories in parsing, the fact we can exploit in our implementation, as we will now see.</Paragraph> </Section> </Section> <Section position="4" start_page="28" end_page="29" type="metho"> <SectionTitle> 3 Constrained Free Word Order Parsing </SectionTitle> <Paragraph position="0"/> <Section position="1" start_page="28" end_page="29" type="sub_section"> <SectionTitle> 3.1 Algorithm </SectionTitle> <Paragraph position="0"> In this section our parsing algorithm that works with the lexicalised linearisation grammar outlined above is briefly overviewed.4 It expands on two existing ideas: bitmasks for non-CFG parsing and dynamic constraint application.</Paragraph> <Paragraph position="1"> Bitmasks are used to indicate the positions of a parsed words, wherever they have been found.</Paragraph> <Paragraph position="2"> Reape (1991) presents a non-CFG tabular parsing algorithm using them, for 'permutation complete' language, which accepts all the permutations and discontinuous realisations of words. To take for an example a simple English NP that comprises the, thick and book, this parser accepts not only their 3! permutations but discontinuous realisations thereof in a longer string, such as [book, -, the, -, thick] ('-' indicates the positions of constituents from other phrases).</Paragraph> <Paragraph position="3"> Clearly, the problem here is overgeneration and (in)efficiency. In the current form the worst-case complexity will be exponential (O(n!*2n), n = length of string). In response, Daniels and Meurers (2004) propose to restrict search space during the parse with two additional bitmasks, positive and negative masks, which encode the bits that must be and must not be occupied, respectively, based on what has been found thus far and the relevant word order constraints. For example, given the constraints that Det precedes Nom and Det must be adjacent to Nom and supposing the parser has found Det in the third position of a five word string like above, the negative mask [ x, x, the, -, -] is created, where x indicates the position that cannot be occupied by Nom, as well as the positive mask [ * , das, *, -], where * indicates the positions that must be occupied by Nom. Thus, you can stop the parser from searching the positions the categories yet to be found cannot occupy, or force it to search only the positions they have to occupy.</Paragraph> <Paragraph position="4"> A remaining important job is to how to state the constraints themselves in a grammar that works with this architecture, and Daniels and Meurers' answer is a rather traditional one: stating them in phrase structure rules as LP attachments. They modify HPSG rather extensively in a way similar to GPSG, in what they call 'Generalised ID/LP Grammar'. However, as we have been arguing, this is not an inevitable move. It is possible to keep the general contour of the standard HPSG largely intact.</Paragraph> <Paragraph position="5"> The way our parser interacts with the grammar is fundamentally different. We take full advantage of the information that now resides in lexical heads. Firstly, rules are dynamically generated from the subcategorisation information (ARGS feature) in the head. Secondly, the constraints are picked up from the WOC feature when lexical heads are encountered and carried in edges, eliminating the need for positive/negative masks. When an active edge is about to embrace the next category, these constraints are checked and enforced, limiting the search space thereby.</Paragraph> <Paragraph position="6"> After the lexicon lookup, the parser generates rules from the found lexical head and forms lexical edges. It is also at this stage that the WOC is picked up and pushed into the edge, along with the rule generated: <Mum- Hd-Dtr * Nhd1 Nhd2...Nhdn; WOCs> where WOCs is the set of ADJ and LP constraints picked up, if any. This edge now tries to find the rest - non-head daughters. The following is the representation of an edge when the parsing proceeds to the stage where some non-head daughter, in this representation Dtri, has been parsed, and Dtrj is to be searched for.</Paragraph> <Paragraph position="7"> <Mum- Dtr1 Dtr2...Dtri* Dtrj...Dtrn; WOCs> When Dtrj is found, the parser does not immediately move the dot. At this point the WOC compliance check with the relevant WOC constraint the one(s) involving Dtri and Dtrj - is conducted on these two daughters. The compliance check is a simple list operation. It picks the bitmasks of the two daughters in question and checks whether the occupied positions of one daughter precede/are adjacent to those of the other.</Paragraph> <Paragraph position="8"> The failure of this check would prevent the dot move from taking place. Thus, edges that violate the word order constraints would not be created, thereby preventing wasteful search. This is the same feature as Daniels and Meurers', and therefore the efficiency in terms of the number of edges is identical. The main difference is that we use the information inside the feature structure without having media like positive/negative masks.</Paragraph> </Section> <Section position="2" start_page="29" end_page="29" type="sub_section"> <SectionTitle> 3.2 Implementation </SectionTitle> <Paragraph position="0"> I have implemented the algorithm in Prolog and coded the HPSG feature structure in the way described using ProFIT (Erbach, 1995). It is a headcorner, bottom-up chart parser, roughly based on Gazdar and Mellish (1989). The main modification consists of introducing bitmasks and the word order checking procedure described above.</Paragraph> <Paragraph position="1"> I created small grammars for Japanese and German and put them to the parser, to confirm that linearisation-heavy constructions such as object control construction can be successfully parsed, with the WOC constraints enforced.</Paragraph> </Section> </Section> <Section position="5" start_page="29" end_page="29" type="metho"> <SectionTitle> 4 Future Tasks </SectionTitle> <Paragraph position="0"> What we have seen is an outline of my initial proposal and there are numerous tasks yet to be tackled. First of all, now that the constraints are written in individual lexical items, we are in need of appropriate typing in terms of word order constraints, in order to be able to state succinctly general constraints such as the head-final/initial constraint. In other words, it is crucial to devise an appropriate type hierarchy.</Paragraph> <Paragraph position="1"> Another potential problem concerns the generality of our theoretical framework. I have focused on the Head-Argument structure in this paper, but if the present theory were to be of general use, non-argument constructions, such as the Head-Modifier structure, must be accounted for.</Paragraph> <Paragraph position="2"> Also, the cases where the head of a phrase is itself a phrase may pose a challenge, if such a phrasal head were to determine the word order of its projection. Since it is desirable for computational transparencynot touse emergentconstraints,Iwill attempt to get all the word order constraints ultimately propagated and monotonically inherited from the lexical level. Though some word order constraints may turn out to have to be written into the phrasal head directly, I am confident that the majority, if not all, of the constraints can be stated in the lexicon. These issues are tackled in a separate paper (Sato, forthcoming a).</Paragraph> <Paragraph position="3"> In terms of efficiency, more study has to be requiredto identify the exactcomplexity of myalgorithm. Also, with a view to using it for a practical system, an evaluation of the efficiency on the actual machine will be crucial.</Paragraph> </Section> class="xml-element"></Paper>