File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/92/p92-1043_metho.xml

Size: 11,993 bytes

Last Modified: 2025-10-06 14:13:20

<?xml version="1.0" standalone="yes"?>
<Paper uid="P92-1043">
  <Title>V N a S V z S S V~ Figure 7: Derivation for NtNaV2VaV1</Title>
  <Section position="1" start_page="0" end_page="0" type="metho">
    <SectionTitle>
1 The Linguistic Data
</SectionTitle>
    <Paragraph position="0"> For German, most transformational lingusitic theories such as GB posit center-embedding as the underlying word order of sentences with embedded clauses: Weft ich \[das Fahrrad zu reparieren\] versprochen habe Because I the bike (ace) to repair promised have Because I promised to repair the bike However, far more common is a construction in which the entire subordinate clause is extraposed: Weil ich ti versprochen habe, \[das Fahrrad zu reparieren\]i. In addition, a third construction is possible, which has been called the &amp;quot;third construction&amp;quot;, in which only the embedded verb, but not its nominal argument has been extraposed: Weil ich das Fahrrad ti versprochen habe \[zu reparieren\]i, A similar construction can also be observed ff there are two levels of embedding. In this case, the number of possible word orders increases from 3 to 30, 6 of which are shown in Figure 1. Of the 30 sentences, 7 are clearly ungrammatical (marked &amp;quot;*&amp;quot;), and 3 are extremely marginal, but not &amp;quot;flat out&amp;quot; (marked &amp;quot;?*&amp;quot;). The remaining 20 are acceptable to a greater or lesser degree (marked &amp;quot;ok&amp;quot; or &amp;quot;?&amp;quot;). No attempt has been made in the linguistic or computational literature to account for this full range of data.</Paragraph>
  </Section>
  <Section position="2" start_page="0" end_page="297" type="metho">
    <SectionTitle>
2 A Linguistic TAG Analysis
</SectionTitle>
    <Paragraph position="0"> Following \[den Besten and Rutten 1989\], \[Santorini and Kroch 1990\] argue that the third construction, rather than being a morphological effect of clause union, is in fact a syntactic phenomenon. The construction derives from two independently motivated syntactic operations, scrambling and (remnant) extraposition. In this work, I have implemented this suggestion in a variant of multi-component TAG (TIC-TAG, \[Weir 1988\]) defined in \[Lee 1991\], which I will call SI-TAG. In SI-TAG, as in MC-TAG, the elementary structures are sets of trees, which can be initial or auxiliary trees. Contrary to the regular MC'lAG, in SI-TAG the trees can also be adjoined into trees *This work was supported by the following grants: ARO DAAL 03-89-C-0031; DARPA N00014-90-J-1863; NSF IRI 9016592; and Ben Franklin 91S.3078C-1. I would like to thank Bob Frank and Aravind Joshi for fruitful discussions relating to this paper.</Paragraph>
    <Paragraph position="1"> from the same set (set-internal adjunction). Furthermore, the trees can be annotated with dominance constraints (or &amp;quot;links&amp;quot;), which hold between foot nodes of auxiliary trees and nodes of other trees. These constraints must be met when the tree set is adjoined.</Paragraph>
    <Paragraph position="2"> The following SI-TAG accounts for the German data. We have 5 elementary sets: for the two verbs that subcategorize for clauses, versuchen 'to try&amp;quot; and versprechen 'to promise', there are two sets each, representing the center-embedded and extraposed versions. For reparieren 'to repair', there is only one set. Sample sets can be found in Figure 2. The dominance links are shown by dotted lines.</Paragraph>
    <Paragraph position="3"> ...... . S ..- ............. ;::.':l vr, i vr, ivP Air is,</Paragraph>
    <Paragraph position="5"> and versuchen 'to try' with extraposed subordinate clause This analysis rules out those sentences that are ungrammatical, since the dominance constraints would be circular and could not be satisfied. Derivations am possible for the sentences that are acceptable. However, the analysis also provides derivations for the three sentences that are extremely marginal, but not ungrammatical. Since these sentences can be derived by a sequence of 3 licit steps, the combination of any two of which is also licit, a syntactic analysis cannot insightfully rule them out. Instead, I would like to explore a processing-based analysis. A processing account holds two promises: first, it should account for the differences in degree among the acceptable sentences; second, it should rule out the extremely marginal sentences.</Paragraph>
    <Paragraph position="6">  Weil ich des Fahrrad zu reparieren zu versuchen versproehen habe ok Weil ich das Fahrrad zu versuchen zu reparieren versprochen habe 7 Well ich versprochen babe, zu versuchen, das Falurad zu reparieren ok Weil ich zu versuchen versprochen habe, das Fahrrad zu reparieren 7 Weft ich das Fahrrad zu versuchen versprochen habe zu reparieren 7* Weil zu versuchen ich das Fahrrad versprochen habe zu reparieren *</Paragraph>
  </Section>
  <Section position="3" start_page="297" end_page="298" type="metho">
    <SectionTitle>
3 A Processing Account Based on
Bottom-Up EPDAs
</SectionTitle>
    <Paragraph position="0"> \[Joshi 1990\] proposes to model human sentence processing with an Embedded Pushdown Automaton (EPDA), the automaton that recognizes tree adjoining languages. He defines the Principle of Partial Interpretation (PPI), which stipulates that structures are only popped from the EPDA when they are a properly integrated predicate-argument structure. Furthermore, it requires that they be popped only when they are either the root clause or they are the immediately embedded clause of the previously popped structure.</Paragraph>
    <Paragraph position="1"> Before extending this approach to the extraposition cases, I will recast it in terms of a closely related automaton, the Bottom-up EPDA (BEPDA) ~. The BEPDA consists of a finite-state control and of a stack of stacks. There are two types of moves: either an input is read and pushed onto a new stack on top of the stack of stacks, or a fixed number of stacks below and above a designated stack on the stack of stacks is removed and a new symbol is pushed on the top of the designated stack, which is now the top stack (an &amp;quot;unwrap&amp;quot; move). The operation of this automaton will be illustrated on the German center-embedded sentence N1N2N3VzVzVI 2. The moves of the BEPDA are shown in Table 3. The three nouns are read in, and each is pushed onto a new stack on top of the stack of stacks (steps 1-3).</Paragraph>
    <Paragraph position="2"> When V3 is read, it is combined with its nominal argument and replaces it on the top stack (Step 4). The PPI prevents V3** from being popped from the automaton, since V3** is not the root clause and V2 has not yet been popped. V2 is then read and pushed onto a new stack (Step 5a). In the next move (5b), N2, V~ deg and I/&amp;quot;2 (i.e., V2 and its nominal and clausal complements) are unwrapped, and the combined V2** is placed on top of the new top stack (the one formerly containing V3**). A similar move happens in steps 6a and 6b. Now, Vx *deg can be popped from the automaton in accordance with the PPI. (Recall that V~ *deg contains its clausal argument, V2 *deg, which in turn contains its clausal argument, V3 *deg, so that at this point all input has been processed.). In summary, the machine operates as follows: it creates a new top stack for each input it reads, and unwraps aI am indebted to Yves Schabes for suggesting the use of the  BEPDA.</Paragraph>
    <Paragraph position="3"> 2I will abbreviate the lexemes so that for example sentence (i) will be represented as N1N3V3VzV1. As in \[Joshi 1990\], an asterisk (e.g., V~*) denotes a verb not lacking any overt nominal complements. In extension to this notation, a circle (e.g., 111&amp;quot;) denotes a verb not lacking any clausal complements.</Paragraph>
    <Paragraph position="4">  whenever and as soon as this is possible.</Paragraph>
    <Paragraph position="5"> Using a BEPDA rather than an EPDA has two advantages: first, the data-driven bottom-up automaton represents a more intuitive model of human sentence processing than the top-down automaton; second, the grammar that corresponds to the BEPDA analysis is the TAG grammar proposed independently on linguistic grounds, as shown in Figure 4 a. The unwrap in move 5afo corresponds to the adjunction of tree /~2 to tree ota at the root node of ~3 (shown by the arrow), and the unwrap in Move 6a/b to the  Let us consider how the BEPDA account can be extended to the extraposition cases, such as sentence (xxiii), NtV2V1N3Va. If we simply use the BEPDA for center-embedding described above, we get the sequence of moves in Figure 5. In move 3a, we can unwrap the nominal argument and verb of the matrix clause, which is popped in move 3b in accordance with the PPI. In move 3c, the clause of V2&amp;quot; can also be popped. Then, the remaining noun and verb are simply read and popped.</Paragraph>
    <Paragraph position="6"> If we use any of the metrics proposed in \[Joshi 1990\] (such as the sum of the number of moves that input elements are stored in the stack) we predict that sentence  (xxiii) is easier to process than sentence (i), which appears to be correcL It is easy to see how this analysis extends to sentence (xvi). Its processing would be predicted to be the easiest possible, and in fact it is the word order by far preferred by German speakers.</Paragraph>
    <Paragraph position="7"> Now let us turn to the third construction cases. If we assume the PPI, the only way for a simple TAG to derive the relevant word orders (e.g., N1N2V1V2) is by an analysis corresponding to verb raising as employed in Dutch. In Section 2, I mentioned linguistic evidence against a verb-raising analysis for German. Processing considerations also speak against this approach: we would have to postulate that German speakers can either use the German center-embedding strategy, or the Dutch verb-raising strategy. This would mean that German speakers should be as good at cross-serial dependencies as at center-embedding.</Paragraph>
    <Paragraph position="8"> However, in German at levels of embedding beyond 2, the center-embedding construction is clearly preferred. We are left with the conclusion that we must go beyond simple TAGs, as was in fact proposed in Section 2. Therefore, a simple BEPDA will not handle such cases either, and we will need an extension of the automaton. This extension will be explained by way of an example, sentence (iv).</Paragraph>
    <Paragraph position="9"> N1, Na, V2 and Va are read in and placed on new top stacks (moves 1 - 4a). (Popping I/2&amp;quot; would violate the PPI.) Now we unwrap V2* and combine it with 1/3&amp;quot;. This yields 1/2deg: while formerly V2* did not lack any nominal arguments (since it has none of its own), \]/2deg now has its clausal complement, but it is lacking a nominal complement (namely Va's) 4. The reason why Na and V3 can't be unwrapped around V~ is that Va does not subcategorize for a clausal complement. We then unwrap N3 around V~ and get V~** in step 4c. We can then unwrap and pop the matrix clause, and then pop Vz** in the usual manner.</Paragraph>
    <Paragraph position="10"> The grammar corresponding to the BEPDA of Figure 6 is shown in Figure 7 (the arrows again show the sequence of adjunctions): we see that the deferred incorporation of Na corresponds to the use of a tree set for the clause of V3.</Paragraph>
    <Paragraph position="11"> Finally, let us consider the extremely marginal sentence (xxv), N1NaV2V1Va. Here, the automaton as defined so far would simply read in the input elements and push them on separate stacks. At no point can a clause be unwrapped (because both verb/noun pairs are too far apart), and the extension proposed to handle the third construction, the deferred incorporation of nominal arguments, cannot apply, 4This operation can be likened to the operation of function composition in a categorial framework.</Paragraph>
    <Paragraph position="12">  either. The automaton rejects the string, as desired.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML