File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/90/c90-3027_metho.xml
Size: 13,619 bytes
Last Modified: 2025-10-06 14:12:30
<?xml version="1.0" standalone="yes"?> <Paper uid="C90-3027"> <Title>A Constraint-Based Approach to Linguistic Performance*</Title> <Section position="4" start_page="0" end_page="0" type="metho"> <SectionTitle> 3 A Common Process Model </SectionTitle> <Paragraph position="0"> Among the partial structures hypothesized during comprehension or production of a sentence, we pay attention to the maximal st~'uctures; the structures such that there is no larger structures. Here we say one structure is larger than another when the former ineludes the latter. For example, \[s \[NP Tom\] \[vp sleeps\]\] is larger than \[s \[NP Tom\] VP\]. Sentence processing, whether comprehension or production, is regarded as parallel construction of several maximM structures.</Paragraph> <Paragraph position="1"> Thus sentence processing as & whole is characterized by specifying what a maximal structure is.</Paragraph> <Paragraph position="2"> We assume the grammatical structure of a sentence to be a binary tree. Here we identify a word with its grammatical category, so that a local structure, such as \[NP Tom\], is regarded as one node rather than a partial tree consisting of two distinct nodes.</Paragraph> <Paragraph position="3"> It is just for expository simplification that we assume binary trees. Our account can be generalized straightforwardly to allow n-ary trees. Further, the essence of our discussion below is neutral between the constituency-based approaches and the dependency-based approaches. Here we employ a representation scheme of the former type, without committing ourselves to the constituency-based framework.</Paragraph> <Paragraph position="4"> From the general speculation below, it follows that a maximal structure should be the left-hand half of (5).</Paragraph> <Paragraph position="5"> (5) s This maximal structure consists of the path form S to A and the part to the left of this path, except for Bi-1 and the nodes between Bi-1 and Ai (those on tile slant dotted lines) for 1 < i < d+l;Aiandthenodesbetween Ai and Bi are included in the maximal structure. Here B0 and Ad+l stand for S and A, respectively. Ai is a leftmost descendant (not necessarily the left daughter) of Bi_l or they are identical for 1 _< i < d+l. Bi is a rightmost descendant (not necessarily the right d&v.ghter) of Ai for 1 G i < d. Thus our model is similar to left-corner parser \[1\], though our discussion is not restricted to parsing.</Paragraph> <Paragraph position="6"> This characterization of a maximal structure is obtained as follows. First note that a maximal structure involves n words and n- i nonterminal nodes, for some natural number n; In the maximal structure in (5), the connected substructure containing Ai (l <; i _< d) contains as many nonterminal nodes as words, so that the maximal structure also contains as many nonterminal nodes as words, except for word A. Note further that the entire sentence structure, being a binary tree, also involves one less nonterminal nodes than words.</Paragraph> <Paragraph position="7"> Accordingly, postulating n - 1 nonterminM nodes versus n words in a maximal structure amounts to postulating that the words and the nonterminal nodes are processed at approximately constant speed relative to each other. 1 The number of words is a measure of lexical information, and the number of nonterminal nodes is a measure of syntactic and semantic information, among others. Hence if all the types of linguistic information (lexical, syntactic, semantic, etc.) are processed at approximately the same relative speed, then a maximal process should include nearly as many words as nonterminal nodes.</Paragraph> <Paragraph position="8"> This premise is justified, because if different types of information were processed at different speeds, then tThe rate of n words versus n - 1 nonterminals does not precisely represent the constant relative speed, but the discrepancy here is least possible and thus acceptable enough as approximat ion.</Paragraph> <Paragraph position="9"> 150 2 there would arise imbalance of information distribution across the corresponding different domains of information. Such imhalance should invoke information flow from the domains with higher density to the domains with lower density of information distribution, when, as in the case of language, those domains of information are tightly related with each other. That is, information flow eliminates such imbalance, resulting in approximately the same speed of processing across different but closely related domains of information.</Paragraph> <Paragraph position="10"> Now that we have worked out how many nodes a maximal structure includes, what is left is which nodes it includes. Let us refer to A in (5) as the current active word and the path from the root node S to the current active word as the current active path. It is natural to consider that a maximal structure includes the nodes to the left of the current active path, because all the words they dominate have already been proce,;sed. Thus we come up with the above formulation of a maximal structure, if we notice that the nodes on the solid-line part (including Ai) of the current active path in (5) are adjacent to nodes to the left of the current active path, whereas the other nodes on the current active path (those on the dotted lines, including Bi) do not except for the mother of A, which will be processed at the next moment.</Paragraph> </Section> <Section position="5" start_page="0" end_page="0" type="metho"> <SectionTitle> 4 Immediate Processing </SectionTitle> <Paragraph position="0"> According to this model, any word should be in,mediately processed, particularly in parsing, in the sense that corresponding amount of syntactic and semantic structure is tailored with little delay. The intrasentential status of a word is hence identified as soon as it is encountered. This contrasts with the determinist accounts which ,'assume lookahead to deal with local ambiguity.</Paragraph> <Paragraph position="1"> Empirical evidences support our position. In Marslen-Wilson's \[13\] experiment, for instance, the subjects were asked to listen to a tape-recorded utterance and to say aloud what they hear with the shortest possible delay. Some subjects performed this task with a lag of only about one syllable, and yet their error reflected both syntactic and semantic context. For example, one of such a subjects said lie had heard that the Brigade ... upon listening to He had heard at the Brigade .... Such a phenomenon cannot be accounted for in terms of the determinist accounts with fixed parsing procedures. In our model, it is explained by just assuming that only the most active maximal structure tailored by the subject survives the experimental situation. null</Paragraph> </Section> <Section position="6" start_page="0" end_page="0" type="metho"> <SectionTitle> 5 ~.~ansient Memory Load </SectionTitle> <Paragraph position="0"> By transient memory load (TML) we refer to tile amount of linguistic information temporarily stored in STM. The measurements of TML during sentence processing proposed so far include the depth of center embedding (CE) \[5\] and that of self embedding (SE) \[15\]. A syntactic constituent a is centeroembedded in another syntactic constituent /3 when /3 = -rc~5 for some non-null strings 7 and PS We further say that c, is self-embedded in /3 when they are of the same sort of category, say NP.</Paragraph> <Paragraph position="1"> However, neither CE nor SE can explain why (6) is much easier to understand than (7).</Paragraph> <Paragraph position="2"> (6) 2bm knows the story that a man who lived in Helsinki and his wife were poor but they were happy.</Paragraph> <Paragraph position="3"> (7) Tom knows that the story on the fact that the rumor that Mary killed John was false is funny. Note that these sentences are of about the same length; The former consists of 20 words and the latter 19 words. Almost all my informants (including both native and non-native speakers of English) reported that (6) is easier to understand than (7). Those who felt contrariwise ascribed the difficulty of (6) to the ambiguity concerning the overall structure of the cornplemeI~t clause after that.</Paragraph> <Paragraph position="4"> The approach based on CE fails to account for this difference, because the maximum CE depth of (6) a:.d that of (7) are both 3, as is shown below.</Paragraph> <Paragraph position="5"> (8) \[0Tom knows the story that \[la man \[2 who \[3lived\] in Helsinki\] and his wife were poor\] but they were happy\] (9) \[0 Tom knows that \[~ the story on the fact that \[2 the rumor that Mary \[a killed\] John\] was false\] is funny\] The maximum SE depth cannot distinguish these sentences: null (10) Tom knows \[NPo tile story that \[NP~ a man who lived in \[NP~ Helsinki\] and his wife\] were poor but they were happy\] (11) Tom knows that \[NP0 the story on the fact that \[NP, the rumor that \[NP2 Mary\] killed John\] was false\] is funny.</Paragraph> <Paragraph position="6"> Our model provides a TML measure which accounts for the contrast in question. In order to plug a maximal structure with the rest of the sentence in a grammatical manner, one must remember only the information contained in the categories o11 the border between the maximal structure and the remaining context; i.e., categories Ai, the mother of Bi (1 ~ i _< d) and A in (5). Thus the value of d in (5) could serve as a TML measure. As is illustrated in (12) and (13), in fact, the maximum of d is 2 and 3 for (6) and (7), respectively, explaining why (6)is easier. In (12) and (13), enclosed in boxes are the nodes corresponding to A,, Bi(1 < i < d) and A when d is ttle maximum; i.e., 2 in the former and 3 in tile latter.</Paragraph> </Section> <Section position="7" start_page="0" end_page="0" type="metho"> <SectionTitle> 6 Language Acquisition </SectionTitle> <Paragraph position="0"> The Dutch language exhibits a type of cross-serial dependency (CSD) in subordinate clauses: (14) ...dat Wolf de kinderen Marie ... that Wolf the children Marie zag helpen zwemmen see-PAST help-INF swim-INF '... that Wolf saw the children help Marie swim' Our .theory predicts that children learning Dutch come to recognize the CSD constructions &quot;as having the following structure, which coincides with the structure figured out by Bresnan et al. \[4\] ~ based on an analysis of adult language.</Paragraph> <Paragraph position="1"> (m > m). Note that NP~,...NP,, and V0,&quot;'V~ constitute right-branching structures dominated by X0 and Zo, respectively.</Paragraph> <Paragraph position="2"> Let us look at how a child regard a simple CSD construction (16) to be (17), which is an instance of (15) for m = n = 1.</Paragraph> <Paragraph position="3"> (16) ,.. dat Wolf/vlarie zag zwemmen ... that Wolf Marie see-PAST swim-INF L.. that Wolf saw Marie swim' According to our model, the relevant part of the most active maximal structure would look like the following 2(15) is slightly different from the structure proposed by Bresnan et al., because we regard a sentence structure as a binary tree whereas their proposal involves tertiary branching obtained by equating VP and X0 in (15). This difference is irrelevant to the essence of the following disc.ssion.</Paragraph> <Paragraph position="4"> when zag has just been acknowledged, provided that the child has already acquired the standard structure of a subordinate clause, in which the finite verb appears at the end.</Paragraph> <Paragraph position="5"> VPo, VP1, Zo and Vo correspond to B,~-t, Aa, Bd and A in (5), respectively (so that VPo and Zo are not included in the maximal structure here). When zwemmen is encountered, category \[v, zwemmen\] must be inserted either between VPo and VPI or between Zo and Vo. In the alleged subordinate clause construe rtion, Zo (which might be identical to Vo) has a direct access to \[NPj Marie\], which is the object of zag, the alleged head of Zo. On the other hand, VP1 lacks such an access, because the relationship between Marie and zag is established not through but under VP~. It is hence more preferable that \[v~ zwemmen\] attaches beneath Zo, if the child has already perceived extralingulstieally the situation being described, in which Marie is swhnming. Now the most active maximal structure should look like this (Zo and Z1 are excluded from this maximal structure if they are distinct from Yo and V1, Note that this reasoning essentially relies oil our formulation of a maximal process. If a bottom-up model were assumed instead, for instance, there would be no immediate reason to exclude a structure, say, as follows. null The above discussion can be extended to cover more complex cases (where m > 1 in (15)) in a rather straightforward manner, as is discussed by Hasida \[6\]. The structure under Xo is tailored as a natural extension of the way an ordinary subordinate clause is processed, then Vo is inserted beneath VP, following the ordinary structure of a subordinate clause together with the semantic information about the situation described, and Vi attaches near to Vi-~ for 1 < i < n due to the semantic information again. The structure under Z0 must be right-branching so that V0 be the head of VP.</Paragraph> <Paragraph position="6"> Also by reference to the current model, Hasida \[7\] further gives an account of the unacceptability of some unbounded dependency constructions in English which is hard to explain in static terms of linguistics.</Paragraph> </Section> class="xml-element"></Paper>