File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/90/p90-1006_metho.xml
Size: 27,316 bytes
Last Modified: 2025-10-06 14:12:36
<?xml version="1.0" standalone="yes"?> <Paper uid="P90-1006"> <Title>MEMORY CAPACITY AND SENTENCE PROCESSING</Title> <Section position="4" start_page="39" end_page="40" type="metho"> <SectionTitle> 2 THE UNDERLYING PARSER </SectionTitle> <Paragraph position="0"> The parser to which the memory limitation constraints apply must construct representations in such a way so that incomplete input will be associated with some structure. Furthermore, the parsing algorithm must, in principle, allow more than one structure for an input string, so that the general constraints described in the previous section may apply to restrict the possibilities.</Paragraph> <Paragraph position="1"> The parsing model that I will assume is an extension of the model described in Clark and Gibson (1988). When a word is input, representations for each of its lexical entries axe built and placed in the buffer, a one cell data structure that holds a set of tree structures. The parsing model contains a second data structure, the stack-set, which contains a set of stacks of buffer cells.</Paragraph> <Paragraph position="2"> The parser builds trees in parallel based on possible attachments made between the buffer and the top of each stack in the stack-set. The buffer and stack-set are formally defined in (3) and (4).</Paragraph> <Paragraph position="3"> (3) A buffer cell is a set of structures { SI,...,S, }, where each Si represents the same segment of the input string. The buffer contains one buffer cell.</Paragraph> <Paragraph position="4"> (4) The stack-set is a set of stacks of buffer cells, where each stack represents the same segment of the input string:</Paragraph> <Paragraph position="6"> where: p is the number of stacks; ml is the number of buffer cells in stack i; and nij is the number of tree structures in the jth buffer cell of stack i.</Paragraph> <Paragraph position="7"> The motivation for these data structures is given by the desire for a completely unconstrained parsing algorithm upon which constraints may be placed: this algorithm should allow all possible parser operations to occur at each parse state. There are exactly two parser operations: attaching a node to another node and pushing a buffer cell onto a stack. In order to allow both of these operations to be performed in parallel, it is necessary to have the given data structures: the buffer and the stack-set. For example, consider a parser state in which the buffer is non-empty and the stack-set contains only a single cell stack:</Paragraph> <Paragraph position="9"> Suppose that attachments are possible between the buffer and the single stack cell. The structures that result from these attachments will take up a single stack cell. Let us call these resultant structures A1, Az, ...,Ak. If all possible operations are to take place at this parser state, then the contents of the current buffer must also be pushed on top of the current stack. Thus two stacks, both representing the same segment of the input string</Paragraph> <Paragraph position="11"> Since these two stacks break up the same segment of the input string in different ways, the stack-set data structure is necessary.</Paragraph> </Section> <Section position="5" start_page="40" end_page="41" type="metho"> <SectionTitle> 3 TWO SYNTACTIC PROPERTIES DERIVABLE FROM THE 0-CRITERION </SectionTitle> <Paragraph position="0"> Following early work in linguistic theory, I distinguish two kinds of categories: functional categories and thematic or content categories (see, for example, Fukui and Speas (1986) and Abney (1987) and the references cited in each). Thematic categories include nouns, verbs, adjectives and prepositions; functional categories include determiners, complementizers, and inflection markers. There are a number of properties that distinguish functional elements from thematic elements, the most crucial being that functional elements mark grammatical or relational features while thematic elements pick out a class of objects or events. I will assume as a working hypothesis that only those syntactic properties that have to do with the thematic elements of an utterance are relevant to preferences and overload in processing. One principle of syntax that is directly involved with the thematic content of an utterance in a Government-Binding theory is the 0-Criterion: (7) Each argument bears one and only one 0-role (thematic role) and each 0-role is assigned to one and only one argument (Chomsky 1981:36).</Paragraph> <Paragraph position="1"> I hypothesize that the human parser attempts to locaUy satisfy the 0-Criterion whenever possible. Thus given a thematic role, the parser prefers to assign that role, and given a thematic element, the parser prefers to assign a role to that element. These assumptions are made explicit as the following properties: (8) The Property of Thematic Reception (PTR): Associate a load of XrR PLUs of short term memory to each thematic element that is in a position that can receive a thematic role in some co-existing structure, but whose 0-assigner is not unambiguously identifiable in the structure in question.</Paragraph> <Paragraph position="2"> (9) The Property of Thematic Assignment (PTA): Associate a load of XTA PLUs of short term memory to each thematic role that is not assigned to a node containing a thematic element.</Paragraph> <Paragraph position="3"> Note that the Properties of Thematic Assignment and Reception are stated in terms of thematic elements.</Paragraph> <Paragraph position="4"> Thus the Property of Thematic Reception doesn't apply to functional categories, whether or not they are in positions that receive thematic roles. Similarly, if a thematic role is assigned to a functional category, the Property of Thematic Assignment does not notice until there is a thematic element inside this constituent.</Paragraph> </Section> <Section position="6" start_page="41" end_page="43" type="metho"> <SectionTitle> 4 AMBIGUITY AND THE PROPERTIES OF THEMATIC ASSIGNMENT AND RECEPTION </SectionTitle> <Paragraph position="0"> Consider sentence (10) with respect to the Properties of Thematic Assignment and Reception: (10) John expected Mary to like Fred.</Paragraph> <Paragraph position="1"> The verb expect is ambiguous: either it takes an NP complement as in the sentence John expected Mary or it takes an IP complement as in (10). 4 Consider the state of the parse of (10) after the word Mary has been processed: (11) a. \[re Lvt, John \] \[v? expected ~e Mary \]\]\] b. \[tp \[~p John \] \[vp expected \[tp Lvp Mary \] \]\]\] In (1 la), the NP Mary is attached as the NP complement of expected. In this representation there is no load associated with either of the Properties of Thematic Assignment or Reception since no thematic elements need thematic roles and no thematic roles are left unassigned. In (llb), the NP Mary is the specifier of a hypothesized IP node which is attached as the complement of the other reading of expected. 5 This representation is associated with at least xrR PLUs since the NP Mary is in a position that can be associated with a thematic role, the subject position, but whose 0-assigner is not yet identifiable. No load is associated with the Property of Thematic Assignment, however, since both thematic roles of the verb expected are assigned to nodes that contain thematic elements. Since 4Following current notation in GB Theory, IP (Inflection Phrase) = S and CP (Complementizer Phrase) = S' (Chomsky 1986).</Paragraph> <Paragraph position="2"> 51 assume some form of hypothesis-driven node projection so that noun phrases are projected to those categories that they may specify. Motivation for this kind of projection algorithm is given by the processing of Dutch (Frazier 1987) and the processing of certain English noun phrase constructions (Gibson 1989).</Paragraph> <Paragraph position="3"> there is no difficulty in processing sentence (10), the load difference between these two structures cannot be greater than P PLUs, the preference factor in inequality (2). Thus the inequality in (12) is obtained: (12) xrR < P Since the load difference between the two structures is not sufficient to cause a strong preference, both structures are maintained. Note that this is an important difference between the theories presented here and the theory presented in Frazier and Fodor (1978), Frazier (1979) and Pritchett (1988). In each of these theories, only one representation can be maintained, so that either (lla) or (llb) would be preferred. In order to account for the lack of difficulty in parsing (10), Frazier and Pritchett both assume that reanalysis in certain situations is not expensive. No such stipulation is necessary in the framework given here: it is simply assumed that all reanalysis is expensive. 6 Consider now sentence (13) with respect to the Properties of Thematic Assignment and Reception: (13) John expected her mother to like Fred.</Paragraph> <Paragraph position="4"> Consider the state of the parse of (13) after the word her has been processed. In one representation the NP her will be attached as the NP complement of expected: (14) \[tp \[up John \] \[vp expected Lvv her \]\]\] In this representation there is no load associated with either of the Properties of Thematic Assignment or Reception since no thematic objects need thematic roles and no thematic roles are left unassigned. In another representation the NP her is the specifier of a hypothesized NP which is pushed onto a substack containing the other reading of the verb expected: (15){ { \[tp \[ueJohn\] \[vpexpected \[tp e\]\]\] } { \[~p ~p her \] \] } } This representation is associated with at least xra PLUs since the verb expected has a thematic role to assign. However, no load is associated with the genitive NP specifier her since its a-assigner, although not yet present, is unambiguously identified as the head of the NP to follow (Chomsky (1986a)). 7 Thus the total load associated with (15) is xra PLUs. Since there is no difficulty in processing sentence (10), the load difference 6See Section 4.1 for a brief comparison between the model proposed here and serial models such as those proposed by Frazier and Fodor (1978) and Pritchett (1988).</Paragraph> <Paragraph position="5"> 7Note that specifiers do not always receive their thematic roles from the categories which they specify. For example, a non-genitive noun phrase may specify any major category. In particular, it may specify an IP or a CP. But the specifier of these categories may receive its thematic role through chain formation from a distant 0-assigner, as in (16): (16) John appears to like beans.</Paragraph> <Paragraph position="6"> Note that there is no NP that corresponds to (16) (Chomsky between these two structures cannot be greater than P PLUs. Thus the second inequality, (18), is obtained: (18) xra < P Now consider (19): s (19) # I put the candy on the table in my mouth.</Paragraph> <Paragraph position="7"> This sentence becomes ambiguous when the preposition on is read. This preposition may attach as an argument of the verbput or as a modifier of the NP the candy: (20) a. I \[vv Iv, Iv put \] Lvv the candy \] \[ee on \] \]\] b. I \[vv Iv, Iv put \] Lvv the candy \[ep on \] \] \]\] At this point the argument attachment is strongly preferred. However, this attachment turns out to be incompatible with the rest of the sentence. When the word mouth is encountered, no pragmatically coherent structure can be built, since tables are not normally found in mouths. Thus a garden-path effect results.</Paragraph> <Paragraph position="8"> Consider the parse state depicted in (20) with respect to the Properties of Thematic Assignment and Reception.</Paragraph> <Paragraph position="9"> The load associated with the structure resulting from argument attachment is XrA PLUs since, although the agrid belonging to the verbput is filled, the thematic role assigned by the preposition on remains unassigned. On the other hand, the load associated with the modifier attachment is 2 *XrA +xrR PLUs since 1) both the verb put and the preposition on have thematic roles that need to be assigned and 2) the PP headed by on receives a thematic role in the argument attachment structure, while it receives no such role in the structure under consideration. Thus the difference between the loads associated with the two structures is XrA + XrR PLUs.</Paragraph> <Paragraph position="10"> Since the argument attachment structure is strongly preferred over the other structure, I hypothesize that this load is greater than P PLUs:</Paragraph> <Paragraph position="12"> Now consider the the well-known garden-path sentence in (22): (22) # The horse raced past the barn fell. The structure for the input the horse raced is ambiguous between at least the two structures in (23): (23) a. be bvp the horse \] \[vp raced \]\] b. bp Lvp the Lv, Lv, horse/\] \[cp Oi raced \] \]\] \] Structure (23a) has no load associated with it due to either the PTA or the PTR. Crucially note that the verb raced has an intransitive reading so that no load is required via the Property of Thematic Assignment.</Paragraph> <Paragraph position="13"> On the other hand, structure (23b) requires a load of 2 * xrR PLUs since 1) the noun phrase the horse is in a position that can receive a thematic role, but currently does not and 2) the operator Oi is in a position that may be associated with a thematic role, but is not yet sI will prefix sentences that are difficult to parse because of memory limitations with the symbol &quot;#&quot;. Hence sentences that are unacceptable due to processing overload will be prefixed with &quot;#&quot;, as will be garden-path sentences. associated with one. 9 Thus the difference between the processing loads of structures (23a) and (23b) is 2 * xrR PLUs. Since this sentence is a strong garden-path sentence, it is hypothesized that a load difference of 2 * xrR PLUs is greater than the allowable limit, P PLUs: (24) 2 * xrR > P A surprising effect occurs when a verb which optionally subcategorizes for a direct object, like race, is replaced by a verb which obligatorily subcategorizes for a direct object, likefind: (25) The bird found in the room was dead.</Paragraph> <Paragraph position="14"> Although the structures and local ambiguities in (25) and (22) are similar, (22) causes a garden-path effect while, surprisingly, (25) does not. To determine why (25) is not a garden-path sentence we need to examine the local ambiguity when the word found is read: (26) a. be Me the bird \] Ire Iv, Iv found \] \[He \] \]\]\] b. \[m Lvt, the ~, ~, bird/\] \[c/, Oi found \] \]\] \] The crucial difference between the verb found and the verb raced is that found requires a direct object, while raced does not. Since the 0-grid of the verb found is not filled in structure (26a), this representation is associated with xrA PLUs of memory load. Like structure (23b), structure (26b) requires 2 * xrR PLUs.</Paragraph> <Paragraph position="15"> Thus the difference between the processing loads of structures (26a) and (26b) is 2 *xrR - XTA PLUs. Since no garden-path effect results in (25), I hypothesize that this load is less than or equal to P PLUs: (27) 2 * xrR - XTA <_ P Furthermore, these results correctly predict that sentence (28) is not a garden-path sentence either: (28) The bird found in the room enough debris to build a nest.</Paragraph> <Paragraph position="16"> Hence we have the following system of inequalities: (29) a. xrR < P b. XTA < P C. XTA &quot;4-XTR > P d. 2*XTR > P e. 2 * XTR -- XrA < P This system of inequalities is consistent. Thus it identifies a particular solution space. This solution space is depicted by the shaded region in Figure 1. Note that, pretheoretically, there is no reason for this system of inequalities to be consistent. It could have been that the parser state of one of the example sentences forced an inequality that contradicted some previously obtained inequality. This situation would have had one of three implications: theproperties being considered might be incorrect; the properties being considered might be incomplete; or the whole approach</Paragraph> <Paragraph position="18"> might be incorrect. Since this situation has not yet been observed, the results mutually support one another.</Paragraph> <Section position="1" start_page="43" end_page="43" type="sub_section"> <SectionTitle> 4.1 A COMPARISON WITH SERIAL MODELS </SectionTitle> <Paragraph position="0"> Because serial models of parsing can maintain at most one representation for any input string, they have difficulty explaining the lack of garden-path effects in sentences like (10) and (25): (10) John expected Mary to like Fred.</Paragraph> <Paragraph position="1"> (25) The bird found in the room was dead.</Paragraph> <Paragraph position="2"> As a result of this difficulty Pritchett (1988) proposes the Theta Reanalysis Constraint:ldeg (30) Theta Reanalysis Constraint (TRC): Syntactic re null analysis which interprets a 0-marked constituent as outside its current 0-Domain and as within an existing 0-Domain of which it is not a member is costly. (31) 0-Domain: c~ is in the 7 0-Domain of/3 iff c~ receives the 7 0-role from/3 or a is dominated by a constituent that receives the 3' 0-role from/3. As a result of the Theta Reanalysis Constraint, the necessary reanalysis in each of (10) and (25) is not expensive, so that no garden-path effect is predicted. Furthermore, the reanalysis in sentences like (22) and (19) violates the TRC, so that the garden-path effects are predicted.</Paragraph> <Paragraph position="3"> However, there are a number of empirical problems with Pritchett's theory. First of all, it turns out that the ldegFrazier and Rayner (1982) make a similar stipulation to account for problems with the theory of Frazier and Fodor (1978). However, their account fails to explain the lack of garden-path effect in (25). See Pritcheu (1988) for a description of further problems with their analysis. Theta Reanalysis Constraint as defined in (30) incorrectly predicts that the sentences in (32) do not induce garden-path effects: (32) a. # The horse raced past the barn was failing. b. # The dog walked to the park seemed small.</Paragraph> <Paragraph position="4"> c. # The boat floated down the river was a canoe.</Paragraph> <Paragraph position="5"> For example, consider (32a). When the auxiliary verb was is encountered, reanalysis is forced. However, the auxiliary verb was does not have a thematic role to assign to its subject, the dog, so the TRC is not violated. Thus Pritchett's theory incorrectly predicts that these sentences do not cause garden-path effects. Other kinds of local ambiguity that do not give the human parser difficulty also pose a challenge to serial parsers. Marcus (1980) gives the sentences in (33) as evidence that any deterministic parser must be able to look ahead in the input string: 11 (33) a. Have the boys taken the exam today? b. Have the boys take the exam today.</Paragraph> <Paragraph position="6"> Any serial parser must be able to account for the lack of difficulty with either of the sentences in (33). It turns out that the Theta Reanalysis Constraint does not help in cases like these: no matter which analysis is pursued first, reanalysis will violate the TRC.</Paragraph> </Section> </Section> <Section position="7" start_page="43" end_page="44" type="metho"> <SectionTitle> 4.2 EMPIRICAL SUPPORT: FURTHER GARDEN-PATH EFFECTS </SectionTitle> <Paragraph position="0"> Given the Properties of Thematic Assignment and Reception and their associated loads, we may now explain many more garden-path effects. Consider (34): (34) # The Russian women loved died.</Paragraph> <Paragraph position="1"> Up until the last word, this sentence is ambiguous between two readings: one where loved is the matrix verb; and the other where loved heads a relative clause modifier of the noun Russian. The strong preference for the matrix verb interpretation of the word loved can be easily explained if we examine the possible structures upon reading the word women: (35) a. \[u, \[we the Russian women\] \] b. \[u, \[We the IN, \[W, Russian/\] \[cl, \[We Oi \] \[tP \[We women \] \]\] \]\] \] Structure (35a) requires xrR PLUs since the NP the Russian women needs but currently lacks a thematic role. Structure (35b), on the other hand, requires at least 3 * xTR PLUs since 1) two noun phrases, the Russian and women, need but currently lack thematic roles; and 2) the operator in the specifier position of the modifying Comp phrase can be associated with a thematic role, but currently is not linked to one. Since the difference between these loads is 2 * XTR, a garden-path effect results.</Paragraph> <Paragraph position="2"> Consider now (36): (36) # John told the man that Mary kissed that Bill saw Phil.</Paragraph> <Paragraph position="3"> 11Note that model that I am proposing here is a parallel model, and therefore is nondeterministic.</Paragraph> <Paragraph position="4"> When parsing sentence (36), people will initially analyze the CP that Mary kissed unambiguously as an argument of the verb told. It turns out that this hypothesis is incompatible with the rest of the sentence, so that a garden-path effect results. In order to see how the garden-path effect comes about, consider the parse state which occurs after the word Mary is read: (37) a. \[tp ~P John \] Ice Iv, Iv told \] \[wp the man \] \[cp that \] be ~P Mary \] \]\] \]\]\] b. bp \[We John \] \[vp \[v, \[v told \] \[wp the \[W, \[W, man/\] \[cp bvp O/\] that bp bvp Mary \] \]\] \]\]7 Structure (37a) requires no load by the PTA since the 0-grid of the only 0-assigner is filled with structures that each contain thematic elements. However, the noun phrase Mary requires XrR PLUs by the Prop-erty of Thematic Reception since this NP is in a thematic position but does not yet receive a thematic role. Thus the total load associated with structure (37a) is xrR PLUs. Structure (37b), on the other hand, requires a load OfXTA +2*XTR since 1) the thematic role PROPOSI-TION is not yet assigned by the verb told; 2) the operator in the specifier position of the CP headed by that is not linked to a thematic role; and 3) the NP Mary is in thematic position but does not receive a thematic role yet. Thus the load difference is xrA +XrR PLUs, enough for the more expensive one to be dropped. Thus only structure (37a) is maintained and a garden-path effect eventually results, since this structure is not compatible with the entire sentence. Hence the Properties of Thematic Assignment and Reception make the correct predictions with respect to (36).</Paragraph> <Paragraph position="5"> Consider the garden-path sentence in (38): (38) # John gave the boy the dog bit a dollar.</Paragraph> <Paragraph position="6"> This sentence causes a garden-path effect since the noun phrase the dog is initially analyzed as the direct object of the verb gave rather than as the subject of a relative clause modifier of the NP the boy. This garden-path can be explained in the same way as previous examples. Consider the state of the parse after the NP the dog has been processed: (39) a. be \[We John \] \[vP Iv, \[v gave \]\[Ne the boy \] \[W~, the dog 1\]\]\] b. \[u, ~t, John \] \[re \[v, \[v gave \] \[wp the \[N, \[W, boyi \] Ice \[we Oi\] be \[we the dog \] \]\] \[we \] 777\]7 While structure (39a) requires no load at this stage, structure (39b) requires 2 * xrR + XrA PLUs since 1) one thematic role has not yet been assigned by the verb gave; 2) the operator in the specifier position of the CP modifying boy is not linked to a thematic role; and 3) the NP the dog is in a thematic position but does not yet receive a thematic role. Thus structure (39a) is strongly preferred and a garden-path effect results.</Paragraph> <Paragraph position="7"> The garden-path effect in (40) can also be easily explained in this framework: (40) # The editor authors the newspaper hired liked laughed.</Paragraph> <Paragraph position="8"> Consider the state of the parse of (40) after the word authors has been read: (41) a. \[o, bop the editor \] \[w, Iv, Iv authors \] bee \] \]\]\] b. \[n, ~e the be, be, editor/\] \[cp Lvp Oi \] \[11, Me authors \] \]\] \]\]\] The word authors is ambiguous between nominal and verbal interpretations. The structure including the verbal reading is associated with XrA PLUs since the 0-grid for the verb authors includes an unassigned role. Structure (41b), on the other hand, includes three noun phrases, each of which is in a position that may be linked to a thematic role but currently is not linked to any 0-role. Thus the load associated with structure (41b) is 3 * XrR PLUs. Since the difference between the loads associated with structures (41b) and (41a) is so high (3 * XrR -- XTA PLUs), only the inexpensive structure, structure (41a), is maintained.</Paragraph> </Section> <Section position="8" start_page="44" end_page="44" type="metho"> <SectionTitle> 5 PROCESSING OVERLOAD </SectionTitle> <Paragraph position="0"> The Properties of Thematic Assignment and Reception also give a plausible account of the unacceptability of sentences with an abundance of center-embedding.</Paragraph> <Paragraph position="1"> Recall that I assume that a sentence is unacceptable because of short term memory overload if the combination of memory associated with properties of the structures built at some stage of the parse of the sentence is greater than the allowable processing load K.</Paragraph> <Paragraph position="2"> Consider (42): (42) # The man that the woman that the dog bit likes eats fish.</Paragraph> <Paragraph position="3"> Having input the noun phrase the dog the structure for the partial sentence is as follows: (43) \[o, \[top the \[to, \[/C/, mani \] \[o, ~p Oi \] that \[tP \[s,P the \[~, ~, womanj \] \[cP \[NP Oj \] that \[lP \[NP the dog \] \]\]\] In this representation there are three lexical noun phrases that need thematic roles but lack them. Furthermore, there are two non-lexical NPs, operators, that are in positions that may prospectively be linked to thematic roles. Thus, under my assumptions, the load associated with this representation is at least 5 * xrR PLUs. I assume that these properties are responsible for the unacceptability of this sentence, resulting in the inequality in (44): (44) 5 * xTR > K Note that sentences with only one relative clause modifying the subject are acceptable, as is exemplified in (45) (45) The man that the woman likes eats fish.</Paragraph> <Paragraph position="4"> Since (45) is acceptable, its load is below the maximum at all stages of its processing. After processing the noun phrase the woman in (45), there are three noun phrases that currently lack 0-roles but may be linked to 0-roles as future input appears. Thus we arrive at the inequality in (46): (46) 3 * XTR <_ K 45 Thus I assume that the maximum processing load that people can handle lies somewhere above 3 * xrR PLUs but somewhere below 5 * xrR PLUs. Although these data are only suggestive, they clearly make the right kinds of predictions. Future research should establish the boundary between acceptability and unacceptability more precisely.</Paragraph> </Section> class="xml-element"></Paper>