File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/00/j00-3002_metho.xml
Size: 17,565 bytes
Last Modified: 2025-10-06 14:07:14
<?xml version="1.0" standalone="yes"?> <Paper uid="J00-3002"> <Title>Incremental Processing and Acceptability</Title> <Section position="5" start_page="320" end_page="320" type="metho"> <SectionTitle> A,A\B, B\C ~ C A\B,B\C ~ A\C \R </SectionTitle> <Paragraph position="0"> Every rule, with the exception of Cut, where the Cut formula A does not appear in the conclusion, has exactly one connective occurrence less in its premisses than in its conclusion. Lambek (1958) proved Cut elimination--that every proof has a Cut-free counterpart--hence a decision procedure for theoremhood is given by backward-chaining proof search in the Cut-free fragment. The nonatomic instances of the id axiom are derivable from atomic instances by the rules for the connectives. But even in the Cut-free atomic-id calculus there is spurious ambiguity: equivalent derivations differing only in irrelevant rule ordering. For example, composition as above has the following alternative derivation:</Paragraph> </Section> <Section position="6" start_page="320" end_page="322" type="metho"> <SectionTitle> (15) A=~A B~B \L A,A\B ~ B C ~ C A,A\B, B\C =~ C A\B,B\C =~ A\C \R \C </SectionTitle> <Paragraph position="0"> One approach to this problem consists in defining, within the Cut-free atomic-id space, normal form derivations in which the succession of rule application is regulated (K6nig 1989, Hepple 1990, Hendriks 1993). Each sequent has a distinguished category formula (underlined) on which rule applications are keyed:</Paragraph> <Paragraph position="2"> In the regulated calculus there is no spurious ambiguity, and provided there is no explicit or implicit antecedent product, i.e., provided .L is not needed, F ~ A is a theorem of the Lambek calculus iff F ~ A is a theorem of the regulated calculus.</Paragraph> <Paragraph position="3"> However, apart from the issue regarding .L, there is a general cause for dissatisfaction with this approach: it assumes the initial presence of the entire sequent to be proved, i.e., it is in principle nonincremental; on the other hand, allowing incrementality on the basis of Cut would reinstate with a vengeance the problem of spurious ambiguity, for then what are to be the Cut formulas? Consequently, the sequent approach is ill-equipped to address the basic asymmetry of language--the asymmetry of its processing in time---and has never been forwarded in a model of the kind of processing phenomena cited in the introduction.</Paragraph> <Paragraph position="4"> Morrill Incremental Processing and Acceptability An alternative formulation (Ades and Steedman 1982, Steedman 1997), which from its inception has emphasized a capacity to produce left-branching, and therefore incrementally processable, analyses, is comprised of combinatory schemata such as the following (together with a Cut rule, feeding one rule application into an- null other): (17) a. A,A\B => B B/A,A =~ B b. A ~ (B/A)\B A ~ B/(A\B) c. A\B,B\C ~ A\C C/B,B/A =~ C/A By a result of Zielonka (1981), the Lambek calculus is not axiomatizable by any finite set of combinatory schemata, so no such combinatory presentation can constitute the logic of concatenation in the sense of Lambek calculus. Combinatory categorial grammar does not concern itself with the capture of all (or only) the concatenatively valid combinatory schemata, but rather with incrementality, for example, on a shift-reduce design. An approach (also based on regulation of the succession of rule application) to the associated problem of spurious ambiguity is given in Hepple and Morrill (1989) but again, to our knowledge, there is no predictive relation between incremental combinatory processing and the kind of processing phenomena cited in the introduction.</Paragraph> </Section> <Section position="7" start_page="322" end_page="323" type="metho"> <SectionTitle> 3. Proof Nets </SectionTitle> <Paragraph position="0"> Lambek categorial derivations are often presented in the style of natural deduction or sequent calculus. Here we are concerned with categorial proof nets (Roorda 1991) as the fundamental structures of proof in categorial logic, in the same sense that linear proof nets were originally introduced by Girard (1987) as the fundamental structures of proof in linear logic. (Cut-free) proof nets exhibit no spurious ambiguity and play the role in categorial grammar that parse trees play in phrase structure grammar. null Surveys and articles on the topic include Lamarche and RetorC/ (1996), de Groote and RetorC/ (1996), and Morrill (1999). Still, at the risk of proceeding at a slightly slower pace, we aim nonetheless to include here enough details to make the present paper self-contained.</Paragraph> <Paragraph position="1"> A polar category formula is a Lambek categorial type labeled with input (deg) or output (o) polarity. A polar category formula tree is a binary ordered tree in which the leaves are labeled with polar atoms (literals) and each local tree is one of the following (logical) links: A deg B&quot; .. B deg A&quot; .</Paragraph> <Paragraph position="2"> -- U -- 1 (18) a. A\B deg A\B deg B * A o .. A deg B o .</Paragraph> <Paragraph position="3"> b. B/A. 11 B/AO 1 A deg B&quot; . B deg A deg ..</Paragraph> <Paragraph position="4"> c. A.B deg 1 A-B deg 11 Without polarities, a formula tree is a kind of formation tree of the formula at its root: Computational Linguistics Volume 26, Number 3 daughters are labeled with the immediate subformulas of their mothers. The polarities indicate sequent sidedness, input for antecedent and output for succedent; the polarity propagation follows the sidedness of subformulas in the sequent rules: in the antecedent (input) rule for A\B the subformula A goes in a succedent (output) and the subformula B goes in an antecedent (input); in the succedent (output) rule for A\B the subformula A goes in an antecedent (input) and the subformula B goes in a succedent (output); etc. The labels i and ii indicate whether the corresponding sequent rule is unary or binary. Note that in the output links the order of the subformulas is switched; this corresponds to a cyclic reading of sequents: the succedent type is adjacent to the first antecedent type.</Paragraph> <Paragraph position="5"> A proof frame is a finite sequence of polar category formula trees, exactly one of which has a root of output polarity (corresponding to the unique succedent of sequents).</Paragraph> <Paragraph position="6"> An axiom linking on a set of literal labeled leaves is a partitioning of the set into pairs of complementary leaves that is planar in its ordering, i.e., there are no two pairs {L1,LB},{La, L4} such that L1 < L2 < L3 < L4. Geometrically, planarity means that where the leaves are ordered on a line, paired leaves can be connected in the half plane without crossing. Axiom links correspond to id instances in a sequent proof.</Paragraph> <Paragraph position="7"> A proof structure is a proof frame together with an axiom linking on its leaves. A proof net is a proof structure in which every elementary (i.e., visiting vertices at most once) cycle crosses the edges of some i-link. 2 Geometrically, an elementary cycle is the perimeter of a face or cluster of faces in a planar proof structure. There is a proof net with roots Adeg,AI&quot;,... ,An&quot; iff A1,...,A, ~ A is a valid sequent.</Paragraph> </Section> <Section position="8" start_page="323" end_page="329" type="metho"> <SectionTitle> 4. Incremental Processing Load and Acceptability </SectionTitle> <Paragraph position="0"> Let us assume the following lexical assignments: condition, which is an involved mathematical result. Danos and Regnier (1989) express it in terms of acydicity and connectivity of certain subgraphs. Intuitively, acyclicity assures that the subformulas of ii-links (binary rules) occur in different subproofs, whereas connectivity assures that the subformulas of i-links (unary rules) occur in the same subproofs (attributed to Jean Gallier by Philippe de Groote, p.c.). However the single-succedent (intuitionistic) nature of Cut-free categorial proofs in fact renders the connectivity requirement redundant, hence we have just an acyclicity test. Morrill Incremental Processing and Acceptability net construction. 3 In the first case, we suppose that one initially expects some target category, perhaps (though not necessarily) S. This 'principle of expectation' seems a reasonable or obvious principle of communication; as we shall see, it turns out to be technically critical. After perception of the word the there is the following partial proof net (for simplicity we omit features, included in lexical entries, from proof nets Now there are only two unmatched valencies. After raced we have, on the correct analysis, the following:</Paragraph> <Paragraph position="2"> the horse raced Note that linking the Ns is possible, but we are interested in the history of the correct analysis, and in that, the verb valencies are matched by the adverb that Computational Linguistics Volume 26, Number 3 follows (henceforth we indicate only the principal connective of a mother node):</Paragraph> <Paragraph position="4"> the horse raced past Observe that a cycle is created, but as required it crosses the edges of an i-link. At the penultimate step we have:</Paragraph> <Paragraph position="6"> the horse raced past the The final proof net analysis is given in Figure 1. The semantics associated with a categorial proof net, i.e., the proof as a lambda term (intuitionistic natural deduction proof, under the Curry-Howard correspondence) is extracted by associating a distinct index with each output division node and traveling as follows, starting by going up at the unique output root (de Groote and Retor6 1996): (25) * traveling up at the mother of an output division link, perform the lambda abstraction with respect to the associated index of the result of traveling up at the daughter of output polarity; traveling up at the mother of an output product link, form the ordered pair of the result of traveling up at the right daughter (first component) and the left daughter (second component); Morrill Incremental Processing and Acceptability</Paragraph> <Paragraph position="8"> perform the functional application of the result of traveling down at the mother to the result of traveling up at the other (output) daughter; traveling down at the left (respectively, right) daughter of an input product link, take the first (respectively, second) projection of the result of traveling down at the mother; traveling down at the (input) daughter of an output division link, return the associated index; traveling down at a root, return the associated lexical semantics. Thus for our example we obtain (26a), which is logically equivalent to (26b). (26) a. (&x&y&z(past x (y z)) (the barn) M(race 1) (the horse)) b. (past (the barn) (race (the horse))) The analysis of (lb) is less straightforward. Whereas in (la) raced expresses a one-place predication (&quot;go quickly&quot;), in (lb) it expresses a two-place predication (there was some agent racing the horse); horse is modified by an agentless passive participle, but the adverbial past the barn is modifying race. Within the confines of the Lambek calculus, the characterization we offer assumes the lexical assignment to the passive participle given in the following: 4 4 In general, grammar requires the expressivity of more powerful categorial logics than just Larnbek calculus; however, so far as we are aware, the characterizations we offer within the Lambek calculus bear the same properties with regard to our processing considerations as their more sophisticated categorial logic refinements, because the latter are principally concerned with generalizations of word order, whereas the semantic dependencies on which our complexity metric depends remain the same. Computational Linguistics Volume 26, Number 3</Paragraph> <Paragraph position="10"> p~t the barn fell (27) fell - fall</Paragraph> <Paragraph position="12"> Here raced is classified as the product of an untensed transitive verbal type, which can be modified by the adverbial past the barn by composition, and an adnominalizer of this transitive verbal type. According to this, (lb) has the proof net analysis given in (28) a. (fall (the (Tra(&xAy&z\[(y z) A 3w(x z w)\], race2) )~29~30 ()~uAv)~w(past u (v w)) (the barn) MI(( (Tr2(/~p/~s&t\[(s t) A 3q(p t q)\], race2) 29 41) 30) horse))) b. (fall (the)~8\[(horse 8) A 37(past (the barn) (race2 8 7))\])) Let us assign to each proof net analysis a complexity profile that indicates, before and after each word, the number of unmatched literals, i.e., unresolved valencies or dependencies, under process at that point. This is a measure of the course of memory load in optimal incremental processing. We are not concerned here with resolution of lexical ambiguity or serial backtracking: we are supposing sufficient resources that the nondeterminism of selection of lexical entries and their parallel consideration is not the critical burden. Rather, the question is: which among parallel competing analyses places the least load on memory? Since entropy degrades the structure of memory, it requires more energy to pursue an analysis that is high cost in memory than to pursue one that is low cost. From these simple economic considerations we derive our main claim: (29) Principle of Acceptability Acceptability is inversely proportional to the sum in time of the memory load of unresolved valencies.</Paragraph> <Paragraph position="13"> If other factors are constant, the principle makes a quantitative prediction. We can distinguish two cases: synonymy and ambiguity. In the case of synonymy, semantics is constant. It is then predicted that amongst synonymous forms of expression, the lower Proof net analysis of (4b) the dog that chased the cat that saw the rat barked. the complexity curve, the higher the preference for the form of expression. In the case of ambiguity, prosodics is constant. It is then predicted that amongst the readings of an ambiguous expression, the lower the complexity curve, the more dominant the reading. The complexity profile is easily read off a completed proof net: the complexity between two words is the number of axiom links bridging rightwards (forwards in time) at that point. Thus for (la) and (lb) analyzed in Figures 1 and 2, the complexity profiles are as follows: a. the horse raced past the barn b. the horse raced past the barn fell We see that after the first two words the complexity of the locally ambiguous initial segment of (lb) is consistently higher than that of its garden path (la). The areas of the a and b curves are 12 and 22 respectively, predicting that in (lb) the less costly but incorrect analysis could be salient, as indeed it is.</Paragraph> <Paragraph position="14"> Johnson (1998) considers center embedding for subject and object relativization from a similar point of view. We assume here the relative pronoun lexical assignments shown in (31). s</Paragraph> <Paragraph position="16"> The proof net analysis of sentence (4b) is shown in Figure 3, and that of sentence (5b) is shown in Figure 4. Let us compare the complexities:</Paragraph> <Paragraph position="18"> the cheese that the rat Figure 4 Proof net analysis of (5b) the cheese that the rat that the cat saw ate stank. The profile of (5b) is higher; indeed it rises above 8, thus reaching what is usually taken to be the limit of short-term memory. Johnson attributes the increasing ill-formedness of centre embedded constructions to the number of incomplete dependencies at the &quot;maximal cut&quot; of a proof net. This almost corresponds to the maximum height of a complexity profile here, except Johnson includes no target category, whereas we will argue in relation to quantifier scope preference that this is critical. However, we also differ from Johnson in attributing relative acceptability to the area under the complexity curve, not only its maximal height. This is because we believe acceptability is to be explained in terms of the energy required to maintain processes in memory over time, and not just in terms of peak memory load. Finally, it happens that our proposal solves a problem encountered by Johnson. Gibson and Thomas (1996) observe that (33a) is easier to comprehend than (33b).</Paragraph> <Paragraph position="19"> (33) a. The chance that the nurse who the doctor supervised lost the reports bothered the intern.</Paragraph> <Paragraph position="20"> b. ?The intern who the chance that the doctor lost the reports bothered supervised the nurse.</Paragraph> <Paragraph position="21"> Johnson notes that his proposal does not capture this difference since both sentences have the same size maximal cut. Under our account, on the other hand, it is the complexity curves as a whole that account for acceptability. In these sentences, although the height is the same, the complexity curves are not: the area of (33a) is less than that of (33b). 6 Thus, whereas Johnson must look to other factors to explain this difference, our account makes the correct prediction.</Paragraph> <Paragraph position="23"> N. N \ S. (N \ S) \ (N \ S) Joe said that Martha believed that Ingrid fell today Figure 5 Proof net analyses for (34) Joe said that Martha believed that Ingrid fell today, with lowest (top), middle (center), and highest (bottom) attachment of the adverb today.</Paragraph> </Section> class="xml-element"></Paper>