File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/06/w06-1637_metho.xml

Size: 23,147 bytes

Last Modified: 2025-10-06 14:10:46

<?xml version="1.0" standalone="yes"?>
<Paper uid="W06-1637">
  <Title>Priming Effects in Combinatory Categorial Grammar</Title>
  <Section position="5" start_page="309" end_page="310" type="metho">
    <SectionTitle>
3 Predictions
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="309" end_page="310" type="sub_section">
      <SectionTitle>
3.1 Priming Effects
</SectionTitle>
      <Paragraph position="0"> We expect priming effects to apply to CCG categories, which describe the type of a constituent including the arguments it expects. Under our assumption that priming manifests itself as a tendency for repetition, repetition probability should be higher for short distances from a prime (see Section 5.2 for details).</Paragraph>
    </Section>
    <Section position="2" start_page="310" end_page="310" type="sub_section">
      <SectionTitle>
3.2 Terminal and Non-terminal Categories
</SectionTitle>
      <Paragraph position="0"> In categorial grammar, lexical categories specify the subcategorization behavior of their heads, capturing local and non-local arguments, and a small set of rule schemata defines how constituents can be combined.</Paragraph>
      <Paragraph position="1"> Phrasal constituents may have the same categories as lexical items. For example, the verb saw might have the (lexical) category (S\NP)/NP, which allows it to combine with an NP to the right. The resulting constituent for saw Johanna would be of category S\NP - a constituent which expects an NP (the subject) to its left, and also the lexical category of an intransitive verb. Similarly, the constituent consisting of a ditransitive verb and its object, gives the money, has the same category as saw. Under the assumption that priming occurs for these categories, we proceed to test a hypothesis that follows from the fact that categories merely encode unsatisfied subcategorized arguments.</Paragraph>
      <Paragraph position="2"> Given that a transitive verb has the same category as the constituent formed by a ditransitive verb and its direct object, we would expect that both categories can prime each other, if they are cognitive units. More generally, we would expect that lexical (terminal) and phrasal (non-terminal) categories of the same syntactic type may prime each other. The interaction of such conditions with the priming effect can be quantified in the statistical model.</Paragraph>
    </Section>
    <Section position="3" start_page="310" end_page="310" type="sub_section">
      <SectionTitle>
3.3 Incrementality of Analyses
</SectionTitle>
      <Paragraph position="0"> Type-raising and composition allow derivations that are mostly left-branching, or incremental.</Paragraph>
      <Paragraph position="1"> Adopting a left-to-right processing order for a sentence is important, if the syntactic theory is to make psycholinguistically viable predictions (Niv, 1994; Steedman, 2000).</Paragraph>
      <Paragraph position="2"> Pickering et al. (2002) present priming experiments that suggest that, in production, structural dominance and linearization do not take place in different stages. Their argument involves verbal phrases with a shifted prepositional object such as showed to the mechanic a torn overall. At a dominance-only level, such phrases are equivalent to non-shifted prepositional constructions (showed a torn overall to the mechanic), but the two variants may be differentiated at a linearization stage. Shifted primes do not prime prepositional objects in their canonical position, thus priming must occur at a linearized level, and a separate dominance level seems unlikely (unless priming is selective).</Paragraph>
      <Paragraph position="3"> CCG is compatible with one-stage formulations of syntax, as no transformation is assumed and categories encode linearization together with subcategorization. null CCG assumes that the processor may produce syntactically different, but semantically equivalent derivations.2 So, while neither the incremental analysis we generate, nor the normal-form, represent one single correct derivation, they are two extremes of a 'spectrum' of derivations. We hypothesize that priming effects predicted on the basis of incremental CCG analyses will be as strong than those predicted on the basis of their normal-form equivalents.</Paragraph>
    </Section>
  </Section>
  <Section position="6" start_page="310" end_page="311" type="metho">
    <SectionTitle>
4 Corpus Data
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="310" end_page="310" type="sub_section">
      <SectionTitle>
4.1 The Switchboard Corpus
</SectionTitle>
      <Paragraph position="0"> The Switchboard (Marcus et al., 1994) corpus contains transcriptions of spoken, spontaneous conversation annotated with phrase-structure trees.</Paragraph>
      <Paragraph position="1"> Dialogues were recorded over the telephone among randomly paired North American speakers, who were just given a general topic to talk about. 80,000 utterances of the corpus have been annotated with syntactic structure. This portion, included in the Penn Treebank, has been time-aligned (per word) in the Paraphrase project (Carletta et al., 2004).</Paragraph>
      <Paragraph position="2"> Using the same regression technique as employed here, Reitter et al. (2006b) found a marked structural priming effect for Penn-Treebank style phrase structure rules in Switchboard.</Paragraph>
    </Section>
    <Section position="2" start_page="310" end_page="311" type="sub_section">
      <SectionTitle>
4.2 Disfluencies
</SectionTitle>
      <Paragraph position="0"> Speech is often disfluent, and speech repairs are known to repeat large portions of the preceding context (Johnson and Charniak, 2004). The original Switchboard transcripts contains these disfluencies (marked up as EDITED):</Paragraph>
      <Paragraph position="2"/>
      <Paragraph position="4"> It is unclear to what extent these repetitions are due to priming rather than simple correction. In disfluent utterances, we therefore eliminate reparanda and only keep repairs (the portions marked with &gt;...&lt; are removed). Hesitations (uh, etc.), and utterances with unfinished constituents are also ignored.</Paragraph>
    </Section>
    <Section position="3" start_page="311" end_page="311" type="sub_section">
      <SectionTitle>
4.3 Translating Switchboard to CCG
</SectionTitle>
      <Paragraph position="0"> Since the Switchboard annotation is almost identical to the one of the Penn Treebank, we use a similar translation algorithm to Hockenmaier and Steedman (2005). We identify heads, arguments and adjuncts, binarize the trees, and assign categories in a recursive top-down fashion. Nonlocal dependencies that arise through wh-movement and right node raising (*T* and *RNR* traces) are captured in the resulting derivation. Figure 1 (left) shows the rightmost normal form CCG derivation we obtain for the above tree. We then transform this normal form derivation into the most incremental (i.e., left-branching) derivation possible, as shown in Figure 1 (right).</Paragraph>
      <Paragraph position="1"> This transformation is done by a top-down recursive procedure, which changes each tree of depth two into an equivalent left-branching analysis if the combinatory rules allow it. This procedure is run until no further transformation can be executed. The lexical categories of both derivations are identical.</Paragraph>
    </Section>
  </Section>
  <Section position="7" start_page="311" end_page="312" type="metho">
    <SectionTitle>
5 Statistical Analysis
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="311" end_page="311" type="sub_section">
      <SectionTitle>
5.1 Priming of Categories
</SectionTitle>
      <Paragraph position="0"> CCG assumes a minimal set of combinatory rule schemata. Much more than in those rules, syntactic decisions are evident from the categories that occur in the derivation.</Paragraph>
      <Paragraph position="1"> Given the categories for each utterance, we can identify their repeated use. A certain amount of repetition will obviously be coincidental. But structural priming predicts that a target category will occur more frequently closer to a potential prime of the same category. Therefore, we can correlate the probability of repetition with the distance between prime and target. Generalized Linear Mixed Effects Models (GLMMs, see next section) allow us to evaluate and quantify this correlation. null Every syntactic category is counted as a potential prime and (almost always) as a target for priming. Because interlocutors tend to stick to a topic during a conversation for some time, we exclude cases of syntactic repetition that are a results of the repetition of a whole phrase.</Paragraph>
      <Paragraph position="2"> Previous work points out that priming is sensitive to frequency (Scheepers (2003) for high/low relative clause attachments, (Reitter et al., 2006a) for phrase structure rules). Highly frequent items do not receive (as much) priming. We include the logarithm of the raw frequency of the syntactic category in Switchboard (LNFREQ) to approximate the effect that frequency has on accessibility of the category.</Paragraph>
    </Section>
    <Section position="2" start_page="311" end_page="312" type="sub_section">
      <SectionTitle>
5.2 Generalized Linear Mixed Effects
Regression
</SectionTitle>
      <Paragraph position="0"> We use generalized linear mixed effects regression models (GLMM, Venables and Ripley (2002)) to predict a response for a number of given categorial ('factor') or continuous ('predictor') explanatory variables (features). Our data is made up of instances of repetition examples and non-repetition examples from the corpus. For each target instance of a syntactic category c occurring in a derivation and spanning a constituent that begins at time t, we look back for possible instances of constituents with the same category (the prime) in a time frame of [t [?]d [?]0.5;t [?]d + 0.5] seconds. If such instances can be found, we have a positive example of repetition. Otherwise, c is included as a data point with a negative outcome.</Paragraph>
      <Paragraph position="1"> We do so for a range of different distances d, commonly 1 [?] d [?] 15 seconds.3 For each data point, we include the logarithm of the distance d between priming period and target as an explanatory variable LNDIST. (See Reitter et al. (2006b) for a worked example.) In order to eliminate cases of lexical repetition of a phrase, e.g., names or lexicalized noun 3This approach uses a number of data points per target, looking backwards for primes. The opposite way - looking forwards for targets - would make similar predictions.  that's what I really like from Switchboard.</Paragraph>
      <Paragraph position="2"> phrases, which we consider topic-dependent or instances of lexical priming, we only collect syntactic repetitions with at least one differing word. Without syntactic priming, we would assume that there is no correlation between the probability that a data point is positive (repetition occurs) and distance d. With priming, we would expect that the probability is inversely proportional to d.</Paragraph>
      <Paragraph position="3"> Our model uses lnd as predictor LNDIST, since memory effects usually decay exponentially.</Paragraph>
      <Paragraph position="4"> The regression model fitted is then simply a choice of coefficients bi, among them one for each explanatory variable i. bi expresses the contribution of i to the probability of the outcome event, that is, in our case, successful priming. The coefficient of interest is the one for the time correlation, i.e. blnDist. It specifies the strength of decay of repetition probability over time. If no other variables are present, a model estimates the repetition probability for a data point i as</Paragraph>
      <Paragraph position="6"> Priming is present if the estimated parameter is negative, i.e. the repetition probability decreases with increasing distance between prime and target.</Paragraph>
      <Paragraph position="7"> Other explanatory variables, such as ROLE, which indicates whether priming occurs within a speaker (production-production priming, PP) or in between speakers (comprehension-production priming, CP), receive an interaction coefficient that adds linearly to blnDist. Additional interaction variables are included depending on the experimental question.4 4Lastly, we identify the target utterance in a random factor in our model, grouping the several measurements (15 for the different distances from each target) as repeated measurements, since they depend on the same target category occurrence and are partially inter-dependent.</Paragraph>
      <Paragraph position="8"> From the data produced, we include all cases of reptition and a an equal number of randomly sampled non-repetition cases.5</Paragraph>
    </Section>
  </Section>
  <Section position="8" start_page="312" end_page="314" type="metho">
    <SectionTitle>
6 Experiments
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="312" end_page="313" type="sub_section">
      <SectionTitle>
6.1 Experiment 1: Priming in Incremental
and Normal-form Derivations
</SectionTitle>
      <Paragraph position="0"> Hypothesis CCG assumes a multiplicity of semantically equivalent derivations with different syntactic constituent structures. Here, we investigate whether two of these, the normal-form and the most incremental derivation, differ in the strength with which syntactic priming occurs.</Paragraph>
      <Paragraph position="1"> Method A joint model was built containing repetition data from both types of derivations. Since we are only interested in cases where the two derivations differ, we excluded all constituents where a string of words was analyzed as a constituent in both derivations. This produced a data set where the two derivations could be contrasted.</Paragraph>
      <Paragraph position="2"> A factor DERIVATION in the model indicates whether the repetition occurred in a normal-form (NF) or an incremental derivation (INC).</Paragraph>
      <Paragraph position="3"> Results Significant and substantial priming is present in both types of derivations, for both PP and CP priming. There is no significant difference in priming strength between normal-form and incremental derivations (blnDist:NF = 0.008, p = 0.95). The logarithm of the raw category frequency is negatively correlated with the priming strength (blnDist:lnFreq = 0.151, p &lt; 0.0001. Note that a negative coefficient for LNDIST indicates  or normal-form derivations. Error bars show (nonsimultaneous) 95% confidence intervals.</Paragraph>
      <Paragraph position="4"> decay. The lower this coefficient, the more decay, hence priming).</Paragraph>
      <Paragraph position="5"> If there was no priming of categories for incrementally formed constituents, we would expect to see a large effect of DERIVATION. In the contrary, we see no effect at a high p, where the that the regression method used is demonstrably powerful enough to detect even small changes in the priming effect. We conclude that there is no detectable difference in priming between the two derivation types. In Fig. 2, we give the estimated priming effect sizes for the four conditions.6 The result is compatible with CCG's separation of derivation structure and the type of the result of derivation. It is not the derivation structure that primes, but rather the type of the result. It is also compatible with the possibility of a non-traditional constituent structure (such as the incremental analysis), even though it is clear that neither incremental nor normal-form derivations necessarily represent the ideal analysis.</Paragraph>
      <Paragraph position="6"> The category sets occurring in both derivation variants was largely disjunct, making testing for actual overlap between different derivations impossible. null</Paragraph>
    </Section>
    <Section position="2" start_page="313" end_page="314" type="sub_section">
      <SectionTitle>
6.2 Experiment 2: Priming between Lexical
and Phrasal Categories
Hypothesis Since CCG categories simply en-
</SectionTitle>
      <Paragraph position="0"> code unsatisfied subcategorization constraints, constituents which are very different from a traditional linguistic perspective can receive the same category. This is, perhaps, most evident in phrasal 6Note that Figures 2 and 3 stem from nested models that estimate the effect of LNDIST within the four/eight conditions. Confidence intervals will be larger, as fewer data-points are available than when the overall effect of a single factor is compared.</Paragraph>
      <Paragraph position="1">  for combinations of comprehension-production or production-production priming and lexical or phrasal primes and targets, e.g. the third bar denotes the decay in repetition probability of a phrasal category as prime and a lexical one as target, where prime and target occurred in utterances by the same speaker. Error bars show (nonsimultaneous) 95% confidence intervals.</Paragraph>
      <Paragraph position="2"> and lexical categories (where, e.g., an intransitive verb is indistinguishable from a verb phrase).</Paragraph>
      <Paragraph position="3"> Bock and Loebell (1990)'s experiments suggest that priming effects are independent of the subcategorization frame. There, an active voice sentence primed a passive voice one with the same phrase structure, but a different subcategorization. If we find priming from lexical to phrasal categories, then our model demonstrates priming of subcategorization frames.</Paragraph>
      <Paragraph position="4"> From a processing point of view, phrasal categories are distinct from lexical ones. Lexical categories are bound to the lemma and thereby linked to the lexicon, while phrasal categories are the result of a structural composition or decomposition process. The latter ones represent temporary states, encoding the syntactic process.</Paragraph>
      <Paragraph position="5"> Here, we test whether lexical and phrasal categories can prime each other, and if so, contrast the strength of these priming effects.</Paragraph>
      <Paragraph position="6"> Method We built a model which allowed lexical and phrasal categories to prime each other.</Paragraph>
      <Paragraph position="7"> A factor, STRUCTURAL LEVEL was introduced  to distinguish the four cases: priming in between phrasal categories and in between lexical ones, from lexical ones to phrasal ones and from phrasal ones to lexical ones.</Paragraph>
      <Paragraph position="8"> Recall that each data point encodes a possibility to repeat a CCG category, referring to a particular instance of a target category at time t and a time span of duration of one second [t[?]d[?]0.5,t[?]d + 0.5] in which a priming instance of the same category could occur. If it occurred at least once, the data point was counted as a possible example of priming (response variable: true), otherwise it was included as a counter-example (response variable: false). For the target category, its type (lexical or phrasal) was clear. For the category of the prime, we included two data points, one for each type, with a response indicating whether a prime of the category of such a type occurred in the time window. We built separate models for incremental and normal form derivations. Models were fitted to a balanced subset, including all repetitions and a randomly sampled subset of non-repetitions.</Paragraph>
      <Paragraph position="9"> Results Both the normal-form and the incremental model show qualitatively the same results. STRUCTURALLEVEL has a significant influence on priming strength (LN DIST) for the cases where a lexical item serves as prime (e.g., normal-form PP: blnDist:lex[?]lex = 0.261, p &lt; 0.0001; blnDist:lex[?]phr = 0.166, p &lt; 0.0001; blnDist:phr[?]lex = 0.056, p &lt; 0.05; as compared to the baseline phr[?]phr. N.B. higher values denote less decay &amp; priming). Phrasal categories prime other phrasal and lexical categories, but there is a lower priming effect to be seen from lexical categories. Figure 3 presents the resulting effect sizes. Albeit significant, we assume the effect of prime type is attributable to processing differences rather than the strong difference that would indicate that there is no priming of, e.g., lexical subcategorization frames. As the analysis of effect sizes shows, we can see priming from and in between both lexical and phrasal categories.</Paragraph>
      <Paragraph position="10"> Additionally, there is no evidence suggesting that, once frequency is taken into account, syntactic processes happening high up in derivation trees show more priming (see Scheepers 2003).</Paragraph>
    </Section>
  </Section>
  <Section position="9" start_page="314" end_page="314" type="metho">
    <SectionTitle>
7 Discussion
</SectionTitle>
    <Paragraph position="0"> We can confirm the syntactic priming effect for CCG categories. Priming occurs in incremental as well as in normal-form CCG derivations, and at different syntactic levels in those derivations: we demonstrated that priming effects persists across syntactic stages, from the lowest one (lexical categories) up to higher ones (phrasal categories). This is what CCG predicts if priming of categories is assumed.</Paragraph>
    <Paragraph position="1"> Linguistic data is inherently noisy. Annotations contain errors, and conversions such as the one to CCG may add further error. However, since noise is distributed across the corpus, it is unlikely to affect priming effect strength or its interaction with the factors we used: priming, in this study, is defined as decay of repetition probability. We see the lack of control in the collection of a corpus like Switchboard not only as a challenge, but also as an advantage: it means that realistic data is present in the corpus, allowing us to conduct a controlled experiments to validate a claim about a specific theory of competence grammar.</Paragraph>
    <Paragraph position="2"> The fact that CCG categories prime could be explained in a model that includes a basic form of subcategorization. All categories, if lexical or phrasal, contain a subcategorization frame, with only those categories present that have yet to be satisfied. Our CCG based models make predictions for experimental studies, e.g., that specific heads with open subcategorization slots (such as transitive verbs) will be primed by phrases that require the same kinds of arguments (such as verbal phrases with a ditransitive verb and an argument).</Paragraph>
    <Paragraph position="3"> The models presented take the frequency of the syntactic category into account, reducing noise, especially in the conditions with lower numbers of (positive) reptition examples (e.g., CP and incremental derivations in Experiment 1). Whether there are significant qualitative and quantitative differences of PP and CP priming with respect to choice of derivation type - which would point out processing differences in comprehension vs. production priming - will be a matter of future work.</Paragraph>
    <Paragraph position="4"> At this point, we do not explicitly discriminate different syntactic frameworks. Comparing priming effects in a corpus annotated in parallel according to different theories will be a matter of future work.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML