File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/06/n06-2031_intro.xml

Size: 5,685 bytes

Last Modified: 2025-10-06 14:03:29

<?xml version="1.0" standalone="yes"?>
<Paper uid="N06-2031">
  <Title>Computational Modelling of Structural Priming in Dialogue</Title>
  <Section position="3" start_page="121" end_page="122" type="intro">
    <SectionTitle>
2 Method
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="121" end_page="121" type="sub_section">
      <SectionTitle>
2.1 Dialogue types
</SectionTitle>
      <Paragraph position="0"> We examined two corpora. Switchboard contains 80,000 utterances of spontaneous spoken conversations over the telephone among randomly paired, North American speakers, syntactically annotated with phrase-structure grammar (Marcus et al., 1994). The HCRC Map Task corpus comprises more than 110 dialogues with a total of 20,400 utterances (Anderson et al., 1991). Like Switchboard, HCRC Map Task is a corpus of spoken, two-person dialogue in English. However, Map Task contains task-oriented dialogue: interlocutors work together to achieve a task as quickly and efficiently as possible. Subjects were asked to give each other directions with the help of a map. The interlocutors are in the same room, but have separate, slightly different maps and are unable to see each other's maps.</Paragraph>
    </Section>
    <Section position="2" start_page="121" end_page="121" type="sub_section">
      <SectionTitle>
2.2 Syntactic repetitions
</SectionTitle>
      <Paragraph position="0"> Both corpora are annotated with phrase structure trees. Each tree was converted into the set of phrase structure productions that license it. This allows us to identify the repeated use of rules. Structural priming would predict that a rule (target) occurs more often shortly after a potential prime of the same rule than long afterwards - any repetition at great distance is seen as coincidental. Therefore, we can correlate the probability of repetition with the elapsed time (DIST) between prime and target.</Paragraph>
      <Paragraph position="1"> We considered very pair of two equal syntactic rules up to a predefined maximal distance to be a potential case of priming-enhanced production. If we consider priming at distances 1...n, each rule instance produces up to n data points. Our binary response variable indicates whether there is a prime for the target between n[?]0.5 and n + 0.5 seconds before the target. As a prime, we see the invocation of the same rule. Syntactic repetitions resulting from lexical repetition and repetitions of unary rules are excluded. We looked for repetitions within windows (DIST) of n = 15 seconds (Section 3.1).</Paragraph>
      <Paragraph position="2"> Without priming, one would expect that there is a constant probability of syntactic repetition, no matter the distance between prime and target. The analysis tries to reject this null hypothesis and show a correlation of the effect size with the type of corpus used. We expect to see the syntactic priming effect found experimentally should translate to more cases for shorter repetition distances, since priming effects usually decay rapidly (Branigan et al., 1999).</Paragraph>
      <Paragraph position="3"> The target utterance is included as a random factor in our model, grouping all 15 measurements of all rules of an utterance as repeated measurements, since they depend on the same target rule occurrence or at least on other other rules in the utterance, and are, thus, partially inter-dependent.</Paragraph>
      <Paragraph position="4"> We distinguish production-production priming within (PP) and comprehension-production priming between speakers (CP), encoded in the factor ROLE.</Paragraph>
      <Paragraph position="5"> Models were estimated on joint data sets derived from both corpora, with a factor SOURCE included to discriminate the two dialogue types.</Paragraph>
      <Paragraph position="6"> Additionally, we build a model estimating the effect of the raw frequency of a particular syntactic rule on the priming effect (FREQ). This is of particular interest for priming in applications, where a statistical model will, all other things equal, prefer the more frequent linguistic choice; recall for competing low-frequency rules will be low.</Paragraph>
    </Section>
    <Section position="3" start_page="121" end_page="122" type="sub_section">
      <SectionTitle>
2.3 Generalized Linear Mixed Effect
Regression
</SectionTitle>
      <Paragraph position="0"> In this study, we built generalized linear mixed effects regression models (GLMM). In all cases, a rule instance target is counted as a repetition at distance d iff there is an utterance prime which contains the same rule, and prime and target are d units apart.</Paragraph>
      <Paragraph position="1"> GLMMs with a logit-link function are a form of logistic regression.2 2We trained our models using Penalized Quasi-Likelihood (Venables and Ripley, 2002). We will not generally give classicalR2 figures, as this metric is not appropriate to such GLMMs. The below experiments were conducted on a sample of 250,000  Task, for within-speaker (PP) and between-speaker (CP) priming. Right: Fitted model for the development of repetition probability (y axis) over time (x axis, in seconds). Here, decay (slope) is the relevant factor for priming strength, as shown on the left. These are derived from models without FREQ. Regression allows us not only to show that priming exists, but it allows us to predict the decline of repetition probability with increasing distance between prime and target and depending on other variables. If we see priming as a form of pre-activation of syntactic nodes, it indicates the decay rate of preactivation. Our method quantifies priming and correlates the effect with secondary factors.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML