File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/90/c90-2044_metho.xml

Size: 21,488 bytes

Last Modified: 2025-10-06 14:12:24

<?xml version="1.0" standalone="yes"?>
<Paper uid="C90-2044">
  <Title>Disambiguating Cue Phrases in Text and Speech</Title>
  <Section position="4" start_page="0" end_page="0" type="metho">
    <SectionTitle>
2 Previous Studies
</SectionTitle>
    <Paragraph position="0"> The important role that cue phrases play in understanding and generating discourse has been well documented in the computational linguistics literature.</Paragraph>
    <Paragraph position="1"> For example, by indicating the presence of a structural boundary or a relationship between parts of a discourse, cue phrases caa assist in the resolution of anaphora\[5, 4, 17\] and in the identification of rhetorical relations \[10, 12, 17\]. Cue phrases have also been used to reduce the complexity of discourse processing and to increase textual coherence\[3, 11, 21\].</Paragraph>
    <Paragraph position="2"> In Example (1) 1, interpretation of the anaphor 'it' as (correctly) co-indexed with THE SYSTEM is facilitated by the presence of the cue phrases 'say' and 'then', marking potential antecedents in '... as an</Paragraph>
  </Section>
  <Section position="5" start_page="0" end_page="0" type="metho">
    <SectionTitle>
EXPERT DATABASE for AN EXPERT SYSTEM ...' as
</SectionTitle>
    <Paragraph position="0"> structurally unavailable. 2 (1) &amp;quot;If THE SYSTEM attenqpts to hold rules, say as</Paragraph>
  </Section>
  <Section position="6" start_page="0" end_page="0" type="metho">
    <SectionTitle>
AN EXPERT DATABASE for AN EXPERT SYSTEM,
</SectionTitle>
    <Paragraph position="0"> then we expect it not only to hold the rules but to in fact apply them for us in appropriate situ- null Previous attempts to define the set of cue phrases have typically been extensional, 3 with such lists of cue phrases then further classified as to their discourse function. For example, Cohen \[3\] uses a taxonomy of connectives based on that of Quirk \[16\] to associate with each class of cue phrases a semantic function with respect to a model of argument understanding. Grosz and Sidner \[4\] classify cue phrases based on changes to the attentional stack and intentional structure found in their theory of discourse. Schiffrin \[18\] classifies cue phrases into groups based on their sentential usage (e.g. conjunctive, adverbial, and clausal markers), while Reichman \[17\] and Ilobbs\[10\] associate groups of cue phrases with the rhetorical relationships they signal. Finally, Zukerman \[21\] presents a taxonomy of cue phrases based on three functions relevant to her work in language generation: knowledge organization, knowledge acquisition, and affect maintenance.</Paragraph>
    <Paragraph position="1"> Once a cue phrase has been identified, however, it is not always clear whether to interpret it as a discourse marker or not\[6, 4, 8, 18\]. The texts in Exampie (2) are potentially ambiguous between a temporal reading of 'now' and a discourse interpretation: (2) a. &amp;quot;Now in AI our approach is to look at a knowledge base as a set of symbolic items that represent something.&amp;quot; b. &amp;quot;Now some of you may suspect from the title of this talk that this word is coming to you from Krypton or some other possible world.&amp;quot; On the temporal reading, (2a), for example, would convey that 'at this moment the AI approach to knowledge bases has changed'; on the discourse reading, 'now' simply initiates the topic of 'the AI approach to knowledge bases'.</Paragraph>
    <Paragraph position="2"> It has been suggested that this difference between discourse and sententiai use may be intonationally disambiguable. Halliday and Hassan \[6\] claim that, in general, items used COtIES1VELY -- i.e., to relate one part of a text to another \[6, p. 4\] -- tend to be intonationally non-prominent (to be unaccented and reduced) unless they are &amp;quot;definitely contrastive&amp;quot;. Non-cohesive uses, on the other hand, are indicated by non-reduced, accented forms.\[6, p. 268\] ttalliday and llassan particularly note that intonation disambiguates in this way between cohesive (discourse) and non-cohesive (sentential) uses of classes of items we term cue phrases, such as conjunctions and adverbials. Empirical studies to date have tended to bear out their observations. Studies of portions of the London-Lund corpus such as \[1\] have provided intonational profiles of word classes including DISCOURSE ITEMS, conjunctions and adverbials which are at least compatible with these views. However, the notion of 'discourse item' used appears much more restrictive 3An exception to this is found in the socio-linguistic work of Schifl'rin\[18\].</Paragraph>
    <Paragraph position="3"> than the notion of 'cue phrase', 4 so it is difficult to make comparative use of these results.</Paragraph>
    <Paragraph position="4"> In an earlier study \[8\], we examined the use of various intonational, syntactic, and orthographic features to distinguish between discourse and sententim readings of a single cue phrase ('now'). 5 While features such as tense, structural configuration, surface order, and orthographic indicators were sometimes useful, we found that intonational features provided only only significant correlation with discourse/sentential status. All of the tokens in our sample were disarnbiguable in terms of intonational phrasing and type of pitch accentfi In our study of now, we found that discourse uses were either uttered as a single intermediate phrase (or in a phrase containing only cue phrases) (Discourse Type A), or uttered at the beginning of a longer intermediate phrase (or preceded only by other cue phrases in the phrase) and with a L* pitch accent or without a pitch accent (Discourse Type B).</Paragraph>
    <Paragraph position="5"> Cue phrases judged to be of Sentential Type were never uttered as a single phrase; if first in intermediate phrase they were nearly always uttered with a H* or complex pitch accent (Sentential Type A); if not first in phrase they could bear any type of pitch accent or be deaccented (Sentential Type B). These results are summarized in Figure I.</Paragraph>
    <Paragraph position="6"> Based on these findings, we proposed that listeners use prosodic information to disambiguate discourse from sentential uses of cue phrases. To investigate this possibility further, we conducted another multi-speaker study of discourse and sentential uses of the cue phrase 'welt. Our findings were alrnost identical to results for the earlier study; briefly, of the 52 in4 For example, in the 48 minute text Altenberg examines, he finds only 23 discourse items, or about 17% of what our study of a similar corpus (described below) would have predicted. Our corpus consisted of recordings of four days of tile radio call-in program &amp;quot;The Harry Gross Show: Speaking of Your Money,&amp;quot; recorded during the week of 1 February 1982115\]. The four shows provided approximately ten hours of conversation between expert(s) m~d callers.</Paragraph>
    <Paragraph position="7"> 6For the earlier study as well as the current one, we assume Pierrehumbel~,'s\[13\] system of phonological description. In this system, intonational contours are described as sequences of low (L) and high (H) tones in the FuraDAM~NTAL errs.</Paragraph>
    <Paragraph position="8"> QUENCV (F0) CONTOUrt. Pitch accents, peaks or valleys in the F0 contour that fall on the stressed syllables of lexical items, signify intonational prominence. A pitch accent consists either of a single tone or an ordered pair of tones, such as L*+H. The tone aligned with the stressed syllable is indicated by a star *; thus, in an L*+H accent, the low tone L* is aligned with the stressed syUahle. There are six pitch accents in English: two simple tones -- H and L -- and four complex ones -- L*+H, L+H*, H*+L, and H+L*. A well-formed intermediate phrase consists of one or more pitch accents, and a simple lfigh H or low L tone that represents the phrase accent. The phrase accent controls the pitch between the last pitch accent of the current intermediate plwase and the beginning of the next -- or the end of the utterance. Intonational phrases are larger phonological milts, composed of one of more intermediate phrases, plus a boundary tone which may also be H or L. The occurrence of phrase accents and boundary tones, together with other phrase-froM characteristics such as passes aald syllable lengthening, enable us to identify intermediate mad intonational phrases.</Paragraph>
    <Paragraph position="9">  Initial in Larger Phrase Non-Initial in Alone in Initial in Larger Phrase H or Complex Accent Larger Phrase Phrase Deaccented or L Accent stances of 'well' we examined, all but one token fit the model depicted in Figure 1.</Paragraph>
    <Paragraph position="10"> To see whether these findings could be extended to cue phrases in general, we began a third study -- of all cue phrases produced by a single speaker during 75 minutes of recorded speech. The remainder of this paper describes our first results from this study.</Paragraph>
  </Section>
  <Section position="7" start_page="0" end_page="0" type="metho">
    <SectionTitle>
3 The Data
</SectionTitle>
    <Paragraph position="0"> To test whether our prosodic model of discourse and sentential uses of 'now' and 'well' extended to cue phrases in general, we examined intonational chm'acteristics of all single-word cue phrases 7 used in a keynote, address given by I~onald Brachman at the First lnlernalional Conference on Expert Database Syslems in 1986. The address provides approximately 75 minutes of speech tY=om a single speaker. For our first sample, we examined the 211 cue phrases uttered during the first 17 minutes of the address. Our tokens had the following distribution: s actually (6), also (2), although (1), and (68), basically (1), because (2), but (12), finally (1), \]i,'sl (1), further (4), however (2), like (11), look (11), next (4), now (26), ok (1), or (19), say (12), second (1), see (5), since (1), so (9), then (3), therefore (1), well (7) .</Paragraph>
    <Paragraph position="1"> To determine the classification of each token (ms-COURSE, SENTENTIAL, or AMBIGUOUS), the authors separately .judged each token by listening to the taped address while marking a transcription. 9 rWe exmnined o~fly single-word cue plu, asea in tiffs study since our current prosodic model applies only to such items. In future work we plan to develop additional models for discourse a~nL(l aententiel uses of multl-word cue phrases, e.g 'that reminds me', 'first o\] all', 'speaking off and so on.</Paragraph>
    <Paragraph position="2"> 8Our set of cue phrases was derived from extensional definitions provided by ourselves and othel~\[3, 4, 17, 18, 21\]. Tim following lexicel items, although also cue phrases, are not present in the portion of the axlch-ess examined to date: ' alright', 'alternatively', 'anyway', %oy', ~ conversely', ' exeepf , 'fine', '\]urthermore', 'incidentally', 'indeed', 'listen', 'moreover', 'nah', 'nevertheless', 'no', 'oh', 'right', 'why', 'yeah', ~yes'.</Paragraph>
    <Paragraph position="3">  by a meraber of the text processing pool at AT&amp;T Bell Laboratories. We found that 20 cite phrases had been omitted by the traalscriber: 'and', 'now', 'ok', 'so', and 'well'. Significantly, ell but two of these were termed 'discourse' uses by In comparing our judgments, we were interested in areas of disagreement as well as agreement. The set of tokens whose classification as to discourse or sentential use we agreed upon provide a testbed for our continuing investigation of the intonational disambiguation of cue phrases. The set of tokens we found difficult to classify (i.e. those tokens we both found ambiguous or those whose cla.ssification we disagreed upon), provide insight into possible intonational correlates of discourse/sentential ambiguity. &amp;quot;Fable 1 presents the distribution of our judgments, where 'classifiable' represent those tokens whose classification we agreed upon and 'unclassifiable' represents those we both found ambiguous or disagreed upon.</Paragraph>
    <Paragraph position="4">  Of the 211 tokens in this initial sample, we found only 133 cue phrases (63.03%) to be unambiguously discourse or sentential. When we looked more closely at the 'unclassifiable' cases, we found that fully 73.08% were coordinate conjunctions (and, or, and but). In fact, when we compare percent classifiable for conjunctions with other cue phrases, we find that, while only 42.42% of conjunctions were found to be classifiable, fully 81.25% of non-conjunctions were classified. Thus, the case of coordinate conjunction appears to explain a large portion of the our difficulty in agreeing upon a classification.</Paragraph>
    <Paragraph position="5"> Once we had made these judgments, we analyzed the tokens for their prosodic and syntactic features as well ms their orthographic context, much as we had done with tokens for the earlier two studies./deg We noted whether each token was accented or not and, if accented, we noted the type of accent employed.</Paragraph>
    <Paragraph position="6"> We also identified the composition of the intermediboth judges.</Paragraph>
    <Paragraph position="7">  Talkin's Waves speech analysis software\[20\] in our prosodic analysis.</Paragraph>
    <Paragraph position="8"> 253 3 ate phrase containing each token as to whether the token constituted a separate phrase (possibly with other cue phrases) or not. And we noted each token's position within its intermediate phrase -- first (including tokens preceded only by other cue phrases) or not. We also noted syntactic characteristics of each item, including part of speech and its immediately dominating constituent, n Finally, we noted orthographic indicators in the transcript which might provide disambiguation, such as immediately preceding and succeeding punctuation and paragraph boundaries. In both the syntactic and orthographic analyses we were particularly interested in discovering how well non-prosodic features which might be obtained automatically from a text would do in differentiating discourse from sentential uses.</Paragraph>
  </Section>
  <Section position="8" start_page="0" end_page="0" type="metho">
    <SectionTitle>
4 The Single-Speaker/Multi-
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
Cue Phrase Study
</SectionTitle>
      <Paragraph position="0"> Our findings from the classified data (133 tokens) in this pilot study confirmed our model of prosodic distinction between discourse and sentential uses of cue phrases. The distribution of these judgments with respect to the prosodic model of discourse and sentential cue phrases depicted in Figure 1 is shown in  tional profiles for discourse uses: Discourse Type A, in which a cue phrase constitutes an entire intermediate phrase (or is in a phrase containing only other cue phrases) and may have any type of pitch accent; Discourse Type B, in which a cue phrase occurs at the beginning of a larger intermediate phrase (or is preceded only by other cue phrases) and bears a L* pitch accent or is deaccented; Sentential Type A, in which the cue phrase occurs at the beginning of a larger phrase and bears a H* or complex pitch accent; and Sentential Type B, in which the cue phrase occurs in non-initial position in a larger phrase. Now note in Table 2 that the ratio of discourse to sentential usage was about 1:2. Of the 44 tokens judged to represent discourse use and fitting our prosodic model, one third were of Discourse Type A and two-thirds of Discourse Type B.</Paragraph>
      <Paragraph position="1"> llWe used Hindie's parser Fidditch\[7\] to obtain constituent structure and Fidditch aaid Church's part-of-speech program\[2\] for part of speech assignment.</Paragraph>
      <Paragraph position="2"> While overall results are quite significant, the 17 items judged sentential which nonetheless fit the discourse prosodic model must be explained. Of these 17, 14 (representing two thirds of the total error) are conjuncts (11 'and's and 3 '0r's) which fit the type (b) discourse prosodic model. While all are thus first in intermediate phrase -- and, in fact, in intonational phrase -- none are utterance-initial. Both judges found such items relatively difficult to distinguish between discourse and sentential use. 12 In (3), for example, while the first and seems clearly sentential, the second seems much more problematic.</Paragraph>
      <Paragraph position="3"> (3) &amp;quot;But instead actually we are bringing some thoughts on expert databases from a place that is even stranger and further away and that of course is the magical world of artificial intelligence.&amp;quot; null The difficulty in such cases appears confined to instances ofsentential coordination where the conjunct is not utterance initial. Table 3 shows how judgments were distributed with respect to our prosodic model when coordinate conjunctions are removed from the sample. Our model thus predicts 93.4% of non- null conjunct cue phrase distinctions, as opposed to the 84.2% success rate shown in Table 2.</Paragraph>
      <Paragraph position="4"> Our prosodic model itself can of course be decomposed to examine the contributions of individual features to discourse/sentential judgments. Table 4 shows the distribution of judgments by all possible feature complexes for all tokens) 3 This distribution reveals that there is considerable agreement when cue phrases appear alone in their intermediate phrase (OF*, corresponding to Discourse type A in Figure 1); such items are most frequently judged to be discourse uses. There is even more agreement when cue phrases appear in non-initial position in a larger intermediate phrase (NONF* -- Sentential type B in l~See Section 3. Of the 99 conjuncts in thin study, both judges agreed on a discom~e/sentential distinction only 42.4% of the time, compared to 78.6~ agreement on non-conjtmcts. Conjunct tokens represented two-thirds of all tokens the judges disagreed on, and 68:9% of tokens at least one judge was unable to assign.</Paragraph>
      <Paragraph position="5"> 13Feature complexes axe coded as follows: initial 'O' or 'NO': consists of a single intermediate phrase or not; medial 'F' or 'NF': appears first in intermediate phrase or not; Final 'D', 'H', 'L', or 'C': deaccented, or bears a H*, L* or complex pitch accent. Note that four cells (ONFD, ONFH, ONFL, and ONFC) are empty, since all items alone in their intermediate phrase must perforce come frrst in it.</Paragraph>
      <Paragraph position="6">  ever, tokens which fit Discourse type B in Figure 1 (first in a larger phrase and deaccented (NOFD) or with a L* (NOFL)) appear more problematic: of the former, there was disagreement on fiflly two thirds) 4 While there is more agreement that tokens characterized as NOFIt (first in a larger phrase with a H* accent) or NOFC (same with a complex pitch accent) -- Sentential type A in Figure 1 --- are sentential, this agreement is certainly less striMng than in the case of tokens characterized a,s NONF* (non-initial il~ a larger phrase with any type of pitch accent --Sentential type B). Since Discourse type B and Sentcntial type A differ only in 'type of pitch accent', we wight conclude that the pitch accent feature is not as powerfid a discriminator as the phrasal features 'alone in intermediate phrase' or 'first in phrase'.</Paragraph>
      <Paragraph position="7"> As in our previous study, we also examined potential non-prosodic distinctions between discourse and sentential uses. Of the orthographic and syntactic fi:atures we examined, we found presence or absence of preceding punctuation and part-of-speech to be most successful in distinguishing discourse from sentential uses. For the 113 tokens on which both judges agreed a.s to discourse or sentential status, 1~ orthogral)hy distinguishes between discourse and sentential use in 101 (89.4%) of cases. Specifically, 21 of 30 discourse uses are preceded by punctuation and only 3 of 83 sentential items.</Paragraph>
      <Paragraph position="8"> We also tbund that part-of-speech distinguishes discourse from sentential use, although less successfully than orthography. If we simply predict discourse or se.ntential use by the assignment most frequently associated with a given part-of-speech, both 14And note that 91.3% of items in these two cells m'e conjmlcts.</Paragraph>
      <Paragraph position="9"> 15Thls figm~ excludes those items which the transcriber omitted.</Paragraph>
      <Paragraph position="10"> Church's part-of-speech algorithm and Hindle's Fidditch predict discourse or sentential use in approximately 75% of cases where both judges agreed on discourse/sentential assigmnent. For example, we assume that since the majority of conjunctions and verbs are judged sentential that these parts-of-speech are predictors ofsentential status, and since most adverbials are associated with discourse uses, these are predictors of discourse status, and so on. While part-of-speech thus might seem less useful than orthographic distinctions for our corpus, the fact that it is not subject to transcriber idiosyncracy might make it a more reliable predictor than orthographic indicators in the general case. Too, for text-to-speech applications, in which one would like to infer discourse or sentential use in order to employ the appropriate intonational features when synthesizing the item in question, these text-based results are encouraging.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML