File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/90/j90-3003_metho.xml
Size: 69,398 bytes
Last Modified: 2025-10-06 14:12:35
<?xml version="1.0" standalone="yes"?> <Paper uid="J90-3003"> <Title>A COMPUTATIONAL GRAMMAR OF DISCOURSE-NEUTRAL PROSODIC PHRASING IN ENGLISH</Title> <Section position="3" start_page="0" end_page="0" type="metho"> <SectionTitle> 2 AN ANALYSIS OF PHRASING 2.1 BACKGROUND: THE FACTORS THAT CONTRIBUTE TO PROSODIC PHRASING 2.1.1 SYNTAX AND PROSODIC PHRASING </SectionTitle> <Paragraph position="0"> What is the exact contribution of syntax to sound? There are well-known local syntactic phenomena that affect both phonetic segment quality and the stress pattern of the phrase. A difference in syntactic category affects phonetic quality in examples 1-3 below: 1. a. They live in Canada. (live = verb) b. He ate live lobster. (live = adjective) 2. a. Can you estimate the damage? (estimate = verb) b. Give us an estimate. (estimate = noun) 3. a. As the water grew colder, their hands grew number. (number = adjective) b. Do you have his phone number? (number = noun) The type of object that a verb takes correlates with the pronunciation of the verb in 4 and 5: 4. He resided in Holland. (reside pronounced \[rizaid\]) 5. He resided the house with aluminum. (reside pro- null nounced \[risaid\]) Syntactic category information also influences word prosody, as in 6 and 7, where knowledge of category membership is necessary to determine the correct stress pattern. 6. a. Both content and style are important.</Paragraph> <Paragraph position="1"> (content = noun) b. They are content to remain here.</Paragraph> <Paragraph position="2"> (content = adjective) 7. a. This ticket is invalid. (invalid = adjective) b. He is an invalid. (invalid = noun) Syntactic category may also affect phrasal stress. For example, the sequence power units has stress on units in the verb-noun sequence in 8, but it has stress on power in the noun-noun sequence in 9: 8. If house current fails, power units from battery. 9. The power units failed.</Paragraph> <Paragraph position="3"> Finally, syntactic gaps affect segment quality. For example, the vowel in the preposition to is normally weak, as in 10. But if a gap like the one that is associated with the question word who in 11 follows the preposition, the vowel of to is strong.</Paragraph> <Paragraph position="4"> 10 We spoke to John. (to pronounced/t~/) 11. Who did you speak to? (to pronounced/tu/) When it comes to sentence-level prosody, especially phrasing, it is often true, as we will see below, that a sequence of words dominated by the same syntactic node cohere more closely than a sequence of words dominated by two different nodes. This observation has led some researchers, e.g., Cooper and Paccia-Cooper (1980), to claim a direct mapping between the syntactic phrase and the prosodic phra,;e. However, this claim is controversial because of the misa'dgnments that occur between the two levels of phrasing. For example, in considering the connection between syntax and phrasing, the linguistic literature most often refers to examples of embedded sentences. Sentences like 12, from Chomsky (1965), are frequently cited. (Square brackets mark off the NP constituents that contain embedded sentences.) 12. This is \[uP the cat that caught \[NP the rat that stole \[NP the cheese\]\]\] In such cases, the syntactic constituency indicated by bracketing is not in alignment with the prosodic phrasing. Instead, 12 has the prosodic phrasing in 13a. The phrasing in 13b, which most closely matches constituency, is strange at best. (In these and other examples, the most prominent prosodic boundaries are marked by vertical bars.) 13. a. This is the cat II that caught the rat 1\[ that stole the cheese, b. ??This is II the cat that caught II the rat that stole \[\[ the cheese.</Paragraph> <Paragraph position="5"> To account for such mismatches, &quot;readjustment rules&quot; that change constituent structure by adjoining each embedded sente, nce to the node dominating it have been posited. The result is a flattened structure that more accurately reflects the prosodic phrasing. In Chomsky and Halle (1968), this flattening process is not part of the grammar. Rather, it is viewed as &quot;... a performance factor, related to the difficulty of producing right branching structures such as \[ 12\]&quot; (p. 372). Thus phrasing, in their approach, is only indirectly related to syntax, since readjustment is done by special rules outside the grammar proper.</Paragraph> <Paragraph position="6"> Langendoen (1975) proposes readjustment rules similar to those of Chomsky and Halle, but he claims that the readjustment of structure is part of the grammar, not part of the performance model. He thus makes explicit what is often a tacit assumption in both the linguistic and psycho-linguistic literature2--that there is a direct connection between syntactic constituency and prosodic phrasing, with apparent misalignments readjusted before syntax interface,; with prosodic phonology.</Paragraph> <Paragraph position="7"> Langendoen's proposal works well for sentences such as 12 because it predicts that important prosodic phrase boundaries will coincide with sentence boundaries. But this does not always fit the prosodic facts--sentences that lack overt complementizers or relative pronouns often resist the insertion of a break to set them off. For example, when applied to They believe California sales are still off 75 percent, readjustment rules cause the embedded sentence to be set off prosodically, as in 14a. This seems quite unnatural compared with 14b (an observed example), where a boundary has been inserted not before the sentence, but after the embedded subject.</Paragraph> <Paragraph position="8"> 156 Compur, ational Linguistics Volume 16, Number 3, September 1990 J. Bachenko and E. Fitzpatrick Discourse-Neutral Prosodic Phrasing in English 14 a. ??They believe \]\[ California sales are still off 75%. b. They believe California sales I\[ are still off 75%. Similarly, flattening the relative clause it saw in It was ready to bite the first person it saw has the questionable effect of inserting a prosodic phrase boundary before the relative clause, as in 15a. But in our data, this sentence actually has the phrasing in 15b, where the relative clause is not set off. (The relative clause in these examples is italicized.) 15. a. ??It was ready to bite the first person I\] it saw. b. It was ready to bite \[I the first person it saw.</Paragraph> <Paragraph position="9"> Moreover, there are certain distinctions among clause types, for example the difference between restrictive and appositive relatives, that are captured only by the presence or absence of a separate prosodic phrase for the clause. However, Langendoen's claim that embedded clauses are flattened would nullify this difference. The flattening and consequent setting off of restrictive relatives would render sentences such as 16a unintelligible because the associated appositive reading forces a contradiction between who came from Plymouth and who came from Falmouth. We believe the only intelligible version of this sentence is 16b, where there has been no readjustment of the come from clauses.</Paragraph> <Paragraph position="10"> 16. a. ??The pilgrims \]\[ who came from Plymouth 1\] were a lusty bunch 1\] while the pilgrims \[\] who came from Falmouth II were not.</Paragraph> <Paragraph position="11"> b. The pilgrims who came from Plymouth 1\] were a lusty bunch I\[ while the pilgrims who came from Falmouth II were not.</Paragraph> <Paragraph position="12"> In sum, the contribution of syntax to sound is borne out by several phenomena. Even at the level of prosodic phrasing, syntactic constituents often cohere. Where misalignments between the syntactic and the prosodic phrasing occur, however, the notion of readjusting the syntax to fit the prosody is problematic and, we believe, compares unfavorably with an approach that views the semantic and phonological components as contributing to prosodic phrase boundary determination.</Paragraph> </Section> <Section position="4" start_page="0" end_page="0" type="metho"> <SectionTitle> 2.1.2 SEMANTICS AND PROSODIC PHRASING </SectionTitle> <Paragraph position="0"> The syntax/prosody misalignment may be viewed as resulting in part from semantic considerations. Both predicate-argument relations and discourse factors have been examined for their possible input to prosodic phrasing.</Paragraph> <Paragraph position="1"> Crystal (1969) claims that prosodic phrase boundaries will co-occur with grammatical functions such as subject, predicate, modifier, and adjunct. Selkirk (1984) and Nespor and Vogel (1986) take a similar approach, but within a different theoretical framework. Previous versions of our work, as described in Bachenko et al. (1986) also assume that phrasing is dependent on predicate-argument structure. The problem here is that the phrasing in observed data often ignores the argument status of constituents. In 17a-f, for example, the phrasing makes no distinction between arguments and adjuncts. All of the sentences have the same X(VY) pattern even though Y is a complement in the first case (thefirst serious attempt) and an adjunct in the others. (The complement in 17a and the adjuncts in 17b-f are italicized.) 17. a. A British expedition II launched the first serious attempt.</Paragraph> <Paragraph position="2"> b. A single bright light \[I shone out from the darkness. c. There were several little changes I\] carried out about that time.</Paragraph> <Paragraph position="3"> d. Were there any gypsies II camping in the plantation ....</Paragraph> <Paragraph position="4"> e .... like the claws of a crab I\] thrown out on each side. f. Two years II have passed since then.</Paragraph> <Paragraph position="5"> The relation between discourse and prosodic phrasing has been examined in some detail by Bing (1985), who argues that each noun phrase in an utterance constitutes a separate prosodic phrase unless it is destressed because of reference to previous discourse. Bing also observes that constituents that refer to items newly introduced into a discourse tend to be longer. This may be the reason that word count and syllable count play a prominent role in prosodic phrasing (see Section 2.1.3.). To our knowledge, no work has explicitly explored the relation between the length of a constituent and its status in the discourse. Hirschberg and Litman (1987) and Litman and Hirschberg (1990) also examine the relation between discourse and prosodic phrasing. Their work succeeds in distinguishing the use of items like now, so, and well as discourse cues from their denotative lexical use on the basis of a complex combination of pitch accent type and phrasing.</Paragraph> <Paragraph position="6"> The Hirschberg and Litman studies identify a specific discourse distinction that relates to phrasing. These studies are not intended to give a picture of the extent to which discourse relates to phrasing. On the other hand, Bing's work gives a broader picture of the relation between discourse and phrasing, but it deals only with noun phrases. Thus both of these efforts leave open the question as to whether discourse features completely determine prosodic phrasing or are a complement to some more basic set of determinants, syntactic and/or phonological. In other words, when prosodic features that reflect facts of the discourse are removed, is there a residual, neutral phrasing? Our work on the prosodic phrase status of clause final prepositional phrases, which we discuss below, suggests the existence of a discourse-neutral phrasing that depends on syntactic constituency mediated by string adjacency and length of a potential prosodic phrase. 3 Such phrasing provides us with a typical phrasing pattern analogous to the typical phrasal stress patterns examined in Liberman and Prince (1977), which &quot;are often overwhelmed by the chiaroscuro of highlight and background in discourse, but retain the status of null-hypothesis patterns that emerge when Computational Linguistics Volume 16, Number 3, September 1990 157 J. Bachenko and E. Fitzpatrick Discourse-Neutral Prosodic Phrasing in English there is no good reason to take some other option&quot; (p. 251). This approach to prosodic phrase boundary determination brings us closer to a framework in which phonological, syntactic, and discourse features all contribute to prosodic phrasing.</Paragraph> <Paragraph position="7"> The possibility of a discourse-neutral prosodic phrasing is also of import to the prosodic quality of synthetic speech, since it allows us to &quot;get by&quot; without a complete description of the discourse features of a given text, many of which have yet to be characterized. Interestingly, in the data we examined we found only 14 percent of the phrases to be discourse-determined.</Paragraph> <Paragraph position="8"> The identification of a preferred phrasing that is independent of discourse also aids us in identifying and characterizing the discourse features that impinge on prosodic phrasing. Several well-known discourse phenomena---coreference, contrast, and parallelism--affected the phrasing of the clause final prepositional phrases in our corpus. We are left with three or four unexplained cases that are suggestive of a discourse explanation.</Paragraph> </Section> <Section position="5" start_page="0" end_page="0" type="metho"> <SectionTitle> 2.1.3 PHONOLOGICAL LENGTH AND PROSODIC PHRASING </SectionTitle> <Paragraph position="0"> The psycholinguistic studies of Martin (1970), Allen (1975), Hitlinger et al. (1976), Grosjean et al. (1979), Dommergues and Grosjean (1983), and Gee and Grosjean (1983), responding to the idea of readjusted syntax as the source of prosodic phrasing, show that grammatical structure, even if readjusted, is not in itself a reliable predictor of prosodic phrasing: mismatches between syntax and prosody occur often and systematically, and can be related to specific nonsyntactic factors such as length and word frequency.</Paragraph> <Paragraph position="1"> For example, although prosodic boundaries between sub-ject and verb do occur, there also exist prosodic patterns in which the boundary comes between the verb and object, i.e., the data reveal both X(VY) and (XV)Y groupings.</Paragraph> <Paragraph position="2"> Grosjean et al. (1979) claims that such mismatches are due for the most part to constituent length, which interacts with grammatical structure and, in some cases, overrides it.</Paragraph> <Paragraph position="3"> Thus syntactic and prosodic structure match when the major constituents of a sentence are roughly equal in length; for example, the main prosodic phrase break corresponds to the subject-predicate boundary in Waiters who remember well \]\[ serve orders correctly. Discrepancies in length throw constituents off balance, and so prosodic phrasing will cross constituent boundaries in order to give the phrases similar lengths; this is the case in Chickens were eating I\[ the remaining green vegetables, where the subject-predicate boundary finds no prosodic correspondent. 4 The most explicit version of this approach is the analysis presented in Gee and Grosjean (1983) (henceforth G&G).</Paragraph> <Paragraph position="4"> Drawing on the psycholinguistic studies mentioned above and on aspects of the grammar of prosody outlined in Selkirk (1984), G&G propose an algorithm for mapping syntactic structure onto a hierarchical representation of phrasing; the rules they present accomplish this by integrating syntactic information (e.g. constituent structure, left-to-right o:rdering) with information about constituent length. We have found that their rules, which are described in detail, provide a productive model for investigations of phrasing, and in what follows we shall frequently refer to their analysis. But, as we will show, G&G fall short of providing a comprehensive theory. Their rules are too limited and their syntax too underspecified to achieve moderate coverage for an unrestricted collection of sentences or to provide an adequate description for implementation. 5</Paragraph> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 2.2 CURRENT ANALYSIS </SectionTitle> <Paragraph position="0"> Our goal has been to develop a theory of syntax/prosody relations that we could test in an experimental text-to-speech system. We approached the problem with the assumption that there is a level of prosodic phrase determination that does not include discourse factors, and that aiming for this level would yield an appropriate phrasing for a sentence. Both the output of the system and our preliminary findings, which show that discourse factors influence just a small part of the phrasings that follow a verb, indicate that this approach is feasible.</Paragraph> <Paragraph position="1"> The analysis that we arrived at takes G&G and, to some extent, Selkirk (1984) as its starting point. Hence we are assuming that there is no necessary match between syntactic structure and prosodic phrasing. Prosody rules refer to syntactic structure, but they are not obliged to preserve it; independent principles of prosodic well-formedness, in particular length calculations, may create entirely different structures that appear at odds with the syntax. Here we shall describe the main features of our analysis and then go on to a description of the implementation.</Paragraph> <Paragraph position="2"> Our prosody rules are intended to account for two aspect,; of phrasing: boundary location and boundary salience. In 18, for example, the rules need to stipulate that a phrase boundary comes between the subject and predicate.</Paragraph> <Paragraph position="3"> 18. The light among the trees \]1 was extinguished.</Paragraph> <Paragraph position="4"> But when there is more than one important phrase boundary ir~ a sentence, the rules will also specify a relative salience, or perceptibility, for each boundary. Thus in the observed sentence 19, where an adjunct has been prefixed to the sentence, the boundary between subject and predicate diminishes in deference to the stronger boundary between adjunct and core sentence. A single vertical bar marks the diminished boundary.</Paragraph> <Paragraph position="5"> 19. About nine o'clock \]l the light among the trees I was extinguished.</Paragraph> <Paragraph position="6"> After deciding which boundaries will be diminished and which highlighted, the rules assign each boundary an acoustic value that reflects its relative strength. Our current 158 Comimtational Linguistics Volume 16, Number 3, September 1990 J. Bachenko and E. Fitzpatrick Discourse-Neutral Prosodic Phrasing in English system uses three values made available by the synthesizer.</Paragraph> <Paragraph position="7"> A pause and its concomitant prosodic effects mark the strongest boundaries, a pitch change signifies intermediate boundaries, and the weakest boundaries are assigned a phrase accent.</Paragraph> <Paragraph position="8"> Our work so far has focused solely on the issues of location and relative salience; the rules for associating the different boundaries with specific intonation contours and acoustic values are still quite rudimentary. Consequently our discussion will center on the location and salience rules, and we will mention the third rule class only in passing.</Paragraph> <Paragraph position="9"> The location rules identify possible boundary sites by first deriving phonological words from the lexical items in a parse tree, and then grouping the phonological words into larger phonological phrases. The boundaries that separate phonological phrases are the candidates for prosodic phrase boundaries, since the prosodic phrases of speech consist of one or more phonological phrases.</Paragraph> <Paragraph position="10"> Our rules for phonological word formation are adopted, for the most part, from G&G, Grosjean and Gee (1987), and the account of monosyllabic destressing in Selkirk (1984). Thus in our analysis, rules of phonological word formation apply to the non-null terminal nodes in a syntax tree. If the terminal is a content word, i.e. noun, verb, adjective, or adverb, then this terminal may have the status of a phonological word on its own. Otherwise the word combines with one or more orthographically distinct words to form a single phonological word that has no internal word or phrase boundaries. This is accomplished by adjoining a word to its left or right neighbor depending on its lexical category and its position in the tree. Function words, e.g. auxiliary verbs, articles, prepositions, pronouns, and conjunctions, are all eligible for adjunction in certain syntactic contexts. Content words, copular verbs, demonstratives, quantifiers and elements in the complementizer node can serve as hosts for the adjoined material or stand alone.</Paragraph> <Paragraph position="11"> Figure 1 illustrates the effects of phonological word formation; the &quot;+&quot; indicates that adjunction has taken place.</Paragraph> <Paragraph position="12"> Article adjunction, for example, attaches a, an, and the to a following word, so that a sudden and the trees in Figure 1 each becomes a single phonological word in which the article acts as an unstressed syllable. The rule of preposition adjunction, which has applied twice in Figure 1, attaches a preposition to the material on its right only if it is the head of a PP, otherwise the preposition attaches leftward. Thus in 20a, where the preposition is a syntactic head, in + the + dimly forms a single phonological word after article and preposition adjunction. The phrase boundary in this case will precede the preposition. But in 20b, where there is no PP (rather, the preposition is a sister of the verb in the syntax tree), filled + in is a phonological word. Hence the boundary will follow the preposition.</Paragraph> <Paragraph position="13"> 20. a. Holmes waited I in + the + dimly lit room.</Paragraph> <Paragraph position="14"> b .... and filled + in I a few of the gaps. 6 Rules of phonological phrase formation now build the next level of prosodic constituents. 7 Each phonological phrase consists of a syntactic head and the material that intervenes between it and a preceding head (usually, the pre-head modifiers, e.g. pre-nominal adjectives, pre-verbal adverbs). Following Selkirk (1984), we have limited the eligible head categories to noun, verb, adjective, and adverb (although adjectives and adverbs do not count if they directly precede and modify another head). Examples 21 a-b illustrate the results of phonological phrase formation. In each case, the phonological phrase is created by a left-to-right process that collects material up to and including the head of a syntactic constituent. Every phonological phrase boundary thus marks a syntactic head as well as the site of a possible prosodic boundary in speech. (The sequences with + are words formed by adjunction; I stands for a phonological phrase boundary.) 21. a. A + British expedition I launched I the + first serious attempt.</Paragraph> <Paragraph position="15"> b. We + saw I a + sudden light r spring + up I among + the + trees.</Paragraph> <Paragraph position="16"> Which boundaries become the prominent ones is determined by the salience rules described below.</Paragraph> <Paragraph position="17"> The elements of phonological phrases cohere strongly in speech--they cannot be separated into smaller phrases without a dramatic effect on the semantic content of the sentence. In 22a, for instance, the italicized phrase must be treated as a minimal element with respect to phrasing; the Computational Linguistics Volume 16, Number 3, September 1990 159 J. Bachenko and E. Fitzpatrick Discourse-Neutral Prosodic Phrasing in English phrasing of 22b, where an important break comes before the syntactic head, is rare in our data.</Paragraph> <Paragraph position="18"> 22. a. I shall never forget I that April morning.</Paragraph> <Paragraph position="19"> b. ??I shall never forget that I April morning.</Paragraph> <Paragraph position="20"> Example 23a, with a phonological phrase boundary before the preposition of follows the pattern that is found in nearly all of our data. The observed pattern in 23b, where the boundary follows the of creates a noticeably theatrical effect with emphatic stress on strange.</Paragraph> <Paragraph position="21"> 23. a. He's a collector I of strange animals.</Paragraph> <Paragraph position="22"> b. He's a collector of I strange animals.</Paragraph> <Paragraph position="23"> The rules for salience apply to a combination of phonological phrases and syntactic constituents. Their input is a structure like that in Figure 2, where syntactic constituents may contain one or more phonological phrases. The NP in Figure 2 consists of a single phonological phrase, and the top-level PP contains two. The absence of VP in this figure will be explained below.</Paragraph> <Paragraph position="24"> When they apply, the salience rules merge phonological phrases to create larger prosodic phrases, which are also merged into a final phrase hierarchy. Boundaries between the phrases are thus diminished or emphasized, finally giving the impression of a balanced, rhythmic pattern</Paragraph> <Paragraph position="26"> We locked our doors for fear of the cheetah Figure 2 Input to Salience Rules (phph = phonological phrase).</Paragraph> <Paragraph position="27"> whose components appear to be equal in length. The salience rules apply on the basis of (i) adjacency to a verb, (ii) length, and (iii) constituent type.</Paragraph> <Paragraph position="28"> All three factors interact in the initial balancing of material around a verb. In this process, the verb groups to the left to form a (XV)Y pattern or it groups to the right to produce a X(VY) pattern. Our analysis generally follows G&G, who propose the following rule, where, in their formulation, X is a prosodic constituent or null, Visa verb, Yis a nonsentential complement, and C refers to phonological word count.</Paragraph> <Paragraph position="29"> 24. Verb Balancing Rule</Paragraph> <Paragraph position="31"> The rule works through a sentence from left to right. It says that if combining the verb with the constituent to its left yields a word count less than that of the complement, the verb fi~rms a prosodic phrase with its left neighbor. Consequently, the phrase boundary following the verb is strengthened. For all other cases, the verb groups to the right so that the boundary preceding the verb becomes reinforced (G&G 442). In 24, X and Y contain prosodic constituents, either phonological phrases or prosodic phrases formed by other salience rules (e.g. the constituent rules that build NP and PP into separate prosodic phrases; see G &G, p. 441). Word count (C) is determined solely by the number of phonological words. 8 Example 25a gives the phonological phrasing (indicated by I) for This little incident gives a new zest to our investigation. Applied to this string, the verb rule will group the verb to the right and derive the X(VY) pattern since This little incident plus gives adds up to four words, while a new zest is only two words; the to-phrase is not adjacent to the verb and so is not considered by the rule.</Paragraph> <Paragraph position="32"> The final phrasing is given in 25b, where the verb rule accounts for the break after the subject, and length rules that are discussed below account for the second boundary.</Paragraph> <Paragraph position="33"> 25. a. This little incident I gives I a + new zest I to + our + investigation.</Paragraph> <Paragraph position="34"> b. This little incident \]\[ gives a + new zest I to + our + investigation.</Paragraph> <Paragraph position="35"> Similarly in 26, Holmes' eyes traveled adds up to three, but the post-verb conjoined phrase round and round only counts for two. Hence the verb groups to the right.</Paragraph> <Paragraph position="36"> 26. Holmes' eyes 1\] traveled round and + round.</Paragraph> <Paragraph position="37"> The sentences in 27 follow the (XV)Yoption. In 27, as + the + lamp plus was + lit add up to two words versus the three words in + one of + the + sitting rooms. In 27b, Chickens plus were + eating also add up to two words, the + remaining green vegetables adds up to three. In 27c, and + his + eyes plus were + fixed adds up to two words, while in + a + dreadful rigid stare adds up to three.</Paragraph> <Paragraph position="38"> 160 Computational Linguistics Volume 16, Number 3, September 1990 J. Bachenko and E. Fitzpatrick Discourse-Neutral Prosodic Phrasing in English 27. a. as + the + lamp was + lit \[I in + one of + the + sitting rooms...</Paragraph> <Paragraph position="39"> b. Chickens were + eating II the + remaining green vegetables.</Paragraph> <Paragraph position="40"> c. and + his + eyes were + fixed I\[ in + a + dreadful rigid stare.</Paragraph> <Paragraph position="41"> In their formulation of the verb rule, G&G impose two conditions on its application. First, the rule may examine only arguments of the verb, i.e., complements are candidates for Yin 24, but modifiers and adjuncts are not. Hence the rule must have access to VP constituency since verb complements, in the generative grammar framework assumed by G&G, are represented as sisters of V in VP, while modifiers and adjuncts are outside of the VP. Second, the rule cannot cross S boundaries--embedded clauses form separate prosodic units in G&G's analysis.</Paragraph> <Paragraph position="42"> Our studies indicate that these conditions are too strong: balancing around a verb often crosses both VP and S boundaries in our taped data. For example, in 28a-b, where the verb and its complement occur in a single prosodic phrase, the phrasing may appear to be influenced by the presence of a VP (the complement is italicized.).</Paragraph> <Paragraph position="43"> 28. a. A + British expedition II launched the + first serious attempt.</Paragraph> <Paragraph position="44"> b. The + 48 channel module II can + have only two di-groups.</Paragraph> <Paragraph position="45"> Yet the verb also forms a single prosodic phrase with sentence adjuncts. This is the case in 29a-c, where the verb and adjunct are separated by an important boundary in the syntax (VP), but not in the prosody. In these sentences, while a secondary break may set off the adjunct, the main prosodic phrase boundary comes before a verb + adjunct sequence (the adjunct is italicized).</Paragraph> <Paragraph position="46"> 29. a. Seven of + our + porters II were + killed in + the + fall b. a + crack \[\] opened in + the + snow.</Paragraph> <Paragraph position="47"> c. the + elements of + personal interest \[\[ must + be + introduced at + all costs If the verb-balancing rule is restricted to subcategorized complements of the verb, as G&G assume, then the phrasing in 29a-c has no explanation since, with the restriction, the main boundary in these sentences has to come between the verb and adjunct, a prediction that contradicts the observed pattern and sounds strange at best. Sentences such as those in 29a-c suggest that Yin the verb rule of 24 should not be limited to material within VP but should include anything to the right of V. Hence we are assuming that the key to phrasing in 28a-b, 29a-c is the adjacency relationship between a verb and the constituent on its right, not verb phrase structure, or, equivalently, the complement versus noncomplement status of a constituent. In particular, phrasing around the verb depends on the relative length of constitutents that are adjacent to a verb and, as we will observe, the presence of specific verb-adjacent items (e.g. phrasal and).</Paragraph> <Paragraph position="48"> The phrasing of embedded sentences follows a similar course. Prominent phrase boundaries often co-occur with clause boundaries, e.g. Customers are asking retailers I\[ whether their watermelons come from California, but we believe that this is only apparently due to clausal constituency. In our data, components of S can form a prosodic phrase with nearly any adjacent material, regardless of where the S boundary falls. For example, 30 a-c has the most prominent break before the matrix verb, not before the embedded S (which is italicized). The sentences of 31 have the most prominent prosodic break within the embedded sentence.</Paragraph> <Paragraph position="49"> 30. a. Even my + fiance 11 believes it's only my + imagination. null b. Did + Dr. Roylott 1\] continue to + practice? c. A + terrible change \[1 began to + come + over our + stepfather.</Paragraph> <Paragraph position="50"> 31. a. They + believe California sales \[\] are + still off 75 percent.</Paragraph> <Paragraph position="51"> b. I've + heard that + the + crocuses II promise very well.</Paragraph> <Paragraph position="52"> c. Experience has + proved that + savages II are the + tyrants of + the + female sex.</Paragraph> <Paragraph position="53"> d. I + seem to + see dimly 1\] what you're driving at. Here again it appears that G&G's formulation of verb balancing, which prohibits the rule from crossing sentence boundaries, is too strict. Requiring the prosody rules to preserve the constituent status of embedded clauses predicts that when the verb is followed by an embedded clause, as in 30a-c and 31a-d, balancing is superceded by rules that work on the internal structure of the embedded S; the result in most cases is a prominent prosodic boundary before the embedded clause. According to this approach, the sentence Did Dr. Roylott continue to practice? would have the odd phrasing Did Dr. Roylott continue \]\[ to practice? instead of the observed phrasing of 30b.</Paragraph> <Paragraph position="54"> The phrasing patterns of 30a-c and 31a-d are easy to explain if we assume that prosody rules ignore clausal constituency so that verb balancing applies across S boundaries. Evidence from sentence adjuncts, e.g. the purposive in We went out to the Himalaya to climb.., and the relative clause in It was ready to bite II the first person It saw leads us to believe that prosody rules ignore these clausal constituents as well. Our analysis thus adopts the basic mechanism G&G propose with their verb rule, but without the conditions on verb complements and clausal constituency. While NP, PP, and AdjP constituents &quot;count&quot;in our prosody rules, VP and S constituents do not. Consequently, eligible material to the right of V in the verb rule 24 may be a complement, a modifier, an adjunct, or the initial constituent of an embedded sentence. The exact treatment of the left-adjacent constituent is still a topic for investigation, and there is currently no requirement that the material to the left of a verb be a subject.</Paragraph> <Paragraph position="55"> G&G intend the balancing rule to make the verb a prosodic center by grouping constituents in such a way as to Computational Linguistics Volume 16, Number 3, September 1990 161 J. Bachenko and E. Fitzpatrick Discourse-Neutral Prosodic Phrasing in English create, in most cases, two phrases of approximately equal length, with the verb as a left or right edge. During subsequent processing, this balancing effect is usually lost since neither length nor adjacency to a verb play any further role in G&G's analysis. After verb balancing, the remaining constituents are bundled from left to right into a left-branching binary tree like that in Figure 3. As we will discuss below, the higher the constituent is in the prosody tree, the more prominent will be the boundaries that set it off in speech. Hence, when constituents are bundled, as in Figure 3, material that comes at the end of a sentence will usually be set off by the largest breaks. This is what happens in 32a-b, where a strong boundary before the final (italicized) constituent seems desirable.</Paragraph> <Paragraph position="56"> 32. a .... walk away from your helper \[\[ approximately 50 feet.</Paragraph> <Paragraph position="57"> b. I suddenly heard in the silence of the night II that same low whistle.</Paragraph> <Paragraph position="58"> But the application of bundling immediately after the verb rule often leads to problems. The salience of a boundary that occurs toward the end of a sentence tends to be overestimated when the final constituent is relatively short. When this happens, the final constituent may be set off unnaturally from the rest of the sentence. In 33a-c, for example, a strong boundary before the italicized constituent is inappropriate. 9 33. a. ??The speaker pronounced the names of the characters II on the left.</Paragraph> <Paragraph position="59"> b. ??We locked our door for fear II of the cheetah.</Paragraph> <Paragraph position="60"> c. ??She had caught an early morning train \[I from London.</Paragraph> <Paragraph position="61"> We believe that the problems raised by final bundling can be avoided largely by extending the effects of length and verb adjacency beyond the verb balancing rule. In our analysis, this is accomplished by an adjacency rule and two S I gave the book to John for Christmas Figure 3 Final Bundling (adapted from G&G, p. 443).</Paragraph> <Paragraph position="62"> lengtlh rules that, in effect, sustain verb centering and determine the prosodic weight of constituents not adjacent to a verb. The adjacency rule in 34 applies after the balancing rule of 24 and groups the &quot;unclaimed&quot; verbadjac, ent constituent with the phrase that was formed by 24 (this will be either (VY) or (XV)). X in 34a and Yin 34b are prosodic constituents.</Paragraph> <Paragraph position="63"> 34. Verb Adjacency Rule a. \[... X(VY). . .\] --; \[...(X(VY))...\] b. \[...(XV)Y...\]--~ \[...((XV)Y)...1 where.., is a phonological phrase or null and X may or may not be a subject.</Paragraph> <Paragraph position="64"> Case 34a says that if (VY) is a prosodic constituent, then create a new prosodic constituent composed of (VY) and adjacent material to the left. Likewise 34b creates a new prosodic constituent by combining (XV) with adjacent material on the right. What results is a prosodic verb phase--a cluster of two prosodic phrases with the verb in the middle as a left or right edge. In 35a-b, for example, constituents abutting the verb have been worked into a single prosodic verb phrase by (i) the verb balancing rule, which groups the verb rightward with the complement (giving... X(VY)...), and (ii) the adjacency rule in 34b, which generates a larger phrase containing the constituent on the left ( .... (X(XY))...). The prosodic verb phrase is italicized; its internal boundary is marked by a single vertical bar.</Paragraph> <Paragraph position="65"> 35. a. Everest I is an + enormous pyramid I\[ with + three wide faces and + three ridges.</Paragraph> <Paragraph position="66"> b. Mrs. Welles I wrote a + weekly sports column II for + the + Christian Science Monitor.</Paragraph> <Paragraph position="67"> Constituents that are not adjacent to the verb form the periphery of a prosodic verb phrase. In 35a, the periphery consists of with three wide faces and and three ridges; in 35b it consists of for the Christian Science Monitor. While our understanding of phrasing of the periphery of a sentence is far from satisfactory, we have been conducting a study that so far suggests that, at least with respect to claw;e-final prepositional phrases, length of a peripheral constituent is an important determinant of its prosodic prominence. The phrases considered all occur to the right of tile verb with some material intervening between the verb and a prepositional phrase, PP. The intervening material varies from a full phrase to a single word that functions as a syntactic head to which the PP under consideration is a complement. In a test of 129 clause-final prepositional phrases, we noted the pattern shown in Figure 4. The clear bifurcation of the numbers in this test suggests that length of the PP is determining the degree to which final PPs are set off'. Many of the 18 apparent counterexamples to this claim also indicate that length establishes a discourse-neutral phrasing for sentence-final PPs that is contravened only by predictable syntactic and discourse factors. For example, the of-PP in the partitive construction of 36a is not in a separate prosodic phrase, even though it contains she+had+caught an+early morning train from+London two stress feet (west and wing). This contrasts with 36b, where the of-PP is not a partitive.</Paragraph> <Paragraph position="68"> 36. a .... blue smoke I curling + up from + the + chimneys \[\[ showed that part of + the + west wing...</Paragraph> <Paragraph position="69"> b. A + single bright light II shone + out from + the + darkness \]l of + the + west wing.</Paragraph> <Paragraph position="70"> Similarly, while the italicized PP in 37a follows the rule of being greater than one phonological word and therefore set off, a similar PP in 37b is not set off because the repetition of room causes this item to be destressed and consequently to merge with the relevant PP.</Paragraph> <Paragraph position="71"> 37. a. Holmes passed at + once \[\] into + the + room \[\[ in + which Helen Stoner II was now sleeping.</Paragraph> <Paragraph position="72"> b. so + that I've + had to + move out + of + my + own room \[I into + the + room next door 1\[ the + room in + which my + sister died.</Paragraph> <Paragraph position="73"> In the implementation, the length rules extend these results to include post-verb peripheral constituents other than PPs; specifically, interjections and adverbials. Currently, the length rules perform two operations on post-verb peripheral material, depending on word count. In one case, if a phrase, P1, consists of a single phonological word, it is adjoined to the most recently created phrase, PO (usually a prosodic verb phrase). The result is a new phrase, P2, whose boundary salience is equal to that of PO. This is illustrated in Figure 5, where i is a number representing salience (Section 3 shows how salience indices are derived). In the second case, a longer phrase, P1, will be bundled with its preceding material, PO, in order to form a new phrase, P2, whose salience is the sum of salience value for PO and the value of P1 (=word count + 1). In Figure 6, for example, the 'long' PP is set off by a relatively large index. Peripheral material at the beginning of a sentence is currently picked up by left-to-right bundling without regard to constituent type or length. The correct treatment of sentence-initial peripheral material remains a topic of investigation. null In some cases, phrasing is influenced by the lexical content of a constituent. Phrasal and, i.e. the left corner of a NP, PP, AdjP, or AdvP conjunct, always starts a new phrase, as in Next to it II he placed a box of matches I\[ and a candle. 11 J. Bachenko and E. Fitzpatrick Discourse-Neutral Prosodic Phrasing in English She had caught an early morning train from London Figure 5 Adjunction of Short Peripheral Phrase (i ---- boundary salience).</Paragraph> <Paragraph position="74"> In our analysis, this fact is captured by a global constraint on the prosody rules: no boundary location or salience rule may apply to a constituent whose left corner is a phrasal conjunct. Hence when phrasal and is adjacent to a verb, the prosodic left corner constraint will block the rules that form prosodic verb phrases. In 38 the verb rule is prevented from merging was extinguished with and all, as it would do if and were treated as any other function word.</Paragraph> <Paragraph position="75"> 38. The light among the trees was extinguished and all was dark.</Paragraph> <Paragraph position="76"> Although the constraint should probably include other material, such as the subordinate conjunctions (e.g. because, while, although), our current analysis acknowledges only phrasal and, or, and nor, requiring that they always introduce a separate phrase. In our implementation of the Everest is an+enormous pyramid with+three wide faces Everest is an enormous pyramid with three wide faces Figure 6 Long Prepositional Phrase Attachment (i = boundary salience).</Paragraph> <Paragraph position="77"> Computational Linguistics Volume 16, Number 3, September 1990 163 J. Bachenko and E. Fitzpatrick Discourse-Neutral Prosodic Phrasing in English prosody rules, we have extended the constraint to punctuation. For example, the comma in After Robert ate, his cat Freddy took a nap is first adjoined to the constituent on its right, to give After Robert ate, + his cat Freddy took a nap, and then, like phrasal and, obligatorily starts a new phrase, preventing the prosody rules from merging ate and his cat.</Paragraph> <Paragraph position="78"> In this section, we have described an analysis of prosodic phrasing that incorporates two classes of rules. Boundary location rules form phonological words from the terminal elements of syntactic structure and build phonological phrases (the first tier of prosodic constituency), using information about syntactic heads. Boundary salience rules assign a relative strength, or perceptibility, to each phrase boundary according to syntactic labeling, length, and adjacency; they ignore verb phrase and clausal constituency and predicate-argument relations. The primary salience rules are, in order, syntactic constituency, which converts NP, PP, and AdjP constituents into prosodic phrases (see G&G, p. 441); verb balancing and verb adjacency, which derive prosodic verb phrases; length rules, which apply to the material on the right of a prosodic verb phrase; and final bundling, which collects phrases built up by the previous rules into a binary tree (see Figure 3). Location and salience rules are both subject to a left corner constraint on their application. In the following section, we discuss how the rules work in an experimental text-to-speech system.</Paragraph> </Section> </Section> <Section position="6" start_page="0" end_page="0" type="metho"> <SectionTitle> 3 AN EXPERIMENTAL TEXT-TO-SPEECH SYSTEM </SectionTitle> <Paragraph position="0"> We have built an experimental text-to-speech system that uses our analysis of prosody to generate phrase boundaries for the Olive-Liberman synthesizer (Olive and Liberman 1985). Two concerns motivated our implementation. First, we hoped the system would provide us with a research tool for testing our ideas about syntax and phrasing against a large unrestricted collection of sentences. Second, we wished to investigate how well our approach would work for determining prosodic phrasing in a text-to-speech synthesizer.</Paragraph> <Paragraph position="1"> Existing text-to-speech systems perform well on word pronunciation and short sentences, 12 but when it comes to long sentences and paragraphs, synthetic speech tends to be difficult to listen to and understand. Many investigators (e.g. Allen 1976; Elowitz et al. 1976; Luce et al. 1983; Cahn 1988) have suggested that the poor prosody of synthetic speech, in comparison with natural speech, is the primary factor leading to difficulties in the comprehension of fluent synthetic speech. And while researchers in text-to-speech synthesis have adopted a variety of approaches to prosodic phrase generation--from the simple punctuation-based rules and function word listings of existing commercial systems to the sophisticated prosodic heuristics described in Emorine and Martin (1988) and O'Shaughnessy (1989)---the generation of appropriate prosodic phrasing in unres~tricted text has remained a problem.</Paragraph> <Paragraph position="2"> As we will show, our results so far indicate that our experimental system, which assigns a discourse neutral prosoclic phrasing on the level of sentences, provides a significant improvement in the quality of synthesized speech.</Paragraph> <Paragraph position="3"> We believe that one reason for the improvement has to do with the increased pitch range that our system uses. Text-to-speech systems that lack sentence-level phrasing must take a conservative approach to pitch settings in order to avoid misleading and inappropriate pitch modulations. Correct phrase identification makes it possible to adopt an expanded pitch range that greatly enhances the naturalness of the final speech. In constructing the system, we focused on two core questions: (i) what kind of parser is needed for the p:rosody rules? and (ii) how should prosodic phrasing, i.e. boundary location and strength, be represented?</Paragraph> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 3.1 PARSING FOR PROSODY </SectionTitle> <Paragraph position="0"> The rules for discourse-neutral phrasing that we propose need examine only a subset of the syntactic information that most parsers provide. That is, the rules require access to lexical category, syntactic heads, NP/PP/AdjP/AdvP constituency, and left-to-right word order, but not to clausal constituency, predicate-argument relations, or modifier attachrnent. We believe the rules must also recognize the trace of wh-movement (e.g. the trace that precedes the phrase break in The slope on which we were standing \[trace)' started to move), although other null terminals such as the trace of passivization are ignored. 13 At the outset of our project, we had available to us a moderate cove:rage deterministic parser--Fidditchla--that we adapted to the syntactic requirements for prosodic phrasing. This modified &quot;speech parser&quot; produces parse trees like that in Figure 7. The tree represents syntactic information necessary for phrasing, but omits nodes (S, VP) and</Paragraph> </Section> <Section position="2" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 3.2 PROSODIC PHRASE REPRESENTATION </SectionTitle> <Paragraph position="0"> Following G&G, we require that the prosody rules build a binary tree whose terminals are phonological words and whose node labels are indices that mark boundary salience.</Paragraph> <Paragraph position="1"> An alternative representation based on Liberman and Prince (1977) is presented in Selkirk (1984), which contends that prosody, including prosodic phrasing, is more properly represented as a grid instead of a tree. Although a grid may be more descriptively suitable for some aspects of prosody (for example, Sproat and Liberman (1987) use the grid representation for their implementation of stress assignment in compound nominals), we are not aware of any evidence for or against a grid representation of discourse-neutral phrasing.</Paragraph> <Paragraph position="2"> Figure 8 shows the phonological phrase tree that is built from the syntactic structure of Figure 7. The rules for building this tree apply from left to right, following the analysis we described in the preceding section. Figures 9-11 show the prosodic phrase derivation. Numbered nodes refer to salience values, with higher numbers indicating greater salience. The index is assigned according to phonological word count, with one point added for the node itself. Figure 11 is the final prosodic phrase tree; in the notation we have been using, the phrasing represented by Figure 11 is He told me I last night I\[ he was coming to London II for several days.</Paragraph> <Paragraph position="3"> Figure 9 shows the effect of two applications of verb balancing. Applying from left to right, the rule first looks at the phonological verb he + told + me. Since the material to the left of the verb is null, the rule must group this verb with the constituent on its right to form the node labeled (r). On its second application, the rule balances the prosodic phrase it has just formed against the single phonological Verb adjacency is now triggered by the contiguity of (r) and the verb he + was + coming. In Figure 10, this rule has formed the prosodic verb phrase (r).</Paragraph> <Paragraph position="4"> It remains to collect the &quot;long&quot; peripheral constituent for + several days. In Figure 11, the length rule has built the final node of the tree; because the peripheral item consists of two phonological words, the value of the topnode is affected by the word count of the peripheral item. If the peripheral item had consisted of a single phonological word, the value of the top node would have been 8 instead of ll.</Paragraph> <Paragraph position="5"> Finally, each node index is converted into one of three acoustic values. High indices are marked as a minor phrase boundary; mid-range indices are signified with a downstep on the first phonological word following the boundary (Pierrehumbert 1980), and those in the lowest range receive a phrase accent. The mapping onto three different values reflects a simple subjective choice. A more complete analysis would consider finer acoustic &quot;tunings&quot; for the indices.</Paragraph> </Section> <Section position="3" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 3.3 EVALUATING THE PHRASING SYSTEM </SectionTitle> <Paragraph position="0"> In testing a prosodic phrase system, it is necessary to be clear about the goals of the work. At the level of basic inquiry, such a system should aim to correlate well with human speech production, assuming that we are able to abstract away from speaker variability. However, in working under the assumption that discourse plays a role, however limited, in prosodic phrase determination, the comparison of synthetic to human phrasing becomes less meaningful since factors unknown to our system influence the placement of phrase boundaries.</Paragraph> <Paragraph position="1"> At the level of system building, a prosodic phrasing system should aim to make a text-to-speech system easier to understand. In general, adding phrasing to a text-to-speech system will enhance synthetic speech both by breaking a run-on delivery into more easily processed chunks of speech and by allowing the pitch range to be increased, enhancing naturalness. The primary requirement for making synthetic speech easier to understand, however, is that the phrasing system should first do no harm, i.e., it should avoid misleading phrasings and phrasings like 38 that are difficult for a listener to process.</Paragraph> <Paragraph position="2"> 38. ??This is 11 the cat that caught II the rat that stole \[I the cheese.</Paragraph> <Paragraph position="3"> To address the issue of avoiding undesirable phrasings, a system like ours must be run against a set of utterances of disparate lengths and syntactic types to see where it breaks. To do this, we accumulated a set of syntactically varied test sentences, had them read by two speakers, marked the prosodic phrase boundaries as we had done with the Holmes and Everest data, and ran the sentences through our prosodic phrase system for comparison with the human speakers' phrasing. Phrasing in the speech of the two human speakers was almost identical; where the speakers failed to match, we depended on the speaker with the more detailed phrasing to obtain a single transcript for comparison. The sentences with the phrase markings produced by the system (S) and by the human speakers (H) are given in Appendix A. In addition, we used the sentences of Grosjean et al.</Paragraph> <Paragraph position="4"> (1979) to test our system; these were the same sentences used by G&G. They were produced by six subjects speaking each sentence at five different speaking rates. We chose these data both because they form an established, though small, ,corpus and because their phrase markings, derived by oscillographic tracings of pause duration, provide a check: against the possibility of error in human judgments of pause location and salience. A comparison of these data with the output of our system is given in Appendix B.</Paragraph> <Paragraph position="5"> In comparing human and synthetic prosodic phrasing with an eye on synthetic phrasings that are either misleading or unprocessable, errors in the assignment of primary phrase boundaries are the most egregious. In the data in Appendix A, the human speakers produced 31 intrasentential primary phrase boundaries, of which our system matched 16 with its equivalent of a primary phrase boundary. 'Yhe other primary phrase boundaries in the human productions were matched in 9 cases with a secondary phra,;e boundary, and in 5 cases with a tertiary boundary.</Paragraph> <Paragraph position="6"> The system thus missed 1 primary boundary, the one after books in 17. In the absence of research on the relative significance of different boundary types, we are assuming that, with respect to the comprehensability and acceptability of ,synthetic speech, the difference between a primary and ,;econdary phrase boundary is minimal. On the other hand, the tertiary boundary produced by our system is almost imperceptible and cannot be considered equivalent to the primary boundary. The system thus matched, either exactly or approximately, 80% of the primary boundaries.</Paragraph> <Paragraph position="7"> In looking at the primary phrase boundaries that the system failed to duplicate, we want to be sure that we avoid difficult phrasings like that of 38 rather than match any particular phrasing exactly. In sentences 9, 12, 17, and 21, where our system produced a tertiary boundary at the location of the human speakers' primary phrase boundary, comprehensibility has not been diminished. The listener is misled only in sentence 17, where our system generated a boundary at a location different from that of the human speakers. In the system's phrasing of 17, the boundary before books corresponds to an interpretation of the utterance in which the books are written by her uncle, which, while it is a possible phrasing, presumes a specific discourse setting.</Paragraph> <Paragraph position="8"> A comparison of the secondary phrase boundaries in Appendix A shows similar correspondences. The human speakers produced 26 secondary phrase boundaries. The 166 Computational Linguistics Volume 16, Number 3, September 1990 J. Bachenko and E. Fitzpatrick Discourse-Neutral Prosodic Phrasing in English system matched 11 exactly and 12 with a primary or tertiary boundary. It thus failed to match 3 secondary boundaries, in sentences 9, 18, and 20, but only in sentence 18 is the absence of the boundary misleading.</Paragraph> <Paragraph position="9"> To sum up, there are 2 boundaries out of the 57 produced by the human speakers whose absence in the system's output is truly problematic. Evaluation of the prosodic phrase system, however, must also consider cases of overgeneration. It is significant here that in the 21 sentences of Appendix A, the system never generated a primary phrase boundary that did not correspond to a boundary in the human productions, although the system overgenerated secondary phrase boundaries at 7 locations. (We ignore the overgeneration of tertiary boundaries as perceptually insignificant.) Three of these secondary boundaries are problematic. Those in sentences 17 and 18 were discussed above as resulting from the misplacement of a boundary. The other unwanted secondary boundary occurs after was in sentence 10 and results in an utterance that, like 38, is difficult to process.</Paragraph> <Paragraph position="10"> A comparison of our system's output with G&G's productions, given in Appendix B, shows similar results. Of the 14 primary phrase boundaries in G&G's sentences, the system matched 12 exactly. There were 2 primary boundaries, however, that the system missed completely. Both of these errors were due to the presence of a sentential subject. For sentence 6, no parse was produced, but otherwise the inability of our system to generate the correct result here stems from the fact that the system discards the syntactic sentence node in the derivation of the prosodic phrasing. 15 With respect to overgeneration of phrase boundaries, the system overgenerated seriously only once, in sentence 4, where the subject is sentential.</Paragraph> <Paragraph position="11"> We find these results encouraging; with respect to matching, there were only four significant problems in the two corpora, and overgeneration of a primary boundary occurred only once. The suitability of our system for speech applications will depend on future tests to determine whether listeners prefer &quot;prosodized speech&quot; that is imperfect, i.e., speech that will have some phrasing errors, to the relatively &quot;flat&quot; speech of systems that lack our phrasing rules. The weak link in our current system is the parser: most problems with phrasing arise from parsing errors; in particular, incorrect part of speech assignment, incorrect analysis of pre-head modifiers, and failure to recognize idiomatic or semi-idiomatic expressions. Problems with the prosody rules come mainly from phenomena that we have not adequately studied, e.g. the proper treatment of material on the left periphery of a prosodic verb phrase and the status of complements to nouns and adjectives.</Paragraph> </Section> </Section> <Section position="7" start_page="0" end_page="0" type="metho"> <SectionTitle> 4 CONCLUSIONS </SectionTitle> <Paragraph position="0"> We have discussed the notion of discourse neutral prosodic phrasing in English and presented an analysis that characterizes this phrasing in terms of constituency, adjacency, and length. In our analysis, the contribution of syntax to discourse-neutral phrasing consists of lexical categorization; NP, PP, and AdjP constituency; and syntactic head identification. Length is an independent phonological factor. Because they refer to both syntactic and phonological information, phrasing rules are free to generate prosodic structures that may or may not resemble syntactic structures. Hence, in speech, it is possible but not expected that phrase boundaries will co-occur with major syntactic boundaries. null Our results suggest that, in an implemented system, the parsing requirements for speech systems are quite different from those for systems providing information retrieval, machine translation, or text generation. In particular, there seems to be no need for a parser to identify VP and S constituents, nor to specify predicate-argument relations.</Paragraph> <Paragraph position="1"> The distribution of the phrasing of clause-final PPs given in Figure 4 may indicate the extent of the relation between the discourse neutral phrasing and the phrasing imposed by discourse. We assume that discourse phrasing may shift neutral boundaries in order to reflect, for example, emphasis, contrast, parallelism, coreference, and the particular structure of the discourse. The exact connection between the level of phrasing we describe and discourse-dependent phrasing is a question for future research. We need to know what aspects of the discourse are relevant for phrasing, as well as how the discourse information and the phrasing specifications should be related. For example, it is not clear whether the discourse-neutral phrasing represents a set of pre-determined values that are reset when necessary by discourse features or whether this is a true default situation in which the discourse neutral phrasing is inserted when discourse phrasing is underspecified. Any contribution in this area will greatly enhance our understanding of the relation between the various components of the grammar-syntactic, semantic, and phonological.</Paragraph> </Section> <Section position="8" start_page="0" end_page="0" type="metho"> <SectionTitle> ACKNOWLEDGMENTS </SectionTitle> <Paragraph position="0"> We are grateful to Francois Grosjean and Terry Langendoen for their comments on an earlier version of this paper, and to Jack Lacy for his invaluable assistance with the implementation and testing of the system.</Paragraph> <Paragraph position="1"> Special thanks go to two anonymous Computational Linguistics reviewers whose comments have greatly helped us improve the original manuscript.</Paragraph> <Paragraph position="2"> None of those mentioned is responsible for any shortcomings of the work described here.</Paragraph> </Section> <Section position="9" start_page="0" end_page="0" type="metho"> <SectionTitle> APPENDIX A </SectionTitle> <Paragraph position="0"> Human: (H) and system (S) productions of syntactically varied ,.sentences. Primary boundaries are marked by &quot;\[l&quot;, secondary by &quot;I &quot;, tertiary by &quot;!&quot;.</Paragraph> <Paragraph position="1"> 1. H: The name \[ of the character \[I is not pronounced. S: The name ! of the character I is not pronounced.</Paragraph> <Paragraph position="2"> 2. H: The left-hand power unit ! on each shelf I in the fi~rty-eight channel module II operates ! the echo cancellers. null S: The left-hand power unit I on each shelf I in the forty-eight channel module II operates I the echo cancellers. null 3. H: Phoneme characters II give more control 11 over the particular sounds tt that are generated.</Paragraph> <Paragraph position="3"> S,: Phoneme characters I give ! more control \[ over the particular sounds I\] that are generated.</Paragraph> <Paragraph position="4"> 4. H: The connection must be determined Ilfor the left-hand power unit II on each shelf S: The connection ! must be determined I for the left-hand power unit II on each shelf 5. H: I need a man II to fix the sink.</Paragraph> <Paragraph position="5"> S: I need a man i to fix ! the sink.</Paragraph> <Paragraph position="6"> 6. H: The techniques I that we had implemented \]1 were tested I on a larger computer.</Paragraph> <Paragraph position="7"> S: The techniques I that we had implemented II were tested I on a larger ! computer.</Paragraph> <Paragraph position="8"> 7. H: Everyone lwho had participated l in the attempt I1 was considerably affected.</Paragraph> <Paragraph position="9"> S: Everyone I who had participated ! in the attempt II was considerably affected.</Paragraph> <Paragraph position="10"> 8. H: The method I by which one converts a word I into phonemes \]1 is provided I in chapter seven.</Paragraph> <Paragraph position="11"> S: The method I by which one converts ! a word i into phonemes II is provided ! in chapter seven.</Paragraph> <Paragraph position="12"> 9. |-I: In these instances II it may be desirable i to use phonemic characters 11 each time I that it appears II on the input text.</Paragraph> <Paragraph position="13"> S: In these \[ instances I it may be ! desirable 1\[ to use ! phonemic characters II each time that it appears ! on the input text.</Paragraph> <Paragraph position="14"> 168 Computational Linguistics Volume 16, Number 3, September 1990 J. Bachenko and E. Fitzpatrick Discourse-Neutral Prosodic Phrasing in English 10. H: The thrust I was now from the south It which Mallory had deemed impossible.</Paragraph> <Paragraph position="15"> S: The thrust I was I now !from the south \[I which Mallory I had deemed ! impossible.</Paragraph> <Paragraph position="16"> 11. H: The destruction I of the good name I of his father II bothered him.</Paragraph> <Paragraph position="17"> S: The destruction I of the good name ! of his father II bothered him.</Paragraph> <Paragraph position="18"> 12. H: Every event I of that dreadful time II is seared II into my memory.</Paragraph> <Paragraph position="19"> S: Every ! event I of that ! dreadful time \[I is seared ! into my memory.</Paragraph> <Paragraph position="20"> 13. H: Everest was discovered II during a survey of India II in 1852.</Paragraph> <Paragraph position="21"> S: Everest was discovered II during a survey ! of India \[ in 1852.</Paragraph> <Paragraph position="22"> 14. H: He told the director II to give the names I of the characters II to Ivan.</Paragraph> <Paragraph position="23"> S: He told ! the director II to give I the names ! of the characters lto Ivan.</Paragraph> <Paragraph position="24"> 15. H: It may be impossible II to give that machine II the proper workout.</Paragraph> <Paragraph position="25"> S: It may be ! impossible II to give that machine I the proper workout.</Paragraph> <Paragraph position="26"> 16. H: Eventually I he will realize II that his cigars are bothering II the other passengers.</Paragraph> <Paragraph position="27"> S: Eventually II he will realize ! that his cigars I are bothering ! the other passengers.</Paragraph> <Paragraph position="28"> 17. H: She was given \[ more difficult books II by her uncle. S: She was given ! more difficult I books by her uncle. 18. H: You could easily I break that vase II if you aren't careful.</Paragraph> <Paragraph position="29"> S: You could easily break I that ! vase II if you aren't careful.</Paragraph> <Paragraph position="30"> 19. H: The president I asked the group II what they were capable I of doing.</Paragraph> <Paragraph position="31"> S: The president I asked ! the group II what they were I capable ! of doing.</Paragraph> <Paragraph position="32"> 20. H: What book I on the subject \[\[ would you recommend I to the group? S: What book ! on the subject II would you recommend to the group? 21. H: I can usually read I a lot faster II than Roger. S: I can usually read I a lot faster ! than Roger.</Paragraph> </Section> class="xml-element"></Paper>