File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/00/w00-1421_metho.xml
Size: 21,454 bytes
Last Modified: 2025-10-06 14:07:26
<?xml version="1.0" standalone="yes"?> <Paper uid="W00-1421"> <Title>Planning Word-order Dependent Focu s Assignments*</Title> <Section position="4" start_page="0" end_page="157" type="metho"> <SectionTitle> 2 The interplay of focus and word </SectionTitle> <Paragraph position="0"> order The pragmatic function of the FBS is to indicate to the listener of an utterance that a certain part of that utterance has been put into the foreground. The semantic information of this foregrounded part. has either been selected from a set of alternative beliefs ascribed t.o the listener, or it is a revision of certain beliefs (in case of contrastive focus), or the focused phrase expresses 'new' information the listener does not know or is not able to infer from his beliefs \[Halliday, 1967\]. 1 In all three cases the focus domain - the syntact i,&quot; realization of a focus - contains the so-called focus exponent, i.e. the bearer of the focal accent which we are identifying with the obligatory' nucleus accent. The existence of this accent indicates to the listener that one part of the message conveys one of these three functions. In addition to the nucleus accent optional prenucleus accents can exist as well which do not have a discourse function in general.</Paragraph> <Paragraph position="1"> I This list of apparently diverse functions shows that there are possibly several phenomena which have been labeh!d as 'focus' within the last 70 years or so. There is an ongoing discussion in the linguistic community whether these three functions can be traced back to one common principle (of. \[Schwarzschild, 1999\]).</Paragraph> <Paragraph position="2"> but a prosodic function which goes back to diverse planning phenomena ......... ~.</Paragraph> <Paragraph position="3"> Three examples shall demonstrate the interplay of word order with accent placement. Example (3) is from our speech corpus of retellings of a trick film we analyzed to obtain rules for accent placement. The other examples are variations of (3) showing that different word order and accent placement correspond to different focus domains. For reasons of simplicity we are abstracting from specific pitch accents in this paper. Accent bearers are given, in capitals. Fur: thermore, tile examples do not exhibit prenucleus accents. However, our rules for accent placement account for these accents as well. The reason is that prenucleus accents are determinable if the bearer of the nucleus accent is known.</Paragraph> <Paragraph position="4"> 1. er fiillt in die STEINebene runter he falls in the stone plateau down 'he is falling down to the stone plateau' 2. in die STEINebene fiillt er runter 3. er f6llt RUNter in die STEINebene Semanticians pointed out that the key concept for word order and its consequences for accent placement is contextual boundedness (see, e.g., \[Jackendoff, 1972; Rooth, 1992\]). However, their method to simulate the different contexts by questions the sentence is able to answer tells us only something about the number of possible loci. For example, sentence (1) is able to answer five possible questions, depending on which constituent provides the answer (the contexts range from Which specific plateau is he falling down to? with focus on tile compound only to What's up?, focusing the whole sentence). Example (2), however, with the locative PP in sentence-initial position but identical accent placement is only able to answer three questions. Hence, (2) is contextually more bounded than (1). Example (3) exhibiting an extraposed unit clearly demonstrates the need for an FBS-related word order. Extrapositions are the linguistic means in German to separate sense units.</Paragraph> <Paragraph position="5"> Tile extraposition is used to mark two informational units: first, the person is falling down and second that the resulting place is the stone plateau. Since informational units coincide with prosodic phrases, each phrase contains one nucleus accent so ti~at, two separate focus domains exist.</Paragraph> <Paragraph position="6"> From an NLG perspective explaining word order and accent placement by tile possibility to answer context questions points to tile wrong direction. Neither should we generate isolated sentences nor are we interested in focus ambiguities. Rather we have to determine a certain word order with a twofold purpose: first, it must be able to express a planned focus and second, it should guarantee coherence of ~he text.</Paragraph> <Paragraph position="7"> To our knowledge, the problem of how word or...... ...der`~m1/4td.:fc&quot;cM`dUmain1/4determin:`.~`ti~n``interact` has not been adressed in NLG research yet. The SYNPHON-ICS formulator \[Abb et al., 1995\] that is able to generate German single sentences.with FBSs does not take into account the interplay between word order and accent placement. Instead word order is determined by incremental syntactic construction; situarive factors have not been addressed in this system. The SPEAK! system \[Teich et al., 1997\] also does not account_for the interplay.of word .order with. accent placement. However, this system cannot be directly compared with our approach, since the coverage of phonological phenomena is completely different: We are interested in FBSs in monologues, whereas the SPEAK! system primarily accounts for the role of a dialogue history to achieve the assignment of various intonation patterns.</Paragraph> <Paragraph position="8"> Generally, the realized word order of an utterance is the result of its embedding into the situative context, which finds expression in the use of linear precedence (LP) rules for word order determination during surface realization. The idea is that constituents are ordered with respect to preferential properties expressed by these LP-rules. From an NLG perspective the question is, then, where the information comes from that allows us to make use of these LP-rules?&quot; In our approach we derive the information necessary for the use of LP-rules from a discourse model that relates various aspects of a discourse to one another. Since we are generating monologues only the utterances previously produced by the program require consideration.</Paragraph> <Paragraph position="9"> Tile generation of monologues with appropriate word order and focus/background structures comprises five major tasks: 1. The information to be conveyed must be selected and linearized by a content planner.</Paragraph> <Paragraph position="10"> 2. During sentence planning: (a) foci n-lllSt be determined, and (b) conditions for word order realization must be given.</Paragraph> <Paragraph position="11"> 3. During surface realization: (a) the loci must be mapped onto focus domains while tile sentences with their respective word order are fornmlated, and (b) the bearers of (pre)nucleus.accents within each focus domain must be determined.</Paragraph> <Paragraph position="12"> Since this paper addresses sentence planning, we are focusing on tasks (2a) and (21)) only. We are leaving aside content planning (task 1) because the lit> earization problem does not affect FBS determinalion. The content planner provides the respectiw' propositions lhal will be extended during sentence planning by pragmatic information for realizing the FBS. The result of sentence planning functions .as input for a competition-based fornmlator. In order to demonstrate how the formulator is able to realize FBSs by means of grammatical competition, we will also outline the determination of focus domains, word order, and accent bearers in focus domains (tasks 3a and 3b).</Paragraph> </Section> <Section position="5" start_page="157" end_page="157" type="metho"> <SectionTitle> 3 Architecture of FOGS </SectionTitle> <Paragraph position="0"> The five tasks mentioned above are realized in our NLG system FOGS. 2 Currently the system generates brief retellings era trick film with each sentence having a contextually appropriate word order and focusrelevant prosody with the context provided by the discourse model. Figure (1) shows the architecture of the system. Sentence planning takes into consideration the current state of a discourse model. When constructing the input for the formulator, the discourse model will be continuously updated so that the word order of the currently planned sentence is coherent with the word order of the preceding sentence. Word order relevant information is encoded by discourse relational features of discourse referents. null The HPSG-based formulator realizing the sentences uses weighted LP-rules for word order determination that take into account the discourse-relational features in the semantic input. Bearers of (pre)nucleus accents within focus domains are determined by a focus principle.</Paragraph> </Section> <Section position="6" start_page="157" end_page="158" type="metho"> <SectionTitle> 4 Sentence planning in FOGS </SectionTitle> <Paragraph position="0"> The planning operators creating the input for the formulator cause the transition to new states of the discourse model. The initial state of the discourse model is characterized by the lack of any information on the events to be conveyed. Correspondingly in the goal state all events are represented.</Paragraph> <Paragraph position="1"> Our discourse model is a knowledge store consisting of two major registers. It consists of a Discourse Representation Structure (DRS. cf. \[Kamp and Reyle, 1993\]) (R. I'() with sets of mutually known discourse referents R and DRS-conditions h, and a set Ref of referential movements assigned to the discourse referents, lI/eferential movements det.ermine how discourse referents &quot;are passed on from one sentence to tile next one. H is a pair (RA, RN} consisting of referents of tile directly preceding utterance and referents of all-other previous utterances. Since referential movements are typically linked with identifiability conditions for discourse referents, ttle latter can be derived front the former. New referents are declared as being unidentifiable for the listener.</Paragraph> <Paragraph position="2"> 2Not to be confused with FOG, a system that generates weather forecasts\[Goldberg t t a/.. 199-I\]. Foc;s is the acronym for &quot;focus generation system'.</Paragraph> <Paragraph position="3"> % \[ discourse relational q f~res refined plan q focu~ground \[ while re-established ones should typically be identifiable by a definite description. Maintained referents are usually anaphorically identifiable. Furthermore, alternative sets All are determined by sortal restrictions. Discourse referents function as alternatives if they are stored in the discourse model ill R and are instances of the same superordinated concept. Anal ogously, concepts are alternatives if they are stored in the discourse model in K and possess the same directly superordinated concept. During planning the discourse model will be continuously updated.</Paragraph> <Paragraph position="4"> Updating comprises the insertion of new discourse referents into RA. shifting referents from tC/.-I to I5.\. and, in case of referential re-establishment, shifting referents from RN to RA. Furthermore, new \[)FIS conditions will be introduced into )Y, and the referential movement conditions are updated, resulting also in new alternative sets and identifiability conditions Id.</Paragraph> <Paragraph position="5"> W~ use a hierarchical planner \[Sacerdot.i, .1974\].</Paragraph> <Paragraph position="6"> The content planner provides the abstract plan.</Paragraph> <Paragraph position="7"> Plan refinement during sentence planning consists of the proposition-wise introduction of operators for the discourse relational features and focus/background determination. The result of applying the operators to tile single propositions functions as nput Io the formulator.</Paragraph> <Section position="1" start_page="158" end_page="158" type="sub_section"> <SectionTitle> 4.1 Discourse-relational features 4.2 Focus and background determination </SectionTitle> <Paragraph position="0"> Extending the propositions by.-disc~otrrsecelationat ......... We~lreadypointed~out.t.h~t-.foGusing-asemanticrep-.</Paragraph> <Paragraph position="1"> features makes intensive use of the discourse model.</Paragraph> <Paragraph position="2"> Three discourse relational .factors influencing word order are realized as plan operators: topic assignment, referential movement, and identifiability of discourse referents by the listener.</Paragraph> <Paragraph position="3"> Topic assignment: Topics establish an aboutness-relation between a familiar discourse referent and the sentential predication. We adopt :the conditions for tooic:assignment.-:propoSed &quot;in' \[Klabunde and Jansche, 1998\]. Topic candidates must be identifiable discourse referents and they should be as high on a so-called topic acceptance scale as possible. According to such a scale referents that are currently lit up constitute the best topic candidates. In our approach, these are referents from the intersection of RA and the referents of the current event proposition to be realized. The topic acceptance scale is mirrored in the successive application of operators for topic assignment. For example, if several referents as candidates exist, a discourse referent will be chosen that is marked as anaphorically identifiable and referentially comprises the picking up of discourse referents from previously uttered information and the introduction of new referents, respectively. If referents from the directly preceding utterance are picked up, these referents are maintained. Referents from all other previous utterances are re-established. Fleferential movement influences word order because maintained referents are usually realized before re-established ones, and re-established ones precede new referents, as indicated by the following LP-rule: refMaintained < refReEstablished < new Identifiability: With respect to identifiability of discourse referents, we distinguish between anaphoric identifiability, identifiabilit.y by a definite description, and referents that. are non-identifiable for the listener. Identifiability influences word order as well because anaphorically identifiable referents are usually realized before dethfites and those precede non-identifiable referents: anaphld < de:finiteld < nonId.</Paragraph> <Paragraph position="4"> resentation is based on one of three functions: the selection of beliefs from a set of alternatives, contrasting a belief with a different one, and indicating new information. These three functions have also been verified in our corpus of story tellings. Each of these functions has been treated separately in various systems (see, e.g., \[Prevost and Steedman, 1993\] for contrastive focus in a concept-to-speech system, .\[Thetme.:et _al.,..:1997.\]. for, n .ew :in.formation in a .data.-. to-speech system, and \[Blok and Eberle, 1999\] for alternative semantics in machine translation), but a single and comprehensive approach has not been proposed yet. However, structure and content of our discourse model allow us to determine FBSs by means of planning operators as well. Different pre-conditions for the focus determining operator result in the successive check whether one of these three functions is satisfied. First it is checked whether the proposition to be conveyed contains any information that is new for the listener. New information is what is not stored in the DRS K of the discourse model, a If these preconditions are not satisfied it is checked whether parts of the proposition belong to alternatives presumed by the listener. Only if these preconditions fail a contrasting focus is realizable.</Paragraph> <Paragraph position="5"> Contrasting focus is realized if some property in K of an activated discourse referent in RA contradicts a property in the semantic input under consideration, provided the same sortal restriction holds as for the alternatives.</Paragraph> </Section> </Section> <Section position="7" start_page="158" end_page="159" type="metho"> <SectionTitle> 5 Surface realization as grammatical </SectionTitle> <Paragraph position="0"> competition The resulting input for grammatical competition is a blend of semantic and pragmatic information. For example, the input for realizing example (1) is as follows: focus: \[fallingDowa(e,m), into(e,s), stonePlateau(s), definiteld(s), refReEstablishment(s)\] ground: \[man(m), anaphld(m), refMaintenance(m), topic(m)\] The constants m, s, and e are referents for a specific man, stone plateau, and tile evenl of falling down. The values of the features :focus and ground represent the focused part of the proposition and * the background, respectively. While tile realization of the focus domain is the task of converting the complete focus into one phrase, word order will be determined by LP-rules that pick up the pragmaticall2,' motivated literals on topichood, identifial)ility, and referential movement.</Paragraph> <Paragraph position="1"> aThis implies that we are ignoring any inferential capabilities in the current system.</Paragraph> <Paragraph position="2"> As already mentioned, the notion of grammatical competition is necessary to account, for the in-. teraction of syntactic and phonological constraints on focus/background structures. The idea to use a competition model to explain word order variations in German is not new (of. \[Steinberger, 1994; Uszkoreit, 1987\]). The advantage of grammatical competition compared to a non-competitive use of precedence rules (as in standard HPSG) is its flexibility. A competition model allows to take syntactic as well as semantic and pragmatic preferences into consideration, and to determine'the acceptability of a sentence with respect to the situative context. The usual approach is to formulate preference rules which have a certain impact on the naturalness of constituent orders. Some of these preference rules are stronger than others. The number of preference rules which are satisfied or violated, in combination with the relative importance of the different factors, is responsible for the varying degree of naturalness of word order variations. Analogously to this idea we use weighted LP-rules as well which are based on the planned discourse-relational features.</Paragraph> <Paragraph position="3"> Focus domains are realized by means of a focus principle. Applying the focus principle results in the projection of a focus feature to the dominating node. Together with the standard HPSG-principles the focus principle confines the successive application of the head-complement, head-filler, and head-adjunct schemata to two lemmas in order to build up phrases and sentences. The focus principle constrains the placement, of prenucleus and nucleus accents in view of the syntactic status of the phrasal signs. It is based on the following empirically validated regularities with respect to the placement of the nucleus and prenucleus accents: 1. in phrases with a head-daughter and adjunctdaughter the focus exponent is in the head-daughter and a prenucleus accent is in the adj unct-daughter.</Paragraph> <Paragraph position="4"> 2. for phrases with a head-daughter and complement-daughter holds: (a) if the head-daughter is a verbal projection, the focus exponent is in the head-daughter and a prenucleus accent is in the complement daughter.</Paragraph> <Paragraph position="5"> (b) else the accents are in the complement-daughter. null The regularities underlying the nucleus and prenucleus accent placement have been formulated on the basis of an analysis of a story telling corpus. The tellings have been analyzed w.r.t, the position of pitch accents and their indication of possible focus domains. Two resuhs of \[his analysis shall he memioned here: First, the analysis showed that the overwhelming number of focus domain determinagEon can.be explained.by.syntax-based.projection rules (see, e.g., \[Gfinther, 1999; Ladd, 1996\] for some proposals) underlying our focus principle. Second, given the three basic pragmatic functions of FBSs, primarily information that was new to the listener has been accented. Contrastiveness was confned to focal accents on certain closed-class items such as determiners. 4 While focus domains are realized by a syntactic principle, word order will be realized by means of weighted-LP-rules. Since especially the LP-rule * topic < focus requires information on focused constituents focus determination must be completed before word order will be realized. We introduced the necessary LP-rules in section 4.1.</Paragraph> <Paragraph position="6"> Based on these LP-rules word order will be determined by means of the operation of domain union proposed in \[Reape, 1994\]. If the head or the daughter is a verbal projection the domain of the phrase will be received by domain union. Verbal projections are of interest for word order realization because only in this case the LP-rules will be evaluated. Otherwise the domains will be combined according to the directionality feature DIR of the head and a MOD-DIR feature of an adjunct. The former determines the order of head and complement, while the latter is responsible for the order of adjuncts and their modified element. Since in this case no LP-rules have to be evaluated, word order deterruination is a trivial task.</Paragraph> </Section> class="xml-element"></Paper>