File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/04/j04-4001_metho.xml

Size: 42,562 bytes

Last Modified: 2025-10-06 14:08:51

<?xml version="1.0" standalone="yes"?>
<Paper uid="J04-4001">
  <Title>Experiments using stochastic search for</Title>
  <Section position="3" start_page="0" end_page="403" type="metho">
    <SectionTitle>
+ Information Technology Research Institute, University of Brighton, Brighton BN2 4GJ, U. K. E-mail:
</SectionTitle>
    <Paragraph position="0"> Submission received: 17 October 2002; Revised submission received: 22 May 2004; Accepted for publication: 6 August 2004  Computational Linguistics Volume 30, Number 4 in particular pronouns. These are issues which the well-known centering theory (CT) of Grosz, Joshi, and Weinstein (1995; henceforth GJW) is concerned with. Previous algorithms for pronominalization such as those of McCoy and Strube (1999), Henschel, Cheng, and Poesio (2000), and Callaway and Lester (2002) have addressed the task of deciding whether to realize an entity as a pronoun on the basis of given factors such as its syntactic role and discourse history within a given text structure; what is essentially novel in our approach is that we treat referential coherence as a planning problem, on the assumption that obtaining a favorable ordering of clauses, and of arguments within clauses, is likely to increase opportunities for nonambiguous pronoun use. Centering theory provides the basis for such an integrated approach.</Paragraph>
    <Paragraph position="1">  Of course coherence of a text depends on the realization of rhetorical relations (Mann and Thompson 1987) as well as referential continuity, and the latter is to an extent a byproduct of the former, as clauses that are rhetorically related also tend to mention the same entities. However, even when a set of facts is arranged in a hierarchical RST structure, there are still many possible linear orderings with noticeable differences in referential coherence. This article concentrates on the influence of referential continuity on overall coherence and describes a method for applying CT to problems in text planning and pronominalization in order to improve the fluency and readability of generated texts. This method is applicable in principle to any system which produces hierarchically structured text plans using a theory of coherence relations, with the following additional assumptions: * There is a one-to-one correspondence between predicates and verbs, so that the options for syntactic realization can be predicted from the argument structure of predicates. Such &amp;quot;shallow&amp;quot; lexicalization appears to be standard in applied NLG systems (Cahill 1999).</Paragraph>
    <Paragraph position="2"> * Pronominalization is deferred until grammatical relations and word order have been determined.</Paragraph>
    <Paragraph position="3"> Our exposition will refer to an implemented document generation system, Iconoclast, which uses the technique of constraint satisfaction (van Hentenryck 1989; Power 2000; Power, Scott, and Bouayad-Agha 2003) with CT principles implemented among a set of soft constraints. The Iconoclast system allows the user to specify content and rhetorical structure through an interactive knowledge-base editor and supports fine-grained control over stylistic and layout features. The user-determined rhetorical structure is transformed into a text structure or a set of candidate text structures which respect various text formation rules encoded as hard constraints. Not all of the resulting text structures will give rise to stylistically acceptable documents, and of those which may be judged acceptable, some will be noticeably preferable to others. The text-structuring phase is followed by an evaluation of the candidate structures in which they are ranked according to a set of preferences encoded as soft constraints. Centering preferences are weighted along with other stylistic constraints to fix the preferred final ordering both of propositions in the text and of arguments within a clause.</Paragraph>
    <Paragraph position="4"> It is not our primary aim in this short article to provide an empirical assessment of the claims of CT, for which we refer the reader to the relevant papers, such as  Kibble and Power Optimizing Referential Coherence those collected in Walker, Joshi, and Prince (1998a) as well as Poesio et al. (2002) and other works cited there. We report elsewhere (Kibble and Power 2004) on two ongoing empirical studies: A paired-comparison study of judgments by naive subjects indicates that centering constraints make an appreciable difference to the acceptability of texts, and a corpus study using what we believe to be a novel technique involving perturbations provides clear evidence of preferences between the different constraints. One of the strengths of our framework is that it can be used as a research tool for the evaluation of variants of CT, as different realizations of an input sequence can be generated by varying control parameters, and one can very quickly see the results of alternative choices.</Paragraph>
    <Section position="1" start_page="403" end_page="403" type="sub_section">
      <SectionTitle>
1.1 Related Work
</SectionTitle>
      <Paragraph position="0"> Other researchers have applied CT to generation, though to our knowledge none have applied it to text planning, sentence planning, and pronominalization in the integrated way that we present in this article. This general approach is anticipated by McKeown's (1985) text-planning system, in which referential coherence is taken to be one of the factors determining fluency, though McKeown's work predates RST and centering.</Paragraph>
      <Paragraph position="1"> Mittal et al. (1998) apply what we term salience to sentence planning, with the goal of realizing the Cb as subject, though the text planner does not have a goal of attempting to maintain the same Cb. We regard Cheng's (2000) work on the interaction of centering preferences and aggregation in text planning as complementary to our enterprise.</Paragraph>
      <Paragraph position="2"> Karamanis (2001), Kibble (2001), and Beaver (2004), have argued for a ranking of the centering principles as opposed to weighting, and indeed Beaver provides a unified formulation of the centering rules and constraints as a ranked set of OT constraints.</Paragraph>
      <Paragraph position="3"> However, we believe that such a ranking stands in need of empirical justification, and Beaver's data actually provide little evidence for strict ranking as opposed to weighting of constraints (see Kibble 2003). Constraint satisfaction search was applied by Marcu (1996, 1997) to the far harder task of constructing RST trees given a set of facts and a repertoire of rhetorical relations; Mellish et al. (1998) argue that this approach may not scale up to the generation of larger texts and propose an alternative using stochastic search. We address the issue of computational complexity in section 4; however we do not face the same problems as Marcu, since the task for our text planner is to convert a given RST tree into a (possibly singleton) set of text structures rather than to build the RST tree from scratch.</Paragraph>
    </Section>
  </Section>
  <Section position="4" start_page="403" end_page="405" type="metho">
    <SectionTitle>
2. Centering Parameters
</SectionTitle>
    <Paragraph position="0"> We assume some familiarity with the basic concepts of CT. In this section we briefly and informally summarize the main assumptions of the theory and explain how we have interpreted and applied these assumptions: 1. For each utterance in a discourse there is said to be at most one entity that is the center of attention or center (Constraint 1). The center in an utterance U n is the most highly ranked entity realized in U n[?]1 , which is also realized in U n (Constraint 3). This is also referred to as the backward-looking center or Cb. (The set of entities mentioned in an utterance U n is defined by Constraint 2 as the set of forward-looking centers or Cfs.) It is not entirely clear whether Constraint 1 is to be taken as an empirical claim or as a stipulation that some entity must be designated as Cb, if necessary by constructing an indirect anaphoric link.</Paragraph>
    <Paragraph position="1"> 2. There is a preference for consecutive utterances within a discourse segment to keep the same entity as the center and for the center to be realized as the highest-ranked entity or preferred center (Cp). Kibble (1999) dubbed these principles cohe- null sion and salience, respectively. Combinations of these preferences provide the familiar canonical set of transitions shown in Table 1, ranked in the stipulated order of preference first set out as Rule 2 by Brennan, Friedman, and Pollard (1987) and adopted by Walker, Joshi, and Prince (1998b).</Paragraph>
    <Paragraph position="2"> 3. The center is the entity which is most likely to be pronominalized: GJW's Rule 1 in its weakest form states that if any entity is referred to by a pronoun, the Cb must be. As Poesio et al. (2002) point out, CT can be viewed as a &amp;quot;parametric&amp;quot; theory in that key notions such as utterance and previous utterance, realization of entities, and ranking are not given precise definitions by GJW, and subsequent applied studies have had to begin by fixing particular instantiations of these notions.</Paragraph>
    <Section position="1" start_page="404" end_page="404" type="sub_section">
      <SectionTitle>
2.1 Ranking
</SectionTitle>
      <Paragraph position="0"> Since Brennan, Friedman, and Pollard (1987), a ranking in terms of grammatical roles (or obliqueness) has become standard; for example: subject &gt; direct object &gt; indirect object &gt; others.</Paragraph>
      <Paragraph position="1"> We have simplified matters somewhat for the purposes of this implementation. First, we assume that syntactic realization serves only to distinguish the Cp from all other referents, which are ranked on the same level: Thus effectively subject &gt; others. Secondly, we assume that the system already knows, from the argument structure of the proposition, which entities can occur in subject position: Thus in realizing a proposition ban(fda, elixir), both arguments are potential Cps because active and passive realizations are both allowed; for contain(elixir, gestodene), only elixir is a potential Cp because we disallow Gestodene is contained by Elixir.</Paragraph>
    </Section>
    <Section position="2" start_page="404" end_page="405" type="sub_section">
      <SectionTitle>
2.2 Realization
</SectionTitle>
      <Paragraph position="0"> GJW's original formulation distinguished between &amp;quot;direct&amp;quot; realization, or coreference, and &amp;quot;indirect&amp;quot; realization, which corresponds to bridging reference. As an example, in (1a) the terms cold sores and viral skin disorders are not strictly coreferential and so do not count as direct realizations of the same entity, but if we allow indirect realization, then there is the potential for one of these to be identified as Cb, in a sequence such as Elixir is used to treat cold sores. Viral skin disorders are relieved by aliprosan. Again, we keep things simple at this stage by treating nominal expressions as realizations of the same entity only if they strictly corefer. As Poesio et al. (2002) observe, under this interpretation of realization, a number of utterances will lack an identifiable Cb, so we have to allow for a &amp;quot;no-Cb&amp;quot; transition in addition to the canonical transitions listed in Table 1.</Paragraph>
    </Section>
    <Section position="3" start_page="405" end_page="405" type="sub_section">
      <SectionTitle>
Kibble and Power Optimizing Referential Coherence
2.3 Utterance and Previous Utterance
</SectionTitle>
      <Paragraph position="0"> Two different approaches to the realization of &amp;quot;utterance&amp;quot; have become associated with the work of Kameyama (1998) and Suri, McCoy, and DeCristoforo (1999). To simplify somewhat: Kameyama argued that the local focus is updated in a linear manner by tensed clauses rather than by sentences, while Suri, McCoy, and DeCristoforo present evidence that the subject of the main clause in a complex sentence is likely to be the preferred antecedent for a subject pronoun in an immediately following sentence, winning out over candidates in an intervening subordinate clause, as in example (2):</Paragraph>
    </Section>
  </Section>
  <Section position="5" start_page="405" end_page="407" type="metho">
    <SectionTitle>
2. Dodge
</SectionTitle>
    <Paragraph position="0"/>
    <Paragraph position="2"> wasn't cooperating.</Paragraph>
    <Paragraph position="3"> Then he j took all the money and ran / #he i started screaming for help. In fact we would argue that Suri, McCoy, and DeCristoforo's analysis does not establish whether the accessibility effects are due to the syntactic or the rhetorical structure of utterances. The examples they present all involve sentences of the form Sx because Sy corresponding to the rhetorical pattern nucleus-connective--satellite. Their results are therefore consistent with the hypothesis that the nucleus of a preceding segment is more accessible than the satellite. We allow the user of our system to choose between two strategies: a linear, Kameyama-style approach or a hierarchical approach in which the utterance is effectively identified with a rhetorical span. Our approach is more general than that of Suri, McCoy, and DeCristoforo as it covers cases in which the components of a complex rhetorical span are realized in different sentences. Veins theory (Cristea, Ide, and Romary 1998) provides a possible formalization of the intuition that some earlier propositions become inaccessible as a rhetorical boundary is crossed. The theory could be applied to centering in various ways; we have implemented perhaps the simplest approach, in which centering transitions are assessed in relation to the nearest accessible predecessor. In many cases the linear and hierarchical definitions give the same result, but sometimes they diverge, as in the following  schematic example: 3. ban(fda, elixir) since contain(elixir, gestodene).</Paragraph>
    <Paragraph position="4">  However, approve(fda, elixirplus).</Paragraph>
    <Paragraph position="5"> Following Veins Theory, the predecessor of approve(fda, elixirplus) is ban(fda, elixir); its linear predecessor contain(elixir, gestodene) (an embedded satellite) is inaccessible. This makes a considerable difference: Under a hierarchical approach, fda can be the Cb of the final proposition; under a linear approach, this proposition has no Cb.</Paragraph>
    <Section position="1" start_page="405" end_page="406" type="sub_section">
      <SectionTitle>
2.4 Transitions versus Constraints
</SectionTitle>
      <Paragraph position="0"> Kibble (1999, 2001) argued for a decomposition of the canonical transition types into the principles of cohesion and salience, partly on the architectural grounds that this makes it easier to apply CT to the generation task, and partly on the empirical grounds that the preference ordering assumed by GJW is not strongly supported by corpus evidence and that transitions are better seen as epiphenomenal, emerging in a partial ordering from the interaction of more fundamental constraints. We follow this general approach, including among the constraints the principle of continuity: Each utterance should have at least one referent in common with the preceding utterance, which is effectively a restatement of GJW's Constraint 1. If we assign a weight of 1 each to cohesion and salience and 2 to continuity, we obtain a partial ordering over the  Computational Linguistics Volume 30, Number 4 canonical transitions as follows: 0:Continue &gt; 1:{Retain  |Smooth Shift} &gt; 2:{Rough Shift  |No Cb} Any relative weighting or ranking of coherence over salience would need to be motivated by evidence that Retain is preferred over Smooth Shift, and we are not aware of any conclusive evidence of this in the literature (see Kipple [1999] for further discussion). null This approach also means that Strube and Hahn's (1999) principle of cheapness can be naturally incorporated as an additional constraint: This is a requirement that</Paragraph>
      <Paragraph position="2"> ). The principle of cheapness effectively cashes out the informal definition of the Cp as &amp;quot;represent[ing] a prediction about the Cb of the following utterance&amp;quot; (Walker, Joshi, and Prince, 1998b, page 3). In classic variants of centering theory, this happens only indirectly as a result of transition preferences, and only following a Continue or Smooth Shift, since the Cp is also the Cb and Rule 2 predicts that the preferred transition will maintain the same Cb. However, the prediction is not entailed by the theory following a Retain, Rough Shift, or no-Cb transition or indeed for the first sentence in a discourse, when there is effectively no prediction concerning the Cp. Strube and Hahn claim that the cheapness principle is motivated by the existence of Retain-Shift patterns, which are evidently a common means of introducing a new topic (see also Brennan, Friedman, and Pollard 1987 [henceforth BFP]). To summarize, our system incorporates the following constraints:  The original version of GJW's Rule 2 specified that sequences of Continue transitions are preferred over sequences of Retains, and so on; in BFP's implementation, however, transitions are evaluated incrementally and the preference applies to individual transitions such as Continue versus Retain rather than to sequences. Strube and Hahn (1999) take an intermediate position: In their formulation, pairs of transitions</Paragraph>
      <Paragraph position="4"> ). Strube and Hahn intended the preference for cheap transition pairs to replace GJW's Rule 2 in toto, which seems a rather weak requirement. On the other hand the original GJW formulation is difficult to verify, since as Poesio et al. (2002, page 66) found, sequences of multiple occurrences of the same transition type turn out to be relatively rare. Our position is a little more complex, as we do not directly aim to generate particular transitions or sequences of transitions but to minimize violations of the constraints continuity, cohesion, salience, and cheapness. Violations are computed on individual nodes and summed for each candidate text structure, so we may expect that the candidate with the fewest violations will have a preponderance of the preferred transitions. The system is certainly more slanted toward global optimization than BFP's incremental model but may be said to achieve this in a more natural way than a strategy of trying to produce uniform sequences of transitions.</Paragraph>
    </Section>
    <Section position="2" start_page="406" end_page="407" type="sub_section">
      <SectionTitle>
2.6 Pronominalization
</SectionTitle>
      <Paragraph position="0"> GJW's Rule 1 is rather weak as a guide to pronominalization decisions in general, as it only mentions the Cb and gives little guidance on when or whether to pronomi- null Kibble and Power Optimizing Referential Coherence nalize non-Cbs. An important consideration for NLG is to minimize the possibility of ambiguity, and so we adopt a cautious strategy: The user can choose between invariably pronominalizing the Cb or using a fairly simple algorithm based on parallelism of grammatical roles. A possible future development is to supplement our CT-based text planner with a more sophisticated pronominalization algorithm as proposed by Henschel, Cheng, and Poesio (2000) or Callaway and Lester (2002).</Paragraph>
    </Section>
  </Section>
  <Section position="6" start_page="407" end_page="409" type="metho">
    <SectionTitle>
3. Generation Issues
</SectionTitle>
    <Paragraph position="0"> CT has developed primarily in the context of natural language interpretation, focussing on anaphora resolution (see, e.g., Brennan, Friedman, and Pollard 1987). As stated above, the novel contribution of this article is an integrated treatment of pronominalization and planning, aiming to determine whether the principles underlying the constraints and rules of the theory can be &amp;quot;turned round&amp;quot; and used as planning operators for generating coherent text. We have assumed some familiarity in the foregoing with terms such as text planning and sentence planning. These are among the distinct tasks identified in Reiter's &amp;quot;consensus architecture&amp;quot; for natural language generation (Reiter 1994): Text planning/content determination: deciding the content of a message and organizing the component propositions into a text structure (typically a tree) Sentence planning: aggregating propositions into clausal units and choosing lexical items corresponding to concepts in the knowledge base; this is the level at which the order of arguments and choice of referring expressions will be determined Linguistic realization: surface details such as agreement and orthography Reiter observed that these functions can often be identified with discrete modules in applied NLG systems and that a de facto standard had emerged in which these modules are organized in a pipeline such that data flows only in one direction and only between consecutive modules.</Paragraph>
    <Paragraph position="1"> Breaking down the generation task in this way makes it evident that there are various ways the distinct principles of CT can be incorporated. Continuity and cohesion naturally come under text planning: respectively, ordering a sequence of utterances to ensure that each has a backward-looking center and maintaining the same entity as the center within constraints on ordering determined by discourse relations. Salience and cheapness, on the other hand, would come under sentence planning, since in each case a particular entity is to be realized as subject. However, we encounter an apparent paradox in that identifying the center itself depends on grammatical salience as determined by the sentence planner: for example, choice of active or passive voice.</Paragraph>
    <Paragraph position="2"> Consequently, the text planner appears to rely on decisions made at the sentenceplanning level, which is incompatible with the fact that &amp;quot;pipelined systems cannot perform general search over a decision space which includes decisions made in more than one module&amp;quot; (Reiter 2000, page 252).</Paragraph>
    <Paragraph position="3"> We can envisage three possibilities for incorporating CT into a generation archi- null tecture: 1. &amp;quot;Incremental&amp;quot; sentence-by-sentence generation, in which the syntactic structure of U n is determined before the semantic content of U  is planned. That is, the text planner would plan the content of U n+1 by aiming to realize a proposition in the knowledge base which mentions an entity which is salient in U</Paragraph>
    <Paragraph position="5"> Computational Linguistics Volume 30, Number 4 Figure 1 Rhetorical structure.</Paragraph>
    <Paragraph position="6"> of any system which performs all stages of generation in a sentence-by-sentence way, and in any case this type of architecture would not allow for global planning over multisentence sequences, which we take to be essential for a faithful implementation  of centering.</Paragraph>
    <Paragraph position="7"> 2. A pipelined system in which the &amp;quot;topic&amp;quot; or &amp;quot;theme&amp;quot; of a sentence is desig null nated independently as part of the semantic input and centering rules reflect the information structure of a discourse. Prince (1999) notes that definitions of topic in the literature do not provide objective tests for topichood and proposes that the topic should be identified with the center of attention as defined by CT; however, what would be needed here would be a more fundamental definition that would account for a particular entity's being chosen to be the center of attention in the first place.</Paragraph>
    <Paragraph position="8"> 3. The solution we adopt is to treat the task of identifying Cbs and Cps as an optimization problem. We assume that certain options for syntactic realization can be predicted on the basis of the argument structure of predicates, which means that centering constructs can be calculated as part of text planning before syntactic realization takes place, so that the paradox noted above is resolved. Pronominalization decisions are deferred until a point at which grammatical relations and word order have been fixed.</Paragraph>
    <Paragraph position="9"> 4. Generation as Constraint Satisfaction In this section we give an overview of our text-planning component in order to set the implementation of CT in context. The methodology is more fully described by Power, Scott, and Bouayad-Agha (2003).</Paragraph>
    <Paragraph position="10"> The text planner was developed within Iconoclast, a project that investigated applications of constraint-based reasoning in natural language generation using as sub-ject matter the domain of medical information leaflets. Following Scott and de Souza (1990), we represent rhetorical structure by graphs like Figure 1, in which nonterminal nodes represent RST relations, terminal nodes represent propositions, and linear order is unspecified. The task of the text planner is to realize the rhetorical structure as a text structure in which propositions are ordered, assigned to textual units (e.g., sentences, paragraphs, vertical lists), and linked where appropriate by discourse connectives (e.g., since, however). The boundary between text and sentence planning is drawn at the realization of elementary propositions rather than at the generation of individual sentences. If a rhetorical subtree is realized as a complex sentence, the effect  Kibble and Power Optimizing Referential Coherence is that &amp;quot;text planning&amp;quot; trespasses into the higher-level syntax of the sentence, leaving only the elementary propositions to be realized by &amp;quot;sentence planning.&amp;quot;  Even for a simple rhetorical input like figure 1, many reasonable text structures can be generated. Since there are two nucleus-satellite relations, the elementary propositions can be ordered in four ways. Several discourse connectives can be employed to realize each rhetorical relation (e.g., concession can be realized by although, but, and however). At one extreme, the text can be spread out over several paragraphs, while at the other extreme, it can be squeezed into a single sentence. With fairly restrictive constraint settings, the system generates 24 text structure patterns for figure 1, including the following (shown schematically): A. Since contain(elixir, gestodene), ban(fda, elixir).</Paragraph>
    <Paragraph position="11"> However, approve(fda, elixirplus).</Paragraph>
    <Paragraph position="12"> B. approve(fda, elixirplus), although since contain(elixir, gestodene), ban(fda, elixir).</Paragraph>
    <Paragraph position="13"> The final output texts will depend on how the propositions are realized syntactically; among other things, this will depend on centering choices within each proposition. In outline, the procedure that we propose is as follows:  1. Enumerate all text structures that are acceptable realizations of the rhetorical structure.</Paragraph>
    <Paragraph position="14"> 2. For each text structure, enumerate all permissible choices for the Cb and Cp of each proposition.</Paragraph>
    <Paragraph position="15"> 3. Evaluate the solutions, taking account of referential coherence among  other considerations, and choose the best.</Paragraph>
    <Paragraph position="16"> For the example in figure 1, centers can be assigned in four ways for each text structure pattern, making a total of 96 solutions.</Paragraph>
    <Paragraph position="17"> As will probably be obvious, such a procedure could not be applied for rhetorical structures with many propositions. For examples of this kind, based on the relations cause and concession (each of which can be marked by several different connectives), we find that the total number of text structures is approximately 5 N[?]1 for N propositions.</Paragraph>
    <Paragraph position="18"> Hence with N = 5, we would expect around 600 text structures; with perhaps five to ten ways of assigning centers to each text structure, the total number of solutions would approximate to 5,000. Global optimization of the solution therefore becomes impracticable for texts longer than about five propositions; we address this problem by a technique of partial optimization in which a high-level planner fixes the large-scale structure of the text, thus defining a set of local planning problems, each small enough to be tackled by the methods described here.</Paragraph>
    <Paragraph position="19"> Stage 1 of the planning procedure is described in more detail by Power, Scott, and Bouayad-Agha (2003). A brief summary follows, after which we focus on stages 2 and 3, in which the text planner enumerates the possible assignments of centers and evaluates which is the best.</Paragraph>
  </Section>
  <Section position="7" start_page="409" end_page="413" type="metho">
    <SectionTitle>
3 See Power, Scott, and Bouayad-Agha (2003) for detailed motivation of this concept of text structure as a
</SectionTitle>
    <Paragraph position="0"> level of representation distinct from both rhetorical structure and syntactic structure.</Paragraph>
    <Paragraph position="1">  Computational Linguistics Volume 30, Number 4</Paragraph>
    <Section position="1" start_page="410" end_page="410" type="sub_section">
      <SectionTitle>
4.1 Generating and Evaluating Text Structures
</SectionTitle>
      <Paragraph position="0"> A text structure is defined in Iconoclast as an ordered tree in which each node has a feature named text-level. Values of text-level are represented by integers in the range 0...L max ; these may be interpreted in various ways, but we will assume here  levels, so that sections are composed of paragraphs, paragraphs of text sentences, and so forth. An example of an ill-formed structure would be one in which a text sentence contained a paragraph; such a structure can occur only when the paragraph is indented--a possibility we are excluding here. As well as being a well-formed text structure, a candidate solution must realize a rhetorical structure (RS) &amp;quot;correctly,&amp;quot; in a sense that we need to make precise. Roughly, a correct solution should satisfy three conditions: 1. The terminal nodes of the TS should express all the elementary propositions in the RS; they may also contain discourse connectives expressing rhetorical relations in the RS, although for some relations discourse connectives are optional.</Paragraph>
      <Paragraph position="1"> 2. The TS must respect rules of syntax when it combines propositions and discourse connectives within a text clause; for instance, a conjunction such as but linking two text phrases must be coordinated with the second one.</Paragraph>
      <Paragraph position="2"> 3. The TS must be structurally compatible with the RS.</Paragraph>
      <Paragraph position="3">  The first two conditions are straightforward, but what is meant by &amp;quot;structural compatibility&amp;quot;? We suggest the crucial criterion for such compatibility should be as follows: Any grouping of the elementary propositions in the TS must also occur in the RS. In other words, the text structurer is allowed to eliminate groupings, but not to add any. More formally:  ) can be realized by a paragraph of three sentences, one for each proposition, even though this TS contains no node dominating the propositions p</Paragraph>
      <Paragraph position="5"> that are grouped by R  . However, when this happens, the propositions grouped together in the RS must remain consecutive in the TS; solutions in which p  comes in between p  parent, they must have different values of order. Connective Hard Governs choice of discourse connective. Rhetorical grouping Soft Failure to express a rhetorical grouping can be treated as a defect.</Paragraph>
      <Paragraph position="6"> Oversimple paragraph Soft A paragraph containing only one text sentence can be treated as a defect.</Paragraph>
      <Paragraph position="7"> Centering Soft Constraints derived from centering theory. Our procedure for generating candidate solutions is based on a technique for formulating text structuring as a constraint satisfaction problem (CSP) (van Hentenryck, 1989), using the Eclipse logic programming environment.  In general, a CSP is characterized by the following elements: * a set of variables V</Paragraph>
      <Paragraph position="9"> of possible values * a set of constraints on the values of the variables (for integer domains these often use &amp;quot;greater than&amp;quot; and &amp;quot;less than&amp;quot;; other domains usually rely on &amp;quot;equal&amp;quot; or &amp;quot;unequal&amp;quot;.) A solution assigns to each variable V</Paragraph>
      <Paragraph position="11"> while respecting all constraints. For instance each node of the rhetorical structure is annotated with a text-level variable with the domain 0...L max and an order variable with the domain 1...N, where N is the number of sisters. Depending on the constraints, there may be multiple solutions, or there may be no solution at all. We distinguish between hard constraints, which are applied during the enumeration phase, determining which candidate structures will be considered, and soft constraints, which apply during an evaluation phase in which the enumerated solutions are ordered from best to worst. Some examples of hard and soft constraints are shown in Table 2.</Paragraph>
    </Section>
    <Section position="2" start_page="410" end_page="412" type="sub_section">
      <SectionTitle>
4.2 Choosing Centers
</SectionTitle>
      <Paragraph position="0"> Given a text structure, we enumerate all permissible centering assignments as follows:  approve(fda, elixir-plus)[fda fda, elixir-plus] 3. Compute all combinations from SCb and SCp that respect the  fundamental centering constraint that Cb(U n ) should be the most salient candidate in U n[?]1 .</Paragraph>
      <Paragraph position="1"> As stated earlier, two criteria for determining the predecessor have been implemented; the user can select one or the other criterion, thus using the NLG system to test different approaches. Following a linear criterion, the predecessor is simply the proposition that precedes the current proposition in the text, regardless of structural considerations. Following a hierarchical criterion, the predecessor is the most accessible previous proposition, in the sense defined by Veins Theory (Cristea, Ide, and Romary, 1998). For now we assume the criterion is linear.</Paragraph>
      <Paragraph position="2"> SCb(U n ) (potential Cbs of proposition U n ) is given by the intersection between</Paragraph>
      <Paragraph position="4"> )--that is, all the referents they have in common. The potential Cps are those referents in the current proposition that can be realized as most salient. Obviously this should depend on the linguistic resources available to the generator; the system actually uses a simpler rule based on argument types within the proposition.</Paragraph>
      <Paragraph position="5"> Table 3 shows the potential Cbs and Cps for the proposition sequence in solution A presented at the beginning of this section. As stated earlier, our treatment of salience here simplifies in two ways: We assume that syntactic realization serves only to distinguish the Cp from all other referents and that the system already knows, from the argument structure of the proposition, which entities can occur in subject position. With these simplifications, the enumeration of centering assignments is straightforward; in the above example, four combinations are possible, since there are two choices each for</Paragraph>
    </Section>
    <Section position="3" start_page="412" end_page="413" type="sub_section">
      <SectionTitle>
4.3 Evaluating Solutions
</SectionTitle>
      <Paragraph position="0"> The system evaluates candidate solutions by applying a battery of tests to each node of the text plan. Each test identifies whether the node suffers from a particular defect. For instance, one stylistic defect (at least for the rhetorical relations occurring in figure 1) is that of placing nucleus before satellite; in general, the text reads better if important material is placed at the end. For each type of defect, we specify a weight indicating its importance: In evaluating continuity of reference, for example, the defect &amp;quot;no Cb&amp;quot; is regarded as more significant than other defects. Other violations are recorded only in the case in which a Cb is present, so if all violations were weighted equally, this could result in a &amp;quot;no-Cb&amp;quot; transition's being treated as less serious than an &amp;quot;expensive&amp;quot; Smooth Shift, for example (violating cheapness and cohesion). Summing the weighted costs for all defects, we obtain a total cost for the solution; our aim is to find the solution with the lowest total cost.</Paragraph>
      <Paragraph position="1"> Regarding centering, the tests currently applied are as follows:  ). This defect is assessed only on propositions that have a backward-looking center.</Paragraph>
      <Paragraph position="2"> Continuity violation: This defect is recorded for any proposition with no Cb, except the first proposition in the sequence (which by definition cannot have a Cb).</Paragraph>
      <Paragraph position="3"> Relative weightings for these defects can be chosen by the user; for the current examples we have chosen a neutral scheme with a weight of 3 for continuity violations and 1 each for the others, so that a no-Cb transition is ranked equally bad as an &amp;quot;expensive&amp;quot; Rough Shift.</Paragraph>
      <Paragraph position="4">  Applied to the four solutions to text structures A and B presented in this section, these definitions yield costs shown in Table 4. According to our metric, solutions A1 and A2 should be preferred because they incur less cost than any others, with B3 and B4 the least preferred.</Paragraph>
      <Paragraph position="5"> Although this article focuses on centering issues, it is important to remember that other aspects of text quality are evaluated at the same time: The aim is to compute a global measure so that disadvantages in one factor can be weighed against advantages in another. For instance, text pattern B is bound to yield poor continuity of reference because it orders the propositions so that U  have no referents in common.</Paragraph>
      <Paragraph position="6"> Text pattern A avoids this defect, but this does not automatically mean that A is better than B; there may be other reasons, unconnected with centering, for preferring B to A. The constraints which have an effect on clause ordering include: Satellite before nucleus: For nucleus-satellite relations, place the satellite before the nucleus.</Paragraph>
      <Paragraph position="7"> Right-branching structure: If an elementary proposition is coordinated with a complex rhetorical structure, place the elementary proposition first.</Paragraph>
      <Paragraph position="8"> Centering constraints: Penalize orderings which violate centering preferences. Text pattern B is favored by &amp;quot;right-branching structure,&amp;quot; but in this case the centering constraints will &amp;quot;conspire&amp;quot; with &amp;quot;satellite before nucleus&amp;quot; to favor pattern A overall.</Paragraph>
    </Section>
  </Section>
  <Section position="8" start_page="413" end_page="413" type="metho">
    <SectionTitle>
5. Conclusion
</SectionTitle>
    <Paragraph position="0"> We have described a technique for generating texts which will be coherent according to a reasonably faithful interpretation of centering theory. NLG systems need some principled means of deciding on the preferred orderings of clauses and of arguments within clauses, and CT appears a good candidate to provide a basis for these decisions, in tandem with other stylistic considerations. We have reported on a particular implementation in the Iconoclast document generation system, but the technique can be applied to other NLG systems that perform hierarchical text structuring based on a theory of coherence relations (with additional assumptions as detailed in Section 1): * For systems which generate a single text plan, CT can determine the most coherent ordering of arguments within clauses.</Paragraph>
  </Section>
  <Section position="9" start_page="413" end_page="415" type="metho">
    <SectionTitle>
5 See Kibble and Power (2004) for initial results of empirical research on constraint weightings.
</SectionTitle>
    <Paragraph position="0"> However, it approves Elixir+. fda fda coh Since Elixir contains gestodene [?] elixir none A2 it is banned by the FDA. elixir elixir none 2 However, the FDA approves Elixir+. fda fda coh, ch Since Elixir contains gestodene [?] elixir none A3 the FDA bans Elixir. However, elixir fda sal 3 Elixir+ is approved by the FDA. fda elixir+ sal, coh Since Elixir contains gestodene [?] elixir none A4 it is banned by the FDA. However, elixir elixir none 3 Elixir+ is approved by the FDA. fda elixir+ sal, coh, ch The FDA approves Elixir+ although [?] fda none B1 since Elixir contains gestodene [?] elixir cont 3 it is banned by the FDA. elixir elixir none Elixir+ is approved by the FDA [?] elixir+ none B2 although since Elixir contains gestodene [?] elixir cont 3 it is banned by the FDA. elixir elixir none The FDA approves Elixir+ although [?] fda none B3 since Elixir contains gestodene [?] elixir cont 4 the FDA bans Elixir. elixir fda sal Elixir+ is approved by the FDA [?] elixir+ none B4 although since Elixir contains gestodene [?] elixir cont 4 the FDA bans Elixir. fda elixir sal Note: ch = cohesion, coh=cohesion, cont=continuity, sal=salience. * For systems which generate multiple text plans, CT can be used to evaluate the different plans as well as to determine the optimal realization of any particular plan.</Paragraph>
    <Paragraph position="1"> We have carried out empirical studies that provide clear evidence that centering features make a difference to the acceptability of texts and demonstrate one way to determine weightings (Kibble and Power 2004). It may turn out that different weight- null Kibble and Power Optimizing Referential Coherence ings are appropriate for different text genres or for speech as opposed to &amp;quot;written&amp;quot; text. Our framework will facilitate detailed research into evaluation metrics and will therefore provide a productive research tool in addition to the immediate practical benefit of improving the fluency and readability of generated texts.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML