File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/00/w00-1431_intro.xml
Size: 11,603 bytes
Last Modified: 2025-10-06 14:01:03
<?xml version="1.0" standalone="yes"?> <Paper uid="W00-1431"> <Title>The.,CLEF= semi-~recursive generation algori:thm</Title> <Section position="4" start_page="232" end_page="235" type="intro"> <SectionTitle> 3 .The ,generation:algorit hm </SectionTitle> <Paragraph position="0"> Tactical generation algorithms use various strategies which are all input-dependent.</Paragraph> <Paragraph position="1"> Moreover, defining the bounds between the &quot;how to say&quot; and the &quot;what to say&quot; is still an open question. RAGS (rags, 1999) proposes a standard architecture for the data, but leaves the ~Eeature; or on.the_linear, order, of the elements, .or &quot; ... on the realisation of the arguments, etc.</Paragraph> <Paragraph position="2"> A list of constraints (initially empty) is associated to each lexical base, and the generation process adds new constraints while parsing the input structure. There are two types of constraints additions : processing details underspecified. (r) implicit constraints. Such constraints are .............. However two main approaehes,carrbe,notieed -'. .... ,._.., .......... added;by~p~en~LB~that~added:~a:~ons/x~nt,~. ~ ~,. the recursive one and the deductive one. The due to its own internal lexical choice. For recursive approach is basically a depth-first backtracking search (for example (NicoIov, 1998)), while the deductive one uses inference mechanisms such as those used in expert systems or specialized languages such as PROLOG (Panaget, 1997). As deductive systems are often used as opaque ways of resolving problems, we will focus on the recursive algorithm, that can easily be used as a base for the customizing of algorithms.</Paragraph> <Section position="1" start_page="232" end_page="232" type="sub_section"> <SectionTitle> 3.1 The input structure </SectionTitle> <Paragraph position="0"> The input of the CLEF generation system is a hierarchical representation (i.e. a tree structure) of the conceptual structure. Therefore, a crucial choice is done before the proper linguistic generation : selecting the theme and the rheme of the utterance ~.</Paragraph> <Paragraph position="1"> The main advantage from a technical point of view is the processing linearization : such a structure is not ambiguous regarding to the mapping between the elements of the input structures and the elements of the grammar. The input structure is therefore always considered as a single tree, and the text generation algorithm is basically a tree walk on this structure, with a lexical choice processing for each node.</Paragraph> </Section> <Section position="2" start_page="232" end_page="233" type="sub_section"> <SectionTitle> 3.2 Lexical choice constraints </SectionTitle> <Paragraph position="0"> The lexical choices made by the lexical bases are modified by constraints that are related to different aspects of the selection : either on the T-This choice is clearly arbitrary, because it is equally relevant to the &quot;what to say&quot; and the &quot;how to say it&quot;. Such a choice for CLEF was mainly guided by technical considerations.</Paragraph> <Paragraph position="1"> example, the lexical selection of&quot;S 1 before $2&quot; (for a succession concept) imposes grammatical constraints on the argument &quot;after&quot; (related to $2) so that the selection of the argument be grammatically compatible (i.e. add a constraint that imposes the use of an infinitive sentence).</Paragraph> <Paragraph position="2"> explicit contraints. Those constraints are added by stylistic rules that will carry on the</Paragraph> <Paragraph position="4"> relations. E2, E11, and El2 are two 1 storder relation. E0 and E1 are two global schemas.</Paragraph> <Paragraph position="5"> lexical choice in order to avoid poor style, or to prevent dead-ends in the generation process. For example, the parallelism rule (Meunier 97) should impose that two verbal predicates use the same syntactic function for an argument they have in common.</Paragraph> <Paragraph position="6"> Every constraint addition is associated with a position in the input structure walk, so that it can be removed whenever the backtracking is used. We will also discuss how the backtracking can be partly avoided taking into account some properties of the algorithm, and using a minimum constraint propagation technique,</Paragraph> </Section> <Section position="3" start_page="233" end_page="233" type="sub_section"> <SectionTitle> 3.3 The semi-reeursivealgorithm * </SectionTitle> <Paragraph position="0"> (Danlos, 1996) emphasizes on the problems tied to the use of a recursive depth-first algorithm in the area of text generation. More specifically she discusses the impossibility of preventing poor stylistic choices, even when they can be easily predicted. In fact, the problem holds in that stylistic or grammatical rules use information that ~are~computed latev:in!:the~ge, ne~ation~stage by the recursive algorithm.</Paragraph> <Paragraph position="1"> Thus, in the examples given by (Danlos, 1996) (see figure 3), the two 2hal-order relation choices are obviously linked to each other. Nevertheless, the computation of the selection of R1 is not done until other selections are done (at least E2, in this example). In this way, if no lexical selection satisfies the syntactic or stylistic constraints, the generation process will backtrack on the whole array of previous selections.</Paragraph> <Paragraph position="2"> Some techniques can be used to partially make up for the problem, for example the memoization ((Nicolov, 1998), \[ 2,d_order (Becker, 1998)), but it does not \[ relations LB solve the problem. The fact is | that depth-first recursive .......... ,is.~ensure, d~.zand~eaeh~lewel,of~eoncept is considered globally.</Paragraph> <Paragraph position="3"> * The compatibility tests between the selections (i.e. the three levels of concepts) are carried out. If the combination is valid, it is accepted, otherwise some new selections are done until the compatibility tests succeed. ...... .,'I'he,approach,of., th~algorithra ,,is;partic~_~!o fly. relevant, as the consistency is not ensured merely for the array of previous lexical choices (which is not enough, as we discussed), but for the whole set of iexicai choices on the same level. This provides a realistic implementation of the global approach.</Paragraph> </Section> <Section position="4" start_page="233" end_page="235" type="sub_section"> <SectionTitle> 3.4 The CLEF algorithm </SectionTitle> <Paragraph position="0"> The CLEF algorithm is a variant of the semi-recursive algorithm. In fact, the main idea of the semi-recursive algorithm is the separated</Paragraph> <Paragraph position="2"> approaches are not adapted to text generation, where lexical choices must be done in a global, holistic perspective (Danlos, 1998) and (Busemann, 1993).</Paragraph> <Paragraph position="3"> In this perspective, (Danlos, 1996) proposes a different algorithm, called &quot;semirecursive&quot; algorithm, in that it remedies to the main drawbacks of the recursive algorithm. The latter is caracterized by the following features : o The lexical choices of the different levels of relations are carried out in parallel.</Paragraph> <Paragraph position="4"> The combinations of the trees and the stylistic choices are carried out separately for each level of concept. Thus, the consistency of all the lexical choices for a particular level processing of the different levels (entity, l~Lorder relations, 2&quot;d-order relations). One problem remains : although the context is taken into account, it is only used in the same level of concepts. Thus, both the 2&quot;a-order relations and the l~Lorder relations remain independent from each other;~and in case of failure of the compatibility test, incompatible selections must be computed again. This is due to the fact that choices are carried out in parallel. In order to solve this problem easily, computation should be carried out sequentially and the different levels should be computed in a predefined order. In this ease, the question syntactic one (eg: aprewious lex, ical selection. - . .._: imposes some syntactic constraint). Many essential information, for example to decide whether a noun phrase must be pronominalized or. not, whether a verb can be elided or not, are available only if the surrounding context does exist and is known.</Paragraph> <Paragraph position="5"> .:arises: in which ord~,should the different * :. .~ ..... ,;The ~,princilale~0Ethe;determination of.the,hacal , ... ,. conceptual levels be computed ? context by the global one&quot; (called &quot;hermeneutics Several evidences indicate that higher level elements should be selected first, then the lower levels (i.e. the 2&quot;d-order relations first, then the lSt-order relations, and then the entities). In fact, on the rhetorical point of view, the higher level elements (in GTAG, the 2nd-order relations) determine the text argumentative structure, thus providing stylistic consistency on the whole generated text. Were they not selected first, they would be constrained by the lexical choices of the other types of concepts. In other words, they would yield to constraints other than purely stylistic, which is not suitable for elements which first criterion of choice is, precisely, stylistics. Moreover, it seems that in numerous cases, it is preferable to select the simpler elements according to more complex ones. This corresponds to the approach developped by (Rastier and al., 1994), that shows that an element is only relevant in its surrounding context. Such an approach is relevant in our framework, since a particular lexical selection can only be done with full knowledge of the facts if its context is known. By context, we mean the conceptual-semantic context (eg. a reference to an entity that already exists in the discourse), the lexical context (eg. some lexical selection that has already been used for an entity), and the principle&quot; in (Rastier and al., 1994)) can therefore be applied only if the global context is already computed, then the local one, according to the global context. In order for the generation process to be compliant with this principle, elements should be computed in the following order: 2'd-order relations first, then lSt-order relations, and then entities.</Paragraph> <Paragraph position="6"> Proceeding otherwise would be inconsistent : it is not possible to determine the lexical-syntactic selection of an entity without knowing if it is bound to a noun or a verb. The two possibilities are not necessarily available for a given concept, and carrying on without this piece of information could be considered a last resort.</Paragraph> <Paragraph position="7"> Besides surrounding context, the local context is also necessary, as shows the perspective notion which can be found in (Busemann, 1993), and also supported by (Rastier and al., 1994). It is therefore necessary to know the dependents (the children in the input tree structure) in order to make a lexical-syntactic choice.</Paragraph> <Paragraph position="8"> These elements were crucial for the design of the CLEF generation algorithm, which we will described hereafter.</Paragraph> <Paragraph position="9"> The CLEF generation algorithm considers the three conceptual levels one by one, carrying out the lexical selection at first on the 2hal-order relations, then on the PLorder relations, and finally on the entities.</Paragraph> </Section> </Section> class="xml-element"></Paper>