File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/00/j00-2001_intro.xml
Size: 8,557 bytes
Last Modified: 2025-10-06 14:00:51
<?xml version="1.0" standalone="yes"?> <Paper uid="J00-2001"> <Title>Integrating Text Planning and Linguistic Choice Without Abandoning Modularity: The IGEN Generator</Title> <Section position="4" start_page="108" end_page="110" type="intro"> <SectionTitle> Rubinoff The IGEN Generator </SectionTitle> <Paragraph position="0"> though, presumes that the work done by each module can be done independently.</Paragraph> <Paragraph position="1"> Separating generation into text planning and linguistic components thus implicitly assumes that text planning can be done without knowledge of the language being used and, conversely, that linguistic choices can be made without text planning knowledge.</Paragraph> <Paragraph position="2"> Unfortunately, this is not always true; both structural and lexical choices sometimes depend on interactions between the two parts of the generator. Thus generation must either compromise the modularity between the components or give up the ability to handle these cases properly}</Paragraph> <Section position="1" start_page="108" end_page="110" type="sub_section"> <SectionTitle> 2.1 Interactions between the Modules </SectionTitle> <Paragraph position="0"> The modular approach to generation assumes that the linguistic component's decisions never matter to the planner (or whichever component(s) organize the information to be expressed). This is not the case, though, as can be seen from the alternations in (1)-(3): (1) a. John killed him with a gun.</Paragraph> <Paragraph position="1"> b. John shot him dead.</Paragraph> <Paragraph position="2"> (2) a. John infected him with a virus.</Paragraph> <Paragraph position="3"> b. *John virused him sick.</Paragraph> <Paragraph position="4"> (3) a. *John homed him with an order.</Paragraph> <Paragraph position="5"> b. John ordered him home.</Paragraph> <Paragraph position="6"> The sentences (la) and (lb) express essentially the same information, so if the generator is attempting to express this information, it must choose between them at some point. In a modular generator, though, there is no point at which the decision can be made. The planner can't make this choice, because the availability of the choice depends on the particular linguistic resources of English. This can be seen by comparison with (2) and (3), in which only one alternative is available. In fact, a different alternative is available in each case. Since the planner doesn't know which alternative(s) is/are available, it can't choose between them; the linguistic component must make the choice.</Paragraph> <Paragraph position="7"> On the other hand, the decision has to be made by the planner, since it can depend on and/or affect the goals the generator is trying to achieve. The choice between (la) and (lb) should depend (in part) on what the generator is primarily trying to talk about. (la) is more appropriate if the generator is going to continue talking about the gun, whereas (lb) is more appropriate if the main concern is the ramifications of the victim's death. Since the planner is the component that deals with this information, it must choose between the alternatives.</Paragraph> <Paragraph position="8"> Also, the choice between (la) and (lb) determines what information can be easily omitted; cutting off the end of the sentence leaves out mention of the use of a gun in (la) and the death of the victim in (lb). Since the planner knows the consequences of omitting information, it must make the choice of which alternative to use and whether to abbreviate it. It might seem that the planner could simply indicate exactly what 2 Note that these interactions aren't the result of the particular details of how the work is divided between the components. As we shall see, there are some decisions that depend on both the underlying goals driving the generator and the details of what can be expressed in a particular language. Any architecture that deals with these issues in different components will encounter the problems described below. Computational Linguistics Volume 26, Number 2 information it wants included in the utterance, but that would require the generator to always assume a strategy of saying as little as possible.</Paragraph> <Paragraph position="9"> Furthermore, decisions about what information to include may interact with other decisions. For example, the generator may want to emphasize the victim's death but not care about the means of death; it might then choose (lb) for the emphasis even though (la) would let it skip mention of the gun. This kind of decision can only be made by the planner.</Paragraph> <Paragraph position="10"> The same kinds of interactions arise in the process of lexical choice. It would seem that lexical choice has to be handled by the planner, since it depends very much on what the generator is trying to accomplish. For example, the choice of describing someone as either firm, obstinate, or stubborn should depend on what else the generator wants to say about the person, as should the choice between meek and wimpy. The generator might describe how justice was served by an execution rather than how the prisoner was murdered by the state. Similarly, the generator might deride the comments of a dreamer, but praise the insights of a visionary. These kinds of lexical choices can only be made by the component that handles the generator's goals.</Paragraph> <Paragraph position="11"> On the other hand, there are a number of reasons why lexical choice has to be handled by the linguistic component. First of all, lexical choice is very dependent on the particular linguistic vocabulary of the language being generated. Thus French, for example, uses two different verbs (connaftre and savoir) to express knowing a person and knowing information, but English just uses know for both concepts. Similarly, English uses to be to indicate both location and a state or property, whereas French uses se trouver for the former and ~tre for the latter.</Paragraph> <Paragraph position="12"> Furthermore, there is in general no guarantee that there will be any lexical item to express a given concept. For example, there is no word in English for the concept of a car with a removable door. There's no inherent reason why there couldn't be; after all, there's a word for a car with a removable roof. This is just a particular fact about English. Similarly, there is a word giant meaning &quot;large man&quot;, but no corresponding word meaning &quot;large car&quot;.</Paragraph> <Paragraph position="13"> In addition, since lexical choice interacts with syntactic decisions, it cannot be done in advance of choosing syntactic structures. For example, a generator cannot decide to use probable instead of likely without knowing if the completed utterance could be the ungrammatical he is probable to be early. Similarly, the verb drink can't be chosen without knowing whether the clause will have a direct object; he drinks apple juice and he drinks actually have quite different meanings. 3 Note that the decision here doesn't depend just on whether the beverage is going to be explicitly mentioned; it depends on whether it's going to be mentioned in a specific syntactic position in the sentence. So here too it is impossible to assign the decision to a single component; the decision must be made by both components.</Paragraph> <Paragraph position="14"> The need to handle interactions such as these forces compromises in the modularity of generators. In the TEXT system (McKeown 1985), for example, some decisions about what information to include are in fact encoded into the tactical component. For example, TEXT's tactical component omits the attribute value WATER (used in TEXT to indicate that some object travels in or under the water) when it must be 3 McDonald has argued that lexical choice should be done in the first step in generation; in cases where lexical and syntactic decisions interact, lexical choice will constrain subsequent syntactic decisions (McDonald 1991). This approach is certainly possible (although it's not clear how to prevent independent lexical choices from imposing incompatible syntactic constraints), but it assumes that the resulting syntactic constraints don't matter to the generator. For example, choosing drink may require the generator to include a direct object even if it would prefer not to indicate that information; by the time the generator discovers this requirement, it is already committed to the choice.</Paragraph> </Section> </Section> class="xml-element"></Paper>