File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/91/w91-0105_metho.xml
Size: 27,527 bytes
Last Modified: 2025-10-06 14:12:47
<?xml version="1.0" standalone="yes"?> <Paper uid="W91-0105"> <Title>REVERSIBILITY AND MODULARITY IN NATURAL LANGUAGE GENERATION</Title> <Section position="3" start_page="0" end_page="31" type="metho"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> In general, the goal of parsing is the derivation of all possible grammatical structures defined by a grammar of a given string a (i.e. especially the determination of all possible logical forms of o~) and the goal of the corresponding generation task is the computation, of all possible strings defined by a grammar of a! given logical form & that are logically equivalent to ~ (see also (Shieber, 1988), (Calder et al., 1989)). Recently, there is a strong tendency to use the same grammar for performing both tasks. Besides more practically motivated reasons - obtaining more compact systems or avoidance of inconsistencies between the input and output of a system - there are also theoretical (a single mode! of language behaviour) and psychological evidences (empirical evidence for shared processors or facilities, cf. (Garrett, 1982), (Frazier, 1982), (Ja'ckendoff, 1987)) to adopt this view.</Paragraph> <Paragraph position="1"> From a formal point of view the main interest in obtaining non-directional grammars is the specification of the relationship between strings and logical forms. 1 According to van Noord (1990), a grammar is reversible if the parsing and generation problem is computable and the relation between strings and logical forms is symmetric.</Paragraph> <Paragraph position="2"> In this case parsing and generation are viewed as mutually inverse processes.</Paragraph> <Paragraph position="3"> Furthermore there are also approaches that assume that it is possible to use the same algorithm for processing the grammar in both directions (e.g. (Hasida and Isizaki, 1987), (Shieber, 1988), (Dymetman et aL, 1990), (Emele and Zajac, 1990)). A great advantage of a uniform process is that a discourse and task independent module for grammatical processing is available.</Paragraph> <Paragraph position="4"> This means that during performing both tasks the same grammatical power is potentially disposable (regardless of the actual language use).</Paragraph> <Paragraph position="5"> Nevertheless, in most of the 'real' generation systems where all aspects of the generation process of natural language utterances are considered, grammars are used that are especially designed for generation purposes (cf. (Hovy, 1987), (Dale, 1990), (Horacek, 1990), (McKeown el al., 1990), (Reithinger, 1991)). ~ The purpose of this paper is to show that the use of a uniform architecture for grammatical processing has important influences for the whole generation task. A consequent use of a uniform process within a natural language generation system affects the separation into strategic and tacti11 assume a notion of grammars that integrate phonological, syntactical and semantical levels of description, e.g., (Pollard and Sag, 1987).</Paragraph> <Paragraph position="6"> 2But it is important to note here, that most of the proposed grammars are unification-based which is an im- portant common property with respect to current parsing granmaars.</Paragraph> <Paragraph position="7"> cal components. On the one hand, existing problems with this separation emerge, on the other hand uniform architectures will serve as an important (linguistic) basis to achieve first solutions for the problems.</Paragraph> <Paragraph position="8"> In the next section I will discuss important problems and restrictions with the modular design of current generation systems and will then show why a uniform architecture as the grammatical basis can contribute to solutions of the problems.</Paragraph> </Section> <Section position="4" start_page="31" end_page="32" type="metho"> <SectionTitle> 2 Modularity in Generation Systems </SectionTitle> <Paragraph position="0"> The Problem It is widely accepted to cut down the problem of natural language generation (NLG) into two subtasks: * determination of the content of an utterance * determination of its linguistic realization This 'divide and conquer' view of generation is the base of current architectures of systems. With few exceptions (e.g., (Appelt, 1985)) the following two components are assumed: * 'what to say' part (strategic component) * 'how to say it' part (tactical component) But, as it has been demonstrated by some authors ((hppelt, 1985), (Hovy, 1987), (P~ubinoff, 1988), (Neumann, 1991), (Reithinger, 1991))it is not possible to separate the two phases of the generation process completely, e.g., in the case of lexieal gaps, choice between near synonyms or paraphrases.</Paragraph> <Paragraph position="1"> Currently, in systems where the separation is advocated the problems are sometimes 'solved' in such a way that the strategic component has to provide all information needed by the tactical component to make decisions about lexical and syntactic choices (McDonald, 1983), (McKeown, 1985), (Busemann, 1990), (Horacek, 1990). As a consequence, this implies that the input for tactical components is tailored to determine a good sentence, making the use of powerful grammatical processes redundant. In such approaches, tactical components are only front-ends and the strategic component needs detailed information about the language to use.</Paragraph> <Paragraph position="2"> Hence, they are not separate modules because this implies that both components share the grammar. As pointed out in Fodor (1983) one of the characteristic properties of a module is that it is computationally autonomous. But a relevant consideration of computationally autonomy is that modules do not share sources (in our case the grammar).</Paragraph> <Paragraph position="3"> Looking for More Symmetric Architectures To maintain the modular design a more symmetric division into strategic and tactical separation is needed:</Paragraph> <Paragraph position="5"> with linguistic decisions A consequence of this view is that the strategic component has no detailed information about the specific grammar and lexicon. This means that in general a message which is constructed precisely enough to satisfy the strategic component's goal can be underspecified from the tactical viewpoint. For example, if the strategic component specifies as input to the tactical component that 'Peter loves Maria', and 'Maria' is the current focus, then in German it is possible to utter: Of course, a 'real' generation system needs to choose between the possible paraphrases. An adequate generation system should avoid to utter 2 because for this utterance there exists also the unmarked reading that 'Maria loves Peter'.</Paragraph> <Paragraph position="6"> As long as the strategic component has no detailed knowledge of a specific grammar it could not express 'choose the passive form to avoid ambiguity'. But then the process can only choose randomly between paraphrases during generation and this means that the underlying message will possibly not be conveyed.</Paragraph> <Paragraph position="7"> There is also psychologically grounded evidence to assume that the input to a tactical component might not be necessary and sufficient to make linguistic decisions. This is best observed in exam- null ples of self-correction (Levelt, 1989). For example, in the following utterance: a &quot;but aaa, bands like aaa- aaa- aaa- errrlike groups, pot bands, - groups, you know what I mean like aaa.&quot; the speaker discovers two words (the nearsynonymous 'groulp' and 'band') each of which comes close to the underlying concept and has problems to decide which one is the most suitable. In this case, the problem is because of a mis-match between what the strategic component want to express and what the language is capable to express (Rubinoff, 1988).</Paragraph> <Paragraph position="8"> Current Approaches In order to be able to handle these proble~ ins, more flexible tactical components are necessary that are able to handle e.g. underspecified inpht. In (Hovy, 1987), (Finkler and Neumann, 1989) and (Reithinger, 1991) approaches are described how such more flexible components can be achieved. A major point of these systems is to assume a bidirectional flow of control betweenl the strategic and the tactical components.</Paragraph> <Paragraph position="9"> The problem with systems where a high degree of feedback between the strategic and the tactical components is necessary in order to perform the generation task is that one component could not perform its specific task without the help of the other. But when the mode of operation of e.g.</Paragraph> <Paragraph position="10"> the tactical component is continuously influenced by feedback from the strategic component then the tactical component will lose its autonomy and consequently this means that it is not a module (see also (LeveR, 1989)).</Paragraph> </Section> <Section position="5" start_page="32" end_page="36" type="metho"> <SectionTitle> 3 Integration of Parsing and Generation </SectionTitle> <Paragraph position="0"> A promising approach for achieving more autonomous tactical components is to integrate generation and parsing in a more strict way. By this I mean: * the use of resulting structures of one direction directly in the other direction, aThis example is taken from Rubinoff (1988) and is originally from a corpus of speech collected at the University of Pennsylvania. ' * the use of one mode of operation (e.g., parsing) for monitoring the other mode (e.g., generation). null A main thesis of this paper is that the best way to achieve such integrated approach is to use a uniform grammatical process as the linguistic basis of a tactical component.</Paragraph> <Paragraph position="1"> Use of Same Structures in Both Directions If parsing and generation share the same data (i.e. grammar and lexicon) then it is possible that resulting structures of one direction could be used directly in the other direction. For example, during the generation of paraphrases of the ambiguous utterance 'Remove the folder with the system tools.' the generator can use directly the analysed structures of the two NPs 'the folder' and 'the system tools' in the corresponding paraphrases. In a similiar way parsing and generation of elliptic utterances can also be performed more efficiently. For example, consider the following dialog between person A and B: A: 'Peter comes to the party tonight.' B: 'Mary, too.' In order to be able to parse B's utterance A can directly use parts of the grammatical structure of his own utterance in order to complete the elliptic structure. 4 Adaptability to Language Use of Others Another very important argument for the use of uniform knowledge sources is the possibility to model the assumption that during communication the use of language of one interlocutor is influenced by means of the language use of the others.</Paragraph> <Paragraph position="2"> For example, in a uniform lexicon it does not matter wether the lexeme was accessed during parsing or generation. This means that the use of linguistic elements of the interlocutor influences the choice of lexical material during generation if the frequency of lexemes will serve as a decision criterion. This will help to choose between lexemes which are synonymous in the actual situation or when the semantic input cannot be sufficiently specified. E.g. some drinking-devices can be denoted either 'cup' or 'mug' because their 4In this particular case, A can use the whole VP 'will come to the party'. In general the process is more complicated e.g., if B's answer would be 'Mary and John, too'. shape cannot be interpreted unequivocally. An appropriate choice would be to use the same lexeme that was previously used by the hearer (if no other information is available), in order to ensure that the same object will be denoted. In principle this is also possible for the choice between alternative syntactic structures.</Paragraph> <Paragraph position="3"> This adaptability to the use of language of partners in communication is one of the sources for the fact that the global generation process of humans is flexible and efficient. Of course, adaptability is also a kind of co-operative behaviour. This is necessary if new ideas have to be expressed for which no mutually known linguistic terms exist (e.g., during communication between experts and novices). In this case adaptability to the use of language of the hearer is necessary in order to make possible that the hearer will be able to understand the new information.</Paragraph> <Paragraph position="4"> In principle this kind of adaptability means that the structures of the input computed during the understanding process carry some information that can be used to parametrize the generation process. This leads to more flexibility: not all necessary parameters need to be specified in the input of a generator because decision points can also be set during run-time.</Paragraph> <Paragraph position="5"> This dynamic behaviour of a generation system will increase efficiency, too. As McDonald et al.</Paragraph> <Paragraph position="6"> (1987) define, one generator design is more efficient than another, if it is able to solve the same problem with fewer steps. They argue that&quot;the key element governing the difficulty of utterance production is the degree of familiarity with the situation&quot;. The efficiency of the generation process depends on the competence and experience one has acquired for a particular situation. But to have experience in the use of linguistic objects that are adequate in a particular situation means to be adaptable.</Paragraph> <Paragraph position="7"> Monitoring As Levelt (1989) pointed out &quot;speakers monitor what they are saying and how they are saying it&quot;, i.e. they monitor not only for meaning but also for linguistic well-formedness.</Paragraph> <Paragraph position="8"> To be able to monitor what one is saying is very important for processing of underspecified input and hence for achieving a more balanced divison of the generation task (see sec. 2). For example to choose between the two paraphrases of the example in sec. 2, the tactical component could parse the resulting strings in order to decide to choose the less ambiguous string 'Mary is loved by Peter.' It only needs to know from the strategic component that unambiguous utterances are preferred (as a pragmactic constraint).</Paragraph> <Paragraph position="9"> In Levelt's model parsing and generation are performed in an isolated way by means of two different grammars. In such flow of control the complete structure has to be generated again if ambiguities are detected that have to be avoided.</Paragraph> <Paragraph position="10"> If, for example an intelligent help-system that supports a user by using an operation research system (e.g. Unix, (Wilensky et al., 1984)), receives as input the utterance &quot;Remove the folder with the system tools&quot; then the system is not able to perform the corresponding action directly because it is ambiguous. But the system could ask the user &quot;Do you mean 'Remove the folder by means of the system tools' or 'Remove the folder that contains the system tools' &quot;. This situation is summarized in the following figure (LF' and LF&quot; symbolize two readings of S): LF' LF&quot; S: Remove the folder with the system tools S~: Remove the folder by means of the system tools S&quot;: Remove the folder that contains the system tools If parsing and generation are performed in an isolated way then generation of paraphrases can be very inefficient, because the source of the ambiguous utterance S is not used directly to guide the generation process.</Paragraph> <Paragraph position="11"> Generation of Paraphrases In order to clarify, why an integrated approach can help to solve the problem I will consider the problem of generation of paraphrases in more detail.</Paragraph> <Paragraph position="12"> If a reversible grammar is used in both directions then the links between the strings and logi- null A first naive algorithm that performs generation of paraphrasds using a reversible grammar can be described as follows: Suppose S is the input for the parser :then the set {(S,!LF'), (S, LF&quot;)} is computed. Now LF' respectively LF&quot; is given as input to the generator to compute possible paraphrases. The sets {(LF', S'), (LF', S)) respectively {(LF&quot;, S), (LF&quot;, S&quot;)} result. By means of comparison of the elements of the sets obtained during generation with the generation grammars ~ long as the symmetry property is not affected. This inh~erent property of a reversible grammar is very important in the case of generation of paraphrases because it ensures that the ambiguous structure and the corresponding paxaphrases are related together. If this would not be ~he case then this would mean that one is only able to generate the paraphrases but not the ambiguous structure.</Paragraph> <Paragraph position="13"> set obtained during parsing one can easily determine the two paraphrases S' and S&quot; because of the relationship between strings and logical forms defined by the grammar.</Paragraph> <Paragraph position="14"> This algorithm is naive because of the assumption that it is possible to generate all possible paraphrases at once. Although 'all-parses' algorithms are widley used during parsing in natural language systems a corresponding 'allparaphrases' strategy is not practicle because in general the search space during generation is much larger (which is a consequence of the modular design discussed in sec. 2).</Paragraph> <Paragraph position="15"> Of course, from a formal point of view one is interested in algorithms that compute all grammatically well-formed structures - at least potentially. So most of the currently developed generators and uniform algorithms assume - more or less explictly - an all-paraphrases strategy (e.g., (Shieber, 1988), (Calder et al., 1989), (Shieber et al., 1989), (Dymetman et al., 1990), (Emele and Zajac, 1990)). But from a practical point of view they are not directly usable in such specific situations. null More Suitable Strategies A more suitable strategy would be to generate only one paraphrase for each ambiguous logical form. As long as parsing and generation axe performed in an isolated way the problem with this strategy is that there is no control over the choice of paraphrases. In order to make clear this point I will look closer to the underlying structure of the example's utterances. null The problem why there are two readings is that the PP 'with the system folder' can be attached into modifier position of the NP 'the folder' (expressing the semantic relation that 'folder' contains 'system tools') or of the verb 'remove' (expressing semantically that 'folder' is the instrument of the described situation). Fig. 2 and 3 (see above) show the internal grammatical structure in a HPSG-style notation (omitting details, that are not relevant in this context).</Paragraph> <Paragraph position="16"> As long as the source of the ambiguity is not known it is possible to generate in both cases the utterance 'Remove the folder with the systemtools' as a paraphrase of itself. Of course, it is possible to compare the resulting strings with the input string S. But because the source of the ambiguity is not known the loop between the isolated processes must be performed several times in general.</Paragraph> <Paragraph position="17"> A better strategy would be to recognize relevant sources of ambiguities during parsing and to use this information to guide the generation process. Meteer and Shaked (1988) propose an approach where during the repeated parse of an ambiguous utterance potential sources of ambiguity can be detected. For example when in the case of lexical ambiguity a noun can be associated to two semantic classes a so called 'lexical ambiguity specialist' records the noun as the ambiguity source and the two different classes. These two classes are then explicitly used in the generator input and are realized as e.g. modifiers for the ambiguous noun.</Paragraph> <Paragraph position="18"> The only common knowledge source for the paraphraser is a high order intensional logic language called World Model Language. It serves as the interface between parser and generator. The problem with this approach is that parsing and generation are performed in an isolated way using two different grammars. If an ambiguous utterance S need to be paraphrased S has to be parsed again. During this repeated parse all potential ambiguities have to be recognized and recorded (i.e. have to be monitored) by means of different 'ambiguity specialists'. The problem here is that also local ambiguities have to be considered that are not relevant for the whole structure.</Paragraph> <Paragraph position="19"> An Alternative Approach I will now describe the basic idea of an approach that is based on an integrated approach where both tasks share the same grammar. The advantage of this approach is that no repeated parse is necessary to compute potential ambiguity sources because the different grammatical structures determined during parsing are used directly to guide the generation process. By means of this strategy it is also ensured that an ambiguous utterance is not generated as a paraphrase of itself.</Paragraph> <Paragraph position="20"> In principle the algorithm works as follows: During the generation of paraphrases the generation process is monitored in such a way that the monitor compares in each step the resulting structures of the generation process with the corresponding structures from parsing maintained in the alternative parse trees (I will now assume that two parse trees P1 and P2 corresponding to the structures given in fig. 2 and 3 are obtained during parsing). Suppose that LF' (cf. fig. 1) is specified as the input to the generator. In the case where the generator encounters alternative grammatical structures to be expanded, the monitor guides the generator by means of inspection of the corresponding parse trees. In the case where actual considered parts pl and p2 of P1 and P2 (e.g., same NPs) axe equal the generator has to choose the same grammatical structure that was used to build Pl and p~ (or more efficiently the generator can use the partial structure directly as a kind of compiled knowledge). In the case where a partial structure of e.g. parse tree P1 has no correspondence in P2 (cf. fig. 2 and 3) an ambiguity source is detected. In this case an alternative grammatical structure has to be chosen. 6 At this point it should be clear that the easiest way in order to be able to generate 'along parsed structures' is to use the same grammar in both directions. In this case grammatical structures obtained during parsing can be used directly to restrict the potential search space during generation. null degOf course, the described algorithm is too restrictive, in order to handle non-structural (e.g. contextual) paraphrases. But, I assume that this approach is also applicable in the case of lexiccal amibiguities prerequisite word meanings are structurally described by means of lexical se- null This approach :is not only restricted in cases where the input is ambiguous and the paraphrases must contrast the different meanings. It can also be used for self-monitoring when it has to be checked wel~her a produced utterance S of an input form LF is ambiguous. In this case S will be parsed. If during parsing e.g., two readings LF t I . . and LF are deduced LF IS generated again along the parse tree obtained for S. Now an utterance S' can be generated that has the same meaning but differs with respect to the ambiguity source of S.</Paragraph> </Section> <Section position="6" start_page="36" end_page="36" type="metho"> <SectionTitle> 4 Current Work </SectionTitle> <Paragraph position="0"> We have now started a project called BiLD (short for Bidirectional iLinguistic Deduction) at the University of Saarland (Saarbriicken) where it will be investigated how an integrated approach k of parsing and generation can be realized efficiently by means of a uniform architecture and how such a model can be used for increasing flexibility and efficiency during natural language processing. null The main topic lof the BiLD project is the development of a uniform parametrized deduction process for grammahcal processing. This process constitutes the core of a flexible and symetric tactical module. In order to realize the integrated approach and to obtain a high degree of efficiency in both directions'we will develop methods for a declarative encoding of information of control in the lexicon and grammar.</Paragraph> <Paragraph position="1"> We follow a sign-based approach for the description of linguistic entities based on Head-driven Phrase Structure Grammar (Pollard and Sag, 1987) and the variant described in Reape (1989). Besides theoretical reasons there are also reasons with respect to system's design criterions to adopt this view because all levels of descriptions (i.e. phonological, syntactic and semantic structure) of lingffistics entities (i.e. words and phrases) are described simultanueous in a uniform way by means of partial information structures. None of the levels of description has a privileged status but expresses possible mutually co-ocurrence restrictions of structures of different levels.</Paragraph> <Paragraph position="2"> Furthermore a high degree of lexicalism is assumed so that the lexicon as a complex hierachical ordered data Structure plays a central role in BiLD. As it has been shown this lexicalized view supports revei'sibility (el. (Newman, 1990), (Dymetman et al., 1990)) and the performing of specific processing strategies (e.g., incremental and parallel generation, (Neumann and Finkler, 1990)).</Paragraph> <Paragraph position="3"> The task of the deduction process during generation is to construct the graphemic form of a specified feature description of a semantic form. For example, to yield the utterance &quot;A man sings.&quot; the deduction process gets as input the semantic feature structure tel : sing' agens: quant : exists' oar : restr : \[ pred : man' \] var : and deduces the graphematic structure</Paragraph> <Paragraph position="5"> by means of successive application of lexical and grammatical information. In the same way the deduction process computes from the graphematic structure an appropriate semantic structure in parsing direction. A first prototype based on head-driven bottom-up strategy is now under development (cf. (van Noord, 1991)).</Paragraph> <Paragraph position="6"> A major aspect of the BiLD project is that a specific parametrization of the deduction process is represented in the lexicon as well as in the grammar to obtain efficient structures of control (Uszkoreit, 1991). The main idea is that preference values are assigned to the elements (disjuncts or conjuncts) of feature descriptions. For example, in HPSG all lexical entries are put together into one large disjunctive form. From a purely declarative point of view these elements are unordered. But a preference structure is used during processing in order to guide the process of lexical choice efficiently which itself influences the grammatical process.</Paragraph> </Section> class="xml-element"></Paper>