File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/95/p95-1034_evalu.xml
Size: 4,894 bytes
Last Modified: 2025-10-06 14:00:21
<?xml version="1.0" standalone="yes"?> <Paper uid="P95-1034"> <Title>Two-Level, Many-Paths Generation</Title> <Section position="22" start_page="257" end_page="258" type="evalu"> <SectionTitle> 9 Strengths and Weaknesses </SectionTitle> <Paragraph position="0"> Many-paths generation leads to a new style of incremental grammar building. When dealing with some new construction, we first rather mindlessly overgenerate, providing the grammar with many ways to express the same thing. Then we watch the statistical component make its selections. If the selections are correct, there is no need to refine the grammar.</Paragraph> <Paragraph position="1"> For example, in our first grammar, we did not make any lexical or grammatical case distinctions.</Paragraph> <Paragraph position="2"> So our lattices included paths like Him saw I as well as He saw me. But the statistical model studiously avoided the bad paths, and in fact, we have yet to see an incorrect case usage from our generator. Likewise, our grammar proposes both his box and the box of he/him, but the former is statistically much more likely. Finally, we have no special rule to prohibit articles and possessives from appearing in the same noun phrase, but the bigram the his is so awful that the null article is always selected in the presence of a possessive pronoun. So we can get away with treating possessive pronouns like regular adjectives, greatly simplifying our grammar.</Paragraph> <Paragraph position="3"> We have also been able to simplify the generation of morphological variants. While true irregular forms (e.g., child~children) must be kept in a small exception table, the problem of &quot;multiple regular&quot; patterns usually increases the size of this table dramatically. For example, there are two ways to pluralize a noun ending in -o, but often only one is correct for a given noun (potatoes, but photos). There are many such inflectional and derivational patterns.</Paragraph> <Paragraph position="4"> Our approach is to apply all patterns and insert all results into the word lattice. Fortunately, the statistical model steers clear of sentences containing non-words like potatos and photoes. We thus get by with a very small exception table, and furthermore, our spelling habits automatically adapt to the training corpus.</Paragraph> <Paragraph position="5"> Most importantly, the two-level generation model allows us to indirectly apply lexical constraints for the selection of open-class words, even though these constraints are not explicitly represented in the generator's lexicon. For example, the selection of a word from a pair of frequently co-occurring adjacent words will automatically create a strong bias for the selection of the other member of the pair, if the latter is compatible with the semantic concept being lexicalized. This type of collocational knowledge, along with additional collocational information based on long- and variable-distance dependencies, has been successfully used in the past to increase the fluency of generated text (Smadja and McKeown, 1991). But, although such collocational information can be extracted automatically, it has to be manually reformulated into the generator's representational framework before it can be used as an additional constraint during pure knowledge-based generation. In contrast, the two-level model provides for the automatic collection and implicit representation of collocational constraints between adjacent words.</Paragraph> <Paragraph position="6"> In addition, in the absence of external lexical constraints the language model prefers words more typical of and common in the domain, rather than generic or overly specialized or formal alternatives.</Paragraph> <Paragraph position="7"> The result is text that is more fluent and closely simulates the style of the training corpus in this respect. Note for example the choice of obtain in the second example of the previous section in favor of the more formal procure.</Paragraph> <Paragraph position="8"> Many times, however, the statistical model does not finish the job. A bigram model will happily select a sentence like I only hires men who is good pilots. If we see plenty of output like this, then grammatical work on agreement is needed. Or consider They planned increase in production, where the model drops an article because planned increase is such a frequent bigram. This is a subtle interaction--is planned a main verb or an adjective? Also, the model prefers short sentences to long ones with the same semantic content, which favors conciseness, but sometimes selects bad n-grams to avoid a longer (but clearer) rendition. This an interesting problem not encountered in otherwise similar speech recognition models. We are currently investigating solutions to all of these problems in a highly experimental setting.</Paragraph> </Section> class="xml-element"></Paper>