File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/relat/02/c02-1064_relat.xml

Size: 6,254 bytes

Last Modified: 2025-10-06 14:15:37

<?xml version="1.0" standalone="yes"?>
<Paper uid="C02-1064">
  <Title>Text Generation from Keywords Kiyotaka Uchimoto + Satoshi Sekine ++</Title>
  <Section position="7" start_page="3" end_page="3" type="relat">
    <SectionTitle>
6 Related Work
</SectionTitle>
    <Paragraph position="0"> Many statistical generation methods have been proposed. In this section, we describe the differences between our method and several previous methods.</Paragraph>
    <Paragraph position="1"> Japanese words are often followed by post-positional particles, such as &amp;quot;ga&amp;quot;and&amp;quot;wo&amp;quot;, to indicate the subject and object of a sentence. There are no corresponding words in English. Instead, English words are preceded by articles, &amp;quot;the&amp;quot; and &amp;quot;a,&amp;quot; to distinguish definite and indefinite nouns, and so on, and in this case there are no corresponding words in Japanese. Knight et al. proposed a way to compensate for missing information caused by a lack of language-dependent knowledge, or a &amp;quot;knowledge gap&amp;quot; (Knight and Hatzivassiloglou, 1995; Langkilde and Knight, 1998a; Langkilde and Knight, 1998b). They use semantic expressions as input, whereas we use keywords. Also, they construct candidate-text sentences or word lattices by applying rules, and apply their language model, an n-gram model, to select the most appropriate surface text. While we cannot use their rules to generate candidate-text sentences when given keywords, we can apply their language model to our system to generate surface-text sentences from candidate-text sentences in the form of dependency trees. We can also apply the formalism proposed by Langkilde (Langkilde, 2000) to express the candidate-text sentences.</Paragraph>
    <Paragraph position="2"> Bangalore and Rambow proposed a method to generate candidate-text sentences in the form of trees (Bangalore and Rambow, 2000). They consider dependency information when deriving trees by using XTAG grammar, but they assume that the input contains dependency information. Our system generates candidate-text sentences without relying on dependency information in the input, and our model estimates the dependencies between keywords.</Paragraph>
    <Paragraph position="3"> Ratnaparkhi proposed models to generate text from semantic attributes (Ratnaparkhi, 2000). The input of these models is semantic attributes. His models are similar to ours if the semantic attributes are replaced with keywords.</Paragraph>
    <Paragraph position="4"> However, his models need a training corpus in which certain words are replaced with semantic attributes. Although our model also needs a training corpus, the corpus can be automatically created by using a morphological analyzer and a dependency analyzer, both of which are readily available.</Paragraph>
    <Paragraph position="5"> Humphreys et al. proposed using models developed for sentence-structure analysis to rank candidate-text sentences (Humphreys et al., 2001). As well as models developed for sentence-structure analysis, we also use those developed for morphological analysis and found that these models contribute to the generation of appropriate text.</Paragraph>
    <Paragraph position="6"> Berger and Lafferty proposed a language model for information retrieval (Berger and Lafferty, 1999). Their concept is similar to that of our model, which can be regarded as a model that translates keywords into text, while their model can be regarded as one that translates query words into documents. However, the purpose of their model is different: their goal is to retrieve text that already exists while ours is to generate new text.</Paragraph>
    <Paragraph position="7"> 7Conclusion We have described a method for generating sentences from &amp;quot;keywords&amp;quot; or &amp;quot;headwords&amp;quot;. This method consists of two main parts, candidate-text construction and evaluation.</Paragraph>
    <Paragraph position="8"> 1. The construction part generates text sentences in the form of dependency trees by providing complementary information to replace that missing due to a &amp;quot;knowledge gap&amp;quot; and other missing function words, and thus generates natural text sentences based on a particular monolingual corpus.</Paragraph>
    <Paragraph position="9"> 2. The evaluation part consists of a model for generating an appropriate text sentence when given keywords. This model considers the dependency information between wordsaswel aswordn-graminformation. Furthermore, the model considers both string and morphological information.</Paragraph>
    <Paragraph position="10"> If a language model, such as a word n-gram model, is applied to the generated-text sentences in the form of dependency trees, an appropriate surface-text sentence is generated.</Paragraph>
    <Paragraph position="11"> The word-order model proposed by Uchimoto et al. can also generate surface text in a natural order (Uchimoto et al., 2000a).</Paragraph>
    <Paragraph position="12"> There are several possible directions for our future research. In particular, * We would like to expand the generation rules. We restricted the generation rules automatically acquired from a corpus to those that generate a bunsetsu. To generate a greater variety of candidate-text sentences, we would like to expand the rules that can generate a dependency tree. Expansion would lead to complementing with content words as well as function words.</Paragraph>
    <Paragraph position="13"> We also would like to prepare default rules or to classify words into several classes when no sentences including the keywords are found in the target corpus.</Paragraph>
    <Paragraph position="14"> * Some of the N-best text sentences generated by our system are semantically and grammatically unnatural. To remove such sentences from among the candidate-text sentences, we must enhance our model so that it can consider more information, such as classified words or those in a thesaurus.</Paragraph>
    <Paragraph position="15"> * We restricted keywords to the headwords or rightmost content words in the bunsetsus.</Paragraph>
    <Paragraph position="16"> We would like to expand the definition of keywords to other content words and to synonyms of the keywords.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML