File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/02/c02-1056_intro.xml
Size: 3,213 bytes
Last Modified: 2025-10-06 14:01:24
<?xml version="1.0" standalone="yes"?> <Paper uid="C02-1056"> <Title>Paraphrasing of Chinese Utterances</Title> <Section position="3" start_page="0" end_page="0" type="intro"> <SectionTitle> 2 Goals and Approach </SectionTitle> <Paragraph position="0"> In the pre-processing stage of translation, Chinese paraphrasing focuses on (1) transforming the expressions of spoken language into formal expressions, (2) reducing syntactic and semantic ambiguities, null (3) generating as many different expressions as possible in order to include expressions that can be translated by the transfer, and (4) paraphrasing the main constituents of the utterance in case the paraphrasing of the whole utterance has no effect.</Paragraph> <Paragraph position="1"> The aim of paraphrasing types (1), (2) and (4) is to simplify the expressions of utterances, and that of paraphrasing type (3) is to increase the variations of utterances. At present, we focus on paraphrasing types (1), (2) and (3).</Paragraph> <Paragraph position="2"> Paraphrasing is a process that automatically generates new expressions that have the same meaning as the input sentence. At first glance one would think that the problem could be resolved by separating it into two processes: the parsing process that analyzes the input sentence and obtains its meaning, and the generation process that generates sentences from the obtained meaning. However, this solution is not practicable for the following reasons.</Paragraph> <Paragraph position="3"> * At present, the techniques of parsing and semantics analysis of the Chinese language are far below the level needed for application. When studying spoken language, research on parsing and research on semantics analysis are major themes themselves. For automatic paraphrasing, we should first determine what kind of analysis is required and then start to develop a parser or a semantics analyzer.</Paragraph> <Paragraph position="4"> * Even if meanings can be obtained, goal (3) cannot be achieved if only one sentence is generated. Here, the demand that paraphrasing should generate multiple expressions is the most important. This focus is different from that of conventional sentence generation.</Paragraph> <Paragraph position="5"> In fact, the paraphrasing can be conducted at many different levels, for instance, words, phrases, or larger constituents. Although the paraphrasing of such constituents is probably related to context, it is not true that paraphrasing is impossible without being able to understand the whole sentence (Kataoka et al., 1999). The paraphrasing process encounters the following problems. (i) How to identify objects, i.e., which components of an input sentence will be paraphrased, (ii) how to generate new sentences, and (iii) how to ensure that the generated sentences have the same meaning as the input sentence. In order to avoid the large cost of syntax and semantics analysis, we propose a pattern-based approach to paraphrasing in which only morphological analysis is required. The focus is placed on how to generate as many different expressions as possible and how to get paraphrasing patterns from a paraphrasing corpus. null</Paragraph> </Section> class="xml-element"></Paper>