File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/w04-0402_intro.xml
Size: 9,404 bytes
Last Modified: 2025-10-06 14:02:26
<?xml version="1.0" standalone="yes"?> <Paper uid="W04-0402"> <Title>Paraphrasing of Japanese Light-verb Constructions Based on Lexical Conceptual Structure Atsushi Fujita + Kentaro Furihata + Kentaro Inui +</Title> <Section position="3" start_page="1" end_page="2" type="intro"> <SectionTitle> .AnLVC </SectionTitle> <Paragraph position="0"> is a verb phrase (&quot;kandou-o ataeta (made an impression)&quot; in (1s)) that consists of a light-verb (&quot;ataeta (give-PAST)&quot;) that grammatically governs a nomi- null For each example, s denotes an input and t denotes its paraphrase. null nalized verb (&quot;kandou (an impression)&quot;) (also see Figure 1 in Section 2.2). A paraphrase of (1s) is sentence (1t), in which the nominalized verb functions as the main verb with its verbal form (&quot;kandou-s-as e-ta (be impressed-CAU, PAST)&quot;).</Paragraph> <Paragraph position="2"> The film made an impression on him.</Paragraph> <Paragraph position="3"> t. Eiga-ga kare-o kandou-s-ase-ta.</Paragraph> <Paragraph position="4"> film-NOM him-ACC be impressed-CAUSATIVE, PAST The film impressed him.</Paragraph> <Paragraph position="5"> To generate this type of paraphrase, we need a computational model that is capable of the following two classes of choice (also see Section 2.2): Selection of the voice: The model needs to be able to choose the voice of the target sentence from active, passive, causative, etc. In example (1), the causative voice is chosen, which is indicated by the auxiliary verb &quot;ase (causative)&quot;. Reassignment of the cases: The model needs to be able to reassign a case marker to each argument of the main verb. In (1), the grammatical case of &quot;kare (him),&quot; which was originally assigned the dative case, is changed to accusative.</Paragraph> <Paragraph position="6"> The task is not as simple as it may seem, because both decisions depend not only on the syntactic and semantic attributes of the light-verb, but also on those of the nominalized verb (Muraki, 1991).</Paragraph> <Paragraph position="7"> In this paper, we propose a novel lexical semantics-based account of the LVC paraphrasing, which uses the theory of Lexical Conceptual Structure (LCS) of Japanese verbs (Kageyama, 1996; Takeuchi et al., 2001). The theory of LCS offers an advantage as the basis of lexical resources for paraphrasing, because it has been developed to explain varieties of linguistic phenomena including lexical derivations, the construction of compounds, and verb alteration (Levin, 1993; Dorr et al., 1995; Kageyama, 1996; Takeuchi et al., 2001), all of which are associated with the systematic paraphrasing we mentioned above.</Paragraph> <Paragraph position="8"> The paraphrasing associated with LVCs is not idiosyncratic to Japanese but also appears commonly Second ACL Workshop on Multiword Expressions: Integrating Processing, July 2004, pp. 9-16 in other languages such as English (Mel'Vcuk and Polgu`ere, 1987; Iordanskaja et al., 1991; Dras, 1999, etc.), as indicated by the following examples.</Paragraph> <Paragraph position="9"> (2) s. Steven made an attempt to stop playing.</Paragraph> <Paragraph position="10"> t. Steven attempted to stop playing.</Paragraph> <Paragraph position="11"> (3) s. It had a noticeable effect on the trade.</Paragraph> <Paragraph position="12"> t. It noticeably affected the trade.</Paragraph> <Paragraph position="13"> Our approach raises the interesting issue of whether the paraphrasing of LVCs can be modeled in an analogous way across languages.</Paragraph> <Paragraph position="14"> Our aim in this paper are: (i) exploring the regularity of the LVC paraphrasing based a lexical semantics-based account, and (ii) assessing the immature Japanese semantic typology through a practical task.</Paragraph> <Paragraph position="15"> The following sections describe our motivation, target, and related work on LVC paraphrasing (Section 2), the basics of LCS and the refinements we made (Section 3), our paraphrasing model (Section 4), and our experiments (Section 5). Finally, we conclude this paper with a brief of description of work to be done in the future (Section 6). 2 Motivation, target, and related work</Paragraph> <Section position="1" start_page="1" end_page="1" type="sub_section"> <SectionTitle> 2.1 Motivation </SectionTitle> <Paragraph position="0"> One of the critical issues that we face in paraphrase generation is how to develop and maintain knowledge resources that covers a sufficiently wide range of paraphrasing patterns such as those indicating that &quot;to make an attempt&quot; can be paraphrased into &quot;to attempt,&quot; and that &quot;potential&quot; can be paraphrased into &quot;possibility.&quot; Several attempts have been made to develop such resources manually (Sato, 1999; Dras, 1999; Inui and Nogami, 2001); those work have, however, tended to restrict their scope to specific classes of paraphrases, and cannot be used to construct a sufficiently comprehensive resource for practical applications.</Paragraph> <Paragraph position="1"> There is another trend in the research in this field, namely, the automatic acquisition of paraphrase patterns from parallel or comparable corpora (Barzilay and McKeown, 2001; Lin and Pantel, 2001; Pang et al., 2003; Shinyama and Sekine, 2003, etc.). This type of approach may be able to reduce the cost of resource development. There are problems that must be overcome, however, before they can work practically. First, automatically acquired patterns tend to be complex. For example, from the paraphrase of (4s) into (4t), we can naively obtain the pattern: &quot;X is purchased by Y = Y buys X.&quot; (4) s. This car was purchased by him.</Paragraph> <Paragraph position="2"> t. He bought this car.</Paragraph> <Paragraph position="3"> This could also, however, be regarded as a combination of a simpler pattern of lexical paraphrasing (&quot;purchase = buy&quot;) and a voice activization (&quot;X which the LVC paraphrasing affects.</Paragraph> <Paragraph position="4"> be VERB-PP by Y = Y VERB X&quot;). If we were to use an acquisition scheme that is not capable of decomposing such complex paraphrases correctly, we would have to collect a combinatorial number of paraphrases to gain the required coverage. Second, the results of automatic acquisition would likely include many inappropriate patterns, which would require manual correction. Manual correction, however, would be impractical if we were collecting a combinatorial number of patterns.</Paragraph> <Paragraph position="5"> Our approach to this dilemma is as follows: first, we manually develop the resources needed to cover those paraphrases that appear regularly, and then decompose and automatically refine the acquired paraphrasing patterns using those resources. The work reported in this paper is aimed at this resource development. null</Paragraph> </Section> <Section position="2" start_page="1" end_page="2" type="sub_section"> <SectionTitle> 2.2 Target structure and required operations </SectionTitle> <Paragraph position="0"> Figure 1 shows the range which the LVC paraphrasing affects, where the solid boxes denote Japanese base-chunk so-called &quot;bunsetsu.&quot;</Paragraph> </Section> <Section position="3" start_page="2" end_page="2" type="sub_section"> <SectionTitle> Being </SectionTitle> <Paragraph position="0"> involved in the paraphrasing, the modifiers of the LVC need the following operations: Change of the dependence: The dependences of the elements (a) and (b) need to be changed because the original modifiee, the light-verb, is eliminated by the paraphrasing.</Paragraph> <Paragraph position="1"> Re-conjugation: The conjugation form of the elements (d), (e), and occasionally (c) need to be changed according to the category change of their modifiee, the nominalized verb.</Paragraph> <Paragraph position="2"> Reassignment of the cases: As described in the previous section, the case markers of the elements (b) and often (c) need to be reassigned. Selection of the voice: The voice of the nominalized verb needs to be chosen according to the combination of the nominalized verb, the lightverb, and the original voice.</Paragraph> <Paragraph position="3"> The first two operations are trivial in the field of text generation. Moreover, they can be done independently of the LVC paraphrasing. The most delicate operation is for the element (c) because it acts either as an adverb or as a case, relying on the con- null The modifiee of the LVC is not affected because the part-of-speech of the light-verb and main verb are the same. Verb LCS for verb Verb phrase move [y MOVE TO z] My sister (Theme) moves to a neighboring town (Goal). transmit [x CONTROL [y MOVE TO z]] The enzyme (Agent) transmits messages (Theme) to the muscles (Goal). locate [y BE AT z] The school (Theme) locates near the river (Goal). maintain [x CONTROL [y BE AT z]] He (Agent) maintains a machine (Theme) in good condition (Goal). text. In the former case, it needs the second operation. In the latter case, it needs the third operation as well as the element (b).</Paragraph> <Paragraph position="4"> In this paper, we take into account only the element (b), namely, the sibling cases of the nominalized verb.</Paragraph> </Section> </Section> class="xml-element"></Paper>