File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/00/w00-0205_intro.xml
Size: 12,883 bytes
Last Modified: 2025-10-06 14:00:51
<?xml version="1.0" standalone="yes"?> <Paper uid="W00-0205"> <Title>Telicity as a Cue to Temporal and Discourse Structure in Chinese-English Machine Translation*</Title> <Section position="3" start_page="34" end_page="37" type="intro"> <SectionTitle> 1 LCS representations in our system have been created for </SectionTitle> <Paragraph position="0"> Korean, Spanish and Arabic, as well as for English and Chinese. null first sentence below, or explicit in a single sentence, as in the second sentence below. Implicit event-state modification (sentence 3) is prohibited.</Paragraph> <Paragraph position="1"> * Wade bought a car. He needed a way to get to work.</Paragraph> <Paragraph position="2"> * Wade bought a car because he needed a way to get to work.</Paragraph> <Paragraph position="3"> * * Wade bought a car he needed a way to get to work.</Paragraph> <Paragraph position="4"> It is exactly these third type that are permitted in standard Chinese and robustly attested in our data. If the LCS is to be truly an interlingua, we must extend the representation to allow these kinds of sentences to be processed. One possibility is to posit an implicit position connecting the situations described by the multiple clauses. In the source language analysis phase, this would amount to positing a disjunction of all possible position relations implicitly realizable in this language. Another option is to relax the wellformedness constraints to allow an event to directly modify another event. This not only fails to recognize the regularities we see in English (and other language) LCS structures, for Chinese it merely pushes the problem back one step, as the set of implicitly realizable relations may vary from language to language and may result in some ungrammatical or misleading translations. The second option can be augmented, however, by factoring out of the interlingua (and into the generation code) language-specific principles for generating connectives using information in the LCS-structure, proper. For the present, this is the approach we take, using lexical aspectual information, as read from the LCS structure, to generate appropriate temporal relations. null Therefore not only tense, but inter-sentential discourse relations must be considered when generating English from Chinese, even at the sentence level. We report on a project to generate both temporal and discourse relations using the LCS representation. In particular, we focus on the encoding of the lexical aspect feature TELICITY and its complement ATELIG-ITY to generate past and present tense, and corresponding temporal relations for modifying clauses within sentences. While we cannot at present directly capture discourse relations, we can garner aspectual class from LCS verb classification, which in turn can be used to predict the appropriate tense for translations of Chinese verbs into English.</Paragraph> <Paragraph position="5"> We begin with a discussion of aspectual features of sentences, and how this information can be used to provide information about the time of the situations presented in a sentence. Such information can be used to help provide clues as to both tense and relationships (and cue words) between connected situations. Aspectual features can be divided into grammatical aspect, which is indicated by lexical or morphological markers in a sentence, and lexical aspect, which is inherent in the meanings of words.</Paragraph> <Section position="1" start_page="35" end_page="35" type="sub_section"> <SectionTitle> 2.1 Grammatical aspect </SectionTitle> <Paragraph position="0"> Grammatical aspect provides a viewpoint on situation (event or state) structure (Smith, 1997). Since imperfective aspect, such as the English PROGRES-SIVE construction be VERB-ing, views a situation from within, it is often associated with present or contemporaneous time reference. On the other hand, perfective aspect, such as the English have VERB-ed, Views a situation as a whole; it is therefore often associated with past time reference ((Comrie, 1976; Olsen, 1997; Smith, 1997) cf. (Chu, 1998)). The temporal relations are tendencies, rather than an absolute correlation: although the perfective is found more frequently in past tenses (Comrie, 1976), both imperfective and perfective co-occur in some language with past, present, and future tense.</Paragraph> <Paragraph position="1"> In some cases, an English verb will specify tense and/or aspect for a complement. For example, continue requires either an infinitive (3)a or progressive complement (3)b (and subject drop), while other verbs like say do not place such restrictions (3)c,d.</Paragraph> <Paragraph position="2"> (3) a. Wolfe continued to publicize the baseless criticism on various occasions b. Wolfe continued publicizing the baseless criticism on various occasions c. Wolfe continued publicizing the baseless criticism on various occasions d. He said the asia-pacific region already became a focal point region e. He said the asia-pacific region already is becoming a focal point region</Paragraph> </Section> <Section position="2" start_page="35" end_page="37" type="sub_section"> <SectionTitle> 2.2 Lexical aspect </SectionTitle> <Paragraph position="0"> While grammatical aspect and overt temporal cues are clearly helpful in translation, there are many cases in our corpus in which such cues are not present. These are the hard cases, where we must infer tense or grammatical aspectual marking in the target language from a source that looks like it provides no overt cues. We will show however, that Chinese does provide implicit cues through its lexical aspect classes. First, we review what lexical aspect is.</Paragraph> <Paragraph position="1"> Lexical aspect refers to the type of situation denoted by the verb, alone or combined with other sentential constituents. Verbs are assigned to lexical aspect classes based on their behavior in a variety of syntactic and semantic frames that focus on three aspectual features: telicity, dynamicity and durativity. We focus on telicity, also known as BOUNDEDNESS.</Paragraph> <Paragraph position="2"> Verbs that are telic have an inherent end: winning, for example, ends with the finish line. Verbs that are atelic do not name their end: running could end with a distance run a mile or an endpoint run to the store, for example. Olsen (Olsen, 1997) proposed that aspectual interpretation be derived through monotonic composition of marked privative features \[//0 dynamic\], \[.4-/0 durative\] and \[-t-/0 telic\], as shown in Table 1 (Olsen, 1997, pp. 32-33).</Paragraph> <Paragraph position="3"> With privative features, other sentential constituents can add to features provided by the verb but not remove them. On this analysis, the \[.-I-durative, +dynamic\] features of run propagate to the sentence level in run ~o the store; the \[/telic\] feature is added by the NP or PP, yielding an accomplishment interpretation. The feature specification of this C/ompositionally derived accomplishment is therefore identical to that of a sentence containing a telic accomplishment verb, such as destroy.</Paragraph> <Paragraph position="4"> According to many researchers, knowledge of lexical aspect--how verbs denote situations as developing or holding in time-=may be used to interpret event sequences in discourse (Dowty, 1986; Moens and Steedman, 1988; Passoneau, 1988). In particular, Dowty suggests that, absent other cues, a relic event is interpreted as completed before the next event or state, as with ran into lhe room in 4a; in contrast, atelic situations, such as run, was hungry in 4b and 4% are interpreted as contemporaneous with the following situations: fell and made a pizza, respectively.</Paragraph> <Paragraph position="5"> (4) a. Mary ran into the room. She turned on her walkman.</Paragraph> <Paragraph position="6"> b. Mary ran. She turned on her walkman.</Paragraph> <Paragraph position="7"> c. Mary was hungry. She made a pizza.</Paragraph> <Paragraph position="8"> Smith similarly suggests that in English all past events are interpreted as telic (Smith, 1997) (but cf. (Olsen, 1997)).</Paragraph> <Paragraph position="9"> Also, these tendencies are heuristic, and not absolute, as shown by the examples in (5). While we get the expected prediction that the jumping occurs after the explosion in (5)(a), we get the reverse prediction in (5)(b). Other factors such as consequences of described situations, discourse context, and stereotypical causal relationships also play a role.</Paragraph> <Paragraph position="10"> (5) a. The building exploded. Mary jumped.</Paragraph> <Paragraph position="11"> b. The building exploded. Chunks of concrete flew everywhere.</Paragraph> <Paragraph position="12"> ture (Dowty, 1979; Guerssel et al., 1985)--an augmented form of (Jackendoff, 1983; Jackendoff, 1990)--permits lexical aspect information to be read directly off the lexical entries for individual verbs, as well-as composed representations for sentences, using uniform processes and representations. The LCS framework consists of primitives (GO, BE, STAY, etc.), types (Event, State, Path, etc.) and fields (Loc(ational), Temp(oral), Foss(essional), Ident(ificational), Perc(eptual), etc.).</Paragraph> <Paragraph position="13"> We adopt a refinement of the LCS representation, incorporating meaning components from the linguistically motivated notion of !ezical semantic template (LST), based on lexical aspect classes, as defined in the work of Levin and Rappaport Hovav (Levin and Rappaport Hovav, 1995; Rappaport lttovav and Levin, 1995). Verbs that appear in multiple aspectual frames appear in multiple pairings between constants (representing the idiosyncratic meaning of the verb) and structures (the aspectual class).</Paragraph> <Paragraph position="14"> Since the aspectual templates may be realized in a variety of ways, other aspects of the structural meaning contribute to differentiating the verbs from each other. Our current database contains some 400 classes, based on an initial representation of the 213 classes in (Levin, 1993). Our current working lexicon includes about 10,000 English verbs and 18,000 Chinese verbs spread out into these classes.</Paragraph> <Paragraph position="15"> Telic verbs (and sentences) contain certain types of Paths, or a constant, represented by ! !, filled by the verb constant, in the right most leaf-node argument. Some examples are shown below: depart (go foe (* thing 2)</Paragraph> <Paragraph position="17"> Each of these relic verbs has a potential counterpart with an atelic verb plus the requisite path.</Paragraph> <Paragraph position="18"> Depart, for example, corresponds to move away, or something similar in another language.</Paragraph> <Paragraph position="19"> We therefore identify telic sentences by the algorithm, formally specified in in Figure 1 (cf. (Dorr and Olsen, 1997b) \[156\]).</Paragraph> <Paragraph position="20"> Given an LCS representation L: 1. Initialize: T(L):=\[C/T\], D(L):=\[0R\], R(L):=\[0D\] 2. If Top node of L E {CAUSE, LET, GO} This algorithm applies to the structural primitives of the interlingua structure rather than actual verbs in source or target language. The first step initialized the aspectual values as unspecified: atelic f-T\], stative (not event: f-D\]), and adurative f-R\]. First the top node is examined for primitives that indicate telicity: if the top node is CAUSE, LET, GO, telicity is set to \[+T\], as with the verbs break, destroy, for example. (The node is further checked for dynamicity \[+D\] and durativity \[+R\] indicators, not in focus in this paper.)If the top node is not a relic indicator (i.e., the verb is a basically atelic predicate such as love or run, telicity may still be still be indicated by the presence of complement nodes of particular types: e.g. a goal phrase (to primitive) in the case of run. The same algorithm may be used to determine felicity in either individual verbal entries (break but not run) or composed sentences (John ran to ~he store but not John ran.</Paragraph> <Paragraph position="21"> Similar mismatches of telicity between representations of particular predicates can occur between languages, although there is remarkable agreement as to the set of templates that verbs with related meanings will fit into (Olsen et al., 1998). In the Chinese-English interlingual system we describe, the Chinese is first mapped into the LCS, a language-independent representation, from which the target-language sentence is generated. Since telicity (and other aspects of event structure) are uniformly represented at the lexical and the sentential level, telicity mismatches between verbs of different languages may then be compensated for by combining verbs with other .components.</Paragraph> <Paragraph position="22"> .o</Paragraph> </Section> </Section> class="xml-element"></Paper>