File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/90/c90-2061_metho.xml
Size: 18,784 bytes
Last Modified: 2025-10-06 14:12:25
<?xml version="1.0" standalone="yes"?> <Paper uid="C90-2061"> <Title>A Type-theoretical Analysis of Complex Verb Generation</Title> <Section position="2" start_page="0" end_page="0" type="metho"> <SectionTitle> 1 The problem of complex </SectionTitle> <Paragraph position="0"/> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 1.1 The difference between languages </SectionTitle> <Paragraph position="0"> In this section, we will briefly mention the difficulties of complex verb translation. The term complex verb refers to those verb structures that are formed with modal 'verbs and other auxiliary verbs in order to express rnodMity or aspects, and which are attached to the original base verb.</Paragraph> <Paragraph position="1"> For example, in a complex verb: seems to have been swimming (1) 'swim' is in both the progressive aspect and the past tense, upon which 'seem' is attached with the inflection of agreement for the third person singular. Such a structure is so often hard to translate to other languages both because surface structures become different between languages, and because some aspectual/modal concepts cannot be found in the target language in the generation process. 1 We admit that the most recalcitrant problem in verb phrase translation is that the differences of tense, aspect, mood, and modality systems. Hence, we can XThe systems of tense, aspect, and modality are different from language to language, and their typology ha.s been discussed in linguistics \[1\], \[21, \[9\].</Paragraph> <Paragraph position="2"> hardly find a verb element in the target language, which corresponds with the original word exactly in meaning. However we also need to admit that the problem of knowledge representation of tense and aspect (namely time), or that of mood and modality is not only the problem of verb translation, but that of the whole natural language understanding. In this paper, in order to realize a type calculus in syntax, we depart from a rather mundane subset of tense, aspects, and modMities, which will be argued in section 2.</Paragraph> <Paragraph position="3"> 1.2 The difference in the structure In the Japanese translation of (1): oyoi-de-i-ta-rashii the syntactic main verb does not change from oyoi(swim), which is the original meaning carrier; and the past tense is replaced by -ta- which is indistinguishable between past and perfect in Japanese. The feature of English verb complex derivation is the alternation of the syntactic main verb, described in fig. 1. On the</Paragraph> </Section> <Section position="2" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 1.3 The existence of external modal- ities </SectionTitle> <Paragraph position="0"> In the history of machine translation, the analysis of a complex verb structure seems to have been rather neglected. Even in an advanced machine translation sys-</Paragraph> <Paragraph position="2"> However this flat description fails in the following cases.</Paragraph> <Paragraph position="3"> First, certain modalities can have their own negation. The sentence You must play. ( \[:l\[you play\]) can be negated in two ways: The former is the negation of 'must' while the latter is the negation of the sentence, where in parentheses above \[\] is necessity and -~ is negation operator. A similar thing can be said for tense. A certain kind of complex verbs such as 'seem to be' can have two positions to be tensed as below: seemed to have been It has been discussed that there are so called external modal expressions, that are embedded from the outside of the sentence (\[3\], \[4\], \[10\], \[11\], and \[14\]). The key issue is that those external ones concern only the speaker's attitude of the sentences in illocutionary situations in order to express deonticity, uncertainty, and so on, and are indifferent to the sentence subjects. Although there should be a controversial discussion as to whether a certain modal verb is connected to speakers or not 2, we will engage this matter in the following formalization of complex verb translation.</Paragraph> </Section> </Section> <Section position="3" start_page="0" end_page="354" type="metho"> <SectionTitle> 2 Formalization of complex </SectionTitle> <Paragraph position="0"> verb translation We will give a formal definition of modality as a function here, and discuss the strategy of verb complex generation in which each modal element is used as a function itself. In this section we use the term 'modal functions' both for modal operators and aspectuM operators as long as there is no confusion.</Paragraph> <Paragraph position="1"> 2Although the concept of modality is based on the deontic distinction/epistemic between possibility and necessity such as 'can', 'may', 'must', tbe term is sometimes used in a broader sense inclusively 'will'(volitive) or 'can'(ability). We distinguished the externality by its independent tense and polarity (positive/negative), so that broad-sense modal verbs such as volitive and ability are regarded as actually internal while narrow-sense ones are often external.</Paragraph> <Paragraph position="2"> Externality To clarify the externality of modal functions mentioned in the previous section, we put the following definition: Definition of externality: We call those modal functions which can have their own tense and negation external; otherwise we call them internal.</Paragraph> <Paragraph position="3"> Interlingua Here we propose the simplified sets of modalities, aspects, and tenses, which are recognizable both in English and in Japanese, mostly based upon the work of \[7\].</Paragraph> <Paragraph position="4"> tense = {past, present, future} aspect = {progressive, perfect, iterative, inchoative, terminative} external modality = {possible, necessaw}(r){epistemic, deontic} or {hearsay, seem} internal modality = {able} The two dimensional analysis of eplstemic and deontic is considered, reflecting the duality in meaning of 'may' and 'must'.</Paragraph> <Paragraph position="5"> Verb elements as function We assume that the crude result of a parsing process for a complex verb is the list of verb elements such as: past1, seem, past2,progressive, swim,... (2) Our objective is: * to construct the interlingua expression from these verb elements such as: (q-ed(seem)) ((-t-ed(progres sive)) (swim)) * and to derive a surface structure of the target language from the interlingua expression.</Paragraph> <Paragraph position="6"> In order to do this, we will regard each verb element as a function. The main concern here is the domain of each function. For example, an external &quot;past&quot; operator +ed should operate upon external verbs, not including internal verbs. We will realize this idea, being independent of each intra-grammar in complex verbs of various languages, in the following section.</Paragraph> <Paragraph position="7"> We express a verb complex as a composition of a root-word, tense, negation, and verb-complement, in which root and verb-form are included; veomp may be included recursively.</Paragraph> <Paragraph position="8"> Fig. 3 is an example of a ,k-function of tt perfect: `kxVe~b-,t~c~u~e.perfeet(x) which takes a verb structure as a parameter, and produce a more complicated structure.</Paragraph> </Section> <Section position="4" start_page="354" end_page="354" type="metho"> <SectionTitle> 3 Type inference for complex </SectionTitle> <Paragraph position="0"> verb composition We will make use of polymorphic type theory \[6\] in the fnrther formalization. The reason we adopt the type formalism is to realize &quot;the internality of concatenation rules&quot;; namely each verb element shonld know which elements to operate upon, instead of being given a set of grammar on concatenation. Behind this purpose lie tile following two issues: * parallel computation: to realize a fas~ parallel computation on coming parallel architecture , partial translation: for machine translation to be robu:-'t for ill-founded information in the lexicon</Paragraph> <Section position="1" start_page="354" end_page="354" type="sub_section"> <SectionTitle> 3.1 ~pes for verb phrase </SectionTitle> <Paragraph position="0"> First we will set up several mnemonic types. They are not terminal symbols of type calculus, but mnemonic identifiers of certain types.</Paragraph> <Paragraph position="1"> A set of mnemonic types: {root, int, ext, teasel, negi, tense~, ne9~} For example, we can assume the following r' as a result of our parser.</Paragraph> <Paragraph position="2"> F = {seem : ext,+ed : tensei, swim : root, progressive : int} where a : a means that a is of type a. (In that parser, each syntactic category in the source language was replaced by a mnemonic type that is actually regarded as a category of the interlingua.) We use cp as a type variable denoting 'verb phrase , . Because we can regard any verb element as a modifier from a verb phrase to another phrase, those mnemonic types are always unified with a type '~ -~ qo', except 'root' of type ~. The type <emposition is the process that specifies the internal variables of~ gradually. We introduce here the notion of 'most general unifier (mgu)' by Milner \[6\], which combines two type expressions and the result becomes a set of variable instantiations which were contained in dmse type expressions.</Paragraph> </Section> <Section position="2" start_page="354" end_page="354" type="sub_section"> <SectionTitle> 3.2 A simple example </SectionTitle> <Paragraph position="0"> Let us consider an example of concatenating seem with +ed(swim). First, assume that we can infer the type of +ed(swim) to be ~1 from F by a variable instantiation 0, as is shown below in (3):</Paragraph> <Paragraph position="2"> As for seem : ext, because 'ext' was a mnemonic for a certain modifier of a verb phrase, we can put ext = (~1 -~ 5~). Hence, for a new type variable 9a2, set r/1 as (4):</Paragraph> <Paragraph position="4"> Note that we have not specified the contents of 7 h yet; we will give them later. We can now infer the type of combined verb phrase with (4), as is shown in fig. 4 where we can use a canonical expression (see \[5\]) \Ve use '~' to denote that the type of the left-hand side is more instantiated than that of the right-hand side, so that:</Paragraph> </Section> <Section position="3" start_page="354" end_page="354" type="sub_section"> <SectionTitle> 3.3 Full-fledged verb type </SectionTitle> <Paragraph position="0"> In order to analyze the contents of 0 in (3) in detail, we need to clarify what tensel actually does. We will argue what each mnemonic means in this subsection.</Paragraph> <Paragraph position="1"> Concretely saying, we will specify what kind of internal variables can occur in various verb types.</Paragraph> <Paragraph position="2"> Our presupposition is that a 'full-fledged' complex verb contains 'external part' as well as 'internal part'.</Paragraph> <Paragraph position="3"> So that a verb type ~, is assumed to have two internal type variables ~oi,,t and ~oCxt. Each of them has a 'head' verb which incurs tense or negation if exists. We call a compound of tense and negation 'cap'. As for a cap, we assume a construction of: negation(tense(X)) a As a result, our full-fledged verb type becomes as fig. 5. aThis formalization is actually after the consideration of the following generation process. When we construct a past and negation of 'walk': walk ~ walked -, didn't walk is less natural than walk ~ don't walk -, didn't walk becanse it is 'don't' which incurs the operation of past. The exactly sinfilar thing happens also in Japanese.</Paragraph> <Paragraph position="4"> Let us get back to the type inference of +ed(swim) here, to see the basis of our formalism of type unification. +ed is of type tensei, and swim is of type root that was a mnemonic for simple T. 0 in (3) becomes as follows:</Paragraph> <Paragraph position="6"> Because tensei itself is not a function, it must be qualified as of type capl to act as a function by itself. We call this qualification 'promotion', to mean that the component raises its type to connect with others. The similar thing can be said for root, which must be promoted to ~nt SO as to be in the domain of cap~. Fig. 6 depicts the type promotion.</Paragraph> <Paragraph position="7"> tense; root J. J. promote</Paragraph> <Paragraph position="9"> A unifier, or what we have called a set of instantiation so far, like 0 or '7 is exactly a set of promotions where some missing verb elements (compared with full-fledged type) are ignored or replaced by other elements.</Paragraph> <Paragraph position="10"> For example, the contents of 0 becomes as follows:</Paragraph> <Paragraph position="12"/> </Section> <Section position="4" start_page="354" end_page="354" type="sub_section"> <SectionTitle> 3.4 Component embedding </SectionTitle> <Paragraph position="0"> In the type inference of fig. 4, we happened to choose two items of seer. and +ed(suim). Actually, we can combine any two items picked up from F. In this subsection we will show an example, in which we try to concatenate the consequence of fig. 4 with progressive : int. In the conventional generation, an internal verb element such as 'progressive' must be concatenated to the structure, prior to an external element such as 'seem'. However, in our type expression, 'be +ing' can join tim 'seem(+ed(swim))' structure, and also correctly choose the target element 'swim' from the structure, instead of 'seem' which exists most exteriorly. null If there is such a set of instantiation 772 that:</Paragraph> <Paragraph position="2"> then we can validate the inference in fig. 7. However, because the domain of int must be 'cap'-less ~i,t, we cannot legalize the type inference of fig. 7 immediately.</Paragraph> <Paragraph position="3"> What we are required to do now is a type 'demotion', as opposed to the promotion. Roughly saying, a verb type of seem(+ed(swim)) is regarded as below: (6) though each of the right hand side of (6) is promoted to some qualified type. This means that the history of promotion must be included in the unifier 0. Hence, we can scrutinize the contents of 0 so that we may find where int can be embeddable. In this case, { r oot / headi, headl/ ~p i~t } in 0 (viz. ~i~t ~ int) should be demoted, and we can redefine a verb type of (6) as: ten e,(root) ) --, tens ,(.d&quot;'(root) ) ) Suppose that r/a replaces the history of promotions and demotions as below: (F U {~9:i,lt})0~71r/3 F seem(+ed(swim)): ~4 then we can make the inference in fig. 8. In this case,</Paragraph> <Paragraph position="5"> there is only a place for int to be embedded in between +ed and swim, and int operates upon a root.</Paragraph> <Paragraph position="6"> This time, we can make an inference from abstracted verb structure, as is shown in fig. 9.</Paragraph> </Section> <Section position="5" start_page="354" end_page="354" type="sub_section"> <SectionTitle> 3.5 Head shifting </SectionTitle> <Paragraph position="0"> We mentioned that another superiority of type formalism is its partiality. Actually we can compose a verb stru'cture from any part of given interlingua set, and this feature is realized dynamically. The computation of verb complex generation may be stopped any time by ill-foundedness of machine translation system. Even in such a case, our formalism can offer a part of surface structure which had been partially completed so far. The type definition above gives us an important clue to how to compose verb elements. The algorithm necessarily becomes as follows: 1. Pick up an original meaning base.</Paragraph> <Paragraph position="1"> 2. Apply internal modal functions, while ..</Paragraph> <Paragraph position="2"> 3. Shift the internal tense and negation to a newly applied modal function.</Paragraph> <Paragraph position="3"> 4. Apply external modal functions, while ..</Paragraph> <Paragraph position="4"> 5. Shift the external tense and negation to a newly applied modal flmction.</Paragraph> <Paragraph position="5"> The position shift of tense, caused by the arrival of a new internal verb, is diagramed in fig. 10. Fig. 11 shows that every step in the derivation process can offer the partial syntactic tree.</Paragraph> </Section> </Section> <Section position="5" start_page="354" end_page="354" type="metho"> <SectionTitle> 4 Discussion </SectionTitle> <Paragraph position="0"> We have shown a model of complex verb translation based upon type theory, in which verb elements in the interlingua are regarded as generation functions whose domain and range are elucidated so that each verb element is certified to acquire a correct position in the target structure. Furthermore, the flexibility of type calculus was shown on the following two points.</Paragraph> <Paragraph position="1"> 1. We don't need to specify all the type variables every time, so that all the information a type owns can always be regarded as partial. This means that we can translate partially what can be done.</Paragraph> <Paragraph position="2"> 2. Because the order of calculation is not specified in the type expression, the verb structure can be composed in a way of self-organization, in tile meaning that tile structure is able to be decomposed and to be reorganized in the process.</Paragraph> <Paragraph position="3"> In the case of complex verb translation, rephrasing to another part of speech is rather easier; we only need to 'kick out' the functional expression from the verb structure to point to another lexical item. Some external expressions must be translated into a complex sentence as the feature of externality, as we have discussed in section 1.3. We give here a definition of the identity in meaning between an external verb and its corresponding complex sentence in the type-theoreticM view as (7): <agent,qa(v) >: (7) ..~ <affent, &quot;u> , ~>&quot; : O'comp where we denote a sentence by <agent, action(state)> informally, and cr is a type of sentence and cr~o~ v is a type of complex sentence. We can adopt (7) as the definition of an external verb; namely, we call such a verb ~ external when, for ~, there is another verb expression ~ which enables the type inference (7).</Paragraph> <Paragraph position="4"> Tile recent study of categorial grammar such as \[13\], as well as the historical feat of Montague semantics, claims the efficacy of type expression. The type calculus is not specific to the complex verb nor the generation in machine translation, so that it is applicable to any generation process of natural languages. Our next goal in due course is to apply this generation mechanism to the whole categorial grammar.</Paragraph> </Section> class="xml-element"></Paper>