File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/89/e89-1037_metho.xml
Size: 25,182 bytes
Last Modified: 2025-10-06 14:12:17
<?xml version="1.0" standalone="yes"?> <Paper uid="E89-1037"> <Title>TRANSLATION BY STRUCTURAL CORRESPONDENCES</Title> <Section position="4" start_page="0" end_page="0" type="metho"> <SectionTitle> A GENERAL ARCHITECTURE FOR LINGUISTIC DESCRIPTIONS </SectionTitle> <Paragraph position="0"> Our approach uses the equality- and description-based mechanisms of Lexical-Functional Grammar. As introduced by Kaplan and Bresnan (1982), lexical-functional grammar assigns to every sentence two levels of syntactic representation, a constituent structure (c-structure) and a functional structure (f-structure). These structures are of different formal types--the c-structure is a phrase-structure tree while the f-structure is a hierarchical finite function--and they characterize different aspects of the information carried by the sentence. The c-structure represents the ordered arrangement of words and phrases in the sentence while the f-structure explicitly marks its grammatical functions (subject, object, etc.). For each type of structure there is a special notation or description-language in which the properties of desirable instances of that type can be specified. Constituent structures are described by standard context-free rule notation (augmented with a variety of abbreviatory devices that do not change its generative power), while f-structures are described by Boolean combinations of function-argument equalities stated over variables that denote the structures of interest. Kaplan and Bresnan assumed a correspondence function mapping between the nodes in the c-structure of a sentence and the units of its f-structure, and used that piecewise function to produce a description of the f-structure (in its equational language) by virtue of the mother-daughter, order, and category relations of the c-structure. The formal picture developed by Kaplan and Bresnan, as clarified in Kaplan (1987), is illustrated in the following structures for sentence (1): (I) (a) The baby fell.</Paragraph> <Paragraph position="1"> (b) C/</Paragraph> <Paragraph position="3"> The c-structure appears on the left, the f-structure on the right. The c-structureto-f-structure correspondence, ~b, is shown by the linking lines. The correspondence C/ is a many-to-one function taking the S, VP and V nodes all into the same outermost unit of the f-stucture, fl.</Paragraph> <Paragraph position="4"> The node-configuration at the top of the tree satisfies the statement S~NP VP in the context-free description language for the c-structure. As suggested by Kaplan (1987}, this is a simple way of defining a collection of more specific properties of the tree, such as the fact that the S node (labeled nl) is the mother of the NP node (n2). These facts could also be written in equational form as M(n2)=nl, - 273 where M denotes the function that takes a tree-node into its mother. Similarly, the outermost f-structure satisfies the assertions</Paragraph> <Paragraph position="6"> language. Given the illustrated correspondence, we also know that fl=d~(nl) and f2--~b(n2). Taking all these propositions together, we can infer first that</Paragraph> <Paragraph position="8"> identifies the subject in the f-structure in terms of the mother-daughter relation in the tree.</Paragraph> <Paragraph position="9"> In LFG the f-structure assigned to a sentence is the smallest one that satisfies the conjunction of equations in its functional description. The functional description is determined from the trees that the c-structure grammar provides for the string by a simple matching process. A given tree is analyzed with respect to the c-structure rules to identify particular nodes of interest. Equations about the f-structure corresponding to those nodes (via ~b) are then derived by substituting those nodes into equation-patterns or schemata.</Paragraph> <Paragraph position="10"> Thus, still following Kaplan (1987), if * appears in a schema to stand for the node matching a given rule-category, the functional description will include an equation containing that node (or an expression such as n2 that designates it) instead of *. The equation (~(M(n2)) SUBJ)----~b(n2) that we inferred above also results from instantiating the schema (di)(M(*)) SUBJ)=C/(*) annotated to the NP element of the S rule in (2a) when that rule-element is matched against the tree in (lb). Kaplan observes that the ? and metavariables in the Kaplan/Bresnan formulation of LFG are simply convenient abbreviations for the complex expressions ~b(M(*)) and ~(*), respectively, thus explicating the traditional, more palatable formulation in (2b).</Paragraph> <Paragraph position="11"> (2) (a) S--* NP VP</Paragraph> <Paragraph position="13"> This basic conception of descriptions and correspondences has been extended in several ways. First, this framework has been generalized to additional kinds of structures that represent other subsystems of linguistic information (Kaplan, 1987; Halvorsen, 1988).</Paragraph> <Paragraph position="14"> These structures can be related by new correspondences that permit appropriate descriptions of more abstract structures to be produced. Halvorsen and Kaplan (1988), for example, discuss a level of semantic structure that encodes predicate-argument relations and quantifier scope, information that does not enter into the kinds of syntactic generalizations that the f-structure supports.</Paragraph> <Paragraph position="15"> They point out how the semantic structure can be set in correspondence with both c-structure and f-structure units by means of related mappings o and o'. Kaplan (1987) raises the possibility of further distinct structures and correspondences to represent anaphoric dependencies, discourse properties of sentences, and other projections of the same string.</Paragraph> <Paragraph position="16"> Second, Kaplan (1988) and Halvorsen and Kaplan (1988) discuss other methods for deriving the descriptions necessary to determine these abstract structures. The arrangement outlined above, in which the description of one kind of structure (the f-structure) is derived by analyzing or matching against another one, is an example of what is called description-by-analysis. The semantic interpretation mechanisms proposed by Halvorsen (1983) and Reyle (1988) are other examples of this descriptive technique. In this method the grammar provides general patterns to compare against a given structure and these are then instantiated if the analysis is satisfactory. One consequence of this approach is that the structure in the range of the correspondence, the one whose description is being developed, can only have properties that are derived from information explicitly identified in the domain structure.</Paragraph> <Paragraph position="17"> Another description mechanism is possible when three or more structures are related through correspondences. Suppose the c-structure and f-structure are related by C/ as in (2a) and that the function o then maps the f-structure units into corresponding units of semantic structure of the sort suggested by Fenstad et al. (1987). The formal arrangement is shown in Figure 1 (next page). This configuration of cascaded correspondences opens up a new descriptive possibility. If o and ~b are both structural correspondences, then so is their composition o o ~b. Thus, even though the units of the semantic structure correspond directly only to the units of the f-structure and have no immediate connection to the nodes of</Paragraph> <Paragraph position="19"> the c-structure, a semantic description can be formulated in terms of c-structure relations.</Paragraph> <Paragraph position="20"> The expression o(d~(M(*))) can appear on a c-structure rule-element to designate the semantic-structure unit corresponding to the f-structure that corresponds to the mother of the node that matches that rule-element.</Paragraph> <Paragraph position="21"> Since projections are monadic functions, we can remove the uninformative parentheses and write (oqbM* ARG1)&quot;-o(dpM* SUBJ), or, using the metavariable, (o ~ ARGI) ---- o( I&quot; SUBJ).</Paragraph> <Paragraph position="22"> Schemata such as this can be freely mixed with LFG's standard functional specifications in lexical entries and c-structure rules. For example, the lexical entry for fall might be given as follows: (3) fall V ( ~' PRED) --'fall'</Paragraph> <Paragraph position="24"> Descriptions formulated by composing separate correspondences have a surprising characteristic: they allow the final range structure (e.g. the semantic structure) to have properties that cannot be inferred from any information present in the intermediate (f-) structure. But those properties can obtain only if the intermediate structure is derived from an initial (c-) structure with certain features. For example, Kaplan and Maxwell (1988a) exploit this capability to describe semantic structures for coordinate constructions which necessarily contain the logical conjunction appropriate to the string even though there is no reasonable place for that conjunction to be marked in the f-structure. In sum, this method of description, which has been called codescription, permits information from a variety of different levels to constrain a particular structure, even though there are no direct correspondences linking them together. It provides for modularity of basic relationships while allowing certain necessary restrictions to have their influence.</Paragraph> <Paragraph position="25"> The descriptive architecture of LFG as extended by Kaplan and Halvorsen provides for multiple levels of structure to be related by separate correspondences, and these correspondences allow descriptions of the various structures to be constructed, either by analysis or composition, from the properties of other structures. Earlier researchers have applied these mechanisms to the linguistic structures for sentences in a single language.</Paragraph> <Paragraph position="26"> In this paper, we extend this system one step further: we introduce correspondences between structures for sentences in different languages that stand in a translation relation to one another. The description of the target language structures are derived via analysis and codescription from the source language structures, by virtue of additional annotations in c-structure rules and lexical entries. Those descriptions are solved to find satisfying solutions, and these solutions are then the input to the target generation process.</Paragraph> <Paragraph position="27"> In the two language arrangement sketched below, we introduce the ~ correspondence to map between the f-structure units of the source language and the f-structure units of the target language. The o correspondence maps from the f-structure of each language to its own corresponding semantic structure, and a second transfer correspondence z' relates those structures.</Paragraph> <Paragraph position="29"> This arrangement allows us to describe the target f-structure by composing dp and ~ to form expressions such as z(dpM* COMP)= (~bM* XCOMP) or simply ~( ~ COMP)-- (~ ~ XCOMP)).</Paragraph> <Paragraph position="30"> This maps a COMP in the source f-structure into an XCOMP in the target f-structure. The relations asserted by this equation are depicted in the following source-target diagram: As another example, the equation Z'(O~ ARG1)=(O'~ ARG1) identifies the first arguments in the source and target semantic structures. The equation ~'o( I' SUBJ)-o(z t TOPIC) imposes the constraint that the semantics of the source SUBJ will translate via ~' into the semantics of the target TOPIC but gives no further information about what those semantic structures actually contain.</Paragraph> <Paragraph position="31"> Our general correspondence architecture thus applies naturally to the problem of translation. But there are constraints on correspondences specific to translation that this general architecture does not address. For instance, the description of the target-language structures derived from the source-language is incomplete. The target structures may and usually will have grammatical and semantic features that are not determined by the source. It makes little sense, for example, to include information about grammatical gender in the transfer process if this feature is exhaustively determined by the grammar of the target language. We can formalize the relation between the information contained in the transfer component and an adequate translation of the source sentence into a target sentence as follows: for a target sentence to be an adequate translation of a given source sentence, it must be the case that a minimal structure assigned to that sentence by the target grammar is subsumed by a minimal solution to the transfer description. One desirable consequence of this formalization is that it permits two distinct target strings for a source string whose meaning in the absence of other information is vague but not ambiguous.</Paragraph> <Paragraph position="32"> Thus this conceptual and notational framework provides a powerful and flexible system for imposing constraints on the form of a target sentence by relating them to information that appears at different levels of source-language abstraction. This apparatus allows us to avoid many of the problems encountered by more derivational, transformational or procedural models of transfer. We will illustrate our proposal with examples that have posed challenges for some other approaches.</Paragraph> </Section> <Section position="5" start_page="0" end_page="0" type="metho"> <SectionTitle> EXAMPLES </SectionTitle> <Paragraph position="0"> Changes in grammatical function. Some quite trivial changes in structure occur when the source and the target predicate differ in the grammatical functions that they subcategorize for. We will illustrate this with an example in which a German transitive verb is translated with an intransitive verb taking an oblique complement in French: (6) (a) Der Student beantwortet die Frage. (b) L'6tudiant r6pond it la question.</Paragraph> <Paragraph position="1"> We treat the oblique preposition as a PRED that itself takes an object. Ignoring information about tense, the lexical entry for beantworten in the German lexicon looks as follows: (7) beantworten V ( T PRED)=&quot;oeantworten<( t SUBJ)( t OBJ)>' while the transfer lexicon for beantworten contains the following mapping specifications:</Paragraph> <Paragraph position="3"> We use the special attribute FN to designate the function-name in semantic forms such as %eantworten < ( T SUBJ)( T OBJ) >'. In this transfer equation it identifies r~pondre as the corresponding French predicate. This specification controls lexical selection in the target, for example, selecting the following French lexical entry to be used in the translation: (9) rdpondre V ( 1' PRED)='r6pondre<( 1` SUBJ)( 1` AOBJ)>' With these entries and the appropriate but trivial entries for der Student and die Frage we get the following f-structure in the source language and associated f-structure in the target language for the sentence in (10): The second structure is the f-structure the grammar of French assigns to the sentence in (6b). This f-structure is the input for the generation process. Other examples of this kind are pairs like like and plaire and help and heIfen.</Paragraph> <Paragraph position="4"> In the previous example the effects of the change in grammatical function between the source and the target language are purely local. In other cases there is a non-local dependency between the subcategorizing verb and a dislocated phrase. This is illustrated by the relative clause in (11): (II) (a)...der Brief, den der Student zu beantworten scheint.</Paragraph> <Paragraph position="5"> (b) ...la lettre, ~ laquelle l'~tudiant semble r~pondre.</Paragraph> <Paragraph position="6"> ...the letter, that the student seemed to answer.</Paragraph> <Paragraph position="7"> The within-clause functions of the relativized phrases in the source and target language are determined by predicates which may be arbitrarily deeply embedded, but the relativized phrase in the target language must correspond to the one in the source language. Let us assume that relative clauses can be analyzed by the following slightly simplified phrase structure rules, making use of functional uncertainty (see Kaplan and Maxwell 1988b for a technical discussion of functional uncertainty) to capture the non-local dependency of the relativized phrase (equations on the head NP are ignored):</Paragraph> <Paragraph position="9"> We can achieve the desired correspondence between the source and the target by augmenting the first rule with the following transfer equations:</Paragraph> <Paragraph position="11"> The effect of this rule is that the ~ value of the relativized phrase (REL-TOPIC) in the source language is identified with the relativized phrase in the target language. However, the source REL-TOPIC is also identified with a within-clause function, say OBJ, by the uncertainty equation in (12). Lexical transfer rules such as the one given in (8) independently establish the correspondence - 277 between source and target within-clause functions. Thus, the target within-clause function will be identified with the target relativized phrase. This necessary relation is accomplished by lexically and structurally based transfer rules that do not make reference to each other.</Paragraph> <Paragraph position="12"> Differences in control. A slightly more complex but similar case arises when the infinitival complement of a raising verb is translated into a finite clause, as in the following: (14) (a) The student is likely to work.</Paragraph> <Paragraph position="13"> (b) I1 est probable que l'6tudiant travaillera.</Paragraph> <Paragraph position="14"> In this case the necessary information is distributed in the following way over the source, target, and transfer lexicons as shown in Figure 2.Here the transfer projection builds up an underspecified target structure, to which the information given in the entry of probable is added in the process of generation. Ignoring the contribution of is, the f-structure for the English sentence identifies the non-thematic SUBJ of likely with the thematic SUBJ of work as follows: The corresponding French structure in (16) contains an expletive SUBJ, il, for probable and an overtly expressed SUBJ for travaiUer. The latter is introduced by the transfer entry for work: Again this f-structure satisfies the transfer description and is also assigned by the French grammar to the target sentence.</Paragraph> <Paragraph position="15"> The use of multiple projections. There is one detail about the example in (14) that needs further discussion. Simplifying matters somewhat, there is a requirement that the temporal reference point of the complement has to follow the temporal reference point of the clause containing likely, if the embedded verb is a process verb. Basically the same temporal relations have to hold in French with probable. The way this is realized will depend on what the tense of probable is, which in turn is determined by the discourse up to that point. A sentence similar to the one given in (13a) but appearing in a narrative in the past would translate as the following:</Paragraph> <Paragraph position="17"> travaillerait.</Paragraph> <Paragraph position="18"> In the general case the choice of a French tense does not depend on the tense of the English sentence alone but is also determined by information that is not part of the f-structure itself. We postulate another projection, the temporal structure, reached from the f-structure through the correspondence X (from XpOVZKOS, temporal). It is not possible to discuss here the specific characteristics of such a structure. The only thing that we want to express is the constraint that the event in the embedded clause follows the event in the main clause. We assume that the temporal structure contains the following information for likely-to-V, as suggested by Fenstad et al.</Paragraph> <Paragraph position="19"> (1987):</Paragraph> <Paragraph position="21"> This is meant to indicate that the temporal reference point of the event denoted by the embedded verb extends after the temporal reference point of the main event. The time of the main event is in part determined by the tense of the verb be, which we ignore here. The only point we want to make is that aspects of these different projections can be specified in different parts of the grammar. We assume that French and English have the same temporal structure but that in the context of likely it is realized in a different way. This can be expressed by the following equation:</Paragraph> <Paragraph position="23"> Here the identity between X and X-~ provides an interlingua-like approach to this particular subpart of the relation between the two languages. This is diagrammed in Figure 3.</Paragraph> <Paragraph position="24"> Allowing these different projections to simultaneously determine the surface structure seems at first blush to complicate the computational problem of generation, but a moment of reflection will show that that is not necessarily so. Although we have split up the different equations among several projections for conceptual clarity, computationally we can consider them to define one big attribute value structure with X and z as special attributes, so the generation problem in this framework reduces to the problem of generating from attribute-value structures which are formally of the same type as f-structures (see Halvorsen and Kaplan (1988), Wedekind (1988), and Momma and D6rre (1987) for discussion).</Paragraph> <Paragraph position="25"> Differences in embedding. The potential of the system can also be illustrated with a case in which we find one more level of embedding in one language than we find in the other. This is generally the case if a modifier-head relation in the source language is reversed in the target structure. One such example is the relation between the sentences in (20): (20) (a) The baby just fell.</Paragraph> <Paragraph position="26"> (b) Le b~b~ vient de tomber.</Paragraph> <Paragraph position="27"> One way to encode this relation is given in the following lexical entry for just (remember that all the information about the structure of venir in French will come from the lexicon and grammar of French itself):</Paragraph> <Paragraph position="29"> This assigns to just a semantic form that takes an ARG function as its argument and maps it into the French venir. This lexical entry is combined with phrase-structure rule (22). This rule introduces sentence adverbs and makes the f-structure corresponding to the S node fill the ARG function in the f-structure corresponding to the ADV node.</Paragraph> <Paragraph position="31"> Note that the f-structure of the ADV is not assigned a function within the S-node's f-structure, which is shown in (23). This is in keeping with the fact that the adverb has no functional interactions with the material in the main clause.</Paragraph> <Paragraph position="32"> f.L f,, pEc f,o REn t, jj The relation between the adverb and the clause is instead represented only in the f-structure associated with the ADV node: In the original formulation of LFG, the f-structure of the highest node was singled out and assigned a special status. In our current theory we do not distinguish that structure from all the others in the range of ~b: the grammatical analysis of a sentence includes the complete enumeration of dl)-associations. The S-node's f-structure typically does contain the f-structures of all other nodes as subsidiary elements, but not in this adverbial case. The target structures corresponding to the various f-structures are also not required to be integrated. These target f-structures can then be set in correspondence with any nodes of the target c-structure, subject to the constraints imposed by the target grammar. In this case, the fact that venir takes an XCOMP which corresponds to the ARG of just means that the target f-structure mapped from the ADV's f-structure will be associated with the highest node of the target c-structure. This is shown in The above analysis does not require a single integrated source structure to map onto a single integrated target structure. An alternative analysis can handle differences of embedding with completely integrated structures. If we assign an explicit function to the adverbial in the source sentence, we can reverse the embedding in the target by replacing (22) with (26): (26) S --* NP (ADV) VP (Jr SADJ) = 4,</Paragraph> <Paragraph position="34"> In this case the embedded f-structure of the source adverb will be mapped onto the f-structure that corresponds to the root node of the target c-structure, whereas the f-structure of the source S is mapped onto the embedded XCOMP in the target. The advantages and disadvantages of these different approaches will be investigated further in Netter and Wedekind (forthcoming).</Paragraph> </Section> class="xml-element"></Paper>