File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/00/c00-2149_intro.xml
Size: 14,352 bytes
Last Modified: 2025-10-06 14:00:51
<?xml version="1.0" standalone="yes"?> <Paper uid="C00-2149"> <Title>Context-Free Grammar Rewriting and the Transfer of Packed Linguistic Representations</Title> <Section position="2" start_page="0" end_page="1017" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> There is currently much interest in translation models that support some amount of ambiguity preservation between source and target texts, so as to minimize disambiguation decisions that the system, or an interactive user, has to make during the translation process (Kay et al., 1994)..</Paragraph> <Paragraph position="1"> An important aspect ol' such models is the ability to handle, during all the stages of the translation process, packed linguistic structures, that is, structures which factorize in a compact fashion all the different readings of a sentence and obviate the need to list and treat all these readings in isolation of each other (as is standard in more traditional models for machine translation).</Paragraph> <Paragraph position="2"> In the case of parsing, and more specifically, parsing with unification-based formalisms such as LFG, techniques for producing packed structures have been in existence for some time (Maxwell and Kaplan, 1991; Maxwell and Kaplan, 1993; Maxwell and Kaplan, 1996; D6rre, 1997; Dymetman, 1997). More recently, techniques have been appearing for the generation from packed structures (Shemtov, 1997), the transfer between packed structures (Emele and Dorna, 1998; Rayner and Bouillon, 1995), and the integration of such mechanisms into the whole translation process (Kay, 1999; Frank, 1999).</Paragraph> <Paragraph position="3"> This paper focuses on the problem of transfer. The method proposed is related to those of (Emele and Dorna, 1998) and (Kay, 1999). As in these approaches, we view packed representations as being descriptions of a finite collection of directed labelled graphs (similar to the functional structures of LFG), each representing a different non-ambiguous reading, which share certain subparts.</Paragraph> <Paragraph position="4"> The representations of (Emele and Dorna, 1998) and (Kay, 1999) arc based on a notion of propositional contexts (see (Maxwell and Kaplan, 1991)), where each possible non-ambiguous reading included ill the packed source representation is extracted by selecting the value (true or false) of a certain number of propositional variables that index elements of the labelled source graph. Transfer is then seen as a process of rewriting source graph elements (e.g, nodes labelled with French lexemes) into target graph elements (e.g. nodes labelled with English lexemes), while preserving the propositional contexts in which these graph elements were selected.</Paragraph> <Paragraph position="5"> In contrast, our approach, following (Dymetman, 1997), views a packed representation as being a grammar (more specifically, a context-flee grammar) over the vocabulary of graph elements (labelled nodes and edges), where each word (in the sense of formal language theory) generated by the grammar represents one of the possible non-ambiguous readings of the packed representation. In other terms, the collection of non-ambiguous graphs belonging to the packed representation is seen as a kmguage over a vocabulary of graph elements, and a packed representation is seen as a grammar which generates such a language. Packing comes fi'om the fact that a context-free grammar is an cMcicnt representation lbr the language it generates. Another essential feature of such a representation is that it is interaction-free, that is, each nondeterministic top-down traversal of the grammar succeeds without ever backtracking and it results in a certain reading, without the need for checking the consistency of a set of associated propositional constraints: the representation for the collection of readings is as direct as can be while permitting a filctorization of common parts.</Paragraph> <Paragraph position="6"> Based on this notion, we present an algorithm for transfer which, starting fi'om a finite set of rewriting patterns (the transfer lexicon), associates with a given context-fi'ee grammar representing the source packed structure a context-free grammar representing the tat'get packed structure. Therefore, the target representation remains interaction-fi'ee and transparently encodes the target structures; furthermore, under certain natural &quot;locality&quot; conditions on the rewriting rules (the graph elements in their left-hand sides tend be be &quot;close&quot; from each other in the source grammar derivations), the target grmnmar preserves much of the factorization and compaction properties of the sotu'ce grammar.</Paragraph> <Paragraph position="7"> The paper is structtu'ed in the following way. Sec- null tion 2 explains how mnbiguous graphs can be seen as commutative hmguagcs over graph description elements, and how context-free grammars provide concise specilications for these languages. Section 3 extends the standard notion of non-ambiguous transfer to that of ambiguous transfer. Section 4 presents the basic hmguagctheoretic formalism needed and introduces some opera|ors on languages. Section 5 presents Ihe detailed rewriting algorithm, which applies these operators not directly to hmguages, but to the context-free grammars specifying them. Section 6 gives an example of the algorithm in operation.</Paragraph> <Paragraph position="8"> 2 Ambiguous structures as languages O: see/sawz ............. -.__ argl~- --'~ ~at'g2-----. *nod &quot;&quot; mod 1: i 2: light z_ ~-Z - -\\- -. \ {Jlf nl()(\]-- ~ ~ , -- -- IIlOd \ 7: green I / green2 err'g2 ~ ~l</Paragraph> <Paragraph position="10"> (~: lelescope Figure 1 : An informal graphical representation of the 20 possible analyses for &quot;I saw the green light on the hill with a telescope&quot;.</Paragraph> <Paragraph position="11"> Let's consider the sentence &quot;I saw the green light on the hill with a telescope&quot;. In Fig. I, we have represented inlbnnally the set of possible analyses for this sentence. Labels on the nodes correspond to predicate names ('on', 'hill', etc). A slash is used to indicate different possible readings for a node; for instance, we assume that the surface form &quot;saw&quot; can correspond to the verbs &quot;to see&quot; or &quot;to saw&quot;, and that &quot;green&quot; is ambiguous between the color adjective &quot;green l&quot; and the noun &quot;green2&quot; (grassy lawn). Relations between nodes are indicated by labels on the edges joining two nodes: 'argl' and 'arg2' for tirst and second argument, 'rood' for modilier. The solid edges correspond to relations which are satistied in all the readings for |be sentence, dotted edges to relations that are satistied only for certain readings.</Paragraph> <Paragraph position="12"> Thus, the preprositional phrase &quot;on the hill&quot; can modify either &quot;light&quot; or &quot;see/saw&quot;, the phrase &quot;with a telescope&quot; either &quot;hill&quot;, &quot;light&quot;, or &quot;see/saw'. The informal picture of Fig. 1 does not make explicit exactly which structures are actually possible analyses of the sentence. For instance the two crossing edges modo3 and rood25 (where indices are used to denote the origin and destination of the edge) cannot appear together in a reading of the given sentence. As a consequence only five of the apparent 2 x 3 prepositional attachments combinations are possible, which multiplied by the four possible lexical variants for &quot;saw&quot; and &quot;green&quot; gives 20 possible readings for tim sentence.</Paragraph> <Paragraph position="13"> Each of these readings is a graph where nodes 0 and 7 now carry one label, and where one 'rood' edge has been selected for the attachment of nodes 3 and 5. One way to describe such a graph is by listing a collection of &quot;description elenmnts&quot; for it, where each such dement is either a labelled node such as scco or a labelled edge such as rood27. Using this format, the pragmatically preferred analysis for our sentence is the set {SCCo, mglol, il, arg202, light2, mod27, gwenlT, mod.23, on3, arg234, hill4, modo5, with~,, zug2.~a, tclescope~ }.</Paragraph> <Paragraph position="14"> If we consider the collection of all possible analyses, we then obtain a collection of sets of description elements. It is convenient to view such a collection as a commutative language over the vocabulary of all possible description elements; each word in such a hmguage corresponds to one analysis and is a list of description elements the order of which is considered irrelevant.</Paragraph> <Paragraph position="15"> The main advantage of taking this view of ambiguous structures is that fomml language theory provides standard tools for representing languages compactly. Thus it is well-known in computational lexicography tlmt a large list of word strings can be represented efliciently by means of a tinite-stale atltoma|on which factorizes common subs|rings. Such a representation is both compact and &quot;explicit&quot;: accessing and using it is as direct as the flat list of words would be.</Paragraph> <Paragraph position="16"> Although one might think o1' using tinite-statc models for representing compaclly the language associated with a collection of graphs, they do not seem as relevant as context-free models for our purposes. The reason is that the source packed representations are typically obtained as the results of chart-parsing processes. A chart used in the parsing of a context-fi'ee grammar can itself be viewed as a context-free grammar, which is a specialization of the original granllllar l'or the string being parsed, and which directly generates tim deriwltion trees for this string relative to the ot t,q,,&quot; &quot;'o'.aL grammar (Billot and Lang, 1989). 1 The generalization of this approach to unitication grammars (ot' the LFG or DCG type) proposed in (Dymemmn, 1997) shows that, in tt, rn, chart-parsing with these unilication grammars conducts naturally to packed representations for the parse results very close to the ones we are about to introduce.</Paragraph> <Paragraph position="17"> Let's consider the CFG Go:</Paragraph> <Paragraph position="19"> Nontenninals of that grammar arc written in uppercase, terminals (which are graph description elements) in lowercase. It can be verified that the language generated by this grammar is the collection of commutative words IThis context-free grammar has polynomial size relative to the length of the string. While it is also possible in principle to use a linitestale model for representing lhe sallle sel of derivation trees, it can be showll Ihal such at model may be exponential relative to string length (remark due to John Maxwell).</Paragraph> <Paragraph position="20"> corresponding precisely to all the possible analyses for the sentence.</Paragraph> <Paragraph position="21"> The fact that there are 20 such words can be established by a simple bottom-up computation involving multiplications and sums. I1' we call ambiguity degree ad(N) of a nonterminal N tim ntunber of words it generates, then it is obvious that, for instance, ad(D30) = 2, ad(D3) = 2+3, ad(S) = 4.1.1-5 = 20. In fact, it is the multiplications which appear in such computations which are responsible for the compactness of the grammar as compared to the direct listing of the words: each time a multiplication appears, a factorization is being cashed in. 2 3 Transfer as language rewriting When working with non-ambiguous structures, transfer is a rewriting process which takes as input a source-language graph and constructs a target-language graph by applying transfer rules of the form lhs --4 rhs, where lhs and rhs are finite sets of description elements for source graph and target graph respectively. In outline, the &quot;non-ambiguous&quot; transfer process works in the Mlowing way: for each non-overlapping covering of the source graph with left-hand sides of transfer rules, the corresponding right-hand sides are produced and taken together represent a target graph (this is a non-deterministic ftmction as there can be several such coverings).</Paragraph> <Paragraph position="22"> In the case of ambigt, ous structures, the aim of transfer is to take as input a language of source graphs and to produce a language of target graphs. The language of target graphs should be equal to the union of all the graphs that would have obtained if one had enumerated one-by-one the source graphs, applied non-ambiguous transfer, and taken the collection of all target graphs obtained. The goal of ambiguous transfer is to perform the same task on the basis o1' a compact representation for the collection of source graphs, yielding a compact representation for the collection of target graphs.</Paragraph> <Paragraph position="23"> For illustratkm purposes, we will consider the following collection of transfer rules:</Paragraph> <Paragraph position="25"> We have only listed a few rules, and have assumed that the remaining ones are straighlorward one-to-one correspondences (11 --+ jet, medea -+ mod'o3 \[we prime labels such as mod, argl .... in order to have disjointness of source and target vocabulary\], etc.)) 2As the example shows, conlexl-flee representations of ambiguous slructures have the important properly (related to their inte,'aclionfreeness as described in the i,~troduction) of being easily &quot;countable&quot;. This is to be contrasted with other possible representations for ambiguous structures, such as ones based on propositional axioms determining which desc,'iption elemenls can be jointly p,esent in a given analysis.</Paragraph> <Paragraph position="26"> In these representations, the problem of determining whether there exists one structure satisfying the specification can be of high complexity, let alone the problem of counting such structures.</Paragraph> <Paragraph position="27"> 3 In practice, real transfer rules are not specialized lbr specific nodes, but are panerns containing variables instead of imlnbers; in order to oh-</Paragraph> </Section> class="xml-element"></Paper>