File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/91/e91-1049_metho.xml
Size: 13,679 bytes
Last Modified: 2025-10-06 14:12:38
<?xml version="1.0" standalone="yes"?> <Paper uid="E91-1049"> <Title>A PREFERENCE MECHANISM BASED ON MULTIPLE CRITERIA RESOLUTION</Title> <Section position="1" start_page="0" end_page="0" type="metho"> <SectionTitle> A PREFERENCE MECHANISM BASED ON MULTIPLE CRITERIA RESOLUTION </SectionTitle> <Paragraph position="0"/> </Section> <Section position="2" start_page="0" end_page="0" type="metho"> <SectionTitle> 2300 Kbh S, Denmark. ABSTRACT </SectionTitle> <Paragraph position="0"> This paper presents an experimental preference tool des!gned, implemented and tested m the Eurotra pro)ect. The mechanism is based on preference rules which can either compare subtrees pairwise or single out a subtree on the basis of some specified constraints. Scoring permits combining the effects of various preference rules.</Paragraph> </Section> <Section position="3" start_page="0" end_page="0" type="metho"> <SectionTitle> THE PROBLEM </SectionTitle> <Paragraph position="0"> The aim of a translation system is to produce the correct translation of a given text. In Eurotra, where translation is split up into a series of mappings among intermediate levels of representation, provisional overgeneration is a necessary evil \[Raw et al. 1989\]: the closer to surface structure a level of representation is, the harder it becomes for the parser to produce an unambiguous result. In the Eurotra framework, the E-framework \[Bech et al. 1989\], overgeneration can be partially controlled by filters which describe parse trees that are to be discarded as not obeying some specified constraints. Thus, filters apply to individual objects and are meant to delete inherently wrong representations. But there are cases where the grammar produces multiple analyses of a given input because the input is ambiguous with respect to a given level. All of these analyses are in some sense correct, although further processing might discard some of them. Our aim was to design a preference mechanism able to choose the best among a set of acceptable candidates.</Paragraph> </Section> <Section position="4" start_page="0" end_page="0" type="metho"> <SectionTitle> OUR VIEW OF PREFERENCE </SectionTitle> <Paragraph position="0"> Preference has been defined in a number of ways, e.g. as a gradual fulfilment of semantic constraints \[Fass andWilks 1983\], as a lexically induced syn~ctic bias \[Ford et al. 1982\], as a parsm\[\[ strategy in~pendent of linguistic criteria \[Frazler and Fodor 1978, Pereira 1985\], and as a system based on multiple judgements reflecting the complexity of psychological processes \[Jackendoff 1985\].</Paragraph> <Paragraph position="1"> Our approach, which is greatly indebted to Jackendoffs theory of preference rule systems, is based on the following assumptions: - Preference is a method which, on the basis of some preference criteria, chooses the best one among a set of possible interpretations which are all correct according to the grammar. - Each preference criterion is expressed as a set of statements, where a statement is either a binary relation between competing interpretations or the description of a subtree which satisfies some defined criteria.</Paragraph> <Paragraph position="2"> - There is no unique preference criterion according to which the best interpretation can be chosen: preference criteria are multiple, and possibly contradictory. A preference mechanism must be able to accommodate such multi~eCferYence criteria are heuristic principles which may vary according to the language and the text type: therefore, they are not hardwired m the system.</Paragraph> <Paragraph position="3"> In the previous Eurotra preference mechanism \[Petitpierre et al. 1987\], preference statements were only defined as binary relations between subtrees. Since comparing subtrees is a rather expensive operation from the computational point of view, and since a number of preference cnteria - e.g. the principle of right-low attachment cannot be expressed in binary terms, we have allowed both binary and non-binary preference rules. The applicaaon algorithm of p-rules and the way in which various preference criteria are combined are also new w~th respect to the previous system.</Paragraph> </Section> <Section position="5" start_page="0" end_page="0" type="metho"> <SectionTitle> THE MECHANISM PROPOSED </SectionTitle> <Paragraph position="0"> The mechanism proposed is an independent module which is activated on the results output by the parser. The module consists of pre~r.ence rules of two possible kinds, which we call tnnary and unary rules.</Paragraph> <Paragraph position="1"> A binary rule establishes a preference relation between two correspondin~ (sub)trees (from here on, (sub)tree will be used m the sense of a representation of an interpretation or a part ot m~s representation). A unary rule picks up a (sub)tree on the basis of i~s own properties, thus implicitly establishing a preference relation between this (sub)tree and all its competitors. Each preference rule - be it binary or unary - is associated with a score, which is assigned to the preferred (sub)tree as a result of the application of the rule.</Paragraph> <Paragraph position="2"> - 281 -Correspondences: The notion of correspondence between (sub)trees is central topreference rules of the binary type. A number of def'mitions of this con- cept can be envisaged: i. The correspondence between two (sub)trees is established by the user, who states that some specified contraints hold between parts of them.</Paragraph> <Paragraph position="3"> ii. A correspondence is only assumed to exist between full parse trees, and the correspondence between two subtrees is defined by specifying their derivation paths from the top node.</Paragraph> <Paragraph position="4"> iii. The system proauces a parse graph which will be a synthesis of the various parse trees, where parts common to several trees are shared; two subtrees correspond if they share a given part. The most challenging solution is (iii): we have not adopted it because of computational problems connected with the introduction of structure-sharing into .the E-framework. The easiest solution to implement (ii): this is the approach chosen in the earlier urotra preterence tool. The solution we have adopted ts (i), which unlike (ii) allows the user to state constraints on subtrees, regardless of their position in the complete parse tree. In other words, our system allows for very local and modular state- ments. Preference Rules: The user expresses preference statements through a set of binary or unary preferen- ce rules (p-rules). The syntax for a binary rule is</Paragraph> <Paragraph position="6"> where Annotations. where: RuleName is a unique identifier used for trace purposes; Score is a positive integer which indicates how strong the relation of preference is; LHS and RHS (the left-hand side and the right-hand side of the rule) are the descriptions of the two (sub)trees to be compared; - >= is a preference sign that indicates which of the two (sub)trees is to ~e preferred; An.notations is a (possibly empty) set of constraints wmcn must hold between the constituents of the two (sub)trees to be compared. The syntax for a unary rule is</Paragraph> <Paragraph position="8"> where Annotations. where: - LHS is the description of the (sub)tree to be singled out (which we call the left-hand side to stress the parallelism with binary rules); - the other parts are as defined for binary rules. LttS and RHS are (sub)tree descriptions of any de.l~h and relevant parts of them may be labelled wtm r'rotog variables, called indexes. These labels are used to express simple or complex correspondence constraints in the annotation part of the rule. A simple constraint states for instance that two indexed subtrees must or must not have the same structure. Simple constraints may be combined with the operators 'and&quot; and 'or' to form complex constraints. Scores, which have the function of driving p-rule interaction, are positive integers. They may be either assigned by the user or generated automaucally on statistical grounds, as explained below. Examples of both rule types are given in the appendix.</Paragraph> <Paragraph position="9"> General Algorithm: All theparse trees have an initial null score before preference rules are applied. For each pair of trees, if they contain two subtrees respectively matching the LHS and the RHS of a binary p-rule, while the constraints in the annotation part of the rule hold, the rule applies. Similarly, for each parse tree, if a subtree matching the LHS of a unary b-rule can be extracted, and all the constraints expressed in the rule are satisfied, the rule applies. In both eases, as a result of p-rule applicatton, the score of the object that contains the preferred subtree is incremented by the score of the rule.</Paragraph> <Paragraph position="10"> When all binary rules have been tried out on all the possible pairs of trees in all the possible ways, and all unary rules have been fired on all the single trees, results are collected. All parse trees are partitioned into equivalence classes according to their score. Note that trees to which no preference rule has applied will belong to the lowest-ranking class: this is motivated by the assumption that unary rules prefer single trees over all the other members of the set of compeling trees.</Paragraph> <Paragraph position="11"> After this partial order has been established, all the trees but those belonging to the highest-ranking class are discarded.</Paragraph> <Paragraph position="12"> A possible enhancement to the expressive power of p-rules would be the introduction of negative scores, for cases where a p-rule describes an acceptable but not totally correct subtree.</Paragraph> </Section> <Section position="6" start_page="0" end_page="0" type="metho"> <SectionTitle> AN EXAMPLE </SectionTitle> <Paragraph position="0"> The following set of p-rules are based on some of the criteria for the treatment of PP attachment described in \[Hirst 1987\]. Note that p-rule scores have been assigned manually, due to the small number of rules.</Paragraph> <Paragraph position="2"> NP2: {cat=np}\] where PI=P2, NPI=NP2. In the rule above, 0 delimit a node in the tree, which in the E-framework is a set of attribute value pairs, \[\] following a given node enclose its daughters, = means equal to and ~= means different from. The rule prefers a valency-bound PP to a PP modifier. This is a very strong criterion, which can only be overridden by semantic principles: therefore, the rule has a high score.</Paragraph> <Paragraph position="4"> The rule gives 2 points to an attachment where a PP is placed under an NP node. Note that *0 means any number of (sub)trees, without any restriction, and ~ in front of a subtree means that this subtree is weakly dominated by the top node. Assuming the following two structures</Paragraph> <Paragraph position="6"> 'plow' will only apply once to (a), but it would fire twice on (b), which will in the end collect the highest score. The rule implements in fact the pnnciple of right-low auachment.</Paragraph> <Paragraph position="8"> The rule above assigns 5 points to a coordinated structure where the two conjuncts have the same number of terminals. Note that constraints are stated between nodes of two com~ting (sub)trees and not, as it was the case in pmod', between nodes belonging to the same (sub)tree. To see how these p-rules work, we can apply them to the set of objects resulting from the analysis of the following three Danish sentences: O) &quot;Kommissionen diskuterede et forslag fra virksomhederne om effektiv lcsning af problememe&quot;.</Paragraph> <Paragraph position="9"> fEN: The commission discussed a proposal by the companies for the effective solution of the problems).</Paragraph> <Paragraph position="10"> (2) &quot;Virksomhederne deltager i programmet for denne periode&quot;.</Paragraph> <Paragraph position="11"> fEN: The companies take part in the programme for this period).</Paragraph> <Paragraph position="12"> (3) &quot;Kommissionen kontrollerer finansieringen af virksomhederne og samarbeidet med in null dustrien&quot;.</Paragraph> <Paragraph position="13"> fEN: The commission controls the financing of the firms and the cooperation with industry). In all three cases the preference tool yields the correct result. The three preferred objects are shown below: p-rules that have applied are indicated on the top nodes of the relevant subtrees. null In accordance with the Eurotra linguistic model, object 1 and 2 below are dependency structures with a lowered governor, where the complements have been ordered in a canonical way and a series of phenomena (determinateness, verbal inflection, prepositions) have been featariseal. What interests us here, however, is the way PPs have been analysed. Thus, note that for all the PPs in sentence (1), the system has been able to find valency-bound syntactic functions (either subject or prepositional objec0.</Paragraph> <Paragraph position="15"> Consequently, modifier interpretations have been dispreferred. In the case of sentence (2) instead, the final PP has been analysed as a modifier, and the correct attachment has been found by the principle of right-low attachment. Note also that, still in (2), the verb &quot;deltage&quot; requires an obligatory prepositional object, and therefore this syntactic function has not been established by preference. Finally, in (3) the correct attachment of the two PPs has been found due to the combined effect of all three rules.</Paragraph> </Section> class="xml-element"></Paper>