XML Viewer - c04-1044

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/04/c04-1044_metho.xml
Size: 17,480 bytes
Last Modified: 2025-10-06 14:08:41
<?xml version="1.0" standalone="yes"?>
<Paper uid="C04-1044">
  <Title>Polarization and abstraction of grammatical formalisms as methods for lexical disambiguation</Title>
  <Section position="3" start_page="0" end_page="0" type="metho">
    <SectionTitle>
2 Morphisms between grammatical
</SectionTitle>
    <Paragraph position="0"> formalisms Polarization and abstraction can be de ned from a more general notion of morphism between grammatical formalisms. A morphism from a grammatical formalism C to a grammatical formalism A is a function f from StructC to StructA with the following properties3:</Paragraph>
    <Paragraph position="2"> then f(S1);:::;f(Sn) can be composed into the structure f(S) by means of rules of RulesA.</Paragraph>
    <Paragraph position="3"> Given such a morphism f and a grammar G in C, the image of G by f denoted f(G) is the grammar|in A|induced by the morphism. The three properties of morphism guarantee that the language generated by any grammar G of C is a subset of the language generated by f(G). In other words, L(G) L(f(G)).</Paragraph>
    <Paragraph position="4"> We propose to use the notion of morphism in two ways: for polarizing grammatical formalisms and in this case, morphisms are isomorphisms; grammars are transposed from a formalism to another formalism with the same generative power; in other words, with the previous notations: L(G) =L(f(G)); for abstracting grammatical formalisms and this case, the transposition of grammars by morphisms entails simpli cation of grammars and extension of the generated languages; we have only: L(G) L(f(G)).</Paragraph>
    <Paragraph position="5"> An example of the use of abstraction for lexical disambiguation may be found in (Boullier, 2003)4. We propose to link polarization with abstraction because polarities allow original methods of abstraction. Polarization is used as a preprocessing step before the application of these methods.</Paragraph>
  </Section>
  <Section position="4" start_page="0" end_page="0" type="metho">
    <SectionTitle>
3 Polarization of grammatical
</SectionTitle>
    <Paragraph position="0"> formalisms The goal of polarizing a grammatical formalism is to make explicit the resource sensitivity that is hidden in syntactic composition, by adding polarities to the labels of its structures. When morpho-syntactic labels become polarized in syntactic structures, they get the status  tended for embedding the proposal of (Boullier, 2003). of consumable resources: a label that is associated with the polarity + becomes an available resource whereas a label that is associated with the polarity becomes an expected resource; both combine for producing a saturated resource associated with the polarity $; labels associated with the polarity = are neutral in this process. In a polarized formalism, the saturated structures are those that have all labels associated with the polarity = or $. We call them neutral structures. The composition of structures is guided by a principle of neutralization: every positive (negative) label must unify with a negative (positive) label.</Paragraph>
    <Paragraph position="1"> The polarization of a formalism must preserve its generative power: the language that is generated by a polarized grammar must be the same as that generated by the initial nonpolarized grammar. This property of (weak and even strong) equivalence is guaranteed if the polarized formalism is isomorphic to the nonpolarized formalism from which it stems. Formally, given a grammatical formalism F, any formalism Fpol with a morphism pol : F !Fpol is a polarization of F if: (i) For any structure S 2 StructF, pol(S) results from associating each label of S with one of the polarities: +, , =, $; in others words, labels of Fpol are pairs (p;l) with p a polarity and l a label of F. The set of polarities f+, , =, $g is equipped with the operation of uni cation and the subsumption order de ned by  and uni cation on pairs are the pointwise operations. That is, for any pairs (p;l)</Paragraph>
    <Paragraph position="3"> (ii) SatFpol is constituted of the neutral structures of StructFpol.</Paragraph>
    <Paragraph position="4"> (iii) pol is an isomorphism whose inverse morphism is the function that ignores polarities and keeps invariant the rest of the  structure.</Paragraph>
    <Paragraph position="5"> Let us illustrate our purpose by taking again our two examples of formalisms.</Paragraph>
    <Paragraph position="6"> For LTAG (see gure 2), pol consists in labelling the root of elementary syntactic trees with the polarity + and their non terminal leaves (substitution and foot nodes)</Paragraph>
    <Paragraph position="8"> with the adjective red in LTAG, LTAGpol, (LTAGpol)destr with the polarity . In every pair of quasinodes, the top quasi-node is labelled with the polarity and the bottom quasi-node is labelled with the polarity +. With respect to the classical presentation of LTAG, initial trees must be completed by an axiom with two nodes of the type sentence: a root with the polarity = and its unique daughter with the polarity . In this way, pol establishes a perfect bijection between the saturated structures of LTAG and the neutral structures of LTAGpol. The rules of adjunction and substitution of RulesLTAGpol mimic the corresponding rules in LTAG, taking into account polarities. We add a third composition rule, a unary rule which identi es the two quasi-nodes of a same pair. It is routine to check that pol is a polarisation.</Paragraph>
    <Paragraph position="9"> In LG(see gure 3), polarization is already present explicitly in the formalism: negative formulas and sub-formulas are input formulas, hypotheses whereas positive formulas and sub-formulas are output formulas, conclusions.</Paragraph>
  </Section>
  <Section position="5" start_page="0" end_page="0" type="metho">
    <SectionTitle>
4 Abstraction of polarized
</SectionTitle>
    <Paragraph position="0"> grammatical formalisms The originality of abstracting polarized formalisms is to keep a mechanism of neutralization between opposite polarities at the heart of the abstract formalism. Furthermore, we can choose di erent levels of abstraction by keeping more or less information from the initial formal- null with the transitive verb eats in LG, LGpol, (LGpol)destr ism.</Paragraph>
    <Paragraph position="1"> As an example, we propose a high degree abstraction, destructuring. Destructuring a polarized formalism consists in ignoring the structure from the initial syntactic objects to keep merely the multisets of polarized labels. Formally, given a polarized formalism P, we de ne the formalism Pdestr as follows: Any element M of StructPdestr is a multi-set of labels. All elements of M are labels of P, except one exactly, the anchor, which is a neutral string.</Paragraph>
    <Paragraph position="2"> SatPdestr is made up of multisets containing only neutral and saturated labels; The projection PhonPdestr returns the label of the anchor.</Paragraph>
    <Paragraph position="3"> RulesPdestr has two neutralization rules. A binary rule takes two multisets M1 and M2 from StructPdestr as inputs; two uni able labels +l1 2 M1(M2) and l2 2 M2(M1) are selected. The rule returns the union of M1 and M2 in which +l1 and l2 are unied and the two anchors are concatenated. The only change with the unary rule is that this operates inside the same multiset.</Paragraph>
    <Paragraph position="4"> A morphism destr is associated to Pdestr (see gure 2 and 3): it takes any structure S from StructP as input and returns the multiset of its labels with an additionnal anchor. This anchor is the neutral string PhonP(S) if this one is de ned.</Paragraph>
    <Paragraph position="5"> An important property of Pdestr is that it is not sensitive to word order: if a sentence is generated by a particular grammar of Pdestr, by permuting the words of the sentence, we obtain another sentence generated by the grammar. Destructuring is an abstraction that applies to any polarized formalism but we can design abstractions with lower degree which are speci c to particular formalisms (see Section 6).</Paragraph>
  </Section>
  <Section position="6" start_page="0" end_page="0" type="metho">
    <SectionTitle>
5 Application to lexical
</SectionTitle>
    <Paragraph position="0"> disambiguation Abstraction is the basis for a general method of lexical disambiguation. Given a lexicalized grammar G in a concrete formalism C, we consider a sentence w1:::wn. For each 1 i n, let the word wi have the following entries in the lexicon of G: Si;1;Si;2:::Si;mi. A tagging of the sentence is a sequence S1;k1;S2;k2 :::Sn;kn. We suppose now that we have given an abstraction morphism abs : C ! Cabs. As L(G) L(abs(G)), any tagging in abs(G) which has no solutions comes from a bad tagging in G. As a consequence, the methods we develop try to eliminate such bad taggings by parsing the sentence w1w2:::wn within the grammar abs(G).</Paragraph>
    <Paragraph position="1"> We propose two procedures for parsing in the abstract formalism: an incremental procedure which is speci c to the destructuring abstraction, a bottom-up procedure which can apply to various formalisms and abstractions.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
5.1 Incremental procedure
</SectionTitle>
      <Paragraph position="0"> We choose polarization followed by destructuring as abstraction. In other words: abs = destr pol. Let us start with the particular case where uni cation of labels in C reduces to identity. In this case, parsing inside the formalism Cabs is greatly simpli ed because composition rules reduce to the neutralization of two labels +l and l. As a consequence, parsing reduces to a counting of positive and negative polarities present in the selected tagging for every label l: every positive label counts for +1 and every negative label for {1, the sum must be 0; since this counting must be done for every possible tagging and for every possible label, it is crucial to factorize counting. For this, we use automata, which drastically decrease the space (and also the time) complexity.</Paragraph>
      <Paragraph position="1"> For every label l of C that appears with a polarity + or in the possible taggings of the sentence w1w2:::wn, we build the automaton Al as follows. The set of states of Al is [0::n] Z.</Paragraph>
      <Paragraph position="2"> For any state (i;c), i represents the position at the beginning of the word wi+1 in the sentence and c represents a positive or negative count of labels l. The initial state is (0;0), and the nal state is (n;0). Transitions are labeled by lexicon entries Si;j. Given any Si;j, there is a transition (i 1;x) Si;j !(i;y) if y is the sum of x and the count of labels l in the multi-set destr(Si;j).</Paragraph>
      <Paragraph position="3"> Reaching state (i;c) from the initial state (0;0) means that (a) the path taken is of the form S1;j1;S2;j2;:::;Si;ji, that is a tagging of the rst i words, (b) c is the count of labels l present in the union of the multi-sets abs(S1;j1);abs(S2;j2);:::;abs(Si;ji).</Paragraph>
      <Paragraph position="4"> As a consequence, any path that leads to the nal state corresponds to a neutral choice of tagging for this label l.</Paragraph>
      <Paragraph position="5"> The algorithm is now simply to construct for each label l the automaton Al and to make the intersection A = Tl2LabelsAl of all these automata. The result of the disambiguation is the set of paths from the initial state to the nal state described by this intersection automaton. Notice that at each step of the construction of the intersection, one should prune automata from their blind states to ensure the e ciency of the procedure.</Paragraph>
      <Paragraph position="6"> Now, in the general case, uni cation of labels in F does not reduce to identi cation, which introduces nondeterminism in the application of the neutralization rule. Parsing continues to reduce to counting polarities but now the counting of di erent labels is nondeterministic and interdependent. For instance, consider the multiset f+a, +b, at+bg of three di erent elements.</Paragraph>
      <Paragraph position="7"> If we count the number of a, we nd 0 if we consider that +a is neutralized by atb and +1 otherwise; in the rst case, we nd +1 for the count of b and in the second case, we nd 0.</Paragraph>
      <Paragraph position="8"> Interdependency between the counts of di erent labels is very costly to be taken into account and in the following we ignore this property; therefore, in the previous exemple, we consider that the count of a is 0 or +1 and the count of b is also 0 or +1 independently from the rst one.</Paragraph>
      <Paragraph position="9"> For expressing this, given a label l of F and a positive or negative label l0 of Fpol, we de ne Pl(l0) as a segment of integers, which represents the possible counts of l found in l0, as follows: if l0 is positive, then Pl(l0) =8</Paragraph>
      <Paragraph position="11"> We generalize the function Pl to count the number ol labels l present in a multi-set abs(S):</Paragraph>
      <Paragraph position="13"> The method of disambiguation using automata presented above is still valid in the general case with the following change in the de nition of a transition in the automaton Al: given any Si;j, there is a transition (i 1;x) Si;j !(i;y) if y is the sum of x and some element of Pl(Si;j).</Paragraph>
      <Paragraph position="14"> With this change, the automaton Al becomes nondeterministic.</Paragraph>
      <Paragraph position="15"> The interest of the incremental procedure is that it is global to the sentence and that it ignores word order. This feature is interesting for generation where the question of disambiguation is crucial. This advantage is at the same time its drawback when we need to take word order and locality into account. Under this angle, the bottom-up procedure, which will be presented below, is a good complement to the incremental procedure.</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
5.2 Bottom-up procedure
</SectionTitle>
      <Paragraph position="0"> We propose here another procedure adapted to a formalism C with the property of projectivity. Because of this property, it is possible to use a CKY-like algorithm in the abstract formalism Cabs. To parse a sentence w1w2 wn, we construct items of the form (i;j;S) with S an element of StructCabs and i and j such that wi+1:::wj represents the phonological form of S. We assume that Rules(Cabs) has only unary and binary rules. Then, three rules are used for lling the chart: initialization: the chart is initialized with items in the form (i;i+ 1;abs(Si+1;k)); reduction: if the chart contains an item (i;j;S), we add the item (i;j;S0) such that S0 is obtained by application of a unary composition rule to S; concatenation: if the chart contains two item (i;j;S) and (j;k;S0), we add the item (i;k;S00) such that S00 is obtained by application of a binary composition rule to S and S0.</Paragraph>
      <Paragraph position="1"> Parsing succeeds if the chart contains an item in the form (0;n;S0) such that S0 is an element of SatCabs. From such an item, we can recover all taggings that are at its source if, for every application of a rule, we keep a pointer from the conclusion to the corresponding premisses. The other taggings are eliminated.</Paragraph>
    </Section>
  </Section>
  <Section position="7" start_page="0" end_page="0" type="metho">
    <SectionTitle>
6 Experiments
</SectionTitle>
    <Paragraph position="0"> In order to validate our methodology, we have written two toy English grammars for the LG and the LTAG formalisms. The point of the tests we have done is to observe the performance of the lexical disambiguation on highly ambiguous sentences. Hence, we have chosen the three following sentences which have exactly one correct reading: (a) the saw cut the butter.</Paragraph>
    <Paragraph position="1"> (b) the butter that the present saw cut cooked well.</Paragraph>
    <Paragraph position="2"> (c) the present saw that the man thinks that the butter was cut with cut well.</Paragraph>
    <Paragraph position="3"> For each test below, we give the execution time in ms (obtained with a PC Pentium III, 600Mhz) and the performance (number of selected taggings / number of possible taggings).</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
6.1 Incremental procedure
</SectionTitle>
      <Paragraph position="0"> The incremental procedure (IP) results are given in Figure 4:</Paragraph>
    </Section>
  </Section>
  <Section position="8" start_page="0" end_page="0" type="metho">
    <SectionTitle>
LG LTAG
</SectionTitle>
    <Paragraph position="0"> ms perf. ms perf.</Paragraph>
    <Paragraph position="1">  One may notice that the number of selected taggings/total taggings decrease with the length of the sentence. This is a general phenomenon explained in (Bonfante et al., 2003).</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
6.2 Bottom-up procedure
</SectionTitle>
      <Paragraph position="0"> The execution time for the bottom-up procedure (BUP) grows quickly with the ambiguity of the sentence. So this procedure is not very relevant if it is used alone. But, if it is used as a second step after the incremental procedure, it gives interesting results. In Figure 5, we give the results obtained with the destr abstraction.</Paragraph>
      <Paragraph position="1"> Some other experiments show that we can im-</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML