XML Viewer - c00-2087

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/00/c00-2087_metho.xml
Size: 25,584 bytes
Last Modified: 2025-10-06 14:07:07
<?xml version="1.0" standalone="yes"?>
<Paper uid="C00-2087">
  <Title>Interaction Grammars</Title>
  <Section position="2" start_page="600" end_page="601" type="metho">
    <SectionTitle>
1 Syntactic descriptions as linear
</SectionTitle>
    <Paragraph position="0"> logic formulas IG arc formally defined as an ILL theory. Basic objects are syntactic dcscriptions which arc represented by linear logic formulas in the following form:</Paragraph>
    <Paragraph position="2"> If a syntactic descrit)tion concerns the dominance relation between syntactic constituents, it has the type Domin; if it concerlm the features which are used for characterizing syntactic or semantic prol)crties of constituents, it has the |yt)e Feat. Finally, a description call I)e built recursively fl'om two descriptions in two ways, which a.re expressed by tile two linear logic conjunctions: the multiplicative tensor ((r)) and the additive with (&amp;).</Paragraph>
    <Section position="1" start_page="600" end_page="600" type="sub_section">
      <SectionTitle>
1.1 Multiplieative and additive conjunction
</SectionTitle>
      <Paragraph position="0"> of resources in descriptions A description D1 (r) D2 requires all resources of both descriptions D1 and D2 while a description DI&amp;D2 requires either the resources of DI or the resources of D., lint not 1)oth. This use of the two linear logic conjunctions is consistent with their left introduction rules in the linear sequent calculus:</Paragraph>
      <Paragraph position="2"> In this way, it; is possible to describe all syntactic configurations of a word with a single lexical entry raider the form of a syntactic description: conunon parts of these COtl\[iglll'al;iolls are factorized whereas Slmcilic parts are distributed into alternations linked together with the comlective with. For instance, a possible lexical entry for the tinite verb volt in French has the shal)e Dvoit = D1 (r) (D2&amp;D3) (r) (D4&amp;;DS): Dj contains information related to the subject which is coiilnloll to all uses of the verb volt; D2 expresses the canonical order subject-verb in the sentence that is headed by the vert) voit whereas Da expresses the reverse order for which the subject must be realized under some conditions, such as in the phrase Marie que volt ,lean; D4 exl)resses that the verb has an exl)licit object whereas D,5 corresl)on(ts to circumstances where this object is not present, such as ill the sentence ,Ican volt.</Paragraph>
    </Section>
    <Section position="2" start_page="600" end_page="601" type="sub_section">
      <SectionTitle>
1.2 Under-specification of dominance
</SectionTitle>
      <Paragraph position="0"> constituent N is decoml)osed into the sub-constituents N1,...,N v. The order between these sub-constituents is only used tbr identifying each one without any linguistic lneaning; word order is dealt with at the same level as morphological information by means of features.</Paragraph>
      <Paragraph position="1"> A predicate N1 &gt; N2 expresses that N2 is all immediate sub-constituent of N1. Such a predicate is used when only partial information on tile sub-constituents of a phrase is available.</Paragraph>
      <Paragraph position="2"> A predicate N1 &gt;* N2 expresses that N2 is embedded in N1 at an undetermined del)th. For instance, if we continue with description D1 related to the verb volt, we can assume that it contains the formula  (Na &gt; \[N4, N~\]) (r) (N4 &gt;* No) which is interpreted as follows: the verb phrase Na is constituted of tile verb N4 and its object N~; Na represents the bare verb whereas N4 represents the verb which has been possibly modified by a clitic, a negation or an adverb. Under-specification of the dominance relation N4 &gt;* No leaves all these modifications open. Under-specification of dominmme between constituents goes beyond TAG adjunction in that the nodes which are in a dominance relation do not necessarily have the same grmnmatical category and thus linguistic phenomena like wh-extraction can be expressed easily in this way.</Paragraph>
    </Section>
    <Section position="3" start_page="601" end_page="601" type="sub_section">
      <SectionTitle>
1.3 Polarized and under-specified features
</SectionTitle>
      <Paragraph position="0"> Deserit)tions of tyI)e Fcat, related to features, have the following fol'm:</Paragraph>
      <Paragraph position="2"> A feature Node : Attr Pol Val is a triplet composed of an attribute Attr, a polarity Pol and a value Val associated with a syntactic node Nodc. Usually, a thature is defined as a pair (attribute, vahle). Ill IG, we add a polarity to this pair so that features behave like electrostatic charges: a positive feature Attr ~ Val seeks a negative featm'e Attr +-- Val to neutralize it and conversely while a neutral feature Attr = Val only acts as a filter through constraints on its value Val when it; meets another feature of type Attr at; the sanle node.</Paragraph>
      <Paragraph position="3"> Ill all cases, Val is either a constant which is selected from an infiifite countable set Coast of feature values or a variable which is selected fl'oln an infinite countable set Vat of feature variables; then, its def inition domain Call be constrained by two lduds of predicates: Val E Dora and Val 9( Dora; Dora is a finite set of elements taken froln Co7~8t.</Paragraph>
      <Paragraph position="4"> Let us illustrate this presentation with a possible lexieal entry for the i)roi)er noun Jean:</Paragraph>
      <Paragraph position="6"> Some features are neutral by nature like agreement features: gen=m (gender=male), num=sg (number=singular), pets=3 (person=3). Others are polarized by nature too: for instmlce, features of type flmct which express syntactic functions. In the example above, the feature of type funct is negative because the noun phrase represented by N is waiting to receive a syntactic fllnction (subject, object...); this flmction is not determined yet and thus it is represented by a variable lq.</Paragraph>
      <Paragraph position="7"> The phonological form of a constituent is determilmd by a system of two features: phon which gives tile effective phonological form of the constituent and ord which gives the order in which its immediate sub-constituents must be concatened to build this phonological form. For instance, we find the tbrnnlla</Paragraph>
      <Paragraph position="9"> Dvoit to express that the clause which has the verb volt as its head and is represented by node N1 is a concatenation subject-verb phrase (14 = 12) or verb phrase-subject (172 = 21). When a node has no children, two cases occur: the node has an empty phonological form and the vahle of the feature ord is 0 or the node is a lexical anchor and the value of the feature ord is 1. In this case, the feature phon is used tbr retrieving the effective phonological form, which can be verified in the (lescription D.l~an. Polarization of phonological tbrms expresses that some constituents are capable of giving a phonological tbrm while others are waiting tbl&amp;quot; one. As the previous exalnples shows, this pohu'ity is not carried by the tbatul'e phon but by the feature ord. The interest of giving privilege to the tbat, ure ord with respect to the feature ph, on, is twofold: we can deternline its value for a given node without being aware, of the phonological form of the children, the effective pholmlogical form will be rebuilt step by step from the leaves to the root of the final syntactic tree as soon as possible; another interest is that features of type ord can be dealt with like all other features; in particular, we can al)ply to theln the salne type of constraints.</Paragraph>
      <Paragraph position="10"> Finally, it is interesting to inention that value sharing by different features is represented in an easy way by using a unique variable for tile vahles of the concerned features.</Paragraph>
    </Section>
  </Section>
  <Section position="3" start_page="601" end_page="603" type="metho">
    <SectionTitle>
2 Syntactic composition as
</SectionTitle>
    <Paragraph position="0"> deduction in a linear theory By choosing a logical framework tbr a fornml definition of IG, we find a natural way of expressing syntactic eompositiou by means of deduction in linear logic according to the 1)aradigm &amp;quot;parsing as deduction&amp;quot; of CG (for a broad survey of CG see (Retord, 2000)). All interaction granunar is lexiealized ill the sense that all linguistic resources are stored in a lexicon and these resources will be coinbined 173' using inference rules of the ILL deductive system for building the acceptable sentences of the corresponding language. Since syntactic descriptions use only a fragment of this logic and if we choose the framework of the sequent calculus, only seven ILL rules are useflll:</Paragraph>
    <Paragraph position="2"> With respect to tile usual presentation of the ILL sequent cahmlus (Lincolu, 1992), axiom id is defined a bit differently but this definition is equivalent to the original one tbr tile logical fragment used by IG.</Paragraph>
    <Paragraph position="3"> Rule gr. is a tirst order rule which is used here for instantiating a node variable with a concrete node or a feature variable with a concrete feature value.</Paragraph>
    <Paragraph position="4"> Beside these general rules, we need proper axioms to express properties related to dominance relations, feature polarities, feature values and phonological forms. Concerning dominance relal, ions, we have the following proper axiom schemes:</Paragraph>
    <Paragraph position="6"> Axiom scheme dl expresses that immediate domiuallCe is realized t)y a parent-children relation whereas axiom schemes d2 and d3 express that dominance is realized l)y finite sequences of l:arentchildren relatkms (L an(1 L' represent sequences of node variables).</Paragraph>
    <Paragraph position="7"> The behaviour of polarities is represented by the following proper axiom schemes:</Paragraph>
    <Paragraph position="9"> Proi)erties related to feature doinaius and vahles rare expressed by tile following axioln schenles:</Paragraph>
    <Paragraph position="11"> hi l)oth axiom schelnes, D rel)resellts a couel'el;e filtite set of feature values taken from Con.W, and U and \ rel)t'esent the usual operations of union aim difference of sets.</Paragraph>
    <Paragraph position="12"> Finally, three axiom schemes are used for deducing tile effective phouological form of a col:stil;uent from the order of the phonological forms of its children: N &gt; \[\], (N: o,'d = 0) ~ N &gt; \[\] V (N: Vho,, = ,,)Vh.1 &gt; \[ \], (N: o,.,~ = 1) F N &gt; \[~ph._, ,) ph3 (~r &gt; \[N 1 ..... NI,\]) , O, ~'1 I- (N &gt; \[NI,..., Np\]) 05) 1 2 Schemes ph~ and ph2 respectively correspond to empty categories and lexical anchors.</Paragraph>
    <Paragraph position="13"> In scheme pha, 0 is an abbreviation for (N :ord = e(c,)); a is a perlnutation on \[\[1,p~ which expresses an order ibr concatenating the phonological tbrms vl,..., vp of the children nodes N\],..., N v of N and c(o-) is a bijective encoding of this permutation with an integer. /71 is an abbreviatioll for (NI : phoTt = Vl),... , (j~Tp : phon = %) and P~ an abbreviation for the product</Paragraph>
    <Paragraph position="15"> A particular interaction grmnmar G is defined by its vocalmlary \]?occ. and by a lexicon gexc,; the vocabulary Poco inchldes the words used for tmilding the hmguage /--:a generated by this grammar and the lexic(m PSc:,:c; associates a syntactic description to each word of Foca. Now, we have to (:ombine the resources provkled by go:re- by means of the inference rules and proper axioms of the linear theory T which has .just; been defined to compose well-formed and complete syntactic structures of G under the shaI)e of closed syntactic descriptions. As a preliminary, we have to give a precise definition of a closed syntactic description: A closed syntactic description is a partic'alar syntactic description in the shape S (r) F wh.cre S and F, respectively, represent the structural and feature parts of the dcseription with the following conditions: 1. S is a product of predicates in the form (,,. &gt; ,,here ,,., ,,.,, ..., n v represent eoncrcte syntactic nodes, and the structure defined by all these parent-children relations is a tree;  2. F is a product of predicates in the form (n :attr = v), where n, attr and v represent concrete atoms, and for each pair (u, attr) pre.scnt in F, there is cxactly one feature (n : attr = v) in F.</Paragraph>
    <Paragraph position="16"> 3. For every syntactic node 7t in S, there is a feature (n : phon. = v) in F.</Paragraph>
    <Paragraph position="17">  Condition 1 guarantees that a closed syntactic description rel)resents a COml)letely specitied tree. Condition 2 gua.rantees ('oherence and neutrality of the feature system which is attached at each syntactic node. Condition 3 guarani;cos the phonoh)gical well-fornmdness of the whole syntactic sl.l'it(:t;::t'e. Now, let; us explain how G generates closed syntactic descriptions from n lexieal entries D,,,,..., D,~,, correspouding to n words Wl,..., w,, taken fi'om Vote;. For this, we need an additional description D,.om to represent the root of the final syntactic tree which has tile fbrm:(No &gt;* N:) (r)... (r) (N0 &gt;* A:,) (r) (No : ord C/- V0). Node No represents the root of the syntactic tree and N1,..., N v are the nodes present in descriptions Dwl,..., D,o,,. Then: A closed syntactic dcscr@tion D is said to be generated from the words w1, . . . , w, by grammar G if the sequent V N V V (D,.oot(r) Dw: (r) &amp;quot;&amp;quot; (r) Dw,) F D is provable in the theory 7- (N and {J represent all node variables and fi'aturc variables that arc fl'cc in Drool, Dw:,..., Dw.).</Paragraph>
    <Paragraph position="18"> D describes a tree which represents the syntax of a phrase given by the feature phon of its root. If we add the predicate (No : piton = wl ... w,) to D,.oot, we transform the generation of closed syntact;ic descriptions into parsing of the phrase wl *. * w,,  By continuing with the verb volt, let us give a very simple illustration of this mechanism. We assume that a lexicon provides us with three descriI)tions Dvoit, Dil and D.lcan which respectively eorresi)ond to the finite verb volt, tile personal pronoun il and tile proper noun Jean. As it was described in sub-section 1.1, Dvoit has the shape D1 (r) (D2&amp;D3) (r) (D4&amp;Da and it is schematized by the following diagram: null</Paragraph>
    <Paragraph position="20"> To remain readable, the diagram includes only the most significant features of every node. The notation ord -~ 12121 is all abbreviation for ord --+ V with 17 G {12, 21} and ord +-- means that the value of Lhe feature ord is undetermined.</Paragraph>
    <Paragraph position="21"> Description Dil has a structure that is similar to</Paragraph>
    <Paragraph position="23"> Tile first additive component of description Dil, Dr&amp;D8 represents a choice between tile absence of all explicit subject ill the sentence beside tile personal pronoun il such as in tile sentence il volt Jean and the presence of this subject such as in tile sentence Jean voit-il ?. The second alternative entails that the sentence is interrogative if we ignore topicalization, which explains description Ds.</Paragraph>
    <Paragraph position="24"> The second additive component of description Dil, Dg&amp;Dlo, represents a choice between tile declarative type and tile interrogative type of the sentence which depends on the relative order between tile verb and tile clitic.</Paragraph>
    <Paragraph position="25"> Descrit)tion D Jean is reduced to the following single node:</Paragraph>
    <Paragraph position="27"> From tile description V N V 1~ (D,.oot ~ D,,oit (r) Da~.,, (r) Dil), it is possible to deduce three closed syntactic descriptions D., Db and D~, which respectively represent the syntax of the grammatical sentences :il voit ,lean, voit-il Jean ? and Jean voit-il .~. Ill concrete terms, the deduction process that leads to these three solutions consists ill plugging nodes of the initial descriptions with tile aim of neutralizing all polarized features while respecting dominance and featm'e constrains. Let us detail the resulting description DD by means of the syntactic tree it specifies: null</Paragraph>
    <Paragraph position="29"> phon ='ir \] J phon ='voit' Tile closed syntactic description that specifies tlte tree above represents the syntactic structure of the sentence voit-il Jean ?. The numbers that label its nodes are the traces of the nodes of the descriptions that have been plugged in the parsing process.</Paragraph>
  </Section>
  <Section position="4" start_page="603" end_page="604" type="metho">
    <SectionTitle>
3 A constraint-based
</SectionTitle>
    <Paragraph position="0"> implementation From tile viewpoint of a computer scientist, a linguistic model has to show not only expressive power but also computational tractability. In the previous section, we have shown that IG computations reduce to ILL proofs. For tile logical fragment that we consider here, three logical rules are a source of non- null determinism in proofsearcll: &amp;L1, &amp;L') and VL. This takes the shape of three kinds of choice points in tile t)arsing process: selecting the pertinent branch for every additive conjunction, identit~ying some node variables and instantiating t~ature variables in an al)t)ropriate maimer. The NP-conq)letenest of the implicative fragment of ILL (Kanovich, 1992) shows that it is hopeless to find a general parsing algorithm for IG that works in polynomial time in the worst cases. Experience has shown that, fortunately, these worst cases rarely occur in parsing natural languages. Nevertheless, the flexibility of IO entails a combinatory explosion of the parsing process if we use a &amp;quot;generate and test&amp;quot; method and leads us to choose a more approt)riate method. The specification of our problem prompts us in a natural way to a constraint-based al)l)roach as it was suggested by st)me proposals for similar prol)h;ms (Duchier and C., 1999; Duehier and Thater, 1999).</Paragraph>
    <Paragraph position="1"> The t)rol)lem can be tbrmulated as follows: Given a s?jntactic description Do, find all closed syntactic descriptions D such that VN VV Do t- D is provable in the theory 7(N and l} respectively repro.sent the node variables N,,..., N~ and the. fcaturcs variables I~,..., 147~ of Do).</Paragraph>
    <Paragraph position="2"> A flmdame.ntal t)rot)erty of the (teduction process that lea(It to a solution is monotonicity to that the t)roblem can t)e expressed as a constraint satisfaction problem (CSP). A CSP is specitied fl:om a set of variables to which constraints are apl)lied. Here, we consider three sets of variable, s, which corretl)on(t to tim three kin(Is of choi(:e 1)oints in the parsing pro-</Paragraph>
  </Section>
  <Section position="5" start_page="604" end_page="605" type="metho">
    <SectionTitle>
COTS;
</SectionTitle>
    <Paragraph position="0"> 1. the set {N1,...,N,,} of syntacti(&amp;quot; 1,o(le varia/)les; null 2. the set {l~,..., I4,, } of t'eature variables; 3. the set {St,...,Sv} of sdection variables; ev null ery selection variable Si is an integer variable which is associated with a connective &amp; of D0 and which is used for indicating the rank of the component of the correspondent additive conjunction that is selected in the deduction. Selection and feature variables are considered as finite domain variables, which imply that all feature vahms are encoded as integers (one exeel)tion is that features of type phon remain strings).</Paragraph>
    <Paragraph position="1"> Node variables arc' enco(ted indirectly via finite set variables by using the metho(t t)roposed in (Duchier and C., 1999). Every node variable Ni is associated with five finite set w~.riables cq(i), up(i), down(i), side(i) and all(i) which are used for locating the node i with respect to the others in the sys|;em of dominance relations. Because of the presence of additive cm\\]unctions, a node i which is present in tile description Do nmy be absent from a solution. In this case, eq(i) = {i}, alt(i) = ~l,n~\{i}, up(i) = down(i) = side(i) = 0; in the case that i is present in a solution, alt(i) repretents the nodes that are not selected in the solution whereas tile selected nodes are distributed into the four sets cq(i), 'up(i), down(i) and .side(i) according to their relative position with respect to i.</Paragraph>
    <Paragraph position="2"> Constraints on the variat)les of the probhnn are divided into two parts: * general constraints guarantee that the solutions D are effective closed syntactic descriptions; * specific constraints guarantee that the solutions D are models of the initial description Do.</Paragraph>
    <Section position="1" start_page="604" end_page="604" type="sub_section">
      <SectionTitle>
3.1 General constraints
</SectionTitle>
      <Paragraph position="0"> Treeness constraints For every node i, the partition of \[1, n~ between eq(i), up(i), down(i), .side(i) and all(i) guarantees that the solution is a directed acyclic graph (DAG).</Paragraph>
      <Paragraph position="1"> For expressing that all dominmme relations which structure a solution must only be realized by parentehihtren relations, we must introduce constraints ill which variables of type. cq(i) and selection variables appear for expressing that every selected node variable must be identified with a node variable which is the parent in a selected parent-children relation.</Paragraph>
      <Paragraph position="2"> In order to express that a solution is more than a DAG, that is a tree, we must add the following constraint: for every selected parent-children relation, the sets down(j) for the children j present in this relation must be disjoint. Such a condition can be drol)ped if we want to extend the fbrmalism to take into ac(:ount resource thm:ing like coordination tot instance; in this ease, syntactic structures are no longer trees trot DAGs.</Paragraph>
      <Paragraph position="3"> Neutrality constraints Feature neutrality of a solution is guaranteed by constraints which also appeal to variables of type cq(i) and selectkm variables: for each attribute Attr, we consider two sets of sets in tile shape cq(i): the first corresponds to all selected predicates in the form (Ni : Attr +-- V) and the second to all selected predicates in the form (Ni : Attr + V). The elements of each of these sets must be disjoint sets and every element of the.</Paragraph>
      <Paragraph position="4"> first set; must be identified with one element of the second and conversely.</Paragraph>
      <Paragraph position="5"> Other general constraints related to features and phonological forms are trivial.</Paragraph>
    </Section>
    <Section position="2" start_page="604" end_page="605" type="sub_section">
      <SectionTitle>
3.2 Specific constraints
</SectionTitle>
      <Paragraph position="0"> Such constraints are determined by Do. Doininance constraints are easily iml)lelnented by combining selection variables and variables of type cq(i), 'up(i), down(i), side(i)(Duchier and Thater, 1999).</Paragraph>
      <Paragraph position="1">  FEaturE constraints concern both feature variables and selection variables which are all finite domain variables to that their implen:entation appeals to classical tools in the domain of constraint programining. null</Paragraph>
    </Section>
    <Section position="3" start_page="605" end_page="605" type="sub_section">
      <SectionTitle>
3.3 A prototype parser for Ih'ench
</SectionTitle>
      <Paragraph position="0"> We have implemented a prototype parser for IS&amp;quot;ench.</Paragraph>
      <Paragraph position="1"> It it written in the language Oz (Smolka, 1995) which combines various aspects and modules, including constraint prograInming. Though the linguistic COvErage of tile lexicon is still linfited, we have learnt lessons from the first experiments: in particular, neutrality constraints play a central role for restricting the search space, which confirms the inlportancc of polarities for the computationa.1 Gtrlciency. null Conclusion Starting from TAG and CO, we have presented a linguistic tbrmalism which aims at better cal)turing the flexibility of natural language by using two notions as its basis: underspccifieation and polarities. In some SENSE, they correspond to two important properties of natural language: ambiguity and resource sensitivity.</Paragraph>
      <Paragraph position="2"> To regard parsing as a constraint satisfaction problem fits in with the flexibility of the formalism in terms of comi)utational efficiency but, at tile same time, it allows to go towards robustness beyond a traditional view of parsing in which only grammatical and completely specified structures are taken into a(;count.</Paragraph>
      <Paragraph position="3"> The success of IG does not ette.ntially depend on the fbrmal propErtiEs that are usually Exhibited for grammatical formalisms: the characterization of tile class of languages that are generated by thesE grammars or the complexity of general parsing algorithms. Forlnal properties matter but with respect to an ESSEntial goal: to Extend the linguistic coverage of IG from toy lexicons to massive lexical databases. For this, IG have some advantages by making it easily to factorize and modularize information: such propErtiEs are decisive when one wants to extract information from a lexical database efficiently or to update data while maintaining the coherence of the whole base.</Paragraph>
      <Paragraph position="4"> The success of IG will also depend on their capacity to integrate other linguistic lEvEls than the syntactic level, the semantic level especially.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML