File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/98/p98-1019_metho.xml
Size: 11,422 bytes
Last Modified: 2025-10-06 14:14:57
<?xml version="1.0" standalone="yes"?> <Paper uid="P98-1019"> <Title>Parsing Ambiguous Structures using Controlled Disjunctions and Unary Quasi-Trees</Title> <Section position="4" start_page="125" end_page="128" type="metho"> <SectionTitle> 3 Controlled Disjunctions </SectionTitle> <Paragraph position="0"> The controlled disjunctions (noted hereafter CD) implement the relations existing between ambiguous feature values. The example of the figure (1) describes a non covariant relation between GENDER and HEAD features.</Paragraph> <Paragraph position="1"> More precisely, this relation is oriented: if the object is a noun, then the gender is masculine and if the object is feminine, then it is an adjective.</Paragraph> <Paragraph position="2"> The relation between these values can be represented as implications: noun => masc and fem :=~ adj. The main interest of CDs is the representation of the variancy between the possible values and the control of this variancy by complex formulae.</Paragraph> <Paragraph position="3"> Controlled disjunctions reference the formulae with names and all the formula are ordered. So, we can refer directly to one of the disjuncts (or to a set of linked disjuncts) with the name of the disjunction and its rank. For clarity, we represent, as in the figure (2), the consequent of the implication with a pair indexing the antecedent. This pair indicates the name of the disjunction and the rank of the disjunct. In this example, noun(2,1) implements noun => masc: the pair (2, 1> references the element of the dis- null The example (3) 1 present, s the case of an ambiguity that cannot be totally controlled by a ND. Tlfis structure indicates a set of variancies. But the ccvariancy representation only implements a part of the relations. In fact, several &quot;complex&quot; implications (i.e. with a conjunction as antecedent) control these formulae a~s follows : {aAc=> f, bAd:-~ e, cAe :=> b, dA f :::> a} These implications (the &quot;controlling formulae&quot;) are constraints on the positions of the disjuncts in the CD. The formula in the example (4) presents a solution using CDs and totally implementing all the relations. In this representation, (i = 1) n (j = 1) ~ (k = 2) implements the implication a n c ~ \]. The set of constraints is indicated into brackets.</Paragraph> <Paragraph position="4"> The feature structure, constrained by this set, simply contains the elementary variations. null</Paragraph> <Paragraph position="6"> From an implementation point of view, the controlled disjunctions can easily be implemented with languages using delaying devices. An implementation using functions in Life has been described in (Blache97).</Paragraph> <Section position="1" start_page="126" end_page="128" type="sub_section"> <SectionTitle> 4.1 Unary Quasi-Trees </SectionTitle> <Paragraph position="0"> (Vijay-Shauker92) proposes the use of trees description called quasi-trees whithin the framework of TAG. Such structures rely on the generalization of hierarchical relations between constituents. These trees bear some particular nodes, called quasi-nodes, which are constituted by a pair of categories of the same type. These categories can refer or not to the same objet. If not, a subtree will be inserted between them in the final structure.</Paragraph> <Paragraph position="1"> Such an approach is particularly interesting for the description of generalizations. The basic principle in TAG consists in preparing subtrees which are part of the final syntactic structure. These subtrees can be of a level greater than one: in this case, the tree predicts the hierarchical relations between a category and its ancestors. Quasi-trees generalize this approach using a meta-level representation allowing the description of the general shape of the final syntactic tree.</Paragraph> <Paragraph position="2"> The idea of the unary quasi-trees relies basically on the same generalization and we propose to indicate at the lexical level some generalities about the syntactic relations. At the difference with the quasi-trees, the only kind of information represented here concerns hierarchy. No other information like subcategorization is present there. This explain the fact that we use unary trees.</Paragraph> <Paragraph position="3"> Several properties characterizes unary quasi-trees (noted hereafter UQTs): * An UQT is interpreted from the leaf (the lexical level) to the root (the propositional one).</Paragraph> <Paragraph position="4"> * A relation between two nodes ~ and/~ (a dominating j3) indicates, in a simple PSG representation, that there exists a derivation of the form a 3&quot; B such that ~eB.</Paragraph> <Paragraph position="6"> pose a node, this set in interpreted as a disjunction. Such nodes are called ambiguous nodes. A categorial ambiguity is then represented by an unary quasi-tree in which each node is a set of objects. null * Each node is a disjunctive formula belonging to a covariant disjunction.</Paragraph> <Paragraph position="7"> * An UQT is limited to three levels: lexical, phrase-structure and propositional. (5) The example (5) shows the UQT corresponding to the word mobile with an ambiguity adjective/noun. For clarity's sake, the tree is presented upside-down, with the leaf at the top and the root at the bottom. This example indicates that: * an adjective is a daughter of an AP which is to its turn a daughter of a NP, * a noun is a daughter of a NP which is to its turn a daughter of an unspecified phrase XP.</Paragraph> <Paragraph position="8"> 3These objects, as for the quasi-trees, can be constituted by atomic symbols or feature structures, according to the linguistic formalism.</Paragraph> <Paragraph position="9"> As indicated before, each node represents a disjunctive formula and the set of nodes constitutes a covariant disjunction. This information being systematic, it becomes implicit in the representation of the UQTs (i.e. no names are indicated). So, the position of a value into a node is relevant and indicates the related values into the tree.</Paragraph> <Paragraph position="10"> This kind of representation can be systematized to the major categories and we Can propose a set of elementary hierarchies, as shown in the figure (6) used to construct the UQTs.</Paragraph> <Paragraph position="11"> (6) It is interesting to note that the notion of UQT can have a representation into different formalisms, even not based on a tree representation. The figure (2) shows for example an HPSG implementation of the UQT described in the figure (1).</Paragraph> <Paragraph position="12"> In this example, we can see that the ambiguity is not systematically propagated to all the levels: at the second level (sub'structure ~\]), both values belong to a same feature (HEAD-DAUGHTER). The covariation here concerns different features at different levels. There is for example a covariation between the HEAD features of the second level and the type of the daughter at the third level. Moreover, we can see that the noun can be projected into a NP, but this NP can be either a complement or a subject daughter. This ambiguity is represented by an embedded variation (in this case a simple disjunction). The example described in the figure (3) shows a french lexical item that can be categorized as an adjective, a noun or a verb (resp. translated as ferm, farm or to close). In comparison with the previous example, adding the verb subcase simply consists in adding the corresponding basic tree to the structure. In this case, the covariant part of the structure has three subcases.</Paragraph> <Paragraph position="13"> This kind of representation can be considered as a description in the sense that it works as a constraint on the corresponding syntactic structure.</Paragraph> </Section> <Section position="2" start_page="128" end_page="128" type="sub_section"> <SectionTitle> 4.2 Using UQTs </SectionTitle> <Paragraph position="0"> The UQTs represent the ambiguities at the phrase-structure level. Such a representation has several interests. We focus in this section more particularly on the factorization and the representation of different kind of constraints in order to control the parsing process. null The example of the figure (4) presents an ambiguity which &quot;disappears&quot; at the third level of the UQT. This (uncomplete) NP con-&quot; tains two elements with a classical ambiguity adj/noun. In this case, both combinations are possible, but the root type is always nominal. This is an example of ambiguous structure that doesn't need to be disambiguated (at least at the syntactic level): the parser can use directly this structure 4.</Paragraph> <Paragraph position="1"> As seen before, the controlled disjunctions can represent very precisely different kind of relations within a structure. Applying this technique to the UQTs allows the representation of dynamic relations relying on the context. Such constraints use the selection relations existing between two categories. In case of ambiguity, they can be applied to an 4We can also notice that covariation implements the relation between the categories in order to inhibit the noun~noun or adj/adj possibilities (cf. the CD number 1).</Paragraph> <Paragraph position="2"> ambiguous group in order to eliminate inconsistencies and control the parsing process. In this case, the goal is not to disambiguate the structure, but (i) to delay the evaluation and maintain the ambiguity and (ii) in order to reduce the set of solutions. The figure (5) shows an example of the application of this technique.</Paragraph> <Paragraph position="3"> The selection constraints are applied between some values of the UQTs. These relations are r@presented by arcs between the nodes at the lexical level. They indicate the possibility of cooccurrence of two juxtaposed categories. The constraints represented by arrows indicate subcategorization. If such constraint is applied to an ambiguous area, then it can be propagated using the selection constraints whithin this area. In this example, there is a selection relation between the root S of the UQT describing &quot;poss~de&quot; and the node value NP at the second level of the UQT describing &quot;ferme&quot;. This information is propagated to the rest of the UQT and then to the previous element using the relation existing between the values N of &quot;ferme&quot; and Adj of &quot;belle&quot;. All these constraints are represented using controlled disjunctions: each controller value bears the references of the controlled one as described in the section (3).</Paragraph> <Paragraph position="4"> The interest of this kind of constraints is that they constitute a local network which defines in some way a controlled ambiguous area. The parsing process itself can generate new selection constraints to be applied to an entire area (for example the selection of a NP by a verb). In this case, this constraint can be propagated through the network and eliminate inconsistent solutions (and eventually totally disambiguate the structure). This pre-parsing strategy relies on a kind of head-corner method. But the main goal here, as for the lexical level, is to provide constraints controlling the disambiguation of the structures, not a complete parsing strategy.</Paragraph> </Section> </Section> class="xml-element"></Paper>