Parsing Ambiguous Structures using Controlled Disjunctions 
and Unary Quasi-Trees 
Philippe Blache 
LPL - CNRS 
29 Avenue Robert Schuman 
F-13621 Aix-en-Provence 
pb~ipl, univ-aix, fr 
Abstract 
The problem of parsing ambiguous structures 
concerns (i) their representation and (ii) the spec- 
ification of mechanisms allowing to delay and 
control their evaluation. We first propose to use 
a particular kind of disjunctions called controlled 
disjunctions: these formulae allows the represen- 
tation and the implementation of specific con- 
stralnts that can occur between ambiguous val- 
ues. But an efficient control of ambiguous struc- 
tures also has to take into account lexical as well 
as syntactic information concerning this object. 
We then propose the use of unary quasi-trees 
specifying constraints at these different levels. 
The two devices allow an efficient implementa- 
tion of the control of the ambiguity. Moreover, 
they are independent from a particular formalism 
and can be used whatever the linguistic theory. 
1 Introduction 
Most of the approaches dealing with ambi- 
guity are disambiguating techniques. This 
preliminary constatation seems trivial and 
relies on a simple presuposition: the am- 
biguous structures need to be disambiguated. 
However, this is not true from several re- 
spects. Machine translation is a good ex- 
ample: the ambiguity of a sentence in the 
source language needs very often to be pre- 
served and translated into the target one (cf. 
(Wedekind97)). 
Another remark, in the same perspective: 
most of the disambiguating techniques rely 
on a single linguistic level. In other words, 
they generally make use of lexical or syn- 
tactic or semantic information, exclusively. 
But a natural processing of natural language 
should not work in this way. All the linguis- 
tic levels of NLP (i.e. phonetic, phonologic, 
lexical, syntactic, semantic and pragmatic) 
have to be taken into account at the same 
time. In other words, processing ambigu- 
ity would have to be parallel, not sequen- 
tial. The problem is then to use ambiguous 
structures during the parse without blocking 
the analysis. In a first approximation, such a 
problem comes to parse using underspecified 
structures. We will see that this constitutes 
a part of the solution. 
The third and last preliminary remark fo- 
cuses on the control strategies for the evalu- 
ation of ambiguous structures. These strate- 
gies can rely on the formal properties of the 
ambiguous structure (for example the sim- 
plification of a disjunctive formula), on the 
contextual relations, etc. But the ambiguous 
objects can themselves bear important infor- 
mation specifying some restrictions. We will 
develop in this paper several examples illus- 
trating this point. The approach described 
here make an intensive use of this kind of 
constraints, also called control relations. 
We present in this paper a technique called 
controlled disjunctions allowing to represent 
and implement an efficient control of am- 
biguous structures at the lexical and phrase- 
structure level. We illustrate this technique 
using the HPSG framework, but it could be 
used in all kind of feature-based representa- 
tions. This approach relies (i) on the rep- 
resentation of constraints relations between 
the feature values and (ii) on the propaga- 
tion of such relations. We insist on the fact 
that this is not a disambiguating technique, 
but a control of the evaluation of ambigu- 
ous structures. In order to increase the num- 
ber of constraints controlling an ambiguous 
structure, we generalize the use of control re- 
124 
mobile = 
- r~,.., A<lsl I~A., ~o..ll .. // 
CAT adj ~" -ho~ L .l| L,,,,,.,<,<o,< 
is,,,< 
Figure 1: Control relatio~-'within a lexical entry 
lations at the phrase-structure level. We pro- 
pose for that a particular representation of 
hierarchical relations for ambiguous objects 
called unary quasi-trees. 
This paper is threefold. In a first section, 
we present the limits of the classical repre- 
sentation of ambiguity and in particular the 
technique of named disjunctions. The second 
section describes the controlled disjunction 
method applied to the lexical level. We de- 
scribe in the third section the generalization 
of this technique to the phrase-structure level 
using unary quasi-trees and we show how this 
approach is useful for an online control of the 
ambiguity during the parse. 
2 Ambiguity and Disjunctions 
Several techniques have been proposed for 
the interpretation and the control of dis- 
junctive structures. For example, delay- 
ing the evaluation of the disjunctive for- 
mulae until obtaining enough information 
allows partial disambiguation (cf. (Kart- 
tunen84)). Another solution consists in con- 
verting the disjunctive formulae into a con- 
junctive form (using negation) as proposed 
by (Nakazawa88) or (Maxwell91). We can 
also make use of the properties of the for- 
mula in order to eliminate inconsistencies. 
This approach, described in (Maxwell91), re- 
lies on the conversion of the original disjunc- 
tive formulae into a set of contexted con- 
straints which allows, by the introduction of 
propositional variables (i) to convert the for- 
mulae into a conjunctive form, and (ii) to 
isolate a subset of formulae, the disjunctive 
residue (the negation of the unsatisfiable con- 
straints). The problem of the satisfiability of 
the initial formula is then reduced to that of 
the disjunctive residue. 
This approach is fruitful and several meth- 
ods rely on this idea to refer formulae with 
an index (a propositional variable, an integer, 
etc.). It is the case in particular with named 
disjunctions (see (DSrre90), (Krieger93) or 
(Gerdemann95)) which propose a compact 
representation of control phenomena and co- 
variancy. 
A named disjunction (noted hereafter ND) 
binds several disjunctive formulae with an in- 
dex (the name of the disjunction). These for- 
mulae have the same arity and their disjuncts 
are ordered. They are linked by a covariancy 
relation: when one disjunct in a ND is se- 
lected (i.e. interpreted to true), then all the 
disjuncts occurring at the same position into 
the other formulae of the ND also have to 
be true. The example (1) presents the lexi- 
cal entry of the german determiner den. The 
covariation is indicated by three disjunctive 
formulae composing the named disjunction 
indexed by 1. 
(i) 
den :P= f ill L"O'x v, ,,<jj 
But the named disjunction technique also 
has some limits. In particular, NDs have to 
represent all the relations between formulae 
in a covariant way. This leads to a lot of 
redundancy and a loss of the compactness 
in the sense that the disjuncts don't contain 
anymore the possible values but all the pos- 
sible variancies according to the other formu- 
lae. 
125 
Some techniques has been proposed in or- 
der to eliminate this drawback and in par- 
ticular: the dependency group representa- 
tion (see (Griffith96)) and the controlled dis- 
junctions (see (Blache97)). The former re- 
lies on an enrichment of the Maxwell and 
Kaplan's contexted constraints. In this ap- 
proach, constraints are composed of the con- 
junction of base constraints (corresponding 
to the initial disjunctive form) plus a control 
formula representing the way in which values 
are choosen. The second approach, described 
in the next section, consists in a specific rep- 
resentation of control relations relying on a 
clear distinction between (i) the possible val- 
ues (the disjuncts) and (ii) the relations be- 
tween these ambiguous values and other ele- 
ments of the structure. This approach allows 
a direct implementation of the implication 
relations (i.e. the oriented controls) instead 
of simple covariancies. 
3 Controlled Disjunctions 
The controlled disjunctions (noted hereafter 
CD) implement the relations existing be- 
tween ambiguous feature values. The exam- 
ple of the figure (1) describes a non covariant 
relation between GENDER and HEAD features. 
More precisely, this relation is oriented: if the 
object is a noun, then the gender is mascu- 
line and if the object is feminine, then it is 
an adjective. 
The relation between these values can be 
represented as implications: noun => masc 
and fem :=~ adj. The main interest of CDs 
is the representation of the variancy between 
the possible values and the control of this 
variancy by complex formulae. 
Controlled disjunctions reference the for- 
mulae with names and all the formula are 
ordered. So, we can refer directly to one of 
the disjuncts (or to a set of linked disjuncts) 
with the name of the disjunction and its rank. 
For clarity, we represent, as in the figure 
(2), the consequent of the implication with 
a pair indexing the antecedent. This pair 
indicates the name of the disjunction and 
the rank of the disjunct. In this example, 
noun(2,1) implements noun => masc: the 
pair (2, 1> references the element of the dis- 
junction number 2 at the i st position. 
(2) 
mobile = 
\[ o,,} 1 
As shown in this example, CDs can repre- 
sent covariant disjunction (e.g. the disjunc- 
tion number 1) or simple disjunctions (dis- 
junction number 2). 
L w = {z v, v, f v, z v, 
The example (3) 1 present, s the case of an 
ambiguity that cannot be totally controlled 
by a ND. Tlfis structure indicates a set of 
variancies. But the ccvariancy representa- 
tion only implements a part of the relations. 
In fact, several "complex" implications (i.e. 
with a conjunction as antecedent) control 
these formulae a~s follows : 
{aAc=> f, bAd:-~ e, cAe :=> b, dA f :::> a} 
These implications (the "controlling for- 
mulae") are constraints on the positions of 
the disjuncts in the CD. The formula in the 
example (4) presents a solution using CDs 
and totally implementing all the relations. In 
this representation, (i = 1) n (j = 1) ~ (k = 2) 
implements the implication a n c ~ \]. The 
set of constraints is indicated into brackets. 
The feature structure, constrained by this 
set, simply contains the elementary varia- 
tions. 
l (i = l) A (j = l) =t" (k = 2) ! r{a Vi bl}\] 
(4) (i=2) A(j=2)~(k l) -,1{ cvjd} (j=l)^(k=l)~(i 2) 
(j=2)ACk=2)~(i 1)J L{evkf 
From an implementation point of view, the 
controlled disjunctions can easily be imple- 
mented with languages using delaying de- 
vices. An implementation using functions in 
Life has been described in (Blache97). 
1This problem was given by John Griffith. 
126 
mobile = \[\] 
"PHON O~ 
s,,,SEM i ... i HEAO {,O,,,, V, ~} 
I|S NSEM I ... I HEAO  {odjV, , OUn) 
///DT~ ~.AD.~TR I~/ ---L \[s~s~M ... H~AD 
Figure 2: UQT in a HPSG form 
fe nTt e --~ 
"PHON CX 
s¥~sE~ I... I HEAD {.ou. Vl • v, ~erb} 
DTRS Vl I,,/s'"SE"' I ' ' "EA° V' V' "e"b}/ ~ COMP..DTR V2 
~SUBJ.DTR VIj\[ \[PHON feryne 
HEAD_DTR DTRS HEAD_DTR SYNSEM 0... \] HEAD 
Figure 3: UQT of the lexical entry ,ferme 
4 Generalization to the 
Phrase-Structure Level 
4.1 Unary Quasi-Trees 
(Vijay-Shauker92) proposes the use of trees 
description called quasi-trees whithin the 
framework of TAG. Such structures rely on 
the generalization of hierarchical relations 
between constituents. These trees bear some 
particular nodes, called quasi-nodes, which 
are constituted by a pair of categories of the 
same type. These categories can refer or not 
to the same objet. If not, a subtree will be 
inserted between them in the final structure. 
Such an approach is particularly interest- 
ing for the description of generalizations. 
The basic principle in TAG consists in 
preparing subtrees which are part of the final 
syntactic structure. These subtrees can be of 
a level greater than one: in this case, the tree 
predicts the hierarchical relations between a 
category and its ancestors. Quasi-trees gen- 
eralize this approach using a meta-level rep- 
resentation allowing the description of the 
general shape of the final syntactic tree. 
The idea of the unary quasi-trees relies ba- 
sically on the same generalization and we 
propose to indicate at the lexical level some 
generalities about the syntactic relations. At 
the difference with the quasi-trees, the only 
kind of information represented here con- 
cerns hierarchy. No other information like 
subcategorization is present there. This ex- 
plain the fact that we use unary trees. 
Several properties characterizes unary 
quasi-trees (noted hereafter UQTs): 
• An UQT is interpreted from the leaf (the 
lexical level) to the root (the proposi- 
tional one). 
• A relation between two nodes ~ and/~ 
(a dominating j3) indicates, in a simple 
PSG representation, that there exists a 
derivation of the form a 3" B such that 
~eB. 
• Each node has only one daughter. 
• An unary quasi-tree is a description of 
tree and each node can be substituted 
by a subtree 2. 
2But at the difference with the quasi-trees, a node 
is not represented by a pair and no distinction is 
done between quasi-root and quasi-foot (see (Vijay- 
Shanker92)). 
127 
"PHON Ot 
SYNSEM 
DTRS 
• "" I HEAD BOBB 
v, 1,S,'HSE  I ... I HEAD  odj V, 
,l sUEJ-DTR I \] IDTRS ~EAD_DTR I 
--L LS,,..~s~ ... HEAO 
'\[ooMP_o,~ IIS,,NS~,I...I.~AOC~,~O,,,-,V,,~dj} 
t~°~-°T~ V'}v' l/ \[ I',HOH I,~'".,~ 
ADJ_DTR \]/DTRS /HEAD..DTR / L L Ls'~'s"M I "'" I "~A'~ 
Figure 4: UQT with an embedded ambiguity 
• The nodes can be constituted by a set of 
objects 3. If more than one object com- 
pose a node, this set in interpreted as a 
disjunction. Such nodes are called am- 
biguous nodes. A categorial ambiguity 
is then represented by an unary quasi- 
tree in which each node is a set of ob- 
jects. 
• Each node is a disjunctive formula be- 
longing to a covariant disjunction. 
• An UQT is limited to three levels: lexi- 
cal, phrase-structure and propositional. 
(5) 
The example (5) shows the UQT corre- 
sponding to the word mobile with an ambi- 
guity adjective/noun. For clarity's sake, the 
tree is presented upside-down, with the leaf 
at the top and the root at the bottom. This 
example indicates that: 
• an adjective is a daughter of an AP 
which is to its turn a daughter of a NP, 
• a noun is a daughter of a NP which is 
to its turn a daughter of an unspecified 
phrase XP. 
3These objects, as for the quasi-trees, can be con- 
stituted by atomic symbols or feature structures, ac- 
cording to the linguistic formalism. 
As indicated before, each node represents 
a disjunctive formula and the set of nodes 
constitutes a covariant disjunction. This in- 
formation being systematic, it becomes im- 
plicit in the representation of the UQTs (i.e. 
no names are indicated). So, the position of 
a value into a node is relevant and indicates 
the related values into the tree. 
This kind of representation can be system- 
atized to the major categories and we Can 
propose a set of elementary hierarchies, as 
shown in the figure (6) used to construct the 
UQTs. 
(6) 
It is interesting to note that the notion of 
UQT can have a representation into different 
formalisms, even not based on a tree repre- 
sentation. The figure (2) shows for example 
an HPSG implementation of the UQT de- 
scribed in the figure (1). 
In this example, we can see that the ambi- 
guity is not systematically propagated to all 
the levels: at the second level (sub'structure 
~\]), both values belong to a same feature 
(HEAD-DAUGHTER). The covariation here 
concerns different features at different levels. 
There is for example a covariation between 
the HEAD features of the second level and the 
128 
type of the daughter at the third level. More- 
over, we can see that the noun can be pro- 
jected into a NP, but this NP can be either a 
complement or a subject daughter. This am- 
biguity is represented by an embedded vari- 
ation (in this case a simple disjunction). 
The example described in the figure (3) 
shows a french lexical item that can be cat- 
egorized as an adjective, a noun or a verb 
(resp. translated as ferm, farm or to close). 
In comparison with the previous example, 
adding the verb subcase simply consists in 
adding the corresponding basic tree to the 
structure. In this case, the covariant part of 
the structure has three subcases. 
This kind of representation can be con- 
sidered as a description in the sense that it 
works as a constraint on the corresponding 
syntactic structure. 
4.2 Using UQTs 
The UQTs represent the ambiguities at the 
phrase-structure level. Such a representation 
has several interests. We focus in this section 
more particularly on the factorization and 
the representation of different kind of con- 
straints in order to control the parsing pro- 
cess. 
The example of the figure (4) presents an 
ambiguity which "disappears" at the third 
level of the UQT. This (uncomplete) NP con-" 
tains two elements with a classical ambigu- 
ity adj/noun. In this case, both combinations 
are possible, but the root type is always nom- 
inal. This is an example of ambiguous struc- 
ture that doesn't need to be disambiguated 
(at least at the syntactic level): the parser 
can use directly this structure 4. 
As seen before, the controlled disjunctions 
can represent very precisely different kind of 
relations within a structure. Applying this 
technique to the UQTs allows the represen- 
tation of dynamic relations relying on the 
context. Such constraints use the selection 
relations existing between two categories. In 
case of ambiguity, they can be applied to an 
4We can also notice that covariation implements 
the relation between the categories in order to inhibit 
the noun~noun or adj/adj possibilities (cf. the CD 
number 1). 
ambiguous group in order to eliminate incon- 
sistencies and control the parsing process. In 
this case, the goal is not to disambiguate the 
structure, but (i) to delay the evaluation and 
maintain the ambiguity and (ii) in order to 
reduce the set of solutions. The figure (5) 
shows an example of the application of this 
technique. 
The selection constraints are applied be- 
tween some values of the UQTs. These re- 
lations are r@presented by arcs between the 
nodes at the lexical level. They indicate the 
possibility of cooccurrence of two juxtaposed 
categories. The constraints represented by 
arrows indicate subcategorization. If such 
constraint is applied to an ambiguous area, 
then it can be propagated using the selec- 
tion constraints whithin this area. In this 
example, there is a selection relation between 
the root S of the UQT describing "poss~de" 
and the node value NP at the second level 
of the UQT describing "ferme". This in- 
formation is propagated to the rest of the 
UQT and then to the previous element us- 
ing the relation existing between the values 
N of "ferme" and Adj of "belle". All these 
constraints are represented using controlled 
disjunctions: each controller value bears the 
references of the controlled one as described 
in the section (3). 
The interest of this kind of constraints is 
that they constitute a local network which 
defines in some way a controlled ambiguous 
area. The parsing process itself can generate 
new selection constraints to be applied to an 
entire area (for example the selection of a NP 
by a verb). In this case, this constraint can 
be propagated through the network and elim- 
inate inconsistent solutions (and eventually 
totally disambiguate the structure). This 
pre-parsing strategy relies on a kind of head- 
corner method. But the main goal here, as 
for the lexical level, is to provide constraints 
controlling the disambiguation of the struc- 
tures, not a complete parsing strategy. 
5 Conclusion 
Controlled Disjunctions allow a precise rep- 
resentation of the relations occuring between 
feature values. Such relations can be defined 
129 
La $crram de la porte qu¢ 
The lock of the door thai 
I 
XP XP 
la b¢11¢ ferr~ p{~sexie f~-m¢ real 
the beautiful farm possesses closes badly 
I I I I I I 
Pro Aclj,~lAd j V Aa i A~ 
Det-- N~'~N N 
I~v ' V 
Figure 5: Constraint networks on ambiguous areas 
statically, in the lexicon. They can also be in- 
troduced dynamically during the parse using 
the Unary Quasi-Tree representation which 
allows the description of relations between 
categories together with their propagation. 
These relations can be seen as constraints 
used to control the parsing process in case 
of ambiguity. 
An efficient treatment of the ambiguity re- 
lies on the possibility of delaying the eval- 
uation of ambiguous structures (i.e. delay- 
ing the expansion into a disjunctive normal 
form). But such a treatment is efficient if we 
can (1) extract as much information as pos- 
sible from the context and (2) continue the 
parse using ambigous structures. The use of 
CDs and UQTs constitutes an efficient solu- 
tion to this problem. 

References 
Philippe Blache. 1997. "Disambiguating 
with Controlled Disjunctions." In Pro- 
ceedings of the International Workshop on 
Parsing Technologies. 
Jochen DSrre & Andreas Eisele. 1990. "Fea- 
ture Logic with Disjunctive Unification" 
in proceedings of COLING'90. 
Dale Gerdemann. 1995. "Term Encoding of 
Typed Feature Structures." In Proceedings 
of the Fourth International Workshop on 
Parsing Technologies, pp. 89-98. 
John Griffith. 1996. "Modularizing Con- 
texted Constraints." In Proceedings of 
COLING '96. 
Lauri Karttunen. 1984. "Features and Val- 
ues" in proceedings of COLING'8~. 
Robert Kasper & William Rounds 1990. 
"The Logic of Unification in Grammar" in 
Linguistics and Philosophy, 13:1. 
Hans-Ulrich Krieger & John Nerbon_ne. 
1993. "Feature-Based Inheritance Net- 
works for Computational Lexicons." In T. 
Briscoe, V. de Paiva and A. Copestake, ed- 
itors, Inheritance, Defaults and the Lex- 
icon. Cambridge University Press, Cam- 
bridge, USA. 
John T. Maxwell III& Ronald M. Kaplan. 
1991. "A Method for Disjunctive Con- 
straints Satisfaction." In M. Tomita, ed- 
itor, Current Issues in Parsing Technol- 
ogy. Kluwer Academic Publishers, Norwell, 
USA. 
Tsuneko Nakazawa, Laura Neher & Erhard 
Hinrichs. 1988. "Unification with Disjunc- 
tive and Negative Values for GPSG Gram- 
mars" in proceedings of ECAI'88. 
Gertjan van Noord & Gosse Bouma. 1994 
"Adjuncts and the Processing of Lexical 
Rules" in proceedings of COLING'9$. 
K. Vijay-Shanker. 1992 "Using Descriptions 
of Trees in a Tree Adjoining Grammar" in 
Computational Linguistics, 18:4. 
Jiirgen Wedekind & Ronald Kaplan. 1997 
"Ambiguity-Preserving Generation with 
LFG-and PATR-style Grammars" in 
Computational Linguistics, 22:4. 
