Grammars for Local and Long Dependencies.
Alexander Dikovsky
Universit´e de Nantes, IRIN, 2, rue de la Houssini`ere
BP 92208 F 44322 Nantes cedex 3 France
Alexandre.Dikovsky@irin.univ-nantes.fr
Abstract
Polarized dependency (PD-) grammars
are proposed as a means of efficient
treatment of discontinuous construc-
tions. PD-grammars describe two kinds
of dependencies : local, explicitly de-
rived by the rules, and long, implicitly
specified by negative and positive va-
lencies of words. If in a PD-grammar
the number of non-saturated valencies
in derived structures is bounded by a
constant, then it is weakly equivalent
to a cf-grammar and has a a0a2a1a4a3a6a5a8a7 -
time parsing algorithm. It happens that
such bounded PD-grammars are strong
enough to express such phenomena as
unbounded raising, extraction and ex-
traposition.
1 Introduction
Syntactic theories based on the concept of depen-
dency have a long tradition. Tesni`ere (Tesni`ere,
1959) was the first who systematically described
the sentence structure in terms of binary relations
between words (dependencies), which form a de-
pendency tree (D-tree for short). D-tree itself
does not presume a linear order on words. How-
ever, any its surface realization projects some lin-
ear order relation (called also precedence). Some
properties of surface syntactic structure can be ex-
pressed only in terms of both dependency (or its
transitive closure called dominance) and prece-
dence. One of such properties, projectivity, re-
quires that any word occurring between a word
a9 and a word
a10 dependent on
a9 be dominated by
a9a12a11 In first dependency grammars (Gaifman, 1961)
and in some more recent proposals: link gram-
mars (Sleator and Temperly, 1993), projective
dependency grammars (Lombardo and Lesmo,
1996) the projectivity is implied by definition. In
some other theories, e.g. in word grammar (Hud-
son, 1984), it is used as one of the axioms defin-
ing acceptable surface structures. In presence
of this property, D-trees are in a sense equiva-
lent to phrase structures with head selection 1.
It is for this reason that D-trees determined by
grammars of Robinson (Robinson, 1970), cate-
gorial grammars (Bar-Hillel et al., 1960), classi-
cal Lambek calculus (Lambek, 1958), and some
other formalisms are projective. Projectivity af-
fects the complexity of parsing : as a rule, it al-
lows dynamic programming technics which lead
to polynomial time algorithms (cf. a0a13a1a4a3 a5 a7 -time
algorithm for link grammars in (Sleator and Tem-
perly, 1993)). Meanwhile, the projectivity is not
the norm in natural languages. For example, in
most European languages there are such regu-
lar non-projective constructions as WH- or rel-
ative clause extraction, topicalization, compara-
tive constructions, and some constructions spe-
cific to a language, e.g. French pronominal cli-
tics or left dislocation. In terms of phrase struc-
ture, non-projectivity corresponds to discontinu-
ity. In this form it is in the center of dis-
cussions till 70-ies. There are various depen-
dency based approaches to this problem. In
the framework of Meaning-Text Theory (Mel’ˇcuk
and Pertsov, 1987), dependencies between (some-
1See (Dikovsky and Modina, 2000) for more details.
times non adjacent) words are determined in
terms of their local neighborhood, which leads
to non-tractable parsing (the NP-hardness argu-
ment of (Neuhaus and Br¨oker, 1997) applies to
them). More recent versions of dependency gram-
mars (see e.g.(Kahane et al., 1998; Lombardo
and Lesmo, 1998; Br¨oker, 1998)) impose on non-
projective D-trees some constraints weaker than
projectivity (cf. meta-projectivity (Nasr, 1995) or
pseudo-projectivity (Kahane et al., 1998)), suffi-
cient for existence of a polynomial time parsing
algorithm. Still another approach is developed
in the context of intuitionistic resource-dependent
logics, where D-trees are constructed from deriva-
tions (cf. e.g. a method in (Lecomte, 1992) for
Lambek calculus). In this context, non-projective
D-trees are determined with the use of hypotheti-
cal reasoning and of structural rules such as com-
mutativity and associativity (see e.g. (Moortgat,
1990)).
In this paper, we put forward a novel ap-
proach to handling discontinuity in terms of de-
pendency structures. We propose a notion of a
polarized dependency (PD-) grammar combining
several ideas from cf-tree grammars, dependency
grammars and resource-dependent logics. As
most dependency grammars, the PD-grammars
are analyzing. They reduce continuous groups
to their types using local (context-free) reduction
rules and simultaneously assign partial depen-
dency structures to reduced groups. The valencies
(positive for governors and negative for subordi-
nates) are used to specify discontinuous (long) de-
pendencies lacking in partial dependency struc-
tures. The mechanism of establishing long de-
pendencies is orthogonal to reduction and is im-
plemented by a universal and simple rule of va-
lencies saturation. A simplified version of PD-
grammars adapted for the theoretical analysis is
introduced and explored in (Dikovsky, 2001). In
this paper, we describe a notion of PD-grammar
more adapted for practical tasks.
2 Dependency structures
We fix finite alphabets a14 of terminals (words), a15
of nonterminals (syntactic types or classes), and
a16 of dependency names.
Definition 1. Let a17a19a18 a1 a14a21a20a13a15 a7a23a22 be a string. A
set a24a19a25a27a26a8a10a29a28a31a30 a11a32a11a32a11 a30a33a10a35a34a37a36 of trees (called components
of a24 ) which cover exactly a17a38a30 have no nodes in
common, and whose arcs are labeled by names
in a16 is a dependency (D-) structure on a17 if one
component a10a40a39 of a24 is selected as its head 2. We
use the notation a17a41a25a43a42 a1 a24 a7 a11 a24 is a terminal D-
structure if a17 is a string of terminals. When a24 has
only one component, it is a dependency (D-) tree
on a17 a11
For example, the D-structure in Fig. 1 has
two components. a44 a1a46a45a48a47 a17 a7 is the root of the non
projective head component, the other component
a49a51a50a53a52a54a3a55a1a46a56a31a47a37a45a40a3a38a7 is a unit tree.
a57a59a58a61a60a63a62a65a64
a62a67a66a68a53a69a33a58a32a70a8a71a73a72a75a74a77a76
a66a78a80a79a53a81a82a70
a57a59a58
a64
a62a67a66a83a84a79a85a78a8a62a23a70
a64a86a64
a71a73a72a55a74
a87
a88
a79a53a89a46a90
a87
a78a80a79a53a81
a87
a91a93a92a95a94
a68a53a96
Fig. 1.
In distinction to (Dikovsky, 2001), the non-
terminals (and even dependency names) can be
structured. We follow (Mel’ˇcuk and Pertsov,
1987) and distinguish syntactical a97
a1a99a98a53a7 and mor-
phological a97a101a100a103a102a105a104 features of a nonterminal a97 a11
The alphabets being finite, the features unification
is a means of compacting a grammar.
The D-structures we will use will be polarized
in the sense that some words will have valencies
specifying long dependencies which must enter
or go from them. A valency is an expression of
one of the forms a106a51a107a109a108 a50 , a106a111a110a112a108 a50 (a positive va-
lency), or a113a114a107a19a108 a50 , a113a114a110a112a108 a50 (a negative valency),
a50 being a dependency name. For example, the
intuitive sense of a positive valency a106a111a110a112a108 a50 of
a node a3 is that a long dependency a50 might go
from a3 somewhere on the right. All nonterminals
will be signed: we presume that a15 is decomposed
into two classes : of positive (a15a116a115 a22a118a117 ) and negative
(a15 a115a120a119 a117 ) nonterminals respectively. D-structures
with valencies, DV-structures, are defined so that
valencies saturation would imply connectivity.
Definition 2. A terminal a3 is polarized if a finite
list of pairwise different valencies 3 a44
a1a4a3a38a7 (its
valency list) is assigned to it. a3 is positive, if
a44
a1a4a3a38a7 does not contain negative valencies, A
D-tree with polarized nodes is positive if its root
2We visualize
a121a80a122 underlining it or its root, when there are
some other components.
3In the original definition of (Dikovsky, 2001), valencies
may repeat in a76 a66a32a62a48a70a82a123 but this seems to be a natural constraint.
is positive, otherwise it is negative.
A D-structure a24 on a string a17 of polarized
symbols is a DV-structure on a17a38a30 if the following
conditions are satisfied :
(v1) if a terminal node a3 of a24 is negative, then
a44
a1a4a3a38a7 contains exactly one negative valency,
(v2) if a dependency of a24 enters a node a3 a30 then
a3 is positive,
(v3) the non-head components of a24 (if any)
are all negative.
The polarity of a DV-structure is that of its head.
a79a124a68
a92
a62a126a125a38a127
a78a129a128
a121
a69a99a68a130a69a131a62
a121
a69a33a62
a91a82a132
a128
a127
a69
a92
a58
a132
a87a87 a87
a133a124a134a53a135a51a136 a68a53a58a84a69a137a68
a92a124a88
a134
a92
a89a99a90a80a138
a139a35a140a99a140
a136 a91a93a92a95a94
a68
a92a61a88
a121
a69a23a128a99a69a131a58
a94
a68a53a58a84a69a137a68
a92a124a88
a66 a64 a58a84a69a137a68a130a70a82a141
a139a35a140a82a142
a136
a66a144a143
a121
a90a61a123
a125a38a127
a70 a57a59a58a61a60a63a62
a87
a121
a69a131a128a46a69a131a58
a94
a87
a68a53a58a84a69a99a68
a92a124a88
a60a111a145a55a57a75a143a147a146a147a148 a76 a145a150a149 a76 a134a152a151 a146 a135a152a153a75a154 a146 a153a59a135 a145 a151
a139a155a142a157a156
a136
a57a59a58a124a158
a127
a66 a64 a58a61a69a99a68a35a70 a141
a154 a96a32a159
a92
a89a99a90 a134 a79a124a68
a92
a62
a139a155a142a120a160
a136
a125
a69
a94
a78
a132
a58a61a69a33a96
a132
a133a95a161a40a162a163a136 a68a53a58a84a69a99a68
a92a124a88
a134
a92
a89a46a90a80a138
a87a87
a88
a79a53a89a46a90
a121a129a164
a58 a134
a164
a62a48a165 a134
a92
a89a99a90
a57a59a58a84a60a166a62a67a66a167a68a130a69a131a58a124a70a8a57a59a58
a76
a66
a94a63a92
a121
a70a147a57a59a58
a76
a66a169a168
a92a84a170
a136 a79a124a68
a92
a62a48a70
a139a155a142a120a171
a136
a87
a121a129a164
a58 a134
a164
a62a48a165 a134
a92
a89a46a90
a87
a88
a79a53a89a46a90
a121
a69a99a68a130a69a131a62
a121
a69a131a62
a91a93a132
a128
a127
a69
a92
a58
a132
a139a155a142a157a140
a136
a87
a91a93a92a95a94
a68
a92a124a88
a60a166a62 a57a59a58a61a60a63a62
a87
a91a93a92a95a94
a68
a92a61a88
a139 a142a99a142
a136
a64a86a172
a151 a148a124a146a38a148
a76
a145a173a149
a76
a134a55a151 a146 a135a152a153a75a154 a146 a153a59a135 a145 a151
Fig. 2.
In Fig. 2 4, both words in a24a6a28a175a174 have no valencies,
all nonterminals in a24a6a28a131a28 and a24a118a28 a5 are positive (we
label only negative nonterminals), a24a6a28a175a176 is positive
because its head component is a positive unit D-
tree, a24a12a174a61a28 and a24a177a174a131a174 are negative because their roots
are negative.
Valencies are saturated by long dependencies.
Definition 3. Let a24 be a terminal DV-structure.
A triplet a178a179a25a180a100 a3 a28a129a30 a3 a174a31a30 a50 a104a54a30 where a3 a28a129a30 a3 a174 are
nodes of a24 and a50 a18
a16
a30 is a long dependency
4For the reasons of space, in our examples we
are not accurate with morphological features. E.g.,
in the place of GrV(gov:upon) we should rather have
GrV(gov:upon)a71 infa74 .
with the name a50 a30 directed from a3 a28 to a3 a174 (nota-
tion: a3 a28 a181a113a19a113a182a104 a3 a174 ), if there are valencies a183 a28 a18
a44
a1a4a3
a28
a7
a30a131a183a53a174a54a18a2a44
a1a4a3
a174
a7 such that :
(v4) either a3 a28 a100 a3 a174 (a3 a28 precedes a3 a174 ), a183 a28 a25
a106a111a110a112a108
a50 , and
a183a130a174a114a25a105a113a63a107a19a108
a50 , or
(v5) a3 a174a166a100 a3 a28a129a30a31a183a155a28a184a25a179a106a111a107a19a108 a50 , and a183a130a174a114a25a185a113a63a110a101a108 a50 .
We will say that a183a155a28 saturates a183a130a174 by long depen-
dency a178 .
The set of valencies in a24 is totally ordered by
the order of nodes and the orders in their valency
lists: a183a155a28a163a100a186a183a53a174 if
(o1) either a183a155a28a19a18a43a44a29a187 a1a4a3 a28 a7 a30a131a183a53a174a185a18a43a44a29a187 a1a4a3 a174 a7 and
a3
a28a188a100
a3
a174a85a30
(o2) or a183a155a28a80a30a131a183a130a174a111a18a2a44a67a187 a1a4a3a38a7 and a183a155a28a188a100a19a183a130a174a53a30 in a44a67a187 a1a4a3a38a7 a11
Let a24a6a28 be the structure resulting from a24 by
adding the long dependency a178 and replacing
a44a67a187
a1a4a3
a28
a7 by
a44a29a187
a142
a1a4a3
a28
a7
a25a101a44a67a187
a1a4a3
a28
a7a147a189
a26a80a183a48a28a129a36 and a44a67a187
a1a4a3
a174
a7
by a44a67a187 a142 a1a4a3 a174 a7 a25a190a44a67a187 a1a4a3 a174 a7a6a189 a26a80a183a130a174a85a36 a11 We will say that a24a118a28
is a saturation of a24 by a178 and denote it by a24a192a191a186a24a118a28 a11
Among all possible saturations of a24 we will select
the following particular one :
Let a183a155a28a186a18a193a44a29a187 a1a4a3 a28 a7 be the first non saturated
positive valency in a24a75a30 and a183a130a174a193a18 a44a67a187 a1a4a3 a174 a7 be
the closest corresponding 5 non saturated neg-
ative valency in a24 a11 Then the long dependency
a178a152a25a195a194
a3
a28 a181a113a186a113a182a104
a3
a174a61a196 saturating a183a130a174 by a183a155a28 is first
available (FA) in a24
a11 The resulting saturation of
a24
by a178 is first available or FA-saturation (notation:
a24a13a191a114a197a118a198a199a24a118a28 ).
We transform the relations a191a200a30a129a191 a197a118a198 into partial
orders closing them by transitivity.
a78 a78
a89 a89
a121
a91 a91
a139 a142
a87 a87a87 a87a87
a58
a78 a78
a89 a89
a121
a91 a91
a139 a140
a87 a87a87 a87a87 a87
a58
a58
a78 a78
a89 a89
a121
a91 a91
a139a130a201
a87 a87a87 a87
Fig. 3.
Suppose that in Fig. 3, both occurrences of a45
in a24a177a202 and the first occurrence of a45 in a24a6a28 have
a44
a1a46a45a203a7
a25a204a26a35a113a114a110a21a108
a50
a36a35a30 and both occurrences of a205
in a24 a202 and the second occurrence of a205 in a24 a28 have
a44
a1
a205
a7
a25a101a26a85a106a111a107a186a108
a50
a36
a11 Then
a24a12a202a111a191 a197a118a198 a24a118a28a73a191 a197a118a198 a24a12a174
a11
5Corresponding means :
(c1) a62 a140 a71a206a62 a142 and a170a124a140a147a207 a134a147a135a199a136 a58 if a170a80a142a37a207 a161a75a162a206a136 a58a95a123 and
(c2) a62 a142 a71a206a62 a140 and a170 a140 a207 a134a147a162a206a136 a58 if a170 a142 a207 a161a75a135a199a136 a58a95a208
In (Dikovsky, 2001), we prove that
a209 If
a24 is a terminal DV-structure and a24a210a191a211a24a118a28a129a30
then either a24a6a28 has a cycle, or it is a DV-structure
(Lemma 1).
As it follows from Definition 3, each satura-
tion of a terminal DV-structure a24 has the same set
of nodes and a strictly narrower set of valencies.
Therefore, any terminal DV-structure has maxi-
mal saturations with respect to the order relations
a191a54a30a75a191 a197a118a198
a11 Very importantly, there is a single max-
imal FA-saturation of a24 denoted a212a112a213 a28 a1 a24 a7 a11 E.g.,
in Fig. 3, a212a214a213 a28 a1 a24a12a202 a7 a25a41a24a177a174 is a D-tree.
In order to keep track of those valencies which
are not yet saturated we use the following notion
of integral valency.
Definition 4. Let a24 be a terminal DV-structure.
The integral valency a215
a197a118a198
a24 of a24 is the list
a216
a34a188a217a169a34a73a218
a115
a187
a117
a44a12a219a221a220
a142
a115
a187
a117
a1a4a3a38a7 ordered by the order of va-
lencies in a24 a11 If a212a214a213 a28 a1 a24 a7 is a d-tree, we say that
this D-tree saturates a24 and call a24 saturable.
By this definition, a215
a197a118a198
a212a214a213
a28
a1
a24
a7
a25 a215
a197a118a198
a24
a11
Saturability is easily expressed in terms of
integral valency (Lemma 2 in (Dikovsky, 2001)) :
Let a24 be a terminal DV-structure. Then :
a209
a212a214a213
a28
a1
a24
a7 is a D-tree iff it is cycle-free and
a215
a197a118a198
a24a222a25a105a223a155a30
a209
a24 has at most one saturating D-tree.
The semantics of PD-grammars will be defined
in terms of composition of DV-structures which
generalizes strings substitution.
Definition 5. Let a24 a28 a25 a26a8a10 a28 a30 a11a32a11a32a11 a30a33a10a40a224a155a36 be a DV-
structure, a225 be a nonterminal node of one of its
components, and a24a12a174a19a25a226a26a8a10a35a227
a28
a30
a11a32a11a32a11
a30a33a10a40a227
a39
a30
a11a32a11a32a11
a30a33a10a40a227a228a99a36 be a
DV-structure of the same polarity as a225 and with
the head component a10a35a227a39
a11 Then the result of the
composition of a24 a174 into a24 a28 in a225 is the DV-structure
a24a118a28a8a229 a225
a189
a24a12a174a61a230a137a30 in which a24a177a174 is substituted for a225a231a30 the
root of a10a40a227
a39
inherits all dependencies of a225 in a24a118a28a129a30
and the head component is that of a24a118a28 (changed
respectively if touched on by composition)6.
It is easy to see that DV-structure a24 in Fig. 4
can be derived by the following series of compo-
6This composition generalizes the substitution used in
TAGs (Joshi et al., 1975) (a143 needs not be a leaf) and is not
like the adjunction.
sitions of the DV-structures in Fig. 2:
a24a6a28a175a174a114a25a41a24a118a28a131a28a8a229
a16 a3a38a189
a10a48a232a23a233a234a232
a3
a10a48a232
a3a6a235a61a236
a30
a49a51a50 a16 a3a38a189a31a237a131a238
a232a80a239
a50a53a236
a230a137a30
a24a177a174a131a174a114a25a41a24a12a174a61a28a8a229
a1a46a52a51a50
a232a23a233
a7
a119
a189a31a47
a233a177a239
a3
a30
a1
a225a166a10a82a240a130a30a131a42
a238a177a7a131a189
a42
a238a67a45a48a237
a30
a49a111a50 a16 a3a38a189
a24a6a28a175a174a84a230a137a30
a24 a28a137a241 a25a41a24 a28
a5
a229
a49a111a50 a16 a3a55a1
a233a37a232
a50a40a7a131a189
a42a111a232a130a30
a49a111a50
a44
a1
a102a182a239a53a10
a7a131a189
a102
a45a48a236
a30
a49a111a50
a44
a1
a9
a239a85a183a150a108
a47
a233a37a239
a3a38a7a131a189a31a50
a232a8a178
a236
a230a137a30
a24a242a25a41a24a6a28a175a176a35a229
a49a111a50
a14
a238a147a1a46a52a51a50
a232a131a233
a7
a119
a189
a24a177a174a131a174a53a30
a15a51a178a137a243a85a239a130a205a93a240a6a113
a47
a233a37a239
a3a38a189
a24a6a28a137a241a124a230a137a30
and a10a180a25a105a212a112a213
a28
a1
a24
a7
a11
a79a124a68
a92
a62a21a125a38a127
a78a129a128
a121
a69a99a68a130a69a131a62
a121
a69a33a62
a91a82a132
a128
a127
a69
a92
a58
a132
a125
a69
a94
a78
a132
a58a84a69a33a96
a132
a87a87a87a87
a87a87
a79a124a68
a92
a62 a125a38a127
a78a129a128
a121
a69a99a68a130a69a131a62
a121
a69a33a62
a91a82a132
a128
a127
a69
a92
a58
a132
a125
a69 a94
a78
a132 a58a84a69a33a96
a132
a87a87a87a87
a87
a133a124a134
a170
a138 a133a95a161
a170
a138
a68a53a58a84a69a137a68
a92a124a88
a134
a92
a89a99a90
DV-structure a139 (a66 a161 a159 a134 a70 a170a244a207 a66 a161a35a162 a159 a134a86a135 a70 a136 a68a53a58a84a69a99a68 a92a124a88 a134 a92 a89a46a90 )
D-tree a121 a207a182a245 a151
a142
a66
a139
a70
Fig. 4.
The DV-structures composition has natural
properties:
a209 The result of a composition into a DV-
structure a24 is a DV-structure of the same polarity
as a24 (Lemma 3 in (Dikovsky, 2001)).
a209 If
a215
a197a118a198
a24 a28 a25 a215
a197a118a198
a24 a174 a30 then a215
a197a118a198
a24 a202 a229 a225
a189
a212a214a213
a28
a1
a24 a28
a7
a230
a25 a215
a197a118a198
a24a177a202a130a229 a225
a189
a212a214a213
a28
a1
a24a12a174
a7
a230 for any terminal a24a118a28a80a30a33a24a12a174
(Lemma 4 in (Dikovsky, 2001)).
3 Polarized dependency grammars
Polarized dependency grammars determine DV-
structures in the bottom-up manner in the course
of reduction of phrases to their types, just as the
categorial grammars do. Each reduction step is
accompanied by DV-structures composition and
by subsequent FA-saturation. The yield of a suc-
cessful reduction is a D-tree. In this paper, we
describe a superclass of grammars in (Dikovsky,
2001) which are more realistic from the point of
view of real applications and have the same pars-
ing complexity.
Definition 6. A PD-grammar is a system a49 a25
a1
a14a246a30a84a15a166a30
a16
a30a33a247a67a30a84a248a6a30a33a110
a7
a30 where a14a246a30a84a15a166a30
a16 are as de-
scribed above, a247a103a249a41a15a180a115
a22a118a117 is a set of axioms (which
are positive nonterminals), a248a2a249a185a14 a250a2a15a251a250a182a107 is a
ternary relation of lexical interpretation, a107 being
a79a124a68
a92
a62 a125a38a127
a78a129a128
a121
a69a137a68a53a69a33a62
a121
a69a131a62
a91a93a132
a128
a127
a69
a92
a58
a132
a125
a69
a94
a78
a132
a58a61a69a33a96
a132
a87a87a87
a87
a87a87
a91a82a92a84a94
a68
a92a124a88
a121
a69a23a128a46a69a33a58
a94
a68a53a58a84a69a99a68
a92a124a88
prepos-obj
a88
a79a85a89a99a90 dir-inf-obj
a57a59a58a84a60a166a62a67a66a167a68a130a69a131a58a61a70
a57a59a58
a76
a66
a94a63a92
a121
a70
GrV(gov:upon)
GrNnNn
GrNn
Cl/obj-upon
(Adj,wh)a66 a64 a58a84a69a137a68a130a70 a141
a57a59a58a124a158
a127
a66a32a79a124a68
a92
a62a155a70 a141
ClWh
+L:prepos-obj-R:prepos-obj
a58
a142
a58
a140
a58
a171
a58
a160
Fig. 5.
the set of lists of pairwise different valencies, and
a110 is a set of reduction rules. For simplicity,
we start with the strict reduction rules (the only
rules in (Dikovsky, 2001)) of the form a24a253a252a254a225a231a30
where a225a222a18a19a15 and a24 is a DV-structure over a15 of
the same polarity as A (below we will extend the
strict rules by side effects). In the special case,
where the DV-structures in the rules are D-trees,
the PD-grammar is local7.
Intuitively, we can think of a248 as of the com-
bined information available after the phase of
morphological analysis (i.e. dictionary informa-
tion and tags). So a1 a42a200a30a33a225a231a30a131a183a48a178 a7 a18a251a248 means that a
type a225 and a valency list a183a48a178 can be a priori as-
signed to the word a42 a11
Semantics. 1. Let a50 a25 a1 a42a231a30a33a225a200a30a131a183a155a178 a7 a18a255a248 and a24a12a202
be the unit DV-structure a42 with a44 a1 a42 a7 a25a179a183a48a178 a11 Then
a50 is a reduction of the structure
a24a177a202 to its type a225
(notation a24a12a202a1a0
a181
a225 ) and a183a48a178 is the integral valency
of this reduction denoted by a183a155a178a147a25 a215
a181
a24a12a202
a11
2. Let a50 a25 a1 a24a242a252 a225 a7 be a reduction rule
with a2 nonterminals’ occurrences a225a200a28a80a30 a11a32a11a32a11 a30a33a225a63a224 in a24a75a30
a2a199a104a4a3a203a30 and a24a118a28a5a0a7a6
a142
a225a200a28a80a30
a11a32a11a32a11
a30a40a24a12a224a8a0a7a6a10a9a51a225a166a224 be some
reductions. Then a11a116a25 a1 a11a12a28 a11a32a11a32a11a11a48a224a13a12 a50a35a7 is a reduction of
the structure a24a12a202a180a25a255a212a214a213
a28
a1
a24a152a229 a225a200a28
a189
a24a118a28a80a30
a11a32a11a32a11
a30a33a225a166a224
a189
a24a177a224a80a230
a7
to its type a225 (notation a24 a202 a0a7a6a65a225 ). a11 a28 a30 a11a32a11a32a11 a30a14a11a48a224
as well as a11 itself are subreductions of a11 a11 The
integral valency of a24a12a202 via a11 is a215
a6
a24a177a202 a25
a215
a197a118a198
a24a152a229 a225a200a28
a189
a24a118a28a80a30
a11a32a11a32a11
a30a33a225a166a224
a189
a24a12a224a8a230a180a25 a215
a197a118a198
a24a12a202
a11 A D-tree
a10 is
determined by a49 if there is a reduction a10a15a0a7a6
7Local PD-grammars are strongly equivalent to depen-
dency tree grammars of (Dikovsky and Modina, 2000) which
are generating and not analyzing as here.
a213a59a30a199a213 a18a226a247
a11 The DT-language determined by
a49 is the set
a16
a1a137a49a231a7 of all D-trees it determines.
a107
a1a137a49a200a7
a25a254a26a80a42
a1
a10
a7a18a17
a10a210a18a19a16
a1a137a49a200a7
a36 is the language
determined by a49 a11a21a20 a1a46a52 a16
a49a231a7 denotes the class of
languages determined by PD-grammars.
By way of illustration, let us consider the
PD-grammar a49 a202 with the lexical interpretation a248
containing triplets:
a1
a42a111a232a130a30
a49a111a50 a16 a3a55a1
a233a37a232
a50a40a7
a30a8a229a2a230
a7
a30
a1
a102
a45a40a236
a30
a49a111a50
a44
a1
a102a182a239a85a10
a7
a30a8a229a13a230
a7
a1a4a50
a232a80a178
a236
a30
a49a111a50
a44
a1
a9
a239a85a183a55a108
a47
a233a37a239
a3a75a7
a30a8a229 a106a111a107a103a108a169a233
a50
a232a131a233a177a239
a98
a113a200a239a130a205a23a240a53a230
a7
a30
a1a4a47
a233a177a239
a3
a30
a1a46a52a51a50
a232a23a233
a7
a119a86a30a8a229 a113a114a110a13a108a169a233
a50
a232a131a233a177a239
a98
a113a200a239a130a205a23a240a35a230
a7
a30
a1
a42
a238a67a45a48a237
a30
a1
a225a166a10a82a240a35a30a131a42
a238a177a7
a30a8a229 a230
a7
a30
a1
a10a48a232a23a233a234a232
a3
a10a48a232
a3a6a235a61a236
a30
a16 a3
a30a8a229a180a230
a7
a30
a1a4a237a131a238
a232a8a239
a50a53a236
a30
a49a111a50 a16 a3
a30a8a229a129a230
a7
a30 and the following reduction
rules whose left parts are shown in Fig. 2:
a50
a28a184a25
a1
a24a118a28a131a28a192a252
a49a51a50 a16 a3a38a7
a30
a50
a174a114a25
a1
a24a118a28
a5
a252 a15a54a178 a243a4a239a130a205a23a240a118a113
a47
a233a177a239
a3a38a7
a30
a50
a5
a25
a1
a24a12a174a61a28a192a252
a49a51a50
a14
a238a38a1a4a47
a233a177a239
a3a38a7
a119
a7
a30
a50
a241 a25
a1
a24 a28a175a176 a252 a15a54a178a99a14
a238a37a7
a11
Then the D-tree a10 in Fig. 4 is reducible in a49 a202 to
a15a54a178a99a14
a238 and its reduction is depicted in Fig. 5.
As we show in (Dikovsky, 2001), the weak
generative capacity of PD-grammars is stronger
than that of cf-grammars. For example, the PD-
grammar a49 a28 :
a57
a142
a136
a143a22a143
a91
a121
a87 a87
a151a22a151 a143
a78
a133a124a134a147a135a2a136 a58a124a138
a89
a133a95a161a75a162 a136 a58a124a138
generates a non-cf language a26a80a42
a1a4a3a38a7a23a17
a42
a1a4a3a38a7
a25
a45
a34
a205
a34
a10
a235
a34
a30
a3a25a24
a3a155a36
a11 D-tree
a24 a174 in Fig. 3 is deter-
mined by a49 a28 on a42 a1a27a26a35a7 a11 Its reduction combined
with the diagram of local and long dependencies
is presented in Fig. 6.
a121
a143
a91
a89
a87
a87
a143
a91
a89
a87 a87
a143
a151a78
a151a78
a151
a28
a28
Fig. 6.
The local PD-grammars are weakly equivalent
to cf-grammars, so they are weaker than general
PD-grammars. Meanwhile, what is really im-
portant concerning the dependency grammars, is
their strong generative capacity, i.e. the D-trees
they derive. From this point of view, the gram-
mars like a49 a28 above are too strong. Let us remark
that in the reduction in Fig. 6, the first saturation
becomes possible only after all positive valencies
emerge. This means that the integral valency of
subreductions increases with a3 a11 This seems to be
never the case in natural languages, where next
valencies arise only after the preceding ones are
saturated. This is why we restrict ourself to the
class of PD-grammars which have such a prop-
erty.
Definition 7. Let a49 be a PD-grammar. For a
reduction a11 of a terminal structure, its defect is
defined as a29 a1 a11 a7 a25a211a102 a45 a17a147a26 a17 a215
a6a31a30
a24a234a227
a17a32a17
a11a48a227 is a subre-
duction of a11a12a36 a11 a49 has bounded (unbounded) de-
fect if there is some (there is no) constant a56 which
bounds the defect of all its reductions. The mini-
mal constant a56 having this property (if any) is the
defect of a49 (denoted a29 a1a137a49a231a7 ).
There is a certain technical problem concerning
PD-grammars. Even if in a reduction to an axiom
all valencies are saturated, this does not guaran-
tee that a D-tree is derived: the graph may have
cycles. In (Dikovsky, 2001) we give a sufficient
condition for a PD-grammar of never producing
cycles while FA-saturation. We call the grammars
satisfying this condition lc- (locally cycle-) free.
For the space reasons, we don’t cite its defini-
tion, the more so that the linguistic PD-grammars
should certainly be lc-free. In (Dikovsky, 2001)
we prove the following theorem.
Theorem 1. For any lc-free PD-grammar a49 of
bounded defect there is an equivalent cf-grammar.
Together with this we show an example of a
DT-language which cannot be determined by lo-
cal PD-grammars. This means that not all struc-
tures determined in terms of long dependencies
can be determined without them.
4 Side effect rules and parsing
An important consequence of Theorem 1 is
that lc-free bounded defect PD-grammars have a
a0a13a1a4a3a6a5a80a7 parsing algorithm. In fact, it is the clas-
sical Earley algorithm in charter form (the char-
ters being DV-structures). To apply this algo-
rithm in practice, we should analyze the asymp-
totic factor which depends on the size of the
grammar. The idea of theorem 1 is that the in-
tegral valency being bounded, it can be com-
piled into types. This means that a reduction
rule a24 a252 a225 should be substituted by rules
a24a152a229 a225a200a28
a189
a225a54a28a31a229 a44a118a28a131a230a137a30
a11a32a11a32a11
a30a33a225a166a224
a189
a225a166a224a29a229 a44a67a224a8a230a32a230 a252 a225a180a229 a44a12a202a61a230 with
types keeping all possible integral valencies not
causing cycles. Theoretically, this might blow
up a183 a224 a115a34a33 a22 a28 a117 times the size of a grammar with de-
fect a56 a30 a183 valencies and the maximal length a2
of left parts of rules. So theoretically, the con-
stant factor in the a0a13a1a4a3a6a5a129a7 time bound is great. In
practice, it shouldn’t be as awful, because in lin-
guistic grammars a56 will certainly equal a35a53a30 one
rule will mostly treat one valency (i.e. a2a190a25a36a35 )
and the majority of rules will be local. Practi-
cally, the effect may be that some local rules will
have variants propagating upwards a certain va-
lency: a24a152a229a37 a189 a37a150a229 a183a40a230a32a230a173a252 a225a180a229 a183a40a230 a11 The actual prob-
lem lies elsewhere. Let us analyze the illustration
grammar a49 a202 and the reduction in Fig. 5. This
reduction is successful due to the fact that the
negative valency a113a114a110a13a108a169a233 a50 a232a23a233a177a239 a98 a113a200a239a130a205a23a240 is assigned to
the preposition a47 a233a177a239 a3 and the corresponding pos-
itive valency a106a111a107a103a108a169a233 a50 a232a131a233a177a239 a98 a113a231a239a53a205a23a240 is assigned to the
verb a50 a232a8a178 a236 a11 What might serve the formal basis for
these assignments? Let us start with a50 a232a80a178
a236
a11 This
verb has the strong government over prepositions
a239
a3
a30
a47
a233a37a239
a3
a11 In the clause in Fig. 4, the group of
the preposition is moved, which is of course a
sufficient condition for assigning the positive va-
lency to the verb. But this condition is not avail-
able in the dictionary, nor even through morpho-
logical analysis (a50 a232a8a178 a236 may occur at a certain dis-
tance from the end of the clause). So it can only
be derived in the course of reduction, but strict
PD-grammars have no rules assigning valencies.
Theoretically, there is no problem: we should
just introduce into the dictionary both variants of
the verb description – with the local dependency
a233
a50
a232a23a233a177a239
a98
a113a200a239a130a205a23a240 to the right and with the positive va-
lency a106a51a107a103a108a169a233 a50 a232a23a233a37a239 a98 a113a200a239a130a205a23a240 to the left. Practically,
this “solution” is inacceptable because such a lex-
ical ambiguity will lead to a brute force search.
The same argument shows that we shouldn’t as-
sign the negative valency a113a63a110a13a108a169a233 a50 a232a23a233a177a239 a98 a113a200a239a130a205a93a240 to
a47
a233a177a239
a3 in the dictionary, but rather “calculate” it in
the reduction. If we compare the clause in Fig. 4
with the clauses what theories we may rely upon;
what kind of theories we may rely upon; the de-
pendency theories of what kind we may rely upon
etc., we see that we can assign a a113a114a110 valency to
wh-words in the dictionary and then raise nega-
tive valencies till the FA-saturation. The problem
is that in the strict PD-grammars there are no rules
of valency raising. For these reasons we extend
the reduction rules by side effects sufficient for
the calculations of both kinds.
Definition 8. We introduce two kinds of side
effects: valency raising a1 a229 a183 a28 a230 a1a39a38a93a7 a181a40 a183 a174 a7 and
valency assignment a1a131a1a39a38a93a7a42a41 a183 a7 a30 a183a12a30a131a183a155a28a8a30a131a183a130a174 being
valency names and a38 an integer. A rule of the
form
a24
a1
a229 a183 a28 a230
a1a39a38a93a7
a181
a40
a183 a174
a7
a252 a225
with a2 nonterminals a225a200a28a80a30 a11a32a11a32a11 a30a33a225a63a224 in a24 and
a35a44a43
a38
a43a45a2 is valency raising if :
(r1) a183a155a28a129a30a131a183a130a174 are of the same polarity,
(r2) a local dependency a50 enters a225a63a217 in a24 ,
(r3) for positive a183a155a28a80a30a131a183a130a174a53a30a105a24a77a252 a225 is a strict
reduction rule,
(r4) if a183 a28 a30a131a183 a174 are negative, then a225 a217 a30a33a225a251a18 a15a180a115a157a119 a117 a30
and replacing a225a63a217 by any positive nonterminal we
obtain a DV-structure 8. A rule of the form
a24
a1a131a1a39a38a93a7a46a41
a183
a7
a252 a225
with a2 nonterminals a225a200a28a80a30
a11a32a11a32a11
a30a33a225a63a224 in a24 and
a35a44a43
a38
a43a45a2 is valency assigning if :
(a1) for a positive a183a12a30a27a24 a252 a225 is a strict
8So this occurrence of a143a48a47 in a139 contradicts to the point
(v2) of definition 2.
reduction rule,
(a2) if a183 is negative and a225a114a217 is the root of a24a75a30
then a225a63a217a59a18a13a15 a115 a22a118a117 and a225a101a18a2a15 a115a157a119 a117 a30
(a3) if a183 is negative and a225a63a217 is not the root
of a24a75a30 then a225 a18a21a15a116a115a120a119 a117 a30a163a225 a217 a18a21a15a116a115 a22a118a117 is a non
head component of a24 9 and replacing a225a63a217 by any
negative nonterminal we obtain a DV-structure.
Semantics. We change the reduction semantics
as follows.
a209 For a raising rule
a24
a1
a229 a183a48a28a131a230
a1a39a38a23a7
a181
a40
a183a130a174
a7
a252 a225a200a30
the result of the reduction is the DV-structure
a24a177a202a242a25a211a212a214a213
a28
a1a50a49a184a1
a183a130a174a130a30a33a24a152a229 a225a200a28
a189
a24a118a28a80a30
a11a32a11a32a11
a30a37a225a114a217
a189a52a51a155a1
a183a155a28a8a30a33a24a67a217
a7
a30
a11a32a11a32a11
a30
a225a166a224
a189
a24a177a224a80a230
a7a131a7
a30 where a51a48a1 a183a177a30a33a24a234a227
a7 is the DV-structure
resulting from a24a234a227 by deleting a183 from a44 a1a4a50 a239a53a239 a237a124a1 a24a234a227 a7a131a7 a30
and a49a163a1 a183a12a30a33a24a234a227
a7 is the DV-structure resulting from
a24a234a227
by adding a183 to a44 a1a4a50 a239a53a239 a237a124a1 a24a234a227 a7a131a7 a11
a209 For a valency assignment rule
a24
a1a131a1a39a38a93a7 a41
a183
a7
a252 a225a231a30 the result of the reduc-
tion is the DV-structure a24a12a202a73a25a41a212a214a213 a28 a1 a24a152a229 a225a200a28 a189 a24a118a28a80a30 a11a32a11a32a11 a30
a225a63a217
a189a53a49a163a1
a183a12a30a33a24a12a217
a7
a30
a11a32a11a32a11
a30a35a225a63a224
a189
a24a12a224a8a230
a7
a11
A PD-grammar with side effect rules is a
PDSE-grammar.
This definition is correct in the sense that the
result of a reduction with side effects is always a
DV-structure. We can prove
Theorem 2. For any lc-free PDSE-grammar a49 of
bounded defect there is an equivalent cf-grammar.
Moreover, the bounded defect PDSE-
grammars are also parsed in time a0a13a1a4a3 a5 a7 a11
In fact, we can drop negative a183a155a28 in raising
rules (it is unique) and indicate the type of
a50
a239a53a239
a237a124a1
a24
a7 in both side effect rules, because the
composition we use makes this information
local. Now, we can revise the grammar a49 a202
above, e.g. excluding the dictionary assignment
a1a4a47
a233a177a239
a3
a30
a1a46a52a51a50
a232a23a233
a7
a119a86a30a8a229 a113a114a110a13a108a169a233
a50
a232a131a233a177a239
a98
a113a200a239a130a205a23a240a35a230
a7
a30 and using
in its place several valency raising rules such as:
a66 a64 a58a84a69a99a68a35a70
a66a144a143
a121
a90a61a123
a125a38a127
a70 a57a59a58a84a60a166a62 a66a55a54a155a70a57a56 a57a59a58a124a158
a127
a66a32a79a124a68
a92
a62a48a70 a141
a87
a121
a69a23a128a99a69a131a58
a94
a87
a68a53a58a84a69a137a68
a92a124a88
where a29a242a25 a66a27a58a60a59a120a66a62a61a31a66a120a57a59a58a84a60a166a62a48a70a99a70a64a63a66a65a68a67a62a63a66a69a68a70a71 a72
a65a73a67a74a63
-R:prepos-obja70a82a208
5 Conclusion
The main ideas underlying our approach to dis-
continuity are the following:
9So this occurrence of a143a75a47 in a139 contradicts to the point
(v3) of definition 2.
a209 Continuous (local, even if non projective)
dependencies are treated in terms of trees com-
position (which reminds TAGs). E.g., the French
pronominal clitics can be treated in this way.
a209 Discontinuous (long) dependencies are cap-
tured in terms of FA-saturation of valencies in
the course of bottom-up reduction of dependency
groups to their types. As compared with the
SLASH of GPSG or the regular expression lifting
control in non projective dependency grammars,
these means turn out to be more efficient under
the conjecture of bounded defect. This conjec-
ture seems to be true for natural languages (the
contrary would mean the possibility of unlimited
extraction from extracted groups).
a209 The valency raising and assignment rules of-
fer a way of deriving a proper valency saturation
without unwarranted increase of lexical ambigu-
ity.
A theoretical analysis and experiments in En-
glish syntax description show that the proposed
grammars may serve for practical tasks and can
be implemented by an efficient parser.
6 Acknowledgments
I would like to express my heartfelt gratitude to
N. Pertsov for fruitful discussions of this paper.
The idea of valency raising has emerged from our
joint work over a project of a PD-grammar for a
fragment of English.
References
Y. Bar-Hillel, H. Gaifman, and E. Shamir. 1960. On
categorial and phrase structure grammars. Bull.
Res. Council Israel, 9F:1–16.
N. Br¨oker. 1998. Separating surface order and syn-
tactic relations in a dependency grammar. In Proc.
COLING-ACL, pages 174–180, Montreal.
A.Ja. Dikovsky and L.S. Modina. 2000. Dependen-
cies on the other side of the Curtain. Traitement
Automatique des Langues (TAL), 41(1):79–111.
A. Dikovsky. 2001. Polarized non-projective depen-
dency grammars. In Ph. De Groote and G. Morrill,
editors, Logical Aspects of Computational Linguis-
tics, number 2099 in LNAI. Springer Verlag. To be
published.
H. Gaifman. 1961. Dependency systems and phrase
structure systems. Report p-2315, RAND Corp.
Santa Monica (CA). Published in: Information and
Control, 1965, v. 8, n a76 3, pp. 304-337.
R.A. Hudson. 1984. Word Grammar. Basil Black-
well, Oxford-New York.
A.K. Joshi, L.S. Levy, and M. Takahashi. 1975. Tree
adjunct grammars. Journ. of Comput. and Syst.
Sci., 10(a76 1):136–163.
S. Kahane, A. Nasr, and O. Rambow. 1998.
Pseudo-projectivity : A polynomially parsable
non-projective dependency grammar. In Proc.
COLING-ACL, pages 646–652, Montreal.
J. Lambek. 1958. The mathematics of sentence struc-
ture. American Mathematical Monthly, pages 154–
170.
A. Lecomte. 1992. Proof nets and dependencies. In
Proc. of COLING-92, pages 394–401, Nantes.
V. Lombardo and L. Lesmo. 1996. An earley-type
recognizer for dependency grammar. In Proc. 16th
COLING, pages 723–728.
V. Lombardo and L. Lesmo. 1998. Formal aspects
and parsing issues of dependency theory. In Proc.
COLING-ACL, pages 787–793, Montreal.
I. Mel’ˇcuk and N.V. Pertsov. 1987. Surface Syntax of
English. A Formal Model Within the Meaning-Text
Framework. John Benjamins Publishing Company,
Amsterdam/Philadelphia.
M. Moortgat. 1990. La grammaire cat´egorielle
g´en´eralis´ee : le calcul de lambek-gentzen. In Ph.
Miller and Th. Torris, editors, Structure of lan-
guages and its mathematical aspects, pages 127–
182. Hermes, Paris.
A. Nasr. 1995. A formalism and a parser for lexical-
ized dependency grammars. In Proc. Int. Workshop
on Parsing Technology, pages 186–195, Prague.
P. Neuhaus and N. Br¨oker. 1997. The Complexity
of Recognition of Linguistically Adequate Depen-
dency Grammars. In Proc. of 35th ACL Annual
Meeting and 8th Conf. of the ECACL, pages 337–
343.
Jane J. Robinson. 1970. Dependency structures and
transformational rules. Language, 46(a76 2):259–
285.
D. D. Sleator and D. Temperly. 1993. Parsing English
with a Link Grammar. In Proc. IWPT’93, pages
277–291.
L. Tesni`ere. 1959. ´El´ements de syntaxe structurale.
Librairie C. Klincksieck, Paris.
