Axiomatization of Restricted Non-Projective Dependency Trees through
Finite-State Constraints that Analyse Crossing Bracketings
Anssi YLI-JYR ¨A
Department of General Linguistics, P.O. Box 9, FIN-00014 University of Helsinki
anssi.yli-jyra@helsinki.fi
Abstract
In this paper, a representation for syntactic depen-
dency trees (D-trees) is defined through a finite set
of axioms. The axiomatized representation consti-
tutes a string that can encode non-projective D-trees
of restricted structural complexity. Upper-bounds
for the structural complexity of these D-trees are
fixed through the following new parameters: proper
embracement depth a0 , nested crossing depth a1 , and
non-projectivity depth a2 .
In the representation, syntactic dependencies be-
tween words are indicated with pairs of brackets.
When the brackets indicate dependencies that cross
each other, the crossing pairs of brackets are dis-
tinguished by assigning separate colors to each of
them. These colors are allocated in a way (Yli-Jyr¨a
and Nyk¨anen, 2004) that ensures a unique repre-
sentation for each D-tree, and entails that languages
whose nested crossing depth is not bounded cannot
be captured using a fixed number of colors.
Although the axiomatization is finite, it ensures
that the represented dependency structures are trees.
This is possible because the described D-trees have
bounded non-projectivity depth. The axioms are
also regular because proper embracement depth of
represented D-trees is bounded.
Our representation suggests that extra strong gen-
erative power can be squeezed out of finite-state
equivalent grammars. Bracketed D-tree representa-
tions (cf. annotated sentences) are structural de-
scriptions that are assigned to their subsequences
(cf. generated strings or yields of trees) where
brackets and other special-purpose characters have
been omitted.
1 Introduction
Recently, many dependency syntactic parsers using
finite-state machines (FSMs) have been presented
(Kahane et al., 1998; Elworthy, 2000; Nasr et al.,
2002; Oflazer, 2003; Yli-Jyr¨a, 2004a). This arti-
cle shows that a finite-state equivalent grammatical
system is capable of assigning even non-projective
syntactic dependency trees — or their representa-
tions — to terminal strings. An appropriate repre-
sentation is conveniently defined through a set of
axioms presented in this work. The complexity of
the structures assigned is bounded by some special
parameters.
1.1 Motivation
We argue that the possibilities of FSMs have not
been fully exploited in dependency syntax. So
far, almost all the dependency parsers that use
FSMs take them merely as subroutines in a sys-
tem whose generative power exceeds regular lan-
guages. Although there are some pure finite-state
approaches to surface syntactic parsing (Krauwer
and des Tombe, 1981; Abney, 1996; Koskenniemi,
1997; Yli-Jyr¨a, 2004a) there seems to be a lack of a
pure finite-state approach that is capable of assign-
ing non-projective dependency structures to the in-
put strings.
The applicability of finite-state systems to natural
language syntax has been questioned since Chom-
sky (1957), who suggested that center embedding
in natural language is unbounded. In contrast to
this view, a recent corpus-based study (Karlsson, in
print) suggests an opposite generalisation according
to which there is an absolute limit (2 – 3) on center-
embedding of subordinate clauses in matrix clauses,
on top of which there are also category restrictions
on which type of center-embeddings are allowed at
each embedding level. Although such limits may
lack some mathematical elegancy, they may entail
some other kind of mathematical beauty (e.g. the
closure properties of regular languages) and, more-
over, new possibilities in the framework of param-
eterized complexity can become available. In nat-
ural language engineering, where ambiguity gener-
ated by a syntactic parser can be very high, limits
on syntactic complexity may resolve some ambigu-
ity and reduce the number of non-typical analyses
generated by the parser. Several such limits on the
complexity of D-trees are proposed in this paper.
To facilitate implementation of a non-projective
dependency grammar with FSMs, this paper intro-
duces a suitable string representation. This repre-
sentation is inspired by Colored Non-Projective De-
pendency Grammar (Yli-Jyr¨a and Nyk¨anen, 2004),
where multiple index pushdowns are used to store
symbols for dependency links. The axiomatiza-
tion deals with crossing dependencies and enforces
acyclicity of the represented dependency graph. It
involves several extensions that are not present in
earlier encoding schemes where dependencies are
also indicated through matching pairs of symbols
(Oflazer, 2003; Yli-Jyr¨a, 2004a).
1.2 Non-Projective Dependency Trees
In dependency syntax, the analysis of a sentence is
given as a dependency tree (D-tree) whose nodes
correspond — as assumed in this paper — to the
words of the analyzed sentence. A D-tree consists
of dependency links (directed arcs) drawn above the
sentence — and implicitly, of the sentence itself.
The D-tree shows which words are related to which
words and in what way. Figure 1 gives an example
of a D-tree.
Figure 1: This tree analyses the Latin sentence
“Ultima Cumaei venit iam carminis aetas” (Vergil:
Eclogues IV.4) that means “The last era of the
Cumean song has now arrived”. The analysis is
adapted from Covington (1990). We added the
sentence-initial node (a0a1a0a0a1a0a0 ) and the arc labels.
The graphical representation for the D-tree is in-
terpreted as follows. If a2a4a3a6a5 a2a8a7 is a (directed) arc
between two nodes, we will say that a2a9a7 depends im-
mediately on a2a10a3 (or, conversely, a2a10a3 governs a2a11a7 im-
mediately), and that a2a12a7 is an immediate syntactic
dependent of a2a10a3 (and a2a13a3 is the immediate syntactic
governor of a2a11a7 ). (Mel’ˇcuk 1988.)
In each D-tree, there is a unique non-governed
node. For this purpose, we have reserved an ex-
ternal wall node (a14a15a14a16a14a16a14a16a14 ) that is placed on the left of
the sentence, while the words in the sentence are
always governed by exactly one node. A D-tree is
non-projective, if some of the arcs cross each other
when the tree is drawn above the sentence.
1.3 The New Representation
The string representation (Figure 2) for a D-tree
consists of overlapping views — string subse-
quences — that realize different aspects of the D-
tree encoding. For example, substrings delimited
by a pair or boundary symbols (#) are called nodes
in the string representation. The first node corre-
sponds to the wall (a14a16a14a16a14a16a14a16a14 ) and the other nodes contain
single word tokens of the sentence:
#
a0a1a0a0a1a0a0
# Ultima # Cumaei # venit # iam # carminis # aetas #
For any given D-tree there is a unique way to assign
colors to its arcs. When the coloring is done, the re-
sult is a colored D-tree. At each node of the colored
D-tree, there is a unique active color a17 that is used
for the arcs that connect the node to the right. In
the string representation this is indicated by special
tokens a18a17a19a18 , a17a20a18 as follows:
# 1
a21
# a22
a21
#
a21
a22
a21
# a23
a21
#
a21
a23
a21
#
a21
a23
a21
# 1
a21
#
The arcs are encoded with separate pairs of brackets
as follows:
# [
a24
# # # ]
a24
# # # #
# # [
a25
# # # # # ]
a25
#
# # # [
a25
# # # ]
a25
# #
# # # # [
a26
# # # ]
a26
#
# # # # [
a26
# a27
a26
# # #
# # # # # # a28
a26
# ]
a26
#
Each substring [a29a31a30a32a30a32a30a29 ]a29 , [a29a33a30a32a30a32a30a29a35a34a29 , and a36a29a31a30a32a30a32a30a29 ]a29 ,
where the intervening string a30a32a30a32a30a29 has a balanced
bracketing w.r.t. [a29a10a37 ]a29 -brackets, corresponds to an
arc between two nodes in the D-tree. When these
subsequences are put on top of each other we obtain
the following combination:
# [
a24
# [
a25
# [
a25
# ]
a24 [a26
# a27
a26
# ]
a25
a28
a26
# ]
a26 ]a25
#
For each arc a2a10a3a38a5 a2a8a7 in Figure 1, there is a label
a39 that tells how
a2a11a7 depends on a2a10a3 . These labels are
attached to the brackets as follows:
# [
a3 pred
# [
a7 attr
# [
a7 attr
# pred ]
a3 [a40
subj adv # adv a34a41a40 # attr ]a7a42a36 a40 gntv # gntv subj
]a40 attr ]a7 #.
The direction of each arc is indicated by adding an
over-line to the dependent-side labels:
[a7 attr [a7 attr pred ]a3 adv a34 a40 a36 a40 gntv subj ]a40 .
The non-projectivity depth of an arc measures its
specially defined distance from the wall node a14a16a14a15a14a16a14a16a14 .
In each node, the non-projectivity depth of an out-
going arc is greater (or equal) than the depth of the
incoming arc. The depth of arcs is indicated as fol-
lows:
# [
a3 0
# [
a7 1
# [
a7 0
# 0 ]
a3 [a40 0 0
# 0
a34 a40
# 0 ]
a7 a36 a40 0
# 0 0 ]
a40 1 ]a7
#.
When we combine all these views, we obtain a
string that represents the D-tree of Figure 1. This
string is shown in Figure 2.
#
a14a15a14a16a14a16a14a16a14 a0 a18 [a3 pred,0
# Ultima
a1 a18 [a7 attr,1
# Cumaei
a18a1 a18 [a7 attr,0
# pred,0 ]
a3 venit a2 a18 [a40 subj,0 adv,0
# adv,0
a34 a40 iam a18a2 a18
# attr,0 ]
a7 carminis a18a2 a18 a36 a40 gntv,0
# gntv,0 subj,0 ]
a40 attr,1 ]a7 aetas a0 a18
#
Figure 2: A string representation for a D-tree.
2 Prerequisites
We are going to define the string representation for
D-trees with axioms that are given as extended reg-
ular expressions. The intersection of the languages
described by these axioms is the set of valid repre-
sentations. In the following, we define the alphabets
and regular expressions used in the axioms.
Assume that a3 is the set of category labels for
the arcs, and that a0 , a1 , and a2 are the parameters
specifying upper bounds respectively for the proper
bracketing depth, nested crossing depth, and non-
projectivity depth. These depth measures will be
precisely defined in an appropriate context.
2.1 Alphabets
In Figure 2, several different kinds of symbols are
involved. They belong to the following alphabets:
a4 the sets
a5a7a6 , a5a7a8 , a9a10a6 , a9a11a8 consisting respec-
tively of brackets [a12a37 ]a12a37 a36a12a37 a34a12,a0a14a13 a17a15a13 a1 ,
a4 the color selectors
a16a18a17a20a19 a17a19a18 a37 a18a17a19a18a22a21a23a0a14a13 a17a15a13 a1a25a24 ,
a4 the dependent labels
a26a28a27a29a17a30a19a32a31
a39
a37 a17a34a33a35a21
a39a37a36
a3 a37a39a38 a13 a17a40a13 a2a41a24 ,
a4 the governor labels
a26a28a42a43a17a20a19
a39
a21
a39a44a36
a26 a27 a24 ,
a4 the set of word tokens
a45 , and
a4 the set of special symbols
a19
#
a37 a14a15a14a16a14a16a14a16a14a39a24 .
The union of these alphabets is denoted as a46 . The
union of the alphabetsa26 a27 anda26a40a42 isa26 .
2.2 Regular Expressions
Leta9 anda5 be sets of strings, anda47 a positive inte-
ger. In the axioms we will use the following regular
operations: Kleene’s closure (a9a49a48 ), finite iteration
(a9a51a50 ), concatenation (a9a51a5 ), asymmetric difference
(a9a53a52a54a5 ), union (a9a56a55a11a5 ), and intersection (a9a56a57a11a5 ) —
with this precedence order. The semantics of these
operations is defined in the usual way. Parenthesis
(a31a58a33 ) is used for grouping. The boxed dot a59 denotes
the language a31a58a46a60a52a60a19 #a24a61a33a48 .
A context restriction of a center a62 in contexts
a63
a3 a37
a63
a7 a37a32a30a32a30a32a30 a37
a63
a50
is a regular operation where a62 is a
subset of a46a64a48 and each context a63 a12, a0a65a13 a17a11a13a66a47 , is of
the forma67 a12 a68 a12, wherea67 a12a37 a68 a12a15a69 a46a70a48 . The opera-
tion is expressed using a notation
a62a72a71a73a67 a3 a68 a3 a37 a67a33a7 a68 a7 a37a75a74a75a74a75a74 a37 a67
a50
a68
a50
and it defines the set of all stringsa76 a36 a46 a48 such that,
for every possible a2 a37a78a77 a36 a46 a48 and a39a79a36 a62 , for which
a76a80a17 a2
a39
a77 , there exists some contexta67 a12 a68a51a12, a0a65a13
a17a15a13a81a47 , where both a2
a36
a46 a48a67 a12 anda77
a36
a68a10a12a46 a48 .
The axioms in this paper produce a set of quite
small automata and the satisfiability and usability
of this lazy finite-state system for choosing a rep-
resentation has been tested with real D-trees. (Af-
ter these tests, the axioms presented here have un-
dergone some editing that hopefully have not intro-
duced typos or bugs.)
2.3 The Basic Structure
Axiom 1. (a) The string begins and ends with a
node boundary (#). (b) Between each two bound-
aries there exists at least one word token or the wall
a76
a36
a45a82a55a83a19 a14a15a14a16a14a16a14a16a14a78a24 . (c) Two tokensa76 a37 a76a10a84
a36
a45a82a55a85a19 a14a16a14a16a14a16a14a16a14a39a24
are always separated by a node boundary.
Axiom 2. (a) There are no two similar square
brackets in a node. (b) The color indices a17 of
closing brackets increase monotonically when we
move from a right bracket towards the closest node
boundary (#) on the left.
Axiom 3. (a) All the labels a39a81a36 a26 belong to some
surrounding bracket. (b) Each left (right) bracket
has some label that is attached to it.
Axiom 4. (a) Angle bracketsa9a7a6a83a55a86a9a11a8 do not have
more than one label attached to them. (b) Within
each node, no angle bracket a36a12 ( a34a87a12) occurs inside a
square bracket [a12 (]a12) having the same color a17 .
Axiom 5. (a) The wall (a14a15a14a16a14a16a14a16a14 ) and all the word tokens
a76
a36
a45 occur before a color selectora88
a36
a16 . (b) All
the color selectors a88 a36 a16 occur after a word token
or the wall.
Axiom 6. (a) There is one and only one ungoverned
node. (b) All other nodes are governed by some
node. (c) No node depends immediately on more
than one other node. (d) No node depends on itself.
These axioms are presented more formally as the
following regular expressions:
#
a46 a48
# (1a)
a46 a48 a52a79a46 a48
#
a31a58a46a60a52a89a31a58a45a90a55a53a19 a14a15a14a16a14a16a14a16a14a39a24a61a33a78a33a48
#
a46 a48 (1b)
a46 a48 a52a79a46 a48 a31a58a45a91a55a92a19 a14a16a14a15a14a16a14a16a14 a24a61a33a93a59a94a31a58a45a90a55a53a19 a14a16a14a16a14a15a14a16a14 a24a61a33a95a46 a48a74 (1c)
a46 a48 a52a96a46 a48a31[a12a59 [a12 a55 ]a12a59 ]a12a33a78a46 a48 (2a)
a19]a12a37 a34a12a24a11a71a91a31a97a19
#
a24a28a55a86a5 a84a33a34a26 a48 a37 (2b)
a26a60a71a91a31a98a5a11a6a83a55a86a9a10a6a99a33a34a26 a48 a37 a26 a48a31a98a5a7a8a22a55a86a9a11a8a100a33 (3a)
a31a78a31a98a5a10a6a85a55a22a9a10a6a101a33a102a71 a26a40a33a103a57a44a31a78a31a98a5a7a8a56a55a86a9a11a8a104a33a102a71a105a26 a33
(3b)
a46 a48 a52a96a46 a48a9a51a6 a26a95a26a40a46 a48 a52a96a46 a48a26a95a26a95a9a11a8 a46 a48 (4a)
a46 a48 a52a79a46 a48 a34a12a59 ]a12a46 a48 a52a96a46 a48[a12a59 a36a12a46 a48 (4b)
a31a58a45a90a55a53a19 a14a16a14a15a14a16a14a16a14 a24a61a33 a71 a16 (5a)
a16a32a71 a31a58a45a90a55a53a19 a14a16a14a16a14a15a14a16a14a78a24a61a33 (5b)
a46 a48 a14a16a14a16a14a16a14a15a14a46 a48 a52a79a46 a48 a14a15a14a16a14a16a14a16a14a46 a48 a14a16a14a16a14a16a14a15a14a46 a48 (6a)
a45 a71a73a26a15a42a44a59 a37 a59a35a26a15a42 (6b)
a46 a48 a52a79a46 a48a26 a42 a59a35a26 a42 a46 a48 (6c)
a46 a48 a52a79a46 a48a5 a6 a59 a5 a8 a46 a48 (6d)
where a0a14a13 a17a15a13 a1 a37 a5 a84 a17a20a19 ]a12a37a32a30a32a30a32a30 a37]a1 a37 a34a12a37 a30a32a30a32a30 a37a8a34 a1 a24
3 Proper Embracement Depth
In the context of D-trees, a counterpart notion for
center-embedding of constituent trees is needed. We
say that an arc a2 a3 a5 a2a13a3 properly embraces an-
other arc a2 a7 a5 a2a8a7 , if and only if a3 a4a6a5 a19 a2 a3 a37 a2a13a3a75a24a8a7
a3
a4a6a5
a19
a2
a7 a37 a2 a7a61a24 and a3a10a9a12a11a99a19
a2
a7 a37 a2a8a7a61a24a13a7 a3a14a9a12a11 a19
a2
a3 a37 a2a9a3a75a24 ,
where a7 is the linear precedence order among the
nodes. The proper embracement depth a0 of a col-
ored D-tree is the maximum number a0 of arcs a2 a3 a5
a2a13a3 , a2 a7 a5 a2 a7 , a74a75a74a75a74 , a2a16a15 a5 a2 a15 where all the arcs
have the same color and each a2 a12 a5 a2 a12, a0 a13 a17a22a13
a0
a52 a0 , properly embraces
a2
a12a6a17 a3 a5 a2 a12a6a17 a3 . This mea-
sure is applied to a D-tree in the Figure 3. Note
that proper embracement does not generally imply
that the arcs belong to a common path: in Figure
1 the arc aetasa5 Ultima embraces properly the arc
carminisa5 Cumaei.
Figure 3: The proper embracement depth of this
clause is 3.
An arc a2 a3a6a5 a2a13a3 that shares a node with another
arc a2 a7 a5 a2a8a7 does not properly embrace the later.
If they overlap each other, the shorter of these arcs
will be represented with a pair of an angle bracket
and a square bracket as shown in Figure 2, unless the
longer arc is already presented by an angle bracket.
Thus, the maximum number of nested square brack-
ets needed corresponds to the proper embracement
depth of the colored D-tree.
In our representation, the proper embracement
depth of trees is bounded by a fixed parameter a0 .
This allows defining finite-state constraints that de-
fine bracketings up to a bounded number of nested
square brackets. In the following, we will give ax-
ioms that check that brackets for each color are bal-
anced and do not exceed the proper embracement
depth a0 .
We define first, for each color a17 a36a8a18a0 a37 a1 a37 a2 a37a75a74a75a74a75a74 a37 a1a20a19 ,
a string homomorphism a21 a12 a18 a46a64a48a23a22 a19 [a12a37 ]a12a24a25a48 in
such a way that it essentially deletes all the other
symbols except the brackets [a12 and ]a12, and the
regular languages a24 a3a26a25
a15
a25a12 a69 a19 [a12a37 ]a12a24a48 , a27
a15
a25a12
a36
a46 a48 in
the following way:
a24 a3a26a25
a15
a25a12 a17
a28a30a29
if a0 a17 a38a32a31
a31a33a24 a3a26a25
a15a35a34
a3a26a25a12 a55 a31[a12a24 a3a26a25
a15a36a34
a3a26a25a12]a12a33a78a33a48 if
a0a38a37
a38a74
a27
a15
a25a12 a17a39a21
a34
a3
a12
a31a33a24 a3a26a25
a15
a25a12a33a74
Some auxiliary languages are defined as follows:
a27 a84
a15
a25a12
a17a40a27
a15
a25a12 a52a41a27
a15
a25a12 a36a12a27
a15
a25a12
a27 a84a84
a15
a25a12
a17a40a27
a15
a25a12 a52a8a27
a15
a25a12 a34a12a27
a15
a25a12a74
Axiom 7. The [a12a37 ]a12-bracketings must be balanced
and the number of nested brackets is bounded.
Axiom 8. (a+b) Left (right) angle brackets match
with a square bracket. (c) The arcs indicated with
angle-square bracket pairs do not cross each other
as in [a12a74a75a74a75a74a12 a36a12 a30a32a30a32a30a12 a34a12 a30a32a30a32a30a12 ]a12.
These axioms are given more formally as follows:
a27
a15
a25a12 (7)
a36a12 a71 a27
a15
a25a12 ]a12 (8a)
a34a12 a71 [a12 a27
a15
a25a12 (8b)
a46 a48 a52a79a46 a48 a36a12 a27
a15
a25a12 a34a12 a46 a48 (8b)
where a0a7a13 a17a15a13 a1
4 Nested Crossing Depth
We added colors to brackets because crossing
brackets have to be separated by some means.
Unfortunately, assigning colors to brackets entails
new problems:
1. We can represent non-projective trees that are
not typical for natural language. In particular,
we conjecture that although we bound the
number of colors a1 available, there is a set of
colored trees (in the limit a0 a22 a42 ) that gives
structural descriptions for the Bach language
(cf. Joshi 1985), the strings of which consist of
an equal number of a’s, b’s and c’s.
2. In parsing, the colors must be selected in one
way or in another and this results normally into
an ambiguity where there are many colorings
available. Thus, we need a discipline that
tells how to assign colors to the arcs in an
unambiguous way.
The first problem could be addressed e.g. by com-
bining a constituent-based structure (topological
fields etc.) with the dependency syntax. In our
representation for D-trees, we need however a so-
lution that addresses both of these problems. Such
a solution has been developed recently and pre-
sented in many ways: by means of constraints (Yli-
Jyr¨a, 2003a), through an informal algorithm (Yli-
Jyr¨a, 2004b), and very formally as a special index
storage type used in Colored Non-projective De-
pendency Grammar (Yli-Jyr¨a and Nyk¨anen, 2004).
We conjecture, however, that there is no essential
differences in the allocation disciplines defined in
these works. In the following, we will adapt the
constraint-based definition (Yli-Jyr¨a, 2003a) to the
allocation of colors of brackets:
Axiom 9 (Plane locking). If a bracket [a0 is still
open at a string position, the position cannot contain
a bracket [a12 for which a17a2a1a4a3 .
An effect of this axiom can be seen in Figure 2,
where a new colora2 is selected after venit although
the color a0 is no more in use. The reason is that
there is a bracket [a7 that is still open at that position.
Axiom 10 (Left conjoin). All the opening brackets
belonging to the same node have the same color.
This axiom corresponds intuitively to the fact
that there is no need to give different colors to arcs
that do not cross each other.
Axiom 11 (Continuous tiling). A position cannot
contain colored bracket [a0 , where a0a5a1a6a3a7a1 a1 , if, on
the left, there are no other brackets [a0 (of the same
color) that remain opened at the position, except if
there is, on the left, another bracket [a12 with a8a10a9a12a11a14a13a5a15
(of the preceding color) that remains open at this
position but will be matched with with a bracket a34a12
or ]a12 that occurs before the bracket [a0 of the current
position is closed with ]a0 .
This axiom corresponds to the fact that when a
new color is introduced at some position (Figure
2), this is done due to a danger of having crossing
brackets with the same color.
The actual effect of these three axioms is that for
each D-tree there remains a unique way to assign
colors, square brackets, and angle brackets to the
arcs. The nested crossing depth a1 of a D-tree is
the number of colors in a colored D-tree that con-
forms Axioms 9 – 11. In our representation, the
nested crossing depth (i.e. the number of colors) is
bounded.
Bounded nested crossing depth has considerable
linguistic relevance. The length of the longest chain
(a16a17a16a17a16 a30a32a30a32a30 a16a17a16 ) of crossing edges is a lower bound
for the nested crossing depth, but such chains are
typically very short in natural language sentences.
The possible upper bound for the nested crossing
depth has been studied experimentally (Yli-Jyr¨a,
2003a; Yli-Jyr¨a, 2004b) with the result that in non-
projective D-trees of some 700 Danish sentences,
the number of required colors is pretty low (a1 –a2 ).
A few interesting exceptions1 actually contained a
chain of up to five crossing dependencies. Such
complex examples seem to be successful combina-
tions of non-local dependencies, and it may be very
difficult to generalize what is possible and what is
not. Nevertheless, we conjecture that D-trees of the
Bach (or MIX) language (cf. Joshi 1985) are not
captured in our system when the number colors is
fixed, because Colored Non-Projective Dependency
Grammar (Yli-Jyr¨a and Nyk¨anen, 2004) is a linear
context-free rewriting system.
In order to facilitate formalization of Axiom 11,
we use of color selectors and the following axiom:
Axiom 12. (a) Color selector a18a3a13a18 , where a0a18a1a19a3 a13 a1 ,
indicates that there is a left bracket [a0 that has not
yet been closed. (b) Color selector a3 a18 indicates that
no bracket [a0 is open at that position.
If we assume also a bound for the proper em-
bracement depth, we can present the above axioms
more formally as follows:
[a0 a71 a31a27
a15
a25a0 a52 a46 a48a19 [a12a37 a36a12 a21a23a0 a13 a17a20a1a21a3a93a24a61a46 a48a33]a0 (9)
[a22 a71 a19a26 a37 [a22a9a37 a36a22 a24a48 # (10)
a3a8a18a87a71 a31a27
a15
a25a0 -a3 a19]a0 -a3 a37
a34a0 -a3a24 a46 a48a46 a57 [a0 a27
a15
a25a0 ]a0 a33 (11)
a18a23a13a18 a71 [a22 a27
a15
a25a22 (12a)
a46 a48 a52a89a31a58a46 a48 a52a41a27
a15
a25a22 a33a24a23a13a18 a46 a48 (12b)
where a0a49a13 a17a20a1a4a3a65a13 a1 a37 a0a14a13a25a23a56a13 a1
5 Subcategorization
We have seen in Section 3 that angle brackets are
used when several overlapping arcs share a common
node. This corresponds to use of reduced bracket-
ing for initial and final embedding in some systems
(Krauwer and des Tombe, 1981; Yli-Jyr¨a, 2003b),
and it facilitates linguistically appropriate bracket-
ing with FSMs.
Our axiomatization (Section 6 in particular) re-
quires that information about the labels and direc-
tions on the arcs of the node are locally present both
in the dependent and the governor nodes. Unfortu-
nately, this kind of duplication of the labeling in-
formation cannot be captured with regular axioms
unless there is a limit on the amount of information
that is duplicated. A solution would be to assign
each square bracket an unsaturated subcategoriza-
tion frame with a symbol that indicates a state in a
special subcategorization automaton. The automata
could be simulated by propagating — by means of
1e.g. “Det har
a26 b˚adea27 nogeta26a29a28a30 meda31 stolenea32 oga27 bordet
ata30 gørea31 — oga32 pladsen udenom.”
declarative constraints — the state information of
each square bracket to the first angle bracket, and
then further from one angle bracket to another. We
have chosen, however, a more restricted approach
for brevity, although we do not argue that it is the
most elegant and general solution. This approach is
presented in the sequel.
Figure 4: Angle brackets are used for non-proper
embraced bracketing. The additional labels of the
square bracket correspond to the labels attached to
angle brackets.
We assume that the number of left or right arcs
per color is bounded by an integer a0 . Thus, at most
a0 labels can be associated with one square bracket.
The label that is nearest to the opening (closing)
square bracket corresponds to the label that is near-
est to the corresponding closing (opening) square
bracket. Each additional label of the square bracket
corresponds to a label of an angle bracket (Figure
4).
We will now give axioms that check that the la-
bels of square brackets corresponds to the labels of
the matching square and angle brackets:
Axiom 13. (a)+(b) The number of left (right) angle
brackets matching each right (left) square bracket is
determined by the number of labels associated with
the right (left) square bracket.
Axiom 14. (a) Every label of the right square
brackets has a corresponding bracket with a corre-
sponding label. (b) Every label of the left square
brackets has a corresponding bracket with a corre-
sponding label.
These axiom are formulated as follows:
a31a58a46 a52 a26a28a33a34a26a95a26
a12]
a29 a71 [a29 a27 a84
a15
a25a29
a31 a36a29 a27 a84
a15
a25a29
a33
a12 (13a)
[a29 a26a15a26 a12a31a58a46 a52 a26a28a33 a71 a27 a84a84a15
a25a29
a31 a34a29 a27 a84a84
a15
a25a29
a33
a12]
a29 a74 (13b)
wherea38 a13 a17a20a1a1a0 a37 a0 a13a3a2a53a13 a1 a37
a4
a29 a25a5a26a25a12 a17
a28
a31a27a7a6 a25a29 a55 [a29a9a8 a29 a25a5 ]a29 a33a48 a57 a27
a15
a25a29 if a17a102a17
a0
a31
a31a27a7a6 a25a29 a55 [a29
a4
a29 a25a5 a25a12a6a17 a3 ]a29 a33a48 a57 a27
a15
a25a29 if a17 a1
a0
a31
(14a)
a10
a29 a25a5 a25a12 a17
a28
a31a27 a6 a25a29 a55 [a29 a0 a29 a25a5 ]a29 a33a48 a57 a27
a15
a25a29 if a17 a17
a0
a31
a31a27 a6 a25a29 a55 [a29
a10
a29 a25a5 a25a12a17 a3 ]a29 a33a48 a57 a27
a15
a25a29 if a17a20a1
a0
a31
(14b)
where a0a7a13a3a2a53a13 a1 a37a39a38 a13a12a11a5a1a13a0 a37 a0a49a13 a17a15a13
a0
a37
and a8 a29 a25a5 and a0 a29 a25a5 describe what is inside the [a29a13a37 ]a29 -
square brackets, when the a31a14a11a16a15a35a0a33 th label of the left
and right square bracket, respectively, has a match-
ing label. These languages are defined as
a8 a29 a25a5a95a17a35a26
a5a18a17
a29 a31 a34a29 a27 a84a84
a15
a25a29
a33
a5
a55a44a31a98a26 a48a52 a26
a5
a26 a48a33a31a58a46a81a52 a26a40a33a27
a15
a25a29
a0 a29 a25a5 a17a18a31a27 a84
a15
a25a29
a36a29 a33
a5 a17
a29 a26
a5
a55 a27
a15
a25a29 a31a58a46a60a52 a26a40a33a31a98a26 a48a52 a26
a5
a26 a48a33
where
a17
a29 a17 a55a20a19a22a21a61a6a24a23a28a31
a39
a27
a15
a25a29
a39
a55
a39
a27
a15
a25a29
a39
a33a92a57 a27
a15
a25a29
is a language whose strings contain just matching
pairs of labels and everything that can come be-
tween them.
6 Non-Projectivity Depth
The arcs in dependency trees constitute, by the def-
inition of trees, an acyclic graph — our discussion
assumes that there are no secondary links in D-trees.
In the axiomatization of the string representation,
we have to enforce acyclicity by some constraints.
Procedurally the acyclicity could be decided, for ex-
ample, by trying to arrange the nodes into an order
where the arcs go from the left to the right (topolog-
ical sorting). Corresponding declarative solutions
would be e.g. (i) to use set constraints (Duchier,
1999) or (ii) to attach each node an integer that in-
creases strictly in the nodes reached by the outgoing
arcs of the node. Both of these solutions are prob-
lematic because the number of reached nodes is, in
practice, unbounded. An alternative solution that
is adopted her is to use a monotonically increasing
counter that is incremented only at certain critical
positions. For technical reasons, we attach such a
counter to arcs and brackets rather than to the nodes
— this change is not mathematically significant.
Let a7 be the linear precedence relation over the
nodes. A node in a D-tree is an articulation node
if no arcs are passing it and the arcs coming into
it are on the opposite side than the arcs going out
from it. A chain of colored arcs a2 a3a31a5
a1
a26
a2 a7 , a2a8a7 a5
a1
a27
a2 a40 , a2 a40 a5
a1
a30
a2a22a25 , a74a75a74a75a74 a2
a50
a34
a3a35a5
a1a27a26a29a28
a26
a2
a50
, where a1a12 is
the color of an arc a2 a12 a5 a1a31a30 a2 a12a17 a3 , is called a colored
dependency path. A node a2 a12, a0 a1 a17 a1a66a47 , is critical
if either (i) a2 a12a6a17 a3 a7 a2 a12a34 a3 a7 a2 a12, (ii) a2 a12 a7 a2 a12a34 a3 a7 a2 a12a6a17 a3 ,
or (iii) a1a12a34 a3 a37 a1a12a17 a3 . The non-projectivity depth of
a (colored dependency) path that does not contain
an articulation node is the number of critical nodes
visited by it. The maximum depth of such paths in
a D-tree is the non-projectivity depth of the D-tree.
Incrementing counters only at the critical posi-
tions has an important advantage over the other so-
lutions mentioned: projective trees do not contain
any critical positions, and in non-projective trees of
natural language sentences we probably need only
very small numbers. If the counter is incremented
several times, the path can be very “unnatural” as
shown in Figure 5. We conjecture that if a de-
4 32
1
0 4 5
0 6color 1
color 2
Figure 5: The growth of the non-projectivity depth.
pendency path is acyclic, we cannot increment the
counters at every critical position. In other words,
assigning depths to the arcs of a D-graph excludes
the alternative that the graph would be cyclic.
In our representation, there is a fixed upper bound
a2 for the non-projectivity depth. Based on it, we can
define, for all a17a85a17 a18a38a10a37 a0 a37 a1 a37a75a74a75a74a75a74 a37 a2 a19 , the sets a26 a27 a25a12 a17
a19 a31
a39
a37 a17a34a33a56a21
a39a66a36
a3 a24 a69 a26 a27 , and a26a40a42 a25a12 a17 a19
a39
a21
a39a66a36
a26 a27 a25a12a24 . The second component of each labela31
a39
a37 a17a34a33 is
the non-projectivity depth counter. The incremen-
tation of these counters in critical nodes gives rise
to the following axioms. They constrain nodes i.e.
substrings occurring between two word boundaries:
Axiom 15. (a) In nodes that are not articulation
nodes, there is no label a39a44a36 a26 a27 a25a12, wherea38 a13 a17 a1 a2 ,
if there is a labela77 a36 a26a28a42 a25a0 , a17 a1 a3 a13 a2 . (b) There is
no label a39a60a36 a26 a27 a25a0 , where a38 a1 a3 a13 a2 , if there is no
labela77 a36 a26a40a42 .
Axiom 16. (a) There are no labels a39 a36 a26 a42 a25a22 and
a77
a36
a26 a27 a25a22 , where a38 a13 a23 a13 a2 , that are attached
to closing brackets in this order. (b) There are no
labels a39a89a36 a26 a27 a25a22 anda77 a36 a26 a42 a25a22 , where a38 a13 a23a79a13 a2 ,
that are attached to opening brackets in this order.
(c) There are no labels a39a80a36 a26 a27 a25a22 and a77 a36 a26 a42 a25a22 ,
wherea38 a13 a23a86a13 a2 , that are attached respectively to a
closing bracket and an opening bracket so that the
color index of the closing bracket is smaller than
that of the opening bracket.
Axiom 17. There is no label a39 a36 a26a70a27 a25a0 ,a38 a1 a3a92a13 a2 ,
that is attached to an opening bracket, if on the left
of the label a39 there is no a77 a36 a26a28a42 a25a0 , or on the right
of the label a39 there is no labela77 a36 a26a70a42 a25a0 a34 a3 .
Axiom 18. There is no label a39 a36 a26 a27 a25a0 ,a38 a1 a3a92a13 a2 ,
that is attached to a closing bracket with color
a2 , if on the right of the label
a39 there no label
a77
a36
a26 a42 a25a0 , or on the left of the label
a39 there is no
label a77 a36 a26 a42 a25a0 a34 a3 , or there is no label a77 a36 a26 a42 a25a0 a34 a3
that is attached to an opening bracket with a color
greater than a2 .
Axiom 19. In articulation nodes, the counters of the
outgoing arc labels must be zero.
More formally these are given as follows:
a46 a48 a52a79a46 a48a31a98a16 a59a18a31a26 a27 a25a12a59a22a26a40a42 a25a0 a55a86a26a15a42 a25a0 a59a56a26 a27 a25a12a33 (15a)
a55a93a31a98a26 a27 a25a12a59a86a26a40a42 a25a0 a55a28a26a15a42 a25a0 a59a35a26 a27 a25a12a33a78a59a86a16a83a55
a26a15a42 a25a0 a59a86a16 a84a59a35a26 a27 a25a12 a55a70a26 a27 a25a12a59a56a16 a84a59a86a26a15a42 a25a0 a33a78a46 a48
a26 a27 a25a0 a71a105a26a40a42a53a59 a37 a59a35a26a40a42 (15b)
a46 a48 a52a96a46 a48a26 a42 a25a22 a59a35a26 a27 a25a22 a59a18a31a98a5a7a8a56a55a22a9a11a8a104a33a78a46 a48 (16a)
a46 a48 a52a79a46 a48a31a98a5a11a6 a55a56a9a51a6 a33a93a59a35a26 a27 a25a22 a59a35a26 a42 a25a22 a46 a48 (16b)
a46 a48a52 a46 a48a26 a27 a25a22 a26 a48a5 a84a59a53a19 [a0 a37 a36 a0a24a26 a48a26 a27 a25a22 a46 a48 (16c)
a26a15a27 a25a0 a71a73a26 a42 a25a0 a59 a37 (17)
a31a59a29a55a28a26a15a42 a25a0
a34
a3a39a33a75a55a86a26 a48a31a98a5a7a8a22a55a86a9a11a8 a33
a26 a27 a25a0 a71a105a26a40a42 a25a0
a34
a3 a59 a37 (18)
a31a58a59 a31a98a26a15a42 a25a0 a55a28a5 a84a26 a48a26a15a42 a25a0
a34
a3a39a33a78a33a75a55a7a31a58a46 a48
#
a52a14a26 a48a19]a29a10a37 a34a29a24a61a46 a48a33
a26 a27 a25a0 a71a105a16a60a59 a59a43a26a15a42 a37 a26a15a42 a59 a59a35a16 a37 (19)
a31a98a16a96a59 a26a15a42a60a55a86a16 a84a33a93a59 a37 a59a32a31a98a26a15a42a44a59a35a16 a55a22a16 a84a33
wherea38 a13 a17a2a1 a3a54a13 a2 a37a39a38 a13a25a23a86a13 a2 a37 a0a14a13 a2 a1 a1 a37
anda5 a84 a17a20a19 [a0 a37 a36 a0 a21 a2 a1 a8 a13 a1a25a24 a37 a16 a84 a17 a16a20a52a96a19a0 a18a24
7 Colored Non-projective Dependency
Grammar
A new grammatical framework, called Colored
Non-projective Dependency Grammar (CNDG)
(Yli-Jyr¨a and Nyk¨anen, 2004), has been developed
on the top of the bounded nested crossing depth
and bounded non-projectivity depth using a care-
fully designed linear context free rewriting system.
A regular approximation for such a grammar is ob-
tained by compiling the colored dependency rules
to constraints that specify local subcategorization
features (labels and bracket colors) within the node
boundaries. The current axioms will take care of the
non-local structure of the described D-trees.
Example 1. The following set of rules describes the
dependency tree shown in Figure 1:
a1
a31 a38
a2
preda33
a31a58a33
a1
venita31 a38
a2
pred a1 a1
a2
adv subj a33
aetasa31 a0
a2
attr a1
a2
subj gntv a1 a33
iama31 a1 a2 adv a1 a33
ultimaa31 a1 a0 a2 attr a33
Cumaeia31 a1 a0 a2 attr a33
carminisa31 a0 a2 attr a1 a1
a2
gntv a33
When the second and the third rule, for example, are
compiled, we obtain the following two constraints:
pred a71 # ]a6 a31venita55a56a16 a33a48 a19 [a7 a37 a36 a7a61a24 subj [a7 a48 adv #
subj a71 # gntv ]a7 a48 a19 ]a7 a37 a34 a7a61a24 attr ]a3 a31aetasa55a22a16 a33a48 #a74
8 Further Work
In the future, the current axiomatization should be
extended to allow free dependents, and to include
rules without colors and arc order. Furthermore, ef-
ficient methods for applying the axioms should be
developed and a standard finite-state parser using
these axioms should be specified.
The approach could be extended with a multi-
tiered approach where different kinds of bracketed
strings (including e.g. P-markers) are processed
with a multi-tape finite automaton. We could also
use weighted automata to improve the ranking of
alternative analyses.
We would like to develop full scale grammars and
to evaluate the presented representation properly in
practical setting. Possibilities to induce a grammar
automatically from a treebank could be examined.
The proposed complexity bounds could be applied
also to treebank validation and more generally in
linguistic studies of natural language complexity.
9 Conclusion
In this article, we have proposed a new representa-
tion for restricted non-projective dependency trees:
The representation combines both dependency and
linearization into a single string structure, and it
gives a realistic basis for non-projective finite-state
dependency parsers. The complexity measures used
here may be of interest in different areas of linguis-
tics.
10 Acknowledgements
This work was funded by NorFA under the author’s
personal Ph.D. scholarship (ref.nr. 010529). The
author is grateful to the anonymous referees for in-
sightful comments on an earlier version of this pa-
per.

References
Steven Abney. 1996. Partial parsing via finite state
cascades. In Proceedings of the ESSLLI’96 Ro-
bust Parsing Workshop.
Noam Chomsky. 1957. Syntactic Structures. Mou-
ton, The Hague.
Michael A. Covington. 1990. Technical correspon-
dence. Computational Linguistics, 16(4).
Denys Duchier. 1999. Set constraints in computa-
tional linguistics – solving tree descriptions. In
Workshop on Declarative Programming with Sets
(DPS’99), pages 91–98, Paris.
David Elworthy. 2000. A finite state parser with
dependency structure output. In Proceedings of
International Workshop on Parsing Technologies.
Sylvain Kahane, Alexis Nasr, and Owen Ram-
bow. 1998. Pseudo-projectivity: A polynomially
parsable non-projective dependency grammar. In
COLING-ACL’98, volume I, pages 646–652.
Fred Karlsson. (in print). Limits of clausal embed-
ding complexity in Standard Average European.
Manuscript. Department of General Linguistics,
University of Helsinki, April 2004.
Kimmo Koskenniemi. 1997. Representations and
finite-state components in natural language. In
E. Roche and Y. Schabes, editors, Finite-state
language processing, pages 99–116. A Bradford
Book, MIT Press, Cambridge, MA.
Steven Krauwer and Louis des Tombe. 1981.
Transducers and grammars as theories of lan-
guage. Theoretical Linguistics, 8:173–202.
Igor A. Mel’ˇcuk. 1988. Dependency Syntax: The-
ory and Practice. State University of New York
Press, Albany.
Alexis Nasr, Owen Rambow, John Chen, and Srini-
vas Bangalore. 2002. Context-free parsing of
a tree adjoining grammar using finite-state ma-
chines. In Proceedings of TAG+6, pages 100–
105, Universit´a di Venezia.
Kemal Oflazer. 2003. Dependency parsing with
an extended finite-state approach. Computational
Linguistics, 29(4).
Anssi Yli-Jyr¨a and Matti Nyk¨anen. 2004. A hierar-
chy of mildly context sensitive dependency gram-
mars. In Formal Grammar (FGNancy), Nancy,
France, August.
Anssi Yli-Jyr¨a. 2003a. Multiplanarity - a model for
dependency structures in treebanks. In The Sec-
ond Workshop on Treebanks and Linguistic The-
ories, V¨axj¨o, Sweden, 14-15 November.
Anssi Yli-Jyr¨a. 2003b. Regular approximations
through labeled bracketing. In Gerhard J¨ager,
Paola Monachesi, Gerald Penn, and Shuly Wint-
ner, editors, Proceedings of Formal Grammar
2003, pages 189–201, Vienna.
Anssi Yli-Jyr¨a. 2004a. Approximating dependency
grammars through intersection of regular string
languages. Ninth International Conference on
Implementation and Application of Automata
(CIAA), July.
Anssi Yli-Jyr¨a. 2004b. Coping with dependencies
and word order or how to put Arthur’s court into
a castle. In H. Holmboe, editor, Nordisk Sprogte-
knologi 2003. ˚Arbog for Nordisk Sprogteknolo-
gisk Forskningsprogram 2000-2004, pages 123–
137. Museum Tusculanums Forlag, Københavns
Universitet.
