~%D-WAY FINITE ~% AND D~a-I~NDENCY GRAMMAR: 
A PARSING METHOD ~-OR INFLECTIONAL FREE WORD ORDER LAN(~I%GES I 
Esa Nelimarkka, Harri J~ppinen and Aarno Lehtola 
Helsinki University of Technology 
Helsinki, Finland 
ARSTRACT 
This paper presents a parser of an 
inflectional free word order language, namely 
Finnish. Two-way finite automata are used to 
specify a functional dependency grammar and to 
actually parse Finnish sentences. Each automaton 
gives a functional description of a dependency 
structure within a constituent. Dynamic local 
control of the parser is realized by augmenting the 
automata with simple operations to make the 
automata, associated with the words of an input 
sentence, activate one another. 
I ~ON 
This Daper introduces a computational model 
for the description and analysis of an inflectional 
free word order language, namely Finnish. We argue 
that such a language can be conveniently described 
in the framework of a functional dependency grammar 
which uses formally defined syntactic functions to 
specify dependency structures and deep case 
relations to introduce semantics into s%mtax. We 
show how such a functional grammar can be compactly 
and efficiently modelled with finite two-way 
automata which recognize the dependants of a word 
in various syntactic functions on its both sides 
and build corresponding dependency structures. 
The automata along with formal descriptions of 
the functions define the grammar. The functional 
structure specifications are augmented with simple 
control instructions so that the automata 
associated with the words of an input sentence 
actually parse the sentence. This gives a strategy 
of local decisions resulting in a strongly data 
driven left-to-right and bottom-up parse. 
A parser based on this model is being 
implemented as a component of a Finnish natural 
language data base interface where it follows a 
separate morphological analyzer. Hence, throughout 
the paper we assume that all relevant morphological 
and lexical information has already been extracted 
and is computationally available for the parser. 
I This research is s,~pported by SITRA (Finnish 
National Fund for Research and Development). 
Although we focus on Finnish we feel that the 
model and its specification formalism might be 
applicable to other inflectional free word order 
languages as well. 
II LINGUISTIC MOIT~ATI ON 
There are certain features of Finnish which 
suggest us to prefer dependency grammar to pure 
phrase structure grammars as a linguistic 
foundation of our model. 
Firstly, Finnish is a "free word order" 
language in the sense that the order of the main 
constituents of a sentence is relatively free. 
Variations in word order configurations convey 
thematical and discursional information. Hence, the 
parser must be ready to meet sentences with variant 
word orders. A computational model should 
acknowledge this characteristic and cope 
efficiently with it. This demands a structure 
within which word order variations can be 
conveniently described. An important case in point 
is to avoid structural discontinuities and holes 
caused by transformations. 
We argue that a functional depend s~cy- 
constituency structure induced by a dependency 
grammar meets the requirements. This structure 
consists of part-of-whole relations of constituents 
and labelled binary dependency relations between 
the regent and its dependants within a constituent. 
The labels are pairs which express syntactic 
functions and their semantic interpretations. 
For example, the sentence "Nuorena poika 
heitti kiekkoa" ("As young, the boy used to throw 
the discus") has the structure 
heitti 
adver bial~ubj~ t~.~ object 
Nuorena poika kiekkoa 
or, equivalently, the linearized structure 
( (Nuorena)advl (poika) ~ubj he~tti (kiekkoalob j I, 
TIW~ AGF/~ N~ L~;J, 
389 
ar~@, w!th \[". -~ ich ..:,'),~u~ i \[:dent, the ,,x.:,,rd without 
\[nflected %ocd d~)peaLs as a complex of its syntac- 
tic, .morphological and semantic properties. Hence, 
our sentence structure is a labelled tree whose 
nodes are complex expressions. 
The advantage of the functional dependency 
structures lies in the fact that many word order 
varying transformations can be described as 
permutations of the head and its labellex9 
dependants in a constituent. Reducing the depth of 
structures (e.g. by having a verb and its subject, 
object, adverbials on the same level) we bypass 
many discontinuities that would otherwise appear in 
a deeper structure as a result of certain 
transformations. As an example we have the 
permutations 
((Poika) subj heitti (kiekkoa)obj (nuorena)advl) 
(Heittik~ (poika) subj (nuorena) advl (kiekkoa) obj) 
and 
((Kiekkoako)obj (poika) subj heitti (nuorena) advl). 
("The bov used to threw the discus when he was 
young", "Did the boy use to throw...?", "Was it 
discus that the boy used to throw... ?", 
respectively. ) 
The second argunent for our choices is the 
well acknowledged prominent role of a finite verb 
in regard to the form and meaning of a sentence. 
The meaning of a verb includes, for example, 
knowledge of its deep cases, and the choice of a 
particular verb to express this meaning determines 
to a great extent what deep cases are present on 
the surface level and in what functions. Moreover, 
due to the relatively free word order of Finnish, 
the main means of indicating the function of a word 
in a sentence is the use of surface case suffixes, 
and very often the actual surface case depends not 
only on the intended function or role but on the 
verb as Well. 
Finally, we wish to describe the sentence 
analysis as a series of local decisions of the 
following kind. Suppose we have a sequence 
CI,... , Ci_l, Ci, Ci+l, ..., C n of constituents as 
a result of earlier steps of the analysis of an 
input sentence, and asinine further that the focus 
of the analyzer is at the constituent C i. In such a 
situation the parser has to decide whether C i is 
(a) a dependant of the left neighbour Ci_l, 
(b) the reagent of the left neiqhbour Ci_l, 
(CI a d~).~%gant of some f,~rtU~r,~\[n ~ Fie\]h+ 
(a) ":.- .~ent ~f some. fortJ\]coming right 
neighbour. 
~b.~erv@ that d~c.lsinng f~% and (d) refer 
either c~ a const\[tJe~t w~ich alceadv exists on the 
right side of C i or which will appear there after 
some steps of the analysis. Further, it should be 
noticed that We do not want the parser to make any 
hypothesis of the syntactic or semantic nature of 
the possible dependency relation in (a) and (c) at 
this moment. 
We claim that a functional combination of 
dependency grammar and case grammar can be put into 
a computational form, and that the resulting model 
efficiently takes advantage of the central role of 
a constituent head in the actual parsing pr.ocess by 
letting the head find its dependants using 
functional descriptions. We outline in the next 
sections how we have done this with formally 
defined functions and 2-way automata. 
III FORMALLY DEFINED ~CTIC FIYNCIXONS 
We abstract the restrictions imposed on the 
head and its dependant in a given subordinate 
relation. Recall that a constituent consists of the 
heed - a word regarded as a complex of its relevant 
properties - and of the dependants - from zero to n 
(sub) constituents. 
The traditional parsing categories such as the 
(deep structure) subject, object, adverbial and 
adjectival attribute will be modelled as functions 
f: ~f ->C, 
where C is the set of constituents and ~)L e C" C 
is the domain of the function. T 
The domain of a function f will be defined 
with a kind of Boolean expression over predicates 
which test properties of the arguments, i.e. the 
regent and the potential dependant. In the analysis 
this relation is used to recognize and interprete 
an occurance of a <head,dependant>-pair in the 
given relation. The actual mapping of such pairs 
into C builds the structure corresponding to this 
function. 
For notational and i~plementational reasons we 
specify the functions with a conditional expression 
formalism. A (primitive) conditional expression is 
either a truth valued predicate which tests 
properties of a potential constituent head (R) and 
its 4ependant (D) and deletes non-matchina 
\[mterore~ations of an ambigu(~\]s word, or an actier. 
which performs one of the basic construction 
operations such as labelling (:=), attaching (:-), 
or deletion, and returns a truth value. 
Primitive expressions can be written into 
series (PI P2 .-- Pn) or in parallel (Pl;P2; ...; 
Pn) to yield complex expressions. Logically, the 
former corresponds roughly to an and-operation and 
the latter an or-operation. A conditional operation 
-> and recursion yield new complex expressions 
from old ones. 
390 
As an exa~91e, consider the expressions 'Object', 'Recobj' 
and 'IntObj' in Figure i. 
ILMIIII |jilt 
IlilKOtjlllntOiJ) -) II I. ObIKtIIC :, IIII(L I)l 
18JTlOIts ItKrA J 
lilt • *lrM|JtJv, "tk~inlll(I • *lMilliil *~ntlmcJ) 
-) II| • Plrt,, -) 11 • h)i 
ill • I~' ") IJ • "f~mtdlil)l 
't} " t(mtlkleJli " ( hi ~j ))l,,,,,, 
|(| • ~'I;'IPI'N k(,,ll • POll -) T'I 
lit • ( Ikm )),,l , PH) 
-) ,,ll • IO*) -) '0 " PL',, 
1() • ~:)(I • ( his 
II~t IW~ ( IP 2P )1 ) '''l 
,,,,1 • lira UI'R • ACt ( lind 
Clmd 
Pot 
(l~I~ ~P' )))') 
,,Ill • *Irlmsit,,ve '41ol|sl\])( I • -P~l~tence +llolisll)) 
") 'D " ( IMI ~I kC Part..) 
lll.ltllalll tJt|j 
,,,,ll • ( JoviqVerkl l~qplVlqlll )) ") '| I, Ilvtrl|)): 
¢III • ¢lim'cl~'t'l)(| * .ililre4tiNl *) li I. lntril,,,, 
Figure I. 
The relation 'RecObj ' approximates the 
syntactic and mDrphological restrictions imposed on 
a verb and its nominal object in Finnish. (It 
represents partly the partitive-accusative 
opposition of an object, and, for an accusative 
object, its nominative-genetive distribution.) The 
relation 'IntObj', on the other hand, tries to 
interprete the postulated object using semantic 
features and a subcategorization of verbs with 
respect to deep case structures and their 
realizations. The semantic restrictions imposed on 
t~e underlying deep cases are checked at this 
point. 'Object', after a succesful match of these 
syntactic and semantic conditions, labels the 
postulated dependant (D) as 'Object' and attaches 
it to the postulated regent (R). 
IV FU~'~ONAL DESCRIPTIONS WI~ ,TflD-~AY AUT(3MA,~ 
We introduced the formal functions to define 
conditions and structures associated with syntactic 
dependency relations. What is also needed is a 
description of what dependants a word can have and 
in what order. 
In a free Word order language we would f~ce, 
for exile, a paradigm fragment of the form 
(subj) V (obj) (advl) 
(advl) (subj) V (obj) 
V (subj) (obj) (advl) 
(obj) (subj) V (advl) 
for functional dependency structures of a verb. 
(Observe that we do not assume transformations to 
describe the variants. ) We combine the descriptions 
of such a paradigm int~ a m~dified two-way finite 
automaton. 
A 2-way finite automaton consists of a set nf 
states, one of which is the initial state and some 
of which are final states, and of a set of 
transition arcs between the states. Each arc 
recognizes a word, changes the state of the 
automaton and moves the reading head either to the 
left or right. 
We modify this standard notion to recognize 
left and right dependants of a word starting from 
its immediate neighbour. Instead of recognizing 
words (or word categories) these automata recognize 
functions, i.e. instances of abstract relations 
between a postulated head and its either 
neighbour. In addition to a mare recognition the 
transitions build the structures determined by the 
observed function, e.g. attach the neighbour as a 
dependant, label it in agreement with the function 
and its interpretation. 
STATE.. ~ LE.CT 
((D • +PhriSe) -) (Subject -) (C I, WS }); 
(Objlct -) (C I, WO )); 
CAdv~bJal -) (C S, .W |); 
(SenSubj -) (C :, VS? )); 
+(Snti4vl -) (C :, .W )); 
• IT ,) IC t'~ ))); 
lID • -Phrast) -) (C ;- V? )) 
|TAT\[." V? RISHT 
|(D • *Phrase) -) {Subject -) (C s- VS? )); 
(Object -) (C ,,. V~ )); 
(SlmtPmbj -) |C ,,,- ~r-~-.ntS?)); 
(SntOA| -) (C s. VgmtO? )); 
|Mverbial -) (C :, I1? ))t 
|SentMvl -) (C t" VSmttt? )); 
¢T -) ¢C *, "%'Final )|); 
led • -Phrise) -) (C ,,, V? )(JuildPhra|eOn RIGHT)) 
STATE: WS LEFT 
(1| • "+Phra$1) -) (Objlct -) (C I, ?VSO )); 
(AdvlrbJ,| -) (C I. WS )); 
(SlmtMvl -) (C :, VS? }); 
(T -) (C t" VS? )111 
((S • -IP*rlml) -) (C ,," W? 1) 
Figure 9. 
Figure 2. exhibits part of a verb automaton 
which recognizes and builds, for exm~ple, partial 
structures like 
v v V V V //////\ 
subj , obj , advl , obj subj , advl subj .... 
The states are divided into 'left' and 'right' 
states ho indicate the side where the dependant is 
to be found. Each state indicates the formal 
functions which are available for a verb in that 
particular state. A succesfull applicati~ of a 
f~Jnct\[or, transfers the v6.~b \[nt~ .~nother :~t~te tc, 
\[~ok for f,rther d_~?endants. 
391 
Heuristic rules and look-ahead can a\]~> 
used, For example, the rule 
((RI = ', )(R2 = 'ett~ )(C = +gattr) 
-> (C := N?Sattr) (Buil~PhraseOn RI(RT)) 
in the state N? of the noun automaton anticipates 
an evident forthcoming sentence attribute of, say, 
a cognitive noun and sets the noun to the state 
N?Sattr to wait for this sentence. 
V PARSING WITH A SE~CE OF 2-WAY AUTCMATA 
So far we have shc~n how to associate a 2-way 
automaton to a word via its syntactic category. 
This gives a local descriotion of the grammar. With 
a few simple control instructions these local 
automata are made to activate each other and, 
after a sequence of local decisions, actually parse 
an input sentence. 
An unfinished parse of a sentence consists of 
a sequence CI,C2,...,C n of constituents, which 
may be complete or incomplete. Each constituent is 
associated with an automaton which is in some state 
and reading position. At any time, exactly one of 
the automata is active and tries to recognize a 
neighbouring constituent as a dependant. 
Most often, only a complete constituent (one 
featured as '+phrase') qualifies as a potential 
dependant. To start the completion of an incomplete 
constituent the control has to be moved to its 
associated automaton. This is done with a kind of 
push operation (BuildPhraseOn RIGHT) which 
deactivates the current automaton and activates the 
neighbour next to the right (see Figure 2). This 
decision corresponds to a choice of type (d). A 
complete constituent in a final state will be 
labelled as a '+phrase' (along with other relevant 
labels such as '+-sentence', '+_nominal', '~main'). 
Operations (FindRegOn L~T) and (FindRegOn RIGHT), 
which correspond to choices (a) and (c), deactivate 
the current constituent (i.e. the corresponding 
automaton) and activate the leftmost or rightmost 
constituent, respectively. Observe that the 
automata need not remember when and why they were 
activated. Such simple "local control" we have 
outlined above yields a strongly data driven 
bottom-up and left-to-right parsing strategy which 
has also top-down features as expectations of 
lacking, aependants. 
ATN-par sets. (There are also other major 
differences. ) In our dependency oriented model 
non-terminal categories (S, VP, NP, AP, ... ) are 
not needed, and a constituent is not postulated 
until its head is found. This feature separates our 
parser from those which build pure constituent 
structures without any reference to dependency 
relations within a constituent. In fact, each word 
collects actively its dependants to make up a 
constituent where the word is the head. 
A further characteristic of our model is the 
late postulation of syntactic functions and 
semantic roles. Constituents are built blindly 
without any predecided purpose so that the 
completed censtituents do not know why they were 
built. The function or semantic role of a 
constituent is not postulated tmtil a neighbour is 
activated to recognize its own dependants. Thus, a 
constituent just waits to be chosen into some 
function so that no registers for functions or 
roles are needed. 
VII REF~S 
Hudson, R. : Arguments for a Non-transformational 
Grammar. The University "6f" ~ ~ ~-6. 
Hudson, R.: Constituency and Dependency. 
Linguistics 18, 1980, 179_.198. 
J~pinen, H., Nelimarkka, E., Lehtola, A. and 
Ylilammi, M.: Knowledge engineering approach to 
morphological analysis. Proc. of the First 
Conference of the European Chapter of ACL, Pisa, 
1983, 49-51. 
Lehtola, A.: Compilation and i,~lementation of 
2-way tree automata for the parsing of Finnish. 
HeLsinki University of ~chnology (forthcoming 
M.Sc. the thesis). 
Nelimarkka, E., J~ppinen, H. and Leh~ola A.: 
Dependency oriented parsing of an inflectional 
language (manuscript). 
VI DISCUSSION 
AS we have shown, cur parser consists of a 
collection of finite transition networks which 
.~c~:,~u'~ ~:h ~J~er. The ~.=e of ~-wa V instead of 
i-why ~ut: ~mat ~ :\] i\[~t h~.gui 5he~ o.ic parse\[ f\['om 
392 
