TOWARD A PAP~I'NG METHOD FOR FREE WORD O~ LANGUAGES x 
Janusz S. Hte~, StantsZaw Szpakowioz 
Institute of Informatics, Warsaw University 
P.0.H. 1210, 00-901Warezawa, Poland 
,,Free word order" is a traditional term that should not 
be taken literally. However, we shall retain the term for its 
conciseness. 
Formal descriptions of =yntax have been usually based 
either on the immediate constituents or on the dependency 
p~ilosophy. Neither of them seems directly applicable to free 
word order languages. The intertwining phrases cannot be de- 
scribed naturally by IC rules. Some coordinate constructions 
are difficult to describe by me~us of dependency relations. 
In our opinion, parsers for free word order languages should 
not be based on the methods developed within the IC framework. 
Scarce experiments with parsers based on the dependency form- 
alism, eg. /5/, do not seem promising. Therefore, we decided 
to take a fresh start and to attack the problem by reanalys- 
ing the basic notions of syntax and parsing. We focus our 
attention on those formal aspects of a language system which 
might be most useful for automatic text processing. We assmae 
that the morphological level is described along the lines of 
/2/4 
x) This paper is an extended abstract of /3/. 
- 37 - 
2. The Notion of S.~t_ax. 
In this paper, we understand syntax as the domain of 
forr~al relations between words, i.e. roughly as so-called 
surface syntax. We define the notion usin~ a morphology-based 
criterion, described below. 
The outcome of morphological analysis can be ambiguous 
for an isolated word. In most situations, however, the morpho- 
logical features of a word are uniquely determined by some 
formal properties of its context. 
Sometimes the ambiguity remains, as in the following 
sent enc e 
Op~nienie bryg~d piecowyoh spowodowa~o potgpienie wuJa Jana. 
1 I I ! !,,. I ! ! I 1 
"~ l~gen" V -'~ NPgen" 
! 1 I 1 ~nom./aco. ~nom./ac£. 
There are five independent ambiguities in this sentence, 
yielding 32 coherent readings. Two of them are due to She 
neutralization of agent/patient function during nominalisat~ 
ion. For example, "potgpienie x" means "disapproval of ~' 
(either "x disapproves y" or "y disapproves x")i such an 
ambiguity can be resolved only by exsmining the meav.ing of a 
given phrase, so we call it semantic one. 
The next ambiguity Occurs in the phrase "wuJa Jana", 
that means either "uncle John"gen ° or "John's uncle"gen.O 
Here we can see two kinds of syntactic relations: case agreem- 
ent (the former interpretation) or government (the latter 
one), which both require "Jana" to be in genitive case. Such 
an ambi~ulty we consider as purely syntactic one. 
In the phrase "bry~d plecowych" we call discern either 
case agreement ("piecowych"gen" is then an adjective) or 
government ("piecowych"gen ° is then a noun). Here, the elim- 
ination of morphological homon~my gives rise to alternative 
constructions, thus increasing the syntactic ambi~ity, 
-38- 
The last ambiguity stems from the nominative/accusatlve 
neutralization both of a virtual subject and a virtual object 
of the sentence. It suffices to assign a syntactic function 
to one of them~ the function of the other and the morpho- 
logical characteristics of both of them will be fully determ- 
ined. 
The example demonstrates how certain relations between 
sentence components allow to disambiguate the morphological 
properties of individual words without resorting %0 their 
meanings. In our approach, these relations constitute the 
level of syntax /3/. 
Syntactic relations (eg. agreement, government) consist 
in matching syntactic properties (eg. case, gender) of re- 
speotiveu units. The basic unit is a morphological word /2/. 
By the syntactic structure of a sentence we understand 
some explicit representation of all the syntactic relations 
between its components, usually - a graph. Such a graph need 
not necessarily be connected. For example, some modifiers are 
linked to their heads only by semantic relations and not by 
syntactic ones. Similarly, some elllptio sentences may have 
disconnected syntactic representationu. 
We ~ulderstand parsing as a process of establishing all 
syntactic structures of a given text. Although such structur- 
es are rather unsophisticated, they are practically very 
important for low-level text processi~. 
In search of an adequate parsing method, we found the 
idea of ~arcus /4/ most appealing. He claims that natural 
languages are designed to be deterministically parsed from 
left to right and that writing a grammar should consist in 
finding out local clues which enable the parser to select 
properly what to do next. This idea seems even more advantag- 
eous for free word order languages. Rich inflection makes 
- 39 - 
the local clues much more explicit and the pareer's expectat- 
ions more precise. Besides, such an organisation of the pars- 
ing process is compatible with the resource control hypothes- 
is /1/ which is hoped to account for semantic implications of 
free word order. 
~ Conclusion 
As a practical consequence of the considerations given 
above, we adopt the following research program. As a starting 
point we take the existing IC-based syntactic description of 
Polish sentences with neutral word order /6/, consisting of 
about 500 rules (some pv~ts of it have been rewritten in great- 
er detail /7/, with the number of rules increasing 5-10 times). 
We are going to restructure the description to obtain an index 
of expectations related to each syntacti~ unit. We shall in- 
corporate the clues, thus obtained, into some Marcus-style 
parsing strate~. We expect that it will lead to an efficient 
and linguistically sound parser for Polish. 

References 

/lJ Bien J.S.: A Preliminary Study on Linguistic Implications 
of Resource Control in Natural Language Understanding. 
ISSCO Working Paper 44, Geneve 1980. 

/2/ Bien J.S., Saloni Z.: The notion of morphological word 
and its application to the description of Polish inflect- 
ion (preliminary version) /in Polish/. l~ace Pilologicz- 
ne XXXI, to appear. 

/3/ Bien J.S,, Szpakowlcz S. : Toward a Parsing Method for Free 
Word Order Languages. In: Papers In Comphtetlonal Linguls- 
tics II. IInf UW Reports, to appear. 

/4/ Marcus M.P,: A Theory of Syntactic Reco~lition for Natur- 
al Language. ~LIT Press 1980, 

/5/ Panevov~ J., Sgall P, : On Some Issues of Syntactic Anal- 
ysis of Czech. In: The l~e~ue Bulletin of Mathematical 
Lingulstlcs 34, 1980, 21-32. 
40 - 

\[6/ Szpakowioz S.: Formal syntactic description of Polish 
sentences /in Polish/. Wydawnictws Uniwersytetu Warszaw- 
skiego~ in press. 

/7/ Szpakowicz S., gwidzi~ski M. : An outline of sentenqs 
schemes classification in contemporary written Polish 
/in Polish/. Studia ~-~amatyczne V, WrooXaw, to appear. 
