Ambiguity resolution in a reductionistic parser * 
Pasi Tapanainen & Atro Voutilainen 
Research Unit for Computational Linguistics 
P.O. Box 4 (Keskuskatu 8) 
FIN-00014 University of Helsinki 
Finland 
1 Introduction 
We are concerned with grammar-based surface- 
syntactic analysis of running text. Morphological 
and syntactic analysis is here based on tags that ex- 
press surface-syntactic relations between functional 
categories such as Subject, Modifier, Main verb etc.; 
consider the following simple sentence: 
I PgOli @SUBJECT 
see V PRES @MAINVERB 
a ART @>H 
bird li @OBJECT 
FULLSTOP 
2 Description of the parsing system 
The parsing system consists of the following modules: 
2.1 Preprocessor 
The preprocessor normalises the input text, detects 
sentence boundaries and punctuation marks, and 
identifies idioms and other fixed syntagms. 
2.2 Morphological analyser 
The ENGTWOL morphological analyser is a 55,000 
entry Koskenniemi-style morphological description 
of English that assigns all recognised input word 
forms with all possible morphological readings as a 
disjunctive list. 
Those words not recognised by the ENGTWOL 
analyser are analysed by a heuristic module; part-of- 
speech readings are assigned on the basis of the form 
of the word (endings etc.). 
The morphologically analysed sentences are en- 
riched with syntactic and word boundary ambigui- 
ties and converted into regular expressions by simple 
awk programs. 
2.3 Finlte-State parser 
The Finite-State parser transforms sentences and 
rules into finite-state automata. The parser com- 
putes the intersection of the sentence automaton and 
all rule automata; the intersection is the parse of the 
sentence. 
The grammar also contains a heuristic section that 
can be used to rank multiple analyses. 
3 Sample analysis 
The sentence Its leadership was insulted by editors 
gets two analyses, when no heuristics are applied: 
@@ 
it <lionMod> PROM GEM SG3 @>I @ 
leadership li liOM S@ 
be V PAST SGI.OR.3 VFIli 
insult <SVO> PCP2 
by PREP 
editor li liON PL 
FULLSTOP 
@SUBJ @ 
@AUX @ 
@HV MAINC@ @ 
@ADVL @ 
@P< @ 
@@ 
@@ 
@ 
@ 
it <lionNod> PRON {\]Eli SG3 @>li 
leadership li liOM SG @SUBJ 
be <SV> <SVC/N> <SVC/A> 
V PAST SGI.0R.3 VFIli @MV NAIl\[C@ @ 
insult PCP2 @SC @ 
by PREP @ADVL/N< @ 
editor I liOM PL @P< @ 
FULLSTOP @@ 
Syntactic tags 
@>li determiner or premodifier 
@SUBJ subject of a finite clause 
@AUX auxiliary in a finite clause 
@MV main verb in a finite clause 
MAIliC@ finite main clause 
@AD'~ adverbial 
@P< preposition complement 
@SC subject complement 
%ADVL/N< adverbial or a postmodifier 
of a nominal 
References 
\[Voutilainen and Tapanalnen, 1993\] Atro Voutilai- 
hen and Pasi Tapanainen. Ambiguity resolution 
in a reductionist parser In Proceedings of EACL- 
93. Utrecht, Netherlands, 1993. 
*The lexicon is adopted from the ENGCG parser that 
has been supported by TEKES, the Finnish Technolog- 
icnl Development Center, and the work on Finite-state 
syntax has been partly supported by the Academy of 
Finland. 
475 
