THE IPS SYSTEM 
Eric Wehrli* 
Laboratoire d'analyse et de technologic du langage 
University of Geneva 
1211 Geneva 4 
wehrli@uni2a.unige.ch 
Abstract 
The IPS system is a large-scale interactive 
GB-based parsing system (English, French) un- 
der development at the University of Geneva. 
This paper starts with an overview of the sys- 
tem, discussing some of its basic features as 
well as its general architecture. We then turn 
to a more detailed discussion of the "right cor- 
ner" parsing strategy developed for this project. 
Combining top down and bottom up features, 
this strategy is consistent with an incremental 
interpretation of sentences. 
1 Overview of the IPS project 
The IPS (Interactive Parsing System) research 
project, at the Linguistics Departement of the 
University of Geneva, aims at developing a large, 
interactive, parsing model based on Chomsky's 
Government and Binding (henceforth GB) lin- 
guistic theory t. This project, which focuses 
both on English and French, has theoretical as 
well as practical goals, including the following: 
• to show the feasibility and the soundness 
of a GB-based approach to natural language 
parsing; 
*The research described in this paper has been sup- 
ported in part by n grant from the Swiss national sci- 
ence foundation (grant no 11-25362.88). I am grateful to 
Robin Clark, Paola Merlo~ Mira Ramluckun and Martin 
Kay for helpful comments and discussion on earlier drafts 
of this paper. 
C\], Chonmky (1981, 1988) for a discussion of 6B 
theory. Issues related to the use of Gl\] theory in natural 
language parsing delign are discussed in Berwick (1887), 
Berwick et ol. (1991) mad Wehdi (1988), 
• to demonstrate the advantages of an inter- 
active approach to natural language pro- 
cessing; 
a to develop robust, large-scale parsers suit- 
able for NLP applications (e.g. transla- 
tion). 
The IPS parser is interactive in the sense 
that it can request on-line information from 
the user. Typically, interaction will be used to 
solve ambiguities that the parser cannot han- 
die, for instance when the resolution of an am- 
biguity depends on contextual or extra-linguistic 
knowledge 2. The interactive feature is seen as a 
way to increase the reliability of the parser -- 
difllcult decisions are left to the user-- as well as 
a way to simplify the grammar, since many ad 
hoc features that would be necessary if the pars- 
er had to solve all problems by itself can now be 
dispensed with. In addition, the interactive ca- 
pabillty is also useful as a development tool, in 
the sense that on-line user interaction can sup- 
plement modules which have not yet been de- 
veloped (e.g. semantic and pragmatic compo- 
nents). 
Other important features of the IPS parser 
include: 
• Modular architecture, i.e. the parsing 
mechanism is decomposed into modules, 
which roughly correspond to some of the 
components of a standard GB grammar (e.g. 
X, chains, O, etc.). 
• The parsing strategy is left-to-right, data 
driven, with parallel treatment of alterna- 
2For arguments in favor of interactive systems (usu- 
ally in the context of machine translation, though) see 
in particular Kay (11180), Melby et al. (1980), Tomita 
(1984), Wehrli (1990), Zajac (1988). 
Acres DE COLING-92, NANTES, 23-28 AOt3"r 1992 8 7 0 PROC. OF COL1NG-92, NANTES, AUO. 23-28, 1992 
fives. The nonodeterminism of the parser is 
restricted by a selection mechaafism. 
a Use of structure-sharing techniques to cut 
down the number of explicit representations 
of alternatives. 
2 Architecture 
The IPS parser tries to associate with an input 
sentence a set of syntactic structures. These 
structures correspond to GB S-structures, i.e. 
surface structures enriched with traces of moved 
elements and other empty categories. In our 
implementation, GB grammatical modules cor- 
respond to particular processes. While some of 
the modules function as generators (2, chain 
modules, coordination module) in the sense that 
they increase the set of structures hypothesized 
at a given point, others are used as filters (Case 
module, 0-module) in the sense that their action 
tends to reduce the set of structures. The mod- 
ules apply as soon as possible at each step in the 
parsing process, triggered by particular data or 
by specific calls from other modules. 
Alternatives are considered concurrently 
(pseudo-parallelism), and a small number of 
heuristics are used to restrict the size of the hy- 
pothesis set. To give an example, one heuristic 
gives precedence to attachments satisfying for- 
real selectional features over (cf. (3), (7)) other 
kinds of attachments. Thus, if an incoming verb 
form can be attached either as a complement to 
an auxiliary or as a main verb, preference will 
be given to the former a. 
User interaction -wtfich is an optional feature 
in the IPS system- can be used to select alterna- 
tives, mostly in case of attachment ambiguities, 
but occasionally also for other types of ambigui- 
ty (lexical, thematic, ere). Alternatives are then 
displayed (in an abbreviated manner) and the 
user is asked to make a selection. 
aNotice that this heuristic might explain garden path 
sentences such as "l'invitd quail a dit des folies" (the gueJt 
he has told inzanitiez),in which readers wrongly interpret 
dit as past participle selected by the auxiliary verb a. 
3 The X module 
The central module of the IPS system is the 
module, which acts as the main generator of the 
system, and determines the syntactic structures 
of constituents. We assume the X schema in (1): 
(1) XP --~ Spec 
X--+ X Compl 
where X is a lexical or a functional category, 
Spee and Compl are lists (possibly empty) 
of maximal projections (YP). 
As indicated in (1), maximal projections are 
of order 2 (XP) and lexicai categories of order 
0 (I). For typographical reasons, categories of 
order 1 (~) are noted  ' in the illustrations be- 
low. The set of lexical categories include N, V, 
• and p, the set of functional categories includes 
D(eterminer), T(ense) et C(omplementizer). We 
also assume the DP hypothesis (cf. Abney 1987, 
Clark 1990a), and, as a consequence, the strong 
parallelism between DP and TP structures, as il- 
lustrated in (2): 
(2) 
TP DP 
DP ~ DP -D 
T VP D NP 
Lexical categories as well as functional cate+ 
gories can select other lexical or funrtional pro- 
jectioos. Tiros, u determiner cyst select a pro- 
jection of category Vp or NP, as in (3) and (4), 
respectively, corresponding to the structures (5) 
and (6) 
(3) \[each, D, \[+definite\], \[__\[D,\[ .... rail\] ...... \] 
(4) \[each, D, \[+definite\], \[__\[N,\[singular\]\] ..... \] 
(5)a, each five men. 
b. ivy \[D' each \[DP \[I)' five\[MP iN' menJJJl\]\] 
(6)a. each student. 
AcrEs DE COLING-92, NANTES, 23-28 AOlJq' 1992 8 7 1 PROC. OF COLING-92, NAbrrES. AUG. 23-28, 1992 
b. \[ DP \[ D' each \[ NP I" N' student\]\]\]\] 
Similarly, auxiliaries can select projections of 
type UP, and most prepositions projections of 
type DP. Some examples of selection features as- 
sociated with auxiliary verbs are given in (7), 
with the corresponding structures in (8) and (9): 
(7)a. 
b. 
e. 
(a)a. 
b. 
{have, V, \[+aux\], {__IV,{+ past 
par ticiplel\] .... \] 
\[be, V, \[+aux\], \[__\[V,\[past participle\]\] "a~\] 
\[be, V, \[+aux\], \[__IV,\[present participle\]\] .... \] 
the men have arrived. 
\[ \[ \[ the \[ \[ TP DP D' NP 
have \[ VP \[ V' arrived\]\]\]\] 
men\]\]\]\] \[ N' T' 
(9)a. the men must have been being cheated. 
b. \[ TP \[ l)P \[ v' the \[ ~ \[ s' men\]\]Ill \[ T' 
must \[Vp Iv' have \[Vp Iv' been \[Vp Iv' 
being \[ VP \[ V' cheated \[ DP e\]i\]\]\]\]\]\]\]\]\]\] 
The following is a summary of the fundamen- 
tal properties of constituent structures : 
• Any lexical category X projects to a maxi- 
mal projection Ip. 
a Attachments are restricted to maxlmal pro- 
jections (XP). 
• All constituents have the same architecture. 
4 The IPS parsing strategy 
In our implementation of the X module, we dis- 
tinguish three types of action: projection, at- 
tachment to the left (specifiers) and attachment 
to the riglit. The parsing strategy is left to right, 
parallel, combilting a bottom up approach with 
top down filtering, as we shall see below. 
The motivation for this particular strategy 
is to maximize at each step the interpretation 
of constituents in order to facilitate the selec- 
tion mechanism as well as user interaction dis- 
cussed in section 2. Interestingly enough, this 
requirement seems to coincide with psycholin- 
guistic evidence suggesting that sentence pars- 
ing proceeds in an incremental fashion, trying 
to integrate incoming items into maximally in- 
terpreted structures ~. 
The basic idea is that the parser must be 
sensitive to incoming words. However, strictly 
bottom up strategies are known to have some 
undesirable features. In particular, they tend 
to generate numerous locally well-formed struc- 
tures which turn out to be incompatible with the 
overall structure. Furthermore, they restrict at- 
tachment to complete constituents, which means 
that when applied to right branching languages 
such as French or English, assembling the final 
structure does not start much before the end of 
the sentence is reached. 
To illustrate these problems, consider the fol- 
lowing examples : 
(10)a. Who could the children have invited ? 
b. John must have given the students several 
of his books. 
In sentence (10a), when the parser gets to the 
word have, it tries to combine it with the left 
context, say \[ DP the children\], leading to the 
new constituent \[ the children have\]. Al- TIP 
though this new constituent is perfectly well- 
formed locally, it is not compatible with the 
modal could. 
Sentence (10b) illustrates the second and 
more serious problem. If node attachment is 
limited to complete nodes, the combination of 
the subject John and the rest of the structure 
(the whole verb phrase, which is a • in our sys- 
tem) will not occur before the last word of the 
sentence is read. 
The use of a more sophisticated strategy, such 
as the left corner strategy, addresses the first 
problem quite successfully ~. However, it fails to 
solve the second problem, since attachments are 
limited to complete constituents in the standard 
left corner parser (~. In an attempt to overcome 
this problem, and taking advantage of the clear 
4 For a detailed discussion attd review of the psycholin- 
guistic evidence for incremental parsing see Tanenhaus et 
al. (1985), Frazier (1987) and Crocker (1992). 
SSee Aho and Unman (1972), Pereira and Shieber 
(1997) for a discussion of left corner parshig. 
6Gibson (1991) proposes to relax this latter require- 
ment~ and to allow attachments of incomplete con- 
stituents. However, the generality of his proposal still 
remains unclear. 
ACTES DE COLING-92, NANTES, 23-28 AoL~r 1992 8 7 2 PROC. OF COLING-92, NANTES, AUG. 23-28, 1992 
bias for right branching structures in languages 
such as French or English (cf. structure (gb)), 
we developed a strategy dubbed "right corner", 
which is based on the following principles: 
(11) "Right corner" strategy : 
constituents can be attached as soon as they 
have been hypothesized; 
attach inconfing items to maximally ex- 
panded constituents in the left context; 
constituents specify a list of active attach- 
ment sites on their right edge; 
all attachments are considered in parallel; 
Notice that tiffs strategy is clearly bottom- 
up (actions are triggered by incoming material). 
However, it differs from other bottom-up strate 
gies in some important ways. First of all, the 
combination of a new item with its left context 
is not done through a sequence of "reduce" op- 
erations (as in a shift-reduce parser). More gen- 
erally, the attachment of (most) incoming items 
is made directly to some subconstituent of the 
top node, i.e. to some attachment site speci- 
fied in the active node list of a constituent in 
the left context. This is in sharp contrast with 
standard bottom-up parsers (including the left 
corner parser), for which reduction to the start 
symbol cannot occur before the end of the sen- 
tence is reached. 
Although the right corner strategy requires 
significantly more complex data structures (con- 
stituents must specify all their potential attach- 
ment sitesT"), it has the advantage of being com- 
pntationally and psycholingulstically more ade- 
quate than other bottom up strategies. Regard- 
ing the latter point, the right corner strategy 
seems consistent with the (trivial) observations 
(i) that the analyzer is data driven but (it) al- 
though still incomplete, left context constituents 
are maximally specified, as they would in a top- 
down parser. 
A detailed example will illustrate tiffs strate- 
gy. Consider the following sentence : 
rAttachment sites for a given constituent correspond 
to the list of X nodea on its right edge. For efficiency 
reasons~ they are put together in a stack associated with 
the constituent. 
(12) Jotm has bought some flowers. 
When the parser reads tim word some, tile 
left context include, among many constituents, 
structure (13) : 
(13) \[ ~l' Johit has \[ Vl' boughtJJ 
Tile word some triggers a DP projection, with 
some as its head. Considering the left context, 
the parser raids structure (13), the stack of at- 
tachment sites of which contains the verb bought. 
The newly created DP projection combine with 
the TP structure as a complement of the verb 
bought. We now have an updated TP constituen- 
t, as in (14), with vm updated stack of active 
nodes, with at the top the DP constituent just 
attached. 
(14) I: "rP John has bought some\] 
The parser cent now read the following word 
flowers, which can attach to the left context 
structure (14), as a complement of the deter- 
miner some. 
The right corner strategy takes care of attach- 
ments to the right. In the case of projections or 
of attachments to the left (specifiers), the usual 
bottom up procedure applies. Typically projec- 
tion is triggered by inherent features (\[+tense\] 
verbs will trigger a T(ense) projection, proper 
nouns a DP projection, etc.). As for left attach- 
ment, it occurs when the current constituent can 
find a left context constituent which can func- 
tion as a possible specifier. The attachment of 
the specifier to the current constituent deter- 
mines a new constituent which may in turn find 
a specifier in its left context (iterative attach- 
meat) as it happens in the possessive construc- 
tion (e.g. John's little brother's cat). 
5 Concluding remarks 
Tile right coruer parsing strategy discussed in 
this paper has been developed to satisfy tile 
particular needs of our on-line interactive pars- 
ing model. By (i) pursuing concurrently all the 
possible analyses and (it) trying to integrate in- 
coming items into fully developed constituents, 
ACIDS DE COLING-92, Nar, rrEs, 23-28 Ao~-r 1992 8 7 3 PR()c. OF COLING-92, NANTES, AUG. 23-28, 1992 
this scheme, at each step in the parsing pro- 
cess, provides the filtering components, includ- 
ing user-interaction, with struct/tres that are as 
much interpreted as possible. Not only does this 
make the selection process much more reliable, it 
is also consistent with psycholinguistic evidence 
for incremental sentence parsing. 
Although still under development, the IPS 
parser, which uses a lexical database exceed- 
ing 80,000 entries, has a fairly broad gram- 
maticai coverage including simple and complex 
sentences, complex determiners and possessives, 
yes/no and wh-interrogatives, relatives, passive, 
some comparatives as well as some cases of coor- 
dination (of same category). The bYench version 
also handles typical Romance constructions such 
as clitics and causatives. 
References 
Abney, S., 1987. The English Noun Phrase in its 
Sentential Aspect, unpublished MIT Ph.D. 
thesis. 
Aho, A. et ,I.D. UUman, 1972. The Theo- 
ry of Parsing, Translation and Compiling, 
Prentice-Hail, Englewood Cliffs, NJ. 
Berwick, R., 1987. "Prlnciple-based parsing", 
technical report, MIT AI-lab. 
Berwiek, It., S. Abney and C. Tenny (eds.) 1991. 
Principle-Based Parsing: Computation and 
Psychollnguistics, Kinwer Academic Pub- 
lishers, Dordrecht. 
Chomsky, N., 1981. Lectures on Government 
and Binding, Foris Publications, Dordrecht. 
Chomsky, N., 1986. Knowledge of Language: Its 
Origin, Nature and Use, Praeger Publisher- 
s, New York. 
Clark, It., 1990a. "(Some) Determiners and Par- 
titives", ms., Uuiversity of Geneva. 
Clark, It., 1990b. "The Auxiliary System of En- 
glish", ms., University of Geneva. 
Crocker, M., 1992. A Logical Model of Compe- 
tence and Performance in the Human Sen- 
tence Processor, PhD dissertation, Univer- 
sity of Edinburgh. 
Frazier, L. 1987. "Syntactic Processing: Evi- 
dence from Dutch," Natural Language and 
Linguistic Theory 5, pp. 515-559. 
Gibson, E., 1991. A Computational Theory 
of Human Linguistic Processing: Memo- 
ry Limitations and Processing Breakdown 
PhD dissertation, Carnegie Mellon Univer- 
sity. 
Kay, M., 1980. "The Proper Place of Men and 
Machines in Language Translation", CSL- 
80-11, Xerox Paio Alto Research Center. 
Melby~ A.~ M. Smith and J. Peterson 
(1980). "ITS: Interactive translation sys- 
tem" in Proceedings of the 8th Internation- 
al Conference on Computational Linguistics 
(COLING-80). 
Pereira, F. and S. Shieber (1987). Prolog and 
Natural Language Analysis, CSLI Lectures 
Notes 10, Chicago Uuiversity Press. 
Tanenhaus, M., G. Carlson and M. Seidenberg 
(1985). "Do Listeners compute Linguistic 
Representations?" in D. Dowty, L. Kart- 
tunen and A. Zwicky (eds.), Natural Lan- 
guage Processing: PsychologicaJ, Computa- 
tional and Theoretical Perspectives, Cam- 
bridge University Press. 
Tomita, M. (1984)."Disambiguating grammat- 
ically ambiguous sentences by asking," 
Proceedings of the lOth International 
Conference on Computational Linguistics 
(COLING-84). 
Wehrll, E., 1988. "Parsing with a GB gram- 
mar," in U. Reyle and C. Itohrer (eds.), 
Natural Language Parsing and Linguistic 
Theories, Reidel, Dordrecht. 
Wehrll, E., 1990. "STS: An Experimental Sen- 
tence Translation System," Proceedings of 
the 13th Internation Conference on Com- 
putational Linguistics (COLING-90). 
Zajac, It. (1988). "Interactive translation: a 
new approach~" Proceedings of the 12th In- 
ternational Conference on Computational 
Linguistics (COLING-88). 
Acres DE COLING-92. NAMES, 23-28 AOUT I992 8 7 4 PROC. OF COLING-92. NANTES. AUO. 23-28. 1992 
