A MODULAR ARCHITECTURE 
FOR CONSTRAINT-BASED PARSING 
Francois Barthdlemy ~" Fran(;ois Rouaix 0 
0 INRIA Roequeneourt, BP 105, 78153 Le Chesnay cedex, France 
& Universidade Nova de Lisboa, 2825 Monte de Caparica, Portugal 
ABSTRACT 
This paper presents a framework and a system for 
implementing, comparing and analyzing parsers 
for some classes of Constraint-Based Grammars. 
The framework consists in a uniform theoretic 
description of parsing algorithms, and provides 
the structure for decomposing the system into 
logical components, with possibly several inter- 
changeable implementations. Many parsing al- 
gorithms can be obtained by compositi(m of the 
modules of our system. Modularity is also ,~ way 
of achieving code sharing for the common parts 
of these various algorithms. Furthermore, tile de- 
sign lielpi~ reusing the existing modules when im- 
plementing other algorithms. The system uses 
the flexible modularity provided by the program- 
mifig languages hleool-90, 1)ased on a type system 
that ensures the safety of module composition. 
1 INTRODUCTION 
We designed a system to study parsing. Our aim 
was not to implement only one parsing algorithm, 
but as many as possible, in such a way that we 
could compare their performances. We wanted to 
study parsers' behavior rather than using them to 
exploit their parses. Furthermore, we wanted a 
system opened to new developments, impossibh~ 
to predict at the time we began our project. 
We achieved these aims by detining a mo(lular 
architecture that gives us in addition code sharing 
between alternative implementations. 
Onr system, called APOC-II, implements more 
than 60 ditferent parsing algorithms for Context- 
Free Grammars, Tree-Adjoining Grammars, and 
Definite-Clause Grammars. The different gener- 
ated parsers are comparable, because they are im- 
plemented in the same way, with common data 
structures. Experimental comparison can involve 
more than 20 parsers for a given grammar and 
give results independent from the implementa- 
tion. 
Fnrthermore, adding new modules multiplies 
the mHnber of parsing Mgorithm. APOC-II is 
open to new parsing techniques to such an ex- 
tent that it can be seen as a library of tools for 
parsing, including constraint solvers, look-ahead, 
parsing strategies and control strategies. These 
tools make prototyping of parshlg algorithms eas- 
ier an(l qui(:ker. 
The system is I)ase(1 on a general framework 
that divides parsing matters in three different 
tasks. First, tl,e compili~tion that translates a 
grammar into a push-down automaton (tescrib- 
ing how a parse-tree is built. The automaton can 
be non-determinlstic if several trees have to be 
eonsidere(l when parsing a string. Second, the 
interl)retation of the push-down ~mtomaton that 
has to deal with non-determinism. Third, the 
constraint solving, used by 1)oth eomi)ilation and 
interpretation to perform operations related to 
constraints. 
Several algorithms can perform each of these 
three tasks: the compiler can generate either top- 
down or bottom-up automata, the interl)reter can 
make use of backtracldng or of tal)ulation and 
the solver has to deal with different kinds of con- 
straints (first-order terms, features, ... ). 
Our architecture allows different combinations 
of three components (one for each basic task) re- 
sulting into a specific parsing system. We use the 
Alcoo\[-90 progranmfing language to implement 
our mo(hlles. This language's type system allows 
the definition of alternative implementations of 
a conlponent and enmlres the safety of module 
cond)ination, i.e. each module provides what is 
neede(1 by other mo(lules and re(:eives what it re- 
quires. 
The same kind of modularity is used to split the 
main components (conll)iler, interpreter, solver) 
into independent snb-modnles. Some of these 
sub-modules can bc shared by several different 
implementations. For instance the coml)utation 
of look-ahead is the same for LL(k) and LR(k) 
techniques. 
The next section defines the class of grammar 
we consider. Then, ~t general framework for pars- 
ing and the sort of modularity it requires are pre- 
sented. Section 4 is devoted to the AIcool-90 lan- 
guage that provides a convenient module system. 
Section 5 is the detailed description of tile APOC- 
454 
II system that implements the gonoral ff~tmework 
using Alcool-90. 
2 CONSTII.AINT- B ASED C~RAMMARS 
The notion of Constraint-Based Gramm~tr aii- 
ile~tred ill computational linglfistic. It is rt useful 
allstraction of several classes of grammars, inelud- 
hlg the most commonly used to describe NatuntI 
Language in view of COmlmter processing. 
Wo give our own definition of constraint-lmsed 
grammars that may slightly differ from other def- 
initions. 
Definition 1 ConstTnint-11ased Grammar 
A constraint-based grammar is a 7-tuple 
{Nt, T, (~, V, Am, C L, R} where 
• Nt is a set of symbols called non-terminals 
• 7' is a set of symbols called terminals 
• a is a flmetion from Nt O 7' to the natm'al 
integers called the arity of the symbol,s 
• V is an infinite set of variables 
• Aa: is an element of Nt called the a:dom 
• CL is a constraint language (see definition be- 
loin) having V as variable set and being closed 
it~'tder renaming a~td conjunction 
• R is a finite set of rules of the form: 
-, (2',) .... , <2;,) 
such that so E Nt, sl ~ Nt U 7' for 0 < i _<. n, 
c e CL, Xi are tuples of (t(sl) distinct va,'i- 
ables, and the same wwiabIe cannot appear in 
two different tupIes. 
in this definitio,t, we use the notion (if con- 
straint language to define the syntax and the se- 
mantics of the constraints usod 1)y the grammars. 
Wo refer to the definition given Iiy H/Sfcld and 
Smollm in \[ITS88\]. This detinition is especially 
suitable for constraints used in NLP (unrestricted 
synt*tx, multiplicity (if interpretation donmins). 
The closure under renaming property has ~tlso 
1lees detined by IISfeld and Snlolka. It ensures 
tlt~tt constraints are independent from the vari- 
able names. This grmtnds the systematic renam- 
ing of grammar rules to avoid wtriallle conflicts. 
Definition 2 Constrnint Language 
A constraint Language is a 4-tuple (V,C,u,I) such 
that: 
• V is an infinite set of variables 
• C is a decidable set whose elements are called 
cons traints 
• u is fanction that associates a finite set of 
variables to eaeh constraint 
• I is a non-empty set of interpretations 
Ii'or bt<:k of Slm<:e we <lo not recall in detail what 
itll interpret&tioll Jill(| the "<'losuro lllldel" I'(!IlH.III ~ 
ing" pr<)perty are, and refer to \[IIS88\]. 
The semantics of Constra.int-Based Gnmmlars 
is defined by the .'-;(?lllalltics of the constra.int lan- 
guage ~tll(l l, ho notion of syntax tree. A synta.x 
trce is a tree which \]ms at grammttr rule (remtmed 
with fi'esh v~triables) as latml of ea.ch nodo. A 
constraint is associatted to at parse tree: it is the 
conjunction of all the constr~dnts of the labels and 
the oqualities between the tUllle of wtriables from 
the non-termilml ,if the loft-hand side of a label 
and the tlq)le of the relewmt symbol of tim right> 
hand side of tim l~dml of its p~trent. 
An hnportant lloint ~dmut p;trse trees is tlt*tt 
the ordor of terminal symbols of tll(~ ini)ut string 
and the order of the symhols in rig}lt-h;md sides 
of rules are signitica.nt. 
A Context-Free Gramma, r is obtained just 
by ,'omoving tutiles and constr~dnts fl'om tho 
grammar rules. Most i)m'sing techniques for 
Constraint-Bas(~d Grainmars use the underlyillg 
context-fro(! structure, to guido parsing. This al- 
lows the ,'euse of cont.ext-fl'ee lntrsing tccl,niques. 
T}Io g~r;tllllll;H's wo hltve just definod OIICOIII- 
pass several classes {if i;r&llllll;trs llSOd ill N\],\] ), 
including log;it p;l'amlttlal'S (Definite Clause Cram- 
mars and variants), UIlifica~tion Cramlmtrs, Tree 
Adjoining (h'ammars I and, at least p~trtially, 
i,exical-I;'unctioval C~l'~tlllllHli's ;ilia I/oral Phras(~ 
~.I'IIC~/.III'(~ (.*fl'~llllllllLl'S. ()1" ('OllI'S(~ 1 t,h(!r(~ ;tl'(~ syn- 
tactical differ(mces 1)(~twe(m these (:lassos altd 
Constraint-Based (ll'amlmU'S. A simple t:ransla.- 
t.ion \['r()lll on(? syntax t,/) {.he ()th(,r is n(~(:essary. 
3 A G ENF.RAI, \]?RAMEWOI{K FOIl. 
PARSING 
This section is devoted to it general fralnework 
for iiarsing ill which most of the i)arsing inethods, 
inchlding~ all the lnost COtlllllOtl OliOS, ar(\] express- 
ible. It is ;in extension of ~ contoxt-freo framo- 
work \[Lan74\]. it is based on an explicit separation 
lletween tho parsing strategy that descrilies how 
ITAGs have an underlying context-free structure, al- 
though this is not ol)vi(ms in their formM definition. See 
for instance \[I,angl\]. 
455 
syntax trees are built (e.g. top-<lown, bottom- 
Ill)), and the control strategy that <lcals with the 
non-determinism of the parsing (e.g. backtrack- 
ing, tabulation). 
3.1 EPDAs 
This separation is based on an intermediate repre- 
sentation that describes how a grammar is used 
following a given parsing strategy. This inter- 
mediate representation is a Push-Down Automa- 
ton. It is known that most context-free parsers 
can be encoded with such a stack machine. Of 
course, the usual formalism has to be extended 
to take constraints into account, and possibly use 
them to disambiguate the parsing. We. call Ex- 
tended Push-Down Automaton (EPDA) the ex- 
tended formalism. 
For lack of space, we do not give here the for- 
mal definition of EPDA. hfformally, it is a ma- 
chine using three data structures: a stack contain- 
ing at each level a stack symbol and its tuple of 
variables; a representation of the terminal string 
that distinguishes those that have already been 
used and those that are still to be read; finally 
a constraint. A configuration of an automaton 
is a triple of these three data. Transitions are 
partial fimctions from configurations to configu- 
rations. We add some restrictions to these tran- 
sitions: the only clmnge allowed for the string 
is that at most one more terminal is read; only 
the top of the stack is accessible and at most one 
symbol can be added or removed from it at once. 
These restrictions are needed to employ directly 
the generic tabular techniques for automata exe- 
cution described in \[BVdlC92\]. EPDAs may be 
non-deterministic, i.e. several transitions are ap- 
plicable on a given configuration. 
Parsing for Constraint-Based Grammars 
blen(ls two tasks: 
• The structural part, that consists in buihling 
the skeleton of parse trees. This l)art is similar 
to a context-free parsing with the underlying 
context-free projection of the grammar. 
• Solving the constraints of this skeleton. 
The two tasks are related in the following way: 
constraints appear at the nodes of the tree; the 
structure is not a valid syntax tree if the con- 
straint set is unsatisfiable. Each task can be per- 
formed in several ways: there are several context- 
free parsing methods (e.g. LL, LR) and con- 
straints sets can be solved globally or incremen- 
tally, using various orders, and several ways of 
mixing the two tasks are valid. Tree construction 
involves a stack mechanism, and constraint solv- 
ing results in a constraint. The different parsing 
teelmiques can be described as computations on 
these two data structures. EPDAs are thus able 
to enco<le various l)arsers for Constraint C~ram- 
nlars. 
Automatic translation of grammars into EP- 
DAs is possible using extensions of usual context- 
free teelmiques \[Bar93\]. 
3.2 ARCIII'rECTUP=E 
Thanks to the intermediate representation 
(EPDA), parsing can be divi<led into two inde- 
pendent passes: tile compilation that translates 
a granlnlar into an extended autonlaton; tim exe- 
cution that takes an EPDA and a string and pro- 
duees a forest of syntax trees. To achieve the in- 
dependence, the compihw is not allowed to make 
any assumptions about the way the automata it 
produces will lie executed, and the interpreter in 
charge of the execution is not allowed to make 
assumptions about the automata it executes. 
We add to this scheme reused from context- 
free parsing a thir<l component: the solver (in an 
extensive meaning) in charge of all the oi>erations 
related to constraints and wu'iables. We will try 
to make it as in<lel)en<teilt from the other two 
modules (compiler and interpreter) as possible. 
There is not a fidl in<lependenee, since both the 
compiler and the interpreter involve constraints 
and related operations, that are: l)erfornmd by 
the solver. We just want to define a (:lear inter- 
face between the solver and the other modules, 
an interface independent from the kind of the 
constraints and from the solving algorithms be- 
ing used. rl'be same coml)iler (resp. interl)reter ) 
used with different solvers will work on ditl'erent 
classes of grammars. For instance, the same com- 
piler can compih~ Unilh:ation Grammars an<l Def- 
inite Clause Grammars, using two solvers, one 
implenmnting feature unilieation, the second one 
iml)lementing tirst-order unilieation. 
We can see a complete parsing system as the 
eoml)ination of three modules, compiler, inter- 
prefer, solver. When ea(:h module has several 
implementations, we wouhl like to take any com- 
bination of three modules. This schematic ab- 
straction captures l)arsing algorithms we are in- 
terested in. However, actually defining interfaces 
for a practical system without restricting open- 
endedness or the abstraction (interehangeability 
of components) was the most difficult technical 
task of this work. 
456 
3.3 SOLVERS 
The main problem lies in the dclinition of the 
solver's interface. Some of the required ol)era- 
lions are ol)vious: renaming of constraints and 
tul)les, constraint lmilding, extraction of the vari- 
al)les from a constraint, etc. 
By the way, remark that constraint solving can 
be hidden within the solver, and thus not ap- 
pear in the interface. There is an equivalence 
relation between constraints given by their inter- 
pretations. This relation can lie used to replace 
a constraint by another eqniwdent one, l)ossibly 
siml)ler. The solving call also be explicitly used to 
enR)ree the simplification of constraints at some 
points of tile parsing. 
Unfortunately some special techniques require 
more specific operations on constraints. For in- 
stance, a family of parsing strategies related to 
Earley's algorithm m~tke use of the restrictio~ op- 
erator defined by Shieber in \[Shi85\]. Another ex- 
aml)le: some tabular techni(lues take Benetit from 
a projectioil operator that restricts constraints 
with respect to a subset of their variat)les. 
We. could define the solver's inte.rface as the 
cartesian product of all the operations used by 
;tt least one technique. There are two reasons to 
re}cot such an apI)roaeh. The first one is that 
some seldom used operations are ditli(:ult to de- 
line on some constraints domains, it is the case, 
among others, of tile projection. The second rea- 
son is that it woul(\[ restrict to the techniques aI: 
ready existing and known by us at the moment 
when we design tile interface. This contradicts 
the open-endedness requirement. A new ollera- 
tion can appear, useful for a new parsing method 
or for optimizing the old ones. 
We prefer a flexible detlnition of the interface. 
Instead of defining one single interface, we will al- 
low each alternative iniF, lenlentation of the solver 
to define exactly what it ol\['ers and each iml)h~- 
nmntation of the compiler or of the interpreter 
to detine what it demands. The conll)ination of 
modules will involve the checking that the @r<'.r 
encompasses the demand, that all tile needed op- 
erations are implemented. This imposes restric- 
tions on the combination of niodules: it is the 
overhead to obtain an open-ended system, opened 
to new developments. 
We found it language providing the. kind of llex- 
il)le modularity we needed: Alcool--90. We now 
present this language. 
4 '\]'IIE LANGUAGE ALCOOL 90 
Alcool-90 is an experimental extension of the 
functional language ML with run-time overload- 
ing \[I{ou90\]. Overloading is used as a tool for 
seamless integration of abstract data types ill 
the ML type system, retaining strong typing, 
and type inference prollerties. Abstract data 
types (encapsulating a data structure represen- 
tation and its constructors ~uld interpretive flmc- 
tiol,s) i)rovide wdues for overloaded symbols, as 
classes provide methods for messages ill object- 
o,'ientcd terminology, i{owever, strong typing 
means that the compiler guarantees that errors 
()f kind "method not found" never hal)pen. 
Abstract programs axe programs referring to 
overloaded syml)ols, which vahles will be deter- 
nfined at run-time, consistently with the calling 
environment. By grouping Mlstract l)rograms, 
we obtain parameterized abstra.ct data types (or 
fllnctors), the calling environment being here a~ 
particular instantiation of the I)arameterized adt. 
Thus, we obtain Jut environment equivalent to a 
module system, each module being an adt, even- 
tually llarameterized. 
D)r instance, ill APOC-II, (:ompilers h~tve an 
abstract data type parameterized by a solver. 
Alcool-90 also proposes an innow~tive environ- 
ment where we exploit anlbiguities due to over- 
loading for semi-automated 1)rogram configura- 
tion : the type iufin'elice eoullnltes interfaces of 
%llissing" COIllpollents to colnplete a progralll, ae- 
cording to the use of overloaded synlbols in the 
program. A search algo,'ithm finds components 
satisfying those interfaces, eventually by tind- 
ing suitable parameters for parameterized compo- 
nents. Naturally, instantiatiot, of parameterized 
coml)onents is also type-safe : actual parameters 
must have interfaces matching formal parameters 
(schematically : the actual parameter must pro- 
vide at least the functions required by the inter- 
face of the formal parameter). 
For instance, only the solvers provi(lil,g 
Shieber's restriction can })e used as the. aetlial pa.- 
ramcter of Earley with restriction compiler. But 
these solvers can also be '.lse(l l)y a.ll the eoml)ilers 
that do not use the restriction. 
Simple module systems have severe limita- 
tions when several implementations of compo- 
nents with simil~tr interfaces (:()exist in a system, 
or when some component Inay be employed in dif- 
ferent contexts. Ada generics provided a first step 
to lnodule parameterization, th(mgh at the cost 
of heavy declar~tions a.nd difficulties with type 
equiwdence. SML pral)oses a very powerful mod- 
ule system with paranleterization, but lacks sepa- 
rate comllilation and still requires a large amount 
of user decl~u'ations to detine and use functors. 
Object-oriented languages lack the type security 
that Alcoo\[-90 guarantees. 
457 
The Alcool-90 approach benefits from the sim- 
plification ot modules as abstract data types by 
adding inference facilities: the compiler is able to 
infer the interfaces of parameters required by a 
module. Moreover, the instantiation of a functor 
is simply seen as a type application, thus no ef- 
forts are required from the programmer, while its 
consistency is checked by the compiler. 
This approacl, is mostly useful when multiple 
implementations with similar interfaces are avail- 
able, whether they will coexist in the program or 
they will be used to generate several configura- 
tions. Components may have similar interfaces 
but different semantics, although they are inter- 
changeable. Choosing a configuration is simply 
choosing fl'om a set of solutions to missing emn- 
ponents, computed by the compiler. 
Several other features of Alcool-90 have not 
linen used in this experiment, namely the inheri- 
tance operator on abstract data types, and an ex- 
tension of tile type system with dynamics (where 
some type checking occurs at run-time). 
5 APOC-II 
APOC-II is a system written in Alcool-90, imple- 
menting numerous parsing techniques within the 
framework described in section 3. The user can 
choose between these techniques to buihl a parser. 
By adding new modules written in Alcool-90 to 
the library, new techniques can freely be added 
to the system. 
APOC-II has two levels of modularity: the first 
one is that of the three main components distin- 
guished above, compiler, interpreter and solver. 
Each of these components is implemented by sev- 
eral alternative modules, that are combinable us- 
ing Alcool-90 discipline. 
Tile second level of modularity consist in split- 
ring each of the three main components i,lto sev- 
era.1 modules. This makes the sharing of common 
parts of different hnplementations possible. 
We give now examples of splitting APOC-ql 
uses at the moment, in order to give an idea of 
this second level of modularity. This splitting has 
proved convenient so far, but it is not fixed and 
imposed to fllrther developments: ~t new imple- 
mentation can be added even if it uses a com- 
pletely different internal structure. 
A solver is made of: 
• a module for wtriables, variabh: generation 
and renaming, 
• a parser for constraints, 
• a pretty-printer for constraints, 
• a constraint builder (creation of abstract syn- 
tax trees for constraints, e.g. building con- 
straints expressing equality of variables), 
• a solver ill the restrictive meaning, in charge 
of constraint reduction, 
• an interface that encapsulate all the other 
modules. 
A compiler includes: 
• a grammar parser (that uses tile constrMnt 
parser given by the solver), 
• a module for look-ahead (for computation of 
look-ahead sets by static anMysis of the gram- 
I\[lar ), 
• a module for EPDA representation and han- 
dling, 
• ~t transition generator which translates gram- 
mar rules into EPDA tra.nsitions therefore de- 
ternfining the p~trsing strategy (cf. figure 1), 
• Control code, using previous modules, defin- 
ing the "compih?' function, tile only one ex- 
ported. 
The two interpreters implemented so far have 
very different structures. The tlrst one uses 
backtracking and the second one uses tabulation. 
They share some modules however, such as a 
module handling transitions and a lexer of inlmt 
strings. 
Tile interest of the modular architecture is in 
tile eomtfin~ttorhtl effect of module composition. 
It leads to many diiferent parsing algorithms. 
The tigure 1 summarizes the different ~spects of 
the parsing algorithms that can vary more or less 
independently. 
For example, the built-in parsing method of 
Prolog for DCGs is ol~t.ained by combining tim 
solver for \])CGs, the top-down strategy, 0 sym- 
bol of look-ahead a.nd a backtracking interpreter 
(and other modules not mentioned in Iigure 1 be- 
cause they do not change the algorithm, but a.t 
most its implenmntation). 
Some remarks about :figure 1: 
• we call Earle?\] parsing strategy the way Earley 
deduction \[PW8a\] builds a tree, *tot the con- 
trol method it uses. It difl'e.rs from top-down 
by the way constrMnts are taken into account. 
• the difference between garley-like tabulation 
and graph-structure stacks is the data struc- 
ture used for item storage. Several variants 
are possible, that actually change the parser's 
behavior. 
458 
Solver Context-tYee Grammars - 1)etinite Clause Grammars 
(grammar class) Tree Adjoining Grammars - Uni\]ication Grammars... 
parsing strategy top-down - pure bottom-up - Earley - Earley with restriction 
(transition generator) left-corner - LR - precedence - PLR ... 
look-ahead eontext-lYee look-ahead of 0 or 1 symbol 
context-free look-ahead of k symbols - contca't-scnsitivc look-ahead 
interpreter backtracking - Earley-like tabulation - Graph-str'acturcd Stacks... 
Agenda management Synchronization - lifo - fifo - wLrio'as weights ... 
(for tabulation only) 
Figure 1: modules of APOC-II 
Modules written iii. bold font are ah'eady iml)lemented, where.as modules written in italic m'e possible 
extensions to the system. 
• we call synchronization sL kind of breadth-first 
se~trch where sc~tnnlng a terminal is performed 
only whe.n it is needed by all the paths of the 
search-tree. The search is synchronized with 
the. input string. It is the order used by l,;str- 
h.'y's algorithin. 
• at the moment, only generic look-ahead, that 
is look-ahestd based on the first and follow 
sets, has been considered. Some more aCCll- 
rate look-ahead techniques such as the ones 
involved in SLR(k) pa,'sing are probal>ly not 
indepen<lent fi'om the parsing strategy and 
<:armor be an independent mo<lule. 
Building a parsing system with APOC-II con- 
sists roughly in choosing one module of each row 
of figure 1 and combining them. Some of the 
combinations are not possible. Thanks to type- 
checking, Alcool-90 will detect the incompatibil- 
ity and provide a tyl)e-based explanation of the 
probh;m. 
At the moment, APOC-II otDrs more than 60 
ditDrent parsing algorithms. Given a g, ralrHn.%r, 
there is a choice of more than 20 different parsers. 
Adding one module does not add only one more 
algorithm, but sewn'M new vstri;tltts. 
The techniques iinplemented by APOC-II are 
not original. For instance, the LR conq)ilation 
strategy comes from a paper I)y Nilsson, \[Nil86\], 
left-corner parsing has been used 1)y Matsumoto 
and Tanaka in \[MT83\]. As far as we know, how- 
ever, LR and left-era'her p~trsers have not been 
prolmsed for Tree-Adjoining C, rammars before. 
Notice that the modularity is also useful to vary 
implementation of algorithms. D)r instance, a 
first prototype can be quickly written by imple- 
menting constraints reduction in a naive way. A 
refined version can be written later, if needed. 
6 CONCLUSION 
APOC-II has several advantages. First of all, it 
provides comparable implementations of the most 
comnmn parsing Mgorithms. Their efficiency can 
be abstractly measured, for instance by counting 
the number of eomlmtation step (EPDA transi- 
tion applicatiol 0 performed to eomlmte a tree or 
a complete forest of parse trees. We call this 
kind of measm'ements abstract \])ecallse it does 
not rely neither on the implementlttion nor on 
the machine that runs the parser. Other compar- 
isons could be done statically, on the automaton 
or on the pstrse forest (e..g. number of transitions, 
alllOllllt ,)f determi~lisnl, size of the forest, alllOllllt 
of structure slurring). 
()therwise, APOC-II cstn be. used as a to(~lkit 
that provides :t library of modules usefld to imple- 
lllent quickly ll(!W parse.r generators. For instance, 
one has only to write a solver to obtain up to 22 
parsing a.lgorithms (perhaps less if tit(', solw!r pro- 
vides only basic operations). The library contains 
tools to deal with some constraints, look-ahead, 
lexing, tabulation, etc. Reusing these tools when- 
ever it is possible saves a lot of work. 
The limitations of APOC-II are that it is mainly 
convenient for parsing strategies that stre some- 
how static, i.e. statically determined at com- 
pih! time. Also, al)stractloll (full independence 
between coral>tiers and i,~terpreters) cannot Im 
achieved for some optimized algorithms. For in- 
Sl,&llCe, Nederhof presents in \[Ned93\] a parsing 
strategy called ELI{ for which tsdmlar execution 
can be optimized. To implement this a.lgorithm 
tit ollr system, one would have to write a Ilow 
interpreter dedicated to ELR-EPDAs. 
\¥e think that our experiment shows the in- 
t(~rest of a tlexible modul;trity for studies abollt 
parsing. We believe that the same technique can 
fiuitfully apply on other domains of Ns~tural Lan- 
guage Processing. 
4,59 
7 ACKNOWLEDGEMENTS 
The authors are grateflfl to Gabriel Pereira Lopes 
for his hell). 
REFERENCES 
\[Bar93\] Franqois Barthdlemy. Outils pour l'3- 
nalyse syntaxique contextuelle. Thb~- 
se de doetorat, Universitd d'Orldans, 
1993. 
\[BVdlC921 F. Barthdlemy and E. Villemonte 
de 13 Clergerie. Subsnmption-- 
oriented push-down autom3t3, hi 
Proe. of PLILP'92, pages 100 114, 
june 1992. 
\[II8881 M. ItShfeld and G. Smolk3. Definite 
Relations over Constraint Languages. 
Technical Report 53, LILOG, IWBS, 
IBM Deutschland, october 1988. 
\[Lan74\] Bernard Lang. Deterministic tech- 
niques for efficient non-dc'terministic 
parsers, hi Proe. of the 2 '~'l Collo- 
quium on automata, languages and 
Prvgramrning, pages 255-269, Saar- 
brlieken (Germany), 1974. Springer- 
Verlag (LNCS 14). 
\[Lan91\] Bernard Lang. The systematic con- 
struction of earley parsers: Applica- 
tion to the production of o(n a) earley 
parsers for tree adjoining grammars. 
In First International Workshop on 
Tree Adjoining Grammars, 1991. 
\[MTSal Y. Matsumoto and H. Tanaka. Bup: 
A bottom-up p3rser embedded in In'O- 
log. New Generation Computing, 
1:145-158, 1983. 
\[Ned93\] Mark-Jan Nederhof. A multidisei- 
plin3ry approach to 3 parsing algo- 
rithm. In Proceedings of the Tvmntc 
Workshop on Language Technology - 
TWLT6, december 1993. 
\[Ni1861 Ulf Nilsson. Aid: An Mternative im- 
plementation of DCGs. New Genera- 
tion Computing, 4:383-399, 1986. 
\[pwsa\] F. C. N. Pereir3 and D. II. D. War- 
ren. Parsing as deduction. In Proc. of 
the 21st Annual Meeting of the Asso- 
ciation for Computationnal Linguis- 
tic, pages 137-144, Cambridge (Mas- 
saehussetts), 1983. 
\[Rou90\] 
\[Shi85\] 
Franqois Rouaix. ALCOOL-90: Ty- 
page de 13 surcharge dons un langave 
fonetionnel. ThSse de doctorat, Uni- 
versitd Paris 7, 1990. 
Stu3rt M. Shieber. Using re- 
striction to extend parsing algori- 
thms for complex--feature--based for- 
malisms. In Proceedings of the 23 r'~ 
Annual Meetin 9 of the Association 
for Computational Linguistics, pages 
145-152, Chic3go (Illinois), 1985. 
460 
