New Approaches to Parsing Conjunctions Using Prolog 
Sand,way Fong 
Robert C. Berwick 
Artificial hitelligence Laboratory 
M.I.T. 
545 Technology Square 
C,'umbridge MA 02t39, U.S.A. 
Abstract 
Conjunctions are particularly difficult to parse in tra- 
ditional, phra.se-based gramniars. This paper shows how 
a different representation, not b.xsed on tree structures, 
markedly improves the parsing problem for conjunctions. 
It modifies the union of phra.se marker model proposed by 
GoodalI \[19811, where conjllnction is considered as tile lin- 
earization of a three-dimensional union of a non-tree I),'med 
phrase marker representation. A PItOLOG grantm~tr for con- 
junctions using this new approach is given. It is far simpler 
and more transparent than a recent phr~e-b~qed extra- 
position parser conjunctions by Dahl and McCord \[1984\]. 
Unlike the Dahl and McCor, I or ATN SYSCONJ appr~ach, 
no special trail machinery i.~ needed for conjunction, be- 
yond that required for analyzing simple sentences. While 
oi contparable ¢tficiency, the new ~tpproach unifies under a 
single analysis a host of related constructions: respectively 
sentences, right node raising, or gapping. Another ,'ulvan- 
rage is that it is also completely reversible (without cuts), 
and therefore can be used to generate sentences. 
John and Mary went to tile pictures 
Ylimplc constituent coordhmtion 
Tile fox and tile hound lived in tile fox hole and 
kennel respectively 
CotJstit,wnt coordination "vith r.he 'resp~ctively' 
reading 
John and I like to program in Prolog and Hope 
Simple constitmvR co~rdinatiou but c,~, have a col- 
lective or n.sp,~'tively reading 
John likes but I hate bananas 
~)tl-c,mstitf~ent coordin,~tion 
Bill designs cars and Jack aeroplanes 
Gapping with 'resp,~ctively' reading 
The fox. the honnd and the horse all went to market 
Multiple c,mjunets 
*John sang loudly and a carol 
Violatiofl of coordination of likes 
*Wire (lid Peter see and tile car? 
V/o/atio/i of roisrdJ)l=lte str¢/¢'trlz'e constr.~int 
*1 will catch Peter and John might the car 
Gapping, hut componcztt ~cnlenccs c.ntain unlike 
auxiliary verbs 
?Tire president left before noon and at 2. Gorbachev 
Introduction 
The problem addressed in this paper ~s to construct 
,~ gr;unmatical device for lumdling cooL dination in natural 
language that is well founded in lingui.~tic theory and yet 
computationally attractive. 'the linguistic theory, should 
be powerful enough to describe ,~ll of the l)henomenon in 
coordi:tation, hut also constrained enough to reject all u.'l- 
gr;unmatical examples without undue complications. It is 
difficult to ;tcldeve such ;t line h;dancc - cspcci,dly since the 
term grammatical itself is hil,hly subjccl.ive. Some exam- 
ples of the kinds of phenolr-enon th:tt must l)e h;mdh.d are 
sh.,'.wl hi fig. t 
'\['he theory shouhl Mso be .~menable to computer 
hnpien:ellt~tion. For example, tilt represeuli~tion of the 
phrase, marker should be ,'onducive to Imth ¢le~u! process 
description antl efficient implementation of the associated 
operations as defined iu the linguistic theory. 
Fig 1: Example Sentences 
The goal of the computer implementation is to pro- 
d,ce a device that can both generate surface sentences given 
;t phrase inarker representation and derive a phrase marker 
represcnt;Ltion given a surface sentences. Thc huplementa- 
lion should bc ~ efficient as possible whilst preserving the 
essential properties of the linguistic theory. We will present 
an ir, ph:n,cut,'ttion which is transparent to the grammax 
and pcrliaps clemler & more nmdular than other systems 
such ,~ the int,:rpreter for the Modilh:r Structure Cram- 
,,,ar.~ (MSG.,) of l)alll & McCord \[1983 I. 
"\]'lie NISG systenl will be compared with ~ shnpliGed 
irnl)lenlenl.;~tion of tile proposed device. A table showin K 
tile execution thne of both systems for some sample sen- 
118 
tences will be presented. Furthermore, the ,'ulvantages and 
disadvantages of our device will be discussed in relation to 
the MSG implementation. 
Finally we can show how the simplifled device can 
l)e extended to deal with the issues of extending the sys- 
tem to handle nmltiple conjuncts ~d strengthening the 
constraints of the system. 
This representation of a phrase marker is equiva- 
lent to a proper subset of the more common syaxtactic tree 
representation. This means that some trees may not be 
representable by an RPM and all RPMs may be re-cast as 
trees. (For exmnple, trees wit.h shared nodes representing 
overlapping constituents are not allowed.) An example of 
a valid RPM is given in fig. 3 :- 
The RPM Representation 
The phrase marker representation used by the theory 
described in the next section is essentially that of the Re- 
duced Phrase Marker (RPM) of L,'mnik & Kupin \[1977\]. A 
reduced phrase maxker c,'m be thought of im a set consist- " 
ing of monostrings ,'rod a termiual striltg satisfying certain 
predicates. More formally, we haws (fig. 2) :- 
Sentence: Alice saw 13ill 
RPM representation: 
{S. Alice.saw.Bill. NP.saw.Bill. Alice.V.Bill. 
Alice.VP.Alice.saw.NP} 
Fig 3: Aa example of RPM representation 
Let E and N denote the set of terminals and 
non-terminals respectively. 
Let ~o,~, x E: (TI. U N)'. 
Let z, y, z E Z'. 
Let A be a single non-terminal. 
Let P be an arbitrary set. 
Then ~o is a monostrmg w.r.t. ~ & N if ~o E 
Z'.N.E'. 
Suppose~o = zAz and that ~o,$6:P where P 
is a some set of strings. We can also define the 
following predicates :- 
yisa*~oin PifxyzEP 
dominates ~b in P if ~b = zXy. X # 0 and 
x#A. 
W precedes v) in P if 3y s.t. y isa* ~o in P. 
~b=zvX and X#z. 
Then :- 
P is an RPM if 3A,z s.t. A,z ~. P and 
V{~O,~0} C_ P then 
dominates ~o in P or ~o dominates ~b in P 
or ~b precedes ~ in P or ~,, precedes ~b in P. 
Fig 2: Delinitioa of azl RPM 
119 
This RPM representation forms the basis of i, he 
linguistic theory described in the next section. The set 
representation ha.s some dcsir;d~M advantages over a tree 
representation in terms of b.th simplicity of description 
and implementation of the operations. 
Goodall's Theory of Coordination 
Goodall's idea in his draft thesis \[Goodall??\] wa.s to 
ext,md the definition of I.a.snik ~md t(upin's RPM to cover 
coordiuation. The main idea behind this theory is to ap- 
ply tilt. notion that coordination remdts from *he union of 
phr,~e markers to the reduced I)hrmse marker. Since R PMs 
axe sets, this h,'m the desirable property that the union of 
RI'Ms wouhl just be the falltiliar set union operation. For 
a computer intplemeutation, the set union operation can be 
realized inexpensively. In contr,-Lst, the corresponding op- 
eration for trees would necessitate a much less simple and 
efficient union operation than set union. 
However, the original definition of the R.PM did 
not ~nvisage the union operation necessary for coordina- 
tion. "\['he RPM w~ used to represent 2-dimensional struc- 
ture only. But under set union the RPM becomes a rep- 
resentation of 3-dimensional structure. The admissibility 
predicates dominates zmd precedes delined on a set of 
monustrings with a single non-terminal string were inade- 
quate to describe 3-dimensional structure. 
B;~ically, Goodall's original idea w~ to extend the 
dominates ~m(l precedes predicates to handle RPMs un- 
der the set union operation. This resulted in the relations 
e-dominates ,'rod e-precedes ,xs shown in fig. 4 :- 
Assuming the definitions of fig. 2 and in addition 
let ~, f2, 0 E (~ O N)" and q, r, s, t, u E \]~', then 
~o e-dominates xb in P if ~ dominates ~b I in 
P. X=w = ~'. e~/fl = Xb and = -- g in P. 
~o e-precedes Xb in P if y lea* ~o in P. v lea* 
in P. qgr -~ s,~t in P. y ~ qgr and u ~ ~t 
where the relation - (terminal equiralence) is 
defined as :- 
z----pin P ifxzwEPandxyo~EP 
Figure 4: Extended definitions 
This extended definition, in particular - the notion 
of equivalence forms the baals of the computational device 
described in the next section, llowever since the size of" the 
RPM may be large, a direct implementation of the above 
definition of equivMence is not computationMly fe,'tsible. In 
the actual system, an optimized but equivalent alternative 
definition is used. 
Although these definitions suffice for most examples 
of coordination, it is not sufficiently constrained enough to 
reject stone ungr,'mzmatical examples. For exaanple, fig. 5 
gives the RPM representation of "*John sang loudly and 
a carol" in terms of the union of the RPMs for the two 
constituent sentences :- 
John sang loudly 
John sang a carol 
{ {John.sang.loudly, S, 
John.V.Ioudly, John.VP, 
John.sang.AP, 
NP.sang.loudly} 
{John.sang.a.carol, S, 
John.V.a.carol, John.VP, 
John.sang.NP, 
NP.sang.a.caroi } 
(When thcse two I\[PM.q are merged some of the elements 
o\[ the set do not satisfy La.snik & gupin '~ ongimd deA- 
uitiou - thc.~e \[rdrs arc :-) 
{John.sang.loudly. John sanff.a.carol} 
{John.V.loudly. John.V.a.carol} 
{NP.sang.loudly. NP.sang.a.carol} 
(N,m. o\[ the show: I~xirs .~lt/.st'y the e-dominates prw/i- 
rate - but Lhcy all .~tisfy e-precedes and hence the sen- 
tcm:e Js ac~eptc~l as .~, RI'M.) 
Fig.5: An example ot" union o\[ RPMs 
The above example indicates that the extended RPM 
definition of Goodall Mlows some ungrammatical sentences 
to slip through. Although the device preseuted in the next 
section doesn't make direct use of the extended definitions, 
the notion of equivMence is central to the implementation. 
The basic system described in the next section does have 
this deficiency but a less simplistic version described later 
is more constrained - at the cost of some computational 
efficiency. 
Linearization and Equivalence 
Although a theory of coordination ham been described 
in the previous sections - in order for the theory to be put 
into practice, there remain two important questions to be 
answered :- 
• I-low to produce surface strings from a set of sentences 
to be conjoined? 
• tlow to produce a set of simple sentences (i.e. sen- 
tences without co,junct.ions) from ~ conjoined surface 
string? 
This section will show that the processes ot" //n- 
e~zation and finding equivalences provide an answer to 
both questions. For simplicity in the following discussion, 
we assume that the number of simple sentences to be con- 
joined is two only. 
The processes of linearization ~md 6riding equiva- 
lences for generation can be defined as :- 
Given a set of sentences and a set of candidates 
which represent the set of conjoinable pairs for 
those sentences, llnearizatinn will output one or 
more surface strings according to a fixed proce- 
dure. 
Given a set of sentences, findinff equivalences 
will prodnce a set o( conjoinable pairs according 
to the definition of equivalence o# the linguistic 
theory. 
\[;'or genera.Lion the second process (linding equiva- 
lences) iu caJled first to generate a set of (:andidates which 
is then used in the first, process (linearization) to generate 
the s.rface strings. For parsing, the definitions still hold - 
but the processes are applied in reverse order. 
To illustrate the procedure for linearization, con- 
sider the following example of a set of simple sentences 
(fig. 0) :. 
120 
{ John liked ice-cream. Mary liked chocolate} 
~t of .~imple senteuces 
{{John. Mary}. {ice-cream. chocolate}} 
set ,ff ctmjoinable pairs 
Fig 6: Example of a set of simple sentences 
Consider tile plan view of the 3-dimensional repre- 
aentation of the union of the two simple sentences shown in 
fig. 7 :- 
"~. ~ice-cream John liked 
Mary .- ~-- chocolate 
Fig 7: Example o\[ 3-dimensional structure 
The procedure of linearization would t~tke the foi- 
l.wing path shown by the arrows in fig. 8 :- 
John . ~~.-cream 
M~--" " chocolate 
Fig 8: Rxample of linearization 
F~dlowin K the path shown we obtain the surface 
siring "John and Mary liked ice-cream and chocolate". 
The set of conjoinable pairs is produced by the pro- 
cess of \[inding equivalences. The definition of i:quivalence 
as given in the description of the extended RPM requires 
the general.ion of the combined R.PM of the constituent sen- 
lances. However it can be shown \[I,'ong??\] by considering 
the constraints impc,sed by the delinitions of equivalence 
and linc:trization, that tile same set of equivalent terminal 
string.~ can be produced just by using the terminal strings of 
the RI*M alone. There ;tre consider;Lble savings of compu- 
tatioaal resources in not having to compare every element 
of the set with every other element to generate all possible 
equivalent strings - which would take O(n ~) time - where 
n is the cardinality of the set. The corresponding term for 
the modified definition (given in the next sectiou) is O(1). 
The Implementation in Prolog 
This section describes a runnable specification written 
in Prolog. The specification described also forms the basis 
for comparison with the MSG interpreter of Dahl aud Me- 
Cord. The syntax of the clauses to be presented is similar 
to the Dec-10 Prolog \[Bowen et a1.19821 version. The main 
differences are :- 
• The symbols %" and ~," have been replaced by the 
more meaningful reserved words "if" and "and" re- 
spectively. 
• The symbol "." is used ,as the list constructor and 
"nil" is ,,sed to represent the empty list. 
• ,in an example, a Prolog clause may have the fornt :- 
a(X V ... Z) ir b(U v ... W) a~d c(R S ... T) 
where a,b & c are predicate names and R,S,...,Z may 
represent variables, constants or terms. (Variables 
are ,listinguished by capitalization of the first charac- 
ter in the variable name.) The intended logical read- 
ing of tile clause is :- 
"a" holds if "b" and "c" both hold 
for consistent bindings of the arguments 
X, Y,...,Z, U, V,..., W, R,S,...,T 
• Cmnments (shown in italics) may be interspersed be- 
tween tile argamaents in a clause. 
Parse and Generate 
In tile previous section tile processes of linearization 
and linding equivalences are described ;m tile two compo- 
nents necessary for parsing and generating conjoined sen- 
testes. We will show how Lhese processes can be combined 
to produce a parser and a generator. The device used for 
comparison with Dahl & McCord scheme is a simplified 
version of the device presented in this section. 
First, difference lists are used to represent strings 
in the following sections. For example, the pair (fig. 9) :- 
121 
{ john.liked.ice-cream.Continuation. Continuation} 
Fig g: Example of a difference list 
is a difference list representation of the sentence "John 
liked ice-cream". 
We can :tow introduce two predicates linearize and 
equivaleutpalrs which correspond to the processes uf lia- 
earization uJl(l liuding equivalences respectively (fig. 10) :- 
linearize( pairs S1 El and 52 E2 candidates Set 
yivcs Sentence) 
Linearize hohls when a pair of difference lists 
({S1. EL} & {S2. E2)) and a set ,,f candidates 
(Set) arc consistent with the string (Sentence) 
as dellned by the procedure given in the previ- 
ous section. 
equivahmtpairs( X Y fi'om S1 $2) 
Equivalentpairs hohls when a ~uhstring X of 
S1 is equivalent to a substring Y of $2 accordhtg 
to the delinition of equivalence in the linguistic 
theory. 
The definitions fi~r parsing ,'utd generating are al- 
most logically equivalent. Ilowever the sub-goals for p~s- 
ing are in reverse order to the sub-goals for generating - 
since the Prolog interpreter would attempt to solve the 
sub-goals in a left to right manner. Furthc'rmore, the sub- 
set relation rather than set equality is used in the definition 
for parsing. We can interpret the two definitions ~ follows 
(fig. t2):- 
Generate holds when Sentence is the con- 
joined sentence resulting/'ram the linearization 
of the pair of dilFerence lists (Sl. nil) and (52. 
nil) using as candidate pairs for conjoining, the 
set o£ non-redundant pairs of equivalent termi- 
nal strings (Set). 
Parse holds when Sentence is the conjoined 
set, tence resulting from the linearization of the 
pair of dilference lists (S1. El) anti ($2. E2) 
provided that the set of candidate pairs for con- 
joining (Subset) is a subset of the set of pairs 
of equivalent terminal strings (Set). 
Fig 12: Logical readhtg for generate & parse 
Fig 10: Predicates llneari~.e & equivalentpairs 
Additionally, let the mete-logical predicate ~etof 
as in "setof(l~lement Goal Set)" hohl when Set is composed 
of chin,eats c~f the form Element anti that Set contains all 
in,: auccs of Element I, hat satisfy the goal Goal. The pred- 
icates generate can now be defined in terms of these two 
processes as folluws (lig. t t) :- 
generate(Sentence from St 52) 
if sctol(X.Y.nil in equivalentpairs(X Y 
from SI $2) is Set) 
andlinearize( pair~: St nil anti S2 nil 
candidtttes Set 9ires Sentence) 
parse~ Sentence 9iota9 S1 El) 
if Ijnearize(pairs SI E1 avd $2 E2 
candidate.~ SuhSet 9ives Sentence) 
nndsctot(X.¥ nil in cquivalentpairs(X Y 
from S1 $2) ia Set) 
Fig 1 !: Prolog dclinition for generate ~. parse 
The subset relation is needed for the above defini- 
tion of parsing hecause it can be shown \[Fong?? l that the 
process of linearization is more constrained (in terms of the 
p,.rn~issible conjoinable pairs) than the process of tinding 
eqnivalences. 
Linearize 
We can also fashion a logic specification for the process 
of line~tt'izatiou in the same manner. In this section we 
will describe the cases corresponding to each Prolog clause 
necessary in the specification of \[inearization. However, ,'or 
sitnplicity the actual Prolog code is not shown here. (See 
Appendix A tbr the delinition of predicate Iinearize.) 
Ill the following discussion we assume that tile tem- 
plate for predicate Iinearize has the form "linearize( pairs 
Sl El and 52 E2 rand,tides Set gives Sentence)" shown 
previously in tig. I0. There are three independent cases to 
con:rider durivg !incariz~tion f- 
t. The Base Case. 
If the two ,lilrcrence tist~ ({S1. El} & {S2. E2}) are 
both empty then the conjoined string (Sentence) is 
also entpty. This siml,ly sta.tes that if two empty 
strings arc conjoint:d then the resttit is also an empty 
string. 
122 
2. Identical Leading Substrlngs. 
The second case occurs wheTt the two (non-eml)ty) 
difference lists have identical leading non-empty sub- 
strings. Then the coni-ined string is identical to the 
concatenation of that leading substring with the lin- 
eari~.ation of the rest of th,: two difference lists. For 
example, consider the linearization of the two flag- 
ments "likes Mary" and "likes Jill" as shown in fig. 13 
.. 
{likes Mary. likes Jill} 
which can be. lineariz~:d a~ :- 
{likes X} 
where X is the linearization 
of strings {Mary. Jill} 
l'Tg. 13: Example of identical leading substrings 
3. Conjohfing. 
The last case occurs when the two pairs of (qon- 
empty) difference lists have no common leading sub- 
string, llere, the conjoined string will be the co,t- 
catenation nf the co.junctinn of one of the pairs from 
the candidate set, with the conjoined sqring resulting 
fr~nl the line;trization of the two strings with their re- 
spective candidate substrings deleted. For example, 
consider the linearization -f the two sentences "John 
likes Mary" aitd "Bill likes Jill" a~ shown in fig. 14 :- 
{John likes Mary. Bill likes Jill} 
Given th,t the .~elertt:,l ,',ltdi,l,tc lmir is {John. Bill}, 
the c,,sj,,,',,:,l :;,rtdt ,,'e ~;:,ul.l Iw :- 
what linearizations the system would produce for an ex- 
ample sentence. Consider the sentence "John and Bill liked 
Mary" (fig. 15) :- 
{John and Bill liked Mary} 
would produce the string:. 
{John and Bill liked Mary. 
John and Bill liked Mary} 
with candidate set {} 
{ John liked Mary, Bill liked Mary} 
with candidate set {(John, Bill)} 
{John Mary. Bill liked Mary} 
with candidate set {(John. Bill liked)} 
{John. Bill liked Mary} 
with candidate set {(John. Bill liked Mary)} 
Fig. 15: Example of linearizations 
All of the strings ,'ire then passed to the predicate 
findequivalences which shouhl pick out the second pair 
of strings as the only grammatically correct linearization. 
Finding Equiwdences 
(.;oodall's delinition of eqnivalence w,'~s that two termi- 
nal strings were said to be equivalent if they h;ul the same 
left and right contexts. Furthermore we had previously a.s- 
sertcd th;~t the equivaleut pairs couhl be l}roduced without 
~earching the whole RI'M. For example consider the equiv- 
ah.nt lernnimd strings in the two sentences "Alice saw Bill" 
an,J "Mary saw Bill" (fig. 16) :- 
{John and Bill X.} 
where X 
is tl~e linearization of ~;trin~,s {likes Mary, likes .Jill} 
Fig. 1,1: \[';xaml~ic of ,:,mj,iui,g ..mh.st, rin,,,,.,; 
There are S,.hC i,ul~h~,.c.t;dic.= d,:t;tils Lhat are dlf- 
r,~re.t for parsi.g tc~ ge,er:ttinK. (~ec al~l~,ndi.'c A.) llowcver 
the fierce cases :u'e the sanonc for hoth. 
We cast illusl, r;ll.e the :tl~¢~v,; dc:llntili,m by she=wing 
{Alice saw Bill. Mary saw Bill} 
would prt.hwr the, equiwdrnt pairs :- 
{Alice saw Bill. Mary saw Bill} 
{Alice, Mary} 
{Alice saw. Mary saw} 
l"ig. 16: l'Jxatuple of equivalent pairs 
Wc also make tile rollowing restriction.~ on Goodall's 
definition :- 
123 
• If there exists two terminal strings X & Y such that 
X-'=xxfl & Y--xYf'/, then X &. 1"~ should be the strongest 
possible left ~ right contexts respectively - provided 
x & y axe both nonempty. In the above example, 
x--nil and fl="saw Bill", so the first a.ud the third 
pairs produced are redundant. 
In general, a pair of terminal strings are redundant 
if they have the form (uv, uw) or (uv, zv), in which 
case - they may be replaced by the pairs (v, w) ~ad 
(u, z) respectively. 
• Ia Goodall's definition any two terminal strings them- 
selves are also a pair of equivalent terminal strings 
( whe, X & f2 ,are both ,ull). We exclude this case 
it produces simple string concatenation of sentences. 
The above restrictions imply that in fig. 16 the only 
remai,ing equivalent pair ({Alice. Mary})is the correct one 
for tl, is example. 
However, before fiuding eq,ivalent pairs for two 
simple zenlences, the ittocess ,,f fimli, g ,quiv.,lel, ces ,nlust 
check that the two se,tt,;nces ate actually gral,tlllatical. We 
;msuune thnt a recot;nizer/i,arser (e.g. a predicate parse(S 
El) alremly exists for determining the grammaticality of 
~itnple ~entenccs. Since the proct'ss only requires a yes/no 
answer to gramnmtic;dity, any parsing or recognition sys- 
l.e;,t f,,r simple sentences can be used. 
We can now specify a l,redicate lindcandi(lates(X Y 
SI $2) that hohls when {X. Y} is an equiw,hmt pair front 
the two grantmatical simple .:e,te,ces {SI. $2} .~ f, llows 
(li!,¢. 17):- 
findcandidates(X and Y in SI and $2) 
ir parse(Sl nil) 
ilnld parse(S2 nil) 
and eqlniv(X Y SL $2) 
wh,.rc eqt,iv is ,h'fit~,'d as :. 
~q.iv(X Y X1 YI) 
if append3(Chi X Omega Xl) 
and ternfinals(X) 
and append3(C.hi Y Omega YI) 
and terminals(Y) 
:vh,'r,' :q,t,',,,IS(L! L2 I..'~ L 1) h,,hls wh,.n L.I i:" ,',l,ml 
;o th,. c',,tJ,'nl,'t~;tli,,tl ,,f I.I.L2 .~: 1.3. h'rminzd.~(X) 
holds when X i.'~ n li..t ,,1' t,'rtztinnl .~yml,,,Is ouly 
Fig. l 7: Logic delit, itiolz .f Fi.:lcntldirh, Les 
Then the predicate findcquivalencos is simply de- 
fined ;t~ (fig. 18) :- 
findequivalences(X and Y in S1 and $2) 
if findcandidates(X and Y in S1 and $2) 
and not redundant(X Y) 
wl.,re redundant implements the two restrictions described. 
Fig.18: Logic definition of Findeq,ivalences 
Comparison with MSGs 
The following table (fig. 19) gives tile execution times 
in milliseconds for the parsing of some sample sentences 
mostly taken from Dahl 0~ McCor(l \[1983\]. Both systems 
were executed using Dec-20 Prolog. The times shown for 
the MSG interpreter is hazed on the time taken to parse ,'rod 
buihl the syntactic tree only - the time for the subsequent 
transformations w,-~s not ,,chided. 
Sample / MSG RPM 
ences J system device 
Each m;ul ate an apish ° ;~.lld ;t pear \[ 662 292 
.Iolm at,, ~lt appl,, and a pear \[ 613 233 f 
Z~k ;t,I ;Ll,ll ;1 WOIIU~.,, ~ilW o;i{'h trttill I 
Eiit'h ll,;lll ;tllll ,'ach wl|l,llt|t at(' l 
,"m pple 
J,~hll saw and the woman heard 
a a, lhat laughed 
.\]ohn drov,. Ihe car through and 
ct)m ~h.lt'ly demolishe, l a window 
"rh,, woa,t;tl, wit,) gav(" a l),~ok to 
.John and dr,we ;L car through .'L 
window laugh~l 
.h,hn .~aw the ,ltltll |.hiLt Mary .~aw 
and Bill gay,. a bo,,k t,, hutght~d 
.l.hnt .~aw the man lhat lu.;trd the 
wotnaH rhar lattglu'd and ~aw Bill 
Th,. ,,tan lh;d Mary saw and h(.ard 
~;LVI' ,'~.ll ;).llllll" t,I ,,;\[l'h ~viHlla\[~ 
.h,htl mtw a /uul Mary .~aw the red 
pear 
319 506 
320 503 
788 83'i 
275 1032 
I 
--1007 3375 
.139 3It 
636 323 
i sot ,9~, 
726 770i! 
Fig. ld: Timings For some sample sentences 
From tile timings we can conclude that the pro- 
po..:ed device is comparable to the MSC, system in terms 
-f comt,ttati,Jn:d elllciency, llowever, there are some other 
advantages s,,ch as :- 
• Transparency of the grammar - There is no need for 
phrmsal rules such .-m "S ~ S and S" The device also 
allows ,,m-phr~al conjunction. 
* Since no special grammar or particular phr~e marker 
representation is required, any par.,;er can be used - 
the dcvicc' only requires an acctpt/reject answer. 
124 
• The specification is uot biased with respect to liars - 
ing or generation. The iniplement:ition is reversible 
allowing it to generate aay sentence it can parse and 
vice versa. 
• Modularity of the device. The granimaticallty of sen- 
testes with conjunctiou is determined by the defini- 
tion of equivalence. For instance, if needed we can 
filter the equivalent terlninals using semantics. 
A Note on SYSCONJ 
It is worthwhile to compare the phr;me marker approach 
t{i the Aq.'N-ba.sed SYSCON.I inechanisln. Like SYSCONJ~ OUr 
analysis is extragrammatical: we do not tanlper with the 
h,sic gramnlar, but add a new cnniponent *.hat handles 
conjunction. Unlike SYSCONJ, our approach is based on a 
precise definition of "equiwdent tlhrztse~" that attenlpts ta 
unify urider one analysis nlany dill'erent types of coordina- 
tion phen,mena. :~YSi~,ONJ relied ou a rather conipticated, 
interrupt-driven method that restarted sentence ~malysis in 
SOlltC previously recorded m;tchine coiilil~qiration, but with 
the input sequence following the conjunction. This cap- 
turcs part of the "multillle planes" analy:ds of the phrase 
marker ,'tpproach, but without a precise notion of equiva- 
lent phr,'l~es. Perhaps ~ a result, SYSCONJ handled only 
ordinary conjunction, ali(l \[tot respectively or gapping read- 
ing~. In our appr-:,h, a simple change to the lincarization 
process allows ll~ t~l handle gapping. 
Extensions to the Basic Device 
The device described in the previ,lus section is a .~ilu- 
plified version for rough elliilll;iristin wii.h the MS~ inter- 
In'ctct ". llowever, the systClll C;ill e.tsily he gciicralizcd to 
h~uidle nlultiple conjunctz. The only ,uhliti.nal phase re- 
quired ia to gelicrate telnpl:tte~ for nluttlph: rc:ulings. Also, 
gallpillg can lie handled just lly adding clauses tll the deft- 
nifioll of linearize - which allows :l dilferent path from that 
of fi~. 8 to be taken. 
The ~iinlllilied device llVruiits ~llllil. ,.,(ainllh~s of un- 
gr;liillii;lli¢:tl ~.l.il!l,nfl.s I.,, h,r ll;U'<'ed as if tin'i--or (lig. 5), 
The inildularity ~f the systelll all.ws its {() ciln..itr;tin the 
dcliiiii.iclii of eClUiv:th,qlcl~ still I'lirl.hl.r. The c×tcndcl\[ dellni- 
ticlns in (141~lthdl's draft l, hcory wci-e licit iiichilled iii his the- 
si~; (;,i,.la11144i lirP~lilll;lllly hl,vi'.liSe it w:us liill COli.'-itrailled 
en~liigh. Ilnwever in lii.~ I.hl~sis he lll'llll~lses illiolher :lefini- 
tion elf !4raniliial.ic;dity ilshil~ II.l~Ms. This delliiitilln cltn lie 
lisctl t.o c~liistrain i~Cliiiv.-tlclice .,;till I'ilrl, lier ill Clllr systelli at 
a lOSS fif Siillle crllil:ieni:y ;llld gelilrl';ilil.y. For (~Xltlll|ile, the 
n~quircd ;tdditional predicate will need to ni;tke explicit use 
of the colnbined RPM. Therefilre, a parser will need to pro- 
duce a I1.PM representation as its phr,~ze marker. The mod- 
ifications necessary to produce th,, representation is shown 
hi appemlix B. 
Acknowledgements 
This work describes research clone at the Artificial Intel- 
ligence Laboratory of the Massachusetts Institute of Tech- 
nology. Sitpport for the Laboratory's artificial intelligence 
rese,'u'ch has been provided in part by the Advanced Re- 
search Projects Agency of the Depitrtnlent of Defense un- 
der Office of Naval Re'~earch contract N000t-I-80-C-0505. 
The first author is also filndnd by a scholarship from the 
Kennedy Memorial Trust. 
References 
Bow~.n ~.t al: D.L. Bowo,l {ed.), L. Byrd, F.C.N. Pert,ira, L.M. 
P,,r(.ira, D.H.I). Warre:l. Docsystem-lO Prolog User's Man- 
ira1. Hniversity of Edinburgh. t982. 
Dahl f4 McCord: V. Dahl and M.C. McCord. Trcatiiig Coordi- 
nation in Iaigie Gramtnars. Anit.ric~ui Journal of Compu- 
taii~lnal Linguistics. Vol. 9. No. 2 (t983). 
.Piing.')?: .%mdiway l"ong. To appear in S,'t,L thesis - ".~pccifying 
C,,Jrdinatioli ill L~lgic" - 1985 
Goodall??.. Grant Todd (;.,.lall. Draft - Chapter 3 (sections 2.1. 
to 2.7)- C,,irdination. 
Goodall.~.~ : ( ;ralit To,hi (:oolhdl, P:lrnlh.l Strltctnr¢,s iil ,~yiltax. 
Ph. D thesis. Uniw.rsity (if CMifiJruia. San Di{.go (tO8, U. 
Lasnik \[.: Kupin: I1. La.~uik iuid .I. \[~upin. A r~'strictive th¢,ory 
+Jt ir.'iosfi,r'.ilatiotl;d gr;Imiilar. Th('or~.tical I.inl4ui:itics ,I 
(19771. 
Appendix A: Linearization 
Thl" fiill Pr.h~g Sll~.ilh.;iiilni flw thl, llrl.dicail , lineai'ize i~ 
givl.n lll.l(iw. 
/ Linenrize f.r g~'ncr.tion / 
/ tcrmin,din~) r.n,lition / 
liu('arizt'(pairs SI ,'-;I and $2 $2 
candidates \[,i.~t £liililty llil) if lillnvar(l,is/) 
/ apldicrtthle mhcn ,yr. have tl t'Oltlllliitl .~i/lb21/rilltJ / 
lilil'.'triZ~'(lulir.~ S I 1'\]1 an,l $2 I,',9. 
¢lllidid/i/e.1 List yivtnf! ,~l.nl,l.llCl~) 
if V;lf { ~l'lll, lql¢~) 
illld not ~llllii'(~l ll.l l~|) 
iliU| IlOl ~illlll!{~ a.,I ~) 
125 
and similar(St to S2 common Similar) 
and not same(Simil~ an nil) 
and remove{Siutih~x from St leaving NewS\[) 
and r,,nove(Siulilar from $2 lenving NewS2) 
and line.'u'ize{pairs NewS1 El ,rod NewS2 E2 
candidates List ~li,,ing RestOfSentenee) 
and appeud(Similmr RestOl~,.ntenee Seutenee) 
/ conjoin two substringa / 
lim:arize(pairs HI El and $2 E;2 
candidates List giving Sentence) 
if var(Sentence) 
attd uteutber(Candl.Cand2.nil of List) 
and not same(St as El) 
and not same(S2 as E2) 
and remove(Coati t from S 1 leaving NewSI} 
and removtr(Caltd2 from $2 l,mving NewS2) 
and coltjoin(li.~t Candl.Cmtd2.nil uning '~md' 
giving Conjoint,l) 
and (lclete(Cand t.Coatd2.nil from List leavin~ NewList) 
and linearize(pairn Ni,wSI 1~1 and NewS2 I~2 
candidates Newl,i~t yiving Restot'Sentence) 
and append(Conjoined RestofS(,stteuce Sentence) 
/ Linearize for par#ing / 
/ Terminating cane / 
}inearize(pair.q nil nil and nil nil 
candidates List. giving nil) 
if var(List) 
anti :am,.(l.ist a.s all) 
,/ Case far common .suhstrinf/,/ 
lill¢.;,:'it.tr(pairs ('.,,n,mon.N,.wS l nil arid ('(ltllt,lotI.NewS2 nil 
randidate.~ List giving Sentence) 
if n,,. wu'(S,.nt¢.,w,.} 
:llld .;},ttt'(~t)Vliltit*Vn.R¢'.'-l()f~'~t'tth l!,',' ¢,:+ ~¢'Iltt'IICC) 
;,,,1 li,..arizt,il,air.~ N,~w.ql nil and NewS2 nil 
caadidttlcs I.isl y,viny Rest()lSentt'tlce) 
/ C',tne for ,',,,d,,in / 
lilwarizvIl,.ir.n .q \[ nil ¢t?t,'l ~2 ui| 
raltdidqle.s \['~,h'tttt'ltt.f.{.t'st ,fivinq `Ht'ittcqtt:e} 
if ,,, ..,va,'(~,',tt(',tce) 
and :tl)l),',,d: {(h,,,.ioi,te,} I, lh.stt)f:q,.,tt,.,,c,. ~/i,,in~ S,.ttLt.,,c¢.) 
and ,',,,lj,,i,,(li.~l l'lh',,,,',,l ,t.~i,t!l ';o,,l" :l,,,irtrJ ( h,ttj,hne, l) 
and ~illii,.( l';h.i,ii.,il. ,i.s ( :mid l.(:at,,12.uil) 
and uot ~ai,ir(f~a,id t ,i.s nil) 
and n,)t ~a,n,'(f:m,d2 ,t.s nil) 
and lim.,triz,.(patr.~ N,.wS! nil and N,,w,H2 nil 
,.uttditlates I{.¢'.~1 giving R.*'~I()I'St'IIt¢'II,'¢') 
and ;qq-',td{('andl N,'wHI ,HI) 
and ;,pl-',vlH'a,,12 N,,wH2 ,H2) 
/ ,lpp,:tttl * i.s ,1 .spi'rirtl ft, rttt i,f .,q,p,:,td d~'t(m/t that 
the Jir.~l liM ma,~l b+" rton.,:tttply 
:q)p,.n,I ' (\[h':vl.=til to "\[';til yimnt/ Ih.;uI.T;fil) 
:tpp,.t=,l ( I.'ir~t.Hec,,,d.():l..r:: to Till 9tvi,,/ Fir.~t.Re.~Q 
if :H~l,.tt,l ' {`Hvc~md.()l h('rs l,, "l';il giving Ih'.~t) 
eil,fibu'(;tii/o nil cornn,~,l nil} 
~tt,,il;~t'llh';td 1. I';dl t lo I\[,.;Ld2.T, il:2 common nil) 
if. ,tot :;.m,'(Ih.adl aa Ih';ul21 
-itttil;u'( \[l,.;ul.'r;dl t to lh.;.I.T;til2 ,.ornmou \[h.mI.Re, t) 
if .-hml;zr('\[';dll lo "\[';d12 c,,,a,n Ilcst} 
/ conjoin ia rewer.sible / 
conjoin(lint \[;'irat.Second.ail using Conj,mct giving Conjoined) 
if nonwtr(First) 
and nonvar(Second) 
and apl~end(1;'irst Conj,mct.Sceond Conjoined) 
conjoin(lint First.S~.wond.uil u.~in9 Conjunct giving Conjoiued) 
if n,mvar(Conjoined) 
attd append(First Conjunct.Second Conjoined) 
remove(nil/rein List leavin~ List} 
remove(Ih,ad.'rail from lI,,~x(l.Re~t leaving List) 
if remove(Tail from Rest leaving List) 
delete{Ilead from nil lenving nil) 
delete(Head from II,ratl.T, til leaving Tail) 
delete(fiend frum First .Rest leaving First.Tail) 
if not sa,,,,.{lI,!ad an First) 
and delete{ {h,,ul from Rest leaving Tail} 
Appendix B: Building the RPM 
A RPM rv\[)res,.utali.n ,'ml b(. Imilt by adding three extra 
imramt,t,,rs to em'h ;;ra.ttmm" |'11h, {f)~(){ht.r with a call t:o a con- 
cat.enat.i,m routine. F,~r examl)k', c,msider th(. verb phra.se "liked 
Mary" fr,,n {he .~imph. semem',. "'John liked Mary". The lltonoa- 
trin~ c-rr,,.~l),mdi,tg t.,~ the mmn-t('rmin;d VP is (',)r,structe, l by 
taking the h.ft m.I right eout, exls .f "liked Mary ;rod placing the 
non-h.rn,inid syl=d),,I VP inl.,Iwt~.n them. In geueral, we have 
~.melhing of the form :- 
phr;L~e( from Pointt to Point2 
unin9 Start to End !/iv/n9 MS.RPM) 
if isphrase(Pointt t, Point2 RPM} 
and bu|hlmonostring{Start Pointl pit=# 'VP" 
Point2 End MS) 
wirer,. ,lilferonce pairs {Start. Pointt}. {Point2. End} aa{l 
{Start. End} repr{.s4.nt the left ,',mt(.xt. the right context lind the 
..ent,.twe string rcsp,~'tively. Th," c(mc;~retmtion routim: build- 
monostring is just :- 
buildmonostring(Start Point\[ l,ht# NonTermiaal 
Point2 End MS) 
if append(Pointl Left Start) 
and append(Point2 Right End) 
and append(Lelt NonTerminaI.Right MS) 
126 
