,, c 
Towards Convenient Bi-Directional Grammar Formalisms 
• I' ? ' 
Abstract 
P. Newmzrn 
IBM Pale Alto Scientific Center 
1530 Page Mill Road 
Pale Alto, CA 94304, US/Y 
This paper discusses tim advantages for practical bi :, 
direclional grammars 6(combirfiug a lexical fbcus with 
the (}PEG-originated principle of immediate- 
dominance/line.ar-precedence (I\[)/1,P) rule partitioning. 
It. also outlines an implenlentation approach fbllowing 
these gafidelines. The approach is inspired by Slot 
Grammar, with additions including more explicit map. 
pings between surface and internal representations, and 
preferential constihJent ordering, rules. 
~. Introductio~ 
a 
Tim term bi-directional gramrnar JbrmalLsm here refers 
to an implementation formalism capable of producing 
grammars usable in both analysis and generation. Such 
grammars can be advantageons for machine translation 
and other applicalions for reasons of ee°'n°my; they also 
aid in grammar validation, as suggeste d by Dymetman 
and lsabelle (1988). 
There have been major strides taken in recent years in 
bi-directional formalisms, based on 'many different 
paradigms, tn many cases some elements of tlm specifi-, 
cations are directiqn-unique, l lowever, bi-directionality' 
' S r is not an end in it. el,, .as contrasted xEith its potential 
gains. So tile goal can be usefblly approximated witl~ 
formalisms which make some limited distincfions 
between informatio~a applying to parsing and to gener- 
ation. " ' 
Amcng recent efforts ill this area at& '(a) the 
CRIITER. system described by l)ymetman. and Isabelle 
(1988), in which an annotated detinite clause grammar is 
compiled differently based on the annotations, for the 
two purposes, (b.) the inversion of a systemic generator 
by Kasper (1988) (in wlfic h phrase structure is said to be 
added mamially \['or parsing), (c) the I)I'I\V 'generator of 
Caraceni and Stock (1988), which is based on arl aug- 
' mented trartsition network (ATN) and which seems to 
employ a "gm~erate and,tesC, approach to generation, 
and (d) .the. P, erli,n GPSG elrort of Buser0an and 
|tauenschild (1988),in wbich GPSG (Gazdar, el. al. 
1985) is adapted for iniplemenlation purpos.es !o allow 
feasible rule specifica!.i(~,n and sectuendng. ; 
} 
The purpose of this papei- is t'o. suggest trial for future 
work in operational b,.'-dir e0.ion al fbrmalisms, 
approaches combining a high degreeof 16xicalis:m with 
some form of GPSGqnspired II)/LP phrfitioning of 
information appear especially promising. Some 
formalisms with these cbaracte,'istics are Ihe head-driven 
grammars liPS(} (Pollard and Sa,g .1987), arid Slot 
Grammar (McCord !989a),. The latter is curre~~tly used 
in the machine translatiot!,.,~y.sl:em, I/MT-2 (McCord 
tOg9b), \[)tll for parshlg only, Aspects of Slot Grammar 
will be used to illu.si, rate the discussion, which is in tbur 
paris, gee'lion 2 discusses the. relatior~ship, 6f a strong 
lexical conlponent to bl-dlt~ecfionality: " Section 3 dis- 
cusses the' difliculties of" obtaining realistic bi-directional 
grammars without an ID/I,P separation, Section 4 dis- 
cusses ways in whichbead driven grammars', ' in partic- 
ular~ ,Slot Grammar, avoid these difficulties. Finally, 
section 5 discusses some .proposed extensiot~s of Slot 
Grammar to ill~lstrate, a possible organi, zation of infor- 
mation ff, r a bead-drivenbi-di!~ecti'°n~ll st:atom.at' 
, . - ; I L ~ ! 
2. Lexicalism and Bi-Directionality 
Probably the majority of' contemporary grammars place 
cor~siderable,information i n :the.. lexicml..This is espe- 
cially important in a N-directional, coptext .because it 
allows direction-fi'ee statements of: . • ., / ... t . , ; 
Semantic represenlati0ns of concepts m3d their asso- 
ciated modifiers , • . .- - . . 
~ AlternaJJve ways in which those semantic represent- 
'ations can be: realized in .terms of alternative 
expression of depertdents, on a syntaclic level. Tlfis 
includes identification or pro-lbrma elements such 
as'required prep0sitions Cw;alt FOR John'), and 
, entire'pro-forum comiSlerhents, such as "a hand" in 
:'gDe a hand"~ and fixed'~pos\[tion inlbrmation for 
non-compositional and frozen compounds. 
, U, nification-oriented mappings between flm'~emantic 
and syntax-tie represt-;nl~ation's.. ' .... 
• l.exical, transformations originating in I.F'G 
{BtTesnan 1982) for .ch.a!~ges hi ..lexical form 
(passivizatipn) and category (e.g., n ominallzation ), 
als Well as 'alternative realizations of some concepts, 
' fbr example, those allowed' via' "raising" and 
"clefl.ing". ,. ' ' ' 
I.exicons incorporating various cornbhmtions of these 
teat.ures are use d .in most of' I!\]e bl-directional systems 
mentioned in Section I. Also, 1,ancef et al. (1988) 
claim that significant bVdifectionality is obtained in t.lle 
SAG\[~ system solely through the u~e of Such lexicons, 
with differqnt syntactic compone'nts use~l'in' the two 
processing directions ..... 
3. Hnverting Non-ID/LP Grammars 
Obtaining realistic bi-directiondlformulati0n,; in gram- 
mars not liavi \]g an ID/LP partitioning of" information is 
.problematical, because extending, theh" 0rdering pro- 
visions to deal not. only with syntactic correctness but 
also with semantic and textual factors exacerbates an 
aheady dillicull situatiorl with .regard to ordering in such 
grammars. t 
To justit}¢ this statement, we look first, at current trends 
in non-ll)/1.P grammars, and then at necessary exten- 
S\]0t";S, 
294 
,3,1 Trends in Non-ID/LP Grammars 
In some contemporary versions of paradigms whose 
basic rules were originally intended to subsume both 
"l l)" and "LP" information, infbrmation is reorganized 
so the information expressed by the basic rules is quite 
limited, and additions are needed to express the 
remainder. The fundamental cause of the modifications 
is the need to effectively accornmodate the relatively fi'ee 
orderings in clause constittmnts of many languages. 
Thus, fbr example, in augmented phrase structure gram- 
mars (APSGs), which include definite clause grammars, 
it is inconvenient |o specify each legal dependent 
ordering by a separale phrase structure rule. Instead, as 
discussed by Jensen (1987), it is more convenient to 
l;,)cus on binary rules, combining a node containing a 
head with one of its modifiers, e.g., 
VPO -> VP\] hiP; VPO -> VPl PP; etc, 
tJsing lhese binary forms, tile phrase structure portion of 
the rules indicate not much more than the side of a head 
on wMch a modifier may occur. The "augmental.ions" 
have a number of responsibilities. They must indicale 
ordering constraints among siblings in terms of fe, atures 
recording subl.ree "states" (i.e., to Ihe exlent thai such 
slates are nol implied by lhe category names). Also, 
explicit facilities are needed fbr sh'ucture building to 
avoid separale nodes tot each rule applied. 1 lrinally, if 
tlle rules are used in combination with a lexicat orien- 
lation, since modifiers are indicated in phrase structure 
rules by general category (e.g., NP), augmenlations must 
locale and speci\[}7 tire relationship hetween the modifier 
category and the specific modifier expressed type (e.g., 
an expected complement). 
'lhe resull of this (necessary) movement of fimction 
away Iiom I:he basic rules of the paradigm is a tendency 
toward somewhat laborious, redundant speciticalion. 
"l'o illustrale lhe kind of redundancy inw)lved, we cow 
sh'uct an AI"SG style bina:y rtfle for allachment of indi- 
rect objects. 
"the example assumes a lexicon identil}'ing potential 
rnodifiers of a head by frarnes, which, by unification, 
map between syntactic and semantic representations of' 
those modifiers. To allow lbr ordering provisions, modi- 
licr fi'arnes \]rave associated labels, such as "indobj". We 
also assume that semantic representations of dependents 
irmlude valency numbers for complemenls. To avoid the 
development of two examples, we also ensure that the 
rule is bi-directionally applicable. 
Vl:O -> VPl hip 
choosemodifier (VPO,VPI,NP, indobj, M, H1) 
eunify (VPO,VPI, (hasobj,modifiers)) 
uHify(VP1.hasobj, "-") 
uni on (M,gl, 11) 
tmify(VPO.modifiers, N) 
uriify(VPl.modifiers, Ml); 
This rule can be understood bi-direcfionally, if' we 
assume tllat both the interpreter and "choosemodifier" 
are directlon-sensitive. In parsing the interpreter finds a 
constituent unlI)ing with (cat = vp) adjacent to one uni- 
fying with (cat = rip) and instantiates an almost empty 
constituent VP0 (cat = vp). "Choosemodifier" then 
checks the lexicon to see if the head of VPI expects a 
modifier with label "indobj" whose syntactic subframe 
unifies with the constituent NP, and whose valency 
number is not yet tbund as a modifier in VPl. If so, it 
returns the result of the full unification as M, and the 
current value of VP1 .modifiers as M 1. 
"Eunify ~ destructively unifies two structures except for 
file listed attributes. This serves in parsing to project 
head featnres upward. In parsing lhe next "unify" func- 
tion ensures t\]Jat a direct object has not yet been 
included in VPl. The remainder of the rule, in parsing, 
creates the dependent list for VP(I by expanding that of 
VPI to include the indirect object. 
Similar rules could be constructed for less strictly 
ordered complements, and for adjuncts. 
These operations have considerable inherent redundancy 
even though much flmction is abstracted out within 
"choosemodifier". The "choosemodifier" operation 
occurs in all complement attachment rules. Feature 
projection and structure building occur in all rules. 
lqnally, the actual precedence rt, le aspects can be 
expressed more perspicuously than via feature state 
testing. Looking ahead slightly, one way of" summa- 
rizing the situation is to say that when a grammar 
paradigm which originally combines qD" and "LW 
resorts to binary rules, especially in the presence of a 
lexical focus, the grammar becomes, to a hu'ge extent, a 
head-driven grammar, without the ability to lake filll 
advantage of the facloriug opportunities atlbrded. 
3.2 Extensions for Bi-.Directionality 
So far we have covered somewhat old ground. Why are 
these modified approaches especially problematic in a 
bi-dircctional context? Because there one is laced with 
an unpleasant choice between probably untenable com- 
plexity and unnecessary generation. 
To justil) lhis claim, we return to the rule illustrated in 
section 3.1, first examining ils assumed operation in a 
generative direction. 
In generation the interpreter instantiates almost empty 
constituents VPI and NP. "Choosemodifier" then 
aU.empts to find an expected modifier frame for the head 
of VP0 with the given label whose semanlJc subfi'ame 
(containing a valency number) unifies with one of the 
actual modifiers (VP0.modifiers) of VP0, and whose syn- 
tactic subfi'ame unifies with NP, returns the result as M, 
and the remainder of the modifiers as M1, etc. 
1 Similar tendencies are observed in contemporary categorial grammars. For example, Yoo and Lee (1988) use "quo- 
tient" categories which speciI) unordered sets of possible arguments, togelher with separate 1.t 7 rules. Bes and 
Gardent (1989) also use sels within categories, together with order features to constrain adjacency. 
2 295 
But such rules do not really satisfy tire requirements o\[" 
generation. They describe syntactically correct struc- 
tures but specify no ordering constraints and preferences 
relating to either semantic considerations (e.g., required 
orderings of adjective types in English, and conventional 
orderings of verb modifiers) or to textual considerations 
such as topic and locus. In parsing such provisions are 
needed to detect textual features, and in generation they 
m'e needed to use textual features to determine ordering. 
if these provisions were added, the "feature testing" 
aspects relating to ordering would become considerably 
more complicated, if expressible at all. This is because 
detecting and using textual considerations seems to 
involve taking irrto account the entire complex of" modi- 
tiers tbr a head, which is extremely awkward in terms of 
binary phrase structure rules. 
llajicova (1989) describes topic/focus determination 
conditions for both English and Czech; they involve both 
semantic role information and complex sibling relation- 
ships. If those conditions were expressed in tile context 
of binary rules, it seems that a rule such as 
VP\[~ -> VPl x 
in the parse would have the responsibility of assigning x 
to "lbcus" if there has been a break to the left of x in the 
conwmtional ordering of dependent roles (for Czech), 
and irrdeterminate otherwise (until ffu'ther dependents 
are found). In generation the rule might be licensed at a 
stage in generation where x is either (a) part of the topic, 
and VP0 contains only topic dependents, or (b) part of 
the tbcus and ranks highest of the dependents in VP0 in 
the systemic order'. (Topic/focus identification criteria 
for \[:.nglish are also considered by I lajicova, and are 
more complicated). 
So adding textual provisions to phrase structure rules 
would pose a considerable challenge. Simply put, 
attaching dependents to heads one at a time is a conven- 
ient approach in parsing, but detaching them one at a 
time is not a convenient approach for generation. 
On the other hand, if textual provisions are omitted 
from the grammar, then generation would produce all 
syntactically legitimate sentences. One would then use 
additional rule sets to select among all the generated 
utterances based on semantic and textually based prefer- 
cnces2 (And rules are also need to detect textual fea- 
tures during analysis.) 
4. Head Driven Grammars and Slot 
Grammar 
l lead driven grammars which combine a lexical focus 
with a strict ID/LP partitioning avoid the problems 
described above. We use Slot Grammar as an example. 
Ttle lexicon formulation of Slot Grammar is interesting 
in that it identifies dependents, both complements and 
adjuncts, by "slotnames", a device originating in earlier 
work by McCord (1980). The (alternative) sh'uctures 
which can be used to realize those slots are Ihctored out 
into separate "filler rules". These rules contain condi- 
tions on both prospective fillers and associated heads. 
They can thus be used to constrain/adjust features of the 
constituents under consideration, e.g., to instantiate 
agreernent. In other words, Ihey can be used to express 
many 11) constraints. 
The basic linear precedence conditions of Slot Grammar 
are expressed by two types of rules. "Head/Slot" rules 
indicate tile sides of the head on which a particular 
"slot" may appear. These rules are conditional in terms 
of unifiers for both head and slot filler. "Slot/slot" rules 
indicate, again conditionally, precedence rules among 
slots on the same side of a head. 
Organizing infbrmation in tiffs way allows the elimi- 
nation of the explicit: specification of many aspects of the 
rule shown in section 3.1 : 
1. Association of "categories" with complements and 
adjuncts is eliminaled - ordering is stated in terms 
of slots rather than the more general syntactic cate- 
gories. 
2. "ChoosemodiIier" becomes lhe basic, built-in 
control operation of the parser, and need not be 
expressed explicitly 
3. Struchn-e building operations are, to a large extent, 
implicit. Only variations in feature projection, etc 
treed be expressed explicitly. 
The remainder of tire information in the rule is 
expressed by two short rules, one whlch indicates that 
indirect objects thll on the right sides of beads, and the 
other that they precede dh'ect objeds. 
The revised organization o1' information also provides 
the basis fbr dealing with semantic and textually corrdi- 
tioned ordering requirements without either undue com- 
plexity or exhaustive generation. This is because the 
inherent modularity allows the use of different control 
schemes for parsing and generation. In parsing the 
control scheme can be ~attach one dependent at a time", 
using immediate dominance rules and basic linear pre- 
cedence constraints together. In generation the control 
scheme can be altered to first generate sets of depen- 
dent& using just immediate dominance rules, and then 
2 There have been efforts to combine textual considerations with non-il)/l,P grammars. For example, Uszkoreit 
(1998) uses exhaustive enumeration of alternative modifier orderings, irmluding complements and adjuncts, with 
selection among alternatives made by a focus feature. However, in parsing, the suggestion must somehow "collapse" 
to a set-oriented approach, using the enumerated alternatives as a kind of LP rule. Also, as implied by the results of 
ltajicova (1989), and explicitly argued by Hauenschild (1988), these provisions are not sufiqdent. 
296 3 
ordering them using both basic precedence constraints 
and preference-oriented ones. 
5. Current Direction 
Recapitulaling, bi-direetional grammar effbrts combining 
lexicalism with an ID/LP separation seem most prom- 
ising because 
1. A lexical lbcus in itself provides a great deal of bi- 
directional facility. 
2. In current grammars based on non-ll)/LP 
paradigms, linear precedence constraints governing 
syntactic correctness are expressed by constraints on 
features of one node of a binary rule. F, xtending 
this approach to deal with preferential ordering is at 
best extremely complex, and possibly infeasible, but 
the alternative seems to be exhaustive generation 
followed by filtering. 
3. In contrast, the modularity provided by an ID/LP 
separation allows rules to be applied in different 
combinations in parsing and generation. 
In the work underlying tiffs paper, a multi-lingual 
machine translation project, a bi-directional grmnmar 
formalism is being developed inspired by Slot Grammar, 
but with modifications including: 
1. adding a fully reversible morphological component 
in the lexicon 
2. expanding the lexical provisions to include explicit 
bi-direelJonal mappings between syntactic and more 
abstract representations 
3. revising the notation to facilitate reversibility. 
4. using a slotname type-lattice to simplil) the 
expression of generalizations. 
5. adding preferential precedence rules dealing with 
semanlic and textual considerations. The preferen- 
tial ordering rules are used in the analysis phase to 
detect textual features, and are applied after a post- 
parse disambiguation analysis (based on a heuristic 
search algorithm described in (Newman 1988)). In 
generation, however, the pretbrential ordering rules 
are applied together with those expressing absolute 
ordering constraints. 
A preliminary descriplion of these provisions has been 
documented (Newman, to appear). The preferential 
precedence rules are of two kinds: one kind relates to 
the association ot" dependenls with "zones" of a constit- 
uent (e.g., pre-subject, pre-finite,...), and the other kind 
deals with their ordering within zones. Zones are used 
because some aspects of dependent ordering are most 
conveniently described in those terms, as discussed by 
Quirk et al (1972), Uszkorelt (1988) and others. 3 Zone 
association rules express the preferences of certain types 
of modifiers for certain zones, and also variations in 
these preferences due to textual considerations. These 
preferences must bE balanced, by heuristics, against the 
needs of other modifiers and the constraints imposed by 
the zones themselves. Optimal ways of stating and using 
these preferences represents a major lbcus of our current 
work. 
, 
3 It might be noted that to allow meaningfld use of zones, the syntactic structures used in the design are very fiat. 
Fronting is not viewed, as in most current approaches, as an example of Iongodistance dependency. Rather, to 
simplify the statement of zone-allocation and other ordering rules, dependents assigned to different zones are siblings 
and, as in Kartunnen (1986), auxiliaries are adjunct-like. 
4 297 

References. 

lies, G.G., (?ardent, C. "French Order Without 
Order", Proc. 4th Conf. of European Chapter of" 
ACL (1989), 249-255 

Buseman S., llauenschild C. "A Constructive View 
of GPSG or How to Make It Work", I'roc 
COLING 88, 77-82 

Caraceni, R., O. Stock, "Reversing a Lexically 
Based Parser for GEnEration," Applied Artificial 
Intelligence, vol. 2, \]/2 (1988) 149-74 

Dymetman M., lsabelle, P. "Reversible Logic 
Grammars for Machine Translation", Pro(: 2nd 
Int'l Conf on Theoretical and Methodological Issues 
in the Machine Translation of Natural Languages 
(1988) 

Gazdar, G., E. Klein, G. Pullum, I. Sag., General- 
ized Phrase Structure Grammar, Basil Blackwell 
(1985) 

l lajicova E. "A l)ependency-Based Parser tbr 
Topic and Focus ~, Proe. Intn'l Workshop on 
Parsing 7?ehnologies (1989) 448-457 

llauenschild C. "GPSG and German Word Order", 
in U.Reyle, C. Rohrer, eds., Natural Language 
Parsing and Linguistic Theories,Reidel (1988). 
411-431 

Jensen, K. "Binary Rules and Non-binary Trees", in 
A. Manaster-Ramer (ed.), Mathematics qf Lan- 
guage, John Benjamins (1987) 

Kasper, R. T. "An Experimental Parser for 
Systemic Grammars", Proe COLING 88, 309-312 

Kartunnen !,., "Radical l~exicalisnC, CSI,I Report 
CS1,\[-86-68 (1986) 

1,ancel, J-M, Otani M., Simonin N., Danlos I,, 
"SAGE: A Sentence Parsing and Generalion 
System", Proe COLING 88, 359-364 

McCord, M.C., "Slot Grammars", Computationa/ 
Linguistics, vol 6, 31-43 (1980) 

McCord, M.C. "A New Version of Slot Grammar", 
IBM Research Report RC: 14506 (1989a) 

McCord, M.C. ~A New Version of the Machine 
Translation System I,MT", IBM Research Report 
RC 14710 (1989b), to appear in Proe. International 
SeienHfie Symposium on Natural I.anguage and 
Logic, Springer Lecture Noles in ComputEr Science 

Newman, P. "Combinatorial Disambiguation% 
Proe. ACL Conf. on Applied NLP (1988) 

Newman, P. "Symmetric Slot Grammar", to appear 
in Proe 3rd Intn'l Conf on 7'heoretieal and 
Methodological Issues in Machine 7>anslation of 
Natural Languages June 1990 

Pollard, C. and 1. Sag, Information-based Syntax 
and Semantics Vol. 1, CSLI (1987) 

Quirk, R., S. Greenbaum, G. l.eech, J. Svavtuik, A 
Grammar (if Contemporary l;.'nglish Longman 
(I 972) 

Uszkoreit, I|. Linear Precedence in D&eontinuous 
Constituents." Complex Fronting in German. CSLI 
Research Report CSI,!-86-47 (1988). 

Yoo S., I,ee, K. "Extended Categorial Gramma(, 
CSI,I Report CSL1-88-121 (1988) 
