Exploiting a Probabilistic Hierarchical Model for Generation 
Srinivas Bangalore and Owen Rambow 
AT&T Labs Research 
180 Park Avenue 
Florham Park, NJ 07932 
{srin±, rambow}@research, art. com 
Abstract 
Previous stochastic approaches to generation 
do not include a tree-based representation of 
syntax. While this may be adequate or even 
advantageous for some applications, other ap- 
plications profit from using as much syntactic 
knowledge as is available, leaving to a stochas- 
tic model only those issues that are not deter- 
mined by the grammar. We present initial re- 
suits showing that a tree-based model derived 
from a tree-annotated corpus improves on a tree 
model derived from an unannotated corpus, and 
that a tree-based stochastic model with a hand- 
crafted grammar outpertbrms both. 
1 Introduction 
For many apt)lications in natural  gen~ 
eration (NLG), the range of linguistic expres- 
sions that must be generated is quite restricted, 
and a grammar tbr generation can be fltlly spec- 
ified by hand. Moreover, in ma W cases it; is very 
important not to deviate from certain linguis- 
tic standards in generation, in which case hand- 
crafted grammars give excellent control. How- 
ever, in other applications tbr NLG the variety 
of the output is much bigger, and the demands 
on the quality of the output somewhat less strin- 
gent. A typical example is NLG in the con- 
text of (interlingua- or transthr-based) machine 
translation. Another reason for reb~xing the 
quality of the output may be that not enough 
time is available to develop a flfll grammar tbr 
a new target  in NLG. In all these 
cases, stochastic ("empiricist") methods pro- 
vide an alternative to hand-crafted ("rational- 
ist") approaches to NLG. To our knowledge, the 
first to use stochastic techniques in NLG were 
Langkilde and Knight (1998a) and (1998b). In 
this paper, we present FERGUS (Flexible Em- 
piricist/Rationalist Generation Using Syntax). 
FErtGUS follows Langkilde and Knight's seminal 
work in using an n-gram  model, but; we 
augment it with a tree-based stochastic model 
and a traditional tree-based syntactic grammar. 
More recent work on aspects of stochastic gen- 
eration include (Langkilde and Knight, 2000), 
(Malouf, 1999) and (Ratnaparkhi, 2000). 
Betbre we describe in more detail how we use 
stochastic models in NLG, we recall the basic 
tasks in NLG (Rainbow and Korelsky, 1992; Re- 
iter, 1994). During text planning, content and 
structure of the target text; are determined to 
achieve the overall communicative goal. Dur- 
ing sentence planning, linguistic means - in 
particular, lexical and syntactic means are de- 
termined to convey smaller pieces of meaning. 
l)uring realization, the specification chosen in 
sentence planning is transtbrmed into a surface 
string, by line~rizing and intlecting words in the 
sentence (and typically, adding function words). 
As in the work by Langkilde and Knight, our 
work ignores the text planning stage, but it; does 
address the sentence, planning and the realiza- 
tion stages. 
The structure of the paper is as tbllows. In 
Section 2, we present the underlying grammat- 
ical tbrmalism, lexicalized tree-adjoining gram- 
mar (LTAG). In Section 3, we describe the ar- 
chitecture of the system, and some of the mod- 
ules. In Section 4 we discuss three experiments. 
In Section 5 we colnpare our work to that of 
Langkilde and Knight (1998a). We conclude 
with a summary of on-going work. 
2 Modeling Syntax 
In order to model syntax, we use an existing 
wide-coverage grammar of English, the XTAG 
grammar developed at the University of Peru> 
sylvania (XTAG-Gronp, 1999). XTAG is a tree~ 
adjoining grammar (TAG) (Joshi, 1987a). In 
42 
,,. .............. .,.. 
Trees used in derivation " -~,- "" 
7 //"',P A,// i P--,7 
N Aux 1) N / N l' NI' ~ / l) A N 
I I I I I I ".-JI I I 
there was n{} cost estimate i'of the second phase 
1 '{3 71 Y2 {z2 74 71 75 (z 1 
Other supertags for the loxemcs found in the training corpus: 
IIOI1C {Z4 {Z 1 {~ I (z 1 {z4 
{z 5 {z 2 7 2 {z 5 
5 more 11 more 4 more I(} more 
IlOIle {z 3 {z 2 
{z 1 3' 2 
5 IIIOrC 2 Ill{We 
Figure 1: An excerl}t from the Xq'AG gr~um\]lm" t(} derive Th,{"r('. 'wa.s u,o to.st {:stim,,tc .fi)r the .second 
phase.; dotted lines show t)ossit}le a{ljun('ti{ms that were not made 
a TAG, the elementary structures are \])hrase- 
structure trees which are comt)osed using two 
ot}er~tions , sut}stitui,ion (w\]fich al}i}{;n{ts one 
tree ~1; the fl:ontier of another) mtd a(tjnlmtio\]t 
(which ins{;rts one tree into the mi{l{ll{', of im- 
o|;her). In gral)hi{:al rei)reselltal;i{}ll , \]lo{tes &I; 
which substitul;ion can take 1)lac{'~ are \]uarked 
with dow\]>arrows. In linguisI;ic uses (}f TAG, 
we asso{'ial;e one lexical item (its anchor) with 
each tree, and {}he or (typically) more trees with 
each lexical ire\]n; its a result we obtain a lexi- 
calized TAG or LTAG. Since ea{'h lexi{:al item 
is associated with a whole tree (rather than 
just a phrase-stru{'ture rule, tbr exa\]nl)le), we 
cm\] st)e(:i\[y t}oth the t)re{licate-argument struc- 
ture of the lexeme (t}y includillg nodes at which 
its arguments must sut}stitute) and morl)h{}- 
syntactic constraints such as sut}je('t-verb agree- 
men| within the sl;rucl;ure associated with the 
lexeme. This property is retbrred to as TAG's 
cztcndcd domain of locality. N{)l;e that in an 
LTAG, I;here is no distinction betw{:en lexicon 
nnd grammar. A Smnl)le grammar is shown in 
Figure 1. 
-We depart fl:om XTAG in our treatment of 
|;rees tbr adjuncts (such as adverl}s), an{t in- 
stead tbllow McDonMd and Pusteiovsky (1985). 
While in XTAG the elementary tree for an ad- 
.iuncl; conl;ains 1)hrase sl;ru{:i;ure |;hat atta{:hes 
l,he adjmmt to ll{}(tes in another tree with the 
stag anchored by 
71 \])et 
72 N 
7:~ A ux 
7.t Pro, l} 
0 T 
1',~ A(lj 
adjoins to direction 
NP 
N 
S, VP 
NP, VP 
S 
N 
right 
right 
right 
h:ft 
right 
right 
Figure 2: Adjmmtion table tbr graamnar frag- 
lUellt 
sl)ecitie(1 label (say, VP) from the specified di- 
rection (say, fronl the left), in our systenl the 
trees for adjuncts simply express their active va- 
lency, trot 11o1\[; how they connect to the lexical 
item they modi\[y. This ilfl'ormal;ion is kept in 
the adjunct|on table which is associated with the. 
grammar; an excerpt is shown in Figure 2. Trees 
t;hat can adjoin to other trees (and have entries 
in the adjunct|on table) ;~re called gamma-trees, 
the other trees (which can only t)e substituted 
into other trees) are alpha-trees. 
Note that we can refer to a tree by a combi- 
nation of its name, called its supertag, and its 
anchor. N)r example, (q is the supertag of an 
all)ha-tree anchored 1)y a noun that projects up 
to NP, wMle 72 is |;lie superi;ag of it gamma tree 
anchored by a noun that only t)rojects 1;{) N (we 
43 
assume adjectives are adjoined at N), and, as 
the adjunction table shows, can right-adjoin to 
an N. So that estimate~ is a particular tree 
in our LTAG grammar. Another tree that a su- 
pertag can be associated with is ~t~, which rep- 
resents the predicative use of a noun.1 Not all 
nouns are associated with all nominal supertags: 
the expletive there is only an cq. 
When we derive a sentence using an LTAG, 
we combine elementary trees fl'om the grmnmar 
using adjunction and substitution. For extort- 
pie, to derive the sentence There was no cost 
estimate for the second phase from the gram- 
mar in Figure 1, we substitute the tree tbr there 
into the tree tbr estimate. We then adjoin in 
the trees tbr the auxiliary was, the determiner 
no, and the modit)ing noun cost. Note that 
these adjunctions occur at different nodes: at 
VP, NP~ and N, respectively. We then adjoin in 
the preposition, into which we substitute ph, ase, 
into which we adjoin the and second. Note that 
all adjunctions are by gamma trees, and all sub- 
stitution by alpha trees. 
If we want to represent this derivation graphi- 
cally, we can do so in a derivation tree, which we 
obtain as follows: whelmver we adjoin or sub- 
stitute a tree t~ into a tree t2, we add a new 
daughter labeled t~ to the node labeled tg. As 
explained above, the name of each tree used is 
the lexeme along with the supertag. (We omit 
the address at which substitution or adjunction 
takes place.) The derivation tree t br our deriva- 
tion is shown in Figure 3. As can be seen, this 
structure is a dependency tree and resembles a 
representation of lexical argument structure. 
aoshi (1987b) claims that TAG's properties 
make it particularly suited as a syntactic rep- 
resentation tbr generation. Specifically, its ex- 
tended domain of locality is useflfl in genera- 
tion tbr localizing syntactic properties (includ- 
ing word order as well as agreement and other 
morphological processes), and lexicalization is 
useful tbr providing an interfime from seman- 
tics (the deriw~tion tree represent the sentence's 
predicate-argument structure). Indeed, LTAG 
has been used extensively in generation, start- 
ing with (McDonald and Pustejovsky, 1985). 
1Sentences such as Peter is a doctor can be analyzed 
with with be as the head, as is more usual, or with doctor 
as the head, as is done in XTAG 1)eeause the be really 
behaves like an auxiliary, not like a flfll verb. 
estimate 
there was no cost for 74 
c~ 1 7 3 71 72 
phase c~ 
the second 
71 75 
Figure 3: Derivation tree tbr LTAG deriw~tion 
of There was no cost estimate for the second 
phase 
3 System Overview 
FERGUS is composed of three modules: the 2¥ee 
Chooser, the Unraveler, and the Linear Prece- 
dence (LP) Chooser. The input to the system is 
a dependency tree as shown in Figm'e 4. Note 
that the nodes are labeled only with lexemes, 
not with supertags. 2 The Tree Chooser then 
uses a stochastic tree model to choose TAG 
trees fbr the nodes in the input structure. This 
step can be seen as analogous to "supertag- 
ging" (Bangalore and Joshi, 1999), except that 
now supertags (i.e., names of trees) must be 
fbund tbr words in a tree rather than tbr words 
in a linear sequence. The Unraveler then uses 
the XTAG grammar to produce a lattice of all 
possible linearizations that arc compatible with 
the supertagged tree and the XTAG. The LP 
Chooser then chooses the most likely traversal 
of this lattice, given a  model. We dis- 
cuss the three components in more detail. 
The Tree Chooser draws on a tree model, 
which is a representation of XTAG derivation 
tbr 1,000,000 words of the Wall Street Journal. a 
The ~IYee Chooser makes the simplifying as- 
2In the system that we used in the experiments de- 
scribed in Section 4, all words (including flmction words) 
need to be present in tt, e inlmt representation, flflly in- 
flected. This is of course unrealistic for applications. In 
this paper, we only aim to show that the use of a %'ee 
Model improves performance of a stochastic generator. 
See Section 6 for further discussion. 
3This was constructed from the Penn ~lS"ee Bank us- 
ing some heuristics, since the Pemt ~IYee Bank does not 
contain hill head-dependent infornlation; as a result of 
the use of heuristics, the Tree Model is not flflly correct. 
44 
estimate 
there was no cost for 
phase 
the second 
Figure 4: Inlmt to FEII.GUS 
Smnl)tions that the (:hoice of n tree. tbr ~t node 
dei)ends only on its daughter nodes, thus allow- 
ing \]'or a tot)-(lown dynamic l)rogrmnlning algo- 
ril;hln. St)ccifically, a node 'q in the intml; si;ru(:- 
ture is assigned ~t sui)e, rt;~g s so th;tt the 1)rol):t - 
|)ilil;y of fin(ling the treelet (;()m\])ose(t of ~1 with 
superta X ,~ ;rod :dl of its (l;mght(;rs (as foun(t 
in I;he ini)ut sl;rucl;ure) is m;rximiz(;d, and such 
l;ha, t .'~ is (:Oml)a, tit)le with 'q'~s mother ~tll(l her 
sut)e, rtag .sin. Here, "('omt)atible" l:nemis |;hat; 
the tree ret)resclfl;ed by .'~ can 1)e adjoined or 
substii;uted into the tree ret)resented by ,%~, :m- 
('or(ling to the XTAG gra, nmmr. For our exmn- 
t)le senl;en(:(;, the, ini)ui; 1;o the sysl,e,m is the t;ree 
shown in Figure d, and the oul;1)ul; fi'om l;he ~.l~ee 
(~hooser is the, tree. ;ts shown in \],'igure. 3. No(;c 
that while a (le, riw~tion tree in TAG fully Sl)(:(:- 
iiies a derivation and thus :t smTth,(:e, s(mte.n(:e, 
the oul;lmt fl:om the ~l~-ee Chooser (loes not;. 
There are two reasons. \]¢irstly, as exi)laine.d at; 
the end of Section 2, fin: us trees (:orrespond- 
ing to adjuncts are underspe.(-itied with rest)ect 
to the adjunction site aat(t/or I;h(; a(ljmwl;ion 
direction (from left; or fl'Oln right) in the tree 
of the mother node, or they nmy 1)e m~orde.re(l 
with respc(:t to other ad.iun('ts (tbr ex~nni)l(; , the 
fmnous adjective ordering t)roblem). Secondly, 
Sul)ert;ags nl~y h~ve been (:hose.n incorre(:l;ly or 
not at ;ill. 
The Unr;~veler takes ;~s input the senti- 
specitied derivation tree, (Figure 3) ml(l 1)ro- 
duces a word lattice. Each node, in the deriw> 
tion tree consisl;s of ~t lexi(:al item m~d a su- 
pertag. The linear order ()f the dmlghte.rs with 
rest)cot to l;he he;td 1)osil;ion of ;t sut)ertng is 
st)ecilied in the Xrl'AG grmnmar. This informa- 
tion is (:onsulted to order the (laughter nodes 
I 
TAG I)eliwtlion Tree 
wilht,ut ,SIIpeltags 
"l'lCC (?h°~)scl / J -= - Tree 
\\ h I 
One siagle sealli specified \ \ 
'I'AG l)cdvalion \]lees \ 
I \[(h'alll,llill'\] 
W(nd l.altice 
Shillg 
Figure 5: Ar(:hii;e(:ture of FERGUS 
with rcsl)e(:t to the head at each le.vel of the 
(terival;ion tree. in cases where ~ daughter node 
C&ll \])(I ntta('hed at more thin1 ()lie t)lace in the 
head SUl)ertag (as is the (:;~se in our exmnt)le for 
"was and for), n disjunction of M1 these, positions 
are. assigned to the dmlghter node. A botton> 
up algorithm the.n constructs ~ lattice that ell- 
(;odes the strings rei)re.sented 1)y (;~(:1~ level of 
th(! derivation tr(x'.. The latti('e~ at the. root of 
the (teriwttion tr(w. is the result o171;\]m Um';tveler. 
'Fhe resulting l~ttti(:(; for the ('.Xaml)h'. s(ml;e.nce is 
shown in Figure 6. 
The \]~t;ti('.e. OUtlmt from the. Unra.veh'a" en- 
codes all t)ossible word sequences l)erniitted 
1)y the derivation strueialre. We rmlk these. 
word sequen(:es in the order of their likeli- 
hoo(l 1)y composing the lattice with a finite- 
state machine rel)rese.nting ~ trigrmn bmgu~Ge 
1no(tel. This mo(M has 1)ee.n ('onstructed froln 
1,000,0000 words of W~dl Stre, et Journal (:orpus. 
We 1)i(:k the 1)est path through the lattice, re- 
sulting from the comt)osition using the Viterl)i 
algorithm, ;m(t this to I) ranking word sequence 
is the outt)ut of the LP Chooser. 
4 Experiments and Results 
In order l:o show |;ll~tl; Lhe llSO, of ~t tl:ce lIlode\] 
trod a, grmmnar doe.s indeed hell) pe, rformmme, 
we pe.rforme, d three experiments: 
45 
Q 
Figure 6: Word lattice tbr example sentence 
after Tree Chooser and Unraveler using the 
supertag-based model 
• For the baseline experiment, we impose a 
random tree structure ibr each sentence of 
the cortms and build a Tree Model whose 
parameters consist of whether a lexeme l~t 
precedes or tbllows her mother lexeme lm. 
We call this the Baseline Left-Right (LR) 
Model. This model generates There was 
estimate for phase the second no cost . for 
our example input. 
• In the second experiment, we derive the 
parmneters tbr the LR model fl'om an an- 
notated corpus, in particular, the XTAG 
derivation tree cortms. This model gener- 
ates Th, crc no estimate J'or the second phase 
was cost . tbr our example input. 
• In the third experiment, as described 
in Section 3, we employ the supertag-based 
tree model whose parameters consist of 
whether a lexeme l d with supertag Sd is zt 
dependent of Im with supertag sin. Fm'- 
thermore we use the supertag in~brmation 
provided by the XTAG grammar to or- 
der the dependents. This model generates 
Thcrc was no cost estimate for the second 
phase . tbr our example input, which is in- 
deed the sentence ibund in the WSJ. 
As in the case of machine translation, evalu- 
ation in generation is a complex issue. We use 
two metrics suggested in the MT literature (A1- 
shawl et al., 1.998) based on string edit; distance 
t)etween the outtmt of the generation system 
and the reference corpus string front the WSJ. 
These metrics, simple accuracy and generation 
accuracy, allow us to evaluate without human 
intervention, automatically and objectively. 4 
Simple accuracy is the mnnber of insertion 
(I), deletion (D) and substitutions (S) errors 
between the target  strings in the test 
corpus and the strings produced by the genera- 
tion model. The metric is summarized in Equa- 
tion (1). R is the number of tokens in the target 
string. This metric is similar to the string dis- 
tance metric used for measuring speech recog- 
nition accuracy. 
I + D + .q SimplcAccuracy = (1 - - --) (1) 
R 
4\~7c do not address the issue of whether these metrics 
can be used for comparative evaluation of other genera- 
tion systems. 
46 
Tree 
Model 
Simt)le Go, ner~rtion 
Ac(:ura('y Accuracy 
Average time 
per scnten(:(; 
Baseline LR Model 41.2% 56.2% 186ms 
~l¥(;cbank derived LI/. Model 52.9% 66.8% 129ms 
Sut)ertag-bascd Model 58.!)% 72.4% 517ms 
Tabl(; 1: Performance results front the thre(', tree models. 
Unlike sl)eech recognition, the task of gener- 
ation involves reordering of tokens. The simple 
accuracy metric, however, penalizes a mist)lacc.d 
token twice, as a deletion from its c.xpo, ct('.d posi- 
tion and insertion at at different l)osition. Wc llSO 
~ second metric, Generation A(:(:ura('y, shown in 
Eqm~tion (2), which treats (hilt|ion of ~ token 
~tt OIIC location in 1;11(; string ~md th(; insertion 
of the same tok(m ~t anoth('a" location in tim 
string as one single mov('an(mt (;trot (M). This 
is in addition to the rem~fining insertions (1 t) 
and deletions (Dl). 
Ge'n(~'rationAcc',,racy = (1 - 54 + I I + 1)' -t- ,q ) 
(2) 
The siml)lc, a(:cura('y, g(merntion a('(:ur;my a,n(l 
tim av(n:ag(~ time, ti)r goamration of (;a,(:h l;cst; s(~,u - 
t(m('c for tim tin'o,(', (}Xl)crinmnts ;~r(~ tabul~m,xl in 
%d)le 1. The test set consist(xl of 1 O0 r~m(tonfly 
(:hoscn WS.I s(mt(m(:(; with ml ~w(n:age lengt;h of 
16 words. As can be seen, tim sut)crtng-1)ased 
mo(M |rot)roves over the LR model derived from 
mmotated data ~md both models improv(; over 
the baseline LR mod(:l. 
Sul)ertngs incorl)or~te richer infbrmation 
st|oh as argunmnt mid a(tjunci: disl;in(:tion, and 
nmnbcr and types of argunmnts. YVe cxt)(;(:t to 
iml)rove the performance of the supcrtag-bas(;d 
model by taking these features into a(:(:ount. 
In ongoing work, we h~vc developed tree- 
based metrics in addition to the string-l)ased 
presented here, in order to ewfluate sto(:hastic 
gener~tion models. We h~vc also attempted to 
correlate these quantitative metrics with human 
(tualitativ(~ judgcnl(mts. Ado, tail(~d dis(:ussion 
of these experiments and results is t)r(',s(mto, d 
in (Bangalore (',|; al., 2000). 
5 Comparison with Langkilde 8z 
Knight 
Langkildc and Knight (1998a) use a hand- 
(:rafted grmmmu: that maps semantic represen- 
tations to sequences of words with lino, arization 
constraints. A COml)lex semantic st, ructur( ~, is 
trnnsl~ted to ~L lattice,, mid a bigrmn langunge 
mode,1 t;hell (:hoost~,s &lltOllg {;}lo, l)ossiblo, surface, 
strings (moo(led in the l~ttice. 
The system of Langkildc 8~ Knight, Nitrogen, 
is similar to FERGUS in that generation is di- 
vided into two phases, the first of which results 
in a lattice fl'om which a surNcc si;ring is chosen 
during the, s(;cond t)has(; using a  model 
(in our case a trigram model, in Nitrogen's case 
a. 1)igr~ml 1no(M). Ih)w(',ver, (;t1(; first t)hases nr(', 
quit(', ditf(;r(mt. In FEI(.GUS, we sI;m:i; with a lex- 
i(:~d pr(',dit:at(;-argulnent st;ru(;l;ur(~ while in Ni- 
trogen, a more s0,mantic intmt is used. FEII.GUS 
(:ould (',asily |)(; augm(;nt(;d with a t)r(;t)ro(:cssor 
l;h~d; maps a so, m;mti(: rc, t)ro, s(mtal;ion t;o ore: syn- 
ta(:ti(: inl)ut; this is not the focus of our r(~sc~u'ch. 
\[Iowev(',r, ther(~ are two more imt)orl,mfl; differ- 
(m('es. First, |;t1(; h~m(t-crafl;ed grmmnar in Ni- 
trogen maps dir(;(:tly from semantics to a linear 
r(~l)r(;sentation , skipping tho, nr|)or(;s(:(mt rcI)rc- 
sentation usually f~vore(t tbr the, rod)r(',s(mtn|;ion 
of syntax. There is no stochastic tree model, 
since, the, re, ~tr(', no trees. In FEI{GUS, in|tied 
('hoices arc, ma(tc stochastically t)ascd on tim 
tree rcl)rcscntation in the "I¥ce Chooser. This 
allows us to capture stochastically certain long- 
(tisl;ance cfli',(:ts which n-grmns camlot, such as 
sct)~ration of p;n'ts of a collocations (such as 
peT:form an ope~ution) through interl)osing ad- 
juncts (John peT:formed a long, .somewhat te- 
dious, and quite frustrating opcration on hi,s 
border collie). Second, tim hand-('rafl;cd gram- 
ln;tr llSCd in FEll.(-IUS was crafted indel)endcntly 
fl'om the n(;('xl for gent, rat;ion and is a imrcly 
(l(;(:larative rcl)rcs(mtation of English syntax. As 
47 
such, we can use it to handle morphological ef- 
fects such as agreement, which cannot in gen- 
eral be clone by an n-gram model and which are, 
at; the same time, descriptively straightforward 
and which are handled by all non-stochastic 
generation modules. 
6 Conclusion and Outlook 
We have presented empirical evidence that us- 
ing a tree model in addition to a  model 
can improve stochastic NLG. 
FERGUS aS presented in this paper is not 
ready to be used as a module in applications. 
Specifically, we will add a morphological compo- 
nent, a component that handles flmction words 
(auxiliaries, determiners), and a component 
that handles imnctuation. In all three cases, we 
will provide both knowledge-based and stochas- 
tic components, with the aim of comparing their 
behaviors, and using one type as a back-up tbr 
the other type. Finally, we will explore FI;R- 
OUS when applied to a  tbr which a 
much more limited XTAG grammar is available 
(for example, specit\[ying only the basic sentence 
word order as, sw, SVO, and speci(ying subject- 
verb agreement). In the long run, we intend 
FEI/OUS to become a flexible system which will 
use hand-crafted knowledge as much as possible 
and stochastic models as much as necessary. 

References 

Hiyan Alshawi, Srinivas Bangalore, and Shona 
Douglas. 1998. Automatic acquisition of hierarchical transduction models for machine translation. In Proceedings of the 36th Annual 
Meeting Association for Computational Linguistics, Montreal, Canada. 

Srinivas Bangalore and Aravind Joshi. 1999. 
Supertagging: An approach to almost parsing. Computational Linguistics, 25(2). 

Sriniw~s Bangalore, Owen Rainbow, and Steve 
Whittaker. 2000. Ewfluation Metrics for 
Generation. In Proceedings of International 
Cor~:ferenee on Natural Language Generation, 
Mitzpe Ramon. Isreal. 

Aravind K. Joshi. 1987a. An introduction to 
Tree Adjoining Grammars. In A. Manaster- 
Ramer, editor, Mathematics of Language, 
pages 87-115. John Benjamins, Amsterdam. 

Aravind K. Joshi. 1987b. Tlm relevance of 
tree adjoining grammar to generation. In 
Gerard Kempeu, editor, Natural Language 
Generation: New Results in Artificial In- 
teUigence, Psychology and Linguistics, pages 
233 252. Kluwer Academic Publishers, Dor- 
drecht /Boston /Lancaster. 

Irene Langkilde and Kevin Knight. 1998a. Generation that exploits corpus-based statistical knowledge. In 36th Meeting of the Association for Computational Linguistics and 17th International Conference on Computational 
Linguistics (COLING-ACL'98), pages 704-710, Montrdal, Canada. 

Irene Langkilde and Kevin Knight. 1998b. 
The practical value of n-grams in genera- 
tion. In Proceedings of the Ninth Interna- 
tional Natural Language Generation Work- 
shop (INLG'98), Niagara-on-the-Lake, On- 
tario. 

Irene Langkilde and Kevin Knight. 2000. 
Forest-based statistical sentence generation. 
In Proceedings of First North American ACL, 
Seattle, USA, May. 

Robert Malouf. 1999. Two methods tbr 1)re- 
dieting the order of prenonfinal t~djectives in 
english. In Pwceedings of CLINg9. 

David D. McDonald and James D. Pustejovsky. 
1985. Tags as a grammatical formalism for 
generation. In 23rd Meeting of the Association for Computational Linguistics (A CL '85), 
pages 94-103, Chicago, IL. 

Owen Rambow and Tanya Korelsky. 1992. Applied text generation. In Third Conference on 
Applied Natural Language Processing, pages 
40-47, ento, Italy. 

Adwait Ratnaparkhi. 2000. Trainable methods 
for surface natural  generation. In 
Proceedings of First North American ACL, 
Seattle, USA, May. 

Ehud Reiter. 1994. Has a consensus NL gen- 
eration architecture appeared, and is it psy- 
cholinguistically plausible? In Proceedings of 
the 7th International Workshop on Natural 
Language Generation, pages 163-170, Maine. 

The XTAG-Group. 1999. A lexicalized %'ee 
Adjoining Grammar for English. Technical 
Report http ://w~rw. cis. upenn, edu/~xtag/ 
tech-report/tech-report .htral, The Insti- 
tute for Research in Cognitive Science, Uni- 
versity of Pennsylvania.  
