Integrating a Large-scale, Reusable Lexicon with a Natural 
Language Generator 
Hongyan 3ing 
Department of Computer Science 
Columbia University 
New York, NY 10027, USA 
hjing@cs.columbia.edu 
Yael Dahan Netzer 
Department of Computer Science 
Ben-Gurion University 
Be'er-Sheva, 84105, Israel 
yaeln@cs.bgu.ac.il 
Michael Elhadad 
Department of Computer Science 
Ben-Gurion University 
Be'er-Sheva, 84105, Israel 
elhadad@cs.bgu.ac.il 
Kathleen R. McKeown 
Department of Computer Science 
Columbia University 
New York, NY 10027, USA 
kathy@cs.columbia.edu 
Abstract 
This paper presents the integration of a large- 
scale, reusable lexicon for generation with the 
FUF/SURGE unification-based syntactic realizer. 
The lexicon was combined from multiple existing re- 
sources in a semi-automatic process. The integra- 
tion is a multi-step unification process. This inte- 
gration allows the reuse of lexical, syntactic, and 
semantic knowledge encoded in the lexicon in the 
development of lexical chooser module in a genera- 
tion system. The lexicon also brings other benefits 
to a generation system: for example, the ability to 
generate many lexical and syntactic paraphrases and 
the ability to avoid non-grammatical output. 
1 Introduction 
Natural  generation requires lexical, syn- 
tactic, and semantic knowledge in order to produce 
meaningful and fluent output. Such knowledge is 
often hand-coded anew when a different application 
is developed. We present in this paper the integra- 
tion of a large-scale, reusable lexicon with a natural 
 generator, FUF/SURGE (Elhadad, 1992; 
Robin, 1994); we show that by integrating the lexi- 
con with FUF/SURGE as a tactical component, we 
can reuse the knowledge encoded in the lexicon and 
automate to some extent the development of the lex- 
ical realization component in a generation applica- 
tion. 
The integration of the lexicon with FUF/SURGE 
also brings other benefits to generation, including 
the possibility to accept a semantic input at the 
level of WordNet synsets, the production of lexical 
and syntactic paraphrases, the prevention of non- 
grammatical output, reuse across applications, and 
wide coverage. 
We present the process of integrating the lexicon 
with FUF/SUR(;E. including how to represenl the 
lexicon in FUF format, how to unify input with the 
lexicon incrementally to generate more sophisticated 
and informative representations, and how to design 
an appropriate semantic input format so that the 
integration of the lexicon and FUF/SURGE can be 
done easily. 
This paper is organized as follows. In Section 2, 
we explain why a reusable lexical chooser for gen- 
eration needs to be developed. In Section 3, we 
present the large-scale, reusable lexicon which we 
combined from multiple resources, and illustrate its 
benefits to generation by examples. In Section 4, we 
describe the process of integrating the lexicon with 
FUF/SURGE, which includes four unification steps, 
with each step adding additional lexical or syntac- 
tic information. Other applications and comparison 
with related work are presented in Section 5. Finally, 
we conclude by discussing future work. 
2 Building a reusable lexical chooser 
for generation 
While reusable components have been widely used in 
generation applications, the concept of a "reusable 
lexical chooser" for generation remains novel. 
There are two main reasons why such a lexical 
chooser has not been developed in the past: 
1. In the overall architecture of a generator, the 
lexical chooser is an internal component that 
depends on the semantic representation and for- 
.:malism and onthe syntactic realizer used by the 
application. 
2. The lexical chooser links conceptual elements to 
lexical items. Conceptual elements are by defi- 
nition domain and application dependent (they 
are the primitive concepts used in an applica- 
tion knowledge base). These primitives are not 
easily ported from application to application. 
209 
The emergence of standard architectures for gen- 
erators (RAGS, (Reiter, 1994))and the possibility 
to use a standard syntactic realizer answer the first 
issue. 
To address the second issue, one must realize that 
if the whole lexical chooser can not be made domain- 
independent, major parts can be made reusable. 
The main argument is that lexical knowledge is mod- 
ular. Therefore, while choice of words is constrained 
by domain-specific conceptual knowledge (what in- 
formation the sentences are to represent) on the one 
hand, it is also affected by several other dimensions: 
* inter-lexical constraints: collocations among 
words 
o pragmatic constraints: connotations of words 
o stylistic constraints: familiarity of words 
* syntactic constraints: government patterns of 
words, e.g., thematic structure of verbs. 
We show in this paper how the separation of the 
syntactic and conceptual interfaces of lexical item 
definitions allows us to reuse a large amount of lex- 
ical knowledge across appli.cations. 
3 The lexicon and its benefits to 
generation 
3.1 A large-scale, reusable lexicon for 
generation 
Natural Language generation starts from semantic 
concepts and then finds words to realize such seman- 
tic concepts. Most existing lexical resources, how- 
ever, are indexed by words rather than by semantic 
concepts. Such resources, therefore, can not be used 
for generation directly. Moreover, generation needs 
different types of knowledge, which typically are en- 
coded in different resources. However, the different 
representation formats used by these resources make 
it impossible to use them simultaneously in a single 
system. 
To overcome these limitations, we built a large- 
scale, reusable lexicon for generation by combining 
multiple existing resources. The resources that are 
combined include: 
o Tile WordNet Lexical Database (Miller et al., 
1990). WordNet is the largest lexical database 
to date, consisting of over 120,000 unique words 
(version 1.6). It also encodes many types of 
lexical relations between words, including syn- 
onytny, antonymy, and many more. 
o English Verb Classes and Alternations 
(EVCA) (Levin, 1993). It categorized 3.104 
verbs into classes based on their syntactic 
properties and studied verb alternations. An 
alternation is a variation in the realization of 
verb arguments. For example, the alternation 
"there-insertion" transforms A ship appeared 
~-on..the horizon_to There,appeared a ship..o~....the 
horizon. A total of 80 alternations for 3,104 
verbs were studied. 
The COMLEX syntax dictionary (Grishman et 
al., 1994). COMLEX contains syntactic infor- 
mation for over 38,000 English words. 
The Brown Corpus tagged with WordNet senses 
(Miller et al., 1993). We use this corpus for 
frequency measurement. . 
In combining these resources, we focused on verbs, 
since they play a more important role in deciding 
sentence structures. The combined lexicon includes 
rich lexical and syntactic knowledge for 5,676 verbs. 
It is indexed by WordNet synsets(which are at the 
semantic concept level) as required by the generation 
task. The knowledge in the lexicon includes: 
Q A complete list of subcategorizations for each 
sense of a verb. 
o A large variety of alternations for each sense of 
a verb. 
o Frequency of lexical items and verb subcatego- 
rizations in the tagged Brown corpus 
Rich lexicat relations between words 
The sample entry for the verb "appear" is shown 
in Figure 1. It shows that the verb appear has eight 
senses (the sense distinctions come from WordNet). 
For each sense, the lexicon lists all the applicable 
subcategorization for that particular sense of the 
verb. The subcategorizations are represented using 
the same format as in COMLEX. For each sense, 
the lexicon also lists applicable alternations, which 
we encoded based on the information in EVCA. In 
addition, for each subcategorization and alternation, 
the lexicon lists the semantic category constraints on 
verb arguments. In the figure, we omitted the fre- 
quency information derived from Brown Corpus and 
lexical relations (the lexical relations are encoded in 
WordNet). 
The construction of the lexicon is semi-automatic. 
First, COMLEX and EVCA were merged, produc- 
ing a list of syntactic subcategorizations and alter- 
nations for each verb. Distinctions in these syntac- 
tic restrictions according to each sense of a verb 
are achieved in the second stage, where WordNet 
is merged with the result of the first step. Finally, 
the corpus information is added, complementing the 
static resources with actual usage counts for each 
syntactic pattern. For a detailed description of the 
combination process, refer to (Jing and Mchieown, 
1998). 
210 
appear: 
sense 1 give an impression 
((PP-TO-INF-gS :PVAL ("to") :SO ((sb, -))) 
(TO-INF-RS :S0 ((sb, --))) 
(NP-PRED-RS :S0 ((sb, --))) 
(ADJP-PRED-RS :SO ((sb, -) (sth, --))))) 
sense 2 become visible 
((PP-T0-INF-KS :PVAL ("to") 
:S0 ((sb, -) (sth, -))) 
(INTRANS TIIERE-V-SUB J 
. . ....... .._ 
: ALT there-insertion 
:S0 ((sb, --) (sth, --)))) 
sense 8 have an outward expression 
((NP-PRED-RS :SO ((sth, --))) 
(ADJP-PRED-RS :S0 ((sb, --) (sth, --)))) 
Figure I: Lexicon entry for the verb appear 
3.2 The benefits of the lexicon 
There are a number of benefits that this combined 
lexicon can bring to  generation. 
First, the use of synsets as semantic tags can 
help map an application conceptual model to lexi- 
cal items. Whenever application concepts are repre- 
sented at the abstraction level of a WordNet synset, 
they can be directly accepted as input to the lexi- 
con. By this way, the lexicon can actually lead to 
the generation of many lexical paraphrases. For ex- 
ample, (look, seem, appear} is a WordNet synset; it 
includes a list of words that can convey the seman- 
tic concept ' 'give an impression of' '. We can 
use synsets to find words that can lexicalize the se- 
mantic concepts in the semantic input. By choosing 
different words in a synset, we can therefore gen- 
erate lexical paraphrases. For instance, using the 
above synset, the system can generate the following 
paraphrases: 
"He seems happy. " 
"He looks happy. " 
"He appears happy.'" 
Secondly, the subcategorization information in the 
lexicon prevents generating a non-grammatical out- 
put. As shown in Figure 1, the lexicon lists appli- 
cable subcategorizations for each sense of a verb. It 
will not allow the generation of sentences like 
"*He convinced me in his innocence" 
(wrong preposition) 
"*He convinced to go to the party" 
(missing object) 
"*Th.e bread cuts" 
(missing adverb (e.g., "'easily" )) 
"*The book consists three parts" 
( m issing t)reposit.ion) 
In addition, alternation information can help gen- 
erate .syntactic paraphrases. For instance, using 
the "simple reciprocal intransitive" alternation, the 
system can generate the following syntactic para- 
phrases: • , 
"Brenda agreed with Molly." 
"Brenda and Molly agreed•" 
"Brenda and Molly agreed with each other." 
Finally, the corpus frequency information can help 
............... _the.lexicat.. -~ice.proeesa~.,When:multiple .words can 
be used to realize a semantic concept, the system 
can use corpus frequency information in addition 
to other constraints to choose the most appropriate 
word. 
The knowledge encoded in the lexicon is general, 
thus it can be used in different applications. The 
lexicon has wide coverage: the final lexicon consists 
of 5,676 verbs in total, over 14,100 senses (on average 
2.5 senses/verb), and over 11,000 semantic concepts 
(synsets). It uses 147 patterns to represent the sub- 
categorizations and includes 80 alternations. 
To exploit the lexicon's many benefits, its format 
must be made compatible with the architecture of a 
generator. We have integrated the lexicon with the 
FUF/SURGE syntactic realizer to form a combined 
lexico-grammar. 
4 Integration Process 
In this section, we first explain how lexical choosers 
are interfaced with FUF/SURGE. We then describe 
step by step how the lexicon is integrated with 
FUF/SURGE and show that this integration pro- 
cess helps to automate the development of a lexical 
realization component. 
4.1 FUF/SURGE and the lexical chooser 
FUF (Elhadad, 1992) uses a functional unification 
formalism for generation. It unifies the input that a 
user provides with a grammar to generate sentences. 
SURGE (Elhadad and Robin, 1996) is a comprehen- 
sive English Grammar written in FUF. Tile role of 
a lexical realization component is to map a semantic 
representation drawn from the application domain 
to an input format acceptable by SURGE, adding 
necessary lexical and syntactic information during 
this process. 
Figure 2 shows a sample semantic input (a), the 
lexicalization module that is used to map this se- 
mantic input to SURGE input (b), and 'thefinal 
SURGE input (c) -- taken from a real application 
system(Passoneau et al., 1996). The functions of the 
lexicalization module include selecting words that 
can be used to realize the semalltic concepts in the 
input, adding syntactic features, and mapping tile 
arguments in tile semantic input to the thematic 
roles in SURGE. 
211 
Sentence: /t has 24 activities, including 20 tasks and four decisions. 
concept 
args 
total-node-count 
theme concept ref 
concept 
rheme args 
pronounPr°cess-fl°wgraph \] 
elaboration 
concept 
theme args 
expansion concept args 
cardinality \] 
\[ theme \[1\] \] / 
t value \[21 l -I. s.ubset-node-countJ 
concept flownode \] 
\[1\] = ref full 
concept 
proc 
partic 
cat 
proc 
partic 
\[2\] = 
concept cardinal \] 
cardinal 24 
ref full 
(a) The semantic input (i.e., input of lexicalization module) 
#(under total-node-count) 
type possessive \] 
possessor cat pronoun / 
i cat common 
cardinal \[ value 
definite no 
head 
possessed 
qualifier 
\[,l\] 
lex "activity" \] 
cat clause 
mood present-participle 
type locative 
proc lex "include" 
partic location \[ cat k 
(b) Tile lexicalization module 
\] 
clause 
type possessive \] 
possessor cat pronoun / 
I 
cat COnllllon 
cardinal \[ value 24 \] 
definite no 
head lex "activhy" \] 
possessed cat clause 
mood present-participle type locative \] 
qualifier proc lex "include" 
(c) Tile SURGE input (ie., output of lexicalization module) 
1 I 
I 
I 
I 
I 
Figure 2: A samph~ lexicalization component 
212 
The development of the lexicalizer component was 
done by hand in the past. Furthermore, for. each 
new application, a new lexicatizer component had 
to be written despite the fact that some lexical and 
syntactic information is repeatedly used in different 
applications. The integration process we describe, 
however, partially automates this process. 
4.2 The integration steps 
The integration of the lexicon with FUF/SURGE 
is done through incremental unification, using four 
unification steps as shown in Figure 3. Each step 
adds information to the semantic input, and at the 
end of the four unification steps, the semantic input 
has been mapped to the SURGE input format. 
(1) The semantic input 
Different generation systems usually use different 
representation formats for semantic input. Some 
systems use case roles ; some systems use flat 
attribute-value representation (Kukich et al., 1994). 
For the integrated lexicon and FUF/SURGE pack- 
age to be easily pluggable in applications, we need to 
define a standard semantic input format. It should 
be designed in such a way that applications can eas- 
ily adapt their particular semantic inputs to this 
standard format. It should also be easily mapped 
to the SURGE input format. 
In this paper, we only consider the issue of seman- 
tic input format for the expression of the predicate- 
argument relation. Two questions need to be an- 
swered in the design of the standard semantic input 
format: one, how to represent semantic concepts; 
and two, how to represent the predicate-argument 
relation. 
We use WordNet synsets to represent semantic 
concepts. The input can refer to synsets in several 
ways: either using a globally unique synset num- 
ber I or by specifying a word and its sense number 
in WordNet. 
The representation of verb arguments is a more 
complicated issue. Case roles are frequently used in 
generation systems to represent verb arguments in 
semantic inputs. For example, (Dorr et al., 1998) 
used 20 case roles in their lexical conceptual struc- 
ture corresponding to underlying positions in a com- 
positional lexical structure. (Langkilde and Knight. 
1998) use a list of case roles in their interlingua rep- 
resentations. 
We decided to use numbered arguments (similar to 
the DSyntR in MTT (Mel'cuk and Perstov, 1987)) 
instead of case roles. The difference between the two 
1Since there are a huge number of synsets in WordNet, we 
will provide a searchable database of synsets so that users can 
look up a synset and its index number easily. For a particular 
application, users can adapt the synsets to their specific do- 
main, such as removing non-relevant synsets, merging synsets. 
and relabeling the synsets for convenience, as discussed in 
(,ling, 1998). 
is not critical but the numbered argument approach 
• avoids the need• to commit: the: lexicon to a specific 
ontology and seems to be easier to learn 2. 
Figure 4 shows a sample semantic input. For easy 
understanding, we refer to the semantic concepts 
using their definitions rather than numerical index 
numbers. There are two arguments in the input. 
The intended output sentence for this semantic in- 
put is "A boat appeared on the horizon" or its para- 
phrases. 
(2) Lexical unification 
In this step, we map the semantic concepts in the " 
semantic input to concrete words. To do this, we use 
the synsets in WordNet. All the words in the same 
synset can be used to convey the same semantic con- 
cept. For the above example, the semantic concepts 
"become visible" and "a small vessel for travel on 
water" can be realized by the the verb appear and 
the noun boat respectively. This is the step that can 
produce lexical paraphrases. Note that when the 
system chooses a word, it also determines the par- 
ticular sense number of the word, since a word as 
it belongs to a synset has a unique sense number in 
WordNet. 
We represented all the synsets in Wordnet in FUF 
format. Each synset includes its numerical index 
number and the list of word senses included in the 
synsets. This lexical unification, works for both 
nouns and verbs. 
(3) Structural unification 
After the system has chosen a verb (actually a 
particular sense of a verb), it uses that information 
as an index to unify with the subcategorization and 
alternations the particular verb sense has. This step 
adds additional syntactic information to the origi- 
nal input and has the capacity to produce syntactic 
paraphrases using alternation information. 
(4) Constraints on the number of arguments 
Next, we use the constraints that a subcategoriza- 
tion has on the number of arguments it requires to 
restrict unification with subcategorization patterns. 
\~k~ use 147 possible patterns. For example, the in- 
put in Figure 4 has two arguments. Although IN- 
TRANS (meaning intransitive) is listed as a possi- 
ble subcategorization pattern for "appear" (see sense 
2 in Figure 1), the input will fail to unify with it 
since INTRANS requires a single argument only. 
This prevents the generation of non-grammatic'A 
sentences. This step adds a feature which specifies 
the transitivity of the verb to FUF/SURGE input, 
selecting one from the lexicon when there is more 
than one possibility for the given verb. 
2The difference between numbered arguments and labeled 
roles is similar to that between named semantic primitives and 
synsets in \.VordNet. Verb classes share the same definition 
of which argument is denoted by l, 2 etc. if they share some 
syntactic properties as far as argument taking properties are 
concerned. 
213 
Semantic input Synsets verbs lexicon si~ucts Input for SURGE 
Figure 3: The integration process 
\[rel-- i--ept --evisible J 1\] 
1 \[ concept "a small vessel for travel on water'' \] 
args 2 \[ concept ''the line at which the sky and Earth appear to meet'' \] 
Figure 4: The semantic input using numbered arguments 
(5) Mapping structures to SURGE input 
In the last step, the subcategorization and alter- 
nations are mapped to SURGE input format. The 
mapping from subcategorizations to SURGE input 
was manually encoded in the lexicon for each one 
of the 147 patterns. This mapping information can 
be reused for all applications, which is more effi- 
cient than composing SURGE input in the lexical- 
ization component of each different application. Fig- 
ure 5 shows how the subcategorization NP-WITH- 
NP (e.g., The clown amused the children with his 
antics) is mapped to the SURGE input format. This 
mapping mainly involves matching the numbered ar- 
guments in the semantic input to appropriate lexical 
roles and syntactic categories so that FIJF/SURGE 
can generate them in the correct order. 
The final SURGE input for the sentence ",4 boat 
appeared on the horizon" is shown in Figure 6. Us- 
ing the "THERE-INSERTION" alternation that the 
verb "appear" (sense 2) authorizes, the system can 
also generate the syntactic paraphrase "There ap- 
peared a boat on the horizon". The SURGE input 
the system generates for "There appeared a boat on 
the horizon" is very different .from that for "A boat 
appeared on the horizon". 
It is possible that for a given application some 
generated paraphrases are not appropriate. In this 
case, users can edit the synsets and the alternations 
to filter out tile paraphrases tile) do not want. 
Tile four unification steps are completely auto- 
matic. Tile system can send feedback upon failure 
struct 
relation 
args 
proc 
lex-roles 
np-with-np 
1 \[21<...> 
2 \[al<...> 3 \[41<...> 
type lexical 
lex Ill 
t 
1 
2 
subcat 2 
3 
\[1 \[all 2 \[3\] 3 \[41 
cat np \] 
121 
\[rat .p \] 
\[al 
cat ip prep lex 
np \[41 
"with" \] 1 
Figure 5: Mapping subcategorization "NP-\VITH- 
NP" to SURGE input 
of unification. 
5 Related Work 
The lexicon, after it is integrated with 
FUF/SURGE, can also be used for other tasks in 
 generation. For example, revision (Robin, 
1994) is a technique for building semantic inputs 
incrementally. The revision process decides whether 
it is appropriate to attach a new constituent to the 
current semantic input, for example, by adding an 
214 
relation 
args 
struct 
argl 
cat 
lexical-roles 
concept 
word 
1 concept 
word 
concept 2 
word ppb 
2 ~ given 
clause c 
d 
c 'become ~isible' ' \] \] 
"appear"a 
'a small vessel for travel on water'' \] J 
"boa~"a 
'Cthe line at which the sky and Earth appear to meet \] 
"hor,izon ''a \] 
"Enriched in first step 
bEnriched in second step 
CEnriched in third step 
dEnriched in fourth step 
Figure 6: SURGE input for "A boat appeared on the horizon" 
object or an adverb. Such decisions are constrained 
by syntactic properties of verbs. The integrated 
lexicon is useful to verify these properties. 
Nitrogen (Langkilde and Knight, 1998), a natural 
 generation system developed at ISI, also 
includes a large-scale lexicon to support the genera- 
tion process. Given that Nitrogen and FUF/SURGE 
use very different methods for generation, the way 
that we integrate the lexicon with the generation sys- 
tem is also very different. Nitrogen combines sym- 
bolic rules with statistics learned from text corpora, 
while FUF/SURGE is based on Functional Unifica- 
tion Grammar. Other related work includes (Stede, 
1998), which suggests a lexicon structure for multi- 
lingual generation in a knowledge-based generation 
system. The main idea is to handle multilingual gen- 
eration in the same way as paraphrasing of the same 
. Stede's work concerns mostly the lexical 
semantics of the transitivity alternations. 
6 Conclusion 
We have presented in this paper the integration of 
a large-scale, reusable lexicon for generation with 
FUF/SURGE, a unification-based natural  
generator. This integration makes it possible to 
reuse major parts of a lexical chooser, which is tile 
component in a generation system that is responsi- 
ble for mapping semantic inputs to surface genera- 
tor inputs. We show that although the whole lexical " 
chooser can not be made domain-independent, it is 
possible to reuse a large amount of lexical, syntactic, 
and semantic knowledge across applications. 
In addition, tile lexicon other benefits to a genera- 
tion system, inchiding the abilities to generate nlany 
lexical paraphrases automatically, generate syntac- 
tic paraphrases, av(fid n(m-grammatical output, and 
choose the most frequently used word when there is 
more than one candidate words. Since the lexical, 
syntactic, and semantic knowledge encoded in the 
lexicon is general and the lexicon has a wide cover- 
age, it can be reused for different applications. 
In the future, we plan to validate the paraphrases 
the lexicon can generate by asking human subjects to 
read the generated paraphrases and judge whether 
they are acceptable. We would like to investigate 
ways that can systematically filter out paraphrases 
that are considered unacceptable. We are also inter- 
ested in exploring the usage of this system in multi- 
lingual generation. 

References 
B. J. Doff, N. Habash, 
A thematic hierarchy 
from lexical-conceptual. 
and D. Traum. 1998. 
for efficient generation 
Technical Report CS- 
TR-3934, Institute for Advanced Computer Stud- 
ies, Department of Computer Science, University 
of Maryland, October. 
M. Elhadad and J. Robin. 1996. An overview of 
SURGE: a re-usable comprehensive syntactic re- 
alization component. In INLG'96, Brighton, UK. 
(demonstration session). 
M. Elhadad. 1992. Using Argumentation to Control 
Lezical Choice: A Functional Unification-Based 
Approach. Ph.D. thesis, Department of Computer 
Science, Columbia University. 
R. Grishman, C. Macleod, and A. Meyers. 1994. 
COMLEX syntax: Building a computational 
lexicon. In Proceedings of COLING'94, Kyoto, 
,Japan. 
H..ling and K. McKeown. 1998. Combining mul- 
tiple, large-scale resources in a reusable lexicon 
for natural  generation. In Proceedings 
of the 36th Annual Meeting of the Association for 
Computational Linguistics and the .17th Interna- 
tional Conference on Computational Linguistics, 
volume 1, pages 607-613, Universit(~ de MontrEal, 
Quebec, Canada, August. 
H. Jing. 1998. Applying wordnet to natural lan- 
guage generation. In Proceedings of COLING- 
ACL'98 workshop on the Usage of WordNet in 
Natural Language Processing Systems, University 
of Montreal, Montreal, Canada, August. 
K. Kukich, K. McKeown, J. Shaw, J. Robin, N. Mor- 
gan, and J. Phillips. "1994. User-needs analysis 
and design methodology for an automated doc- 
ument generator. In A. Zampolli, N. Calzolari, 
and M. Palmer, editors, Current Issues in Com- 
putational Linguistics: In Honour of Don Walker. 
Kluwer Academic Press, Boston. 
I. Langkilde and K. Knight. 1998. The practical 
value of n-grams in generation. In INLG'98, pages 
248-255, Niagara-on-the-Lake, Canada, August. 
B. Levin. 1993. English Verb Classes and Alterna- 
tions: A Preliminary Investigation. University of 
Chicago Press, Chicago, Illinois. 
I.A. Mel'cuk and N.V. Perstov. 1987. Surface- 
syntax of English, a formal model in the 
Meaning Text Theory. Benjamins, Amster- 
dam/Philadelphia. 
G. Miller, R. Beckwith C. Fellbaum, and D. Gross K. 
Miller. 1990. Introduction to WordNet: An on- 
line lexical database. International Journal of 
Lexicography (special issue), 3 (4) :235-312. 
G.A. Miller, C. Leacock, R. Tengi, and R.T. Bunker. 
1993. A semantic concordance. Cognitive Science 
Laboratory, Princeton University. 
R. Passoneau, K. Kukich, J. Robin, V. Hatzivas- 
siloglou, L. Lefkowitz, and H. Jing. 1996. Gen- 
erating summaries of workflow diagrams. In Pro- 
ceedings of the International Conference on Nat- 
ural Language Processing and Industrial Appli- 
cations (NLP-IA'96), Moncton, New Brunswick, 
Canada. 
E. Reiter. 1994. Has a consensus nl generation ar- 
chitecture appeared, and is it psyeholinguistically 
plausible? In Proceedings of the Seventh Interna- 
tional Workshop on Natural Language Generation 
(INLGW-1994), pages 163-170, Kennebunkport, 
Maine, USA. available from the cmp-lg archive as 
paper cmp-lg/9411032. 
J. Robin. 1994. Revision-Based Generation of Nat- 
.ural Language Summaries Providing Historical 
Background: Corpus-Based Analysis, Design, Im- 
plementation, and Evaluation. Ph.D. thesis, De- 
partment of Computer Science, Cohnnbia Univer- 
sity. Also Technical Report CU-CS-034-94. 
M. Stede. 1998. A generative l)ersl}ective on vert} al- 
ternations. Computational Lin.quistics. 24(3):4{}1- 
_430-,September" 
