2. Text Generation 
William Mann, Chairperson 
Institute for Scientific Information (ISI) 
at the University of Southern California 
Marina del Ray, CA 90291 
Panelists 
Madeline Bates, Bolt, Beranek and Newman 
Barbara Grosz, SRI International 
David D. McDonald, University of Massachusetts 
Kathleen R. McKeown, University of Pennsylvania 
William Swartout, Institute for Scientific Information 
2.1 Introduction 
This report consists of two documents describing 
the state of the art of computer generation of natural 
language text. Both were prepared by a panel of indi- 
viduals who are active in research on text generation. 
The first document assesses the state of the art, identi- 
fying four kinds of technical developments which will 
shape the art in the coming decade: linguistically justi- 
fied grammars, knowledge representation methods, 
models of the reader, and models of discourse. The 
second document is a comprehensive bibliography on 
text generation, the first of its kind. In addition to 
citations of documents, it includes descriptions of on- 
going research efforts. 
2.2 Assessing Text Generation Technology 
Our goal here is to assess the state of the art of 
text generation for two purposes: to help people who 
intend to apply that art in the near future and to aid in 
the design or selection of appropriate research. 
This assessment covers all of the technical methods 
by which computer programs create and present Eng- 
lish text in their outputs. (For simplicity we always 
call the output language English.) Because text gener- 
ation has not always been taken seriously from a tech- 
nical point of view, it has been actively pursued only 
recently as a topic in artificial intelligence. As a result 
of this late start, much of the technology available for 
application today is still rather superficial. However, 
text generation is now such an active research 
topic that this superficial technology will soon be sur- 
passed. (The last part of this report contains an ex- 
tensive bibliography on the subject.) 
2.3 What Techniques Are Now Available for Use 
in System Designs? 
Two kinds of practical text generation techniques 
are already in general use and fairly well understood. 
The first is displaying previously prepared text (or 
canned text), and the second is producing text by di- 
rect translation of knowledge structures. 
The simplest and most commonly used way to have 
a computer system produce text is for the implemen- 
ters of the system to figure out in advance what sorts 
of English output will be required and then store it as 
text strings. The computer merely displays the text 
that has been stored. (For example, almost all error 
messages are produced in this way.) It is relatively 
easy to have a program produce English in this way, 
and the text can be complex and elegantly written if 
desired. Unfortunately, because the text strings can 
be changed independently of any knowledge structures 
the program might use, there is no guarantee of con- 
sistency between what the program does and what it 
says it does. Another problem with canned text is that 
all questions and answers must be anticipated in ad- 
wance; for large systems, that may prove to be impos- 
sible. Finally, since one text string looks like any oth- 
er as far as the computer is concerned, the computer 
program cannot easily have a conceptual model of 
what it is saying. This means that one should not 
expect to see much closure: satisfying 100 needs for 
text will not make the second 100 much easier. 
Another approach to providing English output 
produces text by translating knowledge structures of 
the program directly to English. This method over- 
comes many of the problems with canned text, while 
introducing some of its own. Since the structures be- 
ing transformed (or translated) are the same ones used 
in the program's reasoning process, consistency can be 
assured. Closure can be realized because transforma- 
tions are written to handle large classes of knowledge 
structures. However, since the transformations per- 
formed are usually relatively simple, the quality of the 
text depends to a great degree on how the knowledge 
is structured. If the text is to be understandable, the 
knowledge used by the program must be structured so 
62 American Journal of Computational Linguistics, Volume 8, Number 2, April-June 1982 
William Mann Text Generation 
that it is readily understood. Finally, systems employ- 
ing this technique typically have had very little linguis- 
tic knowledge, so they have produced text that is ver- 
bose, stilted, and redundant, although readable. 
Practical, near-term applications of text generation 
will share certain characteristics: 
1. They require short texts: one to three sentences. 
2. They have well-elaborated program data structures 
corresponding in fairly simple ways to the desired 
texts. 
3. The important knowledge can be represented well 
with present techniques; it does not involve the 
difficulties listed in section 2.6. 
4. Limited fluency of output is acceptable. 
Some so-called "expert systems" that explain their 
reasoning in English have these characteristics. 
We believe that text-producing systems of the fu- 
ture will continue to include processes that produce 
text by translating knowledge structures. However, 
they will be integrated with other processes that use 
extensive linguistic knowledge, a discourse model, a 
model of the reader, and enhanced knowledge repre- 
sentations. 
Because of the limited capabilities of present tech- 
niques, a new project aiming to produce a benchmark 
application program in the text generation area would 
currently be counterproductive, since it would produce 
little or no transferable technology and would detract 
from the community's ability to make progress on the 
general problem. 
2.4 Basic Components for a Text Generation 
Facility 
How can the very limited capabilities now available 
be developed into fluent, powerful text generation 
methods that are easily applied to new tasks? The next 
few sections describe the kinds of methods that are 
needed and are being developed. 
The underlying model presumed here, which pres- 
ent research is moving toward, has the following char- 
acteristics: 
1. Responsibility for text generation is in a text gener- 
ation module rather than being scattered at the 
points of use. 
2. A major portion of the text generation module is 
portable and is developed cumulatively through 
many systems. The portable components include a 
grammar that encodes general knowledge of Eng- 
lish and processes that handle linguistic, task- 
independent information. 
We feel strongly that a competent text generation 
facility must have the following four identifiable com- 
ponents, and that limitations on these will be limita- 
tions on the overall state of the art for the foreseeable 
future: 
1. A comprehensive, linguistically justified grammar. 
2. A knowledge-representation formalism that can 
encode diverse kinds of information. 
3. A model of the intended reader of the text. 
4. A model of discourse structure and control. 
Each of these draws on existing noncomputational 
precedents, and each requires some special adaptation 
to the text generation task. 
Below we describe each of these basic components 
in a form that it might achieve in five to ten years of 
research. (These descriptions are followed by a pro- 
jection of the practical alternatives available to system 
designers five years hence.) 
2.5 Lingustically Justified Grammars for Text 
Generation 
Grammars are ordinarily developed by linguists 
over periods of ten to twenty or more years, in depart- 
ments of linguistics. The best ones may be written by 
a single individual, but they reflect the ideas of dozens 
or hundreds of people who have contributed to refin- 
ing particular forms. 
Present practice in linguistics emphasizes carefully 
reasoned development of small fragments of grammars. 
Hence comprehensive, linguistically justified gram- 
mars, the sort we need, are very rare. 
Several linguistic traditions (some associated with 
computation and some not) are particularly likely to 
produce suitably refined, comprehensive grammars for 
text generators. They are: 
1. The systemic tradition, founded by Michael A. K. 
Halliday around 1961. 
2. The transformational tradition, decisively articulat- 
ed by Noam Chomsky starting in 1957. 
3. The Generalized Phrase Structure Grammar tradi- 
tion, currently associated with Gerald Gazdar. 
4. The ATN tradition, begun by Bill Woods and now 
being developed by him and many others. 
5. The LSP tradition, developed so far mainly under 
the direction of Naomi Sager. 
6. The Diamond (or Diagram) grammar by Jane Rob- 
inson. 
Grammars do not appear in computers without 
extensive effort. Most linguists are not interested in 
providing or seeing the level of detail and precise defi- 
nition needed for effective computational use. There 
are enormous social barriers between the source of 
these grammars (linguists) and their potential users. It 
will be necessary to sponsor text generation research 
projects with linguists on the staff; projects staffed 
entirely by computer people can be expected to yield 
only short-term results. 
American Journal of Computational Linguistics, Volume 8, Number 2, April-June 1982 63 
William Mann Text Generation 
2.6 Knowledge Representation Formalisms 
Text generation programs cannot improve much on 
the knowledge they are given. The notation for 
knowledge must already contain appropriate abstrac- 
tions in an easily accessible form. Today's notations 
are relatively good at representing logical formulas and 
deductive necessities, and also hierarchies of objects. 
Coverage is particularly weak for these other kinds of 
knowledge: 
1. Time 
2. Space 
3. Events and actions 
4. Cause 
5. Collectives 
6. Likelihood 
7. Obligation 
8. Possibility 
9. Negation 
10. Quantification 
11. Continuity and discreteness 
2.7 Models of the Reader 
Text prepared without considering the reader is 
uniformly awful. Programs must have explicit models 
of the reader that encode (or make available) at least 
the following four kinds of information: 
1. What is obvious - including common factual knowl- 
edge and certain "obvious" inferable information. 
Obviousness does not agree with logical validity. 
2. What has already been told, and what is obvious 
from that. 
3. Wha~ others believe - including mutual beliefs and 
beliefs about the writer's belief. 
4. What is currently in the reader's attention. 
Beyond these, the program should be able to reason 
about belief and intent. 
2.8 Models of Discourse 
This is another linguistic matter, distinct from 
grammar. Running text has subtle interactions be- 
tween its parts. When we generate multisentential 
text, we need a set of principles for organizing it. A 
few linguists and philosophers are making important 
contributions, but far more work is needed. Again, to 
develop effective models of discourse, research pro- 
jects will need to have linguists on the staff. 
An adequate discourse model will include some 
representation of at least the following: 
1. The structures that can be built out of sentences 
and larger units. 
2. The needs of the writer that each discourse struc- 
ture meets. 
3. The principal effects that the use of each structure 
produces. 
4. The effects of various discourse structures on the 
reader's attention. 
2.9 Relating the Basic Components to Text 
Generation 
How are these basic components related to the 
whole task? Why are they necessary, and how does 
their quality affect what can be done? 
2.9.1 The Text Quality Limitations of Grammars 
In order to generate text that is not awkward or 
misleading, one must be able to control a wide variety 
of language effects at the sentence level. Effects will 
be, present in the mind of the reader in any case, and 
so the program must either control them or take seri- 
ous risks of misunderstanding and error. The effects 
are produced by the arrangements of words used, and 
so a theory of the arrangements of words is needed in 
order to achieve control. Theories of the arrange- 
ments of words are (or include) grammars. The ability 
of a text generation system to express many different 
ideas well will be limited to the different effects con- 
trollable through its grammar. 
Use of an ad hoc grammar limits the generator to 
expressing a narrow range of ideas. It may do well in 
a short, carefully planned demonstration, but it will be 
too narrow for many practical purposes. 
We can think of the grammar as a bottleneck or 
filter at the output of the text generator. Only those 
expressive techniques that the system can control 
through its grammar will be used. 
2.9.2 The Knowledge Representation 
The knowledge representation frameworks in a 
program limit the range of things that the program can 
u,;efully operate upon. Since a text generator must 
create text out of some knowledge representation, it is 
likewise limited. 
Limitations on knowledge representations include 
two important kinds: 
1. Presence of abstractions--are the concepts that 
must be conveyed in text actually symbolized? 
2. Ease of access--is there a fast, uniform method for 
finding the symbols that represent particular con- 
cepts? 
We can think of these limits as a bottleneck or 
filter at the input of the text generator. Only those 
concepts that pass through the filter will appear in the 
text. 
2,9.3 Models of the Reader or System User 
Generating acceptable text requires that the gener- 
ator take into account the knowledge of the reader. If 
this is not done, text quality is so bad that the results 
64 American Journal of Computational Linguistics, Volume 8, Number 2, April-June 1982 
William Mann Text Generation 
may be useless. (With canned text, this problem is 
usually avoided because the writer knows a great deal 
about the reader's knowledge.) To take the reader's 
knowledge into account requires an explicit model of 
that knowledge. 
Of the four kinds of knowledge previously identi- 
fied in section 2.7, the most critical for basic text 
quality are the knowledge of what is obvious and the 
knowledge of what has already been conveyed. 
2.9.4 Models of Discourse 
We know that single sentences are too limited to 
express some things. Moving to multisentential text 
necessarily creates discourse, which involves many 
kinds of effects that programs cannot yet control. For 
example, putting one sentence after another can be 
used to express time sequence, deductive necessity, 
cause, exemplification or other relationships, without 
any words being used to express the relation. 
Creating these effects when they are desired, and 
avoiding them when they are not, requires explicit 
models of discourse phenomena. At a higher level, 
sequences of sentences and paragraphs of a text must 
be organized in a principled way. This also requires 
explicit discourse models. Until such models are de- 
veloped, texts will be awkward and misunderstanding 
will be common. 
2.10 Designing in 1986 for Practical Text 
Generation 
What sort of practical application of text genera- 
tion will be possible in five years? We expect the 
designer to be in the following situation: 
• There will be several examinable systems with devel- 
oped methods for creating the four basic compo- 
nents. For each kind of component, there will be 
some attractive precedents for future work. No 
one system will have a thoroughly elaborated ap- 
proach to all four. 
• System design based on adaptation of these preced- 
ents will be possible. The design work will involve 
creating "handcrafted" systems that embody and 
reconcile the good techniques. It will require the 
personal attention of computer scientists, linguists, 
and programmers who have been involved in the 
prior research. 
• The resulting system can be expected to create ac- 
ceptable, effective texts, limited by quality consid- 
erations to be about one page in length. 
For the message-system problem used as a focal 
problem for the workshop, there were two tasks iden- 
tified for text generation: a task of reporting system 
status and a task of reporting how and to whom par- 
ticular messages would be relevant for an identified 
collection of people. For both of these tasks it seems 
feasible for design of a practical text generation mo- 
dule to begin in five years. However, it is questiona- 
ble whether adequate techniques would be available to 
determine what message relevance to report. 
2.11 Present Research Status 
The most influential research in the next few years 
will be focused on the four basic components: linguis- 
tically justified grammars, knowledge representation 
formalisms, models of readers, and models of discourse 
structure and control. Part of the effort will go into 
developing these components individually, part into 
learning how to combine them. 
Appropriately, most of the current effort is going 
into either developing single components or combining 
two of them. Although there are several institutions 
and individuals working on all four of these compo- 
nents, no one has yet demonstrated a system in which 
all four approach the scopes of action indicated for 
them above. 
Malay important topics are being neglected for lack 
of research support. (There is no lack of interested 
people; natural language processing continually gener- 
ates high interest in the AI community. We are not 
sure whether there is a shortage of interested qualified 
people.) 
More information on the state of the art and cur- 
rent activity can be found in the bibliography on text 
generation, the last part of this report, which includes 
a section on research in progress. 
2.12 Text Generation Bibliography 
This bibliography was prepared in connection with 
the authors' report on the state of the art in text gen- 
eration. It includes published works on generation of 
natural language text by computer programs as well as 
some prior noncomputational work that has been used 
as a basis for such computer programs. It is not ex- 
haustive in any sense, and no evaluation is implied by 
the presence or absence of a citation of any particular 
publication. 
Allen, J. 1978 Recognizing Intention in Dialogue. Ph.D. thesis. 
University of Toronto. 
Appelt, D.E. 1980a A planner for reasoning about knowledge and 
action. In Proceedings of the First Annual National Conference on 
Artificial Intelligence. Stanford University (August). 
Appelt, D.E. 1980b Problem solving applied to language genera- 
tion. In Proceedings of the Eighteenth Annual Meeting of the 
Association for Computational Linguistics. 
Appelt, D.E. 1981 Planning natural language utterances to satisfy 
multiple goals. Forthcoming Ph.D. thesis. Stanford University. 
Badler, N.I. 1975 The conceptual description of physical activi- 
ties. In Proceedings of the Thirteenth Annual Meeting of the 
Association for Computational Linguistics. 
Bates, M., Brown, G., and Collins, A. 1978 Socratic Teaching of 
Causal Knowledge and Reasoning. Bolt Beranek and Newman, 
Inc., Technical Report 3995 (December). 
American Journal of Computational Linguistics, Volume 8, Number 2, April-June 1982 65 
William Mann Text Generation 
Bates, M. 1980 Language instruction without pre-stored examples. 
In Proceedings of the Third Canadian Symposium on Instructional 
Technology. (February) 
Bates, M. and Ingria, R. 1981a Controlled transformational sen- 
tence generation. In The Nineteenth Annual Meeting of the Asso- 
ciation for Computational Linguistics. Stanford University 
(June). 
Bates, M. et al. 1981b Generative tutorial systems. In Proceedings 
of the 1981 ADCIS Conference (March). 
Bates, M. et al. 1981c ILIAD Final Report. Bolt Beranek and 
Newman, Inc., Technical Report 4771 (September). 
Berry, M. 1975 Introduction to Systemic Linguistics: Structures and 
Systems. B.T. Batsford, Ltd., London. 
Berry, M. 1977 Introduction to Systemic Linguistics: Levels and 
Links. B.T. Batsford, Ltd., London. 
Birnbaum, L., Flowers, M., and McGuire, R. 1980 Towards an 
A.I. model of argumentation. In Proceedings of the First Na- 
tional Conference on Artificial Intelligence. Stanford University 
(August). 
Boguraev, B.K. 1979 Automatic Resolution of Linguistic 
Ambiguities. Computational Laboratory, Cambridge University, 
England, Technical Report 11 (August). 
Bossie, S. 1981 A Tactical Component for Text Generation: Sentence 
Generation Using a Functional Grammar. University of Pennsyl- 
vania, Technical Report MS-CIS-81-5. 
Brown, G.P. 1974 Some problems in German to English Machine 
Translation. Massachusetts Institute of Technology, Technical 
Report 142 (December). Project MAC. 
Brown, R.H. 1974 Use of multiple-body interrupts in discourse 
generation. Massachusetts Institute of Technology, Department 
of Electrical Engineering and Computer Science, Bachelors 
Degree thesis. 
Bruce, B.C. 1975 Generation as a social action. In Proceedings of 
Theoretical lssues in Natural Language Processing - I (TINLAP). 
Cambridge, MA (June) 64-67. 
Bruce, B.C., Collins, A., Rubin, A.D., and Gentner, D. 1978 A 
Cognitive Science Approach to Writing. Bolt Beranek and New- 
man, Inc., Technical Report 89 (June). 
Carbonell, J.R. and Collins, A.M. 1973 Natural semantics in 
artificial intelligence. In Proceedings of the Third International 
Joint Conference on Artificial Intelligence. Stanford, CA: 344- 
351. 
Carr, B. and Goldstein, I. 1977 Overlays: A Theory of Modeling for 
Computer Aided Instruction. Massachusetts Institute of Technol- 
ogy, Artificial Intelligence Laboratory, Memo 406 (February). 
Chafe, W.L. 1977 Creativity and verbalization and its implications 
for the nature of stored knowledge. In Freedle, R.O., Ed., 
Discourse Processes: Advances in Research and Theory. Volume 1: 
Discourse Production and Comprehension. Ablex, N J: 41-55. 
Chafe, W.L. 1979 The flow of thought and the flow of language. 
In Givon, T., Ed., Syntax and Semantics. Volume 12: Discourse 
and Syntax. Academic Press, NY. 
Chester, D. 1976 The translation of formal proofs into English. 
Artificial Intelligence 7 (3) Fall. 
Clancey, W.J. 1978a Tutoring Rules for Guiding a Case Method 
Dialogue. Stanford University, Department of Computer Sci- 
ence, Heuristic Programming Project, Technical Report HPP- 
78-25 (December). Also International Journal of Man-Machine 
Studies I1 (1979) 25-49. 
Clancey, W.J. t978b An Antibiotic Therapy Selector Which Provides 
for Explanation. Stanford University, Technical Report HPP- 
78-26 (December). 
Clancey, W.J. 1979 Dialogue management for rule-based tutorials. 
In Proceedings of the Sixth International Joint Conference on 
Artificial Intelligence. Tokyo (Agusut) 155-161. 
Clippinger, J.H. 1974 A Discourse Speaking Program as a Prelimi- 
nary Theory of Discourse Behavior and a Limited Theory of Psy- 
choanalytic Discourse. Ph.D. thesis, University of Pennsylvania. 
Clippinger, J.H. 1975 Speaking with many tongues: Some prob- 
lems in modelling speakers of actual discourse. In Proceedings 
of Theoretical Issues in Natural Language Processing - I 
(TINLAP). Cambridge, MA (June) 68-73. 
Codd, E.F. et al. 1978 Rendezvous Version 1: An Experimental 
English-Language Query Formulation System for Casual Users of 
Relational Databases. IBM Research Laboratory, San Jose, CA. 
Technical Report RJ2144. 
Colhen, P.R. and Perrault, C.R. 1977 Overview of planning speech 
acts. In Proceedings of the Fifth International Joint Conference on 
Artificial Intelligence. Cambridge, MA (August). 
Cohen, P.R. 1978 On Knowing What to Say: Planning Speech Acts. 
University of Toronto, Technical Report 118. 
Cohen, P.R. and Perrault, C.R. 1979 Elements of a plan-based 
theory of speech acts. Cognitive Science 3. 
Collins, A.M., Passafiume, J., Gould, L., and Carbonell, J.G. 1973 
Improving Interactive Capabilities in Computer-Assisted Instruction. 
Bolt Beranek and Newman, Inc., Technical Report 2631 
(August). 
Cullingford, R.E., Krueger, M.W., Selfridge, M., and Bienkowsky, 
M.A. 1981 Automated explanations as a component of a 
computer-aided design system. To appear in IEEE Transactions 
on Systems, Man, and Cybernetics. 
Danes, F., Ed. 1974 Papers on Functional Sentence Perspective. 
Academia, Publishing House of the Czechoslovak Academy of 
Sciences. 
Davey, A. 1979 Discourse Production. Edinburgh University Press, 
Edinburgh. 
de Beaugrande, Robert 1980 Advances in Discourse Processes. 
Volume IV: Text, Discourse, and Process: Toward a Multidiscipli- 
nary Science of Texts. Ablex, Norwood, NJ. 
de Joia, A. and Stenton, A. 1980 Terms in Systemic Linguistics. 
Batsford Academic and Educational, Ltd., London. 
Dehn, N. 1981a Memory in story invention. In Proceedings of the 
Third Annual Conference of the Cognitive Science Society. Univer- 
sity of California, Berkeley (August). 
Dehn, N. 1981b Story generation after TALE-SPIN. In Proceed- 
ings of the Seventh International Joint Conference on Artificial 
Intelligence. University of British Columbia (August). 
Fawcett, R.P. 1980 Exeter Linguistic Studies. Volume 3: Cognitive 
Linguistics and Social Interaction. Julius Groos Verlag Heidel- 
berg and Exeter University. 
Fillmore, C.J. 1976 The case for Case reopened. In Cole, P. and 
Sadock, J.M., Eds., Syntax and Semantics. Volume 8: Grammati- 
cal Relations. Academic Press, NY. 
Forbus, K. and Stevens, A. 1981 Using Qualitative Simulation to 
Generate Explanations. Bolt Beranek and Newman, Inc., Techni- 
cal Report 4490 (March). Also Cognitive Science 3. 
Friedman, J. 1969 Directed random generation of sentences. 
Communications of the ACM 12 (6). 
Gabriel, R.P. 1980 An Organization for Programs in Fluid Domains. 
Ph.D. thesis, Stanford University, 1980. 
Goldman, N.M. 1974 Computer Generation of Natural Language 
from a Deep Conceptual Base. Ph.D. thesis, Stanford University. 
Stanford Artificial Intelligence Laboratory Memo AIM-247. 
Goldman, N.M. 1975a The boundaries of language generation. In 
Proceedings of Theoretical Issues in Natural Language Processing - 
I (T1NLAP). Cambridge, MA (June) 74-78. 
Goldman, N.M. 1975b Conceptual generation. In Schank, R.C., 
Ed., Conceptual Information Processing. North-Holland, Amster- 
clam. 
Goldstein, I. 1978 Developing a Computational Representation of 
Problem Solving Skills. Massachusetts Institute of Technology, 
66 American Journal of Computational Linguistics, Volume 8, Number 2, April-June 1982 
William Mann Text Generation 
Artificial Intelligence Laboratory, Cambridge, MA, memo 495 
(October). 
Grimes, I.E. 1975 The Thread of Discourse. Mouton, The Hague. 
Grishman, R. 1979 Response generation in question-answering 
systems. In Proceedings of the Seventeenth Annual Meeting of the 
Association for Computational Linguistics (August) 99-101. 
Grosz, B.J. 1979 Utterance and objective: Issues in natural lan- 
guage communication. In Proceedings of the Sixth International 
Joint Conference on Artificial Intelligence. 
Grosz, B.J. 1980 Focusing and description in natural language 
dialogs. In Joshi, A. et al., Eds., Elements of Discourse Under- 
standing: Proceedings of a Workshop on Computational Aspects of 
Linguistic Structure and Discourse Setting. Cambridge University 
Press, Cambridge. 
Habel, C., Schmit, A., and Schweppe, H. 1977 On Automatic 
Paraphrasing of Natural Language Expressions. Technische Univ- 
ersiteit, Berlin, Technical Report 3//17. Semantic Network 
Project. 
Halliday, M.A.K. 1961 Categories of the theory of grammar. 
Word 17. 
Halliday, M.A.K. 1967a Notes on transitivity and theme in Eng- 
lish. Journal of Linguistics 3 (1) 37-81. 
Halliday, M.A.K. 1967b Notes on transitivity and theme in Eng- 
lish. Journal of Linguistics 3 (2) 199-244. 
Halliday. M.A.K. 1968 Notes on transitivity and theme in English. 
Journal of Linguistics 4 (2) 179-215. 
Halliday, M.A.K. 1970 Language structure and language function. 
In Lyons, J., Ed., New Horizons in Linguistics. Penguin. 
Halliday, M.A.K. 1976 System and Function in Language. Oxford 
University Press, London. 
Halliday, M.A.K. 1978 Language as Social Semiotic. University 
Park Press, Baltimore, MD. 
Halliday, M.A.K. and Hasan, R. 1976 Cohesion in English. Long- 
man, London. English Language Series, Title No. 9. 
Heidorn, G. 1972 Natural Language Inputs to a Simulation Pro- 
gramming System. Naval Postgraduate School, Monterey, CA, 
Technical Report NPS-55HD72101A. 
Heidorn, G. 1975 Augmented phrase structure grammar. In 
Proceedings of Theoretical Issues in Natural Language Processing - 
I (TINLAP). Cambridge, MA (June) 1-5. 
Herskovits, A. 1973 The Generation of French from Semantic 
Structure. Stanford Artificial Intelligence Laboratory, Technical 
Report 212. 
Hobbs, J. and Evans, D. 1980 Conversation as planned behavior. 
Cognitive Science 4, 349-377. 
Hudson, R. A. 1971 North-Holland Linguistic Series. Volume 4: 
English Complex Sentences. North-Holland, London and Am- 
sterdam. 
Hudson, R. A. 1976 Arguments for a Non-Transformational 
Grammar. University of Chicago Press, Chicago, 1976. 
Hutchins, W.J. 1971 The Generation of Syntactic Structures from a 
Syntactic Base. North-Holland, Amsterdam. 
Kay, M. 1979 Functional grammar. In Proceedings of the Fifth 
Annual Meeting of the Berkeley Linguistic Society. 
Kempen, G. 1977 Building a psychologically plausible sentence 
generator. Presented at the Conference on Empirical and Meth- 
odological Foundations of Semantic Theories for Natural Lan- 
guage, Nijmegen, The Netherlands. 
Kempen, G. and Hoenkamp, E. 1979 A Procedural Grammar for 
Sentence Production. University of Nijmegen, Department of 
Psychology, The Netherlands, Technical Report, 1979. 
Klein, S. 1965 Automatic paraphrasing in essay format. Mechani- 
cal Translation 8 (3). 
Klein, S. 1975 Meta-compiling text grammars as a model for 
human behavior. In Proceedings of Theoretical lssues in Natural 
Language Processing - I (TINLAP). Cambridge, MA (June) 
94-98. 
Knaus, R. 1975 Incremental sentence processing. American Jour- 
nal of Computational Linguistics Fiche 33. 
Kripke, S. 1977 Speaker reference and semantic reference. In 
French, P.A. et al., Eds., Contemporary Perspectives in the Phi- 
losophy of Language. University of Minnesota Press, Minneapo- 
lis. 
Levin, J.A., and Goldman, N.M. 1978 Process Models of Reference 
in Context. USC/lnformation Sciences Institute, RR-78-72. 
Levy, D.M. 1979a Communicative goals and strategies: Between 
discourse and syntax. In Givon, T., Ed., Syntax and Semantics. 
Volume 12: Discourse and Syntax. Academic Press, New York. 
Levy, D.M. 1979b The Architecture of the Text. Ph.D. thesis, 
Stanford University, Department of Computer Science. 
Linde C. and Labov, W. 1975 Spatial networks as a site for the 
study of language and thought. Language 50 (IV) 924-939. 
Mann, W.C. and Moore, J.A. 1980 Computer as Author - Results 
and Prospects. USC/Information Sciences Institute, RR-79-82. 
Mann, W.C. and Moore, J.A. 1981a Computer generation of 
multiparagraph English text. American Journal of Computational 
Linguistics 7 (1) January - March. 
Mann, W.C. 1981b Two discourse generators. In The Nineteenth 
Annual Meeting of the Association for Computational Linguistics. 
Sperry Univac. 
Matthiessen, C.M.I.M. 1981 A grammar and a lexicon for a text- 
production system. In The Nineteenth Annual Meeting of the 
Association for Computational Linguistics. Sperry Univac. 
McCoy, K.F. 1981 Automatic Enhancement of a Database 
Knowledge Representation for Natural Language Generation. 
University of Pennsylvania, Technical Report MS-CIS-81-6. 
McDonald, D.D. 1975a A preliminary report on a program for 
generating natural language. In Proceedings of the Third Interna- 
tional Joint Conference on Artificial Intelligence. Tibilisi, USSR 
(August). 
McDonald, D.D. 1975b A framework for writing generation 
grammars for interactive computer programs. American Journal 
of Computational Linguistics Fiche 33. 
McDonald, D.D. 1977 Language generation: The linguistics com- 
ponent (short note). In Proceedings of the Fifth International 
Joint Conference on Artificial Intelligence. Cambridge, MA 
(August). 
McDonald, D.D. 1978 Subsequent references: Syntactic and 
rhetorical constraints. In Theoretical Issues in Natural Language 
Processing - 2 (TINLAP). ACM, New York. 
McDonald, D.D. 1980a Natural Language Production as a Process 
of Decision-Making Under Constraints. Ph.D. thesis, Massachu- 
setts Institute of Technology, Dept. of Electricial Engineering 
and Computer Science. To appear as an MIT Artificial Intelli- 
gence Laboratory technical report. 
McDonald, D.D. 1980b The role of discourse structure in lan- 
guage production. In The Proceedings of the Third Biannual 
Meeting of the SCSIO/SCEIO. 
McDonald, D.D. 1981 Language production: The source of the 
dictionary. In The Nineteenth Annual Meeting of the Association 
for Computational Linguistics. Stanford University (June). 
McKeown, K.R. 1979 Paraphrasing using given and new information 
in a question-answer system. Master's thesis, University of Penn- 
sylvania, Philadelphia. Number MS-CIS-80-13. Also in Pro- 
ceedings of the Seventeenth Annual Meeting of the Association for 
Computational Lingustics (August) 67-72. 
McKeown, K.R. 1980a Generating Descriptions and Explanations: 
Applications to Questions about Database Structure. University of 
Pennsylvania, Technical Report MS-CIS-80-9. 
McKeown, K.R. 1980b Generating relevant explanations: Natural 
language responses to questions about database structure. In 
American Journal of Computational Linguistics, Volume 8, Number 2, April-June 1982 67 
William Mann Text Generation 
Proceedings of The First Annual National Conference on Artificial 
Intelligence. Stanford, CA (August) 306-309. 
McKeown, K.R. 1981 Generating Natural Language: Deciding 
What to Say Next. University of Pennsylvania, Technical Re- 
port MS-CIS-81-1. 
Meehan, J.R. 1975 Using planning structures to generate stories. 
American Journal of Computational Linguistics Fiche 33. 
Meehan, J.R. 1977 TALE-SPIN, an interactive program that 
writes stories. In Proceedings of the Fifth International Joint 
Conference on Artificial Intelligence (August). 
Moore, R. 1980 Reasoning about Knowledge and Action. SR1 Inter- 
national, Artificial Intelligence Center, Technical Note 191. 
Perrault, C.R. and Cohen, C.R. 1978 Planning Speech Acts. Uni- 
versity of Toronto, Department of Computer Science, Technical 
Report. 
Sacerdoti, E. 1977 A Structure for Plans and Behavior. Elsevier, 
North-Holland, Amsterdam. 
Schank, R., Goldman, N., and Reiger, C. 1975 Inference and 
paraphrase by computer. Journal of the ACM 22 (3) July, 
309-328. 
Schlesinger, I.M. 1977 Production and Comprehension of Utterances. 
Lawrence Erlbaum Associates. 
Shapiro, S.C. 1975 Generation as parsing from a network into a 
linear string. American Journal of Computational Linguistics 
Fiche 33. 
Shapiro, S.C. 1979 Generalized augmented transition network 
grammars for generation from semantic networks. In Proceed- 
ings of the Seventeenth Meeting of the Association for Computa- 
tional Linguistics (August) 25-29. 
Simmons, R. and Slocum, J. 1972 Generating English discourse 
from semantic networks. Communications of the ACM 15 (10) 
October, 891-905. 
Sleeman, D.J. and Hendley, R.J. 1979 ACE: A system which 
analyses complex explanations. International Journal of Man- 
Machine Studies II 125-144. 
Slocum, J. 1973 Question Answering via Cannonical Verbs and 
Semantic Models: Generating English from the Model. University 
of Texas, Department of Computer Sciences, Austin, Technical 
Report NL 13. 
Slocum, J. 1975 Speech generation from semantic nets. American 
Journal of Computational Linguistics Fiche 33. 
Stevens, A. and C. Steinberg 1981 A Typology of Explanations and 
its Application to Intelligent Computer Aided Instruction. Bolt 
Beranek and Newman, Inc., Technical Report 4626 (March). 
Swartout, W.R. 1977 A Digitalis Therapy Advisor with Explanations. 
Massachusetts Institute of Technology, Laboratory for Comput- 
er Science, Technical Report (February). 
Swartout, W.R. 1981a Producing Explanations and Justifications of 
Expert Consulting Programs. Massachusetts Institute of Technol- 
ogy, Technical Report M1T/LCS/TR-251 (January). 
Swartout, W.R. 1981b Explaining and justifying expert consulting 
programs. In Proceedings of the Seventh International Joint Con- 
ference on Artificial Intelligence. University of British Columbia 
(August). 
Thompson, H.S. 1977 Strategy and tactics: A model for language 
production. In Papers from the Thirteenth Regional Meeting. 
Chicago Linguistic Society. 
Thompson, H.S. 1980 Stress and Salience in English: Theory and 
Practice. Xerox Palo Alto Research Center, Technical Report 
CSL-80-8 (May). 
Waltz, D.L. 1978 An English language question answering system 
for a large relational database. Communications of the ACM 21 
(7) July. 
Weiner, J.L. 1980 BLAH, a system which explains its reasoning. 
Artificial Intelligence 15 (November) 19-48. 
Winograd, T. 1972 Understanding Natural Language. Academic 
Press, Edinburgh. 
Wong, H.K.T. 1975 Generating English Sentences from Semantic 
Structures. University of Toronto, Department of Computer 
Science, Technical Report 84. 
Yngve, V.H.A. 1960 A model and a hypothesis for language 
structure. In Proceedings of the American Philosophical Society, 
444-466. 
Yngve, V.H.A. 1962 Random generation of English sentences. In 
The 1961 Conference on Machine Translation of Languages and 
Applied Language Analysis. Her Majesty's Stationery Office, 
London. 
2.13 Research in Progress 
This section describes research in text generation 
either currently in progress or recently completed but 
not yet described comprehensively in any publication. 
Like the set of references, it is not exhaustive. 
Barbara Grosz and Doug Appelt 
(SRI International) 
Barbara Grosz and Doug Appelt are developing a 
problem-solving approach to the design of text, ex- 
tending from prior work by Allen, Cohen, and Per- 
rault. A hierarchical planning system called KAMP 
(Knowledge and Modalities Planner) is being devel- 
oped, capable of planning actions that affect another 
agent's knowledge and wants. It includes critic proc- 
esses that examine the plan globally for interactions 
between the effects of actions and propose modifica- 
tions to the plan that will enable the utterance being 
planned to realize multiple illocutionary acts. KAMP's 
knowledge representation is based on Moore's possible 
worlds semantics approach to reasoning about knowl- 
edge and action. 
David McDonald 
(University of Massachusetts, Amherst) 
David McDonald is the author of MUMBLE, a sys- 
tem that performs utterance construction, grammatical 
smoothing, and maintenance of linguistic constraints 
for natural language generation by expert programs. 
MUMBLE is available to interested researchers in the 
common dialect of LISP machine LISP and NIL. The 
author is currently extending Mumble's grammatical 
power so that it plans word selection in describing 
viLsual scenes and also plans the use of certain connec- 
tives such as "but," "also," and "thus." 
VVilliam Mann and Christian Matthiessen 
(Information Sciences Institute) 
William Mann, Christian Matthiessen, and others 
are developing the Penman system to explore the 
problems of creating a portable text generation facility 
useful in multiple knowledge domains. Penman will 
seek to deliver knowledge (in English) from inside a 
system that was not designed to have a text generation 
component. 
The linguistic components of Penman are based on 
Halliday's Systemic Grammar. A large systemic gram- 
68 American Journal of Computational Linguistics, Volume 8, Number 2, April-June 1982 
William Mann Text Generation 
mar of English has been implemented and is being 
fitted with semantic parts. 
The knowledge representation, which resembles 
Brachman's early KL-ONE, is being used for both the 
subject matter of Penman's generation and the text 
plans by which Penman generates text. The emphasis 
of the research is on providing fluent English output 
from an easily controlled source. 
Kathleen McKeown 
(University of Pennsylvania) 
Research is being completed on a text generation 
system that embodies computational solutions to the 
questions of what to say next and how to organize it 
effectively. Two mechanisms are used to handle these 
problems: (1) rhetorical techniques for communica- 
tion, encoded as schemas, guide the generation proc- 
ess, and (2) a focusing mechanism helps maintain dis- 
course coherence. Schemas define aspects of dis- 
course structure and are associated with explicit dis- 
course purposes. The focusing mechanism aids in the 
organization of the message by constraining the selec- 
tion of what to talk about next to that which ties in 
with the previous discourse in an appropriate way. 
This work is being done within the framework of a 
natural language interface to a database system; the 
completed system will generate responses to questions 
about database structure. 
Steven Bossie, Kathleen McCoy 
(University of Pennsylvania) 
Two systems are being developed at the University 
of Pennsylvania in conjunction with McKeown's text 
generation system. One, developed by Kathleen 
McCoy, automatically enhances a metalevel descrip- 
tion of a database for use by McKeown's text genera- 
tion system. This system generates subclasses of 
classes in a given generalization hierarchy. It uses 
information in the database and a set of axioms to 
create the subclasses and select salient information 
describing the subclass divisions. 
Steven Bossie is developing a system that will take 
the ordered message created by McKeown's text gen- 
eration system and translate it into English. Bossie's 
system uses a functional grammar, based on a formal- 
ism defined by Kay 1979, which will allow for the 
direct encoding of focus constraints in the grammar. 
Thus, eventually, the system will use the focusing in- 
formation provided by McKeown's system to select 
syntactic constructions. 
Rod McGuire 
(Yale) 
Working toward his Ph.D. dissertation, Rod 
McGuire is developing a model of knowledge repre- 
sentation in human memory to account for observed 
constraints on the content of oral text. In this model, 
sentences are generated without building syntactic 
structures. In multisentential text, coherence arises 
directly from the form of representation in memory 
and from memory representation traversal algorithms, 
using a homogeneous representation to cover syntactic 
structure, rhetorical structure, and text plans. 
Madeline Bates, Robert Ingria, and Kirk Wilson 
(BBN and Boston University) 
The ILIAD system is an intelligent CAI system be- 
ing developed by Madeline Bates, Robert Ingria, Kirk 
Wilson (of Learning Tools, Inc., Brookline, Mass.) and 
others to give instruction and practice in English. The 
emphasis is not on teaching grammar, but the system 
needs to have a deep understanding of the syntactic 
relationships in the sentences used in examples and 
exercises. For this reason, the heart of the ILIAD 
system is a sentence generator that is based on the 
paradigm of transformational grammar. 
ILIAD's grammar blends some aspects of standard 
transformational theory with the extended standard 
theory. Rules have been developed to generate not 
only most of the common English structures but also 
ungrammatical sentences typical of those produced by 
people with language-delaying handicaps such as deaf- 
ness. To control the operation of the generator, sever- 
al layers of control structures have been developed. 
Constraints and syntactic specifications allow the user 
and the system to determine the syntactic form of the 
sentences at a very high level. Although semantic 
information is currently used only in lexical insertion, 
a KL-ONE INTERFACE is being designed. 
