Text Generation for Strategic Computing 
USC/Information Sciences Institute 
Marina del Rey, CA 90292 
Project Leaders: William Mann & Norman Sondheimer 
Project Staff: Robert Albano, Susanna Cumming, Thomas Galloway, Christian 
Bernhard Nebel, Lynn Poulton, George Vamos, Richard Whitney 
Matthiessen, 
1 Objectives 
The US military is an information-rich, computer intensive organization. It needs to have easy, 
understandable access to a wide variety of information. Currently, information is often in obscure 
computer notations that are only understood after extensive training and practice. Although easy 
discourse between users and machines is an important objective for any situation, this issue is 
particularly critical in regards to automated decision aids such as expert system based battle 
management systems that have to carry on a dialog with a force commander. A commander cain not 
afford to miss important information, nor is it reasonable to expect force commanders to undergo 
highly specialized training to understand obscure computer dialects which differ from machine to 
machine. 
The great deal of work that has been done in the area of natural language understanding is starting 
to pay off with the delivery of functional interfaces that interact naturally with the user. Comparatively 
little, however, has been done in the area of natural language generation. Currently. there is no 
effective technology for expressing complex computer notations in ordinary English. If there were. 
computer-based military information could be made more accessible and understandable in a manner 
less subject to personnel changes. 
The Text Generation for Strategic Computing project is creating and demonstrating new technology 
to provide an English-in, English-out interface to computer data to be embodied in a system called 
Janus. ISI is developing the English-out (text generation) portion of this overall system, a module 
named Penman. It's initial capability will be demonstrated on a naval database, but most of the 
techniques are more .general and will be able to be reapplied to other military problems. The end 
result will be an exciting new capability for the military that produces answers to queries and 
commands in the form of text that is understandable to any user who understands English. 
This project was put in place at the beginning of FY85 in order to develop the first natural language 
generation capability robust and capable enough to be used in DARPA's Strategic Computing 
Program. It is intended that this interface be coupled to the battle management system being 
developed under DARPA's Fleet Command Center Battle Management Program(FCCBMP). In the 
first 1.5 years of the effort, we were able to demonstrate the first generation system to produce 
English from output demands in formal mathematical logic using a broad coverage grammar and an 
artificial intelligence knowledge base. We have developed the basic technology that will permit the 
realization of a full text generation system. In the next phase of the work, we will extend text 
generation from sentences to paragraphs, increase the power of the grammar, dictionary, and 
planner, expand the knowledge base to cover more fully the battle management problem, optimize 
37 
and increase the robustness of the software, and tune the resulting system to the Navy problem. 
Finally, we will deliver our generation system to BBN, Inc, who will combine it with an understanding 
system for delivery to DARPA and subsequent integration with FCCBMP. 
This report details our approach, Penman's current status, and our plans for future work. 
2 Our Approach 
Both understanding and generation components are necessary in an effective information system 
interface. It is also necessary that the technologies used by these two components be absolutely 
consistent. If they are not, the embarrassing situation of the interface failing to understand a piece of 
text which it has just output can occur, causing a lack of confidence in the system by users. 
In order to alleviate this problem, we are developing Penman in close cooperation with a natural 
language understanding project at Bolt, Beranek, and Newman Inc. (BBN). This cooperation extends 
to joint development of knowledge representation software and lexicon design, and work to insure 
compatibility between the RUS understanding grammar and the Nigel generation grammar. 
The understanding/generation compatibility effort can be considered the first of our goals for the 
Penman system. Another is the ability to generate substantial amounts of English text, up to the 
multiparagraph level. A useful interface should not be restricted to short replies or comments. 
A third goal is to make Penman as portable and domain independent as possible. There are many 
knowledge domains available for interface systems, and few specialists to create the interfaces. 
Thus. we are emphasizing domain independent capabilities over domain dependent ones. 
Finally. there are many linguistically and computationally significant issues that must be addressed 
if high quality generation is to be achieved. For example, we are working on improved methods for 
knowledge representation, vocabulary development, and for using semantics and discourse context 
to guide the generaiion of text. 
3 Accomplishments 
By the time of this report, the project had created and delivered a Master Lexicon facility, called ML. 
for vocabulary acquisition and use \[Cumming 86a, Cumming 86b, Cumming 86c\]. ML is unusual in 
that it is compatible with the two radically different grammars of English in Janus; the Nigel generation 
grammar and the Rus understanding grammar. The system features a multi.window bit mapped 
display which is manipulatable through the keyboard and a mouse. Online documentation is provided 
for all lexical choices. ML is built to allow for the extension of the Janus lexical system. It is also 
possible to use ML with other natural language processing systems by replacement of a single data 
structure. 
In addition, work has been done on bringing the Nigel and RUS grammars into compatibility. 
Our general goal is the development of natural language generation capabilities. Our vehicle for 
these capabilities will be a reusable module designed to meet all of a system's needs for generated 
sentences. The generation module must have an input notation in which demands for expression are 
38 
represented. This notation should be of general applicability. For example, a good notation ought to 
be generally useful in a reasoning system. Also, the notation should have a well-defined semantics 
and the generator has to have some way of interpreting the demands. This interpretation has to be 
efficient. 
In our research, we have chosen to use formal logic as a demand language. Network 
knowledge-bases are used to define the domain of discourse in order to help the generator 
interpret the logical forms. And a restricted, hybrid knowledge representation is utilized to 
analyze demands for expression using the knowledge base \[Sondheimer 86\]. We have: 
1. Developed a demand language, Penman Logical Form (PLF), based on first order 
logic \[USC/ISI 85\], 
2. Structured a NIKL (New Implementation of KL-ONE) network \[Kaczmarek86\] to reflect 
conceptual distinctions observed by functional systemic linguists. 
3. Developed a method for translation of demands for expression into a propositional logic 
database. 
4. Employed KL-TWO \[Vilain 85\] to analyze the translated demands, and 
5. Used the results of the analyses to provide directions to the Nigel English sentence 
generation system \[Mann & Matthiessen 83\]. 
To first order logic, PLF adds restricted quantification, i.e.. the ability to restrict the set quantified 
over. In addition, we allow for equality and some related quantifiers and operators, such as the 
quantifier for "there exists exactly one ..." (3!). the operator for "the one thing that ..." (L). We permit 
the formation and manipulation of sets. including a predicate for set membership (ELEMENT-OF). 
And we have some quantifiers and operators based on Habel's 7//operator \[Habel 82\]. 
Our system's knowledge has been organized in a new way in order to facilitate English expression. 
Abstract concepts corresponding to the major conceptual categories of English have been defined in 
NIKL (a KL-ONE dialect) and used to organize the conceptual hierarchy of the domain. By defining 
concepts such as process, event, quality, relationship and object in this way, fluent English 
expression is facilitated. 
KL-TWO is a hybrid knowledge representation system that uses NIKL's formal semantics: KL.TWO 
links another reasoner, PENNI to NIKL. For our purposes. PENNI can be viewed as restricted to 
reasoning using propositional logic. 
We translate a logical form into an equivalent KL-TWO structure. All predications appearing in the 
logical form are put into the PENNI database as assertions. A separate tree is created which reflects 
the variable scoping. Separate scopes are kept for the range restriction of a quantification and its 
predication. 
The Nigel grammar and generator realizes the functional systemic framework at the" level of 
sentence generation. Within this framework, language is viewed as offering a set of grammatical 
choices to its speakers. Speakers make their choices based on the information they wish to convey 
and the discourse context they find themselves in. Nigel captures the first of these notions by 
39 
organizing minimal sets of choices into systems. The grammar is actually just a collection of these 
systems. The factors the speaker considers in evaluating his communicative goal are shown by 
questions called inquiries inside of the chooser that is associated with each system. A choice 
alternative in a system is chosen according to the responses to one or more of these inquiries. It is 
these inquiries which we have implemented. 
Our implementation of Nigel's inquiries using the connection and scope structures with the NIKL 
upper structure is fairly straightforward. Since the logical forms reflecting the world view are in the 
highest level of the NIKL model, the information decomposition inquiries use these structures to do 
search and retrieval. With all of the predicates in the domain specializing concepts in the functional 
systemic level of the NIKL model, information characterization inquiries that consider aspects of the 
connection structure can test for the truth of appropriate PENNI propositions. The inquiries that 
relate to information presented in the quantification structure of the logical form will search the scope 
structure. Finally, to supply lexical entries, we associate lexical entries with NIKL concepts as 
attached data and use the retrieval methods of PENNI and NIKL to retrieve the appropriate terms. 
Although we have done some generation using the BBN naval knowledge base, our most extensive 
experience comes from the Consul knowledge domain of computer mail and calendars \[Kaczmarek. 
Mark, and Sondheimer 83\]. Using the developments above, translation from first-order logic to 
English organized by NIKL has been demonstrated for a wide variety of sentence types. 
Table 3 shows a series of independent sentences that Penman is now able to generate. We have 
arranged and labeled them to show how they might enter into computer database dialogue. 
Paraphrases in <brackets> show corresponding sentences of the type that will be generated from 
Naval domain knowledge. 
Besides the technology described above, work has been done on aspects of discourse modeling. 
This included a study of object description and an initial noun phrase planner, which space does not 
allow us to detail. 
In order to produce the texts that are needed in a user assistance facility, work is under way to apply 
Rhetorical Structure Theory to text planning. A procedure has been developed which can design 
structures for paragraph-length texts given information about the text goal and communication 
situation. 
4 Future Effort 
In the next phase of our effort, Penman will be delivered to BBN, integrated into Janus, and 
demonstrated in an extensive series of laboratory exercises. This version of Penman will support 
coordinated English input and output, paraphrasing, and user assistance, as well as constructive 
rhetorical structure theory to create descriptive texts \[Mann 84\]. 
For a machine to create fluent English texts, several problems of text organization and presentation 
must be solved. We will design, implement and demonstrate two kinds of new text planning 
processes. One will be a planner for gross organization of text based on rhetorical structure theory, 
and the other a low level planner for sentence configurations. In addition, we will examine the expert 
system explanation functionality necessary within FCCBMP and will design a text planner to meet 
those needs. 
40 
Table 1: Sentences from Penman 
1. User: What is the read status of the message which was sent by Jones? <What is the 
readiness of the ship deployed by CINCPACFLT?> 
2. System: Message 40 is unread. <The Nimitz is C4.> 
3. User: Are any messages which were sent by Jones in the tape file? <Are any ships, 
deployed by CINCPACFLT in the Atlantic fleet?> 
4. User: Are any messages which were sent by Jones in the old ISI tape file which Smith 
has? <Are any ships deployed by CINCPACFLT in the new battle group which Spivak 
has?> 
5. System: Yes. <Yes.> 
6. System: Do you want me to display the read status of the messages? <Do you want me to 
display the readiness of the ships?> 
7. User: No. <No.> 
8. System: There are messages which were sent by Jones. <There are ships which were 
deployed by CINCPACFLT.> 
System: However, the tape is in use. <However, the battle group is en route.> 
User: Show me Message 40. <Show me the JFK.> 
User: Did Smith send any messages? <Did the JFK send any CASREPs?> 
System: Message 45 was sent by Smith. <CASREP 45 was sent by the JFK.> 
13. User: Did Smith schedule his meeting with me in New York? <Did the JFK schedule its 
port visit in San Diego?> 
14. System: Yes. <Yes.> 
15. User: What does the window show? <What does the map show?> 
16. System: The window shows a message which was sent by Jones. <The map shows a sail 
plan which was sent by CINCPAC> 
17. System: The window shows a request by Jones. <The map shows a request which was 
sent by CINCPACFLT.> 
18. User: Did I create the window? <Did I create the map?> 
19. System: The window was created when you asked me to display Message 50. <The map 
was created when you asked me to display Sail Plan 33.> 
20. System: You deleted the message, then you asked me to display it. <You deleted Sail 
Plan 33, then you asked me to display it.> 
. 
10. 
11. 
12. 
We also plan to develop enhancements to the basic interface to Penman. As part of Janus, BBN 
plans to develop methods for translating English inputs into a higher order predicate calculus form. 
In addition, the logical form will be manipulated inside the BBN understanding component to reflect 
database organization, expressions in this form are expected to be much more complex than the 
direct expressions of the corresponding English, requiring multiple sentence expression. Penman 
will be extended so that it can express these complex forms in English. 
Following successful completion of initial laboratory testing, Penman will be delivered to NOSC for 
knowledge base development and testing. This Penman will have a basic text generation capability 
with only limited capability for responding to the knowledge of the user and the state of the user- 
41 
machine interaction. In subsequent years, these basic capabilities will be extensively expanded to 
provide a much more useful interface, with particular development in the areas of user assistance, 
ability of the system to understand the user's knowledge and expectations, and needs for expert 
system explanation and user dialogue, as described below. 
Development of extensions is required in the area of user assistance functionality. Text generation 
can assist the user by explaining system actions and objects, and by clarifying its interpretation of 
incoming English. These capabilities must be integrated into the Janus interface in a natural, easy-to- 
use form. We expect to demonstrate increased user assistance functionality as time goes on. 
Development of extensions is also required in the area of responsiveness to users' knowledge and 
expectations. This includes making the system aware of the ongoing topics and issues, what the user 
has already been told, and what he can be expected to know without being told. This in turn involves 
more extensive use of inference. It also requires implementing a notion of the state of the dialogue -- 
what things are in attention, what interactions are in process and what has been interrupted and 
suspended. 
Penman is scheduled for use in an expert-system explanation context whose details are yet to be 
determined. We will study what functionality is required in FCCBMP. The study will lead to a design 
of code for this use of Penman. Implementation and demonstration of full functionality are expected 
following the user assistance work. 
42 
References 
\[Cumming 86a\] Susanna Cumming, Design of a Master Lexicon, USC/Information Sciences Institute, 
Marina del Rey,CA, Technical Report ISI/RR-85-163, February 1986. 
\[Cumming 86b\] Susanna Cumming, Robert Albano, A Guide to Lexical Acquisition in the JANUS 
System, USC/Information Sciences Institute, Marina del Rey,CA, Technical Report 
ISI/RR-85-162, February 1986. 
\[Cumming 86c\] Susanna Cumming, The Lexicon in Text Generation, USC/Information Sciences 
Institute, Marina del Rey,CA, Technical Report ISI/RR-86-168, 1986. Presented at The 
Workshop on Automating the Lexicon, Pisa, Italy, May, 1986. 
\[Habe182\] Christopher Habel, "Referential nets with attributes," in Horecky (ed.), Proc. COLING-82, 
North-Holland, Amsterdam, 1982. 
\[Kaczmarek 86\] T. Kaczmarek, R. Bates, G. Robins, "Recent Developments in NIKL," in AAAI-86, 
Proceedings of the National Conference on Artificial Intelligence, AAAI, Philadelphia, PA, August 
1986. 
\[Kaczmarek, Mark, and Sondheimer 83\] T. Kaczmarek, W. Mark, and N. Sondheimer, "The 
Consul/CUE Interface: An Integrated Interactive Environment," in Proceedings of CHI '83 
Human Factors in Computing Systems, pp. 98-102, ACM, December 1983. 
\[Mann 84\] Mann, W., Discourse Structures for Text Generation, USC/Information Sciences Institute, 
Marina del Rey, CA, Technical Report RR-84-127, February 1984. 
\[Mann & Matthiessen 83\] William C. Mann & Christian M.I.M. Matthiessen, Nigeh A Systemic 
Grammar for Text Generation. USC/Information Sciences Institute, Technical Report 
ISI/RR-83.105, Feb 1983. 
\[Sondheimer 86\] Norman Sondheimer, Bernhard Nebel, "A Logical-Form and Knowledge-Base 
Design for Natural Language Generation." in AAAI-86, Proceedings of the National Conference 
on Artificial Intelligence, AAAI, August 1986. 
\[USC/ISI 85\] Penman's Logical Language, USC/Information Sciences Institute, Marina del Rey, CA. 
1985. 
\[Vilain 85\] M. Vilain, "The Restricted Language Architecture of a Hybrid Representation System." in 
Proceedings of the Ninth International Joint Conference on Artificial Intelligence, pp. 547-551. 
LOS Angeles, CA, August 1985. 
43 
