RESPONSE GENERATION IN QUESTION - ANSWERING SYSTEMS 
Ralph Grishman 
New York University 
1. INTRODUCTION 
AS part of our long-term research into techniques for 
information retrieval from natural language data bases, 
we have developed over the past few years a natural lang- 
uage interface for data base retrieval \[1,2\]. In 
developing this system, we have sought general, conceptu- 
ally simple, linguistically-based solutlons to problems 
of semantic representation and interpretation. One 
component of the system, which we have recently redesign- 
ed and are now implementing in its revised form, involves 
the generation of responses. This paper will briefly 
describe our approach, and how this approach simplifies 
some of the problems of response generation. 
Our system processes a query in four stages: syntactic 
analysis, semantic analysis, simplification, and retriev- 
al (see Figure i). The syntactic analysis, which is 
performed by the Linguistic String Parser, constructs a 
parse tree a~d then applies a series of transformations 
which decompose the sentence into a operator-operand- 
adjunct tree, The semantic analysis first translates 
this tree into a formula of the predicate calculus with 
set-formers and quantification over sets. This is 
followed by anaphora resolution (replacement of pronouns 
with their antecedents) and predicate expansion 
(replacement of predicates not appearing in the data base 
by their definitions in terms of predicates in the data 
base). The simplification stage performs certain optimi- 
zations on nested quantifiers, after which the retrieval 
component evaluates the formula with respect to the data 
base and generates a response. 
Our original system, like many current question-answering 
systems, had simple mechanisms for generating lists and 
tables in response to questions. As we broadened our 
system's coverage, however, to include predicate expan- 
sion and to handle a broad range of conjoined struc~:ures, 
the number of ad hoc rules for generating answers grew 
considerably. We decided therefore to introduce a much 
more general mechanism, for translating predicate 
calculus expressions back into English. 
2. PROBLEMS OF RESPONSE GENERATION 
To understand how this can simplify response generation, 
we must consider a few of the problems of generating 
responses. The basic mechanism of answer generation is 
very simple. Yes-no questions are translated into predi- 
cate formulas; if the formula evaluates to ~r~e, print 
"yes", else "no". Wh-questions translate into set- 
formers; the extension of the set is the answer to the 
question. 
One complication is embedded set-formers. An embedded 
set-former arises when the question contains a quantifier 
or conjunction with wider scope than the question word. 
For example, the question 
Which students passed the French exam and which failed 
it? 
will be translated into two set-for~ers connected by G~d: 
{s E set-of-students I passed (s, French exam)} 
~d 
{s E set-of-students I failed (s, French exam)} 
It would be confusing to print the two sets by them- 
selves. Instead, for each set to be printed, we take 
the predicate satisfied by the set, add a universal 
quantifier over the extension of the set, and convert the 
resulting formula into an English sentence. For our 
example, this would mean 
print-Eng~ish-equiva~ent-of'(Vx E el) 
passed ix, French exam)' 
~)here S I = {s 6 set-of-students I passed(s,French exam)} 
and 
p~nt-~gl~sh-equ~valent-of (Vx ~ s 2) 
failed ix, French exam)' 
where S 2 = {s E set-of-students I failed(s,French exam)} 
which would generate a response such as 
John, Paul, and Mary passed the French exam; 
Sam and Judy failed it. 
The same technique will handle set-fo~aers within the 
scope of quantifiers, as in the sentence 
Which exams did each student take? 
Additional complications arise when the system wants to 
add some words or explanation to the direct answer to a 
question. When asked a yes-no question, a helpful 
question-answering system will try to provide more infor- 
mation than just "yes" or "no". In our system, if the 
outermost quantifier is existential -- (3x ~ S) C(x) -- 
we print {x E S I C(x\]}; if it is universal -- 
(Vx E S) C(x) --we print {x E S I 7C(x)}. For example, 
in response to 
Did all the students take the English exam? 
our system will reply 
NO, John, Mary, and Sam did not. 
When the outermost quantifier is the product of predicate 
expansion, however, it is not sufficient to print the 
corresponding set, since the predicate which this set 
satisfies is not explicit in the question. For example, 
in the data base of radiology reports we are currently 
using, a report is negGtiue if it does not show any posi- 
tive or suspicious medical findings. Thus the questiQn 
Was the X-ray negative? 
would be translated into 
negative iX-ray) 
and expanded into 
(Vf E medical-findings\] ~show(X-ray,f) 
sO the system would compute the set 
{f E medical-findings \[ show(X-ray,f)} 
Just printing the extension of this set, 
NO p ~tastases. 
99 
QUESTION ANALYSIS RESPONSE SYNTHESIS 
QUESTICN RESPONSE 
string analysis I 
PARSE TREE 
decomposition generative transformations transformational 
OPERATOR-OPERAND-ADJUNCT TREE OPERATOR-OPERAND'~ans~T TREE 
quantifier analysis arise tO op-op-adj 
tree 
PREDICATE CALCULUS FORMICA PRED. CALC. ~(P~°U~Sd ~ ged) 
PLOD. CALC. (pronouns resolved) PREDICATE FORMULA 
predicate expansion substitute retrieved data 
Ante predicate 
PRED. CALC. (predicates e~panded) 
transl, to retrieval retest 
~RIEVAL REQUEST 
simplification 
RETRIEVAL REQUEST (simplified) RETRIEVED DATA 
Figure 1. The structure of the NYU question-answering system. 
would be confusing to the user. Rather, by using the 
sam~ rule as before foe printing a set, we produce a 
response such as 
No, ~he X-ray showed metastases. 
Similar considerations apply to yes-no questions wi~h a 
conjunction of Wide scope. 
3. DESIGN AND IMPLEMENTATION 
As we noted earlier, our question-analysis procedu~ is 
composed of several stages which transform ~he question 
t.hrou~h a se=ias of represen~ationsx sentence, pine 
tree, operator-operand-ad:Junct tree (~ans formational 
deconpoei~Lon), predic&te calculus fornula, retrieval 
request. TIlLs mul~L-#tage structure has made At 
straightfor~a~d to design our sen~nce geuere~inn, or 
synthesis, pro~edttre, which const~cts ~he sm represen- 
tations in ~he reveres order from the analysis 
procedure • 
In designing ~he synthesis procedure, ~he first decision 
we had to make weal which representation should the 
synthesls p~ocedm accept as input? The retrieval pro- 
cedure instant.lares varifies in ~he re~leval request, 
so it might seem ~ost s~.raightforwaurd for ,':hit re~ieval 
procedure to pass to ~he synthesis pz~c~du~ a modified 
retrieval request representation. Al~rna~ively, we 
could keep track of the correspondence between 
components of ~he retrieval request and com~nen~ of 
the parse t~, ope=a~o~-operand-adJunct tree, or 
predicate calculus representation. Then we could sub- 
s~.itute ~he results of retrieval back into one of ~he 
latter representations and have ~-he synthesis component 
work fz~m there. This would simplify the synthesis pro- 
cedure, since its s~ar~ing point would be "closer" to 
~he sentence representation. 
A beullo z~equi=nt for using one o! ~eee rtpresenta- 
tlona is ~hen the ability to emtLblish a correspondence 
between those ccn~onen~ of the retrieval request which 
may be significant in genera~Lng a response and compon- 
ents of ~he other representation. Because predicate 
e~rmlon introduces variables and relations which are 
no~ present earlier but which may have to be used in the 
response, we could not use a representation closer to 
the surface than the outpot of predicate expansion 
(a predicate calculus formula). Subsequent s~aqes of ~he 
analysis procedure, hcMevtr, (translation to retrieval 
request and simplification), do not introduce structures 
which wall be needed in generating responses. We ~here- 
fore choose tO simpllfy Stir syn1~lesizer by using as its 
input the output of predicate expansion \[instantiated 
wi~h the result.s of retrieval) rather than ~he retrieval 
z~quest. 
The synthesis procedure has ~hree stages, which corres- 
pond to three of the staqes of the analysis procedure 
(Fi~IEt l). First, noun phrases which can be pronominal- 
ized are identified. Second, ~he predicate calculus 
expression is translated into an operator-operand-adJunct 
tree. Finally, a set of gtnerative transformations are 
applied to produce a parse ~e, whose frontier is the 
generated sentence. 
The correspondence between analysis and synthesis extends 
to ~he details of the analytic and generative transfoE- 
matlonal stages. Bo~h stages use the same prelim, ~he 
~ransforma~ional component of ~he Linguistic String 
Parser \[3\]. MidSt analytic r.Tansformations have corres- 
ponding members (performing ~he reverse transformations) 
in ~he generative set. These correspondences have great- 
ly facilitated ~he design and coding of our generative 
s t age. 
100 
One problem in transforming phrases into predicate 
calculus and than regenerating them is that syntactic 
paraphrases will be mapped into a single phrase (one of 
the paraphrases). For example, "the negative X-rays" and 
"the X-rays which were negative" have the same predicate 
calculus representation, so only one of these structures 
would be regenerated. This is undesirable in generating 
replies ~ a natural reply will, whenever possible, 
employ the saume syntactic constructions used in the 
question. In order to generate ~uch natural replies, each 
predicate and quantifier which is directly derived from 
a phrase in the question is tagged with the syntactic 
structure of that phrase. Predicates and quantifiers not 
directly derived from the question (e.g., those produced 
by predicate expansion) are untagged. Generative trans- 
fora~tions usa these tags to select the syntactic 
str~ture to be generated. For untagged constructs, a 
special set of transformations select appropriate 
syntactic structures (this is the only set of generative 
transformations without corresponding analytic transfor- 
mations ). 
4. OTHER EFFORTS 
AS we noted at the beginning, few question-answering 
systems incorporate full-fledged sentence generators I 
fixed-format and tabular responses suffice for systems 
handling a limited range of quantification, conjunction, 
and inference. However, several investigators have 
developed procedures for generating sentences from 
internal reprsentations such as semantic nets and 
conceptual dependency structures \[4,5,6,7\]. 
Sentence generation from an internal representation 
involves at least three types of operations: 
o recursive sequencing through the nested predicate 
structure 
o sequencing through the components at one level of the 
structure 
o transforming the structure or generating words of the 
target sentence. 
The last function is performed by LISP procedures in the 
systems cited (in our system it is coded in Restriction 
Language, a language specially designed for writing 
natural-language grammars). The first two functions are 
either coded into the LISP procedures or are performed 
by an augmented transition network (ATN). Although the 
use of ATNs suggests a parallelism with recognition 
procedures, the significance of the networks is actually 
quite different; a path in a recognition ATN corresponds 
to the concatenation of strings, while a path in a 
generative ATN corresponds to a sequence of arcs in a 
semantic network. In general, it seems that little 
attention has been focussed on developing parallel 
recognition and generation procedures. 
Goldman \[5\] has concentrated on a fourth type of opera- 
tion, the selection of appropriate words (especially 
verbs) and syntactic relations to convey particular 
predicates in particular contexts. Although in general 
this can be a difficult problem, for our domain (and 
probably for the domains of all current question-answer- 
ing systems) this selection is straightforward and can 
be done by table lookup or simple pattern matching. 
5. c0Nc~vBz0. 
We have discussed in this paper some of the problems of 
response generation for question-answering systems, and 
how these problems can be solved using a procedure which 
ganezates sentences from their internal representation. 
We have Driefly described the structure of this procedure 
and noted how our multistage processing has made it 
possible to have a high degree of parallelism between 
analysis and synthesis. We believe, in particular, that 
this parallelism is more readily achieved with our 
separate stages for parsing and transformational 
decomposition than with ATN recognizers, in which these 
stages are combined. 
The translation from predicate calculus to an operator- 
operand-adjunct tree and the generative transformations 
are operational; the pronom/nalization of noun phrases 
is being implemented. We expect that as our question- 
answering system is further enriched (e.g., to recognize 
presupposition, to allow more powerful inferencing rules) 
the ability to generate full-sentence responses will 
prove increasingly valuable. 
6. ACKNQ.WLE DGEMENTS 
I would like to thank Mr. Richard Cantone and 
Mr. Ng~ Thanh Nh~n, who have implemented m~st of the 
extensions to our question-answering system over the 
past year. 
This research was supported in part by the National 
Science Foundation under Grant NO. MCS 78-03118, by the 
Office of Naval Research under Contract No. N00014-75-C- 
0571, and by the Department of Energy, under Contract No. 
EY-76-C-02- 3077. 
7. REFERENCES 
\[i\] R. Grishman and L. Hirschman, Question Answering 
from Natural Language Medical Data Bases, 
Artificial InteZligence 11 (1978) 25-43. 
\[2\] R. Grishman, The Simplification of Retrieval 
Requests Generated by Question-Answering Systems, 
Proc. Fourth Intl. Conf. on Very Large Data Bases 
(1978) 400-406. 
\[3\] J. R. Nobbs and R. Grishman, The Automatic Transfor- 
mational Analysis of English Sentences." An Implemen- 
tation. Intern. J. Co~p,,ter Math. A 5 (1976) 
267-283. 
\[4\] R. Simmons and J. Sloctun, Generating English 
Discourse from Semantic Networks. Comm. A.C.M. 1~ 
(1972) 891-905. 
\[5\] N. Goldman, Sentence Paraphrasing from a ConceptUal 
Base. Com. A.C.M. 18 (1975) 96-106. 
\[6\] H. Wong, Generating English Sentences from Semantic 
Structures. Technloal Re,opt No. 84, Dept. of 
Computer Sci., Univ. of Toronto (1975). 
\[7\] J. Slocum, Generating a Verbal Response. In 
Und6Ps~an~ing Spoken Lunguugo, ed. D. Walker, 
North-Holland (1978) 375-380. 
101 

