Epiphenomenal Grammar Acquisition with GG SG 
Marsal Gavaldh 
Interactive Systems, Inc. 
1900 Murray Ave. Suite 203 
Pittsburgh, PA 15217, U.S.A. 
marsal©interactivesys, corn 
Abstract 
As a step toward conversational systems that al- 
low for a more natural human-computer interac- 
tion, we rep6r~ on GSG, a system that, while pro- 
viding a natural-l~nguage interface to a variety of 
applications, engages in clarification dialogues with 
the end user through which new semantic mappings 
are dynamically acquired. GsG exploits task- and 
-dependent information but is fully task- 
and -independent in its architecture and 
strategies. 
1 Introduction 
As conversational systems move from the realm of 
science fiction and research labs into people's every- 
day life, and as they evolve from the plain, system- 
directed interactions ~ la press or say one of so- 
called interactive voice response systems based on 
isolated-word recognizers and fixed-menu naviga- 
tion, to the more open, mixed-initiative dialogues 
carried out in spoken dialogue systems based on 
large-vocabulary continuous speech recognizers and 
flexible dialogue managers (see, e.g., (Allen et al., 
1996; Denecke, 1997; Walker et al., 1998; Rudnicky 
et al., 1999; Zue et al., 2000)), the overall experien- 
tial quality of the human-computer interaction be- 
comes increasingly important. That is, beyond the 
obvious factors of speech recognition accuracy and 
speech synthesis naturalness, the most critical chal- 
lenge is that of providing conversational interactions 
that feel natural to human users (cf. (Glass, 1999)). 
This, we believe, mainly translates into building sys- 
tems that possess some degree of linguistic, reason- 
ing, and learning abilities. 
In this paper we report on GSG, a conversational 
system that partially addresses these issues by being 
able to dynamically extend its linguistic knowledge 
through simple, natural- only interactions 
with non-expert users: On a purely on-need basis, 
i.e., when the system does not understand what the 
user means, GSG makes educated guesses, poses con- 
firmation and clarification questions, and learns new 
semantic mappings from the answers given by the 
users, as well as from other linguistic information 
that they may volunteer. GSG provides, therefore, 
an extremely robust interface, and, at the same time, 
significantly reduces grammar development time be- 
cause the original grammar, while complete with 
respect to the semantic representation of the do- 
main at hand, need only cover a small portion of 
the surface variability, since it will be automatically 
extended as an epip~enomenon of engaging in clari- 
fication dialogues with end users. 
2 Brief System Description 
As sketched in Figure 1, GSG is a conversational 1 
system built around the Soup parser (Gavald~, 2000). 
GSG's principal (and possibly sole) knowledge 
source is a task-dependent, semantic context-free 
grammar (the Kernel Grammar). At run-time, the 
Grammar is initialized as the union of the Kernel 
Grammar and, possibly, the User Grammar A (user- 
dependent rules learned in previous sessions). The 
Grammar gives rise to the 0nto\]ogy and to a parse- 
bank (collection of parse trees), which, together 
with a possible Kernel Parsebank, becomes the Parse- 
bank, from which the statistical Prediction Models 
are trained. The Ontology is a directed acyelic 
graph automatically derived from the Grammar in 
which the nodes correspond to grammar nontermi- 
nals (NTs) and the arcs record immediate domi- 
nance relation, i.e., the presence of, say, NTi in a 
right-hand side (RHS) alternative of NTj will re- 
sult in an arc from NTi to NTj. Nodes are an- 
notated as being "Principal" vs. "Auxiliary" (via 
naming convention), "Top-level" vs. "Non-top level" 
(i.e., whether they are starting symbols of the gram- 
mar), and with having "Only NT daughters" vs. 
"Only T daughters" vs. "Mixed"; arcs are anno- 
tated as being "Is-a" (estimated from being the 
only non-optional NT in a RHS alternative) vs. 
"Expresses" links, "Always-required" vs. "Always- 
optional" vs. "Mixed," and "Never-repeatable" vs. 
ZIn the work reported here, GsG's interactions are text- 
based (keyboard as input, text window as output), but GsG 
is being integrated with both a speech recognizer and a speech 
synthesizer. 
36 
I Back-end Application Manager I 
) 
Figure 1: GsG's system diagram. Ovals enclose knowledge sources, rectangles modules, and arrows indicate 
information flow. Dashed components are optional. 
Strategy 
All-top Parsing 
Anchor Mother Predictions 
Required/Is-a/... Daughters Search 
Verbal Head Search 
Parser Predictions 
Knowledge Source 
Grammar 
Prediction Models 
Ontology 
POS Tagger, Ontology 
Grammar 
Table 1: List of Gsc's main prediction and learning strategies. 
"Always-repeatable" vs. "Mixed". Also, a topo- 
logical sort 2 on the nodes is computed to derive a 
general-to-specific partial order of the NTs. 
A full system description is beyond the scope of 
this paper, but, very briefly, the User Interface me- 
diates all interactions with the end-user, the stack- 
based Dialogue Manager keeps track of current and 
past utterances and ensuing clarification dialogues, 
and, together with the History Interaction, ensures 
that no answered question is asked again. The GSG 
Engine manages the core of the systems' "intelli- 
gence," namely hypothesizing interpretations (to- 
gether with the Parse Tree Builder) and on-line learn- 
ing of semantic mappings. 
3 Example Dialogue 
To illustrate the workings of GSG, let's analyze an 
example interaction in an e-mail client task. Figure 2 
shows the example dialogue, Figure 3 presents a rel- 
evant fragment of the semantic context-free gram- 
mar 3 used to analyze the input, and Table 1 above 
2Requires that the grammar be acyclic. 
3Containing, approximately, 300 NTs, 500 Ts, and 800 
RHS alternatives, out of which about 55% is dedicated to 
lists the main prediction and learning strategies em- 
ployed. 
The user inputs (U1) and, as it turns out, nei- 
ther retrieve, nor bob, nor christmas are in the gram- 
mar and no full parse of (U1) can be found. Rather 
than responding with a I don't understand -- please 
rephrase, GSG sets its predictive machinery in mo- 
tion with the goal of constructing a full interpreta- 
tion for (U1). 4 
The first step is to reparse (U1) in a mode in 
which all NTs are considered top-level, i.e., able to 
slmnd at the root Of a parse tree. This produces 
a set of still unparsed words, such as retrieve, and 
parse subtrees, such as from bob being parsed under 
SENDER (via grammar rules (R6) to (R10)) and, am- 
biguously, under \[sourceFolder\] as well (via gram- 
mar rules (Rll) and (R12)). All of it (i.e., the full 
content of the chart) is taken as evidence by the 
Prediction Models to postulate the overall meaning 
of the original utterance. In this case (see (U2) 
cover temporal and numeric expressions, and digits and let- 
ters. 
4The hypothesized interpretation assumes that a single 
parse tree will cover the entire problematic sentence. 
37 
(U9) 
(UIO) 
(Ull) (U12) 
(U1) U: retrieve all messages from bob that were sent after christmas 
(U2) S: I don't understand right away what you mean but let me guess... 
"retrieve all messages from bob that were sent after christmas" is a way to express: 
1. send mail 
2. reply mail 
3. read mall 
0. None of the above 
(U3) U: none 
(U4) S: "retrieve all messages from bob that were sent after christmas" is a way to express: 
1. move mail 
2. list mail 
0. None of the above 
(U5) U: list 
(U6) S: By "from bob" do you mean: 
1. source folder 
2. sender 
(UT) U: sender 
(US) S " 2christmas" is a way to express: 
-1. yesterday (date relative yesterday) 
2. tomorrow (date relative tomorrow) 
3. today (date relative today) 
0. None of the above 
U: by christmas i'm referring to december twenty fifth 
S: Thanks for teaching me the meaning of "christmas"! 
Thanks for teaching me the meaning of "retrieve all messages from bob that were sent after christmas"! 
I understand "retrieve all messages from bob that were sent after december twenty fifth" 
U: retrieve last email to mary before christmas 
S: I understand "retrieve last email to mary before december twenty fifth" 
Figure 2: Example dialogue between a user (U) 
to (U5) 5) the suggestions of the Prediction Models 
are not particularly accurate (the correct choice is 
presented only in fifth place), but, considering that 
the head verb (retrieve) is not even in the grammar, 
such a response to (U1) is definitely better than giv- 
ing up. The effect of (U5) is to select \[listMail\] 
as (U1)'s "anchor mother" (logical root of the over- 
all interpretation). But to complete the parse tree 
a few details still need to be filled in. To that ef- 
fect (U6) is generated to disambiguate ~rorn bob and 
(US) to find the right mapping for christmas. The 
reasoning behind the rather puzzling choices offered 
by (US) comes from applying the Parser Predictions 
strategy: given the context in which an unparsed se- 
quence (in this case, single word) christmas appears, 
i.e., the subtree DATE..AFTER..PRE covering after (via 
(Pal4)), the grammar is traversed to find likely con- 
tinuations of the context (left context only in this 
case). Since DATE.AFTER_PRE can be immediately 
followed by \[datehfter\] (see (R13)) that makes 
\[datehfter\] a candidate to cover the unparsed se- 
5The options presented in (U2) and (U4) are generated 
at the same time, the only reason why they are split is to 
prevent overwhelming the end user, who may be hearing the 
choices spoken over the telephone. Also, note that in (U3) 
the user could have also said zero, or none of the above and 
achieve the same result -- or, alternatively, they could have 
volunteered information as in (ug). 
and the system (S) on an e-mall client task. 
quence christmas. However, since, according to the 
Ontology, \[dateAfter\] does not allow terminals as 
immediate daughters, a search is performed to find 
NTs under \[dateAfter\] that permit it. In this case 
(via (R15) to (R19)) it suggests yesterday, tomorrow, 
etc. 6 The user, though, realizing that the system 
does not directly understand christmas, volunteers 
(ug) 7, from which the mapping (M2) in Figure 4 is 
learned. 
At this point one may wonder about the fate of the 
unparsed word retrieve, since no question was asked 
about it. The answer is that GsG need not ask about 
every single prediction, if the confidence value is high 
enough. In this case, as soon as \[listgail\] was 
established (in (U5)) as the anchor mother, a Verbal 
Head Search strategy was launched to see whether, 
among the unparsed words, a verb was found that 
could be placed in a mostly-verb NT s directly under 
~In fact it suggests \[DATE_RELATIVE:yesterday\], \[DATE_RELATIVE:tomorrow\], 
etc, but it presents an ex- 
ample automatically generated from such NTs. 
7Obviously "the meaning of Christmas" (cf. cheerful 
(U10)) may be much more profound than a shorthand for 
December 25 -- but, alas, conveying that is well beyond the 
simple grammar presented here. 
SA "verbness" ratio is automatically computed for each 
candidate NT. by running the POS tagger on automatically 
generated sentences from the NTs in question. (%Ve used a 
38 
(1~I) \[listMail\] +--- 
(R2) \[moveMail\] (R3) 
LIST ,-- 
(R4) MOVE 
(R5) MAIL_ARGUMENTS 
(I~6) SENDER 
(R.7) SENDER_PRE 
(I:~8) \[sender\] +---- 
(R9) \[name: STRING\] 
(RI0) PERSON_0R_INSTITUTION_NAME +--- 
(RiI) \[sourceFolder\] +--- 
(R12) \[folderName : STRING\] +.--- 
(R13) \[dateRange\] +---- 
(R14) DATE_AFTER_PRE 
(RlS) \[dateAfte=\] +-- 
(1~16) \[datePoint :DATE\] +-- 
(RIT) DATE_POINT.ARGUMENT 
(RI8) \[datcP_oint : DATE_RELATIVE\] +--- 
(R19) \[DATE~ELATIVE: yesterday\] 
*VERB_DESIKE LIST *T0.FOR.ME +MAIL_ARGUMENTS 
*VERB_DESIRE MOVE *+MAIL_ARGUMENTS *\[sourceFolder\] \[destinationFolder\] 
list I get 
move 
SENDER I ~CIPIENT I S~JECT I DnE I MESSAGE_IDX I ._ 
*SENDER_PRE \[sender\] 
from I by 
\[name:STRING\] I \[emailhddress:STRING\] 
PERSON_0R_INSTITUTION_NAME I MAILING_LIST_NAME 
+WILDCARD 
from \[folderName:STRING\] *FOLDER 
WILDCARD 
(DATE_AFTER_PKE \[dateAfter\]) I (DATE_BEFORE_PKE \[dateBefore\]) ... 
after If tom I since 
\[datePoint : DATE\] 
+DATE_POINT.ARGUMENT 
\[datePoiRt:DATE_RELATIVE\] I \[daZePoint:DATE_FIXED\] I ... 
\[DATE_RELATIVE:yesterday\] \[ \[DATE_RELATIVE:tomorrow\] I ... 
yesterday 
Figure 3: Grammar fragment for an e-mail client task. '*' indicates optiSnality of adjacent token, '+' 
repeatability, and '1' separates RHS alternatives. Terminals are italicized. NILDCARD is a special NT that 
matches any out-of-vocabulary word or any in-vocabulary word present in a list for that purpose. 
\[listMail\]. The result was highly positive and led 
to the acquisition of the RHS alternative (M1). 
It is worth mentioning here that there are two 
kinds of mappings that GSG learns: RHS alterna- 
tives and subtree mappings. Learning new RHS al- 
ternatives is the preferred way because the knowl- 
edge can be incorporated into the Parsebank (and, in 
turn, into the Prediction Models). That is the effect 
of adding (M1) to the Grammar: Since the Parsebank 
and the Prediction Models are updated on-line, the 
presence of the word retrieve in subsequent utter- 
ances becomes a strong indicator of LIST and, asso- 
ciatively, of \[listMa±l\]. However, when the source 
expression can not be mapped into the desired target 
structure via grammar rules, as in (M2), the only so- 
lution is to remember the equivalence. This kind of 
learning, although definitely useful since the mean- 
ing of the source expression will be henceforth re- 
membered, cannot be incorporated into the Predic- 
tion Models. 
Right after (U9), (U1) is considered fully un- 
derstood and the interpretation is automatically 
mapped into the feature structure (FS1) 9 in Fig- 
ure 5, which is then shipped to the Back-end Ap- 
plication Manager. 
Finally, when (Ull) comes in, a correct analysis is 
produced thanks to the mappings just learned from 
(U1), 1° and (FS2) in is generated. 
modified version of Brill's tagger (Brill 1994).) 
9The mapping is simply a removal of auxiliary NTs from 
the parse tree, plus value extraction of dates, numbers and 
strings from certain subtrees, e.g., subtree in (M2) becomes 
the substructure tinder da~ePoin~ in (FSI). 
1°Note that rule \[listMail\] ~ LIST +MAIL_ARGUMENTS 
4 Discussion 
The example above illustrates the philosophy of 
GSG, n namely, to exploit task and linguistic knowl- 
edge to pose clarification questions in the face of 
incomplete analyses, 12 build correct interpretations, 
and acquire new semantic mappings. Thus, a contri- 
bution of Gso, is the demonstration that from a sim- 
ple context-free grammar, with a very lightweight 
formalism, one can extract enough information (On- 
tology, Parsebank, Parser Predictions strategy) to 
conduct meaningful clarification dialogues. Note, 
moreover, that such dialogues occur entirely within 
GSG, with the Back-end Application Manager receiv- 
ing only finalized feature structures. 13 
Another advantage is the ease with which natural- 
 interfaces can be constructed for new do- 
mains: Since all the task and linguistic knowledge 
is extracted from the grammar, 14 one need only de- 
velop a Kernel Grammar that models the domain at 
(extracted from the final interpretation of (U1)) would have 
been learned too, but its subsumption by existing rule (R1) 
was automatically detected. 
nBased on the pioneering work of (Lehman, 1989). 
12Detected by a lack of interpretation, excessively frag- 
mented interpretation, or by being told by the end user that 
the automatically generated paraphrase of their input is not 
what they meant. 
lsOf course, prediction accuracy can improve if the Back- 
end Application Manager can be incorporated as a knowledge 
source to, for example, contribute in the ranking of hypothe- 
ses, but the point is that it is not necessary and that, as long 
as the capabilities of the back-end application are adequately 
modeled by the Grammar, the construction of the correct in- 
terpretation can be performed within Gsc alone. 
t4Except for the POS Tagger and the Syntactic Grammar. 
39 
(M1) 
(M2) 
DATE_POINT_ARGUMENT 
MONTH 
MONTH_VAL 
\[month:f2\] 
december 
DAY_OF_MONTH 
\[dayOfMonth:INTEGER\] 
ORDINAL-NUMBER-O-99 
CARDINAL-NUMBER-TENS 
\[INTEGER-CARDINAL:20\] 
twenty 
ORDINAL-NUMBER-UNITS 
\[INTEGER-ORDINAL:5\] 
99h 
LIST ~ retrieve 
christmas 
listMail 
messageIdx: ull 
sender 
name: bob 
dateRange 
dateAfter 
datePoint 
month: I~ 
dayOfMonth: ~5 
Figure 4: Mappings learned from the dialogue in Figure 2. 
listMail 
messageIdx wlast 
recipient 
name: mary 
dateRange 
dateBefore 
datePoint 
month: 12 
dayOfMonth: 25 
(FS1) (FS2) 
Figure 5: Feature structures sent to the Back-end Application Manager after (U10) and (U12) in Figure 2. 
hand via its NTs 15 but need not provide a high cov- 
erage of the utterances possible in the domain (data 
which may not be available anyway). Also, reuse of 
existing grammar modules for, e.g., dates and num- 
bers, is straightforward. 
However, a fear of letting the end user (indirectly) 
modify a grammar is that the grammar may grow 
untamed and become filled with new rules that dis- 
rupt the Kernel Grammar. To prevent that, besides 
the careful construction of interpretations via the 
strategies described above, GSG employs two safety 
mechanisms: before a rule is added to the gram- 
mar, it is checked whether it introduces ambiguity 
to the grammar, 16 and whether it disrupts existing 
15Knowledge of, e.g., how the Ontology is computed helps, 
but it coincides with the most natural way of writing well- 
structured, context-free semantic grammars. 
lSAccomplished by using the SouP parser in yet another 
mode: parsing of RHSs (expanded to RHS paths) instead of 
terminals. In this case, existence of a parse tree covering an 
entire RHS path indicates ambiguity. Note that if all RHS 
paths of the new rule can be parsed under the current RHS of 
the new rule's left-hand side, then the new rule is subsumed 
by the existing RHS and can therefore be discarded (cf. note Io). 
(correct) interpretations. 17 In this way, some of the 
new rules may have to be discarded, but at least the 
health of the grammar is preserved, is 
Another concern may be that the new mappings 
end up generating feature structures that are not un- 
derstood by the Back-end Application Manager. To 
avoid that, GSG only allows a principal NT to be 
dominated by another principal NT if such domi- 
nance relation is licensed by the Kernel Grammar. 
This guarantees that all resulting feature structures 
be structurally correct (although they may contain 
unexpected atomic values). 
A current limitation of GsG lies in the difficulty of 
segmenting long sequences of unparsed words: GSG 
uses POS tagging followed by noun-phrase bracket- 
ing (via parsing with a shallow Syntactic Grammar), 
which represents an improvement over the Single 
Segment Assumption (cf. (Lehman, 1989)), but is 
still far from perfect and can disrupt the ensuing 
clarification dialogue. Also, the number of questions 
that the system can pose as it builds an interpre- 
17Achieved by reparsing (a subset of) the Parsebank. Note 
that SOUP can typically parse in the order of 100 utterances 
per second (cf. (Gavaldb. 2000)). 
ISAssuming minimally co6perative and consistent users. 
40 
tation, may, in occasion, exceed the patience of the 
end user (but the command cancel is always under- 
stood). 
The hardest problem we have encountered so far is 
typical of natural- interfaces but is exacer- 
bated in GSG (as it treats every unparsable sentence 
as an opportunity to learn), and that is the difficulty 
of identifiying in-domain end-user sentences that go 
beyond the capabilities of the end application, or, in 
other words, are not expressible in the grammar. 
Finally, as Gsc becomes fully integrated with a 
speech recognizer, it remains to be seen how an op- 
timal point in the tradeoff between the wide cover- 
age but relatively low word recognition accuracy ob- 
tained with a loose dictation grammar, and the nar- 
row coverage but high word accuracy achieved with 
a tight, task-dependent grammar, can be found, and 
how the degradations of the input is going to affect 
GSG'S behavior. 
Overall, however, we believe that Gsc, by virtue 
of its built-in robustness, minimal initial knowledge 
requirements, and learning abilities, begins to em- 
body the kind of qualities that are necessary for con- 
versational systems, if they are to provide, without 
exorbitant development effort, an interaction thay 
feels truly natural to humans. 

References 
Allen, James, et al. (1996). Robust Understanding 
in a Dialogue System. In Proceedings o\] A CL- 
1996. 
Brill, Eric. (1994). Some Advances in Part of Speech 
Tagging. In Proceedings o\] AAAI-1994. 
Denecke, Matthias. (1997). An Information- 
based Approach for Guiding Multi-modal Human- 
Computer Interaction. In .Proceedings of IJCAI- 
199Z 
Gavald~, Marsal. (2000). SouP: A Parser for Real- 
world Spontaneous Speech. In Proceedings o\] the 
Sixth International Workshop on Parsing Tech- 
nologies (IWPT-2000). 
Glass, James. (1999). Challenges for Spoken Dia- 
logue Systems. In Proceedings of the 1999 IEEE 
ASRU Workshop. 
Lehman, Jill. (1989). Adaptive Parsing: Sell- 
extending Natural Language Interfaces. Ph.D. dis- 
sertation, School of Computer Science, Carnegie 
Mellon University. 
Rudnicky, Alex, et al. (1999). Creating Natural 
Dialogs in the Carnegie Mellon COMMUNICATOR 
System. In Proceedings o\] Eurospeech-1999. 
Walker, Marilyn, et al. (1998). Learning Optimal 
Dialogue Strategies: A Case Study of a Spo- 
ken Dialogue Agent for Email. In Proceedings of 
COLING/A CL-i998. 
Zue, Victor, et al. (2000). JUPITER: A Telephone- 
Based Conversational Interface for Weather Infor- 
mation. In IEEE Transactions on Speech and Au- 
dio Processing, Vol. 8 , No. 1. 
