Proceedings of the COLING/ACL 2006 Main Conference Poster Sessions, pages 937–944,
Sydney, July 2006. c©2006 Association for Computational Linguistics
Stochastic Discourse Modeling in Spoken Dialogue Systems  
Using Semantic Dependency Graphs 
 
 
 Jui-Feng Yeh, Chung-Hsien Wu and Mao-Zhu Yang  
Department of Computer Science and Information Engineering 
National Cheng Kung University 
No. 1, Ta-Hsueh Road, Tainan, Taiwan, R.O.C. 
{jfyeh, chwu, mzyang}@csie.ncku.edu.tw 
 
 
 
 
Abstract 
This investigation proposes an approach 
to modeling the discourse of spoken dia-
logue using semantic dependency graphs. 
By characterizing the discourse as a se-
quence of speech acts, discourse modeling 
becomes the identification of the speech 
act sequence. A statistical approach is 
adopted to model the relations between 
words in the user’s utterance using the 
semantic dependency graphs. Dependency 
relation between the headword and other 
words in a sentence is detected using the 
semantic dependency grammar. In order 
to evaluate the proposed method, a dia-
logue system for medical service is devel-
oped. Experimental results show that the 
rates for speech act detection and task-
completion are 95.6% and 85.24%, re-
spectively, and the average number of 
turns of each dialogue is 8.3. Compared 
with the Bayes’ classifier and the Partial-
Pattern Tree based approaches, we obtain 
14.9% and 12.47% improvements in ac-
curacy for speech act identification, re-
spectively.  
1 Introduction 
It is a very tremendous vision of the computer 
technology to communicate with the machine us-
ing spoken language (Huang et al., 2001; Allen at 
al., 2001). Understanding of spontaneous language 
is arguably the core technology of the spoken dia-
logue systems, since the more accurate information 
obtained by the machine (Higashinaka et al., 2004), 
the more possibility to finish the dialogue task. 
Practical use of speech act theories in spoken lan-
guage processing (Stolcke et al. 2000; Walker  and 
Passonneau 2001; Wu et al., 2004) have given both 
insight and deeper understanding of verbal com-
munication. Therefore, when considering the 
whole discourse, the relationship between the 
speech acts of the dialogue turns becomes ex-
tremely important. In the last decade, several prac-
ticable dialogue systems (McTEAR, 2002), such as 
air travel information service system, weather 
forecast system, automatic banking system, auto-
matic train timetable information system, and the 
Circuit-Fix-it shop system, have been developed to 
extract the user’s semantic entities using the se-
mantic frames/slots and conceptual graphs. The 
dialogue management in these systems is able to 
handle the dialogue flow efficaciously. However, it 
is not applicable to the more complex applications 
such as “Type 5: the natural language conversa-
tional applications” defined by IBM (Rajesh and 
Linda, 2004). In Type 5 dialog systems, it is possi-
ble for the users to switch directly from one ongo-
ing task to another. In the traditional approaches, 
the absence of precise speech act identification 
without discourse analysis will result in the failure 
in task switching. The capability for identifying the 
speech act and extracting the semantic objects by 
reasoning plays a more important role for the dia-
log systems. This research proposes a semantic 
dependency-based discourse model to capture and 
share the semantic objects among tasks that switch 
during a dialog for semantic resolution. Besides 
937
acoustic speech recognition, natural language un-
derstanding is one of the most important research 
issues, since understanding and application restric-
tion on the small scope is related to the data struc-
tures that are used to capture and store the 
meaningful items. Wang et al. (Wang et al., 2003) 
applied the object-oriented concept to provide a 
new semantic representation including semantic 
class and the learning algorithm for the combina-
tion of context free grammar and N-gram.  
Among these approaches, there are two essential 
issues about dialogue management in natural lan-
guage processing. The first one is how to obtain 
the semantic object from the user’s utterances. The 
second is a more effective speech act identification 
approach for semantic understanding is needed. 
Since speech act plays an important role in the de-
velopment of dialogue management for dealing 
with complex applications, speech act identifica-
tion with semantic interpretation will be the most 
important topic with respect to the methods used to 
control the dialogue with the users. This paper 
proposes an approach integrating semantic de-
pendency graph and history/discourse information 
to model the dialogue discourse (Kudo and Ma-
tsumoto, 2000; Hacioglu et al., 2003; Gao and Su-
zuki, 2003). Three major components, such as 
semantic relation, semantic class and semantic role 
are adopted in the semantic dependency graph 
(Gildea and Jurasfky, 2002; Hacioglu and Ward, 
2003). The semantic relations constrain the word 
sense and provide the method for disambiguation. 
Semantic roles are assigned when the relation es-
tablished among semantic objects. Both semantic 
relations and roles are defined in many knowledge 
resources or ontologies, such as FrameNet (Baker 
et al., 2004) and HowNet  with 65,000 concepts in 
Chinese and close to 75,000 English equivalents, is 
a bilingual knowledge-base describing relations 
between concepts and relations between the attrib-
utes of concepts with ontological view (Dong and 
Dong 2006). Generally speaking, semantic class is 
defined as a set with the elements that are usually 
the words with the same semantic interpretation. 
Hypernyms that are superordinate concepts of the 
words are usually used as the semantic classes just 
like the Hypernyms of synsets in WordNet 
(http://www.cogsci.princeton.edu/~wn/) or defini-
tions of words’ primary features in HowNet. Be-
sides, the approach for understanding tries to find 
the implicit semantic dependency between the con-
cepts and the dependency structure between con-
cepts in the utterance are also taken into 
consideration. Instead of semantic frame/slot, se-
mantic dependency graph can keep more informa-
tion for dialogue understanding. 
2 Semantic Dependency Graph 
Since speech act theory is developed to extract the 
functional meaning of an utterance in the dialogue 
(Searle, 1979), discourse or history can be defined 
as a sequence of speech acts,  
12 1
{, ,.. ,}
ttt
HSASASASA
−
= , and accordingly the 
speech act theory can be adopted for discourse 
modeling. Based on this definition, the discourse 
analysis in semantics using the dependency graphs 
tries to identify the speech act sequence of the dis-
course. Therefore, discourse modeling by means of 
speech act identification considering the history is 
shown in Equation (1). By introducing the hidden 
variable D
i
, representing the i-th possible depend-
ency graph derived from the word sequence W. 
The dependency relation, r
k 
, between word w
k
 and 
headword w
kh
 is extracted using HowNet and de-
noted as  (, )
kkh k
DRw w r≡ . The dependency graph 
which is composed of a set of dependency relations 
in the word sequence W is defined as 
111 222 1 1(1)
( ) { ( , ), ( , ),..., ( , )}
ii i
ihhmmh
D W DR w w DR w w DR w w
−− −
= . 
 The probability of hypothesis SA
t
 given word se-
quence W and history H
t-1
 can be described in 
Equation (1). According to the Bayes’ rule, the 
speech act identification model can be decomposed 
into two components, 
()
1
|,,
tt
i
PSA DWH
−
and 
( )
1
|,
t
i
PD WH
−
, described in the following.  
( )
()
()()
*1
1
11
arg ax | ,
arg ax , | ,
arg ax | , , | ,
t
t
i
t
i
tt
SA
tt
i
SA
D
tt t
ii
SA
D
SA m P SA W H
mPSADWH
mPSADWHPDWH
−
−
−−
=
=
=×
∑
∑
   
where SA
*
 and SA
t
 are the most probable speech 
act and the potential speech act at the t-th dialogue 
turn, respectively. W={w
1
,w
2
,w
3
,…,w
m
} denotes the 
word sequence extracted from the user’s utteance 
without considering the stop words. H
t-1
 is the his-
tory representing the previous t-1 turns.  
(1)
938
2.1 Speech act identification using semantic 
dependency with discourse analysis 
In this analysis, we apply the semantic dependency, 
word sequence, and discourse analysis to the iden-
tification of speech act. Since D
i
 is the i-th possible 
dependency graph derived from word sequence W, 
speech act identification with semantic dependency 
can be simplified as Equation (2). 
( ) ( )
11
|,, |,
tt tt
ii
PSA DWH PSA DH
−−
≅    (2) 
According to Bayes’ rule, the probability 
( )
1
|,
tt
i
PSA DH
−
 can be rewritten as: 
()
( ) ( )
()()
1
1
1
,|
|,
,|
l
tt t
i
tt
i
t
ill
SA
PDH SA PSA
PSA DH
PDH SA PSA
−
−
−
=
∑
 (3) 
As the history is defined as the speech act se-
quence, the joint probability of D
i
 and H
t-1
 given 
the speech act SA
t
 can be expressed as Equation (4). 
For the problem of data sparseness in the training 
corpus, the probability, 
( )
12 1
,, ,., |
tt
i
P D SA SA SA SA
−
, is hard to obtain and 
the speech act bi-gram model is adopted for ap-
proximation. 
( )
()
()
1
12 1
1
,|
,,,., |
,|
tt
i
tt
i
tt
i
PDH SA
P DSASA SA SA
P D SA SA
−
−
−
=
≅
   (4) 
For the combination of the semantic and syntactic 
structures, the relations defined in HowNet are 
employed as the dependency relations, and the hy-
pernym is adopted as the semantic concept accord-
ing to the primary features of the words defined in 
HowNet. The headwords are decided by the algo-
rithm based on the part of speech (POS) proposed 
by Academia Sinica in Taiwan. The probabilities 
of the headwords are estimated according to the 
probabilistic context free grammar (PCFG) trained 
on the Treebank developed by Sinica (Chen et al., 
2001). That is to say, the headwords are extracted 
according to the syntactic structure and the de-
pendency graphs are constructed by the semantic 
relations defined in HowNet. According to previ-
ous definition with independent assumption and 
the bigram smoothing of the speech act model us-
ing the back-off procedure, we can rewrite Equa-
tion (4) into Equation (5). 
( )
1
1
1
1
1
1
,|
((,), |)
    (1 ) ( ( , ) | )
tt
i
m
itt
kkkh
k
m
it
kkkh
k
PDSA SA
PDR w w SA SA
P DR w w SA
α
α
−
−
−
=
−
=
= +
−
∏
∏
  (5) 
where  α is the mixture factor for normalization.  
According to the conceptual representation of the 
word, the transformation function, ()f ⋅ , trans-
forms the word into its hypernym defined as the 
semantic class using HowNet. The dependency 
relation between the semantic classes of two words 
will be mapped to the conceptual space. Also the 
semantic roles among the dependency relations are 
obtained. On condition that
t
SA , 
1t
SA
−
 and the re-
lations are independent, the equation becomes  
1
1
1
((,), |)
( ( ( ), ( )), | )
(((),()|)( |)
itt
kkkh
t
kk kh
itt
kk kh
PDR w w SA SA
PDR f w f w SA SA
P DR fw fw SAPSA SA
−
−
−
≅
=
 (6) 
The conditional probability, 
(((),()|)
it
kk kh
PDR f w f w SA  and 
1
(|)
tt
PSA SA
−
, are 
estimated according to Equations (7) and (8), re-
spectively. 
(((),()|)
(( ),( ),, )
()
it
kk kh
t
kkhk
t
P DR fw fw SA
Cfw fw rSA
CSA
=
   (7) 
1
1
(,)
(|)
()
tt
tt
t
CSA SA
PSA SA
CSA
−
−
=                 (8) 
where ()C ⋅  represents the number of events in the 
training corpus. According to the definitions in 
Equations (7) and (8), Equation (6) becomes prac-
ticable. 
939
2.2 Semantic dependency analysis using 
word sequence and discourse 
Although the discourse can be expressed as the 
speech act sequence 
12 1
{, ,.. ,}
ttt
HSASASASA
−
= , 
the dependency graph 
i
D  is determined mainly by 
W, but not 
1t
H
−
. The probability that defines se-
mantic dependency analysis using the words se-
quence and discourse can be rewritten in the 
following: 
( )
1
12 1
|,
(|, , ,.., )
(|)
t
i
tt
i
i
PD WH
P D W SA SA SA
PD W
−
−−
=
≅
               (9) 
and 
(,)
(|)
()
i
i
P DW
PD W
PW
= `                          (10) 
Seeing that several dependency graphs can be gen-
erated from the word sequence W, by introducing 
the hidden factor D
i
, the probability ()PW  can be 
the sum of the probabilities (,)
i
PDW as Equation 
(11). 
 : ( )
() (,)
ii
i
D yield D W
P WPDW
=
=
∑
             (11) 
Because D
i
 is generated from W, D
i
 is the suffi-
cient to represent W in semantics. We can estimate 
the joint probability (,)
i
PDW  only from the de-
pendency relations D
i
. Further, the dependency 
relations are assumed to be independent with each 
other and therefore simplified as  
 
1
1
(,) ( (, )
m
i
ikkh
k
PDW PDR w w
−
=
=
∏
             (12) 
The probability of the dependency relation be-
tween words is defined as that between the con-
cepts defined as the hypernyms of the words, and 
then the dependency rules are introduced. The 
probability (|( ),( )
kk kh
Pr fw fw  is estimated from 
Equation (13). 
((,)
( ( ( ), ( )))
(|( ),( )
(,( ),( )
(( ),( )
i
kkkh
i
kk kh
kk kh
kk kh
kkh
PDR w w
PDR f w f w
Pr fw fw
Cr fw fw
Cfw fw
≡
=
=
             (13) 
According to Equations (11), (12) and (13), Equa-
tion (10) is rewritten as the following equation. 
1
1
1
 : ( ) 1
1
1
1
 : ( ) 1
((,)
(|)
((,)
(,( ),( )
(( ),( )
(,( ),( )
(( ),( )
ii
ii
m
i
kkkh
k
i m
i
kkkh
D yield D W k
m
kk kh
k kkh
m
kk kh
D yield D W k kkh
PDR w w
PD W
PDR w w
Cr f w f w
Cfw fw
Cr fw fw
Cfw fw
−
=
−
= =
−
=
−
= =
=
=
∏
∑ ∏
∏
∑ ∏
          (14) 
where function, ()f ⋅ , denotes the transformation 
from the words to the corresponding semantic 
classes. 
 
 
Figure 1. Speech acts corresponding to multiple services in the medical domain 
940
3 Experiments  
In order to evaluate the proposed method, a spoken 
dialogue system for medical domain with multiple 
services was investigated. Three main services: 
registration information service, clinic information 
service, and FAQ information service are used. 
This system mainly provides the function of on-
line registration. For this goal, the health education 
documents are provided as the FAQ files. And the 
inference engine about the clinic information ac-
cording to the patients’ syndromes is constructed 
according to a medical encyclopedia. An example 
is illustrated as figure 2: 
 
 Figure 2 An example of dialog  
 
12 Speech acts are defined and shown in Figure 1. 
Every service corresponds to the 12 speech acts 
with different probabilities.  
The acoustic speech recognition engine embed-
ded in dialog system based on Hidden Markov 
Models (HMMs) was constructed. The feature vec-
tor is parameterized on 26 MFCC coefficients. The 
decoding strategy is based on a classical Viterbi 
algorithm. The evaluation results by the character 
error rate (CER) for a Chinese speech recognition 
system is 18.3 percent and the vocabulary size of 
the language is 25,132.  
3.1 Analysis of corpus 
The training corpus was collected using the on-line 
recording from National Cheng Kung University 
Hospital in the first phase and the Wizard-of-Oz 
method in the second phase. Totally, there are 
1,862 dialogues with 13,986 sentences in the cor-
pus. The frequencies of the speech acts used in the 
system are shown in Figure 3.  
4.10
2.70
9.11
13.96
12.81
13.46
11.56
10.71
9.76
4.70
4.35
2.75
02468101214
Others
Registration revision
Dr.'s inforamtion
Time
Greeting
Clinic information
Registration
FAQ
Dr. and Clinic
Confirmatin (others)
Confirmation (clinic)
Cancel registration
 
Figure 3 Frequencies for each speech act 
 
The number of dialogue turns is also important to 
the success of the dialogue task. According to the 
observation of the corpus, we can find that the dia-
logues with more than 15 turns usually failed to 
complete the dialogue, that is to say, the common 
ground cannot be achieved. These failed dialogues 
were filtered out from the training corpus before 
conducting the following experiments. The distri-
bution of the number of turns per dialogue is 
shown in Figure 4. 
0
50
100
150
200
250
300
350
1234567891011213141516
Length (Turns)
F
r
eq
uen
cy
 
Figure 4. The distribution of the number of turns 
per dialogue 
3.2 Precision of speech act identification re-
lated to the corpus size 
System:午安。需要服務嗎 ?  
Good Afternoon. May I help you? 
 User:是的，我要看醫生。   
Yes, I should be diagnosed by doctor. 
System:請問您要掛哪一科 ?  
Which department would you like to make an appoint-
ment for?  
User:我不知道。  
I do not know. 
System:請問你有那些症狀？   
What symptoms are with you?  
User:我喉嚨痛又發燒。而且全身酸痛。  
I am with a sore throat and a fever. I ached all over, too. 
System:好的，這些症狀多久了？  
  Well, how long have you had these symptoms? 
User:好幾天了。   
For several days.  
System:建議您掛內科醫師。   
My suggestion is that you may visit the Internal Medi-
cine. 
User:請問內科何時有門診？  
What’s the visiting schedule of the Internal Medicine?  
System:您可以參考醫院門診表，決定您的時間，我們的時
間有…。   
You can check the Services Schedule and decide a con-
venient time for you. The Available time for now is.... 
941
The size of the training corpus is crucial to the 
practicability of the proposed method. In this ex-
periment, we analyze the effect of the number of 
sentences according to the precision rate of the 
speech act using the semantic dependency graphs 
with and without the discourse information. From 
the results, the precision rates for speech act identi-
fication achieved 95.6 and 92.4 percentages for the 
training corpus containing 10,036 and 7,012 sen-
tences using semantic dependency graphs with and 
without history, respectively. This means that se-
mantic dependency graph with discourse outper-
forms that without discourse, but more training 
data are needed to include the discourse for speech 
act identification. Fig. 5 shows the relationship 
between the speech act identification rate and the 
size of the training corpus. From this figure, we 
can find that more training sentences for the se-
mantic dependency graph with discourse analysis 
are needed than that without discourse. This im-
plies discourse analysis plays an important role in 
the identification of the speech act.  
3.3 Performance analysis of semantic depend-
ency graph 
To evaluate the performance, two systems were 
developed for comparison. One is based on the  
Bayes’ classifier (Walker et al., 1997), and the 
other is the use of the partial pattern tree (Wu et al., 
2004) to identify the speech act of the user’s utter-
ances. Since the dialogue discourse is defined as a 
sequence of speech acts. The prediction of speech  
act of the new input utterance becomes the core 
issue for discourse modeling. The accuracy for 
speech act identification is shown in Table 1.  
According to the observation of the results, se-
mantic dependency graphs obtain obvious  
   
50
62.5
75
87.5
100
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Size of corpus
        (the number of sentence, in thousands)
Spe
e
c
h
 act
 i
dent
i
f
i
c
at
i
on r
a
t
e
 (
%
)
semantic dependency graph with
discourse analysis
semantic dependency graph
without discourse analysis
 
Figure 5. The relation between the speech act iden-
tification rate and the size of training corpus  
 
improvement compared to other approaches. The 
reason is that not only the meanings of the words 
or concepts but also the structural information and 
the implicit semantic relation defined in the knowl-
edge base are needed to identify the speech 
act.Besides, taking the discourse into consideration 
will improve the prediction about the speech act of 
the new or next utterance. This means the dis-
course model can improve the accuracy of the 
speech act identification, that is to say, discourse 
modeling can help understand the user’s desired 
intension especially when the answer is very short.  
Semantic dependency graph 
   Speech act  
With discourse analysis Without discourse analysis
PPT Bayes’ 
Classifier 
Clinic information 
(26 sentences) 
100 
(26) 
96.1 
(25) 
88 
(23) 
92 
(24) 
Dr.’s information 
(42 sentences) 
97 
(41) 
92.8 
(39) 
66.6 
(28) 
92.8 
(39) 
Confirmation(others)  
(42 sentences) 
95 
(40) 
95 
(40) 
95 
(40) 
95 
(40) 
Others 
(14 sentences) 
57.1 
(8) 
50 
(7) 
43 
(6) 
38 
(5) 
FAQ 
(13 sentences) 
70 
(9) 
53.8 
(7) 
61.5 
(8) 
46 
(6) 
Clinic information 
(135 sentences) 
98.5 
(133) 
96.2 
(130) 
91.1 
(123) 
93.3 
(126) 
Time 
(38) 
94.7 
(36) 
89.4 
(34) 
97.3 
(37) 
92.1 
(35) 
Registration 
(75) 
100 
(75) 
100 
(75) 
86.6 
(65) 
86.6 
(65) 
Cancel registration 
(10) 
90 
(9) 
80 
(8) 
60 
(6) 
80 
(8) 
Average Precision 95.6 92.4 85 88.1 
Table 1 The accuracy for speech act identification 
942
For example, the user may only say “yes” or “no” 
for confirmation. The misclassification in speech 
act will happen due to the limited information. 
However, it can obtain better interpretation by 
introducing the semantic dependency relations as 
well as the discourse information. 
To obtain the single measurement, the average 
accuracy for speech act identification is shown in 
Table 1. The best approach is the semantic de-
pendency graphs with the discourse. This means 
the information of the discourse can help speech 
act identification. And the semantic dependency 
graph outperforms the traditional approach due to 
the semantic analysis of words with their corre-
sponding relations. 
 
The success of the dialog lies on the achievement 
of the common ground between users and ma-
chine which is the most important issue in dia-
logue management. To compare the semantic 
dependency graph with previous approaches, 150 
individuals who were not involved in the devel-
opment of this project were asked to use the dia-
logue system to measure the task success rate. To 
filter out the incomplete tasks, 131 dialogs were 
employed as the analysis data in this experiment. 
The results are listed in Table 2. 
 
 SDG
1
 SDG
2
 PPT Bayes’ 
Task 
completion 
rate 
 
87.2 
 
85.5 
 
79.4 
 
80.2 
Number of 
turns on 
average 
 
8.3 
 
8.7 
 
10.4 
 
10.5 
SDG
1 
:With discourse analysis, SDG
2
 :Without discourse 
Table 2 Comparisons on the Task completion rate 
and the number of dialogue turns between differ-
ent approaches 
 
We found that the dialogue completion rate and 
the average length of the dialogs using the de-
pendency graph are better than those using the 
Bayes’ classifier and partial pattern tree approach. 
Two main reasons are concluded: First, depend-
ency graph can keep the most important informa-
tion in the user’s utterance, while in semantic 
slot/frame approach, the semantic objects not 
matching the semantic slot/frame are generally 
filtered out. This approach is able to skip the repe 
 
tition or similar utterances to fill the same infor-
mation in different semantic slots. Second, the 
dependency graph-based approach can provide the 
inference to help the interpretation of the user’s 
intension.  
For semantic understanding, correct interpretation 
of the information from the user’s utterances be-
comes inevitable. Correct speech act identification 
and correct extraction of the semantic objects are 
both important issues for semantic understanding 
in the spoken dialogue systems. Five main catego-
ries about medical application, clinic information, 
Dr.’s information, confirmation for the clinic in-
formation, registration time and clinic inference, 
are analyzed in this experiment.  
 
 SDG PPT Bayes’ 
Clinic infor-
mation 
95.0 89.5 90.3 
Dr.’s infor-
mation 
94.3 71.7 92.4 
Confirmation 
(Clinic) 
98.0 98.0 98.0 
Clinic 
 
97.3 74.6 78.6 
Time 
 
97.6 97.8 95.5 
SDG:With discourse analysis 
Table 3 Correction rates for semantic object ex-
traction 
 
According to the results shown in Table 3, the 
worst condition happened in the query for the 
Dr.’s information using the partial pattern tree. 
The mis-identification of speech act results in the 
un-matched semantic slots/frames. This condition 
will not happen in semantic dependency graph, 
since the semantic dependency graph always 
keeps the most important semantic objects accord-
ing to the dependency relations in the semantic 
dependency graph instead of the semantic slots. 
Rather than filtering out the unmatched semantic 
objects, the semantic dependency graph is con-
structed to keep the semantic relations in the ut-
terance. This means that the system can preserve 
most of the user’s information via the semantic 
dependency graphs. We can observe the identifi-
cation rate of the speech act is higher for the se-
mantic dependency graph than that for the partial 
pattern tree and Bayes’ classifier as shown in Ta-
ble 3. 
943
4 Conclusion  
This paper has presented a semantic depend-
ency graph that robustly and effectively deals with 
a variety of conversational discourse information 
in the spoken dialogue systems. By modeling the 
dialogue discourse as the speech act sequence, the 
predictive method for speech act identification is 
proposed based on discourse analysis instead of 
keywords only. According to the corpus analysis, 
we can find the model proposed in this paper is 
practicable and effective. The results of the ex-
periments show the semantic dependency graph 
outperforms those based on the Bayes’ rule and 
partial pattern trees. By integrating discourse 
analysis this result also shows the improvement 
obtained not only in the identification rate of 
speech act but also in the performance for seman-
tic object extraction.  
Acknowledgements 
The authors would like to thank the National 
Science Council, Republic of China, for its finan-
cial support of this work, under Contract No. NSC 
94-2213-E-006-018. 
References  
J. F. Allen, D. K. Byron, D. M. Ferguson, L. Galescu, 
and A. Stent. 2001. Towards Conversational Hu-
man-Computer Interaction.  AI Magazine. 
C. F. Baker, C. J. Fillmore, and J. B. Lowe. 1998. The 
Berkeley FrameNet Project. In Proceedings of 
COLING/ACL. 86-90 
K. J. Chen, C. R. Huang, F.Y. Chen, C. C. Luo, M. C. 
Chang, and C.J. Chen. 2001. Sinica Treebank: De-
sign Criteria, representational issues and immple-
mentation. In Anne Abeille, editor, Building and 
Using Syntactically Annotated Corpora. Kluwer. 29-
37 
Z. Dong and Q. Dong. 2006. HowNet and the computa-
tion of meaning. World Scientific Publishing Co Inc. 
J. Gao, and H. Suzuki. 2003. Unsupervised learning of 
dependency structure for language modeling. In 
Proceedings of ACL 2003, 521-528.   
D. Gildea and D. Jurafsky. 2002. Automatic labeling of 
semantic roles. Computational Linguistics, 28(3). 
245–288. 
K. Hacioglu, S. Pradhan, W. Ward, J. Martin, and D. 
Jurafsky. 2003. Shallow semantic parsing using 
support vector machines. Technical Report TR-
CSLR-2003-1, Center for Spoken Language Re-
search, Boulder, Colorado. 
K. Hacioglu and W. Ward. 2003. Target word detection 
and semantic role chunking using support vector 
machines. In HLT-03. 
R. Higashinaka, N. Miyazaki, M. Nakano, and K. Ai-
kawa. 2004. Evaluating Discourse Understanding in 
Spoken Dialogue Systems. ACM Transactions on 
Speech and Language Processing (TSLP), Volume 1,   
1-20. 
X. Huang, A. Acero, and H.-W. Hon. 2001. Spoken 
Language Proceeding. Prentice-Hall,Inc.  
T. Kudo and Y. Matsumoto. 2000. Japanese Depend-
ency Structure Analysis Based on Support Vector 
Machines. In Proceedings of the EMLNP.  18–25 
M. F. McTEAR. 2002. Spoken Dialogue Technology: 
Enabling the Conversational User Interface. ACM 
Computer Surveys, Vol 34, No. 1,  90-169.. 
B. Rajesh, and B. Linda. 2004. Taxonomy of speech-
enabled applications (http://www106.ibm.com/de-
veloperworks/wireless/library/wi-tax/) 
J. Searle. 1979. Expression and Meaning: Studies in the 
Theory of Speech Acts. New York, Cambridge Uni-
versity Press. 
A. Stolcke, K. Ries, N. Coccaro, E. Shriberg, R. Bates, 
D. Jurafsky, P. Taylor, R. Martin, C. Van Ess-
Dykema, and M. Meteer. 2000. Dialogue act model-
ing for automatic tagging and recognition of conver-
sational speech. Computational Linguistics 26(3), 
339--373. 
M. A. Walker, D. Litman, C. Kamm, and A. Abella, 
1997. PARADISE: a general framework for evaluat-
ing spoken dialogue agents. In Proceedings of the 
ACL, 271–280 
M. Walker  and R. Passonneau.  2001. DATE: a dia-
logue act tagging scheme for evaluation of spoken 
dialogue systems. In Proceedings of the first inter-
national conference on Human language technology 
research. 1-8. 
Y.-Y. Wang and A. Acero. 2003. Combination of CFG 
and N-gram Modeling in Semantic Grammar Learn-
ing, In Proceedings of the Eurospeech Conference. 
Geneva, Switzerland. September 2003.  
C.-H. Wu, J.-F. Yeh, and M.-J. Chen. 2004. Speech 
Act Identification using an Ontology-Based Partial 
Pattern Tree. in Proceedings of ICSLP 2004, Jeju, 
Korea, 2004. 
 
944
