Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume, pages 288–291,
New York City, June 2006. c©2006 Association for Computational Linguistics
Factoid Question Answering with Web, Mobile and Speech Interfaces
E.W.D. Whittaker J. Mrozinski S. Furui
Dept. of Computer Science
Tokyo Institute of Technology
2-12-1, Ookayama, Meguro-ku
Tokyo 152-8552 Japan
a0 edw,mrozinsk,furui
a1 @furui.cs.titech.ac.jp
Abstract
In this paper we describe the web and
mobile-phone interfaces to our multi-
language factoid question answering (QA)
system together with a prototype speech
interface to our English-language QA sys-
tem. Using a statistical, data-driven ap-
proach to factoid question answering has
allowed us to develop QA systems in five
languages in a matter of months. In the
web-based system, which is accessible
at http://asked.jp, we have com-
bined the QA system output with standard
search-engine-like results by integrating it
with an open-source web search engine.
The prototype speech interface is based
around a VoiceXML application running
on the Voxeo developer platform. Recog-
nition of the user’s question is performed
on a separate speech recognition server
dedicated to recognizing questions. An
adapted version of the Sphinx-4 recog-
nizer is used for this purpose. Once the
question has been recognized correctly it
is passed to the QA system and the re-
sulting answers read back to the user by
speech synthesis. Our approach is mod-
ular and makes extensive use of open-
source software. Consequently, each com-
ponent can be easily and independently
improved and easily extended to other lan-
guages.
1 Introduction
The approach to factoid question answering (QA)
that we adopt was first described in (Whittaker et
al., 2005b) where the details of the mathematical
model and how it was trained for English were
given. The approach has been successfully evalu-
ated in the 2005 text retrieval conference (TREC)
question answering track evaluations (Voorhees and
Trang Dang, 2005) where our group placed eleventh
out of thirty participants (Whittaker et al., 2005a).
Although the TREC QA task is substantially differ-
ent to web-based QA this evaluation showed that our
approach works and provides an objective assess-
ment of its quality. Similarly, for our Japanese lan-
guage system we have evaluated the performance of
our approach on the NTCIR-3 QAC-1 task (Whit-
taker et al., 2005c). Although our Japanese ex-
periments were applied retrospectively, the results
would have placed us in the mid-range of partici-
pating systems. In (Whittaker et al., 2006b) we de-
scribed how our approach could be used for the rapid
development of web-based QA systems in five very
different languages. It was shown that a developer
proficient with the tools, and with access to suitable
training data, could build a system in a new language
in around 10 hours. In (Whittaker et al., 2006a) we
evaluated the performance of the systems for four of
our five languages. We give a brief summary of our
approach to QA in Section 2.
In this paper we introduce our web-based
QA system which is publicly accessible at
http://asked.jp, permitting questions in En-
glish, Japanese, Chinese, Russian and Swedish and
288
is discussed in Section 3. Since answers in factoid
QA are inherently well-suited to display on small
screens we have also made a mobile-phone interface
which is accessible at the same address when using
an HTML browser from a mobile phone. This is dis-
cussed in Section 4. There are several other QA sys-
tems on the web including Brainboost (Brainboost,
2005) and Lexxe (Lexxe, 2005) but they only try to
answer questions in English and do not have conve-
nient mobile interfaces.
Entering whole questions rather than just key-
words is tedious especially on a mobile-phone so
we have also begun to look at speech interfaces. In
this paper we describe a prototype speech interface
to our English-language QA system. This prototype
is currently intended primarily as a platform for fur-
ther research into speech recognition and answering
of questions from an acoustic modelling point-of-
view (e.g. low-bandwidth, low-quality VoIP chan-
nel), from a language modelling perspective (e.g. ir-
regular word order in questions vs. text, and very
large out-of-vocabulary problem) and also in terms
of dialog modelling. There have been several at-
tempts at speech interfaces to QA systems in the lit-
erature e.g. (Schofield and Zheng, 2003) but as far
as we know ours is the only system that is publicly
accessible. We discuss this interface in Section 5.
2 Statistical pattern classification
approach to QA
The answer to a question depends primarily on the
question itself but also on many other factors such
as the person asking the question, the location of the
person, what questions the person has asked before,
and so on. For simplicity, we choose to consider
only the dependence of an answer a0 on the question
a1 . In particular, we hypothesize that the answer
a0
depends on two sets of features extracted from a1 :
a2a4a3a6a5a8a7 a1a10a9 and
a11
a3a13a12a14a7 a1a10a9 as follows:
a15a16a7
a0a18a17
a1a10a9 a3a19a15a20a7
a0a21a17
a2a23a22
a11
a9 a22 (1)
where a2 can be thought of as a set of a24a26a25 features
describing the “question-type” part of a1 such as
who, when, where, which, etc. and a11 is a set of fea-
tures comprising the “information-bearing” part of
a1 i.e. what the question is actually about and what
it refers to. For example, in the questions, Where
is Mount Everest? and How high is Mount Ever-
est? the information-bearing component is identical
in both cases whereas the question-type component
is different.
Finding the best answer a27a0 involves a search over
all a0 for the one which maximizes the probability of
the above model:
a27a0
a3a29a28a31a30a33a32a35a34a36a28a38a37
a39
a15a20a7
a0a21a17
a2a23a22
a11
a9a41a40 (2)
This is guaranteed to give us the optimal answer
in a maximum likelihood sense if the probability dis-
tribution is the correct one. Making various condi-
tional independence assumptions to simplify mod-
elling we obtain the final optimisation criterion:
a28a31a30a33a32a42a34a36a28a38a37
a39
a15a16a7
a0a18a17a43a11
a9
a44 a45a47a46 a48
a49a51a50a53a52a54a49a56a55a57a50a53a58a60a59a41a61
a62a42a63a56a64a33a50a53a61
a65
a15a20a7a26a2
a17a38a0
a9
a44 a45a47a46 a48
a66
a55a54a61a67a52a68a50a53a49
a62a69a63a56a64a33a50a70a61
a40 (3)
The a15a20a7 a0a71a17a72a11 a9 model is essentially a language
model which models the probability of an answer
sequence a0 given a set of information-bearing fea-
tures a11 . It models the proximity of a0 to features in
a11 . This model is referred to as the retrieval model.
The a15a20a7a26a2 a17a73a0 a9 model matches an answer a0 with
features in the question-type set a2 . Roughly speak-
ing this model relates ways of asking a question with
classes of valid answers. For example, it associates
names of people or companies with who-type ques-
tions. In general, there are many valid and equiprob-
able a0 for a given a2 so this component can only
re-rank candidate answers retrieved by the retrieval
model. Consequently, we call it the filter model.
3 Web interface
The web-based interface to our QA systems has been
accessible at http://asked.jp since Decem-
ber 2005 and although still primarily a research sys-
tem and not widely advertised it attracts around five
unique users a day. Currently we do not perform
language detection for an input question so the user
must first select a language-specific system before
inputting a question in a language other than En-
glish.
In Figure 1 we show the results page for the ques-
tion “How high is Mount Everest?”. As can be seen
289
the left-hand side of the page contains the familiar
title, link and summaries of pages relevant to the
query that is common to most of today’s web search
engines. These results are produced by an open-
source web search engine which is run locally and
currently contains about 100 million web-pages in
its database. Down the right-hand side of the results
page we present the answers that were found by our
QA system. These answers are presented in order
of probability as determined by Equation (3). When
the mouse is rolled over an answer a Java-script pop-
up box is displayed that shows more context for a
given answer. This allows the user to determine
more rapidly the validity of an answer and also par-
tially compensates for inaccurate answer identifica-
tion by the system. Each answer can also be clicked
on whereupon the user is redirected to the page from
which the answer was taken. This re-direction is ef-
fected through a redirect via our own web-server so
that for a given question we can see which answers
were clicked on. Eventually, it is hoped this could
be used for unsupervised system adaptation.
Figure 1: Results page for “How high is Mount Everest?”.
The same basic layout and design is repeated for
each of the five language-specific systems. In Fig-
ure 2 we show the results page for the Japanese ques-
tion of “What plant do Pandas eat?”.
The average response time to present the full re-
sults page for a question in each language is cur-
rently around 10 seconds. The web-search and QA
systems are run in parallel and the outputs combined
when both are complete.
Figure 2: Results page for “What plant do Pandas eat?” in
Japanese.
4 Mobile-phone interface
Since the priorities with a mobile-phone interface re-
volve around speed, display size and cost to the user,
the interface is basically a whittled down version of
the web-based interface described in the previous
section. The only requirement for being able to use
the mobile phone interface is that the phone must
contain an HTML browser. In countries like Japan
this has been fairly standard for many years but it is
expected that this will become more common world-
wide in the near future with the continued roll-out of
3G mobile phone services.
For the mobile-phone interface the standard web-
search results section has been removed entirely
from the results section and instead only the top 20
short answers are displayed without pop-up boxes
or corresponding context. Such a strategy mini-
mizes the number of bytes transmitted and ensures
that most answers are adequately displayed on most
mobile-phone interfaces with a minimum amount of
scrolling. Although not yet implemented we aim to
allow users to click on an answer and be taken to
the part of the page that contains the answer rather
than loading a whole page which could sometimes
be several megabytes in size.
5 Speech interface
Our implementation of the speech interface to the
QA system was greatly simplified by the avail-
290
ability of the Voxeo developer platform1 which
provides free access, for development purposes,
to a VoiceXML browser running our application.
The application can be accessed through: (i)
a U.S. telephone number at (800) 289-5570
then PIN:9991423955; (ii) SIP VoIP clients
at (SIP:9991423955sip.voxeo.net); (iii)
Free World Dialup at (**86919991423955); and
(iv) SkypeOut at (+99000936 9991423955).
Since most VoiceXML applications are designed
for use with small vocabulary, rule-based gram-
mars we only use VoiceXML and Voxeo’s browser
to handle negotiation of the questions and answers
with the user through simple dialogs. The recog-
nition of the question itself is performed using a
dedicated large-vocabulary speech recognizer with a
language model (LM) trained on English-language
questions. The speech recognizer we use is the
open-source Sphinx-4 recognizer (Walker et al.,
2004) which runs in a server mode and has been
adapted to use more complex LMs than those per-
mitted by the default ARPA format word a0 -gram
LMs. Currently we use a linear interpolation of
a word and class-based trigram LM each of which
were trained on a large corpus of English-language
questions (Hallmarks, 2002)—the same data used to
train the English-language QA system (Whittaker et
al., 2005b).
6 Conclusion and Further work
Having recapped a basic overview of our statistical
approach to question answering (QA), in this paper
we have described the web and mobile-phone in-
terfaces to our multi-language QA system and how
they can be accessed. In addition, we have de-
scribed our first attempt at a prototype speech in-
terface which will be used as a platform for future
research. Eventually our aim is to make the QA per-
formance of the speech interface the same as that
obtained through the web and mobile-phone inter-
faces. This will be achieved through a combination
of acoustic, language and dialog model adaptation
on the speech side, and making the QA system more
robust to underspecified and errorful questions on
the QA side. We think these demonstration systems
show significant progress has already been made and
1http://www.voxeo.com/developers
give a hint of how information access to QA systems
might be achieved in the near future.
7 Acknowledgments
This research was supported by JSPS and the
Japanese government 21st century COE programme.
The authors also wish to thank Dietrich Klakow for
all his contributions.
References
Brainboost. 2005. http://www.brainboost.com.
Academic Hallmarks. 2002. Knowledge Master Edu-
cational Software. PO Box 998, Durango, CO 81302
http://www.greatauk.com/.
Lexxe. 2005. http://www.lexxe.com.
E. Schofield and Z. Zheng. 2003. A Speech Interface for
Open-domain Question-answering. In Proceedings of
the 41st Annual Meeting of the ACL, Sapporo, Japan,
July.
E.M. Voorhees and H. Trang Dang. 2005. Overview of
the TREC 2005 Question Answering Track. In Pro-
ceedings of the 14th Text Retrieval Conference.
W. Walker et al. 2004. Sphinx-4: A Flexible Open
Source Framework for Speech Recognition. Techni-
cal report, Sun Microsystems Inc.
E.W.D. Whittaker, P. Chatain, S. Furui, and D. Klakow.
2005a. TREC2005 Question Answering Experiments
at Tokyo Institute of Technology. In Proceedings of
the 14th Text Retrieval Conference.
E.W.D. Whittaker, S. Furui, and D. Klakow. 2005b. A
Statistical Pattern Recognition Approach to Question
Answering using Web Data. In Proceedings of Cyber-
worlds.
E.W.D. Whittaker, J. Hamonic, and S. Furui. 2005c. A
Unified Approach to Japanese and English Question
Answering. In Proceedings of NTCIR-5.
E.W.D. Whittaker, J. Hamonic, T. Klingberg, Y. Dong,
and S. Furui. 2006a. Monolingual Web-based Factoid
Question Answering in Chinese, Swedish, English and
Japanese. In Proceedings of the Workshop on Multi-
language Question Answering, EACL.
E.W.D. Whittaker, J. Hamonic, T. Klingberg, Y. Dong,
and S. Furui. 2006b. Rapid Development of Web-
based Monolingual Question Answering Systems. In
Proceedings of ECIR2006.
291
