File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/06/n06-4010_metho.xml
Size: 7,787 bytes
Last Modified: 2025-10-06 14:10:20
<?xml version="1.0" standalone="yes"?> <Paper uid="N06-4010"> <Title>Factoid Question Answering with Web, Mobile and Speech Interfaces</Title> <Section position="3" start_page="288" end_page="288" type="metho"> <SectionTitle> 2 Statistical pattern classification </SectionTitle> <Paragraph position="0"> approach to QA The answer to a question depends primarily on the question itself but also on many other factors such as the person asking the question, the location of the person, what questions the person has asked before, and so on. For simplicity, we choose to consider only the dependence of an answer a0 on the question a1 . In particular, we hypothesize that the answer depends on two sets of features extracted from a1 :</Paragraph> <Paragraph position="2"> where a2 can be thought of as a set of a24a26a25 features describing the &quot;question-type&quot; part of a1 such as who, when, where, which, etc. and a11 is a set of features comprising the &quot;information-bearing&quot; part of a1 i.e. what the question is actually about and what it refers to. For example, in the questions, Where is Mount Everest? and How high is Mount Everest? the information-bearing component is identical in both cases whereas the question-type component is different.</Paragraph> <Paragraph position="3"> Finding the best answer a27a0 involves a search over all a0 for the one which maximizes the probability of the above model:</Paragraph> <Paragraph position="5"> This is guaranteed to give us the optimal answer in a maximum likelihood sense if the probability distribution is the correct one. Making various conditional independence assumptions to simplify modelling we obtain the final optimisation criterion:</Paragraph> <Paragraph position="7"> The a15a20a7 a0a71a17a72a11 a9 model is essentially a language model which models the probability of an answer sequence a0 given a set of information-bearing features a11 . It models the proximity of a0 to features in a11 . This model is referred to as the retrieval model.</Paragraph> <Paragraph position="8"> The a15a20a7a26a2 a17a73a0 a9 model matches an answer a0 with features in the question-type set a2 . Roughly speaking this model relates ways of asking a question with classes of valid answers. For example, it associates names of people or companies with who-type questions. In general, there are many valid and equiprobable a0 for a given a2 so this component can only re-rank candidate answers retrieved by the retrieval model. Consequently, we call it the filter model.</Paragraph> </Section> <Section position="4" start_page="288" end_page="289" type="metho"> <SectionTitle> 3 Web interface </SectionTitle> <Paragraph position="0"> The web-based interface to our QA systems has been accessible at http://asked.jp since December 2005 and although still primarily a research system and not widely advertised it attracts around five unique users a day. Currently we do not perform language detection for an input question so the user must first select a language-specific system before inputting a question in a language other than English. null In Figure 1 we show the results page for the question &quot;How high is Mount Everest?&quot;. As can be seen the left-hand side of the page contains the familiar title, link and summaries of pages relevant to the query that is common to most of today's web search engines. These results are produced by an open-source web search engine which is run locally and currently contains about 100 million web-pages in its database. Down the right-hand side of the results page we present the answers that were found by our QA system. These answers are presented in order of probability as determined by Equation (3). When the mouse is rolled over an answer a Java-script pop-up box is displayed that shows more context for a given answer. This allows the user to determine more rapidly the validity of an answer and also partially compensates for inaccurate answer identification by the system. Each answer can also be clicked on whereupon the user is redirected to the page from which the answer was taken. This re-direction is effected through a redirect via our own web-server so that for a given question we can see which answers were clicked on. Eventually, it is hoped this could be used for unsupervised system adaptation.</Paragraph> <Paragraph position="1"> The same basic layout and design is repeated for each of the five language-specific systems. In Figure 2 we show the results page for the Japanese question of &quot;What plant do Pandas eat?&quot;.</Paragraph> <Paragraph position="2"> The average response time to present the full results page for a question in each language is currently around 10 seconds. The web-search and QA</Paragraph> </Section> <Section position="5" start_page="289" end_page="289" type="metho"> <SectionTitle> 4 Mobile-phone interface </SectionTitle> <Paragraph position="0"> Since the priorities with a mobile-phone interface revolve around speed, display size and cost to the user, the interface is basically a whittled down version of the web-based interface described in the previous section. The only requirement for being able to use the mobile phone interface is that the phone must contain an HTML browser. In countries like Japan this has been fairly standard for many years but it is expected that this will become more common world-wide in the near future with the continued roll-out of 3G mobile phone services.</Paragraph> <Paragraph position="1"> For the mobile-phone interface the standard web-search results section has been removed entirely from the results section and instead only the top 20 short answers are displayed without pop-up boxes or corresponding context. Such a strategy minimizes the number of bytes transmitted and ensures that most answers are adequately displayed on most mobile-phone interfaces with a minimum amount of scrolling. Although not yet implemented we aim to allow users to click on an answer and be taken to the part of the page that contains the answer rather than loading a whole page which could sometimes be several megabytes in size.</Paragraph> </Section> <Section position="6" start_page="289" end_page="290" type="metho"> <SectionTitle> 5 Speech interface </SectionTitle> <Paragraph position="0"> Our implementation of the speech interface to the QA system was greatly simplified by the avail- null ability of the Voxeo developer platform1 which provides free access, for development purposes, to a VoiceXML browser running our application. The application can be accessed through: (i) a U.S. telephone number at (800) 289-5570 then PIN:9991423955; (ii) SIP VoIP clients at (SIP:9991423955sip.voxeo.net); (iii) Free World Dialup at (**86919991423955); and (iv) SkypeOut at (+99000936 9991423955).</Paragraph> <Paragraph position="1"> Since most VoiceXML applications are designed for use with small vocabulary, rule-based grammars we only use VoiceXML and Voxeo's browser to handle negotiation of the questions and answers with the user through simple dialogs. The recognition of the question itself is performed using a dedicated large-vocabulary speech recognizer with a language model (LM) trained on English-language questions. The speech recognizer we use is the open-source Sphinx-4 recognizer (Walker et al., 2004) which runs in a server mode and has been adapted to use more complex LMs than those permitted by the default ARPA format word a0 -gram LMs. Currently we use a linear interpolation of a word and class-based trigram LM each of which were trained on a large corpus of English-language questions (Hallmarks, 2002)--the same data used to train the English-language QA system (Whittaker et al., 2005b).</Paragraph> </Section> class="xml-element"></Paper>