File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/06/w06-1906_intro.xml
Size: 2,504 bytes
Last Modified: 2025-10-06 14:04:06
<?xml version="1.0" standalone="yes"?> <Paper uid="W06-1906"> <Title>BRUJA: Question Classification for Spanish. Using Machine Translation and an English Classifier.</Title> <Section position="2" start_page="0" end_page="0" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> A Question Answering (QA) system seeks and shows the user an accurate and concise answer, given a free-form question, and using a large text data collection.</Paragraph> <Paragraph position="1"> The use of Cross Language Information Retrieval Systems (CLIR) is growing, and also the application of these ones into other general systems, such as Question Answering or Question Classification.</Paragraph> <Paragraph position="2"> A CLIR system is an Information Retrieval System that works with collections in several languages, and extract relevant documents or passages (Grefenstette, 1998).</Paragraph> <Paragraph position="3"> We have proposed a Multilingual Question Answering System (BRUJA - in Spanish &quot;Busqueda de Respuestas University of Jaen&quot;) that works with collections in several languages. Since there are several languages, tasks such as obtaining relevant documents and extracting the answer could be accomplished in two ways: using NPL tools and resources for each language or for a pivot language only (English) and translating to the pivot language the rest of the relevant information when it is required. Because of the translation step, the second approach is less accurate but more practical since we need only NPL resources for English. The central question is the noise, because of the translation process, is too high in order to use this approach in spite of their practical advantages.</Paragraph> <Paragraph position="4"> The first step of this system is a Question Classifier (QC). Given a query, a question classification module obtains the class of such question. This information is useful for the extraction of the answer. For example, given the query &quot;Where is Madrid?, the QA system expects a location entity as answer type. The proposed QA module works with questions in several languages, translates them into English using different online translators, and obtains the type of questions and some features, such as the focus, the keywords or the context. In this work we aim to find out whether a multilingual QC module is possible by using translation tools and English as pivot language or not.</Paragraph> </Section> class="xml-element"></Paper>