File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/01/p01-1070_concl.xml
Size: 3,760 bytes
Last Modified: 2025-10-06 13:53:06
<?xml version="1.0" standalone="yes"?> <Paper uid="P01-1070"> <Title>Using Machine Learning Techniques to Interpret WH-questions</Title> <Section position="7" start_page="0" end_page="0" type="concl"> <SectionTitle> 6 Discussion and Future Work </SectionTitle> <Paragraph position="0"> We have introduced a predictive model, built by applying supervised machine-learning techniques, which can be used to infer a user's key informational goals from free-text questions posed to an Internet search service. The predictive model, which is built from shallow linguistic features of users' questions, infers a user's information need, the level of detail requested by the user, the level of detail deemed appropriate by an information provider, and the topic, focus and restrictions of the user's question. The performance of our model is encouraging, in particular for shorter queries, and for queries with certain information needs. However, further improvements are required in order to make this model practically applicable. null We believe there is an opportunity to identify additional linguistic distinctions that could improve the model's predictive performance. For example, we intend to represent frequent combinations of PoS, such as NOUN +NOUN , which are currently classified as OTHER (Section 3). We also propose to investigate predictive models which return more informative predictions than those returned by our current model, e.g., a distribution of the probable informational goals, instead of a single goal. This would enable an enhanced QA system to apply a decision procedure in order to determine a course of action. For example, if the Additional value of the Coverage Would Give variable has a relatively high probability, the system could consider more than one Information Need, Topic or Focus when generating its reply.</Paragraph> <Paragraph position="1"> In general, the decision-tree generation methods described in this paper do not have the ability to take into account the relationships among different target variables. In Section 5.3, we investigated this problem by building decision trees which incorporate predicted and actual values of target variables. Our results indicate that it is worth exploring the relationships between several of the target variables. We intend to use the insights obtained from this experiment to construct models which can capture probabilistic dependencies among variables.</Paragraph> <Paragraph position="2"> Finally, as indicated in Section 1, this project is part of a larger effort centered on improving a user's ability to access information from large information spaces. The next stage of this project involves using the predictions generated by our model to enhance the performance of QA or IR systems. One such enhancement pertains to query reformulation, whereby the inferred informational goals can be used to reformulate or expand queries in a manner that increases the likelihood of returning appropriate answers. As an example of query expansion, if Process was identified as the Information Need of a query, words that boost responses to searches for information relating to processes could be added to the query prior to submitting it to a search engine.</Paragraph> <Paragraph position="3"> Another envisioned enhancement would attempt to improve the initial recall of the document retrieval process by submitting queries which contain the content words in the Topic and Focus of a user's question (instead of including all the content words in the question). In the longer term, we plan to explore the use of Coverage results to enable an enhanced QA system to compose an appropriate answer from information found in the retrieved documents.</Paragraph> </Section> class="xml-element"></Paper>