File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/06/w06-1809_metho.xml

Size: 15,485 bytes

Last Modified: 2025-10-06 14:10:48

<?xml version="1.0" standalone="yes"?>
<Paper uid="W06-1809">
  <Title>Incorporating User Models in Question Answering to Improve Readability</Title>
  <Section position="4" start_page="0" end_page="0" type="metho">
    <SectionTitle>
QUESTION
PROCESSING
DOCUMENT
RETRIEVAL
ANSWER
EXTRACTION
</SectionTitle>
    <Paragraph position="0"> The QA module, described in the following section, is organized according to the three-tier partition underlying most state-of-the-art systems: 1) question processing, 2) document retrieval, 3) answer generation. The module makes use of a web search engine for document retrieval and consults the user model to obtain the criteria to filter and re-rank the search engine results and to eventually present them appropriately to the user.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.1 User model
</SectionTitle>
      <Paragraph position="0"> Depending on the application of interest, the user model (UM) can be designed to suit the information needs of the QA module in different ways.</Paragraph>
      <Paragraph position="1"> Our current application, YourQA2, is a learningoriented system to help students find information on the Web for their assignments. Our UM consists of the user's: * age range, a [?] {7[?]11,11[?]16,adult} * reading level, r [?] {poor,medium,good} * webpages of interest/bookmarks, w The age range parameter has been chosen to match the partition between primary school, contemporary school and higher education age in</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2http://www.cs.york.ac.uk/aig/aqua
</SectionTitle>
      <Paragraph position="0"> Britain; our reading level parameter takes three values which ideally (but not necessarily) correspond to the three age ranges and may be further refined in the future for more fine-grained modelling. null Analogies can be found with the SeAn (Ardissono et al., 2001), and SiteIF (Magnini and Strapparava, 2001) news recommender systems, where information such as age and browsing history, resp. are part of the UM. More generally, our approach is similar to that of personalized search systems (Teevan et al., 2005; Pitkow et al., 2002), which construct UMs based on the user's documents and webpages.</Paragraph>
      <Paragraph position="1"> In our system, UM information is explicitly collected from the user; while age and reading level are self-assessed, the user's interests are extracted from the document set w using a keyphrase extractor (see further for details). Eventually, a dialogue framework with a history component will contribute to the construction and update of the user model in a less intruding and thus more user-friendly way. In this paper we focus on how to adapt search result presentation using the reading level parameter: age and webpages will not be discussed. null</Paragraph>
    </Section>
    <Section position="3" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.2 Related work
</SectionTitle>
      <Paragraph position="0"> Non-factoids and user modelling As mentioned above, the TREC-QA evaluation campaign, to which the vast majority of current QA systems abide, mainly approaches factoid-based answers.</Paragraph>
      <Paragraph position="1"> To our knowledge, our system is among the first to address the need for a different approach to non-factoid answers. The structure of our QA component reflects the typical structure of a web-based QA system in its three-tier composition. Analogies in this can be found for instance in MUL-DER (Kwok et al., 2001), which is organized according to a question processing/answer extraction/passage ranking pipeline. However, a significant aspect of novelty in our architecture is that the QA component is supported by the user model.</Paragraph>
      <Paragraph position="2"> Additionally, we have changed the relative importance of the different tiers: while we drastically reduce linguistic processing during question processing and answer generation, we give more relief to the post-retrieval phase and to the role of the UM. Having removed the need for fine-grained answer spotting, the emphasis is shifted towards finding closely connected sentences that are highly  relevant to answer the query.</Paragraph>
      <Paragraph position="3"> Readability Within computational linguistics, several applications have been designed to address the needs of users with low reading skills. The computational approach to textual adaptation is commonly based on natural language generation: the process &amp;quot;translate&amp;quot; a difficult text into a syntactically and lexically simpler version. In the case of PSET (Carroll et al., 1999) for instance, a tagger, a morphological analyzer and generator and a parser are used to reformulate newspaper text for users affected by aphasia. Another interesting research is Inui et al.'s lexical and syntactical paraphrasing system for deaf students (Inui et al., 2003). In this system, the judgment of experts (teachers) is used to learn selection rules for paraphrases acquired using various methods (statistical, manual, etc.).</Paragraph>
      <Paragraph position="4"> In the SKILLSUM project (Williams and Reiter, 2005), used to generate literacy test reports, a set of choices regarding output (cue phrases, ordering and punctuation) are taken by a micro-planner based on a set of rules.</Paragraph>
      <Paragraph position="5"> Our approach is conceptually different from the above: exploiting the wealth of information available in the context of a Web-based QA system, we can afford to choose among the documents available on a given subject those which best suit our readability requirements. This is possible thanks to the versatility of language modelling, which allows us to tailor the readability estimation of documents to any kind of user profile in a dynamic manner, as explained in section 3.2.3.</Paragraph>
    </Section>
  </Section>
  <Section position="5" start_page="0" end_page="0" type="metho">
    <SectionTitle>
3 QA Module
</SectionTitle>
    <Paragraph position="0"> In this section we discuss the information flow among the subcomponents of the QA module (see Figure 2 for a representative diagram) and focus on reading level estimation and document filtering. For further details on the implementation of the QA module, see (Quarteroni and Manandhar, 2006).</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.1 Question Processing
</SectionTitle>
      <Paragraph position="0"> The first step performed by YourQA is query expansion: additional queries are created replacing question terms with synonyms using WordNet3.</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.2 Retrieval and Result Processing
3.2.1 Document retrieval
</SectionTitle>
      <Paragraph position="0"> We use Google4 to retrieve the top 20 documents returned for each of the queries issued from the query expansion phase. The subsequent steps will progressively narrow the parts of these documents where relevant information is located.</Paragraph>
      <Paragraph position="1">  Keyphrase extraction is useful in two ways: first, it produces features to group the retrieved documents thematically during the clustering phase, and thus enables to present results by groups. Secondly, when the document parameter (w) of the UM is active, matches are sought between the keyphrases extracted from the documents and those extracted from the user's set of interesting documents; thus it is possible to prioritize results which are more compatible with his/her interests.</Paragraph>
      <Paragraph position="2"> Hence, once the documents are retrieved, we extract their keyphrases using Kea (Witten et al., 1999), an extractor based on Naive Bayes classification. Kea first splits each document into phrases and then takes short subsequences of these initial phrases as candidate keyphrases. Two attributes are used to classify a phrase p as a keyphrase or a non-keyphrase: its TFxIDF score within the set of retrieved documents and the index of p's first appearance in the document. Kea outputs a ranked list of phrases, among which we select the top three as keyphrases for each of our documents.</Paragraph>
    </Section>
    <Section position="3" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.2.3 Estimation of reading levels
</SectionTitle>
      <Paragraph position="0"> In order to adjust search result presentation to the user's reading ability, we estimate the reading difficulty of each retrieved document using the Smoothed Unigram Model, a variation of a Multinomial Bayes classifier (Collins-Thompson and Callan, 2004). Whereas other popular approaches such as Flesch-Kincaid (Kincaid et al., 1975) are based on sentence length, the language modelling approach accounts especially for lexical information. The latter has been found to be more effective as the former when approaching the reading level of subjects in primary and secondary school age (Collins-Thompson and Callan, 2004). Moreover, it is more applicable than length-based approach for Web documents, where sentences are typically short regardless of the complexity of the text.</Paragraph>
      <Paragraph position="1"> The language modelling approach proceeds in two phases: in the training phase, given a range of reading levels, a set of representative documents is collected for each reading level. A unigram language model lms is then built for each set s; the model consists of a list of the word stems appearing in the training documents with their individual probabilities. Textual readability is not modelled at a conceptual level: thus complex concepts explained in simple words might be classified as suitable even for a poor reading level; However we have observed that in most Web documents lexical, syntactic and conceptual complexity are usually consistent within documents, hence it makes sense to apply a reasoning-free technique without impairing readability estimation. Our unigram language models account for the following read- null ing levels: 1) poor, i.e. suitable for ages 7 - 11; 2) medium, suitable for ages 11-16; 3) good, suitable for adults.</Paragraph>
      <Paragraph position="2">  This partition in three groups has been chosen to suit the training data available for our school application, which consists of about 180 HTML pages (mostly from the &amp;quot;BBC schools&amp;quot;5, &amp;quot;Think Energy&amp;quot;6, &amp;quot;Cassini Huygens resource for schools&amp;quot;7 and &amp;quot;Magic Keys storybooks&amp;quot;8 websites), explicitly annotated by the publishers according to the reading levels above.</Paragraph>
      <Paragraph position="3"> In the test phase, given an unclassified docu- null ment D, the estimated reading level of D is the language model lmi maximizing the likelihood L(lmi|D) that D has been generated by lmi. Such likelihood is estimated using the formula:</Paragraph>
      <Paragraph position="5"> where w is a word in the document, C(w,d) represents the number of occurrences of w in D and P(w|lmi) is the probability that w occurs in lmi (approached by its frequency).</Paragraph>
      <Paragraph position="6"> An advantage of language modelling is its portability, since it is quite quick to create word stem/frequency histograms on the fly. This implies that models can be produced to represent more fine-grained reading levels as well as the specific requirements of a single user: the only necessary information are sets of training documents representing each level to be modelled.</Paragraph>
      <Paragraph position="7">  As an indicator of inter-document relatedness, we use document clustering (Steinbach et al., 2000) to group them using both their estimated reading difficulty and their topic (i.e. their keyphrases). In particular we use a hierarchical algorithm, Cobweb (implemented using the WEKA suite of tools (Witten and Frank, 2000) as it produces a cluster tree which is visually simple to analyse: each leaf corresponds to one document, and sibling leaves denote documents that are strongly related both in topic and in reading difficulty. Figure 3 illustrates an example cluster tree for the the query: &amp;quot;Who painted the Sistine chapel?&amp;quot;. Leaf labels represent document keyphrases extracted by Kea for the corresponding documents and ovals represent non-terminal nodes in the cluster tree (these are labelled using the most common keyphrases in their underlying leaves).</Paragraph>
    </Section>
    <Section position="4" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.3 Answer Extraction
</SectionTitle>
      <Paragraph position="0"> The purpose of answer extraction is to present the most interesting excerpts of the retrieved documents according to both the user's query topics and reading level. This process, presented in sections 3.3.1 - 3.3.4, follows the diagram in Figure 2: we use the UM to filter the clustered documents, then compute the similarity between the question and the filtered document passages in order to return the best ones in a ranked list.</Paragraph>
      <Paragraph position="1">  underlying node 4 have a medium reading level; leaf 2 represents a poor reading level document.</Paragraph>
      <Paragraph position="2">  The documents in the cluster tree are filtered according to the UM reading level, r: only those compatible with the user's reading ability are retained for further analysis. However, if the number of retained documents does not exceed a given threshold, we accept in our candidate set part of the documents having the next lowest readability in case r [?] {good,medium} or a medium readability in case r = poor.</Paragraph>
      <Paragraph position="3">  Within each of the documents retained, we seek for the sentences which are semantically most relevant to the query. Given a sentence p and the query q, we represent them as two sets of words</Paragraph>
      <Paragraph position="5"> The semantic distance from p to q is then: distq(p) =summationtext1[?]i[?]m minj[d(pwi,qwj)] where d(pwi,qwj) represents the Jiang-Conrath word-level distance between pwi and qwj (Jiang and Conrath, 1997), based on WordNet 2.0. The intuition is that for each question word, we find the word in the candidate answer sentence which minimizes the word-level distance and then we compute the sum of such minima.</Paragraph>
      <Paragraph position="6">  For a given document, we can thus isolate a sentence s minimizing the distance to the query. The passage P, i.e. a window of up to 5 sentences centered on s, will be a candidate result. We assign to such passage a score equal to the similarity of s to the query; in turn, the score of P is used as the score of the document containing it. We also define a ranking function for clusters, which allows to order them according to the maximal score of their component documents. Passages from the highest ranking cluster will be presented first to the user, in decreasing order of score, followed by the passages from lower ranking clusters.</Paragraph>
      <Paragraph position="7">  To present our answers, we fix a threshold for the number of results to be returned following the ranking exposed above. Each result consists of a title and document passage where the sentence which best answers the query is highlighted; the URL of the original document is also available for loading if the user finds the passage interesting and wants to read more.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML