XML Viewer - n03-1007

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/03/n03-1007_evalu.xml
Size: 12,814 bytes
Last Modified: 2025-10-06 13:58:50
<?xml version="1.0" standalone="yes"?>
<Paper uid="N03-1007">
  <Title>Performance QA System Using Lexico-Semantic Pattern Matching and Shallow NLP&amp;quot;, Proceedings of</Title>
  <Section position="7" start_page="0" end_page="0" type="evalu">
    <SectionTitle>
7 Results
</SectionTitle>
    <Paragraph position="0"> An implementation of the algorithm was evaluated on the TREC context questions used to develop the algorithm and then on the collection of 500 new clarification dialogue questions. The results on the TREC data, which was used to develop the algorithm, were as follows (see below for discussion and an  Where &amp;quot;New&amp;quot; indicates the ability to recognize whether the current question is the first in a new series of clarification questions and &amp;quot;Clarif.&amp;quot; (for &amp;quot;Clarification&amp;quot;) indicates the ability to recognize whether the current question is a clarification question. The results for the same experiments conducted on the collected data were as follows:  information and simply took a question to be a clarification question if it had any words in common with the previous n questions, else took the question to be the beginning of a new series. 64% of questions in the new collection could be recognized with this simple algorithm, which did not misclassify any &amp;quot;new&amp;quot; questions.</Paragraph>
    <Paragraph position="1"> Method 1. This method employed point 1 of the algorithm described in section 5: 62% of questions in the new collection could be recognized as clarification questions simply by looking for &amp;quot;reference&amp;quot; keywords such as he, she, this, so, etc. which clearly referred to previous questions. Interestingly this did not misclassify any &amp;quot;new&amp;quot; questions.</Paragraph>
    <Paragraph position="2"> Method 2. This method employed points 1 and 2 of the algorithm described in section 5: 5% of questions in the new collection could be recognized simply by looking for the absence of verbs, which, combined with keyword lookup (Method 1), improved performance to 66%. Again this did not misclassify any &amp;quot;new&amp;quot; questions.</Paragraph>
    <Paragraph position="3"> Method 3a. This method employed the full algorithm described in section 5 (point 3 is the similarity measure algorithm described in section 6): clarification recognition rose to 91% of the new collection by looking at the similarity between nouns in the current question and nouns in the previous questions, in addition to reference words and the absence of verbs. Misclassification was a serious problem, however with correctly classified &amp;quot;new&amp;quot; questions falling to 67%.</Paragraph>
    <Paragraph position="4"> Method 3b. This was the same as method 3a, but specified a similarity threshold when employing the similarity measure described in section 6: this required the nouns in the current question to be similar to nouns in the previous question beyond a specified similarity threshold. This brought clarification question recognition down to 89% of the new collection, but misclassification of &amp;quot;new&amp;quot; questions was reduced significantly, with &amp;quot;new&amp;quot; questions being correctly classified 83% of the time.</Paragraph>
    <Paragraph position="5"> Problems noted were: * False positives: questions following a similar but unrelated question series. E.g. &amp;quot;Are they all Muslim countries?&amp;quot; (talking about religion, but in the context of a general conversation about Saudi Arabia) followed by &amp;quot;What is the chief religion in Peru?&amp;quot; (also about religion, but in a totally unrelated context).</Paragraph>
    <Paragraph position="6"> * Questions referring to answers, not previous questions (e.g. clarifying the meaning of a word contained in the answer, or building upon a concept defined in the answer: e.g. &amp;quot;What did Antonio Carlos Tobim play?&amp;quot; following &amp;quot;Which famous musicians did he play with?&amp;quot; in the context of a series of questions about Fank Sinatra: Antonio Carlos Tobim was referred to in the answer to the previous question, and nowhere else in the exchange. These made up 3% of the missed clarifications.</Paragraph>
    <Paragraph position="7"> * Absence of relationships in WordNet, e.g. between &amp;quot;NASDAQ&amp;quot; and &amp;quot;index&amp;quot; (as in share index). Absence of verb-noun relationships in WordNet, e.g. between to die and death, between &amp;quot;battle&amp;quot; and &amp;quot;win&amp;quot; (i.e. after a battle one side generally wins and another side loses), &amp;quot;airport&amp;quot; and &amp;quot;visit&amp;quot; (i.e. people who are visiting another country use an airport to get there) As can be seen from the tables above, the same experiments conducted on the TREC context questions yielded worse results; it was difficult to say, however, whether this was due to the small size of the TREC data or the nature of the data itself, which perhaps did not fully reflect &amp;quot;real&amp;quot; dialogues.</Paragraph>
    <Paragraph position="8"> As regards the recognition of question in a series (the recognition that a clarification I taking place), the number of sentences recognized by keyword alone was smaller in the TREC data (53% compared to 62%), while the number of questions not containing verbs was roughly similar (about 6%). The improvement given by computing noun similarity between successive questions gave worse results on the TREC data: using method 3a resulted in an improvement to the overall correctness of 19 percentage points, or a 32% increase (compared to an improvement of 25 percentage points, or a 38% increase on the collected data); using method  or a 22% increase (compared to an improvement of 23 percentage points or a 35% increase on the collected data), perhaps indicating that in &amp;quot;real&amp;quot; conversation speakers tend to use simpler semantic relationships than what was observed in the TREC data.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
8 Usefulness of Clarification Dialogue
Recognition
</SectionTitle>
      <Paragraph position="0"> Recognizing that a clarification dialogue is occurring only makes sense if this information can then be used to improve answer retrieval performance.</Paragraph>
      <Paragraph position="1"> We therefore hypothesized that noting that a questioner is trying to clarify previously asked questions is important in order to determine the context in which an answer is to be sought: in other words, the answers to certain questions are constrained by the context in which they have been uttered. The question &amp;quot;What does attenuate mean?&amp;quot;, for example, may require a generic answer outlining all the possible meanings of &amp;quot;attenuate&amp;quot; if asked in isolation, or a particular meaning if asked after the word has been seen in an answer (i.e.</Paragraph>
      <Paragraph position="2"> in a definite context which constrains its meaning). In other cases, questions do not make sense at all out of a context. For example, no answer could be given to the question &amp;quot;where?&amp;quot; asked on its own, while following a question such as &amp;quot;Does Sean have a house anywhere apart from Scotland?&amp;quot; it becomes an easily intelligible query.</Paragraph>
      <Paragraph position="3"> The usual way in which Question Answering systems constrain possible answers is by restricting the number of documents in which an answer is sought by filtering the total number of available documents through the use of an information retrieval engine. The information retrieval engine selects a subset of the available documents based on a number of keywords derived from the question at hand. In the simplest case, it is necessary to note that some words in the current question refer to words in previous questions or answers and hence use these other words when formulating the IR query. For example, the question &amp;quot;Is he married?&amp;quot; cannot be used as is in order to select documents, as the only word passed to the IR engine would be &amp;quot;married&amp;quot; (possibly the root version &amp;quot;marry&amp;quot;) which would return too many documents to be of any use. Noting that the &amp;quot;he&amp;quot; refers to a previously mentioned person (e.g. &amp;quot;Sean Connery&amp;quot;) would enable the answerer to seek an answer in a smaller number of documents. Moreover, given that the current question is asked in the context of a previous question, the documents retrieved for the previous related question could provide a context in which to initially seek an answer.</Paragraph>
      <Paragraph position="4"> In order to verify the usefulness of constraining the set of documents from in which to seek an answer, a subset made of 15 clarification dialogues (about 100 questions) from the given question data was analyzed by taking the initial question for a series, submitting it to the Google Internet Search Engine and then manually checking to see how many of the questions in the series could be answered simply by using the first 20 documents retrieved for the first question in a series.</Paragraph>
      <Paragraph position="5"> The results are summarized in the following diagram  by looking within the documents used for the previous question in the series, thus indicating the usefulness of noting the occurrence of clarification dialogue.</Paragraph>
      <Paragraph position="6"> * The remaining 31% could not be answered by making reference to the previously retrieved documents, and to find an answer a different approach had to be taken. In particular: * 6% could be answered after retrieving documents simply by using the words in the question as search terms (e.g. &amp;quot;What caused the boxer uprising?&amp;quot;); * 14% required some form of coreference resolution and could be answered only by combining the words in the question with the words to which the relative pronouns in the question referred (e.g.</Paragraph>
      <Paragraph position="7"> &amp;quot;What film is he working on at the moment&amp;quot;, with the reference to &amp;quot;he&amp;quot; resolved, which gets passed to the search engine as &amp;quot;What film is Sean Connery working on at the moment?&amp;quot;); * 7% required more than 20 documents to be retrieved by the search engine or other, more complex techniques. An example is a question such as &amp;quot;Where exactly?&amp;quot; which requires both an understanding of the context in which the question is asked (&amp;quot;Where?&amp;quot; makes no sense on its own) and the previously given answer (which was probably a place, but not restrictive enough for the questioner).</Paragraph>
      <Paragraph position="8"> * 4% constituted mini-clarification dialogues within a larger clarification dialogue (a slight deviation from the main topic which was being investigated by the questioner) and could be answered by looking at the documents retrieved for the first question in the mini-series.</Paragraph>
      <Paragraph position="9"> Recognizing that a clarification dialogue is occurring therefore can simplify the task of retrieving an answer by specifying that an answer must be in the set of documents used the previous questions. This is consistent with the results found in the TREC context task (Voorhees 2002), which indicated that systems were capable of finding most answers to questions in a context dialogue simply by looking at the documents retrieved for the initial question in a series. As in the case of clarification dialogue recognition, therefore, simple techniques can resolve the majority of cases; nevertheless, a full solution to the problem requires more complex methods. The last case indicates that it is not enough simply to look at the documents provided by the first question in a series in order to seek an answer: it is necessary to use the documents found for a previously asked question which is related to the current question (i.e. the questioner could &amp;quot;jump&amp;quot; between topics). For example, given the following series of questions starting with Q1: Q1: When was the Hellenistic Age? [...] Q5: How did Alexander the great become ruler? Q6: Did he conquer anywhere else? Q7: What was the Greek religion in the Hellenistic Age? where Q6 should be related to Q5 but Q7 should be related to Q1, and not Q6. In this case, given that the subject matter of Q1 is more immediately related to the subject matter of Q7 than Q6 (although the subject matter of Q6 is still broadly related, it is more of a specialized subtopic), the documents retrieved for Q1 will probably be more relevant to Q7 than the documents retrieved for Q6 (which would probably be the same documents retrieved for Q5)</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML