XML Viewer - w06-3002

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/06/w06-3002_metho.xml
Size: 14,005 bytes
Last Modified: 2025-10-06 14:10:53
<?xml version="1.0" standalone="yes"?>
<Paper uid="W06-3002">
  <Title>WoZ Simulation of Interactive Question Answering</Title>
  <Section position="3" start_page="11" end_page="11" type="metho">
    <SectionTitle>
3 Setting
</SectionTitle>
    <Paragraph position="0"> Referring to the headlines in Mainichi and Yomiuri newspapers from 2000 and 2001, we selected 101 topics, which included events, persons, and organizations. On each of those topics, a summary of between 800 and 1600 characters long and an abstract of around 100 characters long were constructed using a full text search system on the newspaper articles.2 Four experts shared this preparation work. Twenty topics were selected from among the original 101 on the basis that enough information was gathered and compiled into the summary.3 The topics consisted of 5 persons, 2 organizations, 7 events, 5 artifacts, and 1 syndrome, which include Mr. Colin Powell, Charles, Prince of Wales, An accident of a Russian nuclear-powerd submarine, Concorde, &amp;quot;Harry Potter&amp;quot;, and Windows XP.</Paragraph>
    <Paragraph position="1"> The four experts who shared the preparation work acted as a WoZ in the simulated QA systems, and each expert participated in dialogues on the topic for which she/he wrote the summary and abstract, and tried to answer questions from users using the summary, a full text search system, and her/his memory.</Paragraph>
    <Paragraph position="2"> Six subjects participated in the experiment and 10 topics were assigned to each subject. That is, each topic was handled by three subjects. The subjects behaved as users of simulated QA systems. Presented with the topic and abstract, the subjects were directed to think about questions beforehand assuming that they had to write a report on that topic, and then they participated in an information access dialogue with the &amp;quot;QA system.&amp;quot; The suggested number of questions for each topic was around 10.</Paragraph>
    <Paragraph position="3"> The dialogues were conducted through keyboards and character displays. The subjects were told that the system could answer only simple factoid questions, and the WoZs were instructed not to answer complicated questions such as asking for reasons and opinions. No restrictions on the usage of language expressions were imposed but were left to the discretion of the WoZs, though canned phrases 2Those lengths of Japanese texts roughly correspond to 300 to 600 words and 40 words of English text, respectively.</Paragraph>
    <Paragraph position="4"> 3We prepared 101 topics to meet the additional needs of other experiments. It is absolutely not the case that only 20 out of 101 were acceptable for our experiments.</Paragraph>
    <Paragraph position="5"> USR: What is the title of the first book of the &amp;quot;Harry Potter&amp;quot; series? SYS: &amp;quot;Harry Potter and the Sorcerer's Stone.&amp;quot; USR: How many books were sold? SYS: The Japanese translation sold 600 thousand copies in the first half year after its publication.</Paragraph>
    <Paragraph position="6"> USR: Who is the author? SYS: J. K. Rowling.</Paragraph>
    <Paragraph position="7"> USR: Who are the main characters (in that series)? SYS: Harry Potter, Ron Weasley, Hagrid and so on.</Paragraph>
    <Paragraph position="8"> USR: What kind of person is Harry Potter? SYS: A child with a wizard and a witch for parents.</Paragraph>
    <Paragraph position="9"> USR: What is the plot? SYS: It is an adventure story involving Harry Potter and his friends who enter a school for witchcraft and wizardry. USR: How many people went to the movie? SYS: In the US, box-office sales of the movie &amp;quot;Harry Potter and the Sorcerer's Stone&amp;quot; reached 188 million dollars in the first 10 days after release.</Paragraph>
    <Paragraph position="10">  such as &amp;quot;Please wait a moment&amp;quot; and &amp;quot;Sorry, the answer could not be found&amp;quot; were prepared in advance. The WoZs were also instructed that they could clarify users' questions when they were ambiguous or vague, and that their answers should be simple but cooperative and helpful responses were not forbidden. null An example of the dialogues collected is shown in Figure 2. In the figure, SYS stands for utterances of the QA system simulated by a WoZ and USR represents that of the user, namely a subject. In the rest of the paper, these are referred to as system's utterances and user's utterances, respectively.</Paragraph>
  </Section>
  <Section position="4" start_page="11" end_page="14" type="metho">
    <SectionTitle>
4 Coding and Results
</SectionTitle>
    <Paragraph position="0"> Excluding meta-utterances for dialogue control such as &amp;quot;Please wait a moment&amp;quot; and &amp;quot;That's all,&amp;quot; 620 pairs of utterances were collected, of which 22 system utterances were for clarification. Among the remaining 598 cases, the system gave some answers in 502 cases, and the other 94 utterances were negative responses: 86 utterances said that the answer could not found; 10 utterances said that the question was too complicated or that they could not answer such type of question.</Paragraph>
    <Section position="1" start_page="11" end_page="12" type="sub_section">
      <SectionTitle>
4.1 Characteristics of questions and answers
</SectionTitle>
      <Paragraph position="0"> The syntactic classification of user utterances and its distribution is shown in Table 1. The numbers in  parentheses are numbers of occurrences. In spite of the direction of using wh-type questions, more than 10% of utterances are yes-no questions and imperatives for requesting information. Most of the user responses to clarification questions from the system are rephrasing of the question concerned; only one response has a declarative form. Examples of rephrasing will be shown in section 4.3.</Paragraph>
      <Paragraph position="1"> The classification of user questions and requests according to the subject asked or requested is shown in Table 2; the classification of system answers according to their syntactic and semantic categorization is shown in Table 3. In Table 2, the classification of yes-no questions was estimated based on the information provided in the helpful responses to those. The classification in Table 3 was conducted based on the syntactic and semantic form of the exact part of the answer itself rather than on whole utterances of the system. For example, the categorization of the system utterance &amp;quot;He was born on April 5, 1935,&amp;quot; which is the answer to &amp;quot;When was Mr. Colin Powell born?&amp;quot; is not a sentence but a date expression.</Paragraph>
    </Section>
    <Section position="2" start_page="12" end_page="13" type="sub_section">
      <SectionTitle>
4.2 Pragmatic phenomena
</SectionTitle>
      <Paragraph position="0"> Japanese has four major types of anaphoric devices: pronouns, zero pronouns, definite noun phrases,  and ellipses. Zero pronouns are very common in Japanese, in which pronouns are not apparent on the surface. As Japanese also has a completely different determiner system from English, the difference between definite and indefinite is not apparent on the surface, and definite noun phrases usually have the same form as generic noun phrases. Table 4 shows a summary of such pragmatic phenomena observed.</Paragraph>
      <Paragraph position="1"> The total number is more than 620 as some utterances contain more than one anaphoric expression. &amp;quot;How many crew members were in the submarine when the accident happened?&amp;quot; is an example of such a question with multiple anaphoric expressions.</Paragraph>
      <Paragraph position="2"> Among 203 questions with no reference expression, 60 questions (30%) are the first one of a series of utterances on the same topic. The others are divided into two cases. In the first and major case, the current foci appear literally rather than in pronouns or definite noun phrases. The second case, which is relatively minor, is the beginning of and the resume from a subdialogue. For example, when a user himself resumed talking about Charles, Prince of Wales after a few questions about his children, the user used a literal expression like &amp;quot;Who is the long-term partner of Charles, Prince of Wales?&amp;quot;; after talking  about the details of the accident, a user asked &amp;quot;What is the major characteristic of Concorde?&amp;quot; Out of 437 reference expressions excluding ellipses in Table 4, 94 expressions (22%) refer to items other than the global topic. Moreover, 76 expressions of those (81%) are a sole reference expression in the utterance. The existence of questions without referring to the global topic, accompanied with the existence of some questions with no reference expression, means that the focus shifts in those dialogues, and then the local focus can differ from the global topic.</Paragraph>
    </Section>
    <Section position="3" start_page="13" end_page="13" type="sub_section">
      <SectionTitle>
4.3 Clarifications
</SectionTitle>
      <Paragraph position="0"> As mentioned above, only 22 clarification utterances were observed in the experiment. They are categorized into those asking about an ambiguous referent (13 utterances) and those asking about a vague question subject (9 utterances). Examples of each case are shown below.</Paragraph>
      <Paragraph position="1"> Clarification asking about an ambiguous referent USR: When was he [Mr. Colin Powell] installed as head of security of the United States? SYS: December 6, 2000.</Paragraph>
      <Paragraph position="2"> USR: Who was the president at that time? SYS: Mr. George W. Bush.</Paragraph>
      <Paragraph position="3"> USR: Which political party did he belong to? SYS: Whose political party are you asking about? USR: Which political party did Mr. Colin Powell belong to? Clarification asking about a vague question subject USR: Who developed Windows XP? SYS: Are you asking for a company name? USR: What company developed Windows XP?</Paragraph>
    </Section>
    <Section position="4" start_page="13" end_page="14" type="sub_section">
      <SectionTitle>
4.4 Cooperative and helpful responses
</SectionTitle>
      <Paragraph position="0"> Rather than simple and literal answers to the users' questions and requests, cooperative and helpful responses, in which, for example, information not requested directly by a user was additionally provided (Kaplan 1983), were observed frequently. The results of analyzing such cooperative and helpful behavior in the system responses are shown in Table 5. Questions asking reasons, procedures, situations, definitions, descriptions or explanations, which are categorized into the last three categories except others in Table 2, were excluded from this analysis, since for those questions it is difficult to judge objec- null tively whether some information is additional or not and, in addition, most of those are out of the scope of QACIAD, which is our main interest. Table 5 shows the number of occurrences and its percentage among positive responses. Sometimes, more than one behavior was observed in one utterance. 148 utterances out of 334 positive responses (44%) contained some helpful behavior. The definitions and manually constructed examples of cooperative and helpful behavior listed in Table 5 are shown below.</Paragraph>
      <Paragraph position="1"> Providing additional information Information not requested directly from a user is additionally provided.</Paragraph>
      <Paragraph position="2"> USR: What year was Mr. Colin Powell born? SYS: He was born in 1935 in New York.</Paragraph>
      <Paragraph position="3"> Providing detailed information More detailed information than that requested directly by a user is provided.</Paragraph>
      <Paragraph position="4"> USR: What year was Mr. Colin Powell born? SYS: He was born on April 5, 1935.</Paragraph>
      <Paragraph position="5"> Elaboration of subject A relatively vague subject of a question is elaborated and made concrete.</Paragraph>
      <Paragraph position="6"> In the following example, a question with a vague expression, &amp;quot;how well&amp;quot; is answered by a value as a concrete statistic.</Paragraph>
      <Paragraph position="7"> USR: How well did the movie do? SYS: In the US, box-office sales reached 188 million dollars in the first 10 days after release.</Paragraph>
      <Paragraph position="8"> Providing alternative information Rather than exact information a user requested, alternative related information is provided. An alternative may be proposed by relaxing the condition imposed by the user. In the following example, attendance figures were altered to box-office sales.</Paragraph>
      <Paragraph position="9"> USR: How many people went to the movie?  SYS: In the US, box-office sales reached 188 million dollars in the first 10 days after release.</Paragraph>
      <Paragraph position="10"> Exemplification Instead of a comprehensive list or abstracted general description, some major instances are listed.</Paragraph>
      <Paragraph position="11"> USR: What kinds of character goods are on sale? SYS: Stuffed owls, uniforms for schools of witchcraft and wizardry, and so on.</Paragraph>
      <Paragraph position="12"> Providing speculation Rather than fact, speculations and opinions are provided, which can be considered a special case of providing alternative information. Although there are few cases of such behavior in Table 5, there were many such answers in response to questions asking for reasons and situations.</Paragraph>
      <Paragraph position="13"> USR: What exploded? SYS: It is surmised that torpedoes in the shed at the fore exploded.</Paragraph>
      <Paragraph position="14"> Detecting and correcting erroneous user assumptions is a famous helpful behavior discussed in the literature (Pollack 1989) (Kaplan 1983). Only two occurrences of such behavior were observed, one of which is shown below.</Paragraph>
      <Paragraph position="15"> USR: How long did it take from the explosion until the shipwreck? SYS: The explosion happened while the submarine was going down, and ...</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML