File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/06/w06-0706_intro.xml

Size: 7,666 bytes

Last Modified: 2025-10-06 14:03:52

<?xml version="1.0" standalone="yes"?>
<Paper uid="W06-0706">
  <Title>Automating Help-desk Responses: A Comparative Study of Information-gathering Approaches</Title>
  <Section position="3" start_page="0" end_page="41" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> Email inquiries sent to help desks often revolve around a small set of common questions and issues .1 This means that help-desk operators spend most of their time dealing with problems that have been previously addressed. Further, a signi cant proportion of help-desk responses contain a low level of technical content, corresponding, for example, to inquires addressed to the wrong group, or insuf cient detail provided by the customer about his or her problem. Organizations and clients would bene t if the efforts of human operators were focused on dif cult, atypical problems, and an automated process was employed to deal with the easier problems.</Paragraph>
    <Paragraph position="1">  com/ar/telecom_next_generation_customer.</Paragraph>
    <Paragraph position="2"> In this paper, we report on our experiments with corpus-based approaches to the automation of help-desk responses. Our study is based on a corpus of 30,000 email dialogues between users and help-desk operators at Hewlett-Packard. These dialogues deal with a variety of user requests, which include requests for technical assistance, inquiries about products, and queries about how to return faulty products or parts.</Paragraph>
    <Paragraph position="3"> In order to restrict the scope of our study, we considered two-turn short dialogues, comprising a request followed by an answer, where the answer has at most 15 lines. This yields a sub-corpus of 6659 dialogues. As a rst step, we have automatically clustered the corpus according to the sub-ject line of the rst email. This process yielded 15 topic-based datasets that contain between 135 and 1200 email dialogues. Owing to time limitations, the procedures described in this paper were applied to 8 of the datasets, corresponding to approximately 75% of the dialogues.</Paragraph>
    <Paragraph position="4"> Analysis of our corpus yields the following observations. null O1: Requests containing precise information, such as product names or part speci cations, sometimes elicit helpful, precise answers referring to this information, while other times they elicit answers that do not refer to the query terms, but contain generic information (e.g., referring customers to another help group or asking them to call a particular phone number). Request-answer pair RA1 in Figure 1 illustrates the rst situation, while the pair RA2 illustrates the second.2 2Our examples are reproduced verbatim from the corpus (except for URLs and phone numbers which have been disguised by us), and some have user or operator errors.  RA1: Do I need Compaq driver software for my armada 1500 docking station? This in order to be able to re-install win 98? I would recommend to install the latest system rompaq, on the laptop and the docking station. Just select the model of computer and the operating system you have. http://www.thislink.com.</Paragraph>
    <Paragraph position="5"> RA2: Is there a way to disable the NAT rewall on the Compaq CP-2W so I don't get a private ip address through the wireless network? Unfortunately, you have reached the incorrect eResponse queue for your unit. Your device is supported at the following link, or at 888-phone-number. We apologize for the inconvenience.</Paragraph>
    <Paragraph position="6">  O2: Operators tend to re-use the same sentences in different responses. This is partly a result of companies having in-house manuals that prescribe how to generate an answer. For instance, answers A3 and A4 in Figure 2 share the sentence in italics.</Paragraph>
    <Paragraph position="7"> These observations prompt us to consider complementary approaches along two separate dimensions of our problem. The rst dimension pertains to the technique applied to determine the information in an answer, and the second dimension pertains to the granularity of the information.</Paragraph>
    <Paragraph position="8"> Observation O1 leads us to consider two techniques for obtaining information: retrieval and prediction. Retrieval returns an information item by matching its terms to query terms (Salton and McGill, 1983). Hence, it is likely to obtain precise information if available. In contrast, prediction uses features of requests and responses to select an information item.</Paragraph>
    <Paragraph position="9"> For example, the absence of a particular term in a request may be a good predictive feature (which cannot be considered in traditional retrieval). Thus, prediction could yield replies that do not match particular query terms.</Paragraph>
    <Paragraph position="10"> Observation O2 leads us to consider two levels of granularity: document and sentence. That is, we can obtain a document comprising a complete answer on the basis of a request (i.e., re-use an answer to a previous request), or we can obtain individual sentences and then combine them to compose an answer, as is done in multi-document summarization (Filatova and Hatzivassiloglou, 2004). The sentence-level granuA3: null If you are able to see the Internet then it sounds like it is working, you may want to get in touch with your IT department to see if you need to make any changes to your settings to get it to work. Try performing a soft reset, by pressing the stylus pen in the small hole on the bottom left hand side of the Ipaq and then release.</Paragraph>
    <Paragraph position="11"> A4: I would recommend doing a soft reset by pressing the stylus pen in the small hole on the left hand side of the Ipaq and then release. Then charge the unit overnight to make sure it has been long enough and then see what happens. If the battery is not charging then the unit will need to be sent in for repair.</Paragraph>
    <Paragraph position="12"> Figure 2: Sample answers that share a sentence.</Paragraph>
    <Paragraph position="13"> larity enables the re-use of a sentence for different responses, as well as the composition of partial responses.</Paragraph>
    <Paragraph position="14"> The methods developed on the basis of these two dimensions are: Retrieve Answer, Predict Answer, Predict Sentences, Retrieve Sentences and Hybrid Predict-Retrieve Sentences. The rst four methods represent the possible combinations of information-gathering technique and level of granularity; the fth method is a hybrid where the two information-gathering techniques are applied at the sentence level. The generation of responses under these different methods combines different aspects of document retrieval, questionanswering, and multi-document summarization. Our aim in this paper is to investigate when the different methods are applicable, and whether individual methods are uniquely successful in certain situations. For this purpose, we decided to assign a level of success not only to complete responses, but also to partial ones (obtained with the sentence-based methods). The rationale for this is that we believe that a partial high-precision response is better than no response, and better than a complete response that contains incorrect information. We plan to test these assumptions in future user studies.</Paragraph>
    <Paragraph position="15"> The rest of this paper is organized as follows.</Paragraph>
    <Paragraph position="16"> In the next section, we describe our ve methods, followed by the evaluation of their results. In Section 4, we discuss related research, and then present our conclusions and plans for future work.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML