File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/06/w06-3406_metho.xml

Size: 17,484 bytes

Last Modified: 2025-10-06 14:10:59

<?xml version="1.0" standalone="yes"?>
<Paper uid="W06-3406">
  <Title>Improving &amp;quot;Email Speech Acts&amp;quot; Analysis via N-gram Selection</Title>
  <Section position="4" start_page="36" end_page="36" type="metho">
    <SectionTitle>
3 The Corpus
</SectionTitle>
    <Paragraph position="0"> The CSpace email corpus used in this paper contains approximately 15,000 email messages collected from a management course at Carnegie Mellon University. This corpus originated from working groups who signed agreements to make certain parts of their email accessible to researchers. In this course, 277 MBA students, organized in approximately 50 teams of four to six members, ran simulated companies in different market scenarios over a 14-week period (Kraut et al., ). The email tends to be very task-oriented, with many instances of task delegation and negotiation.</Paragraph>
    <Paragraph position="1"> Messages were mostly exchanged with members of the same team. Accordingly, we partitioned the corpus into subsets according to the teams. The 1F3 team dataset has 351 messages total, while the 2F2, 3F2, 4F4 and 11F1 teams have, respectively, 341, 443, 403 and 176 messages. All 1716 messages were labeled according to the taxonomy in Figure</Paragraph>
  </Section>
  <Section position="5" start_page="36" end_page="38" type="metho">
    <SectionTitle>
4 N-gram Features
</SectionTitle>
    <Paragraph position="0"> In this section we detail the preprocessing step and the feature selection applied to all email acts.</Paragraph>
    <Section position="1" start_page="36" end_page="36" type="sub_section">
      <SectionTitle>
4.1 Preprocessing
</SectionTitle>
      <Paragraph position="0"> Before extracting the n-grams features, a sequence of preprocessing steps was applied to all email messages in order to emphasize the linguistic aspects of the problem. Unless otherwise mentioned, all pre-processing procedures were applied to all acts.</Paragraph>
      <Paragraph position="1"> Initially, forwarded messages quoted inside email messages were deleted. Also, signature files and quoted text from previous messages were removed from all messages using a technique described elsewhere (Carvalho and Cohen, 2004). A similar cleaning procedure was executed by Cohen et al. (2004).</Paragraph>
      <Paragraph position="2"> Some types of punctuation marks (&amp;quot;,;:.)(][&amp;quot;) were removed, as were extra spaces and extra page breaks. We then perform some basic substitutions such as: from &amp;quot;'m&amp;quot; to &amp;quot; am&amp;quot;, from &amp;quot;'re&amp;quot; to &amp;quot; are&amp;quot;, from &amp;quot;'ll&amp;quot; to &amp;quot; will&amp;quot;, from &amp;quot;won't&amp;quot; to &amp;quot;will not&amp;quot;, from &amp;quot;doesn't&amp;quot; to &amp;quot;does not&amp;quot; and from &amp;quot;'d&amp;quot; to &amp;quot; would&amp;quot;.</Paragraph>
      <Paragraph position="3"> Any sequence of one or more numbers was replaced by the symbol &amp;quot;[number]&amp;quot;. The pattern &amp;quot;[number]:[number]&amp;quot; was replaced with &amp;quot;[hour]&amp;quot;. The expressions &amp;quot;pm or am&amp;quot; were replaced by &amp;quot;[pm]&amp;quot;. &amp;quot;[wwhh]&amp;quot; denoted the words &amp;quot;why, where, who, what or when&amp;quot;. The words &amp;quot;I, we, you, he, she or they&amp;quot; were replaced by &amp;quot;[person]&amp;quot;. Days of the week (&amp;quot;Monday, Tuesday, ..., Sunday&amp;quot;) and their short versions (i.e., &amp;quot;Mon, Tue, Wed, ..., Sun&amp;quot;) were replaced by &amp;quot;[day]&amp;quot;. The words &amp;quot;after, before or during&amp;quot; were replaced by &amp;quot;[aaafter]&amp;quot;. The pronouns &amp;quot;me, her, him, us or them&amp;quot; were substituted by &amp;quot;[me]&amp;quot;. The typical filename types &amp;quot;.doc, .xls, .txt, .pdf, .rtf and .ppt&amp;quot; were replaced by &amp;quot;.[filetype]&amp;quot;. A list with some of these substitutions is illustrated in</Paragraph>
    </Section>
    <Section position="2" start_page="36" end_page="36" type="sub_section">
      <SectionTitle>
Symbol Pattern
</SectionTitle>
      <Paragraph position="0"> [number] any sequence of numbers [hour] [number]:[number] [wwhh] &amp;quot;why, where, who, what, or when&amp;quot; [day] the strings &amp;quot;Monday, Tuesday, ..., or Sunday&amp;quot; [day] the strings &amp;quot;Mon, Tue, Wed, ..., or Sun&amp;quot; [pm] the strings &amp;quot;P.M., PM, A.M. or AM&amp;quot; [me] the pronouns &amp;quot;me, her, him, us or them&amp;quot; [person] the pronouns &amp;quot;I, we, you, he, she or they&amp;quot; [aaafter] the strings &amp;quot;after, before or during&amp;quot; [filetype] the strings &amp;quot;.doc, .pdf, .ppt, .txt, or .xls&amp;quot;  For the Commit act only, references to the first person were removed from the symbol [person] -i.e., [person] was used to replace &amp;quot;he, she or they&amp;quot;. The rationale is that n-grams containing the pronoun &amp;quot;I&amp;quot; are typically among the most meaningful for this act (as shall be detailed in Section 4.2).</Paragraph>
    </Section>
    <Section position="3" start_page="36" end_page="38" type="sub_section">
      <SectionTitle>
4.2 Most Meaningful N-grams
</SectionTitle>
      <Paragraph position="0"> After preprocessing the 1716 email messages, n-gram sequence features were extracted. In this paper, n-gram features are all possible sequences of length 1 (unigrams or 1-gram), 2 (bigram or 2gram), 3 (trigram or 3-gram), 4 (4-gram) and 5 (5gram) terms. After extracting all n-grams, the new dataset had more than 347500 different features. It would be interesting to know which of these n-grams are the &amp;quot;most meaningful&amp;quot; for each one of email speech acts.</Paragraph>
      <Paragraph position="1">  1-gram 2-gram 3-gram 4-gram 5-gram ? do [person] [person] need to [wwhh] do [person] think [wwhh] do [person] think ? please ? [person] [wwhh] do [person] do [person] need to let [me] know [wwhh] [person] [wwhh] could [person] let [me] know and let [me] know a call [number]-[number] could [person] please would [person] call [number]-[number] give [me] a call [number] do ? thanks do [person] think would be able to please give give [me] a call can are [person] are [person] meeting [person] think [person] need [person] would be able to of can [person] could [person] please let [me] know [wwhh] take a look at it [me] need to do [person] need do [person] think ? [person] think [person] need to  One possible way to accomplish this is using some feature selection method. By computing the Information Gain score (Forman, 2003; Yang and Pedersen, 1997) of each feature, we were able to rank the most &amp;quot;meaningful&amp;quot; n-gram sequence for each speech act. The final rankings are illustrated in Tables 2 and 3.</Paragraph>
      <Paragraph position="2"> Table 2 shows the most meaningful n-grams for the Request act. The top features clearly agree with the linguistic intuition behind the idea of a Request email act. This agreement is present not only in the frequent 1g features, but also in the 2-grams, 3-grams, 4-grams and 5-grams. For instance, sentences such as &amp;quot;What do you think ?&amp;quot; or &amp;quot;let me know what you ...&amp;quot; can be instantiations of the top two 5-grams, and are typically used indicating a request in email communication.</Paragraph>
      <Paragraph position="3"> Table 3 illustrates the top fifteen 4-grams for all email speech acts selected by Information Gain. The Commit act reflects the general idea of agreeing to do some task, or to participate in some meeting. As we can see, the list with the top 4-grams reflects the intuition of commitment very well. When accepting or committing to a task, it is usual to write emails using &amp;quot;Tomorrow is good for me&amp;quot; or &amp;quot;I will put the document under your door&amp;quot; or &amp;quot;I think I can finish this task by 7&amp;quot; or even &amp;quot;I will try to bring this tomorrow&amp;quot;. The list even has some other interesting 4-grams that can be easily associated to very specific commitment situations, such as &amp;quot;I will bring copies&amp;quot; and &amp;quot;I will be there&amp;quot;.</Paragraph>
      <Paragraph position="4"> Another act in Table 3 that visibly agrees with its linguistic intuition is Meeting. The 4-grams listed are usual constructions associated with either negotiating a meeting time/location (&amp;quot;[day] at [hour][pm]&amp;quot;), agreeing to meet (&amp;quot;is good for [me]&amp;quot;) or describing the goals of the meeting (&amp;quot;to go over the&amp;quot;).</Paragraph>
      <Paragraph position="5"> The top features associated with the dData act in Table 3 are also closely related to its general intuition. Here the idea is delivering or requesting some data: a table inside the message, an attachment, a document, a report, a link to a file, a url, etc. And indeed, it seems to be exactly the case in Table 3: some of the top 4-grams indicate the presence of an attachment (e.g., &amp;quot;forwarded message begins here&amp;quot;), some features suggest the address or link where a file can be found (e.g., &amp;quot;in my public directory&amp;quot; or &amp;quot;in the etc directory&amp;quot;), some features request an action to access/read the data (e.g., &amp;quot;please take a look&amp;quot;) and some features indicate the presence of data inside the email message, possibly formatted as a table (e.g., &amp;quot;[date] [hour] [number] [number]&amp;quot; or &amp;quot;[date] [day] [number] [day]&amp;quot;).</Paragraph>
      <Paragraph position="6"> From Table 3, the Propose act seems closely related to the Meeting act. In fact, by checking the labeled dataset, most of the Proposals were associated with Meetings. Some of the features that are not necessarily associated with Meeting are &amp;quot; [person] would like to&amp;quot;, &amp;quot;please let me know&amp;quot; and &amp;quot;was hoping [person] could&amp;quot;.</Paragraph>
      <Paragraph position="7"> The Deliver email speech act is associated with two large sets of actions: delivery of data and delivery of information in general. Because of this generality, is not straightforward to list the most meaningful n-grams associated with this act. Table 3 shows a variety of features that can be associated with a Deliver act. As we shall see in Section 5, the Deliver act has the highest error rate in the classification task.</Paragraph>
      <Paragraph position="8"> In summary, selecting the top n-gram features via Information Gain revealed an impressive agreement with the linguistic intuition behind the different email speech acts.</Paragraph>
    </Section>
    <Section position="4" start_page="38" end_page="38" type="sub_section">
      <SectionTitle>
Request Commit Meeting
</SectionTitle>
      <Paragraph position="0"> [wwhh] do [person] think is good for [me] [day] at [hour] [pm] do [person] need to is fine with [me] on [day] at [hour] and let [me] know i will see [person] [person] can meet at call [number]-[number] i think i can [person] meet at [hour] would be able to i will put the will be in the [person] think [person] need i will try to is good for [me] let [me] know [wwhh] i will be there to meet at [hour] do [person] think ? will look for [person] at [hour] in the [person] need to get $[number] per person [person] will see [person] ? [person] need to am done with the meet at [hour] in a copy of our at [hour] i will [number] at [hour] [pm] do [person] have any [day] is fine with to go over the [person] get a chance each of us will [person] will be in [me] know [wwhh] i will bring copies let's plan to meet that would be great i will do the meet at [hour] [pm] dData Propose Deliver - forwarded message begins [person] would like to forwarded message begins here forwarded message begins here would like to meet [number] [number] [number] [number] is in my public please let [me] know is good for [me] in my public directory to meet with [person] if [person] have any [person] have placed the [person] meet at [hour] if fine with me please take a look would [person] like to in my public directory [day] [hour] [number] [number] [person] can meet tomorrow [person] will try to [number] [day] [number] [hour] an hour or so is in my public [date] [day] [number] [day] meet at [hour] in will be able to in our game directory like to get together just wanted to let in the etc directory [hour] [pm] in the [pm] in the lobby the file name is [after] [hour] or [after] [person] will be able is in our game [person] will be available please take a look fyi - forwarded message think [person] can meet can meet in the just put the file was hoping [person] could [day] at [hour] is my public directory under do [person] want to in the commons at</Paragraph>
    </Section>
  </Section>
  <Section position="6" start_page="38" end_page="40" type="metho">
    <SectionTitle>
5 Experiments
</SectionTitle>
    <Paragraph position="0"> Here we describe how the classification experiments on the email speech acts dataset were carried out.</Paragraph>
    <Paragraph position="1"> Using all n-gram features, we performed 5-fold crossvalidation tests over the 1716 email messages.</Paragraph>
    <Paragraph position="2"> Linear SVM1 was used as classifier. Results are illustrated in Figure 2.</Paragraph>
    <Paragraph position="3"> Figure 2 shows the test error rate of four different experiments (bars) for all email acts. The first bar denotes the error rate obtained by Cohen et al. (2004) in a 5-fold crossvalidation experiment, also using linear SVM. Their dataset had 1354 email messages, and only 1-gram features were extracted.</Paragraph>
    <Paragraph position="4"> The second bar illustrates the error rate obtained using only 1-gram features with additional data. In this case, we used 1716 email messages. The third bar represents the the same as the second bar (11We used the LIBSVM implementation (Chang and Lin, 2001) with default parameters.</Paragraph>
    <Paragraph position="5"> gram features with 1716 messages), with the difference that the emails went through the preprocessing procedure previously described.</Paragraph>
    <Paragraph position="6"> The fourth bar shows the error rate when all 1gram, 2-gram and 3-gram features are used and the 1716 messages go through the preprocessing procedure. The last bar illustrates the error rate when all n-gram features (i.e., 1g+2g+3g+4g+5g) are used in addition to preprocessing in all 1716 messages.</Paragraph>
    <Paragraph position="7"> In all acts, a consistent improvement in 1-gram performance is observed when more data is added, i.e., a drop in error rate from the first to the second bar. Therefore, we can conclude that Cohen et al. (2004) could have obtained better results if they had used more labeled data.</Paragraph>
    <Paragraph position="8"> A comparison between the second and third bars reveals the extent to which preprocessing seems to help classification based on 1-grams only. As we can see, no significant performance difference can be observed: for most acts the relative difference is  A much larger performance improvement can be seen between the fourth and third bars. This reflects the power of the contextual features: using all 1grams, 2-grams and 3-grams is considerably more powerful than using only 1-gram features. This significant difference can be observed in all acts.</Paragraph>
    <Paragraph position="9"> Compared to the original values from (Cohen et al., 2004), we observed a relative error rate drop of 24.7% in the Request act, 33.3% in the Commit act, 23.7% for the Deliver act, 38.3% for the Propose act, 9.2% for Meeting and 29.1% in the dData act.</Paragraph>
    <Paragraph position="10"> In average, a relative improvement of 26.4% in error rate.</Paragraph>
    <Paragraph position="11"> We also considered adding the 4-gram and 5-gram features to the best system. As pictured in the last bar of Figure 2, this addition did not seem to improve the performance and, in some cases, even a small increase in error rate was observed. We believe this was caused by the insufficient amount of labeled data in these tests; and the 4-gram and 5-gram features are likely to improve the performance of this system if more labeled data becomes available. null Precision versus recall curves of the Request act classification task are illustrated in Figure 3. The curve on the top shows the Request act performance when the preprocessing step cues and n-grams proposed in Section 4 are applied. For the bottom curve, only 1g features were used. These two curves correspond to the second bar (bottom curve) and forth bar (top curve) in Figure 2. Figure 3 clearly shows that both recall and precision are improved by using the contextual features.</Paragraph>
    <Paragraph position="12"> To summarize, these results confirm the intuition that contextual information (n-grams) can be very effective in the task of email speech act classification. null</Paragraph>
  </Section>
  <Section position="7" start_page="40" end_page="40" type="metho">
    <SectionTitle>
Classification
6 The Ciranda Package
</SectionTitle>
    <Paragraph position="0"> Ciranda is an open source package for Email Speech Act prediction built on the top of the Minorthird package (Cohen, 2004). Among other features, Ciranda allows customized feature engineering, extraction and selection. Email Speech Act classifiers can be easily retrained using any learning algorithm from the Minorthird package. Ciranda is currently available from http://www.cs.cmu.</Paragraph>
    <Paragraph position="1"> edu/[?]vitor.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML