File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/06/p06-2053_metho.xml

Size: 11,615 bytes

Last Modified: 2025-10-06 14:10:30

<?xml version="1.0" standalone="yes"?>
<Paper uid="P06-2053">
  <Title>Sydney, July 2006. c(c)2006 Association for Computational Linguistics Towards the Orwellian Nightmare Separation of Business and Personal Emails</Title>
  <Section position="5" start_page="407" end_page="407" type="metho">
    <SectionTitle>
3 Previous Work with the Dataset
</SectionTitle>
    <Paragraph position="0"> The most relevant piece of work to this paper was performed at Berkeley. Marti Hearst ran a small-scale annotation project to classify emails in the corpus by their type and purpose (Email annotation at Berkely).</Paragraph>
    <Paragraph position="1"> In total, approximately 1,700 messages were annotated by two distinct annotators. Annotation categories captured four dimensions, but broadly speaking they reflected the following qualities of the email: coarse genre, the topic of the email if business was selected, information about any forwarded or included text and the emotional tone of the email. However, the categories used at the Berkeley project were incompatible with our requirements for several reasons: that project allowed multiple labels to be assigned to each email; the categories were not designed to facilitate discrimination between business and personal emails; distinctions between topic, genre, source and purpose were present in each of the dimensions; and no effort was made to analyse the inter-annotator agreement (Email annotation at Berkely).</Paragraph>
    <Paragraph position="2"> User-defined folders are preserved in the Enron data, and some research efforts have used these folders to develop and evaluate machine-learning algorithms for automatically sorting emails (Klimt and Yang, 2004).</Paragraph>
    <Paragraph position="3"> However, as users are often inconsistent in organising their emails, so the training and testing data in these cases are questionable. For example, many users have folders marked &amp;quot;Personal&amp;quot;, and one might think these could be used as a basis for the characterisation of personal emails. However, upon closer inspection it becomes clear that only a tiny percentage of an individual's personal emails are in these folders. Similarly, many users have folders containing exclusively personal content, but without any obvious folder name to reveal this. All of these problems dictate that for an effective system to be produced, large-scale manual annotation will be necessary.</Paragraph>
    <Paragraph position="4"> Researchers at Queen's University, Canada (Keila, 2005) recently attempted to categorise and identify deceptive messages in the Enron corpus. Their method used a hypothesis from deception theory (e.g., deceptive writing contains cues such as reduced frequency of first-person pronouns and increased frequency of &amp;quot;negative emotion&amp;quot; words) and as to what constitutes deceptive language. Single value decomposition (SVD) was applied to separate the emails, and a manual survey of the results allowed them to conclude that this classification method for detecting deception in email was promising.</Paragraph>
    <Paragraph position="5"> Other researchers have attempted to analyse the Enron emails from a network analytic perspective (Deisner, 2005). Their goal was to analyse the flow of communication between employees at times of crisis, and develop a characterisation for the state of a communication network in such difficult times, in order to identify looming crises in other companies from the state of their communication networks. They compared the network flow of email in October 2000 and October 2001.</Paragraph>
  </Section>
  <Section position="6" start_page="407" end_page="409" type="metho">
    <SectionTitle>
4 Annotation Categories for this Project
</SectionTitle>
    <Paragraph position="0"> Because in many cases there is no definite line between business emails and personal emails, it was decided to mark emails with finer categories than  Business and Personal. This subcategorising not only helped us to analyse the different types of email within business and personal emails, but it helped us to find the nature of the disagreements that occurred later on, in inter-annotation. In other words, this process allowed us to observe patterns in disagreement. null Obviously, the process of deciding categories in any annotation project is a fraught and contentious one. The process necessarily involves repeated cycles of category design, annotation, inter-annotation, analysis of disagreement, category refinement. While the process described above could continue ad infinitum, the sensible project manager must identify were this process is beginning to converge on a set of well-defined but nonetheless intuitive categories, and finalise them.</Paragraph>
    <Paragraph position="1"> Likewise, the annotation project described here went through several evolutions of categories, mediated by input from annotators and other researchers. The final categories chosen were:  Based on the categories above, approximately 12,500 emails were single-annotated by a total of four annotators. null The results showed that around 83% of the emails were business related, while 17% were personal. The company received one personal email for every five  A third of the received emails were &amp;quot;Core Business&amp;quot; and a third were &amp;quot;Routine Admin&amp;quot;. All other categories comprised the remaining third of the emails. One could conclude that approximately one third of emails received at Enron were discussions of policy, strategy, legislation, regulations, trading, and other high-level business matters. The next third of received emails were about the peripheral, routine matters of the company. These are emails related to HR, IT administration, meeting scheduling, etc. which can be regarded as part of the common infrastructure of any large scale corporation.</Paragraph>
    <Paragraph position="2"> The rest of the emails were distributed among personal emails, emails to colleagues, company news letters, and emails received due to subscription. The biggest portion of the last third, are emails received due to subscription, whether the subscription be business or personal in nature.</Paragraph>
    <Paragraph position="3"> In any annotation project consistency should be measured. To this end 2,200 emails were double annotated between four annotators. As Figure 2 below shows, for 82% of the emails both annotators agreed that the email was business email and in 12% of the emails, both agreed on them being personal. Six percent of the emails were disagreed upon.</Paragraph>
    <Section position="1" start_page="408" end_page="409" type="sub_section">
      <SectionTitle>
Disagreement
</SectionTitle>
      <Paragraph position="0"> By analysing the disagreed categories, some patterns of confusion were found.</Paragraph>
      <Paragraph position="1"> Around one fourth of the confusions were solicited emails where it was not clear whether the employee was subscribed to a particular newsletter group for his personal interest, private business, or Enron's business. While some subscriptions were clearly personal (e.g. subscription to latest celebrity news) and some were clearly business related (e.g. Daily Energy reports), for some it was hard to identify the intention of the subscription (e.g. New York Times).</Paragraph>
      <Paragraph position="2"> Eighteen percent of the confusions were due to emails about travel arrangements, flight and hotel booking confirmations, where it was not clear whether the personal was acting in a business or personal role.</Paragraph>
      <Paragraph position="3">  Thirteen percent of the disagreements were upon whether an email is written between two Enron employees as business colleagues or friends. The emails such as &amp;quot;shall we meet for a coffee at 2:00?&amp;quot; If insufficient information exists in the email, it can be hard to draw the line between a personal relationship and a relationship between colleagues. The annotators were advised to pick the category based on the formality of the language used in such emails, and reading between the lines wherever possible.</Paragraph>
      <Paragraph position="4"> About eight percent of the disagreements were on emails which were about services that Enron provides for its employees. For example, the Enron's running club is seeking for runners, and sending an ad to Enron's employers. Or Enron's employee's assistance Program (EAP), sending an email to all employees, letting them know that in case of finding themselves in stressful situations they can use some of the services that Enron provides for them or their families. One theme was encountered in many types of confusions: namely, whether to decide an e-mail's category based upon its topic or its form. For example, should an email be categorised because it is scheduling a meeting or because of the subject of the meeting being scheduled? One might consider this a distinction by topic or by genre.</Paragraph>
      <Paragraph position="5"> As the result, final categories were created to reflect topic as the only dimension to be considered in the annotation. &amp;quot;Solicited/Soliciting mailing&amp;quot;, &amp;quot;Solicited/Auto generated mailing&amp;quot; and &amp;quot;Forwarded&amp;quot; were removed and &amp;quot;Keeping Current&amp;quot;, &amp;quot;Soliciting&amp;quot; were added as business categories and &amp;quot;Personal Maintenance&amp;quot; and &amp;quot;Personal Circulation&amp;quot; were added as personal categories. The inter-annotation agreement was measured for one hundred and fifty emails, annotated by five annotators. The results confirmed that these changes had a positive effect on the accuracy of annotation. null</Paragraph>
    </Section>
  </Section>
  <Section position="7" start_page="409" end_page="409" type="metho">
    <SectionTitle>
6 Preliminary Results of Automatic
Classification
</SectionTitle>
    <Paragraph position="0"> Some preliminary experiments were performed with an automatic classifier to determine the feasibility of separating business and personal emails by machine.</Paragraph>
    <Paragraph position="1"> The classifier used was a probabilistic classifier based upon the distribution of distinguishing words. More information can be found in (Guthrie and Walker, 1994).</Paragraph>
    <Paragraph position="2"> Two categories from the annotation were chosen which were considered to typify the broad categories - these were Core Business (representing business) and Close Personal (representing personal). The Core Business class contains 4,000 messages (approx 900,000 words), while Close Personal contains approximately 1,000 messages (220,000 words).</Paragraph>
    <Paragraph position="3"> The following table summarises the performance of this classifier in terms of Recall, Precision and F- null Based upon the results of this experiment, one can conclude that automatic methods are also suitable for classifying emails as to whether they are business or personal. The results indicate that the business category is well represented by the classifier, and given the disproportionate distribution of emails, the classifier's tendency towards the business category is understandable. null Given that our inter-annotator agreement statistic tells us that humans only agree on this task 94% of the time, preliminary results with 93% accuracy (the statistic which correlates exactly to agreement) of the automatic method are encouraging. While more work is necessary to fully evaluate the suitability of this task for application to a machine, the seeds of a fully automated system are sown.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML