File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/96/x96-1033_concl.xml

Size: 4,297 bytes

Last Modified: 2025-10-06 13:57:45

<?xml version="1.0" standalone="yes"?>
<Paper uid="X96-1033">
  <Title>A SIMPLE PROBABILISTIC APPROACH TO CLASSIFICATION AND ROUTING</Title>
  <Section position="12" start_page="174" end_page="177" type="concl">
    <SectionTitle>
7. CONCLUSIONS AND FUTURE
WORK
</SectionTitle>
    <Paragraph position="0"> For the small test performed, all of the methods produced about the same classification result, and the MUltinomial Distribution method produced the best routing result. Future work with TREC data will determine whether these are repeatable results or whether the small test data was particularly well tuned to the Multinomial Distribution method.</Paragraph>
    <Paragraph position="1"> Although we anticipate improvements to all of the methods through the use of phrases, feedback, term expansion and clustering, these have not yet been implemented. Future efforts will investigate these modificationsdeg null This test for classification and routing was much simpler than the TREC task, since the size of the corpus was significantly smaller and less diverse and every document was relevant to a single category. This produced results which were close to perfect for all of the methods, and the Multinomial Distribution method was less than 1% different than the SMART method in clas- null sification, and only 5% better in routing. However.</Paragraph>
    <Paragraph position="2"> since the TREC data is very diverse and is classified into fifty classes, the Mulfinomial Distribution method is expected to perform even better than the other methods, as it is particularly good at distingui~qhing fine detail between classes.</Paragraph>
    <Paragraph position="3">  Change it to upper case.</Paragraph>
    <Paragraph position="4"> Scan for and remove any remaining embedded statements.</Paragraph>
    <Paragraph position="5"> Remove possessives.</Paragraph>
    <Paragraph position="6"> If the last character is an apostrophe, remove it.</Paragraph>
    <Paragraph position="7"> If the last two characters are 's, remove them.</Paragraph>
    <Paragraph position="8"> Remove any remaininPS punctuation.</Paragraph>
    <Paragraph position="9"> Discard the word if the previous steps have removed all of it.</Paragraph>
    <Paragraph position="10"> Remove 'ies'.</Paragraph>
    <Paragraph position="11"> If the last three characters are 'ies', change them to 'y'.</Paragraph>
    <Paragraph position="12"> Remove 'ied'.</Paragraph>
    <Paragraph position="13"> If the last three characters are 'ied', change them to 'y'.</Paragraph>
    <Paragraph position="14"> Remove plural' s'.</Paragraph>
    <Paragraph position="15"> If the last character is's' and the next m last is any consonant except 's', remove the 's'. Examples: winds -&gt; wind, pass -&gt; pass.</Paragraph>
    <Paragraph position="16"> 10. Remove 'ing'.</Paragraph>
    <Paragraph position="17"> Do nothing if the word is 'during' or 'th' precedes the'ing'.</Paragraph>
    <Paragraph position="18"> If the last three characters are'ing ', remove them.</Paragraph>
    <Paragraph position="19"> Examples: wil~ding -&gt; wind.</Paragraph>
    <Paragraph position="20"> If the two characters prior to the 'ing' are the same and riot's', remove the second one.</Paragraph>
    <Paragraph position="21"> Examples: stepping -&gt; step, passing -&gt; pass. If the character prior to the 'ing' is a consonant except 'y', the previous character is a vowel, and the next character is not a vowel, add an 'e' to the end of the word.</Paragraph>
    <Paragraph position="22"> Examples: mining -&gt; mine, keying -&gt; key, joining -&gt;join.</Paragraph>
    <Paragraph position="23"> 11. Remove 'ed'.</Paragraph>
    <Paragraph position="24"> Do nothing if the word is four characters or less.</Paragraph>
    <Paragraph position="25"> If the hst two characters are 'ed', remove them.</Paragraph>
    <Paragraph position="26"> Examples: winded -&gt; wind.</Paragraph>
    <Paragraph position="27"> If the two characters prior to the 'ed' are the same and not 's', remove the second one. Examples: stepped -&gt; step. passed -&gt; pass. If the character prior to the 'ed' is a consonant except 'y', the previous character is a vowel, and the next character is not a vowel, add an'e' to the end of the word.</Paragraph>
    <Paragraph position="28"> Examples: mined -&gt; mine, keyed -&gt; key, joined -&gt; join.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML