XML Viewer - a00-1030

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/00/a00-1030_intro.xml
Size: 6,147 bytes
Last Modified: 2025-10-06 14:00:39
<?xml version="1.0" standalone="yes"?>
<Paper uid="A00-1030">
  <Title>Aggressive Morphology for Robust Lexical Coverage</Title>
  <Section position="3" start_page="219" end_page="219" type="intro">
    <SectionTitle>
3 Evaluation
</SectionTitle>
    <Paragraph position="0"> Since analyzing a word is done once per unknown word type and consumes a negligible fraction of the overall text-processing time, speed of operation is not considered a factor for evaluation. The interesting dimension of evaluation deals with the coverage of the rules and the kinds of errors that are made. This was tested by applying the system to two word lists randomly selected from the Brown corpus and provided to me by Philip Resnik, using some sampling tools that he developed. The first of these (the token sample) consists of 100 word tokens selected randomly, without eliminating duplicates, and the second (the type sample) consists of 100 distinct word types selected randomly from the vocabulary of the Brown corpus. Prior to a single test run on each of these samples, neither the lexicon nor the morphological rule system had any exposure to the Brown corpus, nor had either of these word lists been looked at by the experimenter. Consequently, the results are a fair evaluation of the expected performance of this system on an unknown domain.</Paragraph>
    <Section position="1" start_page="219" end_page="219" type="sub_section">
      <SectionTitle>
3.1 Grading rule performance
</SectionTitle>
      <Paragraph position="0"> Since different syntactic category errors have different consequences for parsing text, it is useful to grade the syntactic category assignments of the analyzer on an A-B-C-D-F scale according to the severity of any mistakes. Grades are assigned to a lexical entry as follows: A if all appropriate syntactic categories are assigned and no incorrect categories are assigned B if all categories are correct, allowing for categorizing an adjective or a name as a noun or a noun as a name C if an entry has at least one correct category and is correct except for missing a noun category or having a single extra category D if there is more than one extra category or if there is a missing category other than one of the above cases, provided that there is at least one correct category F if there are no correct categories Both A and B grades are considered acceptable assignments for the sake of evaluation, since category B errors would allow a reasonable parse to be found. This is because the grammar used for parsing sentences and phrases allows a noun to be used as an adjective modifier and a proper noun to be used in place of a noun. One parser/grammar that uses this lexicon also allows any other categoo; to be used as a noun, at the expense of a penalty, so that a C grade will still enable a parse, although with a penalty and a substantial likelihood that other false parses might score better. Similarly, a D grade increases the likelihood that a false parse might score better.</Paragraph>
      <Paragraph position="1"> Separately, we measure whether count/mass distinctions are made correctly (for nouns only), and whether roots of derived and inflected forms are identified correctly. We are interested in the count/mass distinction because, like the common/proper noun distinction, it affects the grammaticality and likelihood of a noun phrase interpretation for a singular noun in absence of an explicit determiner.</Paragraph>
    </Section>
    <Section position="2" start_page="219" end_page="219" type="sub_section">
      <SectionTitle>
3.2 Sampling rule performance
</SectionTitle>
      <Paragraph position="0"> The morphological analyzer has been applied to the words from the two sample word lists that were not already in its core lexicon. There were 17 such words from the token sample and 72 such words from the type sample. Of the 17 unknown tokensample words, 100% were graded B or better (88% A and 12% B); 85% of the roots were identified correctly (all but one); 85% of the count noun senses were found (all but one); and 100% of the mass noun senses were found. Token-sample performance is not a very challenging test for a morphological analyzer because it is biased toward a relatively small number of frequently occurring word types. Token-sample performance is used to assess the per-token error rate that one would expect in analyzing large amounts of running text. In contrast, type-sample performance  gives a measure of the expected performance on new words the analyzer is likely to encounter.</Paragraph>
      <Paragraph position="1"> For the 72 words in the type sample that are not covered by the lexicon, Tables 1-3 show the syntactic category performance of the analyzer and its abilities to make count/mass distinctions and identify roots.</Paragraph>
      <Paragraph position="2"> Notes on incorrect or debatable analyses:  1. One N (noun) for a probable name (Tonio), counted as B.</Paragraph>
      <Paragraph position="3"> 2. Two NPR(proper name) for abbreviations; (A. V. may be ADJ, W.B. is correct), counted as one B and one A.</Paragraph>
      <Paragraph position="4"> 3. One wrong root when suffix ism was identified  as root of hooliganism in a hypothesized compound hooligan+ism (arguably justifiable as a kind of ism, which is known in the lexicon, but counted as an error anyway). Reanalyzing this word after hooligan is a known word gets the correct interpretation.</Paragraph>
      <Paragraph position="5"> 4. One debatable root in the hyphenated phrase reference-points whose root was listed as points rather than reference-point. This is due to a bug that caused the hyphenated word rules to incorrectly identify this as a verb, rather than a noun (counted as F for syntax).</Paragraph>
      <Paragraph position="6">  5. One extra root for embouchure from embouche (but a correct form of the French root?).</Paragraph>
      <Paragraph position="7"> 6. One missing category N for bobbles, which was  given category V but not N because the core lexicon incorrectly listed bobble only as a verb (counted as C for syntax). This is corrected by adding the missing category to the lexical entry for bobble.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML