File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/94/p94-1052_evalu.xml

Size: 1,594 bytes

Last Modified: 2025-10-06 14:00:17

<?xml version="1.0" standalone="yes"?>
<Paper uid="P94-1052">
  <Title>CONCEPTUAL ASSOCIATION FOR COMPOUND NOUN ANALYSIS</Title>
  <Section position="5" start_page="337" end_page="338" type="evalu">
    <SectionTitle>
RESULTS
</SectionTitle>
    <Paragraph position="0"> Test Set and Evaluation: Of the noun sequences extracted from Grolier's, 655 were more than two nouns in length and were thus ambiguous. Of these, 308 consisted only of nouns in Roget's and these formed the test set. All of them were triples. Using the full context of each sequence in the test set, the author analysed each of these, assigning one of four possible outcomes. Some sequences were not CNs (as observed above for the extraction process) and were labeled Error. Other sequences exhibited what Hindle and Rooth (1993) call SEMANTIC INDETERMINACY, where the meanings associated with two attachments cannot be distinguished in the context. For example, college economics texts. These were labeled Indeterminate. The remainder were labeled Left or Right depending on whether the actual analysis is leftor right-branching.</Paragraph>
    <Paragraph position="1">  Table 1 shows the distribution of labels in the test set. Hereafter only those triples that received a bracketing (Left or Right) will be considered.</Paragraph>
    <Paragraph position="2"> The attachment procedure was then used to automatically assign an analysis to each sequence in  examples. The results show more success with left branching attachments, so it may be possible to get better overall accuracy by introducing a bias.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML