File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/94/p94-1013_intro.xml

Size: 2,396 bytes

Last Modified: 2025-10-06 14:05:48

<?xml version="1.0" standalone="yes"?>
<Paper uid="P94-1013">
  <Title>DECISION LISTS FOR LEXICAL AMBIGUITY RESOLUTION: Application to Accent Restoration in Spanish and French</Title>
  <Section position="3" start_page="0" end_page="0" type="intro">
    <SectionTitle>
INTRODUCTION
</SectionTitle>
    <Paragraph position="0"> This paper presents a general-purpose statistical decision procedure for lexical ambiguity resolution based on decision lists (Rivest, 1987). The algorithm considers multiple types of evidence in the context of an ambiguous word, exploiting differences in collocational distribution as measured by log-likelihoods. Unlike standard Bayesian approaches, however, it does not combine the log-likelihoods of all available pieces of contextual evidence, but bases its classifications solely on the single most reliable piece of evidence identified in the target context. Perhaps surprisingly, this strategy appears to yield the same or even slightly better precision than the combination of evidence approach when trained on the same features. It also brings with it several additional advantages, the greatest of which is the ability to include multiple, highly non-independent sources of evidence without complex modeling of dependencies.</Paragraph>
    <Paragraph position="1"> Some other advantages are significant simplicity and ease of implementation, transparent understandability *This research was supported by an NDSEG Fellowship, ARPA grant N00014-90-J-1863 and ARO grant DAAL 0389-C0031 PRI. The author is also affiliated with the Linguistics Research Department of AT&amp;T Bell Laboratories, and greatly appreciates the use of its resources in support of this work. He would like to thank Jason Eisner, Libby Levison, Mark Liberman, Mitch Marcus, Joseph Rosenzweig and Mark Zeren for their valuable feedback.</Paragraph>
    <Paragraph position="2"> of the resulting decision list, and easy adaptability to new domains. The particular domain chosen here as a case study is the problem of restoring missing accents 1 to Spanish and French text. Because it requires the resolution of both semantic and syntactic ambiguity, and offers an objective ground truth for automatic evaluation, it is particularly well suited for demonstrating and testing the capabilities of the given algorithm. It is also a practical problem with immediate application.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML