File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/02/w02-2006_concl.xml

Size: 1,290 bytes

Last Modified: 2025-10-06 13:53:31

<?xml version="1.0" standalone="yes"?>
<Paper uid="W02-2006">
  <Title>Bootstrapping a Multilingual Part-of-speech Tagger in One Person-day</Title>
  <Section position="9" start_page="2" end_page="2" type="concl">
    <SectionTitle>
6 Conclusion
</SectionTitle>
    <Paragraph position="0"> This paper has presented an alternative to traditional corpus annotation-based supervision of part-of-speech taggers. Given that even obscure languages have reference grammars and dictionaries available in large bookstores, libraries or even online, the focus of this work is on using human supervision for efficient structured entry of this seed knowledge (in the form of regular and semi-regular inflectional paradigms and often irregular closed-class part-of-speech entries). Minimally supervised bootstrapping procedures then used corpus-derived distributional data to induce lexical tag probabilities from dictionaries, irregular morphological analyses via weighted Levenshtein-based alignment models, tag sequence probability induction and grammatical gender agreement modeling. Experiments show high accuracy coarse and fine-grained (AP 250 tag) part-of-speech analyses using only one person day of new human supervision based on readily available linguistic resources.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML