File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/02/w02-2006_intro.xml

Size: 1,085 bytes

Last Modified: 2025-10-06 14:01:45

<?xml version="1.0" standalone="yes"?>
<Paper uid="W02-2006">
  <Title>Bootstrapping a Multilingual Part-of-speech Tagger in One Person-day</Title>
  <Section position="3" start_page="0" end_page="1" type="intro">
    <SectionTitle>
2 Inducing POS Tag Candidates from
Unlabeled Bilingual Dictionaries
</SectionTitle>
    <Paragraph position="0"> A substantial percentage of foreign language dictionaries that are available on line or in smaller paperback format are simple bilingual word or phrase translation lists which fail to specify part of speech.</Paragraph>
    <Paragraph position="1">  Thus one component question of this work is how can one extract preliminary part-of-speech distributions from untagged monolingual translation lists. Figure 1 illustrates such a bilingual dictionary, also specifying the true part of speech for each possible translation, which we do not assume to be generally available.</Paragraph>
    <Paragraph position="2"> One approach is to take an unweighted mixture of the prior part-of-speech distributions for the English words CT</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML