File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/05/i05-2022_intro.xml

Size: 957 bytes

Last Modified: 2025-10-06 14:02:59

<?xml version="1.0" standalone="yes"?>
<Paper uid="I05-2022">
  <Title>HMM Based Chunker for Hindi</Title>
  <Section position="2" start_page="0" end_page="0" type="intro">
    <SectionTitle>
Abstract
</SectionTitle>
    <Paragraph position="0"> This paper presents an HMM-based chunk tagger for Hindi. Various tagging schemes for marking chunk boundaries are discussed along with their results.</Paragraph>
    <Paragraph position="1"> Contextual information is incorporated into the chunk tags in the form of part-of-speech (POS) information. This information is also added to the tokens themselves to achieve better precision.</Paragraph>
    <Paragraph position="2"> Error analysis is carried out to reduce the number of common errors. It is found that for certain classes of words, using the POS information is more effective than using a combination of word and POS tag as the token. Finally, chunk labels are also marked on the chunks.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML