File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/05/i05-2022_intro.xml
Size: 957 bytes
Last Modified: 2025-10-06 14:02:59
<?xml version="1.0" standalone="yes"?> <Paper uid="I05-2022"> <Title>HMM Based Chunker for Hindi</Title> <Section position="2" start_page="0" end_page="0" type="intro"> <SectionTitle> Abstract </SectionTitle> <Paragraph position="0"> This paper presents an HMM-based chunk tagger for Hindi. Various tagging schemes for marking chunk boundaries are discussed along with their results.</Paragraph> <Paragraph position="1"> Contextual information is incorporated into the chunk tags in the form of part-of-speech (POS) information. This information is also added to the tokens themselves to achieve better precision.</Paragraph> <Paragraph position="2"> Error analysis is carried out to reduce the number of common errors. It is found that for certain classes of words, using the POS information is more effective than using a combination of word and POS tag as the token. Finally, chunk labels are also marked on the chunks.</Paragraph> </Section> class="xml-element"></Paper>