File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/06/n06-1042_abstr.xml

Size: 1,432 bytes

Last Modified: 2025-10-06 13:44:54

<?xml version="1.0" standalone="yes"?>
<Paper uid="N06-1042">
  <Title>Learning Morphological Disambiguation Rules for Turkish</Title>
  <Section position="1" start_page="0" end_page="0" type="abstr">
    <SectionTitle>
Abstract
</SectionTitle>
    <Paragraph position="0"> In this paper, we present a rule based model for morphological disambiguation of Turkish. The rules are generated by a novel decision list learning algorithm using supervised training. Morphological ambiguity (e.g. lives = live+s or life+s) is a challenging problem for agglutinative languages like Turkish where close to half of the words in running text are morphologically ambiguous. Furthermore, it is possible for a word to take an unlimited number of suf xes, therefore the number of possible morphological tags is unlimited. We attempted to cope with these problems by training a separate model for each of the 126 morphological features recognized by the morphological analyzer.</Paragraph>
    <Paragraph position="1"> The resulting decision lists independently vote on each of the potential parses of a word and the nal parse is selected based on our con dence on these votes. The accuracy of our model (96%) is slightly above the best previously reported results which use statistical models. For comparison, when we train a single decision list on full tags instead of using separate models on each feature we get 91% accuracy.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML