File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/relat/06/n06-1042_relat.xml

Size: 2,621 bytes

Last Modified: 2025-10-06 14:15:52

<?xml version="1.0" standalone="yes"?>
<Paper uid="N06-1042">
  <Title>Learning Morphological Disambiguation Rules for Turkish</Title>
  <Section position="3" start_page="328" end_page="329" type="relat">
    <SectionTitle>
2 Related Work
</SectionTitle>
    <Paragraph position="0"> There is a large body of work on morphological disambiguation and part of speech tagging using a variety of rule-based and statistical approaches. In the  rule-based approach a large number of hand crafted rules are used to select the correct morphological parse or POS tag of a given word in a given context (Karlsson et al., 1995; Oflazer and Tcurrency1ur, 1997). In the statistical approach a hand tagged corpus is used to train a probabilistic model which is then used to select the best tags in unseen text (Church, 1988; Hakkani-Tcurrency1ur et al., 2002). Examples of statistical and machine learning approaches that have been used for tagging include transformation based learning (Brill, 1995), memory based learning (Daelemans et al., 1996), and maximum entropy models (Ratnaparkhi, 1996). It is also possible to train statistical models using unlabeled data with the expectation maximization algorithm (Cutting et al., 1992). Van Halteren (1999) gives a comprehensive overview of syntactic word-class tagging.</Paragraph>
    <Paragraph position="1"> Previous work on morphological disambiguation of in ectional or agglutinative languages include unsupervised learning for of Hebrew (Levinger et al., 1995), maximum entropy modeling for Czech (Haji c and Hladk*a, 1998), combination of statistical and rule-based disambiguation methods for Basque (Ezeiza et al., 1998), transformation based tagging for Hungarian (Megyesi, 1999).</Paragraph>
    <Paragraph position="2"> Early work on Turkish used a constraint-based approach with hand crafted rules (Oflazer and Kurucurrency1oz, 1994). A purely statistical morphological disambiguation model was recently introduced (Hakkani-Tcurrency1ur et al., 2002). To counter the data sparseness problem the morphological parses are split across their derivational boundaries and certain independence assumptions are made in the prediction of each in ectional group.</Paragraph>
    <Paragraph position="3"> A combination of three ideas makes our approach unique in the eld: (1) the use of decision lists and a novel learning algorithm that combine the statistical and rule based techniques, (2) the treatment of each individual feature separately to address the data sparseness problem, and (3) the lack of dependence on previous tags and relying on surface attributes alone.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML