File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/relat/00/c00-1042_relat.xml

Size: 2,933 bytes

Last Modified: 2025-10-06 14:15:34

<?xml version="1.0" standalone="yes"?>
<Paper uid="C00-1042">
  <Title>Statistical Morphological Disambiguation for Agglutinative Languages</Title>
  <Section position="3" start_page="0" end_page="285" type="relat">
    <SectionTitle>
2 Related Work
</SectionTitle>
    <Paragraph position="0"> There has been a large numl)er of studies in tagging and mori)hological disambiguation using various techniques. POS taggiug systems have used either a statistical or a rule-based approach, hi the statistical api)roach, a large corpus \]ms been used to train a t)rotmbilistic model wlfieh then has been used to tag new text, assigning the most likely tag for a given word in a given context (e.g., Church (1.988), Cutting el; al. (1992)). In the rule-based approach, a large mmfl&gt;e.r of hand-craft;ed linguisiic constraints are used to elinfinate impossible tags or morphological t)arse.s tbr a given word in a given context (Ka.rlsson et al., 1995). Brill (1995a) has presented a transfl)rnmtioi&gt;based lea.rning at)l)roach, whi(:h induces disanlbiguation rules from tagged corpora.</Paragraph>
    <Paragraph position="1"> Morphologi(:al disanlbiguation in inflecting or agglutinative languages with COlnl)lex morphology involves more than determining the major or minor Imrts-of-sl)cech of the lexiea.l items. Typically, roof phology marks a mlmber of inflectional or derivatioiml features and this involves ambiguity. For instance, a given word nlay be chopl)ed up in difl'erent ways into mort)heroes , a given mort)heine may inark different features depending on the morphotactics, or lexicalized variants of derived words may interact with productively derived versions (see Ottazer and Tiir (1997) for the difl'erent kinds of morphological ambiguities in Turkish.) We assume that all syntactically relevant fcat'urcs of word forms have to be determined correctly for morphological disambigua.tion. null In this context, there have l)een some interesting previous studies for difl'erent languages. Levinger ct al. (1995) have reported on an approach that learns morpholexical probabilities fi'om an mltagged eorlms mid have. used the resulting infornlation in  morphological disambiguation in Hebrew. Haji~: and Hla(lk~i (1998) have used ntaximunl entropy modeling approach for morphological dismnbiguation in Czech. Ezeiza et al. (1998) have combined stochastic and rule-based disambiguation methods for Basque. Megyesi (19991 has adapted Brill's POS tagger with extended lexical templates to Itungartan. null Previous ai)proaches to morphological dismnbiguation of Turkish text; had employed a constraint-based approach (Otlazer and KuruSz, 1994; Oflazer and Tiir, 1996; Oflazer and Tiir, 1997). Although results obtained earlier in these at)preaches were reasonable, the fact that tim constraint rules were hand crafted posed a rather serious impediment to the generality and improvement of these systems.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML