File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/relat/97/a97-1056_relat.xml

Size: 2,237 bytes

Last Modified: 2025-10-06 14:16:05

<?xml version="1.0" standalone="yes"?>
<Paper uid="A97-1056">
  <Title>Sequential Model Selection for Word Sense Disambiguation *</Title>
  <Section position="9" start_page="392" end_page="393" type="relat">
    <SectionTitle>
7 Related Work
</SectionTitle>
    <Paragraph position="0"> Statistical analysis of NLP data has often been limited to the application of standard models, such as n-gram (Markov chain) models and the Naive Bayes model. While n-grams perform well in part-of-speech tagging and speech processing, they require a fixed interdependency structure that is inappropriate for the broad class of contextual features used in word-sense disambiguation. However, the Naive Bayes classifier has been found to perform well for word-sense disambiguation both here and in a variety of other works (e.g., (Bruce and Wiebe, 1994a), (Gale et al., 1992), (Leacock et al., 1993), and (Mooney, 1996)).</Paragraph>
    <Paragraph position="1"> In order to utilize models with more complicated interactions among feature variables, (Bruce and Wiebe, 1994b) introduce the use of sequential model selection and decomposable models for word-sense disambiguation. ~ Alternative probabilistic approaches have involved using a single contextual feature to perform disambiguation (e.g., (Brown et al., 1991), (Dagan et al., 1991), and (Yarowsky, 1993) present techniques for identifying the optimal feature to use in disambiguation). Maximum Entropy models have been used to express the interactions among multiple feature variables (e.g., (Berger et al., 1996)), but within this framework no systematic study of interactions has been proposed. Decision tree induction has been applied to word-sense disambiguation (e.g. (Black, 1988) and (Mooney, 1996)) but, while it is a type of model selection, the models are not parametric.</Paragraph>
    <Paragraph position="2"> SThey recommended a model selection procedure using BSS and the exact conditional test in combination with a test for model predictive power. In their procedure, the exact conditional test was used to guide the generation of new models and the test of model predictive power was used to select the final model from among those generated during the search.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML