XML Viewer - p98-1037

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/98/p98-1037_evalu.xml
Size: 7,257 bytes
Last Modified: 2025-10-06 14:00:27
<?xml version="1.0" standalone="yes"?>
<Paper uid="P98-1037">
  <Title>A Concept-based Adaptive Approach to Word Sense Disambiguation</Title>
  <Section position="6" start_page="240" end_page="242" type="evalu">
    <SectionTitle>
4 Experiments and Discussions
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="240" end_page="240" type="sub_section">
      <SectionTitle>
4.1 Experiment
</SectionTitle>
      <Paragraph position="0"> In our experiment, we use the materials of text windows of 50 words to the left and 50 words to the right of thirteen polysemous words in the Brown corpus and a sample of Wall Street Journal articles. All instances of these thirteen words are first disambiguated by two human judges. For these thirteen words under investigation, only nominal senses are considered. The experimental results show that the adaptive algorithm disambiguated correctly 71% and 77% of these test cases in the Brown corpus and the WSJ sample. Table 1 provides further details. However, there are still room for improvement in the area of precision.</Paragraph>
      <Paragraph position="1"> Evidence have shown that by exploiting the constraint of so-called &amp;quot;one sense per discourse,&amp;quot; (Gale, Church and Yarowsky 1992b) and the strategy of bootstrapping (Yarowsky 1995), it is possible to boost coverage, while maintaining about the same level of precision.</Paragraph>
    </Section>
    <Section position="2" start_page="240" end_page="241" type="sub_section">
      <SectionTitle>
4.2 Discussions
</SectionTitle>
      <Paragraph position="0"> Although it is often difficult to compare studies on different text domain, genre and experimental setup, the approach presented here seems to compare favorably with the experimental results reported in previous WSD research. Luk (1995) experiments with the same words we use except the word bank and reports that there are totally 616 instances of these words in the Brown corpus, (slightly less than the 749 instances we have experimented on). The author reports that 60% of instances are resolved correctly using the definition-based concept co-occurrence (DBCC) approach. Leacock et al. (1996) report that precision rate of 76% for disambiguating the word line in a sample of WSJ articles.</Paragraph>
      <Paragraph position="1"> One of the limiting factors of this approach is the quality of sense definition in the MRD.</Paragraph>
      <Paragraph position="2"> Short and vague definitions tend to lead to inclusion of inappropriate topics in the contextual representation. Using inferior CR, it is not possible to produce enough and precise samples in the initial step for subsequent adaptation.</Paragraph>
      <Paragraph position="3"> Table l(a) Disambiguation results for thirteen ambiguous words in Brown corpus.</Paragraph>
      <Paragraph position="4">  The experiment and evaluation shows that adaptation is most effective when a high-frequency word with topically contrasting senses is involved. For low-frequency senses such as EARTH, ROW, and ROAD senses of bank, the approach does not seem to be very effective.</Paragraph>
      <Paragraph position="5"> For instance the following passage containing an instance of bank has the ROW sense but our algorithm fails to disambiguate it.</Paragraph>
      <Paragraph position="6"> ... They slept- Mynheer with a marvelously high-pitched snoring, the damn seahorse ivory teeth watching him from a bedside table. In the ballroom below, the dark had given way to moonlight coming in through the bank of french windows, it was a delayed moon, but now the sky had cleared of scudding black and the stars sugared the silver-gray sky.</Paragraph>
      <Paragraph position="7"> Martha Schuyler, old, slow, careful of foot, came down the great staircase, dressed in her best lace-drawn black silk, her jeweled shoe buckles held forward.</Paragraph>
      <Paragraph position="8"> Non-topical sense like ROW-bank can appeared in many situations, thus are very difficult to captured using a topical contextual representation. Local contextual representation might be more effective.</Paragraph>
      <Paragraph position="9"> Infrequent and non-topical senses are problematic due to data sparseness. However, that is not specific to the adaptive approach, all other approaches in the literature suffer the same predicament. Even with a static knowledge acquired from a very large corpus, these senses were disambiguated at a considerably lower rate. S Related approaches In this section, we review recent WSD literature from the prospective of types of contextual knowledge and different representational schemes.</Paragraph>
      <Paragraph position="10">  With topical representation of context, the context of a given sense is reviewed as a bag of words without structure. Gale, Church and Yarowsky (1992a) experiment on acquiring topical context from substantial bilingual training corpus and report good results.</Paragraph>
      <Paragraph position="11">  Local context includes the structured information on word order, distance, and syntactic feature. For instance, the local content of a line from does not suggest the same sense for the word line as a line for does.</Paragraph>
      <Paragraph position="12"> Brown et al. (1990) use the trigram model as a way of resolving sense ambiguity for lexical selection in statistical machine translation. This model makes the assumption that only the previous two words have any effect on the translation, thus word sense, of the next word. The model attacks the problem of lexical ambiguity and produces satisfactory results, under some strong assumption. A major problem with trigram model is that of long distance dependency. Dagan and Itai (1994) indicate that two languages are more informative than one; an English corpus is very helpful in disambiguating polysemous words in Hebrew text. Local context in the form of lexical relations are identified in a very large corpus. Brown, et al. (1991) describe a statistical algorithm for partitioning word senses into two groups. The authors use mutual information to find a contextual feature that most reliably indicates which of the senses of the French ambiguous word is used. The authors report a 20% improvement in the performance of a machine translation system when the words are first disambiguated this way.</Paragraph>
    </Section>
    <Section position="3" start_page="241" end_page="242" type="sub_section">
      <SectionTitle>
5.2 Static vs. Adaptive Strategy
</SectionTitle>
      <Paragraph position="0"> Of the recent WSD systems proposed in the literature, almost all have the property that the knowledge is fixed when the system completes the training phase. That means the acquired knowledge never expands during the course of disambiguation. Gale, et al. (1992a) report that if one had obtained a set of training materials with errors no more than twenty to thirty percent, one could iterate training materials selection just once or twice and have training sets that had less than ten percent errors. The adaptive approach is somehow similar to their idea of incremental learning and to the bootstrap approach proposed by Yarowsky (1995). However, both approaches are still considered static models which are changed only in the training phase.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML