XML Viewer - w96-0305

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/96/w96-0305_metho.xml
Size: 13,320 bytes
Last Modified: 2025-10-06 14:14:26
<?xml version="1.0" standalone="yes"?>
<Paper uid="W96-0305">
  <Title>Acquisition of Computational-Semantic Lexicons from Machine Readable Lexicai Resources</Title>
  <Section position="3" start_page="32" end_page="33" type="metho">
    <SectionTitle>
3. The algorithm
</SectionTitle>
    <Paragraph position="0"> The algorithm is divided into two stages. The preprocessing steps such as part-of-speech tagging, and removal of stop words are necessary for the algorithm to obtain good results. Various methods for POS tagging have been proposed in recent years. For simplicity, we adapted the method proposed by Churchl(1988) to tag the definition sentence. In the second stage, we select the label which is associated with word lists most similar to the definition as the result. We sum up the above descriptions and outline the procedure for labeling a dictionary sense.</Paragraph>
    <Paragraph position="1"> Algorithm: I Sense division for a head word h Step 1: GiVen a head word h, read its definition, DEFh, from LDOCE.</Paragraph>
    <Paragraph position="2"> Step 2: For each definition D ofDEFh, tag each word in D with POS information..</Paragraph>
    <Paragraph position="3"> Step 3: Remove all stop words in D to obtain a list of keyword-POS pair, KEYD. i Step 4: Lookup LLOCE for headword h to obtain a list of sets SETh that contains h. For each S in SETh, compile a set of words TOPS that listed under the topic of S and REFS the set of words listed under it cross references.</Paragraph>
    <Paragraph position="4"> Step 5: Compute similarity Sim(D, S) based on Dice Coefficient for all clef'tuitions D ~ DEFh and labels S ,SETh.</Paragraph>
    <Paragraph position="5"> Sim (D, S) = whereKEYD= the set of POS-keyword pairs in definition D, ~= the overall relevancy of cross references to a topic, wk= 1/the degree of ambiguity of the keyword k, In(a, B)= 1 when a * B, In(a, B)= 0 when a ~ B.</Paragraph>
    <Paragraph position="6"> Step 6: Assign to D the label S with the maximum value of Sire(D, S) over a threshold. Initially, the candidates are limited to the set labels indicated in LLOCE for the head word. If the algorithm finds all initial candidates dissimilar, a second run of the algorithm is executed with candidates expanded to all topics in LLOCE.</Paragraph>
    <Section position="1" start_page="32" end_page="33" type="sub_section">
      <SectionTitle>
3.1 An illustrative example
</SectionTitle>
      <Paragraph position="0"> We illustrate how the algorithm functions using the 5th definition of the word &amp;quot;interest.&amp;quot; The preprocessing stage for definition of word &amp;quot;interest&amp;quot; includes part-of-speech (POS) tagging and stop word removal, thereby yielding the following result:</Paragraph>
      <Paragraph position="2"/>
      <Paragraph position="4"> {quite/adj, calm/adj .... interest/n, excitement/n, shrill/n .... } {likeN, fancy/v .... attraction/n, appeal/n, interest/n .... } {lend/v, loan/v .... interest/n, investment/n, share/n .... } {entertain/v, amuse/v .... game/n, hobby/n, interest/n .... }</Paragraph>
      <Paragraph position="6"> The word lists associated with the label Jell 2 is most similar to the key-words of the definition.</Paragraph>
      <Paragraph position="7"> Therefore, the algorithm produces Je112 as the label for &amp;quot;a share in a company business etc.&amp;quot;</Paragraph>
    </Section>
    <Section position="2" start_page="33" end_page="33" type="sub_section">
      <SectionTitle>
3.2 Experiments and Evaluation
</SectionTitle>
      <Paragraph position="0"> An experiment was carried out using a test set 3 containing 12 polysemous words used in recent WSD experiments (Yarowsky 1992; Luk 1995). The 12-word test set used in the evaluation represents much more difficult cases than average. There are on the average 2.6 definitions in LDOCE for each words as opposed to the average 6.4 definitions per words in the test set. Table 1 displays a word by word performance of the algorithm. The results show that on the average the algorithm can assign labels to 87% of the senses with 94% precision.</Paragraph>
    </Section>
  </Section>
  <Section position="4" start_page="33" end_page="37" type="metho">
    <SectionTitle>
4. Discussion
</SectionTitle>
    <Paragraph position="0"> In this section, we thoroughly analyze the labeling performed by the algorithm and, in particular, look into several uses that are made possible by the labels' availability. In addition, those cases when the algorithm failed can also be analyzed. Analyses result not only illustrate the merits of these labels, but also imply possible improvement of the algorithm.</Paragraph>
    <Paragraph position="1"> 4.1. Broad coverage About 50% of the labels are assigned during the second run of the algorithm from the extended candidate set. These labels represent gaps in the LLOCE. So, the algorithm can produce much broader coverage than the original LLOCE.</Paragraph>
    <Paragraph position="2">  2. For simplicity, the parameter , is set to i.</Paragraph>
    <Paragraph position="3"> 3. Only entries relevant to ~e test set m LLOCE are manually emered to ~e computer. We are currently trying to get a licence s.t. the full LLOCE entries in order to ccmduct a more complete test.</Paragraph>
    <Paragraph position="5"> Dolan (1994) pointed out that it is helpful to identify zero-derived noun/verb pairs for such tasks as normalization of the semantics of expressions that are only superficially different. We have noticed that  zero derivatives are an important knowledge source for resolving PP-attachment ambiguity. A PP with an object involved in a noun/adjective zero-derivation has a strong tendency to attach itself to the preceding noun as a modifier. For instance, consider the following example that has an ambiguous PP-attaclmaent problem: We had a lot of interests in common.</Paragraph>
    <Paragraph position="6"> (= We had a lot of common interests.) 4.3. Systematic inter-sense relations Sanfilippo et al. (1995) contended that strong evidence would suggest that a large part of word sense ambiguity is not arbitrary but follows regular patterns. Moreover, gaps frequently arise in dictionaries and thesauri in specifying this kind of virtual polysemy. Virtual polysemy and recurring inter-sense relations are closely related to polymorphic senses that can support coercion in semantic typing under the theory of Generative Lexicon of Putstejovsky (1991).</Paragraph>
    <Paragraph position="7"> Our experimental results indicate that the labels in LLOCE make it possible to acquire important inter-sense relations, i Many of those relations are reflected in the cross reference information in LLOCE. For instance, LLOCE lists the following cross references for the topic of Eb (Food):  Jg: Shopkeepers and shops selling food most of which are systematic inter-sense relations similar to those described in above-mentioned work. We also observed that words involved in such inter-sense relations are frequently underspecified. For instance, &amp;quot;chicken&amp;quot; is listed under both topics Eb and topic Ad, while &amp;quot;duck&amp;quot; is listed under Ad but not Eb. By characterizing of some 200 cross references in LLOCE, most systematic inter-sense relations can be easily identiffed among the labeled senses. The labels attached to senses in the MRD, coupled with these inter-sense relations, can then support and realize automatic sense shifts advocated in Putstejovsky and Bouillon (1994). For instance, the sense of &amp;quot;duck&amp;quot; label with topic Ad can be coerced into an Eb sense when necessary, with the availability of the lexical rule stipulating a sense shift from Ad and Eb. Krovetz (1992) observed that LDOCE indicates sense shifts via direct reference (links indicated by a capitalized word with a sense number) and deictic reference (implicit links to the previous sense created by this, these, that, those, its, itself, such a, such an). Sense shifts indicated through a deictic reference are also present in our 12-word test set. For instance, the first 2 senses of &amp;quot;issue&amp;quot; are  1. the act of coming out.</Paragraph>
    <Paragraph position="8"> 2. an example of this.</Paragraph>
    <Paragraph position="9">  The definition of the 2nd senses indicates an A ctionNoun-CountNoun sense shifts from issue.n.1 to issue.n.2 through a deictic reference of &amp;quot;this.&amp;quot; Since those types of definitions pattern are not considered, the labeling algorithm fails in such cases. Further work must be unde~xaken to cope with direct and deictic references, so that such def'mitions can be appropriately labeled and information on sense shifts can be acquired.  The 4th and 5th sense are metonymically associated with two &amp;quot;star&amp;quot; senses, star.1 .n.3 (a 5- or more pointed figure) and start.l.n.2 (a heavenly body such as a PLANET), respectively. The algorithm often fails in such cases for two reasons. First, metonymies are not clearly separated and indicated in LLOCE. Second, the genus terms in metonymical senses are often indistinguishable from each other. Further action must be taken to identify the nature of such relations before this kind of ambiguity can be successfully resolved. The presence of phrases &amp;quot;as a ... 03 deg' or &amp;quot;regarded as&amp;quot; and drastic change in topic toward the second half of the definition may be cues for identifying metonymy and metaphor.</Paragraph>
  </Section>
  <Section position="5" start_page="37" end_page="37" type="metho">
    <SectionTitle>
5. Other approaches
</SectionTitle>
    <Paragraph position="0"> Sanfilippo and Poznanski (1992) proposed a so-caUed Dictionary Correlation Kit (DCK) in a dialog-based environment for correlating word senses across pairs of MRDs, LDOCE and LLOCE. Dolan (1994) described a heuristic approach to forming unlabeled clusters of closely related senses in a MRD. The clustering program relies on LDOCE domain code, grammar code, and 25 types of semantic relations exu'acted from definitions. Yarowsky (1992) described a WSD method and an implementation based on Roget' s Thesaurus and the training material of the 10-rnillion-word Grolier' s Encyclopedia. The author suggested that the method can also apply to dictionary definitions. Krovetz (1993) described a simple algorithm based on overlap of defining words to identify related senses between morphological variants. The author reported that the success rate was over 80%. No results were reported for closely related senses within a part-ofspeech. null In most of the above-mentioned works, experimental results are reported only for some senses of a couple of words. In this study, we have evaluated our method using all senses for 12 words that have been studied in WSD literature. This evaluation provides an overall picture for the expected success rate of the method, when applied to all word senses in the MRD. Directly comparing methods is often difficult. Nevertheless, it is evident that in comparison our algorithm is simpler, requires less preprocessing, and does not rely on information idiosyncratic to LDOCE. Thus, the algorithm described in this paper can readily apply to other MRDs besides LDOCE. Although our algorithm makes use of defining words with various semantic relations with the sense, explicit computation of those relations is not required.</Paragraph>
  </Section>
  <Section position="6" start_page="37" end_page="38" type="metho">
    <SectionTitle>
6. Conclusions and Future Work
</SectionTitle>
    <Paragraph position="0"> The meth~ proposed in this work takes advantages of a number of linguistic phenomena: (1) Division of senses is primarily along the line of subject and topic. (2) Rather rigid schemes of text generation and predictable semantic relations are used to define senses in MRDs such as LDOCE. (3) The implicit links between instances of many of these relations are available in a thesaurus such as LLOCE.</Paragraph>
    <Paragraph position="1"> This work also underscores the effectiveness oflexical rules for coarse WSD. Hand-constructed topic-based classes of words, coupled with lexical rules as common topic and cross references of topics, prove to be highly affecfive both in coverage and precision for WSD, admittedly for sense definitions, a somehow restricted type of text.</Paragraph>
    <Paragraph position="2"> Merging senses via labeling has another implication as weU. As discussed in Section 4, the senses sharing the same label (or cross-referencing labels) are frequently associated through various linguistic relations. Making those relations explicit will open the door to flexible treatment of lexicon, semantic typing, and semantic under-specification, all of which have received ever-increasing interest.</Paragraph>
    <Paragraph position="3"> In a broader context, this paper promotes the progressive approach to knowledge acquisition for NLP as opposed to the &amp;quot;from-scratch&amp;quot; approach. We believe this to be a preferable means to approaching a sound and complete knowledge base.</Paragraph>
  </Section>
  <Section position="7" start_page="38" end_page="38" type="metho">
    <SectionTitle>
I
Acknowledgment
</SectionTitle>
    <Paragraph position="0"> The authors would like to thank the National Science Council of the ROC for financial support of this research under Conu'act No. NSC 85-2213-E-007-042.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML