File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/relat/02/j02-2001_relat.xml
Size: 9,456 bytes
Last Modified: 2025-10-06 14:15:36
<?xml version="1.0" standalone="yes"?> <Paper uid="J02-2001"> <Title>c(c) 2002 Association for Computational Linguistics Near-Synonymy and Lexical Choice</Title> <Section position="9" start_page="138" end_page="140" type="relat"> <SectionTitle> 9. Related Work </SectionTitle> <Paragraph position="0"> Most computational work on near-synonymy has been motivated by lexical mismatches in machine translation (Kameyama et al. 1991). In interlingual MT, an intermediate representational scheme, such as an ontology in knowledge-based machine translation (KBMT) (Nirenburg et al. 1992), or lexical-conceptual structures in UNITRAN (Dorr 1993) is used in encoding lexical meaning (and all other meaning). But as we showed in Section 3, such methods don't work at the fine grain necessary for near-synonymy, despite their effectiveness at a coarse grain. To overcome these problems but retain the interlingual framework, Barnett, Mani, and Rich (1994) describe a method of generating natural-sounding text that is maximally close in meaning to the input interlingual representation. Like us, they define the notion of semantic closeness, but whereas they rely purely on denotational representations and (approximate) logical inference in addition to lexical features for relative naturalness, we explicitly represent fine-grained aspects on a subconceptual level and use constraints and preferences, which gives flexibility and robustness to the lexical-choice process. Viegas (1998), on the other hand, describes a preliminary solution that accounts for semantic vagueness and underspecification in a generative framework. Although her model is intended to account for near-synonymy, she does not explicitly discuss it.</Paragraph> <Paragraph position="1"> Transfer-based MT systems use a bilingual lexicon to map words and expressions from one language to another. Lists, sometimes huge, of handcrafted language-pairspecific rules encode the knowledge to use the mapping (e.g., in SYSTRAN [Gerber and Yang 1997]). EuroWordNet (Vossen 1998) could be used in such a system. Its Inter-Lingual-Index provides a language-independent link between synsets in different languages and has an explicit relation, EQ NEAR SYNONYM, for relating synsets that are not directly equivalent across languages. But, as in individual WordNets, there is no provision for representing differences between near-synonyms.</Paragraph> <Paragraph position="2"> In statistical MT, there would seem to be some promise for handling near-synonymy. In principle, a system could choose the near-synonym that is most probable given the source sentence and the target-language model. Near-synonymy seems to have been of little concern, however, in statistical MT research: The seminal researchers, Brown et al. (1990), viewed such variations as a matter of taste; in evaluating their system, two different translations of the same source that convey roughly the same meaning (perhaps with different words) are considered satisfactory translations. More recently, though, Foster, Isabelle, and Plamondon (1997) show how such a model can be used in interactive MT, and Langkilde and Knight (1998) in text generation. Such methods are unfortunately limited in practice, because it is too computationally expensive to go beyond a trigram model (only two words of context). Even if a statistical approach could account for near-synonymy, Edmonds (1997) showed that its strength is not in choosing the right word, but rather in determining which near-synonym is most typical or natural in a given context. So such an approach would not be so useful in goal-directed applications such as text generation, or even in sophisticated MT.</Paragraph> <Paragraph position="3"> 10. Conclusion Every natural language processing system needs some sort of lexicon, and for many systems, the lexicon is the most important component. Yet, real natural language pro- null Edmonds and Hirst Near-Synonymy and Lexical Choice cessing systems today rely on a relatively shallow coverage of lexical phenomena, which unavoidably restricts their capabilities and thus the quality of their output. (Of course, shallow lexical semantics is a necessary starting point for a practical system, because it allows for broad coverage.) The research reported here pushes the lexical coverage of natural language systems to a deeper level.</Paragraph> <Paragraph position="4"> The key to the clustered model of lexical knowledge is its subconceptual/stylistic level of semantic representation. By introducing this level between the traditional conceptual and syntactic levels, we have developed a new model of lexical knowledge that keeps the advantages of the conventional model--efficient paraphrasing, lexical choice (at a coarse grain), and mechanisms for reasoning--but overcomes its shortcomings concerning near-synonymy. The subconceptual/stylistic level is more expressive than the top level, yet it allows for tractable and efficient processing because it &quot;partitions,&quot; or isolates, the expressiveness (i.e., the non-truth-conditional semantics and fuzzy representations) in small clusters. The model reconciles fine-grained lexical knowledge with coarse-grained ontologies using the notion of granularity of representation.</Paragraph> <Paragraph position="5"> The next stage in this work is to build a more extensive lexicon of near-synonym clusters than the few handwritten clusters that were built for the simple implementation described in this article. To this end, Inkpen and Hirst (2001a, 2001b) are developing a method to automatically build a clustered lexicon of 6,000 near-synonyms (1,000 clusters) from the machine-readable text of Hayakawa's Choose the Right Word (1994).</Paragraph> <Paragraph position="6"> Besides MT and NLG, we envision other applications of the model presented in this article. For instance, an interactive dictionary--an intelligent thesaurus--would actively help a person to find and choose the right word in any context. Rather than merely list possibilities, it would rank them according to the context and to parameters supplied by the user and would also explain potential effects of any choice, which would be especially useful in computer-assisted second-language instruction. Or the model could be applied in the automatic (post)editing of text in order to make the text conform to a certain stylistic standard or to make a text more readable or natural to a given audience.</Paragraph> <Paragraph position="7"> We leave a number of open problems for another day, including recovering nuances from text (see Edmonds [1998] for a preliminary discussion); evaluating the effectiveness of the similarity measures; determining the similarity of conceptual structures; understanding the complex interaction of lexical and structural decisions during lexical choice; exploring the requirements for logical inference in the model; modeling other aspects of fine-grained meaning, such as emphasis; and understanding the context-dependent nature of lexical differences and lexical knowledge.</Paragraph> <Paragraph position="8"> Appendix: An Example Representation: The Error Cluster The following is the representation of the cluster of error nouns in our formalism.</Paragraph> <Paragraph position="9"> Tokens ending in l represent lexical items. In upper case are either variables (for cross-reference) or relations; it should be clear from the context which is which. Capitalized tokens are concepts. In lower case are values of various features (such as &quot;indirectness&quot; and &quot;strength&quot;) defined in the model. We have not discussed many of the implementation details in this article, including p-link and covers (see Edmonds [1999]).</Paragraph> <Paragraph position="10"> (defcluster error C ;;; from Gove (1984) :syns (error l mistake l blunder l slip l lapse l howler l) (blunder l usually medium implication P1) ;; Mistake does not always imply blameworthiness, blunder sometimes.</Paragraph> <Paragraph position="11"> (mistake l sometimes medium implication (P2 (DEGREE 'medium))) (error l always medium implication (P2 (DEGREE 'medium))) (blunder l sometimes medium implication (P2 (DEGREE 'high))) ;; Mistake implies less severe criticism than error.</Paragraph> <Paragraph position="12"> ;; Blunder is harsher than mistake or error.</Paragraph> <Paragraph position="13"> (mistake l always medium implication (P31 (DEGREE 'low))) (error l always medium implication (P31 (DEGREE 'medium))) (blunder l always medium implication (P31 (DEGREE 'high))) ;; Mistake implies misconception.</Paragraph> <Paragraph position="14"> (mistake l always medium implication P4) ;; Slip carries a stronger implication of accident than mistake.</Paragraph> <Paragraph position="15"> ;; Lapse implies inattention more than accident.</Paragraph> <Paragraph position="16"> (slip l always medium implication P5) (mistake l always weak implication P5) (lapse l always weak implication P5) (lapse l always medium implication P6) ;; Blunder expresses a pejorative attitude towards the person.</Paragraph> <Paragraph position="17"> (blunder l always medium pejorative V1) ;; Blunder is a concrete word, error and mistake are abstract.</Paragraph> <Paragraph position="18"> (blunder l high concreteness) (error l low concreteness) (mistake l low concreteness) ;; Howler is an informal term (howler l low formality)) )</Paragraph> </Section> class="xml-element"></Paper>