File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/98/w98-0707_abstr.xml

Size: 3,116 bytes

Last Modified: 2025-10-06 13:49:32

<?xml version="1.0" standalone="yes"?>
<Paper uid="W98-0707">
  <Title>I i i I I i I I I I I I I I I l I I I Towards a Representation of Idioms in WordNet</Title>
  <Section position="1" start_page="0" end_page="52" type="abstr">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> WordNet (Miller, 1995), (Fellbaum, 1998) is perhaps the most widely used electronic dictionary of English and serves as the lexicon for a rarity of different NLP applications including Information Retrieval (IR), Word Sense Disambignation (WSD), and M~hine Transla~ tion (MT). Despite WordNet's large coverage, which comprises some 100,000 concepts lexicMi~.ed by approySmately 120,000 word forms (strings) and is comparable to that of a collegiate dictionary, it contains relatively little figurative language. WordNet includes a w~mber of multi-word strings, such as phrasal verbs, but many idiomatic verb phrases Like smell a rat, know the ropes, and eat humble pie, are mi~ging. Idioms and metaphors abound in everyday language and are found in texts spanning many genres (see, e.g., (Jackendoff, 1997) for a numerical estlm~te of the frequency of idioms and fixed expression). Clearly, a dictionary that indudes extended senses of words and phrases is likely to yield more successful NLP applications.</Paragraph>
    <Paragraph position="1"> On the one hand, no system wants to retrieve the string bucket from the idiom kick the bucket.</Paragraph>
    <Paragraph position="2"> On the other hand, MT and WSD efforts need to distinguish the sense of ropes in phrases like know~learn/teach someone the ropes from the sense meaning &amp;quot;strong cords&amp;quot;; selecting the latter sense in any of the idiomatic phrases leads to failure. An IR query is likely to be interested only in the &amp;quot;strong cord&amp;quot; reading. When this sense is to be retrieved with the aid of a lexicon intended for multiple applications, the figurative sense must be successfully recognized and excluded from a text that may contain instances of the string ropes with both meanings.</Paragraph>
    <Paragraph position="3"> In this paper, we consider the possibility of extending WordNet to accommodate figurative meanings in the English lexicon. While much  h~.~ been written on figurative language, there is no agreement on the boundary between literal and non-literal language, see e.g. (Moon, 1986).</Paragraph>
    <Paragraph position="4"> Criteria that are commonly accepted include semantic non-compositionality and syntactic constraints on internal modification (such as adjective and adverb insertion) and movement transformations. Our purpose here is not to attempt a clear delimitation or definition of non-literal language, but to examine how extended senses of words and phrases from different syntactic and lexical categories-or conforming to none of the standard categories-are compatible with the network structure of a relational lexicon like WordNet and its particular way of representing words and concepts. Our discussion will focus on, but not be limited to, idiomatic verb phrases.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML