File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/w04-3205_intro.xml
Size: 5,956 bytes
Last Modified: 2025-10-06 14:02:50
<?xml version="1.0" standalone="yes"?> <Paper uid="W04-3205"> <Title>VERBOCEAN: Mining the Web for Fine-Grained Semantic Verb Relations</Title> <Section position="3" start_page="0" end_page="0" type="intro"> <SectionTitle> 2 Relevant Work </SectionTitle> <Paragraph position="0"> In this section, we describe application domains that can benefit from a resource of verb semantics.</Paragraph> <Paragraph position="1"> We then introduce some existing resources and describe previous attempts at mining semantics from text.</Paragraph> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 2.1 Applications </SectionTitle> <Paragraph position="0"> Question answering is often approached by canonicalizing the question text and the answer text into logical forms. This approach is taken, inter alia, by a top-performing system (Moldovan et al.</Paragraph> <Paragraph position="1"> 2002). In discussing future work on the systems logical form matching component, Rus (2002 p.</Paragraph> <Paragraph position="2"> 143) points to incorporating entailment and causation verb relations to improve the matchers performance. In other work, Webber et al. (2002) have argued that successful question answering depends on lexical reasoning, and that lexical reasoning in turn requires fine-grained verb semantics in addition to troponymy (is-a relations between verbs) and antonymy.</Paragraph> <Paragraph position="3"> In multi-document summarization, knowing verb similarities is useful for sentence compression and for determining sentences that have the same meaning (Lin 1997). Knowing that a particular action happens before another or is enabled by another is also useful to determine the order of the events (Barzilay et al. 2002). For example, to order summary sentences properly, it may be useful to know that selling something can be preceded by either buying, manufacturing, or stealing it. Furthermore, knowing that a particular verb has a meaning stronger than another (e.g. rape vs. abuse and renovate vs. upgrade) can help a system pick the most general sentence.</Paragraph> <Paragraph position="4"> In lexical selection of verbs in machine translation and in work on document classification, practitioners have argued for approaches that depend on wide-coverage resources indicating verb similarity and membership of a verb in a certain class. In work on translating verbs with many counterparts in the target language, Palmer and Wu (1995) discuss inherent limitations of approaches which do not examine a verbs class membership, and put forth an approach based on verb similarity. In document classification, Klavans and Kan (1998) demonstrate that document type is correlated with the presence of many verbs of a certain EVCA class (Levin 1993). In discussing future work, Klavans and Kan point to extending coverage of the manually constructed EVCA resource as a way of improving the performance of the system. A wide-coverage repository of verb relations including verbs linked by the similarity relation will provide a way to automatically extend the existing verb classes to cover more of the English lexicon.</Paragraph> </Section> <Section position="2" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 2.2 Existing resources </SectionTitle> <Paragraph position="0"> Some existing broad-coverage resources on verbs have focused on organizing verbs into classes or annotating their frames or thematic roles. EVCA (English Verb Classes and Alternations) (Levin 1993) organizes verbs by similarity and participation / nonparticipation in alternation patterns. It contains 3200 verbs classified into 191 classes. Additional manually constructed resources include PropBank (Kingsbury et al. 2002), Frame-Net (Baker et al. 1998), VerbNet (Kipper et al.</Paragraph> <Paragraph position="1"> 2000), and the resource on verb selectional restrictions developed by Gomez (2001).</Paragraph> <Paragraph position="2"> Our approach differs from the above in its focus.</Paragraph> <Paragraph position="3"> We relate verbs to each other rather than organize them into classes or identify their frames or thematic roles. WordNet does provide relations between verbs, but at a coarser level. We provide finer-grained relations such as strength, enablement and temporal information. Also, in contrast with WordNet, we cover more than the prescriptive cases.</Paragraph> </Section> <Section position="3" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 2.3 Mining semantics from text </SectionTitle> <Paragraph position="0"> Previous web mining work has rarely addressed extracting many different semantic relations from Web-sized corpus. Most work on extracting semantic information from large corpora has largely focused on the extraction of is-a relations between nouns. Hearst (1992) was the first followed by recent larger-scale and more fully automated efforts (Pantel and Ravichandran 2004; Etzioni et al.</Paragraph> <Paragraph position="1"> 2004; Ravichandran and Hovy 2002). Recently, Moldovan et al. (2004) present a learning algorithm to detect 35 fine-grained noun phrase relations. null Turney (2001) studied word relatedness and synonym extraction, while Lin et al. (2003) present an algorithm that queries the Web using lexical patterns for distinguishing noun synonymy and antonymy. Our approach addresses verbs and provides for a richer and finer-grained set of semantics. Reliability of estimating bigram counts on the web via search engines has been investigated by Keller and Lapata (2003).</Paragraph> <Paragraph position="2"> Semantic networks have also been extracted from dictionaries and other machine-readable resources. MindNet (Richardson et al. 1998) extracts a collection of triples of the type ducks have wings and duck capable-of flying. This resource, however, does not relate verbs to each other or provide verb semantics.</Paragraph> </Section> </Section> class="xml-element"></Paper>