File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/99/w99-0632_concl.xml
Size: 3,622 bytes
Last Modified: 2025-10-06 13:58:33
<?xml version="1.0" standalone="yes"?> <Paper uid="W99-0632"> <Title>Using Subcategorization to Resolve Verb Class Ambiguity</Title> <Section position="7" start_page="271" end_page="272" type="concl"> <SectionTitle> 5 Discussion </SectionTitle> <Paragraph position="0"> semantic preferences as ranked by two judges This paper explores the degree to which syntactic frame information can be used to disambiguate verb semantic classes. In doing so, we cast the task of verb class disambiguation in a probabilistic framework which exploits Levin's semantic classification and frame frequencies acquired from the BNC. The approach is promising in that it achieves high precision witha simple model and can be easily extended to incorporate other sources of information which can influence the class selection process (i.e., selectional restrictions).</Paragraph> <Paragraph position="1"> The semantic preferences which we generate can be thought of as default semantic knowledge, to be used in the absence of any explicit contextual or lexico-semantic information to the contrary (cf. table 5). Consider the verb write for example. The model comes up with an intuitively reasonable ranking: we more often write things to people (&quot;message transfer&quot; reading) than for them (&quot;performance&quot; reading). However, faced with a sentence like Max wrote Elisabeth a book pragmatic knowledge forces us to prefer the &quot;performance&quot; reading versus the the &quot;message transfer&quot; reading. In other cases the model comes up with a counterintuitive ranking. For the verb call, for instance, the &quot;get&quot; reading (e.g., I will call you a cab) is preferred over the more natural &quot;dub&quot; reading (e.g., John called me a fool). We still rely heavily on the verb class information provided by Levin. But part of original aim was to infer class information for verbs not listed by Levin. For such a verb, P(class), and hence P(verb,frame, class) will be zero, which is not what we want. Recent work in computational linguistics (e.g., Schfitze (1993)) and cognitive psychology (e.g., Landauer and Dumais (1997)) has shown that large corpora implicitly contain semantic information, which can be extracted and manipulated in the form of co-occurrence vectors. The idea would be to compute the centroid (geometric mean) of the vectors of all members of a semantic class.</Paragraph> <Paragraph position="2"> Given an unknown verb (i.e., a verb not listed in Levin) we can decide its semantic class by comparing its semantic vector to the centroids of all semantic classes. We could (for example) determine class membership on the basis of the closest distance to the centroid representing a semantic class (cf. Patel et al. (1998) for a proposal similar in spirit). Once we have chosen a class for an unknown verb, we are entitled to assume that it will share the broad syntactic and semantic properties of that class.</Paragraph> <Paragraph position="3"> We also intend to experiment with a full scale subcategorization dictionary acquired from the BNC. We believe this will address issues such as: (a) relations between frames and classes (what are the frames for which the semantic class is predicted most accurately) and (b) relations between verbs and classes (what are the verbs for which the semantic class is predicted most accurately). We also plan to experiment with different classification schemes for verb semantics such as WordNet (Miller et al., 1990) and intersective Levin classes (Dang et al., 1998).</Paragraph> </Section> class="xml-element"></Paper>