XML Viewer - p04-2007

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/p04-2007_intro.xml
Size: 2,650 bytes
Last Modified: 2025-10-06 14:02:24
<?xml version="1.0" standalone="yes"?>
<Paper uid="P04-2007">
  <Title>Towards a Semantic Classi cation of Spanish Verbs Based on Subcategorisation Information</Title>
  <Section position="2" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> Lexical semantic classes group together words that have a similar meaning. Knowledge about verbs is especially important, since verbs are the primary means of structuring and conveying meaning in sentences. Manually built semantic classi cations of English verbs have been used for different applications such as machine translation (Dorr, 1997), verb subcategorisation acquisition (Korhonen, 2002a) or parsing (Schneider, 2003). (Levin, 1993) has established a large-scale classi cation of English verbs based on the hypothesis that the meaning of a verb and its syntactic behaviour are related, and therefore semantic information can be induced from the syntactic behaviour of the verb. A classi cation of Spanish verbs based on the same hypothesis has been developed by (V*azquez et al., 2000). But manually constructing large-scale verb classi cations is a labour-intensive task. For this reason, various methods for automatically classifying verbs using machine learning techniques have been attempted ((Merlo and Stevenson, 2001), (Stevenson and Joanis, 2003), (Schulte im Walde, 2003)).</Paragraph>
    <Paragraph position="1"> In this article we present experiments aiming at automatically classifying Spanish verbs into lexical semantic classes based on their subcategorisation frames. We adopt the idea that a description of verbs in terms of their syntactic behaviour is useful for acquiring their semantic properties. The classication task at hand is achieved through a process that requires different steps: we rst extract from a partially parsed corpus the probabilities of the subcategorisation frames for each verb. Then, the acquired probabilities are used as features describing the verbs and given as input to an unsupervised classi cation algorithm that clusters together the verbs according to the similarity of their descriptions. For the task of acquiring verb subcategorisation frames, we adapt to the speci cities of the Spanish language well-known techniques that have been developed for English, and our results compare favourably to the sate of the art results obtained for English (Korhonen, 2002b). For the verb classi cation task, we use a hierarchical clustering algorithm, and we compare the output clusters to a manually constructed classi cation developed by (V*azquez et al., 2000).</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML