File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/06/w06-0612_intro.xml

Size: 3,191 bytes

Last Modified: 2025-10-06 14:03:54

<?xml version="1.0" standalone="yes"?>
<Paper uid="W06-0612">
  <Title>Constructing an English Valency Lexicon[?]</Title>
  <Section position="2" start_page="0" end_page="94" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> The creation of a valency lexicon of English verbs is part of the ongoing project of the Prague English Dependency Treebank (PEDT). PEDT is being built from the Penn Treebank - Wall Street Journal section by converting it into dependency trees and providing it with an additional deep-syntactic annotation layer, working within the linguistic framework of the Functional Generative Description (FGD)(Sgall et al., 1986).</Paragraph>
    <Paragraph position="1"> The deep-syntactic annotation in terms of FGD pays special attention to valency. Under valency we understand the ability of lexemes (verbs, nouns, adjectives and some types of adverbs) to combine with other lexemes. Capturing of valency is pro table in Machine Translation, Information Extraction and Question Answering since it enables the machines to correctly recognize types of [?] The research reported in this paper has been partially supported by the grant of Grant Agency of the Czech Republic GA405/06/0589, the project of the Information Society No. 1ET101470416, and the grant of the Grant Agency of the Charles University No. 372/2005/A-INF/MFF.</Paragraph>
    <Paragraph position="2"> events and their participants even if they can be expressed by many different lexical items. A valency lexicon of verbs is inevitable for the project of the Prague English Dependency Treebank as a supporting tool for the deep-syntactic corpus annotation. null We are not aware of any lexical source from which such a lexicon could be automatically derived in the desired quality. Manual creation of gold-standard data for computational applications is yet very time-consuming and expensive. Having this in mind, we decided to adapt the already existing lexical source PropBank (M. Palmer and D. Gildea and P. Kingsbury, 2005) to FGD, making it comply with the structure of the original Czech valency lexicons VALLEX ( Zabokrtsk*y and Lopatkov*a, 2004) and PDT-VALLEX (J. Haji c et al., 2003), which have been designed for the deep-syntactic annotation of the Czech FGD-based treebanks (The Prague Dependency Tree-bank 1.0 and 2.0) (J. Haji c et al., 2001; Haji c, 2005). Manual editing follows the automatic procedure. We are reporting on a work that is still ongoing (which is though nearing completion).</Paragraph>
    <Paragraph position="3"> Therefore this paper focuses on the general conception of the lexicon as well as on its technical solutions, while it cannot give a serious evaluation of the completed work yet.</Paragraph>
    <Paragraph position="4"> The paper is structured as follows. In Section 2, we present current or previous related projects in more detail. In Section 3, we introduce the formal structure of the EngValLex lexicon. In Section 4, we describe how we semi-automatically created the lexicon and describe the annotation tool. Finally in Section 5, we state our outlooks for the future development and uses of the lexicon.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML