File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/w04-1906_intro.xml
Size: 2,536 bytes
Last Modified: 2025-10-06 14:02:40
<?xml version="1.0" standalone="yes"?> <Paper uid="W04-1906"> <Title>Corpus-based Induction of an LFG Syntax-Semantics Interface for Frame Semantic Processing</Title> <Section position="3" start_page="0" end_page="0" type="intro"> <SectionTitle> 2 Corpus and Grammar Resources Frame Semantic Corpus Annotations The basis </SectionTitle> <Paragraph position="0"> for our work is a corpus of manual frame annotations, the SALSA/TIGER corpus (Erk et al., 2003).2 The annotation follows the FrameNet definitions of frames and their semantic roles.3 Underlying this corpus is a syntactically annotated corpus of German newspaper text, the TIGER treebank (Brants et al., 2002). TIGER syntactic annotations consist of relatively flat constituent graph representations, with edge labels that indicate functional information, such as head (HD), subject (SB), cf. Figure 1. The SALSA frame annotations are flat graphs connected to syntactic constituents. Figure 1 displays frame annotations where the REQUEST frame is triggered by the (discontinuous) frame evoking element (FEE) fordert ... auf (requests). The semantic roles (or frame elements, FEs) are represented as labelled edges that point to syntactic constituents in the TIGER syntactic annotation: the noun SPD for the SPEAKER, Koalition for the ADDRESSEE, and the PP zu Gespr&quot;ach &quot;uber Reform for the MESSAGE. LFG Grammar Resources We aim at a computational syntax-semantics interface for frame semantics, to be used for (semi-)automatic corpus annotation for training of stochastic role assignment models, and ultimately as a basis for automatic frame assignment. As a grammar resource we chose a wide-coverage computational LFG grammar for German (developed at IMS, University of Stuttgart). This German LFG grammar has already been used for semi-automatic syntactic annotation of the TIGER corpus, with reported coverage of 50%, and 70% precision (Brants et al., 2002). The grammar runs on the XLE grammar processing platform, which provides stochastic training and online disambiguation packages. Currently, the grammar is further extended, and will be enhanced with stochastic disambiguation, along the lines of (Riezler et al., 2002). LFG Corpus Resource Next to the German LFG grammar, (Forst, 2003) has derived a 'parallel' LFG f-structure corpus from the TIGER treebank, by applying methods for treebank conversion. We make use of the parallel treebank to induce LFG frame annotation rules from the SALSA/TIGER annotations.</Paragraph> </Section> class="xml-element"></Paper>