File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/03/n03-2008_intro.xml
Size: 2,560 bytes
Last Modified: 2025-10-06 14:01:43
<?xml version="1.0" standalone="yes"?> <Paper uid="N03-2008"> <Title>A Maximum Entropy Approach to FrameNet Tagging</Title> <Section position="3" start_page="0" end_page="2" type="intro"> <SectionTitle> 2 Method </SectionTitle> <Paragraph position="0"> Training (32,251 sentences), development (3,491 sentences), and held out test sets (3,398 sentences) were generated from the June 2002 FrameNet release following the divisions used in Gildea and Jurafsky (2000) . Because human-annotated syntactic information could only be obtained for a subset of their data, the training, development, and test sets used here are approximately 10% smaller than those used in Gildea and Jurafsky (2000).</Paragraph> <Paragraph position="1"> There are on average 2.2 Frame Elements per sentence, falling into one of 126 unique classes.</Paragraph> <Paragraph position="2"> Maximum Entropy ME models implement the intuition that the best model will be the one that is consistent with all the evidence, but otherwise, is as uniform as possible. (Berger et al., 1996). Following recent successes using it for many NLP tasks (Och and Ney, 2002; Koeling, 2000), we use ME to implement a Frame Element classifier. We use the YASMET ME package (Och, 2002) to train an approximation of the model below: P(r |pt, voice, position, target, gf, h) Here r indicates the element type, pt the phrase type, gf the grammatical function, h the head word, and target the target predicate. Due to data sparsity issues, we do not calculate this model directly, but rather, model various feature combinations as described in Gildea and Jurafsky (2000).</Paragraph> <Paragraph position="3"> The classifier was trained, using only features that had a frequency in training of one or more, and until performance on the development set ceased to improve. Feature weights were smoothed using a Bayesian method, such that weight limits are Gaussian distributed with mean 0 and standard deviation 1.</Paragraph> <Paragraph position="4"> Tagging Frame Elements do not occur in isolation, but rather, depend very much on what other Elements occur in a sentence. For example, if a Frame Element is tagged as an Agent it is highly unlikely that the next Element will also be an Agent. We exploit this dependency by treating the Frame Element classification task as a tagging problem.</Paragraph> <Paragraph position="5"> The YASMET MEtagger was used to apply an n-gram tag model to the classification task (Bender et al., 2003). The feature set for the training data was</Paragraph> </Section> class="xml-element"></Paper>