XML Viewer - c04-1157

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/c04-1157_intro.xml
Size: 2,178 bytes
Last Modified: 2025-10-06 14:02:13
<?xml version="1.0" standalone="yes"?>
<Paper uid="C04-1157">
  <Title>Verb Phrase Ellipsis detection using Automatically Parsed Text</Title>
  <Section position="3" start_page="0" end_page="0" type="intro">
    <SectionTitle>
2 Previous work
</SectionTitle>
    <Paragraph position="0"> Hardt's (1997) algorithm for detecting VPE in the Penn Treebank (see Section 3) achieves recall levels of 53% and precision of 44%, giving an F11 of 48%, using a simple search tech1Precision, recall and F1 are deflned as : Recall = No(correct ellipses found)No(all ellipses in test) (1) nique, which relies on the parse annotation having identifled empty expressions correctly.</Paragraph>
    <Paragraph position="1"> In previous work (Nielsen, 2003a; Nielsen, 2003b) we performed experiments on the British National Corpus using a variety of machine learning techniques. These earlier results are not directly comparable to Hardt's, due to the difierent corpora used. The expanded set of results are summarised in Table 1, for  For all of these experiments, the training features consisted of lexical forms and Part of Speech (POS) tags of the words in a three word forward/backward window of the auxiliary being tested. This context size was determined empirically to give optimum results, and will be used throughout this paper. L-BFGS-MaxEnt uses Gaussian Prior smoothing optimized for the BNC data, while GIS-MaxEnt has a simple smoothing option available, but this deteriorates results and is not used. Both maximum entropy models were experimented with to determine thresholds for accepting results as VPE; GIS-MaxEnt was set to a 20% confldence threshold and L-BFGS-MaxEnt to 35%. MBL was used with its default settings.</Paragraph>
    <Paragraph position="2"> While TBL gave the best results, the software we used (Lager, 1999) ran into memory problems and proved problematic with larger datasets. Decision trees, on the other hand,</Paragraph>
    <Paragraph position="4"> tend to oversimplify due to the very sparse nature of ellipsis, and produce a single rule that classifles everything as non-VPE. This leaves Maximum Entropy and MBL for further experiments. null</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML