File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/relat/02/w02-1002_relat.xml

Size: 2,113 bytes

Last Modified: 2025-10-06 14:15:39

<?xml version="1.0" standalone="yes"?>
<Paper uid="W02-1002">
  <Title>Conditional Structure versus Conditional Estimation in NLP Models</Title>
  <Section position="4" start_page="0" end_page="0" type="relat">
    <SectionTitle>
4 Related Results
</SectionTitle>
    <Paragraph position="0"> Johnson (2001) describes two parsing experiments.</Paragraph>
    <Paragraph position="1"> First, he examines a PCFG over the ATIS treebank, trained both using RFEs to maximize JL, and using a CG method to maximize what we have been calling CL . He does not give results for the unconstrained CL, but even in the constrained case, the effects from section 2 occur. CL and parsing accuracy are both higher using the CL estimates. He also describes a conditional shift-reduce parsing model, but notes that it underperforms the simpler joint formulation.</Paragraph>
    <Paragraph position="2"> We take these two results not as contradictory, but as confirmation that conditional estimation, though often slow, generally improves accuracy, while conditional model structures must be used with caution.</Paragraph>
    <Paragraph position="3"> The conditional shift-reduce parsing model he describes can be expected to exhibit the same type of competing-variable explaining-away issues that occur in MEMM tagging. As an extreme example, if all words have been shifted, the rest of the parser actions will be reductions with probability one.</Paragraph>
    <Paragraph position="4"> Goodman (1996) describes algorithms for parse selection where the criterion being maximized in parse selection is the bracket-based accuracy measure that parses are scored by. He shows a test-set accuracy benefit from optimizing accuracy directly.</Paragraph>
    <Paragraph position="5"> Finally, model structure and parameter estimation are not the entirety of factors which determine the behavior of a model. Model features are crucial, and the ability to incorporate richer features in a relatively sensible way also leads to improved models. This is the main basis of the real world benefit which has been derived from maxent models.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML