File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/03/n03-1026_concl.xml

Size: 3,583 bytes

Last Modified: 2025-10-06 13:53:28

<?xml version="1.0" standalone="yes"?>
<Paper uid="N03-1026">
  <Title>Statistical Sentence Condensation using Ambiguity Packing and Stochastic Disambiguation Methods for Lexical-Functional Grammar</Title>
  <Section position="6" start_page="0" end_page="0" type="concl">
    <SectionTitle>
5 Conclusion
</SectionTitle>
    <Paragraph position="0"> We presented an approach to sentence condensation that employs linguistically rich LFG grammars in a parsing/generation-based stochastic sentence condensation system. Fine-grained dependency structures are output by the parser, then modified by a highly expressive transfer system, and filtered by a constraint-based generator. Stochastic selection of generation-filtered reduced structures uses a powerful Maximum-Entropy model.</Paragraph>
    <Paragraph position="1"> As shown in an experimental evaluation, summarization testset I lowerbound systemselection upperbound  structures for original uncondensed sentences.</Paragraph>
    <Paragraph position="2"> quality of the system output is state-of-the-art, and grammaticality of condensed strings is guaranteed. Robustness techniques for parsing and generation guarantee that the system produces non-empty output for unseen input.</Paragraph>
    <Paragraph position="3"> Overall, the summarization quality achieved by our system is similar to the results reported in Knight and Marcu (2000). This might seem disappointing considering the more complex machinery employed in our approach. It has to be noted that these results are partially due to the somewhat artificial nature of the data that were used in the experiments of Knight and Marcu (2000) and therefore in our experiments: The human-written condensations in the data set extracted from the Ziff-Davis corpus show the same word order as the original sentences and do not exhibit any structural modification that are common in human-written summaries. For example, humans tend to make use of structural modifications such as nominalization and verb alternations such as active/passive or transitive/intransitive alternations in condensation. Such alternations can easily be expressed in our transfer-based approach, whereas they impose severe problems to approaches that operate only on phrase structure trees. In the given test set, however, the condensation task restricted to the operation of deletion. A creation of additional condensations for the original sentences other than the condensed versions extracted from the human-written abstracts would provide a more diverse test set, and furthermore make it possible to match each system output against any number of independent human-written condensations of the same original sentence. This idea of computing matching scores to multiple reference examples was proposed by Alshawi et al. (1998), and later by Papineni et al. (2001) for evaluation of machine translation systems. Similar to these proposals, an evaluation of condensation quality could consider multiple reference condensations and record the matching score against the most similar example.</Paragraph>
    <Paragraph position="4"> Another desideratum for future work is to carry condensation all the way through without unpacking at any stage. Work on employing packing techniques not only for parsing and transfer, but also for generation and stochastic selection is currently underway (see Geman and Johnson (2002)). This will eventually lead to a system whose components work on packed representations of all or n-best solutions, but completely avoid costly unpacking of representations.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML