File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/06/p06-1048_concl.xml
Size: 3,197 bytes
Last Modified: 2025-10-06 13:55:18
<?xml version="1.0" standalone="yes"?> <Paper uid="P06-1048"> <Title>Models for Sentence Compression: A Comparison across Domains, Training Requirements and Evaluation Measures</Title> <Section position="9" start_page="382" end_page="383" type="concl"> <SectionTitle> 7 Conclusions and Future Work </SectionTitle> <Paragraph position="0"> In this paper we have provided a comparison between a supervised (constituent-based) and a minimally supervised (word-based) approach to sentence compression. Our results demonstrate that the word-based model performs equally well on spoken and written text. Since it does not rely heavily on training data, it can be easily extended to languages or domains for which parallel compression corpora are scarce. When no parallel corpora are available the parameters can be manually tuned to produce compressions. In contrast, the supervised decision-tree model is not particularly robust on spoken text, it is sensitive to the nature of the training data, and did not produce adequate compressions when trained on the human-authored Broadcast News corpus. A comparison of the automatically gathered Ziff-Davis corpus with the Broadcast News corpus revealed important differences between the two corpora and thus suggests that automatically created corpora may not re ect human compression performance.</Paragraph> <Paragraph position="1"> We have also assessed whether automatic evaluation measures can be used for the compression task. Our results show that grammatical relations-based F-score (Riezler et al. 2003) correlates reliably with human judgements and could thus be used to measure compression performance automatically. For example, it could be used to assess progress during system development or for comparing across different systems and system con gurations with much larger test sets than currently employed.</Paragraph> <Paragraph position="2"> In its current formulation, the only function driving compression in the word-based model is the language model. The word signi cance and SOV scores are designed to single out important words that the model should not drop. We have not yet considered any functions that encourage compression. Ideally these functions should be inspired from the underlying compression process.</Paragraph> <Paragraph position="3"> Finding such a mechanism is an avenue of future work. We would also like to enhance the word-based model with more linguistic knowledge; we plan to experiment with syntax-based language models and more richly annotated corpora.</Paragraph> <Paragraph position="4"> Another important future direction lies in applying the unsupervised model presented here to languages with more exible word order and richer morphology than English (e.g., German, Czech).</Paragraph> <Paragraph position="5"> We suspect that these languages will prove challenging for creating grammatically acceptable compressions. Finally, our automatic evaluation experiments motivate the use of relations-based F-score as a means of directly optimising compression quality, much in the same way MT systems optimise model parameters using BLEU as a measure of translation quality.</Paragraph> </Section> class="xml-element"></Paper>