File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/04/w04-1015_concl.xml

Size: 1,876 bytes

Last Modified: 2025-10-06 13:54:15

<?xml version="1.0" standalone="yes"?>
<Paper uid="W04-1015">
  <Title>Sentence Compression for Automated Subtitling: A Hybrid Approach</Title>
  <Section position="5" start_page="0" end_page="0" type="concl">
    <SectionTitle>
4 Conclusion
</SectionTitle>
    <Paragraph position="0"> We have described a hybrid approach to sentence compression which seems to work in general. The combination of using statistics and filtering out invalid results because they are ungrammatical by using a set of rules is a feasible way for automated sentence compression.</Paragraph>
    <Paragraph position="1"> The way of combining the probability-estimates of chunk removal to get a ranking in the generated sentence alternatives is working reasonably well, but could be improved by using more fine-grained chunk types for data collection.</Paragraph>
    <Paragraph position="2"> A full syntactic analysis of the input sentence would lead to better results, as the current sentence analysis tools have one very weak point: the handling of coordinating conjunction, which leads to chunking errors, both in the input sentence as in the processing of the used parallel corpus. This leads to misestimations of the compression probabilities and creates noise in the behaviour of our system.</Paragraph>
    <Paragraph position="3"> Making use of semantics would most probably lead to better results, but a semantic lexicon and semantic analysis tools are not available for Dutch, and creating them would be out of the scope of the current project.</Paragraph>
    <Paragraph position="4"> In future research we will check the effects of improved word-reduction modules, as word reductions often seem to lead to inaccurate compressions. Leaving out the word-reduction module would probably lead to an even bigger amount of no output-cases. This will also be checked in future research.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML