File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/w04-1015_intro.xml
Size: 2,654 bytes
Last Modified: 2025-10-06 14:02:33
<?xml version="1.0" standalone="yes"?> <Paper uid="W04-1015"> <Title>Sentence Compression for Automated Subtitling: A Hybrid Approach</Title> <Section position="2" start_page="0" end_page="0" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> A sentence compression tool has been built with the purpose of automating subtitle generation for the deaf and hard-of-hearing. Verbatim transcriptions cannot be presented as the subtitle presentation time is between 690 and 780 characters per minute, which is more or less 5.5 seconds for two lines (ITC, 1997), (Dewulf and Saerens, 2000), while the average speech rate contains a lot more than the equivalent of 780 characters per minute.</Paragraph> <Paragraph position="1"> The actual amount of compression needed depends on the speed of the speaker and on the amount of time available after the sentence. In documentaries, for instance, there are often large silent intervals between two sentences, the speech is often slower and the speaker is off-screen, so the available presentation time is longer. When the speaker is off-screen, the synchrony of the subtitles with the speech is of minor importance. When subtitling the news the speech rate is often very high so the amount of reduction needed to allow the synchronous presentation of subtitles and speech is much greater. The sentence compression rate is a parameter which can be set for each sentence.</Paragraph> <Paragraph position="2"> Note that the sentence compression tool described in this paper is not a subtitling tool. When subtitling, only when a sentence needs to be reduced, and the amount of reduction is known, the sentence is sent to the sentence compression tool.</Paragraph> <Paragraph position="3"> So the sentence compression tool is a module of an automated subtitling tool. The output of the sentence compression tool needs to be processed according to the subtitling guidelines like (Dewulf and Saerens, 2000), in order to be in the correct lay-out which makes it usable for actual subtitling. Manually post-editing the subtitles will still be required, as for some sentences no automatic compression is generated.</Paragraph> <Paragraph position="4"> In real subtitling it often occurs that the sentences are not compressed, but to keep the subtitles synchronized with the speech, some sentences are entirely removed.</Paragraph> <Paragraph position="5"> In section 2 we describe the processing of a sentence in the sentence compressor, from input to output. In section 3 we describe how the system was evaluated and the results of the evaluation. Section</Paragraph> </Section> class="xml-element"></Paper>