File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/00/w00-0506_evalu.xml
Size: 3,346 bytes
Last Modified: 2025-10-06 13:58:41
<?xml version="1.0" standalone="yes"?> <Paper uid="W00-0506"> <Title>Pre-processing Closed Captions for Machine Translation</Title> <Section position="8" start_page="43" end_page="44" type="evalu"> <SectionTitle> 6 Performance </SectionTitle> <Paragraph position="0"> We ran a test to evaluate how the recognizer and segmenter affected the quality of translations. We selected a sample of 200 lines of closed captioning, comprising four continuous sections of 50 lines each. The sample was run through the MT system twice, once with the recognizer and segmenter activated and once without. The results were evaluated by two native Spanish speakers. We adopted a very simple evaluation measure, asking the subjects to tell whether one translation was better than the other. The translations differed for 32 input lines out of 200 (16%). Table (3) shows the evaluation results, with input lines as the unit of measurement.</Paragraph> <Paragraph position="1"> The third column shows the intersection of the two evaluations, i.e. the evaluations on which the two subjects agreed. The three rows show how often the translation was better (i) with pre-processing, (ii) without pre-processing, or (iii) no difference could be appreciated.</Paragraph> <Paragraph position="2"> The results show a discrepancy in the evaluations. One evaluator also pointed out that it is hard to make sense of transcribed closed captions, without the audio-visual context. These two facts seem to point out that an appropriate evaluation should be done in the operational context in which closed captions are normally used. Still, the intersection of the subjects' evaluations shows that pre-processing improves the output quality. In three of the four cases where the two evaluators agreed that pre-processing yielded a worse result, the worse performance was due to an incorrect name recognition oi&quot; segmentation. However, in two of the three cases, the original problem was an incorrect tagging.</Paragraph> <Paragraph position="3"> Note that even when the name recognizer and segmenter are off, the system can identify some names, and recover from translation failures by piecing together translations of fragments. Therefore, what was being tested was not so much name recognition and segmenting per se, but the idea of having separate modules for such tasks in the system front end.</Paragraph> <Paragraph position="4"> Finally, the test did not take into account speed, as we set higher time thresholds than an embedded application would require. Since segmentation reduces processing time, it is also expected to reduce the impact of tighter time thresholds, all other things being equal.</Paragraph> <Paragraph position="5"> We are planning to conduct an operational evaluation of the system. The goal is to evaluate the system output in its proper visual context, and compare the results with parallel results for human translated closed captions. Different groups of participants will watch a video With either human- or machine-translated subtitles, and complete a questionnaire based on the subtitles in the video. The questionnaire will contain a set of questions to elicit the subject's assessment on the translation quality, and a set of questions to assess the subject's level of comprehension of the program.</Paragraph> </Section> class="xml-element"></Paper>