File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/00/c00-1024_evalu.xml
Size: 3,227 bytes
Last Modified: 2025-10-06 13:58:33
<?xml version="1.0" standalone="yes"?> <Paper uid="C00-1024"> <Title>A Multilingual News Summarizer</Title> <Section position="5" start_page="163" end_page="164" type="evalu"> <SectionTitle> 5. Evaluation of Sumnmrization Results </SectionTitle> <Paragraph position="0"> The same six events specified in Section 3.1 are used to measure the performance of the two summarization models. Three kinds of metrics are considered - say, the document reduction rate, the reading-tilne reduction rate, and the inforlnation carried. The higher the document reduction rate is, the more time the reader may save, but the higher possibility the ilnportant information may be lost. Tables 2 and 3 list the document reduction rates for focusing and browsing summarization, respectively. Only focuses are displayed in focusing sutnmarization, so that the average doculnent reduction rate is higher than that of browsing summarization.</Paragraph> <Paragraph position="1"> Besides the document reduction rate, we also measure the correct rate of question-answering, and reading-time reduction rate. Assessors read the highlight parts only in the browsing summarization, and answer 3 to 5 questions.</Paragraph> <Paragraph position="2"> Table 4 lists the evaluation results of the six events. The average doculnent reduction rate is 43.79%. On the average, the summary saves 30.86% of reading time. While reading the summary only, the correct rate of question-answering task is 88.46%.</Paragraph> <Paragraph position="3"> Conclusion This paper sketches architecture for multilingual news summarizer. In multilingual clustering, lnatching all pairs of news clusters in all languages is time-exhaustive. Because only English and Chinese news articles are considered in this paper, it is not a problem. In general, an effective way is to predefine a sequence of language pairs according to the degree of translation ambiguity. The hmguage pair of less ambiguity is tried first.</Paragraph> <Paragraph position="4"> To discuss which fi'agments of multilingual news stories denote the salne things, this paper defines the concept of MUs. Punctuation marks, linking elements and topic chains are cues to identify MUs for Chinese. Select-high-frequent English translation and name transliteration are adopted to transhtte Chinese MUs into L;nglish. Five models are proposed to link the similar MUs together. Different formats used in time, date and monetary expressions, e.g., implicit time zone, affect the performance of linking. It should be studied in the fllture.</Paragraph> <Paragraph position="5"> In presentation o1' summarization results, the information decay strategy helps reduce the redundancy, and the user can get all the information provided by the news sites.</Paragraph> <Paragraph position="6"> However, the news sequence is not presented according to the importance. The user may quit reading and miss the information not shown yet. The voting strategy from reporters gives a shorter summarization in terlnS of user-preferred languages. However, it also misses some unique information reported only by one site. A hybrid strategy should be developed in the future to meet all the requirements.</Paragraph> </Section> class="xml-element"></Paper>