File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/03/p03-1020_concl.xml
Size: 1,127 bytes
Last Modified: 2025-10-06 13:53:36
<?xml version="1.0" standalone="yes"?> <Paper uid="P03-1020"> <Title>tRuEcasIng</Title> <Section position="6" start_page="0" end_page="0" type="concl"> <SectionTitle> 5 Conclusions </SectionTitle> <Paragraph position="0"> We have discussed truecasing, the process of restoring case information to badly-cased or non-cased text, and we have proposed a statistical, language modeling based truecaser which has an agreement of 98% with professionally written news articles.</Paragraph> <Paragraph position="1"> Although its most direct impact is improving legibility, truecasing is useful in case normalization across styles, genres, and sources. Truecasing is a valuable component in further natural language processing. Task based evaluation shows a 26% F-measure improvement in named entity recognition when using truecasing. In the context of automatic content extraction, mention detection on automatic speech recognition text is improved by a factor of 8. Truecasing also enhances machine translation output legibility and yields a BLEU score improvement of 80:2% over the original system.</Paragraph> </Section> class="xml-element"></Paper>