File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/93/p93-1001_concl.xml
Size: 1,166 bytes
Last Modified: 2025-10-06 13:57:04
<?xml version="1.0" standalone="yes"?> <Paper uid="P93-1001"> <Title>Char_align: A Program for Aligning Parallel Texts at the Character Level</Title> <Section position="9" start_page="7" end_page="7" type="concl"> <SectionTitle> 9. Conclusion </SectionTitle> <Paragraph position="0"> The performance of charalign is encouraging. The error rates are often very small, usually well within the length of a sentence or the length of a concordance line. The program is currently being used by translators to produce bilingual concordances for terminology research. For this application, it is necessary that the alignment program accept noisy (realistic) input, e.g., raw OCR output, with little or no manual cleanup. It is also highly desirable that the program produce constructive diagnostics when confronted with texts that don't align very well because of various snafus such as missing and/or misplaced pages. Charalign has succeeded in meeting many of these goals because it works at the character level and does not depend on finding sentence and/or paragraph boundaries which are surprisingly elusive in realistic applications.</Paragraph> </Section> class="xml-element"></Paper>