File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/94/c94-2198_abstr.xml
Size: 1,054 bytes
Last Modified: 2025-10-06 13:48:12
<?xml version="1.0" standalone="yes"?> <Paper uid="C94-2198"> <Title>WORD CLASS DISCOVERY FOR POSTPROCESSING CHINESE HANDWRITING RECOGNITION</Title> <Section position="1" start_page="0" end_page="0" type="abstr"> <SectionTitle> WORD CLASS DISCOVERY FOR POSTPROCESSING CHINESE HANDWRITING RECOGNITION Chao-Huang Chang </SectionTitle> <Paragraph position="0"> E000/CCI~, Building 11, Industrial Technology Research Institute Chutung, Hsinchu 31015, TAIWAN, R.O.C.</Paragraph> <Paragraph position="1"> Summary This article presents a novel Chinese class n-gram model for contextual postprocessing of haudwriting recognition results. The word classes in the model are automatically discovered by a corpus-based simulated anuealing procedure. Three other language models, least-word, word-frequency, and the powerflfl inter-word character bigram model, have been constructed for comparison. Extensive experiments on large text corpora show that the discovered class bigram model outperforms the other three competing models.</Paragraph> </Section> class="xml-element"></Paper>