File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/98/p98-1076_concl.xml
Size: 846 bytes
Last Modified: 2025-10-06 13:58:06
<?xml version="1.0" standalone="yes"?> <Paper uid="P98-1076"> <Title>One Tokenization per Source</Title> <Section position="9" start_page="462" end_page="462" type="concl"> <SectionTitle> 7 Conclusion </SectionTitle> <Paragraph position="0"> The hypothesis of one tokenization per source confirms surprisingly well (99.92% ~ 99.97%) with corpus evidences, and works extremely well (90% - 97%) in critical ambiguity resolution. It is formulated on the critical tokenization theory and inspired by the parallel hypotheses of one sense per discourse and one sense per collocation, as is postulated as a particular articulation of the general law of one realization per expression. We also argue for the further generalization of regarding it as a new paradigm for studying the twin-issue of token and tokenization.</Paragraph> </Section> class="xml-element"></Paper>