File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/06/p06-1085_abstr.xml
Size: 880 bytes
Last Modified: 2025-10-06 13:45:02
<?xml version="1.0" standalone="yes"?> <Paper uid="P06-1085"> <Title>Contextual Dependencies in Unsupervised Word Segmentation[?]</Title> <Section position="2" start_page="0" end_page="0" type="abstr"> <SectionTitle> Abstract </SectionTitle> <Paragraph position="0"> Developing better methods for segmenting continuous text into words is important for improving the processing of Asian languages, and may shed light on how humans learn to segment speech. We propose two new Bayesian word segmentation methods that assume unigram and bi-gram models of word dependencies respectively. The bigram model greatly out-performs the unigram model (and previous probabilistic models), demonstrating the importance of such dependencies for word segmentation. We also show that previous probabilistic models rely crucially on sub-optimal search procedures.</Paragraph> </Section> class="xml-element"></Paper>