File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/06/n06-1029_concl.xml
Size: 2,059 bytes
Last Modified: 2025-10-06 13:55:07
<?xml version="1.0" standalone="yes"?> <Paper uid="N06-1029"> <Title>Unsupervised and Semi-supervised Learning of Tone and Pitch Accent</Title> <Section position="7" start_page="229" end_page="229" type="concl"> <SectionTitle> 6 Conclusion & Future Work </SectionTitle> <Paragraph position="0"> We have demonstrated the effectiveness of both unsupervised and semi-supervised techniques for recognition of Mandarin Chinese syllable tones and English pitch accents using acoustic features alone to capture pitch target height and slope. Although outperformed by fully supervised classification techniques using much larger samples of labelled training data, these unsupervised and semi-supervised techniques perform well above most common class assignment, in the best cases approaching 90% of supervised levels, and, where comparable, well above a good discriminative classifier trained on a comparably small set of labelled data. Unsupervised techniques achieve accuracies of 87% on the cleanest read speech, reaching 57% on data from a standard Mandarin broadcast news corpus, and over 78% on pitch accent classification for English broadcast news. Semi-supervised classification in the Mandarin four-class classification task reaches 94% accuracy on read speech, 70% on broadcast news data, improving dramatically over both the simple baseline of 25% and a standard SVM with an RBF kernel trained only on the labeled examples.</Paragraph> <Paragraph position="1"> Future work will consider a broader range of tone and intonation classification, including the richer tone set of Cantonese as well as Bantu family tone languages, where annotated data truly is very rare.</Paragraph> <Paragraph position="2"> We also hope to integrate a richer contextual representation of tone and intonation consistent with phonetic theory within this unsupervised and semi-supervised learning framework. We will further explore improvements in classification accuracy based on increases in labeled and unlabeled training examples. null</Paragraph> </Section> class="xml-element"></Paper>