File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/98/p98-2155_concl.xml
Size: 3,398 bytes
Last Modified: 2025-10-06 13:58:09
<?xml version="1.0" standalone="yes"?> <Paper uid="P98-2155"> <Title>Constituent-based Accent Prediction</Title> <Section position="7" start_page="943" end_page="944" type="concl"> <SectionTitle> 5 Conclusion </SectionTitle> <Paragraph position="0"> Accent prediction experiments on noun phrase constituents demonstrated that deviations from citation form accentuation (supra, reduced and shift classes) can be directly modeled. Machine learning experiments using not only lexical and syntactic features, but also discourse focusing features identified by a new theory of accent interpretation in discourse, showed that accent assignment can be improved by up to 4%-6% relative to a hypothetical baseline system that would produce only citation-form accentuation, giving error rate reductions of 11%-25%.</Paragraph> <Paragraph position="1"> In general, constituent-based accentuation is most accurately learned from lexical information readily available in TTS systems. For CTS systems, comparable performance may be achieved using only higher level attentional features. There are several other lessons to be learned, conceming individual speaker, domain dependent and domain independent effects on accent modeling.</Paragraph> <Paragraph position="2"> First, it is perhaps counterintuitively harder to predict deviations from citation form accentuation for speakers who exhibit a great deal of noncitation-style accenting behavior, such as speaker H3. Accent prediction results for H1 exceeded those for H3, although about 15% more of H3's tokens exhibited non-citation form accentuation. Finding the appropriate parameters by which to describe the prosody of individual speakers is an important goal that can be advanced by using machine learning techniques to explore large spaces of hypotheses.</Paragraph> <Paragraph position="3"> Second, it is evident from the strong performance of the word lemma sequence models that deviations from citation-form accentuation may often be expressed by lexicalized rules of some sort. Lexicalized rules in fact have proven useful in other areas of natural language statistical modeling, such as POS tagging (Brill, 1995) and parsing (Collins, 1996).</Paragraph> <Paragraph position="4"> The specific lexicalized rules learned for many of the models would not have followed from any theoretical or empirical proposals in the literature. It may be that domain dependent training using au- null tomatic learning is the appropriate way to develop practical models of accenting patterns on different corpora. And especially for different speakers in the same domain, automatic learning methods seem to be the only efficient way to capture perhaps idiolectical variation in accenting.</Paragraph> <Paragraph position="5"> Finally, it should be noted that notwithstanding individual speaker and domain dependent effects, domain independent factors identified by the new theory of accent and attention do contribute to experimental performance. The two local focusing features, grammatical function and form of referring expression, enable improvements above the citation-form baseline, especially in combination with lexical information. Global focusing information is of limited use by itself, but as may have been hypothesized, contributes to accent prediction in combination with local focus, lexical and syntactic features.</Paragraph> </Section> class="xml-element"></Paper>