File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/91/h91-1074_concl.xml
Size: 2,634 bytes
Last Modified: 2025-10-06 13:56:39
<?xml version="1.0" standalone="yes"?> <Paper uid="H91-1074"> <Title>Predicting Intonational Boundaries Automatically from Text: The ATIS Domain</Title> <Section position="5" start_page="379" end_page="380" type="concl"> <SectionTitle> 4 Discussion </SectionTitle> <Paragraph position="0"> The experiments described above indicate that it is indeed possible to relate intonational boundaries to the text of an utterance with fair success, 4 using information available automatically using current NLP technology. This application of CART techniques to the problem of predicting phrase boundaries increases our understanding of the importance of several among the numerous variables which might plausibly be related to boundary location. Future word wiLl extend the set of variables for analysis to include distance metrics defined in terms of stressed syllables, automatic NP-detection \[5\], MUTUAL INFORMATION, GENERALIZED MUTUAL LNFORMATION scores can serve as indicators of intonational phrase boundaries \[10\]. We will alto examine possible interactions among the statistically important variables which have emerged from our initial study. CART's step-wise treatment of variables, optimization heuristics, and dependence on binary splits obscure the possible relationships that exist among the various factors. Now that we have discovered a set of variables which do well at predicting intonational boundary location, we need to understand just how these variables interact.</Paragraph> <Paragraph position="1"> While we have not yet attempted the parallel classification of boundary sites from acoustic information for the ATIS sample, previous research \[12\] and our own preliminary analysis of a a smaller set of training data collected for the VEST (Voice English-Spanish Translation) project, suggest that in4For purposes of comparison with classification efforts that measure only success of boundary prediction (not success of non-boundary prediction as well), the best cross-validated prediction from the analyses done for this study has a 79.5% success rate and the best prediction from a full tree classifies 89.7% correctly.</Paragraph> <Paragraph position="2"> tonational boundaries can be identified with some success from simple measures of final lengthening (inferred from relative word or syllable duration) and of pausal duration. For the VEST data, for example, boundary location can be inferred correctly from such metrics in 92% of cases. In future work, these features, as well as amplitude and other potential boundary indicators will be examined in the ATIS database.</Paragraph> </Section> class="xml-element"></Paper>