XML Viewer - h92-1107

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/92/h92-1107_metho.xml
Size: 3,631 bytes
Last Modified: 2025-10-06 14:13:15
<?xml version="1.0" standalone="yes"?>
<Paper uid="H92-1107">
  <Title>Spoken Language Recognition and Understanding</Title>
  <Section position="1" start_page="0" end_page="0" type="metho">
    <SectionTitle>
1. PROJECT GOALS
</SectionTitle>
    <Paragraph position="0"> The goal of this research is to demonstrate spoken language systems in support of interactive problem solving.</Paragraph>
    <Paragraph position="1"> The system accepts continuous speech input and handles multiple speakers without explicit speaker enrollment.</Paragraph>
    <Paragraph position="2"> The MIT spoken language system combines SUMMIT, a segment-based speech recognition system, and TINA, a probabilistic natural language system, to achieve speech understanding. The system engages in interactive dialogue with the user, providing output in the form of tabular displays, as well as spoken and written output. The system has been demonstrated on several applications, including travel planning and direction assistance.</Paragraph>
  </Section>
  <Section position="2" start_page="0" end_page="0" type="metho">
    <SectionTitle>
2. RECENT RESULTS
</SectionTitle>
    <Paragraph position="0"> * Reduced spontaneous speech recognition word error rate by more than a factor of two since the February 1991 evaluation through the use of low perplexity language models and context-dependent phonetic models.</Paragraph>
    <Paragraph position="1"> * Reduced natural language weighted error by almost a factor of 2 on class A sentences through the use of a robust parsing mechanism, which integrates parsed phrases into a single semantic representation, using a slight extension of the existing discourse processing. null * Demonstrated a near real-time interactive spoken language system, running on a Sun SPARCstation or an IBM RX6000.</Paragraph>
    <Paragraph position="2"> * Developed and experimented with alternative metrics for the evaluation of interactive spoken language systems, including use of task completion and time-to-completion for air travel planning tasks, as well as a log file based evaluation procedure.</Paragraph>
    <Paragraph position="3"> * Collected nearly 20,000 sentences for the Wall Street Journal pilot corpus in support of research and development in large-vocabulary speech recognition systems.</Paragraph>
    <Paragraph position="4"> * Chaired the MADCOW multi-site ATIS data collection effort, contributed over 5000 sentences of spontaneous ATIS data, and participated in the common evaluation for speech, spoken language and text input. null</Paragraph>
  </Section>
  <Section position="3" start_page="0" end_page="474" type="metho">
    <SectionTitle>
3. PLANS FOR THE COMING
YEAR
</SectionTitle>
    <Paragraph position="0"> Improve SUMMIT recognition performance by developing phonetic and language models for silence and filled pauses, and mechanisms to detect and &amp;quot;repair&amp;quot; false starts and repeats.</Paragraph>
    <Paragraph position="1"> Continue experimentation on low perplexity language models (N-gram, LR parser, probablistic parsing) to improve speech recognition performance.</Paragraph>
    <Paragraph position="2"> Model discourse and dialogue, including the use of error and clarification messages, to improve both recognition performance and the interactive nature of spoken language systems.</Paragraph>
    <Paragraph position="3"> Develop evaluation metrics for interactive spoken language systems based on experiments using the real-time ATIS spoken language system.</Paragraph>
    <Paragraph position="4"> Experiment with alternative user interaction strategies using near real-time data collection system.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML