File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/01/h01-1001_concl.xml

Size: 2,669 bytes

Last Modified: 2025-10-06 13:53:00

<?xml version="1.0" standalone="yes"?>
<Paper uid="H01-1001">
  <Title>Activity detection for information access to oral communication</Title>
  <Section position="8" start_page="0" end_page="0" type="concl">
    <SectionTitle>
6. CONCLUSION AND FUTURE WORK
</SectionTitle>
    <Paragraph position="0"> It has been shown that activities can be detected and that they may be e cient indices for access to oral communication. Overall it is easy to make high level distinctions with automated methods while ne-grained distinctions are even hard to make for humans { on the other hand automatic methods are still able to model some aspect of it (Fig. 3).</Paragraph>
    <Paragraph position="1"> To obtain an reduction in entropy a relatively large database such as CallHome Spanish is required (120 dialogues). Alternatives to activities might be emotional and dominance distributions that are easier to detect and that may be natural to understand for users. If activities are only used for local navigation support within a rejoinder one could also visualize by displaying the dialogue act patterns for each channel on a time line.</Paragraph>
    <Paragraph position="2"> The author has also observed that topic clusters and activities are largely independent in the meeting domain resulting in orthogonal indices. Since activities have intuitions for naive users and they may be remembered it can be assumed that users would be able to make use of these constraints.</Paragraph>
    <Paragraph position="3"> Ongoing work includes the use of speaker activity for dialogue segmentation and further assessment of features for information access. Overall the methods presented here and the ongoing work are improving the ability to index oral communication. It should be noted that some of the techniques presented lend themselves to implementations that don't require (full) speech recognition: Speaker identi cation and dialogue act identi cation may be done without an LVCSR system which would allow to lower the computational requirements as well as to a more robust system.</Paragraph>
    <Paragraph position="4">  tion of high-level genre as exempli ed by the di erentiation of corpora can be done with high accuracy using simple features (Ries, 1999). Similar it was fairly easy to discriminate between male and female speakers on Switchboard (Ries, 1999). Discriminating between sub-genre such as TV-show types (Sec. 4) can be done with reasonable accuracy. However it is a lot harder to discriminate between activities within one conversation for personal phone calls (CallHome) (Ries et al., 2000) or for general rejoinders (Santa) and meetings (Sec. 2).</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML