File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/98/w98-0318_intro.xml

Size: 3,128 bytes

Last Modified: 2025-10-06 14:06:39

<?xml version="1.0" standalone="yes"?>
<Paper uid="W98-0318">
  <Title>Automatic Disambiguation of Discourse Particles</Title>
  <Section position="2" start_page="0" end_page="107" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> Discourse particles, as for instance German ja, nein, ach, oh, and ~hrn, and English well, yes, oh, ah, and uhm (Schiffrin, 1987), are extremely frequent phenomena of spontaneous spoken language dialogues.</Paragraph>
    <Paragraph position="1"> For instance, in informal German human-to-human communication, their frequency ranges between 8.8% and 9.8% (Fischer and Johanntokrax, 1995). In human-computer interaction, this prominent quantitative role decreases; however, they may still constitute 6.6% of the 150 most frequent words (Fischer and Johanntokrax, 1995). In spite of their important quantitative role, discourse particles have so far been neglected in automatic speech processing; if they are identified at all, then only in order to eliminate them (O'Shaugnessy, 1993). Reasons may be firstly that it is not clear what they could contribute to the aims of automatic speech processing, and secondly that they may fulfill so many different functions that it seems difficult to identify the information relevant to such aims.</Paragraph>
    <Paragraph position="2"> In this study, it will firstly be investigated what these discourse particles contribute to natural human-to-human conversation with the aim to determine which of their functions can be useful for automatic speech processing. In order to make use of the information they provide: however, they need to be disambiguated. Two different strategies will be employed to disambiguate them automatically: On the one hand their position with respect to the turn and utterance in which they occur will be investigated in order to see how much it contributes to the interpretation of a discourse particle occurrence, on the other their role in the dialogue structure, especially regarding a dialogue model, will be analyzed. These two types of information, position and dialogue acts, are particularly easy to obtain during automatic speech processing. The aim of this investigation is thus to determine * what discourse particles contribute to hun~an-to-human dialogues; * what they may contribute to automatic speech processing; * in how far their surface properties, in particular their position regarding the  turn and the utterance, influence their functions; what the dialogue structure may contribute to their automatic disambiguation; null how these aspects interact and how this interaction can be modelled in an actual automatic speech processing system.</Paragraph>
    <Paragraph position="3"> It will be shown that the two types of information involved suffice to disambiguate a considerable portion of discourse particle occurrences and that there is consequently no reason to eliminate them from automatic speech processing. Finally, a model for an implementation in a semantic network for the automatic disambiguation of discourse particles will be proposed.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML