File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/05/p05-1031_intro.xml
Size: 2,552 bytes
Last Modified: 2025-10-06 14:03:04
<?xml version="1.0" standalone="yes"?> <Paper uid="P05-1031"> <Title>Towards Finding and Fixing Fragments: Using ML to Identify Non-Sentential Utterances and their Antecedents in Multi-Party Dialogue</Title> <Section position="3" start_page="0" end_page="247" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> Non-sentential utterances (NSUs) as in (1) are pervasive in dialogue: recent studies put the proportion of such utterances at around 10% across different types of dialogue (Fern'andez and Ginzburg, 2002; Schlangen and Lascarides, 2003).</Paragraph> <Paragraph position="1"> (1) a. A: Who came to the party? B: Peter. (= Peter came to the party.) b. A: I talked to Peter.</Paragraph> <Paragraph position="2"> B: Peter Miller? (= Was it Peter Miller you talked to?) c. A: Who was this? Peter Miller? (= Was this Peter Miller? Such utterances pose an obvious problem for natural language processing applications, namely that the intended information (in (1-a)-B a proposition) has to be recovered from the uttered information (here, an NP meaning) with the help of information from the context.</Paragraph> <Paragraph position="3"> While some systems that automatically resolve such fragments have recently been developed (Schlangen and Lascarides, 2002; Fern'andez et al., 2004a), they have the drawback that they require &quot;deep&quot; linguistic processing (full parses, and also information about discourse structure) and hence are not very robust. We have defined a well-defined subtask of this problem, namely identifying fragments (certain kinds of NSUs, see below) and their antecedents (in multi-party dialogue, in our case), and present a novel machine learning approach to it, which we hypothesise will be useful for tasks such as automatic meeting summarisation.1 The remainder of this paper is structured as follows. In the next section we further specify the task and different possible approaches to it. We then describe the corpus we used, some of its characteristics with respect to fragments, and the features we extracted from it for machine learning. Section 4 describes our experimental settings and reports the results. After a comparison to related work in Section 5, we close with a conclusion and some further 1(Zechner and Lavie, 2001) describe a related task, linking questions and answers, and evaluate its usefulness in the context of automatic summarisation; see Section 5.</Paragraph> <Paragraph position="4"> work that is planned.</Paragraph> </Section> class="xml-element"></Paper>