File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/06/e06-1036_intro.xml

Size: 2,726 bytes

Last Modified: 2025-10-06 14:03:20

<?xml version="1.0" standalone="yes"?>
<Paper uid="E06-1036">
  <Title>Recognizing Textual Parallelisms with edit distance and similarity degree</Title>
  <Section position="2" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> Detection of discourse structure is crucial in many text-based applications such as Information Retrieval, Question-Answering, Text Browsing, etc.</Paragraph>
    <Paragraph position="1"> Thanks to a discourse structure one can precisely point outaninformation, provide italocal context, situate it globally, link it to others.</Paragraph>
    <Paragraph position="2"> The context of our research is to improve automatic discourse analysis. A key feature of the most popular discourse theories (RST (Mann and Thompson, 1987), SDRT (Asher, 1993), etc.) is the distinction between two sorts of discourse relations or rhetorical functions: the subordinating and the coordinating relations (some parts of a text play a subordinate role relative to other parts, while some others have equal importance).</Paragraph>
    <Paragraph position="3"> In this paper, we focus our attention on a discourse feature we assume supporting coordination relations, namely the Textual Parallelism. Based on psycholinguistics studies (Dubey et al., 2005), our intuition isthat similarities concerning the surface, the content and the structure of textual units can be a way for authors to explicit their intention toconsider these units withthesamerhetorical importance. null Parallelism can be encountered in many specific discourse structures such as continuity in information structure (Kruijff-Korbayov'a and Kruijff, 1996), frame structures (Charolles, 1997), VP ellipses (Hobbs and Kehler, 1997), headings (Summers, 1998), enumerations (Luc et al., 1999), etc. These phenomena are usually treated mostly independently within individual systems with ad-hoc resource developments.</Paragraph>
    <Paragraph position="4"> In this work, we argue that, depending on description granularity we can proceed, computing syntagmatic (succession axis of linguistic units) and paradigmatic (substitution axis) similarities between units can allow us to generically handle such discourse structural phenomena. Section 2 introduces the discourse parallelism phenomenon.</Paragraph>
    <Paragraph position="5"> Section 3develops three methods weimplemented to detect it: a similarity degree measure, a string editing distance (Wagner and Fischer, 1974) and a tree editing distance1 (Zhang and Shasha, 1989).</Paragraph>
    <Paragraph position="6"> Section 4 discusses and evaluates these methods and their relevance. The final section reviews related work.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML