File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/99/e99-1015_intro.xml
Size: 6,111 bytes
Last Modified: 2025-10-06 14:06:50
<?xml version="1.0" standalone="yes"?> <Paper uid="E99-1015"> <Title>An annotation scheme for discourse-level argumentation in research articles</Title> <Section position="4" start_page="0" end_page="111" type="intro"> <SectionTitle> 2 The annotation scheme </SectionTitle> <Paragraph position="0"> We wanted the scheme to cover one text type, namely research articles, but from different presentational traditions and subject matters, so that Proceedings of EACL '99 we can use it for text summarization in a range of fields. This means we cannot rely on similarities in external presentation, e.g. section structure and typical linguistic formulaic expressions.</Paragraph> <Paragraph position="1"> Previous discourse-level annotation schemes (e.g. Liddy, 1991; Kircz, 1991) show that information retrieval can profit from added rhetorical information in scientific texts. However, the definitions of the categories in these schemes relies on domain dependent knowledge like typical research methodology, and are thus too specific for our purposes.</Paragraph> <Paragraph position="2"> General frameworks of text structure and argumentation, like Cohen's (1984) theoretical framework for general argumentation and Rhetorical Structure Theory (Mann and Thompson, 1987), are theoretically applicable to many different kinds of text types. However, we believe that restricting ourselves to the text type of research articles will give us an advantage over such general schemes, because it will allow us to rely on communicative goals typically occurring within that text type.</Paragraph> <Paragraph position="3"> STales' (1990) CARS (Creating a Research Space) model provides a description at the right level for our purposes. STales claims that the regularities in the argumentative structure of research article introductions follow from the authors' primary communicative goal: namely to convince their audience that they have provided a contribution to science. From this goal follow highly predictable subgoals which he calls argumentative moves (&quot;recurring and regularized communicative events&quot;). An example for such a move is &quot;Indication of a gap&quot;, where the author argues that there is a weakness in an earlier approach which needs to be solved.</Paragraph> <Paragraph position="4"> STales' model has been used extensively by discourse analysts and researchers in the field of English for Specific Purposes, for tasks as varied as teaching English as a foreign language, human translation and citation analysis (Myers, 1992; Thompson and Ye, 1991; Duszak, 1994), but always for manual analysis by a single person. Our annotation scheme is based on STales' model but we needed to modify it. Firstly, the CARS model only applies to introductions of research articles, so we needed new moves to cover the other paper sections; secondly, we needed more precise guidelines to make the scheme applicable to reliable annotation for several non-discourse analysts (and for potential automatic annotation).</Paragraph> <Paragraph position="5"> For the development of our scheme, we used computational linguistics articles. The papers in our collection cover a challenging range of sub-ject matters due to the interdisciplinarity of the field, such as logic programming, statistical language modelling, theoretical semantics and computational psycholinguistics. Because the research methodology and tradition of presentation is so different in these fields, we would expect the scheme to be equally applicable in a range of disciplines other than those named.</Paragraph> <Paragraph position="6"> Our annotation scheme consists of the seven categories shown in Figure 1. There are two versions of the annotation scheme. The basic scheme provides a distinction between three textual segments which we think is a necessary precondition for argumentatively-justified summarization. This distinction is concerned with the attribution of authorship to scientific ideas and solutions described in the text. Authors need to make clear, and readers need to understand: * which sections describe generally accepted statements (BACKGROUND); * which ideas are attributed to some other, specific piece of research outside the given paper, including own previous work (OTHER); * and which statements are the authors' own new contributions (OWN).</Paragraph> <Paragraph position="7"> The/ull annotation scheme consists of the basic scheme plus four other categories, which are based on STales' moves. The most important of these is AIM (STales' move &quot;Explicit statements of research goal&quot;), as these moves are good characterizations of the entire paper. We are interested in how far humans can be trained to consistently annotate these sentences; similar experiments where subjects selected one or several 'most relevant' sentences from a paper have traditionally reported low agreement (Rath et al., 1961). There is also the category TEXTUAL ( STales' move &quot;Indicate structure&quot;), which provides helpful information about section structure, and two moves having to do with attitude towards previous research, namely BASIS and CONTRAST.</Paragraph> <Paragraph position="8"> The relative simplicity of the scheme was a compromise between two demands: we wanted the scheme to contain enough information for automatic summarization, but still be practicable for hand coding.</Paragraph> <Paragraph position="9"> Annotation proceeds sentence by sentence according to the decision tree given in Figure 2. No instructions about the use of cue phrases were given, although some of the example sentences given in the guidelines contained cue phrases. The categorisation task resembles the judgements performed e.g. in dialogue act coding (Carletta et al., ing out weaknesses in other research; sentences stating that the research task of the current paper has never been done before; direct comparisons BASIS Statements that the own work uses some other work as its basis or starting point, or gets support from this other work</Paragraph> </Section> class="xml-element"></Paper>