File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/04/w04-0206_concl.xml
Size: 2,751 bytes
Last Modified: 2025-10-06 13:54:08
<?xml version="1.0" standalone="yes"?> <Paper uid="W04-0206"> <Title>Discourse-Level Annotation for Investigating Information Structure</Title> <Section position="7" start_page="11" end_page="11" type="concl"> <SectionTitle> 6 Conclusions and Perspectives </SectionTitle> <Paragraph position="0"> We presented the details of the discourse-level annotation scheme that we developed within the MULI project. This project is a pilot project: As such, the annotation has so far been restricted to a relatively small amount of data, since the experimental design of the study required testing of tools as well as manual annotation. We plan to extend the size of the corpus by manual and semi-automatic annotation in a follow-up project.</Paragraph> <Paragraph position="1"> The challenge in the MULI project has been to define theory-neutral and language-independent annotation schemes for annotating linguistic data with information that pertains to the realisation and interpretation of information structure. An important characteristic of the MULI corpus, arising from its theory-neutrality, is that it is descriptive.Thecorpus annotation is not based on explanatory mechanisms: We have to derive such explanations from the data. (See (Skut et al., 1997) for related methodology pertaining to syntactic treebanks.) The MULI corpus facilitates linguistic investigation of how phenomena at different annotation levels interact. For example, how do syntactic structure and intonation interact to realize information structure? Or, how does information structure interact with anaphoric relationships? Such linguistic investigations can help to extend existing accounts of information structure, and can also be used to verify (or falsify) predictions made by such accounts. The corpus also makes it possible to construct computational models from the corpus data.</Paragraph> <Paragraph position="2"> Theory-neutrality enhances reusability of linguistic resources, because it facilitates the integration with other, theory-neutral resources. To some extent we have already explored this in MULI, combining e.g. Tiger annotation with discourse-level annotation. Another possibility to explore is the to integrate MULI annotation with, e.g., the SALSA corpus (Erk et al., 2003), which provides more detailed semantico-pragmatic information in the style of FrameNet.</Paragraph> <Paragraph position="3"> Our initial investigation also reveals where additional annotation would be needed. For instance, the text example discussed above constitutes a concession scheme, which we cannot identify without annotating discourse/rhetorical relations. This in turn requires extending the annotation scheme to non-nominal markables.</Paragraph> </Section> class="xml-element"></Paper>