File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/03/n03-1030_concl.xml
Size: 3,032 bytes
Last Modified: 2025-10-06 13:53:29
<?xml version="1.0" standalone="yes"?> <Paper uid="N03-1030"> <Title>Sentence Level Discourse Parsing using Syntactic and Lexical Information</Title> <Section position="7" start_page="0" end_page="0" type="concl"> <SectionTitle> 6 Conclusion </SectionTitle> <Paragraph position="0"> In this paper, we have introduced a discourse parsing model that uses syntactic and lexical features to estimate the adequacy of sentence-level discourse structures. Our model de nes and exploits a set of syntactically motivated lexico-grammatical dominance relations that fall naturally from a syntactic representation of sentences.</Paragraph> <Paragraph position="1"> The most interesting nding is that these dominance relations encode suf cient information to enable the derivation of discourse structures that are almost indistinguishable from those built by human annotators. Our experiments empirically show that, at the sentence level, there is an extremely strong correlation between syntax and discourse. This is even more remarkable given that the discourse corpus (RST-DT, 2002) was built with no syntactic theory in mind. The annotators used by Carlson et al. (2003) were not instructed to build discourse trees that were consistent with the syntax of the sentences. Yet, they built discourse structures at sentence level that are not only consistent with the syntactic structures of sentences, but also derivable from them.</Paragraph> <Paragraph position="2"> Recent work on Tree Adjoining Grammar-based lexicalized models of discourse (Forbes et al., 2001) has already shown how to exploit within a single framework lexical, syntactic, and discourse cues. Various linguistics studies have also shown how intertwined syntax and discourse are (Maynard, 1998). However, to our knowledge, this is the rst paper that empirically shows that the connection between syntax and discourse can be computationally exploited at high levels of accuracy on open domain, newspaper text.</Paragraph> <Paragraph position="3"> Another interesting nding is that the performance of current state-of-the-art syntactic parsers (Charniak, 2000) is not a bottleneck for coming up with a good solution to the sentence-level discourse parsing problem. Little improvement comes from using manually built syntactic parse trees instead of automatically derived trees. However, experiments show that there is much to be gained if better discourse segmentation algorithms are found; 83% accuracy on this task is not suf cient for building highly accurate discourse trees.</Paragraph> <Paragraph position="4"> We believe that semantic/discourse segmentation is a notoriously under-researched problem. For example, Gildea and Jurafsky (2002) present a semantic parser that optimistically assumes that has access to perfect semantic segments. Our results suggest that more effort needs to be put on semantic/discourse-based segmentation. Improvements in this area will have a signi cant impact on both semantic and discourse parsing.</Paragraph> </Section> class="xml-element"></Paper>