File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/05/j05-2005_metho.xml

Size: 49,468 bytes

Last Modified: 2025-10-06 14:09:39

<?xml version="1.0" standalone="yes"?>
<Paper uid="J05-2005">
  <Title>Representing Discourse Coherence: A Corpus-Based Study</Title>
  <Section position="3" start_page="467" end_page="467" type="metho">
    <SectionTitle>
3 Although Lascarides and Asher (1991) do not explicitly disallow crossed dependencies, they argue that
</SectionTitle>
    <Paragraph position="0"> when a discourse structure is being constructed, the right frontier of an already existing discourse structure is the only possible attachment point for a new incoming discourse segment (cf. also Polanyi 1996; Polanyi and Scha 1984; Webber et al. 1999). This constraint on building discourse structures effectively disallows crossed dependencies.</Paragraph>
    <Paragraph position="1">  Wolf and Gibson Representing Discourse Coherence Some accounts of informational discourse structure do not assume tree structures (e.g., Bergler [1991] and Hobbs [1985] for monologue and Penstein Rose et al. [1995] for dialogue structure). However, none of these accounts provides systematic empirical support for using more general graphs rather than trees. Providing a systematic empirical study of whether trees are descriptively adequate for representing discourse coherence is the goal of this article.</Paragraph>
    <Paragraph position="2"> There are also accounts of informational discourse structure that argue for trees as a &amp;quot;backbone&amp;quot; for discourse structure but allow certain violations of tree constraints (crossed dependencies or nodes with multiple parents). Examples of such accounts include Webber et al. (1999) and Knott (1996). Similarly to our approach, Webber et al. (1999) investigated informational coherence relations. The kinds of coherence relations they used are basically the same as those that we used (cf. also Hobbs 1985). However, they argue for a tree structure as a backbone for discourse structure but have also addressed violations of tree structure constraints. In order to accommodate violations of tree structure constraints (in particular, crossed dependencies), Webber et al. (1999) argue for a distinction between &amp;quot;structural&amp;quot; discourse relations, on the one hand, and &amp;quot;nonstructural&amp;quot; or &amp;quot;anaphoric&amp;quot; discourse relations on the other hand. Structural discourse relations are represented within a lexicalized tree-adjoining grammar framework, and the resultant structural discourse structure is represented by a tree. However, more recently, Webber et al. (2003) have argued that structural discourse structure should allow nodes with multiple parents, but no crossed dependencies. It is unclear, however, why Webber et al. (2003) allow one kind of tree constraint violation (nodes with multiple parents) but not another (crossed dependencies).</Paragraph>
    <Paragraph position="3"> Note that there seems to be a problem with the definition of &amp;quot;structural&amp;quot; versus &amp;quot;nonstructural&amp;quot; discourse structure in Webber et al. (1999): According to Webber et al. (1999), nonstructural discourse relations are licensed by anaphoric relations and can be involved in crossed dependencies. However, Webber et al. (1999) also argue that one criterion for nonstructural coherence relations is that they can cross (non)structural coherence relations. Since this definition of &amp;quot;nonstructural&amp;quot; appears to be circular, it is necessary to find an independent way to validate the difference between structural and nonstructural coherence relations. Knott (1996) might provide a way to empirically formalize the claims in Webber et al. (1999), or at least claims that seem to be very similar to those in Webber et al. (1999): Based on the observation that he cannot identify characteristic cue phrases for elaboration relations (e.g., because would be a characteristic cue phrase for cause-effect), Knott argues that elaboration relations are more permissive than other types of coherence relations (e.g., cause-effect, similarity, contrast). As a consequence, Knott argues, elaboration relations would be better described in terms of focus structures (cf. Grosz and Sidner 1986), which Knott argues are less constrained, than in terms of rhetorical relations (cf. Hobbs 1985; Mann and Thompson 1988), which Knott argues are more constrained. This hypothesis makes testable empirical claims: Elaboration relations should in some way pattern differently from other coherence relations. We come back to this issue in sections 4.1 and 4.2.</Paragraph>
    <Paragraph position="4"> In this article we present evidence against trees as a data structure for representing discourse coherence. Note, though, that the evidence does not support the claim that discourse structures are completely arbitrary. The goal of our research program is to first determine which constraints on discourse structure are empirically viable. To us, the work we present here seems to be the crucial first step in avoiding arbitrary constraints on inferences for building discourse structures. In other words, the point we wish to make here is that although there might be other constraints on possible discourse annotations that will have to be identified in future research, tree structure constraints  Computational Linguistics Volume 31, Number 2 do not seem to be the right kinds of constraints. This appears to be a crucial difference between approaches like Knott's (1996), Marcu's (2000), or Webber et al.'s (2003), on the one hand, and our approach, on the other hand. The goal of the former approaches seems to be to first specify a set of constraints on possible discourse annotations and then to annotate texts with these constraints in mind.</Paragraph>
    <Paragraph position="5"> The following two sections illustrate problems with trees as a representation of discourse coherence structures. Section 3.1 shows that the discourse structures of naturally occurring texts contain crossed dependencies, which cannot be represented in trees. Another problem for trees, in addition to crossed dependencies, is that many nodes in coherence graphs of naturally occurring texts have multiple parents. This is shown in section 3.2. Because of these problems for trees, we argue for a representation such as chain graphs (cf. Frydenberg 1989; Lauritzen and Wermuth 1989), in which directed arcs represent asymmetrical or directed coherence relations and undirected arcs represent symmetrical or undirected coherence relations (this is equivalent to arguing for directed graphs with cycles). For all the examples in sections 3.1 and 3.2, chain-graph-based analyses are given. RST analyses are given only for those examples that are also annotated by Carlson, Marcu, and Okurowski (2002) (in those cases, the RST analyses are those provided by Carlson, Marcu, and Okurowski).</Paragraph>
    <Section position="1" start_page="467" end_page="467" type="sub_section">
      <SectionTitle>
3.1 Crossed Dependencies
</SectionTitle>
      <Paragraph position="0"> Consider the text passage in example (20) (modified from SAT practice materials):  (20) 1. Schools tried to teach students history of science. 2. At the same time they tried to teach them how to think logically and inductively.</Paragraph>
      <Paragraph position="1"> 3. Some success has been reached in the first of these aims. 4. However, none at all has been reached in the second.  Figure 1 shows the coherence graph for example (20). Note that the arrowheads of the arcs represent directionality for asymmetrical relations (elaboration) and bidirectionality for symmetrical relations (similarity, contrast).</Paragraph>
      <Paragraph position="2"> The coherence structure for example (20) can be derived as follows:  Elaboration relation between discourse segments 3 and 1: Discourse segment 3 provides more details (the degree of success) about the teaching described in discourse segment 1.</Paragraph>
      <Paragraph position="3"> Figure 1 Coherence graph for example (20). contr = contrast; elab = elaboration.  Wolf and Gibson Representing Discourse Coherence a114 Elaboration relation between discourse segments 4 and 2: Discourse segment 4 provides more details (the degree of success) about the teaching described in discourse segment 2.</Paragraph>
      <Paragraph position="4"> In the resultant coherence structure for (20), there is a crossed dependency between {3, 1} and {4, 2}.</Paragraph>
      <Paragraph position="5"> In order to be able to represent a structure like the one for (20) in a tree without violating validity assumptions about tree structures (Diestel 2000), one might consider augmenting a tree either with feature propagation (Shieber 1986) or with a coindexation mechanism (Chomsky 1973). There is a problem, however, with both feature propagation and coindexation mechanisms: Both the tree structure itself and the features and coindexations as well represent the same kind of information (coherence relations). It is unclear how a dividing line could be drawn between tree structures and their augmentation. That is, it is unclear how one could decide which part of a text coherence structure should be represented by the tree structure and which part should be represented by the augmentation. Other areas of linguistics have faced this issue as well. Researchers investigating data structures for representing intrasentential structure, for instance, generally fall into two groups. One group tries to formulate principles that allow representation of some aspects of structure in the tree itself and other aspects in some augmentation formalism (e.g., Chomsky 1973; Marcus et al. 1994). Another group argues that it is more parsimonious to assume a unified dependency-based representation that drops the tree constraints of allowing no crossed dependencies (e.g., Brants et al. 2002; Skut et al. 1997; K&amp;quot;onig and Lezius 2000). Our approach falls into the latter group. As we point out, there does not seem to be a well-defined set of constraints on crossed dependencies in discourse structures. Without such constraints, it does not seem viable to represent discourse structures as augmented tree structures.</Paragraph>
      <Paragraph position="6"> An important question is how many different kinds of crossed dependencies occur in naturally occurring discourse. If there are only a very limited number of different structures with crossed dependencies in natural texts, one could make special provisions to account for these structures and otherwise assume tree structures. Example (20), for instance, has a listlike structure. It is possible that listlike examples are exceptional in natural texts. However, there are many other naturally occurring nonlistlike structures that contain crossed dependencies. As an example of a nonlistlike structure with a crossed dependency (between {4, 2} and {3, 1-2}), consider example  (21) (constructed): (21) 1. Susan wanted to buy some tomatoes 2. and she also tried to find some basil 3. because her recipe asked for these ingredients.</Paragraph>
      <Paragraph position="7"> 4. The basil would probably be quite expensive at this time of the year.  The coherence structure for (21), shown in Figure 2, can be derived as follows: a114 Similarity relation between 1 and 2: 1 and 2 both describe shopping for grocery items.</Paragraph>
      <Paragraph position="8"> a114 Cause-effect relation between 3 and 1-2: 3 describes the cause for the shopping described by 1 and 2.</Paragraph>
      <Paragraph position="9">  (22) 1. The flight Sunday took off from Heathrow Airport at 7:52pm 2. and its engine caught fire 10 minutes later, 3. the Department of Transport said.</Paragraph>
      <Paragraph position="10"> 4. The pilot told the control tower he had the engine fire under  control.</Paragraph>
      <Paragraph position="11"> The coherence structure for example (22) can be derived as follows: a114 Temporal sequence relation between 1 and 2: 1 describes the takeoff that happens before the engine fire described by 2 occurs.</Paragraph>
      <Paragraph position="12"> a114 Attribution relation between 3 and 1-2: 3 mentions the source of what is said in 1-2.</Paragraph>
      <Paragraph position="13"> a114 Elaboration relation between 4 and 2: 4 provides more detail about the engine fire in 2.</Paragraph>
      <Paragraph position="14"> The resulting coherence structure, shown in Figure 3, contains a crossed dependency between {4, 2} and {3, 1-2}.</Paragraph>
      <Paragraph position="15"> Consider example (23) (from wsj 0655; Wall Street Journal 1989 corpus [Harman and Liberman 1993]):  Coherence graph for example (23). expv = violated expectation; elab = elaboration; attr = attribution. Figure 5 Coherence graph for example (23) with discourse segment 1 split into two segments. expv = violated expectation; elab = elaboration; attr = attribution. Figure 6 Tree-based RST annotation for example (23) from Carlson, Marcu, and Okurowski (2002). Broken lines represent the start of asymmetric coherence relations; continuous lines represent the end of asymmetric coherence relations; symmetric coherence relations have two continuous lines (cf. section 2.3). attr = attribution; elab = elaboration.</Paragraph>
      <Paragraph position="16"> The annotations based on our annotation scheme with the discourse segmentation based on the segmentation guidelines in Carlson, Marcu, and Okurowski (2002) are presented in Figure 4, and those with the discourse segmentation based on our segmentation guidelines from section 2.1 are presented in Figure 5. Figure 6 shows a tree-based RST annotation for example (23) from Carlson, Marcu, and Okurowski (2002). The only difference between our approach and that of Carlson, Marcu, and Okurowski with respect to how example (23) is segmented is that Carlson and her colleagues assume discourse segment 1 to be one single segment. By contrast, based on our segmentation guidelines, discourse segment 1 would be segmented into two segments (because of the comma that does not separate a complex NP or VP), 1a and 1b, as indicated by the brackets in example (24):  4 Based on our segmentation guidelines, the complementizer that in discourse segment 3 would be part of discourse segment 2 instead (cf. (15)). However, since this would not make a difference in terms of the resulting discourse structure, we do not provide alternative analyses with that as part of discourse segment 2 instead of discourse segment 3.</Paragraph>
      <Paragraph position="17">  The coherence structure for example (23) can be derived as follows: a114 If discourse segment 1 is segmented into 1a and 1b (following our discourse segmentation guidelines), elaboration relation between 1a and 1b: 1b provides additional detail (a name) about what is stated in 1a (Mr. Baker's assistant).</Paragraph>
      <Paragraph position="18"> a114 Same relation between 1 (or 1a) and 4: The subject NP in 1 (Mr. Baker's assistant) is separated from its predicate in 4 (acknowledged)by intervening discoure segments 2 and 3 (and 1b in our discourse segmentation).</Paragraph>
      <Paragraph position="19"> a114 Attribution relation between 2 and 3: 2 states the source (the elided Mr. Baker) of what is stated in 3.</Paragraph>
      <Paragraph position="20"> a114 Elaboration relation between the group of discourse segments 2 and 3 and discourse segment 1 (or the group of discourse segments 1a and 1b in our discourse segmentation): 2 and 3 state additional detail (a statement about a political process) about what is stated in 1 (or 1a and 1b) (Mr. Baker's assistant).</Paragraph>
      <Paragraph position="21"> a114 Attribution relation between 4 (and by virtue of the same relation, also  Violated expectation relation between the group of discourse segments 2 and 3 and the group of discourse segments 4 and 5: Although Mr. Baker's assistant acknowledges cease-fire violations by one side (discourse segments 2 and 3), he acknowledges that it is in fact difficult to clearly blame one side for cease-fire violations (discourse segments 4and5).</Paragraph>
      <Paragraph position="22"> The resulting coherence structure, shown in Figure 5 (discourse segmentation from Carlson, Marcu, and Okurowski [2002]) and Figure 6 (our discourse segmentation), contains a crossed dependency: The same relation between discourse segment 1 and discourse segment 4 crosses the violated expectation relation between the group of discourse segments 2 and 3 and the group of discourse segments 4 and 5.</Paragraph>
      <Paragraph position="23"> Figure 6 represents a tree-based RST annotation for example (23) from Carlson, Marcu, and Okurowski (2002); in Figure 6, dashed lines represent the start of asymmetric coherence relations and continuous lines mark the end of asymmetric coherence relations; symmetric coherence relations have two continuous lines (cf. section 2.3 for the distinction between symmetric and asymmetric coherence relations and for the directions of asymmetric coherence relations). Carlson, Marcu, and Okurowski (2002) do not provide descriptions of how they derived tree-based RST structures for their examples that are used in this article. Therefore, instead of discussing how the tree-based RST structures were derived, we show comparisons of the RST structure and our chain-graph-based structure; the comparison for (23) is provided in Table 5. Note in particular that the RST structure for example (23) does not represent the violated expectation relation between 2-3 and 4-5; that relation could not be annotated without violating the tree constraint of not allowing crossed dependencies.  Wolf and Gibson Representing Discourse Coherence Table 5 Comparison for example (23) of tree-based RST structure from Carlson, Marcu, and Okurowski (2002) and our chain-graph-based structure.</Paragraph>
      <Paragraph position="24"> Tree-based RST structure Our chain-graph-based structure (1a and 1b are one discourse segment) Elaboration between 1a and 1b Same between 1-2 and 4 Same between 1 (or 1a) and 4 Attribution between 1 and 2 Attribution between 1 and 2 Elaboration between 2-3 and 1 Elaboration between 2-3 and 1 (or 1a and 1b) Attribution between 1-4 and 5 Attribution between 4 and 5 (no relation) Violated expectation between 2-3 and 4-5  Coherence graph for example (25). cond = condition; attr = attribution; elab = elaboration.</Paragraph>
    </Section>
    <Section position="2" start_page="467" end_page="467" type="sub_section">
      <SectionTitle>
3.2 Nodes with Multiple Parents
</SectionTitle>
      <Paragraph position="0"> In addition to including crossed dependencies, many coherence structures of natural texts include nodes with multiple parents. Such nodes cannot be represented in tree structures. Consider example (25) (from ap890103 = 0014; AP Newswire 1989 corpus [Harman and Liberman 1993]).</Paragraph>
      <Paragraph position="1">  (25) 1. &amp;quot;Sure I'll be polite,&amp;quot; 2. promised one BMW driver 3. who gave his name only as Rudolf.</Paragraph>
      <Paragraph position="2"> 4. &amp;quot;As long as the trucks and the timid stay out of the left lane.&amp;quot;  The coherence structure for example (25) can be derived as follows: a114 Attribution relation between 2 and 1 and 2 and 4: 2 states the source of what is stated in 1 and 4, respectively.</Paragraph>
      <Paragraph position="3"> a114 Elaboration relation between 3 and 2: 3 provides additional detail (the name) about the BMW driver in 2.</Paragraph>
      <Paragraph position="4"> a114 Condition relation between 4 and 1: 4 states the BMW driver's condition for being polite, stated in 1.</Paragraph>
      <Paragraph position="5">  This condition relation is also indicated by the phrase &amp;quot;as long as.&amp;quot; In the resultant coherence structure for example (25), node 1 has two parents--one attribution and one condition ingoing arc (cf. Figure 7). 5 A cultural reference: In Germany, when driving on a highway, it is only lawful to pass on the left side. Thus, Rudolf is essentially saying that he will be polite as long as the trucks and the timid do not keep him from passing other cars.</Paragraph>
      <Paragraph position="6">  Computational Linguistics Volume 31, Number 2 Figure 8 Coherence graph for example (26). Additional coherence relation used (from Carlson, Marcu, and Okurowski [2002]): evaluation-s = the situation presented in the satellite assesses the situation presented in the nucleus (evaluation-s would be elaboration in our annotation scheme).</Paragraph>
      <Paragraph position="8"> Coherence graph for example (26) with discourse segments 1 and 2 merged into one single discourse segment. Additional coherence relation used (from Carlson, Marcu, and Okurowski [2002]): evaluation-s = the situation presented in the satellite assesses the situation presented in the nucleus (evaluation-s would be elaboration in our annotation scheme). attr = attribution; cond = condition.</Paragraph>
      <Paragraph position="9"> As another example of a discourse structure that contains nodes with multiple parents, consider the structure of example (26) (from wsj 0655; Wall Street Journal 1989 corpus [Harman and Liberman 1993]):  (26) (they in 4 and 6 = Contra supporters; this is clear from the whole text wsj 0655) 1. &amp;quot;The administration should now state 2. that 3. if the February election is voided by the Sandinistas 4. they should call for military aid,&amp;quot; 5. said former Assistant Secretary of State Elliott Abrams.</Paragraph>
      <Paragraph position="10"> 6. &amp;quot;In these circumstances, I think they'd win.&amp;quot;  Our annotations are shown in Figures 8 (discourse segmentation from Carlson, Marcu, and Okurowski [2002]) and 9 (our discourse segmentation); Carlson et al.'s (2002) tree-based RST annotation is shown in Figure 10. The only difference between our annotation and that of Carlson, Marcu, and Okurowski is that we do not assume two separate discourse segments for 1 and 2; 1 and 2 are one discourse segment in our annotation (represented by the node 1+2 in Figure 9). Note also that in discourse segment 3 of example (23) &amp;quot;that&amp;quot; is not in a separate discourse segment; it is unclear why in example (26), &amp;quot;that&amp;quot; is in a separate discourse segment (discourse segment 2) and not part of discourse segment 3. The discourse structure for example (26) can be derived as follows:  1. According to our discourse segmentation guidelines (cf. section 2.1), 1 and 2 should be one single discourse segment: Therefore either same relation  between 1 and 2 (cf. Figure 8), or merge 1 and 2 into one single discourse segment, 1+2 (cf. Figure 9).</Paragraph>
      <Paragraph position="11">  Wolf and Gibson Representing Discourse Coherence Figure 10 Tree-based RST annotation for example (26) from Carlson, Marcu, and Okurowski (2002). Broken lines represent the start of asymmetric coherence relations; continuous lines represent the end of asymmetric coherence relations; symmetric coherence relations have two continuous lines (cf. section 2.3). Additional coherence relation used (from Carlson, Marcu, and Okurowski [2002]): evaluation-s = the situation presented in the satellite assesses the situation presented in the nucleus (evaluation-s would be elaboration in our annotation scheme). attr = attribution; cond = condition.</Paragraph>
      <Paragraph position="12">  2. Attribution relation between 1 or 1+2 and 3-4: 1 or 1+2 state the source (the administration) of what is stated in 3-4.</Paragraph>
      <Paragraph position="13"> 3. Condition relation between 3 and 4: 3 states the condition for what is stated in 4 (the condition relation is also signaled by the cue phrase if in 3). 4. Attribution relation between 5 and 1-4: 5 states the source of what is stated in 1-4.</Paragraph>
      <Paragraph position="14"> 5. Attribution relation between 5 and 6: 5 states the source of what is stated in 6. 6. Evaluation-s  relation between 6 and 3-4: 3-4 state what is evaluated by 6--the Contra supporters should call for military aid, and if the February election is voided (group of discourse segments 3-4), the Contra supporters might win (discourse segment 6). Note that in our annotation scheme, the evaluation-s relation would be an elaboration relation (6 provides additional detail about 3-4: Elliott Abrams's opinion on the Contras' chances of winning).</Paragraph>
      <Paragraph position="15"> In the resultant coherence structure for example (26), node 3-4 has multiple parents or ingoing arcs: one attribution ingoing arc and one evaluation-s ingoing arc (cf. Figures 8 and 9).</Paragraph>
      <Paragraph position="16"> Table 6 presents a comparison of the RST annotation and our chain-graph-based annotation for (26). Note in particular that the attribution relation between 5 and 6 cannot be represented in the RST tree structure. Note furthermore that the RST tree contains an evaluation-s relation between 6 and 1-5. However, this evaluation-s relation seems to hold rather between 6 and 3-4: What is being evaluated is a chance for the Contras to win 6 The relation evaluation-s is part of the annotation scheme in Carlson, Marcu, and Okurowski (2002) but not part of our annotation scheme. In an evaluation-s relation, the situation presented in the satellite assesses the situation presented in the nucleus (Carlson, Marcu, and Okurowski 2002). An evaluation-s relation would be an elaboration relation in our annotation scheme.</Paragraph>
      <Paragraph position="17">  Computational Linguistics Volume 31, Number 2 Table 6 Comparison for (26) of tree-based RST structure (from Carlson, Marcu, and Okurowski (2002) and our chain-graph-based structure.</Paragraph>
      <Paragraph position="18"> Tree-based RST structure Our chain-graph-based structure Same between 2 and 3-4 Same between 1 and 2, or merging of 1 and 2 to 1+2 Attribution between 1 and 2-4 Attribution between 1 or 1+2 and 3-4 Condition between 3 and 4 Condition between 3 and 4 Attribution between 5 and 1-4 Attribution between 5 and 1-4 (no relation) Attribution between 5 and 6 Evaluation-s between 6 and 1-5 Evaluation-s between 6 and 3-4 a military conflict under certain circumstances. But a coherence relation between 6 and 3-4 could not have been annotated in a tree structure.</Paragraph>
    </Section>
  </Section>
  <Section position="4" start_page="467" end_page="467" type="metho">
    <SectionTitle>
4. Statistics
</SectionTitle>
    <Paragraph position="0"> We performed a number of statistical analyses on our annotated database to test our hypotheses. Each set of statistics was calculated for both annotators separately. However, since the statistics for both annotators were never different from each other (as confirmed by significant R  s &gt; 1), we report only the statistics for one annotator in the following sections.</Paragraph>
    <Paragraph position="1"> An important question is how frequent the phenomena discussed in the previous sections are. The more frequent they are, the more urgent the need for a data structure that can adequately represent them. The following sections report statistical results on crossed dependencies (section 4.1) and nodes with multiple parents (section 4.2).</Paragraph>
    <Section position="1" start_page="467" end_page="467" type="sub_section">
      <SectionTitle>
4.1 Crossed Dependencies
</SectionTitle>
      <Paragraph position="0"> The following sections report counts on crossed dependencies in the annotated database of 135 texts (cf. section 1). Section 4.1.1 reports results on the frequency of crossed dependencies, section 4.1.2 reports results concerning the question of what types of coherence relations tend to be involved in crossed dependencies, and section 4.1.3 reports results on the arc lengths of coherence relations involved in crossed dependencies. Section 4.1.4 provides a short summary of the statistical results on crossed dependencies.</Paragraph>
      <Paragraph position="1">  dependencies for the coherence structure graph of each text, we counted the minimum number of arcs that would have to be deleted in order to eliminate crossed dependencies in the coherence structure. Figure 11 illustrates this process. The example graph depicted in the figure contains the following crossed dependencies: {1, 3} crosses with {2, 4}, {3, 5} with {2, 4},and{5, 7} with {6, 8}. By deleting {2, 4}, two crossed dependencies can be eliminated: the crossing of {1, 3} with {2, 4} and the crossing of {3, 5} with {2, 4}. By deleting either {5, 7} or {6, 8} the remaining crossed dependency between {5, 7} and {6, 8} can be eliminated. Therefore two edges would have to be deleted from the graph in Figure 11 in order to make it free of crossed dependencies.  12.5% of arcs in a coherence graph have to be deleted in order to make the graph free of crossed dependencies. Seven texts out of the 135 had no crossed dependencies. The mean number of arcs for the coherence graphs of these texts was 36.9 (minimum: 8, maximum: 69, median: 35). The mean number of arcs for the other 128 coherence graphs (those with crossed dependencies) was 125.7 (minimum: 20, maximum: 293, median: 115.5). Thus, the graphs with no crossed dependencies had significantly fewer arcs than the graphs that had crossed dependencies (kh</Paragraph>
      <Paragraph position="3"> for continuity applied), p &lt; 10 [?]6 ). This is a likely explanation for why these seven texts had no crossed dependencies.</Paragraph>
      <Paragraph position="4"> More generally, linear regressions show a correlation between the number of arcs in a coherence graph and the number of crossed dependencies. The more arcs a graph has, the higher the number of crossed dependencies (R  = 0.39, p &lt; 10 [?]4 ; cf. Figure 12). The same linear correlation holds between text length and number of crossed dependencies: The longer a text, the more crossed dependencies are in its coherence structure graph (for text length in discourse segments: R  to the question of how frequent crossed dependencies are, another question is whether  there are certain types of coherence relations that participate more or less frequently in crossed dependencies than other types of coherence relations. For an arc to participate in a crossed dependency, it must be in the set of arcs that would have to be deleted from a coherence graph in order to make that graph free of crossed dependencies (cf. the procedure outlined in section 4.1.1). In other words, the question is whether the frequency distribution over types of coherence relations is different for arcs participating in crossed dependencies compared to the overall frequency distribution over types of coherence relations in the whole database.</Paragraph>
      <Paragraph position="5"> Figure 13 shows that the overall distribution over types of coherence relations participating in crossed dependencies is not different from the distribution over types of coherence relations overall. This is confirmed by the results of a linear regression, which show a significant correlation between the two distributions of percentages (R  = 0.84, p &lt; .0001). Note that the overall distribution includes only arcs with length greater than one, since arcs of length one cannot participate in crossed dependencies. However, there are some differences for individual coherence relations. Some types of coherence relations occur considerably less frequently in crossed dependencies than overall in the database. Table 8 shows the data from Figure 13 ranked by the factor of &amp;quot;percentage of overall coherence relations&amp;quot; by &amp;quot;percentage of coherence relations participating in crossed dependencies.&amp;quot; The proportion of same relations, for instance, is 15.23 times greater, and the percentage of condition relations is 5.59 times greater, overall in the database than in crossed dependencies. We do not yet understand the reason for these differences and plan to address this question in future research. Another way of testing whether certain coherence relations contribute more than others to crossed dependencies is to remove coherence relations of a certain type from the database and then count the remaining number of crossed dependencies. For example, it is possible that the number of crossed dependencies is reduced once all elaboration relations are removed from the database. Table 9 shows that by removing all elaboration relations from the database of 135 annotated texts, the percentage of coherence relations involved in crossed dependencies is reduced from 12.5% to 4.96% of the remaining coherence relations. That percentage is reduced even further, to 0.84%, by removing all elaboration and similarity relations from the database. These numbers seem to be partial support for Knott's (1996) hypothesis: Knott argued that elaboration relations are less  constrained than other types of coherence relations (cf. the discussion of Knott [1996] in section 3).</Paragraph>
      <Paragraph position="6"> However, there is a possible alternative hypothesis to Knott's (1996). In particular, elaboration relations are very frequent (37.97% of all coherence relations; cf. Table 8). It is possible that removing elaboration relations from the database reduces the number of crossed dependencies only because a large number of coherence relations are removed when elaborations are removed. In other words, an alternative hypothesis to that of Knott (1996) is that the lower number of crossed dependencies is just due to lessdense coherence graphs (i.e., the less dense coherence graphs are, the lower the chance for crossed dependencies). We tested this hypothesis by correlating the percentage of coherence relations removed with the percentage of crossed dependencies that remain after removing a certain type of coherence relation.</Paragraph>
      <Paragraph position="7">  Figure 14 shows that the higher the percentage of removed coherence relations, the lower the percentage of coherence relations becomes that are involved in crossed dependencies. This correlation is confirmed by a linear regression (R  = 0.7697, p &lt; .0005; after removing the elaboration data point: R  = 0.4504, p &lt; .05; these linear regressions do not include the data point elaboration + similarity). Thus, although removing certain types of coherence relations reduces the number of crossed dependencies, it results in a very impoverished representation of coherence structure (i.e., after removing all elaboration and all similarity relations, only 39.12% of all coherence relations would still be represented [cf. Table 8]; the figure is 52.13% based on the distribution over coherence relations including those with absolute arc length one [cf. Table 11]).</Paragraph>
      <Paragraph position="8"> With respect to Knott's (1996) hypothesis, note that leaving out elaboration relations still leaves the proportion of remaining crossed dependencies at 4.96% (cf. Table 9). 7 Note that the percentages of removed coherence relations do not include coherence relations of absolute arc length one, since removing those coherence relations cannot have any influence on the number of crossed dependencies (coherence relations of absolute arc length one cannot be involved in crossed dependencies). Thus, the percentages of coherence relations removed in Figure 14 are from the third column of Table 8.</Paragraph>
      <Paragraph position="9">  Correlation between removed percentage of overall coherence relations and remaining percentage of crossed dependencies. Note that the data point for elaboration + similarity is not included in the figure. R  = 0.7699, p &lt; .0005.</Paragraph>
      <Paragraph position="10"> In order to further reduce the proportion of remaining crossed dependencies, it is necessary to remove similarity relations in addition to removing elaboration relations (cf. Table 9). This is a pattern of results that is not predicted by any literature that we are aware of (including Knott [1996], among others, although he predicts these results partially). We believe this issue should be addressed in future research.  other question is how great the distance typically is between discourse segments that participate in crossed dependencies, or how great the arc length is for coherence relations that participate in crossed dependencies.</Paragraph>
      <Paragraph position="11">  It is possible, for instance, that crossed dependencies primarily involve long-distance arcs and that more local crossed dependencies are disfavored. However, Figure 15 shows that the distribution over arc lengths is practically identical for the overall database and for coherence relations participating in crossed dependencies (linear regression: R  = 0.937, p &lt; 10 [?]4 ), suggesting a strong locality bias for coherence relations overall as well as for those participating in crossed dependencies.</Paragraph>
      <Paragraph position="12">  The arc lengths are normalized in order to take into account the varying length of texts. Normalized arc length is calculated by dividing the absolute length of an arc by the maximum length that that arc could have, given its position in its text. For example, if there is a coherence relation between discourse segment 1 and discourse segment 4 in a text, the raw distance between them would be three. If these discourse segments are part of a text that has five discourse segments total (i.e., 1 to 5), 8 The distance between two discourse segments is not measured in terms of how many coherence links one has to follow from any discourse segment x to any discourse segment y to which discourse segment x is related via a coherence relation. Instead, distance is measured in terms of the number of intervening discourse segments. Thus, distance between nodes reflects linear distance between two discourse segments in a text. For example, the distance between a discourse segment 1 and a discourse segment 4 would be three.</Paragraph>
      <Paragraph position="13"> 9 The arc length distribution for the database overall does not include arcs of (absolute) length one, since such arcs cannot participate in crossed dependencies.</Paragraph>
      <Paragraph position="14">  Computational Linguistics Volume 31, Number 2 Figure 15 Comparison of normalized arc length distributions. For each condition (&amp;quot;overall statistics&amp;quot; and &amp;quot;crossed-dependencies statistics&amp;quot;), the sum over all coherence relations is 100; each bar in each condition represents a fraction of the total of 100 in that condition. the normalized distance would be 3/4 = 0.75 (because four would be the maximum possible length of an arc that originates in discourse segment 1 or 4, given that the text has five discourse segments in total).</Paragraph>
      <Paragraph position="15">  sults on crossed dependencies suggest that crossed dependencies are too frequent to be ignored by accounts of coherence. Furthermore, the results suggest that any type of coherence relation can participate in a crossed dependency. However, there are some cases in which knowing the type of coherence relation that an arc represents can be informative as to how likely that arc is to participate in a crossed dependency. The statistical results reported here also suggest that crossed dependencies occur primarily locally, as evidenced by the distribution over lengths of arcs participating in crossed dependencies.</Paragraph>
    </Section>
    <Section position="2" start_page="467" end_page="467" type="sub_section">
      <SectionTitle>
4.2 Nodes with Multiple Parents
</SectionTitle>
      <Paragraph position="0"> Section 3.2 provided examples of coherence structure graphs that contain nodes with multiple parents. In addition to crossed dependencies, nodes with multiple parents are another reason why trees are inadequate for representing natural language coherence structures. The following sections report statistical results from our database on nodes with multiple parents. As in the previous section on crossed dependencies, we report results on the frequency of nodes with multiple parents (section 4.2.1), the types of coherence relations ingoing to nodes with multiple parents (section 4.2.2), and the arc length of coherence relations ingoing to nodes with multiple parents (section 4.2.3).</Paragraph>
      <Paragraph position="1"> Table 10 In-degree of nodes in the overall database.</Paragraph>
      <Paragraph position="2">  Wolf and Gibson Representing Discourse Coherence Figure 16 Correlation between number of arcs and number of nodes with multiple parents. Section 4.2.4 provides a short summary of the statistical results on nodes with multiple parents.</Paragraph>
      <Paragraph position="3">  nodes with multiple parents by counting the number of nodes with in-degree greater than one. We assume nodes with in-degree greater than one in a graph to be the equivalent of nodes with multiple parents in a tree. The results of our count indicated that 41.22% of all nodes in the database have an in-degree greater than one. In addition to counting the number of nodes with in-degree greater than one, we determined the mean in-degree of the nodes in our database. Table 10 shows that the mean in-degree (= mean number of parents) of all nodes in the investigated database of 135 texts is 1.6. As for coherence relations involved in crossed dependencies (cf. section 4.1.1), a linear regression showed a significant correlation between the number of arcs in a coherence graph and the number of nodes with multiple parents (cf. Figure 16; R  ). The proportion of nodes with in-degree greater than one and the mean in-degree of the nodes in our database suggest that even if a mechanism could be derived for representing crossed dependencies in (augmented) tree graphs, nodes with multiple parents present another significant problem for trees representing coherence structures.  with crossed dependencies, an important question is whether there are certain types of coherence relations that are more or less frequently ingoing to nodes with multiple parents than other types of coherence relations. In other words, the question is whether the frequency distribution over types of coherence relations is different for arcs ingoing to nodes with multiple parents compared to the overall frequency distribution over types of coherence relations in the whole database. Figure 17 shows that the overall distribution over types of coherence relations ingoing to nodes with multiple parents is not different from the distribution over types of coherence relations overall.</Paragraph>
      <Paragraph position="4">  This is confirmed by the results of a linear regression, which show 10 Note that, unlike in section 4.1.2, the distribution over coherence relations for all coherence relations includes arcs with length one, since there was in this case no reason to exclude them.  Proportion of coherence relations.</Paragraph>
      <Paragraph position="5"> Coherence relation Percentage of Percentage of Factor (= overall/ coherence relations overall coherence ingoing to nodes with ingoing to nodes with relations multiple parents)  Unlike for crossed dependencies (cf. Table 8), there are no big differences for individual coherence relations. Table 11 shows the data from Figure 17, ranked by the factor of &amp;quot;percentage of overall coherence relations&amp;quot; by &amp;quot;percentage of coherence relations ingoing to nodes with multiple parents.&amp;quot; As for crossed dependencies, we also tested whether removing certain kinds of coherence relations reduced the mean in-degree (number of parents) and/or the percentage of nodes with in-degree greater than one (more than one parent). Table 12 shows that removing all elaboration relations from the database reduces the mean in-degree of nodes from 1.60 to 1.238 and the percentage of nodes with in-degree greater than one from 41.22% to 20.29%. Removing all elaboration as well as all similarity relations reduces these numbers further to 1.142 and 11.24%, respectively. As Table 12 also shows, removing other types of coherence relations does not lead to as great a reduction in the mean in-degree and the percentage of nodes with in-degree greater than one. However, as with crossed dependencies (cf. section 4.1.2), we also tested whether the reduction in nodes with multiple parents could simply be due to removing more and more coherence relations (i.e., the less dense a graph is, the smaller the chance that there are nodes with multiple parents). We correlated the percentage of coherence relations removed with the mean in-degree of the nodes after removing different types of coherence relations.</Paragraph>
      <Paragraph position="6">  Figure 18 shows that the higher the percentage of removed coherence relations, the lower the mean in-degree of the nodes in the database becomes. This correlation is confirmed by the results of a linear regression (R  = 0.8310, p &lt; .0005; note that these linear regressions do not include the data point elaboration + similarity). We also correlated 11 Note that in the correlations in this section, the proportions of removed coherence relations include coherence relations of absolute arc length one, because removing these coherence relations also has an effect on the mean in-degree of nodes and the proportion of nodes with in-degree greater than one. Thus, the proportions of coherence relations removed in Figure 18 and in Figure 19 are from the third column of  Correlation between percentage of removed coherence relations and mean in-degree of remaining nodes. Note that the data point for elaboration + similarity is not included in the figure.  the percentage of coherence relations removed with the percentage of nodes with in-degree greater than one after removing different types of coherence relations. Figure 19 shows that the higher the percentage of removed coherence relations, the lower the percentage of nodes with in-degree greater than one. This correlation is also confirmed by the results of a linear regression (R  = 0.8146, p &lt; .0005; note that these correlations do not include the data point elaboration + similarity).</Paragraph>
      <Paragraph position="7"> Thus, although removing certain types of coherence relations (the same ones as for crossed dependencies, i.e., elaboration and similarity; cf. section 4.1.2) can reduce the mean in-degree of nodes and the proportion of nodes with in-degree greater than one, the result is a very impoverished coherence structure. For example, after removing both  Wolf and Gibson Representing Discourse Coherence Figure 19 Correlation between percentage of removed coherence relations and percentage of nodes with in-degree &gt; 1. Note that the data point for elaboration + similarity is not included in the figure.  elaboration and similarity relations, only 52.13% of all coherence relations would still be represented (cf. Table 11). Furthermore, note that this pattern of results is not predicted by any literature we are aware of, including Knott (1996), although he predicts the results partially (he predicts that removing elaboration relations but not that removing elaboration as well as similarity relations is necessary in order to remove basically all nodes with multiple parents; cf. the discussion in the last paragraph of section 4.1.2). This issue will have to be investigated in future research.</Paragraph>
      <Paragraph position="8">  As for crossed dependencies, we also compared arc lengths. Here, we compared the length of arcs that are ingoing to nodes with multiple parents to the overall distribution of arc lengths. Again, we compared normalized arc lengths (see section 4.1.3 for the normalization procedure). By contrast to the comparison for crossed dependencies, we included in this comparison arcs of (absolute) length one, because such arcs can be ingoing to nodes with either single or multiple parents. Figure 20 shows that the distribution over arc lengths is practically identical for the overall database and for arcs ingoing to nodes with multiple parents (linear regression: R  = 0.993, p &lt; 10 [?]4 ), suggesting a strong locality bias for coherence relations overall as well as for those participating in crossed dependencies.</Paragraph>
      <Paragraph position="9">  statistical results on nodes with multiple parents suggest that they are a frequent phenomenon and that they are not limited to certain kinds of coherence relations. However, as with crossed dependencies, removing certain kinds of coherence relations (elaboration and similarity) can reduce the mean in-degree of nodes and the proportion of nodes with in-degree greater than one. But also as with crossed dependencies, our data at present do not distinguish whether this reduction in nodes with multiple parents is due to a property of the coherence relations removed (elaboration and similarity)or whether it is just that removing more and more coherence relations simply reduces the chance for nodes to have multiple parents. We plan to address this question in future research. In addition to the results on frequency of nodes with multiple parents  Computational Linguistics Volume 31, Number 2 Figure 20 Comparison of normalized arc length distributions. For each condition (&amp;quot;overall statistics&amp;quot; and &amp;quot;arcs ingoing to nodes with multiple parents&amp;quot;), the sum over all coherence relations is 100; each bar in each condition represents a fraction of the total of 100 in that condition. and types of coherence relations ingoing to nodes with multiple parents, the statistical results reported here suggest that ingoing arcs to nodes with multiple parents are primarily local.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML