File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/99/w99-0106_evalu.xml
Size: 7,303 bytes
Last Modified: 2025-10-06 14:00:39
<?xml version="1.0" standalone="yes"?> <Paper uid="W99-0106"> <Title>Discourse Structure and Co-Reference: An Empirical Study</Title> <Section position="9" start_page="49" end_page="51" type="evalu"> <SectionTitle> 3.2.2 Results </SectionTitle> <Paragraph position="0"> The graph in Figure 3 shows the potentials of the Linear-k and Discourse-VT-k models to correctly determine co-referential links for each k from 1 to 20. The graph in Figure 4 represents the same potentials but focuses only on ks in the interval \[2,9\]. As theze two graphs show, the potentials increase monotonically with k, the VT-k models always doing better than the Linear-k models. Eventually, for large ks, the potential performance of the two models converges to 100~.</Paragraph> <Paragraph position="1"> The graphs in Figures 3 and 4 also suggest resolution strategies for implemented systems. For example, the graphs suggests that by choosing to work with EDRAs of size 7, a discourse-based system has the potential of resolving more thun 90~ of the co-referential links in a text correctly. To achieve the same potential, a linear-based system needs to look back 8 units. If a system does not look back at all and attempts to resolve co-referential links only within the unit under scrutiny (k -- 0), it has the potential to correctly resolve about 40~ of the co-referential links.</Paragraph> <Paragraph position="2"> To provide a clearer idea of how the two models differ, Figure 5 shows, for each k, the value of the Discourse-VT-k potentials divided by the value of the Linear-k potentials. For k = 0, the potentials of both models are equal because both use only the unit in focus in order to determine cwreferential links.</Paragraph> <Paragraph position="3"> For k = 1, the Discourse-VT-I model is about 7% better than the IAnear-I model. As the value of k increases, the value Discourse-VT-k/Linear-k converges to I.</Paragraph> <Paragraph position="4"> In Figures 6 and 7, we display the number of exceptions, i.e., co-referential links that Discourse-VT-k and Linear-k models cannot determine correctly. As one can see, . over the whole corpus, for each k _< 3, the Discourse-VT-k models have the potential to determine correctly about tO0 mote co-referential links than the Linear-k models. AS k increases, the performance of the two models converges. null cannot be correctly determined by Discourse-VT-k and Linear-k models (0 _.< k _< 20).</Paragraph> <Paragraph position="5"> In order to assess the statistical significance of the difference between the potentials of the two models to establish correct co-referential links, we carried out a Paired-Samples T Test for each k. In general, a Paired-Samples T Test checks whether the mean of casewise differences between two variables differs from 0. For each text in the corpus and each k, we determined the potentials of both VT-k and Line.ark models to establish correct co-referential links in that text. For ks smaller than 4, the difference in potentials was statistically significant. For example, for k -- 3, t -- 3.345, df - 29, P = 0.002. For values of k larger than or equal to 4, the difference was no longer significant. These results are consistent with the graphs shown in Figure 3 to 7, which all show that the potentials of Discourse-VT-k and Linear-k models converges to the same value as the value of k increases.</Paragraph> <Paragraph position="6"> cannot be correctly determined by Discourse-VT-k and Linear-k models (1 _< k < I0).</Paragraph> <Section position="1" start_page="51" end_page="51" type="sub_section"> <SectionTitle> 3.3 Comparing the effort required to establish co-referential links 3.3.1 Method </SectionTitle> <Paragraph position="0"> The method described in section 3.2.1 estimates the potential of Linear-k and Discourse-VT-k models to determine correct co-referential links by treating EDRAs as sets. However, from a computational perspective (and presumably, from a psycholinguistic perspective as well) it also makes sense to compare the effort required by the two classes of models to establish correct co-referential links. We estimate this effort using a very simple metric that assumes that the closer an an ~teo~__ent is to a corresponding referential expression in the EDRA, the better. Hence, in estimating the effort to estabfish a co-referential link, we treat EDRAs as ordered lists. For example, using the Linesr-9 model, to determine the correct antecedent of the referential expression the smaller company in unit 9 of Hgure 1, it is necessary to search back through 4 units (to unit 5, which contains the refezent Genet/c Therapy). Had unit 5 been Mr. Cosset succeeds M. James Barrett, .50, we would have had to go back 8 units (to unit 1 ) in order to correctly resolve the RE the smaller company. In contrast, in the Discourse-VT-9 model, we go back only 2 units because unit 1 is two units away from * unit 9 (EDRAg(9) = 9,8,1,7,8,5,4,3,2).</Paragraph> <Paragraph position="1"> We consider that the effort e(M, a, EDRAt) of a model M to determine correct c0-referential links with respect to one referential a in unit u, given a correspondingEDRA of size k (EDRAt(u)) is given by the number of units between u and the first unit in EDRAt(u) that contains a co-referential expression ofa.</Paragraph> <Paragraph position="2"> The effort e(M, C, k) of a model M to deter- null Discourse-VT-k models to determine correct co-referential links (0 < k < 100).</Paragraph> <Paragraph position="3"> mine correct co-referential links for all referent/al expressions in a corpus of .tex~ C using EDRAs of size k was computed as the sum of the efforts e(M,a, EDRAk) of all referential expressions a in C.</Paragraph> </Section> </Section> <Section position="10" start_page="51" end_page="52" type="evalu"> <SectionTitle> 3.3.2 Results </SectionTitle> <Paragraph position="0"> Figure 8 shows the Discourse-VT-k and Linear-k efforts computed over all referential expressions in the corpus and all ks. It is possible, for a given referent a and a given k, that no co-referential link exists in the units of the corresponding EDRAt. In this case.</Paragraph> <Paragraph position="1"> we consider that the effort is equal to k. As a consequence, for small ks the effort required to establish co-referential linksis similar for both theories, because both can establish only a limited number of links. However, as k increases, the effort computed over the entire corpus diverges dramatically: using the Discourse-VT model, the search space for co-referential links is reduced by about 800 units for a corpus containing roughly 1200 referential expressions. null A Paired-Samples T Test was performed foreach k.</Paragraph> <Paragraph position="2"> For each text in the corpus and each k, we determined the effort of both VT-k and Linear-k models to establish correct co-referential links in that text. For all ks the difference in effort was statistically significant. For example, for k = 7, we obtained the values t = 3.51, df = 29, P = 0.001. These results are intuitive: because EDRAs are treated as ordered lists and not as sets, the effect of the discourse structure on establishing correct co-referential links is not diminished as k increases.</Paragraph> </Section> class="xml-element"></Paper>