File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/06/p06-2033_evalu.xml
Size: 6,186 bytes
Last Modified: 2025-10-06 13:59:44
<?xml version="1.0" standalone="yes"?> <Paper uid="P06-2033"> <Title>Sydney, July 2006. c(c)2006 Association for Computational Linguistics Conceptual Coherence in the Generation of Referring Expressions</Title> <Section position="5" start_page="259" end_page="260" type="evalu"> <SectionTitle> 4 Evaluation </SectionTitle> <Paragraph position="0"> It has been known at least since Dale and Reiter (1995) that the best distinguishing description is not always the shortest one. Yet, brevity plays a part in all GRE algorithms, sometimes in a strict form (Dale, 1989), or by letting the algorithm approximate the shortest description (for example, in the Dale and Reiter's IA). This is also true of references to sets, the clearest example being Gardent's constraint based approach, which always finds the description with the smallest number of logical operators. Such proposals do not take coherence (in our sense of the word) into account. This raises obvious questions about the relative importance of brevity and coherence in reference to sets.</Paragraph> <Paragraph position="1"> The evaluation took the form of an experiment to compare the output of our Coherence Model with the family of algorithms that have placed Brevity at the centre of content determination. Participants were asked to compare pairs of descriptions of one and the same target set, selecting the one they found most natural. Each description could either be optimally brief or not (+-b) and also either optimally coherent or not (+-c). Non-brief descriptions, took the form the A, the B and the C.</Paragraph> <Paragraph position="2"> Brief descriptions 'aggregated' two disjuncts into one (e.g. the A and the D's where D comprises the union of B and C). We expected to find that: H1 +c descriptions are preferred over [?]c.</Paragraph> <Paragraph position="3"> H2 (+c,[?]b) descriptions are preferred over ones that are ([?]c,+b).</Paragraph> <Paragraph position="4"> H3 +b descriptions are preferred over [?]b.</Paragraph> <Paragraph position="5"> Confirmation of H1 would be interpreted as evidence that, by taking coherence into account, our Three old manuscripts were auctioned at Sotheby's.</Paragraph> <Paragraph position="6"> e1 One of them is a book, a biography of a composer. e2 The second, a sailor's journal, was published in the form of a pamphlet. It is a record of a voyage. e3 The third, another pamphlet, is an essay by Hume. (+c,[?]b) The biography, the journal and the essay were sold to a collector. null (+c,+b) The book and the pamphlets were sold to a collector. ([?]c,+b) The biography and the pamphlets were sold to a collector. ([?]c,[?]b) The book, the record and the essay were sold to a collector. algorithm is on the right track. If H3 were confirmed, then earlier algorithms were (also) on the right track by taking brevity into account. Confirmation of H2 would be interpreted as meaning that, in references to sets, conceptual coherence is more important than brevity (defined as the number of disjuncts in a disjunctive reference to a set). Materials, design and procedure Six discourses were constructed, each introducing three entities. Each set of three could be described using all 4 possible combinations of +-b x +-c (see Figure 4). Entities were human in two of the discourses, and artefacts of various kinds in the remainder. Properties of entities were introduced textually; the order of presentation was randomised. A forced-choice task was used. Each discourse was presented with 2 possible continuations consisting of a sentence with a plural subject NP, and participants were asked to indicate the one they found most natural. The 6 comparisons corresponded to 6 sub-conditions: C1. Coherence constant a. (+c,[?]b) vs. (+c,+b) b. ([?]c,[?]b) vs. ([?]c,+b) C2. Brevity constant a. (+c,[?]b) vs. ([?]c,[?]b) b. (+c,+b) vs. ([?]c,+b) C3. Tradeoff/control a. (+c,[?]b) vs. ([?]c,+b) b. ([?]c,[?]b) vs. (+c,+b) Participants saw each discourse in a single condition. They were randomly divided into six groups, so that each discourse was used for a different condition in each group. 39 native English speakers, all undergraduates at the University of Aberdeen, took part in the study.</Paragraph> <Paragraph position="7"> Results and discussion Results were coded according to whether a participant's choice was +-b and/or +-c. Table 4 displays response proportions. Overall, the conditions had a significant impact on responses, both by subjects (Friedman</Paragraph> <Paragraph position="9"> 30.2,p < .001). When coherence was kept constant (C1a and C1b), the likelihood of a response being +b was no different from [?]b (C1a: kh2 = .023,p = .8; C1b: kh2 = .64,p = .4); the conditions C1a and C1b did not differ significantly</Paragraph> <Paragraph position="11"> where brevity was kept constant (C2a and C2b) resulted in very significantly higher proportions of</Paragraph> <Paragraph position="13"> served between C2a and C2b (kh2 = .08,p = .8).</Paragraph> <Paragraph position="14"> In the tradeoff case (C3a), participants were much more likely to select a +c description than a +b one (kh2 = 39.0,p < .001); a majority opted for the (+b,+c) description in the control case</Paragraph> <Paragraph position="16"> The results strongly support H1 and H2, since participants' choices are impacted by Coherence.</Paragraph> <Paragraph position="17"> They do not indicate a preference for brief descriptions, a finding that echoes Jordan's (2000), to the effect that speakers often relinquish brevity in favour of observing task or discourse constraints. Since this experiment compared our algorithm against the current state of the art in references to sets, these results do not necessarily warrant the affirmation of the null hypothesis in the case of H3. We limited Brevity to number of disjuncts, omitting negation, and varying only between length 2 or 3. Longer or more complex descriptions might evince different tendencies. Nevertheless, the results show a strong impact of Coherence, compared to (a kind of) brevity, in strong support of the algorithm presented above, as a realisation of the Coherence Model.</Paragraph> </Section> class="xml-element"></Paper>