File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/06/w06-1417_evalu.xml
Size: 3,707 bytes
Last Modified: 2025-10-06 13:59:51
<?xml version="1.0" standalone="yes"?> <Paper uid="W06-1417"> <Title>Sydney, July 2006. c(c)2006 Association for Computational Linguistics Generation of Biomedical Arguments for Lay Readers</Title> <Section position="9" start_page="117" end_page="118" type="evalu"> <SectionTitle> 7 Experiment </SectionTitle> <Paragraph position="0"> Argument generation was evaluated in the following experiment. Five biology graduate students, screened beforehand for writing ability in biology, were shown two patient letters. The letters were created by the experimenter by paraphrasing the output of discourse generation that would be input to the realizer. The paraphrases are similar in syntax and lexical style to letters in the corpus, but the genetic disorders covered in the experiment's letters differ from those covered in the corpus. One letter concerns a child confirmed to have cystic fibrosis (CF); the other a child whose test results for Waardenburg syndrome (WS) were negative.</Paragraph> <Paragraph position="1"> The text of letter CF is given in Figure 2. The first column contains annotations describing the communicative function of the information: C for claim, D for data, W for warrant, and B for backing. Each label is subscripted with an integer referring to the argument. (The row labeled C functions as both the claim of argument 2 and the data of argument 3.) Annotations were not shown to the experiment's participants. Communicative function was used to determine presentation order within each argument. Letters CF and WS had 23 and 25 segments, respectively, where a segment is defined as a unit fulfilling one of the above functions, or a non-argument-related function.</Paragraph> <Paragraph position="2"> The goal of the experiment was to conduct a preliminary evaluation of the acceptability of the arguments in terms of content, explicitness, and presentation order within arguments. The participants were asked to revise each letter as needed to make it more appropriate for its intended recipients, the biological parents of a patient. Participants were told they could reword, reorder, and make deletions and additions to a letter. The results are summarized in Table 1, which includes the average number of segments to/from which information was added (New) or deleted (Delete), and reordered (Reorder). (Rewordings are not tabulated since it was not our goal to evaluate wording.) New and Delete are measures of acceptability of argument content and explicitness. Reorder is a measure of acceptability of ordering. On average, the number of New, Delete, and Reorder revisions were low: less than two per letter, with most revisions in the category of Reorder. This is encouraging since the system to be built for genetic counselors should provide acceptable arguments requiring a minimum of revision.</Paragraph> <Paragraph position="3"> To provide more details about the results, first, the only segments to which participants added information are warrants. The deletions of data consist of information presumably already known to the recipients, e.g. D in letter CF; other deletions are of part or all of a warrant or all of a backing. The only deletions of claims consist of information duplicated in another part of the letter; there were no cases where a claim was deleted even though it could be inferred from data and warrant. The reorderings were across-argument, which violates conventional topic structure in the genre, or withinargument. In the latter, half repositioned a claim from final position in an argument to a position before the warrant or backing; the other half repositioned the warrant or backing before the data.</Paragraph> </Section> class="xml-element"></Paper>