File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/98/w98-1415_intro.xml

Size: 5,176 bytes

Last Modified: 2025-10-06 14:06:50

<?xml version="1.0" standalone="yes"?>
<Paper uid="W98-1415">
  <Title>Clause Aggregation Using Linguistic Knowledge</Title>
  <Section position="2" start_page="0" end_page="138" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> An expression is more concise than another expression if it conveys the same amount of information in fewer words. Complex sentences generated by combining clauses are more concise than corresponding simple sentences because multiple references to the recurring entities are removed.</Paragraph>
    <Paragraph position="1"> For example, clauses like &amp;quot;Jones is a patient&amp;quot; and &amp;quot;Jones has hypertension&amp;quot; can be combined into a more concise sentence &amp;quot;Jones is a hypertensive patient. '~ To illustrate the common occurrence of such repeated entities in generation, let us take a shipping company's database as an example.</Paragraph>
    <Paragraph position="2"> Each database tuple being conveyed is transformed into one or multiple propositions or clauses (we use these terms interchangeably throughout the paper). Each proposition refers to a piece of information which usually corresponds to a simple sentence. The database might Contain multiple shipments to the same location possibly on the same day. Generating a sentence for each tuple separately would containrepetitive and potentially redundant references to the same location Or date.</Paragraph>
    <Paragraph position="3"> Though we used a relational database as the example, the observation about recurring entities in the input is also valid for other types of input, such as execution traces from expert systems.</Paragraph>
    <Paragraph position="4"> CASPER (Clause Aggregation in Sentence PlannER) is a sentence planner which focuses on generating concise sentences. Clause aggregation can happen at three levels: inferential, rhetorical, and linguistic. At the inferential level, user modeling, domain knowledge, and common sense reasoning are used to reduce the number of concepts to convey. Such operations are implemented in the content planner and clauses are combined without consulting lexical resources. Text summarization is an application which uses inferential operators extensively. For example, the two sentences &amp;quot;John hit Mary&amp;quot; and &amp;quot;Mary kicked John&amp;quot; might imply that &amp;quot;John and Mary fought.&amp;quot; To define a set of inferential operators for unrestricted text is beyond the state-of-art. Because it is unlikely that the inferential operators for our domains (medical briefings and telephone network plan descriptions ) will be reusable for other applications, we have directed our effort into aggregation operations at other levels. At the rhetorical level, clauses are combined based on their rhetorical relationships \[Mann and Thompson, 1986\], such as CONTRAST and CONDITION. We will take advantage of such information in future aggregation work. At the linguistic level, lexical and Syntactic information are used to combine clauses. In this paper, we concentrate on two types</Paragraph>
    <Paragraph position="6"> The patient's past medical history is significant for bladder carcinoma1, status post cystectomy with a urostomy tube insertion2, left nephrolithiasis~, status post surgery4, recurrent syncopes, questionable vagovagal6, a neurological workup was negativer, and the EPS was negatives, abdominal aortic aneurysm approximately 5 cmg, high cholesterol10, exertional anginan, past tobacco smoker, quit about one year ago12.</Paragraph>
    <Paragraph position="7">  Of linguistic aggregation operators: hypotactic and paratactic. The term, hypotaxis, describes the relation between a dependent element and its dominant element. Hypotactic operators transform one clause into a modifier and attach the modifier to the dominant clause. In contrast, paratactic aggregation operators combine clauses together using constructions of equal status, such as. coordination.</Paragraph>
    <Paragraph position="8"> CASPER is used in two separate projects, MAGIC (Multimedia Abstract Generation for Intensive Care) and PLANDoc, to increase the fluency of the generated text. MAGIC \[Dalal et al., 1996, MeKeown et al., 1997\] automatically generates multimedia briefings to describe the post-operative status of a patient after undergoing Coronary Artery Bypass Graft (CABG) surgery. It uses the existing computerized information infrastructure in the operating rooms at Columbia Presbyterian Medical Center. PLANDoc\[Kukich et al, 1994, McKeown et al., 1994\] generates English summaries based on somewhat cryptic traces of the interaction between planning engineers and LEIS-PLAN TM. It documents the timing, placement and cost of new facilities for routes in telephone networks.</Paragraph>
    <Paragraph position="9"> In Section 2, we present a corpus analysis to identify the complexity of the target output in MAGIC. Section 3 describes the semantic representation used in CASPER. Details of hypotactic operators are presented in Section 4. Paratactic operators are described in Section 5. Section 6 describes related work.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML