File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/85/e85-1027_abstr.xml

Size: 8,648 bytes

Last Modified: 2025-10-06 13:46:12

<?xml version="1.0" standalone="yes"?>
<Paper uid="E85-1027">
  <Title>A Computational Theory of Prose Style for Natural Language Generation</Title>
  <Section position="2" start_page="0" end_page="188" type="abstr">
    <SectionTitle>
2. Prose Style
</SectionTitle>
    <Paragraph position="0"> Style is an intuitive notion involving the manner in which something is said. It has been more often the professional domain of literary critics and English teachers than linguists, which is entirely reasonable given that it involves optional, often conscious decb/ons and preferences rather than the unconscious, inviolable rules that linguists term Universal Grammar.</Paragraph>
    <Paragraph position="1"> To illustrate what we mean by style, cons/der the three paragraphs in Figure 1. As we see it, the first two of these have the same style, and the third has a different one.</Paragraph>
    <Paragraph position="2"> The Ibibio are a group of six related peoples riving in southeastern Nigeria. They have a population estimated at 1,500,1300, and speak a language in the Benue-Niger subfamily of the Niger-Congo languages. Most Ibibio are subsistence farmers, but two subgroups are fishermen.</Paragraph>
    <Paragraph position="3"> The Ashanti are an AKAN-speaking people of central Ghana and neighboring regions of Togo and Ivory Coast, numbering more than 900,000. They subsist primarily by farming cacao, a major cash crop. The Ashanti are an African people. They live in central Ghana and neighboring regions of Togo and Ivory Coast. Their population is more than 9(}0,000.</Paragraph>
    <Paragraph position="4"> They speak the language Akan. They subsist primarily by farming cacao. Thb is a major cash crop.  The first two of these paragraphs are extracted from the Academic American Encyclopedia; they are the lead paragraphs from the two articles on those respective tribes. The third paragraph was written by taking the same information that we have posited underlies the Ashanti paragraph and regenerating from it with an impoverished set of stylistic rules.</Paragraph>
    <Paragraph position="5"> We began looking at texts like these during the summer of 1983, as part of the work on the &amp;quot;Knoesphere Project&amp;quot; at Atari Research (Borning et al \[1983\]). Our goal in that project was to develop a representation for the kind of information appearing in encyclopedias which would not be tied to the way in which it would be presented. The same knowledge base objects were to be used whether one WaS recreating an article llke the or/giuaJ, or wakin~g a simpler version to give to children, or answering isolated questions about the material, or giving an interactive multi-media presentation coordinated with maps and icons, and so on.</Paragraph>
    <Paragraph position="6"> With the demise of Atari Research, this ambitious goal has had to be put on the shelf; we have, however, continued to work with the articles on our own. Research on these articles lead us to begin work on p~o.~ style. This remains an interesting domain in which to explore style since we are working with a body of texts whose organization is not totally dictated by its internal form. These paragraphs are representative of all the African tribe articles in the Academic American, which is not surprising since all of the articles were written by the same person and under tight editorial control. What was most striking to us when we first looked at these articles was their similarity to each other, both in the information they contained and the way they were muctured as a text. We will assume that for such texts, ~encyclopedia style&amp;quot; involves at least the following two generalizations: (1) be consistent ia the reformation that you provide about each tribe; and (2) adopt a complex, &amp;quot;information loaded&amp;quot; sentence structure in your presentation. This sentence t~ructure is typified by a rich set of syntactic constructions, including the use of conjunction reduction, reduced relative clauses, coordination, secondary adjunction, and prenominal modification whenever possible.</Paragraph>
    <Paragraph position="7"> A contrasting style might be, for example, one that was aimed at children; we have rewritten the information on the Ashanti tribe as it might look in such a style. We have not yet tried implementing this ~'71e qince it will call for doing lexicalization under stylistic control, which we have not yet designed.</Paragraph>
    <Paragraph position="8"> &amp;quot;The Ashanti are an African people. They live in West Africa in a country called Ghana and in parts of Togo and the Ivory Coast. There are about 900DO0 people in this tribe, and they speak a language named AKAN. Most of the Ashanti are cacao farmers.&amp;quot; Figure 2 The style of the Academic American paragraphs, on the other hand, is much tighter, with more compact sentence structure, and a more sophisticated choice of phrasing. Such differences are the son of thing that rules of prose  style must capture.</Paragraph>
    <Paragraph position="9"> 3. Our Theory of Generation Looking at the generation process as a whole, we have always presumed that it involved three different stages, with our own research concentrating on the last.</Paragraph>
    <Paragraph position="10"> (1) Deter,-,,,i,.e what goals to (attempt to) accomplish with the utteraaes. This initiates the other activities and posts a set of criteria they are to meet, typically information to be conveyed (e.g. pointers to frames in the knowledge base) and speech acts to be carried out.</Paragraph>
    <Paragraph position="11"> (2) Deriding which qx.dfle propositions to express and  which to leave for the audlcnge to Infer on their own. This cannot be separated from working out what rhetorical constructions to employ in expressing the specified speech ace; or from selecting the key lexical items for communicating the propositions. The result of this activity is a teat plan, which has a principally conceptual vocabulary with rhetorical and lexical annotations. The text plan is seen by the next stage as an executable %-pecification&amp;quot; that is to be incrementally converted into a text. The specification is given in layers, Le. not all of the details are planned at once. Later, once the linguistic context of the uni~ within the s\]~t'ication has been determined, this planner will be recunively invoked, unit by unit, until the planning has been done in enough detail that only linguistic  problems remain.</Paragraph>
    <Paragraph position="12"> (3) \]~fnintJ.lnlna_ * rt~u of the ~ ~tl&amp;quot;u~ or the uttermuz, traverdng und interpreting thts structure  to preduce tim words of tim text and constrain further dee/stun~ This stage is responsible for the grammaticality of the text and its fluency as a discourse (e.g. insuring that the correct terms are pronominalized, the conect focus maintained, etc.). The central representation is an explicit model of the suryace structure of the text being produced, which is used both to determine control flow and to constrain the activities of the other ~ (see discussion in McDonald \[1984\]). The surface structure is defined in terms of a stream of phrasal nodes, constituent positions, words, and embedded information units (which will eventually have to Le sent back to the planner and then realized linguistically, extending the surface structure in the process). The entities in the stream and their relative order is indelible (i.e. once selected it cannot be changed); however more material can be spficed into the stream at specified points.</Paragraph>
    <Section position="1" start_page="187" end_page="188" type="sub_section">
      <SectionTitle>
3.1 WHERE IS STYLE CONSIDERED?
</SectionTitle>
      <Paragraph position="0"> According to our theory, prose style Is a consequence of what decisions are made darhllg the U'ans/t/ou from the ceueeptmd representationsl level to the linguistic level. The conceptual representation of what is to be mid--the text  plan--is modeled as a stream of information units selected by the content planning component. The a:tachmera process takes units from this stream and positions them in the surface structure somewhere ahead of the point of speech. The prose style one adopts dictates what choice the attachment process makes when faced with alternatives in where to position a unit: should one extend a sentence with a nonrestrictive relative clause or start a new one; express modification with an prenominal adjective or a postnominal prepositional phrase. The collective pattern of such decisions is the compotational manifestation of one's style.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML