File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/91/w91-0107_metho.xml

Size: 21,118 bytes

Last Modified: 2025-10-06 14:12:48

<?xml version="1.0" standalone="yes"?>
<Paper uid="W91-0107">
  <Title>SYNTACTIC CHOICE IN LANGUAGE GENERATION r</Title>
  <Section position="3" start_page="0" end_page="45" type="metho">
    <SectionTitle>
2 Introduction
</SectionTitle>
    <Paragraph position="0"> &amp;quot;A thematic system is one where corresponding members of the contrasting terms normally have the same propositional meaning, and the same illocufionary potential.&amp;quot; (Huddleston 1984:p437 ).</Paragraph>
    <Paragraph position="1"> Most phrase structure or categorial unification based grammars encode some form of thematic system. The simplest would involve the inclusion of both active and passive voice. Typically, the grammar defines the syntactic structure of each form, but does not include the pragmatic information that distinguishes one from another. When using such a grammar for parsing, this is not important, so long as the information is not required by the system using the parser's output. However, there has recently been an upsurge in the use of these grammars for generation. The lack of pragmatic data now becomes important: the generator is under-constrained, being capable of producing any of the available thematic forms. One way of applying the necessary constraints, is to introduce a system of &amp;quot;functional&amp;quot; features into the feature structure of the grammar itself. These features are so called because they refer to the function of the various parts of the sentence in a discourse. McKeown suggested the use of functional features for the TEXT system (McKeown 1985) in which the grammar was based on the FUG formalism (Kay 1979). The functional features were defined as part of the initial specification of the sentence, which was then filled out by traversing the grammar in a &amp;quot;Top Down&amp;quot; fashion. For example, the following was given by McKeown as an initial sentence specification.</Paragraph>
    <Paragraph position="3"> The functional feature is &amp;quot;TOPIC&amp;quot;, and is specified as being the agent (or PROTagonist) of the semantic structure. The feature value controls wether an active or passive sentence will be produced.</Paragraph>
    <Paragraph position="4"> The work reported in this paper extends this technique to a grammar which encodes a richer thematic system than just active and passive. We use a unification based grammar with a phrase structure backbone, which was originally developed to provide a simple computational description of current linguistic theories (mainly GPSG, Gazdar 1985). As in the example above, a system of functional features is introduced. A bottom-up generation algorithm allows the production of sentences given an initial semantic form. The assignment of some initial values to the functional features constrains the structures generated, and typically just one sentence will be generated for each semantic input.</Paragraph>
    <Paragraph position="5"> This work was done in the context of a database enquiery system with single sentence output. We assume there is a discourse manager which initiates generation by passing the generator a &amp;quot;message&amp;quot;. This message consists of the propositional content of the output required, and some pragmatic information. null The rest of this paper is in three main parts. The first is the definition of a coherent set of discourse parameters that describe the behaviour in discourse of the various elements of a sentence. The second section describes the thematic system used, and how each member relates to the discourse parameters. Finally, we see how the grammar can be augmented with functional features to provide filtering during generation consistent with the discourse parameters.</Paragraph>
  </Section>
  <Section position="4" start_page="45" end_page="46" type="metho">
    <SectionTitle>
3 Discourse Parameters
</SectionTitle>
    <Paragraph position="0"> 'The members of the thematic system to be described below behave differently in discourse.</Paragraph>
    <Paragraph position="1"> In the linguistics literature, there is a long tradition, of assigning labels to various clause constituents in order to describe this behaviour.</Paragraph>
    <Paragraph position="2"> Labels such as &amp;quot;given&amp;quot; and &amp;quot;new&amp;quot;, &amp;quot;topic&amp;quot; and &amp;quot;comment&amp;quot; ,&amp;quot;theme&amp;quot; and &amp;quot;rheme&amp;quot; and so  on (a summary can be found in Quirk 1985, 18.9). We have adopted a set which allows a distinction between the members of the thematic system we use.</Paragraph>
    <Section position="1" start_page="46" end_page="46" type="sub_section">
      <SectionTitle>
3.1 Speech Act Type
</SectionTitle>
      <Paragraph position="0"> This parameter conveys information about the sentence as a whole. Something similar is to be found in most grammars, but precedents in generation can be found in Appelt 1985, and Bunt 1987. Values are :-</Paragraph>
    </Section>
    <Section position="2" start_page="46" end_page="46" type="sub_section">
      <SectionTitle>
3.2 Theme
</SectionTitle>
      <Paragraph position="0"> The theme is :&amp;quot;... somehow an element semantically crucial to the clause ... the communicative point of departure for the rest of the clause&amp;quot; - Quirk 1985, In general, the theme is the established or given part of a message, and lays the ground for the rest of the communication. So, when it occurs in its expected or unmarked form, it will tend to be the first element of the sentence. null</Paragraph>
    </Section>
    <Section position="3" start_page="46" end_page="46" type="sub_section">
      <SectionTitle>
3.3 Focus
</SectionTitle>
      <Paragraph position="0"> The label &amp;quot;focus&amp;quot; has been widely used in the linguistics and A.I. to name a whole range of concepts. We use the following definition :&amp;quot;The focus ... indicates where the new information lies&amp;quot; - Quirk 1985.</Paragraph>
      <Paragraph position="1"> This definition is easy to assimilate in terms of a database enquiry system where the new data is easily identified. As to where the focus occurs in the sentence i &amp;quot;The neutral position of focus is what we may call END-FOCUS, that is (generally speaking) chief prominence on the last open-class item or proper noun in the clause&amp;quot; - Quirk 1985 '.</Paragraph>
      <Paragraph position="2"> There may be several elements in the generator's input whitch are given, and several which are new. For simplicity, we assume the discourse manager is able to specify one as the most thematic, and one as the most focussed.</Paragraph>
    </Section>
    <Section position="4" start_page="46" end_page="46" type="sub_section">
      <SectionTitle>
3.4 Emphasis
</SectionTitle>
      <Paragraph position="0"> The emphasis parameter indicates that some stress is to be laid on the indicated sentence element, above that supplied by an unmarked sentence, as when correcting a false presupposition. Emphasis is associated with particular marked sentence constructions, as we will see below. Either the topic or the focus may be emphasised: other sentence elements may not.</Paragraph>
    </Section>
  </Section>
  <Section position="5" start_page="46" end_page="47" type="metho">
    <SectionTitle>
4 Discourse parameters and
</SectionTitle>
    <Paragraph position="0"> the thematic system We can now move on to see how the discourse parameters relate to the thematic system in the grammar. In general, guided by Quirk's definitions, we have~ adopted the simple rule that the theme is the first NP in the sentence, and the focus is the last.</Paragraph>
    <Section position="1" start_page="46" end_page="46" type="sub_section">
      <SectionTitle>
4.1 Active
</SectionTitle>
      <Paragraph position="0"> The active sentence is considered as &amp;quot;unmarked&amp;quot; form in which the parameters adopt their default or neutral values. Thus the subject NP will be the theme, and the focus will be on the verb, direct object, indirect object, or verb modifier, whichever comes last.</Paragraph>
      <Paragraph position="1"> 4. John slept in the garden. \[theme = John, focus = the garden\]</Paragraph>
    </Section>
    <Section position="2" start_page="46" end_page="47" type="sub_section">
      <SectionTitle>
4.2 Passive
</SectionTitle>
      <Paragraph position="0"> Creider (1979) classifies the passive as principally a topicalising structure, whilst Quirk  (1985) discusses the focussing effect. We have modeled these effects as follows. With transitive verbs, the subject is focused and the object becomes theme. If the subject is omitted, the verb itself can be focused, but in addition, this produces some emphasis. If the subject is not omitted, the verb can still be focussed and emphasised by fronting the object, which then becomes the theme (see fronting). Modifiers may take the emphasis.</Paragraph>
      <Paragraph position="1">  5. Mary was loved by Jim. \[theme = Mary, focus = Jim\] For bi-transitive verbs, the direct or indirect object can be thematised.</Paragraph>
      <Paragraph position="2"> 6. Mary was sold a book by Jim. \[theme = Mary, focus = Jim\]</Paragraph>
    </Section>
    <Section position="3" start_page="47" end_page="47" type="sub_section">
      <SectionTitle>
4.3 The indirect object transforma-
</SectionTitle>
      <Paragraph position="0"> tion Creider (1979) classifies this transformation as having a thematising function. Q. What did you give to George? A. I gave George a pennywhistle.</Paragraph>
      <Paragraph position="1"> A. ?I gave a pennywhistle to George. This is modeled by transferring theme to the indirect object, and focus to the direct object.</Paragraph>
      <Paragraph position="2"> 7. I gave George a pennywhistle. \[theme = George, focus = a pennywhistle\] The transformation can be combined with class II passivisation. The result is treated as a passive :8. A book was given by John to Mary. \[theme = a book, focus = Mary\]</Paragraph>
    </Section>
    <Section position="4" start_page="47" end_page="47" type="sub_section">
      <SectionTitle>
4.4 Fronting
</SectionTitle>
      <Paragraph position="0"> This construction is generally accepted as es: tablishing the theme (see Creider 1979 - he calls theme &amp;quot;topic&amp;quot;, and fronting &amp;quot;topicalisation&amp;quot;). The fronted item is not new data, and seems to be associated with some form of contrast. This shows up in examples like 9. John I like, but Mary I hate.</Paragraph>
      <Paragraph position="1"> This is modeled by assigning both the &amp;quot;theme&amp;quot; and &amp;quot;emphasis&amp;quot; parameters to the fronted item, the focus being at the end of the sentence as usual.</Paragraph>
      <Paragraph position="2"> 10. To Mary John gave a book. \[theme = Mary, focus = a book, emphasis =</Paragraph>
    </Section>
    <Section position="5" start_page="47" end_page="47" type="sub_section">
      <SectionTitle>
Mary\]
4.5 Clefts
</SectionTitle>
      <Paragraph position="0"> These constructions introduce the cleffed element as new data, and apply special emphasis, as when correcting a presupposition :-Q : Was it John who robbed the bank? A : No, it was Arther Usually, the other entities in the sentence are given, and uncontested. As we saw in the description of the grammar above, any NP or modifier in as sentence can be clefted. So, the clefted item is in focus, and the theme now moves to the end of the sentence.</Paragraph>
      <Paragraph position="1"> 11. It was to Mary that John gave a book.</Paragraph>
      <Paragraph position="2"> \[theme = a book, focus = Mary, emphasis = Mary\]</Paragraph>
    </Section>
    <Section position="6" start_page="47" end_page="47" type="sub_section">
      <SectionTitle>
4.6 Intonation
</SectionTitle>
      <Paragraph position="0"> The intonational centre is assumed to be at the end of the phrase, except in cleft forms, where it falls at the end of the first clause.</Paragraph>
      <Paragraph position="1"> If the theme or focus is realised as a relative clause, the intonational centre comes at the end of that clause. These are important assumptions since non-standard intonation can serve to shift the emphasis or focus to almost any part of a sentence.</Paragraph>
    </Section>
  </Section>
  <Section position="6" start_page="47" end_page="48" type="metho">
    <SectionTitle>
5 The Grammar Formalism
</SectionTitle>
    <Paragraph position="0"> The grammar is encoded in a framework built as part of the Alvey natural language tools project, and known as the GDE (Grammar Development Environment). The syntactic analyses are based on those developed by Pulman 1987, with extensions to cover all the  thematic forms mentioned in the last section. They are couched within a simple unificationenriched phrase structure formalism. Semantic rules are associated with the syntactic rules on a rule-to-rule basis. The semantic rules are instructions for building logical forms of a typed higher order logic. The semantic translation of an expression is assembled using function application and composition, and by using beta-reduction. The logical forms the rules built are a type of &amp;quot;intensionless Montague&amp;quot;, similar to PTQ (Dowty 1981), but without the intension and extension operators. Here, we are only interested in the syntactic part of the rules, so the semantics can be omitted. The following rules couched in GDE notation will serve as an illustration  Here, the prefix &amp;quot;@&amp;quot; denotes a variable. NP's are type rMsed. Syntactic categories, subcategorisation, and unbounded dependencies, are treated similarly to GPSG (Gazdar 1985). Topicalisation, cleft forms, and relatives are all treated as problems of unbounded dependency, using gap threading techniques. The tricky problems of passives and dative shift are covered by a version of the neat treatment presented in Pulman 1987.</Paragraph>
    <Paragraph position="1"> This involves the construction of passive and dative shifted versions of verbs, before inclusion in the rules which combine them with noun phrases, such as R4. No special structure rules for passives are needed.</Paragraph>
  </Section>
  <Section position="7" start_page="48" end_page="49" type="metho">
    <SectionTitle>
6 The generation algorithm
</SectionTitle>
    <Paragraph position="0"> The current GDE generation system uses a chart based bottom-up grammar traversal algorithm, similar to that described in Shieber 1988).</Paragraph>
    <Paragraph position="1"> The starting point for generation is a logical form involving symbols which represent entities in the discourse model of the applica- According to the description of the thetion program. For example &amp;quot;LOVE(ENT1,ENT2)&amp;quot;matic system above, a plain active sentence The referring expressions for these entities are pre-generated and entered in the chart, along with all the lexical items compatible with the rest of the logical form.</Paragraph>
    <Paragraph position="2"> During generation, chart entries are repeatedly combined into larger constituents via the grammar rules. A semantic filter blocks any constituents whose semantic formulae are incompatible withthe goal logical form.</Paragraph>
    <Paragraph position="4"> 7 How the:discourse parameters are encoded in the grammar So, how can the discourse parameters be embodied in in the feature system of the grammar. null The speech ac t type of the sentence is introduced at the sentence level using the features &amp;quot;sentence-type&amp;quot; and &amp;quot;wh&amp;quot;. Assignments are as follows : null The other parameters, theme, focus, and emphasis, are conflected with entities in the application program's discourse model. For generation, they are added to the initial chart entries for those entities. Assume, to begin with, that wehave a functional feature for each discourse parameter, &amp;quot;thin&amp;quot;, &amp;quot;foc&amp;quot; and &amp;quot;emp&amp;quot;, which take the values + or as appropriate. Then, given the start logical form above, assume ENT1 is pre-generated as &amp;quot;John&amp;quot; and ENT2 as &amp;quot;Mary&amp;quot;. From the discourse model, we discover that ENT1 is to be the theme, ENT2 the focus, and that neither is to receive emphasis. This gives us an initial chart with the following entries for the  We could constrain the generator to produce just the active form by augmenting the grammar rules as follows (irrelevant features will be omitted from the rules; altered rules retain their original numbers, augmented with a,b,c ... and so on) :-</Paragraph>
    <Paragraph position="6"> Functional features on the verb will be included for completeness, but are not actually used in the current system.</Paragraph>
    <Paragraph position="7"> Here, the NP of R4a is assumed to be the last constituent in the sentence. Our treatment of passives means that these rules would generate passive sentences correctly as well, since there is no separate passive transfornlation rule. Rules for intransitive and bitransitive verbs could be handled in the same way. However, the system breaks down when we introduce VP modifiers. Now, we no longer know which NP will be last until the VP has been incorporated into a sentence. This can be handled by making the focus value of the NP dependent on a similar feature in the mother VP, as follows :-</Paragraph>
    <Paragraph position="9"> This, however, only works if there are no gaps.If the NP of rule R4b were a gap, and there were no modifiers, the V would then carry the focus. This can be handled by threading the focus feature through the NP. If the NP turns out to be a trace (that is, the creation of a gap), the focus value is threaded through to the V, but if it is a real NP, it keeps the focus value for itself, and passes  the value &amp;quot;foc -&amp;quot; to the V. The &amp;quot;foe&amp;quot; feature R0a is now replaced by &amp;quot;fin&amp;quot; and &amp;quot;fout&amp;quot; features. This allows a gap in the VPMOD as well. If Rld there is a fronted NP, the theme shifts to it, from the subject NP. This can be accounted for by linking the value of &amp;quot;thin&amp;quot; to the sen- R2b tence. If a fronted element takes the theme, this is set to -, otherwise it is set to + . Below, the topicalisation rule assigns + to the R4d thm of the fronted NP, and - to the thm of the subsequent sentence. The thematised NP receives emphasis as well. Transitive or bitransitive verbs which ends up as the focus also receives emphasis. So, we also link the R5b emp value of such a verb to its &amp;quot;fout&amp;quot; value.  Now we need to deal with clefting. In this construction, the theme isshifted from the front of the sentence to the end, and the focus shifts to the clefted element, which is also ROb emphasised. In response to this, we need to introduce a &amp;quot;shifted theme&amp;quot; feature, &amp;quot;sthm&amp;quot;, Rle and link the fin feature up to the sentence category. Once shifted, the theme needs to be treated just like the focus - landing at the end of the sentence. That means it needs R4e threading, and we replace thm with the features &amp;quot;tin&amp;quot; and &amp;quot;tout&amp;quot;. Treatment of clefting, then, causes the following alterations : null Finally, for dative movement, focus stays at the end of the sentence, (unless a cleft from is used) but the theme moves to the indirect object. This can happen if the theme has already been shifted by a cleft, or if it hasn't. This is treated by introducing one final feature &amp;quot;normal shifted theme&amp;quot; or &amp;quot;nst&amp;quot;. This feature is set to - if there is a dative shift, and + otherwise. Then, wherever tin used to be set to +, it is now takes its value from the nst feature. The exception is topicalisation, when dative movement is prevented by setting nst to -. The rules changes that implement this are as follows :-</Paragraph>
    <Section position="1" start_page="49" end_page="49" type="sub_section">
      <SectionTitle>
7.1 Initial feature values
</SectionTitle>
      <Paragraph position="0"> An NP now carries five functional features, as opposed to the three we assumed at the start.</Paragraph>
      <Paragraph position="1"> They are initially set as follows. If the entity is theme, we have \[tin +,tout -\]. If the entity is focus, we have fin +, four -\]. Otherwise, theme and focus values are threaded, as in \[tin @t, tout @t, fin:@f, lout @f\].</Paragraph>
      <Paragraph position="2">  which are compatible with the initial logical form.</Paragraph>
      <Paragraph position="3"> From this position, C2 and C3 can be combined via rule R4e to give the new chart entry :-C4 loves Mary:VP\[tin @t, nst +, fin +\] Then, C1 and C4 can be combined via rule le to give :-C5 John loves Mary:S\[type decl, tin +, nst +, fin +\] Other sentence forms are blocked by the functional features. If the NP &amp;quot;Mary&amp;quot; were originally assigned &amp;quot;emp +&amp;quot;, the generation would only be able to succeed by using the cleft form &amp;quot;It was Mary who was loved by John&amp;quot;. If &amp;quot;John&amp;quot; were emphasised, generation would fail: the current system has no way of emphasising a thematised agent. It would be necessary to use a different verb, or use prosodic stress. Neither of these methods is available in the current system.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML