File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/03/w03-1601_metho.xml

Size: 23,467 bytes

Last Modified: 2025-10-06 14:08:34

<?xml version="1.0" standalone="yes"?>
<Paper uid="W03-1601">
  <Title>Generation of single-sentence paraphrases from predicate#2Fargument structure using lexico-grammatical resources</Title>
  <Section position="3" start_page="1" end_page="1" type="metho">
    <SectionTitle>
2 Typical generation methodology
</SectionTitle>
    <Paragraph position="0"> Sentence generation takes as input some semantic representation of the meaning to be conveyed in a sentence. We make the assumption that  Ability to handle variety in a uniform manner is also importantinmultilingual generation as some forms available in one language may not be available in another.</Paragraph>
  </Section>
  <Section position="4" start_page="1" end_page="1" type="metho">
    <SectionTitle>
ENJOY
EXPERIENCER THEME
AMY INTERACTION
</SectionTitle>
    <Paragraph position="0"> the input is a hierarchical predicate#2Fargument structure such as that shown in Fig. 1. The output of this process should be a set of grammatical sentences whose meaning matches the original semantic input.</Paragraph>
    <Paragraph position="1"> One standard approach to sentence generation from predicate#2Fargument structure #28like the semantic-head-driven generation in #28Shieber et al., 1990#29#29 involves a simple algorithm.</Paragraph>
    <Paragraph position="2">  head in step 1 In realizing the input in Fig. 1, the input can be decomposed into the top predicate which can be realized by a syntactic head #28a transitive verb#29 and its two arguments, the experiencer and the theme. Suppose that the verb enjoy is chosen to realize the top predicate. The two arguments can then be independently realized as Amy and the interaction. Finally, the realization of the experiencer, Amy, can be placed in the subject position and that of the theme, the interaction, in the complement position, yielding #282a#29.</Paragraph>
    <Paragraph position="3"> Our architecture is very similar but we argue for a more central role of lexico-grammatical resources driving the realization process.</Paragraph>
  </Section>
  <Section position="5" start_page="1" end_page="2" type="metho">
    <SectionTitle>
3 Challenges in generating
</SectionTitle>
    <Paragraph position="0"> paraphrases Paraphrases come from various sources. In this section, we give examples of some types of paraphrases we handle and discuss the challenges they pose to other generators. We also identify types of paraphrases we do not consider.</Paragraph>
    <Section position="1" start_page="1" end_page="1" type="sub_section">
      <SectionTitle>
3.1 Paraphrases we handle
</SectionTitle>
      <Paragraph position="0"> Simple synonymy The simplest source of paraphrases is simple synonymy. We take simple synonyms to be di#0Berent words that have the same meaning and are of the same syntactic category and set up the same syntactic context.</Paragraph>
      <Paragraph position="1"> #281a#29 Booth killed Lincoln.</Paragraph>
      <Paragraph position="2"> #281b#29 Booth assassinated Lincoln.</Paragraph>
      <Paragraph position="3"> A generation system must be able to allow the same semantic input to be realized in different ways. Notice that the words kill and assassinate are not always interchangeable, e.g., assassinate is only appropriate when the victim is a famous person. Such constraints need to be captured with selectional restrictions lest inappropriate realizations be produced.</Paragraph>
      <Paragraph position="4"> Di#0Berent placement of argument realizations Sometimes di#0Berent synonyms, like the verbs enjoy and please, place argument realizations di#0Berently with respect to the head, as illustrated in #282a-2b#29.</Paragraph>
      <Paragraph position="5"> #282a#29 Amy enjoyed the interaction.</Paragraph>
      <Paragraph position="6"> #282b#29 The interaction pleased Amy.</Paragraph>
      <Paragraph position="7"> To handle this variety, a uniform generation methodology should not assume a #0Cxed mapping between thematic and syntactic roles but let each lexical item determine the placementof argument realizations. Generation systems that use such a #0Cxed mapping must override it for the divergent cases #28e.g., #28Dorr, 1993#29#29. Words with overlapping meaning There are often cases of di#0Berentwords that realize different but overlapping semantic pieces. The easiest way to see this is in what has been termed incorporation, where a word not only realizes a predicate but also one or more of its arguments.</Paragraph>
      <Paragraph position="8"> Di#0Berentwords may incorporate di#0Berent arguments or none at all, which may lead to paraphrases, as illustrated in #283a-3c#29.</Paragraph>
      <Paragraph position="9"> #283a#29 Charles #0Dew across the ocean.</Paragraph>
      <Paragraph position="10"> #283b#29 Charles crossed the ocean by plane.</Paragraph>
      <Paragraph position="11"> #283c#29 Charles went across the ocean by plane.</Paragraph>
      <Paragraph position="12"> Notice that the verb #0Dy realizes not only going but also the mode of transportation being a plane, the verb cross with its complement realize going whose path is across the object realized by the complement, and the verb go only realizes going. For all of these verbs, the remaining arguments are realized by modi#0Cers.</Paragraph>
      <Paragraph position="13"> Incorporation shows that a uniform generator should use the word choices to determine 1#29 what portion of the semantics they realize, 2#29 what portions are to be realized as arguments of the realized semantics, and 3#29 what portions remain to be realized and attached as modi#0Cers.</Paragraph>
      <Paragraph position="14"> Generation systems that assume a one-to-one mapping between semantic and syntactic units #28e.g., #28Dorr, 1993#29#29 must use special processing for cases of overlapping semantics.</Paragraph>
      <Paragraph position="15"> Di#0Berent syntactic categories Predicates can often be realized by words of di#0Berent syntactic categories, e.g., the verb found and the noun founding, as in #284a-4b#29.</Paragraph>
      <Paragraph position="16"> #284a#29 I know that Olds founded GM.</Paragraph>
      <Paragraph position="17"> #284b#29 I know about the founding of GM by Olds.</Paragraph>
      <Paragraph position="18"> Words of di#0Berent syntactic categories usually have di#0Berent syntactic consequences. One such consequence is the presence of additional syntactic material. Notice that #284b#29 contains the prepositions of and by while #284a#29 does not. These prepositions might be considered a syntactic consequence of the use of the noun founding in this con#0Cguration. Another syntactic consequence is a di#0Berent placement of argument realizations. The realization of the founder is the subject of the verb found in #284a#29 while in #284b#29 the use of founding leads to its placement in the object position of the preposition by.</Paragraph>
      <Paragraph position="19"> Grammatical alternations Words can be put in a variety of grammatical alternations such as the active and passivevoice, as in #285a-5b#29, the topicalized form, the it-cleft form, etc.</Paragraph>
      <Paragraph position="20"> #285a#29 Oswald killed Kennedy.</Paragraph>
      <Paragraph position="21"> #285b#29 Kennedy was killed by Oswald.</Paragraph>
      <Paragraph position="22"> The choice of di#0Berent grammatical alternations has di#0Berent syntactic consequences which must be enforced in generation, such as the presence or absence of the copula and the di#0Berent placement of argument realizations. In some systems such as ones based on Tree-Adjoining Grammars #28TAG#29, including ours, these consequences are encapsulated within elementary structures of the grammar. Thus, such systems do not have to speci#0Ccally reason about these consequences, as do some other systems.</Paragraph>
      <Paragraph position="23"> More complex alternations The same content of excelling at an activity can be realized by the verb excel, the adverb well, and the adjective good, as illustrated in #286a-6c#29.</Paragraph>
      <Paragraph position="24"> #286a#29 Barbara excels at teaching.</Paragraph>
      <Paragraph position="25"> #286b#29 Barbara teaches well.</Paragraph>
      <Paragraph position="26"> #286c#29 Barbara is a good teacher.</Paragraph>
      <Paragraph position="27"> This variety of expression, often called head switching, poses a considerable di#0Eculty for most existing sentence generators. The di#0Eculty stems from the fact that the realization of a phrase #28sentence#29 typically starts with the syntactic head #28verb#29 which sets up a syntactic context into which other constituents are #0Ct. If the top predicate is the excelling, wehavetobe able to start generation not only with the verb excel but also with the adverb well and the adjective good, typically not seen as setting up an appropriate syntactic context into which the remaining arguments can be #0Ct. Existing generation systems that handle this variety do so using special assumptions or exceptional processing, all in order to start the generation of a phrase with the syntactic head #28e.g., #28Stede, 1999#29, #28Elhadad et al., 1997#29, #28Nicolov et al., 1995#29, #28Dorr, 1993#29#29. Our system does not require that the semantic head map to the syntactic head.</Paragraph>
      <Paragraph position="28"> Di#0Berent grammatical forms realizing semantic content Finally, we consider a case, which to our knowledge is not handled by other generation systems, where grammatical forms realize content independently of the lexical item on which they act, as in #287a-7b#29.</Paragraph>
      <Paragraph position="29"> #287a#29 Who rules Jordan? #287b#29 Identify the ruler of Jordan! The wh-question form, as used in #287a#29, realizes a request for identi#0Ccation by the listener #28in this case, the ruler of Jordan#29. Likewise, the imperative structure #28used in #287b#29#29 realizes a request or a command to the listener #28in this case, to identify the ruler of Jordan#29.</Paragraph>
    </Section>
    <Section position="2" start_page="1" end_page="1" type="sub_section">
      <SectionTitle>
3.2 Paraphrases we do not consider
</SectionTitle>
      <Paragraph position="0"> Since our focus is on sentence generation and not sentence planning, we only consider the generation of single-sentence paraphrases. Hence, we do not have the ability to generate #288a-8b#29 from the same input.</Paragraph>
      <Paragraph position="1"> #288a#29 CS1 has a programming lab.</Paragraph>
      <Paragraph position="2"> #288b#29 CS1 has a lab. It involves programming. Since we do not reason about the semantic input, including deriving entailment relations, we cannot generate #289a-9b#29 from the same input.  Generation in our system is driven by the semantic input, realized by selecting lexico-grammatical resources matching pieces of it, starting with the top predicate. The realization of a piece containing the top predicate provides the syntactic context into which the realizations of the remaining pieces can be #0Ct #28their placement being determined by the resource#29.</Paragraph>
      <Paragraph position="3"> The key to our abilitytohandle paraphrases in a uniform manner is that our processing is driven by our lexicon and thus we do not make any a priori assumptions about 1#29 the amount of the input realized by a lexical unit, 2#29 the relationship between semantic and syntactic types #28and thus the syntactic rank or category of the realization of the top piece#29, 3#29 the nature of the mapping between thematic roles and syntactic positions, and 4#29 the grammatical alternation #28e.g., there are di#0Berent resources for the same verb in di#0Berent alternations: the active, passive, topicalized, etc.#29. Because this information is contained in each lexico-grammatical resource, generation can proceed no matter what choices are speci#0Ced about these in each individual resource. Our approach is fundamentally di#0Berent from systems that reason directly about syntax and build realizations by syntactic rank #28#28Bateman, 1997#29, #28Elhadad et al., 1997#29; #28Nicolov et al., 1995#29; #28Stone and Doran, 1997#29#29.</Paragraph>
    </Section>
    <Section position="3" start_page="1" end_page="1" type="sub_section">
      <SectionTitle>
4.1 Our algorithm
</SectionTitle>
      <Paragraph position="0"> Our generation algorithm is a simple, recursive, semantic-head-driven generation process, consistent with the approach described in section 2, but one driven by the semantic input and the lexico-grammatical resources.</Paragraph>
      <Paragraph position="1"> 1. given an unrealized input, #0Cnd a lexico-grammatical resource that matches a portion including the top predicate and satis- null resource in step 1, as determined by the resource in step 1 Notice the prominence of lexico-grammatical resources in steps 1 and 3 of this algorithm. The standard approach in section 2 need not be driven by resources.</Paragraph>
    </Section>
    <Section position="4" start_page="1" end_page="1" type="sub_section">
      <SectionTitle>
4.2 Lexico-grammatical resources
</SectionTitle>
      <Paragraph position="0"> The key to the simplicity of our algorithm lies in the lexico-grammatical resources, which encapsulate information necessary to carry through generation. These consist of three parts: #0F the semantic side: the portion of semantics realized by the resource #28including the predicate and any arguments; this part is matched against the input semantics#29 #0F the syntactic side: either word#28s#29 in a syntactic con#0Cguration or a grammatical form without words, and syntactic consequences #0F a mapping between semantic and syntactic constituents indicating which constituent on the semantic side is realized by which constituent on the syntactic side Consider the resources for the verbs enjoy and please in Fig. 2. The semantic sides indicate that these resources realize the predicate ENJOY and the thematic roles EXPERIENCER and THEME.</Paragraph>
      <Paragraph position="1"> The arguments #0Clling those roles #28whichmust be realized separately, as indicated by dashed outlines#29 appear as variables X and Y which will be matched against actual arguments. The syntactic sides contain the verbs enjoy and please in the active voice con#0Cguration. The mappings include links between ENJOY and its realization as well as links between the unrealized agent#28X#29 or theme #28Y#29 and the subject or the complement. Our mapping between semantic and syntactic constituents bears resemblance to the pairingsin Synchronous TAG #28Shieber and Schabes, 1990#29.</Paragraph>
      <Paragraph position="2">  critical for combining realizations #28in step 3 of our algorithm in section 4.1#29. There are, however, advantages that our approach has. For one, we are not constrained by the isomorphism requirement in a Synchronous TAG derivation.</Paragraph>
      <Paragraph position="3"> Also, the DSG formalism that we use a#0Bords greater #0Dexibility, signi#0Ccant in our approach, as discussed later in this paper #28and in more detail in #28Kozlowski, 2002b#29#29.</Paragraph>
    </Section>
    <Section position="5" start_page="1" end_page="1" type="sub_section">
      <SectionTitle>
4.3 The grammatical formalism
</SectionTitle>
      <Paragraph position="0"> Both step 3 of our algorithm #28putting realizations together#29 and the needs of lexico-grammatical resources #28the encapsulation of syntactic consequences such as the position of argument realizations#29 place signi#0Ccant demands on the grammatical formalism to be used in the implementation of the architecture. One grammatical formalism that is well-suited for our purposes is the D-Tree Substitution Grammars #28DSG, #28Rambow et al., 2001#29#29, a variant of Tree-Adjoining Grammars #28TAG#29. This formalism features an extended domain of locality and #0Dexibility in encapsulation of syntactic consequences, crucial in our architecture.</Paragraph>
      <Paragraph position="1"> Consider the elementary DSG structures on the right-hand-side of the resources for enjoy and please in Fig. 2. Note that nodes marked with #23 are substitution nodes corresponding to syntactic positions into which the realizations of  the resources for enjoy and please arguments will be substituted. The positions of both the subject and the complement are encapsulated in these elementary structures. This allows the mapping between semantic and syntactic constituents to be de#0Cned locally within the resources. Dotted lines indicate domination of length zero or more where syntactic material #28e.g., modi#0Cers#29 may end up.</Paragraph>
    </Section>
    <Section position="6" start_page="1" end_page="1" type="sub_section">
      <SectionTitle>
4.4 Using resources in our algorithm
</SectionTitle>
      <Paragraph position="0"> Step 1 of our algorithm requires matching the semantic side of a resource against the top of the input and testing selectional restrictions. A semantic side matches if it can be overlaid against the input. Details of this process are given in #28Kozlowski, 2002a#29. Selectional restrictions #28type restrictions on arguments#29 are associated with nodes on the semantic side of resources.</Paragraph>
      <Paragraph position="1"> In their evaluation, the appropriate knowledge base instance is accessed and its type is tested.</Paragraph>
      <Paragraph position="2"> More details about using selectional restrictions in generation and in our architecture are given in #28Kozlowski et al., 2002#29.</Paragraph>
      <Paragraph position="3"> Resources for enjoy and please which match the top of the input in Fig. 1 are shown in Fig. 2. In doing the matching, the arguments AMY and INTERACTION are uni#0Ced with X and Y. The dashed outlines around X and Y indicate that the resource does not realize them. Our algorithm calls for the independent recursive realization of these arguments and then putting together those realizations with the syntactic side of the resource, as indicated by the mapping.</Paragraph>
      <Paragraph position="4">  portion realized by cross in bold This is shown in Fig. 3. The argument realizations, Amy and the interaction, are placed in the subject and complement positions of enjoy and please, according to the mapping in the corresponding resources.</Paragraph>
    </Section>
    <Section position="7" start_page="1" end_page="1" type="sub_section">
      <SectionTitle>
4.5 Driving decomposition by resources
</SectionTitle>
      <Paragraph position="0"> The semantic side of a resource determines which arguments, if any, are realized by the resource, while the matching done in step 1 of our algorithm determines the portions that must be realized by modi#0Cers. This is always done the same way regardless of the resources selected and how much of the input they realize, such as the two resources realizing the predicate GO shown in Fig. 4, one for #0Dy which incorporates MODE PLANE and another for cross which incor- null are to be realized as arguments. The remaining thematic role MODE with the argument PLANE #0Clling it, is to be realized by a modi#0Cer.</Paragraph>
    </Section>
    <Section position="8" start_page="1" end_page="1" type="sub_section">
      <SectionTitle>
4.6 Encapsulation of syntactic
consequences
</SectionTitle>
      <Paragraph position="0"> All syntactic information should be encapsulated within resources and transparent to the algorithm. This includes the identi#0Ccation of arguments, including their placement with respect to the realization. Another example of a syntactic consequence is the presence of additional syntactic material required by the lexical item in the particular syntactic con#0Cguration. The verb found inthe active con#0Cguration, as in #284a#29, does not require any additional syntactic material.</Paragraph>
      <Paragraph position="1"> On the other hand, the noun founding in the con#0Cguration with prepositional phrases headed by of and by, as in #284b#29, may be said to require the use of the prepositions. The resources for found and founding are shown in Fig. 6. Encapsulation of such consequences allows us to avoid special mechanisms to keep track of and enforce  them for individual resources.</Paragraph>
    </Section>
    <Section position="9" start_page="1" end_page="2" type="sub_section">
      <SectionTitle>
4.7 Syntactic rank and category
</SectionTitle>
      <Paragraph position="0"> No assumptions are made about the realization of a piece of input semantics, including its syntactic rank and category. For instance, the predicate EXCEL can be realized by the verb excel, the adverb well, and the adjective good, as illustrated in #286a-6c#29. The processing is the same: a resource is selected and any argument realizations are attached to the resource.</Paragraph>
      <Paragraph position="1"> Fig. 7 shows a resource for the predicate EXCEL realized by the verb excel. What is interesting about this case is that the DSG formalism we chose allows us to encapsulate the PRO in the subject position of the complement as a syntactic consequence of the verb excel in this con#0Cguration. The other resource for EXCEL shown in Fig. 7 is unusual in that the predicate is realized byanadverb, well. Note the link between the uninstantiated theme on the semantic side and the position for its corresponding syntactic realization, the substitution node VP  Also notice that the experiencer of EXCEL is considered realized by the well resource and coindexed with the agent of the theme of EXCEL, to be realized by a separate resource.</Paragraph>
      <Paragraph position="2">  tics. The matching in step 1 of our algorithm determines that the subtree of the input rooted at TEACH must be recursively realized. The realization of this subtree yields Barbara teaches. Because of the link between the theme of EXCEL and the VP  node of well, the realization Barbara teaches is substituted to the VP  node of well. This is a more complex substitution than in regular TAG #28where the substitution node is identi#0Ced with the root of the argument realization#29, and is equivalent to the adjunction of well to Barbarateaches. In DSG, we are able to treat structures such as the well structure as initial and not auxiliary, as TAG would. Thus, argument realizations are combined with all structures in a uniform fashion.</Paragraph>
    </Section>
    <Section position="10" start_page="2" end_page="2" type="sub_section">
      <SectionTitle>
4.8 Grammatical forms
</SectionTitle>
      <Paragraph position="0"> As discussed before, grammatical forms themselves can realize a piece of semantics. For instance, the imperative syntactic form realizes a request or a command to the listener, as shown in Fig. 9. Likewise, the wh-question form realizes a request to identify, also shown in Fig. 9.</Paragraph>
      <Paragraph position="1"> In our system, whether the realization has any lexical items is not relevant.</Paragraph>
    </Section>
    <Section position="11" start_page="2" end_page="2" type="sub_section">
      <SectionTitle>
4.9 The role of DSG
</SectionTitle>
      <Paragraph position="0"> We believe that the choice of the DSG formalism plays a crucial role in maintaining our simple methodology. Like TAG, DSG allows capturing syntactic consequences in one elementary structure. DSG, however, allows even greater #0Dexibilityinwhat is included in an elementary structure. Note that in DSG wemayhave nonimmediate domination links between nodes of  di#0Berent syntactic categories #28e.g., between the S and NP in Fig. 9 and also in the excel at structure in Fig. 7#29. DSG also allows uniform treatment of complementation and modi#0Ccation using the operations of substitution #28regardless of the realization of the predicate, e.g., the structures in Fig. 7#29 and adjunction, respectively.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML