File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/98/p98-1033_metho.xml

Size: 20,186 bytes

Last Modified: 2025-10-06 14:14:54

<?xml version="1.0" standalone="yes"?>
<Paper uid="P98-1033">
  <Title>Building Parallel LTAG for French and Italian Made-H616ne Candito</Title>
  <Section position="3" start_page="211" end_page="213" type="metho">
    <SectionTitle>
2. The metagrammar
</SectionTitle>
    <Paragraph position="0"> Formally the MG takes up the proposal of (VSS92) to represent grammar as a multiple inheritance network, whose classes specify syntactic structures as partial descriptions of trees (Rogers &amp; Vijay-Shanker, 94). While trees specify for any pair of nodes either a precedence relation or a path of parent relations, these partial descriptions of trees, are sets of constraints that may leave underspecified the relation existing between two nodes.</Paragraph>
    <Paragraph position="1"> The relation between two nodes may be further specified, either directly or by inference, by adding constraints, either in sub-classes or in lateral classes in the inheritance network.</Paragraph>
    <Paragraph position="2"> In the MG, nodes of partial descriptions are augmented with feature structures : one for the feature structures of the future tree sketches and one for the features that are specific to the MG, called meta-features. These are, for instance, the possible parts of speech of a node or the index (cf Section l) in the case of argumental nodes.</Paragraph>
    <Paragraph position="3"> So a class of an instantiated MG may specify the following slots : * the (ordered) list of direct parent classes * a partial description of trees * feature structures associated with nodes 3 Contrary to (VSS92) nodes are global variables within the whole inheritance network, and classes can add features to nodes without involving them in the partial description.</Paragraph>
    <Paragraph position="4"> Inheritance of partial descriptions is monotonic. The aim is to be able to build pre-lexicalized structures respecting the PACP, and to group together structures likely to pertain for the same lexeme. In order to achieve this, MG makes use of syntactic functions to express either monolingual or cross-linguistic generalizations (cf the work in LFG, Meaning-Text Theory or 3 Actually the tree description language --that we will not detail here-- involves constants, that name nodes of satisfying trees. Several constants may be equal and thus name the same node. The equality is either infered or explicitly stated in the description.</Paragraph>
    <Paragraph position="5">  Relational Grammar (RG) - see (Blake, 90) for an overview). Positing syntactic functions, characterized by syntactic properties, allows to set parallels between constructions for different languages, that are different in surface (for word order or morpho-syntactic marking), but that share a representation in terms of functional dependencies. Within a language, it allows to abstract from the different surface realizations of a given function and from the different diathesis a predicate can show.</Paragraph>
    <Paragraph position="6"> So in MG, subcategorization (hereafter subcat) of predicates is expressed as a list of syntactic functions, and their possible categories.</Paragraph>
    <Paragraph position="7"> Following RG, an initial subcat is distinguished, namely the one for the unmarked case, and is modifiable by redistribution of the functions associated with the arguments of the predicate.</Paragraph>
    <Paragraph position="8"> Technically, this means that argumental nodes in partial descriptions bear a meta-feature &amp;quot;initial-function&amp;quot; and a meta-feature &amp;quot;function&amp;quot;. The &amp;quot;function&amp;quot; value is by default the &amp;quot;initial-function&amp;quot; value, but can be revised by  redistribution. Redistributions, in a broad sense, comprise : * pure redistributions that do not modify the number of arguments (eg. full passive).</Paragraph>
    <Paragraph position="9"> * reductions of the number of arguments (eg.</Paragraph>
    <Paragraph position="10"> agentless passive) * augmentations of the number of arguments (mainly causative).</Paragraph>
    <Paragraph position="11">  In MG, structures sharing the same initial subcat can be grouped to form a set of structures likely to be selected by the same lexeme. For verbal predicates, a minimal clause is partly represented with an ordered list of successive subcats, from the initial one to the final one. Minimal clauses sharing a final subcat, may differ in the surface realizations of the functions. The MG represents this repartition of information by imposing a three-dimension inheritance network4: * dimension 1: initial subcat * dimension 2: redistributions of functions * dimension 3: surface realizations of syntactic functions.</Paragraph>
    <Paragraph position="12"> 4 More precisely a hierarchy is defined for each category of predicate. Dimension 2 is primarily relevant for verbal predicates. Further, remaining structures, for instance for argument-less lexemes or for auxiliaries and raising verbs are represented in an additional network, by classes that may inherit shared properties, but that are totally written by hand.</Paragraph>
    <Paragraph position="13"> In an instantiated MG for a given language, each terminal class of dimension 1 describes a possible initial subcat and describes partially the verbal morpho-syntax (the verb may appear with a frozen clitic, or a particle in English). Each terminal class of dimension 2 describes a list of ordered redistributions (including the case of noredistribution). The redistributions may impose a verbal morphology (eg. the auxiliary for passive). Each terminal class of dimension 3 represent the surface realization of a function (independently of the initial function). For some inter-dependent realizations, a class may represent the realizations of several functions (for instance for clitics in romance languages). Terminal classes of the hand-written hierarchy are pieces of information that can be combined to form a tree sketch that respects the PACP.</Paragraph>
    <Paragraph position="14"> For a given language, some of the terminal classes are incompatible. This is stated either by the content of the classes themselves or within an additional set of language-dependent constraints (compatibility constraints). For instance a constraint is set for French, to block cooccurrence of an inverted subject with an  object in canonical position (while this is possible for Italian).</Paragraph>
    <Paragraph position="15"> 3. Compilation of MG to LTAG  The compilation is a two-step process, illustrated figure 2. First the compiler automatically creates additional classes of the inheritance network : the &amp;quot;crossing classes&amp;quot;. Then each crossing class is translated into one or several tree sketches.</Paragraph>
    <Section position="1" start_page="213" end_page="213" type="sub_section">
      <SectionTitle>
3.1 Automatic extension of the hierarchy
</SectionTitle>
      <Paragraph position="0"> A crossing class is a linguistic description that must fulfill the PACP. Using syntactic functions and the three-dimension partition, MG makes more precise this well-formedness principle. A crossing class is a class of the inheritance network that is automatically built as follows: * a crossing class inherits exactly one terminal class of dimension 1 * then, a crossing class inherits exactly one terminal class of dimension 2 These two super-classes define an ordered list of subcat, from the initial one to the final one.</Paragraph>
      <Paragraph position="1"> * then, a crossing class inherits classes of dimension 3, representing the realizations of every function of the final subcat.</Paragraph>
      <Paragraph position="2"> Further, for a crossing class to be well-formed, all unifications involved during the inheritance process must succeed, either for feature structures or for partial descriptions. Clashes between features or inconsistencies in partial descriptions are used to rule out some irrelevant crossings of linguistic phenomena. Finally, the compatibility constraints must be respected (cf Section 2).</Paragraph>
    </Section>
    <Section position="2" start_page="213" end_page="213" type="sub_section">
      <SectionTitle>
3.2 Translation into LTAG families
</SectionTitle>
      <Paragraph position="0"> While crossing classes specify a partial description with feature structures, LTAG use trees. So the compiler takes the &amp;quot;representative&amp;quot; tree(s) of the partial description (see Rogers &amp; Vijay-Shanker, 94 for a formal definition).</Paragraph>
      <Paragraph position="1"> Intuitively these representative trees are trees minimally satisfying the description. There can be several for one description. For example, the relative order of several nodes may be underspecified in a description, and the representative trees show every possible order.</Paragraph>
      <Paragraph position="2"> A family is generated by grouping all the trees computed from crossing classes that share the same class of dimension 1.</Paragraph>
    </Section>
  </Section>
  <Section position="4" start_page="213" end_page="216" type="metho">
    <SectionTitle>
4. Metagrammars for French and
</SectionTitle>
    <Paragraph position="0"> Italian : a contrast We have instantiated the metagrammar for French, starting with an existing LTAG (Abeill6, 91). The recompilation MG---~LTAG insures coherence (a phenomena is consistently handled through the whole grammar) and completeness (all valid crossings are performed). The coverage of the grammar has been extended 5.</Paragraph>
    <Paragraph position="1"> Then we have adapted the French MG to Italian, to obtain a &amp;quot;parallel&amp;quot; LTAG for Italian, close with respect to linguistic analyses. The general organization of the MG gives a methodology for systematic syntactic contrast. We describe some pieces of the inheritance network for French and Italian, with particular emphasis on dimension 2 and, in dimension 3, on the surface realizations of the subject.</Paragraph>
    <Section position="1" start_page="213" end_page="214" type="sub_section">
      <SectionTitle>
4.1 Dimension 1
</SectionTitle>
      <Paragraph position="0"> We do not give a description of the content of this dimension, but rather focus on the differences between the two languages. A first difference in dimension 1 is that for Italian, there exist verbs without argument 6 (atmospheric verbs), while for French, a subject is obligatory, though maybe impersonal.</Paragraph>
      <Paragraph position="1"> Another difference, is known as the unaccusative hypothesis (see (Renzi, 88, vol I) for an account). It follows from syntactic evidence, that the unique argument of avereselecting intransitives (eg. (I)) and essereselecting intransitives (the unaccusatives, eg.  (2)) has different behavior when post-verbal: (1) *Ne hanno telefonato tre.</Paragraph>
      <Paragraph position="2"> (of-them have phoned three) Three of them have phoned (2) Ne sono rimaste tre.</Paragraph>
      <Paragraph position="4"> Three of them have remained.</Paragraph>
      <Paragraph position="5"> We represent unaccusatives as selecting an initial object and no initial subject. A redistribution in dimension 2 promotes this initial object into a special subject (showing subject properties and some object proTperties, like the he-licensing shown in (2)). This redistribution is also used for specifying passive and middle, which both trigger unaccusative behavior (see next section).</Paragraph>
      <Paragraph position="6"> s The number of tree sketches passed from 800 to 1100 lwithout causative trees).</Paragraph>
      <Paragraph position="7"> An alternative analysis would be to consider that these verbs select a subject pronoun, that is not realized in Italian (pro-drop language).</Paragraph>
      <Paragraph position="8"> 7 We take a simpler approach than RG, which accounts for most of the Italian data. Unhandled are the auxiliary change for verbs, when goal-phrases are added (see (Dini, 95) for an analysis in HPSG).</Paragraph>
    </Section>
    <Section position="2" start_page="214" end_page="215" type="sub_section">
      <SectionTitle>
4.2 Dimension 2
</SectionTitle>
      <Paragraph position="0"> The MG for French and Italian cover the following types of redistribution s : passive, middle, causative and impersonal (only for French). Causative verbs plus infinitives are analysed in Romance as complex predicates. Due to a lack of space will not describe their encoding in MG here. Figure 3 shows the inheritance links of dimension 2 for French (without causative). Terminal classes are shown without frame.</Paragraph>
      <Paragraph position="2"> Figure 3 : Dimension 2 for French (without causative) The verbal morphology is affected by redistributions, so it appears in the hierarchy.</Paragraph>
      <Paragraph position="3"> The hierarchy comprises the case of noredistribution, that inherits an active morphology : it simply states that the anchor of the future tree sketch is also the verb that receives inflexions for tense, agreement...</Paragraph>
      <Paragraph position="4"> Refering to the notion of hierarchy of syntactic functions (A la Keenan-Comrie), we can say that the redistributions shown comprise a subject demotion (which can be a deletion) and a promotion of an element to subject.</Paragraph>
      <Paragraph position="5"> For active impersonal (3), the subject is demoted to object (class SUBJECT---~OBJECT), and the impersonal il is introduced as subject (class IMPERS---~SUBJECT).</Paragraph>
      <Paragraph position="6"> (3) I1 est arriv6 trois lettres pour vous.</Paragraph>
      <Paragraph position="7"> (IL is arrived three letters for you) There arrived three letters for you.</Paragraph>
      <Paragraph position="8"> Passive is characterized by a particular morphology (auxiliary bearing inflections + past participle) and the demotion of subject (which is either deleted, class SUBJECT---&gt;EMPTY, or demoted to a by-phrase, class SUBJECT--~AGT-OBJ), but not necessarily by a promotion of the object to subject (class OBJECT----&gt;SUBJECT) (cf (Comrie, 77)). In French, the alternative to object promotion is the introduction of the impersonal subject (class IMPERS---~SUBJECT )9.</Paragraph>
      <Paragraph position="9"> This gives four possibilities, agentless personal (4), full personal (5), agentless impersonal (6), full impersonal, but this last possibility is not well attested.</Paragraph>
      <Paragraph position="10"> (4) Le film sera projet6 mardi prochain.</Paragraph>
      <Paragraph position="11"> The movie will be shown next tuesday.</Paragraph>
      <Paragraph position="12"> (5) La voiture a 6t6 doubl6e par un v61o.</Paragraph>
      <Paragraph position="13"> The car was overtaken by a bike.</Paragraph>
      <Paragraph position="14"> (6) I1 a 6t6 d6cr6t6 l'6tat d'urgence.</Paragraph>
      <Paragraph position="15"> (IL was declared the state of emergency) The state of emergency was declared.</Paragraph>
      <Paragraph position="16"> Middle is characterized by a deletion of the subject, and a middle morphology (a reflexive clitic se). Here also we have the alternative  OBJECT--~SUBJECT (7) or IMPERS---&gt;SUBJECT (8). The interpretation is generic or deontic in French. (7) Le th6 se sert ~ 5h.</Paragraph>
      <Paragraph position="17"> (Tea SE serves at 5.) One should serve tea at 5.</Paragraph>
      <Paragraph position="18"> (8) I1 se dit des horreurs ici.</Paragraph>
      <Paragraph position="19"> (IL SE says horrible things here)  Horrible things are pronounced in here.</Paragraph>
      <Paragraph position="20"> Now let us contrast this hierarchy with the one for Italian. Figure 4 shows dimension 2 for  In Italian, what is called impersonal (9a) is a special realization of subject (by a clitic sO, meaning either people, one or we. (cf Monachesi, 95). The French equivalent is the 8 The locative alternation (John loaded the truck with oranges/John loaded oranges into the truck), is not covered at present time, but can easily be added. It requires to choose an initial subcat for the verb. 9 So we do not analyse impersonal passive as passive to which apply impersonal. This allows to account for the (rare) cases of impersonal passives with no personal passive counterpart.</Paragraph>
      <Paragraph position="21">  nominative clitic on (9b).</Paragraph>
      <Paragraph position="22"> (9a) it. Si parti.</Paragraph>
      <Paragraph position="23"> (SI left) People / we left.</Paragraph>
      <Paragraph position="24"> (9b) fr. On partit.</Paragraph>
      <Paragraph position="25">  This impersonal si is thus coded as a realization of subject, in dimension 3, and we have no IMPERS---~SUBJECT promotion for the Italian dimension 2. The impersonal si can appear with all redistributions except the middle. The Italian middle is similar to French, with a reflexive clitic si. Indeed impersonal si, with transitive verbs and singular object (10), is ambiguous with a middle analysis (and subject inversion).</Paragraph>
      <Paragraph position="26"> (10) Si mangia il gelato.</Paragraph>
      <Paragraph position="27"> (SI eat-3sg the ice-cream) The ice-cream is eaten.</Paragraph>
      <Paragraph position="28"> With a plural nominal object, some speakers do not accept impersonal (with singular verb (11 a)) but only the middle (with verb agreement (1 lb)). (1 la) Si mangia le mele.</Paragraph>
      <Paragraph position="29"> (SI eat-3sg the apples) (1 lb) Si mangiano le mele.</Paragraph>
      <Paragraph position="30"> (SI eat-3pl the apples) Another difference with French redistributions, is that when the object is promoted, in passive or middle, it is as a subject showing unaccusative behavior (eg. he-licensing, cf section 4.1). To represent this, we use the class OBJECT---~EXTENDED-SUBJECT, which is also used for the spontaneous promotion of initial object of unaccusative intransitives (cf section 4.1). So for Italian, passive (agentless or full) and middle (1 lb) comprise a subject demotion (a mandatory deletion for middle) and the promotion OBJECT--~EXTENDED-SUBJECT, while for intransitive unaccusatives, this promotion is spontaneous.</Paragraph>
      <Paragraph position="31"> Other differences between French and Italian concern the interaction of causative with other redistributions : passive and middle can apply after causative in Italian, but not in French.</Paragraph>
    </Section>
    <Section position="3" start_page="215" end_page="216" type="sub_section">
      <SectionTitle>
4.3 Dimension 3
</SectionTitle>
      <Paragraph position="0"> We describe in dimension 3 the classes for the surface realizations of subject. This function is special as it partially imposes the mode of the clause. The subject is empty for infinitives and imperatives Ideg. Adnominal participial clauses are to See (Abeill~, 91) for the detail of the linguistic analyses chosen for French. We describe here the hierarchical organization.</Paragraph>
      <Paragraph position="1"> represented as auxiliary trees that adjoin on a N, the subject is the foot node of the auxiliary tree (we do not detail here the different participial clauses).</Paragraph>
      <Paragraph position="2"> For French (Figure 5), when realized, the subject is either sentential, nominal or pronominal (clitic). Nominal subjects may be in preverbal position or inverted, relativized or cleft. These last two realizations inherit also classes describing relative clauses and cleft clauses.</Paragraph>
      <Paragraph position="3"> Sentential subjects are here only preverbal. Clitic subjects are preverbal (post-verbal subject clitics are not shown here, as their analysis is special). Note that in dimension 2, the class IMPERS---~SUBJECT specifies that the subject is clitic, and dominates the word il. This will only be compatible with the clitic subject realization.</Paragraph>
      <Paragraph position="5"> Figure 5 : SubJect realizations for French For Italian, (Figure 6), the hierarchy for subjects is almost the same : a class for non-realized subjects is added, since Italian is a pro-drop language, and pronominal subjects are not realized. But we mentioned in section 4.2 the special case of the impersonal subject clitic si.</Paragraph>
      <Paragraph position="6"> To handle this clitic, the Italian class for clitic subject introduces the si.</Paragraph>
      <Paragraph position="7"> Figure 6 : Subject realizations for Italian (differences with French in bold)</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML