File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/96/c96-2179_metho.xml

Size: 14,183 bytes

Last Modified: 2025-10-06 14:14:20

<?xml version="1.0" standalone="yes"?>
<Paper uid="C96-2179">
  <Title>The implementation of a computational grammar of French using the Grammar Development Environment</Title>
  <Section position="4" start_page="0" end_page="1024" type="metho">
    <SectionTitle>
2 Comparison of the two
</SectionTitle>
    <Paragraph position="0"> grammars The ANLT grammar was used as a model and initial source of inspiration for the design and implementation of the grammar of l~'ench, despite obvious differences between tile two languages. Both grammm-s strive to account for the same t)roperties of natural language although they differ on points of detail, the differences being essentially structural.</Paragraph>
    <Section position="1" start_page="0" end_page="1024" type="sub_section">
      <SectionTitle>
2.1 Similarities
</SectionTitle>
      <Paragraph position="0"> X-bar theory is used in (Gazdar et al., 1985) to characterize constituent structm'e. In our grammar, as in the ANLT grammar, X-bar schemata are respected in general, although there are differences of detail. There is no specifier in the verb phrase (VP), thus the V2 immediately dominates the V; complex specifiers in noun phrases (NP) and adjectival and adverbial phrases (AdjP and AdvP) are given special treatment: there are X2 level specifiers of X2 constituents, as tbr example in: (I) N2 -4 R2\[POSS +\], H2 the man's black hat</Paragraph>
      <Paragraph position="2"> une foule de ces 6tudiants (a crowd of those students) Furthermore, some constituents have a specifier at level X1 even if there is a specifier at level X2 : (2) N2 -4 Spec, H2 tous les enfants, all the children</Paragraph>
      <Paragraph position="4"> les enfants, the children Adjectival, adverbial and prepositional phrases are treated in a similar fashion in both grammars.  Thus in ninny respects our grammar follows closely the ANLT grammar.</Paragraph>
    </Section>
    <Section position="2" start_page="1024" end_page="1024" type="sub_section">
      <SectionTitle>
2.2 Differences
</SectionTitle>
      <Paragraph position="0"> The structure of tile NP and that of the VP in the CGF differ from those in the ANLT grammar. This is a reflection of phenomena which are characteristic of French: cliticization, present in IS&amp;quot;ench but not in English, and agreement, which is more limited in English titan in l~Y=eneh. We overview these phenomena and then examine the structures.</Paragraph>
      <Paragraph position="1">  There are nnmy more cases of agreement in French than in English and mmw more lexical items exhibit agreement fi~,atures (determiners, adjectives, past participles and conjugated verbs). In particular, two sets of rules are required to account for AdvP and AdjP constituents in CGF, since only the latter is subject to agreement. The agreeInent patterns are more complex as well. In the VP for example the past participle ('.an agree with a direct object when it is anteposed (3), or with its subject when it is conjugated with ~.trc (for a non-pronominal verb); otherwise it remains invariant.</Paragraph>
      <Paragraph position="2">  (3) la table que Paul a faite (the table that Paul made) il les a vus (he saw them) In quantified NPs, we examine two cases: 1. When the subject of a w:rb is an NP quantified by an adverb or an adjective, the complement of tile quantifier determines agreement: (4) beaucoup de garcons sont arrivds (ninny boys have arrived(masc, pl.)) bcaucoup de filles sont m'rivdes (many girls have arrived(ti-~m, pl.)) 2. When tile subject of a verb is a collective noun, the verb &lt;:an agree either with the collective or with the element it quantifies: (5) une foule de garcons est arrivdc (a crowd of boys has arrived(fern, sg.)) une tbule de garcons sont arrivds (a crowd of boys have arrived(masc, pl.))  Cliticization refers to the phenomenon where a w~rbal complement can be pronominalized and adjoined to the verb: (6) Pierre volt le garqon (Peter sees the boy) Pierre le volt (&amp;quot;Peter him sees&amp;quot;) This phenomenon is interesting because it bears a superficial resemblance to long distance dependencies, which are dealt with by SLASH in GPSG.</Paragraph>
    </Section>
  </Section>
  <Section position="5" start_page="1024" end_page="1026" type="metho">
    <SectionTitle>
3 Structural differences
</SectionTitle>
    <Paragraph position="0"> The data relative to cliticization and agreement have dictated the structure of the French VP and NP.</Paragraph>
    <Section position="1" start_page="1024" end_page="1025" type="sub_section">
      <SectionTitle>
3.1 The structure of the French VP
</SectionTitle>
      <Paragraph position="0"> The structure for the VP in English developed in the ANLT grammar corresponds to that in (Gazdar et al., 1982): a binary branching structure where each verbal element takes a VP complement of a specifc type (the &amp;quot;cascading structure&amp;quot;). The same type of structure for the French VP is generally assumed in transformational or generative grammar and also in GPSG analyses (Miller, 1991). We haw', departed from these traditional analyses and have implelnented a fiat structure for the compomld tenses, while retaining the cascading one for the passive. For example, the structure for a direct transitive verb is expressed by the rule  in (7) and the passive by the rnle in (8): (7) SV --~ H\[AUX +\], V\[AUX -\], N2 (8) SV -+ H\[SUBCAT ~tre\], X2\[PRD\]  In rule (8), the X2 can be realized as a p~ssive participle or any predicative complement, as in the GPSG analysis of passives.</Paragraph>
      <Paragraph position="1">  The arguments in favour of distinguishing the two structures are numerous. Many have been discussed in (Abeill6 A Godard, 1994) and also in (Emirkanian A Da Sylva, 1995). Directly relevant: to our discussion is the fact that only the passive participle may be cliticized (9a)2 not the participle involved in compound tenses (gb): (9) a. il est aired par Marie (he is loved by Mary) il l'est (&amp;quot;he it is&amp;quot;) b. il a mang~5 une poinme (he ate an apple) * il l'a (&amp;quot;he it has&amp;quot;) Thus in the passive structure, the copula behaves like control verbs such as the modal vouloir (to want) which take a V2 complement: il veut partir (he wants to leave) yields il le veut (&amp;quot;he it wants&amp;quot; ).</Paragraph>
      <Paragraph position="2"> Our analysis of the French VP also provides an account of past partieilfle agreeinent. In GPSG, agreement is handled by the Control-Agreement Principle (CAP). a However, we have been unable to account for past participle agreement in French using the CAP (see (Emirkanian et al., in press)); 4 2With or without its eolnplements. See (Abeill5 A Godard, 1994) for more details.</Paragraph>
      <Paragraph position="3"> aln the GDE implementation of the grammar of English (Grover et al., 1993), the CAP seems satisfactorily transposed: it is implemented by a relatively snmll set of propagation rules, which respect the generality of the CAP.</Paragraph>
      <Paragraph position="4"> aThis article shows on the other hand how, tbr the past participle, the CAP can account for agreement in predicative structures, for example the passive.  the latter account makes use of other devices, such as the Feature Cooccurrence Restrictions and the</Paragraph>
    </Section>
    <Section position="2" start_page="1025" end_page="1025" type="sub_section">
      <SectionTitle>
Feature Specification Defaults.
3.1.2 GDE Implementation of a flat VP
</SectionTitle>
      <Paragraph position="0"> The insertion of the auxiliary into the VP structure to produce a flat VP is done by a metarule.</Paragraph>
      <Paragraph position="1"> However, implementing this flat structure in the GDE was found to be highly problematic. In the GDE, grammars are pre-compiled into ordered phrase structure rules, and the number of these rules necessary to account for the VP turned out to be extremely large. To give an idea of the size of the resulting grammar, consider the lexical ID rule for a verb requiring an NP and a PP complements: null (I0) V2 --~ H\[3\], N2, P2\[~\] Paul donne un livre ~ Marie (Paul gives a book to Mary) The following metarules operate on this ID rule: passiveS; direct object extraction (SLASH N2); direct object cliticization (accusative); direct object cliticization (oblique); indirect object extraction (SLASH P2\[h\]); indirect object cliticization (dative); indirect object cliticization (locative); direct object extraction (SLASH N2\[de\]); adverb insertion; auxiliary insertion; supercompound aux insertion; negative adverb (pas) insertion; subject clitic insertion.</Paragraph>
      <Paragraph position="2"> Although not all rules are compatible with each other (in particular, no direct object extraction rule can apply to the result of the passive rule), the combinatorics are complex: in a test grammar, we found that the number of phrase structure rules corresponding to this ID rule was of the order of 2 la, or over 8000. This is of course unacceptable, as the CGF comprises 45 different lexical ID rules. Not all of them give rise to so many rules, but the compilation time and size of the output grammar made this solution impractical.</Paragraph>
      <Paragraph position="3"> Instead, we implemented the structure of the VP as a verbal complex of the auxiliary and the past participle which is sister to the complements. The above metarules apply either to the verbal complex or to the complements, thus reducing considerably the combinatorics. The resulting structure is sufficiently equivalent from a theoretical perspective (i.e., the past participle and its complements do not form a constituent) and it allowed us to implement the bulk of our grammar.</Paragraph>
      <Paragraph position="4"> The size of our grammar is now as follows: 185 ID rules and 81 metarules. After compilation, these 185 ID rules expand to 2630 expanded ID rules and 3053 phrase structure rules.</Paragraph>
      <Paragraph position="5"> Let us now turn to our account of the structure of the NP.</Paragraph>
    </Section>
    <Section position="3" start_page="1025" end_page="1026" type="sub_section">
      <SectionTitle>
3.2 The structure of the French NP
</SectionTitle>
      <Paragraph position="0"> The data relative to cliticization and agreement have shaped the structure of the NP in various ways. Our study of the French NP has been influenced mainly by the work of (Milner, 1978). We will concentrate here oil quantitative structures (such as beaueoup d'enfants, many children) and partitive structures (such as beaucoup de ces enrants, many of these children). These involve a quantifier (adverb, pronoun, collective or adjective) whose &amp;quot;complement&amp;quot; contains respectively a determiner-less NP with de (de filles) or an NP with a definite determiner (de ces filles).</Paragraph>
      <Paragraph position="1"> The ANLT grammar's treatment of partitive NPs such as &amp;quot;all of the children&amp;quot;, akin to trois de cos filles and also to beaueoup de filles, assumes a three-way branching structure, given by the fbllowing rule: (11) N2\[+SPEC\] --~ A2, P\[ofj, N2\[+SPEC\] This structure must be rejected for French based on data from cliticization: the de and what follows must form a constituent, which may be cliticized as en like a P2 \[PFORM de\] can be (12) je parle de Marie (I speak of Mary) j'en parle (&amp;quot;I of-her speak&amp;quot;) je vois beaucoup de filles (I see many girls) j'en vois beaucoup (&amp;quot;I of-them see many&amp;quot;) But rather than treat this constituent as a P2, we treat it as an N2 with \[PFORM de\], because of agreement data: agreement features from the complement must be allowed to percolate upwards to explain possible agreement patterns with it, as in une foule d'hommes sont venus (a crowd of men have come). This can be achieved only if the complement is allowed to be an N2 head of the higher NP, and not a P2 (there is no justification for agreement features on a PP in French). (Miller ~, 1991) also argues in favour of treating de and d in French as markers on N2, rather than as prepositions, yielding N2 \[PFORM de\] and N2 \[PFORM 5\]. The following rules for a subset of French quantitative constructions highlight the prevalence of this N2:6 (13) a. N2 \[PFORM nil\] -+ Adv2 \[+QTE\], H2 \[de\] beaucoup de filles (many girls) b. N2\[PFORM nil\] --* N2\[Coll\], H2\[de\] une foule de filles (a crowd of girls ) c. N2\[PFORM nil\] -+ A2\[+QTE\] , H2\[de\] trois filles (three girls) d. N2\[PFORM nil\] -~ H2\[de\] des filles (girls) 6We have omitted the description of the internal structure of this g2 \[de\]. Our grammar accounts for the absence of an explicit de in certain contexts (i.e.,trois filles) as well as its presence in extraposed contexts (fen ai trois, de fllles) and its contraction with the determiner des to produce simply de (de filles - r~gle de caeophonie).</Paragraph>
      <Paragraph position="2">  The following examples featuring cliticization and dislocated structures, where tim de reappears, also argue for the N2\[de\]: (14) a. il voit be.aucoup d(; filles (&amp;quot;he sees many of gMs&amp;quot;) il en voit l)eaucoul) (&amp;quot;he of them sees lnany&amp;quot;) il en voit 1)eaucoul), (le filles (&amp;quot;he of them sees many, of girls&amp;quot;) (14) b. il volt une foule de lilles (he sees a crowd of girls) il en volt une fbulc, de tilles (&amp;quot;he of-them sees a crowd, of girls&amp;quot;) (14) c. il volt trois filles (he sees three girls) il en voit trois (&amp;quot;he of them sees three&amp;quot;) il en volt trois, de filles (&amp;quot;he of-them sees three, of girls&amp;quot;) Wc I)osit an anah)gous structure fbr NPs with the determiner des in rule (d), i.e., a partitive structure where, the quantifier is not st)ecified. Indeed, cliticization in cn is possit)te (14d).</Paragraph>
      <Paragraph position="3"> (14) d. il voit (h;s filles (he sees gMs) il en voit (&amp;quot;he of-theln sees&amp;quot;) il en voit, des filles (&amp;quot;he of them sees, girls&amp;quot;) In all structures described above, we postulate an N2 \[de\] which may be cliticized. In that cruse, it leaves behind a &amp;quot;stranded&amp;quot; quantifier (except in the e;use of rule (d), where that quantifier is null).</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML