File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/02/c02-1153_evalu.xml

Size: 10,417 bytes

Last Modified: 2025-10-06 13:58:50

<?xml version="1.0" standalone="yes"?>
<Paper uid="C02-1153">
  <Title>Generating the XTAG English grammar using metarules</Title>
  <Section position="6" start_page="19" end_page="19" type="evalu">
    <SectionTitle>
4 Evaluation
</SectionTitle>
    <Paragraph position="0"> An important methodological issue is that the grammar was generated towards a pre-existent English grammar. So we can claim that the evaluation was quite accurate. Differences between the generated and pre-existent trees had to be explained and discussed with the group of grammar developers. Often this led to the discovery of errors and better ways of modeling the grammar. Perhaps the best expression of the success of this enterprise was to be able to generate the 53 verb families (783 trees) from only the corresponding 53 declarative trees (or so) plus 21 metarules, a quite compact initial set. More importantly this compact set can be effectively used for grammar development. We turn now to the problems found as well as some interesting observations.</Paragraph>
    <Section position="1" start_page="19" end_page="19" type="sub_section">
      <SectionTitle>
4.1 We undergenerate:7
</SectionTitle>
      <Paragraph position="0"> There are about 20 idiosyncratic trees not generated, involving trees for &amp;quot;-ed&amp;quot; adjectives, restricted to transitive and ergative families, and Determiner Gerund trees, which lack a clear pattern across the families.8 These trees should be separately added to the families. Similarly, there are 10 trees involving punctuation in the sentential complement families which are not worth generating automatically.</Paragraph>
      <Paragraph position="1"> We do not handle yet: the passivization of the  selects a prepositional complement introduced by the preposition of: &amp;quot;The finding of the treasure (by the pirates) was news for weeks.&amp;quot; But the &amp;quot;of&amp;quot; insertion is not uniform across families: cf. &amp;quot;the accounting for the book.&amp;quot;  No. DESCRIPTION EXAMPLE 1 Declarative He put the book on the table 2 Passive w. by The book was put on the table by him 3 Passive w.o. by The book was put on the table 4 Gerundive nominals He putting the book on the table was unexpected 5 Gerundive for passive w. by The book being put on the table by him ... 6 Gerundive for passive w.o. by The book being put on the table ... 7 Subject extraction Who put the book on the table ? 8 Subj. extr. from passive w. by What was put on the table by him ? 9 Subj. extr. from passive w.o. by What was put on the table ? 10 1st obj. extraction What did he put on the table ? 11 2nd obj. NP extraction Where did he put the book on ? 12 2nd obj. NP extr. from pass. w. by Where was the book put on by him ? 13 Agent NP extr. from pass. w. by Who (the hell) was this stupid book put on the table by ? 14 2nd obj. NP extr. from pass. w.o. by Where was the book put on ? 15 PP obj. extr. On which table did he put the book ? 16 PP obj. extr. from pass. w. by On which table was the book put by him ? 17 By-clause extr. from pass. w. by By whom was the book put on the table ? 18 PP obj. extr. from pass. w.o. by On which table was the book put ? 19 Imperative Put the book on the table ! 20 Declarative with PRO subject I want to [ PRO put the book on the table ] 21 Passive w. by w. PRO subject The cat wanted [ PRO to be put on the tree by J. ] 22 Passive w.o. by w. PRO subject The cat wanted [ PRO to be put on the tree ] 23 Ger. noms. with PRO subject John approved of [ PRO putting the cat on the tree ] 24 Ger. noms. for passive w. by w. PRO subj. The cat approved of [ PRO being put on the tree by J.] 25 Ger. noms. for passive w.o. by w. PRO subj. The cat approved of [ PRO being put on the tree]  of&amp;quot;); the occurrence of the &amp;quot;by phrase&amp;quot; before sentential complements (&amp;quot;I was told by Mary that ...&amp;quot;); and wh-extraction of sentential complements and of exhaustive PPs. Except for the first case all can be easily accounted for.</Paragraph>
    </Section>
    <Section position="2" start_page="19" end_page="19" type="sub_section">
      <SectionTitle>
4.2 We overgenerate:
</SectionTitle>
      <Paragraph position="0"> We generate 1200 trees (instead of 1008).9 However things are not as bad as they look: 206 of them are for passives related to multi-anchor trees, as we explain next. It is acknowledged the existence of a certain amount of overgeneration in the tree families due to the separation between the lexicon and the tree templates. For instance, it is widely known that not all transitive verbs can undergo passivization. But the transitive family contains passive trees. The reconciliation can be made through features assigned to verbs that allow blocking the selection of the particular tree. However in the family for verb particle with two objects (e.g., for &amp;quot;John opened up Mary a bank account&amp;quot;), the four lexical entries were judged not to undergo passivization and the corresponding trees (64) were omitted from the family. It is not surprising then that the metarules overgenerate them. Still, 100 out of the 206 are for passives in the unfinished idiom families and are definitely lex9Which means more than an excess of 192 trees since there is also some undergeneration, already mentioned.</Paragraph>
      <Paragraph position="1"> ically dependent. The other 42 overgenerated passives are in the light verb families. There are a few other cases of overgeneration due to lexically dependent judgments, not worth detailing. Finally, a curious case involved empty elements that could be generated at slightly different positions which are not distinguished at surface (e.g., before or after a particle). The choice for having only one alternative in the grammar is of practical nature (related to parsing efficiency) as opposed to linguistic.</Paragraph>
    </Section>
    <Section position="3" start_page="19" end_page="19" type="sub_section">
      <SectionTitle>
4.3 Limitations to further compaction:
</SectionTitle>
      <Paragraph position="0"> All the metarules for wh-object extraction do essentially the same, but currently they cannot be unified. Further improvements in the metarule system implementation could solve the problem at least partially, by allowing to treat symbols and indices as separate variables. A more difficult problem are some subtle differences in the feature equations across the grammar (e.g., causing the need of a separate tree for relativization of the subject in passive trees). By far, feature equations constitute the hardest issue to handle with the metarules.</Paragraph>
    </Section>
    <Section position="4" start_page="19" end_page="19" type="sub_section">
      <SectionTitle>
4.4 A metarule shortcoming:
</SectionTitle>
      <Paragraph position="0"> Currently they do not allow for the specification of negative structural constraints to matching. There is one feature equation related to punctuation that needed 5 separate metarules (not described above) to handle (by exhaustion) the following constraint: the equation should be added if and only if the tree has some non-empty material after the verb which is not a &amp;quot;by-phrase&amp;quot;.</Paragraph>
    </Section>
    <Section position="5" start_page="19" end_page="19" type="sub_section">
      <SectionTitle>
4.5 Other cases:
</SectionTitle>
      <Paragraph position="0"> A separate metarule was needed to convert foot nodes into substitution nodes in sentential complement trees. This families departs from the rest of the grammar in that their base tree is an auxiliary tree to allow extraction from the sentential complement. But the corresponding relative clauses have to have the S complement as a substitution node.</Paragraph>
    </Section>
  </Section>
  <Section position="7" start_page="19" end_page="19" type="evalu">
    <SectionTitle>
5 Discussion
</SectionTitle>
    <Paragraph position="0"> A question might arise about the rationale behind the ordering of the rules. There has been some debate about how lexical or syntactic rules should apply to generate an LTAG. Becker's metarules have been targeted due to the unboundedness in the process of their recursive application. He has been defending himself (Becker, 2000) suggesting principles under which boundedness would arise as a natural consequence. What we have been proposing here is a clear separation between the metarules as a formal system for deriving trees from trees and the control mechanism that says which rule is applied when. Given the experiment we have reported in this paper, it seems undeniable that such approach should be considered at least valid.</Paragraph>
    <Paragraph position="1"> As for the particular order we adopted, as mentioned before, it comes partly from reasonable assumptions about precedence of lexical redistribution rules over extraction rules (which can also be empirically observed), and partly as a mere simplification of a partial order relation.</Paragraph>
    <Paragraph position="2"> In a related issue, it is important to notice also that the ordering is not among rules, but among instances of rule applications as observed in (Evans et al., 2000). It was just by &amp;quot;accident&amp;quot; that rules were applied only once. For instance, one could imagine that in languages where double wh-movement is possible, a wh-rule have to be effectively applied twice. That does not entitle one to reject an a priory ordering between the instances. In this case the wh-rule would appear twice in the graph.</Paragraph>
    <Paragraph position="3"> Still another issue that can be raised is related to the monotonicity of the approach, especially in face of the problems we had with passives. As in (Candito, 1996), we overgenerate: ultimately, trees are incorrectly being assigned to some lexical items. In our particular case, however this can be charged to the architecture of the XTAG English grammar. The obvious way to handle this kind of problem in the XTAG grammar is by way of features in the lexical items that block their effective selection of a template. On the other hand if one wants to adopt a stronger lexicalist approach, it is easy to see how one could allow the lexical item to influence the base trees so as to control what rules in the chain are effectively applied, e.g., as in (Evans et al., 2000).</Paragraph>
    <Paragraph position="4"> Or, in other words: a metarule by itself is just a mechanism for tree-transformation.10</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML