File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/00/c00-2155_evalu.xml

Size: 3,207 bytes

Last Modified: 2025-10-06 13:58:41

<?xml version="1.0" standalone="yes"?>
<Paper uid="C00-2155">
  <Title>An HPSG-to-CFG Approximation of Japanese</Title>
  <Section position="8" start_page="1048" end_page="1049" type="evalu">
    <SectionTitle>
4 Evaluation
</SectionTitle>
    <Paragraph position="0"> The Japanese HPSG grammar used in our experiment consists of 43 rule sdmmata (28 unary, 15 binary), 1,208 types and a test lexicon of 2,781 highly diverse entries. The lexicon restrictot, as introduced in section 1.1 and depicted in figure 1, maps these entries onto 849 lexical abstractions. This restrictor tells us which parts of a feature structure have to be deleted---it is the kind of restrictor which we are usually going to use. We call this a negative restrictor, contrary to tile positive restrictors used in the PATR-II system that specii\[y those parts of a feature structure which will survive after restricting it.</Paragraph>
    <Paragraph position="1"> Since a restrictor could have reentrance points, one can even define a reeursivc (or cyclic) restrictor to foresee recursive embeddings as is the case in HPSG.</Paragraph>
    <Paragraph position="2"> The rule restrictor looks quite silnilar, cutling off additionally information contained only in the daughters. Since both restrictors remove the CONTENT feature (and hence the semantics which is a source of infinite growth), it hal&gt; pened that two very productive head-adjunct schemata could be collapsed into a single rule.</Paragraph>
    <Paragraph position="3"> Tiffs has helped to keep the number of feature structures in the fixpoint relatively small.</Paragraph>
    <Paragraph position="4"> We reached the fixpoint after 5 iteration steps, obtaining 10,058 featnre structures. The comtmtation of the fixpoint took about 27.3 CPU hours on a 400MHz SUN Ultrasparc 2 with t~Y=anz Allegro Common Lisp under Solaris 2.5.</Paragraph>
    <Paragraph position="5"> Given tim feature structures from the fixpoint, the 43 rules might lead to 28 x 10,058-t- 15 x 10,058 x 10,058 = 1, 51.7,732,084 CF productions in the worst case. Our method produces 19,198,592 productions, i.e., 1.26% of all possible ones. We guess that the enormous set of productions is due tile fact that the grammar was developed for spoken Japanese (recall section 2 on the mnbiguity of Japanese). Likewise, the choice of a 'wrong' restrictor often leads to a dramatic increase of structures in the fixpoint, and hence of CF rules--we are not sure at this point whether our restrictor is a good compromise between tile specificity of the context-free language and the number of context-free rules.</Paragraph>
    <Paragraph position="6"> We are currently implementing a CF parser that can handle such an enormous set of CF rules.</Paragraph>
    <Paragraph position="7"> In (Kiefer and Krieger, 2000b), we report on a similar experiment that we carried out using the English Verbmobil grmnmar, developed at CSLI, Stanford. In this paper, we showed that the workload on the HPSG side can be drastically reduced by using a CFG filter, obtained  addition, the rule restrictor cuts off the DAUGHTERS feature. from the HPSG. Our hope is that these results can be carried over to the Japanese grammar.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML