File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/93/e93-1023_concl.xml

Size: 2,564 bytes

Last Modified: 2025-10-06 13:56:57

<?xml version="1.0" standalone="yes"?>
<Paper uid="E93-1023">
  <Title>A Probabilistic Context-free Grammar for Disambiguation in Morphological Parsing</Title>
  <Section position="6" start_page="190" end_page="190" type="concl">
    <SectionTitle>
5 Conclusion
</SectionTitle>
    <Paragraph position="0"> As the results show, this fully implemented system, running with a morpheme lexicon of 17,087 entries on a randomly selected 3,077 words test sample, is successful. This success may to a large extent be put down to the augmentation of the context-free grammar to a PCFG 14.</Paragraph>
    <Paragraph position="1"> As mentioned above, the accuracy of a PCFG depends heavily on the accuracy of the empirical estimate of the probability function. We were lucky to have at our disposal a training set which was both large enough and representative, but due to the facts that, in some cases, MORPA and the training set yield different analyses, and token frequencies for string ambiguous words were not disambiguated, we expect our estimate to have become less reliable. In order to improve MORPA's performance on text test samples, we will have to &amp;quot;repair&amp;quot; the token frequencies. null It is often argued that a PCFG only provides poor estimates of probability, and that probabilistic grammars require more sensitivity to lexical context. After all, PCFGs only provide very general information on how likely a production rule is going to appear anywhere in a sample of the language, and production rules are not always context-free \[Magerman and a2For reasons I will not go into here, the newspaper and dictionary words did not comprise highly frequent words \[Nunn and van Heuven, 1993\].</Paragraph>
    <Paragraph position="2"> 13See for a comparison with a data-oriented system for Dutch grapheme-to-phoneme transcription \[van den Bosch and Daelemans, 1993\]. Note that in this comparison syllabification and stress assignment have not been taken into account.</Paragraph>
    <Paragraph position="3"> 14Before this augmentation, the parser was enriched with some preliminary criteria imposing an order on the set of alternatives. Then, the performance came up to 85%.</Paragraph>
    <Paragraph position="4"> Marcus, 1991; Resnik, 1992\]. However, most of the work done on context-free probabilistic grammars is done for syntax, and as I hope to have shown that a PCFG yields good results for morphology, it might be interesting to find out if, for one reason or another, PCFGs are more successful for morphology than for syntax.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML