File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/00/w00-1201_concl.xml
Size: 1,893 bytes
Last Modified: 2025-10-06 13:52:56
<?xml version="1.0" standalone="yes"?> <Paper uid="W00-1201"> <Title>Two Statistical Parsing Models Applied to the Chinese Treebank</Title> <Section position="5" start_page="2" end_page="2" type="concl"> <SectionTitle> 4 Conclusions and Future Work </SectionTitle> <Paragraph position="0"> There is no question that a great deal of care and expertise went into creating the Chinese Treebank, and that it is a source of important grammatical information that is ufiique to the Chinese language. However, there are definite similarities between the grammars of English and Chinese, especially when viewed through the lens of the statistical models we employed here. In both languages, the nouns, adjectives, adverbs, and verbs have preferences for certain arguments and adjuncts, and these preferences--in spite of the potentially vastlydifferent configurations of these items--are effectively modeled. As discussed in the introduction, lexica! items' idiosyncratic parsing preferences are modeled by lexicalizing the grammar formalism, using a lexicalized PCFG in one case and a lexicalized stochastic TAG in the other. Linguistically-reasonable independence assumptions are made, such as the independence of grammar productions in the case of the PCFG model, or the independence of the composition operations in the case of the LTAG model, and we would argue that these assumptions are no less reasonable for the Chinese grammar than they are for that of English. While results for the two languages are far from equal, we believe that further tuning of the head rules, and analysis of development test set errors will yield significant performance gains on Chinese to close the gap.</Paragraph> <Paragraph position="1"> Finally, we fully expect that absolute performance will increase greatly as additional high-quality Chinese parse data becomes available.</Paragraph> </Section> class="xml-element"></Paper>