File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/03/p03-1013_concl.xml
Size: 2,725 bytes
Last Modified: 2025-10-06 13:53:34
<?xml version="1.0" standalone="yes"?> <Paper uid="P03-1013"> <Title>Probabilistic Parsing for German using Sister-Head Dependencies</Title> <Section position="9" start_page="3" end_page="3" type="concl"> <SectionTitle> 7 Conclusions </SectionTitle> <Paragraph position="0"> We presented the first probabilistic full parsing model for German trained on Negra, a syntactically annotated corpus. This model uses lexical sister-head dependencies, which makes it particularly suitable for parsing Negra's flat structures. The flatness of the Negra annotation reflects the syntactic properties of German, in particular its semi-free wordorder. In Experiment 1, we applied three standard parsing models from the literature to Negra: an unlexicalized PCFG model (the baseline), Carroll and Rooth's (1998) head-lexicalized model, and Collins's (1997) model based on head-head dependencies. The results show that the baseline model achieves a performance of up to 73% recall and 70% precision. Both lexicalized models perform substantially worse. This finding is at odds with what has been reported for parsing models trained on the Penn Treebank. As a possible explanation we considered lack of training data: Negra is about half the size of the Penn Treebank. However, the learning curves for the three models failed to produce any evidence that they suffer from sparse data.</Paragraph> <Paragraph position="1"> In Experiment 2, we therefore investigated an alternative hypothesis: the poor performance of the lexicalized models is due to the fact that the rules in Negra are flatter than in the Penn Treebank, which makes lexical head-head dependencies less useful for correctly determining constituent boundaries.</Paragraph> <Paragraph position="2"> Based on this assumption, we proposed an alternative model hat replaces lexical head-head dependencies with lexical sister-head dependencies. This can the thought of as a way of binarizing the flat rules in Negra. The results show that sister-head dependencies improve parsing performance not only for NPs (which is well-known for English), but also for PPs, VPs, Ss, and coordinate categories. The best performance was obtained for a model that uses sister-head dependencies for all categories. This model achieves up to 74% recall and precision, thus outperforming the unlexicalized baseline model.</Paragraph> <Paragraph position="3"> It can be hypothesized that this finding carries over to other treebanks that are annotated with flat structures. Such annotation schemes are often used for languages that (unlike English) have a free or semi-free wordorder. Testing our sister-head model on these languages is a topic for future research.</Paragraph> </Section> class="xml-element"></Paper>