File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/03/n03-1033_concl.xml
Size: 1,783 bytes
Last Modified: 2025-10-06 13:53:30
<?xml version="1.0" standalone="yes"?> <Paper uid="N03-1033"> <Title>Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network</Title> <Section position="5" start_page="0" end_page="0" type="concl"> <SectionTitle> 4 Conclusion </SectionTitle> <Paragraph position="0"> We have shown how broad feature use, when combined with appropriate model regularization, produces a superior level of tagger performance. While experience sug10On a 2GHz PC, this is still an important difference: our largest models require about 25 minutes per iteration to train.</Paragraph> <Paragraph position="1"> 11In practice one notices some wiggling in the curve, but the trend remains upward even beyond our chosen convergence point.</Paragraph> <Paragraph position="2"> gests that the final accuracy number presented here could be slightly improved upon by classifier combination, it is worth noting that not only is this tagger better than any previous single tagger, but it also appears to outperform Brill and Wu (1998), the best-known combination tagger (they report an accuracy of 97.16% over the same WSJ data, but using a larger training set, which should favor them).</Paragraph> <Paragraph position="3"> While part-of-speech tagging is now a fairly well-worn road, and our ability to win performance increases in this domain is starting to be limited by the rate of errors and inconsistencies in the Penn Treebank training data, this work also has broader implications. Across the many NLP problems which involve sequence models over sparse multinomial distributions, it suggests that feature-rich models with extensive lexicalization, bidirectional inference, and effective regularization will be key elements in producing state-of-the-art results.</Paragraph> </Section> class="xml-element"></Paper>