File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/p04-2010_intro.xml
Size: 1,816 bytes
Last Modified: 2025-10-06 14:02:24
<?xml version="1.0" standalone="yes"?> <Paper uid="P04-2010"> <Title>A Machine Learning Approach to German Pronoun Resolution</Title> <Section position="2" start_page="0" end_page="0" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> Automatic coreference resolution, pronominal and otherwise, has been a popular research area in Natural Language Processing for more than two decades, with extensive documentation of both the rule-based and the machine learning approach.</Paragraph> <Paragraph position="1"> For the latter, good results have been achieved with large feature sets (including syntactic, semantic, grammatical and morphological information) derived from handannotated corpora. However, for applications that work with plain text (e.g. question answering, text summarisation), this approach is not practical.</Paragraph> <Paragraph position="2"> The system presented in this paper resolves German pronouns in free text by imitating the manual annotation process with off-the-shelf language sofware. As the avalability and reliability of such software is limited, the system can use only a small number of features. The fact that most German pronouns are morphologically ambiguous proves an additional challenge.</Paragraph> <Paragraph position="3"> The choice of boosting as the underlying machine learning algorithm is motivated both by its theoretical concept as well as its performance for other NLP tasks. The fact that boosting uses the method of ensemble learning, i.e. combining the decisions of several classifiers, suggests that the combined hypothesis will be more accurate than one learned by a single classifier. On the practical side, boosting has distinguished itself by achieving good results with small feature sets.</Paragraph> </Section> class="xml-element"></Paper>