File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/05/w05-0709_abstr.xml

Size: 1,198 bytes

Last Modified: 2025-10-06 13:44:36

<?xml version="1.0" standalone="yes"?>
<Paper uid="W05-0709">
  <Title>The Impact of Morphological Stemming on Arabic Mention Detection and Coreference Resolution</Title>
  <Section position="2" start_page="0" end_page="0" type="abstr">
    <SectionTitle>
Abstract
</SectionTitle>
    <Paragraph position="0"> Arabic presents an interesting challenge to natural language processing, being a highly inflected and agglutinative language. In particular, this paper presents an in-depth investigation of the entity detection and recognition (EDR) task for Arabic. We start by highlighting why segmentation is a necessary prerequisite for EDR, continue by presenting a finite-state statistical segmenter, and then examine how the resulting segments can be better included into a mention detection system and an entity recognition system; both systems are statistical, build around the maximum entropy principle. Experiments on a clearly stated partition of the ACE 2004 data show that stem-based features can significantly improve the performance of the EDT system by 2 absolute F-measure points. The system presented here had a competitive performance in the ACE 2004 evaluation.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML