File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/05/w05-0705_intro.xml

Size: 1,330 bytes

Last Modified: 2025-10-06 14:03:15

<?xml version="1.0" standalone="yes"?>
<Paper uid="W05-0705">
  <Title>Modifying a Natural Language Processing System for European Languages to Treat Arabic in Information Processing and Information Retrieval Applications</Title>
  <Section position="2" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> When a natural language processing (NLP) system is created in a modular fashion, it can be relatively easy to extend treatment to new languages (Maynard, et al. 2003) depending on the depth and completeness desired. We present here lessons learned from the extension of our NLP system that was originally implemented for Romance and Germanic European1 languages to a member of the Semitic language family, Arabic. Though our system was designed modularly, this new language posed new problems. We present our answers to 1 European languages from non indo-European families (Basque, Finnish and Hungarian) pose some of the same problems that Arabic does.</Paragraph>
    <Paragraph position="1"> these problems encountered in the creation of an Arabic processing system, and illustrate its integration into an online cross language information retrieval (CLIR) system dealing with documents written in Arabic, English French and Spanish.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML