File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/05/p05-2023_intro.xml

Size: 878 bytes

Last Modified: 2025-10-06 14:03:10

<?xml version="1.0" standalone="yes"?>
<Paper uid="P05-2023">
  <Title>An Unsupervised System for Identifying English Inclusions in German Text</Title>
  <Section position="3" start_page="0" end_page="0" type="intro">
    <SectionTitle>
2 Motivation
</SectionTitle>
    <Paragraph position="0"> In natural language, new inclusions typically fall into two major categories, foreign words and proper nouns. They cause substantial problems for NLP applications because they are hard to process and infinite in number. It is difficult to predict which foreign words will enter a language, let alone create an exhaustive gazetteer of them. In German, there is frequent exposure to documents containing English expressions in business, science and technology, advertising and other sectors. A look at current headlines confirms the existence of this phenomenon:</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML