File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/02/w02-2007_intro.xml

Size: 892 bytes

Last Modified: 2025-10-06 14:01:45

<?xml version="1.0" standalone="yes"?>
<Paper uid="W02-2007">
  <Title>Language Independent NER using a Unified Model of Internal and Contextual Evidence</Title>
  <Section position="3" start_page="0" end_page="0" type="intro">
    <SectionTitle>
2. Entity-Internal Information
</SectionTitle>
    <Paragraph position="0"> Two types of entity-internal evidence are used in a unified framework. The first consists of the prefixes and suffixes of candidate entities. For example, in Spanish, names ending in -ez (e.g. Alvarez and Gutierrez) are often surnames; names ending in -ia are often locations (e.g. Austria, Australia, and Italia). Likewise, common beginnings and endings of multiword entities (e.g. Asociacion de la Prensa de Madrid and Asociacion para el Desarrollo Rural Jerez-Sierra Suroeste, which are both organizations) are good indicators for entity type.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML