File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/02/c02-1127_evalu.xml

Size: 2,203 bytes

Last Modified: 2025-10-06 13:58:54

<?xml version="1.0" standalone="yes"?>
<Paper uid="C02-1127">
  <Title>Location Normalization for Information Extraction*</Title>
  <Section position="5" start_page="0" end_page="0" type="evalu">
    <SectionTitle>
6 Algorithm and Experiment
</SectionTitle>
    <Paragraph position="0"> With the information from local context, discourse context and the knowledge of default senses, the location normalization process turned out to be very efficient and precise. The processing flow is divided into 5 steps: Step 1. Look up the location gazetteer to associate candidate senses for each location NE; Step 2. Call the pattern matching sub-module to resolve the ambiguity of the NEs involved in local patterns like &amp;quot;Williamsville, New York, USA&amp;quot; to retain only one sense for the NE as early as possible; Step 3. Apply the 'one sense per discourse' principle for each disambiguated location name to propagate the selected sense to its other occurrences within a document; Step 4. Call the global sub-module, which is a graph search algorithm, to resolve the remaining ambiguities; Step 5. If the decision score for a location name is lower than a threshold, we choose a default sense of that name as a result.</Paragraph>
    <Paragraph position="1"> For evaluating the system performance, 53 documents from a travel site (http://www.worldtravelguide.net/navigate/region/na m.asp), CNN News and New York Times are used. Table 2 shows some sample results from our test collections. For results shown in Column 4, we first applied default senses of location names available from the Tipster Gazetteer in accordance with the rules specified in the gazetteer document. If there is no ranking value tagged for a location name, we select the first sense in the gazetteer as its default. This experiment showed accuracy of 42%. For Column 5, we tagged the corpus with default senses we derived with the method described in section 5, and found that it can resolve 78% location name ambiguity. Column 6 in Table 2 is the result of our LocNZ system using the algorithm described above as well as default senses we derived. The system showed promising results with 93.8% accuracy.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML