File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/03/p03-2031_concl.xml

Size: 1,066 bytes

Last Modified: 2025-10-06 13:53:36

<?xml version="1.0" standalone="yes"?>
<Paper uid="P03-2031">
  <Title>Automatic Acquisition of Named Entity Tagged Corpus from World Wide Web</Title>
  <Section position="5" start_page="0" end_page="0" type="concl">
    <SectionTitle>
4 Conclusions
</SectionTitle>
    <Paragraph position="0"> In this paper, we presented a method that automatically generates an NE tagged corpus using enormous web documents. We use an internet search engine with an NE list to collect web documents which may contain the NE instances. The web documents are segmented into sentences and refined through sentence separation and text refinement procedures.</Paragraph>
    <Paragraph position="1"> The sentences are finally tagged with the NE categories. We experimentally demonstrated that the suggested method could acquire enough NE tagged corpus equally useful to the manual corpus without any human intervention. In the future, we plan to apply more sophisticated natural language processing schemes for automatic generation of more accurate NE tagged corpus.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML