File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/97/a97-1030_concl.xml

Size: 2,142 bytes

Last Modified: 2025-10-06 13:57:46

<?xml version="1.0" standalone="yes"?>
<Paper uid="A97-1030">
  <Title>Categorizing and standardizing proper nouns for efficient information retrieval, In B. Boguraev and</Title>
  <Section position="11" start_page="207" end_page="207" type="concl">
    <SectionTitle>
9 Conclusion
</SectionTitle>
    <Paragraph position="0"> Ambiguity remains one of the main challenges in the processing of natural language text. Efforts to resolve it have traditionally focussed on the development of full-coverage parsers, extensive lexicons, and vast repositories of world knowledge. For some natural-language applications, the tremendous effort involved in developing these tools is still required, but in other applications, such as information extraction, there has been a recent trend towards favoring minimal parsing and shallow knowledge (Cowie and Lehnert 1996). In its minimal use of resources, Nominator follows this trend: it relies on no syntactic information and on a small semantic lexicon - an authority list which could easily be modified to include information about new domains.</Paragraph>
    <Paragraph position="1"> Other advantages of using limited resources are robustness and execution speed, which are important in processing large amounts of text.</Paragraph>
    <Paragraph position="2"> In another sense, however, development of a module like Nominator still requires considerable human effort to discover reliable heuristics, particularly when only minimal information is used. These heuristics are somewhat domain dependent: different generalizations hold for names of drugs and chemicals than those identified for names of people or organizations. In addition, as the heuristics depend on linguistic conventions, they are language dependent, and need updating when stylistic conventions change. Note, for example, the recent popularity of software names which include exclamation points as part of the name. Because of these difficulties, we believe that for the forseeable future, practical applications to discover new names in text will continue to require the sort of human effort invested in Nominator.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML