File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/93/h93-1080_metho.xml

Size: 2,261 bytes

Last Modified: 2025-10-06 14:13:27

<?xml version="1.0" standalone="yes"?>
<Paper uid="H93-1080">
  <Title>ROBUSTNESS, PORTABILITY, AND SCALABILITY OF NATURAL LANGUAGE SYSTEMS</Title>
  <Section position="1" start_page="0" end_page="0" type="metho">
    <SectionTitle>
ROBUSTNESS, PORTABILITY, AND SCALABILITY
OF NATURAL LANGUAGE SYSTEMS
</SectionTitle>
    <Paragraph position="0"/>
  </Section>
  <Section position="2" start_page="0" end_page="0" type="metho">
    <SectionTitle>
1. OBJECTIVE
</SectionTitle>
    <Paragraph position="0"> In the DoD, every unit, from the smallest to the largest, communicates through messages. Messages are fundamental in command and conlrol, intelligence analysis, and in planning and replanning. Our objective is to create algorithms that will  1) robustly process open source text, identifying relevant messages, and updating a data base based on the relevant messages; 2) reduce the effort required in porting natural language (NL) message processing software to a new domain from months to weeks; and 3) be scalable to broad domains with vocabularies of tens of thousands of words.</Paragraph>
    <Paragraph position="1"> 2. APPROACH  Our approach is to apply probabilistic language models and training over large corpora in all phases of natural language processing. This new approach will enable systems to adapt to both new task domains and linguistic expressions not seen before by semi-automatically acquiring 1) a domain model, 2) facts required for semantic processing, 3) grammar rules, 4) information about new words, 5) probability models on frequency of occurrence, and 6) rules for mapping from representation to application structure. For instance, a Statistical model of categories of words will enable systems to predict the most likely category of a word never encountered by the system before and to focus on its most likely interpretation in context, rather than skipping the word or considering all possible interpretations. Markov modelling techniques will be used for this problem.</Paragraph>
    <Paragraph position="2"> In an analogous way, statistical models of language will be developed and applied at the level of syntax (form), at the level of semantics (conten0, and at the contextual level (meaning and impact).</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML