File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/02/w02-1031_abstr.xml

Size: 1,381 bytes

Last Modified: 2025-10-06 13:42:36

<?xml version="1.0" standalone="yes"?>
<Paper uid="W02-1031">
  <Title>The SuperARV Language Model: Investigating the E ectiveness of Tightly Integrating Multiple Knowledge Sources</Title>
  <Section position="1" start_page="0" end_page="0" type="abstr">
    <SectionTitle>
Abstract
</SectionTitle>
    <Paragraph position="0"> A new almost-parsing language model incorporating multiple knowledge sources that is based upon the concept of Constraint Dependency Grammars is presented in this paper. Lexical features and syntactic constraints are tightly integrated into a uniform linguistic structure called a SuperARV that is associated with a word in the lexicon. The SuperARV language model reduces perplexity and word error rate compared to trigram, part-of-speech-based, and parser-based language models. The relative contributions of the various knowledge sources to the strength of our model are also investigated by using constraint relaxation at the level of the knowledge sources. We have found that although each knowledge source contributes to language model quality, lexical features are an outstanding contributor when they are tightly integrated with word identity and syntactic constraints. Our investigation also suggests possible reasons for the reported poor performance of several probabilistic dependency grammar models in the literature.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML