XML Viewer - p98-1035

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/98/p98-1035_concl.xml
Size: 2,024 bytes
Last Modified: 2025-10-06 13:58:04
<?xml version="1.0" standalone="yes"?>
<Paper uid="P98-1035">
  <Title>Exploiting Syntactic Structure for Language Modeling</Title>
  <Section position="7" start_page="229" end_page="229" type="concl">
    <SectionTitle>
6 Conclusions and Future Directions
</SectionTitle>
    <Paragraph position="0"> The large difference between the perplexity of our model calculated on the &amp;quot;development&amp;quot; set -- used  for model parameter estimation -- and &amp;quot;test&amp;quot; set -unseen data -- shows that the initial point we choose for the parameter values has already captured a lot of information from the training data. The same problem is encountered in standard n-gram language modeling; however, our approach has more flexibility in dealing with it due to the possibility of reestimating the model parameters.</Paragraph>
    <Paragraph position="1"> We believe that the above experiments show the potential of our approach for improved language models. Our future plans include: * experiment with other parameterizations than the two most recent exposed heads in the word predictor model and parser; * estimate a separate word predictor for left-to-right language modeling. Note that the corresponding model predictor was obtained via re-estimation aimed at increasing the probability of the &amp;quot;N-best&amp;quot; parses of the entire sentence; * reduce vocabulary of parser operations; extreme case: no non-terminal labels/POS tags, word only model; this will increase the speed of the parser thus rendering it usable on larger amounts of training data and allowing the use of deeper stacks -resulting in more &amp;quot;N-best&amp;quot; derivations per sentence during re-estimation; * relax -- flatten -- the initial statistics in the re-estimation of model parameters; this would allow the model parameters to converge to a different point that might yield a lower word-level perplexity; * evaluate model performance on n-best sentences output by an automatic speech recognizer.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML