File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/99/p99-1080_concl.xml

Size: 1,534 bytes

Last Modified: 2025-10-06 13:58:28

<?xml version="1.0" standalone="yes"?>
<Paper uid="P99-1080">
  <Title>A Pylonic Decision-Tree Language Model with Optimal Question Selection</Title>
  <Section position="6" start_page="608" end_page="608" type="concl">
    <SectionTitle>
3 Evaluation of the Model
</SectionTitle>
    <Paragraph position="0"> The decision tree is being trained and tested on the Wall Street Journal corpus from 1987 to 1989 containing 45 million words. The data is divided into 15 million words for growing the nodes, 15 million for cross-validation, 10 million for estimating probabilities, and 5 million for testing. To compare the results with other similar attempts (Bahl et al., 1989), the vocabulary consists of only the 5000 most frequent words and a special &amp;quot;unknown&amp;quot; word that replaces all the others. The model tries to predict the word following a 20-word history.</Paragraph>
    <Paragraph position="1"> At the time this paper was written, the implementation of the presented algorithms was nearly complete and preliminary results on the performance of the decision tree were expected soon. The evaluation criterion to be used is the perplexity of the test data with respect to the tree. A comparison with the perplexity of a standard back-off trigram model will indicate which model performs better. Although decision-tree letter language models are inferior to their N-gram counterparts (Potamianos and Jelinek, 1998), the situation should be reversed for word language models. In the case of words</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML