File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/98/w98-1202_intro.xml

Size: 3,853 bytes

Last Modified: 2025-10-06 14:06:45

<?xml version="1.0" standalone="yes"?>
<Paper uid="W98-1202">
  <Title>Natural Language Learning by Recurrent Neural Networks: A Comparison with probabilistic approaches</Title>
  <Section position="4" start_page="0" end_page="0" type="intro">
    <SectionTitle>
2. Methods
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.1. The data
</SectionTitle>
      <Paragraph position="0"> The natural language corpus used in these experiments was obtained from a first-year primary school reader published circa 1950's (Hume). This text was chosen because of its limited vocabulary and sentence structure.</Paragraph>
      <Paragraph position="1"> For this initial study, sentences with embedded structures (relative clauses) and a length of more than eight words were eliminated. The resulting corpus consisted of 106 sentences ranging from three to eight words in length, average length 5.06 words. The words were converted to 10 lexieal categories, including a sentence boundary marker. The categories and their abbreviations as used in the subsequent text and figures are listed in Table 1.</Paragraph>
      <Paragraph position="2"> The resulting data consisted of a string of 643 categories in 106 sentences. There were 62 distinct sentence sequences of which 43 occurred only once, the rest being replicated. The maximum replication of any sequence was eight-fold. Category frequencies are given in Table I.</Paragraph>
      <Paragraph position="3"> For some experiments, the total data were used, for training and in other experiments the data were divided into training and test sets. The test set consisted of every fourth sentence taken from the total data yielding a string of 158 categories in 26 sentences. The training set consisted of the remaining data, a string of 486 categories in 80 sentences. Due to replication of some sentences, the test set contained sentence sequences that also occurred in the training set.</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.2. The networks
</SectionTitle>
      <Paragraph position="0"> The two networks used in this study were an Elman network and an RCC network. For both nets there were ten input and ten output units representing the sparsely coded lexical categories. The task in all cases was to predict the next lexical category given the current category. State unit activations were not reset to zero on presentation of a sentence boundary as is sometimes done. The Elman network was trained by standard backpropagation using momentum = 0.9. Step-size and number of training epochs varied depending on the requirement for slow or fast training. The slow training regime used stepsize = 0.0001 for 100,000 epochs and a typical fast training regime was 200 epochs at 0.01 followed by another 200 at 0.001. One training epoch consisted of one complete presentation of the training data. The RCC network was trained by Quickprop.</Paragraph>
      <Paragraph position="1"> Error of the outputs was measured as the root-meansquare (rms) of the difference between the output and some target or reference value averaged over the outputs. The entropy of the outputs was calculated as</Paragraph>
      <Paragraph position="3"> In order to assess the learning of the neural networks, prediction performance was compared with that of n-grams obtained by a statistical analysis of the data.</Paragraph>
      <Paragraph position="4"> Using the complete data sequence of 643 word categories, 48%, 62%, 72% and 76% correct predictions Towsey, Diederich, Schellhammer, Chalup, Brugman 4 Natural Language Learning by Recurrent Neural Nets</Paragraph>
      <Paragraph position="6"> could be obtained using bigram, trigram, 4-gram and 5-gram probabilities respectively.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML