File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/94/c94-1027_evalu.xml
Size: 3,066 bytes
Last Modified: 2025-10-06 14:00:16
<?xml version="1.0" standalone="yes"?> <Paper uid="C94-1027"> <Title>PART-OF-SPEECH TAGGING WITH NEURAL NETWORKS</Title> <Section position="6" start_page="772" end_page="775" type="evalu"> <SectionTitle> 6 Rl,~suurs </SectionTitle> <Paragraph position="0"> The 2-layer version of the Net-Tagger w,~s trained on a 2 million word subpart of the Pe.nn-Treebank corpus.</Paragraph> <Paragraph position="1"> Its performance was tested on a 100,000 word subpart which was not part of the training corlms. The settings of the network parameters were as follows: the number of preceding words in the context p w,~s 3, the number of following words f was 2 and the number of training cycles was 4 millions. The training of the tagger took one day on a Sparcl0 workstation and the tagging of 100,000 words took 12 minutes on the same machine.</Paragraph> <Paragraph position="2"> In tabh; 2, the accuracy rate of the Net-Tagger is cOrolLated to that of a trigram l)msed tagger (Kempe, 1993) and a lIidden Markov Model tagger (Cutting et al., 1992) which were. trained and tested on the same data. In order to determine the influence of tim size of the training sample, the taggers were also trained on corpora of different sizes and tested again r. The resulting percentages of correctly tagged words are shown in figure 4.</Paragraph> <Paragraph position="3"> These experiments demonstrate that the performance of the Net-Tagger is comparable to that of the trigram tagger and better than that of the IIMM tagger. They further show tl,at the performance of the Net-Tagger is less affected by a small amount of training data than that of tim trigram tagger. This may be due to a much smaller number of paraineters in the Net-Tagger: while the trigram tagger must accurately ~l:or this test, a slightly simpler netwm'k structure with two preceding and one following word in the input context was used. It was fitrther tested, whether an additional hidden layer in the network with 50 units would improve the accuracy of the tagging. It turned out that the accuracy actually deteriorated slightly, although the number of training cycles had been increased to 50 millions s.</Paragraph> <Paragraph position="4"> Also, tire influence of the size of the input context was determined. Shrinking the context from three preceding and two following words to two preceding and one following word reduced the accuracy only by 0.1%. Enlarging the context gave no improvement.</Paragraph> <Paragraph position="5"> A context of three preceding and two following words seems to he optimal.</Paragraph> <Paragraph position="6"> As mentioned previously, the tagger can produce an alternative tag, if the decision between two tags is difficult. In that way, the accuracy can be raised to 97.79 % at the expense of 4.6 % ambiguously tagged words.</Paragraph> <Paragraph position="7"> An analysis of tire errors of the Net-Tagger and the trigram tagger shows that both have problems with the same words, althot, gh the individual errors are often different 9 .</Paragraph> </Section> class="xml-element"></Paper>