File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/00/c00-1074_concl.xml

Size: 2,125 bytes

Last Modified: 2025-10-06 13:52:44

<?xml version="1.0" standalone="yes"?>
<Paper uid="C00-1074">
  <Title>Hybrid Neuro and Rule-Based Part of Speech Taggers</Title>
  <Section position="8" start_page="514" end_page="514" type="concl">
    <SectionTitle>
5 Conclusion
</SectionTitle>
    <Paragraph position="0"> To collstruct a 1)tactical tagger that needs as little training data. a.s possible, neuro taggers, which have high generalizing al)ility and therefore a.re good at dealing with the problems ofda~ ta. sl)a,rseness, have been proposed so fa.r. Neure tatters, however, have crucial shortcomings: they ca.nnot utilize lexical information; they have trouble learning rules with single inputs; and they cannot learn training data to an ac~ curacy of 100%. To make up for these shortcomings, we introduced a rule-based correcter, which is constructed by a. set of trans\[brma.tion rules obtained by error-driven learning, for post 1)recessing and constructed a hybrid tagging system, l{y examining the transtbrma.tion rules acquired in the computer experiments, we found that 1;he 99.9% of them were those that; the neure tagger can hardly acquire, even when using a.</Paragraph>
    <Paragraph position="1"> template set including t;hose for generating the rules that the neuro tagger can easily acquire.</Paragraph>
    <Paragraph position="2"> This reinlbrced our expecta.tion that the rule-based approach is a well-suited method to cope with the shortcoming of the neuro tagger. Computer experiments showed that 19.7% of the errors made by the neuro tagger were corrected by the tra.nslbrmation rules, so the hybrid system rea.ched an accuracy of 95.5% counting only the ambiguous words and 99.\]% counting all the words in the testing data, when a small corpus with only 22,311 ambiguous words was used tbr train int. ~l'h is ind icates thai; ou r tagging ,qystem can nearly reach a pra.ctica.l level in terms of tagging accuracy even when a small Thai corpus is used tbr tra.ining. This kind of tagging system can be used to constructs multilingua.1 corpora that include languages in which large corpora have not yet been constructed.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML