File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/96/c96-2130_concl.xml
Size: 4,386 bytes
Last Modified: 2025-10-06 13:57:32
<?xml version="1.0" standalone="yes"?> <Paper uid="C96-2130"> <Title>Learning Part-of-Speech Guessing Rules from Lexicon: Extension to Non-Concatenative Operations*</Title> <Section position="7" start_page="774" end_page="774" type="concl"> <SectionTitle> 5 Discussion and Conclusion </SectionTitle> <Paragraph position="0"> The target; of the research reI)orted in this pa1)er was to incorporate the learning of morl)hological word-t'os guessing rules which (lo not ol)ey simI)le (:oncatenations of main words with affixes into the learning paradigm proposed in (Mikheev, 1996). ~l.k) do that we extended the data stru(:tures and the algorithlns for the guessing-rule ap1)li(:ation to handle the mutations in the last n letters of the main words. Thus siml)le concatenative rules naturally became a sul)set of the mutative rules they can 1)e seen as mutative rules with the zero inutation, i.e. when the M element of the rule is empty. Simple. con(:atenative rules, however, are not necessarily regular morphological rules and quite often they capture other non-linear morphological dependen(:ies. For instance, consonant doubling is naturally cal)tured by the af fixes themselves and obey siml)le concatenations, as, fl)r exalnI)le, describes the suffix rule A~': \[ S = gangl= (NN VII) 1~: = (JJ NN VB(;) M~---&quot;&quot;\] This rule. for examl)le , will work fl)r word pairs like t,~g - tagging OF dig - digging. Not() that here we don't speei\[y the prerequisites for the stemword to have one syllable and end with the same consonant as in the beginifing of the affix. Our task here is not to provide a t)recise morphological deserii)tion of English 1)ut rather to SUl)t)ort computationally effective pos-guessings, by elll1)loying some, morphological information. So, inst;cad of using a prol)er morphological t)ro(:essor, we adopted an engineering at)preach which is argued tbr in (Mikheev&Liubushkina, 1995). There is, of course, ilothing wrong with morphological processors perse, but it is hardly feasit)le to re-train them fully automatically for new tag-sets or to induce, new rules. Our shallow Ix~('hnique on the contrary allows to in(hlce such rules completely automat;ically and ensure that these rules will have enough discriminative features for robust guessings. In fact, we abandoned the notion of morpheme attd are dealing with word segments regardless of whether they are, &quot;proper&quot; morphemes or nol;. So, for example, in the rule above &quot;ging&quot; is (:onsidered as a suffix which ill i)rincil)le is not right: the suffix is &quot;ing&quot; and &quot;g&quot; is the dubbed (:onsonant. Clearly, such nuan(:es are impossible to lem'n autolnati(:ally without specially l)repared training data, which is denied by the technique in use. On the other hand it is not clear that this fine-grained information will contribute to the task of morphological guessing. The simplicity of the l)rol)osed shallow morphology, however, ensures flflly automatic acquisition of such rules and the emi)iri(:al evahlation presenl;ed in section 2.3 ('()ntirmed that they are just right for the task: 1)recision ;rod recall of such rules were measured ili the railge of 96-99%.</Paragraph> <Paragraph position="1"> The other aim of the research tel)erred here was to assess whether nou-concatenative morphological rules will improve the overall performance of the cascading guesser. As it was measured in (Mikheev, 1996) simple concatenative prefix and sutlix morphological rules iInproved the overall i)recision of the cascading guesser 1)y about 5%, which resulted in 2% higher a(:curacy of tagging on mlknown words. The additional rule-set of stir k fix rules with one, letter mutation caused soille flirt, her improvement. The precision of the guessing increased by al)out 1% and the tagging accura(:y on a very large set of unknown words increased l)y at)out 1%. in (:onchlsion we (:tin say that although the ending-guessing rules, which are nmeh simpler than morphological rules, can handle words with affixes longer than two chara(> ters almost equally well, in the fi'amework of pos-tagging even a fl'action of per(:ent is an important imi)rovement. Therefore the (:ontribution of the morphological rules is wflual)le and ne(:essary for I;he robust t'os-tagging of real-world texts.</Paragraph> </Section> class="xml-element"></Paper>