File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/86/c86-1069_metho.xml

Size: 18,624 bytes

Last Modified: 2025-10-06 14:11:48

<?xml version="1.0" standalone="yes"?>
<Paper uid="C86-1069">
  <Title>MACHINE LEARNING OF MORPHOLOGICAL RULES BY GENERALIZATION AND ANALOGY</Title>
  <Section position="1" start_page="0" end_page="292" type="metho">
    <SectionTitle>
MACHINE LEARNING OF MORPHOLOGICAL RULES
BY GENERALIZATION AND ANALOGY
Klaus Wothke
ArbeiLssLe\]le LinguisLische DaLenverarbeiLung
INSTI\[UI FOR DEUTSCHE SPRAI;HE
</SectionTitle>
    <Paragraph position="0"> Mannheim, West. Germany ABSTRAI:T: 1his paper describes an experimenLal procedure For Lhe inducLive auLomaLed learning of morphological rules From examples. At First an ouL\].irle of Lhe problem is given. Then a Formalism for Lhe represent. arian of morphological rules is defined. This Formalism is used by Lhe auLomaLed procedure, whose anaLomy Js subsequently present, ed. Finally t. he performance of t. he sysLem is evaluat, ed and Lhe mosL important.</Paragraph>
    <Paragraph position="1"> unsolved problems are discussed.</Paragraph>
    <Paragraph position="2"> l. OuLline of Lhe Problem Learning algorithms for Lhe domain of naLurai languages were in Lhe pasL mainly developed to model Lhe acquisition of synLax and Lo generaLe synLacLJc descripLions flrom examples (eL. Pinker 1979~ Cohen/Feigenbaum \]982: 494-5\]\]). There exist also some sysLems which learn rules for Lhe auLomaLie phonetic LranscripLion off orLhographic LexL (eL. Oakey/Cawt:horn 1981, Wolf 1977). Like the system presenLed in Lhis paper all Lhese systems sLill are exporimenLal sysLems, the inductive auLomaLic learning of morphologi~ cal rules has Lill now been invesLigaLed only Lo a small degree. Research on Lhis problem was carried out by Ring (1978), 3snsen-WJnkeln (\]985) and Wofhl&lt;e (1985).</Paragraph>
    <Paragraph position="3"> The task of' Lhe sysLem described here is Lo learn rules f'or inflecLiona\] and derivaLional morphology. The system is naL designed as a sLandard program, but as an experimenLal system. It \]s used For Lhe experimenLa\] development and t, he Lesling of fundamenLal a\]goriLhmic learning st. rat. egies. Lat. er these sLrategies could perhaps become necessary components of a standard \].earning program devised For Lhe interacLive developmenL off \]inguisLJc algorithms For Lhe domain of morphology.</Paragraph>
    <Paragraph position="4"> Input: Lo Lhe sysLem is a seL of examples called a learning corpus. Each example is an ordered pair of words. We call the f'irsL word of each pair Lhe source. \[he second word is called Lhe t. argeL. BeLween the source and Lhe LargeL of each given pair Lhere musL exist: an infllect, ional or a derivational morphological relaLion. By ap-.</Paragraph>
    <Paragraph position="5"> plying t. he processes of generallzaLion and deLecLion analogies Lhe syst. em has to consLrucL a seL o6 insLrucLions which describe on a purely graphemic basis how Lhe LargeL of each pair is generaLed From the source.</Paragraph>
    <Paragraph position="6"> (SemanLic feaLures of morphemes are aL presenL ignored by Lhe sysLem.) Such a seL of inskrucLions should not only generaLe correcL LargeLs For the sources given in the learning corpus: The insLrucLions should also generaLe correcL targeLs for Lhe majoriLy of Lhe sources not in Lhe corpus which part. icJpaLe in Lhe same inflectional or derJvaLional relaLienship as Lhe source-LargeL-pairs Jn Lhe learning corpus. Suppose For example LhaL Lhe Following learning corpus is Fed JnLo Lhe sysLem:  In t. his case kilo learning algoriLhm has Lo consLrucl a set. off inst. rueLions which generales fior each singular noun (= SOLirce~ in Lhe leFL column) of: Lhis corpus a sLring which is idenLical w.tLh t. he corresponding plural Form (= LargeL, in the righL column).</Paragraph>
    <Paragraph position="7"> FurLhermore, Lhe inst. rucLions should also generat, e Lhe correcL plural Form For Lbe majoriLy of English singu\].ar nouns which are not, members off Lhe l~arnirlg corpus. For inseance, Lhe inslrucl, ions should also generaLe &amp;quot;flies&amp;quot; f'rom &amp;quot;fi\[y', &amp;quot;Lables &amp;quot; f'rom &amp;quot;Lable &amp;quot;, &amp;quot;foxes &amp;quot; from &amp;quot;fox &amp;quot;, &amp;quot;lays&amp;quot; from &amp;quot;Lay &amp;quot;, &amp;quot;classes &amp;quot; From &amp;quot;(;lass', and &amp;quot;thieves &amp;quot; From &amp;quot;Lhief'. Of course Lhere will also be singular nouns For which Lhe .tnsLrucLions will noL be adequaLe. These will include all nouns whose paLLern off pluralizaLion is not represenLed by examples in Lhe learning corpus. WiLh t. he given learning corpus one  could not expect the inferred instrucLJons to be adequat, e e. g. For the pluralizations &amp;quot;ox&amp;quot; -&gt; &amp;quot;oxen', &amp;quot;LooLh&amp;quot; -&gt; &amp;quot;teeLh', &amp;quot;index&amp;quot; -&gt; &amp;quot;indices', &amp;quot;foot&amp;quot; -&gt; &amp;quot; feeL&amp;quot; ~ and &amp;quot;addendum&amp;quot; -&gt; &amp;quot;addenda'. As Lhis example illustrates, the linguistic adequacy of&amp;quot; the insLrucLions does not only depend on the quallLy of the automated learning sLrategies but also on the representativity off a given \]earning corpus for a morphological pattern.</Paragraph>
    <Paragraph position="8"> 2, Formalism for the ReEresentation of Me r~ho~ic al Rules \]here are two main types of instruction the learning algorithm uses for the formulation of morphological rules: Prefixal substitution instructions change the beginning of a source in order to generate the corresponding target. \]hey have Lhe genera\] \]'arm</Paragraph>
    <Paragraph position="10"> Such an instruction means: If a source begins with Lhe string X and J fi immedJ, ately on the right of X follows the string Z(\].) or ... or Z(i) or ... or Z(n)~ then substitute X by Y. ('#&amp;quot; signifies the word-boundary and marks the position where X must occur in order Lo be subst i. Lutable by Y, namely at Lhe beginning sl' a source (right off &amp;quot; #&amp;quot; ) and immediate.}y before Z(1) or ... or Z(\].) or ... at&amp;quot; Z(n)).</Paragraph>
    <Paragraph position="11"> ~ufflixa.l substJ, tuLion \]nstrucLJons change the end of a source in order to generate the corresponding target. Ihey have the form</Paragraph>
    <Paragraph position="13"> rhe meaning off such an instruction .is:IF a source ends with the string X and if imme(liaLely on Lhe left: of X is the str.tng Z(1) or ... or Z(i) or ...or Z(n), then substitute X by Y.</Paragraph>
    <Paragraph position="14"> Each seE of&amp;quot; instructions constructed by the learning algorithm Js ordered, i. e. the later application of the instructions to a given source mus~ be tried in a fixed sequence in order to generate a target: The first applicable prefiixa\] instruction in the sequence of prefixal substitution instructions must be determined and the first applicable suffixal instruct Jan in the sequence of suffixal subsLitution instructions must be determined. Then, both must be applied to the source concurrently, thus generating the target.</Paragraph>
    <Paragraph position="15"> the order and application of sets of instructions may be illustrated by a small example: Suppose the learning algorithm has consLructed Lhe Following set of instructions for the negation of English adjectives (the seL is linguistically noL Fully adequate; &amp;quot;&amp;quot; is the nulls\]ring, i. e. the string wiLh the length 0):</Paragraph>
    <Paragraph position="17"> Then the negation of &amp;quot;perfect' is Formed by First determining tile fJrsL applicable preflJxa\] substituLion instruct i. on:  (l) is not applicable, since &amp;quot;perfect&amp;quot; does noL begin with &amp;quot;1&amp;quot;.</Paragraph>
    <Paragraph position="18"> (2) is not applicable, since 'perfect:&amp;quot; does not begin with &amp;quot;r &amp;quot;.</Paragraph>
    <Paragraph position="19"> (3) is opp\].Jeable, since &amp;quot;perfect&amp;quot; begins with &amp;quot;p &amp;quot;, The first applicable sufflixal subst:it:utJon instruction Js the only suffixal :instrunLJon at. hand, namely (5): &amp;quot;perfect&amp;quot; ends wiLh &amp;quot;'. By the concurrenL app.IJcation of (3) and (5) to &amp;quot;perfect &amp;quot; the target 'imperfect &amp;quot; Js generated, which \]s t:he negaLion of &amp;quot;perfect &amp;quot;.</Paragraph>
    <Paragraph position="20"> 3, Anatomy of the System for the Aufomalnd L e a.PSni rPS~_9 fi _M o ~_tip~11 pPSi c s l R u \] e s lhe sysLem Js written J.n the programming language PL/I. It has the name PRISM, which is an acronym for &amp;quot;PRogram For tile Inferennc and SJmulaLion of' Morphological ru\].es'.</Paragraph>
    <Paragraph position="21"> PRISM has the macro structure shown Jn  main procedure MONITOR at first activates GETOPTN ~lhJch reads \]:he user's options For |111o control of PRISM and checks them for synLactJc we\] \].-Formedness and For plaus:ihilJtyo \[hen MONIIOR activafes Lhe component indicaLed by the user &amp;quot;S COl/ire\] options. ~here are three alternative components : - A learning component which infers sels of JnstruelJons From a \]earning corpus gJvee by the user of PRISM. Th:is component comprises the procedures I:ItKCRPS, DISCOV, STMT\[}UT, TODSE\], and others. \]he learning process is performed by DIS('OV. The other procedures perform peripheral functions.</Paragraph>
    <Paragraph position="22"> A componenL For the appl:ication of instructions ~hich were inferred by the \].earning component, lhis component comprises the procedures FRODSE\], APPLY, DERIVE, and others.</Paragraph>
    <Paragraph position="23"> A third, marginal component which prepares instractions For their printout.</Paragraph>
    <Paragraph position="24"> IL consists of FRODSE\[, SIM\]OU\], and other procedures.</Paragraph>
    <Paragraph position="25"> The aet:J vat\]on of the learning algorithm starts with a call of CHKCRPS by MONITOR. CHK(}RPS cheeks a given learning corpus for formal errors. The procedure activated next. is DISCOV~ which performs the learning processes. DISI'OV first determines Lhe different types of substitution patterns in the qiven \]earninq corpus. Types of  \].earn.i. ng of&amp;quot; app\].icaLion of&amp;quot; prinLeut of Lnsl:rucLJone i net: r LIC t: J. one ir/sl:ruuLiona  ! + .......... + ! -J ........... + + .......... j. ! +-&gt;! CIIt&lt;CRPS ! +-&gt;! FRODSFT ! ! PRODSEI !&lt;-+ ! + ........... +&lt;====/ / ! + .......... F&lt;=====/ /==&gt;+ .......... + ! ! / I.EARN1NG / ! / KNOWLEDGE / ! ! -P .......... ,- / CORPUS / ! ~ ........... ~. / BASE / + ......... + ! +-&gt;\[ DISCOV !&lt;= / +-&gt;! APPLY !&lt;==/ / ! SIMIOUT !&lt;--, ! -P ............. i ~ + ......... F + ......... + ! + ......... + V +-&gt;! SIMTOUT ! / / + ......... + / / ! .i .......... + / SOURCES /=&gt;! DERIVE !=&gt;I TARGErS /</Paragraph>
    <Paragraph position="27"> F.igure, 3. Macro eLrucLLire el PRISM. (For reasells oF lueJdiLy some macro FeatLIres of PRISM have been .ignored in Lhis chart.) subst, iLuL:i en psi:barns ace Ehe diFferenL (X, Y)-pairs which are iml)liciLly presenL in Lhe learnJ, og carpus. (For Lhe eLabus of` X and Y cempare Lhe deF:iniLion oF the formal.Jam I'or Lhe repreeenLation oF merpholagJ c:a.l ruJeso ) \[tie second st:ep of \[) I S(~(\]V cempLiLes Lhe frequency ef` each subst..iLuLion patLern in I:he eortJas. D\]SI~E\]V's learning st. raLegy presur)poses LhaL Lhe subs b J lLlt:.ierl pa\[:~:erns oeetlrrJng more Irequenf\].y Jn a \] anguage also eecur more Frequently J n Lbe \] earn:ing corpus. I heref'ore D1SCOV creates more general J. nst. rueLiona Per Lhe mare f'requent poLLerna of&amp;quot; a learrliog corpus and more specific \]liSP. surE:bOllS fop Lhe \].ens f'requenL patLerns oF o learning corpue~ J. o. the conLexLuo\] sbringe Z(i) of&amp;quot; an Jn,&lt;;Lrue|:.i. or~ X --&gt; Y/# (Z(\])\] ...</Paragraph>
    <Paragraph position="28"> iZ(:i)! ... IZ(n)) or X -&gt; Y~(Z(\].)I ...</Paragraph>
    <Paragraph position="29"> tZ(:i)l ... IZ(n)) tt are l:he more general Lhe more frequerlt, ly Lhe eubsL~t:ul::i, on pat:LeFr~ (X~ Y) aeeUrSo They are bbe more speelf'.ie \[-.he mere rarely t. he aubsLJ t. uL.i. on pal:t, ern occurs. Provided LhaL a learn:ing ooPSpus JS represenl.at:\] ve of&amp;quot; Lhe morpholaqical SUbatJt. uLJon pal:terns of` a \].anguage and Lhe conLexLua\] at. rings Z(J), t:hie genera\].</Paragraph>
    <Paragraph position="30"> sLrat, egy Far Lhe deLermJeaLJon of' t. he Z(J) 'a increases t. he probabJlJLy thaL the inferred :ins f.r ue I::i ons generate correct targeLs For sueh sour(',es as are not. elements oF t, he giverl \].earrling corpus. D\[SCOV arranges Lhe subsLil:utJnn inaLructton,s in such a way t. bat.</Paragraph>
    <Paragraph position="31"> Lhe more npeeif'J.e inst. rueLJons precede t:he more general odes. rhis order of` the inst. rueLions guarant, ees durJ. ng t, heJr \]aler app\]icaLion Lhat: pot, erlLia\].ly each tnsl. cL~et, iol/ can be applied. SIHTOU\] Lransforms subst:itut. ion instrhlcL:ions inferited by I)IS(:OV From Lheir inLerrlal, relDresent, aLJen ~ whJcb aliens lheir easy and fasL aubomabie breaLmenL, into an external represerlLaLlon and prinLs thenl ouL. For Lh.ts ext, ernai represenLaLlon Lhe noLat, ion is used which was :int:raduced above :in Lhe def'inJl:ions off the l:wo t. ypes oF subst, ii uLion in,sLrucLions. F.tna\] \]y TOI)SE \[ slates Lhe ~ I\] a \[~ PS ill? ~, J one in an exberns\] knowledge base, From i~hieh Lhey can Islet be read by t. he oLher |.wo componenLs off PRISM (In Lhe l&lt;rloll/ledqe base Lhe J. nsbrLicLJ, ons are seared J. rl theJ. r inLet'na\] t'epresenLaLion).</Paragraph>
    <Paragraph position="32"> The spp\]lcaLlon component, sLarLs ~/Jth EROI)SEI, ~hJeh loads a set. of&amp;quot; insbrucLions I-o be nppJied From Lhe knowledge base lo Lhe eenLral memory. Then l. he Ewe procedures APPLY and DERIVE apply Lbe inst, I.'uet:ions Lo ~/orde gives by Lhe user and Lhereby generaLe Largel.s i~/hJch are ~lJrit:Len to an ouLpuL data set:. \[he I&lt; i. nd of morphological relaLJ, en beLween bhe generabed Larget-s and t. he given wards depends on l. he apeeifJ, c see af` Jn-sL\['ucl, Jona which is applied.</Paragraph>
    <Paragraph position="33"> 4. ~_LaLu~LL~n ~_r L+-PS_Sy,,~Lem \[he perf`ormanee of&amp;quot; PRISM ~J/as evaluaLed Llnder the Fo\],J. uwing condit.:ions.</Paragraph>
    <Paragraph position="34"> \]. A see oF insLrucl. J one stlou.'\[d always generat, e correct. Lai~gef. s if' iL la applled Lo t. he souz'ces of&amp;quot; Lhe learning corpus From u/bich iL was inferred.</Paragraph>
    <Paragraph position="35"> 2. The larger Lhe learning corpHs Js For a given morphoJogical relaLion, the hlghar should be on average t. he percenLage of&amp;quot; correcl:\]y genet'abed t. acget:s f'or such sources as are not: e\].emenbs of` the learnLng car pu,q (buL nevert, heless  participate in the given morphological relation).</Paragraph>
    <Paragraph position="36"> 3. A set of instructions inferred From a linguistically representative learning corpus should generate correct targets for at \].east 90% of the sources which are not elements off the learning corpus (but which nevertheless participate in the morphological relationship under discussion). null 4. If a linguistically representative learning corpus is given, the learning algorithm should classify as regular those morphological patterns which linguists also usually classify as regular.</Paragraph>
    <Paragraph position="37"> Condition i is fulfilled. This could be proved deductively with reference to the structure of the learning algorithm. (The proof is given in Wothke 1985, 144-154.) The fulfilment of conditions 2-4 could only be tested inductively by applying PRISM's learning algorithm to different learning corpora and evaluating the results.</Paragraph>
    <Paragraph position="38"> Condition 2 was tested by applying the learning component to learning corpora of different sizes compiled For two morphological relations: derivation of nomina actionis from verbs in German (e. g.: &amp;quot;betreuen&amp;quot; -&gt; &amp;quot;8etreuung'), derivation of Female nouns from male nouns in French (e. g.: &amp;quot;spectateur&amp;quot; -&gt; &amp;quot;spectaLrice'). With the sets of instructions inferred from these learning corpora PRISM's application component generated targets for a set of words not in the learning corpora. The statistical results of these tests showed that the percentage of correctly generated targets For such sources as are not elements of the learning corpus is, on average, the higher the larger the learning corpus is. A Further important result was that the percentage of correctly generated targets is the higher the more regular the morphological relation is: The tests yielded better results For the more reguiar derivation of Female nouns from male nouns in French than For the less regular derivation of nomina actionis Form verbs in German.</Paragraph>
    <Paragraph position="39"> To test the Fulfilment of the third condition representative learning corpora were manually compiled For the derivation of nomina actionis From verbs in German (9.167 source-target-pairs) and For the derivation of female nouns from male nouns in French (89 source-target-pairs). The two sets of instructions automakieally inferred from these two corpora were applied Lo large sets of sources which were not members of the learning corpora (4.793 sources for German, 211 sources for French). In both cases the percentage of correctly generated targets was iOO~.</Paragraph>
    <Paragraph position="40"> Condition 4 was tested with learning corpora for the pluralization of English nouns and For the derivation of female nouns from male nouns in French. An exact quantification of the degree of accuracy is not  possible, since this condition contains some vague expressions such as &amp;quot;regular&amp;quot; and &amp;quot;usually&amp;quot; My subjective judgement is that the instructions constructed by the learning algorithm For (approximately) representative corpora are quite similar to the morphological regularities described in tradJtionaI grammars. This may be illustrated by an example: The learning corpus shown in Figure is approximately representative for the regular pluralization patterns of English nouns. From this corpus PRISM inferred the Following set of instructions which represent the most important pluralization rules:  (l) &amp;quot; -&gt; &amp;quot;/# (2) &amp;quot;f&amp;quot; -&gt; 'yes'/ # (3) &amp;quot;re&amp;quot; -&gt; &amp;quot;yes.'/ # (4) &amp;quot;y&amp;quot; -&gt; &amp;quot;ies'/( &amp;quot;d'l &amp;quot;l'i &amp;quot;p'i 'r'~ &amp;quot;t ') # (5) '' --&gt; &amp;quot;ca'/( &amp;quot;oh 'i &amp;quot;sh't &amp;quot;s'l &amp;quot;x'\[ &amp;quot;z &amp;quot;)# (6) &amp;quot;' -&gt; &amp;quot;s'/ #</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML