Morphological Rule Induction for Terminology Acquisition 
Bdatrice Daille 
IRIN, 2, rue de la Itoussinib, re. BP 92208, 44322 Nmll;es Cedex 3 France 
daille(@irin.univ-nantes.fl 
Abstract 
We 1)res(;111; the identiti(:ation in corl)Ol"a of 
th:elM1 relatio11M adjectives (RAdj) such as 
gazc'uz (gaseous) which is derived from the noun 
gaz (.(/as). RAdj at)t)earillg in nonlinal phras('s 
are int('resting tbr ternlinology acquisition 1)e- 
cause they 11ollt a llalning flnl(:tion. The (leriw> 
tio11M rules emt)loyed to (:omt)ute the nora1 front 
which has been deriv('(t the RAdj are a(xluired 
s('nli-mltonmti(:ally fronl n t~Gge(t and ~l leln- 
matize(t (:orl)ora. q'hesc rules are then integr;tt- 
ell into ~t t('~rmer whi('h i(h',ntifies I{A(lj tlmnks 
to their 1)roi)el"ty of being paraphrasal)h; l)y 
a prepositionM phrase. RA(tj and comt)ound 
nouns which inchlde a I{A(tj m'e 1;111;11 (tuanti- 
fled, their linguistic precision is lneasured and 
their iutbrmative status is e.vahl;Lted thnnks to 
~ thesaurus of the. dOUlMn. 
1 Introduction 
litelM\[ying relationM adje(:tives (l{.Adj) such a.s 
malarial, ~md 11Ol111 phrases in whirl1 they hi)- 
pear su(:h as malarial mosq'uitoc.s, could be iu- 
teresting in several tiehts (if NLP, such as ternli- 
nology acquisition, tot)i(: detection, updating of 
thesauri, tm(:mlS(; they hold ~t 11~mlillg flln(:tiou 
acknowledged t)y linguists: (Levi, 1978), (M61is- 
Pudluhl, 1991), (;to. The us(', of RAdj is par- 
ticularly ti'e(luent in scieutiti(: tields (Monceaux, 
1993). P~m~(loxically~ ternlinology acquisition 
systems suc\]l as TEI{MINO (David ~md Plant(': 
11,)90), LEXTER (Bourigault, 1992), TERMS 
(.hlsteson and Katz, 1995), have not 1)een con- 
(:erned with RAdj. Even (I1)ekwe-Sanjua, 1998) 
in her study of tern1 wMatiOllS for idelltit)illg 
research tot)its fi'onl texts does 11ot; take into 
account derivatio1111 WMmltS. Our (:ou(:ern is: 
1. 31) idelltit~y 1101111 phrases in wlli(:ll relation- 
al adje(:tives nt)l)ear, as well as the prel)o- 
sitiollM I)llrases l)y which they could 1)c 
t)ar;~l)llrase(t. We will see through anotll- 
(;1" source 1)resented iu section 2 that this 
t)l"Ot)erl~y of parai)hrase (;tin be used to i- 
dellti(y these adjectives. 
2. To check the naming character of these ad- 
je('tives and to evahlate the 11;ruling (:lmra(:- 
ter of the, noun 1)hras(;s in which they ;11)- 
l)em'. 
Moreover, i(hmtitlying both the a(tje(:tive ~ll(l 
the t)ret)ositional phrase is useflll ill th(; tMd of 
ternlinology a(:(luisition for t)eribrlning accurate 
tel'n1 llornlalization l)y grout)ing synonynL tbt'lnS 
referring to an uuique coneet)t such as p~vduit 
laitier (dairy prod'uct,) :re(l, pro&tit a'u lair (prod- 
'uc/; 'with, milk), pTvd,uit de lair (p'~vd',,ct of milh), 
p~vd'uit issv, du lair (p~wd,uct, made o.f milk), (;t(:. 
3\]) (:m'ry out this i(tentitic;ttion, we use shal- 
low t)m'Sillg (Almey, 1991), and then, tbr m()r- 
i)hologi(:al processing, a dymmli(: nlethod wllMl 
t~lkes ~s input n (:orl)us \]M)eled with t)m't-ol "- 
sl)eech and lellUUa l;;lgs. ~J)lle lnorl)hologicM 
rules m'e l)uilt selui-autonlaticMly ti'oln the (:or- 
l)US. 
I\]1 this stu(ty, we tirst defiue, and give some lin- 
guistic 1)roperties of RAdj. We then l)resent the 
method to build morphological rules mM how to 
integrate then into a ternl extractor. \¥e qUall- 
tit~y the resullis ot)tailled fl'Oln a te(:hnical eorl)us 
in the tield of agriculture \[AGII,IC\] and evaluate 
their linguistic mid int'or111~tive precision. 
2 Linguistic properties of relational 
adjectives 
Ac(:ording to linguistic and gralnlnaticM tradi- 
tion, there are two nlain categories aUlOllg adjec- 
tives: el)ithetic slM1 as important (,sign'~ificant) 
and relatio11M adjectives such as laitier Malty). 
The tirst ones cannot \]l~ve an ~gentive interl)re- 
215 
ration in contrast to the second: tile adjective 
laiticr (dairy) within the uoun phrase pr'oduc- 
lion laiti~re (dairy production) is an argument 
to the predicative noun production (production) 
and this is not the case fbr the adjective impof 
tant (significant) within the phrase production 
importante (significant production). Relation- 
al adjectives (RAdj) possess the following well- 
known linguistic properties: 
• they are either denonfinal adjectives -- 
morphologically derived from a noun 
thanks to suttix--, or adjectives having a 
noun usage such as mathdmatique (math- 
cmatical/mathcmatics). For the former, 
not all the adjective-tbrming sufiqxes lead 
to relational adjectives. The following suf- 
tixes are considered by (Dubois, 1962) as 
appropriate:-ain, -air'e, -al, -el, -estr'c, 
ien,-icr',-il(e),-in,-ique. However, (Guy- 
on, 1993) remarks that a suffix, even 
the most appropriate, is never necessary 
nor sufficient. Several adjectives carry- 
ing a favorable suffix are not relationah 
this is the case with the adjectives ending 
with -iquc (-ic), which characterize chem- 
istry and which are not derived from a 
noun, such as ddsox!/ribonucldique (deoryri- 
bonucleic), dodecanoiquc Modecanoic), etc. 
Other suffixes inappropriate are sometimes 
used such as the suffixes -d and -e'a:~:: car- 
bone &,'bon) -* car'bon~J (~.a,'bo,,,,eeo'a,~), 
c,,,,ce," #a,,,cer9 + ca,~c~r'e'a:~ &ancc','o'~,~), 
etc. 
• they own tile possibility, in special condi- 
tions, of replacing tile attributiw'~ use of 
a corresponding prepositional phrase. The 
preposition employed, as well as tile pres- 
ence or not of a deternfiner, depends on the 
head noun of the noun phrase: 
aciditd sanguine (blood acidity) ~_ aciditd 
du sang (acidity of the blood) 
conqugtc spatiale (space conquest) ~_ con- 
qu~tc de l'espace (conquest of space) 
ddbit horairc (hourly rate) ~- ddbit par' 
heure (rate per" h, our) 
cxpdrimentations animales (animal experi- 
mentation) ~ cxpdrimcntations sur lea an- 
imaux (experimentation on animals) 
• and several other properties such the im- 
possibility of a predicative position, the ill- 
compatibility with a degree modification, 
etc. 
3 Morphological Rule Induction 
~lb identify RAdj trough a term extractor, we 
use their paraphrastic property which inchldes 
the morphological property, the morl)hological 
property being insufficient alone. We need rules 
to recover the lemma of the noun fl'om which the 
lemma of the RAdj has been derived. 
These rules tbllow the tbllowing schemata: 
r~ = \[-S +M \]{exceptions} where: 
S is the relational suffix to be deleted from the 
end of an adjective. The result of this dele- 
tion is the stem R; 
M is the mutative segment to be concatenated 
to R in order to tbrm a noun; 
exceptions list the adjectives that should not 
be submitted to this rule. 
For example, the rule \[-d -l-e \]{agd} says that 
if there is an adjective which ends with d, we 
should strip this ending from it and append tile 
string c to tile stem except if this a4jective be- 
longs to tile list of exceptions, namely agd. 
We extract these mort)hological rules Kom 
the corpora following the method presented in 
(Mikheev, 1997) with the difl'erenee that we 
don't limit the length of the mutative segmen- 
t. The relational suffixes are known, only the 
nmtative segments have to be guessed. For tlm 
lemma of an adjective ending with a relational 
suffix in the corpus Adji, we strip this suffix of 
Adji and store the resulting stem ill R. Then, 
wc try to segment this stein R to each noun 
Nounj at)pearing in the corpus. If the subtrac- 
tion result in all non-empty string, the system 
creates a morphological rule where tile muta- 
tive segment is tile result of the subtraction of 
R to Nounj. We thus obtained couples (Ad.ii, 
Nounj) associated to a morphological rule. For 
example: (gazeux, gaz) \[-cux +""\]. 
This schemata doesn't take into account stem 
alternants such as: 
el6 alphabe t/ aph, abd t-ique 
~/~ hygi~ ne/hygidn-ique 
e/i polle n/polli n-ique 
x/c th, orux / thorac-ique 
216 
In order to h~mdle this alh)mort)hy, we, use the 
Lcvenshtein's weighted distance (l,cvenshtcin, 
1.966) which determines the minimum numl)e,r of 
insertions or deletions of characters to transfor- 
m one word into another. (Wagner and Fisher, 
1974) presents n re(:nrsive ~dgorithm to (:ah:ulate 
this dist~mcc. 
• ! &.s t ('w ~,i, 'w~ ,j ) = 
min(di.st (w~ ,i-~, ""~ ,j) + q, 
• ~ di,~t(wi,i, 
with w~,m 1)eing the substring t)egimfing nt tlm 
1l I'h' C\]I}II'}I, CtCI" }~ll(1 tinishing after tim mth char- 
acl;(;r of the word w, 
dis@c,y) = 1 i.f:c--y 
= 0 if :~: ¢ y 
and 
q cost; of the, inserl;ion/de, h',tion of one, character 
p cost of" t;he sul)stitution of one (:h;~racter |)y 
~mothcr. 
Generally, a substitution is (:onsidcr(~,d as a dch~- 
lion fi)llowed 1)y ;m insertion, thus I ) -- 2(1• Wc 
apply this alg()rithm to e,a(:h stem 1{, ()l)tahm(t 
;d'te, r the (h~letion of tim r(~,lational suffix, that 
had not; 1)c(m found ~s a stem ()f n llOllll. \]~llt,, 
we add the constraint that l/. ~m(1 the n(mn must 
share the same, two first; characters, i.e. the sul)- 
string comput(:d t)cgin at character 3. We only 
rel;~fin cout)les comi)oscd of ml ~uljectivc and a 
noun with it Levenshtcin's w(;ightcd e(tual (;o 3 
(i.e. one sul)stitutiol~ + one insertion) . From 
the, se tout)los, wc dcdu(:c new rel;~tional suffix- 
cs to l)c ~ulded to list; of ~dlowc, d sullixes. More, 
1)re('iscly, we (:onsidcr theft such suffixes are, al- 
lomorphic w~rbmts of the relation suffixes. Wc 
also add new mort)hok)gic;d ruh',s. For cxam- 
ple, for the couple (hygi&t,c, hygidniquc,), we add 
the suffix -~niquc which is conside, red as an al- 
lomorph of the sutfix -iquc, mid creatc tim rule: 
\[-&t, ique +&~,e\]. However, this method doesn't 
rc,~ricve, RAdj lmilt from non ~mtonomous t)ascs 
of such 
nor from Latin noun 1)ases such as ph'r(,./patc/r 
(fathen'/patcr), vill@urb (tov,,,/,~rb). 
We check m~mmflly the rules ot)tained and 
ll,elational Number of Number 
Suffix allomorphs of rules 
-al 3 5 
-airc 4 8 
-d 2 2 
-d 1 2 
-or 1 2 
-cu:c 1 3 
-ion 1 2 
-i~:r 1 2 
-if 2 6 
- in 1 2 
-iquc 8 18 
-isle 1 1 
-cite 1 1 
Total 25 54 
Figm:c l: Numl)er of varimlts mid rules 1)y rel;L- 
tiomd suffix 
added to the list; of cxccptions thc wrong (lcriva- 
lions obtain(',(l. %d)\]c I prescnt, s tim 1mini)or of 
rules r(',t~dn(xt nn(t the mnubcr of v~riants fl)r 
(~(:11 suffix. 
4 Term Extractor 
First, we present the tcrm e, xtr~mtor ('hosen 
the, n, the modifications perfi)nn to enable the 
al)l)li('ation of the dcriw~tional rules. 
4.1 Initial Term Extractor 
ACAB\]T (\])ailh~, 1996), the term cxtra(:tor used 
ti)r this (!xt)(',rim(mt; eases I;he task ()f t;he, t;ernli- 
no\]ogist l)y proposing, \['or ;~ given (:orl)uS , a, list 
of (:mldi(l~tc terms ranked, from the most rei)- 
rcscnl;ativc of the domain to the lc:~sl; using a 
st~tistical score. Can(lid~tte terms whi(:h are cx- 
tr;tctcd fl:om the corlms t)elong to a Sl)CCiM type 
of cooc(:m:rcnces: 
• the cooc(:urrcn(:c is oriented and follows the 
lincar ordcr of the text;; 
• it; is ('Oml)OS(,xl of two lexi(:al milts whi('h (lo 
not l)elong to the, (:lass of functional words 
such as prcl)ositions, articles, etc.; 
• it m~tchcs one of the morphosyntactic pat- 
tcrns of wh~Lt wc will (:all "l)~se terms", or 
one of their t)ossible vm'iations. 
The l)atterns for base \[;CI'IlIS arc: 
Noun1 Adj cmballagc biod@radablc 
(biodcqradabh: packag(;) 
Noun1 Noun2 ions calcium 
217 
Nounl (Prep (Det)) Noun2 ions calcium 
(calcium ion) pTvtdine de poissons (fish 
protein), ehimioprophylaxie a'u r'~fa~n, pine 
(riJhmpicin chemoprophylazis) 
Noun1 5_ Vinf viandes ~t 9riller (grill meat) 
These base structures are not frozen structures 
and do accept several variations. Those which 
are taken into account are: 
1. Inflexional and Internal morphosyntactic 
variants: 
• graphic and orthographic variants 
which gather together predictable in- 
flexional variants: conservation de 
p~vduit (product preservation), conser- 
vations de p'rvduit (product preserva- 
tions), or not: conservation dc prod'ait- 
s (products preservation) and ease dif 
ferences. 
• variations of the preposition: eh, w- 
matographie en colonne (column 
chrwnatography), chromatographic sur 
colonne (chrvmatograph, y on col'area); 
• optional character of the preposition 
and of the z~rticle: fixation azote (hi- 
trogen fization), fixation d'azote (fiz- 
ation of nitrogen), fi.~:ation de l~azote 
(fization of the nitrogen); 
2. Intermfl modification variants: insertion in- 
side the base-term structure of a modifi- 
er such as the adjective inside the Noun1 
(Prep (Det)) Nom~2 structure: lair de bre- 
bis (goat's milk), lait cru de brebis (milk 
straigh, t .from the goat); 
3. Coordinational w~riants: coordination of 
base term structures: alimentation hu- 
maine (human diet), alimentation animale 
et hnmaine (human and animal diet); 
4. Predicative variants: the predicative role of 
the adjective: peetinc mdthylgc (mcthylate 
pectin), cos pectines sont m6thyldes (these 
pectins are metylated). 
The corpus is tagged and lemmatized. The pro- 
gram scans the corpus, counts and extracts col- 
locations whose syntax characterizes base-terms 
or one of their variants. This is done with shal- 
low parsing using local grammars based on reg- 
ular expressions (Basili et al., 1993). These 
grammars use the morphosyntactie information 
associated with the words of the corpus by the 
tagger. The different occurrences are grouped 
as pairs formed by lemmas of the candidate ter- 
m and sorted following an association measure 
which takes into account the frequence of the 
COOCCtlrrOllCeS. 
4.2 Term Extractor modifications 
The identilication of relational adjective takes 
place afl;er extraction of the occurrences of the 
candidate terms and their syntactic variation- 
s. The algorithm below resmnes the successive 
steps tbr identifying relational adjectives: 
1. Examine each candidate of Noun Adj struc- 
ture; 
2. Apply a transtbrmational rule in order 
to generate all the possible corresponding 
base nouns. We added morphosyntactie 
constraints for some suffixes, such as tbr 
the suffix -er, that the identitied adjective 
is not a past-participle; 
3. Search the set of candidate terms tbr a pair 
formed with Nomtl (identical between a 
Noun1 (Prep (l)°t)) Nou,~2 and a Noun1 
Adj structures) and Noun2 generated from 
step 2. 
4. If step 3 succeeds, group the two base struc- 
tures mlcter a new candidate term. Take 
out all the Noun Adj structures owing this 
adjective from the set; of Noun Adj candi- 
dates and rename them as a Nomt RAdj 
structure. 
I11 Step 2, morl)hoh)gical rules generate one or 
several nouns tbr a given adjective. We gener- 
ate a notllt for each relational suffix class. A 
class of suffixes includes the allomorphic vari- 
ants. This overgeneration method used in in- 
forlnation retrieval by (aacquemin and Tzouk- 
ermann, 1999) gives low noise because the base 
noun must not only be an attested for in the 
corpus, but must also appear as an extension of 
a head noun. For exanti)le, with the adjective 
ioniqne (ionic), we generate both ionic ('ionia) 
and ion (ion), but only ion (ion) is an attested 
tbrm; with the adjective gazeux (gaseous), the 
noun forms gaz #as) and gaze #auze); are gen- 
erated and the two of them are attested; but, 
the adjective gazeux (gaseous) appears with the 
218 
Nmnber of oc(:urrences 1 > 2 Total 
1)ase slir~l(:l;ures 
Nora1 Prep (\])et) Nora2 17 232 5 949 23 181 
Nora Adj 12 344 4 778 17 122 
Nora h Vinf 203 16 219 
'.FoCal 29 912 10 895 40 807 
Figure 2: Quantitative (bfl;a on 1)nse, stru(:tures 
llOllll dchange (ezch, ange) whi(:h is t)aral)hrased 
in the tort)us t)y dchangc de gaz (.qa.s ezchange) 
and not by ~.changc de gaze (gauze exehanftc). 
I,i)r adjectives with a mmn fimction, as for ex- 
ample pwbldmc technique (te.ehnical pTvblem) 
and Frobl&nc de tech.nique~ (pwbh:m of tech- 
7~,ics), we tl;tve ac(:el)ted th~tt ~t (:;m(ti(l~te term 
(:ouhl share several base stru(:tur('.s: on(; ()f type 
Nounl (Prep (l)et)) No,m2 and ;mother of type. 
N(mnl Adj. No comtmtalfion is n(;('.(lcd to see 
that Noun2 as Noun2 and Adj shin'(; the s;une 
1CIlSIlI~L 
5 Results and Evaluation 
Ore: corI)us, (:alled \[AGRIC\], is made up of 7 272 
aJ)str;tcts (/130000 wor(ls) fronl th'en(:h texts 
in tlm ~tgri(:ulture (tomnil~ mM extra(:te(t from 
PASCAL. We used 1;t5(; Brill t)a.rt-ofSt)ee(:h Tag- 
ger (Brill, 1992) trained for l,¥en(:h by (Le(:olntc~ 
and Pm'out)ek, 1996)) and the lelmnatizer (h> 
veh)ped t)y F. Na.mer (\[Ibussaint et M., 1998). 
5.1 Quantitative results 
q_~d)le 2 resmnes the mmfl)er of l)ase stru(:tures 
extr;mted from \[AGRIC\] corlms. \]q:om these 
t)ase structures, 395 groul)ings were identitied. 
The linked presence of noun l)hrases of which 
the extension is fultilled either 1)y a rebttional 
adjective, or l)e a l)rel)ositional phrase the nmn- 
ber is rare --a little bit more than 1. % of the 
tol;al of occurrence, s- . B15t, these groupings al- 
low us to extract from the 5mmerous hal);,x -- 
more than 70 % of l;he totM of occurrences 
candidates which, we presu5ne, will t)e, highly 
denonfinative and to increase the numt)er of oc- 
currences of a candidate term. The mmfl)er 
of relational adjectives which h~ve l)een identi- 
fied is 129: agTvnomique (agTvnomical), alimen- 
tai,'c, (fl, od), araeh, idier (groundn,,d), aromatiq'ac (arow, atie), 
etc. 
5.2 Linguistic Precision 
We chc(:k(;d tim linguistic accuracy of the 395 
structural wu'iations which group ~ Noun1 Prep 
(Det) N(mn2 structure ~md a Nounl RAd- 
j structure. Reported errors COlmern 3 incof 
re('t groupings due to 1;15('. homograi)hy , and 
the non homonymy, of the adjective ;tn(l the 
noun: fin gh, in (A@/(,',,d (Nou@), ,:o,a'ra,> 
t (ordi,,,ary(Adj)/e'm're.nt(Nov, n)), potentiel (po- 
tential). This lead us to a linguisti(" i)rc(:i- 
sion of more than 99 % in the identitication 
of relational adjectives. As ~ matter of com- 
1)arison, (Ja(:quenfin, 1999) obtained a pr(:(:i- 
sion of 69,6 % for the Nora5 to Adj morl)hO- 
synl, tmti(: wtriations (:M(:ulat(',d according to the 
morl)hologi(:M fimfilies l)roduced 1)y ~ sl;enl- 
ruing algorithm al)l)lied to the MUI;.I)F, XT lex- 
i(:;d datM)ase (MUIT.13'3XT, 1998) on the StLllle 
French corpus \[AGRIC\]. 
5.3 Informative Precision 
The thes~mrus (AGI/,()V()C, 1998) is ~ taxono- 
my of M)out 15 000 terms ;~ssocbtted with syn- 
onyms in n SGML fi)rm;~t, which leads to 25 964 
(tiff('xent terms. AGROVOC is used for indexing 
with (l~tta tittillg ;tgri(:ultural retriev;tl syst('.lliS 
and indexing syst(mlS. \~e lna(le two ('Oml)~tr- 
is(ms with AGI/OVOC: we tirst (:h(;(:k('A whetllcr 
thc.se RA(tjl~. were re.ally t)~rt of terms of it ml(t 
se(:oll(l, we colnt)~re(t the c~mdi(t,~te terlllS ex- 
tracted with a I/.A(lj with its terms. We ('onsi(t- 
or |;hat the t)resence of the I/,A(tj in AGR,()VOC 
(:ontirms its informative character, mM th}tt the 
l)resen(:e of a (:an(li(late t(;rm ~ttests its termi- 
nological wtlue. 
5.3.1 Relational adjectives alone 
Fronl the 124 correct RAdj, 68 appear insid- 
e terms of the thesaurus in epithetic 1)osition, 
and 15 only under their noun tbrm in an exten- 
sion position, for exmnple arach, idier (ground- 
n'at) does not appear but arach, ide is used in an 
extension position. Moreover, among the 124 
adjectives, 73 appear in AGROVOC under their 
noun term as mfitenns. The adjectives which 
are not l>resent ill the thesaurus in an extension 
t>osition tamer either their adje(:tiwfl or n<mn 
form are 11 in mmflmr. So 93% of them m'e 
indeed highly inf'ormtLtive. 
219 
5.3.2 Candidate terms with a relational 
adjective 
Pour 9 AdjR belonging to AGROVOC, we com- 
pute the tbllowing indexes: 
TA tile number of terms in AGROVOC in 
which tile relational adjective appears in an 
epithetic position, i.e. the terms of Noun 
RAdj structure. Fox" example TA=15 tbr 
the adjective cellulairc (eellular) because it 
appears in 15 terms of AGROVOC such 
as di./~renciation cellulairc (cellular differ'- 
enciation), division cclIulaire (cellular divi- 
sion). 
TN the number of terms in AGROVOC in 
which the noun from which has 1)een de- 
rived the relational adjective appears in- 
side ~ prepositional phrase, i.e. the terms 
of Nounl Prep (Det) Nounl~Adj structure. 
For example TN=4 tbr the noun eellulc 
(cell) because it appears in 4 terms of A- 
GROVOC such as banque de ccllulcs (cell 
bank), c'alt'a,'e de ecUules (e~tlt~u'e of cells). 
C A the number of candidate terms of Noun 
RAdj structure. For example, CA=61 for 
the adjective celluIaire (cellular) because it 
appears in 61candidate terms such as acidc 
cellulaire (cellular acid), activitd cell'alaire 
(cclluhtr activity), agr@at cell'ulaire (ccll'a- 
la'r aggregate). 
C N the munber of candidate terms of Noun1 
Prep (Det) NounltAd j structure. For exam- 
ple CN=58 tbr the noun eellule (cell) be- 
cause it appears in 58 candidate terms such 
as ADN de cellule &ell DNA), addition de 
cellules (cell addition). 
Then, tbr each candidate term of CA and CN, 
we checked tbr their presence in AGROVOC. 
Tile only matches that we have accepted are 
exact matches. With this comparison, we ob- 
tained the following indexes: 
a the number of candidate terms of Noun RAdj 
structure tbund in AGR.OVOC under the 
Noun RAdj structure. 
b the number of candidate terms of Noun RAdj 
structure tbund in AGROVOC muler the 
Nounl Prep (Det) NounlIAdj structure. 
Noun RAdj N1 Prep (Det) NIIA4i 
Precision 0,34 {},{}4 
Recall 0,46 O, 14 
Figure 3: Averages of precisions and recalls 
c the number of candidate l;erms of Nounl 
Prep (Det) Nounl~Adj structure found in A- 
GROVOC under the Noun RAdj structure. 
d the number of candidate terms of Nounl. 
Prep (Det) Noun~Adj structure found in 
AGROVOC under the Noun1 Prep (Det) 
NounRAdj structure. 
These indexes allow us to compute precision 
P and recall R for each Noun RAdj structure 
and each Noun1 Prep (Det) Noun~Adj structure 
with the help of the fbllowing tbrmula: 
((,, + b) I'No~,~A~j -- C~ (1) 
+ d) (2) 
aNounPrep(Del.)Nounl~A,lj -- CN 
(a + t,) (3) ~NounRAdj -- TA 
(c +d) 
l~,Nounl)rep(Det)Nounl¢A4i -- TAr (4) 
The averages of precision and recall for the t- 
wo structures are summarized in table 3. This 
comparison of the average of precision comput- 
ed shows that candidate terms with a Noun 
RAdj structure are 10 times more likely to be 
terms than their eqniwflent in Nounl Prep (De- 
t) Nounl~.Adj. The analysis of the average of re- 
call is also impressive: it is generally difficult to 
obtain a recall sut)erior to 25 % when comparing 
candidate terms extracted from a corpus and 
a thesaurus of the same domain (Daille et el., 
1998). The average of recalls obtained thanks 
to the identification of RAdj shows that nearly 
half of the terms lmilt with the defined RAdj are 
identified. These good wflues of precision and 
recall have been obtained on linguistic criteria 
only without taking into account frequency. 
6 Conclusion 
Tile method proposed in this study to acquire 
morphological rules fl:om corpora in order to re- 
cover derivational term variations trough a ter- 
m extractor and identi(y relational adjectives 
220 
shows an excellent I)recision. We h~v(; Mso 
proved that noun l)hrases including a l l,Ad.i arc 
fitr more infornlativ(; l;hmt their equivMent in 
Nounl Pre 1) (Det) Nounlbb/j stru(;ture. \¥c still 
h~we to write the program whose task will t)e to 
merge, new mort)hologicM rules ttcquire, d Kom 
another (:orlms with t\]le existing Olle, S. 

References 

S. A1)n(;y. 1991. l~&rsing with (:hunks. In 
R. Berwi('k mid C. Tcnny, extitors, Principh;- 
Base Parsing, I)agcs 257 278. Kluwer Aca(h;- 
too(: Pul)lishers. 

AGR()VOC, 1998. A GI~OVOG'- M'altiling'aal 
Agricult'mul Th, c,.s'a'aru.s', l?ood and Agricul- 
tural ()rganiz~tion of the United N;~tions. 
httl)://www.f~u/.org. 

l{.ol)crto Basili, Mm:b~ 'l.bresa l)azienza, mM 
l)aob~ Velar(li. 1993. Acquisition of Selc(:tiolF 
al PaA, terns in Sul)lmlgu~gcs. Math, in('. 7;ran- 
lation,, 8:175 201. 

Didier Bourigault. 1992. Surface grammatical 
analysis for the extraction of terminological 
noun phrases. In COLING '92, pages 977-981, Nantes, Frmme. 

Eric Brill. 1992. A simple rule-based part of 
speech tagger. In ANLP '92, pages 152-155, 
Trcnl;o, mar(:h. 

Bd~d;ri(:(,' l)Mlle, Eri(: Ga.ussier, ;m(l .le, ml-Mm'(: 
LanK& 1998. An (',wduati()n ()f statisti(:al 
s(;or(~s fOl' Wolxl ass()(:inti()n. In .lonathan 
(finzt)urg, Zm'al) Kha.si(tashvili, C:u'l Vogel, 
&;;m-,\]a(:(tues Ldvy, ~md Era'i(: Va.llduvi, ed- 
itors, 77~,e 7'blisi Symposium on l,ogic , Lan- 
g'uafle and Computation: ,~clccl,('d Papers, 
pnges 1177 188. CSLI Publications. 

Bdatrice \])Mlle. 1996. Study ;rod imt)l(',menta- 
tion of ('onfl)in(;(l techni(tue, s for ;mt()nl~ti(: ex- 
traction ()f terminology. In Judith l~. l(bwan- 
s and Philil) Rcsnik, (;ditors, The, Bala'aci'nfl 
Act - Combining Symbolic and Statistical Ap- 
proach, es to Language , (:hal)ter 3, t)~ges 28 49. 
MIrl? \]?tess. 

Sot)hie David and 17. Plante,. 1990. L(; 1)rogi- 
(:iel tcrmino : l)e, la ndc(;ssit;d (l'mie, ml~lyse 
morphosyntaxique pour le ddt)ouillement ter- 
minologique, des textes. In lCO, volume 2. 

,l. Dul)ois. 1962. Etude s'ar ht ddrivation suf- 
.fixale (',',, F'ra',,~:ai.~' 'm, odcrne ~:l, co'nicmi)orain. 
Lm:oussc, Paris. 

Anne, Guyon. 199"1. Lt's adjeet'(fs r('Jalion',,t',ls 
arguments de noms pre~dieat@. Ph.D. thesis, 
Univea'sitd Paris 7. 

Fidelia Ibekwe-Sanjua. 1998. Terminological 
variation, a mean of identifying research topics
from texts. In COLING-ACL'98, volrume 1, t)t~g(;s 564-570, MontrM, Canada. 

Christian ,la(:quemin mM Evelyne Tzoukerman- 
n. \]999. Np1 tbr term variant extra('tion: 
Syn(;rgy between mort)hoh)gy, lexicon ~md 
synt~x. In T. StrzMkowski, editor, Nat, u- 
ral Language Processing and IT~:formation Re- 
trieval. Kluwer, Boston, MA. 

Christian Jacquemin. 1999. Syntagmatic and 
Paradigmatic Representation of Term Variation. In ACL'99, University of Marylnnd. 

,l. Justeson ;rod S. K;tl;z. 1995. Technical ter- 
minology: Some linguistic l)roperties mM ml 
Mgorithm for id(mtitic~tion in text. \]ill ,lour- 
'hal fff Li',,g'H, isbh: Enflinecri'n,9, volum(; \]. 

.\]os(;l;t(', Le('omtc ~11(t Patri(:l{ 1)nr()ul)e,k. 1996. 
l,e (:at(goris(',ur (t'(;ri(: t)rill, raise (',n (mlvr(', (le 
la version (;ntr:md(; n l'imdt'. ~lb, t:hlfical tel)Oft , 
CNllS-INAIAL 

V.I. l~e,v(msht(;in. 1966. Binary (:ode, s cat)al)le of 
(:orr(;('ting deletions, insertions mM l"eversa\]s. 
Soy. \])h, ys.-Dokl., 10(8):707 710. 

Judith Levi. 1978. 7'he .syntaz and the seman- 
tics of complez 'nominals. A('adenfi(: Press, 
I~on(lon. 

A. Mdlis-1)u(:hulu. 1!)91. Les adj(;('tit~ 
ddnomina.ux : (h;s ~utje(:titls lie "r(,J~ttion". 
\],c:riq.uc, 10:33 60. 

Andrei Mikheev. 1997. Automatic rule induction 
for unknown-word guessing. Computational  
Linguistics, 23(3):405-423. 

Mine Moncemlx. 1993. La .formation des 'sore- 
s composds de str'act'are NOM ADJECTI£. 
Thb, s(; (le do(:tornt en linguisl;ique thdorique, 
et formcllc, Universitd de Mm:nc 1~ Valid(;. 

MULTEXT, 1998. \]~M)or~toire Pa.role et Ira.n- 
gag(;, httl):/ /www.ll)l.univ-aix.fr. 

Ymmi(:k Toussaint, l.'imnetta Nalner, Bdatrice 
l)aille, Christian ,\]a~c(tuentin , .\](;all l{oymd:d, 
mM Nal)il llIathout. 1998. Une api)roche 
linguistique et stntistique 1)ore: l'mmlyse de, 
l'informntion (',n corpus. In TALN'98, pages 
182 191, Pro'is. 

R. A. Wagner and M. J. Fisher. 1974. The 
string-to-string correction problem. Journal 
of the Association for Computing Machinery, 
21 (1):168 173. 
