File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/00/c00-1032_evalu.xml

Size: 7,539 bytes

Last Modified: 2025-10-06 13:58:32

<?xml version="1.0" standalone="yes"?>
<Paper uid="C00-1032">
  <Title>Morphological Rule Induction for Terminology Acquisition Bdatrice Daille</Title>
  <Section position="6" start_page="218" end_page="219" type="evalu">
    <SectionTitle>
5 Results and Evaluation
</SectionTitle>
    <Paragraph position="0"> Ore: corI)us, (:alled \[AGRIC\], is made up of 7 272 aJ)str;tcts (/130000 wor(ls) fronl th'en(:h texts in tlm ~tgri(:ulture (tomnil~ mM extra(:te(t from PASCAL. We used 1;t5(; Brill t)a.rt-ofSt)ee(:h Tagger (Brill, 1992) trained for l,Y=en(:h by (Le(:olntc~ and Pm'out)ek, 1996)) and the lelmnatizer (h&gt; veh)ped t)y F. Na.mer (\[Ibussaint et M., 1998).</Paragraph>
    <Section position="1" start_page="218" end_page="218" type="sub_section">
      <SectionTitle>
5.1 Quantitative results
</SectionTitle>
      <Paragraph position="0"> q_~d)le 2 resmnes the mmfl)er of l)ase stru(:tures extr;mted from \[AGRIC\] corlms. \]q:om these t)ase structures, 395 groul)ings were identitied.</Paragraph>
      <Paragraph position="1"> The linked presence of noun l)hrases of which the extension is fultilled either 1)y a rebttional adjective, or l)e a l)rel)ositional phrase the nmnber is rare --a little bit more than 1. % of the tol;al of occurrence, s- . B15t, these groupings allow us to extract from the 5mmerous hal);,x -more than 70 % of l;he totM of occurrences candidates which, we presu5ne, will t)e, highly denonfinative and to increase the numt)er of occurrences of a candidate term. The mmfl)er of relational adjectives which h~ve l)een identified is 129: agTvnomique (agTvnomical), alimentai,'c, (fl, od), araeh, idier (groundn,,d), aromatiq'ac (arow, atie), etc.</Paragraph>
    </Section>
    <Section position="2" start_page="218" end_page="218" type="sub_section">
      <SectionTitle>
5.2 Linguistic Precision
</SectionTitle>
      <Paragraph position="0"> We chc(:k(;d tim linguistic accuracy of the 395 structural wu'iations which group ~ Noun1 Prep (Det) N(mn2 structure ~md a Nounl RAdj structure. Reported errors COlmern 3 incof re('t groupings due to 1;15('. homograi)hy , and the non homonymy, of the adjective ;tn(l the noun: fin gh, in (A@/(,',,d (Nou@), ,:o,a'ra,&gt; t (ordi,,,ary(Adj)/e'm're.nt(Nov, n)), potentiel (potential). This lead us to a linguisti(&amp;quot; i)rc(:ision of more than 99 % in the identitication of relational adjectives. As ~ matter of com1)arison, (Ja(:quenfin, 1999) obtained a pr(:(:ision of 69,6 % for the Nora5 to Adj morl)hOsynl, tmti(: wtriations (:M(:ulat(',d according to the morl)hologi(:M fimfilies l)roduced 1)y ~ sl;enlruing algorithm al)l)lied to the MUI;.I)F, XT lexi(:;d datM)ase (MUIT.13'3XT, 1998) on the StLllle French corpus \[AGRIC\].</Paragraph>
    </Section>
    <Section position="3" start_page="218" end_page="219" type="sub_section">
      <SectionTitle>
5.3 Informative Precision
</SectionTitle>
      <Paragraph position="0"> The thes~mrus (AGI/,()V()C, 1998) is ~ taxonomy of M)out 15 000 terms ;~ssocbtted with synonyms in n SGML fi)rm;~t, which leads to 25 964 (tiff('xent terms. AGROVOC is used for indexing with (l~tta tittillg ;tgri(:ultural retriev;tl syst('.lliS and indexing syst(mlS. \~e lna(le two ('Oml)~tris(ms with AGI/OVOC: we tirst (:h(;(:k('A whetllcr thc.se RA(tjl~. were re.ally t)~rt of terms of it ml(t se(:oll(l, we colnt)~re(t the c~mdi(t,~te terlllS extracted with a I/.A(lj with its terms. We ('onsi(tor |;hat the t)resence of the I/,A(tj in AGR,()VOC (:ontirms its informative character, mM th}tt the l)resen(:e of a (:an(li(late t(;rm ~ttests its terminological wtlue.</Paragraph>
      <Paragraph position="1"> 5.3.1 Relational adjectives alone Fronl the 124 correct RAdj, 68 appear inside terms of the thesaurus in epithetic 1)osition, and 15 only under their noun tbrm in an extension position, for exmnple arach, idier (groundn'at) does not appear but arach, ide is used in an extension position. Moreover, among the 124 adjectives, 73 appear in AGROVOC under their noun term as mfitenns. The adjectives which are not l&gt;resent ill the thesaurus in an extension t&gt;osition tamer either their adje(:tiwfl or n&lt;mn form are 11 in mmflmr. So 93% of them m'e indeed highly inf'ormtLtive.</Paragraph>
      <Paragraph position="2">  5.3.2 Candidate terms with a relational adjective  Pour 9 AdjR belonging to AGROVOC, we compute the tbllowing indexes: TA tile number of terms in AGROVOC in which tile relational adjective appears in an epithetic position, i.e. the terms of Noun RAdj structure. Fox&amp;quot; example TA=15 tbr the adjective cellulairc (eellular) because it appears in 15 terms of AGROVOC such as di./~renciation cellulairc (cellular differ'enciation), division cclIulaire (cellular division). null TN the number of terms in AGROVOC in which the noun from which has 1)een derived the relational adjective appears inside ~ prepositional phrase, i.e. the terms of Nounl Prep (Det) Nounl~Adj structure.</Paragraph>
      <Paragraph position="3"> For example TN=4 tbr the noun eellulc (cell) because it appears in 4 terms of AGROVOC such as banque de ccllulcs (cell bank), c'alt'a,'e de ecUules (e~tlt~u'e of cells). C A the number of candidate terms of Noun RAdj structure. For example, CA=61 for the adjective celluIaire (cellular) because it appears in 61candidate terms such as acidc cellulaire (cellular acid), activitd cell'alaire (cclluhtr activity), agr@at cell'ulaire (ccll'ala'r aggregate).</Paragraph>
      <Paragraph position="4"> C N the munber of candidate terms of Noun1 Prep (Det) NounltAd j structure. For example CN=58 tbr the noun eellule (cell) because it appears in 58 candidate terms such as ADN de cellule &amp;ell DNA), addition de cellules (cell addition).</Paragraph>
      <Paragraph position="5"> Then, tbr each candidate term of CA and CN, we checked tbr their presence in AGROVOC.</Paragraph>
      <Paragraph position="6"> Tile only matches that we have accepted are exact matches. With this comparison, we obtained the following indexes: a the number of candidate terms of Noun RAdj structure tbund in AGR.OVOC under the Noun RAdj structure.</Paragraph>
      <Paragraph position="7"> b the number of candidate terms of Noun RAdj structure tbund in AGROVOC muler the  c the number of candidate l;erms of Nounl Prep (Det) Nounl~Adj structure found in AGROVOC under the Noun RAdj structure.</Paragraph>
      <Paragraph position="8"> d the number of candidate terms of Nounl.</Paragraph>
      <Paragraph position="9"> Prep (Det) Noun~Adj structure found in AGROVOC under the Noun1 Prep (Det) NounRAdj structure.</Paragraph>
      <Paragraph position="10"> These indexes allow us to compute precision P and recall R for each Noun RAdj structure and each Noun1 Prep (Det) Noun~Adj structure with the help of the fbllowing tbrmula:</Paragraph>
      <Paragraph position="12"> The averages of precision and recall for the two structures are summarized in table 3. This comparison of the average of precision computed shows that candidate terms with a Noun RAdj structure are 10 times more likely to be terms than their eqniwflent in Nounl Prep (Det) Nounl~.Adj. The analysis of the average of recall is also impressive: it is generally difficult to obtain a recall sut)erior to 25 % when comparing candidate terms extracted from a corpus and a thesaurus of the same domain (Daille et el., 1998). The average of recalls obtained thanks to the identification of RAdj shows that nearly half of the terms lmilt with the defined RAdj are identified. These good wflues of precision and recall have been obtained on linguistic criteria only without taking into account frequency.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML