File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/99/w99-0630_evalu.xml

Size: 9,494 bytes

Last Modified: 2025-10-06 14:00:39

<?xml version="1.0" standalone="yes"?>
<Paper uid="W99-0630">
  <Title>Automatically Merging Lexicons that have Incompatible Part-of-Speech Categories</Title>
  <Section position="6" start_page="251" end_page="256" type="evalu">
    <SectionTitle>
5 Experiment
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="251" end_page="253" type="sub_section">
      <SectionTitle>
5.1 Setup
</SectionTitle>
      <Paragraph position="0"> We tested the above method in a set of experiments using four commonly-used machine-readable dictionaries. They are Brill's lexicon, the Moby lexicon, the Collins lexicon, the Oxford machine-readable dictionary, with characteristics as summarized in table 1. The lexicons use distinct POS tagsets of different tag granularities, as summarized in table 2.</Paragraph>
      <Paragraph position="1"> With these four training lexicons we can test twelve pairwise lexicon merging tasks, as shown in table 3. For each pairs of lexicon combination, we intersect them by the strategy mentioned before and produced a  new set of training lexicons in each task.</Paragraph>
      <Paragraph position="2"> Note that the trimmed down Brill lexicon in the &amp;quot;Brill-to-Collins&amp;quot; task is not the same as the trimmed down Brill lexicon in &amp;quot;Brill-to-Moby&amp;quot;.</Paragraph>
      <Paragraph position="3"> In order to evaluate the accuracy of our methods, we asked a linguist to manually create twelve &amp;quot;gold standard&amp;quot; sets of POS mapping rules, TO, one for each of the twelve pairwise lexicons on the semantics between the POS tag only. We then ran the experiments to automatically generate two sets of POS mapping tables, with one under the complete world assumption and another using an anti-lexicon in each merging task. We evaluated precision and recall on POS mapping rules as follows: precision on POS mapping rules - I$' I IEI where * g is the resulting tagset mapping table containing all mapping rules obtained from experiment; * PSt is the subset of PS which contains all correct mapping rules in &amp;quot;R,. (PS E 7~) recall on POS mapping rules - \[g' \[ Jnl Using an anti-threshold A = 0.00001, we created twelve anti-lexicons which can then be used in our algorithm. We obtained the POS mapping results as shown in table 4.</Paragraph>
      <Paragraph position="4"> In the baseline model, the precision is very low, mainly due to data sparseness caused  lexico n_insertor ( PSP,PS q ) input : 1. Two sets of lexemes, each lexeme in the form of a pair of lemma and POS. z:p = C q : {&lt;mk,ql&gt;,... } 2. A POS mapping table B from the POS tagset P to POS tagset Q B = {(p~,qu)...} output :An enlarged set of lexemes in PS q, which contains newly inserted lexemes converted from PSP.</Paragraph>
      <Paragraph position="5"> cq'= .., (mr, where (Pj,qs) E B, and (mr, qs) ~ ~q for all k,l,p,q,r,s algorithm: foreach (mi,pj) in PSP do  foreach (pj, qsI in B do if (mi, qs) not in PSq then  12 q +-- ~q U (mi, q~) } end end end Algorithm 3: Lexicon merging algorithm by the fact that machine readable lexicons usually do not contain full lexeme coverage.  This means our &amp;quot;complete lexicon assumption&amp;quot; which says that we can interpret entries not being in the lexicon as &amp;quot;negative examples&amp;quot; is not correct.</Paragraph>
      <Paragraph position="6"> In the anti-lexicon model, the precision greatly improves, with some experiments even achieving 100% precision. Unfortunately, the recall suffers sharply. After automatically constructing the POS mapping tables from training, we proceeded to merge lexicons in each testing task using the lexicon merging algorithm described above, and evaluated the accuracy of the merged lexicons as follow.</Paragraph>
      <Paragraph position="7"> In each merging task, we randomly selected 100 lexemes from the additional lexicon. Given these 100 lexemes, a linguist first manually constructs a set of correctly converted lexemes, which will be used as the &amp;quot;gold standard&amp;quot; set of lexemes, T~ n. Similar to the evaluation criteria outlined for POS mappings, we define the precision and recall on lexicon merging as the following: IEL'I precision on lexicon merging- i$ L I where * E L is the set of lexemes generated by the lexicon insertor.</Paragraph>
      <Paragraph position="8"> * E L' is the subset of E L that contains all lexemes in ~n.</Paragraph>
      <Paragraph position="9"> J EL'I recall on lexicon merging- 17~L I</Paragraph>
    </Section>
    <Section position="2" start_page="253" end_page="255" type="sub_section">
      <SectionTitle>
5.2 Results
</SectionTitle>
      <Paragraph position="0"> We obtain the results on lexicon merging as shown at table 5.</Paragraph>
      <Paragraph position="1"> The anti-lexicon model significantly improves the precision in both the generated POS mapping rules and merged lexicons.</Paragraph>
      <Paragraph position="2"> Most of the 12 lexicon merging tasks achieve  nearly more than 92% precision, which cannot be obtained by using even the gold standard mapping rules, as shown in table 6. The recall degradation using anti-lexicon is lower in lexicon merging than in POS mapping rule learning, owing to the fact that not all POS tags appear in lexicons with same frequency. For example, nouns and verbs occur far more frequently than prepositions and adverbs. High recall in POS mapping rules will not necessarily yield more accurate converted lexemes, if all the mapping rules obtained are only those rarelyoccurring POS tags. Conversely, the successful generation of a single correct mapping rule for a frequently-occuring POS tag greatly improves recall. The mapping rules generated by our anti-lexicon model confirm this assumption: recall for POS mapping rules is 8%, but for lexicon merging it improves to about 22%.</Paragraph>
      <Paragraph position="3"> Recall suffers sharply, but precision is more important than recall in lexicon merging. This is because the cost of post-lexicon clean up on lexemes with incorrect POS tag in a lexicon after merging is very expensive. A set of high precision POS mapping rules guarantees a much cleaner resulting lexicon after merging. Thus during lexicon merging, a conservative algorithm, which generates fewer but more exact lexemes is preferable. null task I precision recall  gold standard POS mapping rules To show how anti-lexicons affect the precision and recall on lexicon merging, we also ran experiments using different combinations of sim-thresholds and anti-thresholds. In most cases, the precision of lexicon merging obtained from anti-lexicon are much higher than those without. The results are summarized in table 7 and table 8. The  recall on lexicon merging using different sim-thresholds v and antibest precision for lexicon merging is obtained from 7 = 0.8 and A = 0.00001 in a gridsearch. null</Paragraph>
    </Section>
    <Section position="3" start_page="255" end_page="256" type="sub_section">
      <SectionTitle>
5.3 Discussion
</SectionTitle>
      <Paragraph position="0"> As mentioned earlier, the mapping rule learning algorithm we used permits m-to-n mappings so long as the mapping rules created for every tag in a lexicon reach the sim-threshold, that is, the confidence level specified by the lexicographer. An alternative approach that we are experimenting with is to allow only m-to-1 mappings, by simply choosing the mapping rule with highest similarity score. In theory, this would seem to limit the possible accuracy of the algorithm, but empirically we have found that this approach often yields higher precision and recall. Further investigation is needed.</Paragraph>
      <Paragraph position="1"> Different similarity scoring functions can also be used. If data sparseness is a serious problem, we can use a similarity score which counts only the lemmas which are tagged, but not the lemmas which are not tagged.</Paragraph>
      <Paragraph position="2"> One effect of ignoring unlikely tags in this way is that the need for an anti-lexicon is eliminated. We are also currently investigating the mapping power of such variant methods.</Paragraph>
      <Paragraph position="3"> In general, we have observed different behaviors depending on factors such as the granularity of the tagsets, the linguistic theories behind the tagsets, and the coverage of the lexicons.</Paragraph>
      <Paragraph position="4"> Finally, in addition to lexicon merging, POS mapping table is also useful in other applications. Wu and Wong apply them in their SITG channel model to give better performance in their translation application (Wu and Wong, 1998).</Paragraph>
      <Paragraph position="5"> There is a serious problem of low recall on our anti-lexicon model. This is because our model prunes out many possible POS mapping rules which results in very conservative lexeme selection during the lexicon merging process. Moreover, our model cannot discover which POS tags in original lexicon have no corresponding tag in the additional lexicon.</Paragraph>
      <Paragraph position="6"> Our model took POS mapping rules as a natural starting point since this repre- null sentation has been used in earlier related work. However, our experiments showing low precision on lexicon merging even using the human-generated gold standard mapping rules indicates it might not be a good approach to use POS mapping rules at all to tackle the lexicon merging problems. Our next step will be to investigate models that are not constrained by the POS mapping rule representation.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML