File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/01/w01-0702_evalu.xml

Size: 5,005 bytes

Last Modified: 2025-10-06 13:58:45

<?xml version="1.0" standalone="yes"?>
<Paper uid="W01-0702">
  <Title>Combining a self-organising map with memory-based learning</Title>
  <Section position="6" start_page="0" end_page="0" type="evalu">
    <SectionTitle>
6 Results
</SectionTitle>
    <Paragraph position="0"> Table 1 gives the results of the experiments. The columns are as follows: a0 &amp;quot;features&amp;quot;. This column indicates how the features are made up.</Paragraph>
    <Paragraph position="1"> &amp;quot;lex&amp;quot; means the features are the lexical space vectors representing the POS tags.</Paragraph>
    <Paragraph position="2"> &amp;quot;orth&amp;quot; means that orthogonal vectors are used. &amp;quot;tags&amp;quot; indicates that the POS tags themselves are used. MBL uses a weighted overlap similarity metric while SOMMBL and LSOMMBL use the euclidean distance.</Paragraph>
    <Paragraph position="3">  and MBL, training was performed on sections 15 to 18. The fscores of the best performers in each case for LSOMMBL, SOMMBL and MBL have been highlighted in bold. a0 &amp;quot;window&amp;quot;. This indicates the amount of context, in the form of &amp;quot;left-right&amp;quot; where &amp;quot;left&amp;quot; is the number of words in the left context, and &amp;quot;right&amp;quot; is the number of words in the right context.</Paragraph>
    <Paragraph position="4"> a0 &amp;quot;Chunk fscore&amp;quot; is the fscore for finding base NPs. The fscore (a20 ) is computed as a20 a15</Paragraph>
    <Paragraph position="6"> a28 is the percentage of base NPs found that are correct and a29 is the percentage of base NPs defined in the corpus that were found.</Paragraph>
    <Paragraph position="7"> a0 &amp;quot;Chunk tag accuracy&amp;quot; gives the percentage of correct chunk tag classifications. This is provided to give a more direct comparison with the results in (Daelemans et al., 1999a). However, for many NL tasks it is not a good measure and the fscore is more accurate.</Paragraph>
    <Paragraph position="8"> a0 &amp;quot;Max comparisons&amp;quot;. This is the maximum number of comparisons per novel item computed as a1a3a2a5a4a30a6a32a31a33a13 wherea1 is the number of categories, a4 is the number of units and a31 is the maximum number of items associated with a unit in the SOM. This number depends on how the map has organised itself in training. The number given in brackets here is the percentage this number represents of  items the maximum number of comparisons under MBL. The average number of comparisons is likely to be closer to the average mentioned in Section 4.1.</Paragraph>
    <Paragraph position="9"> Table 2 gives the sizes of the SOMs used for each context size, and the number of training items.</Paragraph>
    <Paragraph position="10"> For small context sizes, LSOMMBL and MBL give the same performance. As the window size increases, LSOMMBL falls behind MBL. The worst drop in performance is just over 1.0% on the fscores (and just over 0.5% on chunk tag accuracy). This is small considering that e.g. for the largest context the number of comparisons used was at most 6.8% of the number of training items. To investigate whether the method is less risky than the memory editing techniques used in (Daelemans et al., 1999b), a re-analysis of their data in performing the same task, albeit with lexical information, was performed to find out the exact drop in the chunk tag accuracy.</Paragraph>
    <Paragraph position="11"> In the best case with the editing techniques used by (Daelemans et al., 1999b), the drop in performance was 0.66% in the chunk tag accuracy with 50% usage (i.e. 50% of the training items were used). Our best involves a drop of 0.23% in chunk tag accuracy and only 20.4% usage. Furthermore their worst case involves a drop of 16.06% in chunk tag accuracy again at the 50% usage level, where ours involves only a 0.54% drop in accuracy at the 6.8% usage level. This confirms that our method may be less risky, although a more direct comparison is required to demonstrate this in a fully systematic manner. For example, our system does not use lexical information in the input where theirs does, which might make a difference to these results.</Paragraph>
    <Paragraph position="12"> Comparing the SOMMBL results with the LSOMMBL results for the same context size and tagset, the differences in performance are insignificant, typically under 0.1 points on the fscore and under 0.1% on the chunk tag accuracy. Furthermore the differences sometimes are in favour of LSOMMBL and sometimes not. This suggests they may be due to noise due to different weight initialisations in training rather than a systematic difference.</Paragraph>
    <Paragraph position="13"> Thus at the moment it is unclear whether SOMMBL or LSOMMBL have an advantage compared to each other. It does appear however that the use of the orthogonal vectors to represent the tags leads to slightly worse performance than the use of the vectors derived from the lexical space representations of the words.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML