File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/04/p04-3026_evalu.xml
Size: 6,244 bytes
Last Modified: 2025-10-06 13:59:14
<?xml version="1.0" standalone="yes"?> <Paper uid="P04-3026"> <Title>A Practical Solution to the Problem of Automatic Word Sense Induction</Title> <Section position="5" start_page="0" end_page="0" type="evalu"> <SectionTitle> 4 Results </SectionTitle> <Paragraph position="0"> Before we proceed to a quantitative evaluation, by looking at a few examples let us first give a qualitative impression of some results and consider the contribution of SVD to the performance of our algorithm. Figure 1 shows a dendrogram for the word palm (corpus frequency in the lemmatized BNC: 2054) as obtained after applying the algorithm described in the previous section, with the only modification that the SVD step was omitted, i.e. no dimensionality reduction was performed.</Paragraph> <Paragraph position="1"> The horizontal axes in the dendrogram is dissimilarity (1 - cosine), i.e. 0 means identical items and 1 means no similarity. The vertical axes has no special meaning. Only the order of the words is chosen in such a way that line crossings are avoided when connecting clusters.</Paragraph> <Paragraph position="2"> As we can see, the dissimilarities among the top 30 associations to palm are all in the upper half of the scale and not very distinct. The two expected clusters for palm, one relating to its hand and the other to its tree sense, have essentially been found.</Paragraph> <Paragraph position="3"> According to our judgment, all words in the upper branch of the hierarchical tree are related to the hand sense of palm, and all other words are related to its tree sense. However, it is somewhat unsatisfactory that the word frond seems equally similar to both senses, whereas intuitively we would clearly put it in the tree section.</Paragraph> <Paragraph position="4"> Let us now compare figure 1 to figure 2 which has been generated using exactly the same procedure with the only difference that the SVD step (reduction to 3 dimensions) has been conducted in this case. In figure 2 the similarities are generally at a higher level (dissimilarities lower), the relative differences are bigger, and the two expected clusters are much more salient. Also, the word frond is now well within the tree cluster. Obviously, figure 2 reflects human intuitions better than figure 1, and we can conclude that SVD was able to find the right generalizations. Although space constraints prevent us from showing similar comparative diagrams for other words, we hope that this novel way of comparing dendrograms makes it clearer what the virtues of SVD are, and that it is more than just another method for smoothing.</Paragraph> <Paragraph position="5"> Our next example (figure 3) is the dendrogram for poach (corpus frequency: 458). It is also based on a matrix that had been reduced to 3 dimensions.</Paragraph> <Paragraph position="6"> The two main clusters nicely distinguish between the two senses of poach, namely boil and steal.</Paragraph> <Paragraph position="7"> The upper branch of the hierarchical tree consists of words related to cooking, the lower one mainly contains words related to the unauthorized killing of wildlife in Africa which apparently is an important topic in the BNC.</Paragraph> <Paragraph position="8"> Figure 3 nicely demonstrates what distinguishes the clustering of local contexts from the clustering of global co-occurrence vectors. To see this, let us bring our attention to the various species of animals that are among the top 30 associations to poach. Some of them seem more often affected by cooking (pheasant, chicken, salmon), others by poaching (elephant, tiger, rhino). According to the diagram only the rabbit is equally suitable for both activities, although fortunately its affinity to cooking is lower than it is for the chicken, and to poaching it is lower than it is for the rhino.</Paragraph> <Paragraph position="9"> That is, by clustering local contexts our algorithm was able to separate the different kinds of animals according to their relationship to poach. If we instead clustered global vectors, it would most likely be impossible to obtain this separation, as from a global perspective all animals have most properties (context words) in common, so they are likely to end up in a single cluster. Note that what we exemplified here for animals applies to all linkage decisions made by the algorithm, i.e. all decisions must be seen from the perspective of the ambiguous word.</Paragraph> <Paragraph position="10"> This implies that often the clustering may be counterintuitive from the global perspective that as humans we tend to have when looking at isolated words. That is, the clusters shown in figures 2 and 3 can only be understood if the ambiguous words they are derived from are known. However, this is exactly what we want in sense induction.</Paragraph> <Paragraph position="11"> In an attempt to provide a quantitative evaluation of our results, for each of the 12 ambiguous words shown in table 1 we manually assigned the top 30 first-order associations to one of the two senses provided by Yarowsky (1995). We then looked at the first split in our hierarchical trees and assigned each of the two clusters to one of the given senses.</Paragraph> <Paragraph position="12"> In no case was there any doubt on which way round to assign the two clusters to the two given senses. Finally, we checked if there were any mis-classified items in the clusters.</Paragraph> <Paragraph position="13"> According to this judgment, on average 25.7 of the 30 items were correctly classified, and 4.3 items were misclassified. This gives an overall accuracy of 85.6%. Reasons for misclassifications include the following: Some of the top 30 associations are more or less neutral towards the senses, so even for us it was not always possible to clearly assign them to one of the two senses. In other cases, outliers led to a poor first split, like if in figure 1 the first split would be located between frond and the rest of the vocabulary. In the case of sake the beverage sense is extremely rare in the BNC and therefore was not represented among the top 30 associations. For this reason the clustering algorithm had no chance to find the expected clusters.</Paragraph> </Section> class="xml-element"></Paper>