File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/04/p04-3015_metho.xml
Size: 9,927 bytes
Last Modified: 2025-10-06 14:09:07
<?xml version="1.0" standalone="yes"?> <Paper uid="P04-3015"> <Title>Hierarchy Extraction based on Inclusion of Appearance</Title> <Section position="3" start_page="0" end_page="0" type="metho"> <SectionTitle> 2 Experiment Corpus </SectionTitle> <Paragraph position="0"> A good deal of linguistic research has focused on the syntactic and semantic functions of abstract nouns (Nemoto, 1969; Takahashi, 1975; Schmid, 2000; Kanzaki et al., 2003). In the example, &quot;Yagi (goat) wa seishitsu (nature) ga otonashii (gentle) (The nature of goats is gentle).&quot;, Takahashi (1975) recognized that the abstract noun &quot;seishitsu (nature)&quot; is a hypernym of the attribute that the predicative adjective &quot;otonashi (gentle)&quot; expresses. Kanzaki et al. (2003) defined such abstract nouns that co-occur with adjectives as adjective hypernyms, and extracted these co-occurrence relations between abstract nouns and adjectives from many corpora such as newspaper articles. In the linguistic data, there are sets of co-occurring adjectives for each abstract noun - the total number of abstract noun types is 365 and the number of adjective types is 10,525. Some examples are as follows.</Paragraph> <Paragraph position="1"> OMOI (feeling): ureshii (glad), kanashii (sad), shiawasena (happy), ...</Paragraph> <Paragraph position="2"> KANTEN (viewpoint): igakutekina (medical), rekishitekina (historical), ...</Paragraph> </Section> <Section position="4" start_page="0" end_page="0" type="metho"> <SectionTitle> 3 Complementary Similarity Measure </SectionTitle> <Paragraph position="0"> The complementary similarity measure (CSM) is used in a character recognition method for binary images which is robust against heavy noise or graphical designs (Sawaki and Hagita, 1996). Yamamoto et al. (2002) applied CSM to estimate one-to-many relations between words. They estimated one-to-many relations from the inclusion relations between the appearance patterns of two words.</Paragraph> <Paragraph position="1"> The appearance pattern is expressed as an n-dimensional binary feature vector. Now, let F = (f</Paragraph> <Paragraph position="3"> 1) be the feature vectors of the appearance patterns for a word and another word, respectively. The CSM of F to T is defined as</Paragraph> <Paragraph position="5"> The CSM of F to T represents the degree to which F includes T; that is, the inclusion relation between the appearance patterns of two words.</Paragraph> <Paragraph position="6"> In our experiment, each &quot;word&quot; is an abstract noun. Therefore, n is the number of adjectives in the corpus, a indicates the number of adjectives co-occurring with both abstract nouns, b and c indicate the number of adjectives co-occurring with either abstract noun, and d indicates the number of adjectives co-occurring with neither abstract noun.</Paragraph> </Section> <Section position="5" start_page="0" end_page="0" type="metho"> <SectionTitle> 4 Overlap Coefficient </SectionTitle> <Paragraph position="0"> The overlap coefficient (OVLP) is a similarity measure for binary vectors (Manning and Schutze, 1999). OVLP is essentially a measure of inclusion.</Paragraph> <Paragraph position="1"> It has a value of 1.0 if every dimension with a non-zero value for the first vector is also non-zero for the second vector or vice versa. In other words, the value is 1.0 when the first vector completely includes the second vector or vice versa. OVLP of F and T is defined as veloped for advanced processing of natural language by computers and is composed of eleven sub-dictionaries. The sub-dictionaries include a concept dictionary, word dictionaries, bilingual dictionaries, etc. We verify and analyse the hierarchies that are extracted based on a comparison with the EDR dictionary. However, the hierarchies in EDR consist of hypernymic concepts represented by sentences. On the other hand, our extracted hierarchies consist of hypernyms such as abstract nouns. Therefore, we have to replace the concept composed of a sentence with the sequence of the words. We replace the description of concepts with entry words from the &quot;Word List by Semantic Principles&quot; (1964) and add synonyms. We also add to abstract nouns in order to reduce any difference in representation. In this way, conceptual hierarchies of adjectives in the EDR dictionary are defined by the sequence of words.</Paragraph> </Section> <Section position="6" start_page="0" end_page="0" type="metho"> <SectionTitle> 6 Hierarchy Extraction Process </SectionTitle> <Paragraph position="0"> The processes for hierarchy extraction from the corpus are as follows. &quot;TH&quot; is a threshold value for each pair under consideration. If TH is low, we can obtain long hierarchies. However, if TH is too low, the number of word pairs taken into consideration increases overwhelmingly and the measurement reliability diminishes. In this experiment, we set 0.2 as TH.</Paragraph> <Paragraph position="1"> 1. Compute the similarity between appear- null ance patterns for each pair of words. The hierarchical relation between the two words in a pair is determined by the similarity value. We express the pair as (X, Y), where X is a hypernym of Y and Y is a hyponym of X.</Paragraph> <Paragraph position="2"> 2. Sort the pairs by the normalized similarities and reduce the pairs where the similarity is less than TH.</Paragraph> <Paragraph position="3"> 3. For each abstract noun, A) Choose a pair (B, C) where word B is the hypernym with the highest value.</Paragraph> <Paragraph position="4"> The hierarchy between B and C is set to the initial hierarchy.</Paragraph> <Paragraph position="5"> B) Choose a pair (C, D) where hyponym D is not contained in the current hierarchy and has the highest value in pairs where the last word of the current hierarchy C is a hypernym.</Paragraph> <Paragraph position="6"> C) Connect hyponym D with the tail of the current hierarchy.</Paragraph> <Paragraph position="7"> D) While such a pair can be chosen, repeat B) and C).</Paragraph> <Paragraph position="8"> E) Choose a pair (A, B) where hypernym A is not contained in the current hierarchy and has the highest value in pairs where the first word of the current hierarchy B is a hypernym.</Paragraph> <Paragraph position="9"> F) Connect hypernym A with the head of the current hierarchy.</Paragraph> <Paragraph position="10"> G) While such a pair can be chosen, repeat E) and F).</Paragraph> <Paragraph position="11"> 4. For the hierarchies that are built, A) If a short hierarchy is included in a longer hierarchy with the order of the words preserved, the short one is dropped from the list of hierarchies. B) If a hierarchy has only one or a few different words from another hierarchy, the two hierarchies are merged.</Paragraph> </Section> <Section position="7" start_page="0" end_page="0" type="metho"> <SectionTitle> 7 Extracted Hierarchy </SectionTitle> <Paragraph position="0"> Some extracted hierarchies are as follows. In our experiment, we get koto (matter) as the common hypernym.</Paragraph> <Paragraph position="1"> koto (matter) -- joutai (state) -- kankei (relation) -- kakawari (something to do with) -- tsukiai (have an acquaintance with) koto (matter) -- toki (when) -- yousu (aspect) -omomochi (one's face) -- manazashi (a look) -iro (on one's face) -- shisen (one's eye)</Paragraph> </Section> <Section position="8" start_page="0" end_page="0" type="metho"> <SectionTitle> 8 Comparison </SectionTitle> <Paragraph position="0"> We analyse extracted hierarchies by using the number of nodes that agree with the EDR hierarchy. Specifically, we count the number of nodes (nouns) which agree with a word in the EDR hierarchy, preserving the order of each hierarchy. Here, two hierarchies are &quot;A - B - C - D - E&quot; and &quot;A - B - D - F - G.&quot; They have three agreement nodes; &quot;A - B - D.&quot; Table 1 shows the distribution of the depths of a CSM hierarchy, and the number of nodes that agree with the EDR hierarchy at each depth. Table 2 shows the same for an OVLP one. &quot;Agreement Level&quot; is the number of agreement nodes. The bold font represents the number of hierarchies completely included in the EDR hierarchy.</Paragraph> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 8.1 Depth of Hierarchy </SectionTitle> <Paragraph position="0"> The number of hierarchies made from the EDR dictionary (EDR hierarchy) is 932 and the deepest level is 14. The number of CSM hierarchies is 105 and the depth is from 3 to 14 (Table 1). The number of OVLP hierarchies is 179 and the depth is from 2 to 9 (Table 2). These results show that CSM builds a deeper hierarchy than OVLP, though the number of hierarchies is less than OVLP. Also, the deepest level of CSM equals that of EDR.</Paragraph> <Paragraph position="1"> Therefore, comparison with the EDR dictionary is an appropriate way to verify the hierarchies that we have extracted.</Paragraph> <Paragraph position="2"> In both tables, we find most hierarchies have an agreement level from 2 to 4. The deepest agreement level is 6. For an agreement level of 5 or better, the OVLP hierarchy includes only two hierarchies while the CSM hierarchy includes nine hierarchies. This means CSM can extract hierarchies having more nodes which agree with the EDR hierarchy than is possible with OVLP.</Paragraph> <Paragraph position="3"> each depth Also, many abstract nouns agree with the hyperonymic concept around the top level. In current thesauri, the categorization of words is classified in a top-down manner based on human intuition.</Paragraph> <Paragraph position="4"> Therefore, we believe the hierarchy that we have built is consistent with human intuition, at least around the top level of hyperonymic concepts.</Paragraph> </Section> </Section> class="xml-element"></Paper>