File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/98/w98-0715_metho.xml
Size: 23,163 bytes
Last Modified: 2025-10-06 14:15:06
<?xml version="1.0" standalone="yes"?> <Paper uid="W98-0715"> <Title>I I | I I I I I I I I I I I Semi-automatic Induction of Systematic Polysemy from WordNet</Title> <Section position="3" start_page="108" end_page="109" type="metho"> <SectionTitle> 2 Systematic Polysemy </SectionTitle> <Paragraph position="0"> Before presenting the induction method, we first clarify what we consider a systematic polysemy in the work described in this paper, and explain the assumptions we made for such polysemy.</Paragraph> <Paragraph position="1"> Our systematic polysemy is analogous to logical polysemy in (Pustejovsky, 1995): word senses in which there is no change in lexical category, and the multiple senses of the word have overlapping, dependent, or shared meanings. This definition excludes meanings obtained by cross-categorical alternations (eg. denominals) or morphological alternations (eg.</Paragraph> <Paragraph position="2"> suffixing with -ify), or homonyms or metaphors, and includes only the senses of the word of the same cat- null egory and form that have some systematic relations.</Paragraph> <Paragraph position="3"> For example, INSTITUTION and BUILDING meanings of the word school are systematically polysemons because BUILDING relates to INSTITUTION by the location of the institution.</Paragraph> <Paragraph position="4"> For nouns, each polysemous sense often refers to a different object. In the above example, school as INSTITUTION refers to an organization, whereas school as BUILDING refers to a physical object.</Paragraph> <Paragraph position="5"> On the other hand, for verbs, polysemous senses refer to different aspects of the same action. For example, a word write in the sentence &quot;John wrote the book&quot; is ambiguous between CREATION (of the book) and COMMUNICATION (through the content of the book) meanings. But they both describe the same action of John writing the particular book. Here, these two meanings are systematically related by referring to the causation aspect (CREATION) or the purpose aspect (COMMUNI-CATION) of the write action. This view is largely consistent with the entailment relations (temporal inclusion and causation) used to organize WordNet verb taxonomies (Fellbaum, 1990}.</Paragraph> <Paragraph position="6"> Another assumption we made is the dependency between related senses. In the work in this paper, sense dependency is viewed as sense extension, similar to (Copestake and Briscoe. 1995), in which a primary sense causes the existence of secondary senses. This assumption is in accord with lexical rules (Copestake and Briscoe, 1995; Ostler and Atkins, 1992), where meaning extension is expressed by if-then implication rules. In the above example of the noun school, INSTITUTION meaning is considered as the primary and BUILDING as the secondary, since institutions are likely to have office space but building may be occupied by other entities besides institutions. Similarly for the verb write, CREATION is considered as the primary and COMMUNICATION as the secondary, since communication takes place through the object that is just produced but communication can take place without producing an object.</Paragraph> </Section> <Section position="4" start_page="109" end_page="112" type="metho"> <SectionTitle> 3 Induction Method </SectionTitle> <Paragraph position="0"> Our induction method is semi-automatic, requiring a manual filtering step between the phased automatic processing. The basic scheme of our method is to first identify the prominent pair-wise cooccurfence between any two basic types (abstract senses), and then build more complex types (underspecifled classes) by the composition of those cooccurfences. But instead of generating/composing all possible types statically, we only maintain the pair-wise relations in a graph representation called type dependency graph, and dynamically form/induce the underspecified classes during the phase when each WordNet entry is assigned the class label(s).</Paragraph> <Paragraph position="2"> Based on the definitions and assumptions described in the previous section 2, underspecified semantic classes are induced from WordNet 1.6 (released December 1997) by the following steps: . Select a set of abstract (coarse-grained) senses from WordNet taxonomies as basic semantic types. This step is done manually, to determine the right level of abstraction to capture systematic polysemy.</Paragraph> <Paragraph position="3"> . Create a type dependency graph from ambiguous words in WordNet. This step is done by two phased analyses: an automatic analysis followed by a manual filtering.</Paragraph> <Paragraph position="4"> . Generate a set of underspecified semantic classes by partitioning the senses of each word into a set of basic types. Each set becomes an underspecified semantic class. This step is fully automatic.</Paragraph> <Paragraph position="5"> Each step is described in detail below.</Paragraph> <Section position="1" start_page="109" end_page="110" type="sub_section"> <SectionTitle> 3.1 Coarse-grained Basic Types </SectionTitle> <Paragraph position="0"> As has been pointed out previously, there are many regularities between polysemous senses, and these regularities seem to hold across words. For ex'ampie, words such as chicken and duck which have ANIMAL sense often have MEAT meaning also (i.e., animal-grinding lexical rule (Copestake and Briscoe, 1992)). This generalization holds at an abstract level rather than the word sense level. Therefore,the first step in the induction is to select a set of abstract senses that are useful in capturing the systematicity. To this end, WordNet is a good resource because word senses (or synsets) are organized in taxonomies.</Paragraph> <Paragraph position="1"> Ideally, basic types should be semantically orthogonal, to function essentially as the &quot;axes&quot; in a high-dimensional semantic space. Good candidates would be the top abstract nodes in the WordNet taxenemies or lexicographers' file names listed in the sense entries. However, both of them fall short of forming a set of orthogonal axes because of several reasons. First, domain categories are mixed in with ontological categories (eg. co,,petition and body verb categories). Second, some categories are ontologically more general than others (eg. change category in verbs). Third, particularly for the verbs, senses that seem to take different argument noun types are found under the same category (eg. &quot;ingest&quot; and &quot;use&quot; in consumption category). Therefore, some WordNet categories are broken into more specific types.</Paragraph> <Paragraph position="2"> For the verbs, the following 18 abstract basic types are selected.</Paragraph> <Paragraph position="4"> These are mostly taken from the classifications made by lexicographers.</Paragraph> <Paragraph position="5"> Two classes (&quot;consumption&quot; and &quot;creation&quot; are sub-divided into finer categories (ingestion, use and phys ical/ment al/verbal_creat ion, respectively) according to the different predicate-argument structures they take.</Paragraph> <Paragraph position="6"> For the nouns, 31 basic types are selected from WordNet top categories (unique beginners): 2</Paragraph> <Paragraph position="8"> Senses under the lexicographers' class &quot;group&quot; are redirected to other classes, assuming a collection of a type has the same basic semantic properties as the individual type.</Paragraph> </Section> <Section position="2" start_page="110" end_page="111" type="sub_section"> <SectionTitle> 3.2 Type Dependency Graph </SectionTitle> <Paragraph position="0"> After the basic types are selected, the next step is to create a type dependency graph: a directed graph in which nodes represent the basic types, and directed edges correspond to the systematic relations between two basic types.</Paragraph> <Paragraph position="1"> The type dependency graph is constructed by an automatic statistical analysis followed by a manual filtering process, as described below. The premise here is that, if there is a systematic relation between two types, and if the regularity is prominent, it can be captured by the type cooccurrence statistics. In machine learning, several statistical techniques have been developed which discover dependencies among features (or causal structures), such 2Noun top categories in WordNet do not match exactly with lexicographers&quot; file names, in our experiment, noun types are determined by actually traversing the hierarchies, therefore they correspond to the top categories.</Paragraph> <Paragraph position="3"> as Bayesian network learning (eg. Spirtes et al., 1993). Those techniques use sophisticated methods that take into consideration of multiple antecedents/causations and so on, and build a complex and precise model with probabilities associated with edges. In our present work however, Word-Net is compiled from human lexicographers' entries, thus the data has a fair amount of arbitrariness (i.e., noisy data). Therefore, we chose a simple technique which yields a simpler network, and used the result as a rough approximation of the type dependencies to be corrected manually at the next phase.</Paragraph> <Paragraph position="4"> The advantage of this automatic analysis here is two fold: not only it discovers/reveals the semantic type associations with respect to the basic types selected from the previous step, it also helps the manual filtering to become more informed and consistent than by judging with mere intuition, since the result is based on the actual content of WordNet.</Paragraph> <Paragraph position="5"> The type dependency graph is constructed in the following way. First, for all type-pairs extracted from the ambiguous words in WordNet, mutual information is computed to obtain the association by using the standard formula: for type tl, t2, a mutual</Paragraph> <Paragraph position="7"> where f(t) is the number of occurrence of the type t, and N is the size of the data. The association between two types are considered prominent when the mutual information value was greater than some threshold (in our current implementation, it is 0).</Paragraph> <Paragraph position="8"> At this point, type associations are undirected because mutual information is symmetric (i.e., commutative). Then, these associations are manually inspected to create a directed type dependency graph in the next phase. The manual filtering does two things: to filter out the spurious relations (i.e., false positives) and add back the missing ones (i.e., false negatives), and to determine the direction of the correct associations. Detected false positives are mostly homonyms (including metaphors) (eg.</Paragraph> <Paragraph position="9"> WEA-EM0 (weather and emotion) verb type pair for words such as the word ignite). False negatives are mostly the ones that we know exist, but were not significant according to the cooccurrence statistics (eg. ANI-F00D in nouns). As a heuristic to detect the false negatives, we used the cross-categorical inheritance in the taxonomies in which category switches as the hierarchy is traversed up.</Paragraph> <Paragraph position="10"> The direction of the associations are determined by sense extension described in section 2. In addition, we used &quot;the ontological generality of the basic types as another criteria. This is because a transitive inference through a ontologically general type may result in a relation where unrelated (specific) types are combined, particularly when the specific types are domain categories. For instance, the verb category Cl~ (change) is ontologically general, and may occur with specific types in entailment relation. But the transitive inference is done through this general type does not necessarily guarantee the systematicity between the associated specific types. In order to prevent this kind of implausible inference, we restricted the direction of a systematic relation to be from the specific type to the general type, if one of the member types is the generalization of the other. Note for some associations which involve equally general/specific types ontologically (such as COG (cognition) and C0MH (co~tmication)), the direction was considered bidirectional (unless sens~ extension strongly suggests the dependency). A part of the type dependency graph for WordNet verbs is shown in Figure</Paragraph> </Section> <Section position="3" start_page="111" end_page="111" type="sub_section"> <SectionTitle> 3.3 Underspecified Semantic Classes </SectionTitle> <Paragraph position="0"> Underspecified semantic classes are automatically formed by partitioning the ambiguous senses of each word according to the type dependency graph.</Paragraph> <Paragraph position="1"> Using the type dependency graph, all words in WordNet verb and noun categories are assigned one or more type partitions. A partition is an ordered set of basic types (abstracted from the fine-grained word senses in the first step) keyed by the primary type emcompassing the secondary types. From a list of frequency-ordered senses of a WordNet word, a partition is created by taking one of the three most frequent types (listed as the first three senses in the WordNet entry) as the primary and collecting the secondary types from the remaining list according to the type dependency graph. 3 Here, the secondary types are taken only from the nodes/types that are directly connected to the primary type. That is be3The reason we look at the first three senses is because primary types are not always listed as the most frequent sense in the WordNec sense lists (or in actual usage for that matter). We chose the first three senses because the average degree of polysemy is around 3 for WordNet (version 1.6) verbs and nouns.</Paragraph> <Paragraph position="3"/> </Section> <Section position="4" start_page="111" end_page="112" type="sub_section"> <SectionTitle> Verb Class Verbs </SectionTitle> <Paragraph position="0"> cONT-CHA :&quot; blend, crush, enclose, fasten, fold, puncture, tie, weld CON'r-HOT beat, chop, fumble, jerk, kick, press, spread, whip CONT-POSS pluck, release, seize, sponge C0NT-MOT-CHA &quot;' dip, gather, mount, take_out C0HT-HOT-POSS carry, cover, fling, toss cause we assumed if an indirect transitive dependency oftl on t3 through t2 is strong enough, it will be captured as a direct dependency. This scheme also ensures the existence of a core concept in every partition (thus more implausible than transitive composition ). This procedure is applied recursively if the sense list of a word was not covered by one partition (note in this case, the word is a homonym). As an example, for the verb write whose sense list is (VCR C0g, H PCR Cl~t),4 the first 3 types VCR, COI~ and PCR are picked in turn as the primary type to see whether a partition can be created that encompasses all other member types. In this case, a partition keyed by PCR can cover all member types (see the type dependency graph in Figure i), thus a class VCR-C0~-PCR-CBA is created. The systematic relation of this class would be &quot;a change or creation action which involves words (and resulted some object), performed for communication purpose (through the object)&quot;.</Paragraph> <Paragraph position="1"> For the verbs and nouns in WordNet 1.6, 136 underspecified verb classes and 325 underspecified noun classes are formed. Some verbs of the classes involving C/ontacl; (coN'r) areshown in Table I.</Paragraph> <Paragraph position="2"> We can observe from the words assigned to each class that member types are indeed systematically related. For example, CONT-MOT class represents an action which involves physical contact resulting from motion (MOT). Words assigned to this class do seem to have tool;ion flavor. On the other hand, CONT-POSS class represents a transfer of possession (P0SS) which involves physical contact. Again, words in this class do seem to be used in a context in which possession of something is changed. For the more polysemous class CONT-HOT-POSS, words in this class, for instance toss, do seem to cover all three member types.</Paragraph> <Paragraph position="3"> By using the underspecified classes, the degree of ambiguity in WordNet has substantially decreased.</Paragraph> <Paragraph position="4"> Table 2 shows the summary of our results (indicated by Und) compared to the original WordNet statistics. There, the advantage of our underspecified classes for reducing ambiguity seems very effective for polysemous verbs (from 3.57 to 2.39, 33 % decrease). This is an encouraging result because many familiar (frequently used) verbs are polysemous in actual usage.</Paragraph> </Section> </Section> <Section position="5" start_page="112" end_page="112" type="metho"> <SectionTitle> 4 Application </SectionTitle> <Paragraph position="0"> To observe how the induced underspecified classes facilitates abductive inference in the contextual understanding of real-world texts, predicate-argument structures were extracted from the Brown corpus. deg Table 3 shows some examples of the extracted verb-object relations involving the verb class VCR (verbal_creal; ion). Abductive inference facilitated by underspecified classes is most significant when both the predicate and the argument are systematically polysemous.</Paragraph> <Paragraph position="1"> We call this a multi-facet matching. 6 As an example, the verb write (VCR-COMM-PCR-CHA) takes an object noun paper (AFT-COHM) in a sentence in Brown corpus null In 19,J8, Afranio Do Amaral, the noted Brazilian herpetologist, wrote a technical paper on the giant snakes.</Paragraph> <Paragraph position="2"> In this sentence, by matching the two systematically polysemous words write and paper, multiple interpretations are simultaneously possible. The most preferred reading, according to the hand-tagged corpus WNSEMCOR, would be the match between VCR of the verb (sense # 3 of write - to have something published, as shown in section 1) and C0MM of the noun (sense ~ 2 of paper - an essay), giving rise the reading &quot;to publish an essay&quot;. However in this context, other readings are possible as well. For instance, the match between verb gca and noun AFT (a printed media), which gives rise the reading &quot;to have a written material printed for publishing&quot;. Or another reading is possible from the match between verb C0HH (sense # 2 of write - to communicate (thoughts) by writing) and noun AFT, which gives SPredicate~argument structures (verb-object and subject-verb relations in this experiment) are extracted by syntactic pattern matching, similar to the cascaded fufite-state processing used in FASTUS (Hobbs, et al., 1997)). In the preliminary performance analysis, recall was around 50 % and precision was around 80 %.</Paragraph> <Paragraph position="3"> 6 By taking the first sense for both predicate verb and argument noun, 78 % of the verb-object relations and 66 deg70 of the subject-verb relations were systematically polysemous for at least one constituent.</Paragraph> <Paragraph position="4"> dramatize comment (COIIM-ACT), fact (COG-C01fl4-STA), scene (LOC) write article (AFT-COI~I-ART-RF, L), book (AFT-COYd~), description (C0/IN-ACT-C0G), fiction (COMH), letter (C0/ei-ACT), paper (AFT-COMM), song (AFT-ACT-COMM) rise the reading &quot;to communicate through a printed media&quot;. This reading implies the purpose and entailment of the write action (as COMH): a paper was written to communicate some thoughts, and those thoughts were very likely understood by the readers.</Paragraph> <Paragraph position="5"> Also from those readings, we can infer the paper is an artifacl;, that is, a physical object rather than an intangible mental object such as &quot;idea&quot; for instance. Those secondary readings can be used later in the discourse to make further inferences on the write action, and to resolve references to the paper either from the content of the paper (i.e., essay) or from the physical object itself (i.e., a printed artifact). null One interesting observation on multi-facet matching is the polysemous degrees of matched classes.</Paragraph> <Paragraph position="6"> Table 4 shows the predicate verbs of different systematically polysemous classes and the average polysemous degree of argument nouns observed in verb-object and subject-verb relations, r The result indicates, as the verb becomes more polysemous, the polysemous degree of the argument stays about the same for both subject and object nouns. This suggests a complex multi-facet matching between verb and noun basic types, since the polysemous degree of nouns does not monotonically increase.</Paragraph> </Section> <Section position="6" start_page="112" end_page="113" type="metho"> <SectionTitle> 5 Discussion </SectionTitle> <Paragraph position="0"> The induction method described above should be considered as an initial attempt to automatically acquiring systematic polysemy from a broad-coverage lexical resource. The task is essentially to map our semantic/ontological knowledge about the systematicity of word meanings to some computational terms for a given iexical resource. In our present work, we mapped the systematicity to the cooccurfence of word senses. But the mapping only by computational/automatic means (mutual informar'I'he predicate-argument structures in this table represettt the ones in which both verb and noun entries axe found in WordNet. The total numbers of structures extracted from Brown corpus were 47287 for verb-object and 39266 for sub j-verb. Discrepancies were mostly due to proper nouns and pronouns which WordNet does not encode.</Paragraph> <Paragraph position="1"> tion) was not possible: manual filtering was further needed to enhance the mapping.</Paragraph> <Paragraph position="2"> Also, there was a difficulty with type dependency graph. In the current scheme, systematicity among polysemous senses are represented by binary relations between a primary and a secondary sense in the graph. A partition, and eventually an under-specified class, is formed by taking all the secondary senses from the primary sense listed in each Word-Net entry. The difficulty is that some combinations do not seem correct collectively. For example, a class PKR-COG-CONT consists of two binary relations: PER-COG (to reason about what is perceived, eg. detect), and PER-CONT (to perceive through physical contact, eg. hide). Although each one correctly represents a systematic relation, PF_,R-COG-CONT does not seem correct as a collection. In the Word-Net entries, a verb bury is assigned to this class PER-COG-CONT. Here, CONT sense seems to select for a physical object (as in '&quot;they buried the stolen goods&quot;), whereas COG sense (to dismiss from the mind) seems to select for a mental non-physical object. Therefore the construction of type partitions needs more careful considerations. Also the applicability of the induced classes must be evaluated in the further analysis.</Paragraph> </Section> class="xml-element"></Paper>