File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/00/c00-2094_intro.xml
Size: 4,280 bytes
Last Modified: 2025-10-06 14:00:47
<?xml version="1.0" standalone="yes"?> <Paper uid="C00-2094"> <Title>Using a Probabilistic Class-Based Lexicon for Lexical Ambiguity Resolution</Title> <Section position="2" start_page="0" end_page="649" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> Disambiguation of lexical ambiguities in naturally oceuring free text is considered a hard task for computational linguistics. For instance, word sense disa.inbiguatiol~ is concerned with the protflem of assigning sense labels to occurrences of an ambiguous word. Resolving such ambiguil;ies is useful in constraining semantic interpretation. A related task is target-word disambiguation in machine translation. Here a decision has to be made which of a set of alternative target-language words is the most appropriate translation of a source-language word. A sohltion to this disambiguation problem is directly applicable in a machine translation system which is able to propose the translation alternatives. A further problem is the resolution of attachment ambiguities in syntactic parsing.</Paragraph> <Paragraph position="1"> Here the decision of verb versus argunlent atta&ment of noun phrases, or the choice for verb phrase versus noun phrase attachment of prepositional phrases Call build upon a resolution of the related lexical mnbiguities.</Paragraph> <Paragraph position="2"> Statistical approaches have been applied successfully to these 1)roblems. The great advantage of statistical methods over symbolic-linguistic methods has been deemed to be their effective exploitation of minimal linguist;it knowledge. However, the best performing statistical approaches to lexical ambiguity resolution l;lmmselves rely on complex infornmtion sources such as &quot;lemmas, inflected forms, parts of speech and arbitrary word classes If-.. \] local and distant collocations, trigram sequences, a.nd predicate m'gument association&quot; (Yarowsky (1995), p. 190) or large context-windows up to 1000 neighboring words (Sch/itze, 1992). Unfortmmtely, in many applications such information is not readily available. For instance, in incremental machine translation, it may be desirable to decide for the most probable translation of the arguments of a verb with only the translation of the verb as information source lint no large window of sun'ounding translations available. In parsing, the attachment of a nolninal head nlay haa~e to be resolved with only information al)out the semmltic roles of the verb but no other predi('ate argument associations at; hand.</Paragraph> <Paragraph position="3"> The aim of this paper is to use only nfinimal, but yet precise information fbr lexical ambiguity resolution. We will show that good results are obtainable by employing a simple and natural look-up in a probabilistic class-labeled lexicon for disambiguation. The lexicon provides a probability distribution on semantic selection-classes labeling the slots of verbal subcategorization frames. Induction of distributions on frames and class-labels is accomplished in an unsupervised manner by applying the EM Mgorittnn. Disambiguation then is done by a simple look-up in the probabilistie lexicon. We restrict our attention to a definition of senses as alternative translations of source-words. Our approach provides a very natural solution for such a target-language disambiguation task--look for the most fl'equent target-noun whose semantics fits best with the</Paragraph> <Paragraph position="5"> pLc(n119 ) distribution and their probabilities, and at leg are tile 30 most probable verbs in the pLC(V 119) distribution. 19 is the class index. Those verb-noun pairs which were seen in the training data appear with a dot in the class matrix. Verbs with suffix .as : s indicate the subject slot of an active intransitive. Similarily .aso : s denotes the subject slot of an active transitive, and .aso : o denotes the object slot of an active transitive.</Paragraph> <Paragraph position="6"> semantics required by the target-verb. We evaluated this simple method on a large number of real-world translations and got results comparable to related approaches such as that of Dagan and Itai (1994) where much more selectional int!ormation is used.</Paragraph> </Section> class="xml-element"></Paper>