File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/90/c90-2042_metho.xml
Size: 11,220 bytes
Last Modified: 2025-10-06 14:12:26
<?xml version="1.0" standalone="yes"?> <Paper uid="C90-2042"> <Title>SOLVING AMBIGUITIES IN THE SEMANTIC REPRESENTATION OF TEXTS</Title> <Section position="1" start_page="0" end_page="0" type="metho"> <SectionTitle> SOLVING AMBIGUITIES IN THE SEMANTIC REPRESENTATION OF TEXTS Marie-Claude Landau IBM France Paris Scientific Center 3-5 Place Vend6me 75021 PARIS cedex 01 FRANCE Abstract </SectionTitle> <Paragraph position="0"> One of the issues of Artificial Intelligence is the transfer of the knowledge conveyed by Natural Language into formalisms that a computer can interpret. In the Natural Language Processing department of the IBM France Paris Scientific Center, we are developing and evaluating a system prototype whose purpose is to build a semantic representation of written French texts in a rigorous formal model (the Conceptual Graph model, introduced by J.F Sowa \[10\]).</Paragraph> <Paragraph position="1"> The semantic representation of texts may then be used in various applications, such as intelligent information retrieval. The accuracy of the semantic representation is therefore crucial in order to obtain valid resuits in any subsequent applications, in this article we explain how ambiguities related to Natural Language may be solved by semantic analysis using the Conceptual Graph model.</Paragraph> <Paragraph position="2"> Key words Natural Language Understanding, Computational Linguistics, Conceptual Graph Model. almost completely solved by the syntactic analyzer.</Paragraph> <Paragraph position="3"> * Struclurat arnbiguities, a consequence ef the multiple possible attachrnents of the syntactic components in a sentence.</Paragraph> <Paragraph position="4"> This kind of ambiguity rnay be solved to a large extent by the semantic analyzer.</Paragraph> <Paragraph position="5"> * Anaphoric ambiguities, that could be solved in part by syntactic analysis within a sentence \[3\], but cannot be solved across different sentences I)e('ause a syntactic analyzer processes each sentence independently. In our system, the resolution of anapheric ambiguilies is done uniquely by the semantic analyzer.</Paragraph> <Paragraph position="6"> '+ Ellipses, that could also be solved in part by syntactic analysis. But an incomplete synlactic analysis may in some cases be complemenled by the semantic analysis.</Paragraph> <Paragraph position="7"> * Semantic ambiguities coming frorn polysemous lemmas, that can only be solved at the sen\]antic level (unless a polysemy leads to different syntactic conslructions).</Paragraph> <Paragraph position="8"> It+ this article, we concentrale especially on the practical solving of the different kinds of ambiguities, showing that these problems are inter-related and may be solved by unF fled n/ethods.</Paragraph> <Paragraph position="9"> introduction In the system prototype we have been developing, the analysis of a texl is carried oul in two sleps: first syntaclic and lhen semantic \[1\].</Paragraph> <Paragraph position="10"> We assll~lle lhat lhe synlax of a lext conveys .~;orne meaning, but since our syntactic analyzer does nol lake semantics into account, a Iol of ambiguilies remain: Lexical aml)iguities, corning from the fact tllat the sarne word may cer+respond te several lemmas in the syntactic dicliotla~y. This kind of ambiguity can be The Conceptual Graph model The Conceplual Graph model is a very promising unified model, because it generalizes many ideas contained in preceding works on natural language sernantics, such as Fillmore \[7\], Schank \[9\], Montague \[5\], Wilks \[12\], and Karnp \[8\], for example. For the sake of clarity, we briefly recall here the Conceptual Graph model introduced by J.F. Sowa \[101\]. A Conceptual Graph is an orienled graph macle up of concept nodes related by conceptual relation edges. The cencepls are represented by boxes, the relations by circles. Example:</Paragraph> <Paragraph position="12"> The concepts may have referents which specialize them. A referent can be a constant ('Sue') to denote individuals, a variable to denote cross-references, or more complex expressions. Most of the relations are binary relations (OBJ), some are unary. The concepts are organized in a concept type lattice with a partial ordering relation. The top concept type is ENTITY. Example:</Paragraph> </Section> <Section position="2" start_page="0" end_page="0" type="metho"> <SectionTitle> BNTITY ECONOMIC....~ NT I TY M~AHURI~_UNIT INTEI~EBT RATE CUI~RENCY TIMF~__U NI'\]\[ DOLLAR FRANC MDNTH </SectionTitle> <Paragraph position="0"> Conceptual Graphs may be combined together using various algorithms, the most important of which are the projection and the join algorithms. They are pattern matching algorithms which take the concept types hierarchy into account.</Paragraph> <Paragraph position="1"> The projection of one Conceptual Graph into another one is a restriction of the first graph to a sub-graph of the second one. The projection also gives the pending edges of the second Conceptual Graph in relation to the result.</Paragraph> <Paragraph position="2"> The join of two Conceptual Graphs forms a common overlap, while keeping the most specialized concept types in the result, and attaches to the common overlap the pending edges remaining in the two graphs.</Paragraph> <Paragraph position="3"> G:~= ~IRL;'Sue; G DRIVE FAST Result of the projection of G1 into G2: ~Y: 'John ~ 6 DRIVE I\] CAR Result of the join of G1 and G3: hIRL:' The semantic analyzer: general method The semantic analyzer produces one or more Conceptual Graphs for each sentence, including cross-references within a sentence or between different sentences.</Paragraph> <Paragraph position="4"> Our semantic analyzer is written in the VM/Programming in Logic (VM/PROLOG) programming language \[_11\]. The semantic analyzer takes as input the annotated syntactic tree(s) resulting from the syntactic analysis. Applying compositionality rules, it links together the Conceptual Graphs corresponding to each word or locution of the sentence, according to the indications given by the syntactic tree(s).</Paragraph> <Paragraph position="5"> The Conceptual Graphs for each word or loc=dion are retrieved from a semantic lexicon. The words of the Natural I_anguage rnay be coded in a semantic lexicon general to Natural Language and/or in a semantic lexicon specific to an application. In our project, we have concentrated on developing specialized semantic lexicons, in order to get fast results on texts dealing with a specific subject (econornics, pharmacology). In cases of polysemy there may be several entries (hence several Conceptual Graphs) for one word in the semantic lexicon. If, however, a word is missing in the semantic lexicon, default options are taken.</Paragraph> <Paragraph position="6"> The directed join algorithm as a disambiguation tool The Conceptual Graphs for words are linked by an algorithm that we call the directed join. In fact, the directed join is a deterministic version of the join algorithm described by J.F. Sowa: we force such and such concept box in the first graph to be mapped onto such and such concept box in the second graph, by use of attachrnent point labels which lie inside the concept boxes. The join may then be propagated along the edges related to those initial concept boxes. Semantic constraints on the concept lypes, contained in the concept type lattice, make it possible to rule out invalid polysemous combinations, and in sorne cases to discard non-pertinent syntactic analyses.</Paragraph> <Paragraph position="7"> Ill addition, we have implemented a directed join management algorithm which allows the &quot;best&quot; possible solution to be chosen. Indeed, when two semantic structures must be linked together, all the conceptual choices (corresponding to the different entries for each word in the semantic lexicon) 240 2 are simultaneously taken into account by the directed join management algorilhm, which only keel)s tire solutions leading to a maximum overlap between the two sets of Conceplual Graphs (according to the link constraints).</Paragraph> <Paragraph position="8"> For example, suppose we have the following coding for the verb &quot;passer&quot; (&quot;to go from ... to&quot;) in lhe semantic lexicon:</Paragraph> <Paragraph position="10"> For the sentence &quot;le dollar est pass6 de 6 francs ~:l 5 francs&quot;, (&quot;The dollar went down from 6 hancs to 5 francs&quot;) the directed join algorithm will enly give solution 2, automatically discarding solution 1.</Paragraph> <Paragraph position="11"> SOLUTION 1 is SOLUTION 2 is \[ Therefore, the final result is usually not the combinatorial product of all the entries of polysemous words in the semantic lexicon. We thus see thai the direcled join algorithm is a powerful tool which carl help disambiguate polyserny. It also helps fill in the gaps of incomplete syntactic information, as well as solve anaphors, as we shall explain below.</Paragraph> <Paragraph position="12"> Processing of incomplete syntactic information We prefer to speak here of incomplete syntaclic iniormation rather than of ellipses, in that the solving of true ellipses has not yet been clone in our system.</Paragraph> <Paragraph position="13"> In our system, the solving of incornplete syntactic information deals with missing subjects of complement clauses (infinitive verbs, verbal prepositional groups). The choice of the missing subject is made according to: o the preposilion introducing the complement clause (if applicable), (r) the subject, object and dative of the main verb (i.e. the verb to which the complement clause is syntactically related), null * in some cases, the adverbial phrases of the complement clause.</Paragraph> <Paragraph position="14"> For this processing it is necessary to have a knowledge base about the warbs of the Natural Language, along with their possible prepositional syntactic constructions. This knowledge base is organized into classes of verbs for which similar syntactic constructions lead to the same choice for the rnissing subject. Surprisingly enough, we have found that these classes also correspond in French to semantic classes (necessity, motion, perception, accompaniment, intention, delegation of power, etc.). Our algorithm has been written for the French language and should be partially or totally rewritten for other Natural Languages.</Paragraph> <Paragraph position="15"> Here is an example of the kind of results we get: &quot;Le directeur demande ~t son employ#; de faire r6..parer le terminal par le service d'entretien&quot; (&quot;The manager asks his employee to have the terminal repaired by the maintenance people&quot;) Sometimes, the solution is not so straightforward. For example, let us consider the sentences: &quot;J' ai entendu jouer les enfants&quot; (&quot;1 heard the children playing&quot;) &quot;J'ai entendu jouer la musique&quot; (&quot;1 heard the music playing&quot;) In one of these sentences (both in French and in English), lhe noun phrase following the infinitive is its subject, in lhe other il is</Paragraph> </Section> class="xml-element"></Paper>