File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/79/j79-1083_metho.xml
Size: 121,040 bytes
Last Modified: 2025-10-06 14:11:15
<?xml version="1.0" standalone="yes"?> <Paper uid="J79-1083"> <Title>Association for Computational Linguistics SlJMMARY OF A 'L;EXICON FOR A COMPUTER QUF ST!! ON-AHSWCRING SYSTEM</Title> <Section position="2" start_page="0" end_page="0" type="metho"> <SectionTitle> SlJMMARY OF A 'L;EXICON FOR A COMPUTER QUF ST!! ON-AHSWCRING SYSTEM </SectionTitle> <Paragraph position="0"> An integral part of any natural language understanding sys tern, but one which has received very little attention in application, is the lexicon. It is needed during the parsing of the input text for making inferences, and for generating language output or performing some action. This paper discusses Lhe principal questions concerning the lexicon as it relates in particular to a question-answering system and proposes a specific type of lexicon to fulfill the needs of this system.</Paragraph> <Paragraph position="1"> Rather than make a distinction between dictionary and encyclopedia, we have a single global data base which we call the lexicon. Homographs are differentiated and phrases with fixed meanings are treated as separate entries. All the information in this lexicon is encoded in the form OZ relations and words or word senses. These form a large network with the words as nodes and the relations as edges. In addition the relations define semsntic fields and these are used to treat problems of ambiguity. Relations are use6 to encode both syntactic and smntic information. Axiom schemes are associated with each relation and these are used for inferencing.</Paragraph> <Paragraph position="2"> The lexical relations then are at the heart (or brain) of the system for representation, retrieval, and inferencfng.</Paragraph> <Paragraph position="3"> For each relation we describe its semantics and the axioms appropriate to it. In the positing of lexical relations our appreh has been inPS luenced by the work of 4presfan, Me1 cuk, and Zolkovs~y . The lexical relations we have posited are the traditional svnonvmy and antonymy, taxonomy, part whole, gradfig and approximately forty others.</Paragraph> <Paragraph position="4"> The whole set, deliberately left open ended, is subdiudded into nin~ subsets which include attribute relatibns, coLl~cational relations and paradigma ti* onEs.</Paragraph> <Paragraph position="5"> Each relation has its own lexical entry givhng its properties and telling how to interpret lexical relationships in a first order predicate calculus form. lor example, the tnformatfon for the lextcal entry dog gncludes the statement dog T animal, that is, that a dog is a kind of mim~z. The lexical enfry for T, the taxonomic relation, in its turn includes infprmatic , which allows the statement to be interpreted as HoZda(Ncom(dog,X)) - RoZds(lYcorn(anima1,X)).</Paragraph> <Paragraph position="6"> The inventory of relations is expandable simply by adding lexical entries for new relations. In addition having both the lexical entries and the relations in the entrdes expressed in the same notational form as that of input sentences, namely in a Eirst order predicate calculus notation, allows for a consistent, coherent, and easily modifiable system for analysis, inference, and synthebi~,</Paragraph> </Section> <Section position="3" start_page="0" end_page="0" type="metho"> <SectionTitle> TABU OF CQNTENTS </SectionTitle> <Paragraph position="0"> .. m.................... 1, In trodu~t ion ...................... 2. Design Decfsiohs a. The Dictionary and the Encyclopedia - One Data Base ......................... or TWO #.</Paragraph> <Paragraph position="1"> b. Lexical Models - Componential ~eatuke Analysis vs. .................... Relational Networks c. Selection Preference ..................</Paragraph> <Paragraph position="2"> d. The Homonymy - Polysemy Problem - Criteria for SeparateEntries ............... - - - .</Paragraph> <Paragraph position="3"> e. Idioms f. Preliminary Design Decisions for the Lexicon ....... 3. Some Theories of Lexical Relations .............. 4. The Set of Lexical Relations ................ a. me Classical Relations: Taxonmy and Synonymy ...... 3. Antonymy * ........................</Paragraph> <Paragraph position="4"> ......................... C. Grading em Parts and Wholes ...................</Paragraph> <Paragraph position="5"> ........... f. Typical Case Relations ...</Paragraph> <Paragraph position="6"> ................ g. Other Callocation Relations h. Paradigmatic Ke1;1tions .................. j . Inflectional Relations ...*..............</Paragraph> <Paragraph position="7"> 5. The Organization of the Lexicon and the Semantic Representations ......................</Paragraph> <Paragraph position="8"> 6. Tha rorm of the Lexical Entry ................. ..........................</Paragraph> <Paragraph position="9"> 7. Sunnary ............. Appendix I. The Semantic Representations Ref erenccs</Paragraph> </Section> <Section position="4" start_page="0" end_page="8" type="metho"> <SectionTitle> I. INTRODUCTION </SectionTitle> <Paragraph position="0"> The lexicon presented here is being developed as an integral part of a computer question-answering system which answers multiple-choice questians about simple children's stories, It thus must make informa-tion readily available for the parsing process, for building an internal nodel OF the story being read, and for making inferences. Knovledge nbout words and knowledge about the world must both be stored in a compact but mediately accessible form.</Paragraph> <Paragraph position="1"> Many decisions must be made, therefore, about the design of the lexicon. The first problem is to decide on an organizing structure, Should Lexical and &quot;encycloped~c&quot; information be stored separately or together? Which ltems will have separate lexik3l entries? Which will be included in other entries? What about homonymy and polysemy? What connecting links between words and word senses will be recorded and how? The next problem is to determine a characterization of word meanings. This leads to some deep theoretrcal questrohs What kind of lexical semantic representations are appropriate? What is the structure ot these represefitations? What are the semantic primes, the elements of that structure? The design of the lexical entry is thus subject to theoretical biases as well as the practicas constrajnts of space, retrfeval effkiency, and effective support of rnference-making.</Paragraph> <Paragraph position="2"> The decision to use lexical relations as fundamental elements of the struclture of the lexicon has strongly influenced our design. Relations are wed to encode both semantic and syntactic information, Axiom schemes essential to inferencing are associated with each relation. Relati-1 informatgon makes up.a signifkant part of,the lexical entry, Lexical relations offer significant advantages. They aS.Low us to generalize familiar gnference patterns into axiom schemes. They can encapsulate the defining formulae of the commercial dictidnary. They have an iritwitive appeal which we believe reflects a certain measure of psychological reality. On a practical level they allbw~tls to express both syntactic and semantic informa-tion in a form that is compact and eaay to retrieve. They can be used in many ways. For example, the following paragraph from a test administered to first and second graders-by , local school system says: (Pl) Ted has a puppy. His name is Happy.</Paragraph> <Paragraph position="3"> Ted and Happy like to play.</Paragraph> <Paragraph position="4"> (Ql) The pet is a: dog boy toy In order to answer this question we need to know what p& means. In our Lexicon the lexical entry for pet contains a simple definition. a pet is an animal that is owned by a human. In order to answer this question we also need to know that a puppy is a young dog. 'This information in predicate calculus form would be part of the lexical entry for puxpy. We would, of course, need axioms of the same form as w'ell for the entrqies for kitten, jamb, etc. Instead of such a representation we express this ;information by uslnp, a l exical relation, CIIILD. The Lexical entry for puppy contdins CHILD Jog. Similarly, the lexjcal entry for kitten contains CHILD cut; while the lexical entry for CHILD contains the axio~tl scheme from which the relevant axioms are formed when needed.</Paragraph> <Paragraph position="5"> We treat verb& in a similar way. Corresponding to each case re-Lation there is a lexical relation which points to typfcal. fillers of that ca le slot. Thc lexical entry for bake2 includes TLOC kitchen. It also ihcludes~ T make where T 1s the well-known taxonomy relation, so that if the story says that &quot;Mother baked a cake &quot; we can1 inter that she inade one add CAUSE bakel so that we car deduce that the cake has baked he select Lon rgstrictions that help us tell instances sf bakel and bdke2 apn'rt can also be expressed compactly using the T relation. We also need to make deducti~ns from main verbs in predicate complemhnt constructions, deduetions such ag the speaker's view of the truth of the proposition stated in the complement as derived from the factivity of the verb. In order to answer several questions from the test cited above the reader must infer that everything that Mother says is true.</Paragraph> <Paragraph position="6"> ~exical entries for main verbs that take predicate complements contakn pointers to the implication clgss. These relatiohs can then be expanded to give the proper axioms.</Paragraph> <Paragraph position="7"> The lexicon includes separate entries for each derived form unless the root can be identified by a simple suffix-chopp-tng routine. Lexical relations are useful, here too in saving space. The lexical entry for man contains PLURAL men. The 1ex;bcal entry for went consists of PAST go, The lexical entry for death consJ.sts of NOMV die. There are, as well, lexical entries for some multiple word expressions such as birthday par ~IJ, baZZ gm, piggy bank, and thank you.</Paragraph> <Paragraph position="8"> As to the form of presentation here, the next section presents some of the practical problems and theoretical convictions which determined our most critical design decisions. Then, after a brief description of some earlier developments in the theory of lexical relations, we explain the system of relations which structure our lexicon, discussing each group of relations in turn Finally we describe the actual form of our lexical entries.</Paragraph> </Section> <Section position="5" start_page="8" end_page="14" type="metho"> <SectionTitle> 2. DESIGN ~JECISIONS </SectionTitle> <Paragraph position="0"> The lexicbn in this system must make information readily available for parsing, for building the story model, and for making inferences during question answerivg . So th knowledge about words and knowledge about the world must be stored in a compact but immedrlately accessible form.</Paragraph> <Paragraph position="1"> Therefore, many decisions must be made concerning the design of the lexicon The problems involved includ~ the asganizatian of lexical and encyclopedic i.nformation, the chotce of a lexdcal model and the determ3nafion of appropriate serantic primes, the representation of selection preferences, the recognition and storage of homographs, and the criteria for establishing separate entries for idioms and other fixed phrases. Th i s paper attempts to develop some consxstent solutions to these problems, sol utlons which determine the design decisions for the lexicon in this ques t~on--rnsa erinp system.</Paragraph> <Paragraph position="2"> a. 1'he dze-tzonary ad the elzcgctoy~dza - one data-base or two7 Any question-answering system must uBe lexical information in at least two ways, In parsing and in making inferences. The flrst critlcal dec5s50n that must be made 9s whether tm separate data-bades are needed to support these sepdrate functions or whether a single unified ~lobal datq-base is better. Traditionally human bejnfs have used two separate stores of inf orma tlon, the dictioqary and the encycl opedia.</Paragraph> <Paragraph position="3"> Some linguistic and computational mode1.s of language have also been based on the assumption thzt information about words should be stored in two separate co1 lectians.</Paragraph> <Paragraph position="4"> Ln Chomsky's fl~pp~to model (1965) there are two separate storage plzcc .; for lexical Informztf on, one in the base component 2nd another in thc scmantic component. Kirr (1972) to red syntactic information In a ' dictionary&quot; and semantic inf omation in an ' encyclopedia&quot; Winograd (1971) has two separate word lists, one used by the parser and one by the semantic routines, even though the parslng and the semantic routgnes are very closely interwoven in his BLOCKS system Before deciding on whether to carry on this tradition one must ask whether there is really a clearcut d~stincti-on between these two klnds of lexical information Is there a simple algorltnm for deciding which data should go where? Bzem, Bzemzsch, mefer, and the Smantzc Funct ton of the Lpxzcon Both the dictionary and the encyclopedia are ways of recording information stored in human memory But human memory ns probably not organized in the usual graphic form of an alphabetic word list, therefore alternative memory s-tructures should be examlned One such alternat~ve has been presented by Bierman (1964) In his system lexical-semantic fields &re prlmary, they define the basic organization of semantic Information null The function of the lexicon, if it has one 4x1 the semantic domain, is to index thebe fields, to store pointers to the location of a word In the varlous fields containing it An appropr:iate image for such a system is a very large single page dictionazy with Language speciflc nodes connected by semantic relations (See also Werner 1969) Can the dictionary and the encyclopedia be distmguished in this context? Bierwisch and Kiefer (1970) assume that both kinds of informa-tion are contained in the same lexical entry The distinction between lexical and encyclopedic knowledge corregponds then to the dffference between the care and the periphery of a lexical entry, where The core of,a lexical reading copprises all and only those semantic specifications that determine, roughly speaking, its place within thb system of dictionary entries, i. e. deljmit it from other (non-synonymous') entries. The periphez-8 conelsts of those mantic specifications whikh could be removed from its reaaing withour changing its relation tc other lexical readings within the same grammar (2bid: 69-70) Unfortunately they do not speqify whether the lexical-gemanttc relations which form the structure of the fields are part of the core or the peri-The major difficulty with this crikerion is its instability. As new entries are added to the system, information sufficient to Oistinguish one enrry from another may have go be shifted from the periphery to the core --and thus from the encyclopedia to the lexicop. For instance, sb~pose a new entry, &quot;2eopard--a large, wild cat&quot; is to be added. The entire lexicon must be ~ear'ched for entries which mention large wild cass. If one is found, say &quot;lion--a large wlld cat&quot;, then enough information must be added to both definitions to differentiate leopard and zwfl from each other. Soviet Lexicography and the Lmical Universe.</Paragraph> <Paragraph position="5"> Apresyan, iolkovsky, and el ' Zuk run into the same difficulty of distinguishing dictionary and encyclopedic $nformation in attempting to define the lexical universe of a word C 0' The main themes dealt with under the heading 'lexical universe' are: 1) the types of Co; 2) the main parts or phases of Co; 3) typical situations occurring %efore or after Co etc. Thus, the section lexical universe for the ward skis consis-ts of a liat of the types of skis (racing, mountain, jumping, hunting), their main parts (skis prdper and bindings), the main objects and actions necessary 2o.r the correct use (exploitatibn) of skis (sticks, grease, $0 WLZE), the main types of actiJities connected with skis (a ski-trip, a ski-race ...I and so on. Even these scan*yt examples make it clear that the information about the lexical universe is, at least partially, of an eneyclopaedic nature. We say &quot;partZallyl' because genuine encyclopaedic information about skis (their history, the way they are manufactured, etc.) is not supplied hgre: the sections contain only such words and phrases as are necessary for talking on the topic, and nothing else. (1970:19,) The problem here is that &quot;what is needed for talking about the toqic&quot; hepends very much on who is going to do the talking.</Paragraph> <Paragraph position="6"> The definition of slEi in Webster 's Nm tntemationaZ (2nd Edition) begins : One of a pair of narrow strips of wood, metal, or plastic, usually in conibinatian, bound one on each foot and used fox gliding over a snow-covered surface.</Paragraph> <Paragraph position="7"> Apresyan, iolkovsky, and Me1 cuk do not provide for three of the items, mentioned here: what skis are made of (wood, plastic, or metal), what shape they come in (long and narrow) and where they belong spatially (on the feet). Yet these items could be essential in understanding inferences in a story.</Paragraph> <Paragraph position="8"> It was snowing, Jim took out his skis.</Paragraph> <Paragraph position="9"> He waxed the wooden strips....</Paragraph> <Paragraph position="10"> You could need this information in answering questions.</Paragraph> <Paragraph position="11"> Jim skid down the mountain....</Paragraph> <Paragraph position="12"> What was he wearing on his feet: slippers skis skates? Although in English or Russian it is possible to refer to skis without knowing that they are long and narrow, it is not possible in Navajo where physical shapes determine verb forme. While the entry in Webster's goes on at length beyond the sentence given above, it does not include all V the items which Apresyan, Zolkovsky, dhd Mel'guk mention.</Paragraph> <Paragraph position="13"> This, however, is not surprising; the boundaries of the lexical universe are not well def lned .</Paragraph> <Paragraph position="14"> Difficulties in updating a system with separate dictionary and encyclopedia, This lack of definition cahses tremepdous problem in a dynamic system. A &quot;real&quot; dictionary ar encyclopedia, the one in a person's brain, is constantly changing. Infomation is added, corrected, apd perhaps lost. A truly interesting memorymodel must be dynamic. The problems of updating this information are not easy to solve, the problem of distinguishing between dictionary and encyclopedic inf~miation in the updating process seems insuperable, Recognizing definitions phrased in ordinmy English is already, diffi cult (Bferwisch and Riefer 1970, Lawler 1972). Determhing the reliability of such fnfornation is also a problem and the dichotomy of dictionary and eacyclop&dia increases this difficulty. Unfortunately inf~rmation does pot cpme neatly packaged and marked &quot;for the dict30nary1' or &quot;for the encycloped~a&quot;. And addition of ififormation to one part of the entry may necessitate updating 'other parts of th-e entry. For example, if we learn that record is a verb as well as a noun we need to add morphological information, lescribe the relatisons between record and write, and we should # probably describe recording materials. Mention must be made that record is a factitv, i.e, if someone records that something happened, one can assume that from the standpoint of the speaker the something really did happen. Whi~h of this information is dictiona~y information and which is encyclopedicf And once this decision is made, information added to that entry may require additions to other entries In the record example, the entries for ebase and write would have to be updated. Also, a decision must be made @n whether a new entry is needed and whether homography or polysemy exists for th3.s new entry.</Paragraph> <Paragraph position="15"> The wrk of Kiparsky and Kiparsky (1970) , Lakof f (1971) and KcCdwley, (1968) has shown that syntax and semantics cannot be separated into such neat compartments. But if syntax and semantics are interwoven then does it make sense to put syntactic information in one box and semadtic informatior inanother? The enswer to this questgon given at least by generative semantics calls into question the traditional distinction between the dice&quot;.& null tionary and the encyclopedia.</Paragraph> <Paragraph position="16"> We accept the generative semantikist arguments that syntax ad semantics cannot be separated and thus do nut separate syntactic and semantic inbrmation. Furthermore, as shown above there seem to be no practical cr,iteria for dig tinguishing dictionary inf~rmation from encyclopedic information.- Thus our system has one singls global data base. For brevity and' since it is a kind of vllection of words, it will be called &quot;the lexicon&quot;.</Paragraph> <Paragraph position="17"> b. Lex{eaZ Models - ComponentiaZ Feature AnaZysis vs. ReZationaZ Netr~orFs. A second critically important decision involves the choice of an appropriate lexical model, the determination of what semantic primes to use and how they should be combined in lexical senantic structures. Two importarit competing models a're provided by componential feature analysis and by relational networks.</Paragraph> <Paragraph position="18"> In a componential analysis model the primes are semantic features and words are deffried by bundles of features. This is a natural extension of the dj ~inctive feature approach to phoneme description which h$~; been used to explain many phonological phenomena. Certain practical problems arise. The number of words in any 1anguage'Fs far larger than the number of phonemes.</Paragraph> <Paragraph position="19"> The number of distinctive features which serve to discrjhinate them must be larger too.</Paragraph> <Paragraph position="20"> The wordsemantic feature matrix for a given language would be vastly larger than the phoneme-phonetic feature natrix. In addition, this matrlx woyld be extremely sparse, Also, it $8 not clear Qhether all the entries ih this malcr* cd'uld be +/- as in a phoneme marix. Axe semantic features either definitely absent or bfinitelg present dr are some features present by degrees? The size of the comp~nential analysis matrix would imfnediatelg ihtroduce difficulties in a computerized model. Fortunately, both numeric c31 analysis and document retriqval offet experience in handling immense matrices by machipe. When a six is extremely sparse it tutna out tp be sensible to store a list of entlrre with SOW and wlumn numbers Here it would mean storing a list of features for each word. This. in fact, is close to Katz's proposal (1966).</Paragraph> <Paragraph position="21"> Ip a relatzanal network model, however, the primes are relations and words or word senses. Relations connect w'osds together in a network in wnlch the woPds are nodes and the relations are edges. In fact, wo.rds,are defined in terms of their relationships to other words.</Paragraph> <Paragraph position="22"> These *models differ radically in their approach to the critical Lexical task pf finding related words. In the componential analysis model related words share related features. Prl~.;umably, the more features two words share the more closely related khey are. Thus, some kind of cluster analysis must be used to identify related worgs. In the relational network model the lexicon is Eormed from 2eTatfionships between words. Thus relatl ed words are immedzately available Both models, componential and relational, require a search for semanric primes The componential analysis model requires the discovery of possibly thousaads 0% semantic features .</Paragraph> <Paragraph position="23"> For a relational network mob1 an inventc ry of lexical relatAons and theil; properties must be developed.</Paragraph> <Paragraph position="24"> This is apparent-ly a significantly simpler task than the discovery of semantic features, for the w number of rekvant relations is probably quite small.</Paragraph> <Paragraph position="25"> Zolkovsky and ~el'&k (1970) list about fifty in their paper.</Paragraph> <Paragraph position="26"> Related to both of these models is the notion of semant5c fields. Intuitively, semantlc fields are collections of related words used to talk about a particular subject.</Paragraph> <Paragraph position="27"> Semantic fields seem-to offer some heip in coping with the problems of ambiguity and context. Many utterances, taken out of context, are ambiguous. But remarhlQy, people almost never perceive this ambf guitg . They immediately choose the correct word sense and ignore the others. ~~~areitl~ the toplc of conversation deterrdines a s'emantic field and the word sense chosen is the one which lj es in this field. The semantic field pomehow defines the I I verbal context. (Or as Fillmore 1977 :59 phrases it meanings are relatlvized to scenes&quot;. ) The componential analysis model makes it possible to define distinct semantic f lelds, but getting ,from one word nn the ff eld to the others may take a sigrrifxcant amount of processing time. bery set of semantic markers can be used to define a semantic fleld; the fiel&consists of all the words wldh definitions containing the markers. The smaller the number of,markers the larger the field obtained. It is possible to decide imediatkly whether aogfven word is in the field or not, just by checking its list of markers.</Paragraph> <Paragraph position="28"> In the relational network model r~lated words are easy to find, but the boundaries of semantic field5 are extremely fuzzy and indistinct. A semantic field can be defined by starting at a particular node and going a given number of steps in any direction. The semantic fields obtained ibis way, however, have very arb J trary boundaries and overJap considerably . Certain bash philosophical -psycho1 ogical assumptions may create a strong bias for one of these models over the other. Someone who believes that semantic features exist as Platonic ideals or who accepts them as psychological realities may easily find componential analysis a most natural kind of description and regard the necessary search for features or sememes as highly relevant. Someone who Feels that &quot;There is no thought without words&quot; would be much mre likely to prefer a relational network description. A lexicon 28, in an important sense, a memory model. Intuitio'i abbut our own internal memory models must have a strong influence on the lexicon we sha~e.</Paragraph> <Paragraph position="29"> We have chosen a relational network model for both intfiitive and practical reasons. We find lexical-semantic relations theoretically interesting (see Evens et al. ms), Useful mventorles of thesd relations are available, in a later section we describe some of these sources. As will be show they proviae a convenient way of storing axiom schemes far deductive inference.</Paragraph> <Paragraph position="30"> As lexical semantic structures @e use the same first-order predicate calculus notation in which semantic representations are written in the que9tion-answering system - meanings of words and meanings of sentenc~s must have the same underlyjng form As McCawley (1970) has argued &quot;denyist&quot; and &quot;doctor who creats teeth&quot; must contain the same units of mehnlng tied together in the same %ray c &quot;'e Zectzqn Preferences A third important problem to be faced in constructing a lexicon which is to support a parser is the prnhl~rn nf selection restrictions. Chomsky (1965) developed the theory of selection restrictions in order to block the generation of nonsense sentences in the syntactic component of his model, The lexical entry for frzghten, for examae, contains the information that it requires as object a noun with the featur [+animate] , while drznk requlres an animate subject . If these conditions are not met, generation is blocked. Selectional restrictrions seem much too restrictive. Traller trucks drink diesel fuel and the earth drinks in the rain. In describing dreams we can invent perfectly appropriate sentences in which inanimate objects by the dozens get up, run around, and drink untjl frightened back to place. Still it is %rue that sentences like these are somehow more surprising than sentences in which cows drink from a brook and are frightened by lighthing. We need some method ~f recording the ordinary, everyday ways in which words combine w~thout excluding the unusual, the poetic, the nletaphoric uses, We kill call them seZectzonprefe~ences instead of selection restrictions. Some truly semantic means of identifying semantic anomalies are needed. Raphael mentions this question rather casually, almost as an aside in the SIR paper. He draws taxonomic trees, one for the nouns and one for the verbs, from the vocabulary of a first grade reader. Then he makes statements like this 1. Any noun below node 1 is a suitable subject for any verb below node 1'.</Paragraph> <Paragraph position="31"> 2. Only nouns below nodes 3 or 4 may be subjects for verbe below node 3'. (1968, p. 51) He makes it clear that he is indeed trying to solve the selectional problem The complete model cohlposed of tree structures and statements abodt their possible connectionp, is a representation for the class of all possible events. In other words, it represents the computer's knowledge of the world. We now have a mechanism for testing the ' coherence' or 'meaningfulness' of new samples of text. (1968, p. 51) Werner (1972) has suggested a method for handling the selectlonal problem which uses noun taxonomies in very much the same way that Raphael does. His proposal includes an elegant way of storing sele~tional information within his memory model. In his network model, noun phrase arguments are connected to the verb by prepositions. The node representing the lexical entry for the verb has arcs connecting it to compound nodes, one for each prepwition which can be used with the verb. The object of each preposition is a node in the noun taxonomy. This noun or any noun below ft in the taxonomy may serve as an argument for the verb. Hem is an over-simplified example of a network for saZ2.</Paragraph> <Paragraph position="32"> This network says that oeZZ takes a human subject, a thing as object, the preposit-n to followed by a human, the preposition Jbr followed by money. The square brackets around [human] indicate that this is just a pointer to the top noun in the taxonomy for human beings. Any node in this taxonomy below the node marked human, whether it is Sm OK a Navajo or Mother, can be used as a subject for saZZ. He does not use the verb taxonomy as Raphael does.</Paragraph> <Paragraph position="33"> Each verb haa its own set of selection indicators fn his discussion of the goals of a semantic theory Winogtad describes semantic markers and selection restrictions , quotes Katz and Fodor (1964) and indicates that he intends to mbody this theory in his system. But in fact semantic markers in the BLOCKS program are derived from a marker tree (Winograd, 1971, Figure 59) which is organized taxonomically. In the implementation process Winograd seems to have moved from .a strict Katz-Chomsky position to a position somewha1 closer to RaphaeJ. a J Werner. The Raphael and Werner proposals are the guiding principles here, ajapted to accmodate case-defxned arguments.</Paragraph> <Paragraph position="34"> The lexlcal entry for mooel, the intransitive pave, must tell us about selection as well as how to relatesubject, object, and prepositional phrases to cases. The information is organized this way: gramnutical function case frame selection infaration 1. subject &per fencer thing move1 # 2. from source thing, place 3, to, into, onto goal thing, place The numbers 1, 2, 3, indicate argdment positions for the predicate cal~uld~ representation. The next column lists the grammatical function. Next come dase indications. Last comes the selection information, the top node in the relevant part of the taxonomy.</Paragraph> <Paragraph position="35"> For move1 the subject is an experiencer. The source is usually marked by the prepositaon fmm. The god is lrirually marked by a preposition like to, into, br onto.</Paragraph> <Paragraph position="36"> The selection information in column four is rather dull, since any argument can be a physical argument or thzw, the source and goal can both be places. There is a rule that any physical goal can be replaced by a class of adverbs containing back and there, so these alternatives do not have to be listed, An attempt is being made to use the verb taxorlomy as Raphgel suggested. In this lexicon go is marked as taxonomically related to move. The entry for go does not contain the information labelled # above. Instead, when this information is needed, the look-up routine climbs the taxonomic tree in the lexicon until it finds a verb which has this information and oopies it from that entry. Thus it gets,caseargument and selectional information for go from the entry for move. It is not clear yet whether this will really work with a sizable vocabulary. null This selec tional informa tion is treated as selec tfon preference and not selection restriction. Each candidate word sense for a verb is checked for selectional prefbrence. If no arrangement of the available noun phrase arguments 1s consistent with these preferences another word senqe is examined But if all word sensee have been rejected on the basis of selectional information, the sentence is not rejected Instead we look again at the candidate word senses and count for ~ach one the number of qteps up the tzxonomic tree we have to make tomresolve the conflict. The word-sensf which requires the fewest steps Is ~hdsen. The hope is tHat the system will be able to &quot;understand&quot; simple metaphors this way. It would be interesting to try to create metaphors by picking noun phrase arpuments claw to but: not undtr the nodcs Indicittd by the sel ection information.</Paragraph> <Paragraph position="37"> d. The Amonm-PoZyeemy Problm - Cr$te&a for Separate Entriee.</Paragraph> <Paragraph position="38"> Words with the same phyeical shape but dgffprent meanings constantly cause trouble in natural language processing.</Paragraph> <Paragraph position="39"> In designing a lexicon we must decide whether or not to create a separate entry for each variation t meaning aid type of use.</Paragraph> <Paragraph position="40"> Quillian is particularly interested in words w'ith multiple meafiings and he experimented with tseveral in his memory model. In Quillian (1968) the word ptmt is treated as a three-way homonym with three separate ttrpe nodes, each with a separate definition-plane:</Paragraph> </Section> <Section position="6" start_page="14" end_page="14" type="metho"> <SectionTitle> PLANT </SectionTitle> <Paragraph position="0"> Living structure which is not ah animal, frequently with leaves, getting its food from air, water, earth.</Paragraph> <Paragraph position="1"> Apparatus used for any process in industry. WT3 Put (seed, plant, etc.) in eaxh for growth. The type node for the first forms a disjunctive set with token nodes pointing to the other two The word food has a single definition with alternative formulations: That which Jiving being has to take in to keep it living and for growth. Things f omllng meals, especially other than dklnk.</Paragraph> <Paragraph position="2"> A polysemous word like this has a single type node and a single definitionplane, but the two alternative definitions are combined with an OR link. r/ Apresyan, Me1 'guk and ZolkovsQ at tack the homonymy-polysemy problem with vigor. Graphically coincident worda are considered homonyms, given distinctive superscripts and listed as separateeentries, if their definitions V &quot;hiive no common part&quot; (Apresyan, Zolkovsky and ~el'guk 1970:3) They do not define &quot;a common part,&quot; but they do give an example. KOCA (scythe), 2 3 KOCA (braid of hair), KOCA (spit). If two gefinitibns have a single common part, the word is classified as polysemantic with a single entry divided into separate parts. They distinguish two types of polysemy. Ifi one case the difference between two words is regular. The relation of a verb to its typical ob J ect is such a regular meaning change, e. q. record (v) - record(n) , f ieh(v) - f ish(n) , and aid(v) - aid (n) . ghese regular variations in meaning are numbered with Arabic numerals, while irreguldr variations are numbered with Roman numerals. Thus part 3 of the lexical entry for bm, the definition, might have the form: bod XC 1. To bend the head in assent or reve~ence. (vt) 2. To submit or yreld. (vi) 3. TO cause to bend. (vt) a 4. An inclinacibn of the head. (n) 5. A bent amplement used to propel an arrow or play a stringed mstrument.</Paragraph> <Paragraph position="3"> (n) I. 1. The forward part of a boat. (n) 2. One who rows in the bow of a boat.</Paragraph> <Paragraph position="4"> (n) There seems to be some redundancy between definition-elements and the lexical functions. Shouldn't regular variations in meaning be captured by fegular lexical fllnctions? If so, then the distinction Apresyan, u? Zolkovsky and ~el'fuk make between regular and irregular meaning variations will be appareht from the form and need not heindicated by different notation, such as Arabic and Roman numerals.</Paragraph> <Paragraph position="5"> For convenience in lexical lookup we have a single physical entry for each grapUcal form. Each word sense whether irregular or regular is numbered separately with Arabic numerals.</Paragraph> <Paragraph position="6"> Thus the ad~ective is coozl, coo22 i$ the verb to become coozl, ad cod3 is the verb meaning to cause to become cooll.</Paragraph> <Paragraph position="7"> Separate information aboub. lexical relati~ns, etc. is stored for each subentry.</Paragraph> <Paragraph position="8"> e. Xdzornb Idloms present a serious problem to the designer of an English lexicon Some criterla must be established for deciding which idioms deserve separate lexical erntries and how multi-word phrases should be stored. When does an idiom deserve to be treated as a separate Lexical unlt? v Apresyan, Zolkovsky, and el' Euk (1970) and Klparsky (1979) represent oppbsfte poles of opinion here. In the explanatory-combinatory dictionary (ECF) of the Soviets word combinati~ns which have a definition of their own or &quot;a pecullar 'combinability pattern&quot; have separate entries. Kiparsky (1975) considers an idiom as a separate lexical uqit only if it invoxves syntactic patterns which are no longer productive. Th~s &quot;house beautiful&quot; and &quot;come hell or high water&quot; are treated be units, but &quot;make headway&quot; is no't. Instead headmy is defxned as &quot;progress&quot; and marked as appearing after make' and rose. Uparsky's proposal places a greater burden on the recognition program which would have to be able to retrieve and put together the pieces of the idiom using his lexicon The system descrQed here follows Apresyan, ~olkovsky and Me1 'Euk, and treats fixed phrases as units. In particular, all noun-noun combinations like pzggg bank and bz~thday cake are separately defined, although thls is certainly a productive partuof English.</Paragraph> <Paragraph position="9"> Judith Levi (1974, 1975) has proposed a theoretically elegant and intuitively attractive method of generating these forms. According to Levi the underlying structure for &quot;birthday boy&quot; is &quot;boy-have-b5rthday&quot; and the underlying s truc tpre for &quot;Sir thday cake&quot; is &quot;cake-for-bir thday . tl Then under certain conditions he, for, etc. can be deleted to give us the noun adjunct expressions. Given these rules, she argues, it is not necessary to treat these exptessions acCl separate lexical items. While her rules seem sufficient to allow us to syntheeiize these compounds correctly, difficlClJties arise when we try to use them for analysis. The question-answering system needs to be able to infer from &quot;birthday boy&quot; that the boy in question is having a birthday, but to avoid inferring from &quot;birthday cake&quot; that the cake is having a birthday. For correct recognition we need to be able to recover the qnique underlying stl'ucture if one exists. (For a similar criticism see Downing 1977 : 814-15. ) Levi' s theory accounts for the generation of new noun-noun compounds. However, in order to pcrount for the recognition process we need lexical entries for fa~piliar fixed compounds and her theory to analyze new compounds. We have used Le+iVs structure as a basis for our representation of omp pound nouns. Noun-noun colppounds have separate entries. A birthday cake is treated as &quot;a cake for a birthday.&quot; A ball gme is represented as &quot;a game that has a ball&quot;. A piggy bank 1s deflned as &quot;a bank that is a pj g. It The system is told that Jim has a piggy bank and asked what the bank looks like. It could be argued that anyone with sufficient cultural knowledge ought to be able to answer this even if all the banks in his past were shaped like bee-hives, but we need a place to write down this cultural encyclopeaic knowledge and a lexical entry for piggy bank seems like a good place to put it.</Paragraph> <Paragraph position="10"> Becker in hie wrk on &quot;The Phrasal Lexicon&quot; (1975) has produced evidence on the Soviet side of this argument, His data suggest that f hed phrases comprise approximately half of our spoken output and have an independent lexical existewe.</Paragraph> <Paragraph position="11"> He includes in his lexicon euphemisms (&quot;the oldest profession&quot;) , phrasal constraintr ('8y i&quot;&quot;&quot; sheer coincidence&quot;), deictic locutions (&quot;for that matter&quot;), sentence builders (&quot; (person A) gave (person B) a loag song and dance about (a topic)'!), situational utterances (&quot;HOW can I ever repay yau?&quot;) , and verbatim texts (proverbs, song titles, etc .) . He claims that we speak mostly .by stitching together bswatcbs of text that we have heard before; productive processes have the secondary role of adapting the old phrases to the new situation....most utterances are produced in ste~e k yped social situations, where the comnicative and ritualistic functions of languagg demand not novelty, but rather an appropriate combination of formulas, clichea, idioms, allusions, slogans. . .</Paragraph> <Paragraph position="12"> (1975 : 60) He has collected 25,000 phrases for the phrasal lexicon.</Paragraph> <Paragraph position="13"> Catherinq Flournoy (1975) has found several hundred fixed phrases in d computer scudy of Father Coughlin's speeches.</Paragraph> <Paragraph position="14"> This is not a new idea to students of oral epic poetry. Homer constantly used fixed phrases to fit syntactic and metrical slots. Dawn is always &quot;rosy-tingered&quot;; Hector is constantly &quot;tail Hector of the shining helm. II There is a serious space-time tradeoff here between parsing time and lexical storage space. It is probably true that people possess and constantly use a phrasal lexicon. Whether we should use atorage space for items whieh we can parse/produce without ambiguity is another question. Currently we provide separate entries for any phrase that we cannot parse and interpret correctly from the entries for individual words. Briaf entries for these phrases seem absolutely necessaty for any prwtacal recognition scheme. These entries also seem to be the appropriate place for indexing pointers to the cultural inPS ormation necessary br mWog ifif erences and answering question6 about birthday cakes and birthday parties. There are theoretical arguments for such entries as well. We believe, as Becker does, in the phrasal lexicon, although we do not Pnslude entries for any phrases that can be parsed and interpreted correttly without a separate entry. Any complete system for language proceestng must also, of course, contain rules like Levi's to providg an ability to process novel forms.</Paragraph> <Paragraph position="15"> f . Prelhznary Design Deczsions for the Lex{con.</Paragraph> <Paragraph position="16"> The goal of this project is a lexicon suf ficieat for parsing, forming semantic representations, and making inferences, coqac t but still alI0~4.ng rapid lexical lookup.</Paragraph> <Paragraph position="17"> The lexicon is a data-base for the question-answering system, a combination lexicon-encyclopedia . Syntactic aad semantdc inf ob tion are combined ih the same lexical enrries. Lexical semanttc representations are mitten in the same form as the semantic representations for sentenced, in a many-tiorted. first order predicate calculus. Homographs w3ich vary in meaning or use are differentiated by Arabic numetal subscripts. Separate entries are included for phras es with fixed meaning.</Paragraph> <Paragraph position="18"> The lexicon is organized In terms of lexical relations. Semantic fields defined by relations are used to handle prablems of ambiguity and context. The relations are used to express and retrieve plany dif f erem kzncls informaticn, from past participles to selection preferences to proper habjtats for lions.</Paragraph> <Paragraph position="19"> Thus the system of lexical relations is crucial to representation, retrieval, and inferance.</Paragraph> </Section> <Section position="7" start_page="14" end_page="33" type="metho"> <SectionTitle> 3. sOM~ THEORIES OF LEItgcat RELATIONS </SectionTitle> <Paragraph position="0"> While developing our lexical relations we examined a variety of relational theories in anthropology and lingui st ice and even collected folk defilpitions of our own (Evens 1975) .</Paragraph> <Paragraph position="1"> We have been parrlcularly influenced by the anthropological fieldwork of Casagrande and Hale (1967) by the memory models of Raphael (1968) and Werner (19741, and most of all by the ECD of Apresyan, Me1 cuk, and Zolkovsky (1970).</Paragraph> <Paragraph position="2"> But we looked at each of these relational theories from the peaaliar point crf view of corn put er ques tion-answering and the pal-titular lexical environment of children's stories, adding and discarding relations to fit the problem. Cdsagmnde mrd Bale - L&caZ RejSations in FoZk Definitions.</Paragraph> <Paragraph position="3"> Casagrande and Hale (1967) collected 800 Papago folk-defini*tions and sorted them into groups on the basis bf semantic and grammatical similarities. They produced the following list of thirteen lexical relations. (Table 1) ReZatiofi Word EngZish 0208s of Papagu Cefinitiow 1, Attributive burrming ~IZ but they are small; and they act like mice; they live in holes.</Paragraph> <Paragraph position="4"> 2. Contingency to get agry When we do not like something we get angry.</Paragraph> <Paragraph position="5"> 3. Function tongue with which we speak 4 Spatial bucket in which we get water 5. Operational bred which we eat 6. Cobpariaon ~oZf they are rather like coyotes, but they are big 7. Exemplification sweet as s igar 8. Class Lntlusion cme a Bird 9. Synohymy amwag funny 10. Antonymy ~OU not High 11. Provenience mzzk we get it from a cow 12 Crad~ng M~nday the one following Sunday 13. Clrcular lty near when somethihg is sitting nearby we gay near Casagrande and Bale make no claim that they have found aW possfble lexical relatiops. These definitions were collekted as part hf a study of dialect variation in Papago and Pima, The wolds to be defined were chosen because they night exhibit dlal-ect diffemnces and not to elidt all possible defining formulae. They suggest for intuitive reasons adding the part-whole relation to thelr list althbugh they d~d not identify it in their data, They also provide an interesti~g Biscussian of word assoclation data in whlch they give stimulus-resranse pairs from tile Mzryresota oms of Russell and Jenkim (1954) exmplifying each of their lexical semantic relations (except for drculardty). They cite some word assoclation pairs which do wt have exact analogues in the Papago definitions. These are &quot;cootdinate&quot; pairs like &quot;needle-thread&quot; or &quot;bread-butter&quot;, &quot;clang&quot; responses like &quot;table-stable&quot; , or sequential responses 'bish-bone&quot; and &quot;whistle-st~~~~. 'Phey remark about the bread-but ter pair that the relationship involved between &quot;bread&quot; and &quot;butter&quot; is similar to that discussed for contingency, except that in the Papago sample, the contingency relationship is not used if both X and Y are nominal concepts. hbsterts CoZZegiate Dictioll~(~~ does pot mention butter in the bread entry but it has a separate entry: &quot;bread and butter. Bread spread with butter; hence, Collo q. livelihood . . . .&quot; (p. 103) It does mentiol thread in the needle entry and needle in the thread entry. This kind o.f association belongs in every lexicon. Werner Is Lexica2 Re'lations.</Paragraph> <Paragraph position="6"> There are two ways to go from the study of folk definitions. One way IS to find or invent lexical relations to fit all the folk definltigns one can collect in a given language, and then look for more in. the formu1.a~ of published dictionaries of that lahguage. me other is to abstract a minimal set of language-universal lexical-semantic relation1 and then attempt to express other proposed lexical relations in terms of the minimal set. Werner has made substantial steps in this second direction (Werner and Topper 1976).</Paragraph> <Paragraph position="7"> Werner's basic semantic relations are the taxonomic relation (T), the modification or attribution relation (M) and the temporal sequencing relation, queuing (Q) . These he calls &quot;the basic cement of the organization of. cultural knowledge and memory. &quot; (1974 : 173, 'Ehe re1 ation of taxonomy (T), the one expressed in English by &quot;a canary is a (kind of) bird , is written (bird) T (canary) and is represented in Wernerf s diagrams by a directed'arc labelled T.</Paragraph> <Paragraph position="9"> The relation of modification or attribution (M), EUhe one expressed in English by &quot;the yellow bird&quot; or &quot;the bird is yellow&quot;, is represented bp a directed arc labelled R.</Paragraph> <Paragraph position="10"> (bird) ; M b [yellow] These last two djagrams can be combined to expresb the idea that a canary is a yellow bird.</Paragraph> <Paragraph position="11"> (bird) [yel low7 (canary) The queuing relation Q represents the idea of order or sequence. For example, (Monday) Q (~uesday). This relation is fundamental in the representation of plans in Werner's memory model. &quot;Kming how... requires the retention of temporal order. there are things to be done first, second, and so on and usually nonsense results if the order is changed (One can ' t drink the beer before the bottle cap is removed) .&quot; (ibid, p. 11) I' Relations like '~onsists of, part of,' 'cause of,' 'like' are handled as complex relations and composed from thr primitive relations M and T using the logical operators not (-), and (a) or (v) and particular lex3cal items. For the 'part of' relation he gives the example &quot;the thumb is a part of the hand&quot; (ibid, pp. 50, 51) oof [hand]</Paragraph> <Paragraph position="13"> This 'diagram essentially says &quot;the thumb is a (kind of) hand-part.</Paragraph> <Paragraph position="14"> It This is an extremely elegant and general theory. ~erner's claim of linguistic universality seems well-founded. His model is m many ways intuitively agpealing although we are not convinced that our basic lexical relations and our basic semantic relations are the safne. Our declsion to try to design a lexicon with a larger set of lexical relations is really an engineering declsion, based on two probably temporary practical difficulties.</Paragraph> <Paragraph position="15"> (i) We do not know how to prove theorems in Werner's model.</Paragraph> <Paragraph position="16"> (ii) We believe that a variety of language specific lexical relations can produce a more compact lexicon with more efficient search routines.</Paragraph> <Paragraph position="17"> ~aphael's Semantic Information Retrieval program (1968) combined a semantic net representation with a relational calculys whichpakes inference8 in this net. SIR inputs simph English sentences, translates them into node-relation-node form, uses a relational calculys to prove theorems, asks for more information, if needed, and answers questions using those inferences. The rehtions which Raphael used are: x~y (An x is a y, e.g. A boy is a person.) x y (x is a y, e.g.</Paragraph> <Paragraph position="18"> John ie a person.) equ$v[x;y] (x and .y are two names for same thing. ) owng[x;y] (Every y owns an x.) own [x; y] (y owns an x. ) partg [x;y] (Some x is part of every y.) part [x;y] (An x is part of y. ) right [r;y] (x is to the right of y.) jright [x;y] (xBs just to the right of y.) (ibid, p. 92) Each relation R has an inverse z. If aRb then the pair (R,b) is stored C on the property list of a and (~,a) is stored on the property list of b. For each relation there are axioms. Further axioms describe how fllfferent relations interact. For instance, the set Inclusi~n relation has the following properties: /r J [c] i.e., set inclusion is transitive a ~xA xcy~acy The interaction between set inclusion and partg is expressed by the axiom In other words, if an x is part of a y and a z is a y then an x is patt of o z. For example, if you know that mammals have hair and that whales are mammals, then you know that whales have hair. Some of Raphael's relations represent particnlar information, some represent generic information. It is the generic relations which correspond to the kind of lexical relations we are working with: set inclusion, equiv, partg, and owng.</Paragraph> <Paragraph position="19"> dprssgan, Zolkouskg, ud Me$ '&k.</Paragraph> <Paragraph position="20"> The Explana tory-Combinator9 pic t ionary of Apr~syan , &el ' Zuk, and Q Zolko~sky (1970) contains a wide variety of lexical relations. Whenever they notice a lexical regularity, they invent a lexical relatlon to express it. Their paper contains about fifty relations and they outline ways of combining the given relations to get still more. Many of these relations appear in an earlier paper by Zolkovsky and Mel16uk, which emphasizes the importance of specifying the grammatical transformations associgted with each lexical pairing. Suppose a story says: The prince's gift of a magic apple to Zamiya dismayed his mother.</Paragraph> <Paragraph position="21"> Zn order to represent this correctly or answer a question like &quot;What did the prince give Zamiya?&quot; the system needs to know not only the lexical relation between give and gift but also the transformation which carries one string into another. In this lexicm the accompmying transformation will be indicated in the lexical entry for the relation, not in the lexical entry for the particular words gi@e and gift. Most of the relations given in these two papers are as appropriate in English as in Russian. Some, although appropriate in English, embody more sophistication than seems necessary in this ploject. The Soviet collection of relatvns is open-ended.</Paragraph> <Paragraph position="22"> They expect to identify more in further lexicographical work and to discover further properties of the relations already identified. This seems highly intuitive. It is probably the case that people go on expanding their repertoires of lexical relations and learning their properties and that this learning continues to a much greater age than the acquisition of syntax. Lexical relations can be added to our lexicon just by adding a lexical entry for the relation. At this point the actual addition of entries can only be done by internal manipulation. Eventually it would be preferable for the system tov&quot;learn&quot; such relations or at least accept them in Englieh form. The authors refer to their relations as functions and the examples are wr$tten in functional notation: Figur(passim)=f&me; Anti(beautifu1) =pZain ugly. Since these functions are definiteqy not single-valued, we have used the term lexical relation, in deference to the mathematical conventions. null</Paragraph> </Section> <Section position="8" start_page="33" end_page="41" type="metho"> <SectionTitle> 4. THE SET OF LEXICAL RELATION& </SectionTitle> <Paragraph position="0"> The fesearch reviewed above and our own experience with children' s stories has led us to posit nine major categories of relatioa6.</Paragraph> <Section position="1" start_page="33" end_page="41" type="sub_section"> <SectionTitle> See </SectionTitle> <Paragraph position="0"> however many of the relations themselves seemed to share some comonality usually semantic, and so it became natural to group them into sets of categories. Our category list beglns with bhe more familiar and classicag ralations of synonymy and faxonomy, apd presents an expanded sub-catego* rization within antonmy. The grading category includes a somewhat aiverse collection of three relations. The attribute relations and the part-whole category seem firmly motivated. The next two categories consist of co-occurrence or collocational relations. The last twb groups of relations are paradigmatic in nature, The set of relations presented here is by no means complete, Indeed, it iq deliberately open-ended. Whenever a new lexical regularity is seen in the data, a new relation is added. In order to make the system of relations extensible, theredwe, a separate lexical entry has been constructed for each relation containing its special properties and associated axiom schemes. (Examples of this appear below, for example, in section d.</Paragraph> <Paragraph position="1"> tIn addation definitions of properties, such as tranpitivity, ad a discussbn of their use in this system can be found in Appendix IT).</Paragraph> <Paragraph position="2"> There are several arguments for this methodology.</Paragraph> <Paragraph position="3"> Primarily we are convinced that lexical relations do not constitute a fixed set of language* universal semantic primes.</Paragraph> <Paragraph position="4"> We also feel that we have not yet discovered the most appropriate oollectior or our own use. In addition we hope to I. T raxmmy lion T animal 2. S SYnonPY amusing S f unnyl 1. COMP complementarity single COMP married 2. ANTI an tonymy hot ANTI cold 3. COW converseness to buy COW (3-2-1-4) to sell 4. RECK reciprocal kinship husband RECK wife 1, Q queuing Monday Q Tuesday 2. SET set-element flock SET sheep.</Paragraph> <Paragraph position="5"> 3. STAGE manifestation ice STAGE wat ex 1. MALE male - unmarked drake MALE dud term 2. FEMALE female - unmarked lioness FEMALE lion t em 3. CHILD juvenile - parent calf CHILD cow 4. HOME habitat - object APS rica HOME lion 5 SON characteristic bark SON dog sound - anlmal 6. WEOF substance ski MADEOF wood e., Parts ancZ Wholes 1. PART part - whole horn 2. CAP head - organiza- chief t ion 3. EQUIP personnel - object crew 4. PIECE count - mass lump 5. COM~SFJXOM provenience milk 2, TOIPJECT typical ob j ect 3. TRESULT typical result 4. TCAGENT typical counter agent 5. TINST typical ins trument null 6. TsoURCE typical source 7, TEXPER typical exper ienc er 8, TLOC typical loca t Son g Other CoZZocatzon Rektwns I COPUL speciaJ. copula verb 2 LIQU dest~ ~ying verb 3. PREPAR verb which means prepare 4 DEGRAD verb to deteriorate null 5. (INC increase verb (DEC decrease verb 6 PREPOS preposition o%j mp h. Paradzgmatzc ReZatwns 1. CAUSE cause - thing or action effected 2 BECOME become + adj be + predicate conqueror TAGENT to conquer dinner TOBJECT co dine hole TRESULT to dig loser TCAGENT to beat 2 needle TINST to sew earth TSOURCE to sprout lover T-ER to love kitchen T][X)C to bakeg 7. IMPER irregular impera- go ahead! IMPER to talk tive -I. PAST past tense - in- went PAST to go f initive past participle - gone PP inPS initi've 3. PLURAL plural - singular men PLURAL man model the acquisition of relations at some later point. Aad finally, we are attempting to introduce some modularity of deaign into a difficult programming project.</Paragraph> <Paragraph position="6"> a. The CZassicaZ Re&zt$ons: Tuxcmomy and Synonymy. Aristotle demanded that every definition begin with the statemen5 of the genus to which the term belonged. The genus now is called a superordinate taxon and the relatioh between the term and its germs is labelled as the taxonomy relation. Even today commerc%al lexicographers following the classical tradition use taxonomy along with synonymy as the fundamental relatiuns. These relations also 'have played an easeneial part in attempts at question-answering. In Raphael's (1968) system they appear as set inclusion and equivalence. In Simmnns@(l973) system they are called IMPIY snd EQ. The inference-makfng scheme in Marx's question-answering system is based on these two relations, For eample, one of his test paragraphs says that a dog is brown and the question asks, &quot;14 the aaimal brown?&quot; (Man 1972:224). A dictionary lookup of dog finds the taxonomic relationship between dbg and mimaz. AnimaZ is subs tituted for dog and the two sides match. Marx uses synonyqy in the same way. Suppose the text says &quot;John wants money&quot; and a question asks &quot;Does John desire money?&quot; (P972:229). A dictionary lookup finds that &~ipe is a synonym of want. The substitution of one for the other results in a successful pattern match.</Paragraph> <Paragraph position="7"> 1.) T~zco~o~~.</Paragraph> <Paragraph position="8"> The taxonomy relation T is expressed in many ways in English; perhaps &quot;is a kind of&quot; is the most typical: A dog is a kind of animal.</Paragraph> <Paragraph position="9"> A dog is an animal.</Paragraph> <Paragraph position="10"> Dogs are animals.</Paragraph> <Paragraph position="11"> The notation dog T mhaZ is used to state 'this relationship. In the lexicon it is represented by an edge from the dog entry to the ank~z entry labelled TI Werner's work on the taxonomy relation in memory models has shown that this relation plays a crucial role in lexical theory as well as in ~ractical question answering. He has discussed the theoretical aspects of the taxonomy relation at length (Werner 1969, 1972, 1973, Perchonock and Werner 1969, Werner and Fenton 1970) and has used it in geveral studies (Werner and Begishe 1969, 1970).</Paragraph> <Paragraph position="12"> Casagrande and Hale (1967) and Raphael (1968) use the name inchsion for this relation. It is certainly related to set inclusion. If A T B then the set of objects named by A, the extension of A, is a subseti of the set of objects named by B, the extension of 8. The set of dogs is ti subset of the set of animals, If we look instead at the intensions of A and B, the sets of attributes implied by the terms, we agaln find a set inclusion relationship but in the other direction. If A T B then the intension of A includes the intension of B. The characteristics that letas identify an object as a dog include the characteristics that make it an animal. Because of the possible confusion about the direction of the inclusion relation, it seenied like a good idea to use another name. The term taxommy is the natural choice since it is now well-known m anthropology.</Paragraph> <Paragraph position="13"> 2. ) Synorapy. The synonymy relation poses some dif f iwit philosophicqJ problems. Do two words ever have the same meaning, or are there always differences? What criteria can be used to decide whether two words are spnonymourUl Apresyan, ~olkovsky and el ' Euk (1970 : 5) have at tempted to state a precise criterion: the two words should be san&call~ substitutable f6r each other, the meaning of one shouU be expressible through the other in any context.</Paragraph> <Paragraph position="14"> But this criterion substidutes one problem for another. How can one tell whether such a substitution is successful, whether the resulting sentences have the same meaning? It can be argued that different sentence forms exist: .precisely in order to allow the expression of differences in meaning. However impossible it may be to define synonymy precisely, this concept is used daily in ordinary discourse* Dictionary writers use it constantly. To sfmplify matters it is assumed her4 that the synonymy relation holds between two words whenever any of the dictipnaries in thz bibliography defines one as the other. This should be read as &quot;rough synonymy&quot; or approximate synonymy. b. -Anton$my.</Paragraph> <Paragraph position="15"> Antonymy has long been recognized as a lexical relation. Websterrs NW CoZZsgiate Dictionmy, for esample, regularly lists antonyms. Its definition of cotd includes &quot;Ant. Hot&quot; (1951:161) (The definition of hot, although it mentions c~Zd, does not include &quot;Ant. cold&quot; ) The same dictiaaary defines antonym as &quot;A word so upposed in meaning to another word that it negates or nullifies every single one of its implications. 11 It is true that antonymy indicates some important facts about implications, and these need to be captured, but it is not true that amttoyrny involves nqaring every proposition in sight. The problem is that there are many kinds of oppositeness of meaning.</Paragraph> <Paragraph position="16"> We have found four separate lexical relations whidt correspond to separate subcategories of antonymy: complementarity, antonymy proper, converseness, and reciprotal kinship.</Paragraph> <Paragraph position="17"> CompZempntar<ty, isolated by Lyons (1968), is the kind of oppositeness that holds between single and married or male and fmaze. The denial of one implies the assertion of the other: me assertion of one implies the denial of the other.</Paragraph> <Paragraph position="18"> If John is married, then John is not single.</Paragraph> <Paragraph position="19"> If John is not married, then John is single If John imsiggle, then John is not married.</Paragraph> <Paragraph position="20"> If John is not single, then John is married, jEU&quot;his kind of relation seems to hold primarily between two adjectives or two adverbs belonging to the same primitive concept. If we set up a lexical relation COW, then the appropriate axiom schemes seem to be, for &he case where Ad j COW Ad j 2, if Z 2, looked at along dimension Z1 , has property Adj , then it also has the property Not (Adj 2) and vice versa. In the notation used for the bemantic representations in the question-answering system this is stated: if, on the other hand, it has the property ~ot (Adjl) then it 21 ao has the property Adj2 and vice ogrsa.</Paragraph> <Paragraph position="21"> (and similarly for adverbs). :OMP is a symmetric relation. If A C(IMP B, then B COW A. In other words it is its own inverse, jn this lexlcon ~f A is marked COW B, then B fs.marked COMP A and so inferences are available in both directions. Anything marriageable is either married or single4 nokboth; if me tern applies, the other must not. 2.) Antonymy. Lyons restricts the tern antonyqj to the sstuation where the assertion of one implies the denial of the other, but the denial of one does not imply the assertion of the other. ~ed and green are antonyms in this sense. If X is red, it is not green. OR the other hand, if X is not red it does not have to be green.</Paragraph> <Paragraph position="22"> It could be blue or yellow instead. IIot or cold behave in the same way.</Paragraph> <Paragraph position="23"> If X is hot then it is not cold, but if X is not hot we do not know for sure that it is cold; it may just be lukewarm. We set up a lexical relation ANTI to express this kind of antonymy . Again it applies par ticularlj to ad j ec tives and adverbs belonging to the same primitive concept. The lexical entry for ANTI gives an appropriate axiom scheme for the case in which Adjl ANTI Adj~: If Z2 is Adjl then it is not Adj2.</Paragraph> <Paragraph position="24"> Hozd~ (P(Z1,~2,Adjl)) ~otds(P(z~rz~,~~~t(~dj~))) (similarly for adverbs. ) Verbs may be included in this kind of antonymy.</Paragraph> <Paragraph position="25"> Consider the pairs love-hate and open-shut.</Paragraph> <Paragraph position="26"> For a child, at least, &quot;X loves Yt' may imply lC.X hates'~.&quot; The appropriate axiom scheme for verbl ANTI verbZ would be: if a simple sentence containing verbl is true, then the negation, is true when verb2 is substituted for verbl floZds(R(verbl, Zl, Z*, z3, Z4)) ~H~lds(R(verb~,~~,~~,Z~,~~)) Since such verb pairs do not appear in our examples such problematical inferences have been avoided.</Paragraph> <Paragraph position="27"> There are some important semantic realities here which are not being captured. There is a set of inrnmpatible color terms: red, orange, peZZow, peen, blue, purple, brw, bhk, white. One can describe any small area of a physical object in one of these terms if it is forbidden to use hedges Like turquoise and pink. Hot and cold, like big and smaZZ, are opposite ends of a scale. Between hot and cotd, wm?n and cool can be placed somewhere. Binary lexical relations are not adequate here, Perhaps developments in the theory of fuzzy sets will eventually provide a better descriptdon. null There are logical problems here too. If the story says the toy is red, then we want to answer &quot;no&quot; to the question &quot;Is the toy green?&quot; But toys can be both red and green in spots, patches, or stripeer</Paragraph> </Section> </Section> <Section position="9" start_page="41" end_page="49" type="metho"> <SectionTitle> JE </SectionTitle> <Paragraph position="0"> the story says that the toy is red and green, we do not want tl) get lost inba self-contradiction.</Paragraph> <Paragraph position="1"> Adjectives which imply grading (cf section c below) involve potential self-contradictions of a slightly different kind. Lyons discusses the sentence reference &quot;A small elephant is a large animal.&quot; The current representation for that sentence in pur system would be:</Paragraph> <Paragraph position="3"> For more details see the section on semantic representations. But 8man ANTI large so w must conclude from P1 that P(size,X1,Not(large)). The problem is that when we call something a small elephant we imply a comparison with some norm for elephants. However, this comparison does noEU appear in our representation. (This diPS+itulty has also been discussed Cm~erseness. This is Lyons' name for a third kind of antonymy.</Paragraph> <Paragraph position="4"> As examples he gives the pairs buy-se2Z aod husband-&fee his kind of oppositeness does not seem to involve negation at all.</Paragraph> <Paragraph position="5"> Rather it involves some kind of permutation of the associated individuals. Dale calls this relation reciprocity and explains it this way: Buy and sell are reciprocals, as are give and receive. What die tinguishes these from antonyms (which they are, in a sense) is that whenever a sentence using one of them is appropriate, there is another appropriate sentence using the other member of the pair. For example, John h9e books from Bill has the same meaning as BiZZ sells books to John. He guve flowers to her has the same meaning as She received flowers from him. This is a sort of &quot;semantic passivet'--like the passive transformation in syntax, it presents the saiie meaning from a different point of view. (1972: 144) Wether ale's sentences have exactly the same meaning or not is debatable, but anyone would agree that one implies the other, What is needed is some compact way to indicate what these other appropriate sentences are and to V derive them when they are needed. Zolkovsky and ~ei'xuk, (1970) have a clever way of doing this for verbs. They use a flotation of the form: We have borrowed this notation, applying it to cases rather than subjects and objects.</Paragraph> <Paragraph position="6"> It is interesting that the Soviets include regular syntactic passives in their discussion of this relation. Since in this systesn inferences are made on the basis of the fully formed semantic representations from which passives have been eliminated, they need not be included here.</Paragraph> <Paragraph position="7"> 4.) RecipcaZ Kinship. Tf we had followed Reichenbach (1966) in treating kinship relations as functions of several arguments then wemc'ould have used CONV for pairs like husband-wzfe also. Since kinship and social relationships like teacher-student are expressed in terms of have, however, it makes sense to posit a new felation RECK far RECiprocal Kinship and other social terms.</Paragraph> <Paragraph position="8"> Husband and wife relationships are represented this way: Len is Martha's husband, R$%gtve, XI, X2, husband) Martha is Len's wife R(have,X2,X1,wife) We want to be able to derive one of theselsentences from the other, using the lexical information husband RECK wife, i.e. if X1 has X2 as husbond then X2 has X1 as wife.</Paragraph> <Paragraph position="9"> The axiom scheme for A WCR B says that if X1 has</Paragraph> <Paragraph position="11"> Ocher kxnds of converseness or reciprocity have not occurred often e~ough to warrant a separate relation and a sephrate axiom scheme. They are entered as individual Inferences in each entry.</Paragraph> <Paragraph position="12"> Antonymy seems to be a highly diverse lexical concept.</Paragraph> <Paragraph position="13"> With further study it may spawn still more lexical relations.</Paragraph> <Paragraph position="14"> 0. Grading.</Paragraph> <Paragraph position="15"> Grading relations like antonymy relations involve alternatives of some kind. Graded alternatives appear to be organized in lists or other kinds of formal structures.</Paragraph> <Paragraph position="16"> Our collection of grading relations is in a state of fLux, many aspects of grading are still not properly defined. 1. ) meuzng.</Paragraph> <Paragraph position="17"> The notation Q is borrowed from Werner but used in a very restricted sense to connect adjacent items on lists, as in on day Q hcesw.</Paragraph> <Paragraph position="18"> It could be read &quot;is immediately followed by.&quot; 2.) Set-ehnent. SET relates the name for the set to the name of the elements, e.g. flock SET sheep. This is the relation which the Soviets call Mult, This relation seems to be particularly well-founded psycholo-. gically, for English has many special words of this type pride of lions, Bevy of maidens, gaggle of geese, and it is certainly a source of word-pla: 3 .) Marrzfestatzm. By contrasf the STAGE relation, as in zce STAGE water, seems very shaky, The axiom schemes are not satisfactory and some of the territory is covered by the CHILD relation described in the section on attribute relatfons.</Paragraph> <Paragraph position="19"> There seems to be a gap in our collection here. We have no parallel to the comparison relation of CWlagrande and dale (1967). Of course in the most common type of examples where the items related are taxonomic Brothers, or cohyponyms as they are soaetimes callea, the comparison relation can be =pressed by a combination of T and T. Recent work by Litowitz (1977) suggests that comparisons are an important component of the defining strategy of children. The boundary between the grading relations and the attribute relations described in the next section is also uncomfortably arbitrary.</Paragraph> <Paragraph position="20"> d, Aetrzbute ReZa$zons.</Paragraph> <Paragraph position="21"> According to Caeagrande and Hale (1967 168) whenever &quot;X is defined with respect to one or more distinctive ow characteristic attributes Y&quot;. a definition is &quot;attributive&quot;. Given this all-inclusive description it is not surprising that the attributive category was the largest in their sample. They propose several subcategories including stimulus properties like size and color, distinctive markers, habitat, behavior, sex, generation, and line of descent. But in order to fatilitate inference we need to associate axiom schemes with each relation. Thus we have broken these subcategories into still more precise relatlns.</Paragraph> <Paragraph position="22"> 1.) Maze. The relation MALE as in drake MALE duck relates the masculine t~ the unmarked term. We want to be able to infer that if something is a drake, then it is a duck and it i~ male, i.e.</Paragraph> <Paragraph position="24"> This axiom can be derived when needed from an axiom scheme in the lexical entry for maze which says that whenever ZN1 MALE ZN2 then a ZN1 is also a ZN2 and if is male; i. e. ,</Paragraph> <Paragraph position="26"> 2.) FmaZe. Similarly, FEMALE, as in lioness FEMALE Zion, relates the dame of the femade ta the unmarked term.</Paragraph> <Paragraph position="27"> 3 .) Tern6 for juveniZes. The most common attribute relation in our vocabulary is CHILD, which relates the term for the offspring tn the term for its parent, as in puppy CHILD dog, kitten CHILD cat, lamb CHILD sheep. The lexical entry for CHILD contains the axiom scheme</Paragraph> <Paragraph position="29"> men puppy and dog have been substituted for ZN1 and ZNp respectively we get an axiom that tells us that if Z1 id a puppy then Z1 is a dog and Z1 is young.</Paragraph> </Section> <Section position="10" start_page="49" end_page="60" type="metho"> <SectionTitle> 4.) HaJita*. </SectionTitle> <Paragraph position="0"> The habitat relation we have called HOME, so thai Awca HOME th.</Paragraph> <Paragraph position="1"> 5. ) Ckrmoteristic Sod The relation SON was borrowed from the Soviets. SON relates an object and the verb expressing the kind of sound it produces.</Paragraph> <Paragraph position="2"> to bark SON dog to roar SON Zion to meow SON cat to choa choo SON truin This relation seas to underlie a crucial part the vocabulary of young children. Wby is such a tremendous amount of time spent teaching children words like mem? Was this tntormation once lif e-preserving or is it a way of teaching how sound is structured into words, the phonology of the language? For whatever reason, children who never see a farm are carefully kaught to associate the sound moo with c&ws. 6 .) Etcbs-t;ance. The relation we call MADEOF as in ski MADEOE wood relates an object to the substance of which it is made. Casagrande and Hale classify as provenience both batea: &quot;which is made out of meequite&quot; and milk: &quot;we get it from a cow1' (1967:184). Since in-English these. relationships* are expressed in different ways, for example, the ski is made ~f woad - wooden ski, but milk comes from a caw - cow's milk, and since the appropriate inferences are different (the milk was once in the cow but the ski was not in the wood), we chose to classify them separately. As the vocabulary-expands we expect the list of attrfbute relations to expamt. Litowitz (1977) is current1 y collecting defihitions-from children and isolating further relations. Smith: and Maxwell (1977) have identified certain attribute relations which occur repeatedly in defining a formulae in Webster's Seventh: COLOR, TIME, LOCIATION, SIZE, and QUALITY. These relations, among others, will be added eventually to our lexicon, e. Parts and Wh6Zes.</Paragraph> <Paragraph position="3"> 1. ) Part..WhoZe. The relation which links finger to hand and carhetop to car we call PART: finger PART carburetor PART car The PART relation seems to be crucial in the definition of many every day objects. While it is clearly important-in computer models of memory, it seems hard to isolate from natural English sentences, Raphael's (1968) SIR model used some subtle heuristics to determine whether a particular instance of the verb have should be represented by the pmt relation or the cp~n relation. Sometimes dialog with a human is necessary to resolve the asiguity. Simmons (1973) recognizes a three-way ambiguity in have which is represented variously as HASPART, POSSess, and ASSOC (1973:76) Mary has long fingers HASPART Mary has money POSSess Mary has fun in the park ASSOC Apparently the part-whole relation is hard to identify in Papago also. Casagrande and Hale do not find it in their Papago sample. They classjfy tl as exemplification deffnitionZwhich,are translated inta English as cows have horns&quot; and &quot;horses have tails. However, on, the basis of intuition and t_he word-association data of Russell and Jenkins (1954) they posit a fourteenth relation (1967:191): Constituent: X is defined as being a constituent or part of Y.</Paragraph> <Paragraph position="4"> The example given is cheek-face.</Paragraph> <Paragraph position="5"> Apresyan, Me1 hk an'd 3olkovsky do not have an explicit part-whole relation but they do include tGo relations in this same area. We have borrowed 2.) Bead-Orgrmiaatipn. CAP relates the head to the organization. chief CAP tribe 3.) PersonneZ-Object, EQUIP relates the associated staff to the organization or object they serve* crw EQUIP ship 4.) C-t-Mass. The relation PIECE which carvea a countable chynk out of a mass also belungs to the part-whole family. For example, lwnp PIECE sugar item PIECE net38 Jespersen was intrigued by this mechanisa which he named individuatizat<on (1933:209); he discovered and listed many such examples. This seems to be the relation which the ECD calls SING (Apresyan et al., 1970:11). 5.) Provenience. We include here also the relation COMESPROM, as in milk COMESPROM cad. This is one aspect of the relation which Casagrade and Hale (1967) call provenience. (It should passibly be listed as an attribute relation along with its close cousin MADEOF.)' Our current lexicon contains only two axioms for the part-whole relation. One is transitivity: if X PART Y and Y PART Z, then X PART 2.</Paragraph> <Paragraph position="6"> The other, borrowed from Raphael, connects PART and Taxonomy. Essentially it says that if all X's are Y's and allY1s have 2's as parts, then all X's also have 2'6 as parts. There is an extensive philosophical literature involving this relation. Martin (1971) presents a system of axioms for part-whole and a review of work by Lesniewski, Woodger, and Tarski. f. TypicaZ-Case ReZatwns .</Paragraph> <Paragraph position="7"> Casagrande and Hale discovered that certain familiar objects, body parts, foods, tools, and other objects of material culture were most often defined not by the relations discussed above but rather by their use in daily life, by common activities associated with them. For example, under the I1functiontl relation they classify examples in which &quot;X is defined as the means of effecting Y&quot; such as eye: &quot;. . .with which we see things&quot; money : If. . .we buy things with it&quot; (1967 : 175) The &quot;operational1' class includes examples in which X is defined as &quot;the characteristic goal or recipient&quot; of action Y bridle: &quot; . . .which they put on horses&quot; (1967 :178) What they call the &quot;spatial&quot; relation also seem to be of this same type, grindstone: &quot;...on which a knife is sharpened&quot; (1967:177) Folk definitions collected from speakers of English often are of this variety, sometimes combined with taxonomy, e.g. &quot;a house is a building in whfch people reside&quot; (Evens 1975:340). Children in particular seem to prefer functional definitions (cf. Ruth ~rauss' collection of children's definitions, A Bole is to Dig, 1952).</Paragraph> <Paragraph position="8"> Apresyan, Mel1c!uk, and Zolkovsky's system includes a family of fundions S1, S2, Sg, $4 which relate nouns and verbs or adjectives. Their semantic structures are based on grammatical relations. For verbs these are a subject relation, a direct object relation, and two kinds of indirect object relatiotls.</Paragraph> <Paragraph position="9"> The functions S1, S2, Sj, and S4 correspond to these grammatical relations.</Paragraph> <Paragraph position="10"> S1 relates the verb to its generic subject.</Paragraph> <Paragraph position="11"> S2 relates the verb to its generic direct object, etc. S4(to sell)=price (that for which the goods are sold) The Em also contains four other substantive relations (1970:ll). The values are nouns. The arguments can apparently be verbs, adjectives or Rouns. First is Smod whxch gives the noun denoting the? mode of action> %od (to write)=hmdwritzng. Sl0, g ives the noun denotlng the place of the argument; Slot (act<on)=scene. Sinst, g ives the noun denoting the inatrument; S~strI~~mnication)nem2s, SinLnstr (to think)=brazn. S,,, gives the noun denoting the result; Sres(to hunt)=bq.</Paragraph> <Paragraph position="12"> Since the semantic representations in the question-answering system are structured m terms of cases tather than grmtical relacions we have set up a group of &quot;typical-case1' relations, one for each case relation in our case system.</Paragraph> <Paragraph position="13"> The typical-case relation relates the verb to typical fillers of that case argument slot.</Paragraph> <Paragraph position="14"> Thus, corresponding to the semantic relation AGENT we have a laical relation TAGENT. The fact that someone who bakes can be called a baker is expressed in our lexicon baker TAGENT to bake The smff thgt you eat is usaally called Bod; pbbd TOBJEZT EU0 eat. The result of digging is usually a hole; hole TRES'ULT to dig. Wen the Cubs beat the Cardinals the Cardinals are the losers; zoser TCAGENT to beat2.</Paragraph> <Paragraph position="15"> The thing you sew with is called a needle; needte TINST to,sm. (This is the Casagrande and Hale operational relation.) Most plants sprout from earth.: earth TSOURCE to sprout. One who loves is called a lover; bver TEXPR to Zove. People usually bake cakes in a kitchen; kitclzm mOC to bake2. It should be noticed that the relation TLOC bears zt close resemblance to HOME which gives the typical habitat Mr an ahimal or other object. The Soviet relation S1,, seems to include both. It is not clear that eemantic theory can justify usidg two relatiom here. We have made a distitnction because our system of semantic representations treats nouns and verbs differently, so that the associated axiom schemes for TLOC and are formally different. It would be possible to use only oqe relation and test the argument for part of speech before choosing an axiom scheme Perhaps the real problem is in the system of semantic representations.</Paragraph> <Paragraph position="16"> This particular choice of lexical relatiws is based on the particular case system being used. We claim, however, that thewame basic scheme would be effective for a lexicon functioning with a different system of semantic representations based on my other set of case or grammatical relations. This IS so since in this scheme corresponding LO each semantic relation in the eemantic representation there is a lexical relation in the lexicon relatingverbs and typical fillers of argument dots.</Paragraph> <Paragraph position="17"> g. Other CoZZocation ~ezations.</Paragraph> <Paragraph position="18"> The relations in this group, like the typical case relations exm amined in the preceding section, are basically ooocurrence relations. They connect words which cooccur conqtantly and point to words which have special meanings in particular contexts.</Paragraph> <Paragraph position="19"> This fs an important pa^ L of the lexical knowledge of the native speaker often neglected in dictionaries. null Most of our relations in this group are borrowed from the Soviet lexicographers: COPUL, LIQU, PREPAR, DEGRAD.</Paragraph> <Paragraph position="20"> 1.) Special Coputa Verb. The COPUL relation indicates the correct copula verb for nouns where belbecome is not appropriate. For example, to fa22 is the special copula verb for victim, to fuzz COPUL victim, as in &quot;Constance fell victim to Louis' aharm. 2.) Destroying Verb. LIQU relates a noun and the verb which means to liquidate or destroy it. This seems to be useful in English as well and some examples belong to a child's vocabultary.</Paragraph> <Paragraph position="21"> to erase LIQU mistake to wipe out LIQU traces 3,) Srepare for use. The relation PREPAR relates a noun and the verb which means to prepare the object, to make it ready for use. This is particularly useful in making deductions about why people are doing things. to lay PREPAR tabzs to make PMPAR bed to bad PREPAR guz 4.) Verb to deteriorate.</Paragraph> <Paragraph position="22"> The relation DEGRAD connects nouns and the appropriate verbs meaning to deteriorate. to 20 bad.</Paragraph> <Paragraph position="23"> to decay DEGW teeth to wear put DEGRAII clothes 5. ) Incream and decrease in activify.</Paragraph> <Paragraph position="24"> The pair of relations INCrease and DECrease connect nouns and special-purpose verbs for increase and decrease.</Paragraph> <Paragraph position="25"> to grow INC ch<ld</Paragraph> <Paragraph position="27"> which the Soviets call LOC. It links suitable prepositions ta particular; nouns. In English things gr, on lisbe, not in them. The fact that on is the appropriate prepositxon for list is recorded as on PREPOS Zest.</Paragraph> <Paragraph position="28"> These are all collacational relations that we have observed in our data. Mel'&k's ECD contains even mare collocation relations bdt we have not included them because they seem too l~terary or too sophistlcated for the vocabulary of children s stories. For example, Bon (Apresyar Melt Euk, and ~olkovsky 1970 : 13) points to attributes meaning &quot;good&quot;:</Paragraph> <Paragraph position="30"> Both the typical-case,relations md the other collocation relations which we have describdare syntagmatic relations. They connect words with other words which coocur frequently in natural language sentences, sometimes with special meanings. We turn now to a group of paradigmatic relations which connect wards which express aspects of the same core of meanlpg as it appears in various contexts or in different parts of speech.</Paragraph> <Paragraph position="31"> h. Paradigmatic Rehtions .</Paragraph> <Paragraph position="32"> The relations which we have grouped together as paradigmatic relations are highly disparate in kind and importance. CAUSE, BECOME, and Nomare, we believe, essential to the structlve of the English lexicon; ABLE and @JN seem potentially quite useful. There seem to be very few emples of BE. /ill except BECOME were influenced by the inventory of Apresyan; V Zolkovs'ky , and Me1 ' Euk .</Paragraph> <Paragraph position="33"> 1) Cacse. Traditional dictionaries use-cause constantly to describe relationships between verbs. Dennison (1972) defines to send as &quot;to causk to go&quot;. KebsterFe Neu CoZZegiate (1951) defines to boiZ as &quot;to cauae to bubble.. . .&quot; (p.96). Schank (1975) treats cause as the most Lmportant relation. McCavley (1975~) in discussing to open argues for two lexical entries, open1 for &quot;intransitive&quot; uses : &quot;the door opened&quot; and open2 for &quot;transitive&quot; uses: &quot;John opened the door. Openl and op3 arp related by cause: to open2 is to cause to openl. McCawleyls fonrmlation will be followed here.</Paragraph> <Paragraph position="34"> The first and longest entry in Websterrs Nm CoZZegiate Diatwnary for Open belongs to the adjective. The definrtion of the intransitive verb6 begins &quot;to become open&quot;.</Paragraph> <Paragraph position="35"> This suggests a renumbering : open1 - adjective - &quot;the doot is open&quot; open2 - to become open1 - verb intransittve -. '%he door opens&quot; open3-to cause to open2 - verb transitive - &quot;John opens the door&quot; @en is only one of hundreds of verb-adjective homographs in English. Coo'l behaves like opm. We start with the adjective coozl, &quot;the jello was cool&quot;. me intraneitive wan cool2 means &quot;to become cool &quot; 9 &quot;the jello cooled in the refrigerator.&quot; The tranait&ve verb cooZ3 means to cause to become cooZ1&quot; , &quot;Jane coaled the j sllo in the refrigerator. 'I Other verb-adjective homographs like clean how a different pattern, the in: transitive verb is missing.</Paragraph> <Paragraph position="36"> cleanl - adjective - The rodm was clean.</Paragraph> <Paragraph position="37"> $clean? - to become cleanl~- %he room cleaned.</Paragraph> <Paragraph position="38"> clean2 - to cause to become c3 !an] - Jane cleaned the room.</Paragraph> <Paragraph position="39"> Not all verb-adjective pairs are homograt)hs. Mobm Englisl retains traces af an old suffix -en which turns adjectives into verbs. To reddm is to make or become red. Somatimes the verb and the adjective are etymologically distant: to agel is to become old.</Paragraph> <Paragraph position="40"> We need a lexical relation CAUSE re la tin^ sd and go, openg ,d</Paragraph> <Paragraph position="42"> The appropriate axio~ scheme for the case verbl CAUSE verbZ tells us that: if the sentence containing verbl holds, then so does the sentence contain- null where ZC1 is the primitive concept corresponding to adgl. (This axiom may conceivably react in uncomfortable ways with tense.) For the moment the relation between czem2 and cZeanl the cause to become&quot; relation will be compounded from CAUSE and BECOME4 It will probably occur often enough to deserve a name of its own, perhaps MAKE.</Paragraph> <Paragraph position="43"> 3. ) Be. The relatioq BE parallels BECOME very closely. While BECOME relates the verb of becoming and the predicate adjective, BE relates the verb of being and the predicate adjective. For example, to neigh&o~ is the verb which means to be near.</Paragraph> <Paragraph position="44"> to neighbor BE near This is the inverse of the relation which the Soviets call PRED. For some reason it seems to be much less common than BECOME.</Paragraph> <Section position="1" start_page="54" end_page="60" type="sub_section"> <SectionTitle> 4.) Process Nacn mrd Verb. </SectionTitle> <Paragraph position="0"> NOMV relates a process noun and its fi verb.</Paragraph> <Paragraph position="1"> Death is the nominalization of the verb to die; death NOMV to die. his is the Soviet relation V and the inverse of the relation So: ) suggests that adjectives formed from nouns by adding -y, e.g. sunny as opposed to 8d0, mean &quot;having more than a normal amount of&quot; whatever the noun denotes. Adjectives in -a1 and -ful may present certain othex semantic regularities.</Paragraph> <Paragraph position="2"> 6.) Able. The relation ABLE is used in combination with case relations only.</Paragraph> <Paragraph position="3"> undersknd#b Ze OB JECT*ABLE to understand Z.iterute AGENT*ABLE to read legible OBJECTJcABLE to read The Soviet version of thjs relation has different eubcategoriee - Ablel, Able2, Able3, Able4 - to indicate grammatical arguments of the verb. Ablel (to burn) = cornZnc8tibZe things are precisely those which can be subjects of the verb to bum. On the other hand, Ablep(to eat) gince edible things are those which can be objects of the verb to eat, Since the semantic representation system in the questrion-answerer uses cases to connect verbs and arguments, we handle different kinds of ABLEness by combining ABLE with a case.</Paragraph> <Paragraph position="4"> 7.) ImeguZar imperative. The relation IMPER comes directly from the Soviet inventory. It relates colloquial imperative expressions to the appr~priat e milin verb.</Paragraph> <Paragraph position="5"> fire! IMPER to shoot go ahead! ~ER to talk This relation esgentially involves very irregular imperatives; and this brings us to the inflectional relations.</Paragraph> <Paragraph position="6"> Inflectional relations are dull but useful.</Paragraph> <Paragraph position="7"> Regu?ar noun plurals and verb forms are handled by a suffix-chopping algorithm byt words like men and 8cz7zg defeat it completely.</Paragraph> <Paragraph position="8"> We get around this difficulty in essentially the same way as some conrmercial dictionaries do. A separate entry is included for these words. The lexical entry for men c%nsists of PLURAL man. The entry for sang is PAST-to sing; for swrg we have PP to sing. The axiom-generator for PLURAI, changes the number assoc~ated with the object if necessary and moves to the main entry to pick up other axiom scheme% there.</Paragraph> <Paragraph position="9"> The inflectional relations are, of course, paradigmatic relations, but are groupea separately because of their sttong family resemblance and particularly uninteresting nature.</Paragraph> </Section> </Section> <Section position="11" start_page="60" end_page="60" type="metho"> <SectionTitle> 5. THE ORGANIZATION OF THE LEXICON AND THE SEMANTIC REPRESENTATIONS </SectionTitle> <Paragraph position="0"> The lexicon is a largo network in which the nodes are lexical entrles and the arcs are lexical relations; all the arcs are double-To represent the nework in the data base, each entry contains a list of attribute-value pairs. Each pair consists of an arc (1.e. a relation name) and the name of the entry at the other end of the arc. Each lexical relationo L has an inverse r. If entry1 contains the l.r attribute-value pair L-en try^, then entry2 contains L-entryl. Each relation also has a lexical entry which gives its properties and also tells how to ffiterpret lexical relationships in the predicate calculus. null For example, the entry for dag includes the information dog T anma2 (dog is taxonomically related to anumrl). The system use, the information in the lexical entry for T to interpret this as* ~oZds (Ncom (dox, X) ) - Ho Me (Neon (animal, X) ) The lexical entry for T also tells us that T is transitive. The inventory of relations is expandable. To add a relation we need only add a lexical entry, When the meaning of the word cannot be expressed solely in terms of le;xica~xelations, a defiLnftion is added to the lexical entry, phrased in the same form as the semantic representations and using the same depth lexis. These lexical semantic relations axe wfittkn in the same form as semantic representgtions for sentences. The lexical entry for pet dncludes the information that a pet is an animal which is owned by a human mcom (animal, Z1) A Ncom (human, Z2) If an hdividual 21 is a Pet, then 21 is an animal and is owned by a human Z2, Thus this lexicon is a relational network model with words and Lexical relations as semantic primes. Definitions are written using lexical relations and first order predicate calculus formulas. The design of this lexicon is independent of a particular representation scheme and the lexical relations we propose can be equally useful in another context. Nevertheless an overview of the semantic representations is included here in order to enable the reader to understand the notation in the examples of lexical entries in the n.ext section. Anyone who does not find notational problems attractive should skip thesew paragraphs; with the exception of a few lines of formal details the rest of this paper will make sense without it.</Paragraph> <Paragraph position="1"> An Ooervia? of the System of Semantic Pepreeentatiae .</Paragraph> <Paragraph position="2"> The question-answering system of which this lexicon is a fundamental part uses a first order predicate calculos system of semantic representations., As it readse paragraph, the system makgs an internal model of the story, identifying objects and events and the relationships between them. The representations are written in a Sirst order predicate calculus so that they can be used in an existing theorem prover (Henschen, Overbeck, and Wos 1W4)= a a firqt order predicate calculus we are allowed predicates, functions, and quantifiers like &quot;there exists&quot; and &quot;for all&quot; but predicates are not allowed to be arguments of other predicates. This particular calculus is many-sorted; that is, there are many dizferene classes of objects in the system.</Paragraph> <Paragraph position="3"> Suppose a story begins : Peter heard a meow. Mother said, &quot;The kitten is hungry.&quot; ,She sent Petek to the store. He bought milk and a big, red lollipop.</Paragraph> <Paragraph position="4"> As we process this story we need first of all to recognize the different entities in the story.</Paragraph> <Paragraph position="5"> Here we have seven individual objects:</Paragraph> <Paragraph position="7"> We can write Ncom(lollipop,X7) to signify that X7 is a lollipop since ZoZZipop is the common noun that names X7. The story mentions two properties of the lollipop; it 18 bi~ and it is red. The lexicon tus us that red is an adjective of cqlor, SO we represent this property using a functional notation Ptcolor ,X7, red) Similarly, P(size ,X7, big) records the fact that the lollipop is big. These propertAes are numbered and put on a list for convenient retrieval. We may wrlte</Paragraph> <Paragraph position="9"> This story also tells us some relations between entities. &quot;~e bought a lollip~p,'~ can be expressed as</Paragraph> <Paragraph position="11"> since he refers to XI, Peter, and X, is the 1ollipop.The third sentence in the story: SHe sent Peter to the more contalns a relation Rg</Paragraph> <Paragraph position="13"> and a property of that relation</Paragraph> <Paragraph position="15"> The predicate fioZds is used to make assertions. To assert the third sentence we write Ho zd8 (P4) me connection between the milk and the lollipop in the last sentence is described by an interrelation I, I(and,X6,X7) so-that the whole sentence becomes Bozds(R(buy,~~,I(and,~~,X~)) (There is a rule to rewrite this later as &quot; BoZds (R(buy ,xl ,x6)) A hzds (R(bujr , x1 ,X7) ) hue it is applied only if some kind of inference is required from this sentence, ,e. g., if a question asks, id Peter buy some milk?&quot;) To obtain these representations we, of course, need a great deal of inPS ormation from the lexicon (like the information mentioned above that red is an adjective of color and that big is an adjective of size). I exical information is also used in setting up representations for quest ion s like 1. What color is the Iallipop?</Paragraph> <Paragraph position="17"> The answer to this question can be found by a simple matching process because the story representation already contains this kind of lexical information. A question such as 2. Did Peter buy some candy? requires further lexical lookup since the word candy does not appear in the storyd The answer is found using the lexical relation T (taxonomy or class inclusion) between ZoZZipop and candy - the entry for- lollipop includes T candy. Similarly, the entry for candy includes !? ZoZZipop, where 1 is the inverse or, converse relation of T, which relates the same pairs of objects in the opposite order.</Paragraph> <Paragraph position="18"> Likewise a multiple choice question such as 3, Where does milk come from: cats cows trees cars can be answered correctly usiug the provenience relation COMESFROM listed in the entries for milk and eat?. The question The lexicon is then used to look for cohnections between go and send. The lexical entry fol; send includes the information CAUSE do. The entry for the learn1 relation CAUSE contains several axiom schemes. With send and go substituted in the correct positions we get the axioms</Paragraph> <Paragraph position="20"> In order to answer the question 5, How old is the cat? we must first identify the cat in the s~o-ry, that is, recognize that a kitten is a cat and then realize that it is a young one. The lexical relation CHILD is essential to this task. The definition of Etten consists of CHLLD cat. The lexical entry for cat contains CHILD kitten. The lexical entry for the relatton CHILD contains axiom schemes which, when kitten and cat are filled in in the praper places, tell us that if X is a kitten then it is a cat and it is young. That is, if IVcom(kitten,X) then Ncom(cat ,X) and P(age,X, young) .</Paragraph> <Paragraph position="21"> In addition some questions force us to look at the interaction between two or more lexical relations. To answer the question 6. What animal did Peter hear? we need to know that a meow is a typical cat soubd, which is expressed by the lexical relation SON, memo SON oat. We also need to know that a cat is an animal, cat T anirnaz, and that a kitten is a young cat, a5 above kitten CHILD cat.</Paragraph> <Paragraph position="22"> Ws has been an extremely brief introduction to the semantic systa used in me question-answering scheme of which this lexicon is a part. For those who are interested in the representations themselves Appendix I contains a brief formal presentation. A more complete description is in preparation. (M. Evens and G. Krulee, &quot;Semantic Representations for Question-Bnswezing Sys terns. &quot;) Lexical relations, we are convinced, are an extremely useful addition to any lexicon, whatever the underlying semnt;ic system The a,xioms which are associated with each relation. of course, have to be expressed in the semantic representations of the system in which the lexicon is being used.</Paragraph> </Section> <Section position="12" start_page="60" end_page="60" type="metho"> <SectionTitle> 6. THE FORM OF THE LEXICAL ENTRY </SectionTitle> <Paragraph position="0"> The most crucial step for the lexicographer is the design of the lexical entry.</Paragraph> <Paragraph position="1"> Somehow all the different kinds of lexical information previously decided upon must be neatly packaged into a compact, consistent, and accessible package. The lexi~on 1s a large network in which the nodes are lexical entries and Che arcs are lexical relations. Lexical entries can be found from an alphabetic list, so that the network may be entered at any point.</Paragraph> <Paragraph position="2"> There is a subnetwork contatlning lexical relations and their logical properties.</Paragraph> <Paragraph position="3"> Each entry begins with the lettex atring which names it. Homographs are numbered 1,2,3, ... to prevent confusion. Thus, oZt3arl is the adjective, clear2 is the verb 'to become cZearlr, and c'learg is the verb 'tb cause to c2earp1 or 'to cause to become cZsmI1. Entries con- null (i) Category - Part of speech, sort, lexical relation, etc. (ii) Irregular inflectional morphology.</Paragraph> <Paragraph position="4"> This latter is stated in terms of a special set of lexical relations-- PAST, PI? (past participle), and PLUR(a1) are the only ones nebded for our simple data-base. The lexical entry for make includes (lii) Lexical relations and pointers to thelr values in the fom of attribute-value pairs. The lexical entry for puppy contairk CHILD - dog. The lexical entry for dog contains CHILD - puppy. The lexical entry for the lexical relation CHILD tells us how to interpret these. It contains an axio111 scheme which when filled in tells us that X is a puppy if and only if X is a dog and X is young Ncbm(puppy,X) means that Ncom(dog,X) and also P(age,X,ygung).</Paragraph> <Paragraph position="5"> Information often classed as derivational morphology will be included here, the lexical entry for soap, for example, contains ADJN- soapy Some of this derivational information could be ~tated instead in general rules and probablv should be Ln any larger data base.</Paragraph> <Paragraph position="6"> bv)- Parameters appropriate to particular categories. (v) Def init~ons. These are in the form of logical inferences that may be drawn when a ~iven word is used, and which are idiosyncratic enough not to be coded in terms of lexical relations. Only a few words have definitions. Puppy, for example, does not because the information that a puppy is a young dog 1s indicated by the lexical relation CHILD - dog. Pet, on the other hand, has a definition In other words, if some indivxdual 21 is a pet, then Zl is an anmal owned by some human Z2, Omitted from this lexicon are the examples which are an important and valuable part of other diczionaries. This system does not have the generalizing power to use examples effectively and, m addition, they occupy a great deal of space. The most natural way of handling examples in such a model might be to accumulate tZl- from semantzc representations of sentences which the system parses. The task of organizing, pruning, and generalizing from examples is too formidable to tackle here.</Paragraph> <Paragraph position="7"> Nouns: Taxonomy seems to be the most important lexical elation Eor nouns, but many others appear in the texts as well. dog T mima2 A dog is an animal.</Paragraph> <Paragraph position="8"> cep6 T maey A cent is a kind of money.</Paragraph> <Paragraph position="9"> puppy CHILD dog A puppy %s a young dog.</Paragraph> <Paragraph position="10"> *oil S earth Soil is the same thing as earth cake TRESULT bake The typical bring-inta-being verb for cake is bake.</Paragraph> <Paragraph position="11"> bubble TRESULT bZmd The typical brinp-into-being verb for bubbZe is blow.</Paragraph> <Paragraph position="12"> The sjmtoctico-semantic features are used in now entries only. Originally, following Winograd, the number and count features were combined ipto a single feature with three values: singular, plural, and mass. But McCawUy has recently (1975a) given examples of plural mass nouns: oZothes guts, bra%s, etc. It is impossible to argue with counterexamples from everydgy language. The feature information can be expressed compactly as a vector of 1's and 0's.</Paragraph> <Paragraph position="13"> These features are used to determine pronoun choice, for example, not to provide semantic information.</Paragraph> <Paragraph position="14"> Puppy is marked as having the feature ban so that the system can parse &quot;the puppy who barks1' and &quot;the cat who walks alone.</Paragraph> <Paragraph position="15"> tt Definitions for nouns begin with the specification of the function, Ncom or Nprop;</Paragraph> </Section> <Section position="13" start_page="60" end_page="77" type="metho"> <SectionTitle> BANK Ncom(bank,Z1): P(location,R(save,Z2,Z3),Prep(in,Z1)) </SectionTitle> <Paragraph position="0"> (A bank is a place where thitlgs are saved.) Smith and Maxwell 41977) include here commonly understood metaphorical extensions, metaphorical cliches (e. g. pitch=heZZ) . These also can be expressed by lexical relations (cf. the Soviet Flgur function which gives figurative f oms; presumably pitch Figur heZZ) . No obvious ones occur in this data base, so that this item is not currently included.</Paragraph> <Paragraph position="1"> ~rmrpZe entry for puppy: Category: common noun Relations: S pup CHILD dog Parameters: 111 110 10 10 The relationship puppy T is nut inclulrd. It can be inferred f tom puppy CHILD dog and dog T anima l Qd mission of relationships whjch can be easily inferred saves space but costs time. It is probably the case that people actually otore these relationships directly. The fact that most, if not all, puppies in the child's w0tl.d are pets is not stored either. This is open to question.</Paragraph> <Paragraph position="2"> The word mer definitely beldngs in the lexical univeree of pet: We can recover it from the presence of own in the definition and the fact that mer TEXPER om. In a child's world, though, the pet-owner relationship seems to be a reciprocal ki6ship relationship like daughtermother .</Paragraph> <Paragraph position="3"> Ron-CopuZa Verb8 Every non-copula verb entry includes case information, in the tom of a list of one or more arguments. For eqch argument we need four pieces of information: (i) How it IS realized syntactically: subject, object, or a list of prepositions.</Paragraph> <Paragraph position="4"> (ti) The case (s) involved.</Paragraph> <Paragraph position="5"> (iii) Whether the case must be explicitly' speclrlea WBL~, whether it is optional and unnecessary (OPT), or whether when absent it must be understood (ELLiptical). (iv) Selection preferences: the top node of the taxonomy subtree.</Paragraph> <Paragraph position="6"> (The classification names in (iii) are borrowed from the SPEECHLIS project, Nash-Webber, 1974). The ellipfical cases belong to verbs khich Chomsky (1965) marked [+objrct-deletion] , which allow the object deletion transformation. Such verbs are eat and read whete the object is easily understood. But this phenomenon also occurs with other associated noun phrases, not just the object, The sentence Joha and Mary gave an alarm lock, begs for a dative-experiencer in isolation, but sounds perfectly appropriate in answer to the question What did John and Mary give the Andersons for a wedding present? For gzve boch the object and the experiencer may be deleted. A sign on the door sayshg &quot;We gave&quot; is acceptable because everybody understands that it mans &quot;We gave money to the United Fund &quot; For buy the arguments age d ii iii 1v 1 Subject agent, source OBL human, organization.</Paragraph> <Paragraph position="7"> 2. Object objective OBL thing (In the Wall Street Journal dialect thls argument is ELLiptlcal.) 3. From source OPT human, organization.</Paragraph> <Paragraph position="8"> 4. For Instrument OFT money For gzve, they are 1. Subject agent, source OBL human, organization. 2. Object objective EL thing 3. To, Object experieacer ELL human, organization. The next item tells us whether a verb is an actioz? verb or not (SACTION). Action verbs and adjectives can appear in imperative sentences but non-action verbs and adjectives cannot. throw the ball! * own the house! be sensib3e: * be tall: These caq also appear in embedded sentences dependent on imperative perfotlmatives like oder and teZ7.</Paragraph> <Paragraph position="9"> Sally told Sam to throw the ball.</Paragraph> <Paragraph position="10"> The next item tells us whether 1he verb allows a regular passive or not. Only thobe whlch do not allow a passfve are marked. Apresyan, el' Eak, and ?!olkovsky treat this also using lexical relations. Eventually this will probably be computable from other information in the entry. Some important items apply only to verbs that take sentential complements. This includes the complementizer(s) the verb takes and whether or not it allows not-transportation. The possible complementizers are* Mother did not like Mike's sitting there, FROM Mothex prevented Mike from going.</Paragraph> <Paragraph position="11"> Verbs like t?z$nk whfch give us roughly synonymous sentences whetheq not is in the main clause or the subordinate clause are said to penbit not- transportation.</Paragraph> <Paragraph position="12"> John didn ' t think Mary had gone.</Paragraph> <Paragraph position="13"> John thought Mary hadnq t gone.</Paragraph> <Paragraph position="14"> Many verbs do not permit not-transportatzon, of course. These sew tences are not synonymoust John didn't say that Mary had gone.</Paragraph> <Paragraph position="15"> John said that Mary hadn' t gone.</Paragraph> <Paragraph position="16"> This complementizer infomafiod is coded by adding to the entry: THAT, FORTO, ING, FRO^, or NOT, as appropriate.</Paragraph> <Paragraph position="17"> The next item is the imp1icati;onal structure of the verb. There are seven such verb-classes and an eighth wastebasket class from which no inferences can be made (Joshi and Weischedel 1973; Karttunen 1930; Kiparsky and Kiparsky 1970) ; sbe table 3. In this system f actives are the unmarked qase since we always assume that we can assert arguments unless we are explicitly told not to. The lexicql entry for each verb which can take a predicate complement and which is not a factive is not a factive is marked with its class name. Eaoh class name appears in the lexicon with its appropriate inference pattern. For a negative-if verb, for example, this is : If R(V,Z1,s) can be asserted then S can be denied. Bill to accept the job.</Paragraph> <Paragraph position="18"> Larry pervented BilT F l winning the game.</Paragraph> <Paragraph position="19"> John failed to go.</Paragraph> <Paragraph position="20"> Hugh ~efmined from smoking.</Paragraph> <Paragraph position="21"> Mary pretended that Ben went home.</Paragraph> <Paragraph position="22"> No implications Jerry wanted Meg to elope with him.</Paragraph> <Paragraph position="23"> Table 3 Classification of Main Verbs in Predicate Complement Constructions (adaptled from Joshl and Weischedel 1973) The next to lasc ftem is the performative classification. The classification used is that proposed by McCawley (1975b) as an extension to the work ~f Austin and Vendler: Verdictive, Operative, Advisory, Imperatitre Conmissive, Behabitive, Expositive (1-7). This is reall~ a .luxury id a recognition-only system for children's paragraphs. The only speech-act verbs involved in our data are say and tell. Performqtive classification does interact with syntax (especially modals), particularly in use with &quot;woulU like to&quot;, &quot;would&quot;, &quot;willtt, and &quot;let me&quot;. The last item tells whether a verb takes indirect question (IQ) . It is probably the case that when PSactivity and p0rformative structure are understood, this item will be predictable.</Paragraph> <Paragraph position="24"> The 10 verbs are apparently all expositives, but not all exposirivee are IQ'S and the IQ clashification seems to cut across ~c~awley's subclassification of the expositives. Presuppoeitions are included in the definition, at present, rather than as a separate item.</Paragraph> <Paragraph position="25"> (that is, if eomeone tells somebody to perform an action then he is saying that he orders that person to perform the action.) CopuZa Verbs: These are marked as verbs of perception or verbs of motion as appropriate if they are not of the 'be-become-seem' variety. Verbs of perception are marked with the perceptual sthere. This helps to construct appropriate semantic representatioas. There is a close r~lation between the following sentences and we need to make inferences from one to another.</Paragraph> <Paragraph position="26"> Sally listened to the trumpets.</Paragraph> <Paragraph position="27"> (active) Sally heard the trumpets. (cogaitive) The trumpets sounded beautiful to Sally.</Paragraph> <Paragraph position="28"> (flipped) The third sentence is called flipped because its arguments are switched from those in the first two. Sowzd is the flip perception verb for hear dcf. Rogers 1972). Thus, the entry for the copula verb Sound 1s marked: type - perception sphbre - aw?aZ flip - he Adjectives: The f irat special item for an adjective is the primitive concept. For red it is coZor; far big and smaZZ it is size. The second item is the selection preference. For red it is thing; for big it is thhg, thought. The selection preference could probably be stated once in the entry for the primitive concept and not repeated. Since it is useful to have it readily available in pareing, it is included separately in every adjective entry.</Paragraph> <Paragraph position="29"> With adjectjves as with verbs we oftea have causally related homographs. The zdm in &quot;warm coat&quot; has a different meaning from the in &quot;warm pie.&quot; A warm pie has a temperature greater than room temperature, but a warm coat makes you warm. These are called warm and W~XPEI I 2 land are connected by CAUSE u-. HOW does one recognize which is which? If the head noun is cZothing or one of the 'furnace-stove-oven' family or indeed anything else which has function heat, is assumed. Adjectives, like verbs, are marked 'Action - yes' or 'Action - No' Lexical entries EUor adverbs are very much like those for adjectives. The main strategy rfollowed in the design of the lexical entry has been to make it as compact as possible. It seems likely that more information will have to be added later.</Paragraph> </Section> class="xml-element"></Paper>