File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/90/c90-2070_metho.xml
Size: 16,821 bytes
Last Modified: 2025-10-06 14:12:25
<?xml version="1.0" standalone="yes"?> <Paper uid="C90-2070"> <Title>MacWhinney, B. Competition and Lexical Categoriza-</Title> <Section position="4" start_page="407" end_page="407" type="metho"> <SectionTitle> 30 Why M~DA~ W~rkn </SectionTitle> <Paragraph position="0"> We believe that MIDAS works because it is exploiting metaphoric subregulafity by a form of analogical rea~ soning. That is, it finds a metaphorical usage that is closest to the given case according to some conceptual metric; it then exploits the structure of the prior metaphor usage to construct an analogous one for the case at hand, and proposes this new structure as a hypothetical word sense. Note that according to this explanation, metaphor does not play a crucial role in the extension process. Rather, it is the fact that the metaphor is a subregularity rather than the fact that it is a metaphor that makes it amenable to analogical exploitation.</Paragraph> <Paragraph position="1"> Analogy, of course, has played a prominent role in traditional linguistics. Indeed, rather influential linguists (for example, Paul (1891) and Bloomfield (1933) seemed to attribute all novel language use to analogy. However, today, analogy seems almost entirely relegated to diachronic processses. A notable exception to this trend 408 2 is the wo~k of Skon~n (in press), who appears to advo.~ catea vk~w quite similar to our own, although the primary foclL~ of his work is morphological.</Paragraph> <Paragraph position="2"> Analogy has also been widely studied in artificial intelligence and cognitive psychology. The work of C~bonell (1982) and Burstein (1983) is most relevant to our enterprise, as it explores the role of analogy in knowledge acquisition. Similarly, Alterman's (1985, 1988) approach to planning shares some of the stone concerns.</Paragraph> <Paragraph position="3"> However, many of the details of Carbonell's and Alterman's proposals are specific to problem solving, and Burstem'z wo~k is focused on fommlating coustraints on rite relations to be considetv.d for analogical mapping. &quot;!bus, their work does not appear to have an obvious application to our problem. Many of the difter~ ences between analogical reasoning for problem solving and language knowledge acquisition are discussed at length in Martin (1988).</Paragraph> <Paragraph position="4"> Another ihte of related work is the connectionist approach iinitiated by Rumelhart and McClelland (1987), and explicitly considered as an alterative to acquisition by analogy by MacWhinney et al. (1989). However, there are numerous reasons we believe an explicitly ana~ logical framework to be superior. The Rumelhart-McClelland model maintains no memory of specific cases, but only a statistical summary of them. Also, the analogy-b~L~d model can use its knowledge more tlexi~ bly, for example, to infer that a word encountered is file past tense of a known word, a task that an associationist networks could not easily be made to perform. In addle ~Jon, we interpret as evidence supportive of a position ?li_ke ours psycholinguistic results such as those of Cutler (1983) and Butterworth (1983), which suggest that words are represented in theh ~ lull &quot;nndecomposed '~ ~'onn, along with some sorts of relations between ielate~i words.</Paragraph> </Section> <Section position="5" start_page="407" end_page="407" type="metho"> <SectionTitle> 3.L Other K~nds of Lexical Subregularities </SectionTitle> <Paragraph position="0"> ff MIDAS works by applying analogicM reasoning to ~xploit metaphoric subregularities, then the question ~ises as what other kinds of lexicM subregularities there might be. One set of candidates is approached in the ~vork of Brugman (1981, 1984) and Norvig and Lakoff (1987). In particular, Norvig and Lakoff (1987) offer six types of links between word senses in what they call ~exical network theory. However, their theory is condeg (:erned only with senses of one word. Also, there appear ~o be many more links than these. Indeed, we have no J~eason to believe that the number of such subregularitics ~s bounded in principle.</Paragraph> <Paragraph position="1"> We present a partial list of some of the subregularities ~ve have encountered. The list below uses a rather inforo real rule fi~rmat, and gives a couple of examples of words to which the rule is applicable. It is hoped that explicating a few examples below will let the reader infer the meanings of some of the others: (1) function-object-noun -> primary-activity&quot;detemfinerless&quot;-noun null (&quot;the bed&quot; ---> &quot;hi bed, go to bed&quot;; &quot;a school ~> at school&quot;; &quot;any lunch ~-> at lunch&quot;; &quot;the conference -> in conference&quot;) (2) noun ~-> lesembling-in-appearance-noun (&quot;tree&quot; ~> &quot;(rose) tree&quot;; &quot;nee&quot; ~-> &quot;(shoe) tree&quot;); &quot;tiger&quot; --> &quot;(stuffed) tiger&quot;, &quot;pencil&quot; -> &quot;pencil (of ligb0&quot;) (3) noun -> having-the-same-functiononoun (&quot;bed&quot; -> &quot;bed (of leaves)&quot;) (4) noun -> &quot;involve-concretion&quot;-verb (&quot;a tree&quot; -> &quot;to tree (a ea0&quot;; &quot;a knife&quot; --> &quot;to knife (someone)&quot;) (5) verb -> verb-w~role-splitting (&quot;take a book&quot; -> &quot;take a book to Mary&quot;, &quot;John shaved&quot; -> &quot;John shaved Bill&quot;) (6) verb -> profiled-componentoverb (&quot;take a book&quot; -> &quot;t~e a book to the Cape&quot;) (7) verb--> framedmposifion..verb (&quot;take a book&quot; -> &quot;t~c someone to dinner', &quot;go '~ L_> &quot;go dancing&quot;) (8) acfivity-verb~t -> concrefion~.activity-verboi (&quot;eat an apple&quot; -> &quot;eat \[a meal\]&quot;, &quot;drink a coke&quot; --> &quot;drink \[alcohol\]&quot;, &quot;feed the dog&quot; -> &quot;file dog feeds&quot;) (9) acfivity-verb-t -> dobj-subj-middle-voice-verbq (&quot;drive a car&quot; --> &quot;the car drives well&quot;) (10) activity-verbq o-> activity-verb+primaryocategory (&quot;John dreamed&quot; -> &quot;John dreamed a dream&quot;; &quot;John slept&quot; -> &quot;John slept the sleep of the innocent&quot;) (11) activity~verboi -> do-cause-activity-verb-t(patient as subject) (&quot;John slept&quot; -> &quot;The bed sleeps five&quot;) (12) activity~verb -> activity-of-noun (&quot;to cry&quot; -> &quot;a cry (in the wilderness)&quot;; &quot;to punch&quot; -> &quot;a punch (in the mouth)&quot;) (13) activity-verb <-> product-of-activity-noun (&quot;copy the paper&quot; <-> &quot;a copy of the paper&quot;; xerox, telegram, telegraph) (14) functional-noun-> use-function-verb (&quot;the telephone&quot; -> &quot;telephone John&quot;; machine, motorcycle, telegraph) (15) object-noun -> central-component-of-object (&quot;a bed&quot; -> &quot;bought a bed \[=frame with not mattress\]; &quot;an apple&quot; -> &quot;eat an apple \[=without the core\]&quot;)) Consider the first rule. This rule states that, for some noun whose core meaning is a functional object, there is another sense, also a noun, that occurs without determination, and means the primary activity associated with the first sense. For example, the word &quot;bed&quot; has as a core meaning a functional object used for sleeping.</Paragraph> <Paragraph position="2"> However, the Word can also be used in utterances like &quot;go to bed&quot; and &quot;before bed&quot; (but not, say, &quot;*during bed&quot;). In these cases, the noun is determinerless, and means something akin to sleeping. Other examples include &quot;jail&quot;, &quot;conference&quot;, &quot;school&quot; and virtually all the meal terms, e.g., &quot;lunch&quot;, &quot;tea&quot;, &quot;dinner&quot;. British English allows &quot;in hospital&quot;, while American English &~es not.</Paragraph> <Paragraph position="3"> The dialect difference underscores the point that this is truly a subregularity: concepts that might be expressed this way ,are not necessarily expressed this way. Also, we chose this example not because it in itself is a particularly important generalization about English, but precisely because it is not. That is, there appear to be many such facts of limited scope, and each of them may be useful for learning analogous cases.</Paragraph> <Paragraph position="4"> Consider also rule 4, which relates function nouns to verbs. Examples of this are &quot;tree&quot; as in &quot;The dog treed the cat&quot; and &quot;knife&quot; as in &quot;The murderer knifed his victim&quot;. The applicable rule states that the verb means some specific activity involving the core meaning of the noun. I.e., the verbs are treated as a sort of conventionalized denominalization. Note that the activity is presumed to be specific, and that the way in which it must be &quot;concreted&quot; is assumed to be pragmatically determined. Thus, the rule can only tell us that &quot;treeing&quot; involves a tree, but only our world knowledge might suggest to us that it involves cornering; similarly, the rule can tell us that &quot;knifing&quot; involves the use of a knife, but cannot tell us that it means stabbing a person, and not say, just cutting.</Paragraph> <Paragraph position="5"> As a final illustration, consider rule 5, so-called &quot;role splitting&quot; (this is the same as Norvig and Lakoffs semantic role differentiation link). This rule suggests that, given a verb in which two thematic roles are realized by a single complement may have another sense in which these two complements are realized separately.</Paragraph> <Paragraph position="6"> For example, in &quot;John took a book from Mary&quot;, John is both the recipient and the agent. However, in &quot;John took a book to Mary&quot;, John is only the agent, and Mary is the recipient. Thus, the sense of &quot;take&quot; involved in the first sentence, which we suggest corresponds to a core meaning, is the basis for the sense used in the second, in which the roles coinciding in the first are realized separately. A similar prediction might be made from an intransitive verb like &quot;shave&quot;, in which agent and patient coincide, to the existence of a transitive verb &quot;shave&quot; in which the patient is realized separately as the direct object. (Of course, the tendency of patients to get realized as direct objects in English should also help motivate this fact, and can presumably also be exploited analogically.)</Paragraph> </Section> <Section position="6" start_page="407" end_page="407" type="metho"> <SectionTitle> 4. An Analogy-based Model of Lexical Acquisition </SectionTitle> <Paragraph position="0"> We have been attempting to extend MIDAS.-style word hypothesizing to be able to propose new word senses by using analogy to exploit these other kinds of lexical subregularities. At this point, our work has been rather preliminary, but we can at least sketch out the basic architecture of our proposal and comment on the problems we have yet to resolve.</Paragraph> <Paragraph position="1"> (A) Detect unknown word sense. For example, suppose the system encountered the following phrase: &quot;at breakfast&quot; Suppose further that the function noun &quot;breakfast&quot; were known to the system, but the determinerless usage were not. In this case, the system would hypothesize that it is lacking a word sense because of a failure to parse the sentence.</Paragraph> <Paragraph position="2"> (B) Find relevant cases/subregularities. Cues from the input would be used to suggest prior relevant lexieal knowledge. In our example, the retrieved cases might include the following: bed-I/bed-3, class-I/class-4 Here we have numbered word senses so that the first element of each pair designates a sense involving a core meaning, and the latter a determineless-activity type of sense. We may have also already computed and stored relevant subregularities. If so, then these would be retrieved as well.</Paragraph> <Paragraph position="3"> Relevant issues here are the indexing and retrieval of cases and subregularities. Our assumption is that we can retrieve relevant cases by a conjunction of simple cues, like &quot;noun&quot;, &quot;functional meaning&quot;, &quot;extended detexminerless noun sense&quot;, etc., and then rely on the next phase to discriminate further among these.</Paragraph> <Paragraph position="4"> (C) Chose the most pertinent case or subregularity.</Paragraph> <Paragraph position="5"> Again, by analogy to MIDAS, some distance metric is used to pick the best datum to analogize from. In this 410 b ca~e, perhaps the correct choice would be the following: class- 1/cl~t~s-4 One motivation for this selection is that &quot;class&quot; is compatible with &quot;at&quot;, as is the case in point. Finding the right metric is the primary issue here. The MIDAS metric is a simple sum of two factors: (i) the length of the core-relationship from the input source to the source of the candidate metaphor, and (ii) hierarchical distance between the two concepts. Both factors are measured by the number of links in the representation that must be traversed to get from one concept to the other. The hierarchical distance factor of the MIDAS metric seems directly relevant to other cases. However, there is no obvious counterpart to the core-relationship component. One possible reason for this is that met& phoric extensions are more complex than most other kinds; if so, then the MIDAS metric may still be applicable to the other subregularities, which are just simpler special cases.</Paragraph> <Paragraph position="6"> (D) Analogize to a new meaning. Given the best case or subregularity, the system will attempt to hypothesize a new word sense. For example, in the case at hand, we wo~dd like a representation for the meaning in quotes to be produced.</Paragraph> <Paragraph position="7"> class- 1/class-4 :: breakfast-1/&quot;period of eating breakfast&quot; In Ihe case of MIDAS, the metaphoric structure of previo~ls examples was assumed to be available. Then, once a best match was established, it is relatively straightforward to generalize or extend this structure to apply to the new input. The same would be true in the general case, provided that the relation between stored polysemous word senses is readily available.</Paragraph> <Paragraph position="8"> (E) Determine the extent of generalization. Supposing that a single new word sense can be successfully propos~, the que.,;tion arises as to whether just this particular word sense is all the system can hypothesize, or whether some &quot;local productivity&quot; is possible. For example, if this is the first meal tema the system has seen as having a determinerless activity sense, we suspect that only the single sense should be generated. However, if it is the second such meal term, then the first one would have been the likely basis for the analogy, and a generalization to meal terms in general may be attempted.</Paragraph> <Paragraph position="9"> 09 Record a new entry. The new sense needs to be storcA in the lexicon, and indexed for further reference. Thi,; task may interact closely with (E), although generalizing to unattested cases and computing explicit subregularities are logically independent.</Paragraph> <Paragraph position="10"> There are many additional problems to be addressed beyond the ones alluded to above. In particular, there is the issue of the role of world knowledge in the proposed process. In the example above, the system must know that the activity of caring is the primary one associated with breakfast. A more dramatic example is the role of world knowledge in hypothesizing the meaning of &quot;treed&quot; in expressions like &quot;the dog treed the cat&quot;, assuming that the system is acquainted with the noun &quot;tree&quot;. All an analogical reasoning mechanism can do is suggest that some specific activity associated with trees is involved; the application of world knowledge would have to do the rest.</Paragraph> </Section> class="xml-element"></Paper>