File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/87/p87-1024_abstr.xml
Size: 21,321 bytes
Last Modified: 2025-10-06 13:46:29
<?xml version="1.0" standalone="yes"?> <Paper uid="P87-1024"> <Title>On the Acquisition of Lexical Entries: The Perceptual Origin of Thematic Relations</Title> <Section position="2" start_page="0" end_page="175" type="abstr"> <SectionTitle> Abstract </SectionTitle> <Paragraph position="0"> This paper describes a computational model of concept acquisition for natural language. We develop a theory of lexical semantics, the Eztended Aspect Calculus, which together with a ~maxkedness theory&quot; for thematic relations, constrains what a possible word meaning can be.</Paragraph> <Paragraph position="1"> This is based on the supposition that predicates from the perceptual domain axe the primitives for more abstract relations. We then describe an implementation of this model, TULLY, which mirrors the stages of lexical acquisition for children.</Paragraph> <Paragraph position="2"> I. Introduction In this paper we describe a computational model of concept acquisition for natural language making use of positive-only data, modelled on a theory of lexical semantics. This theory, the Eztende~t Aspect Calculus acts together with a maxkedness theory for thematic roles to constrain what a possible word type is, just as a grammar defines what a well-formed tree structure is in syntax. We argue that linguistic specific knowledge and learning principles are needed for concept acquisition from positive evidence alone: Furthermore, this model posits a close interaction between the predicates of visual perception and the early semantic interpretation of thematic roles as used in linguistic expressions. In fact, we claim that these relations act as constraints to the development of predicate hierachies in language acquisition. Finally, we describe TULLY, an implementation of this model in ZETALXSP and discuss its design in the context of machine learning research.</Paragraph> <Paragraph position="3"> There has been little work on the acquisition of thematic relation and case roles, due to the absence of any consensus on their formal properties. In this research we begin to address what a theory of thematic relations might look like, using learnabUity theory as a metric for evaluating the model. We claim that there is an important relationship between visual or imagistic perception and the development of thematic relations in linguistic usage for a child. This has been argued recently by Jackendoff (1983, 1985) and was an assumption in the pioneering work of Miller and Johnson-Laird (1976). Here we argue that the conceptual abstraction of thematic information does not develop arbitrarily but along a given, predictable path; namely, a developmental path that starts with tangible perceptual predicates (e.g. spatial, causative) to later form the more abstract mental and cognitive predicates. In this view thematic relations are actually sets of thematic properties, related by a partial ordering. This effectively establishes a maxkedness theory for thematic roles that a learning system must adhere to in the acquisition of lexical entries for a larlguage.</Paragraph> <Paragraph position="4"> We will discuss two computational methods for concept development in natural language: (1) F~ature Relaxation of particular features of the arguments to a verb. This is performed by a constraint propagation method.</Paragraph> <Paragraph position="5"> (2) Thematic Decoupling of semantically incorporated information from the verb.</Paragraph> <Paragraph position="6"> When these two learning techniques are combined with the model of lexical semantics adopted here, the stages of development for verb acquisition are similar to those acknowledged for child language acquisition.</Paragraph> <Paragraph position="7"> 2. Learnabillty Theory and Concept Development null Work in machine learning has shown the usefulness to an inductive concept-learning system of inducing &quot;bias&quot; in the learning process (cf. \[Mitchell 1977, 1978\], \[Michalski 1983\]). An even more promising development is the move to base the bias on domain-intensive models, as seen in \[Mitchell et al. 1985\], \[Utgoff 1985\], and \[Winston et al. 1983 I. This is an important direction for those concerned with natural language acquisition, as it converges with a long-held belief of many psychologists and linguists that domain-specific information is necessary for learning (cf. \[Slobin 1982\], \[Pinker 1984\], {Bowerman 1974\], \[Chomsky 1980\]). Indeed, Berwick (1984) moves in exactly this direction. Berwick describes a model for the acquisition of syntactic knowledge based on a restricted X-syntactic parser, a modification of the Marcus parser (\[Marcus 1980\]). The domain knowledge specified to the system in this case is a parametric parser and learning system that adapts to a particular linguistic environment, given only positive data. This is just the sort of biasing necessary to account for data on syntactic acquisition.</Paragraph> <Paragraph position="8"> One area of language acquisition that has not been sufficiently addressed within computational models is the acquisition of conceptual structure. For language acquisition, the problem can be stated as follows: How does the child identify a particular thematic role with a specific grammatical function in the sentence? This is the problem of mapping the semantic functions of a proposition into specified syntactic positions in a sentence.</Paragraph> <Paragraph position="9"> Pinker (1984) makes an interesting suggestion (due originally to D. Lebeaux) in answer to this question. He proposes that one of the strategies available to the language learner involves a sort of ~template matching&quot; of argument to syntactic position. There are canonical conj~gurat{orts that are the default mappings and non-canonicoJ mappings for the exceptions. For example, the template consists of two rows, one of thematic roles, and the other of syntactic positions. A canonical mapping exists if no lines joining the two rows cross. Figure 1 shows a canonical mapping representing the sentence in (1), while Figure 2 illustrates a noncanonical mapping representing sentence (2).</Paragraph> <Paragraph position="10"> (1) Mary hit Bill.</Paragraph> <Paragraph position="11"> (2) Bill was hit by Mary.</Paragraph> <Paragraph position="12"> With this principle we can represent the productivity of verb forms that are used but not heard by the child. We will adopt a modified version of the canonical mapping strategy for our system, and embed it within a theory of how perceptual primitives help derive linguistic concepts. As mentioned, one of the motivations for adopting the canonical mapping principle is the power it gives a learning system in the face of positive-only data. In terms of learnability theory, Berwick (1985) (following \[Angluin 1978\]) notes that to ensure successful acquisition of the language after a finite number of positive examples, something llke the Subset Principle is necessary. We can compare this principle to a Version Space model of inductive learning( \[Mitchell 1977, 1978\]), with no negative instances. Generalization proceeds in a conservative fashion, taking only the narrowest concept that covers the data.</Paragraph> <Paragraph position="13"> How does this principle relate to lexical semantics and the way thematic relations are mapped to syntactic positions? We claim that the connection is very direct. Concept learning begins with spatial, temporal, and causal predicates being the most salient. This follows from our supposition that these are innate structures, or are learned very early. Following Miller and Johnson-Laird (1976), \[Miller 1985\], and most psychologists, we assume the prelinguistic child is already able to discern spatial orientations, causation, and temporal dependencies. We take this as a point of departure for our theory of markedness, which is developed in the next section.</Paragraph> <Section position="1" start_page="172" end_page="173" type="sub_section"> <SectionTitle> 3.0 Theoretical Assumptions 3.1 The Extended Aspect Calculus </SectionTitle> <Paragraph position="0"> In this section we outline the semantic framework which defines our domain for lexical acquisition. In the current linguistic literature on case roles or thematic relations, there is little discussion on what logical connection exists between one e-role and another. Besides being the workhorse for motivating several principles of syntax (cf. \[Chomsky 1981\], \[Willi~ms 1980\]) the most that is claimed is that Universal Grammar specifies a repertoire of thematic relations (or case roles), Agent, Theme, Patient, Goal, Source, Instrument, and that every NP must carry one and only one role. It should be remembered, however, that thematic relations were originally conceived in terms of the argument positions of semantic predicates such as CAUSE and DO. * That is a verb didn't simply have a list of labelled arguments 2 such as Agent and Patient, but had an interpretation in terms of more primitive predicates where the notions Agent and Patient were defined. The causer of an event (following Jackendoff (1976)) is defined as an Agent, for example, c ,4u s E(=, ,) -. Ag,.~(=).</Paragraph> <Paragraph position="1"> Similarly, the first argument position of the predicate GO is interpreted as Theme, as in GO(=,y,z). The second argument here is the SOURCE and the third is called the GOAL.</Paragraph> <Paragraph position="2"> The model we have in mind acts to constrain the space of possible word meanings. In this sense it is similar to Dowty's aspect calculus but goes beyond it in embedding his model within a markedness theory for thematic types. Our model is a first-order logic that employs symbols acting as special operators over the standard logical vocabulary. These are taken from three distinct semantic fields. They are: causal, spatial, and aspectual.</Paragraph> <Paragraph position="3"> The predicates associated with the causal field are Cau~e, (C,), C~se~ (C2), and l.stru,ne.t (I). The spatial field has only one predicate, Locatiue, which is predicated of an object we term the Th~me. Finally, the aspectual i CfiJackendoff (1972, 1976) for a detailed elaboration of this theory.</Paragraph> <Paragraph position="4"> field has three predicates, representing the three temporal intervals t~, beginning, t2, middle, and t3, end. From the interaction of these predicates all thematic types can be derived. We call the lexical specification for this aspectual and thematic information the Thematic Mapping Indez.</Paragraph> <Paragraph position="5"> As an example of how these components work together to define a thematic type, consider first the distinction between a state, an activity (or process), and an accomplishment. A state can be thought of as reference to an unbounded interval, which we will simply call t2; that is, the state spans this interval. 3 An activity or protess can be thought of as referring to a designated initial point and the ensuing process; in other words, the situation spans the two intervals tt and t2. Finally, an event can be viewed as referring to both an activity and a designated terminating interval; that is, the event spans all three intervals, it, t2, and is, Now consider how these bindings interact with the other semantic fields for the verb run in sentence (8) and give in sentence (9).</Paragraph> <Paragraph position="6"> (8) John ran yesterday.</Paragraph> <Paragraph position="7"> (9) John gave the book to Mary.</Paragraph> <Paragraph position="8"> We associate with the verb run an argument structure of simply rim(=}. For give we associate the argument structure ~v,(=, v, =). The Thematic Mapping Index for each is given below in (10) and (11). 00)</Paragraph> <Paragraph position="10"> The sentence in (8) represents a process with no logical culmination, and the one argument is linked to the named case role, Theme. The entire process is associated with both the initial interval t~ and the middle interval t2. The argument = is linked to C~ as well, indicating that it is an Actor as well as a moving object (i.e. Theme). This represents one TMI for an activity verb.</Paragraph> <Paragraph position="11"> The structure in (9) specifies that the meaning of give carries with it the supposition that there is a logical This is a simplication of our model, but for our purposes the difference is moot. A state is actually interpreted as a primitive homogeneous event-sequence, with downward closure. Cf. \[Pustejovsky, 1987\], 4 \[Jacl~endoff tOSS\] develops a similar idea, but vide in/ra for discussion. culmination to the process of giving. This is captured by reference to the final subinterval, is. The linking between = and the L associated with tt is interpreted as Source, while the other linked arguments, y and z are Theme (the book) and Goa/, respectively. Furthermore, = is specified as a Causer and the object which is marked Theme is also an affected object (i.e. Patient). This will be one of the TMIs for an accomplishment.</Paragraph> <Paragraph position="12"> In these examples the three subsystems are shown as rows, and the configuration given is lexically specified.</Paragraph> </Section> <Section position="2" start_page="173" end_page="175" type="sub_section"> <SectionTitle> 3.2 A Markedness Theory for Thematic Roles </SectionTitle> <Paragraph position="0"> As mentioned above, the theory we are outlining here is grounded on the supposition that all relations in the language are suffiently described in terms of causal, spatial and aspectual predicates. A thematic role in this view is seen as a set of primitive properties relating to the predicates mentioned above. The relationship between these thematic roles is a partial ordering over the sets of properties defining them. It is this partial ordering that allows us to define a markedness theory for thematic roles.</Paragraph> <Paragraph position="1"> Why is this important? If thematic roles are assigned randomly to a verb, then one would expect that there exist verbs that have only Patient or Instrument, or two Agents or Themes, for example. Yet this is not what we find. What appears to be the case is that thematic roles are not assigned to a verb independently of one another, but rather that some thematic roles are fixed only after other roles have been established. For example, a verb will not be assigned a GOAL if there is not a THEME assigned first. Similarly, a LOCATIVE is dependent on there being a THEME present. This dependency can be viewed as an acquisition strategy for learning the thematic relations of a verb.</Paragraph> <Paragraph position="2"> Now let us outline the theory. We begin by establishing the most unmarked relation that an argument can bear to its predicate. Let us call this role Them,~. The only semantic information this carries is that of an existential quantifier. It is the only named role outside of the three interpretive systems defined above. Normally, we think of Them, as an object in motion. This is only half correct, however, since statives carry a Theme readings as well. It is in fact the feature \[+-motion\] that distinguishes the role of Mary in (1) and (2) below.</Paragraph> <Paragraph position="3"> (1) Stative: l-motion I Mary sleeps.</Paragraph> <Paragraph position="4"> (2) Active: \[+motion\] Mary fell.</Paragraph> <Paragraph position="5"> This gives us our first markedness convention: (3) Therr=ee--Theme.~/\[+motion\] (3) Themery-..Themes/\[-motior=\] where ThemeA is an &quot;activity&quot; Theme, and Themes is a stative.</Paragraph> <Paragraph position="6"> Within the spatial subsystem, there is one variable type, Location, and a finite set of them L1, L~... L~. The most unmarked location is that carrying no specific aspectual binding. That is, the named variables are Ls and Lz and are commonly referred to as Source and Goal. Thus, Lu is the unmarked role. The limitations on named locative variables is perhaps constrained only by the aspectual system of the language (rich aspectual distinction, then more named locative variables). The markedness conventions here are: (4) Lu -* S/B (s) L~ -- C/E Within the causal subsystem there are three predicates, Cl, C2, and I. We call C2, (the traditional Patient role) is less marked than c~, but is more marked than I. These conventions give us the core of the primitive semantic relations. To be able to perform predicate generalization over each relation, however, we define a set of features that applies to each argument within the semantic subsystems. These are the abstraction operators that allow a perceptual-based semantics to generalize to nonperceptual relations. These features also have marked and unmarked values, as we will show below. There are four features that contribute to the generalization process in concept acquisition: The first feature, abttract, distinguishes tangible objects from intangible ones. Direct will allow a gradience in the notion of causation and motion. The third feature, cornplete, picks out the extension of an argument as either an entire object or only part of it. Ani~v~ac~l has the standard semantics of labeling an object as alive or not.</Paragraph> <Paragraph position="7"> Let us illustrate how these operators abstract over primitive thematic roles. By changing the value of a feature, we can alter the description, and hence, the set of objects in its extension. Assume, for example, that the predicate C1 has as its unmarked value, \[+Direct\].</Paragraph> <Paragraph position="8"> (6) C,\[UDir,,tl --\[+Vir,ctl By changing the value of this feature we allow CI, the direct agent of an event, to refer to an indirect causer. (7) Ae,.t\[+D~rect I <@ Aee,~tl-Dir,ct \] Similarly, we can change the value of the default setting for the feature I+Complet~\] to refer to a subcausation (or causation by part).</Paragraph> <Paragraph position="9"> (8) Agent{+CompleU\] <~ Agent\[-CompleteJ These changes define a new concept, &quot;effector', which is a superset of the previous concepts given in the system. The same can be done with C'~ to arrive at the concept of an &quot;effected object.&quot; We see the difference in interpretation in the sentences below.</Paragraph> <Paragraph position="10"> a. John intentionally broke the chair. (Agent-direct) b. John accidentally broke that chair when he sat down. (Agent-indirect) c. John broke the chair when he fell. (Effector) Given the manner in which the features of primitive thematic roles are able to change their values, we are defining a predictable generalization path that relations incorporating these roles will take. In other words, two concepts may be related thematically, but may have very different extensional properties. For example, give and take are clearly definable perceptual transfer relations. But given the abstractions available from our markedness theory, they are thematically related to something as distant as &quot;experiencer verbs&quot;, e.g. please, as in &quot;The book pleased John.&quot; This relation is a transfer verb with an incorporated Theme; namely, the &quot;pleasure.&quot; s If we apply these features in the spatial subsystem, we can arrive at generalized notions of location, as well as abstracted interpretations for Theme, Goal and Source. For example, given the thematic role Th - A with the feature \[-Abstract\] in the default setting, we can generalize to allow for abstract relations such as like, where the object is not affected, but is an abstract Theme. Similarly, the Theme in a sentence such as (a) can be concrete and direct, or abstract, as in (b).</Paragraph> <Paragraph position="11"> (a) have(L, rh) Mary has a book.</Paragraph> <Paragraph position="12"> (b) have(L, Yh) Mary has a problem with Bill.</Paragraph> <Paragraph position="13"> In conclusion, we can give the following dependencies between thematic roles: {r~eme} {~} {s, c} {c,} s Cf. Pustejovsky (1987) for an explanation of this term and a full discussion of the extended aspect calculus. The generaliztion features apply to this structure to build hierarchical structures (Cf. {Keil 1979\], \[Kodratoff 1986\]). This partial ordering allows us to define a notion of covcrs'ng, as with a semi-lattice, from which a strong principle of functional uniqueness is derivable (of. \[Jackendoff 1985\]). The mapping of a thematic role to an argument follows the following principle: (9) Maximal Assignment Principle An argument will receive the maximal interpretation consistent with the data.</Paragraph> <Paragraph position="14"> This says two things. First, it says that an Agent, for example, will always have a location and theme role associated with it. Furthermore, an Agent may be affected by its action, and hence be a Patient as well. Secondly, this principle says that although an argument may bear many thematic roles, the grammar picks out that function which is mazimall!; specific in its interpretation, according to the markedness theory. Thus, the two arguments might be Themes in &quot;John chased Mary&quot;, but the thematic roles which maximally characterize their functions in the sentence are A and P, respectively.</Paragraph> </Section> </Section> class="xml-element"></Paper>