Enjoy the Paper: Lexical Semantics via Lexicology 
Ted Briscoe & Ann Copestake Bran Boguraev 
Computer Laboratory, Cambridge University IBM Thomas J. Watson Research Center 
Pembroke Street, Cambridge, CB2 3QG, UK PO Box 704, Yorktown Heights, New York 10598, USA 
Abstract: Current research being undertaken at both 
Cambridge and IBM is aimed at the construction of 
substantial lexicons containing lexical semantic information 
capable of use in automated natural language processing 
(NLP) applications. This work extends previous research 
on the semi-automatic extraction of lexical information 
from machine-readable versions of conventional 
dictionaries (MRDs) (see e.g. the papers and references in 
Boguraev & Briseoe, 1989; Walker et al., 1988). The 
motivation for this and previous research using MRDs is 
that entirely marina1 development of lexicons for practical 
NLP applications ks infeasible, given the labour-intensive 
nature of lexicography (e.g. Atkins, 1988) and the 
resources likely to he allocated to NLP in the foreseeable 
future. In tiffs paper, we motivate a particular approach to 
lexicai semantics, briefly demonstrate its computational 
tractability, and explore the possibility of extracting the 
lexical information this approach requires from MRDs and, 
to some extent, textual corpora. 
1. Lexlcal Semantics 
A theory of lexical semantics should provide an 
efficient representation of lexical semantic information in 
the paradigmatic plane which is capable of integrating with 
a genuinely compositional semantic account in the 
syntagmatic plane. Our starting point for this research is 
the work of Levin (e.g. 1985) and others on verbal 
alternations (diathesis), Pustejovsky (e.g. 1989) on lexical 
coercion and qualia theory, and Evans & Gazdar (e.g. 
1989) on default inheritance within unification-based 
formalisms. It can be seen as a further contribution to the 
use of unification-based formahsms in linguistic description 
and specifically as an enriching of the minimal sort-based 
lexical semantic taxonomy incorporated into the Esprit 
ACORD system (Moens et al., 1989) and the SRI 
(Cambridge) CLE system (Alshawi et al., 1989). We 
propose a system in which a standard graph-based 
unification formalism, such as PATR-II, is augmented with 
minimal disjunction (of atomic terms) and minimal default 
inheritance (allowing only 'orthogonal' multiple inheritance 
in a manner similar to Evans & Gazdar's DATR). Using 
such a system we are able to see the beginnings of 
solutions to three problems concerning the integration of 
lexical semantics with a general theory of linguistic 
description and processing - alternations, coercion, and 
decomposition / representation. 
The first problem emerges with systems, such as the 
Alvey Tools grammar (Carroll & Grover, 1989), which 
attempt to characterise the grammatical behaviour of 
lexical items in terms of sets of subcategorisation frames. 
Intuitively, this often seems arbitrary and inelegant because 
the occurrence of alternation seems to be semantically 
motivated. This problem has been discussed in connection 
with w~rbs mostly, but also arises with nouns and 
adjectiw'~a. For instance, in the Tools lexicon the verb 
believe has eight entries. Six of these separate entries relate 
to the same or a very similar sense of believe; namely, 
believe3 (Longman Dictionary of Contemporary English, 
LDOCE) 'to hold as an opinion; suppose' which is a 
relatlon between an individual (the believer) and a 
proposition (what is believed). Treating the various 
grammatical realisations of this sense of believe separately 
predicts that it is pure accident that they share the same 
sense. It also suggests that the range of possible 
alternations is unpredictable and must simply be listed 
from verb to verb. Most of the work on alternations has 
concentrated on attempts to characterise semantic classes 
of verbs which undergo similar alternations (e.g. Levin, 
1985). This enterprise has not been particularly successful 
(Boguraev & Briscoe, 1989b), but in any case ignores or 
simply assumes the prior point that it is possible to 
construct a system in which there is just one entry for 
believe3. 
Nevertheless, it seems correct that examples like John 
believed that Mary was clever / Mary (to be) clever / 
Mary / the rumour should be related to one entry for 
believe because this would allow us to account for the 
interpretation of John believed Mary as something like 
'John believed something(s) that Mary asserted'; thai: is, as 
standing for some 'understood' proposition involving 
Mary. Pustejovsky (1989b) refers to this process as 
coercion and compares it to examples such as John 
considers Mary a genius where it is usual (e.g. in GPSG, 
Gazdar et al., 1985) to claim that a genius functions 
predicatively because the subcategorisation frame for 
consider forces this interpretation. In general, coercion is a 
problem in theories which take the syntactic aspect of 
grammatical realisation as primary, but would be a natural 
consequence of a theory which took the sense and rite fact 
that believe3 is a relation between an individual and a 
proposition as basic. In such an account an NP 
complement of a verb denoting a relation between an 
individual and a proposition would either denote a 
proposition 'directly' (the rumour) or be coerced to the 
appropriate semantic type (Mary). 
When coercion occurs some additional information is 
required to 'flesh out' the elevated semantic type of the 
complement. Pustejovsky (1989) dubs this logical 
metonymy. In the case of believed Mary this is that it is 
some assertion of Mary's which is believed. This 
information appears to be inherited from the verb. In other 
cases, such as John enjoyed (watching) the film, John 
began (reading) the book, or John finished (drinking) the 
beer, it is more plausible that the missing information is 
provided by the lexical specification of the; NP 
complements (cf: John enjoyed (drinking) the beer, John 
finished (reading) the book). Pustejovsky (1989, 1989b) 
and Pustejovsky & Anick (1988) propose that the lexical 
representation of nouns is enriched to include a 
specification of processes typically associated with the 
objects they denote and that, in cases of coercion, this 
information is utilised. In their terms, this is the tdic role 
of the qualia structure of the noun. 
We see the inheritance of this information from the 
verb or complement as a default process which operates in 
the absence of more marked pragmatic information. For 
example, one would normally enjoy (watching) the play, 
but it would not be difficult to construct a discourse 
context in which someone (say lecturer or student) enjoyed 
1 42 
(reading) the play, and so forth. So we propose that enjoy 
in this sense is a relation between an individual and ml 
event m-Kl that, by default, nouns such as fihn or play 
inherit 'watch' as a specitication of the typical event 
(process) in which they participate. The entry for enjoy 
will, also by default, state that in cases of coercion the 
specification of the process will be imherited from the 
nominal comt)lement. In cases where tim defanlt is 
overriddm,, by pragmatic information more specific 
instances of the entries tot enjoy and/or play are created in 
which the defaults are replaced by pragmatically 
appropriate ';pecitications. (The precise nature of the 
processes which trigger this or the retxicval of tile relevant 
intbrnmtion we take to be a lmrt of 'pragmatics' and not 
lexical semantics,) 
One tinal (mostly methodological) tx~int is that the 
approach we are advocating provides a slightly different 
viewtxfinl on the problem of lexical decomposition / 
represenlation. Early approaches to lexical semmltics within 
the generative hadition were criticised for the ,'ubitrariness 
of the representations produce*t. Following Dowry (1979), 
Pustojevsky (1989) and others, we suggest that one 
strategy for uncovering the optimal lexical representation, 
or level of 'decomposition', is to tx)sit representations 
which provide elegant accounts of the interaction of lexiegd 
semantics with grammatical realisation and with 
comtmsitional semantics. Pursuing this methodology, we 
have been led to a model of lexical semantic representation 
which suppo~ts a (somewhat emichexl) comt×)sitional 
account of (sentence) meaumg by enrichhlg lexieal 
~epresentations of nouns and collapsing those for verbs 
with alternate grammatical realisations. In this framework, 
there arc. still many inferences which are not captured "tl. 
the level of lexiea\] organisatien, bu! we argtlt' that these 
inferences are. 'pragmatic' i~l the sense t\]la\[ they art: lh)f 
b'v-;ed on default processes operating within tim lexicon. 
Ttms, our position is OplX~S~{.d la lha:: of flobbs ct a1.(1987) 
who argue that there is no disti~ction between lexical 
semantic:~ and general knowledge, hi our approach, simple 
default 'lexicat' ifderence l~,rocedure.s do quite a lot of 
work. Of course, the way is always open to us to argue 
that rely inference which cannot be captur~xl by these 
procedures is 'non default'. New:Ilheless, in section 3 we 
argue thai this distinction is supt~;rted by natural data both 
in terms of the fruity of non-default cases and also the 
marked, info~mationally-rich nature of the contexts in 
which lexical defanlts ark overridden. Thus, our approach 
gives us a handle on which aslxacls el' k;xicN meaning 
sheukt be represented in the lexicon m~d therefore on the 
type of information we wmat te extract from our MRI)s. 
anyObj 
PhysObj Artefact Abstract .S, ................. /"'.. 
>", "X "- " \ 
| "-... ". ~Represent at ion equation l 
P°tat°l cake2 Xstatuel \ l 
X Literature VisualRep 
bo6~k \] film3 
Figure 1 
2. An Implementation 
It is possible to implement a system capable of' 
coercion and default specification using a unification-based 
formalism extended with 'orthogonal' default inheritance of 
(paradigmatic) lexical specifications. We also make use of 
minimal disjm~ctive specifications to allow for the range of 
grammatical alternation within one sense of a tmxlicate. 
Our prototype extend,; PATRolI (Shieber, 1986) with 
disjunction of atomic terms and uses the template 
mechanism to imt×)se a natural m~bsumption ordering on 
the lcxical taxonomy which defines the inheritance 
network. The taxonomy implicit in the fragment 
implemented so far is shown in Figure 1. q\]fis t~monorny is 
adequate to cover the metonymies discussed in this paper 
and others discussed in Pustejovsky (1989). (Numbers on 
concepts are relatexl to LDOCE sense numbers.) 
An entry for book is given in template from in Figure 
2a. Its position in the network in Figure 1 defines the 
pattmn of inheritance for the qualia structme. 
Lexica\] entry for "book": 
book -- i N Literature PhysObj; 
Dag for "the book": 
\[ CAT : NP 
SEMt~'S : \[CAT : OBJ 
TYPESHIFTED : FALSE \] 
TI~ANS: \[DET :DEFINITE 
PRED : BOOK1 
VAR: <DAG61> = REF25 
ARG!: <DAG61>\] 
@UAL IA : 
\[ T S'I'\]:~,UC'.\[' : 
\[ '±'RANS : \[PRED : ?REAl:) 
VAR: <DAG,62> =\[\] 
EVENT: <DAG62> 
ARGI: <DAG63> =\[\] 
ARG2: <DAG61>\] 
COMBINES: \[ FIRST: 
\[T}{ANS : \[VAR:<DAG63>\] \] \] \]\] 
Figure 2a 
The relic role fur book is thus inheritexl li'om file default 
role associated with 'Literature'. The entry will also inherit 
reformation from 'PhysObj' lint the orthogonality 
const~aint rules out conflicts with the attributes inlmrited 
from 'Literature'. In fact the template 'PhysObj' does not 
contain any information about the telie part of the qualia 
structure. 
The DAG for rite NP the book is "also shown in Figure 
2a. This still denotes an object; when combined with a 
norrnal, non-amrcing verb the telic role makes no 
contribution to the semantic structure. However some 
grmmnar rules allow type-shifting; one allows NPs with an 
associated telic role to be type-shiftexl to be equivalent to 
untensed VPs mad to denote events, 
Figure 2b shows the NP after application of this rule. 
Once type-shifted, the logical formula associated with the 
book is the same as that associated with reading ttw book, 
except that the question mark indicates defeasibility and 
could be inteq)reted as 'possibly(P = read) & P(e' j x)'. 
2 43 
\[CAT:NP 
SEMFS:\[CAT:EVENTUALITY 
TYPESHIFTED:TELIC\]\] 
TRANS:\[PRED:AND 
VAR:<DAG39>=REF26 
ARGI: \[DET:DEFINITE 
PRED:B00KI 
VAR: <DAG40>=REF25 
ARGI:<DAG40>\] 
ARG2: \[PRED:?READ 
VAR:<DAG39> 
EVENT:<DAG39> 
ARGI:<DAG41>=\[\] 
ARG2:<DAG40>\]\] 
COMBINES: \[FIRST: 
\[TRANS:\[ VAR:<DAG41>\]\]\] 
Figure 2b 
In (la) we show the formula which can be read off the 
DAG in Figure 2b given straightforward assumptions about 
the semantic interpretation of the formalism (e.g. Moore, 
1989). The lexical entry for enjoy specifies that its 
complement must denote an event which can be 
syntactically an NP or progressive VP and that, if the NP 
is type-shifted, the relic role supplies the understood 
predicate. The resulting formulae associated with the VP 
and S are shown in (lb,c). 
(1) 
a) ~ x e' ~ y ?read(e' x y) & book(y) 
b) ~ x 3 e e' y past(e) & enjoy(e x e') & 
?read(e' x y) & book(y) 
c) 3 e e' y past(e) & enjoy(e j e') & ?read(e' j y) & 
book(y) 
We follow Hobbs (1985), Alshawi et al. (1989), 
Moens et at. (1989) and others in using an event-based 
calculus for reasons of computational tractability, and also 
because distinctions amongst types of events are likely to 
be important in the characrerisation of the recovery of 
unc~rstood predicates in logical metonymies. In a fuller 
account it would be possible to constrain the type of event 
selected by a particular verb; for instance, enjoy might be 
constrained to unify by default with the telic role of a 
norm if this specified a process or culminating event. This 
would predict the relative oddity of examples such as John 
enjoys his house, in which we assume that the telic role is 
somettfing like 'living in' and that this specifies a state 
rather than process. It would also be possible to alter the 
aspect of qualia structure selected by a particular verb. An 
example like John regrets that book by default receives an 
interpretation in which 'writing' is selected to flesh out the 
metonyauy. In this case, we might specify that regret, in 
contrast to enjoy, selects the agentive path in the noun's 
qualia structure. 
Another area in which this approach to lexical 
semantics is suggestive relates to adjectival modification. It 
is well-known that adjectives such as good, fast, long, and 
so forth, have meanings which are hard to specify 
independently of some 'aspect' of the noun they modify. 
Pustejovsky (1989) suggests that in examples like fast car, 
fast typist, or fast waltz, fast should be treated as a 
modifier of the telic role associated with these nouns, so 
that: these examples can be paraphrased fast car to drive or 
fast waltz to dance. The adjective long appears to be (at 
least) ambiguous between a telic role modifier and a 
forrnal role modifier - a long book can either be a 
comment on shape, size or number of pages, or a comment 
on the length of time required for reading. In the event- 
based calculus we adopt we could associate the logical 
form in (2b) with the interpretation of (2a) where long is a 
telic role modifier. 
(2) 
a) John enjoyed the long book 
b) 3 e e' e" x y enjoy(e j e') & ?read(e' j y) & 
book(y) & long(e") & ?read(e" x y) 
However, note that it would be inappropriate to 
automatically conflate the events e' and e" because this 
would predict that John's reading of the long book was 
necessarily a long event which, whilst plausible, is not 
entailed under this interpretation of long. In order to avoid 
this effect using unification-based techniques it is necessary 
to explicitly copy the structure that specifies the telic role. 
We suggested in section 1 that NPs, such as the fact, 
can denote propositions 'directly'. Similarly, we think that 
there is no metonymy involved in examples such as John 
enjoyed the experience /film-making and so forth. In these 
cases, we claim that the NPs in question denote events 
'directly'. Thus, we are lead to an 'ontologically 
promiscuous' semantics (Hobbs, 1985). However, recent 
developments in model-theoretic semantics which treat 
properties as basic entities (e.g. Chierchia & Turner, 1988) 
support this position. Indeed the interpretation of event- 
denoting NPs in complement position with enjoy strongly 
suggests that these NPs must be analysed as denoting 
propositional functions since their 'missing argument' must 
be associated with the subject of enjoy. For instance, John 
likes marriage can mean that John likes the institution but 
John enjoys marriage can only mean that he enjoys being 
in the state of marriage (to someone). 
3. Data concerning Logical Metonymles 
The previous sections have demonstrated the nature of 
the phenomenon of logical metonymy and have outlined a 
computationally-tractable unification-based trealment. A 
crucial aspect of this treatment is that, with the predicates 
we have considered, the missing information is supplied, 
by defatflt, by the qualia structure of the head noun in the 
type-shifted complement. In order to demonslrate the 
presence of logical metonymies in naturally-occurring text 
and to evaluate the plausibility of our default approach, we 
examined data drawn from the Lancaster-Oslo/Bergen 
(LOB) corpus containing predicates capable, in principle, 
of coercing the type of their complements. 
A set of type-coercing predicates similar to enjoy was 
obtained by extracting verbs coded to take both NP and 
progressive or infinitive VP complements in LDOCE (see 
Boguraev & Briscoe, 1989b for an account of these codes 
and the extraction techniques). Further manual editing of 
this list led to 24 predicates which we felt were capable of 
exhibiting logical metonymies parallel to that of enjoy. To 
date, we have analysed all the data obtainable from the 
LOB corpus for seven of these predicates. The results of 
this analysis are summarised in Figure 3. (Numbers after 
predicates refer to LDOCE sense numbers.) 
44 3 
Pred Prog Inf NP Ev Met Prag 
enjoyl 6 / 59 21 25 4 
prefer 1 4 30 3(1 10 13 1 
finishl 8 / 31 8 23 6 
start/,3 45 28 63 42 21 0 
tmginl 1 5'1 11 8 3 2 
miss5 3 / ~A 10 13 4 
regret1 2 \] 17 14 0 0 
Figure 3 
Columns headext NP, lnf(initivc) and Prog(ressive) show 
the number of times each predicate occurred with this tYtVe 
of complement. (A stroke in fltese eolunms indicates that 
this complement type w(mld be ungrammatical with a 
particuh~r predicate.) The remaining cohunns give further 
information about the NP complements. Ev(cnt) indicates 
the ntlLmber of times that the NP complentent was judgexl 
to denote an event (or hi a few cases a proposition) 
directly: Met(onymic) indicates the ntunbar of times we 
judged that coercion had occmred. And Prag(matic) 
indicates the number of times that we judged the 
understood predicate was not recovered via the head 
noun's qualia structure in the metonymic cases. In some 
cases, the number of NP complenmnts is greater than the 
stun of Event and Metonyntic because we felt unable to 
classify some exmnples. These examples were either (senti) 
idiomatic, such as miss the boat, or involved NPs whose 
status was unclear because of modification of the head 
norm, such as enjoy the warm evening. 
The first thing to note about Figure 3 is rite 
comparatively high numbers of metonymic exmnples 
relative to the complete sets. It is inslructive that the 
apparently more complex metonymic complement pattern 
is selected quite frequently despite the availability, with ~dl 
ttmse 1)redicates, of an explicit VP complement pattern. For 
instancx~, enjoy and moq~hological vmimlts occurs 65 times 
in the relevant sense mid coerces its NP complement in 25 
of these cases. The second and crucial ohse~v~,tion, from 
the perspective of our default theory of the recovery of the 
understood predicate in the metonymic eases, is that the 
numbers in the 15ragmatic column m'e relatively low by 
comp~dson with the total nuntber of metonynfic exmnples. 
Given the defimlt theory, we would expect most 
rnetonymie examples to be resolvable via the head noun's 
qualia structure and there to be relatively few 'pragmatic' 
examples involving less constrairted lind ntore complex 
inferences, and, in fact, these cases represent aixmt 17% of 
the metonymic examples and about 4% of the total set of 
exanaples considered. Further examination of these 17 
cases revealed that, in most, the hnmediate context was 
informationally-rich arid therefore marked enough for the 
appropriate pragmatic inference to go through. For 
example, compare the a) examples with the b) examples in 
(3). 
(3) 
a) Willie enjoyed the hot sweet te~h standing on the 
deck in the cool of the night 
b) Site can lie back and enjoy her baby tmtil the 
midwife, knowing the afterbirth is ready to pop out, 
a) Loddon paid Iris own account, finished lfis 
cigarette, and got up. 
b) The Ix)ok was never finished, for his illness ~md 
death intervened while he was in the course of 
wriling it. 
a) If you prefer a Burgundy try a 1955 Charmes 
Chambertin costing round 1 t×mnd. 
b) Then again so rnany t~ople much prefer the sea 
or river to the 1)aths. Having learned to swim in the 
sea ... 
in each of the a) examples we think that the tm(terstood 
predica|c is supplied via the qualia structure of the head 
noun in the NP complement, ht the b) cases, it seems 
implausible that the telic role of babies is to be cuddh;d, or 
that seas or rivers are (mainly) for swinuning in. "\['he 
agentiw." role of book will specify the predicate 'wriuz', st) 
we could treat finish as selecting this role by defmdt m~d 
this would, in fact, deal with four of the 'pragmatic' cases, 
but othexs would become 'pragmatic' since in our 
implementation only one unification path hato the qualia 
structure c~m be selected by default. However, hi each of 
these examples the context shown provides enough 
information to infer the relevant predicate, it is not ~flways 
tim case that the (remah~der of the) context provides the 
relevmit information or intuitively seems so 'rich' in the 
default cases. 
/MLother way in which we can evaluate the default 
theory is by considering the status of the predicates which 
are supplied explicitly when a VP complement is selected. 
We might expect VPs to be selected precisely in those 
situations when defimlts based on qualia structtue would 
lead to the wrong intewretations. We tested this idea by 
exmnining the VP complements of start. In many cases, 
the prtxticates were intransitive, dilransitive, and so lbrth, 
so that rite hypothesis did not apply, tlowever, in the 
srraiOtfforwardly trmtsitive cases 21 exmnptes exhibited 
clear non-default pre(ficates, such as started to open the 
bottle, started to play a Waltz, or started flirting with the 
first pretty girl that you met, whilst only 4 cases arguably 
inwglv~l defmflt predicates recoverable from the head 
~otm's qualia structure -- start making a fuss, started to 
fire disitress rockets, started pulling the commtmication 
cord, artd started making bubbling twises. 
This analysis is hardly conclusive, however it doe';, we 
think, demonstrate that logic~d metonymies occur quite 
regularly with certain predicates in natural text. We have 
also provided some evklence that default inference based 
on lexical organisation (in this case the qnalia structure of 
nouns) would succeed in a large number of cases. 
Furtheimore, there seems to be some support in this data 
for the claim that contexts in which 'pragmatic' recovery 
of the understood predicate occurs are quite 
infbrmationally-rich mid would therefore constrain m~ 
otherwise rather unconstrained process. Finally, we have 
shown that, in the case of start there is evidence that VP 
complenrentation is chosen when default recovery of 
understood predicates on the basis of qualia structure 
would lead to the wrong interpretation. 
4. Acquiring Lexlcal Semantic Information 
In this section, we describe three exploratory studies 
aimed at tile (semi-)automatic acquisition of qualia 
structure, in particular telic roles, fTom lvlRDs. The first 
involves exploiting subject and box codes in the LDOCE 
MRD (see papers in Boguraev & Briscoe, 1989 fbr a full 
description), while the second is based on an analysis of 
the LDOCE definitions. These teclmiques are aimed at 
allowing qualia structure to be inherited appropriately; the 
flfird attempts to determine the predicates associated with a 
word by analysis of dictionary definitions and, to some 
extent, more general corpus material. 
4 45 
The machhae-readable version of LDOCE contains some 
residual 'database-like' features which do not appear in the 
printed dictionary. These include a taxonomy of many 
words in terms of 'subject matter'. This taxonomy defines 
a 'fat' hierarchy of, at most, two levels and many 
relationships are left implicit; for instance, 'sports' is a 
main extegory with subdivisions such as 'archery' but 
'football' is a main category with subdivisions such as 
'rugby'. Nevertheless, this taxonomy can be used to 
identify 'lexical conceptual paradigms' (Pustejovsky & 
Anick, 1988); for example, there is a class 'beverages' 
(147 word senses), a class 'motion pictures' (113 word 
senses), and a class 'literature' (377 word senses). These 
words c.ould straightforwardly be associated with the 
'deeper' inheritance network given in Figure 1 with default 
telic (and possibly other) roles, such as 'drinking', 
'watching' and 'reading' associated appropriately. There 
are a few problems though, for instance the category 
'beverages' includes publican, and 'motion pictures' 
includes usherette. It is possible to exclude these examples 
from the target network by making use of box codes 
which, amongst other things, associate semantic features 
with nouns, because the exceptions are coded 'animate' 
and 'hmnan'. Nevertheless, this approach is limited 
because the LDOCE semantic taxonomy will undoubtedly 
not contain all the classifications which eventually will 
prove desirable and there will be errors of omission in its 
construction. In addition, we are ufilising an idiosyncratic 
feature of the LDOCE MRD, wlfilst we would like our 
extraction techniques to be generally applicable. 
An improvement to this approach is to utilise 
taxonomies constructed from the dictionary definitions. For 
example we have built a taxonomy of substances by 
extracting the genus senses of LDOCE definitions in which 
145 word senses such as Burgundy appear directly or 
indirectly under the main nominal sense of drink. We are 
currently investigating an approach whereby lexical entries 
inherit some of their structure from higher nodes in the 
taxonomy. Qualia structure could thus be inherited from 
word senses rather than abstract templates; for example 
Burgundy would inherit its telic role from the noun drink. 
If abstract templates were still needed they could be 
inserted into the inheritance hierarchy at the appropriate 
points. 
The approaches above only specify how the qualia 
structure is inherited, rather than how it is initially 
det~'mined. In recent work, the IBM lexical systems group 
have used their lexical database system (e.g. Neff & 
Boguraev, 1989) with a number of MRDs to generate lists 
of pre~licates which are applied to books by searching 
through definition fields for the occurrence of book in a 
position denoting 'typical object' of the headword. For 
instance, LDOCE defines sag with '(of a book, 
performance, etc.) to become uninteresting during part of 
the length'. Using these techniques with three dictionaries 
resulted in the following list of verbs: abridge, abstract, 
annotate, appreciate, autograph, ban, bang about, borrow, 
bring out, burlesque, bowdlerize, call in, castigate, 
castrate, catalogue, censor, chuck away, churn out, 
classify, collate, commission, compile, consult, cross-index, 
dramatize, entitle, excoriate, expurgate, footnote, page, 
pirate. It is obvious that this technique yields specific, 
often rare, predications with typical objects. Whilst qualia 
structure is likely to involve typical predications with 
specific (classes of) nouns. 
In Order to automatically obtain typical (frequent) 
predications of book, four corpora were searched - 
LDOCE example sentences, the Brown corpus, 1.2 million 
words of Readers' Digest, and 26 million words of tapes 
from the American Publishing House for the Blind. 
Analysing those citations in which book occurs as direct 
object revealed that read and write are the two most 
common predicates across the four corpora, although ~l~re 
are considerable differences within each corpus '~see 
Boguraev et al., 1989 for details). This approach could and 
should be extended in several ways, for instance by 
dealing with semantically related nouns such as novel, and, 
of course, by attempting a similar analysis for many more 
nouns. Nevertheless, these preliminary results do suggest 
that a noun's qualia structure should be recoverable from 
MRDs and corpora in a semi-automatic way. 
5. Conclusion 
We have attempted to motivate an approach to iexical 
semantics which enhances the representation of nouns in 
terms of their qualia structure. We have shown that 
incorporating this information into a default inheritance 
hierarchy and enriching the notion of compositionality to 
allow for type-shifting of NPs allows for a computationally 
tractable and plausible account of logical metonymy. We 
have, however, said very little about what qualia structure 
is. Whilst Pustejovsky (1989) relates this idea back to 
Aristotle's four causes, we think that for the purposes of 
the computational implementation described above we need 
only assume that qualia structure constitutes (part of) the 
lexical information associated with a word sense, in the 
sense that it is the information which is most accessible 
given the organisation of the inheritance network. One 
could imagine that other more general or 'encyclopedic' 
information concerning concepts would simply be less 
accessible or 'dose' in terms of the same network. The 
preliminary work with MRDs/corpora suggests that both 
types are recoverable semi-automatically. 

References 

Alshawi, H., Carter, D., van Eijck, J., Moore, R., Moran, 
D., Pereira, F., Pulman, S. & Smith, A. 1989. Final 
Report: Core Language Engine. SRI (Canlbridge) 
Technical Report, Project No. 2989. 

Atkins, B.T. 1988. Course Notes. ESF Summer School on 
Automating the Lexicon. To appear in Atldns, B.T. & 
Zampelli, A. Automating the Lexicon, Oxford University 
Press, Oxford. 

Boguraev, B & Briscoe, E. 1989. Computational 
Lexicography for Natural Language Processing. 
Longman/Wiley, London/New York. 

Boguraev, B. & Briscoe E. 1989b. Ufilising the LDOCE 
grammar codes. In Boguraev & Briscoe 1989. 

Boguraev, B., Byrd, R., Klavans, J. & Neff, M. 1989. 
From structural analysis of lexical resources to se, mantics 
in a knowledge base. IBM Research, Mimeo. 

Carroll, J. & Grover, C. 1989. The derivation of a large 
computational lexicon of English from LDOCE. In 
Boguraev & Briscoe 1989. 

Chierchia, G. & Turner, R. 1988. Semantics and property 
theory. Linguistics & Philosophy, 11,261-302. 

Dowty, D. 1979. Word Meaning and Montague Grammar. 
Reidel, Dordrecht. 

Evans, R. & Gazdar, G. 1989. Inference in DATR. Proc. 
of 4th Eur. ACL., Manchester, pp.66-71. 

Ga~'Aar, G., Klein, E., Pullum, G. & Sag, I. 1985. 
Generalized Ptwase Structure Grammar. Blackwell, 
Oxford. 

Hobbs, J. 1985. Ontological promiscuity. Proc of 23rd 
ACL., Chicago, pp.61-9. 

Hobbs, J., Croft, W., Davies, T., Edwards, D. & Laws, K. 
1987. Commonsense metaphysics and lexical semantics. 
Computational Linguistics, 13, 241-50. 

Levin, B. 1985. Lexical Semantics in Review. MYY 
Working Papers on the Lexicon & in Walker et al. 1988. 

Moens, M., et al. 1989. Expressing generalisations in 
unification-based grammar formalisms. Proc. of 4th Eur. 
ACL., Manchester, pp.174-81. 

Moore, R. 1989. Unification-based semantic interpretation. 
Proc. of 27th ACL., Vancouver, pp.33-41. 

Neff, M. & Boguraev, B. 1989. Dictionaries, dictionary 
grammars and dictionary entry parsing. Proc. of27th ACL., 
Vancouver, pp.91-101. 

Pustejovsky J. 1989. Current Issues in Computational 
Lexical Semantics. Proc. of 4th Eur. ACL., Manchester, 
pp.xvii-xxv. 

Pustejovsy, J. 1989b. Type Coercion and Selection. Proc. 
of WCCFL VIII, Vancouver. 

Pustejovsky, J. & Aniek, P. 1988. On the semantic 
interpretation of nominals. Coling88, Budapest, pp.518-23. 

Shieber, S. 1986. An Introduction to Unification-based 
Approaches to Grammar, U. Chicago Press, Chicago. 

Walker, D., Zampolli, A. & Calzohtri, N. 1988, in press. 
Automating the Lexicon: Research and Practice in a 
Multilingual Environment. Cambridge University Press, 
Cambridge. 
