INTERPRETING COMPOUNDS FOR MACIIINE TRANSLATION 
BARBARA GAWRONSKA CHP, ISTER JOHANSSON 
ANI)ERS NORDNEI~. CAROLINE WILLNERS 
Dept. of Linguistics, University of Lnnd, 
Helgonabacken 12, S+223 62 LUND, Sweden 
SUMMARY: 
The paper presents a procedure for interpre- 
tation of English compounds and lkn" automatic 
translation of such compounds into Slavic 
languages and French. In tile target languages, 
a compound nominal is as a role to be rendered 
by an NP with an adjective or genitive attri- 
bute, or with an attrilmtive participle construc- 
tion. The model is based on Bierwisch's theory 
of word formation, which in turn is inspired by 
categorial grammar. The procedure is applied 
to a specific domain (asthma research). 
0. INTItOI)UCTION 
The need of a component interpreting 
complex lexical items in an MT system 
translating fl'om Germanic languages into e.g. 
French or Slavic languages is obvious. Many 
rules (or patterns) of word fo,'mation are highly 
productive, which makes it impossible to store 
all complex lexical entries in a static lexicon. 
An effective MT system rnust also be able to 
match the interpretation of a complex entry with 
the correct morphosyntactic pattern in the target 
language. For example, a program translating 
from German into Polish must distinguish the 
relations between the parts of a compound like 
Universitiitslehrer (university teacher) \['rotll the 
relations holding between Musik and Lehrer in 
Musiklehrer (teacher el + music). The first men- 
tioned compound is to be translated as a noun 
followed by an adjective (nattczyciel tmiwersy- 
tecki-'teacher university+adjective ending'), the 
later one its a noun and a genitive attribute (,za- 
ttczyciel mttzyki-'teacher music+gen'). Similar 
problems occur when translating into French or 
Czech: cf. Musikabend-Fr. soim;e mtt.s'icah' (n 
a), Cz. hudebn( ve&~r (a n), Mttsiklehrer-Fr. 
pro.fiesseur de musique (n prep n), Cz. tlEitel 
hudbv (n n+gen). 
Tile models for compound interpretation and 
generation proposed by general linguists (of. 
Lees 1960, Selkirk 1982, Fanselow 1988, 
Bierwisch 1989) require as a rule several modi- 
fications in order to be applicable ill an MT 
system. Since, in our opinion, a model aimed 
to serve as an efficient tool for NLP and MT 
mnst be linguistically valid, we will discuss a 
number of theoretical questions and relate our 
model to general linguistics beRn'e presenting 
our experimental procedure t'or domain- 
restricted compound translation. 
1. Till,; STATUS OF WORI) FORMATION 
RUIA,;S 
1.1. 'Where's nmrphology?' 
The above question, put by Stephen 
Anderson 1982, is still waiting for a definitive 
answer. Word for,nation rules have been 
claimed to obey syntactic principles and hence 
being a part of UG (Lees 1960, Pesetsky 
1985), to form a grammatical level on their own 
(Di Sciullo & Williams 1987), to be explainable 
in semantic terms solely (Fanselow 1988) or to 
belong to the lexicon (Chomsky 1970, Jacken- 
doff 1975, Bierwisch 1989). 
We will propose a quite simple answer to 
Anderson's question: morphology shall be seen 
;is a component of the grammar, tim notion 
'grammar' to be understood as an integrated 
fnodel where no borders are drawn between 
syntax, morphology, and semantics. 
1.2. Towards an integration of syntax, 
semantics and morphology 
Fanselow (1985, 1988)argues, on tile basis 
of psycholinguistic evidence, for treating word 
fornmtion rules not ;is generative processes, but 
as a 'primitive' process of co,acatcnating mor- 
phematic items, a very easily learnable proce- 
dure. His argumentation is restricted to morph- 
ology in the traditional sense of tim term. We 
would like to go even further and clailn that tile 
gralnmal" as a whole can be regarded as a set of 
patterns {'of concalolVdtion or COOCllITellCe of 
lexical items, each concatenatio,~ pattern asso- 
ciated with principles of semantic interpretation. 
This approach is to some extent inspired by (but 
far l'ronl identical to) categorial grammar and 
gierwisch's lexicon theory (P, ierwisch 1989). 
At tile same time, it is in its very essence not 
incompatible with Constraint Grannnar (Karls- 
son 1990, Koskenniemi 1990). 
1.3. Compounds as collocations 
English compounds provide an argument in 
\['avour of our approach to grammar. It seems 
impossible to draw a clear-cut borderline 
between strings traditionally labelled its 
4,5 
compounds and those classified as noun 
phrases. Cf.:the following examples, taken 
from a corpus of medical abstracks: 
ragweed allergic dffnitis 
house-thtst-allergic a,vthma 
house chtst asthma 
patient daily symptom diary cards 
fluticasone propionate aqueous nasal spray 
In most grammatical descriptions, strings 
consisting of nouns (like house dust asthma) 
are treated as compound nouns, whereas a 
complex including an adjective followed by a' 
noun is normally labelled as an NP. The above 
examples show, however, that such a 
distinction is not unproblcmatic. Phrases like 
house-dust-allergic asthma and.fluticasone 
propionate aqueous nasal spray may be 
analysed either as NPs containing a compound 
adjective and a head noun, or as compounds 
including optional adjective constituents (hottse 
dust asthma and fluticasone ,wray are perfectly 
well-formed). Furthermore, parts of an English 
compound may provide referents for elliptic 
constructions, as in the following examples: 
The variations in provocation concentrations ... 
were small during both placebo aml active 
drug treatment 
the d~fference between a single allergen 
provocation and continltotts exposure... 
Thus, a noun included in a compound can still 
have a referent on its own, an ability normally 
associated with nominal phrases. Such facts in- 
dicate that there is no absolute distinction to be 
drawn between compound nominals and com- 
plex nominal phrases in English. It seems more 
appropriate to talk about morn or less lexicalized 
collocations. However, in the following the tra- 
ditional term 'compound' will be used. 
2. AUTOMATIC INTIqlPilETATION AND 
TRANSLATION OF COMPOUNi)S 
2.1. The theoretical foundation 
Bierwisch (1989; cf also Olsen 1991) 
regards the process of compounding as a 
functional application, where one of the 
thematic roles of the head noun becomes 
'absorbed'. For example, a noun like payer is 
supposed to have the following interp,'etation: 
)~y)~x\[zINST\[xPAYy\]\], 
where y is the external theta-role, x the internal 
one, mad z represents the 'referential role'. In a 
compound like bill payer, the internal role of 
pay becomes instantiated: 
)vx\[zINST\[xPAY BILL\]\]. 
Our analysis of compounds is not 
incompatible with Bierwisch's approach. 
However, for the purpose of MT, a 
classification of valency in terms of three kinds 
of theta roles only (external, internal and 
referential) seems insufficient. A procedure for 
compound interpretation must also take into 
account optional thematic roles, e.g. location 
(tmive~wio, teacher). It must in addition be able 
to deal with compounds that do not include 
deverbal components. IIence, we decided to 
modify the theory proposed by Bierwisch at 
two main points: 
a. the valency of a verbal stem is to be repre- 
sented not in terms of external and internal them 
roles, but in terms of the components of the 
event or situation the verb may refer to 
b. the interpretation of compounds that do not 
contain deverbal elements is based on morpho- 
semantic patterns specifying the default read- 
ings of combinations that include members of 
dilTferent semantic categories. 
2.2 An experimental procedure for 
understanding derived nouns and 
compounds 
In an experimental program, implemented in 
LPA MacProlog, we structured a very restricted 
lexicon of Swedish stems and affixes (basal 
lexical ent,'ies, BI.A) according to the approach 
outlined above. Each verbal stem was provided 
with a list of elements of its typical event refer- 
ent, e.g.: 
lex(\[ l~ir\],m(teach,stem),v,vt,\[ agent, 
sem_object,domain,place,time,resultl,\[\]). 
Affixes were specified with respect to the fol- 
lowing features: 
0 the category or categories of stems the affix 
may be combined with 
0 the resulting category, including the morpho- 
syntactic specification 
0 the default semantic interpretation of the affix. 
For example, the Swedish agcntive suffix -are 
was represented as: 
slex(\[ arel,su ff(n,agr(sg,re,inde f)),v,agent,\[\]). 
Underived nouns got a quite simplified seman- 
tic specifcation formulated in tradilional terms 
like 'human', 'animate', 'abstract', 'concrete', 
'potential location' etc. On this basis, the inter- 
prctation procedure tried to match the semantic 
specification of the affix or of the noun and 
associate the morphcmaiic entries attached to the 
verb stem with the most probable elements of 
46 
the stem's semantic valency. The program dis- 
tinguished correctly between coml+ounds like 
grammatikliimre (teacher of gl'itllllllal') and toli- 
vetwitetslih'are (university teacher), as shown in 
the following outprint. 
:- analym(\[grammatikl;,iraml) 
m(Idomain(grammar), 
head ( m (I agen t(su ff), 
head(teach) l))l) 
category: n agr(sg, re, indel +) 
constituents \[grammatik, 1;,irate, \[\]fir, :tre tl 
:- analyse(\[universitetsl;,irarel) 
m(\[ place(university), 
head(m(\[agent(suff), 
head(teach)\]))l) 
category: n agr(sg, re, indef) 
constituents \[universitet, l£irare, I I~ir, atoll 
The program wits also able to interpret 
somewhat unusual, but I'ully possil+le 
compounds like ttniversitervmi)rdare (university 
killer). In the case of 'university killer', three 
alternative interpretations were given, all of 
them acceptable in Swedish: l) a person who 
kills in university buildings, 2) somebody who 
causes destruction of a university, 3) somehody 
who uses a university for destructive purposes. 
The flexibility of the quite simple interpretation 
procedure and its ahility to 'undcrst:md' even 
unusual complex words encouraged us to apply 
the method tested by means ol7 the toy program 
fflr a more serious goal, viz. for interpretation 
and translation of rncdical abstracts dealing with 
asthma and allergy research. 
2.3. Translation o1" compounds within ;l 
restricted domain (medical texts ml 
asthma and allergy research) 
2.3.1. Domain-related requirements 
In order to construct a domain specific 
lexicon and to design apl)ropriate parsing and 
translation algorithms, we investigated a corpus 
of about 140 medical abstracts. Already the 
preliminary inspecticm provided evidence for 
the need of a special procedure for COmlmund 
interpretation. The frequency of compounds in 
the texts was extremely high. Cf. the following 
sample: 
A large-scale mttlticenter investigation 
was undertaken in 3 cities with c'Oml)aral.~le 
pollen seasons and atmospheric pollen 
concentrations in order to obtain+ more 
dell'nile it~flormatiot7 about the sa/'ety and 
+q\[icacy o\['cromolyn sodium in the treatment 
of pollen-induced seasonal rhinitis. 
Complex names of chemical substances, as 
cromolyt~ soditm,, do not pose especially great 
prohlems to an MT system, since chemical 
symbols may be efficiently used as interlingual 
representations. Highly lexicalized and highly 
idiosyncratic compoutv, ls, like airways or hay 
fi, ver, may also he stored in tim basic lexicon. 
The rnain difficulty lies rather in the translation 
of productive compounds referring to different 
allergic syudroms, types of medical treatment 
and patient groups (ragweed pollen asthma, 
late-stmtmer rhinilis, Jhmisolide test,.lhmisolMe 
patient group etc.). In different texts, tile same 
syndrom may be referred to by different 
phrases, e.g. ragweed asthma, ragwood-in- 
dttced asthnla, ra,~wood pollen asttuna, rag- 
wood-allergic asthma etc. A correct interpreta- 
tion of the semantic relations between tile con- 
stituents of such collocations is necessary for 
correct translation. Otherwise, a phrase like 
child/~ood o.vthma \vcmhl be translated into 
French itot its asthme des etfrmts, bttt as ¢tsthme 
inchtit par el!/?race (lit. asthma induced by child- 
hood-by analogy to e.g. pyrethrltm asthma- 
ctslhme induit \]mr \]~yr+;thrines). A procedure for 
interpretation of compounds and complex NPs 
must therefore include a kind ol' domain know+ 
lc¢lge, preferably encoded in the lexicon. 
2.3.2. The lexicon 
An MT system aimed at translatkm of 
scientific texts should give tile user a possibility 
el" adding new entries to the lexicon in a simple 
way. A system for medical abstract translation 
would n{~t be really useful, if the user could not 
introduce names of new medicines, new terms 
denoting syndroms, symptoms, treatment me-. 
thods etc. Since the users of such a .system 
would, with a high degree of probability, be a 
non-linguist, the linguist designing the method 
for lexicon extension lntlst adapt the form of 
interactions to the expected competence of the 
I.iser. 
It would be naive to helive that a non-linguist 
could manage to specify tile lexical items in 
terms of internal and external theta~roles. Even 
terms like agent, theme and semantic object 
would prohably cause confusion, l-lence, it 
seems most reasonable to fornmlate the se,nan- 
tic classification in do,nain-spccific texts (in our 
case, in terms like allergen, syndrom, body-part 
etc.). There are actually linguistic reasons for 
this solutiotl, as scientific sublanguages differ 
semantically from each other as well as from the 
everyday conversation language. For a botanist, 
pyrethrum is primarily +a plant belonging to the 
chrysanthemum family, whereas an allergy re- 
47 
searcher regards pyrethrum as an allergy-induc- 
ing factor, having much in common with grass 
pollen and house dust. 
In the preliminary model of the lexicon deve- 
loped until now we classify nouns as members 
of the following categories: 
- syndrom (asthma, rhinitis) 
- symptom (sneezing, irritation) 
- allergen (pyrethrum, ragweed) 
- body part (airways, skin) 
- body function (inhalation) 
- chemical substance: 
medicine (antihistamine) or 
not used as medicine (histamine) 
- medical treatment (injection) 
- scientific method (measurement, test) 
- time period (season, childhood) 
- lmman: patient or not (the later distinction is 
needed for con'ect interpretation of e.g. 
asthma patient and asthma researcher) 
- amount: mass or countable (dose, group) 
- others: concrete or abstract 
2.3.3. Interactive lexicon extension 
Tim user has tim possibility to classify new 
nouns to be added to the lexicon by marking the 
desired alternative in an interaction window. 
The same entry may be marked as belonging to 
several categories. For example, inhalation may 
be regarded as both body function and medical 
treatment (house dust inhalation~steroid inha- 
lation). When adding a compound, the user is 
asked to specify its constituents according to the 
category list above. New words may be typed 
in by the user or read in fl'om a text file. 
It is assumed that the lexical entries to be 
added will belong to open lexical classes: 
nouns, verbs and adjectives. To distinguish 
between these three classes is not an impossible 
task for a non-linguist, especially if an 
appropriate instruction is provided. Adjectives 
are classified in a way similar to nouns, e.g. 
nasal, bronchial-denoting body part; stttf\[iv, 
rttnny (as in sttt\[\[iy nose)-dcnoting symptom 
and attribute of body part. 
A user-adapted classification of ve,'bs is 
more difficult to achieve. In our preliminary 
model, tim user is presented questions 
combined with example patterns, for instance: 
'Does tim verb take an object, like investigate 
the effects'?' 'Does it also take a complement 
with a certain preposition like: shield the patient 
from house dust?' 'What preposition is 
required?' If the verb in question turns out to be 
transitive, a further question is asked about the 
semantic category of the typical object, 
according to the standard category list. The 
specification of verbs takes more time than the 
one of nouns and adjectives. However, the 
need of introducing new verbs is ~su:dly not as 
great as the need o17 adding new nouns. 
2.3.4. Compound interpretation and 
generation of target equivalents 
The present program covers the most 
frequent types of compounds found in the 
corpus. After having filtered out the most fl'e- 
quent verbs (:mxiliaries, medals) and items be- 
longing to closed lexical classes (pronouns, art- 
icles, prepositions etc.), we first investigated 
word frequencies, and then the (unfiltered) 
environment of about the thirty most frequent 
words. On this basis, we could state that the 
most usual compounds containing the most 
frequent nouns (disregarding names of chemical 
substances) display the following patterns: 
i. (attribute, concretc)-allergcn-(adj)-synd,'om 
house dust (allergic) asthma 
(grass) pollen (seasonal) asthma 
ii. medicine/allergen-medical treatment 
antigen injection 
allergen iniection 
steroid treatment 
iii. (time period)-adj/allergenhnedicine- 
(body part)-scientific method 
allergen (skin) test 
9 week double-blind study 
iv. syndrom-patient-(countable amount) 
hay fever patient group 
v. medicine-(patient)-countable amount 
steroid patient group 
llunisolide group 
vi. body part-body function/symptom 
skin llyperresponsiveness 
airway patcncy 
vii. (attribute, concrcte)-aIlergen-time period 
grass pollen season 
48 
viii. (medicine/allergen)-mcdical tmatnmnt/body 
function-time period 
steroid treatment period 
house dust inhalation period 
The procedure fot' compound interpretation 
is base(I on a Prolog formalization of the most 
frequent patterns. "File following program frag- 
ment shows what the format for basal lexical 
entries looks like and how the interpretation 
rules are constructed. 
lex(\[asthma\] ,n,\[synd,'om I ...... 2). 
lox(\[dust\],n,lallergen\] ...... ). 
lex(Ipollenl,n,\[allergen\] ........ ). 
lex(Ipatient\] ,n ,I patientl ........ ). 
lex(\[season\],n,\[ time_period \] ......... ). 
lex(I steroid I ,n ,\[ medicine\] ....... ). 
lex(lgrassl,n,l concrete\] ....... ) i 
/* pattern: grass pollen */ 
tlex(\[G,P\],mean(\[G,Pl),n, 
\[ alle,'gen\],F 1 ,F2,F3):- 
lex(\[GI,_,\[concretel ......... ), 
lex(lPl,n,lallergen\] ........ ). 
/*pattern: nllcrgen-synctronl: 
ragweed aslhm',l*/ 
flex(Tlex,mean(Complex),n, 
\[is yn(I rein 1 ,A,B ,C):- 
append(All,Dis,Tlex), 
lex(All,n,\[allergenl ....... ), 
lex(1)is,n,\[syndrom\],A,B,C), 
al)pend(Dis,\[because_o f I ,New), 
appen(l(New,\[Mlcrgell(All) I,Co,nplex). 
/* patlem: allergen,complex- (a) - syndrom: 
grass pollen (allergic) asthma */ 
flcx(Tlex,mean(Com plex),n, 
\[syndrom\],A,B,C):- 
append(\[ Attr,Alll,1)is,Tlex), 
(lex(Dis,n,\[syndroml,A,B,C), 
tlex(\[ Attr,All\] ,M,n,\[ al lergenl .......... ), 
append(Dis,\[ because_ofl ,New), 
append(New,\[allergcn (1Atlr,All \])l, 
Complex)); 
(lex(Dis,a,Sem ....... ), 
append(\[Aur, All,l)is \],Ncxt,Tlcx), 
lcx(Ncxt,n,\[sy,Rh'om I,A,II,C), 
tlcx(I Attr,All I,M,n,I allergen I ........ ), 
a fq~end (1 attr(l)is)l,Next,l lead), 
append(Head,\[ because_ofl,New), 
append (New,\[allergen (\[ Attr, All\]) I,Complex)). 
lcx --~ basic lexical entry 
flex = temporary lexical entry 
The rules simply specify the dcfimlt interpre- 
tation of a sequence of nouns and deliver a se- 
manlic representation coded in 'Machinese Eng- 
lish', as shown in the outprints below: 
• ~ interpret(\[ house, dust, aslh,na, patien(\]) 
mean(\[ patient, 
su f for in g__.fro m, 
syndmm (\[asthma, 
because_of, allergen(\[ house, dust\])l) D 
grammatical category : n 
semantic category : \[patient\] 
:- interprct(\[house, dust, inhalationl) 
mcan(linhalation, 
of object, 
allcrgen(\[ house, dustl)l) 
!~I'anlnlalical category : n 
semantic category : I hody_l'unctionl 
The Machinese l'epresenlatio,~s can without 
difficulties be matched with the appropriate tar- 
get morphosyntactic patterns. For example, the 
semantic representation of grass pollen asthma 
l)atieHt becomes associated with the Polish pat- 
tern (simplified notation): 
paticnt,su ffcrin~from, 
syndrom(\[ X,bccause of,allcrgen(attr, All)\]) --> 
n(paticnt,Agr,nom), 
ln'tact(su ffer, Agr,nom), 
prep(su flEr,Prcp,Case), 
n(X,Agr2,Case), 
prtpass(cause,Agr2,Case), 
n(allergy,Agr3,ins), 
prcp(_,na,ack), 
n(All,Agr3,ack), 
n(Attr,agr(Gen,pl),gen). 
ins = tht~ instrumental case 
The pattern above correctly generates tile Polish 
equivalent of grc/.v.v polle~l (Isllllll(i palielm 
l~ac'je**l cieq@cy na astm(~ 
patient suffering prep asthnul-acc 
S\]210H,'O(/OW(III{I 
ClltlSed--aCC 
ucztl/eJli~'m iict Fyfek kwictlowy traw 
allergy-ins prep pollcl>aCC flower-a(\[i grass-gcn 
In a similar way, the progranl disambiguates 
ra#u, oocl crvthma and chilclhood <A~'ttzmct when 
translating into l~rench. Still, certain ambiguities 
may remail~: the present program can, for ex- 
ample, nol decide whether xras.v pollen asthma 
should bc translated into l:rcnch as asthme i,,> 
49 
duit pat" pollen des gramin#es or pat" pollen tie 
l'herbe. The decision has to be made by the 
user. 
Translation of frequent compounds of the 
type noun+past participle (allergen-shielded, 
allergen-tested, placebo controlled) is handled 
in a way similar to the one used in the prototype 
program when translating compounds like uni- 
versity teacher and univetwity killer. The seman- 
tic category of the noun is compared with the 
~mantic specification of the wdency of the verb 
stem and the noun is associated with the most 
probable verbal argument. Thus, allergen 
shielded tvom is interpreted as 'a rooin shielded 
fi'om allergen', while allergen tested skin gets 
the reading 'skin tested by exposure to aller- 
gen'. 
3. CONCLUSIONS AND IMPI.ICATIONS 
I;OR FUR'FItEI~ RI,;SEAI:tCI! 
3.1. Remaining problems 
The method proposed here has so far led to 
good translation results. However, the problem 
lies not only in interpreting a COlnp(mnd, but 
also in identifying an English word sequence its 
a compound. For the time heing, we use a pars- 
ins procedure based on a combination of depen- 
dency grammar and categorial grammar. The 
main parsing difficulty, when dealing with an 
English input, is to decide whether a lexical 
stem functions as a finite predicate or as a noini- 
nal. We try to remove the ambiguity by starting 
the parsing by a procedure called 'verbfinder', 
searching for possible candidates for the predi- 
cate function. The function o1" ambiguous items, 
like result, control etc., may often be identified 
on tim basis of their evironmeut: if tile word in 
question is immediately preceded by a preposi- 
tion and/or an article, it can be easily identified 
as a nominal element. The parsing procedure 
may still be made more efficient by utilizing re- 
stilts of statistic investigations of tile corpus 
(Steier & Below 1991, Johansson 1993). 
3.2. Future plans 
Tim advantage of the model outlined here 
lies in the fact that the general approacll to the 
grammar underlying the translation system may 
be adapted to differei~t domains without 
violating any theoretical assumptions, tlow- 
ever, the theory solely does not guarantee a 
high-quality translation. The preliminary sys- 
tem outlined above is to be developed and im- 
proved along the following lines: 
0 statistical methods will be used in order to 
reduce ambiguities and to discover coocurrence 
patterns on tile basis of larger corpora 
0 the medical vocabuhu'y will be enlarged by 
using hirge compui,'ltional medical data-bases 
(e.g. MEDLINE) and by consnlting specialists 
who are native speakers o1" the languages in- 
volved in the system 
0 the interactive procedures will be evaluated 
l.lud refined by testing their tiselTuh~ess in expe- 
riments with non-linguists. 
The results of the corpus investigations and 
the experiments with translation of abstracts am 
to be used in a system for automatic abstracting 
and multilingual abstract generation. 
REI,'I:A~I,~N CI,~S: 
Anderson, S. 1982. Where's morphology? 
Linguistic Inquiry 13. pp. 571-612. 
Bierwisch, M. 1989. Event nominalization: 
Proposals and problems. Motsch, W. ted.): 
Wortstrll\](tltr t.tlid Satzstrlt\]e.tltr. Berlin: VEP, 
(=l~inguistische Studien, Reihe A 194). pp. 
1-73. 
Chomsky, N. 1970. Remarks on nolninaliza- 
tions. R. Jacobs & P. Rosenbaunl (eds.): 
Readings ill English Tramjbrmational Gram- 
mar. 184-22 I. Waltham: (;inn & Co. 
Di Sciullo, A.M. & E. Williains. 1987. On the 
d~{fi'nition of word. Cambridge: MIT Press. 
Fanselow, G. 1985. What is a possible com- 
plex woM? J. Tomail ted.): Studies hi Ger- 
man Grammar. Dordrecht: Forts. pp. 289- 
318. 
Fanselow, G. 1988. 'Word Syntax' and se- 
inantic principles. O. Booi.i & J. v. Marie 
(eds.): 7"heorie des Lexikons. Universitiit 
Diisseldorf. pp. 1-32. 
Jackendoff, R. 1975. Morphological and se- 
mantic regularities in tim lexicon. Language 
51. pp. 639-7l. 
Johanss(m, C. 1992. Using a statistical mea- 
sure lo find rehitions between words: Seman- 
tics from frequency of co-occurrence? (MS) 
Karlsson, F. 1990. Constraint grammar for 
parsing rutming texts. CoLing '90, Helsinki. 
pp. 168-173 
Koskenniemi, K. 1990. Finite-state parsing 
anti disambiguation. CoLing '90, llclsinki. 
pp. 229-32. 
Lees, R. 196(). The grammar of English nomi- 
nalizations. The Hague: Mouton. 
Olseu, S. 1991. Zur Grammatik des Wortos: 
Argumente zur Argumentstruktur. 77teorie 
des £exikons. Universit~it Diisseldorf. pp. 
31-58. 
Posetsky, D. 1990. F, xperiencer predicates and 
universal alignment princilJles. Cambridge: 
MIT Press. 
Selkirk, F,. 1982. The syntax of wrJrdv. Cam- 
bridge: MIT Press. 
Steier, A.M. & R.K. Bclew. 1991. A statistical 
analysis of topical lalaguage. R. Casey & B. 
Croft (otis.): 2nd Sympositml on Docmnent 
Analysis and h(/brmation Retrieval. 
50 
