TItE MFRGEI) UPPER MODEL: .A. L\]NGUISTI(; ONTOLOGY 
FOR (3ERMAN .AND ENGL\]StI 
Reaiaf,(~ tt(;nsehel aud ,fohn ilat;,enian ~ 
Projcci, I(OMF, T 
(IM I)/hisl;itul, fiir hlLegrierDc Pitblik~tioits und hl\[ornq.al,ionssyslx,nie 
I)oliw)sl, r. 15, I)-64793 1)arnisl,adt, (TeHiialiy 
{heirs CtlO\]., bat enla.i1 } @ips i. darnls'6 adt. gllld, de 
Abstract 
A detMled (:otnp~trison of I,he i)C}llll~l, ll \[)|)p(!l: Modcl and 
the KOMI';'\]' Germ;un Upl,cr Modci has 1)co. (:arried ou'.. 
in order l.o construct ~t ,row Merged Upper Modelcapa. 
b/e of serving as the idcatioua\[ basis for generation iu 
I)oth I';nglish ~tnd (J(2rlila.ll. Previously proposed critcl:ia 
for Colld/Ic|.illg Sllch ;L 1\]lCITg(' ~LIT(: cxpallde(\[ 011 alld (wahl 
al.ed. 11, is es{a.I)lished tha.t no (seini-)axltontaJ.ic merging 
of Sllch kllowledg(! SOll\]Xx's Cilll \])(! expe(:l, ed 1,o pFO(\]IICC 
;t r(',aSOllabl(', r('sull; itlid l,\[ial, del, ailed (:Olllparisotl Of l, ltc 
kind rei)orl,(xl iS essential. The ix!sull, of the llicrg(, is 
now be, ing u.',ed ;is (,he basis for senlDenco generation in 
I';nglish> (l(!rnla.n alid I)ntch. 
1 INTI/.OI)UCTION: MULTII,INGU AL 
LINCIUISTIC q)NTOI,OCIIES' 
Wil,h the need to develop re-usahle rr~un(,works \['or 
organizinginformalfion (of., e.g., I.he ARI)A Knowl 
edge Sharing l';ffort \[/)atil cl al., 1992\]), workM)le 
proposals lbr 'ontologie.q' 1,o Imwiclc sm:h orgaui. 
Zal,iOlt 32;C iiicreasing 111 ill)\])Ol'Lall(:(L ~llch Oll\[.o\[o- 
gies are now commonly applied in Natural I,anguage 
Processing systenls sillce t,h(?r(? {,h(? )'(qireSellt.al.ion 
()\[' COtlltllOtlS(!ltS(~ ~111(\[ (IOllH/ill klloWl(Xig(' is ess(mtial. 
Accordingly, a numher or org;a)tizal, ions of knowledge 
have been dcvdol)Cd some quite cxtensiw:. Th,~sc 
orgnnizal.ions a.rc (waluntcd hy th(~ (!xt(:lll. l.o whi(:h 
they I)rOVe re usable ;~cl:oss distinct, domains aim al>- 
plications. The consideration of the reuse of these 
orga.niza{,ions is, however, co,ul)lical.ed by the rauge 
of difl'eriHg design criteria I h~Ll. are employed ill their 
(;o\[Isl;rllC{;ion; all (;xl,eltsiv(; overview of 0,\[)\[)1'O3A:\]1(28 is 
givcu it, \[llatcnm.n, 1992~\]. One i)articutar apl)roa.ch 
is ~o define linguistically mogivatcd ontologies, wh(.re 
l.he criteria \['or organizal;ion rest. on seman{.ic clistiilc- 
l.ions I.h~fl; l;he gr0.ll)lll~ll' ()\[' a \[;Ulgllage 11(2C(\[8 \[.O \]l;I.VO 
(Ira.wn iu oMer to mol;iwfl.e il.s deployln(ml, of grant. 
rnalAcal (lisl,incl.ions. 
Two sizcahle liuguistic ontologies col)sl;ruct.ed in 
this way are t.he Penman (lpi)(':r Model d(welot)cd for 
I",nglish text genel:al;\]Oll within the l)enman i)rojecl. 
at US(J/ISI \[Prmn,an I'roj(,ct., 1!)8!)\] and tim Upper 
Model for (\]erimm develol)('d similarly for (,ext g(!ll(!l' 
ation within the. KOMWr l)rojcct at GM l)/ll'SI \[llatc. 
ma,it c* al., 1991a\]. The English Ul)pcr Model (l'3/M) 
is (les(:rib(:d in \[llat(;nlan cl al., 1990\]; th(,. conccl)tS of' 
Lhe (k.'rrmm Upper Model (('UM) go Imck tlc) \[Stcim!r 
cl ,'d., 1988\] aml \['l'cich, 1992\]. Ik)th cqltologics arc 
individuMly examples c,f the most det.a.ih,d sn,:;h on. 
tologics curreul.ly und(w development., each with over 
200 doma.iu and al)l)li(:ation indcl)end(mt (:oncel)ts 
*Also on indefinite I<tv(~ 1'1"01\[1 the \]'vnma.n I)rojc(:t, 
1Lq(J/hffornta.tion Sciences Institute, Mariner dcl lie.y, I,os 
A ngcles. 
arranged in a subsumption latl;icc, spraining dis 
~iuct t, ypcs of processes (menl;M, communication, re 
lational, aclions), aud diverse quMitics and object;s. 
Both onl, ologies ha.vc I~('.cu used in ~t nmnber or (Io. 
mait,s and show good re-usalfility characteristics 
mainly due to the l.tct that they :m: liuguistically 
motiw/.ted. Thus, for exauiple, if language genera- 
tioN is required there is tylfically I00% r('. usabili{:y 
across doniahis it, co,fl.,'ast to the 50% clescrib(~d 
by \[I)irlein, 1!)ga\] fo,. l;he, largely ,ion-linguistically 
niotivated, I,II,OG onl, ology, l 
'l'here are a mlnilyer of sugg,;stions for the eval 
ual.iou o\[' ontologies in t(;rlttS of \['orlnal prop(!r 
ties of consistel,ce and cohorenc, e of the inform;)- 
tioN those outologics contain (e.g., \[lloracek, 1989, 
(:hmrino, 1994\]). With a restriction 1.o linguisticMly 
mol iw~l;ed onl.ologies, we can now sg~i.,; rurthcr (1(; 
sigu principles collcerning whal; is to 1oe represented 
and how. All;hough Lhese l)rinciplcs were origiually 
develolmd in oMer to carry out a detailed coull)ar 
ison of I, he IqUM and th(; GUM, they are g(;nerally 
apl)lical)l(~ for all liuguistically molJvated ontologi(.s; 
ewdua.ting t;Im couc(;pt.s l)ropos(;d within su(:h an on 
tology accordiHg to the principles (les(:l'il)(!(I in t.his 
paper should iml)rovc {.lie sl.al;us of thud, ontology 
overall. 
The main rosull, of the IqUM-C;UM comp~rison 
is a zIhrged Upp(r Mo(hl presently used within I;h(: 
KOMI.'/I' pro jecl; as the basis rot multiliugual sentence 
gel)eration in I'hlglish, (~ermau and I)utch. ~ This 
l;\]letl also i)rovi(les ~m early answer to a quesi.ion cot, 
(:eruing a (lilferenl; kind of re-usability of liitgtiisl;i. 
(:ally motiva, ted ontologics, i.e., l;hc (!xtcnt to which 
they can be. re-used across distinct languag~.s ,'aLher 
tha.u across distinct donia.ins. 
2 TIlE MEll£~ING MI",THOI) 
2.\] Starting points 
Merging dislfiuct ontologies is a I)rot)lcm thaI, will oc- 
Cllr lllOFO fr(?(lllelll;ly as llCW plX)poS\[-/.lS ~'ul'C ~,o \[)e \['('.('oIl- 
oiled. \[tlovy aud Nirenl)m'g, 1992\] i)rol)ose a gelll!ral 
nlethod for creating a merged ontology out or dif 
fercnt ontologics where it. does not mattc'r whel, hcr 
I lun Is smq>ly lmca.usc a linguistically motiwdxM on- 
I.ology is ImLutd to the. sem~nfl.k:s of a grammar and 
not to genel'a\[, possibly (\[oinain-l.rattscendenl. knowl. 
edge. 'l'},e two kinds of onl, ologics should t.hcreR)rc be 
seen as perk)rnihig dillT~rcnt kinds of work. For ext.en 
siw~ moi.iwfl.ions for imfinl.ainhlg a linguisti<:Mly moli 
wired o.tology, see \[llMlida.y a.ml M ~ttl.h\[esseu, t,o ~tpt)ea.r , 
llatema,t, \[992a\]. 
>l'he comparis(m is based on tim \]';nglish Upper Model 
~tiid (',erma.n U pper Model data files from .I uly 1992. 
Both arl'(\] expressed in the kn<>wledgc representa.tion laat 
gua.gc t,()oM (\[Mac(Iregor mul lh:ill, 1989\]). 
80.3 
differences are language dependent or due to differ- 
eat linguistic theories. The commonalities and dif- 
ferences in two ontologies are classified according to 
Hovy and Nirenburg as follows: 
1. 
2. 
3. 
Identity: The same concept is found in both 
ontologies. 
Extension: There is a concept ill one ontol- 
ogy which is missing in the other, but which 
specializes the latter ontology fllrther. 
Cross classification: '\['he partitioning of 
identified concepts into subeoncepts differs in 
the considered ontologies. 
The merging procedure t, hen keeps all concepts of 
cases (1) and (2) and resolves ease (3) by exhaustive 
cross classification. 
A simplified version of this procedure is proposed 
in \[How and Knight, 1993\]. Here, the cross elas- 
sification resulting fi-om nonmatching partitions of 
identified concepts into snbeoncepts is replaced by 
parallel subordination of those subconcepts. This re- 
sults in a substantial reduction in the concepts nec- 
essary, but leaves open l;he question of the mutual 
relation between concepts stemming from different 
source ontologies. Participating NLI' components can 
be controlled well by such n shared ontology, but its 
adequacy as a point of communication in a joint MT 
systent is less clear. 
We have found that it is necessary to go beyond 
the original merging methodology in a number of 
ways. Nevertheless, as a eonsequence of the criteria 
for merging that we propose, not all existent con- 
cepts of the EUM and GUM need find their represen- 
tation in the Merged Upper Model and cross classili- 
cation is still significantly reduced without impairing 
inter-translatability across concepts arising fi'oln dif- 
Drear source ontologies. 
2.2 Problems with identity 
The crucial point in \[tlovy and Nirenbm'g, 1992\] is 
the notion of 'identity'. The decision how to deal 
with different concepts (identification, extension, or 
cross classification) is based on the possibility of stat- 
ing an identity between concepts of difl~rent lan- 
guage ontologies. This is somewhat problematic. \]\[21 
the comparison between the English and the Ger- 
man upper models, we took as identificatiou criterion 
the equivalence of the sentences or phrases which 
can be generated by the concepts. This correspon- 
dence relies on the assumption theft German and En- 
glish sentences have a one-to-one-mapping and that 
translation is a totally information preserving rela- 
tion. Although this is not true in general, we based 
our merging on the assumption that it may be true 
for simple sentences if we abstract out the textual 
and interpersonal i.e., non-ezperieniial (examples 
below) dimensions of utterances, and tim language 
distance is close. Hence, the whole construction has 
to be seen in the context of its own relativity. 
2.3 Further Principles 
Prinniple 1: Removal of non-o.xperiential eon- 
nept discrimination 
DifIicultics can arise when ontologies to be merged 
are themselves inherently problematic in some way. 
Internal problems should not be autolnatically trans- 
mitted to a merged ontology. Thus, during merging, 
the distinctions drawn iu individual ontologies need 
always to be evahlated internally before being ad- 
mitted. The exclusion of textual and interpersonal 
inforn'mtion in a merged upper model provides an ad- 
ditional important criterium for an 'extended identi- 
lication' of concepts within ontologies to be merged. 
Two comnlon kinds of non-experiential concept dis- 
criminations were found in the GUM. The first in- 
troduces distinct upper model concepts in order to 
motivate lexieogrammatical realization by differing 
types of grammatical units. The second introduces 
distinct upper model concepts to motiw~l,e, the selec- 
tion of semantic roles Dora a given semantic conligu- 
ration that are to be lexicogrammatieally expressed. 
An example of the. first ldnd is offered by the con- 
cepts (Ll~elalional and C-l~elalionship :~. These are 
both responsible for the generation of processes ex- 
perientially classifiable as relational, but whereas G- 
gelalionship causes an attributive or adverbial re- 
alization, G-Relational causes a. clausal realization. 
Thus, phrases (1) and (2) can only be generated from 
dill>rent semantic input expressed here in the form 
of the typed semantic assertions of the Penman Sen- 
tence t)lan Language (sPt,) \[Kasper, 1989\]: 4 
(1) Das Mitdchen ist kra.nk. (The girl is sick.) 
(a / classificatory 
:attribuant (m / person :lex m/idchen) 
:classifier (k / quality :lex krank)) 
(2) das kranke Miklchcn (the sick girl) 
(a / property-ascription 
:domain (m / person :lex m~dchen) 
:range (k / quality :lex krank)) 
This problenl does not surface so often ill the EUI~\], 
althougb there are occasional violations e.g., the 
inchlsion of 'rhetorical relations' that are explicitly 
textual (see \[Bat(man el al., 1990\] for details). 
Examples of the second kind a.re offered by seu- 
tenees (3a-b). 
(3)a. l)cr Lehrer antwortcl,, dass (hts Raumschit\[ zuriick 
gckehrt ist. 
(The te~tcher answers that the spaceship 
has term:ned.) 
h. l)er bchrer a.ntwortet den Schfileru, (lass 
das l{amnschitf zur/ick gekehrt ist. 
(The teacher a.nswers the studeats that... ) 
The differences in (3) arise from differences in the 
numbe.r of semantic participants in the answering- 
event that are made grammatically explicit. Both 
(3a) and (3b) could be used to deseribe the same 
experiential event, the selection being made on non- 
experiential grounds (e.g., lack of relevance of a par- 
ticipant, being known from context, known from pre- 
ceding text, etc.) specific to the text being created. 
3Conccl)ts fi:om the English Upper Model and the 
German Upper Model will be differentia.ted where rcl-- 
evant by prefixing either 'E-' or 'G-.' as appropriate. 
4 Classificatory is it subtype of GUM concept l~c'la- 
tional, propert!j-ascr@tion a subtype of relationship. Fur- 
thor, in the SPL examples in this paper, lcxic~L1 selection is 
specitied directly by means ef the keyword : \]ex to ~tvoid 
complicating the discussion mmeccssa.rily. 
804 
Both role, however, classilied semantically under- 
neath distinct GUM concepts. These distinct con 
eepts have differing obligatory role conligurations, 
which requires that the selection of semantic (ex 
pcriential) I, ype has go be Inade aceording to the 
participant, s that ,~re to be expressed a decision 
that is often made ou textua.l grounds without a 
change in experiential perspective, lilt the FUM 
the only semantie distinction in <his area of con> 
reunite<ion processes is betwe.en 'telling' like ewmts 
(address<c-orient<d) and 'saying'-like events (non- 
addressee-oriented), which is ~ dill<fence in experi- 
ential lmrspective. 's In the I';UM, concept discrimina- 
tion is made with respeet to difIi'.ring possible realiza- 
lious of rob:s, not strMghtlbrward absence/presence 
of roles as in the GUM. 'Missing' surface par<Jel- 
l)ants ean be modelled more adequately by an tip- 
per mode.l-grammar interface which Mlows detined 
sem~mtic roles to ha.ve zero realization. '\['his is an 
elegant way to deal with optional participants, pas- 
sive, and iml)ersollal eonstructions. 
The net effect of both kinds of violations (; of prin- 
ciple I is tha, t the number of concepts is increased 
and itecessary decisions eone(;rnillg lexi('ogralllmat- 
ical realizations arc avoided. 'lPhe. proliferation of 
concepts if Mlowed would complicate consider- 
ably the task of 'identifieation' of similar coneepts 
in ontologies to be merged. 
Principle 2: Intelligent cross classification 
linen \[bllowing extended i(lentifieation of concel)ts , 
it is uo~ suili(:ient to provide cross products \[br those 
eoneepts that ~re not identifiable but which classify 
overlapping sema.,ttie areas. '\['his is clarified l)y the 
following concrete, al{,hough \[ll/ICtl abbreviated, ex-. 
ample, of merging in tim al'e&L of material (action) 
process types. TIll; decisions that ar(; required here 
are typical of the merging process as ~t whole. 
The l';-Malerial-Proccss hierarchy distiuguishes 
processes more or less with regard to transitivity pat- 
tcrning. Au l';-Nondirecled-Aclion is a process with- 
out external eausation (n,ostly intransitive, although 
transitive sentences where the objee.t is not alD(;te.d 
or created by t, he aetion Mso fall into this class), li- 
Ambienl-Process and l';-Motion-Process are not ex.. 
haustive subconeepts of L'-Nondirected-Aclion. 
'\]'he wtse broke. Nondirected-Aetion 
i pla.y pi~um. Nondircctcd-Action 
The to,<fist ran. Molion-t'roccss 
\]t rains here. Ambient-Process 
An E-Directed-Action in contrast is a proee.ss with 
an e×ternal causer as additionM tmrticipanL. /','- 
\])ireclcd-Actions divide into I':-Crealive-Mahrial- 
Action and I','-l)isposilive-MateriaI-Aclion. 
5'\['he \[()\]?nlel" neeess;trily illvolws a.n addressee n(!lll~l, ll-. 
ticM\[y, sonleone who in intended to be listeuing, while 
the latter does l|ol;. This diiR~rence is gramma.ticized tn 
\[,;nglish in the a.cceptabillty/lmn-a.ceel)tMfi\]ity of '\[ told 
him thai .... '/'1 s~tid hint tha.t... '. Ill order I,o gratntmati- 
cize a.n *Mdressee iu a. s~tying-like event, it, is necessary to 
respect its lens centrM role ~tnd 1.o use the form '1 s;tid to hi*n 
fl~tt... '. 
~'l'he,:e ~tr(.' further similar cases; for ex~tlnph~ tit(! 
GUM also inelmles interpersonally motive.ted eon(:el)t 
discrimin~Ltions such ;ts negativc@:ctlurc.-ascriplion ~t,td 
ncgativc-quality. These govern the general, ion of m:ga.tiw~ 
assertions, Lhus pre erupting at lllOITc *q)propriate speech 
function co<tirol of ncg~ttion. 
The chihl broke, the vase. 
The lion chased the tourist. 
Mary baked a cake. 
1)ispositivc- M aterial- A ction 
l)ispositive- Material- A ct ion 
Creative-Material-Action 
The GUM differentiates G-Agenl-eenlered, G- 
Affected-centered, G-Agent-only and G-Affected-only 
as disjoint (7-Action subtypes. \]lere, we have at first 
a classifieation with regard to kind and number of 
partic.ipants. Example.s for the semantic representa- 
tions of the intransitive process types are given in 
(4) and (5), again ill Sel, notation: 
(4) I)er'fimrint ramtte. (The tonrint ran.) 
(r / action :lex rennen 
:agent (t / tourist)) 
(5) I)ie \]'tlanze geht ein. (The plant is dying.) 
(e / action :lex eingehan 
:affected (p / pflanze)) 
The transitive processes (with t;wo participants) ~rc 
I'urther broken up into G-Agent-centered and G'- 
Affected-centered. The G-Affected-centered process 
type is a. very special case of a transitive process. 'l'he 
detinition is giwm ill \[Steiner el el., 19881 thus: "X 
affe.clcd-cenlered-vcrb Y iff X causes that Y a.lfc.cted~ 
<chief'<d-verb". I';xtunplcs are: 
I)as Kind zerbricht (lie Vase. 
l)as Kind bewirkt, class die Vase zerbricht. 
The chihl breM(s the vase. ~ * 
The ehihl brings it about that the vase bremen. 
Thus a process is called G-Alfected-cenlered if the 
reMizing verl) is ahh'. to \['orln an ergal:ive pMr. All 
C-Ajfected-ccnlcred processes have at least two par- 
ticipants, the G-A:lenl and the G-Aft<<led. 
The (;-Agent-centered process is di!l'crenti~ted 
with regaM to the different p~rtieipant types for <,he 
second participant: 
Air<cling 
l:;ffecti ng 
Ranging 
l)er Bauer fii.llt den I~aann. 
(The. farmer is felling the tree.) 
(;-Agent (;-AJl~c~ed 
Die Mutt, er malt o, in lbms, 
(The mother is painting a house,) 
G-Agent G-l':J.fccled 
leh spiel< K l~vier. 
(1 play piano.) 
(;-Agent (\[- Process-range 
At first, sight, there are few con.nonalities t)etween 
the.se two onl, ologies. Without deeper hltrospecCion, 
one can only state an identity 
E-Ambient- Process =: G- Natural- I'henomcnen 
mid could me<haul<ally build a cross classification as 
shown in Figure 1. Sonic ereated concepts should, 
howew:r, bc omitted from this 'eross produet ontol- 
ogy'. 
The \[hrsl, obvkms argumen~ is tile mmdmr of par 
tieipants. 'Phese are contradictory in the R)llow- 
ing cross concepts: li- I)irccted- A <lion/0?- Agcnb. oMy 
and I'L Direcled-Aetion/ G-Affcctcd-only. A compari-- 
son of the low level eoneepts shows further that the 
805 
I!'igure l: Mechanical merge of the material processes by (:ross classiticahon 
following GUM and EUM concepts can in fact be 
identified: 
I';- Oispositive~ M aterial- A c tion = 
G-Affcctiu 9 + G-Affcctcd-ccntercd 
l'LCreative-Matcrial-Action = G-Effecliug. 
This rules out the cross concepts: 
\]';-l)ispositive-Matcrial-Action/ g-Effecting, 
E- I)ispositivc- M ateriaI- A ction/ G- Ranging, 
E- Ureative. MatcriaI-Action/ G-A ffected-centered, 
E- Creative.Material- Action/G-Affecting, 
E- Creative-Material-Action/G-Ranging. 
I,'urthermore, it is known fron-i tile definition of E- 
Nondirecled-Aclion in \[l:~ateman el al., 1990\] that 
such processes axe either intransitive or they have a 
second pm:ticipant which is in meaning nothing else 
than t;he G-Process-range participant, llence, the 
cross concepts: 
\]'L N ondirected-Action/ G-A ffecting, 
li- Nondirected- A (lion/(~- I';ffecting, 
E- Nondirected-A ction/ G- A ff ectcd-ccntcrcd 
as well ~s its subconco.pts 
E-Motion-Process~ (7-Affecting, 
1,;-Motion-l~roccss/ G'-ldffccting 
~re rnled out. Fin,nlly, tile exhaustive coverage of the 
low level subtypes in the I,;UM and GITM supports 
the tbllowing identities: 
l'% Nondirected-Action/ G- Natural-phenomenon 
- E-A mbient- Process/G-Natural-phenomenon 
E- Nondirected-Action/ G- A.qent-ecntered 
= l'%Nondirccted-Action/G-t~anging, 
E- Directed- Action/G-AJ.'fected-centered 
-- E-Dispo.~itive-MateriaI-Action/ G-Affected-ccntcred, 
E-Directed-Action/G-Agent-ccnlered 
= E-Dispositive-Material-Action/G-Affectin9 
-I- E- C~vative-Material-Action/ G-Effectlng. 
By these kinds of deta.ilexl considerations, we have 
filtered an intelligent merge out; of the mechanicM 
merge. Within the intelligent merge, we omit the 
German differences concerning tile participant nnm- 
bet (G-dge~tl-o~dy, GGganging) since these violate 
principle I, and do not estM)lish the very subtle (L 
AJfccled-cente~vd type. Preferring the I~nglish termi- 
nology the result is giwm in l!'ignre 2. 
This turns ont i~o I)0 mainly tile EITM subhier- 
archy for material processes. 'lb also cover the 
German requirements, the Nondirecled-Aclion con- 
cept is difDrentiated into Nondirecled-I)oing and 
Nondircclcd-Happenin 9 according to the distinctiou 
between Agent-only mM Affecled-only. q'herefbre 
we do not need to preserw', the Clerman participant 
types d.qe,tt and A Jr(clod, and can inDr the releva,t 
inff)rmation from tim new Nondireclcd-Action sub 
cottcepts. The (.4erman SPL examples (4) and (5) 
then have the revised semantic form: 
(4') 
(53 
(r / nondirected-doing :\].ex rennen 
:actor (t / tourist)) 
(e / nondirected-happening :lex eingehen 
:actor (p / pflanze)) 
Because we have \[ixed the semantic differences be- 
tween the G-Aqcnl and the G~AJ\]?.clcd participant 
in the process types we do not need this differentia- 
lion as partMpant roles again, tlence, we choose the 
l';nglish p~rticipant types h'-Aclor and I';-dclee, tim 
correspondence of which to the German G-Agent, G- 
Affected, G-Eft((ted and G-Process-range ditDrs with 
the process type (see l"igure 2). For further dermis 
of l, he merging of all 12 top-level regions of the two 
ontok)gies, see \[Henschel, 1993\]. 
Principle, 3: Flexible seman|;ics-grmnmar in- 
terface 
One peculiarity of the proposed merging is that we 
do not assume a. strMghfforward correspondence be.- 
tween concepts (especially process types) and sets 
of surface sentences, fh*tt means, disjoint concepts 
806 
~tertal-I)rocess / Action~ 
C Nondirech~l I)irecte@ z <_ 
.... / (Ta, ~,,,,,,,,0~,,,,,-~ ~-No.,,,,..,~,_---~ / ~me~':'.__-- <~ 
I)oili~ J ~_j / ilct,'o = clle{tM 
P /~ D,SlmSii\]ve---- -. 
~0t lm~l -- -----actee ~: al fected 
I,'igure 2: Mergiug t)rol)osal for t,he materiM t)roc(:ss t,yl){~ 
in the Merged O l)ltcr Model do nol, m~(:ess~Mly cor- 
respond (,() disjoint sets of su\]'fa(:(~ SCIIt,(HIC(2S o\]\]ly 
t,o disioinl. S(HTI~-I, II|,i('. \[){n'Sl){~cLives (tit LIt(HIt. Tim i)> 
(,erfa, ce \[)cl,ween \[,ho ii\[)\])(!r \]ilOd(!\] ~/Ild (,h(~ I.~F~-I, IIIIII~LP 
IICX:(1S {,o be Wl;i(,LCll ifl SIICII a, w~ty l,hat it, is; l>OS - 
8iblc ill S(}lllC c}ts(2s {,o ~{)llOf&lA? l,\[l(~ Ngq.IIt(? S{~ll{,cln;c 
fr()l\]~ diIfer('.nt soIllg-Llttic inl)Ul,. This apl)roach lll(2Ol,s 
{,\]le (\[i\[leer{;ii{;es b(%wecll 1;I\](2 \])r(>c(~ss type i)a.rtit, ioning 
in the IgUM mtd tim (~UM withoul, {~limin:~ting bol,h 
l)erspec(,ives &lid with(rot c:r(~tl,ing IlCW ('.ross l)l'o(lltc{, 
t;ypes (as il, would be I,hc (rose in (,l\](: simple merging 
sLral,cgy), bul, by giving l;he Ul)lmr model--gi:~mmmr 
inI;(21'f&cc ItlOl'O. \[Iexil)ilil,y r. As a (;O\]ISC(\]U{211C{2\] ~l, hOIt- 
(;CIlC(B Sllch ~ts (6) c~ti/ iiow \]i:l,vc \[,WO {lisl,i~ct scman. 
(;ic \]:(q)resc!nt~-tl;k)ns (7;t-\])) ~u:(:oMing |,()tim Mcrg(x\[ 
\[Jl)l)er Model; (,h(~ COI/{:(2I)(, dcsliltalioll ill ('(\])) iS it 
sul>conc{2l)t of relalional-lrrOCCSs , which i.% disjoinl. 1,o 
ma~cri(d-proc,:ss in (7;Q. 
(6) I)er Sohn I)egl(',itct scinml V~tcr in die St,a(ll,. 
(Thc son ax:comp;tni(~s his lather to Line city.) 
(7)a, (b / mal;erial-process :lex bogie1 l~ett 
:;1el;or (p / person :lex sohn) 
:actee (v / person :lex rater) 
:desl;:ination (s / one-or-two-d--location 
:\]ex 81,actl~) ) 
I}. (b / de~tinal;Jon :71.ex begieiten 
:domain (v / person :le× rater) 
:zange (s / one-or-Lwo-d--1ocaLJon 
: 21.(:x s \[;adt ) 
: ~hird-parLy-agenl, 
(p / pe\].':~on :\].ex Bohn)) 
q'lm two 8(!lltall(,i(; r{2\])l'{;s(2\]ll,a,|,lOllS (;Ol'l'(!s/}on(\] Lo two 
p~(;lnlilt(!ly aJ(;erllat, iv(! ex\])erien(,ial tmrspecMvcs ol) 
|;h(~ (2VOIllB) Olle fOCllSillg 11101?{2 oil i|,s ~lc{,iOll like IHt- 
tuft, Lhc ot, hcr more on \]l,s r{21ational-\]ik(~ nat\[.'< 
3 II,ESULTS AND CONCLUSIONS 
3.11 The Merged Upper Model 
By N)plying t, he l)rincil)h,.s for merging set (}ut al)ov(!, 
it was Lmssiblc (,o fully r(H)hc(! both U\])l){!r too(lois 
r(fiving this intcrfa.<:(,. (,his flexibility is in a.ny (:as0 
~trgued \[or on ol, her grounds ht \[B&tcnta.n, 19921}\]. 
by a shJglc m,.:g('d ultp(w model that diff(.'s v(:ry 
slighl,ly I'ro.t (;h(~ t'JUM. '\['h(2 Merg{!(I \[/i)l)er Mo(lel 
C3,II HI f;/,C(, \[}{2 o\])LaJll{~(\] \]'l'Olll l;})(~ \]')~J~/\[ by & s/ll&lI 
mmtbcr of ad(liLions (8 ucw (X)ll(L(!l)~s (1l\]\[1{\[ \] (2\]1~111<{~(2 
o\[ ro\[('. P(2sl;l'i(;~iOltS \[,o a,t/ (!xisl;ing collcel)L). This lac\]¢ 
of differ(~nc{~ supports tit{2 {:brims (2Ol1(2{2flli11~ mull;Jim 
gll~t\]il;~,' O~' ('llllCiJiOll&l dcscril)l;io\]|s made i. \[l~tt(miml 
at M., l{)glb\]. Th(!rc i(, i,~; m'gur.d thai; a, FllllCl;iOllal 
gI'itlIlIllO+\]' ah'c~My goes beyond slrictly l:tllgllage sit(> 
cific. (lisi;i.('.tions: the re usa, bilil,y of tim vast ma 
jority of (;he lqUM orga, niz~{,ion for (a(2rman (lemon 
sl;ra, t(!s I,\]I;L(, LIlt{I; org~ttli:.';a,l,\[Ol\] is i1()(, Lic(\[ soiely to 
lBnglish. This is further reinforced by tlm {2XJ)Ol'i(Ul(:(2 
{luring (,Ira Ul(~rg(2 |,ha, t wll(2r{*, l,h(! IqU. M e~l,{mdc{I o. 
disti||cl,ions made i. tim (HIM, (;hose (2:(l, cliMOllS w(erc 
gmn!rMly equa, lly {qq)licM}le a,u{\] useful I'{}r (;crmall 
(scc \[\[\]{mschcl, 1993\] for r<dt'.vant examples), 
'l'lm result, ()1" our merging i)l'()(:(!(\]lll,(': is; all ollt,oI- 
ogy fulfilling the C(}IIS{;PIIC(;iOII i(\[O~tS O\[" \[llovy }tll(\[ 
Knigh(;, /!).(}8\] in (,hat the I'{~sulting ont,oh)gy c{mtains 
~til COll(:c\])ts It(2CO88~l,l!y for (,lie op{'r;:~tk)u of t, lm L'I,;N 
MAN t\[\]o(hl\](2 a, ll(\[ l,h(! I(()MET module. \[Iowcv0r, i(, 
(:OllD':-/,(Ii{;Ls t,h(~ lll(!r~ill,~ 10}l{~Ol'y o\]' llovy and t(nigh{, 
in I;hat. iL sl;ai, es SOlrlC t,heol:(%icaJ priu(:il>l(~s; lbr t, hc 
)llcl'g(2 COllS(,iqlCl,iol/ which should I)0 mainta.in(~d I)y 
(,he sou\]'c(2 <)nt, ologics as well ;ts h~ (;he IIIOr~{L 
3.2 Merging Statistics 
\[~CC~ttlS(! Ot' tJl(;iP qucsl,ionabl(~ st;at;us, wc \[(mvc (,\[~c 
'rhetoricM relalJons' out of accoum, in the stal,is 
(.ical compariso.. WiLhout this l{.Sq'-sul)hi(a'm'chy 
Dim IqUM includes 252 COllC('.l)l,s. 'l'he (HiM lnakes 
no i>rccise dis(,iuct, ion between upper and (IotiHtilt 
|nodel. For (,h('. compm'ison, 235 GUM conc{epgs arc 
considered. The M{~rged |Jpl)er Mo(hd (:olll.tdns 288 
(X) It(;CI){;S. 
Id(:nf, lty 
We found I{i7 id(:ntAca.l conccl>l, na.mcs (excluding Chc 
Ics'r-rclal&ms), from which ouly 87 concepts cm~ ~1 ( ~- 
ally be identiIicd. Identical me,ruing can strongly I>(~ 
si,al, ed for lOG c.onc(q)i;s (i.e. 19 ha.re distinct Im.lnes). 
The maiu hlenl;ilh:~tth)u m:cas are the ol)j(:cl; ;t*l(l I;ho 
(tuMity hiern.rchy as well as I.hc (.etuporM oae. Tin~ 
807 
precise distribution for strong identical meaning is 
shown by the numbers on,side of brackets in Fig- 
ure 3. 
Union 
If both considered ontologies are equally weighted as 
ill \[How and Nirenburg, 1992\], individnal concepts 
in an ontology must be maintained in any merge. 
However, in onr approach we have extensively made 
nse of an ontology:internal concept union. This is a 
result of the general ontology design principles given 
in Section 2.3. The ehmse/PP distinction, for exam- 
ple, which is often a concept discrinliliation criterion 
in the GUM violates Principle 1 and so this <tiscrim: 
,nation is not preserved in the Merged Upper Model. 
Therefore, leaving out of account the clause/PP dis- 
tinct,on, identical concepts then amount to 163. The 
number and distribution of concepts identical after 
union is shown by the numbers in brackets in Fig- 
ure 3. 106 concepts are strongly identical and 57 
merged coneepts are identical with tire unions of dill 
fcrent GUM con(:epts. 
Extension 
Extension can be Pound in both directions. Because 
of the emphasis we have given to the EUM, most 
of the extensions are I';UM concepts which extend 
tile GUM further. These are 60 concepts, 11 for the 
Mental-Process, ll Participants and 38 others from 
the Relational-Process hierarchy. On the other hand, 
only 4 German participant, concepts have found their 
way into the Merged Upper Model. 
Cross classification 
An essentiM field tbr cross classification has been 
avoided by the relaxation of the upper model- 
grammar interface stated in Principle :{ in Sec- 
tion 2.3. For exainple, whereas the cross class,If- 
cation (liscussed for the MaleriaI-Proccss/Aclion lii- 
erarchy in Section 2.3 would have cross classified 
2 Fmglish subconcepts with 5 Gerlnan subconcepts 
and their subhierarchies respectively, resulting in 42 
merged concepts, 9 concepts are suflicient to cover all 
distinctions expressed in tile EUM and the GUM. 
Slllillrlary 
Summarizing the urerging statistics, strong identity 
can be found for 41%. \[f we allow identification 
of unified concepts, identity can be stated for 63%. 
About 25% of' the merged UM are created by ex- 
tension, and only 3.6% by cross classification. Be- 
side this, there is a small part of t.he Merged Upper 
Model (8%) where the concepts are not crea.ted by 
identification, extension and cross classification, but 
by preferring IgUM concepts over GUM ()lies. 
3.3 Future work 
In the current merging proce, ss, we have only h)oked 
tbr identities and dift~rences between tile given En- 
glish and German Upper Models. Wc did not try to 
improve the inherent consistency of both, although it 
becanie clear during the merge thai, ccrtain distinc- 
tions should be removed and others tilrther devel- 
oped; these local improvements are detailed in \[llen- 
sehel, 1993\] and will be incorporated in future vet'- 
sions of tile Merged Upper Model. 
In addition, one of our tints with the Merged Up- 
per Model is to provide a stable basis for fllrther 
+.__Material 2 /llr,i<ess 41 (98) ~'-Mental 
3 Q 3 
UM-Thing / Thin 106(163) Object/Entity 35 Rdalional 33(90) 
Quality ~0 
Figure 3: Identity statistics and distribntion 
extension both to include, tim;her linguistic I)he,- 
nolnelaa and to cover further languages. We cx-. 
pect that an organization of information based on 
the requirements of natural language gram mars will 
provide a inert st~d)le and re-ilsable resnlt than of 
ganizations based on the requirements of individual 
cornl)utational systems. We are already using the 
Merged Upper Model as the basis for sentence gen- 
eration in I)utch and l, here is suggestion here that, 
again, few additional concepts appear necessary. Of 
more interest is the extension to rather (lilt>rent lan. 
guages, some of which has already been begun. Ira- 
tailed accounts of this work of extension and com- 
parison are necessary since automatic merging will 
rarely be possible when these, most, general lewd'Is of 
information organization are considered. 
t!'inally, extensions in filtnre may also be made by 
colrtparison with other ontoh)gies although here it. 
is necessary to be very (:artful concerning l;he kin(Is 
of ontoh)gies considered. Since the Merged Upper 
Model is explicitly a linguisticMly motivated enter 
ogy, conlparisou witlt outologies with (lifferhig me 
tivations can be difficult. In considering the ontol.- 
ogy of the IAI,OG l)roject, for example, the nlixt,lre 
of linguistic and non linguistic information criticized 
by \[Lang, 1991\] should not be carried over into I;he 
lnerge. 
The evaluation of the resulting linguistic ontolo 
gles as potential semantic type hierarchies for repre- 
sentations in Inachine trailslai:ion, analysis and lnu\] 
tilingual generation is t\]ieu a eh!ar further step, 
References 
\[Bateman et el., 1990\] John A. l~ateman, Robert; T. 
Kasper, Johanna 1). Moore, and Richard A. Whir> 
hey. A general organization of knowledge Ibr 
natural language, processing: the PENMAN upper 
nrodel. Technical report, USC'/Infornmtion Sci 
ences Institute, Marina del Rey, Calilbrnia, 1990. 
\[I)ateman cl el., 1991a\] John A. Bateman, Elisa. 
bel;h A. Mater, FAke 'li'.ich, and Leo Wanner. 
Towards an architecture for situated text gen 
era.l;ion. \[n \[nlernalional Co~@r('ncc on Cur- 
renl lssucs in Compulalional Linguislics, Penang, 
Malaysia, 1991. Also available as technical report 
of GM D/Institut fiir Integricr~e Publikations und 
Informationssystem G Darmstadt, (~ermauy. 
\[13atelnan el al.,1991b\] John A. BaLenlau, Chris- 
tian M.I.M. Matthiessen> Kelzo Nanri, and 
Licheng Zeng. 'Pile re-use of linguistic resources 
808 
a.cross laatgua.g(~s in ttmlt;ilii~gu;d g(2n(~r~l;ion (:()nl 
port{mrs. In 1}'cocccdi'ngs of the 199l h~ter~.alional 
,/oi'nl Co~@:r'c:'tce on Ar't{licial l~tlclligc~cc, ,gyd- 
n~:y, A'uslralia, VO\]lll\[l( ~. 2, I)~tg,!s 9(;0 971. Morg~l.it 
I~a.ut'ula.nn Publish~'rs, 1991. 
\[l~al,cin;m, 19!)2a\] John A. I{a,t(~ltl;m. 'l'h(~' th(~orel;i- 
cal sl,~t;us of ol,tologi(~s in ira.rural I~mguagc pro- 
c('ssing;. In Susanne I}P(m/3 and I~irtc S(;lunit.z, 
(xlil, ors, 7'e:cl ltep~'csc:'~,lalio~z and I)o~ain Mod- 
elli:'lg idca.s ,/7"o~1t liT~gui.slics and !t 1, I);tgcs 50 
99. I{l'l'-ILcl}or{. 97, Technisch{~ Univorsitiit I{erlin, 
M~ty I992. (I}aI){',rs fronl I~I'I'-FAST W{)rlfsll(}t) , 
'l'echnicM Uuivcrsigy Iterliu, ()cl;ol){~r 9th- 1 l t,h 
1991). 
\[It~tenmn, 1992b\] Jolm A. P,~teman. Tow~trds 
Me~ming-Ilascd M~tchinc 'l'r;msl~t.ion: using ~d)- 
sl,ra.cl, ions rrom text; gencr~ti.ion For i}res(:rving 
m(~aning. Machinc 'lS'anslaliolG6(I):l 37, 1992. 
(Slmcia.I {xliti{}n ()It |tie rote or l{~xt; g{:ttoraA, iol~ iu 
MT). 
\[(luariuo, 1994\] Nicola. (;uarino. 'l'h(' o\]ll,ological 
level. In IL. (?asa.{i, II, Smi{,l~, ~m{I (I. Whir, c, 
{~ditors, l}hilo.soph?\] a~*d I1~(' (;og~ilivc ,%'cie:'lccs. 
Iliilder-I)i{;hlc.r-T('.ml)sky, Vi('.lllla., 1994, 
\[ll~flliday and Matl, hiessetl, |o a.l)l)t~.a.i'\] 
Mich~u!/ A.K. llalliday ~xud (Jhristia.n M.I.M. 
M~tl;t, hie:;sc'n. (/o'lt.slT'iti*t{t (:a:pcric~zcc lhro:'t(jh 
:'~c(tni:'z,q: a l¢l~g'ttage-bascd approach 1o cwp~ilio:'L 
de (Iruyl,er, Berlin, Lo al~pea,'. 
\[llcuschcl, 1993\] ILcn~l,~, I I~m,qcll~q. M~u'ging I:ll~ Iqn- 
g;lish ;rod i;hc (\](~rman IJI)I)(~," Mo(l(~l. q'(~(:llnical r(~ 
I~O,'1., (;Ml)/lnstitui; fiir lut~gri<wt(~ Pul>likal, ions- 
ultd Ii~f()rHl~tlionssysl.¢?nl( G I)~rul.ql:acll., (*,ernta~y, 
1{)93. 
\[\[Iol~c<~k, 1!)89\] Ih;lnml. lloPac(A¢. 'lbwards lMnci 
pl(!s oF outology. In 1). Mel.ziD.g, ~'dit,or, Ptocccd- 
i,gs oJ'thc (;cr,n.a~. Work:.shop or* Arl~\[Jicial h~lclli- 
.(le'~t(:(: (; WA 18,9, p;tgo.q "/23 3;10. SIMngor.¥Mrla.g, 
Ilerlil,, Ilcidelb~!rg, N(~w York, 1989. 
\[llovy and l<uight, 1993\] Iqduard Ilovy and Kcviu 
IKnighl;. Mol, iw~ting sha.rcd kl,owledgc resoul:ce,~< 
an (~x~mM)l<' fron~ l.h<~ I'~tngloss coltabor~d.ion. In 
l'rocccdi:'zq,'; o.\[' I,\](,'AI Work,shop o:'~ l(,o'wl~dyle 
£ha'~'i'ng a'~d htfo'v~lalion h~lcrch.nge, lnt,crll;t- 
{.iOI/arl Joint (',Oll\['(\]l?('.\[l('(~ OI1 Arl,ifici~d Intelligence, 
1993. 
\[llovy ~/lld Nir('nlmrg, 1992\] lq. Ilovy a.nd S. Nit(m- 
burg. Al~pr()xin~a|il:'g au iul.ccliugua il, a priu-- 
oil)led way. In t'~'o(ecdin~l,S oJ" /he l),dl~l~/1 
,5'pccch (t:'~,~l Nata~'¢tl l,(t~tguagc Work,shcq~, Ar(Icu 
Ilot,s~+, New York, 1992. Also av+tilahll> l'rolu 
\[J,q(:/h~forltlation Scionces h,s/.itutc (Marina <1¢;\[ 
H.oy, I,o,~ Angeles) ~t.q 'l'('chnic~d II.~!l)or|. ISI/l:LI/.- 
93 3d5, I,'l~bua.ry 1!)93. 
\[Ka.~q~er, 1989\] ILol~crt '1'. l(asper. A tlt~xiblc ill 
l.~;r/'acc For \[iuking al)plica|Jons t,o PIqNMAN's .sc:ti- 
tenet gcuc.r~l.or. In l~rocccdiwIs of lhc I)AIgPA 
Workshop o:'t ,b'p¢!¢:ch a.d Na, l,ral I,a~guage, 1989, 
Availabh; from US(~,/lnl'ovmat, ion ,q(:iot/cl~s Insl, i- 
t;ut(~, M;~ri,ut del ILcy, (IA. 
\[Laa~g, \[991\] Iqwald L~mg. The IAI,OG ontology \[rol~l 
~ linguistic point of view. In O. Herzog and C.-H,. 
ILollinger, editors, "l~a:t u~*der,staltding i~t L\[I~O(~- 
inlcgrali~tg co71~putalioTtal linguislics a~td a,'tzJic~al 
i,.lellige~tcc, l,'inal r'cporl o~ the IBM (;cr~ata~t.y 
LU,O(~-l}roject, pttges 464 481. Springer-Verlag, 
Berlin, 1991. Lecture notes in arti\[icDd intelli- 
gence, 546. 
\[N/I;~c(~regor and Hrill, 1989\] iLobert M~c(~re- 
gor ~nd David I~rill. The LOOM ma,nu~d, 1989, 
IJS(~/In\[ormat, ion ,~cicn(;e!s Institute, Ma,rina, (lel 
ILey, (~A. 
\[P;~t;il el al., 1992\] lL~unesh S. Pautil, l~idlm'd E. 
I!'ikcs, Petcr I". })atel-Schneidcr, \])ol1 McK~ty, 'l'im 
Finin, 'l'hGmas II.. (',tuber, ~tnd Robcrt Ncches. 
The I)AllA'A knowlodge sharing c\[\[brt: progress 
report;. \[ll ('.haries 1lAth, l{ernhard Nobel, and 
William IL. Swal'l, olll;, cditors, PT"i~ciples of k~to'lul- 
edge rep~'esc~llalioTt a~trl ~'ca.so~l.i~tq: p~'occedi~gs 
of the 1bird in, le,rnalio~lal ~ o~fcre'nce, (',~unbridge, 
MA, 1!)!)2. Morgan I(~tuI)n~mn. 
\[1%nma,l ProjecL, 1989\] P(!lunall lq;ojcc\[;. PI,)NMAN 
docume.utation: l, hc Prime,', the User (~'uid(!, th(~ 
l~el'cu'cnc(~ M~ml,~d, ~uld the Nigcl nu~nual, 'l'cch 
nica.l rcporG US(~/InFol:m~tion Scicl,ccs Itlstitul.e, 
Marin~ dcl l~cy, C~dil'orni~% 1989. 
\[Pirlcin, 1993\] Thomas Pirlcin. ILcusing a. lm'gc 
dolnain-indcp(!r, dcn/. k.owledge b~sc. In Fifth 
h~,lcr~zalio~l, al Co~d'cre:'lcc o',. Soflwavc P,'Wli~zcer- 
i:'U/ a~.d I(~o~t:led:le l(~uli,tteeri~zg (Ht'Jt(l¢'9:'\]), San 
l#l:~mcisco, 1993. 
\[Sl;ein(~P cl (tl., 1988\] Erich II. Steiuer, \[Jrslll& Ib'ck 
Cwt, Birgit, Weck, ~nd JlltLa. WilH;el'. '\]'he dew:l- 
Ol),neut of the I,'UII.O'I'RA-I) Syf4|,(~,l\[| O\[' S(~l,l~t,lI;i(' 1'(> 
la, tions. It, l",rich \[l. Sl;ein(,r, P~ml SclHnidt, and 
C, orneli~ Zelinksy-Wibbclt, editors, l;5"o.~ Hy~la~: 
lo ,5'c'm.a'~ztics: insighls .h'o',~ Machismo 75"a'~,slalio:'L 
I%/nces Pint(,r, LOlldon, 1!)88. 
\['l'eich, 1992\] Iqlkc Teich. Korea|: gra.l|tluar doc 
IIlllOlll;~lA.ioll. 'l'echnica\] r<q~ort, (~Ml)/Institut 
\['ii\]' \]ni,('.griel:i;(~ I)ublikt~tions und informationssys. 
I;()IYIO, I)a,rmsta,dt, W(~sI, (_Icr~nany, 1992. 
809 

Syntax 

