herican JournaI of Computational Linguistics 
Microfiche 20 
AN APPROACH TO 
VERBALIZATION AND TRAWSLATION 
BY MACHINE 
WALLACE L, CHAFE 
University of California 
Berkeley 
Copyright 1975 
Association for Computational Linguistics 
Phis retport do3cribt:n [I modd for mnchino tr~nnlnt iori dei~elopcd 
at Berkeley durin~ l'972-7/tm 'Jhe ,nodel i~ brli1-t r~raurbd a cct of 
procedures czlkd vorbnlizatim, intendctl to sir::ul:iLlt? the procn::r,us 
emdoyed by a sb)ecaker or writer in turniny storm1 k~~owlecl c nto 
words, verbnLiz~tion in zcsn tr> conn int of :;~d-~cr)t~cw t11~1j~nt ir~r~ 
and lexic?lizn.t8on prr>cannr?r: i~hich i rlvrlPve c,lb .at ivr; r:hoicr:n on the 
part CI :' the ~1~5,ll izr;*r, tot7cthcr with :llr;orithalc s-rr~t act ic ~j~~ie,cr,r,cr; 
r?etemine(3 b;~ 1 ti I , 3 in 1 :; vic ~:.\rl :1:3 
1) the r \constnuct of tblo venb;31 ixr1t i \~l 1 )rr)c:f:r,zc:; l~,!li-~r:~l v~~c~~L 
in'to the ori(<innl R~O :rCr 1 I, t(:x t af1d (2 tA.;n li17,l ic~t ion ~f' 
~arnllel erl'rallzat i ~n .,lroct?E;sos in t lit t:iriret 1 -he tar ;ct 
lm-IJR +e ~~erS,Jizati~n .\)o%r; Tor cm?*3tive c:/xncea t I tht :our:c 
- + 
langu,l:l;s -verb:*liz -it ion wid tries to ;1:;1j117 c~rreg:~ondlng; C~~~CCS, db 
the silme time that it an lies sjrntactlc DF~>G~:BS~S d~ctrqtcc: 5-7 the 
gram=* 01 ti~c t:)rIyet Lanrr1~~3fre. Jerbglliz~tion :i:ld translatijn 
processes are illustrated, wlth exanpll;~ t!jkcln fro-.,: c;ny;llsh i-hi 
Japanese, .ti few of t:lesc 9rocctr;:c:s h:jvc bvcrl i,.,, 1sr;ierit;sd LYL an 
int er:jc t ~vc p-,s.op;r m i t P:ic:i1 1 t 1t;s of I L,a~rrer~c~~ Lk~~1:r21 ::,IT 
Laboratory, but t int~~~! t of LC. re i:; LC, (If?r:~orlsL r>:itl: L :I&: 
kinds of ps ,zer,ses th;jt need to 3c incor:,@r;ite::d I.rl ; : E;;,rT;tCrn. 
Abstract 
~cknowledment s 
I Overview 
11, dubconceptualization 
111, b Lx:imnle 
11 
IV. ~exicalization of a JJ 
V. Lexicalizati )n of a PI 
VI. Jhe Lexicon 
VII. Discourse Information and Readjustments 
VIII, Translation 
IX, P~iscellaneous rroblems in T~mslation 
This report donln with work performed by the Contrastive 
Gemant ics Project in the Department of Lin(wi::tics of the Univer- 
sity of California at Berkeley. The project was supported by 
Air Force Contract No. F30602-72-C-0/cc)6, Associated with the 
project durin~ its entire life, in addition to the author, were 
Patrrlcia M. Clancy, Leonard PI. r'altz, Christopher Murcano, and 
Hasmig Seroplan. Also active during more than half of this period 
were Masayoshi Shib~tnni aqd Linda oobek. Associated during shorter 
periods of time were Teresa M. Chen, Charles J. Fillmore, Robert E. 
Gaskins, and Marie-Claude Jorlatid. Masayoshl IIirose served as a 
consult ant on Japanese during the final two months. 
Thls s~me report, in slightly different form, was published 
by Rome Air Uevnlopment Center, Griffiss Air Porce Base, ;Jew York, 
as 9ADC-Td-74-271 (October 1974). 
Central to the vlew of t~.~~rls~utlon that w111 be preqented 
here 1s the notlon of ~rbahzatlon, Verb illzntlnn 1s the ~ppl~cn- 
tlon of processes by whlch som?! hollstxc conceptual chunk, 
recalled from memory, 1s converted into sentences and words-- 
lnco a phonetically or gra3hlcallg comunlcable llngulstlc repre- 
sentatlon. buch a notlon assumes that the underlying content of 
what 1s belng communicated 1s hot, or need a3t be, In verb31 form 
to begln wltH. At the very least it nay be a complex sjrszern of 
dlscrete elements and rel,~tlons, represen able perhips as a network 
of nodes aria arcs. It may also lnvolve m im7ortant nondlscrete 
or analog component, representable only m some other terms.  o or 
excellent swarles of both sldes of thl- particular lssue see 
Pylyshyn 1913 and Palvlo 1977.) ahatever may turn out to be the 
case here, lt seems clear that some sorts of processes must be 
apphed m order to transform the orlglnal fom of storage Into a 
verbs1 output: that tbe stored materl 1 must be verbqllzed. 
Xn any partlcuIar instance of trs~slatlon there are tdo In- 
stances of verballzatlon. One Ts the orleln 11 verbahzatlon npr- 
formed by the crr~tor of the source language text. The vther 1s 
the verballzatxon r~roduced in the tzr~et 1ani:uq:e by th? tl mslatqr. 
3esides belng In dlfferent l.ngune;es, these two verb~hzatlons 
are fundamentally different in one other respect. 2he source 
language wrballzatlon is, we mlght say, autononous. It. 1s freely 
produced by the spaken or wrlter in any way he decldes is alpro- 
Frl :te to the content md the occaslos, provlded he adhere. to the 
rules of hls culture aqd the langu Re he is uslnR. e tarmet 
language vcrbnlizntlon, on thc other hnnd, 1s parnoltlc on the 
source Imguage one. Not only must the translator adhere to the 
rules of hls own language, he must also produce a verballzatlon 
that commun~cates, so f lr as posslble, the srme underlyrng content 
or knowledge thnt was communicated by the source language verbal- 
lzatlon. ?he verb?l~zatlon Ln the target language is thus subject 
to thls special klnd of constra~nt, Its producer is not free to 
"say what he wants," but must insofar as posslble say the same 
thlng as the producer of the source language text. bde suggested 
In an earlier report that there are two chrnensl~ns of high quallty 
translstlqn, whnch we termed naturalnes-s and fldelltg, Naturalness 
1s achleved when the tuget language verbal~zat~on adheres to all 
the constralnts of that language; the output w~ll then sound 
"natural". Fldel~ty 1s achleved to the extent that the tzrget 
language verbal~zat~on communicates the same content as the source 
langusge one. 
Vesbdlzat~on In general, as we see ~t, conslsts of a mixture 
of two klnds of orocesses: those wh~ch necessitate creatlve de- 
clslons on the nart of the verbCdlzer and those which do not, 
semg governed by the constraints lrnlmsed by the lmfruage. e 
rnlght sneak of creqtlve nrocessec and al~orlthmic processes. Srea- 
tlve processes are ultlnately ~ovsrned by the content whlch under- 
lies the verballzatlan; the verb llzer has to declde how best to 
verbalize that content. Normally a range of cholces wlll be onen 
to him, and he must declde what will most effectively convey what 
he has In rnlnd. After he has made ssch cholces, there are often 
automatic consequences whlch follow from them because of the 
pnrticula rules of the 1:mp;ungc (hut which RTB themselves likely 
to leiid to the necessity of further crentivc choices). 'de can Say, 
then, with respect to the two vcrbalizat~ons involved in a t$ans- 
lation, that the producer of the source language, verl~alizat~on, has 
applied both creative and a1p;orithmic processes, wherehs in the 
target lwguoge verbalization only algorithmic processes are auton- 
omously applied, the necessary creative choices belng determined 
by the choices that were made in the source lan(.;ufq;e verbalization. 
Thus the naturalness of the f lnal translation depends largely on 
adherence to the algorithmic processes of the target language, 
while its fidelity depends on the extent to whlch the translation 
has been able to incorporate qreatlve cl~oices that correspond to 
those originally applied in the source language. In a11 proba- 
bility there are cases where exact correspondence in these choices 
Is not possible, and where a ceqtain mount of autonomous crea- 
tivity has to be introduced lnto the target verb lizat~on s well. 
These are the cases where automatic translation becomes nost 
problematic. One useful goal of machine translation research can 
be to determine precisely the nature and extent of such cases. 
We are led, then, to the general picture of translation which 
is shown in Figure 1. 
The two vertlcill columns represent the two 
verbalizations whlch are involved: 
.In the left the source languasge 
verbalizatlon and on the right the target verbalization. zhe lnpEzt 
to a to a translation procedure, of course, is an already produced 
verbal output or text in the source language. The first major 
component - of the translation procedure will have to be the re- 
construction from that text of the verbalizatlon nrocesses by 
which it w:l,c, prorlllced, r) k i lid of "ci~~c .bml i z~.tion", he 1 rof cs 
to this nn the pnrsinPL com)~n;:nb, nlthou(;h it in cJnorlg difbfdr.cnt 
froa c';~nvontionnl pnrnine. Lt aim to rocanstruct, not n sin(7lc 
dqer, structure undorlyiny: the s1ir'fnf:e toxt, halt rnth r n is of 
processes by which thnt text was zrc~jterl from the knf)tdl pdyrc--not 
anlv r~onvcrb I bllt r, ssibly ever: nondiscrete--w"l~h the c:3kr:r or 
writer had in ~:lnd. The but-~~ut f~f the pnrni.n(; corr~nont is idonllg 
a co~ lete rec~r~struction of bot-h the creatrvo nnrl ti-IF nlr;r)r~ thnic 
qroccsses which the source lanr;~l,y~e verbalizer ap.111 ed. 
The other nnjr~r c~m:,qnent of the trcanslatir)n proced!lre i:i the 
translation componept. It is equivalent to a vnrbql imtian- 111 trlc 
tarr;c?t language. 'The processes wklich rn:~Be u~ ti verbal-iz:lt~nq 
apem, to the extent that they are alcorithmlc, those which cxnrrtss 
tcargct 1nny;uaye constra~nts and, to the extent that they :.T crea- 
tive, those whlch corresaond to choices alreadg nlade n the re- 
, r 
constructed source language verbalizatir~n. ,:he necl.ssity of' 
reference to the sollrcc lm~ja59 verbalizat,ion for creatlvt$ cho~ccs 
at many rmints 1s suq6:estitd in F'iqure 1 the zir;za~ arrows 
lrie believe that- 'thir, r~icture provides il p1nu:;lblr. 1)a::ar; Tor 
translation re!:(?arch, but nf:erll~~ss to my ~t ,)resents rnv!nv prnblc~nr, 
whose solutions are only dimly Yoreseen at the p?c:;ent tlmc. Uur 
project concentrated mor of its attqntion an verbnlizntion itself 
than on parsing or tpmslation, slnce both of the latter depend on 
a prior understanding of verbalization. Any other ordermi- of 
priorities would be putt in^ the cart before the horse. Any detailed 
investigation of the parsing comnonent wolild be futzle lf we dld not 
know what sort of output we would expect thnt con:~onent to ~roduce: 
target 
language 
Figure 1 
the proccnnes thnt went into EI p?rt.iairlnr verbalizatinn. The kr-nn8- 
lation comannent - is R ve~b~li~ation,, thpr1p;h one of R sneciaL sort, 
and there a~ain a detailed understmrlinp; of verbnlizntion pro- 
cesses is necezsary. This report, then, will be most cr)ncerned wi.th 
the nature of verbalization. We will also devote considera7~le space 
to the nature of that speci-al sort of verba1iz:rtion which in trnns- 
lation. 'We will have the least to say about parsinc. Examples will 
be cited from English and Japanese. 
For bout the last nine months of the project we were concerned 
with the development of :m intxrnct ive computer pro,p;ram thnt would 
implement the verbalization nrocesses we hy-potheslzed. f~lthoup-h 
ti prQy;ram remained primitive, the intention was that it would 
~raduall~ achieve increased sophistication in its abilltg to simu- 
late verb: lization, translation, and garsing. As it presently 
simulates the Drocesses of verbalization, it beeins with an item 
that represents the initial holistic idea which the sneaker or 
writer of a text wishes o c~)nmunicate. It then asks the user, 
seated at a tele$tyne, to make the series of creative choices that 
are hecessnry kn the production of the fanal text. lit the same 
time it attempts to anilly on its own the al~orlfhmic processes 
w%ich a-e called for. It knows when cre:ltlve choices are necessary, 
but must ask the user what choices to make. Ideally it shol~ld be 
able to anply the aleorithmic processes wlthout help. As it simu- 
lates translation it should likewise be able to apply the algorithmic 
Drocesses of the targt:t lan~uage automatically, and also to apply 
certain creative processes 
on its own by looking at the source 1onp;uaf;e vnrbalizatit~n to see 
wnat creative choices were made there. hhenever j.t is not able to 
make a creative choicb, the prop;r:un asks the user to do so. e find 
that this kind of machine-user inter &ion wovides a valuable 
research technique. 
Taking as oui- ultimate god the eventual elim- 
innti on of the user from the translation Rrogram altogether, we 
start with a situation in which the u6t.r fntervenes at many points. 
As we learn more we can graaually give the machlne mope to do and 
tne user less. This technique can be followed not only in verbal- 
ization, but also in parsing 'ulhetner r;he user will eventually 
dis sppear from the ~icture altogether is uncertain. 
However that nay be, the goal a1 a pro.;ram in which the conti-i- 
bution of the user is significantly diminished in relation to that 
of the nachine seems worsable. Short of the final goal of elimi- 
nating tne user altogether, an intermediate goal identifiable as 
'human-iided" machine translation can more easily be foreseen. 
Here the machine will do the many things for which it is suited; 
but a human brain will be introduced =at those points where the 
machine has reached its limits. This intermediate goal has, we 
believe, significant p-~actical as well as theoretical value. 
Funding for this project ceased in June 1974. The report 
mubt be read, therefore, as a s:mmary of work thnt was interrupted 
in mid-course, and-as a partial blueprint Tor further work should 
the necessary funding ever materialize. At this point, six months 
after the termination of the project, the need for varlbus modlfl- 
cations is already evident. It seems best, howeven, to document 
consistently how things stood at the time of interruption, without 
trying to i+ntroduce now and untested material. 
11, Subconcept ualizhtion 
nle assume that a speuer or writer begins with a sin~le, 
unita~yj holistia concentual chunk that he has recalled from 
memory and has decided, for some reason to communicate. Thus he 
may nave ir mind some incident in which he was involved, something 
of interest he was previously told about or read about, some ex- 
periment he wishes to report on, or whatever. de label such a 
chU, as well as the smnllmer chunks into which it will be analyzed, 
with tlie prefix CC (for "conceptual chunk") followed by a four- 
d igif uber. he first digit indicates the lanrruwe in which 
verbalization is to take place ("1" for English and "2" for 
Japanese), and the remaiaing three digits constituts an arbitrary 
index--for the particu-lar chunk. 'fhl*s -e%1001 might be the name 
given to some p&rticular chunk of this sort that is about to be 
verbalized in xnglish. 
We assume. futhermore, that while this chunk is from one 
point of view a wit, from ahother point of view it has a more 
or less rich contentn, <aIrd that 1.C is tl-L~S content which t71e 
spsaker.wishes to convey to his audience. Sometimes, though not 
in most cases, the initial chunk itself may have a linguistic 
label. If it is a folktale, for example, it may have a name like 
"Cihderella" or 'lThe Three Bears". But someone who has decided 
to tell a story is not likely to say ju'st "Cinderella" and let it 
go at that. (One is reminded of the old story about a convention 
of comedians at which people said thirigs like "h9" OF "178" and 
elicited laughter aach time because everyone knew the jokes these 
numbers stood for.) Normally it is necessary j nstead for the 
speaker to get inside the content of this initial unit--to analyee 
it into smaller chunks. This kind of process can be pictured as 
shown in Figure 2, where the initial chunk CC-1001 has Seen, as we 
say, subconceptualized into chunks CZ-l002-&1nd Cd-1003. In a text 
of any size each of these smaller chunks will be further broken 
down into still smaller ones, and sp on, so that a hierarchical 
structure of successively smaller subconceptualizations emerges. 
Subconce~tualization belongs- to the class of verbalization 
processes which are creative. Normally a chunk does not auto- 
matically determine a particular subconceptual breakdown, but the 
speaker must creatively choose how to subconceptualize each one. 
It is useful to think of the content of each chunk--each circle in 
Figure 2--as if it were a mountai~ous landscape, with the most 
salient aspects .-tanding out in bold relief and the less salient 
appearing as only minor hills. kll other t?ings being equal, the 
more salient sople aspect of the total content is, the more likely 
the speaker is to express it when he subcoaceptualizes. Re is not 
likely to make exactly the same subconceptual breakdown each time 
he communicates the sane initial chunk, partly because he may 
judge different things 50 be salient in different contexts and 
pwtly because the landscape itself may change over time, the 
relative salience of its different ~LSD~C~S being modified in 
long-term memory. IJe assume that any particular subconceptuali- 
zation necessarily leaves out part of the content of what is 
being subconceptl~r;lized, as suggest-ed by the area that lies within 

the l*lr~er circle but outside the two smaller circlcs in Pieurq 2. 
Subconceptuulization, that is, is necessarily a select'i-ve [~rocess. 
No one ever says everything he could say about what he has in mind. 
~~bconceptualization of R parkicular chunk, say GJ-1001, pro- 
duces two or more hew chunks., say CC-1002 and CC-1003. These new 
Chunks, furthermore, are conceivy.d of as related to each other in 
It 
some way. For example, 3;-1002 might be the reason" for 2C-1003. 
Suppose the entire text consisted of -the sentences, "I bouqht a 
bi-ke yesterday. 
I decided I need more exerci~e." Let us ssy that 
the first sentence is a verbzlization of ZC-1003 and the second 
sentence of CC-1002. de can say that 5s-1002 is the reason for 
CC-1003. de write a subcon~eptudizati~fl process of this kind in 
r;he following way: 
1) JC-1001 S> CJ-KE'ASON (CC-1002, 32-1003) 
This statement says that the initial chunk, CZ-1001, is sub- 
conceptualized (S>) into the chunks CC-1002 and CX-1003, and thyt 
th'ese two new chunks nre related by the predicate labeled CJ-:1EAiLilN, 
The prefix JJ stands for "conjunctionf1 (derived fron the ~rmrnatical, 
7('( ( 
not the logical use of this tern). my rel~tion between Yd s is 
labeled with this prefix. 
'VY'e use a different notation to repreyent each of the various 
stages in the verb-lization process. -r~ the outset, in thls example 
the initial chunk JC-1001 was all that was present. This initial 
representation, before any verbnlizati~n processes had beer] a~jplied, 
was siaply: 
2) ca-1001 
After the subconce~tualizatl~n soecified in 1) was applied, the 
Subconce~tu6lization processes ore +;bus 'rewrite rules, wh'ich replace 
one stace in a verh:jliant~on with a subsequent stage, The formnt 
wt ,;e to represent sdch stn(;es, as in 3), shoiJs predicates with 
their arguments written indented below thenl* 
In simulating verbalization our program ppesently aokls the usm 
to specify all the creatlve choices, restricting its own contribution 
to the application of nll;orithnic nrocgsses determined by the crammar 
of discourse, sentences, and words in the lanq-uage involved. ?he 
program is labeled VAI) (for "verbslizc-L and trmslato$'j, :md we 
can illustrate convcrsntionr; netween VA? :~rd th.~ user identifyin@ 
them as V and L' respectively. The procram. b&[;ins by asking: 
rv -r? 
4) V: , Vi A 3u !ilT? ~.i~l'~~,L~ i2d. 
to wh;ch one possible mswer is: 
5) U: ViKBliLIZE 2:-1001 
Skipping several stens to illnstrat: unl;~ the ~cru~;~~ outlines of 
subconcepto~ aliznthY, wt? TC intovb?- ber{ just no:.r in t!:e ~leqt iqn: 
6) V: 110,~ 1,; X-1001 ,LXJ13,~\4!! XiJ;'[; Li 5.iU? 
t3 lt~hl~h ~OSS ib1.e mswcr is: 
7) U: LAG (:?-1002, 22-1006) 
At tSlis n~int 'JI.+T will con:-tmct the representation shown In 3)- 
s p~ol,:r.a 1s 
In givinz jnnswer like thct In 7) the user of th-4 
assu~ed to be :oKlng e~piicit n declaian wlich a mnl s e+er would 
make u~?con~cisusly dn the b,~r,ls of a variety of co2plex criteria. 
~e do not pretend to understrand how such 3 de-is; on is rczched; kt 
can at least introduce t-he decision itself into tho verbr~lizntion 
model at this stage. 
VAT will now apply txn al~orithmic or, an we say, syntactic 
"l 
process triggered by the presence of ZJ-REASON in 3). 
lhe process 
applied is of a type tha) is also not clcariy understood, but We 
may view what we do at pre~ent a first approximation. \Ji'i't the 
moment VAT simply takes the twb ZCs related by 3'-REXSOIJ and orders 
them so that the second will be express'ed beigre the first. That 
is, for exam't~le, if SC-1002 is eventu:~lly going to he verbalizes as 
"I decided I need more exercise" -md 2C-1003 as "I bought a b~ke 
yesterday", we want the two sentences to be expressed, with C2-1003 
preceding CC-1002. Thus VAT will automatically change the repre- 
sentatiofl in 3) to the following: 
'Phis kind of ~epresentation, in which no predicate is shown aoove 
the two CGs, indicates that they (or their eventual verbalimti1,ns) 
are to occur in the final text in the order. shown, with dz-1803 pre-- 
ceding CC-1002. 
In Japaese the corresponalng syntactic process will tyoically 
lead to the attachnent of CJ-"KAdk" at the end of th6 second sen- 
tence. phus if a representation like that in 3) were produced 
in a Japanese vorb:ilization VAT would automatically cl~ange it to: 
The quotatlnn marks around indicate that this is an item 
which will actually appear as a word in the text. l~otation marks 
are used for iterho th~t have EI ~urf~ce Lexical reprentlntation. 
The reprefleptrition in 9) i.s deficient j.n that it flails to show 
that CJ-"KAiiAt' will be part of tht: same sentence as CC-1002, whereas 
"v-1003 will (or is likely to) form a different seotence. 
Sle indi- 
c~ate sentence boundaries with the notation CJ- " . " , 'since the period 
will a.ipear in the final text. Th~s fulLer vnrgil~ns of 8) and 7) are 
re~~rect ivelys 
The crentiofl ~f these :)eriods is r* hi>usekeepinp; task that rleed not 
be described in detayl here. 
Given a representatiar like thnt in lo), VAT wlll ::o on to ask 
about the subconcer~tunl-iznti1,n of fhe first dz in the ordering. ?he 
general princiqle foliowed here is one of "de;~th firstr', in the sense 
that egrlier itlsms. in the text are Tom letely vc?rS ~lized bi:fore tile 
verbalizatl ~n of later items is belvn. his procedl~re :,robably 11ns 
some ~s:~chological vnl idity; th:lt is, a speCaker is li~ely to ttnnk 
of later parts of what he is rroing to sny only in terns of the most 
general chunks, while he is elaborating the earlier narts ln detnil. 
Only after ne has finished the verb:~lizat'ion of these earlier 
parts will he turn his attention-to a full verbalization of tne 
later ones. 
Thus, ositting varlous considerations not bq get discussed, 
subconceptualization procr:eds interac ki vcly Ln the following fashion: 
12) V: LJ1ii1T VAT TASK DO Y'dU J,,NT PERI;'Ol(MiSD? 
U: VERBALIZE CC-1001 
(VAT creates the following representabion: ) 
Vs HOW IS CC-1081 SUBZONCi'li'TUfiI ZED? 
(VAT creates first the following representation:) 
C J-REASON 
CC-1002 
CC-1003 
(and immediately applies a scored syntactic 
alf~orithm that 
changes it to:) 
V: HOW IS dC-1003 SUBCONCEPTUALIZED? 
etc. 
In this fashion a su~conceptual hierarchy of any degree of com- 
plexity can be constn~cted and expressed. 
The organization of a text may not, be entirely hierarchical. 
however. Not only does a speaker break down larger chunks into 
I I 
smaller chunks--larger concepts" into subconcepts; one chunk may 
also remind him of another, so that the organization which results 
may be in part conc-atenative. de have been viewing concatenation 
in tepms of excursions away from the main hierarchy, ad hn-ve been 
calling such excurshm 9ressions. In some discourse, however, 
there is no necessary constraint that the main hierarchy Se re- 
turned to, and the result may be a rambling text in which digression 
is added to digression. 
In a more tightly organized text digressions 
are more likely to appear as parenthetical remarks: 
brief Sidepaths 
which quickly return to the main hierarchy. We uoc the tern 
parentheeis for this brief and transient kind of digression. 
If subconceptual.ization be rcpresepted in terms of a tree 
diagram (which does not, however, provide a convenient mean$ of 
showing the relations between subcoqcepts, like CJ-BEASON), then 
digressions can be pictured as subtroes attached to the main tree 
at one point or another, as sur;gestod in Figure .3. 
One other important modification of the strictly hierarchical 
model of. subco~ceptualization results from the common occurrence 
of summarization. It is frequently the case in verbnlizat~on tnnt 
an iditid chunk will be subject to tyo ~ep~3rate hi-erarchies of 
sul)concepttdi~ation, one of which can be identified 3s a summary 
of the other. It is ch'aracteristic of :r summary th.4 its subcon- 
ceptllal ization prclcesses nevw proceed beyond some relatively large 
chunks--cnunks which package a relatively large content. e can 
contrast a subconceptu~lization hierarchy which is a summary with 
a hierarchy which constitutes the body of the text and consists of 
subconceptualization processes thxt produce a lar.:cr number of chunks 
of smaller size. 
A surlrnary is ty-pically expressed at the beginning or end of a 
text; thst is, preceding or following the body. Various conventions 
for summaries are associated with dif'o ent genres of writing. Por 
example, a scientific article may begin with the eel$-conscious kind 
of summary that is called an abstract; a news report typically con- 
tains an opening par3graph telling wllu, whqt, where, and when; a 
fable is likely to end with a moral, and so on. Our program at 
present simylf asks, for the initial CC, whether it has an initial 

surnmnry (one cxprosr.ed at the be~inninpj of thc text). If the 
answer i~ yes it asks first for subconce~tunlizatio11 of tho sum- 
mary, and moves on to ask about the body of the toxt only after the 
summary has been completely verbalized. nt the end of the text it 
asks whether there is a final summary. 
Cwativity within a discourse is likely to be limited by the 
genre to whlch the discourse belongs. It would a.jpear that there 
is a continuum ranging from mnximally storeoty-ped to mcmimnlly 
creative discourse. Plost stereotyped are those forms of discourse, 
such as rituals, in which the speaker has very little choice as to 
what he is going to say or how he is goinf: to say it. llith weir 
discourse the "grammar" of the genre provides many of the answers 
to the questions VAT would otherwise have to ask the use-r. In other 
words, VAT should be able to produce ritual texts with mininurr 
recourse to creative decisions. Kt the other extreme xre forms of 
discourse such os descriptions of uniauc + personal exncriences whicn 
have never been described before, where the speaker 1s relatively 
free to lake a 'reat variety of creative docislons. 
\lie believe it would be of considerable interest to incorporate 
into the verbalization process the constrnlnts iln;~osed by several 
different genres, but we have not as yet donr: this. As it now 
stands ow program does ask JIAT Is' '2B < GZlu?i?? as soon as it has 
established that a verbalization is to be performed. Possible 
answers that we mld like to implement in the future are, for 
example, S 0 'rSY 2:IOLOGY IL.TI~LE, Fk?BL3, and the like. 
rn exrunple of these procedure8 ar: anplied to a roal text can 
be based on the 
following United Press report trrken, sl ightly con- 
densed, from the em Francisco Chroniclo of May 16, 19743 
13) 1. An 11-ye:lr-old boy using a new "super-glue" 
2. acciirenfally glued his eye shut 
3. while building a model irlae, 
4. and a doctor had to renpen the eye surgically. 
nike Harris said 
6. he rubbed his left eye 
7. after several drops of the glue squirted into it last 
Sunday 
8. and found his eyelid would not move. 
9. An eye surgeon debated briefly about 
10. using a super glue solvent 
11. but decided against it 
12. for fear it might damqe;e the boy's eye. 
13. 'Phe surgeon, who asked not to be identified, 
1 finally put Plike in the operating room, 
15- tri:,ined Mike ' s eyelashes, 
16. thzn opened the eyelid surgically. 
1 Mike was released from the hospital Tuesday. 
It is a-:proximately the case that each of the nunbered lines in this 
text expresses a terminal subconcept (see below). :ie assume that 
the text contains a nllmber of intermediate subconcepts as well, 
which need to be eltlcidated in a subcnnceptual hierarchy. 
Let UB SU~POBB th:& the wmbinntion of VAT and the uner are 
attempting to simulate the verbnlizatioh proces::es that went into 
the ~roduction of this text. For the moment we &re concerned oqly 
with subconceptualizati~n processes (and, associated syntactic al- 
gorithms). limy of the user's answers in the following conversation 
with VAT1 are intuitively based. 'lfhe success of our eventual parsing 
component will depend on the extent to which these intuitive an- 
swers can be predicted from the text together with whatever items 
of background knowledge we relevant. ihe example will be carried 
only far enough to sup;gest the nntyre of the procedure. 
n 
 he exchanre be6ins in the usual way: 
VAT creates the follo~ing representat ion, including a text-f ins1 
period: 
VAT's next question seeks to establish what genre constraints apply 
in this text: 
6 v: A!l*lT Iii ?Hi2 GJ22TRd2 
U: E3VL L~C~'OHT 
VAT will now assume that the text is a typical ncws re~ort which 
begirls with a sll:runar:r. Its first questions wilL deal wlth the 
subconce~tua~ization of the smary (expressed in the text in 
sentences 1L4) : 
17) V: HOU I:] CC-1001 SWC,NGEPTGhLIi;ED IN Ti12 dU;iMjiliY? 
U- YISLI, (CC-10~2, 3C-1003) 
the user has answered that ;he fi:*st breakdown of the summary i@ 
into two subconcepts; CC-1002 (to be expressed as "fh 11-year-UL~ 
boy using a new I'm* zlue" Bccitientally clued his eye shut while 
building a model irplane") and CC-1003 (to be expressed as "a 
doctor had to reoTen the eye surgically"). Furthermore the relation 
betireen these two JCs has Seen identified as one laSzled YILLD, in 
which the first ZC "leads to It or '\rsstGLts dn" the second. YIELD 
di ffers from another, similar relation which is lsbelmd CAU65 in 
that the event conceptualized by the secmd CC is not a necessary 
consequence of the first. -It is, however, something that presumablp 
1 ,-' 
would not have happened if the event conceptualized by the flrst J~ 
had not taen plhce. (Zvidentls YI3L3 can be equated with I~<ITIAIZ 
as this term is used by Humelhart 1974, the relationship between an 
external event and the willful reaction of an ;mthropomorohized 
being to that event. 3chank 1974 uses 1:JLTIAICE differently.) As 
a result of the user's answ-?r in 17) VAT first creates the -.-epre- 
sent at ion: 
C J-YIELD 
and immediately apr.lies syntactic Qrocesses which Changes it to: 
'Fhnt is, the two 32s are to he expressed with tte "pielderl' pre- 
ceding the "yielded", and they are to be connected with coma 
followed by the word "LJD". 
This is gat the only way 
which 
YIXLD can be realized, but for the ,sake of the example we may re 
gard it as such. VAT will now proceed to ask a  out the subcon- 
ceptuali~ation of the earliest CC in 19): 
20) V: WOW IS CC-1002 CUDGONCISlJTOALIZED IN TI143 C\JMMkd?Y? 
The user has answered that CC-1002 is broken down into t,wo CGs, 
CC-1004 ("buildir~g a model airplane") and CC-1005 ("An 11-year- 
old boy using a new "super-glue" accidently glued his eye shut"). 
They are related by PKAMSU, a temporal relation in which Tie first 
CC occupies s time period larger than rind including the tirne period 
of the second. In oth9r words the time period of 3C-1004 includes 
that of CC-1005. VAT creates, sequentially, th: following two 
representations: 
Although there may be several possibilities forq the e~pressiop 
~f Fm, viLT has assumed h~t: that two factors ape involved: an 
oraering of the two CZs so that the "framertt precedes the "framed", 
and a prefixing of the word "LfiIL..~" to the first 122. (In this 
[l /Y 
example the ordering Of these two AS will be reversed in a s~b- 
sequent operlation.) 
If PIAME may be expressed in other ways, wt: 
assume (gratuitously, for the moment) that subtle :onceptual dif - 
ferences are involved; that there is not, in bther words, free 
variation among possible syntactic algorithms. This remains for 
low an article of faith. 
We would expect VAT to ask next about the subconceptllaliz~tion 
)f CC-1004, but by a meand not yet discussed VAT will discover that 
is is a terminal CC (one not further suoconceptualized). If 
I1 It 
I' AND", VAT would proceed to 
:C-1004 were followed by . or by , 
n 
~sk questions directed at the comalete verbalization of this uC. 
3ut since CC-10u4 is not followed by one qf these boundaries, 
2ttention is -ne*t focused on CC-1005: 
23) V: HOW IS CC-1005 SUBCoNGEPTUALIZ5lJ IN THE SIlMMAriY? 
VAT creates the following represcntation: 
24) SJ-'~~WSILE~ 
CC-1004 
CJ-PHAME 
CG-1006 
cc-1007 
C J-" , AND" 
cc-1003 
It I4 
CJ- 
The user has said that CG-1006 ("an 11-year-old boy using a new 
1 (1 - 
"super-qlue"") occupies a time period which includes 1007 ("hc- 
cidently glued his eye shut"). So f~r we w uld expect tV1i.s second 
instance of PRfiE to be expresrxd by prefixing the word "liilllL.<" to 
25-1006, as was done in 22). Let us suppose, how-t>~rer 9 h:.lt I~':tAPiL 
actually triggers a more complex algorithm which says in effect 
that one "WHILE" in a sentence is enough, and that a sec-~nd instanc 
of PHJWIE will lead to a different ex~ression. Here the second 
1nsr;ance leads to the creation of a relative clause which will 
modify one of the conetituents of CC-1007. Furthemore, the alre~dy 
created "WHILE" clause will be moved to a position aftv ':G-lOO7. 
(This orderine; of the CCs does appear to maximally natural. It 
would be slightly less desirable, for example, tn produce "While he 
tl 
WR~ building a model airplane an ll-,yenr-old boy, using a new supeI 
glue" , eye shut. 
I1 
Certainly, the 
differences in thia area are very subtle.) We will i.ndiLate the 
relative clause status of CC-1006, to be embedded wiatkln the ex- 
pression with slash notation: 
The representation in 25) will be discove-red to be the final 
one in the subconceptualization of the summm, which h~s been 
found to co'nsist of four CCs (ultimately four clauses) joined 
together in the manner indicatild. VAT will now proceed to verbal- 
ize the summary comoletely, making use of othen kinds of processes. 
Wnen that has been done, it will ssy: 
SUB",NCEPTUALIZi3D? 
U: YIELD (GC~O~P, GC-1003) 
This is, of course, the same answer that wag :;iven to the corre- 
sponding questlon in 17). above AS GC-1.002 and JC-1003 are further 
elaborated, however, rnany dif fe~erice~ will ertlerge. Ult iniat ely 
CC-1002, Phich was expressed in sentences 1-3 of the Summarv- will 
be expressed in the bna;~ of the Bext in sentences 5-8. CC-lOoj, 
expressed in the summary -3s sentence i+, will be expressed in the 
body in sentences 9-14. 
Wb will not repeat here- the ~per~tions involved in. the sub- 
concepr;ualizati~n of the bpdy of the text. They ape for the most 
pert similar to those  ill:^^ trated above. 
Variqus other relntion~ 
Setween ':Js .we in ;roduce(i: for exam de, that 'letween CG-1015 
( M eye surgeon debated brlcfby i~bout uqiny; a super glue colvent 
but decided :~~alnat it for fear it rnicht dnmal:e tne boy's eye.#> and 
52-1016 ("The surgeon. who asked not to be identified, Tinally put 
RZi3 tn operfttiw fom, trkmed Mike's eyelr?shes, then opdfied 
/Y 1 
the eyelid sur~ically.") The first of th'ece involves an alter- 
native that is rejected in favor of the alternative conceptualized 
in the second; thus, the relation rnq ,e labeled Ii:I:JE3::3-ILI-PAVOR- 
Ol' 'n'ithin ZC-1015 there is a relatim of 3 ;;N? td;;L;IOI'I (denial of 
expeetation) becwsen 32-1017 ("~n eye susKeon debnted 'brief lk about 
using a suner glue solvent"') and 22-1018 ("decided ag.ii.nst it for 
II ' 
fear it night dam?ge the boy's'eye. - It will be of ems dcrable 
intereat to isolate relations of this sort in a variety of texts, 
an:! to deter.zin8 the ways in whic-h $hgy :ia;y- lje expres:;ed ~nder 
uaryin~ circa~stances in different 1a.ngunp;es. 
?he text does cr~ritain one exm*de of a parenthesis, exnresseo 
in tho cmrestrictiye relative clause ln, lifle 13 ("The surgeon, wbo 
asked not to be identified, "). The fact thlt the surgeon asked not 
to be identified is a n vlor li(s,rerision from the inalnstrearn of the 
acc~:lat. 
it is attached to thy node representme. ;he surf;eon which 
~~11 becon- a const:tuent or :;-1022 ("finrilly put !like in the 
o :eratin& rooa, tri.med Mike ' s eyelsshes, then ooenea the eyelid 
'1 ,\ 
sur;icnllg. , 
IV. Lexical,izat~on of a 33C 
he use the term lexicslizatldn to r-for to another rna.jor 
:omponent of vsrbnlizati:~nz sl~e.:ifically to n clu~ter of procwmes 
that are involved in the choico of a pnrticular linguietic expres- 
3 * 
sion for R vu. aSubconceptualization breako down an initial chunk 
into smaller chunks. Phese smaller chunks, however, remain oncell- 
tual dn nature, ~lnd other ooeroti jns are nececsary to convert then 
into surgace linppistic reprc~sent litions. iiou6hl.v c~eaking, 1sxic~1- 
ization involves the choice of "words" thirt will aonropriiita1.y 
commupicnte the content of 2%. 
Lsx~calization of a ZC takes ?laces ar; the noint where the 
SD-~&R~ decides th~t he hrrn subconcc~~tunlized r no. The 
air.v of subconce.~:tunlization is to 3roduce chunks of ;I size anuro- 
?riate to ling ~lstic expression, and nnrticularly to linf-l~istlc 
ex:~ression that will convey neither too little or too nuch infor- 
manon to the ;iddr::scee. Too little informstlon is, for exmnle, 
~rovlded 'by o sun::iaT;y, whx-e ;;ubr.nncel;tu4',izati Jn has rjroc~edeci 
only to a poi:& wht: e Lexicalization w-LS1 give the a drvr-see a 
It 
gener'il idea" of the content of the whole. At the otwr end of 
the scale, we are a11 C-miliw with e .:~o.Sitionz in dhich t~o fl~i~ch 
informat~on is conveyed, vhnyue we :ire toll] mre h:ln NB w~nt to 
knqw. he asnect of a ~neakczr's cr~vitivity, then, ic? tb decide 
exactly wilere in the procr,m of  sub^ ~nce;,t~~~l~z:~ticm he sh ~ld sto~, 
tnkin~ into acccblnt the rleeds a::d interests of the sddressee. It 
is at this :~oi:iQ that he turns to lexic~ilization. 
The s~eaker mag 31.0 be influenced in such dscisionc hy the 
resources hls laguage ~r&es civail:~ble for pxkatrin~ erAiLulk:: oS d~f- 
ferent sizes. Zoon?ider, for exarnle, the amo ~nt of content that 1s 
packqed In an English sentence like "IIe hit into a double :)lay. 
r t 
If our lanf;~iap;e did cot provide this pnrticualr exprehsion, we 
would have to subcnncentualize this chgnk considerably further and 
come up with chunks that wo~ld have to be expres::ed in some such 
way as "He hit the ball to the shortrtop, who threw it to the second 
baseman before the runner previously on €irst base could reach 
second. ?he second ba6eman then threw the bllL to the first base.- 
17 
man before the batter could reach first. J-hus his hit caused two 
outs to be made." rresumably a language makes available packaging 
at var ous leves of s~bconceptualizCtion according to predominant 
communicative needs within the ~ulture of its sqeakers. 
How are col~~entual chqnks communicated? One way to approach 
this question is by looktng :it the spatial and temporal properties 
of such chunks. chunk is ty~ically either event ("He rubbed 
his left eye") or 2 situation ("The glue was next to the lampu) 
20th events and situations have a particulsr 1uc;us i2 sp::ce and 
time (the difference being that an event involves sorle spatial 
change throu~h time, whereas a situation does not). wch chunks. 
then", can be reqwded as assignable to particular coordinates In 
both a s~atial and a temporal continuum. (de omit consider tion 
here of generic chunks, expressed -in #sentences like "Dof:s chase 
catsIr or "The hf~usa had two chimneys", where -p:~rtlc llarity is 
@sent Genericness calls or extended discussion that wuld take 
us too fzr afield af tnis point.) 
If we assume that aost of the chunks a speaker wants to find 
linguistic expression for are evenzs or situ .tions,  ad thus hsve 
both spatial and temporal parti~ularity, it is not r;ur;~riisin,~ that 
langu,i~;e falls to provide direct l8bels for them. 
.ie cmnot, in 
the course of nubconceptualizntiqn, arrive nt aomethin~ like CC-1011 
then remember thnt the name for this chunk is "BLUi4GH, md comduni- 
cat(? it by uftk~ing th% aforcl. Pnrtic~lIiiP events :md, situatiohs me 
too numerous, and our experience of them too idiospcr~tic forl eacih 
to have its own nme. 'lhe way this probhem- is solved is threpp;h 
the interpretqtion ~gf many gf ferent ZCs as inst:~nces of the same 
cate~a. 
Thus the titr!e lqst December when I cave iny mot;h:r a 
Ohristmas {)resent, the time when the rnailrnaq r;clvs r:le a rezistored 
letter ths morning, the time yesterdqy when the teachw cave ny 
son R note to take hone, etc. etc. 7re 111 catee;orizab~e ns LnctTmce 
of "~iv jng" . .!e label "he cate~ory itself U':-"GIVI~" (U3 gtgndinc 
for "univors91 ~ntezory") md snezify the choice of thls category 
by the s esker with the notation: 
7 '1 - 
2") u,--05j 0 U:-uGIV&v 
Such a stnte:~ent is to be res~d "SC-1053 is c:lte;:orized 2s an in- 
11 - -7r-~? 11 
stance of the caterbry UG- ULV!; . It should be noted that the 
English w0i.d "GIVE" is not 8he name of this catek-ory; mther 
any particular $C hich is so c-.t-;::-oriz?d can be communicsted with 
the word "GIVE". In obher words, the decinion described in 27) 
allows us to US+ 
It(;' 7J,$ll 
a a n:me for 23-1053. 
The way in which a speaker djecldes thnt a particular J2 can 
be categorized as an lnstance of so:ne LJC is of c mrse a fl~ndi2n.mtal 
psychological quest on. dne thing that seems clear is t znt some 
Xs are more easily categorized khan others; ease of catecoriza- 
bilitg has been called "codability" (~rorm and Lenneberq 1). in 
a closer approximation to huwn aent nl mocesses, therefore, a 
statement like 27) ouf;'f&, to be qualified as valld to a certain 
degree, and not ns an ~31-or-nothing decision. 
5f the degree to 
which a particular CS is an ingtance of some UO ic very high-- if 
the CG is highly codable--then the use of the word nrovided by the 
U3 will succeed quite well in conveying t%e content which the a eak- 
er has in mind. If, on the other hand, the content of the :C is 
not wry well c~ptured by nsgigning it to the UC, then the speaker 
is likely to *.lnnt to add one or more modifiers to mold the content 
aore c1osel.y to the content bf the CC he has in mind Adverbs are 
0 n 
an obvious d?vice by which such molding is accomplished. ~hun, the 
spvker might depide that the content of %-lo53 is better captured 
in an intersection of VS-"GIVZ" and UC-"GR'JDSING": 
28) 22-1053 3> ?JS-"GIVG" d. UZ-~G~{UDSISG~ 
in. which case the eventual lexioalization will be "give grudginglytr, 
and not sinply "give". 
Suppose= u"Z-1053 is a concentual chunk that will eventually 
be vzrbalized with the sentence: 
28) 
Krs. Brown gave Tomny a cookie. 
'de h&e sad that the wrd "GIm" is available as a la5el for this 
CC. Up to a point that is correct; there was a ~ivinc whlch took 
place. But sentence 28) contsins more than the word "G1VE" l dhat 
kind of conceptual information is conveyed by "YiIS. Bl~OWNtt, ttili;rq:Y1l 9 
and "A COOKIS"? Zach of xhese items evj dently conmunicates a conceat 
that is different in nature from a 2C. 'rLis other kind of zoncept 
we label a PI (for "particular individual"). The chief difference 
between a PI and a Gz seems to have to ao with temporal psrticular- 
ity. A CC is conceived of as odcupying a specific and usually 
fairly limited perlod of tine. The time perlod oxupied b:y, say, 
8. Brown is much less ~r~ocific, and ia not likely to be nome- 
thing we are vsry interested in when we utter a pentence like 28) 
In other worrls, although a~Pl may have temporal particu1;lrit;y in 
the nense of a lifespan or total time of existence, such R time 
period tends to be of a different order of magnitude from that 
occupied by a 3C, and more often than not is of little relevance 
when the PI is communicated. Furthermore, any one 1 may par- 
ticipate in an indeterminate rlumber of different 03s. (Mrs. Brown 
has done many other things besides that which was reported in 
28) 
h'hy do PIS play a necessary role in the communication of a 
CG2 'Phe answer may have something to do wSth the necessity for 
providing gnchor points in the addressee's mind. Because of its 
lack of temporal psrticularity, the concept of a PI is a relatively 
stzible concept, ana one which is liable +a enter consclousn-:ss 
again and again 'dth respect to a wlde vririety of 3:s. Thus, the 
only bray s s:)e.lker cm effectively install the content of a JC 
in the addressee s mind is to tie it to one or more PIS alrcady 
known to the addwssee. That iq, the ununl way LII communic~tin~ 
information is by brin(prig one or #\ore PI nodes into the addressee's 
conscioushess, and by predicating 3omethi.n~ of these nodes. 
-.- 
~an wage usually involves takin;; one PI (the "topic") as a starting 
point and either predicating sonething of it ?lone, or tying i to 
other+ I Is through a relational ~redicate. 
It should be noted in passing that nor everything which 1s 
expressed syntactically as a noun is conceptually a 21. A rrr~~d 
like "Tues~ay" for exam Le, may be used as the nrme for what we. 
call R PC: a "partlculnr tlme" wh~ch rnlght be wed to provlde 
temporal orlentntlon in a s~ntence like "On Tuesday Tlrs. Brown 
gave Tommy a cookle. 
II 
If* 
In decldlng to catepw~ze a ln a csrtnln way, sag -s an 
lnstance of U;-'Tu\lE" 9 a sneaker sl.nult ineously est9bl lshes a 
framework of PIS vlrhlch a-e separated out froq the cohtcnt of the 
and 
way. In the case of U3-"GAVEo' these 1 wlll function as went 
-j 
beneficiary, and  atl lent (the ve thy glvee, md the g~ven). 
The fact that :hew three Is ire entalled by the choice o! U:-"dJY 
IS expressed as follows: 
The letters A, B, 2, and D in thls st tetnent are variables rang- 
lng over part~cular four dlglt nuqbers. For exaxle, X-h mlght 
be "v-1053, PI-9 alght be PI-1687, etc. The syhbol > is to be 
11 
read "entails", and-F> 1s to be read "1s framed as . (The nota- 
11 
tlon to ?he rltht of F> can be re,.- ]r+ed as a case frmeM; hence 
the mproprlnteness of the te~~ "framlnc". ) 
The statement in 29), then, s~ys th?t iillen o le has chosen 
declslon entalls that tLe 33 m11 be frmed as, or exyrec,sed by, 
the verb (Ydl "GIVdl' ahcoa?anled b three r"s, functlonln; 2s 
~ent, beneficiary, and aatlent. at~tements ilk- that in 29) 
are stored in our angllsh lexlcon. Thl? s5atenent actually forms 
only part of the lexical entry for JJ- 11~1 J,JI,". 
The c3n3let- entry 
for ths cltegory contams a nmber of addltlonal lznes whlch 
state vf+rious othcr entnilments, for exnrnole that ~pviny; involves 
transfcr of ownership. These othw ns:)ects of lexical entries will 
be discussed below. 
To summarize, a 3C of the a propriato size, nrrsvcd at th'rouct 
subcnncer~tualization, will De subject to cnte~:orization in terms 
of some UC, the off'ect of which will, be to brcate, h;~ way of the 
lcxicon, a vecb~ll label for the 3!: tof:ether with a ffr~ework of 
associ~.ted nouns. The framing operfitlon, in cf fect , will have 
factored out those elements (PIS) having no significant ternpor~il 
p~rticulnrity, leaving a word (the Vn) to which 21 one that tem:,orol 
particularity will be assigned. 
It is probably a consequence of its being left with this 
temporal role that the V.3 is likely to end up carrying a temr,oral 
marker of some kind, such as a tmse and or 3r;pect. sufflx. If, 
forhexrmple, the :C occur;ie? a temnornl locus that recedes the 
locus of the speech act, the Vi3 js likely to end up with a past 
tense suffix attached. ?his part of rli?xlcalizotl~n we -a11 
inflection. Its im~lement ation will be i1.-lustrated immxcdi .~t ly 
below. 
1 
Our nro6;r:lrn tric:; to e:;t:~blish at -the o~tsct for each J., 
whether it can be c I! e,r;o~ized, or1 the ;:..;,n,~~q:)tlon th'it- the s ;criker 
is aiming at DUC~ cateRorization as :I kzoal, rmd that suhconce:~tusl- 
ization takes place only when the c~ntent of the :C 1s such thrit 
categorization is not appropriate. 2hus the flrst question asked 
of any 2C is of th-. sort: 
30) V: GPL! GC-lO53 BE G' ,TZGOI!I:XDZ 
Y'i 
If the user's answer is no, ir~3 ;aes on- to .~sk how this dd 1s to 
be subconceptualized, QS in the example given in sec Lion 111. 
If, on the other hand, the user's answer is yes, VAf will (TO on 
1n 
to ask cluestion "c1lev nt to th:? tense/aspoct properties of the Uv. 
At prcsent it. asks fipst: 
nin 
3 V: Iu uL-1053 GEU3111;? 
since special considerations have to be give to CCs that do not 
have temooral particularity. If the answer to 31) is no, Vdi' 
presently asswne.s by default that CJ-1053 has a temporal locus 
preceding that of the speech act. 'Phis is certainly the mdst 
probable state of affairs for most kinds of discourse. Be would 
like event~ally to. elaborat~: other ~ossibllities, which aro likely 
to 6epend on adverbial and other means of establishing tempopol 
oar,tlcularity. Our nrogram at :)resent will, .under these circum- 
sta~ces, add the inflectional notation "PAGT" after a slash, as in: 
32) -22-1053 / lti3AST" 
It is now time for the foll~wing exchange: 
'1 1 
The user says that i;he decision has been to cate~-orize thls dw :is 
an instance of tho category UJ-I1GlVZ". VAT than looks into the 
lexicon and, on the basis of the last line in 29), r+places 32) with: 
Two other consideration:: are relevant at this palnt. 
For one 
thing, VAT will w.ant to replace the 21 variables in 34) with partl- 
cular four digit numbers. 
Our easiest recourse at present is to 
have VAT ask the user about cxh iJL: 
whereupon VAT will. re r~lace 3 wl.th: 
kt least some of the answers to the questions in 3') ought, under 
soma circumstances, to be derivable from the context. :le hope 
gradually to teach VAT to discover such aswers for itself when- 
ever nossible. 
A second consider ltlon at this point 5s to estab! ish which PI 
is the subject or topic, the PI on which the sr>eaker lntends the 
addressee' a attention to be focusod and concerning -{/hich something 
will be asserted. Again the easy way out 1s for 'JAiT to :I& the user: 
'n Jl ;CT'i 
37) V: ~JiIiil! 1,: ,111 L,, 
U: 11-1254 
The question in 37) 1,s :q)pro~~riate for a nub,ject-prornl-nent 1anp;ua -Q 
like English. If thq verbalization is In a topic-prominent language 
V~LT will -:,sk instead abdut the topic ('Li 1974). In Enr~lish this 
may be the ~oint at which functloonal relations such as af;ent, 
beneficiary, and patient shoilld be rer~laced .& by surf ,ice syntactic 
roles like subject, lndifect object, and direct object. (In 
o kand ni 
Jap~ne~e the intooduction of pa-ficles -ike E, -, - 
would be appropriate here. ) Thqs, after 3 VA'P rn* chan~e the 
representation in 36) to: 
38) VB-"GIVE" / "PAST" 
21-1234?2U~~ 
PI-1345TIO 
PX-f 456pDO 
where I0 and DO stand for %ndirect objeCtl' and '*direct oojecr;". 
Again, the identity of the topic will often be deriv;lble from the* 
context. For example, $11 other th-~ngs being equal, topics have a 
tendency to remain constant from one clzuse to the next, arpnts 
- 
are mom likely to be topics than patients, and so on.. Gonsiderable 
empirical work will be necessary before all such factors hatre Seen 
sorted out. 
If the codability of 3C-1053 had been somewhat Iwcr and the 
modified categorization exemplified in 28) had been chosen, the 
representahon at: t:lis stage woc~ld include an advsrh (AV): 
The lexicnlization of C3-1053, then, has involved cate!-orizw 
tion, possibly modification, inflection, ;md framing. The next 
step in verbalization is to lexicalize- the sevrral 1'1s whizh rve 
contained in a reopesentation like 78) or 39). We will see that 
the lexicalization of a Fr involves categori~atic~n, possibly modi- 
fication, and inflection. 
Yrarnlng is for the most part restricted 
to the lexicaslizatl:~n of a CG. 
A PI is the concept of n concrete object, be it animate of 
inanimate, or of 3n abstraction which has been reif ied and is being 
treated linguistically in w.&ys ~~~O~~OIIS to the treatment of 
physical objects. The supface linguistic representamon of a PI 
may be a proper noun, a conmon noun, a pronoun, or nothing at all. 
Further-more, blr agreement processes certain features of the PI rnay 
be incorporated into the verb with w9ich it 1s associated. .hch 
language has its own idiosyncrasies in  he treatment of PIS. Some, 
like Japanese, arie especially fond of deletiny: the PI a1toc;ether 
whepevdr it is predictable from context. Sone, of the polysynthetic 
type, seen to go overboard in the extent bo which they incorporate. 
features of the noun within the verb. Saxe .nc&e a mint sf adding 
inflectional features expressiny; "definiteness", plurality, and the 
like to the surface noun, while others seem to qet along well1 with- 
out such expression. For illustrative -1urpose8 we will canllne 
oxrselves in thls section -t;o the rnain outlines bf how a :'I is 
lexicalized in English. 
-11 
- Much depends on whether 'dr not the PI in question is glvenl'-- . 
1 
- 
whether it is a piece of knowledi;e that -the: spe9ker believes hzs 
\' 
aipeady hean brought into the addressee ' s consciousn~ss in sone 
1 
way; nrio-r ti, the I ttering of' the present sentence 
(3hafe 1974). 
Here =aiA we h5ve a case where the easiest, course for VAT at this 
preliminary sts:-e of jts developnent is to nsk 'the uscr: 
J 
4 
Certainly in many d41scs, howev:;r, ViiT I. can Se t:l:rght to decide this 
r 
for itself. If, for example, 11-12311. was ~nentioned in the preceding 
sentence th~! answer to 0 must be yes. 
If the preceding Bentence 
was "Mrs. Brown cme over from next door!' and we are c.)ncerned with 
the lexicalization of PI-1234 wlthin the sentence "PI-1231 I;ave 
Tommy a cookie", FPre g-iyexxess of PI-1234 will result in its lex- 
icalization as 
IISHEII . \Je can actually go a fair distance in es- 
tablishing the givemess of a PI on this Baais alone, but the 
question ~i' how else givenness is estabLlahed, including its 
introduction from knowledge external to the linguistic text al- 
together, cdls for extensive further work. 
Let us assuae first that the answer to LO) has been yes, in 
which case English is likely to lexicalize PI-1234 with a or0noun. 
This is not always the case; sometimes a PI that is given will not 
be pron~minallzed. The principal criterion here se~ms to be whether 
pronominalization will produce mbit;uitg, and ultimately VAT -will 
need to deci6e whether ambiguity will result. For now, however, 
we proceed on the assumation that a PI which f3 ajvcn will auto- 
matically be pronominalized. 
The procedure we are currently using for prononinalization in 
English asks first: 
4-1) L: IS PI-1234 'PIE ~irUDil 
1 l:>(-im 
T:? 
i JA>~),Lu. 
T:lis question is asked first because the pronoun "YOU" does not 
distinguish nuniber, and if the answer to 41) is yes bt will not be 
necessary for VAT to do anyth3:ng beyand lexicalizing PI-123'1 as 
NN-"YOU"' (N~J, of course, for "noun"). 
If., on the other h;;nd, the 
answer to 1) fs no, then VAT must ask: 
42'9 V: WIIAT' IS TTIO Z/\IIDI"tlAlrTTY OIP PI-12341 
1Je assume that. a LJI i~ from one point of view the concept of s 
set of objecbs, md that me cardinality of the set is relevant 
In establishing expressions of singularity and plurality, among 
other things. Actually the d.istinction between one and more than 
one as possible answers to 42) is all that is relevant at the 
moment. More interesting questions do arise-in this area. For 
examnie, with cardinalities up to ahout five there is likely to be 
a need for distihguis%ing each member of th? set with a specific 
PI number, ~~vheress with lrlrger cardinalities the set is likely to 
be conceived of sin )ly as containing "a n~unber of'! or "many" members. 
If we assume first that the answ8:r to f+2) is one, then 'JA'P will 
ask : 
43) V: IS PI-1234 THE GPExKjlR1 
If the answer is yes, then PI-12.34 is lexicalized as I?!i- 
tII" 
. If 
no, then we are dealing with a third person rererent and VAT must 
determine its gender: 
44) V: IS PI-1234 hVT'180!30?I:)i!l?IIZ? 
This classif lc;ltif3n includes human beings, but also nadled animals 
such as pets. If the answhr to 4) is no, VAT will lexicalize 
PI-1234 as NN-"I'P". Otherwise it rnust find the sex of this refere~lt: 
't5) ' V: 
IS PI-123 I- MALE OR P9IlLLE1 
md lexicalize it as IJTd-"HiS" or EN-"6:IZ" accordingly. 
If the mswr to 42) was a n:lnber greater than one, VAT must 
decidz between "~~~" and "TilEY", the pronouns 7,dhich are explicitly 
plural. d~sefitially it must ask: 
'1.6) V: It *,"1!3 Lj .;'Mi ,11 h IuIJ?: JE!t OP .171-123LC2 
If yes, 
it will q~rotlr~ce tho 1exitbnlization AN- ''5IE" ~nd if no, 
;3,-y' lsytt 
?here are +gain. n vgriety of bays in lihhich V!iT' might be able 
to answer quastions like 111) throuph 46) without i~skinc tk1c ~lscr, 
i&entitv of n~~cxkcr md addressee will hwe been established 
bp p~ovidinp: ruch discci:rse paymeters at the very beg~nning of the 
discgvlrse; at  resent we use the arbitrary convention that PI-1001 
is the saeal..nr =A 71-1002 the addressee. 
In qnesti'on 41) and /+3) 
T is sski?~: whether 21-1234 is identical tq iT-lf)02 or 11-1001. 
But, de andip'; on the context, t ILS identity may already have hen 
established. As for the cardl~lalitj. of 1/1-1254, it may have been 
iaade :xnl:cit through a ~tmeral or In sow. &her way. And the 
qender of thi ref or&. night h~vn Seen ent ablished th-rough the 
previous use of a sex-s- e-kf ic proper nam.e, or th-rough sox other 
fact that has alrcad j been supplied. 
Let us :;ow turn To the oossibility that 1'1-1254 is not ~iven-- 
that t'le msw?r to question 40) T~~ilr, no. In that c?se, lexlcalization 
must be either i*: terns sf .i nropor name, or th~~~u:;il tno use of a 
cate~oriza io:: ZTII? ultirn3teIy 2 :ommpn no:;n VAT rirst ;:sks: 
" T 
-. 
'$T1) 34:s PI-12YC 'IA.?L~ i L,.L,.,. 
If yes, th.: lJ-eT. cives the name and T/A'P lexi :alizes PI-1234 as 
19 
J ar t5e like, The rzal situztion is not aulte tklip simple, 
since n-+rdI is likel;~ to have one tk~an one proper lime (John, Mr. 
!3r9wn, 2ad37, etc. the '.ch@ice of whlch, if any, mon~ theln to 
Iiise will de :end 02 various interp?rsonal cwisideraf ions. .Jventually 
our ~r3qr-IT s:lo~ld incl ~de questions relev.mt to such n c'noice. 
If the answer to 47) is no, then 'Vlfl follows a procedlire 
roupkly analogous to that nssocinted with the catet;orization of a 
rs n 
uv : 
48) V: H ;\hi I3 PI-1234 :I.'I!~(;ORILEU? 
U: TC,I(:ETXR 
(for examrle). Basicnlly, at t i in, VAT wzll re )lace i'l-1254 
11 T" 
with NN- T,A':I,I<'ii1', At the same time 1% will. store thr3 .cZt.nt:ementz 
49) PI-1234 C> 1 UC-"T'EkCIl ,itft 
and will look at the Lexiciil entq for thss c~tecqry for whatcivei. 
relevant informat;i(;>n is st;ort$d there. 
Just :as a % nay given R lexic:-ilization thik ir, infl'ect-ed for 
tense and/or a,spect. Ghe Isxizalizatior: of a PI may be t~ivcn ii\- 
flections for such features as nunber and/or definiteness, If the 
lexicon shows, for example, thrtt UZ-"TEACII ;HI' entails that PI-1234 
is countaSle, 1 also in t9ls case ask about its cardinality, 
as in 92) above. I-f the answer is a nufiber r~reater than one, TJl~'2 
will ere te n re-7resent at ion 1 l~e 21:- ";"":;::IiLfifI / " ;>i,:T .ILL'' Tndencn- 
dent of thir: nilmber question, VAT .WI 11 need to deteminc uhe+,her 
the use of this c .tef.;ory in this context w:ll enabla the aedressee 
to know wh3t :~nrticular. inst :rif;&: of the ce tey;or;y is bei nF:bt:rtlkcd 
about. ,:e :?u-t thir: in tcnns of the q:lestion: 
50) v: jc,:i; u.:-"~:..=!iz~i" 1a :::r:~ T-y ~1~~2342 
If yes, VLr will ad the definite zirticle () as an inflection: 
it PI- \I 2 - Jli~llrJL m-vT?bj~~ 1 ;&-IIT~~~I~ If no--that LS, if the addressee is as- 
sumed not to be a5le to ide!,tifjr a 3reviogsly kaow PI as the reS.erent, 
'JAT will ae $tie !~etiveen, the indefinite articles L?-"A" and Lii- 
"30!13" de3e dinr: on wh.eth9r the cardinal-lty of PI-1254 is one or 
;relater than one. The sutcome will thus be cithcr NN-"T!SAZII:;H" 
UI-"AW i~l / "I3LPIIAL" / &Z~"*L~'IIZE"~ that is, "a teacher" 
)r ''sane te~ci~e~s~~. we have attem~ted to formalize some of the 
:ontextual grodnds on whkbh VhT will be able to answer a question 
Like 50) without. asking the user, 
and this matter will be discussed 
in section 'VII below. 
In all, its o~-ercitions Vll11' must at ,ne4ny points :lake access to 
a store of more or less pernnnent lexical know]-edge which we have 
formalized in tePas of efitailments of c;!tegories. The st7tements 
in the lexicon mecify what we know about a particular .X or 21 as 
a result of its being identified as ar, instance o a certain cate- 
gory. Or, to look at it from the ovosite point of view, these 
statements say what properties a p~rticular' CC Or TI must h~ve in 
order to he crcte[.;orized in a cerc in way. From the rlrst ~oint of 
view we c:m say that once we know that a-particu1.m CC has been 
categorized as an instance of UC-"GITJE", for ex:un~le, the lexicon 
tells us a number of othei thin;$ that we must know about this 
CC. From the second point of view we chi say that the lexi~;:~l 
entry fo? UZ-"GITfE" tells us what we must know about a i: in order 
to assign it to this category. ?hose two Wfiys of vizwinp lexic,~l 
entries ?re notbin cont-adiction, but ::re dlffcrent sldes of the 
same coin. 
From ad osycholoqicai stanlpoint the lexissn approximates a 
description of everything that is involved inapeFsonls interpretation 
of the world, at least so far as his interpretive i rid is r',e~en- 
dent an verbill cnte~-.;oriss. We AT(? unnhlc, of courne, to f'ocua 
on indi.viclunl. differences, but must htternr~t to dei.11. with a core that 
is common to the s;)cakr?rs of a. y)articalnr 1r1n~uaf;e. The lexicon is 
the heart of 011s propram, whet her we re enfr;ar5ed in verhnlizat ion, 
crnnslatic)n, or in (and everythin~ else denends on the success 
with which the lexicon han beer1 elaborated. 1% sc,r,nrate lexicon 9a~ 
t-o be develo.,r?d for oa,ch lm~uap;e wlth dhich the !?rop;r?m~ tries to 
de.il. In a full-fledmd irnple:aent:itj on certninly a very high nro- 
portion of the total develo.jdent~1 effort will.Qave to be devoted 
to lexical questions. 
As a slrLr)le 11lustratic-m of the kind af information +lexical 
entry might contain, as well :is of the for~alism we hove 5een usinq 
to reprec;ent such infornatnn, let us consider at least nart of 
what it; rneans for rs ps~tlcular .23 to be (: jte~';~rized as instance 
of UZ-ttLIFT"t Ve will w:at to sa;~ that when X lifts Y, ti entails 
that % does sclr~lethiri~ which cailses a. chmir;'e of st;:.,te frgm Y be in^ 
ill one. locatson to H being i-n another loc2tion, and fur-bhem~re 
that the new loc-2tlrrn is shove the ofd location. The 1exiz:~l entry 
for U2-"LIII'T", insof-lr as it cny~t~~rcs t;i:lr; much Infurnn-t;~.on, is 
written as follows: 
51) CC-h C> UZ-"LIE'Ttf 
i3> 
25-4 P> VU-"Ll" PTt! (I~I-D(BGT, PI-C? PAT) 
U'"4 f1 1 
CJ-A S> CJ-3ii d~ (43-D, yu-li) 
JC-D F> VB-ACT (PI-@) 
rl, I \ 
CC -3 S> (; J-CO!!J JIIGIXO ( (I ( G ) , ,,,-II, 
Z:-F P> VD-.rLT (PI-;, PL-I) 
2C-B F> VBTLiI (71-2, PL-J) 
CC-H F> VB-ndL!V!.: (PL-J, 1TLI) 
'?he first two lines are to be read, "If ;G-A is cnt~~oFized as 
instance of UC-"LII~T", this entails.. . " The first line under .X> then 
~ives the case frane, saying that there will be a clause contaming 
r;ne verb "LIFTtt ac3:~rnpanied by an age1,t (PI-B) and a patient (PI-c). 
The second Pine under i3> says that it is alternatively possible to 
subconce;~tl:alize CC-A in e certain way, wnich aruou~~ts to a :)ara- 
phr,:se. That is., a1thoufl;h the sptt ak hr has - c 1oL-n not to subcon- 
ce~tualize dC-A further (bresuriably because the c'loice of 1% "LIFTt' 
has heen Judged to ~rovide the right packaining for C:-hj, if he had 
decided to subconceptualize f~rther he could have done it in the 
manner specified in thi~ line, where two new s, - and lc-2, 
are joined by CZ-3hUsE.- In othez words SAD is c,:nceived of as 
ceusing CC-L. The tiLird line under ii> sags sornetiliqg about the 
content of JC-I), namely ',hat it lnvolves -ah act by PI-R. (It may 
be noted that the absence of q~lotes around h2T in VU-ACT indicates 
that this is not a conceptual unit that will lead to a direct sur- 
face structure represent ition, as will VB-"ZITI"' . j Jhe fourth line 
r1-T 
under z> says that UL-2, which is caused by this act, cm be sub- 
concestualized into. twoe c:mjoined el=ofie~tn. 'Phe first of these is 
"l I T-(i 1- 
r1- P 11 
a ~hui~~ from sc-P to - md ;;he sel:ond 1s ,,,-l! he flfth and 
sixth lines lnder W> soecify the nature of the prior and subsequent 
-I il 
states, 32-P and d.4k 
30th inv7lve PI-: being at sone location, 
first I .and then 1%-J (PL sttding for "partticular location"). 
The Last line eluciikates ZC-II stating that the new location (PL-J) 
is above the old location (P.LF1). Thus 51) hag captured formally 
the several bits of knowleuce ab'lut CC-A that were sl~rnmerized dis- 
cursively at the beg inn in^ of this paragraph. 
Let us PQW turn to a more comnlicated exmr>le. 'Phis exImple 
came up initially 8s a result of the absepati~n that the Japanese 
verb kasu can be translated into dnp;lish as either rent (out) or 
lend. In other words this verb is rlonspecific as to whether the 
ap;ent does or does nr~t recive money for the ~c)ods OP servi~es he 
psov~des. LJe were interested in how a translatif>n from Jsnanese 
int 3 English would decide wliether to use - rent or lend where the 
Ja~anese had used - kasu. This problem led us to consider lexic;-dl 
entries for several verbs involviug transfers and t~ans:~~ ti-f)ns, 2nd 
we arrived at a System of cross-referencing and embedding within 
lexical entsie s th;;t captures the content of abstract notions 
(such as transfer and transaction) at the same time tha-b it links 
~nt~ies one tr mothe?? in a way that 1s renerally useful. 
vle may bepn by defining a transfer. bL'e lassume a cate~7ory UC- 
'y~l which, since 2 t does not cont ;~ln cluot atior~ mnrCks , is 
understood to be ambstr;i,ct ;-,nd not lrnmediately convertlhle into a 
surf ace st-ructune verb. The lexical entry rr>ads as follows.: 
52) CC-A C> UC-I~~ANSPEH 
E> 
CC-A IS> CJ-CIIANGE (CC-B, CC-C) 
$2-B F> VB-ILiVE (PI-D, PI-E) 
dc-C F> VB-HAVE (PI-F, PI-E) 
Discursively, a CC-A which has been categorized as on instance of 
UC ?RiU;BFZR can alternatively be s~tbconoeptualized (or 
.in helms of a change from i3G-B to GO-G, whe~e the former involves 
PI- D "having"E-d, and the latter involves another party, PI-I?, 
having PI-6. In other words, a transfer in~rolvus a change in the 
having of some object (PI-3) from one indiviaual to anotTl?w The 
English word _I_ have of course performs a variety of semantic functions; 
our use of- it in this formalism is meant to include at least two 
varieties of hav5ng--ownership, wh ch we will label HAVE-OidN, and 
having the use of something, wt-ich we will call HAVE-USS. Simple 
HAV3, as in 52), is meant to be nonspecific a$ to wt~ich of these 
varieties of having is involved, as may be accounted for with the 
following two statements: 
55) CG-A C> UC-HlLE-OUN 
E> 
ZC-A C> UC-IIAVZ 
'1 rl 
A C> UC-EIAV$:-UGS 
E> 
CS-A C> UC-HAVE 
One examnle of a transfer is the kind whish is cate:;orizab.be 
with UC-"GIVE", whose lexical entry can be ~iven as follo~s: 
54) CC-A S> UG-"GIVE" 
E> 
ZC-A F> VB- "GIVZ" (PI-B~AGT , 
CC-A. C> UC-TMJSFXT{ 
PI-D = PI-B 
'1 = PI-c 
PI- = PI-D 
That is, a CC which has been categorized as an instance of UC-"GTVX:,." 
has tho case fr~me nhown in the firnt line undor &>. The question 
nark before the heneflcimy indicates that it is opt ianal; one c~n 
say "Hoger gave n book" without nentionin~ 9 beneficiary. The 
second line under E> shows that this Cz can also be categorized :I$ 
an instrmce of UC-Tt.1ANSZI'::h. is fact means that the '2 also has 
the ent-ilments listed in 52). Since the variables within exh 
'V 
lexical entry are arbitrarily labeled A, B, s, otc., it i8 riecessary 
now to state equivi~lencr?~ between the v~rl&bles in the ontry for 
rn 
UC-ftGIV6" and those in the entry for UZ-TIIR!::~YLA- ~hese oc4uivalences 
are listed, indented, in the last three lines of 54). They vle to 
be read, "1'1-I> of the T;i~T3F.1i,l- entry is qulvnlent Lo i31-I3 of the 
ItGIVEH entry (the giver); 1'1-b' of the TBl;!JSPER entry is eqllivalent 
to PI-C of the "(;IUEt' entry (the glvee)* aqd 11-E of the ".iA.idp.ia 
entry is equivalent to PI-U of the "GIVX" entry (the given). In 
this way 54) and 52) are brought into the correct ali~nznent. 
Anotker, more cornllicated kind of transf r 1s that involved in 
the cat eaory ij3-"LX:IDt' 
55) 2s-A c> VC-~IL 4:r~~n 
E> 
f7" YV- A- F> VB-"L:II(D" A 1- 3 J'I-Uf r 'AT) 
-Ar 23 US- ,.\HiGSb~PSlr 
PI-I) = 1'1-13 
i~1-P .- l*r-z 
71-2 pr-13 
2- = GC-3, 
j7lY 
CCIZ 7 Ju-3' 
- Zb d~-kiAVe3-Z, ,I2 
2s-F Qb J~-~~4:V~-U~~ 
VB-HAVJ$~.~JY (21-3, i 1-D) 
3C-i~ * -C> TTC-T&LT+ISASTION 
The first seven li~egof this nntry are entirely parallel to the 
entry for UG-"GI-VE" in 54). It then becomes necessary to refer. to 
the earlier ma lrlter states, X-13 :md CJ-J, of the T.~! entry. 
These are equated with JC-B and JC-F of the "LE:UD1' entry. 
It is 
said that both of these states involve IIAVZ-JAE. ihat is, when X 
lends an object to Y, in the earlier state X has use of the oh~ect 
and Jn the late1 stste Y does. The n-xt to last line s3ys that 
PI-B, the awnt of the lending; maintains ownership of PI-I-) thx>uugh- 
out. Phe last line says that 3-x cannot be categorizeu 3s a trans- 
action, as explained b?low. Svidently the only di fference between 
55) ant: the entry for U:-''~AL-!' ( kasu) in Jan#!ncse is that for 
- 
the latter the last lin~ of 55) ims miss~ns. Ihus, - knsu leaves it 
unaecidnd whether a transaction was involved or not. 
w'hat, then, is a :ransactio~~! dnsen~lally it is a lid-cing of 
two trmsfsrs, where one of th? transfers iz for thc purpose of the 
other. In buying, for examele, a tbypicel truli;act~on. the buyer 
gives aoney to the sr?.llf:r so that the sellv will zive him some 
object in retm rl. 
14Jith bu~ina, ,-, chanqe of o~nt?rshin in involved 
in both tmnsfers, but that neerl nqt Se the, csse. 'w;i-tl~ r?ntinc, 
for example, theze is a ch3.nr.e of awnerstli,~ of the aodey, tut only 
A cha.ril;e of use of the r,bject. We define amtranaact;ion as follows: 
56) CC-A G> UU-T~IANSACTI~X 
E> 
CC-A 6> CJ-iURP0L;Ji (CC-B, 
CC-B C> TJC-TRJLNS~!'~~~~ 
PI-D = PI-D 
PI-15 = 12E-X 
PI-F = L)I-$' 
S C -C C> U",-'PLIAN~J.~'E~I 
I = PZdD 
PI-5 = IJI-G 
PI-D = 111-F 
The first line under E> states that X-.A can be paraphrased in terms 
of CC-I3 and CC-2, -the former being for the purvose of $he latter. 
CC-B-j $ a tr inafer in which PI-D (e.Re Dhe buyer) transfers PI-E 
(e.:p;. money) to PI-P (e.p;. the seller). CC-C is a transfer in which 
t'he roles of PI-D and PI-P (and hence their relation to the variables 
in 52)) are reversed. Furthermore, the object transferred (e.ge 
the thing bought) is a different one--here PI-G,. 
Besides buying and sglling, anather typical ~ransact~on is 
renting. The Xnglish word rent is ambiguous, an(: wewill illustrate 
here the ent~y for what we call U'3-ffR,1:;IT-2ff, whlch is renting out 
57) CC-A C> UC-"RENT-2" 
E> 
CC-A F> VB-"QI!XVTt' (I?I-B~AGT, ?PI-CtBEN, 'I PI-D~MBH; PI-E PAT) 
CC-A C> UC-Z'lWSACTION 
~'1-F = PI-B 
PT-D = I~I-G 
I = PI-D 
I = PI-E 
CC-B = CC-F 
qfl fl 
uu-v = CC-G 
CC-F C> UC-T3lilILFiSR 
CC-B = CC-H 
CC-C = CC-I 
CG-G S>* US-T! ~ANSF3.il 
ee-3 = cc-J 
2c-C = cc-K 
PI-D C> UC-l\TZlI)Im-OF-EXSTIhYGE 
CC-H C> ~JC-K~LVE-O\JN 
CC-I C> UC-IIAVLOL1N 
CC-J C> UC-IIXVZ-USE 
CC-K C> UC-HAVE-USE 
VB-IIAVE-OWN (PI-D, 1'1-E) 
The first line under E> gives the case frame, which includes two 
obligatory cases, an agent and a patieilt (Bill rented (out) his 
lawnmover") and an optional beneficiary and measure (MLR) ("-Bill 
rented his lawnmower to Tom for five dollars"). The second line 
under 2> says that ZC-A is a transaction; ~t thus conforms to 56) 
and it is necessary to state the equivalences between the PIS in 5?) 
and those in 56). Below these PI equiv:llences it is also stated 
that the JC-B of the WANSACTION delinition (the transfer of money) 
is equivalent to CC-F of the 'RENT-2" definition, while CZ-C of the 
TLCLN~ACTIOIY definition (the transf ar of the objec*) is equival'ent 
to Cc-G of "REiTT-2". The twu bbdbes of the f.i rst TRANSFXH are 
named CC-H and CC-I, whlle xhe two states of the second TLLINSFER are 
named ZC-J and CC-K. It is then sald that the measure, PI-D, must 
be some-thing categorizable as a MilDIUM-OF-EXSWSJGi--nomally money, 
but potentially anything that would perform this function. The two 
states of the fir& ERfl!JXYI:H are then both said to be instances of 
UC-IIAVl3-OWN, since the money actually changes ownership. The two 
states of the second tr'msfer, on the other hand, are instances of 
UC-HAW-USP, since the object does not change ownership, but only 
use. The last line, like the next to last line of 55) says that 
the agent of the rentina; retains ownership of the obje'ct. 
It was mentioned that the lexical entry for Japanese TJC-J'K ,S- 
is the same 3s that for Zngliah U3-!IL:iXD1', as ill 55), except that 
the Japanese entry laCks the last line of 55) in which it is 
stipulated that lend in^ cannot be a transactisn, It can now be 
jeen thar; UC-"KAS-I' is conoatible with both 55) and 57,. we thus 
have a formal explanation for the fact thnt kasu may be translated 
as either lend or rent. In order to decide between the two trans- 
lations, it is necessary to searoh the context in which this CG 
occurs to discover whet he^ it is or is not a transaction, We will 
return to this matter in our discussion of trcmsl~t;ion in section 
VIII, 
Lexic 1 entries for cztegorles whose lnstmces are 1'1s are 
designed to elucidate t'he know1edf:e which is entailed by the as- 
signment of a partic llnr P1 to some c.itegor-y. Such entries do 
not contain a ctbse frame, but .ire otherwise similar in format to 
the entries for categories whose ins-t;ancqs qre GCs, as described 
above. As a simple examnle, we nay note, that wheh a PI is cate- 
gorized as ~JI instance of UG-"C;LK1' th is nn entailment that t'lls 
PI will "have" a trunk. ~'hls kind of having is different from 
those discussed In connection with €rczns~ers ?nd transactions in 
the last section; we represent it with iIiiVE-AS-PMlT: 
58) PI-A C> UC-"CNI" 
E> 
VB-HAVE-AS-PART (PI-A, PI-B) 
PI-B i:> UG- " THUliK" 
It is useful here. (and elsewhere in the lexicon) to distinpish 
between necessary entailments and e~pected entailments or default 
options. The latter constitute knowledge that is normally entailed 
by the category, but not necesswily so. We indicate entailments 
of this sort with a prefixed "E:". As an example we m:iy note that 
some thing which has been categorizeti as a MEDIm-OF-XXCLIiLYGE (cf . 
57)) is normally expected to be money, althou[~;l- -In some circumstances 
it might be cowry shells or wampum: 
59) PI-A C> UC-rUDY~-oF-lZ2~&lJSE 
E> 
Et PI-A C> UC-"MdNEY" 
A more com2lex example involves the categorization of a PI 
as an instance of UC-"BEnGLEtt. In this case we Know that the PI is 
also sategorizable as an instance of UC-"DOG", that. we may ex~ect 
that it will have a tail (although some dogs do mt), that it will 
bm, ana that it will chase cuts: 
60) PI-A O> UG-"BEA(;Li.," 
L'> 
y1-A 3> UC-ltD()GH 
E: VBLHAT~-AS-PIL~~T (I- PI-B) 
PI-B z> uC-t'T.ti.zj-btt 
E: VB-BARK (YI-A) 
E: VB-3&dB (11-A, PI-<) 
PI-c C> T~c-t~dji~tl 
It nay be that E: should be expressed as a probab~lity; 
s~at is, that there is a ao~tinuous range over which we nay expect 
~omething to be entailed, with necessary entailnrent being one extreme. 
kt least for practicd purposes, however, 
it proves useful to make 
a three-way distinction ~etwoen necessary ent.iilments (unmarked;, 
default expectations (: and a third type which we call optional 
ent~ilments an8 mark with "0:" These last represent a lowcr de- 
gree of probability; they are entailments which are neither neces- 
sary por expected, but which nrs easj ly possible. 5or example, a 
bicycle need not have a bnsket and is not expected to have a basket, 
but it may very well have one: 
The distinct- on between necersary or expcxtea and opt onal en- 
tailments is of interest when it cones to the assignment of definite- 
ness, as discussed in the following section. 
VT.TZ, vl$rz:ourse Inf omation and i3ead.iustmen-t~ 
A si~eaker needs access to three major c1;isses of informatim 
in order to verb.!lize su~cessfuhlg. First, of course, he :nust hare 
an Idea of what he w,mts to talk abmt: the content of the v~~lbal- 
iza~ion. Second, hc must have access to general knowledge that i s 
relevant, the kind of knowledge thlt we are attennting to charac- 
terize in the lexicon. But there is a third kind also. The speaker 
must keep track of know1edr;e llavin(; to do with the very fa-t that 
he is verbalizing: knowledge about the soeech act ltself, and lts 
effect LJLL ~lt: wrson his ver;),rjllizat;~on is a:ldressed to. It is thls 
t lird kind of knowledge that we are calllng discourse, ini-ormation. 
'rle are concerned in thls area with such factors as the identlty and 
social relationship of the speaker and the addressee, the time and 
olnce of the sp-ech act, and factors w'llch relate the content of the 
di:qcoursc to whrlt i3 nss:med to be p;qi.n(; on jn the rnind of the 
thel qct of v/>pb lizntinn at;! fin event in itself, since the vcrhal- 
discourse. i)ir;cuurse inf'ormnti on in kept by VAT in tcrnnorarg 
$S;or:q,y e Unlike information in the lexicon, ~t 1s specific to 
even cbrl?<"e Sle within a pqrticular dlsco~;lrse rather t'lm ~einp; 
~otentially .a:.:,pllc-ble to an unlimited dumber of different dis- 
Our trestmont of dlscourse i.nformation is at present rudl- 
mptntary : nd uneven. :jo f as sr)e:ik.tr ;~ddrensee are concerned, 
we siz lg enter into dlscour'se infomation stor7p.e st:rtement$ 11~e 
the fallowing: 
(The prefix bP st {nds for "system predlc te"; lt 1s used for a 
variety of nredicates assoelated wzth dlscourse inf ormatlon. ) 
The proTran rnakes use of thls information in various ways. For 
gxmqle, in iecidinr h~w ta lexicalize PI-1001 and 1'1-1002 'JAT makes 
use of infomat ion llke that in 62) In order to arrlve at f lrst and 
secrind :)#>?con nronougs; cf questic~ 1) and 43) In section V :3bove. 
Erohably ir~ tlost lancuap;es to some-degree, but especially in 
nang Asian lanqua~es, the social relationship between the sncaker 
and addressee pl:>ys a role of some klnd in veSbillzatlon. 
Ue have 
bees interested in lntroduclng such c~ns;~dar:itlons ~nto our verbal- 
izatiqn procedure; and hV~ve so f?r concentrated on the questlon of 
how VAT $ho1Jd decide to categorize in Japanese a rI wali&h in 
" 
;I;KIE~;~~R~ WOU~~ he ~nte[;nrized QB m inntnnce oaf UC- 
arb o~ver~1 cmto[:ori(:r: i tho Jn:)mo!:c Loxicon, of which conform 
to the definition of IJO-"GIVL" in 54) above, but which diff'er from 
excn other wit11 respect to the spcRkr?r-addrea~ee relationzhip.  ow 
the choice can bc !lade is aont e,zsil;y illustrnted in, t1:e context of 
e translation psocedure, nnd we will return to thin exfirr~plo in the 
aection iX. 
VAT does little at present with cdn:;iderab.ionr; of 'the time md 
phce of the spo~ch act. katements like the followln - czzn be in- 
cluaed with discollrse inf qrmnt~on: 
(where L stands for  articular location" and : f0.r "l~artic:llar 
time"). dhether lJL-135? and M-1579 reinain thrgughout the discourse 
or are reyjlsced bv other r~lac'es ad times depends on th(3 r,:;ture of 
the discourse itself; sarne'ines there will br: signif ic mt chm:;es 
in thsse paraveters and sometimes not. lh m:~ cyse it is ossible 
for V,.;? 90 .answer cl:~estions :jbout tehge, for exanplo, Sy askinc 
11 
whether the timmf .I JW -t;h:it is beinp; ver3al ized is before -or, 
after, or whether 1-6 incIudes, the tl,~e which h:ir; beer1 8:;ec~fic.d 
as !JO.i, sl-I bh a3 J T-1579 in h3 ). 
- -- 
Uiscourse lrrf ormat lr,n is s'lb,ject to ck~;rnk~,e s the cl",~courr;e 
b 
?rocceds. The way in which 'JA~ nresently :Icco;n :lrishes such chcmps 
is through readcjyistment processes, -3pi)lied immediately aft :r each 
sentence has been con1,letely verS liaed. hene read.j:ast;zents 
L 
g:~eclfy the w-zrs in which st .re of dissourse inforn Ition has 
been azfected by the sentence. e of then, for exah::le, creates 
a 33 w .ich is the concept of tb: event of producj-nq .tkie nentence 
itself, which st~bsequently cm be trentud like any other ovsnt. 
Xverythin~ involved. in hhe verbnlization of that sentenco '~c?longs 
to the content of this CJ. If, for exwnt~le the spc,lkhr subsequently 
has reason to repeat what he sri~inally  idi id, he may vmbalize in 
exactly the -sane wny (quote hlmself directly), or1 he may "say the 
sane thing in different words" by makiqp; different ohoices in (:ate- 
goriiation and so on. The relevant information is available within 
the CC th-it re~resents the original verbalization. 
mother readjustment has to do w.ith the establishment of 
"giyenness" for items coma;micated in the sentc~nce. "or' e~ch PI-A, 
for exmple, there will be, when the sentend has been com~letely 
verbalized, a readJustment process st atecibhe as: 
64) SP-GIVEE (PI-A) 
If, for exam~le, the sentence in question was "Mrq. 'drown gave 
Tonmy a cookie" and Mrs. Brown, Tommy, and the cookie arc 1'1-1234, 
i)I-'1345, and ?I4456 respectively, then readjustments af t(?r the 
prod~zction of this sent;ence wlLl create the st.xtemnnts: 
65) SP-G17fZA (PI-1234) 
SP-GIVdN (PI -1 345) 
-GIT (11-1'456) 
If any or all of these PIS occur in the next sentence, they ;rill 
be pronom'inalized, arld it wlll not be necessary for \VA2 to ask the 
user a ~uestion like 40) above (IS PI-1234 .:1~3i*:?). Thus, the next 
sentence might be "iIe - toox - them from her gratefully. I I 
- 
lt is difficalt to decide when statements like those in 65) 
should be deleted fron the stdre of discfiurse information--when 
givenness evaporates. After a cc-rtaie- wried of time hss elapsed 
in which the :-I has not been talked about or otherwise kept in the 
addreqnee ' s conoci~uanosn, tho r w 1 pr~bnblv no 10n(:r?l: 
pronominnlize it. kt ~roonnt wn 4,et fit;t~$rlrnentn likq t!lor,e in 655 
remain .only throup;h the To-llsnn:in~ flentence. Thur, i ? J3~-l2~'+,- for 
~x~mnle, does .lot of:cur in the next RO~~Q~CO it will not ho tse~tetl 
as !:i#gn two sentences 'Lter, and will not br! prono~inalizcd. &st 
all discourse works in t'li~ w%y, bu't thS~ device provides a usnf 11 
tempi )rary mprOximntion, 
A rather similar- kind of resri,ju~tm?nt has' to d~ w'ith the: 
establishment of 0 rclatjnn botwl?~tn 2 3,: ar~rl a 1'1 which WB ~911 
 ID!^ '!?he ~~resence of thi r, relation even t11nll;y lr?#tdi to 
the: lexicnlizat :on of the 1 with the definite n~tic10. . :J?~~:,OSC 
the snmkcr sqys "1 bpuaht. d bic*;ch yc::tr?~dny. " U,~rinq tile 
~erb~alization of tlris sentancc VIA'? will hnve cre~tcd the ht itsment : 
66) PI-1987 3> iTJ-"3IC CLi:" 
It ' 
That is, - FIi19Rr7 has aeen catc?p.ori zed ns m inrtmce gf 5.)- :3~l': 2'. 
This ~t-?tement than trir[:?rs a rea djust-.ont proc.Pas tkiiich 'crr?~t.es 
the disco'urse informtion: 
*- ;: flq lf'l LJ (: j,;- 14 ;3~ 7 - ,,T 11, 
.by iij*-1uadA2 .,A a J~JJ )I lJl -1'387) 
to know wh~t n~rticul~r inntwce it is (in this caie 1'1-l9h7). 
dhen, during a 1::ter sentence, VjiT coaos to the nuestion: 
68) V: DOLS US-"HJY 2,;" P131:NTIE'Y kJ1-19873 
as in 50) above, it is in a position to provide sts own answer 
without recourse to the user. Thus it will, on its initibtive, 
lexicalize PI-1987 with the definite article: 81q- 31 ; i~;~dtl 
H THE 11 
, '~t is in waxs such as tiiis that we .CC~? attempting tb in- 
crease VAT s ability to answer its own questions. 
As in the 
the arises 
whgn 
a stritement concerning identifiability like 6) should be deleted 
from the stare of discourse informati~n. A11 that i.s clear now is 
that such statements generally last longer than SjA-GIVEN statements, 
and for the moment we do* not delete SP-IDETJTIFISS statements before 
the end of the discourse. 
It is undoubtedly the case, however, 
thzt some of them should be deleted sometimes, and it will be 
necessmy also to deal eventually with discourses in which there 
are multiple instances of the sme catep;ory: "the first bicycle, 
the second bicycle, etc. 11 
The presence of Lexical information of the type that was 
described at the end of section VI has an interesting and desirable 
effect on ~eadjustments, snaclficdlly with respect to statements 
like 67). As aq example ,. we might have a lexical entry for UC- 
't31CY"JLE" which includes: 
That is, something cate~orized as an instrance of UC-'tBICYZLLu has 
as a necessary p.mt something categoriz-able as i*n instan~e of U3- 
'4FHAPE1', . & also has as an opt-ional part something cat egorizable 
as an inatance of UJ-"'dd569.P". 
Now, it may be noted that the second 
line under , which deals with the categorizatinn of PI-B, 
is a 
statement like that .in 66) above. 
After a sentence like 
!'I bought 
a bicycle yesterday'' has bezn produced, 
this line will therefore 
trigger a readjustment proces:: which creates the statement: 
70) 8P-IDENTIFI'4:E (UC-"PHM!IE",. lal'-1'~68) 
(with whatever number it is appropr~nte to assign to thia PI 1. As 
a cchxsequuence, if PI-$468 occurs in a subsequbnt sentence it will 
be lexicalieed with the definite article, as in "The frbe is extra 
large. " Thus, as 1s. aes~raolo. definite nnss is c'reatea not' only 
fort instances of the category ftrst rnedtioned, but also through 
entailments of that cntegory. It -should also .be noted that in this 
context it-is a little odd to say "The basket: is extra large", 
talking aboot. PI-C.. One would be ,nore likely to say "It has a bas- 
ket which is extca large", or in some other way to introduce the 
basket explicitly. In other words the process just described works. 
better for necessarv  arts than &for optional parts of the first- 
mentioned obja~t <PI-A). We therefore exclude from this readjust- 
dent process PISO that hrve .been introduced through opt lonal entail- 
ment s. 
The general nature of the $ranslatinn procedure was oatlined 
in section I, and dia~ramed in Figure 1. 1'0 summarize aeain, '/AT 
will start with a text in the eource language, will re con st;^-uct the 
verbalization processes which produced that text, and will then 
itself produce a. paralleLl. v'erbalization in the target language 
During this last procedure it will agply sptactic processes an- 
apropriate to the target lmgua~e whenever it can, but at each of 
Chose. many points where it must make a choice of some kind it will 
look across to the. source Language verbalization to see what choice- 
was made there. If poslible it will. B-quate that choice (lirectly 
with a correspondinp; choice in the target language. 
If no direct 
correspondence is available, it will compare the lexicons gf the 
two languages to determinebwhat correspondences are possible, and 
will then sewch the conf ext . to decide which of them shozxld. bc: 
chosen. b:e will be particularly concerned in his section with 
illustrat-ing a case in which such a qon~lex choice must be made-- 
in which the zigzag arrows in F'i~ure 1 haveconsiderable content. 
First, however; it may be useful to provide a fPmework by i~lus- 
tratdng a relatively simple case where the corresodndencos ore more 
direct. We will use as our first example the Yollowinq brief 
text f rnm Japanese': 
71) Reizooko o utta. Okqe ga hituyoo datta kara. 
ref rigelator sold money needed was because 
\ie wlL1 want to consider sone of the arocedures VAL? will follow ~ri 
translatins this sentence into hglish: 
72) I sold the rezriger &or. I needed the money. 
Actually our attention in this example will focus on the first sen- 
tence, since we will later want to consider the comvlic~ti~ns that 
are added by changing the verb in the first sen'tence from utta 
'sold' to kasita 'rented' or 'leht 
Let us first revi3w the manner in whic'h 'J1i'I' will reconstruc-t 
the original verbalization of the Japanese text. Since our eventual 
pwsing' corn!~onent will follow a kind of- ''analys~s by synthesis" 
procedure; we will also be suggesting-here the steps of the parsing 
program. The only difference, md of course it is a big one, is 
that for the moment VAT- will ask that: decisions. be---made hy the us- 
instead of itself deriving them from the t6xt together with its 
own knowlodge of the world. The conversation wikh the uoer will. 
procaod as follows: 
1. V: WiIAT VAT TAU DO YUU WANT P.FCI-(E'OlIMEDZ 
2, U: VERBALIZE CC-%UUL 
5. V: CAN CC-2001 BE: 38TZGORIZED1 
As explained for example 9) in section 11, and with the proper 
insertion of periods, VAT's reprerlentstion now is: 
Vat's representation, as explained for example 32) in section IV, 
now includes: 
Vzt finds UC-''UH-'' in the Japanese 1-exicon. The first three lines 
of this entry are: 
CC-A C> UC-"UH-" 
am* 
As in example 34) in section IV, VAT creates- the representation: 
Since the beneficiary and measure PIS are optional, VAT rlext asks: 
17. V: IS THE WPSU~E EXPLICIT? 
The next two questions are: 
19. V: WIIAT IS THZ AG~T? 
21. V: WIIAT 'IS T33 Z'ATIENla 
VAT now has the. following represe~tation (cf. 36) above: 
VB- " UR- '1 / ll~~~~lf 
PI-2001t1GT 
PI-2003'l PAT 
CJ-". It 
cc-2002 
c g ~~K:~IIAII 
tI - It 
2J- 
VAT next asks-: 
whereupon for Japanese. it creques the structure: 
It 11 
CJ- 
CC-2002 
CJ-"K@IA" 
tr ti 
CJ- . 
VM is.now at a point where it can lexicalige PI-2001 and PI-2003. 
Beginpling with PI-2~01, it might ask fir~t: 
25. V: IS PI-2001 GIVEN? 
In fact, however, we assume that the speaker (and addressee) are 
latnmtically given, so that VAT contains a general entailment to the 
effect that: 
Since by convention PI-2001qis the s~eaker, the follorirink is already 
stored as discourse informa~lon: 
Thus VAT w;:s nble to ~ive an affirmative answer to guestton 25 
above without asking the user. Pronominalization in Ja~mese is a 
complex matter, deaendin~ in !)art on social lela'tionships, and 
we have not: 7s yet c,List"ruoted a procedure to introduce the correct 
pronoun for a PI that is given. We have,' however, taken advantage 
of the simple fact that given PIS are very often deleted, 
with no 
surf ace representation at all. In theh present example, anti in many 
others, the slm~le deletion of such a PI produces the correct result, 
SQ that an affiqative answer t,o question 25 leads to the recre- 
sent ation: 
VB- " UR- !' 
/ tlph~tl 
'PI-2003 I "0" 
%J-". t I 
CC-2002 
CJ-''rn" 
CJ-" . 
II 
VAT now turns its attention to PI-2003: 
27. V: IS -PI--2003 GIVEN? 
-29. V: DOES PI-2003 HAVE A NAPEJ 
31 V: HOW IS PI-2003. CATEGOBIZED'I 
(We omit here considerations of cardindli-tg.) 
The renresentation 
now is: 
VB- I'TJR- I' */ 
rlYAST I' 
NN-"~IZOOL15>" / "0" 
II tr 
SJ- . 
CG-2002 
c ~-!tmd 
CJ-". 11 
The first three lines of the above are xtually as far as we go at 
the present time in the surfrice representation of a sentence. 'rle 
try to include in such a representation everything that is needed 
to a:-rive at a correct linear sequence of words. In this case the 
combination VB-"UH-" / "EASTft will yield the surf ace word -3 utta 
which will be placed in sentence-final positlon (followed by the 
peribd). That leaves .reizookd o as the first words in the sentenhe. 
VAT, would. next ask about CC-2002, but we will not c&ry the 
verb lization process furt-her here. he are interested in how just 
this much of -the text will-be translated into English. 
By and 
1-mge VAT will ask- the same .questions it asked in the course of the 
Japanese verbalization. It will look for the answers in the answers 
thpt were 3ive.n there, and when possible will ~pply corvasponding 
answers in .English. Along the way, whenever aopropPiate, it will 
appzy syntactic processes that ,are called for by the strupture of 
English. The translation, then, begins with the same question that 
beg:in the verbalizatton' in Japanese: 
V: WHAT VAT Tr.rSK DO YOU WATIJT L''t3tZFORMLUil 
The answer given in line 2 above wtas V!IRBALIBE GC-2001. The English 
translation must use its own four digit numbers; in what follows we 
will simply substitute the English digit "1" 
for 'the Japanese digit 
Of course here as elsewhere this question is not actua1,ly asked of 
the user, but is answered internally by VAT. The next questions- 
exactly parallel lines 3-8 above: 
V: ITAAT IS THE GzNRE'2 
V: CAN CC-1001 BE CI~T3~GOKIZkZD? 
We assume that English would not in this case use the word because, 
but simply juxtapose the two sentences, as in example 8) in section 
11. Thus *the represenr;aT;lon now is: 
Lines 9-13 of the Japanese verbalization have a direct correspondence: 
V: 
CAN CC-1003 DE CBTEGOHIZ CDS 
U: NO 
V: HOW IS CC-1003 C~ITEGORIZEU? 
At this point the Japanese was UH-. 
That is, the categorization 
was in terms of the Japanese category UG-"UR-". 
It is necessary to 
ffnd an English catep;ory that corresponds. The procEdure at this 
point is to look flrst in a stored list of bilingual category bqu&?a- 
lences wl-rich we call interlingua. The entries in interlingua are of 
the following s~rt: 
UR- SELL 
That is, the list contains pairs of categories, where the members of 
each pair are assumed to categorize what is, for all practical 
purposes, identical content. The assumption is that if a CC can be 
categorized as an instance of UC- I~UR-U in Japanese it can also be 
categorized as an instance of UC-"SELLtt in English, arid vice versa. 
Similarly, Japanese UC-"HUN-If and Znglish UC- "BOOK" are equivalent - - 
categaries. As a general strategy we expect that pairs wlL1 gradual- 
ly be.removed from interlingua as differences between the paired 
categories are discovered. Lingustic research has not 
yet progresse,d to the point thr:t we can siy with complete cer- 
tainty that aAy tpo categories from two different 1anp;uaf:es embrace 
exactly the same content. At the outset, however, it is iiseful at 
Least to pretend tha,t .UC-lJUH-lt and UC-"SELL" are equivalent, and 
probably thprp we at least some pairs in interlingua that will 
remain viable for some time. 
The present eTample w8s chosen because the answer to the last 
question above - Gan be found in interlingua. Later we will consider 
a case where it cannot. At this polnt VKY answers its own question 
with: 
tha looks at the 1exic:il entry ror UG-~~SELL'~ (which we assume does 
not differ -'ran that for UCr"Ulf-"j, and creates the representation:' 
VB- SELI;~' V"STH 
PI-B~GT - 
?PI-~?BEN 
'2 PI~D~MSH 
PI -E?PAT 
CJ-I' . 
I I 
CC-1002 
J- 11 . 
The questions and answers which parallel llnes 15-22 of the Japa- 
- 
nese verbal-izatim are atrnightforward; 
V: S TI- 1 LC EXPLICIT t 
V: kJHkT IS THE AGENT? 
V: GjTIkT IS TIE i':LTIdn7T? 
The rer~resantation now is: 
The next exchm~e is: 
which creates the represent at ion: 
With the lexicalizat~on of PI-1001 the procedure is different in 
English, since tbis item cannot siqnly be deleted as in the Japanese. 
We follow the questions illustrated in examples 40) through 43) in 
section V: 
V: Is 1-1001 GIVEN? 
Thus the repr-~entati~on now is: 
Now corne.8 the lexicalization of the direct object, EL'-1003. The 
initi~~l question's parallel lines 27-31 of. the Japanesd verbalization: 
V: 12 PI-1003 GIVSN? 
V: DO26 PI-1003 HAVE A NAMX? 
The Javanese answer was HEIZOOKO; VAT' will-now look in interlin~ua 
to see whether that item is there, and we asoumc that it will be 
f omd paired with English HEFHIG~HATOH. Although *Japanese was able 
to terminate the verbalization of PI~2003 at this point, English 
must ask the qIlestion introduced in exmple 50) pf section V: 
The answer deoends on the context, but let us asdurne thqk it is yes. 
The representation now is: bt 
'de now have the kind of re~rcsantation of the first sentence that 
is our current goal.. ~~ormal Enel-ish word order will put the subject 
first, the verb second, ic~d the direct object last to yield the 
final representation '"1 sola the refrigerator" of 72). 
The above example was chosen to il1ustr:tte a mmlmally simple 
case of translation: one in which, in particuln~, the answers to 
all questions about cross-lanquage ~~ategorization could be founcl in 
interlingua. Phe intereeting cases, ~QW~V~T, arc those in which 
interlingua does not proylde dl the answers. It is in these cases 
that the zigzag arrows of Figure 1 muat be further elaborated. The 
general method of elaboration is suggested in Figure 4. Assume 
that we are producing .a verbalization in the t<=get language and, 
coming down from the upper righthand corner, we arrive at a point 
where a CC or PI needs to be categorized. follow in^: arrow 1, we 
look across to the source language verbalization to find that the 
correspondln'g ZC or PI was categorized in a csrtain wqy, let us say 
as an instance of category A. We look next at interlingua (arrow 2) 
If A were there, we would take the .target language category paired 
with it (such as SELL and Ri.:F!iIGE:IATQR in the exam~~le above), 
intFoduce it the target language verbdization, and proceed. 
Now, however, xe are considering those cases in which A is not 
found in ivte-lingua. The next step, following arrow 3, is to look 
at the entailments of A in the source languaye lexicon. de next 
follow arrow 4 to search the target lanp;uap;e lexicon for entries 
khose entailbents are coa~atible with tho-e of A. (This search 
procedure is likely to present chall-enging prob1t:ms when the source 
language lexicon reach- any interesting size. It io, howwer, 
facilitated by the Tlresence of abstract features like TRANSFER 
hd T'IUSACTION which can be used zo limit the domain of search.) 
Suppose that we find tw~ gntries in the target language lexicon, 
3 mi3 C, both bf whose ent~ilments are compatible with the entail- 
ments of A. 
Ue then look to see how the entailment-s of 3 a~d C 
differ and find, -let US say, that 3 contains entailmb:nt(s) X while 
r: contains entailment(s) Y. 
de then follow arrow 5 back to tne 
aource 80urce inter- target target 
language lapguage 1 ingua lan u~ge 
f 
language 
verbal!,- Ilexican Lax can verbali- 
zation w *ion 
choice of 
-- category 
needed 
$haice of 
category 
A -+ Ai19t 
found 
inter- 
look at 
/ ,in,, 
entaf 1- 
ment~ of\ 
A 
find categories 
B and C whose 
entailments- are 
Both compatible 
wish entailments 
of A, B and C 
differ with 
respect .to 
entailments 
X and Y. 
is X or Y 
compatible 
with 
source 
languap 
text' 
\ choice of 
B or C 
accordingly 
Figure 4 
source lan~ua~e vorbalizntion, hoping to find sometfling in it that 
will allow us to choose between X and Y. (~e;ain there are chal- 
.lenging problems in searching the aaurce language text for the 
answer.) 
Let us now assune that we find something in the source 
language text that is corn~~atible with X but not with Y. 
We are then 
able to choose B as the correct target langusge catetl;ory. We int-ro- 
cfuce that category into the target 1angua~;e verbalization via arrow 
6 and proceed. 
In those cases where the choice between X and Y 
(and hence between B and C) cannot be made--where the source lan- 
guage texb does nat proviae the answer-LVAT must resort to sskine; 
the user for the correcs caGegorlzaslon. 
We will illustrate this procedure with the brieuf Japanese 
text: 
73) Rnizooko o kasita. Jkane ga hikuyoo datta .kua. 
refrigerator r6nted money needed was because 
We will want VA:? to translate these two sentences into English: 
74) I rented the rented the refrigerator. I needed the money- 
We are no$ concerned in thls example with the fact that the first 
English sentence is amil-ji~llmls between rented (to someone) and 
rented .(from someone). but with the fact that the first Jarmnese 
sentence is amb@ous between rented and lent. In both ceises, it 
seems, the second sentence serves to dislmibif;uate. dhat we are 
interested in now is the fact thilt V~Tkl.nunt ~omehow-choose between 
~~ and LEND as the pr0pe.r coon'esponaent for Jananese Kiib~. 
'de can ssume that nost of the verbalization in both lvlguagas 
proceeds 'along the lines already exemplified, since 71) and 73) 
sre minimally different. Imaginq, then, thqt we have carrived at 
the point in the English verbalization where the au.estion is: 
Vt HOW 1;; CG-1003 CATECrOLlTZEU? 
We are now in tho upper right of Pi~;ure/rC, and we follow arrow 1 
to find that the corresponding Gc; in the ~'apanese verbalization wag 
categorized in terms of UC-"ICAU-'I. We then follow arrow 7, and 
find that KAS- is not in interlingua. 'vle look next via arrow 3 at 
the 3ntailment.s of UC-"KM-" and find thnt they are 11s specified in 
example 55), section v'I above., but without the last line of thnt 
example : 
75) CCAA J> U~~~"IQ~?:" V- 
E> 
C F> n-ltKASdH (PI-B~AGT, ?i21-C~BLIJ, IJI-DTPAT) 
GC-A C> UC-TIfiUJ;<FEl? 
PI'D = 1'1-B 
r* 
iT-F = .L I-b 
- = PI-j.) 
CC-B = CC-E 
CC-C = CC-F 
Substituting four digit numbers for the %varinbles, we8 obtain: 
76) CC-2003 C> U3-1tF&b-'1 
I> 
CS-2003 F> VtL ( -20 A, ?L I-~~o;,VMD, PI-2003?i'kT; 
CC-2003 C > U2-=TT<A~TS~~;311 
LJI-D = J'l-2001 
PI-F = jJI-2902 
(PL29O2, CC-2905, dnd CS-2906 Lave been inserted here as arbitrary 
numbers. It i.s ouite possible, however, that these are items whicl 
show up ex$licir;ly elsewhere in the Jananese verbalizati >n. For 
example, 21-2902, the one who receives the refrii?:rator, might well 
bs mentioned elsevhere in the text.) 
Since. CC-2003 inv~lves a tr~sf er, Vlrl must also ~s~ign numbers 
within the definition of UC-TH~~~FEII, given in sectiw VI akove ae 
example 52) : 
Thus there. is a change from the renter or lender (PI-2001) Having 
the ob~ect (~~1-2003) to the rentee or borrower (Pr-2902) hai7ing it. 
The last three lines of 76) made it clear that this was not a chwc 
in owdership but only a change in use, and that PI-2001 retains om- 
m-sh5p throughbut. 
Following arrow 4, we carr-y these entailments across to the 
English lexicon and search for entries whose entailments are com- 
patible with 76). Compatibility means that these entries will con- 
tain what is in what is 76), but may also contain more. Let us 
say that we find two such entries, one for the category UC-"LENDt1, 
which was given. in 55) above, and one for UC-"HENT-2", which was 
given in 57). 
The next step is to isol-7te thf: differences between UC-"LEND" 
and UC-l'RjQFJ~-2" ' Uc-"~i$f~ff , as mentioned, differs from 75) in 
containing an additional final line: 
78) CC-A -C> UC-TRIiNSACTIOIl 
That is, CC-A cannot be categorized as a -tran&action. UC-"i%NT-2", 
on the other hrmd, contains the statement: 
79) GC-.A C> UC-TRA~TSP.CTION 
At one level of babstrdction the question whish must be answered, 
:herefore, ie whether CC-1w is or is-hot :I transaction. InformalL;L, 
this is a metf;f?r of whether PI-2001, the renter or lender, did or 
lid .not receive money in exchanga~for the Crmsfer of :Ine of the 
~bject. 
The Pollowinp;' digits can be inserted for the variables in the 
lexical entry for UC-'"KENT-2" :. 
50) CCb1003 C> UC-"HEAT-2" 
l$> 
CC-1003 F> U.3-"L1i3NTU (i'I-l001YAGT, ?iI-1901~.~~~, 
?PI-1902fMSR, ~dI-l003?PAT) 
CC-1005 C> UC-TKANSLCTION 
rI-F = PI-lop1 
PI-D PT-1901 
PI-E =. FI-1902 
PL-G = PI-1003 
CC-B 2C-1901 
CC-C = CC-1902 
CC-1901 C> UC-TRANSFER 
CC-B = CZ-1983 
GC-C = CC-1904 
CC-1902 C> UC-TL~~U~SFER 
CC-B = SC-1905 
ZC-C = CC-2'96 
PI-1902. C> UC-MEDI~~-~P-P:XCHIF~TG~ 
CC-1903 C> UC-HAVE-OWN 
CC-1904 C> UC-HAVE-OidN 
CC-1905 C> UC-HAVX-USS 
CC,-L906 C> UC-HAVL-Uk' ' 
M-HAVE-OW (PI-1001, 2-1003) 
Idhat all this says is that the cntegorization of X2-1003 as an 
instance 01 UC-"KENT-2" involves a number of thin~s. First, there 
must be a person who does he rentin@ out (0 1, a nerson who 
receives the rented object (PI-l9Ol), the money that is paid in rent 
(PI-1902), and the rented object itself (PI-1003). Furtherinore, 
CG-1003 "is said to be a transaction, and certain equivalences are 
stated between the RENT-2 definition and the 'THMLACTI h def ini Lion. 
VAT must ther*efore assign these particul-w PI and dC numbers withln 
the definition of u~-TRA.NSAGTION' which was givenx as example 56) in 
saction VI abouet 
81) c;l-loo3 a> UC-IRANSACTIOR 
Er 
CC-1003 S> CJ-PURPOSE (CC-1901, CC-1902) 
CC-1901 C> UC-TlIiWGFEH 
PIrD a PI-1901 
P1m.Z PI-1902 
1- = PI-1001 
CC-1902 C> U2-TWANSFEl2 
PI-F PI-1901 
PI-E = 21-1003 
PI-I) = l-1-1001 
This says that 30-1003 can be nf7raphrased as two transfers, CC-1.901 
and CC-1902, the first of which was for the mixrpose of the second. 
(CC-1901 is the transfer of money, and CC-lq02 the transfer of the 
rented object.) VAT must, therefore, look also at the definition of 
UC-TRAIiSm~, ~iven in section VI above as example 52), 
and i~ltrg- 
duce-eghin the proper 
and CG n'mbers for each of these particular 
trasfers. The first of them-wlli be represented as: 
82) CC-1901 C> UC-THAXSP3d 
E>, 
zs-1901 s> CJ-CHANGE (CC-1903, .':C-1904) 
CG71g03 F> ' W-HAVE ' (PI-1901 PI-1902) 
cc-1904 F> VB-HAVE ((PI-oo1-, I~I-1902) 
That is, the first 'transfer involves a. chanae from ZS-1903 to 3C-1904. 
In CC-1903 the rentee (1'1-1301) has the money (21-1902), and in 
CC-194 the renter (PI-10~1) has it. The second trxnsfer is repre- 
sented as:. 
Here there is a change from CG-1905 to CC21906. In CG-1905 the 
renter (PI-1901) has the obje-ct to be rentec! (PI-100j), and in 
~~-19dcj --the rentee (PI-1901) has it. 
In 80) it is also stated thnt 1'2-lr302 can be catep;orized as an 
instance of MEDIUM-OE;l5XSIlUG,E, in all probability therefore an in- 
stance of UC-"M\JNEY'' (see exLmplo' 59) in section VI above). 'FurtAek- 
nore it is stated that tpe change in the having of the.money (from 
:C-1903 to OCn1904) inyolves a change m ownership, whereas the 
:hange 1~1 the havina; of the rented object (from C;J-1905 to CC-1906) 
involves a change in the use. Finally, it; ir; stated that the renter 
(PI-1001) retain3 ownership of the rented object throughout. 
What VAT wants to find out, then, is whetlzer these thinas that. 
must be true if CC-1003 is to be an $nnstdjnce of U'~~"ltLi~-2'~ are 
indeed true, or whether the bottom.lin6 in the entailments of 'JC- 
"LElUD", exmpJe ) is fulfilled instead. 'VAT tries to decide-this 
bv followiny, arrow 5 to the v~rb?lizqtion of the Ja~anese text. Of 
course -there are- snany ways in whhh the answer might appear in that 
verbaJ,ization, - if it appears at all. If VAT is unswccessfui .in its 
search it will have to ask the user directly? 
84) V: IS CC-1003 CK'PEGOIIIZED 1;s LAND dd f1EI?IT? 
In 73), however; we have made things easy .by supnlying a context 
which ought to decide the question.. It wil t)e remembered :hat the 
second sentence in 733 expresses C3-2002, whicn is the R9nS0n for 
CC-2005. or what is expressed in tho first sen.teiice. PTow, C;i-,2002 
is categorized in the Japanese as an i-nctance of UJ-"AITriYOC, DA" 
which means something like "be nee'dedl'. Let us assume that the 
Jap-anese lexicon c~ntains an entry for this categorytwhich incrludes 
the following: 
85) CC-A C> U~'-llBITljYOO DL 
E> 
CC-A P> VB-"fIITIJYGO DAtt (pI-B+BEN, PI-z~PA'P) 
c- E> vB-~~~,IQ (PI-B,' l;c-D) 
CC-D P> V13-IIAVE (PI-8, PT-C) 
. . 
Ihe case frameb immediate3y under the E> identifies 1'1-B as the 
beneficiary, the pe.rson. who needs something, while the thing needed 
.is labeledT PIX. 
The second link under the E> says that an alter- 
native framing- is possible in terms of an abstract verb WNYT, where- 
in PI-B wants GC-D, and CC;D is then characterized in terms of PI-B 
hming PI-C. In other words, when one nt?edsdsoaething , one warns to 
have it. (If this is not alwqys true., at least it is the expected 
entailment. ) 
If 853 is -going to Drovlcie an answer to 84), there must also 
be a general prin'ciple of some kind which relates what is entailed, 
by CC-2002 to whak is entaiied by-CC-2003. This general principle 
cw be stated as fbllows: 
86) CC-A I?> V;B-WMCP (PI-B, CC-C) 
cc-D F> . VB-IIE~ (PI-B~AGT) 
3J-RZASON (CC-A, CC-D) 
B> 
GC-D E> CC-C 
The first line says 'that PI-B wants CC-C. The second line says 
that PI-B does something. The third line says Ahat his wanting 
CC-C is the reason he does somet~ling. All af this together is then 
said, $0 entail that his doing "something entails what he wants, or 
CC-C. In other words,. if one wants something and does somhethlng 
because of Mat. then what one does must entail whqt one wants. 
During the verbalization of X-2002 as part of the verball- 
zation of the Japanese text, VLd will nzve recorded the lact that 
C2-2002 was categorized as an instance of UG-"HIT;JYOO A, and 
will hrtve entered the following statements in accordance with 85): 
37) CC-2002 C> UC-"HITUYOO DA" 
E> 
CC-2002 F> VB-"1IIf UY00- DA" (PI-200UREN, PI-2902~~~1) 
CC-2002 F> VB-WANT (PI-2001 9 ~~12904 
CC-2904 F> VB-HAVE (PI-2001, PIn-2902 
At tths point VAT also has all the particulars needed for principle 
86), which can be f i-lled .out a9 fol lows: 
88) GC-2002 F> VH-WANT. (1'1-'2001, CC-2904) 
CC-2003 F> VB-tlKAS-lt (PI-20019 AGT) 
SJ-REASON (CC-2002 ,. GC-2003) 
E> 
C.C-2003 E> CC-2904 
The first line of 88) was obtaified from 87). The second line was 
obtained from 76). The third line comes from line B of, the Japanese 
verbalization set forth ab the beginning o.1 this section. idhat we 
are interested in now is the last line of 881, which says in effect 
thab CC-2003 is categorized in such a way that CC-2904 is trbe, an? 
loo~ling back to 87) we see that (2':-2904 lnvolves 21-2001 having PI- 
2902, or the agent Of kasu having okane 'money' . making the neces- 
sary correspondences in English, this means that CC-1005 must be 
cntegorized in such a way that CC-1904 is true., where: 
'This is exactly what VAT finds as the last line of 82). Since 82) 
is entailed by UC-"H:CNT-2" but r10t.b-y UC-"LEND", th'e question in 841 
has been answered, and the avow labeled 6 in Figure 4 carries 
back the choice of UG-"HMT-~" into the English verbalization, 
which then proceeds as it did in the translation illustrated earlier. 
By this .complex proce-ss involving comparisons of entailments 
within and across languages, as well as the general principle stated 
in 86), VAT has been able to make the correct*choice. So long as 
the answer to 84) was derivable from sometr~ing discoverable within 
the Jepaflese verbalization, VAT could in pri-nciple succeed. It is 
clear, howevet that .the route to the answer cbould be extremely com- 
plex, involvin~ chains of entajlments of unforeseeribW1en~;th. 
There is no doubt that such procedures are necessary to mder such 
questions, and that they present an e~traordi~ary challel~ge to our 
techniques for information storage -and<search. 
IX. Miscellaneous Problems in Translation 
Since we have spent considera1)le time looking into various 
specific translation problems beyond those illustrated above, we 
pres,ent here a few additional examples- ox me sorts of things that 
will have to be takeb into account during the implementation of 
machine translation along the lines suggested above. Two of these 
examples will, like those in the last sect?or,, involve the choice 
of a category in the target \language rhen that, chbice is not directly 
provided by interlingua. One has to do with the translation of 
Japanese osieru. into English; the othGr, the translation of English 
@ve into Japanese. A third example will illustrate the 15ind of 
probkem that arises at the st age of subconceptualizat ion qnd sentence 
formation. 
The following three sentences illustrate three possible English 
translations of the Japanese verb osierx 
90) Gaido Va Kookyo ,ga doko nl. aru ka osiete kuremwlta. 
guide imperial Palace where is showed 
Soko kara tookyoo tawaa e ikimasita. 
there from Tokyo tower to went 
The guide chowed us where the bperial Palace was. 
From there  we^ went to the Tokyo Tower. 
91) Gaido wa Kooky-o ga doko, no aru ka osiete kursmasita 
gui de Imperial Palace where is told 
ga watasitati ga soko e itta toki ni moo simatte 
but we there to 'went when already closed 
imasit a. 
was 
The guide told us whem the Imperial Palace was, but when we 
got there it was already closed. 
92) Kimatu siken no tame ni sensei wa 
senester-final exam of for the purpose teacher 
Kookyo ga doko ni aru ka osiete kud~saimnsita. 
Impgrial Palace where is taught 
For the final exam the teacher taught us where the Imperial 
Palace was. 
xach of these ,extunples contains the phrase: 
93) 
Kookyo-ga doko ni aru ka osiete 
which is translated in three diffe~ent ways, determined by the context 
in 90): show where the Imperial Palace 
is 
in 91): tell where the lImpe~7ial Palace is 
in 92): teach where the 1m~eria.l Pal .m is 
The difference is localized in the translation of osiete, a parti- 
cipial !form of the verb osieru. This verb may "be transla-ked into 
Englirqh as -9 show - tell, or _- teach according to the coqntext, and the 
problem is to Identify what the determnlng rzctors atre. 
The Japanese category UC-"OdIE-If -1s well as the English cate- 
gories UC-"SiIOW", UC-"TELL" 9 and UC- ','TZACH" are all included within 
the more abstract cl:tegory UJ-CONWNICBTION, which can be defined 
as follows: 
94) CC-A O> UCTC~P'lTIUNICATLON 
E> 
I1 "U 
CC-A F> VBTINTEND (PI-U, vb-c) 
GO-C ;j> CJ-OAU& (03-D, X7X) 
c:-u F> VB-ACT (PI-H) 
+ ;;> 2 JIG ( F , 4 r2r3 gv-G) 
3;:-E Y> -VB-+=KWU'Fi (?.I-11, CC-I) 
:C-G F> 'VB-K~:O:I (PI-H, 3C-I) 
entails that soneone (I- intends sornethlnp; -) and that whnt 
.-Y+l 1 
he intends, is that 2- 1 cause . GC-L) is so;ne act that ill -13 
performs, a:d - caudcd by th:;t act, ies a ch:m/:e from state 3J-hI1 
1 -i 
to state CJ-G. - is a ~tate in. which mother per on (I- does 
,-I 7 
not know soqethiny: I, 'and 4- is R state In which that pnrzon 
does P:lahr it. 
-3 3 
Sube te;-ones -5f G ;- ,I.~JIO iI:ATILol; nay differ as to the n?ture 
of the act I p~rfomed by the LUi.llilUiLLdLUL) rza. LV ~LLU r.llld of 
knowini: th t result's (e.. whothm lt is retamed in surfwe 0-r 
2eeo lenor:r), ml? in other wejs such as the al~thorit-;?-t-tlvenesr; of 
the cozr,u.p~c~itor .?~iti; repect to what 1s connunic~ited (2:-11. 2he 
the sct nf,jrZor b;~ the' co~mmicator; apparently he cnn do blmost 
11 1 IIrp);LLtt 
an::.thinq $hat will hc:ve n cornrnunic3t~vc I'unctj.on. u J- on 
th.: other hrmrl!, :ntnlls a verbal act, I; >-"A ibh" ail act which dlrcetbp 
11r-q I I 
the other nccsml s visual att-ent~on to 22-1, U ;h L~~~~JI -I:I act 
w%ich is didxtic In. nature. It lc- dlfIicult.to delimlt the qcts 
wnich ~:.inllf:~ ss te~c'lin~, but evid2ntl:q they nust have an in-truc- 
'- Y I1 "1 -11 
ti~nsl r:mlitg wh1c.h 1s .lot, necessary f~p LJ- OLJI=-;". 'J2-11T.~kv:~ 
337 r.1~0 be -mlque 
reqalirins: that the '{nowjnq - be deep 3~ 
lone-tern ~now~p~, st least in th.? intention of L-I-:3. J;: :t~c:se 
"1 I1 + I I 
- 0 TI, fop L~C nart, require that, PI-3 be 3~~thoritatlve 
with respect to the contenf- of what is being comrnunicnted (CC-I). 
But how is it, foi- exomplc, that-the context in 90) restricts 
the transltition of "OJIiS-" to ":JI1OW"'I The second sentence in 90) 
says thnt we went from there (ooko), whose referent is thr: location 
of the Imperial I'alace. Thus, nt the time of the comunic:~tive 
event, we must have been at the Imporial Palace. Prow, thero is 
evidently a generax principle, like 86) in the last sectfgn, which 
says that a vnrbtil act is not ujed to ccxnrnunic:ite where riomethin~ 
is when the beneficiary of the act is already at that place. i'here 
is evidently no such restriction on direct in^^; visual attentson tc 
whc?re it is, hence UC-"SH\;W1' is preferred to U~:-"T;IJL". hnce 
there is no thin^ in the context of 90) to su1:~ent thnt teaching 
methods were involved, UJ-"Y;IDW1' is left as the only crm?ldate. 
In 91) the situation is otherwise. The second clause mikes 
it clear through the phrase translated "when we r~;ot the~e" th t we 
were - nl)t at the lrnp 1~iaL Palace at the time of the c~~~rnzm~c:~tlve 
act. Another ~enernl principle says th~t vld~~aZ attention cm be 
I i 
directed only at thn~s within visu~1 rY:lnpe. Thus lJ$-"oiiOf~ is In 
T 1 
t is case r:~lad out, ns i i opnin Secai~nc of the nbsence 
of didactiic context. iJ;-"T ,L!," 1:- thus thc cllo~ce here. 
In ) the ditlnctic c7r:text is av111ent. Phe h+mnena word:: 
kimntu, s~ken, mcl nen$el all belon{r within the sernn::'tic field of 
-.- 
teochlng, a fsct to be nated In the lexical entry for eFeh of them. 
dence the Xnqlish zateyory K- 
11 I ' 1 TI 
1 , O~V~OUS~~ a xenber of the 
sane semantic field, wlll be the cho~ce hnre. :robn31:~ we sho~ld 
also tI*e account of the fact th.4 the idionqtlc vr:.S at the end 
I 
of thls sentence, 1 Rave ' ,/ reinforci,~ the superior rela- 
ti~nship of the communicntor: in this cnsc, the fact that he in 
$uthorita-tive with respect to what is Boinp comibunicatcd. 
The boint of this exnm:de of the 't-rnnsl~ti ~n of osiorll ir: to 
emphasize th,~ conplexity of the criteria which lrmy have to bo in- 
voked to decide Decwoen no~sible transbations. iicre we have seen 
a link betwym different kinds of cornfiuriic:ntive acts cahd the location 
qf the recipient of the conmuni :ation, inforrnntlon on the latter 
being derivible from information a3o1lt the aove:nent of the rc?ci,:?ient 
to or from the plac~ of c~mm~~nicntion, to!:e.thor with ternnoral in- 
formation., It is also of interest thnt this cxmnl-e, like the st.- 
cond exam~le in section VPII, Icd US to recognize certzin qenvral 
principles: that ohe does not communicate verbally nbo:xt whkre 
Somethin? is when the addressee is already there, For examsls, wd 
the ooWlous prmclple that one does n3t call visur41 ntt-entlon to 
something tSat is not visible. Detalled i~nlefient ation of this 
kind of translation rese~rch wlll undoubtedly lead to the recop- 
nition of a nuber of such principles. 
The word kudashimasita In 32) lys s us to a jiffer~nt kind 
of connlicntlon : ?,*!lat i~lvolvec! i?~ the need to ecisl atten- 
tion in Japanese verb:llizaf, ion t:) thc socinl relnt I onskip (?xir;tinr< 
between the s?, j:rlCl>~ a~d ~rjrl oils 0th~~ nbJvrons. Alt '+=ollh-h- \vc3 arc; 
chan~inp the dlrectlon of trnnsla~ion here, it is of so:m 5nt~~e~t 
to consider r;l:c?stims thr:t %~rlse in trahr;7atinm ths 3ni7!1sY1 ~nh- 
qory U d-"GI%Z'' into Ga~mese. 
7- 
.dr, A~r-~ :2ssuze th9t J - has the 
sntailv?ents flsted in exa~nle t, sectlo11 71 above, -?+~,7d t1p.t filr- 
thern0x-e the cqtcrories u,rlrlcrl-~lr;~ 1 thc J.~?,TLc se vk*=i..; to be 
- 
nentlonerl shsre t>er,s rrwto mtailne?ts. d;icn J2qn3e~c c lct:~ -oyv, 
howevnr, ~:IR add:i4zlonnl, ent~ilrn~ntg. of its. own, 1 it i~ the nature, 
of these, nrznltionnl entnilmentn that we nre i.nt,ox:.ogtcd in. What 
follows is ijnnod on tho anniysis in Kuno 1973:127-135;. 
" The verb - kureru is unad to ex in~tnnccc 3f a cntcryry 
Whoqe entailments ~nclude those OX. UC-rtlGIVB" ' plug the followinf; 
(where YI-B is the (went nnd :'I-,': the b:e,no.ficiarg of the ~ivin~) 
a*. 
vfi-~;oss-~~-s~~b;~sl{E~i "-(p~-d ) 
VB-';1J(iS~3-'?Li-SPd~~'11( ILI~"IVL., (I, 1'1-B) 
-VI E -1 ( 1 3 PI -C ) 
That is, Ud-','dKIIHS-" is, th'e c<)tecory chor,Bn if the bcneficiqry of 
the giving is socivilly close to- the speaker, closer to the sneak~r 
than athe aqent ofl the giving, and the agent in not socially hiqher 
than the henef'ic-Ta~y. In translatint< texts where such infornXtion 
is relevant, 71iT will bitller h3ve to store a n:-~twork o: social 
relation5 linkin(; all the relevant I ndlvld~~ls, a network which 
nav -;n c?rt be ,I'eYlvnble frorn th~! tvt, pr ~t will ~'Iv(? to ;isk the 
user c?l~'~r,t ions llke: 
'is used to 'oxpress inst :inces ofFiocelt':f-or7 j d whose e;-,tr4ll :IOT t c me. RS 
follows: 
In other woqds, btho -entnilqent~ of UO-"KUUUI-I2-" Rre the name ao 
those of U3-"KUIId-" except that the agent of the ~ivinq & socially 
'higher- thnn the beneficiiwy. 
(It was the exalted poniOim of ~ensei,, 
the-te'aohor, in 92). Chat led-to $he use of kudaoaim:~,sita 'in that 
mother possibiJi'ty is the verb yaru: 
C-VB-GLOSE-T~-S~~~M~CX (PI-C ) 
She braces indicate a dlsjunctlon. lhu~ one ~'f We ways in which 
this category differs .from the last two is in the bene;'icla+v 9f the 
giving not being soclally close to the speaker, or 4se in his not 
beinc closer to the _speaker thari tne aqent <of the qivin~. As in 
97) -the agent is socially hicher than the 3eneficlnry. Ep~'c'her 
more, as stated in the last line, the beneficinry is not hein[; 
treated respectfully by the spe&(Jr. 
?he verb ageru 1s like ygrlyL, excent t$at ths 2r:ent of the 
-iving is not - soc~ally higher th:m the bonefici~ry: 
- 4 d-:L,>sx-Tg-:,) .:kw (p:-c) 
71 
-VB-2lJ0L 52-'20-, )la A$AK ;-?-lJiIlil i (J~I*-'J , 1 '1.-:3) 
v 2 1 (PI -B, ~'1- 2 ) 
-vB-;t.<~~~~.~L<D -(PI -: ) 
3 
The last verb th::t we' will conslder here 1s s~sia{;eru:~, 
T~-CL~:~uL-'20-Al--21~LL-i (PI - 3) 
Ih other woFds, the n'gcnt o,F the.y;i Xssa. to the 
$pe&oP; wh'ile the $~anoEicinry -in ~ocin 1 ly hi.r;her than me il(Senr; 
mfi is being-~treatcd respectfully by thr! r:pc&er. it io a1,go 
possible to use +his cntegory when: the apnt is not socinll?;g close 
to the. bpeaker, but evidently Japanese .speakers, are not com:~letely 
cornfortable about the qhoioe in that case; nevertheless,. there is 
no other category available. 
One vay in which. VAT'-~ni~ht be a!,lc to puestionn re~:nrdiny: social. 
relationships is throup;h the occurrence in the text Q f' cnt ef~orizafj-ons 
that entai.1 such relatiinslups. For ex.arny.lc, the occurrence of an 
instance of UC-"UYN:;BI1' in example '32) entails a soc1all-y hlp-her 
stritus .for the PI thus categorized than. for the !)Is who are tnis 
teacher1 s studentst It thus- leads to sthe choice ~f ~JG-''KIJ~~~~~;~u;- 
I I 
kinslur, terns also nrovide examnles of automntically entnj!,le(l cocisl 
4. 
status. If e take 'a IJI that is a71 icstmce of U:-"OPO!}lJIJ,:'t 'father' ,, 
for example, therebwe ent :lilment s 01 the followlny: sort: 
'hat in, !'Iq-A 1;lu::t be the father of someone (1 :)ad ~1.11 he 
T -8 
h~t- lf th,e I-.: wno is the fat!ler of .'I-? is at the 5:ine tl~o 
171 
the s-)-nker, I-R ~v~ill .he ,soclnll;s .close to ST~~R~F)F. .-he en- 
tailment~~derived frov? bbth 101) .and lW) 21-e relev:nt tb the 
choice of a trmslatawl f 0;s l.lnglish U.;-"G r ViC" , as sl:ctci?er! -7'r~ove. 
80 far all our ex'trn~iea of translntibn problems Have involved 
cp~egorization, 2ertain%y, however, there are also ~roblems which 
arise in subconceptu:xlizatdon, end in th'e associ~ted application 
of syntactic processes which lead to clause and sentence formation. 
We have not paid as much attention to questions of this sort, si~ce 
for r;ne most part we have been able to translate sentence for sen- 
tence with- reasonable success. One example which Eeems fai'rly 
clear arose early- in oar investiflation, and will be repeated here 
as an illustration of thq chrillelees which :ire likely to arise in 
this respect. 
At issue is -the translation o'f the Eacl ish sentence in 103j, 
the first sentence of 2 fable, lnto the r:equence# of two J~~panese 
sentences in 104): 
103) There was once a wolf who raw a la~b drinkin(; at a river ,. 
and want?d to create an excuse to eat it. 
104) Mukasi Ub tokoro ni kawa 6e lizu o. nonfle 
once certain glace 'in r-iver at water drinking 
iru ko-hitu~i o mltuketa ipl~ikl ,no ooic:mi .'-a imasita. 
be lXnh s aw one wol f !q ah 
i3osite sono ookmi wa &slo ko-hitmi o tnberu $me rro 
and tha% w91-f that l"qrn5 ent for 
iiwrlke o tukuri-tn-l;att;e imn5ita 
excuse make-wmt~seeminf~ W~S 
lhe ouestion we are con1cerned wlth is why it is deslrnblo for the 
Japanese tmnslati~n to 'cre-rte two sent~nc-s where 'the English had 
We lax note first of all thr3t -I e Enqlish sentence. contains tTnro 
conjoined relative clauses ("wh-o fiszu.. .and wantbd.. . "). 
Ja~~ese 
relatlve clauses dlrfer fron thl~se in tCni;lish ii-being prepoced to 
t;o the noun thoy -modify. tIencc, if the Johmor:~?. wwe to 4,re::orve 
the struct~~re of' the 'English in :I oinp;lb sentence, the s!IG~~~?P 
would h:?ve to srly ev8ryt!1iny: th t tho wolf saw and wantieb--%zm rp 
everb was ~blc to rne~~tion the wolf. 'Phe*qsllb,jc3ct of the ceeinrl; nntl 
the wantin3 would be hold in numense for so lank thst iddrassee 
or rendnr .miRht h:~ve some problem in ihtcrpretinr: what was being 
. - 
sala. xnother reason for not rco~~atiny; the Ln(:lish sl-r%ct~~~e of 
tw6 rclntive cl:luse:; hoswto do vith thc bc;;innin: of the next cen- 
tence: in :hf;lish, 'Tor that ourpose ... he accused the Inxh of 
stirrinc un the water.. . I' The referent of thnt,.rmrposr, in Cnrtlish 
is clear. It rrfcrs to t'hs immodir.tely lr~tcedinr: ralnti~r: qlnunc: 
"~~lr?*nteCl to crer(te- 2n excuse to e it. 
11 T-. 
1~1s ~p~-mtlnE to ocrcte t6i s 
his purpose for 'accusir-ri; the 
if the ~1211s.: in ?uei:tion l~d/iero n nooaetl to ook~pi !,wbnch would then 
be followed by the nnln verb of the sentence, j ~asitn) - , the referent 
of tnat purpose would r;o lon~er- he cl-ecm. ,3y r:lakinil; the cl ;~:lr;c a- 
<. 
.IT 
bout thc wolf's wantinyl; to crwtr: the r::csusc lnto arl 7 r~d+3:~1dcnt 
sentence, the Jnpalr:se is able to reff:~ to it ire 2t ti !,.!-in- 
nin{: of the next nt:ntonce wlthoiit; tli " l'ic ~?Lt,y. 
r hhve not f'orrn:~lizc;A thr? )roc(: s:;el; b;~ rnj~i ich L'~ t~:ol~l (1 (1 {:(: i (I? 
to - two ,e;entet~c:er, i rl thfl ti 1, 8s L :: illrc(? vf;rlk, l- 
izat irm has o:le, but evidently nri nl3plt.~ S~C &n the f ollowinr: rnur.t 
eventl~nlly he inclllded. ~lirst, !;here rtust he a $er,tri dtinn 0,' Borne 
kind on the wnount of [mt~rial tkt c'in be iw1 ided ln a :)re.josed 
rel:?tive clause, a r~erh:-p,r,~ 9~54~12lly in 2 rolntive clause thht 
introd,~ces tne nain charvl :ter of n story (wkir~sr 1-:tr6rl;\ction can~lot 
be nut off for too lonc). L;ecqrltl, thilre is a nee.: for d,sentknce- 
intgoduotony phrase iiko for that pdrposp to have a clenr referent 
which immediately precedbs it. The task of int~~oducing such prin- 
pn ~nto VAiT1s operntions is fomid:ible, but nt:rhaps not irn~ossible 
of accom~~1ishmen.t. 

References

R. W. Brown and E. H. Lenneberg. 1954. A study in language and cognition. Journal of Abnormal and Social Psychology 49:454-462.

W. L. Chafe. 1974. Language and Consciousness. Language 50:111-133.

S. Kuno. 1973. The structure of the Japanese langauge. Cambridge, Mass. MIT Press.

C. Li. 1974. Subject and topic: a new topology for language. Paper for the Linguistic Society of America - Annual Meeting, New York.

A. Paivio. Images, propositions, and knowledge. The Univ. of Western Ontario, Department of Psychology, Research Bulletin 309.

Z. W. Pylyshyn. 1973. What the mind's eye tells the mind's brain: a critique of mental imagery. Psychological Bulletin 80:1-24.

D. E. Rumelhart. 1974. Notes on a schema for stories. Paper for the Carbonell Memorial Conference, Pajaro Dunes, California.

R. C. Schank. 1974. Understanding paragraphs. Instituto per gli. Studi Semantici e Cognitivi, Sastagnola, Switzerland.
