Recorlstructirlg Spati~l hm~ge from Natural Language Texts 
Atsushi Yamada Tadashi Yall).alllolo Hisashi Ikrda 
Toyoaki Nishida Shuji Doshita 
Dt'l);,u'tnwllt of Infl)rmation S('i('n('(', Faculty of F,11gin('('rillp, Kyoto University 
Sakyo ku, I(yoto 6(16-0 l, .Japan 
phon(': +81-75-753-4952 
entail: yalnada tt kuis,kyot oqLa('.j p 
Abstract 
This 1)al)(,r drs('ribrs tit(' un(h,rstanding l)ro('('ss of the 
spatial descriptions in ,lal)anese. In order to under- 
statt(I tlw described worhl, tit(' a, it|hors 113" to r('('Oll- 
stru('t tit(' gc(nm,tric model of tit(' gh)bal s('en(' frmn 
tlw scenic descriptions drawing a spaco. It is done by 
an experimental ('Omlmter itrogranl SPR INT. whiclt 
lakes natural language texts att,l l)roduces it nmdel 
of the described win'hi. To reconstruct the modvl, 
the altthors extract the qualitative ~patial constraints 
front the text, and represent Ihellt ;IS tlw nunierical 
constraints on the spatial attributes of the entities, 
This makes it lmssibh' to eXln'ess the vaguent,ss of tit(, 
spatial concepts attd to derive the ntaxinlally plausi- 
ble interl)retation froltl it ('hllnk of illforltl&tloil ;t('('ll- 
nmlated ;m tit(, constraints. The int('rln'ehdi(m r('- 
fleets the tentlmrary 1)clief about the world. 
1 Introduction 
This p~ttter (!OltCelltl'3I,(!s (ill l;h(! tllld(wst3.1tdillg 
proc(!ss of l;he verbal (!xtn'essiolls ('OlWerllillg 
it\[)olt|; space. ()lit, can easily inla.gilw ~11(} (h> 
scribed worM fl'om the vcrbM expressions. We 
l'(!gltl'(\[ the iltgerpl'('.~0.Lit)ll of ({(!s(rriptiolts ;US ;~11 
;tcLiv(! process, t;IIM, is t}m proc(~ss of I'(R:OlIS|il'lt{'- 
t,ion ()flt situation which tim st)eak(w intended. In 
this process, tree will use many kinds of infitrma- 
Lion. The natm'a} bulgu~ge descriptitms contain 
st)Ill(} of th(~lll, itltd it, is vm'y illl\])ort, allL to extl'itct 
and use t}mnl, but they are not enough. Among 
t}tem, inforntation about the configm'at, ion of the 
wor}d ill (rim's illlltge plays ;tit inl\[)Ol'tltltt, rote. 
W(! h3,v(. ~ lllltd(! itIl (iXpCl'itil(?ll\[,itl (!Olllplltcl" 
in'ogrmn SPHINT (fl,r "'S}'atial Repres(,'ntI~tion 
INTtwttrt~ter'), whi('h takes spatial (tescrii)tions 
wrigtelt ill ,\]itl)itll(!,q(L I'{',COllS|.I'/ICt,8 3-dillt(~Itsit)lllt\]. 
nto(l(~l of the wm'}d, and t)llI;plltS thc ct)rl'(}Spolld- 
ing inlitge on tht~ graphic disp}ay. 
In this palter. We (h~scrib(~ t, lw overvi(,*w of this 
syst, enl. 
2 Tile Approach 
The esscn('(~ of our approach is its fo}}ows: 
a ~\[(qillillg of thP llal(l(';ll lktll{4111tg(' rxl)rrssions ILs 
the constraints among tit(' sl)atiM era;ties 
• lmagv r('l>r('srntation of Ill(, worhl as a collection 
of tit(' lmram('trrizvd entili(,s 
• int(,rl)rrting tit(, qualit;ttiv,, r(qations as tit(' ml- 
ttWt'il'a\[ ('ottstr~lilHs KltlOtlg lilt' p;tr;tttt(qt)t's 
• Pot(qttbfl (ql(.'rgy fun('tions for lhr vague con- 
st raitll ~, 
• l'~xtracting, thr procedur(, of the ro('onstrll('tiol; 
froltt lit(' itatttral lallgtlag(' (,xl)t't,ssiotls 
• Su('('essive refinement mul modification of tit(' 
worhl model. 
We regard the worm its ml i~ssembly of the 
spatial ent, iti(~s, and r(q)resent each entity a.s the 
(:onfltimd;i\[m of its pl'ototylm lind the real wd- 
ues t)}" il:s t)itl';ttlltlt,(!l',q. W(! \[n'eplu'e the gl'ap}tic 
object, s corr(!sponding to tim prototypes. Each 
graphic (d)ject is represented by the parameters 
prescribing the details of it. The pa, rametcrs pre- 
scribe its }ot:ation, orientation, and extmd,. 
Now the t~sk becomes to gen(,'rate the graphic 
objects corresponding to the described entities 
}tll(| to (|(!tCl'lllill(? th(~ p}Ll.'itltl(!tCl' values pres(tl'ib- 
ing thmn. 
AcrEs DE COLING-92, NANTES, 23-28 Aour 1992 1 2 7 9 PROC. OF COLING-92, NANTES, AUG. 23-28, 1992 
*ynt~etic rule 
diction ~ry 
object dictlonary 
~emal~tic rub" 
@172: ...... 
patlal co'nst raint~s 
of objel cls }pltranletric reprenel~ 
~ (Ioeal) minimum 
'1 ...... ' ............ I ~)t~ t pu t Image 
Figure 1: The Overview of the Experilnental Sys- 
tem SPRINT 
It is difficult to deter,nine the tmrameter values 
directly front the natural language descriptions. 
beeattse of the I)artiality of the information and 
tim vagueness about the sttatial rehttions alnollg 
tile entities. So. at first, we extract such informa- 
tion as the qualitative spatial constraints among 
the spatial attributes of the entities, and then. 
interpret these COllstrailltS alld calcuhtte the t>a: 
ranleter values. This process is shown in figure 1. 
Given ~ text, SPRINT makes a sm'faee ease 
structure using the lexical information. Each en- 
tity is described ms a noltlt. Next, SPRINT ex- 
tracts spatial constraints about the entities by 
analyzing the related words in the case. struc- 
ture. At this time. SPRINT also extracts the 
sequence of the information references fl'om the 
lexical mtbrmation as dependencies, which are 
used as cues in the calculation of the paranw- 
ters. 
At the next stage, the extracted qualitative 
constraints are intertn'eted as the nmnerical con- 
straints among the e,ltity paranletc.rs. These nu- 
lllerical COllstrailltS :4re repl'esented }kS the COlllbi- 
natiolt of the primitive constraints. Tit(! tmten - 
tial energy function is one of such primitives, and 
this is an efficient method to treat the vagllelteSS 
in the constraints. Other primitives are the tot m- 
logical constraints and the regions. The potential 
energy flmction is a kind of the cost fllnetiolls 
which t~tkes all related paranmters and output 
the ('()st. The less the wthm of the potential (m- 
ergy flmction, the more credit the combinatiml 
of the gemnetric parameters gains. Using the 
gradient descendent method, the solution with 
nlininnnn cost is calculated. The t)otcntial (m- 
c.rgy flmction 1)rovides a means for accmnulating 
fl'om fragnmntal'y infornuttion. (Tit(; basic i(te~ 
of the pote,ltial ene.rgy fm|etion is reported in 
\[4\].1 
3 The Example 
Suptlose that the fi)llowing sentences are the in- 
puts to SPRINT. 
(1) Fill V~i0'Dq,~E~at,fI.g;0%b/oo J (There is a 
fount;tin at the center of till, Yalnmshita Park.) 
(2) \[,fI*a) & ~_ 70~,65v,\[$ia)~a)lrq~. 5 ~:)klll)lt,~ 
~& ~ &:0~ ~ 70. j (Front that place, you can 
sec Hikawa-maru (a slfip) Iwyond the fence of 
the park.) 
(3) V).k JIl .,ic ~ ,~i N l,: l~. "q" ') "./47 V- g: /: o "C 
~)5o j (There is a marine tOwl.r to tit(. right 
hand of Hikawa-mm'u.) 
Frmn these sentences, SPRINT gets the sur- 
fttee eR.se strlwtltres a, ttd interprel, s each eOllltec- 
tion in tit(! structures to extr;u:t sp;ttial con- 
straints. Tit(; ext.racled constraints in this ex- 
ample is shown in table 1. 
Then SPRINT calculates the entity parame- 
ter vahles based Oll these COllstrailltS llSillg po- 
tential c.nergy flmctions. The exanq)le of the t)l)- 
tentiM energy fllllctioll is showlt ill figm'e 2. This 
is one which is used to calculate the location of 
the ship. In this figure, the line repre.sents the 
edge of the park. and the thither side of the line 
me;ms the inside of the t)ark. Finally SPRINT 
draw a world image (m the graphic display. This 
is shown in figure 3. 
4 The Analysis of the View 
In the bunt I~xanlple. the treatment of the view 
is vm'y important. Usually an ob.s(wver sees the 
world ~md notices how the world is. If you did 
Ilot klloW which directiol~ the observer sees. yell 
wouM not dete,'mine the directi(m "'to the right" 
alia (,(mid llOt illlagille wll(!r(! the tower is. All- 
other way to (tetern|ine the direction "'to the 
AcrEs DE COLING-92, NANTES, 23-28 AOt)r 1992 1 2 8 0 P~OC. OF COLING-92, NANTEs, AUG. 23-28. 1992 
Table \]: '\['he Exta'acted Sl)al:ial (?.nstrainl:s 
• The Spatial Eutitivs in the World 
ganmshita Park (a park), il i~,ttntaill. ;t fl,ncc, 
Ilikawa-malu (a ship), a lllD.lilll' tliwl'l (a IOW('I) 
• The l/ehttiow, alllOltg tlw l':ntilics 
Iota/i/m( Fmlnt ain )--Ileal{ ('cnt ~'1{ Park ) ). 
inside( region( l'alk ) ) 
local ion( }':y,' lloint )=near(locat ion( l"oultl ain ) ). 
o,llsid.'(l egion( l"ou nt ain ) ) 
local iolx( l"cncc)=at (\]mundaiy(l c~ion( l'at k) )) 
orient alton{ View) =h'onl( h)('ali(m( Ey(,q)oiut ). 
t o(h)cation(Fenc(.)) 
hwation(Am>l)oint)-hint(,l(Fenc(,. Eycqmint ), 
location(Ship) 
location(Towel )=right(Shill. Eye l)Omt ) 
• 'Fill' Used i,:nowh,dg(, almnl the Entiti,,s 
Park = TwlJ-dilncllSiOllal legion (a kind of 
Gromtd) 
Fomltain = Thic(,-dimensiomd object with \Va- 
tel 
Font*. = Throe dilnvnsional objvct al tilt' 
|)oundary of the two-dinlonsiomtl region 
Shill = Tlnev-dinwllSiolml object (a kind of V,,- 
hich, on \Valet) 
To',ve, = Thrce-dinu'u~,iollal M~jvct (a kind of 
Building) 
• The General World Klmwh.dgv 
Ally two ohjvc/s CllllliOl o('t'llpy the S}llll( placc 
al the s~Ull(* tiHle. 
Evcly object is undel tit(' law of glavitation 
{i.v it nmst SUlqmlted mdoss a special reason) 
Every object has its plausibl,' t.xlent 
Ther,' I'xists a diStall('(' scale act'oldillp, to the 
exteltt. 
Figure 2: The Exlunpl(~ of the l'ol;mdial Energy 
Fmwtion 
(L __ 2::: - - ~.\ 
}~'i,~;ul"/' 3: The ()ull)lll: lltli't~,~l' of the \[nt./!rtn'eLit- 
l:ion 
'l'M~h, 2: Tlw Basic (:onslrainls from I;he Eye 
l'oinl: 
• constrain the ,'y¢ point hy lit,' iminl of "ol~sm- 
Vii/i/ill" 
COllstraill I}lo ailll point by tit(' itillled elltity 
• COllstrahl Ihl' vip'w Ily tlw eye lloint 
• collstl'ailt the view by Ilw aim lmilH 
,J coustrain the view by Ihe ,,ye direction 
righl:'" is t,o cal('uhtl:e il only from t;he orienl, al;ion 
of i:he ship. l)ul we: do m~l: lhink it is usual. Thi.~ 
lllC}Llts t;\[t~L\[ ILl\[It' sp~d,ial inut,e,,~ rl!fh!('l;s l:\[lC histllry 
,:d' l;\]w illi'cr(utcl~, and t.}w ('(Hl.%l;lllC|.l!d itn;tgc is 
used ag.~tin I.o Ulldl!t'st mid file next; ,qt!ll~,(tllC/L 
St, SPtl.IN'F also lias to 
• I)ltl'SllO of IIw eye ll,fint o\[ the o\])sct'ver. 
• sot tho ;'it",'," of the ohm'rvvr from tlw eyo lmim. 
• infer tlw spatial c,mtigura¢ion from 111,, view. 
For ~his s;tkc, w,~ lnOdt~led t.hc view of t.}w ob 
server as one of t;lw spat:ia\[ et~l,it:ies, which has 
lhc ey~ iminl, I,hc aim point:. ~tliil the eye dire/" 
lion. lu this mwl;i/m, we itlladyzt! tho descripl:ions 
aboltI; view in &'/:alia. 
At first, w(~ detin,~ the relali, m almul "'seN" as 
f.llows: 
"There is no visible obsl:acles bid:wccn t.hc 
eye point mid the aimed ent.il:y." 
The constraint.s about tht, eye tminl:, eye dire(:- 
t.ion, and t:h/', itilll point, comes IJt'Olll t:his detini- 
lion. 
Th,~ silulth~.st ci~se is shown in table 2. For 
/!XallI|)l/~, t,\]wre ILl'(? 5 ClHl,qIl'l~illt.S t.O t,h/~ S/!ll- 
t.nce \[ ~{liiJ'\]j~l$~ (j) c b C 4l/'3' C, .It: e) E L & V -- ¢)%1 
ACI'ES DE COLING-92. NANTES, 23-28 AOt~'l 1992 1 28 I I'ROC. OF COLING-92. NANTES, AUG. 23-28, 1992 
~. 7a \] (You can see a tower north from tile sta- 
tion square. ) 
(A) loeation(Eye-l)ointl)=Station Square 
(B) location( Aim-lmint 1) =Trover 
(C) star t - point (View 1 )=Eye-point 1 
(D) end-point (Viewl)=Ainl-point 1 
(El direction(View1 ).=Nor th 
If the eye tloint has its own direct(ira, the 5th 
(!Ollst, r~illt ill '2 \[)eCOllle.N ;~ relative one llased Oil 
the direction of the eye t)oint. For example, to 
~ To\] (If you get a('ross the crossroad, You can 
sec a tower to the right hand.) the. constraint (13) 
~t)ove lice(lilies 
(E') direction(view 1)=t o-t he-right (view- point 2) 
which means the (lireetiml "'to tit(.' right" is deter- 
mined by the direction of tile eye of the observer. 
In this case tile observer get across the crossroad 
and no other information is obt~dned, so tile di- 
rectiml of tim eye is determined as the same ~m 
that of the transfm" of the observer. 
There are the cases where the directi(m of the 
eye changes l'UllOllg the trallsfer. Ill slletl closes, 
the last eye direction must be calculated ~meord- 
ing to the intermediate eh~mges. So the change 
point is imt, ~md it nle(liates the change of the di- 
rection of the tralmfer. The necessary COllStl'ttitlts 
are as follows: 
* eonstrtfint about the change point 
$ COllStrailll 0.b(lut Ill(' IrallSf('r before the 
ehallge 
• ('onstrain( 0.l)ollt tile transfer after Ill(, 
('hallge 
Fro' example, there are 5 constraints for the 
sentence \[ t~?:~ ~/i!~)r~ 70 j (turn left at the 
crossroad). 
(A) loeation(Changeq)oint 1) =Crossroad 
(B) start-point(Transfi, r-vectorl)=the hust 
Eye-point 
(C) end-point (Transfi, r-vector 1) =Change- 
point1 
(D) star t- point (Transfer-veetor2)=Change- 
pointl 
(El direct ioll{Trausfer- reel or2 )= t o- the- 
left (Transfiu'- veer or 1 ) 
Figure 4: The View hltert)retati,m of the Trans- 
fer with the Interlnediate Change 
The direction of the eye after the change is same 
~Ls the diructioll of the' tl'~ulsf't~r-vector2. For ex- 
amph~, the sentence V'l'rF-~'~),i~I?-~" ;5 L, ,~P- 
l: # r\]_ ?)~,~ )5 \] (If you turn left at the cross- 
road, you can see a tower to the right h~md.) is 
interpreted as in figure 4. hi this case, the di- 
rectiml "to the right" is calculated Kom the last 
direction of the eye. 
This interpretation satisfies the constraints in 
the Sellte, ltee, howe.ver, olle llllty think this is not 
tlle Sallle its he/she imagine becmlse in this in- 
tel\])l'etltt.ioIl the obse, l'vel' ('gll see the tower (IV(,~ll 
before the crossroad. The sentence "If you turn 
left....'" seems to imply tlu~t "until you turn left 
at tile Cr(lssro3(l, yOll CgllllOt see 3, tower yet." 
and this is not ill the case of the logical sense. 
Of course this iv not Mways true,. Suppose the 
situation where you see a t(lwer now ~md arc toM 
the last se.utcnce (llrotlably in English you say 
not "a tower" /)ut "'the tower"), this will be the 
c~me of the integration of the several views. Sll 
the additional pragmatic constraints are strongly 
influen(:ed by the puri)ose of the utterance,. 
Anyway if you do not want to see a towm' t)e- 
fore the crossroad, one of the solutions to this 
problem is like this: put some obstacle on the 
view of tit(: observer 1)efore the crossroad, that 
means put it between the p()int of the observer 
and the tower. In this ease, till the observer turn 
at tile corller, there is llO way to kllow the loca- 
ti(m of the tower, so no way to trot the obstacle. 
Tile interpretation ~u'eor(ling to this solution is 
showll in figure 5. 
()lie ()f the other solutilms is that you know 
AcrEs DE COLING-92, NAMES, 23-28 ^ot~'r 1992 1 2 8 2 PROC. OF COLING-92, N^NI'ES, AUG. 23-28, 1992 
72__ -:~ z(_ ~ 
/ ,/ .__,a,.. /7 
Figure 5: The View Int:erln'et;ation with the 
Added Obstacle 
|,here is buildings or sorer:thing Mong a st.reel, and 
llS(~ tili(~lll }Lq IX ol)stacle. 
This kind .f "ilwisil)le' sit:uation must, tm dis- 
cussed with respect t:o the read world and the 
daily language use. 
5 Related Work 
From the lmrc linguistic point of view, A. Her- 
skovits \[1\] analyzed locativ(~ expr(~ssi(ms in En- 
glish. As for constructing a ('Oml)uter model, 
conventional logic fMls short of ore" tmrpose. 
Anlong the formulations based Imrely on C()llVell- 
tioual logic, n,ost, t,ypical is slot-tiller ret)resen~a- 
|,ion such as a tbrmulal:ion by Gordon Nowtk ,It" 
\[2\]. Tlmre also is a work |,y D. Waltz\[3\]. i1 is 
however hlu'd to draw logical c(mchmion out. of a 
set of axioms which lliay involve predicates vague 
and to get a reusable model of the world toni|g- 
unction. 
Ore' apln'oach allows both cont, iiluous and dis- 
t:oiil;iillIOliS fllntq,iOll8 go l'(!pl'(}Sell~, spatiM COli- 
stxMnts, so that the prot)ability changes eit:lwr 
contimumsly and discontilmously. 
It Mso works as ~t clmnk of the information. 
Though it seems that our aptn'oach is rathm' sub- 
jective, it seems imtiossil)le to construct a model 
for the worhl without smnc kind of subjective. 
6 Conclusions 
texts. The area of spa('e-language relationstrip 
contains a lot of hm'd issues, and some liroblems 
related to this work are mentioned tMow. 
ln'esenl;~d,i(m of the image, 
Our progr~tm makes ~t internal 
3-(lin,ensional model of the world, but the 
t)i't!Selii:~ttioil t)iI l,lio s(!rt!t!ii is now llHtiill~tlly 
done. which means that. tim camera position 
fin' t.he computm' graphics is lnmmltlly de- 
cided (it is usmdly a bird's-eye view). How 
to iiresellt tht! iilI;erllld (!oiifigura, tiOll ~ts an 
image is a flu'thin' prot)lem. 
* inl:egration of the iuiti;d image. 
if ;dl tim model is t:onstrut:l;rx\[ b~me.d Oll fill(~ 
wn'bal inf'ol'nt;tl, itm, how to give the initial 
v3.hi(!s of the t)0,ritlilt~|;(:i,q effectively t}(}(:t)llie,~ 
the t)roblenl. If the tim reconstruction tie- 
gins wii:h an initial image, the inl.egration of 
t, hat inmge aim the verbM inforniation is the 
other tiroblein. (Proll,d)ly the initial ilmtge 
is also vague.) 
We are now considering t;lw pn~gmatic use of 
the verbal eXl)l'ession in tim world model, and 
making a model of l:he visual disappearance. 
References 
\[1\] A. Herskovits. Language and Spatial Cogni- 
tion, C~mfl)ri(tge Universit, y Press, 1986. 
\[2\] G.S. Nowtk Jr. II.elWe,sentations of knowledge 
in a pr()grmn for solving l)hysics probhmm. \]u 
Proc. L1CAI-77, l)ages 286 291, 1977. 
\[3\] D.L. Walt, z. Towards a det~dled modd of pro- 
<:essing fro' hmguag(~ describing the l)hysicM 
world. In lbvc. L1CAI-81, t)itges 1 6, 1981. 
\[4\] A. Ymnatl~t, '\]'. Nishida. mM S. Doshit~u Fig- 
tiring out most t)lausillh~ interpretation f,'om 
si)atial descriptions. In Proc. COLING-88, 
l)ages 764 769, 1988. 
We have presented i~ll exlmriillellt, al colllputer 
t)rogrmn which produces 3-(limensional image as 
~ttl illtert)re~ittiotl of I, he givoll liar.lira|| lltllgtlag~! 
ACRES DE COLING-92, NAN'IIlS, 23-28 AoIsr 1992 I 2 8 3 PRoc. OF COLING-92, NAMES, AUO. 23-28, 1992 
