16a~\].gG_&\]~ ° £. Na;i;ttra.i La.,./gi~.;.~.ge Analysis c, ~,ystem 
TOK(\]N.A GA '/)ake~obu, IWAYAMA M~d~o~% TAN AK k i:fozu~tfi 
\])ep~trtment o1! Computer ,£qelence 
'\]bkyo h~sgit~e <K '\]2hc}mdogy 
I(AMIWA KI 3%dashi 
\[iif,~whl \]h'sem:ch I,M>or~.tm:y 
tlit~mhi I,M. 
.Ab~;tza<:~t 
'.Fhi.q p}~pm presmds ~s \]~a.tu~a.l laatg~atge ~ma\]ysis system 
~',~..:~gL.A?/~ based o~t _~:~0'~.-X(:.~ which parses with at I,r~i.t, om- 
~p ~d delr',h-.-fi~s~, strategy a:ttd hms atbili~,y t,o h~nd\]e leg 
ext_~;\].positio:,to We halve atl\]xm+dy developed a gtammatr for-. 
mall|sin ')',X~};:;, which ix a. snpeJcset of I)CGo Wi~.h X.K.~{{, leg 
ex{\]:atposii;io:~ phenome.,ta, is nattrt\]:~/ly expressed i,~ g\]:a.mmat~: 
F, les= We h.:tve atb;o optim.ized. \]:~tl~X<} cl.~,~,se,;. l,',:~pe> 
|merits aho'a.ed thatt h: i:ompatr\]so~ to the el:it|rod }~U}*... 
XG} system~ the anMysis aped up 10 times in the interp~:eter 
~node ~,td 4 times in the compiled mode° The '.PNX\]!~ s~;r~c.. 
ttlred dictioaatry ixl. La,tgLAJC~ req~fires less rnentory, p\]:o- 
rides Ikstm: dictim-~ary ):eD.renee {~nd also hat~td\[es cma.p\]i- 
c~ted idioms with versa~tJlity. Conseq~e\]dly~ the itti\]iz~4hm 
el ~,a_~.tg\];.Ag for pra,:tical purposes ha.s become feasfble. 
So t'a,r, seve~at! gratmmar forma./imn based <m logic progratm.- 
mi\]tg par;~d:,g\]n ~iIlCh atS Met~nno~:phosis Gramm~,.r \[~\] aatd 
\])CC. \[9\] ha.~.e been p~:eseatedo In Meta.:,~:mrphosis Gr~llt~ta,r, 
each graanmax rule is tratnslatted i,.t~o at I\[or~t Clatuse, and the 
Pro/og hd, erpreter pa~:ses the inp*tt sentmtee with these llo,:rt 
Clause uai~tl; at top-,dew, a,xtd dep~h.d'aa~~ strattegy. O\]tlilm in 
the past where pa,sers had to be eo\]\]strlmted for synt~mt\]e 
analysis, i~t ;;his method, we do not Mwe to i)ecrtuse the Pro- 
log iute~preter itself works a_.~ o,m. Metatmorplhosis Gratmma:c 
atlso provides ~ '~tatturatl btnguage processing method which 
intedeatves t;y~tta.etie ~aatlysls and sere.antic attalysis. 'i'his 
is at deal,able feattu~e from. the pot,4 of view of cognitive 
science° 
1.,"ollowing Met~u~torphusis Gra.mlaar~ Yereka e~ ~1o devel~ 
oped a gra,mmar formalism called lJe.ihfiie Cla.ase Gratm- 
ma.\]:(D,,~.: 0 at,~d lgxt):atposii, imt (,ratmma~.r0=G) \[8\]. The 
gratmraat* r~!es writtmt i~t I)CG at~:e also tr~nslatted irate a~ 
Prolog progratm, a.nd the Prolog in.terpreter works as at top° 
(iowx~ a)~d dopth-4i~st pa~ser i~tterleatving syntax axtrdysis. ~nd 
sex**a~tie aaalysis. XG is the exte~tded ver.qioa of DCG ca,- 
pa.ble of haa,.dli,tg left ext~a~poaition, 
){oweve.~:~ top--(low~ patrse~ have ;a. ta'oblem tMt the p:m- 
gra.m :5~lls i~to the infinite loop whe_*.t at leg ~ec,rsive rule 
a~pzpea\]~rs i?t the gra.mma.~: rtfles. Thls problem (:a~.\]t be solved 
by either t\]:~mlatti~tg gra.mmar r~tles with left rem~rsive ruled 
i~4.o {met; wiZhollt le*t ~:eeursive r~ales or by 'ashlg at botto~n.- 
ap p~.~:sing st~tegy. Si~tee the lbnner solallen ~aty give 
~m~,~,g~tral ?2s.~:shtg ~:esults; the latter is prefe~:atb\]e. 
Ma.tsn~ao~o of l,~leetrotee/ndeal :!~,t~bora{;ory developed ~. 
sys*em in width the g~:atnlnla~ *~*les w*ittea i~t :I)CG atre 
t,a~tslated ittto \]\[o~:xt clauses c;dled BOP cb~uses ~atd Pro- 
log h~erpre~;e* wo:,:ks a~: a bo~om-up and depthdlrst p~.,sm: 
" il,4,,,i,G ...... 
2 q I7i£Z ZZZZ, ZL~21:2_2. 
Smllm~o~ I TfllF. ~Iruetutod \] I 
• • t ~ dlotlonar I \[JUP-,~{3 (}littlsO~; 
Prolo~ System 
I ............... l .............. 
V liOtAJli 
Figm:e 1: Stun:lute of LangLklt 
fin: these rules \[1@ Ma,tsun~lotoh; system is called the BUP 
system. 'i!?he BUP system ca,n ha,ndle Iv:It recn\],sive redes 
~n(i, t~:eat~ gl:ammatr rule~; a,nd ~he dictio~atry sepat:rattely, 
Komto of '.lbkyo Institute of Technology extended the 
B0P system Lo BUP-X(4 system \[5\] wMch ca,n h~:ndle the 
left ex¢,\]:atpositio~t phenonm\]tat elegantly. BUP-X(:I system 
intn:oduced the g~atmmatr description form ea,lled XGS (F,x- 
tr:,,~posit.io~t Grammatr with Slash Category). 
This paper presents a mttnral \]atnguage a.mdysis sysge:m 
La,ngf,AB based o~ KoItno's BUP--XG system. \],'igul:e 1 
shows the structure of the LatngLAB system, Users should 
p~ep~re gr~mmar rules written in XGS and a dietionatry 
written in I)CG. Both gramma.r rules atnd ~t dictionatry a\]:e 
t,atltalatted i~tto BUP-XG clauses and 3.'lf.IE stratetltred die- 
ti¢mary respectively by tra.nslatm's. 3)ranslated results are 
c0nsMted by the Prolog system and the Prolog interp~:eter 
works as ~ parset. 
In chapter 2, we briet\[y explatin the filndatmentals of 
the BUP system and the gra,mmar description tbrm XGS 
adopted itt LangLAB. We will a.lso describe BUP. XG treats- 
is.to.,: which translates the g~:atmma.r written i~l 7XGS into 
BUP.--XG cl~tllse altd its optimizatimts, l*t chatpter 3, we 
will to4ch on the '.FRIE st\]mctn*ed dictionary adopted i~i 
LatngLAB. 'i?\]{,IE struetrlred dietio\]tary ~eqnires less mem- 
ory atnd provides' faster dicfio\]n~ry reference a,n.d provides 
~texible idiom lta~tdlil~g. In chapter: 4, we shall p\]:esent re- 
suits of experiments re:drying the effect of the optimiza.*,io~ 
des,.:~:Jbed in the.pier 2. Experiments showed theft the a.md-- 
ysis sped ap 10 times in the interpretive mode uatd 4 times 
in the compiled mode. The authol:s believe l, ha.t L~\]lg\],A\]l 
pe~fi>rms well cnoug\]t to be of pratctiea,l use. 
65!i 
s --> rip, vp. (d-l) 
np ~-.> pron. (d~2) 
pron -o-> \[you\]. (d-3) 
vp --> \[walk\]. (d~4) 
Figure 2: Sample grammar written in DCG 
np (G) --> {link (np,G) }, (b--l) 
goal (vp) , 
s(G). 
pron(G) --> np(G). (b--2) 
diet (pron) ---> \[you\]. (b-3) 
diet (vp) --> \[walk\] . (b--4) 
Figure 3: BUP clauses translated fzom figure 2 
2 XGS and BUP-XG 
hi this chapter, we shall explain the grammar description 
form XGS adopted in LangLAB and the BUP-XG trans- 
lator which translates grammar rules written in XGS into 
the BUP-XG clauses. Before explaining BUP--XG, we will 
briefly explain the mechanism of the BUP system, the pre- 
decessor of BUP-XG. Basic parsing mechanism of BUP is 
left-corner parsing with top-down prediction. 
2.1 BUP system 
\]in BUP system, grammar rifles written in DCG (Figure 2) 
are translated into the rules called BUP clauses which are 
also of DCG format and some Prolog program (link clauses 
and termination clauses : explained .later). 
Figure 3 shows results of the translation. These BUP 
clauses are then translated into a Prolog program (Figure 4) 
by the DCG translator wl,ich is embedded in the Prolog 
system. Two more arguments are added to each predicate 
which denotes nontenninal symbol in figure 4. These argu- 
ments constitutes a difference list which represents the input 
string. With the special predicate goal which is necessary 
for bottom up parsing, this Prolog program can parse the 
input string with a bottom-up and depth-first strategy. Fig- 
ure 5 shows the definition of the predicate goal. 
Now, we shall give a step by step explanation of the pars- 
ing algorithm of the BUP system. We will use the gram- 
mar shown in figure 4 and input sentence "you walk" as an 
example. Calling the predicate goal activates the parsing 
process: 
?- goal(s, \[you,walk\] ,\[\]). 
np(G,X,Z) :~' link(rip,G), (p--l) 
goal (vp ,X ,Y), 
s(G,Y,Z). 
pron(~,X,Y) :- np(G,X,Y). (p-2) 
diet (pron, \[youlX\] ,X) . (p-S) 
dict (vp, \[wa\].k IX\] ,X) o (p-4) 
Figure ~: Prolog progn~.ms translated from figure 3 
goal(G,X,Y) :-~ (g-l) 
(wf_goal(G,X,_) 
fail_goal(G,X), ! ,fail 
),!, 
wf_goal(G,X,Y). 
goal(G,X,Y) :- (g-2) 
diet (C,X,Y), 
link(C,G) , 
P =.. \[c,G,Y,Z\], 
call (P), 
assertz (wf_goal (P)). 
goal(G,X,Y) :- (g-3) 
assertz(fail_goal(G,X)) , !, 
fail. 
Figure 5: Definition of the goal clause 
This calling checks to see if : 
A parse tree the root of which is the category %', 
can be constructed from the input string denoted 
by the difference between the list \[you, walk\] and 
the list \[ \] (\[you, walk\] in this example). 
The first call of goal invokes the clause (g-l) in the figure 5. 
The clause (g-l) checks to see if the same analysis have been 
made before, to avoid recomputation using the information 
previously asserted as wf_goal and fail_goal. 
As the execution of the clause (g-l) fails in this case, the 
system chooses the next clause (g-2). In the body of the 
clause (g-2), the system consults the dictionary by calling 
"diet (C, \[you, walk\], Y) ". This predicate call picks (p-3) 
in figure 4 and the system matches "pron" with variable C 
and "\[walk\]" with variable Y. 
In the second line of (g-2), the system calls the predicate 
link to see if the category which is obtained by the previous 
dictionary consultation ("pron" in this example) can be left- 
corner of the current goal ('%" in this example). The llnk 
clauses are calculated by the BUP translator. Suppose this 
test succeeds, the system calls the predicate "pron" : 
P =.. \[pron,s,\[walk\],\[\]\], 
call(P). 
Calling "pron(s, \[walk\] , \[\] )" invokes (p-2). Then, the 
system executes its body that is, "up (s, \[walk\] , \[\] )'~. 
Calling "np(s, \[walk\], \[\] )" invokes the clause (p-l). Af- 
ter calling the predicate llnk to check a teachability from 
"np" to "s", the system invokes "goal (vp, \[walk\], \[\] ) ". At 
this point, the system has analyzed the string "you" as "up" 
and is predicting that the trailing string "walk" should be 
bundled up to the category "vp". 
In the same manner, a bottomoup analysis with a top- 
down prediction proceeds until the execution of goal with 
the termination clauses succeeds. See \[14\] for the detail of 
the termination clauses. 
R,esults once succeeded or failed in an analysis are asserted 
as wf_goal in the end of (g-2) and fail_goal in the clause 
(g-'S) respectively. This information is used in (g-l) as 
described. 
np =.-> pron° (x-2) 
~tp --=> del;~ ~tomt, s_x'el.°/np. (x-3) 
vp .... > vt~ x~p. (x-4) 
~.~:el °'o> :cel~p:coa, ~o (x-5) 
Figure 6: Sample graramar written in XGS 
2.2 ~IO'I~=XG systenl 
The embedded sentence which appears in relative clauses in 
English can be viewed as a strncture ilt which a noun phrase 
is missing f\[om declarative sentence. A gap is formed as a 
result of moving the antecedent from within the declarative 
sentence to the left of the relative clause. Linguists call such 
phenomena "Left extraposition'. By considering the gap left 
by the nloved constituents as a "trace", and incorporating a 
mechanism tha:t looks h)~ such a "trace" automatically, the 
number of gx ammar rules can be decreased and the grammar 
~ules become easie~ to read. Moreover incorporating such 
mechanism contributes to making analysis speed faster. 
~*bp=down parsers hke ATNG \[131, \[12\] and XG \[8\] incof 
porate such • mechanism° The top-down parser can predict 
what catego:ey the trailing input string may be bundled up 
~o. Efficient trace searching is possible as the system as- 
sumes the e:dstence of traces only when a particular cate- 
go,y is predicted as a goal. 
A pu~e boa;tom-up parser is not capable of such predictions 
and inefficiency results because of the necessity to assume 
the existence; of a trace between every two words. However, 
since the BI\[P system incorporates top-down prediction in 
the bottom-up parsing strategy described in 2.1, it is pos- 
sible to implement the mechanism to look for the traces 
efficiently. Konno developed a BUP--XG system which in- 
corporated such a mechanism \[5\]. 
The XGS adopted in LangLAB p~ovides grammar writers 
the facility ~¢ith which left extraposition tan be naturMly 
expressed in grammar rules. Figure 6 shows a small English 
grammar w\]dch is written in XGS. 
The notation ". o/" (called "slash") in the rule (x=3) is in- 
troduced in XGS. This rule means that there exists the syn- 
tactic category "up" which dominates the "trace" under the 
syptactic ca~egory "s=xel" (%.xel" means relative sentence). 
This idea is influenced by the %lash category" in GPSG \[3\]. 
We call the category after "../" ~c slash category". Rule 
(x-3) also shows that the category '~np '~ consists of the cat- 
egories "det", ~'noun" and %.xel" and that the trace left 
behind by the left extraposition of the norm phrase consist- 
ing of "det" a~td "norm" is dominated by "s~rel". During an 
~nalysi% when the system finds the trace under "s_~el", as 
sl~own tit figure 7, its associates the trace in the enlbedded 
sentence with the moved phrase ("the man"). 
XGS also provides a notation to represent "Ross's Como 
plex NP constraint" \[10\]. Following is an example of this 
notation. This notation is called "open (<)" and ~'close 
(:>)" following Pex'ei~a \[8\]. 
a ~-o> b9 c~ <d>. 
'\[.'his rule ~Yu!a.:as that category "a" consists of categories C~b', 
"C ~ and ~'d ~. Open-close notation defines the scope of extra- 
np 
np s rel 
det neun 
the man 
rel_joren s 
np vp 
who \[trace\] loves her 
Figure 7: Matching between slash category and its trace 
position. This example says that the movement fi:om under 
"b" or "c" to the outside of "a" is permissible, but the move= 
ment from under "d" to the outside of "a" is not. Sentences 
violating "Ross~s Complex Np constraint" are rejected by 
modifying (x-3) to become (x-3') 
np --> dot, noun, <s_rel../np>. (x-.3') 
With (x-3 ' ), the trace which is dominated by slash category 
"up" under "s_rel" can only correspond to the noun phrase 
which consists of "det" and "noun". 
In addition, XGS also provides a double arrow notation 
(==>) and the notation to describe X lists (explained later) 
explicitly. With these notation, "coordinate structure" can 
be represented in a natural way (see \[5\]). 
2.3 BUP-XG translator 
Just like the BUP system, the grammar rules written in 
XGS are translated into BUP-XG clauses, link clauses atLd 
termination clauses by the BUP-XG translator. The BUP- 
XG translator in the LangLAB system has been improved so 
as to generate BUP--XG clauses more optimized than that 
in the original BUP-XG system. Furthermore, it is also 
equipped with a new function which inserts parse tree in.- 
formation automatically. The translator takes about three 
seconds to translate a grammar of about 200 rules. The 
following subsection explains these inlprovements. 
2.3.1 Representation ofllnk clauses 
As the number of grammar rules increases, more link clauses 
are generated by the translator. For example, from about 
200 grammar rules of English which we have developed, 
the BUP-XG translator generates about 700 llnk clauses. 
Shortening the search time of llnk clauses wonld contribute 
to an efficient analysis. 
Link clauses are called in the body of BUP-XG clauses 
and in the predicate goal. Since both the alguments of link 
are atoms in the both cases, a llnk 
link (a, b). 
which denotes th.e reachablity from the category %t" to "b" 
can be change to the form 
a(b) :- !. 
This form of representation reduces the search space of the 
reachablity test. The BUP-XG translator in LangLAB geuo 
erates link information of this form. 
557 
2.3.2 Indexes for ditFerenee \]:~st 
As described in subsection 2.1 input string are represented 
by ~, difference list and intermediate analysis results are as-- 
served with the predicate wfgoal a:ud fidl.goal. Since dm 
h~st two arguments of" the wf_goal constitutes a diffbrence 
list of the input string, the longer the input string becomes, 
~he more memory wKgoe! consumes. By indexing differ.. 
once lisfs, the amonnt of memory ~:eqnired is reduced, amt 
faster re\[create to intermediate results is possible. 
}'or example, when the system gets the input string '!you 
walk", the predicates text arc a.sse~ted as follows : 
tex~(sO, \[\]). 
text (r~:l:, \[~alk\]) . 
te:~:t(s2, \[you, ~al.k\])° 
'l'he dictionary reference program gets a difference list by 
calling text with indexes (s\],s2,...) as the key, before con- 
suiting the dietiom~ry. 
2.:L3 Representation of intermediate results 
Generally, a long input string gives rise to more wftgoals 
and faiLgoals which results in longer search time for inter- 
mediate analysis results. Wf_goals and fail_goals have as 
their arguments, the index to the difference list denoting the 
partial input string, and its anMysis. As described in 2.1; 
goal first consults wf goads and fail_geMs with the indexes 
of input string as the key. In LangLAB system, the predi- 
(:ate names of intennedlate analysis result are the indexes to 
the ditference list insteaA of "wLgoa.\]" or "fail_goal'. This 
modification reduces the search space oil the intermedi~ttc 
analysis results and speeds up the analysis processo 
2.3°4 Insertion of parse tree information 
Users sometimes reqaire the results of syntactic analysis to 
be expressed as pa.rse trees, and in both the BUP system and 
the originM B(Y\])--XG system, users are required to insert an 
argument in each category to accommodate parse tree lure> 
marion. Itoweve~, it is not a difficult task to make the trans- 
lator insert this information automatically, ht the BUP-XG 
translator of LangLAB, this information is inserted auto- 
maritally unless instructed otherwise. This Nnction is simi- 
lar to ~he one in the McCord's MLG(Modular Logic Gram- 
mar) \[7\]. However, unlike MLG, all the nonterminal symbols 
can be a node of parse trees. 
2.3.5 Example of translation 
Figure 8 shows the BUP--XG clauses translated t~om the 
grammar in figure 6. The wriables beginning with "X" in 
the figme.8 axe introduced to handle left extraposmon. Tins 
variable is called X list (extraposition list) which were intro- 
duced in XG \[8\]. Information pertaining to slash categories 
is pushed into the X llst and is then transfe:rred from eate.~ 
gory to category during the analysis process. The predicate 
goal_x is an extended version of the predicate gord in the 
BUP system, which pops up the slash category from the X 
\]is~ when the t,.'ace is \[bun& Note that variables for parse 
tree in~brmation, the names of which begin with "T", are 
automatically inserted and that the representation of link 
information (in braces) is also modified. 
np (Goal, \[TJ\], Xnfo,XO~Xl,XR) .... > 
geal_x (vp~ \[T2\] ,Xl ,X2), 
s ((~oal, \[ \[s ,'fl ,'r2\] \], Info ,X0, X2 ,XR) o 
pron (Goal, \[TI\], Info ,X0 ,Xl ,XR) ..... > 
{ up(Goal) }, 
np(Goal~ \[\[np,'fl\] \] ,Info~X0 ~)\[I,XR). 
dot (Goal, \[T1\] ~ In're ,X0 ,Xi ~XI~) ...... > 
{ up(Goal) }, 
goa\].~x (noml, \[T2\] , Xl ~ X2), 
goal...x (~.rel, \[Y3\] ,x(np, \[np (t)\] ~X2) ,X3) ,~ 
np (Goal, \[ tap, T1 ,T2 ,T3\] \], Info ~X0 ,X3 ,Xlt), 
• et (G o al, \[TI \], -tat o, XO, X i, XR) ~-. > 
{ vp(Goa\].) }, 
goal_x (nil, \[T2\] ,X!,X2) 
vp (Goal, \[ \[vp,T~.,T2\] \], lure, XO,X2, Xlt) 
~'el_p:~o:~,.(Goal, \[T1\] ,Info ,X0 ,Xl ,XR.) "~'> 
{ s_rel(Goal) \]', 
goal..x (s ~ IT2\] ,Xi ,X2) 
s_rel (Goal, \[ \[s_rel, T:\[, T2j \] ~ Inf o ~ XO ~ X2 ~ X~{) . 
Figure 8: BUP-XG clo~nses t):~mslated from figm'e 6 
v(info(get)) ~-> \[get\]. 
v (ref (get, \[ \[vf I ed\] \] )) .... > \[got\] ° 
v(ref(get,\[\[v~len\]\])) --> \[gotten\]. 
v(in:ro(get_up)) -~-> \[get, Up\]o 
v(info(gct_on)) ---> \[get, on\]. 
Figure 9: Sample dictionary including idioms 
3 TR,IE st, ructured dictionary 
This chapter explMns the TRIE stnletnred dictionary, a> 
other extension to the BUP-XG system and the BUP sys-. 
tern. The TR\]IE sh'nctured dictionary requires less memory, 
provides Nster dictionary ~eference aitd flexible idiom han- 
dling. 
3°1 Title structure 
The name "T/~ll?," is taken fl'om "reTlllllJval" \[1\] and it 
means a kind of tree structure. A dictionary written in 
I)CG is translated into a TRIE structured dictionary by the 
Tll.IE dictionary translator. The TPJE structure is u tupple 
which has three elements, that is "word", "information for 
word(s)" and '~its child TRIE strncture'. 
}'or example, the dictionary written in DCG shown in fig-. 
are 9 would be translated to the TP~IE structured dictionary 
shown in figm:e \]0. 
To look up a TRIE .,~t~uctuzed dictionary, the dicliona~y 
reference program searches through the tree matching the 
i~tput string with the first element of the TKIE structure 
and, information for the string of input is retrieved only 
after the last word of the input string is matched. Actnally~ 
the translatm: blmdles up ~he dictional:y entries which has 
the s~Lme ill'st word into • clause (sue how the entries '~gct', 
"get on" ,%nd "get up" are translated in tlgnre 10). By md~g 
this struchlre for the dictionary, the system can avoid the 
658 
d:L r;~,a (gei;, 
\[ \[v, \[i:i ~+:~ o (gei;),I i! J 
\[ t~(,.It, 
i\] D,, l{_t:u:t o (l:>ri;.mO \] 7! 7i , 
I11.1 t, 
\[{ r+<~) - , i::\[l\[~ 0 ( ~j <! ~:. fl:~.~) ) 71 \[l ."t , 
di Ci:i~ (re ~; ~ 
i7 Iv, E+:La:~I (g,;i;., \[ \[:v:~: ! ~.d\] ~i )J \] \], 
H ).. 
(12 t: i; ~+ (\[-;o i;g eL* 
! i"J, b_'e:,: (ge~, i~ \[v:~ i e~d \] ) \] 11 !, 
J\] )o 
\]'~i\[;,.-'i:e I0: 'J)\]{l)'; s~;i:~c~,,+re t~:a~,',tsls.ted from fig.x:e 9 
ba,ckt:ca,cki.!~ a,+, (:lwl~.a~ level h~ dJ.c{imia~y reJhJ:eace, a,,d ~+,i, 
I,t it{are ), ~he ~r!~n:u)cn~ of i:he head is the intb):m~ic.t of 
bk', <;~t t,~:y+ ' t'hc >,q~nmen ~, ":i,',+,~ o (+:+) ":' utean<~ the ht h)rma, iAo. 
el; {he elitr F %,'-'~', 'J'}lC ~ argllntevtt of tile entry +eI~Otn a,!td 
{(\[~,Ot{()\]+ +) \]; a! ,~{,rl.lCttLre {e\]iJe;i}D WIIJcl! dem>{e,:.: a~, !>()Jilter to 
the entry denoted by the iir<¢~ argmne-nt of %',+,i7" (ht {hi~ 
case, a, \])oinl, e~ to tlm entry ~<get~ ). \])ictiona)y et)trie,~ i;h(~ 
ilttb::'+aa:ho~l of whic}t only differs Dora each other l:)~+rt\]aJly~ 
e+g+ the \]:oo{ :ibrnt ~utd the col@gated Nrm of a+n i,:regld;<c 
"/e)b, ca, u 1+~: wri~,i;tm i,,. ~hJs ma.nne:,:+ 
+JP\])e secoli_d gi:gtD.),en\[ or the s\[rlletltre c{:~:.~;;12~) is {\]to dil:. 
J~erential it. ti>ima.tion bet,>e(;e~ {his entry (:+f4ot ~ (Jr "!~(>~t()*i )>) 
and the enhy pohtted to by the tit'.s+, arg/!ilteltt e-J' ";/el ". \]}(t 
i;hi<<i e/a'.,~qp\]{e, .i}!attlre <<~,J7 ~ lrl.(;it.\]lts :%erb !}iron" a.nd its v~Jae 
~<od." a.nd %.m" men, Its "pa.s{" acttd <<\].;itst puxticip\],? ~ ,:e,~,pe,(> 
l,ive!y, "V\"i~h ..;tlch ~',+ (/emvxipgi()tl, users do ltot have tO W\]'ite 
~Mdi~.iomd :diem m)LJ.es whleh i~mlude the eo,~sgated lbYul 
of i>:eguh~a verbs. In the cas:; of reg~dar vet:b% since co,f, 
.\]ug, a\[cd \[brms acre pr:)ee,<iqed b 2 the :utoqdiologieal aliMysis 
p:,:ograJ.m l)a_i\[~ .;a i, he die{ionaa:y reihren(:e pc'oBram , idie,m 
c<nt~:ies wtii,:lt i:.qclade the conjugated Jbrm au:e not ~tecessary+ 
{got exam.)~t% t~sers do :~)ot, hams to w~:ite g}te idionl mit*y 
<<kicked the bltdeet'~ i\[ life (;ntry '<k\[c\]e {l~e bucket" is writ: 
tea. 
fio~,ary 
'il,'\])e '\]/R\[}'\] strnctl~eed dic~io.a~ry ea.:u htehtde P~:olog pro. 
g~a.ms to check ,,;ome cons{,a,htts a,itd sy\]~ta,etic ea:lego;ies 
i,t its ++wo:ed" po.,dtimt (tb:st element el im.pl)ie). '.flit,<.; >at iu'e 
makes Jt p 3ssible ~;o }t~,:(td~e ~;l')e idio'_,rta i,iclm{ing ,ion-.fmze3,. 
c\]e,umt% aaeh as %!.(:,t t);aly .... but a,ls.) ,? ) J~+ ;,he BUi' 
~+ystem :mad !;l)e BU\]?--XC:a. system, the sys~<;m regards so.ch 
i<\]io~!<~ as ~:, two-:el,eme~, wo:rd, tlta~ is g prefix i, ermia~)l part 
and a ~bll.owing ?,to;~te~:~F\].na,l \]>ar~;. 3.'he Ib,:n~e,: part iS \]It= 
ch, ded h, i;he dicfimtary and the I.at{e.t pa:rg L<; included in 
the gx:~mt\], a,t ,~!es. l:a Lang\],AilJ, the 'J.'l.{.i\]g s{v~c*Yt~'..~d dic= 
i, io~u~,ry is abk; to h~.ndle all snch idle:ms w.; the dicth)_aa,ry 
ceAries. 
'£he idi(}~t eni~:ies which i~tch;de ,),(nt.d'toze:t~ eleme).)g',.i such 
~u; sh ow~+. ~ ao +, only ,,, bu * ~+l'do ,,," cau b e writ {(hi as ilg,are 71 1, 
+>_dj ( \[\], \[~1 ) ...... > I}1.o~; ~ oltil~y\] , adj ( ..... ) , 
llt)~').~, a:t:m\] , adj ( ..... ). 
:up(lSp, i\]) "+ +;> Lx,-og ~m(l_y\] ,,tp (i~p*, .) + 
1)mi; ~ als o\] ~ np (i~lI)2 , ~+) , 
-\[j o:ht (l~pi, ~p:>, ~ l~p) }. 
.i!'igt~.~o 1 ;i ; Sample dictios~ry With ~o,i terminal symbola ar, d 
\['~\]x)/~):a,:,~ts i!l tlt+~ r!de body 
d:i,ct>aOurl;, \[\] + !; 
i; t>0-:i ....... ;I ,, il ~ E 
\[bt~,t, \[\], E 
I;al::o, \[;\], f 
\[ \[vdj ........ 1, 
12 i.d j, \[ I;;l + I;71 ;t \] \], Ei\] 71 \] \] 1 \] \] ,, 
L i.}?, tq>J. ~.:,\] ~ 1£71, \[ 
Ehrri; ~ l;\] o i 
faJ_.'.), \[\], i 
\[ \[:~tp, gp2,,.J, \[\] ,, \[ 
il (j o:i.n 0~lp;t ,gp2, ~'~p) ), 
1: \[np, \[;l~p ~ H 71 ;1 \], \[\] \] \] \] :l ;I \] \] \] \] \] ~1 \] ) ,, 
\]'~igare i2:'.L'IMI\];3 ,<;tr~lc%~l,:e {~ansla+ted from fig.:ce !;1 
And figm:e 12 is the resnlt of the translation. 
IN tim case of DCG, as the idiom ent~_:y such as figure 1 \] is 
csnaliy handled as a g~ammar rule, the number of gramma~ 
rules blcrea~)os a,~.t inefl\]cieney oll analysis process results. 1~ 
i,<.~ preDralAe to ha,,tdle g~aanmar cnles and dictionary entries 
sepa~:a.tely. 
.A:; sttowll Jxt fignre 12, the txansla3,or converts the Prolog 
progra,ms i~t the dietiona.ry entry e{join(gpl ,l~fp2,fgp)}" 
h, to {he :thrill (~(oo:hl(igpJ. +t~p2,}~p))'. The dictionary ~e:~ 
e,mtce l)mg*am calls tim prog~:ant enclosed by pa, renthesis 
whelt it enco/inters <(~nc.h i~ refilL. Ill the same way, the syn- 
htciie ca¢,egory i~t Ore dictionary enLries such as "rip (~/pi, =)" 
au:e c(mverted into the \]is{ the tits{ eleme:at of which is a, 
category rot, me a~,d ¢\]te rest of which a, re arguments of £he 
categoi'y (\[np,~ap;t,:\]). The dictionary reference l>rog~:am 
calls the predicatte goal (goal(np+ I)~p1 ,._\] ,X ~Y)) tot snch. 
a, \]bfItt, 
'.rite T\]tl;E st,uctu~ed dictionary enables the \],angL AB 
,(;ys~em to ttaild.le idioms with.versatility \[4\]. 
We condrtcted experiments to verify the eflec~ of optimiza-- 
{ion of \]Y0P-,XC claaises. We mea,sa~ed the time fo~ syntau: 
tic a:aa\]ysis of ten sample sentences. The experimen~ envJ-. 
~:onraeitt Js a,s follows: 
~, J~a,chine : SIDa/260 Worksta+tion 
~ P,:olog : @fint~tsPmlog Release 1.6 
o (',.raommar : 1($3 l:nles in XGS 
In {lie, expezim.elit, we measm:ed f, he time leq0i~ed to oh= 
ta,h~ a}i pa,se tree be\[ore and aJi{er tile optimiz~tio~t Ibm: each 
659 
......... T@P_-.L: A nAysi~ 
No. I' Number Numl}er 
of Words of Trees 
11 14 9 
21 4 \]2 
31 3 7 
41 1 10 
5J 3 11 
61 4 18 
71 9 21 i 
81 2 19 
91 4 17 
10 I 1 25 
...................... I ............... 
time using interpretive code 
Analysis Time \[msec\] l%~tio I 
(1) DUB-X(\] (2) LangLAB _(1)/(2_~) ! 
80,415 8,552 
18,868 2,700 
46,700 4,983 
30,900 3,600 
39,634 4,050 
95,933 9,550 
323,167 26,183 
87,550 9,349 
180,300 15,816 
116,284 12,083 
average 
9.401 
6.99 I 
9.37 I 
8.58 I 
9.791 
10.05 I 
12.34 I 
9.36 I 
11.40 I 
9.62 I 
9.691 
.... '_\]?-~!,!e_.2: AnaJ_ysis time using compiled code 
No. Number Number I 
of Words of Trees I 
1 14 91 
2 4 121 
3 3 71 
4 l 101 
5 3 11 I 
6 4 181 
7 9 211 
8 2 191 
9 4 171 
10 1 251 
Analysis Time \[m~ee\] B.atio 
(1) BUP--XG (2) LangLAB 
20,485 4,134 
2,467 1,299 
4,783 2,284 
2,884 1,566 
4,383 1,917 
18,768 4,500 
127,400 14,000 
13,450 4,450 
59,468 8,216 
23,650 5,801 
average 
(1)/(~) 
4.96 
1.90 
2.09 
1.84 
2.29 
4.17 
9.10 
3.02 
7.24 
4.08 
4.07 
sample sentence. '\['his analysis does not include morphc~ 
logical a~tulysis. Table \] is the result of the experiment in 
the interpretive mode and table 2 is'the one in the compiled 
mode. The fourth and the fifth column of the table is the 
time to analyze the sentence in the original BUP-XG system 
and in the LangLAB system respectively. Time is shown in 
millisecond. 
Results showed that in comparison to the original BUP- 
XG system, the analysis sped up 10 times in the interpretive 
mode and 4 times in the compiled mode. The optimization 
is less effective in the compiled mode than in the interpretive 
mode. I\[owever, this optimization is practical because de- 
bugging is usually done in the interpretive mode. We believe 
that LangLAB has the capacity for practical use. 
'\]'here is a related work SAX \[6\] by Matsumoto. SAX is 
also a parsing system based on logic programming, but its 
parsing strategy is bottom-up and breadth-first. Okunishi 
of ICOT reports that LangLAB is 6 ~ d0 times faster than 
SAX in the intm'pretive mode. However, in the compiled 
mode, SAX is 6 ~- 16 times faster than LangLAB \[11\]. SAX 
has still yet to be modified to handle idioms. If this modi- 
fication is introduced, debugging can be done on LangLAB 
in the interpretive mode and the debugged grammar can be 
executed on SAX in the compiled mode. 
5 C " oonchlmon 
We have made the following modification to the original 
BUI)-XG : 
660 
• Optimized and enhanced translated code 
• Adopted TItIE structured dictionary 
With these modifications, the analysis sped up in compari- 
son to the original BUP-XG system and fiexible idiom han- 
dling became possible. We believe that LangLAB has be- 
come a more powerful and practical tool for natural lan- 
guage processing. We plan to develop a natural language 
processing system which includes semantic analysis, based 
on LangLAB. 

References 

\[1\] A. V. Aho, J. E. Hopcroft, and J. D. Ulhnan. Data 
Structures and Algorithms. Addison-Wesley, i983. 

\[2\] A. Colmerauer Metamorphosis grammar, in Natural 
Language Communication with Computers, pages 133- 
190, Springer-Veflag, 1978. 

\[3\] G. Gazdar and A. F. Pullum. Generalized Phrase 
Structure Grammar:A Theoretical Synopsis. Indiana 
University Linguistics Club, 1982. 

\[4\] M. Gross. Lexicon-grammar: the representation of 
compound words. In COLING '86, pages 1-6, 1986. 

\[5\] S. Konno and H. Tanaka. Processing left-extraposition 
in bottom up parsing system. Computer Software, 
3(2):115-125, 1986. (in Japanese). 

\[6\] Y. Matsumoto and It. Sugimura. A parsing system 
based on logic programming. In IJCA\[ '87, pages 671- 
674, 1987. 

\[7\] M. McCord. Natural language processing m prolog. In 
Adrian Walker, editor, Knowledge Systems and Prolog, 
chapter 5, pages 291-402, Addison-Wesley, 1987. 

\[8\] F. Pereira. Extraposition grammar. American Journal 
o\] Computational Linguistics, 7(4):243-256, 1981. 

\[9\] F. Pereira and D. Warren. Definite clause grammar for 
language analysis- a survey of the formalism and a com- 
parison with augmented transition networks. Artificial 
Intelligence, 13(3):231-278, 1980. 

\[10\] J.it. Ross. Constraints on variables in syntax. In On 
Noam Chomsky: Critical Essays, Anchor Books, 1974. 

\[11\] T. Oknuishi, et.al. Comparison of logic programming 
based natural language parsing systems. In ~nd Inter. 
national Workshop on Natural Language  nderstanding 
and Logic Programming, pages 90-102, 1987. 

\[!2\] T. Winograd. Language as a Cognitive Process. VoL 
ume 1:Syntax, Addison-Wesley, 1983. 

\[13\] W.A. Woods. Experimental parsing system for transl. 
tion network grammar. In Natural Language Process° 
ing, Algorithmic Press, 1971. 

\[14\] Y. Matsumoto, et.al. Bup:a bottom-up parser embed- 
ded in I~olog. New Generation Computing, 1(2):145- 
158, 1983. 
