Disambiguation of Super Pint, of Speech (or Supertags) 
Ahnost Parsing 
Aravind i(. Joshi and B. Sriniw~s 
Depa.rtment of Computer and Information Science 
University o\[ Pen nsylvanh~ 
I>hiladelphia, PA 19104, USA 
{joshi, stint} ~linc.cis.upeiln.edu 
Abstract: In a lexicalized grammar Ibrnlal- 
isni such as LexicMized Tree-Adjoining (h'~unmar 
(I3'AG), each lexicM item is associated with at 
least one elementary structure (supertag) that 
localizes syntactic a.nd semantic dependencies. 
Thus a parser for a lexicalized grammar must 
search a large set of supertags to choose the right 
ones to combine for the parse of the sentence. We 
present techniques I'or dlsambiguating supertags 
using local inlorlnlttion s~Lch as lexicM preference 
and local lexicN dependencies. Tim similarity 
between LTAG and l)ependency grammars is ex- 
ploited in the dependency niodel of snpertag dis- 
a.mbiguation. The performance results for vari- 
otis models of supert;tg disambigu~ttk)n such as 
unigram; trigram and dependency-based models 
are presented. 
1 Introduction 
l>art-of-spee<:h disanll>iguation techni<lues (tag- 
gers) are often used to eliminat<, (or sul>sl;an- 
tlally reduce) the lm.rt-of-spee,<;h anil>iguity prior 
to parsing. The ta.ggel's are all local hi the sense 
that they use inform~tion front a limited context 
in deciding which tag(s) to choose for each word. 
As is well known, these taggers are quil;e, success- 
ful. 
In a lexicalized grammar such its the I,exicM- 
ized "Dee-Adjoining Grammar (13~AG), each lex- 
ical item is associated with at least one elemen- 
tary structure (tree). The elementary structures 
of I'PAG localize dependencies, including long 
distance dependencies, by requiring that M1 and 
only the dependent elements be present within 
the saute structnre. As a result of this localiza- 
tion, a IoxicM item may be (a.nd, in general, a,1- 
most alwa,ys is) associa,ted with more. than one 
elementary structure. ~Ve will cMl these ele.- 
mentary structures supertags, in order to dis- 
tinguish them l'rom the standard part-of-speech 
tags. Note that even when a word has a unique 
standard part-of-speech, say a verb (V), there 
will usually lie more than one superta.g associated 
with this word. Since when the parse is complete, 
there is only one supertag for each word (assum- 
ing there is no global ambiguity), an L'\['AC, parser 
(SchMms, 1988) nee.ds to search a large space o1" 
supertags to select the right one lbr each word 
before combining them for the parse of a. sen- 
tence. It is this 1)roblem of supertag dis;unbigua- 
tion that we address in I, his paper. 
Since l,'l.'A(',s are lexlcalized, we are. presented 
with a novel opportunil;y to elimill;tte or substan- 
tially reduce the supertag assignnmnt ambigu- 
ity by using local information such a.s local lex- 
ical dependencies, prior to parsing. As in stan- 
dard lmrt-ofslieech disambiguatioii, we can use 
local statistical iufortnatiot~ in the term o\[n-gt'anl 
models based Oil the distril)ution of stiperl;ags hi 
a I,'I'A(I liarsed corpus. Moreover, since the sli- 
l)erta.gs elicode depemde, ncy hfl'ornlal;k)n~ we can 
also use informa.tion about the distribution of dis- 
tances between a, given superi;ag and its depen- 
dent su perl;ags. 
Note that its ill sta,ndard part-of-speech disaun- 
biguation, superl;ag disambiguation could have 
been done by a parser, lloweveG carrying out 
part-of-speech disaml)igua.tion prior to pamsing 
lnMces the job o1' the. parser much easier and 
therefore, speeds it np. Stlpertag disalnl)igua- 
tion a.s proposed in this paper reduces the work 
of the parser even further. After snpertag dis- 
ainbiguation, we have effectively completed the 
1,54 
parse *rod the p~u'ser need 'only' (:omhine the indi- 
vi(hlM structures; hence the term--ahnost parsing. 
This method can a.lso be used go i)~v'se senten(:e 
fragments in cases where the snpertag sequence 
after the disambiguation may not combine into :L 
single structure. 
The ma.in tom of this paper is to present 
te.chniques for dis~unbiguating Sul)erta.gs , and to 
(wahu~te their pe\]'formm~ce a.nd their impa.ct on 
I;I'AG parsing. Although presented with resl)ect 
to Ill'A(',, these techniques are a l)plica.bD to lex- 
icalized gl'aitltlla.rs ill generM. Section 2 I)rovi(h,~ 
~m introduction to l,exi('.~dized '\['ree Adjoining 
Gr~mlmaa's. The objective of supertag (lisa.m- 
1)iguation is illustrated through an example in 
Section 3. Section 4 l)rielly deseril)es the sys- 
tem used to collect the data, needed for Sul)ert~tg 
disambiguation. Various methods and their i)er- 
formance results for superta.g (tisambigua.tion are 
discussed in (let~dl in Section 5. 
Lexicalized 'lS'ee Adjoining 
Grammars 
l,exicalized '.lYee Adjoining (\]r~mmla~r (I:t'AC) is 
~ lexicMized tree rewriting grammar lbrm~dism. 
The primary structures of ILFAG ~u'e I'~LI,;MI,;N- 
TALLY 'PII.FI,IS. l'~wh elementary tree has a lexi- 
ea.l item (a,nchor) on its fi'ontier and l)rovides a,n 
extended (lomain of /ocMity over which the au- 
('hor specifies syntactic a.nd semantic (pre(lica.te 
argument) constra.ints, l'~lementary trees a.re 
of two ldnds: INITIAl, TRI,H,:S a~rtd AUXILIARY 
TI/I,~,:s. Examples of initial trees (~ts) ~ul(I a.uxil- 
ia,ry trees (fls) are shown in I,'igure I. Nodes (m 
the frontier ofinitiM trees are, ma, rke(I a,s sul)stil, u- 
tion sites 1)y a 'J', while exa.ctly one node on the 
fl'ontier of an a.uxili~ry tree. whose la, h(q m:~tches 
the hal}el of the root of the tree, is ula.rked as ;~ 
foot node 1)y ~L ','. The other nodes on the fron- 
tier of an mlxiliary tree ~u'e marked as sul)stit, u- 
tion sites, lfl!A(l \['actors recursion \['rom the sta, te- 
ment of the syntactic dependencies, l!',lementary 
trees (initiM and ~mxiliary) are the domain for 
specifying dependencies. Recursion in specilied 
via the auxili~u'y trees. \];'Jementa.ry trees ~Lre com- 
l)ined by the Substitution and Adjunetion op- 
eri~tions. Substitution inserts element;u'y tre.es a,t 
the substitution nodes of other elementa.ry trees. 
Adjunction inserts :mxili~ry trees into elemen- 
tary trees at the node whose la.bel in the same as 
the rout lM)el of tile auxilia,ry tree. An ~Ln exam- 
pie, the (:Oml)onent trees ( ~s, me, n,.% n.4, fls, (~s, 
n's), shown in Figure l c~m be combined to form 
the sentence John saw a man with lhe telescope l 
as follows: 
\[. 
2. 
3. 
/t. 
n's substitutes at the NP0 node. in (~. 
n,:~ sul)stitutes a.t the \])etl ) node in e~4, the 
res~llt of which is sul)stituted :~1: the NPl 
node in r~2. 
~:~ substitutes :LI, the I)etl ) node in ~(;, the 
result of which is sul)sl, ituted a,L the NI ) node 
hl 138. 
The result of step (3) a.bove a.djoins to the 
VP node of the result of step (2). Tim re- 
surfing pa,rse tree. is shown in Figure 2(:t). 
The process of coral)thing tim elementm'y trees 
resulting in the I)a.rse of the seutel~ce is rel)re.- 
sented by the deriwltion tree, shown in li'ig- 
ure 2(I)). The nodes of the deriwttion tree are 
the tree names that a.re anchored by the ~Lppro- 
pria.te lexical item. The c.omposition opera.lion 
is indica.ted by the nature of the a.rcs - (h~shed 
llne. for sul)stitutiou :uLd bold line. for a.(ljunction, 
while the ~ul(h'ess of the operation is h~dica.ted as 
part of the node label. The deriwLtion tree ea, n 
a.lso I)e iuterpre.ted ,~s :~ dependency gra.l)h with 
unhd)eled a.rcs I)etweell words o\[" the. sentence as 
shown in Figure 2(('). 
We will ca.ll the elementa.ry structures asso- 
ei:tted with ea.ch lexi(:a.I item a.s super l)a.rts-of - 
speech (super I'OS) or supertngs. 
3 Example of Supertagging 
An a. result o\[' locafization ht I:I'A(I, a. lexica.I item 
may I)e assoch~.l.ed with more tha.n one SUl)ert~g. 
The eXaml)le hi I,'igure 3 ilhlstr;Ltes the iniLia.l set 
o\[" su pertags assigned to each word of the sentence 
,\]olz,* saw a mmz wilh lhe lclescope. The ordc'r 
of the superta,gs for e;Lch lexica.l item in the ex- 
aml)leis not signili(:ant. Figure 3 Msoshows the 
\[i na, l SUl)e.rtag se(llnmce a.ssigned by the su pe.rt~Lg- 
ger, which picl(s the best supertag sequence llSil'lg 
sta.tistica.l inforlna,tion (descril)e.d in Section el) 
~dmug hMividual superta.gs aim theh' dependen- 
cies on other supertags. The chosen SUl)ert~Lgs 
nre (:ombltled to derive a. \[)axse., as exl)h).ined ill 
~The parse with tit(: PP ~tl.Lached to the NP has not 
\])(!ell .'~}lOWll. 
155 
NP 
l)etP$ N 
I 
John 
s, /N 
NP~ VP 
v NFA 
I s~w 
I)etP NP nP 
NI}" I'I' 
I) l)etP l N nx/X 
I' NI~ I I 
~I Ill;HI ~llh 
OiL Oi2 (~3 (~',I f~l 
A 
nl~l ve n /N 
v nlsn~ 
John \[ i 
0:' 8 0' 9 
Rib vl' 
n ~* n~ A /\ 
V NI'IL I 
Johll uw 
DetP 
I) 
lhe 
(/' 5 
DeI P N P vlt 1)et P ,& 
;& I1 llt~ N I) ll~l, 
P NI~ 1 i ~ I 
II IIIIIII with the 
Oil0 (~'11 f~8 (X'I2 
DeIP r 1~ s, 
I'P N" 
I) I)elP\] + I%1 / k 
I 
\[l Ill\[Ill xlllh 
I)elP r 
I) DelPp I 
the 
NP 
I)eIP$ N 
I 
telescope 
¢.V(; 
NI' 
N 
I 
telescope 
/X 
n N;* 
telescope 
#2 ,-~r ,qa #,, #~ #,~ fir 
NA 
Pigm'e I: \],'lementary trees of LTAG 
N Vl' pl, 
I ~ l.l,. v nl' P ~e 
I /~ I 
mw \[hip N +llh lkll' N 
I I 1 I II m~ II td¢~.p+ 
I I .. 
(~) 
P:u'se Tree 
,, 2lsa~q 
~l ~l\[.hdmJ (11 ILSIwIIM (21 a 4lnuml (2,21 
~' ~l e~r-lW\](221 n Sial(I) 
. 5li eli ) 
(11) 
l)eriw~.tion Tree 
l?igure 2: Structm'es of LTAG 
saw 
.lldm MIh hum 
I I 
telesclqk' a 
I 
the 
(,.) 
l)eI>en dell cy (,~ a,p h 
Sentence: ,lohn s~txv a man with. 1;1lo telescope. 
initia,l Supertag set: (~1 
(I' 8 (~.*~ Oim (vii fls OiJ2 (~r~ 
Oir ~a:, fl,, fls fl,; fir 
F'inM Assignment: Oi8 Oi'2 rv:~ (v,i f18 Ois o:(; 
Pigure 3: Supertag Assignment for ,loire saw ama'., willz Iltc /eles(:Ol~e 
156 
Section 2. 
Without the superta.gger, the \[)au'ser wouhl ha,ve 
to process combinatioi, s o\[ tlte entire set o\[' tree+s 
(28); with it the parser must only i)rocesses con> 
binations of 7 trees. 
4 Data Collection 
The (t+~t:+ re(luired for disantbigtta.tine; superta.gs 
(discussed in Section 5) luwe I)een colle<:l:ed I)y 
l)a,rsing the Wa.ll Street ,Iottrna+.l '2, l l~IVl-tnautud 
and ATIS {:ori)ot':t using the wide+coveraxe I%- 
glish gr:-tmma,r being (l(weloped a.s part of the 
X'FA(I systeni (1)or;m eL. a\]., 1994). The pa.rses 
gene.ra,ted by the system for these sentett(:es Irom 
the corpora, ;tl'(.' llot subjecte<l to :tny 1,:iu<l or lil- 
tering or selection. All the deriw~tion structures 
are used in the collection o1' the sta+tistics. 
4.1. About XTAG 
X'IJAC,' is a, large ongoing proje(:t to develop a, 
wide-c.overage gra.mma.r f,.)r l",nglish, l)ased ()n tit,:; 
\];\['AG form+dism. It a.lso serves +Ls a.tl \]'\['A(', 
grammlar develolmteut system and consists of a. 
predictive left-to-right i)a.rser, au X-wht(low in 
terface, ~ rnorphologicad a+na.lyzer a, nd a. part-of- 
speec\]l tagger. The wide-(:overa.,e.;e English gram 
mar of the XTA(\] system contains :tl7,000 in- 
tleeted iLetns in the morl)hology (2 13,0(1(\] of these 
~Lre nouns ~tnd 46,500 are verbs) a.nd 37,000 (m 
tries in tit(', syntactic lexicon. The syntactic lex+ 
i(:on a, ssocia, tes words with the trees i;ha.i; they 
anchor. 'Fhere ;u'e 385 trees itI a.ll, in a, gra.tnln+Lr 
which is compose.d of 40 (li\[l'erent sul)c~d, egoriza+ 
tion I'rantes. lC, au:h word iu the syt,tactic lexi('+m, 
on the +w(,,rage, depending on the st~utda.t'd lm.rts - 
of-speech of the word, is an a.nchor l'or a.bout 
to d0 element;try trees. 
Models, Experiments and 
Results 
The SUl:)ert~Lg statistics which h+a.ve been tts,:'d 
in the prellndna.ry experiments descrihed I)elow 
h~we been toilet:ted from the XTAG parsed cor- 
pora. The deriwU:ion structur('s resulting rrom 
i)a.rsed corpor~ (W~dl Street .JournaL1, for the +~x 
periments descril)ed here) serve as tr~h~ing da.ta. 
for these experiments. 
'2Sentences of hmgth < 15 words 
5.:1 Unigram model 
One met.hod o\[' disanil,igua, ting i.h,:' sup(n'l,a.e;s as- 
signed to e;tch word iX to or(ler the Stll)el't:-i,gs by 
the lexic~d l)rel'erence tluLt the word ha.s (+or thelll. 
The l'r(Xluency with which a. certaiu supertag is 
associated with a+ word is a, (lire(:t inca.sure of ii:~ 
lexica.i lU'(d'eren(:e R+r tha.1, SUl)(wta.g. Associating 
fre(luench>s with the superta.gs a.nd ttsin,e; theln 
to ;ISSO<'i;Lt(+ ;t. l)a.rli(:ula.r Sttl)erta.g with a. word 
is clearly the simplest inca.its ',)1' (lisa.utl)i+vguatin K 
Sll \])('t't:-t.gs. Thus, 
S,~I,ot.L;~.~,;(,,,~) = ta :) a.,'p;n,a×,,: ,,t,ip;t';,.m(t\]... 1 ""J. 
5.1.\] Exl)eriments and Results 
Owing to sl)a.rs+'lmss o\[' (la.ta., we ha.w+ I)a.(:ke(l-c~ff 
I'rom word/supertag pa, it's to i)a.rt-o\[ - 
sl)eech/st,l)erta.g pa.irs, i.e., collected the unigram 
I're(ll,eucies or superl.a./~;s as;so('.i~ted with the pa.rt- 
or-speech :~+ssigned to words instea.d or the words 
themselves. Ta.ble l illusl.r;ttes the na+ture o\['tlm 
statistics used, with n. rew sa.ml)le entries. 
\[\]Tiu'(~)iCTt~e(~-I~- (SUlwrtag , u,ligram i)rol)al)ilitv) 
I N 
L - v .. (.>, 
I I) ((~.~, 0,9(13) 
T:d)le 1 : ,qa.nl l)h, en t, ries of u uigra.m d a.ta.I)a+se 
'l'al)h' 2: 
Model 
'l'op n SUl)erta.gs % Success 
" - .,i =-:~ .... 2~-cX7 - +.;TY-~+ - - F:,~ 
II('+~;t,ll+~ \['r<mt the \[luigt'a.nl +qUl)(,rt:t.pi 
Tim w()r<ls u.re first a,,~;,~dgu(,(I stauda, rd i>arL> 
of-speech usint, ~ ~ couventioual ta,gger (Churdl, 
l!)gg). Then the set o\[' Sul)ertags a.sso(qated with 
ca.oh word is retrieved rroln XTAC,'s synta.ctic 
(lata.bn.se. 'l'hese sul)erta.gs a.re ordered ha.sed .:)n 
their u ni,<,;ra.m rr<~(lUeUCy , a.n(I the top n Sul)erta.gs 
a.re a.ssocia.ted with th(, word. 'r~Lble 2 suntm;> 
rizes the success l){,rcenti~g~e on a, held out test 
set or 100 Wall Street ,lottrna.l SelltelH'A~8~ .:IS 11 iS 
varied, lr a, sentence p;u'ses using the n sllperta.gs 
sele(:ted for mL(:h wor(I then the a.ssigument is cou- 
si(lered a, success. 
The unigt'a.tn superta.gger tha+t selects Ix) l) three 
Sul)ertags has l)een interl'aced wiLh X'\]'A(:. This 
157 
(I'.O.S,Supertag) 
(D,<~) 
(N,~s) 
(N,,~,) 
(V,o,2) 
Direction of 
Dependent 
Supertag 
(-) 
(-, +) 
(-, +) 
\])ependent 
Supertag 
(Y3 
('gg 
Table 3: I)ependeney Data 
Ordinal 
position Prob 
- 1 C)..()99 
- I C).300 
1 (L374 
speeds the runtime of the parser by 87% on the 
average, whenever the snpertagger succeeds. 
5.2 n-gram model 
In a unigram model a word is always associated 
with the supertag that is most preferred by the 
word, irrespective of the context in which the 
word appears. An Mternate method that is sen- 
sitive to context is the n-gram model. The n- 
gram model takes into account the eontextuaJ de- 
pendency probabilities between supertags within 
a window of n words in associating supertags 
with words. Thus the most prob~tble supertag 
sequence for a N word sentence is given by 
Y' = argmaxr Pr(T~,5'~,...,TN) * 
Pr(I'VI,I'V2,...,WN IT~/&,...,7~) 
To compute this using only local information, 
we approximate, taking the I)robM)ility of a word 
to depend only on its supertag 
Pr(W1,W2,...,WN IT, ,T2,...,7~) 
l-I Y_-, Pr(l+~,' I ~1~) 
and also use an n-gram (trigram, in this case) 
approximation 
P"OL-'&,...,TN) ~ F\[~, P"('/~ I "L-~, '/t~-I) 
5.2.1 Experiments and Results 
A trigram model has been used to model the 
contextual dei)endencies in supertag sequences. 
Again, due to sparseness of (hint, the particu- 
lar words have been ignored and the training of 
the trigram model has been done on the part-of- 
speech/supertag pair. The model has been tested 
on the same set of held out sentences as in the 
unigram experiment. The percentage success is 
68%, i.e., 68% of the words of the test corpus 
were assigned the correct sui)ertag. 
5.3 Dependency model 
hi the n-gram model lot (lis~unbiguating su- 
pertags, dependencies t)etween supertags that 
appear beyond the n word window ea, nnot be in- 
corporated into the mode.1. This limitation can 
be overcome, if no a priori bound is set on the size 
o\[" the window but instead a prol)ability distril)u- 
tion of the distanee.s o\[' the <lel)endent supertags 
for each supertag is ma.intained. A supertag is 
dependent on another supert~g i\[' the former sul)- 
stitutes or adjoins into tit(.' latter "~. 
5.3.1 Experiments and l/,esults 
Table 3 shows the data required for the depen- 
dency model of supertag disambigua.tion. Ide- 
ally each entry would be in(lexed by a (word, su- 
i)ertag) pair I)ut, due to si)arseness o\[' (lata, we 
have backed-off to a. (I)()S, supertag) pa.ir, l'3a(:h 
entry contains the following information. 
• POS and Supertag p~dr. 
IJst ol' + aml -, representing the (lirectioll of 
the (h, peIM(mt superta,gs with resl)e(:t to the 
indexed supe.rtag. (Size of this list iiMicates 
the total number of dependeltt SUl)e,'ta.gs re- 
quired.) 
• l)ependent supertag. 
Signed numl)er representhig the direction 
a.nd the ordinal position of the l)a.rticul;u' 
dependent SUl)e.rtag mentioned in the entry 
from the position (ff the indexed su\[)ertag. 
aWe are computing dependencies between words with 
respect to supertags associated with the words, although 
the complete structure of the supcrtags is not used. It is of 
interest to COml)~U:e our work with some other dependency- 
based appro~ches as described by, for example, Sle~tor 
(Sleator and Teml)erley, 1990), l\[indle (llindle, \] 993), Mil- 
ward (M ilward, 1!)!)2). 
158 
• A probal)ility of occnrrence of such :t (lepen- 
dency. The sum probability over all the de 
pendent supertags at all ordinal positions in 
the same direction is one. 
For example, the fourth entry in the T:d)le ;I 
reads that the tree (~2, a.nehored 1)y a verl) (V), 
has a left and a right dependent (-, +) and the 
tirst word to the left (-1), with the 1;ree. (~s, is 
dependent on the current word. The strength of 
this association is rel)resented by the i)robal)ility 
0.300. 
The dependency model of (lisaunl)iguation 
works as follows. Stil)l)ose (~'2 iS a, llleiillie.r of tile' 
set of super(ass associa.te(l with :t word a.t posi 
ties n in the senten(:e. The :d<e;orithul proceeds 
to slttisfy the depende.ncy req'<lh'e.ment of <t,2 I)y 
pieldng up the dependency entries for e:t<:h (>\[ the 
directions. It picks a, del)en<lency dai, at entry (the 
fourth entry, say) from the (hmd):tse that is in- 
dexed by a2 all(I proceeds to sol; i1 l) at pa.tll with 
tile first word to tile left that has the (lepe.ndent 
supertag ((~8) a.s a ineml)er (!\[' its set o\[" sul)erLa.gs. 
If the first word to the left th~tt ha,s (h~ as ac lneu> 
ber of its set of super(ass is a.t l)ositiou m, t,111!1i a.II 
arc is set up 1)etwee.n c~,2 and (~s. Also, the arc is 
verified not to kite-string-tangle/i with auly other 
i~l'(:s in the path up to e~2. The i);ttll prol)M)ility 
up to a2 is incremented by log 0.300 to reflect the 
success of the ma, tch. The l)atth probad)ility u I) to 
(Is incorporates the nnigra!n probability of (vs. 
On the other hand, if no word is found 1,\[llti; \]la.s 
a8 as ;~ member of its set of supertags then the 
entry is ignored. The a\]gorit\]inl mltkes a greedy 
elloice t)y selecting the path wit\]/ the ill;i.xil/lllIll 
path probabilii, y to extend to the reimdniug di- 
rections ill tile (\]elmll(lellcy list. A SllCl'l,Ss\[ul Sll 
per(as seqllen(;e is one which ;~l,SSit~llS it Sllp(!l't;I.g 
to (.'itch l)osition such that eau:h supertag \]His all 
()fits dependents an(1 ma×hnizes the accunlula.i.ed 
path l)rob~d)ility. It is to lie noted tllatl, tile algo- 
rithm when pairing l, he head itll(l its del)endent 
is not really parsing since it does so evell without 
looking at tim strllctllre o~" the striilg~ l)etween the 
head and the del)endent. 
The implementation and testing of this Ill()(l(,I 
of slipertag (lis~mbiguation is underway. Ta.1)le d 
shows preliminary results on the same held out 
test set of 100 Wall Street Jollrlla\] seiitelices thai: 
was used in the unigram and triRrain models. 
The table shows two nieastlres of eva.hlal, ioil. Ill 
4'l'wo arcs (a,c) and (b,d) kite-string-tangle, if .. < b < 
c<dorb<a<d<c, 
the first, the dependency link measnre, the test 
seilteRces were indel)endently ha.n(l tagged with 
dependency links an(l tlien were used tO nla.tch 
1,he the lhlks output I)y the del)endency nlodel. 
The c:ohuni+s show tit(; total nunllJer el' clel)en- 
(lency liuks hi the lilmd tagged set, the nuiriber 
of nm.tched links output by this model and the 
i)el'cellta.~e (-OlTeetlless. The second lllOaSlll'e~ Sll- 
f)erta.gs, shows the tot:.1 null)her of cori'ect su- 
l)ertag, s assiDled to the words hi the COl'l)US t)y 
this model. 
C,'it(.,.io,, I ,,U_@,'~ \[_ (,o,',',.c.t _1 <.o,.,.,~<¢_1 
SUlierl'lgs \] 915_ ~__ 707 77 26% __' "2 '_ ~' '_ ......... ~__~' '" {'~ 
'l'id)le .'l: Results el" l)epeudency nlo(le{ 
6 Conclusion 
Lexica\[ized grammars :i,ssociate with each word 
richer sgructllre~; (trees ill case ()\[' l'l'A(',s and c~t- 
egories hi case o1' (Joml)hl~tI, ory Ca, l, egoriaJ (',I'\[LI\]t-- 
Ill;ll'S ((\](~(~S.)) OVeI' which tile wor(I specilles syn- 
t:t(:gic :lid S(qll;i.lltiC collstrathlts. Ilence every 
word is asso('ia.ted with ~t uluch la.rger set of 
lllOl'e COlll\[)\]ex stl'll('tlll'es \[,hail ill the ca,se where 
the words :.re associated with sta,nda.rd i)a.rts - 
olZsl)eech, llowever, these more complex de- 
scriptions alk)w more comple-~ coustraints to be 
imposed a.nd w,'ified locally on the coutexts in 
which these words a?pea.r. This fea.ture of lexi- 
calized grammars can be taken a,dvantage of, to 
further reduce the (lisalnl)iguatioii task of the 
I)arser, as slmwll in SUlmri.ag disa.ml)igua.i.ion. 
Ileu(:e sui)el'Da,g ~ (lisai, nll)igua(,ioli (;a,l/ Im use(I :t~'; 
a. g;enera.I i)re-i)a.rsing (:olnl)oneut o\[' lexicalized 
~rl'all) Illal' pa i'sels. 
The d(,gree of distiuct, ion l)etwe(m SUlml'(.a.g dis- 
aml)igua.tion a.n(I i/arsing va.ries, depen(ling on 
the. lexicalized g;ranima.r be(us (:onsi(M'ed. l,'or 
both I/I'A(', an(I C'CG, supertag disaml)igui~tion 
serves as a, preq)arser filter i;tutt effectively we.eds 
Oil( iila, l)l)rol)ria, te eIelIl(':llta, ry stl'il('tures (tre.es or 
categories) givenl the c(mtext of the sentence. It 
also in(liea.tes the dopenden('ies alnoi~g the ele- 
mentary stru('tlu'es but not tim spe('ific el)era.ties 
to lie used l,o coral)(he the strllctul/es or tim it(I- 
dress a.t which the el)era.ties is to be l)erformed 
"a.ll ahliost parse", l if c'ases where 1,1l(; SUl)ertag 
sequelice \[Tir the ~iW.~li hil)ut strilig c:l, llilot lie 
159 
combined to form a complete structure, the "at- 
most parse" may mdee(i be the best one can do. 
In case of LTAG, even though no exl)licit 
substitutions or adjunctions are shown, the de- 
pendencies among LTAG trees uniquely iden- 
tify the combining operation between the trees 
and the node at which the operation (:an be 
performed is almost always unique s. Thus su- 
pertag disambiguation is almost parsing lbr UI'- 
AGs. In contrast, the dependencies among the 
CCG categories do not result in directly identi- 
fying the combining operations between the cate- 
gories since two categories can often be corn I)ined 
in more than one way. Hence for CCG fiu'ther 
processing needs to be performed to obtain the 
complete parse of the sentence, although without 
any supertag ambiguities. 
The supertag disaml)iguation, dependency 
model in particular, is even closer to p~wsing in 
dependency grammar formalism, l)ependency 
parsers establish relationships among words, un- 
like the phrase-structure parsers which construct 
a phrase-structure tree spanning the words of 
the input. Since LTAGs are lexicalized and 
each elementary tree is associated will, a.t least 
one lexical item, the supertag disaml)iguation 
for EPAG can therefore be viewed as establish- 
ing the relationship a among words as depen- 
dency parsers do. Then the elementary stru(> 
tures that the related words anchor are combined 
to reconstruct the phrase-structure tree similar 
to the result of phrase-structure parsers. Th,s 
the interplay of both dependency ,~nd phrase- 
structure grammars can be seen in U\['AGs. Ram- 
bow and Joshi (R, ambow and Joshi, 1993) dis- 
cuss in greater detail the use of LTAC, in reh~ting 
dei)endency analyses to phrase-structure analy- 
ses and I)rOl)OSe a dei)endency-I)ased l)arser for a, 
phrase-structure based grammar. 
In summary, we have presented a new tech- 
nique that performs the disambiguation of su- 
pertags using local intbrmation such as lexi('al 
preference and local lexical dependencies. This 
technique, like part-of-speech disambigua.tlon, ro.- 
duces the disambiguation task that needs to be 
Sin some cases, the dependency information between 
an auxiliary and an elementary tree may be insufficient to 
uniquely identify the address of adjunction, if the auxiliary 
tree can adjoin to more than one node in the elementary 
tree, since the specific attachments are not shown. 
6The relational labels between two words it, L'I'AG is 
associated with the address of the operation between the 
trees that the words anchor. 
done 1)y the parser. After the disa.nd)iguation, 
we have effectively comi)leted the parse of the 
sentence ~md the parser needs %nly' to coml)lete 
the ~djunction and substitutions. This method 
can also serve to parse Selltetlce \['ra~lfleuts ill 
cases where the supertag sequence after the dis- 
ambiguation may not contbine to form a single 
structure. We have implemented this technique 
of disambiguation using the n-gram models using 
the prol)ability data collected from LTAG I)arsed 
corpus. The similarity between lilAC and l)e- 
pendency grammars is exploited in the (lepen- 
dency mo(M of supertag disambigm~tion. The 
per\['ormance results of these models have been 
presented. 
References 
Church, K. (1988). A Stochastic Pari;s I)rogram and 
Noun Phrase Parser for Unrestri('i;ed TexL In 2~ld 
Applied Natural Language Processing ConJ'cre~tcc 
1988. 
Doran, C., l'\]gedi, D., l\[ockey, B.A. and Srinivas, B. 
(1994). XTAG 7'ec&/cal Report. I)ep~rtrnent o\[' 
Computer a.ml hdbrmation Sciences, University or 
lhmnsylwmia, l)hihuh!lldli~t, PA. In In'ogress 
Ilindle, D. (1993). I'rediction of Lexic~dized Tree 
Fragments in 'I'ex~; ARPA Workshop on \[\[um~m 
l,anguage Technology, March 11993. 
Milward, D. (1992). Dynamics, Dependency Gram- 
mars and Incremental \[nterpreta.tion. In Pro- 
ceedings of Ihe 14 th International Confe'ccncc on 
Comlrutalional Linguistics (COLINC'92), Nantes, 
France, August. 
1Lambow, O. and Joshi, A.K. (1993). 1)epen- 
(h'ncy Parsing for I)hrase-Structure (\]rammars. 
Man'usc'cipt, U,liversil,y of I'(mnsylv:mia. 
.qh,ator, I). a.n(1 'l'elnp,wh'y, I). (1991). Parsing 
I';nglish wil.h a laid{ (h'muinar. 7}~chnical 'report 
CMU-C',q'-91-196, Deparl.ment of Compul,er Sci- 
ence, Carnegie Mellon Uldversity, 1991. 
Schabes, Y., AIMII6 A. and Joshi, A.K. (t988). Pro's- 
ing strat;egies with 'lexicalized' grammars: Appli- 
c;d, ion to tree adjoining grammars. In P'.,'oceedinys 
of the 12 th Inlernalional CoT@:rence on Comp'ula- 
lional Linguislics (COLINC'88), BudN)est , Ilun- 
gary, August. 
Schal)es, Y. (\]9q()). Malhcmalical and Comp.ula- 
lional Aspects of Lczicalizcd Grammars. Ph.I). I, he- 
sis, University of Pe,msylva.nia, l'hiladelphia, PA, 
August. Ava, ilal)le as technicM rel)ort (MS-CIS-90- 
48, LINC LA\]~,I79) from the l)elmrl,menl; of (\]om- 
lmter and htI'ormation S(:icnce. 
160 
