Integrating Shallow Linguistic Processing into a Unication{based
Spanish Grammar
Montserrat Marimon
gilcUB
Grup d'InvestigacioenLingustica Computacional
Universitat de Barcelona
montse@gilc.ub.es
Abstract
This paper describes towhatextentdeep pro-
cessing may benet from shallow processing
techniquesanditpresentsaNLPsystemwhich
integrates a linguistic PoS taggerand chunker
asapreprocessingmodule ofabroad{coverage
unication{based grammarofSpanish. Exper-
iments show that the eciency of the overall
analysisimprovessignicantlyandthatoursys-
tem also provides robustness to the linguistic
processing,whilemaintainingboththeaccuracy
andtheprecisionofthegrammar.
1 Introduction
Deeplinguistic processingproducesacomplete
syntacticandsemanticanalysisofthesentences
itprocesses,howeveritfailsinproducing are-
sult when the linguistic structure being pro-
cessedand/orwordsintheinputsentencesfall
beyond the coverage of the grammatical re-
sources. Natural Language Processing (NLP)
systemswithmonolithicgrammars,inaddition,
havetodealwithhugesearchspaceduetosev-
eral sources of non{determinism (i.e. ambigu-
ity). Thisisparticularlytrueofbroad{coverage
unication{based grammars where all dimen-
sionsoflinguisticinformationareinterleaved,as
theoriessuchasHPSGpropose. Lackofrobust-
ness and inecient processing makesuchsys-
tems inadequate for practical applications e.g.
NaturalLanguageInterfaces(NLI).
ThispaperpresentsaNLPsystemwhichin-
tegratesalinguistic Part{of{Speech(PoS)tag-
ger and chunker (as opposed to data{driven)
asapreprocessingmodule ofabroad{coverage
unication{basedgrammarofSpanish.
By integrating shallow and deep processing
theeciencyoftheoverallanalysisprocessim-
proves signicantly, since we can release the
parser from certain tasks that maybee-
cientlyandreliablydealtwithbycomputation-
allyless expensivetechniques. Theintegration
ofshallowprocessing,inaddition,providesthe
unication{basedgrammarwithlargercoverage
forsyntacticstructuresandallowsustoimple-
ment default lexical entry templates for virtu-
allyunlimitedlexicalcoveragewhileavoidingin-
creaseinambiguity.
Thesystemwepresentisinspiredby(Abney,
1992)andit is in accordancewith (Srinivas et
al.,1997;;CiravegnaandLavelli,1997;;Yoonet
al.,1999;;Venkova,2000;;Watanabe,2000;;Prins
andNoord,2001;;GroverandLascarides,2001;;
Crysmannetal.,2002).
Inthefollowingsectionwebrieypresentthe
unication{basedgrammar.Section3describes
latch,the linguistic taggerand chunker. Sec-
tion4discusses theextensions requiredbyour
systeminordertotransfertheinformationde-
liveredbythetaggerandchunkerintothegram-
mar. Insection5wedescribethedefaultlexical
entrieswehavedened. Resultsonthesystem
performanceareprovidedinsection6. Thispa-
persendsbypresentingthegeneralconclusions.
2 The Unication{based Grammar
Thedevelopmentofthegrammarthatservedas
thebasisofourresearchworkwasdonein the
frameworkoftheAdvancedLanguageEngineer-
ing Platform (ALEP) (Simpkins et al., 1993)
during the project LS{GRAM (LRE 61029)
(Schmidt et al., 1996) and it was used in
the project MELISSA (ESPRIT 22252) (Bre-
denkamp et al., 1998) for the rst time in an
industrial context. The grammar is currently
beingusedintheprojectIMAGINE(IST{2000{
29490).ThemaingoaloftheIMAGINEproject
istodevelopsoftwaretechnologythatallowsthe
interactionwith e{business applications byus-
ingamulti-lingualNLIfrommobiledevicesand
otherappliances.
1
2.1 Coverage of the Grammar
The range of linguistic phenomena that the
grammar handles includes: all types of sub-
categorization structures, determination (sim-
ple and complex), a full coverage of agree-
ment (subject{verb, subject{attribute, agree-
mentwithin theNP),null{subjects (pro{drop,
impersonal sentences), compound tenses and
periphrastic forms, clausal complements (com-
pletive clauses and indirect questions), control
and raising structures, support verb construc-
tions, passive constructions (with the copula,
withorwithoutthe`by{agent'complement,and
reexivepassive),modiersofverbs,nouns,ad-
jectives and adverbs, negation, sentential ad-
juncts, topicalization, relative and interroga-
tives clauses, surface word order variation, co-
ordination (binary,enumeration and coordina-
tion of unlike categories), clitics (clitic{NP al-
ternation, clitic doubling, clitic climbing, encl-
itics), NPs with no noun{head, non{sentential
input strings and special constructions (num-
ber,dates,...).
2.2 The ALEP Architecture
ALEP distinguishes preprocessing operations
and linguistic processing operations. The for-
mer|TextHandling(TH)andorphographemic
analyses|accountforsurfacepropertiesofin-
put text (document formatting, delimitation
oftextual structural elements, orthographemic
aspects of morphology), while the latter |
parsing and renement| deal with its non{
surface properties (morphosyntactic analysis,
constituent structure, semantic representa-
tion).
2
A special rule{based operation |
Lifting| interfaces the output of the prepro-
cessingoperationwiththeparsingoperation.
2.3 The ALEP Linguistic Formalism
TheALEPlinguisticformalismhasbeendevel-
oped on the basis of the specications result-
ingfromtheET{6designstudy(Alshawietal.,
1
See http://www.rtd.softwareag.es/imagine.
2
A distinctive feature of the ALEP processing archi-
tecture is the division of the analysis task into two sub{
tasks: `parsing', which builds up a complete but shallow
phrase structure tree, and `renement', which traverses
the structure top{down, thus monotonically performing
feature decoration, typically with semantic information.
1991). It is a so called \lean" formalism com-
pilableintorst{order(Prolog)termsandthus
avoiding computationally expensive formalde-
vices.
AnALEPgrammarisimplemented byspec-
ifyinglexical entriesandgrammarrules, based
onatypesystemthatconstitutesamonotonic
simpletypehierarchywithappropriatenesscon-
ditions.
Lexical entries are based on the data struc-
tureLinguisticDescription(LD),collectingcon-
straints on the type system. The lexical com-
ponent of our grammar plays a crucial role in
thegrammaticaldescriptionneededforprocess-
ing. Itisahighlylexicalizedgrammarwherelin-
guisticphenomena,suchassubject{verbagree-
ment, subcategorization, modication, control
relations,etc.,traditionallydealtwithbymeans
ofspecializedphrasestructurerules,aretreated
inthelexicon. Grammarrulesarethusreduced
toasmallsetofbinary{branchingcontext{free
phrasestructure rules, which arebased on the
datastructureLinguisticStructure(LS).
3
The adopted approach in the grammar we
present follows HPSG proposals (Pollard and
Sag,1994).
3 Latch: The Linguistic Tagger and
Chunker
Latch was rstly conceived as a lexical dis-
ambiguation tool based on analyses promo-
tion/reduction by means of weighted symbolic
contextrules(Porta,1996).
It is a lean formalism where lexical infor-
mation, including fullform, lemma and Mor-
phoSyntactic Description (MSD), is expressed
byregularexpressions. Thepivotsoftherules,
which specify the tokens to be disambiguated,
aresequences oflexical elementsthatreceivea
voteontheir morphosyntacticanalyses. Votes
may be positive or negative to promote or to
eliminatethem,respectively. Inaddition,apre-
condition may be expressed in the pivots to
specify the typeofambiguity the rule is re-
ferredto. Linear generalizations areexpressed
bymeansofcontextualoperatorsforimmediate,
unboundedandconstrainedunboundedcontex-
tualconditions.
3
Besides phrase structure rules, a set of word struc-
ture rules are applied at the parsing component perform-
ing morphosyntactic analysis.
Inafurtherdevelopmentstate,theLatchfor-
malismwasextendedsothatitcanalsobeused
to mark chunks (or intra{clausal partial con-
stituents)(Abney,1996)andusethatinforma-
tion for PoS disambiguation. This interaction
of PoS disambiguation and partial parsing re-
duces the eort needed for writing rules con-
siderably and improves results (Marimon and
Porta,2000).
4
4 Integrating PoS Tags and Chunks
into the Grammar
The integration of shallow processing tech-
niques(PoStaggingandpartialparsing)isfully
supported by the open architecture of ALEP,
which allowseasyintegrationofexternalmod-
ules.
Oursystemrequiressomechangestothede-
fault architecture of the ALEP system where
boththeTHsystemandthemorphographemic
analysis component are replaced by a unique
external preprocessing module (Latch). It
also requires the lifting componenttobeex-
tendedinordertotransfertheinformationde-
livered by the external preprocessing module
intothehigh{levellinguisticprocessingcompo-
nents. Thechangestobemadeinthehigh{level
linguistic processing components, however, are
very thin: word structurerules havetobeex-
tended, but phrase structure rules and lexical
entriescanbeleftuntouched.
4.1 Text Structure to Linguistic
Structure Rules
TheintegrationofboththePoStagsandchunk
mark{upsdeliveredbyLatchisdonebythelift-
ingcomponentoftheALEPsystem,whichcon-
vertsthemintodatastructuressuitablefordeep
linguistic analysis.
Thelifting componentisbasedonaparticu-
larsetofrules,theso{calledTextStructureto
LinguisticStructure(TS{LS)rules.
Threelevelsareassumedattheliftingcompo-
nent|`M',`W'and`S'|whichinthedefault
architecture of the system were convertedinto
4
Latch is currently being used to annotate the 125
million word Corpus Diacronico del Espa~nol (CORDE)
and 125 million word Corpus de Referencia del Espa~nol
Actual (CREA)by the Departamento de Lingustica
Computational de la Real Academia Espa~nola. Some
results on the rst version of the tool can be found in
(Sanchez et al., 1999).
LDsrepresentingmorphemes,fullforms,andthe
top node establishing the axiom of the gram-
mar.
56
Structure rules, then, are distributed
according to the dierenttypes of structural
units being involved in the parsing operation:
`morphemestowords'(wordstructurerules)or
`wordstosentences'(phrasestructurerules).
4.1.1 Lifting PoS Tags
Integrating PoS information in a system like
ALEP means dening TS{LS rules propagat-
ingthemorphosyntacticinformationassociated
tofullforms(i.e. PoStagandlemma)delivered
by the tagger to the relevant morphosyntactic
featuresatthelexicalentriesofthegrammar.
The integration of PoS tags into ALEP is
done at the level `M'.By using the lowest tag
level to lift the lexical information associated
tofullforms, wecanpropagatethe ambiguities
which can not be reliably solved by the shal-
lowprocessingtooltothegrammarcomponent,
thusensuringthattheaccuracyofthegrammar
remainsthesame.
(1)shows the rule we dened to lift the tag
'Ncfs-'.
(1)
ts ls rule(
2
6
6
6
6
6
6
6
6
6
6
6
6
6
6
4
ld
synsemjloc:
2
6
6
6
6
6
6
6
6
6
6
6
6
4
t local
morph:
2
6
6
4
t morph
lemma: 2
morpheme: 1
agr: (fem&sing)
3
7
7
5
cat:
2
6
4
t subst
head:

t noun
nclass: common

3
7
5
3
7
7
7
7
7
7
7
7
7
7
7
7
5
3
7
7
7
7
7
7
7
7
7
7
7
7
7
7
5
,
'M',[POS = 'Ncfs-', LEMMA =
2
],
1
).
4.1.2 Lifting Chunks
Similar to the integration of PoS information,
theintegrationofchunkmark{upsintheALEP
system requires TS{LS rules to convert them
into LD data structures used by the linguistic
processingcomponentsofALEP.
5
Normally, this will be the sentence node, though it
can also be any phrasal node when partial input strings
are to be processed.
6
The output of the lifting process is a Partial Lin-
guistic Structure (PLS) where the hierarchical relations
between the dierent structural elements is expressed in
terms of week dominance relations.
The integration of chunk mark{ups into
ALEP is done at the level 'W'. By integrat-
ing chunk mark{ups at the intermediate level,
weavoidmodifyingphrasestructureruleswhich
buildupaLDontopoftheconvertedLDs: (i)
attaching post{head sisters (modiers and/or
complementstotherightoftheheadelement),
(ii) and/or attaching modiers and/or speci-
ers to the left of the head element when the
chunkhasonlybeenpartially recognized. Fur-
thermore,weavoidinterference withthesetof
phrasestructureruleswhichbuild upthesame
type of LDs. These rules are maintained to
build up nodes thathave not been marked up
bythepreprocessingmodule.
7
The system we propose, in addition, inte-
gratesintothehigh{levelcomponentsofALEP
LDswhichdonotneedtobere-builtbyphrase
structure rules, since, even though they are
quiteunderspecied w.r.t. theheadelementof
thechunk(theyonlycontaininformationabout
itspart{of{speech),theyalreadyspecifysyntac-
tic and semantic information about the non{
head elements that have been attached to the
headelement.
8
Thisallowsustodealwithlow
frequentsyntacticstructureswhosecoverageby
meansofourALEPgrammar,thoughfeasible,
would increase both the parsing search space
andtheambiguity.
9
(2)showstheruleforadjectivalchunkswhich
havetheheadelementandadegreeadverb.
4.2 Word Structure Rules
Besides the TS{LS rules wehave presented,
the strategy we propose also requires unary
word structure rules to consolidate the struc-
turalnodes provided by the`lift' operationfor
thenewtags`M'and`W'.
Theserules,inaddition,areinchargeofper-
colating the linguistic information of the head
element of the chunk, which is encoded in the
lexicon,tothemothernode,whichalreadycon-
tainsinformationaboutthenon{headelements
7
These rules are applied when parsing words to sen-
tences, whereas lifted chunk mark{ups are dealt with
word structure rules (cf. section 4.2).
8
This strategy,however, requires very specialized TS{
LS rules not only w.r.t. the category of the head element
(noun, verb, adjective, adverb) but also the number, cat-
egory (determiner, adjective, adverb, auxiliary, ...) and
type (denite, indenite, ...) of non{head elements.
9
Examples of suchsyntactic structures are given in
section 6.
alreadyattachedbythepreprocessingtool.
(2)
ts ls rule(
2
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
4
ld
string:


muy interesante

...
2
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
4
t local
cat

t subst
head t adj

sem
2
6
6
6
6
6
6
6
6
6
4
t sem
indxjind sf(index( , 1 ))
mods
*
2
6
6
6
6
4
t sem mod
rel sf(rel(degree, 1, 2 ))
indxjind sf(index(nevent, 3))
predarg

t predarg
pred sf(pred(
3
,
2
))

3
7
7
7
7
5
+
3
7
7
7
7
7
7
7
7
7
5
3
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
5
3
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
5
,
'W',['TYPE' = 'CHUNK', CHUNK-TY = 'AX', ADV =
3
]).
5 Default Lexical Entries
Supplementarytotheintegrationoftheshallow
processingtool,defaultlexicalentrieshavebeen
implementedinourALEPgrammartoprovide
robustdeepprocessing.
Default lexical entries are lexical entry tem-
plateswhichareactivatedwhenthesystemcan
notnd aspecic lexical entrytoapply.Note
that having default lexical entries in a system
like ALEP increases ambiguity, and, thus, the
parsing search space, unless a mechanism is
used to restrict as much as possible the tem-
platesthatareactivated. Theintegrationofthe
tagger, which supplies the PoS information to
thelinguisticprocessingmodulesofoursystem,
allowsustoincrease robustnesswhile avoiding
increaseinPoSambiguity.
Therearetwobasicwaystodenedefaultlex-
icalentries. Oneistoimplementunderspecied
lexical entry templates assigned to eachmajor
wordclasssuchthat,whileparsing,thesystem
llsinthemissinginformationofeachunknown
word(Horiguchietal.,1995;;MusicandNavar-
retta,1996;; Mitsuishi et al., 1998;; Groverand
Lascarides,2001). In theotherapproach,very
detailed default lexical entries for eachmajor
wordclassaredened.
Theapproachwehavefollowedfallsundera
middle type. Wehave dened several default
lexical entry templates for the dierent ma-
jorwordclasses|verbs,nouns,adjectivesand
adverbs|whichcovertheirmostfrequentsub-
categorization frames. These templates, how-
ever,areunspeciedw.r.t. thosefeatureswhich
encode the subcategorization restrictions im-
posed on their subjects and complements, e.g.
marking prepositions, lexical semantics, etc.
This information is lled by the application of
phrasestructurerules.
First experiments testing the eect of our
default lexical entries, however, showed that,
bycovering the most frequent subcategoriza-
tion frames, we ensured that the accuracy of
the grammar |percentage of input sentences
that received the correct analysis| remained
the same. The precision of the grammar |
percentage of input sentences thatreceived no
superuous(orwrong)analysis|,however,was
verylow,sincewecouldnotrestrictthelexical
templatetobeactivatedforeachwordtype.
To improve the precision of the system we
extended the PoS tags of our external lexicon
(i.e. thelexiconweuseformorphosyntactican-
notation in Latch) so that they included syn-
tacticinformationaboutthesubcategorizedfor
elements (category, marking prepositions, ...).
Thisallowedustoreducethenumberofdefault
lexicaltemplatestobeapplied.
10
6 Experiments and Results
The twoexperiments described in this section
were used to evaluate the performance of the
integratedsystembothw.r.t. ecientprocess-
ingandrobustness.
In therstexperiment, ourgoalwastoper-
formacomparativestudyoftheprocessingtime
ofourALEPgrammarbeforeandafterthein-
tegration of the PoS tagger and chunker. For
this experiment, therefore, we required testing
cases whichwere already fully covered byour
grammar before the integration of the tagger
and chunker. In this experiment, we used a
subset of the test suites wehaveusedinthe
LS{GRAMandtheMELISSAprojects.
In the second experiment, our goal was to
investigatetowhatextenttheALEPgrammar
beneted from the default lexical entries in
terms of robustness. In this experiment, we
tested our system on test corpus whichwas
10
This information was not manually encoded, but it
was extracted from the lexical resources developed in the
project PAROLE (Melero and Villegas, 1998).
selectedrandomly.
11
a){ ExperimentA
Toevaluate the eciency of the system, we
dened twotestsuites and run them with our
ALEPgrammarbothbeforeandaftertheinte-
grationoftheshallowprocessingtools.
12
The rst test suite included short instruc-
tivesentences or queries from the corpus of
the MELISSA project
13
and sentences we se-
lected from the dierent test suites wehave
used for diagnosis and evaluation purposes in
the LS{GRAM and the MELISSA projects.
14
Test cases were selected according to: (i) the
syntactic function of the chunk e.g. subject,
complement and adjunct, for nominal chunks,
complementandadjunct,foradjectivalchunks,
etc.;; (ii) the position of the chunk in the sen-
tence, and (iii) the category and the number
ofnon{headelements. Thistestsuiteincluded
1500cases.
Inrunningthetestsuitewiththenewsystem,
processingtimeoftheoverallprocessimproved
anaverageof65%duetothereductionofboth
lexicalambiguityandsentencelength.
15
Oncepositiveresultswereachievedwithsuch
type ofsentential structures, weevaluated our
system with much more complex sentences,
showinga high interaction ofphenomena. For
this, we used an article |from the newspaper
11
Test suites and corpora are the two tools tradition-
ally used for evaluating and testing NLP systems. The
main properties of test suites are: systematicity,con-
trol over data, exhaustivity, and inclusion of negative
data. Test corpora, by contrast, reect naturally occur-
ring date (cf. (Lehmann et al., 1996)).
12
Experiments have been run in a 128 Mb Ultra Sparc{
10. Mean CPU time values were calculated for 50 sam-
ples.
13
NL utterances which users made in interacting with
ICAD, an administrative purchase and acquirement han-
dling system, employed at ONCE (Organizacion Na-
cional de Ciegos de Espa~na), dealing with budget propos-
als and providing information to help decision makers.
14
These test suites are organized on the basis of a hi-
erarchical classication of linguistic phenomena. Test
suites including cases with interaction of phenomena and
negative cases are also included.
15
The reduction of the sentence length is due to the
fact that elements that are wrapped together in a chunk
by the preprocessing module are lifted to the parsing
component of the grammar as a unique element.
\ElDiarioVasco"|of250wordsfromtheLS{
GRAMcorpus.
Two experiments have been carried on,
rst byintegrating the PoS tags into ALEP
and then the chunk mark{ups. For the rst
experiment, the reduction of morphosyntactic
ambiguity an average of 0.40 reduces the
processingtimeoftheoverallprocessby45.9%
(35.9% on average per sentence). For the
secondexperiment,thesystemprocessingtime
is reduced by52.6%(anaverageof 42.7% per
sentence). Here, parsing speed{up is due to
the fact that byintegrating chunk mark{ups,
we do not only avoid generating irrelevant
constituentsnotcontributingtothenalparse
tree but wealsoprovide part of the structure
thattheanalysiscomponenthastocompute.
16
b){ ExperimentB
Theevaluationoftheeectofdefaultlexical
entries on the ALEP grammar was done with
freeinputtext. Hereweuseda300wordarticle
from\ElPais"(September2001).
In running the second experimentwe ob-
served that our rst approach ensured that
the accuracy of the grammar |percentage
of input sentences that received the correct
analysis| remained the same, even though
67.7% of major words which appeared in the
article was not encoded in the ALEP lexicon.
The precision of the grammar|percentage of
inputsentencesthatreceivednosuperuous(or
wrong) analysis|, however, was be very low,
wegotanaverageof8analysispersentence. By
adding framinginformation tothe PoStagsof
ourexternallexiconwereducedovergeneration
uptoanaverageof2.5analysispersentence.
Besides, our system provides structural ro-
bustness to the high{level processing. We ob-
served that a number of linguistic structures
which could not be handled by the grammar
16
A detailed analysis of the results showed us that,
while in processing simple sentences, as the ones we in-
cluded in the rst test suite, the most relevant factor for
improving processing time was the reduction of the num-
beroftokens of the sentences, in processing complex sen-
tential constructions, e.g. sentences included embedded
clauses, eciency gains were mainly due to the reduction
of the morphosyntactic ambiguity, since this drastically
reduced the structural ambiguity.
beforetheintegrationoftheshallowprocessing
toolsarecurrentlycovered. Examplesare:
(3) a. No dieron [credito alguno] a ...
((they)didnotbelievein...)
b. Se incrementaran en [los proximos
ocho meses](Theywill beincreasedin
thefollowingeightmonths)
(3.a)showsanominalchunkwheretheinde-
nitealgunoispostponed,(3.b)showsanominal
chunkwherethecanonical`cardinal+adjective'
orderisinverted.
7 Conclusions
This paper hasdescribed researchinto thede-
velopmentofengineeredlarge{scalegrammarto
provide more robust and ecient deep gram-
maticalanalysisoflinguisticexpressionsinreal{
worldapplications e.g. NLI,while maintaining
boththeaccuracyofthegrammaranditspre-
cision.
Weforeseetoextendthechunkertocoverun-
grammatical or uncomplete intra{clausal par-
tial constituents which can then be integrated
into the ALEP linguistic processing compo-
nents. Also we plan to add semantic informa-
tiontothePoS+Frametagsencodedinthelexi-
calresourcesdevelopedintheprojectSIMPLE.

References

S. Abney. 1992. Prosodic Structure, Perfor-
mance Structure and Phrase Structure. In
Proceedings of the Workshop on Speech and
Natural Language, Morgan Kaufmann Pub-
lishers,SanMateo,CA.

S.Abney. 1996. http://www.sfs.nphil.uni-
tuebingen.de/ abney/Papers.html#96i.

H. Alshawi, D.J. Arnold, R. Backofen, D.M.
Carter, J. Lindop, K. Netter, S. Pulman,
J. Tsujii, and H. Uszkoreit. 1991. Eurotra
ET6/1:RuleFormalismandVirtualMachine
Design Study (nal report). Commission of
theEuropeanCommunities,Luxembourg.

A. Bredenkamp, T. Declerck, P. Groenendijk,
M.Phelan,S.Rieder,P.Schmidt,H.Schulz,
and A. Theolidis. 1998. Natural Language
AccesstoSoftwareApplications. In Proceed-
ings of COLING{ACL'98,Montreal,Canada.

F. Ciravegna and A. Lavelli. 1997. Con-
trolling Bottom{up Chart Parsers Though
TextChunking. In Proceedings of IWPT'97,
Boston,MA.

B. Crysmann, A. Frank, B. Kiefer, H.-U.
Krieger, S. Muller, G. Neumann, J. Pisko-
rski, U. Schafer, M. Siegel, H. Uszkoreit,
and F. Xu. 2002. An Integrated Architec-
tureforShallowDeepProcessing. InProceed-
ings of ACL'2002, UniversityofPennsylva-
nia,Philadelphia, PA.

C. Grover and A. Lascarides. 2001. XML{
Based Data Preparation for Robust Deep
Parsing. In Proceedings of ACL{EACL 2001,
Toulouse,Franse.

K.Horiguchi,K.Torisawa,andJ.Tsujii. 1995.
AutomaticAcquisitionofContentWordsUs-
ing an HPSG{Based Parser. In Proceedings
of the NLP Pacic Rim Symposium, Seoul,
Korea.

S. Lehmann, S. Oepen, S. Regnier-Prost,
K. Netter, V. Lux, J. Klein, K. Falkedal,
F.Fouvry,D.Estival, E.Dauphin, H.Com-
pagnion,J.Baur,L.Balkan,andD.Arnold.
1996. TSNLP | Test Suites for Natu-
ral Language Processing. In Proceedings of
COLING{96,Copenhagen,Denmark.

M. Marimon and P.Porta. 2000. PoS Disam-
biguation and Partial Parsing: Bidirectional
Interaction. In Proceedings of LREC{2000,
Athens,Greece.

M. Melero and M. Villegas. 1998. Issues on
theEncodingofaComputationalLexicon. In
Proceedings of LREC{1998,Granada,Spain.

Y.Mitsuishi,K.Torisawa,andJ.Tsujii. 1998.
HPSG{StyleUnderspecied JapaneseGram-
mar with Wide Coverage. In Proceedings of
COLING{ACL'98,Montreal,Canada.

B. Music and C.Navarretta. 1996. Documen-
tation of the LS-GRAM Danish Lingware.
Deliverable E-D8-DK, Center for Sprogte-
knologi,Copenhagen.

C. Pollard and I.A. Sag. 1994. Head{Driven
Phrase StructureGrammar. Universityof
ChicagoPress,Chicago.

J. Porta. 1996. Rtag. Technical Report,
Grup de Investigacio en Lingustica Com-
putacional,UniversitatdeBarcelona.

R. Prins and G. van Noord. 2001. Unsuper-
vised Post{Tagging Improves Parsing Accu-
racy and Parsing Eciency.InProceedings
of IWPT'2001,Beijing,China.

F. Sanchez, J. Porta, J.L. Sancho, A. Nieto,
A.Ballester,A.Fernandez,L.Gomez,E.Rai-
gal, andR. Ruiz. 1999. La anotacionde los
corpus CREA y CORDE. In Proceedings of
SEPLN'99,LLeida,Spain.

P.Schmidt,A.Theolidis,S.Rieder,andT.De-
clerck. 1996.Leanformalism,LinguisticThe-
ory, and Applications. Grammar Develop-
mentinALEP. In Proceedings of COLING{
96,Copenhagen,Denmark.

N. K. Simpkins, M. Groenendijk, and
G. Cruickshank. 1993. ALEP User Guide.
Commission of the European Communities,
Luxembourg.

B. Srinivas, C. Doran, B.A. Hockey, and
A.Joshi. 1997. AnApproachtoRobustPar-
tialParsingandEvaluationMetrics. In Pro-
ceedings of IWPT'97,Boston,MA.

T. Venkova. 2000. A Local Grammar Dis-
ambiguator of Compound Conjunctions as
a Pre{Processor for Deep Analysis. In Pro-
ceedings of Workshop on Linguistic The-
ory and Grammar Implementation. ESSLLI{
2000,Birmingham,UK.

H. Watanabe. 2000. A Method for Accelerat-
ing CFG{Parsing by Using Dependency In-
formation. In Proceedings of COLING{2000,
Saarbrucken,Luxembourg,Nancy.

J.Yoon,K.S.Choi,andM.Song. 1999. Three
Types of Chunking in Korean and Depen-
dencyAnalysisBasedonLexicalAssociation.
In Proceedings of ICCPOL.
