A Computational Lexicon of Portuguese for Automatic Text Parsing 
Ehsabete RANCHHOD 
FLUL/CAUTL-IST 
Av Rovlsco Pals, 1 
1049-001 Llsboa, Portugal 
ehsabet@label 1st utl pt 
Cnstma MOTA 
CAUTL-IST 
Av Rovisco Pals, I 
1049-001 Llsboa, Portugal 
cnstIna@label2 1st utl pt 
Jorge BAPTISTA 
UALG/CALrI'L-IST 
Av Rovlsco Pals, 1 
1049-001 Llsboa, Portugal 
jbaptis @ualg pt 
Abstract 
Using standard methods and formats 
established at LADL, and adopted by several 
European research teams to construct large- 
coverage electronic dictionaries and 
grammars, we elaborated for Portuguese a set 
of lexlcal resources, that were implemented in 
IN'rEX We describe the main features of 
such linguistic data, refer to their mmntenance 
and extension, and gwe different examples of 
automatic text parsing based on those 
dictionaries and grammars 
Keywords Text parsing, large-coverage 
dictionaries, computational lexicons; word 
tagging, information retneval. 
1 Introduction and Background 
The French DELA system was conceived and 
developed at LADL (Laboratoire d'Automatlque 
Documentaire et LmgmstIque) It includes 
monohngual hngulstlc resources (mainly for 
French and English) specifically elaborated to be 
integrated into NLP systems Standard methods 
and formats-have been defined and are now used 
by other national teams working on their own 
languages German, Greek, Italian, Portuguese 
and Spanish Within that common framework, 
important fragments of the descnptmn of the 
languages involved have been worked out the 
syntactic and semantic properties of free and 
frozen sentences are descnbed and formalized e~s 
for the lexicon, a major component of NLP, large 
coverage electromc dlctionanes have been built 
Simple and compound words have been 
descnbed, and their hnguIstlc characteristics have 
been hand-coded by computational lexicographers 
using a common method 
Most of these lex~cal resources can now be 
imported into the Intex NLP system ~, and then 
automatically applied to large texts Within the 
scope of this article, we describe the set of lexical 
resources built so far for Portuguese, and we gwe 
different examples of automatic Portuguese text 
parsing 
2 Portuguese Electronic Dictionaries 
By electromc dictionary, we mean a computerized 
lexicon specifically elaborated to be used tn 
automatic text parsing operations (indexing, 
recognition of complex words, technical and 
common, etc ) Thus, large coverage electronic 
dlctlonanes were built for Portuguese for that 
purpose 
The set of lexIcal data is organized according to 
the formal complexity of the lexlcal units The 
Portuguese DELAS IS the central element of the 
d~ctionary system Itcontains more than 110,000 
simple words, whose grammatical attributes are 
systemaucally described and encoded The set of 
compound words Is structured m the Portuguese 
DELAC At the moment, it Is constituted by a 
lexicon of 22,000 compound nouns and 3,000 
frozen adverbs, so it is stdl far from adequate 
completion 2 
2.1 The DELAS and DELAF Dictionaries 
As said before, DELAS is the dIctmnary of simple 
words We understand by simple words the lexlcal 
units that correspond to a continuous string of 
i See http//www ladl lUSSleu fr/IN'I~X/mdex htm ll 
2 The French DELAC contains (Sdberztem (1997 189) 
about 130,000 entries 
74 
letters The lexlcal entnes of DELAS have the 
followmg general structure 
<word>, <formal description> 
where word represents the canomcal form (the 
lemma) of a simple lexlcal umt (m general the 
masculine smgular for the nouns and adjecuves, 
the mfinmve for the verbs), and formal 
description corresponds to an alphanumenc code 
contzanmg mformat~on on the grammatical 
attributes of the entries their grammatical class 
(eventually, sub-class), and their morphological 
behavior 
The mflected forms are automaucally generated 
from the association of a lemma to an mflecuonal 
code the hst of all reflected words constitutes the 
Portuguese DELAF (1,250,000 word forms) 
In Portuguese, the major grammatical classes 
nouns, adjectives and verbs have mflected forms 
-nouns and adjectives can appear m the 
femmme and/or m the plural, they can recewe 
dtmmuUve and augmentaave suffixes, the 
superlative degree of the adjectives can be 
expressed by morphological means (suffixes), 
-verbs are conjugated (mood, tense, person, 
number), furthermore, some verbal forms can 
undergo formal mo&ficattons reduced by the 
presence of a clmc pronoun 
Thus, the DELAS entries 
gato, NOID1 
gordo, AOIDIS1 
(where N and A mdtcate that gato (cat) ts a noun 
and gordo (fat) Is an adjective, 01 corresponds to 
the mflectlon rule for mascuhne, feminine, 
smgular and plural, DI and $1 exphclt the type of 
dtmmuttve and superlative suffixes that can be 
accepted by these entries) produce the following 
infected forms (DELAF entnes) 
gato, gato N ms (cat) 
gata, gato N fs 
gatos, gato N mp 
gatas, gato N fp 
gatmho, gato N Dms (httle cat) 
gatmha, gato N Dfs 
gattnhos, gato N Dmp 
gatmhas, gato N Dfp 
gordo, gordo A ms (fat) 
gorda, gordo A fs 
gordos, gordo Amp 
gordas, gordo A fp 
75 
gordmho, gordo A Dins (rather fat) 
gordmha, gordo A Dfs 
gordmhos, gordo A Dmp 
gordmhas, gordo A Dfp 
gordlsslmo, gordo A Sms (very fat) 
gordfsstma, gordo A Sfs 
gord£sstmos, gordo A Stop 
gord\[sstmas, gordo A Sfp 
As for the verbs, for mstance, dar (to gwe) 
dar, VO2t 
gives rise to a hst of 73 reflected forms that 
correspond to the normal conjugation of a non- 
defective verb, in addmon, dar can be constructed 
with clmc pronouns (t), m the posmon of 
accusative and dative complements So, m 
(1) Nds demos o hvro ~ Maria 
(Lit We gave the book to Maria) 
the verb form demos expresses mdlcatlve mood, 
past tense, and first person plural 
From a syntactic point of view, dar Is constructed 
with three arguments, subject Nds (we) and two 
complements o hvro (the book), fi Maria (to 
Maria) The complement syntacuc posmons can 
be fulfilled by clmc pronouns, respecavely, o (it), 
accusative, and lhe (her), dauve, as m 
(2) N6s demo-lo ~ Maria 
(Lit We gave tt to Maria) 
(3) N6s demos-lhe o hvro 
(Lit We gave her the book) 
(4) Nds demos-lho 
(Lit We gave her_t0 
In (2), the direct object has been chttclzed, and, 
due to historical phonetic reasons, both the 
accusative pronoun and the verb have undergone 
formal modifications o>lo, demos>demo In (4), 
both pronouns (dative and accusative) are 
obhgatonly agglutinated, forming the contraction 
lho ( <lhe + o) 
So, even though the analysts of the combinations 
verb-chttc Is a syntactic matter, given the 
morphological changes reduced by such 
combinations m Portuguese, a first descnptlon 
had to be made at the morphcqegtcal level 
On the other hand, the example m (4) dlustrates a 
case where the formal notion of simple word does 
not correspond to an adequate hngutstlc analys~s 
Indeed, the form lho results from the contraction 
of two Independent oronouns lhe + o In 
Portuguese, contracted forms Issued from the 
agglutmatton of two different words (and two 
different grammaucal categones) are commonly 
observed We give some simple examples of 
contracuons resulting from the merging of 
preposmons with determiners, pronouns and 
adverbs 
pel(o,a, os, as) < por + (o,a, os, as) 
(by the) 
del(e,a, es, as) < de + (ele, ela, eles, elas) 
(of (him, her, them)) 
daqut < de + aqut 
(from here) 
The relationship between contractions and their 
base constituent categories are estabhshed by 
finite-state transducers (see below) 
2.2 Dictionaries for Compounds 
Compound words, l e, lexical units that are 
constituted by a fixed combination of s~mple 
words, represent a large amount of the lexicon of 
any language One has only to underline m a text 
the sequences of words that are frozen together to 
some extent to realize that compounds constitute 
an important percentage of the text 3 It is therefore 
illusory to envisage any sort of automatic 
processing before a slgmficant lexlcal coverage is 
achieved. The Issue is even more acute if one 
considers the description of sclenttfic or technical 
texts or any speciahzed lexicon, where the 
number of compounds can rise up to appalhng 
figures 
As said in 2 compounds are structured m the 
Portuguese DELAC Priority was given tO the 
hstlng and formahzatlon of compound nouns, that 
can mflect lua de mel - luas de mel (honeymoon), 
and to compound adverbs, that are invariable de 
repente (suddenly) From the point of view of the 
lexicon, the mare focus, especially as far as 
compound nouns are concerned, has been the 
every-day, not too techmcal, lexicon 
In order to ~dentlfy compound words, and 
dlstmgulsh them from formally Identical word 
free combinations, a set of morpho-syntactlc 
criteria was adopted (Ranchhod (1991), Bapttsta 
(1995)) In short, compounds are the sequences of 
words that present restncUons to the 
See 5 ~Parsmg Texts Using INTEX Tools- 
combinatorial properttes that they were supposed 
to have 
The formallzauon of compound dtcuonary entries 
is slmdar to that of simple words Since 
compound adverbs, preposluons and conjunctions 
do not reflect, their formats are rather s~mple 
de repente, ADV+PC 
(suddenly) 
para corn, PREP 
(towards) 
afire de, CONJ 
(m order to) 
Compound nouns, however, have generally 
reflected forms The rules for the mflect~on of 
compound nouns presented by grammarians do 
apply to some cases, but most compound nouns 
exhibit mflecuonal restrictions on gender or 
number that cannot be accounted by the 
morphological properties of their constituents In 
the DELA format, the inflectional properties of 
compound nouns are specified according to the 
same criteria as m the dictionary of simple words 
Thus, g~ven the following nominal entries of the 
DELAC 
ser(21)humano(Ol), N + NA ms - + 
(human being) 
guerra frta, N + NA fs - - 
(cold war) 
vtstta(30) de estudo, N + NDN fs - + 
(field trip) 
The first two compound nouns, ser humano and 
guerra frta have an internal structure Noun 
Adjective (NA), the most productwe class m 
Portuguese, vtstta de estudo ~s a compound of 
structure Noun de Noun, also a very productwe 
one Each entry is characterized by the posslbdlty 
(+) or lmposstbtllty (-) of gender and number 
reflection, respecttvely, the elements of the 
compound that can be inflected receive the 
mflecuonal code that they have m the DELAS 
both constltuents of ser humano inflect (in 
number) according to, respectively, the rules 21 
and 01 ser humano - seres humanos, guerra frta 
is invariable, and the noun vlstta de estudo only 
allows the inflection of vtstta vtstta de estudo - 
vtsttas de estudo 
As well as for other languages (e g French), 
addluonal mformat~on Is being added, namely 
semantic 
76 
2.3 Local Grammars 
Most of the local hngutstgc phenomena, as well as 
many complex sentences, are represented m a 
natural way by the formalism of finite-state 
automata (FSA) For instance, frozen or semt- 
frozen structures are very naturally described by 
graphs, that represent FSAs (Sllberztem (1997)) 
We illustrate the use of graphs with an elementary 
example, selected from the hbrarles of Portuguese 
local grammars This grammar descnbes a family 
of adverbial expressions (dates), which refer to a 
period of ttme around the middle of the months 
(or, by extensmn, of some years) as m the 
underlined expression 
lsso aconteceu nos tdos de Marfo 
(That happened on the ides of March) 
\ / ?a l M~ 
,' ~o,. / ~,ok, ' ,l,a. 
t ,, ~), ',,,,,,,,,, @N~ 
- --'" ~ .... ,: t~,,,,,,, 
IO,,z.,,t(, 
The following examples show how transducers 
are used to analyze contracuons, ambtgumes and 
compound numerical determiners 
3.1 Analysis of Contracted Words 
As stated above (2 1 ), contracted forms resulting 
from the agglutination of two independent words 
are commonly observed nn Portuguese To 
properly analyze these entrees we built flmte state 
transducers (FST) that, given a contracted form, 
produce an output corresponding to the 
decomposition of the contractxon into ~ts base 
constituents For xnstance, the FST 
(de, ~ t~q~,,Al~ 
F~g 2 - Analysis of the contracted form daqut 
decomposes daqut (a contraction of the 
preposmon de (from) and the adverb aqut (here)) 
m ~ts base constituents and, s~multaneously, 
associate to them the grammatical reformation of 
the dtcuonary 
3.2 Disambiguation I ,' ', ..... 
~__ / ~ ........... \[:~j" \[ Dtsambtguanon can be done at different moments • 
~{i-~.'rld ......... . ~, ),v,m, ! , of parsing '~ I..~.,.I ~ m 
~'--'~. , I'~'," ! / "-- ~, I""" I t a) Dtsambtguatmn durmg normahzatwn 
-,~lallll *J i: ~' (I The normahzation of texts for hnguistlc analysis 
I~ \[ uses FST to identify sentences and unambiguous • 
It: | compounds, to solve contractions and ehsxons As 
i~,~ an example of dxsamblguatlon at this level, we • 
I p.I ,.i.." ~, still use the case of contracuons 
.. l,','j 
' The form dele results from the contrachon of de \[\] 
Fig 1-Advldos grf (of) with the ambiguous personal pronoun ele (he, 
Th~s set of adverbial phrases corresponds to a h~m), which can be either a subjective (coded N) \[\] 
linguistic object of clearly flmte-state nature, but or a genitive form (coded O) 
hngmst~c phenomena of a more complex nature ele, eu PRO+Pes N3ms 03ms However, only • 
can be efficmntly described by such formahsms genitive forms can occur m the contraction dele 
(Gross (1997, 1995)) (de + ele) So, the FST \[\] 
I~ PI/FI'I le-r* ~ Pilt'1,-)~prl O,m~l 
. , ,.,,.AI.- 
". \[& .l~i't.~r.l~l \[eCe-,,e~ H,lO+l'-e.,~.~) 
\[,Ik" Pl~..ll~N r-N~,eu,PPd'l"rPe,m,ONp) 
Fig 3 - Analysis of the contracted forms dele, 
dela, deles, delas 
From the graphs of the local grammars, parsers 
(FSTs) can be automatically constructed, that 
applied to texts m combination with the 
dictionaries, allow the detection of a large variety 
of hngmstlc patterns (see below) 
3 Transducers 
Finite-state automata and transducers can be 
efficiently apphed at various levels of hngutstm 
analysis 
77 
ts used not only to decompose the contracted form 
dele m its base consmuents but to dlsamb~guate 
the pronouns ele, ela, eles, elas Identical FSTs 
can be used to analyze more complex situations 
where both constituents of a contraction can 
mflect mdependently 4 
b) Dtsambiguatwn for tagging 
In Portuguese, a word such as compra can be 
either a noun or a verb, the form o can be a 
determiner, a demonstranve pronoun and a 
personal pronoun So, the linear combination of 
these elements allows six different analyses 
However, m sentences hke 
Ela compra-o hoje (She buys it today) 
compra Is only a verb, and o is only a personal 
pronoun, bound to the verb by an hyphen 
The following FST 
Fig 4 - FST for the dlsamblguanon of 
verbs and clmcs 
was built to solve these amb~gmttes the five 
erroneous analyses are not taken mto account, 
compra and o receive the correct tags 
3.3 Numerical Determiners 
The Portuguese numerical determiners from dots 
(2) to novecentos e noventa e nove md novecentos 
e noventa e nove (999,999) are plural forms 
However, some of them can reflect m gender 
dol_.~s <hvro__s> 
(two <books>) 
du_a~s <cadeira_.ss> 
(two <chmrs>) 
trezentos e vmte e dots <hvro__~s> 
(three hundred and twenty-two <books>) 
trezenta___~s e vmte e duas <cadetra_._s_s> 
(three hundred and twenty-two <chmrs>) 
~ That ~s the case of aqueloutra which Is the 
contraction of the demonstrative pronouns aquela + 
outra (that(fs) + otherOes)) In Portuguese, even though 
contracted words are numerous, the hst of contractions 
~s stdl a closed set So its descnpuon with FSTs is 
possible However, this solution would not be adequate 
to describe productive phenomena revolving 
agglutination, as it is probably the case of most 
compound nouns In German, for instance 
Others are mvanant an respect to gender 
vinte <hvros> 
(twenty <books>) 
vtnte <cadeiras> 
(twenty <chairs>) 
rail e sete <hvros> 
(one hundred and seven <books>) 
mde sete <cadeiras> 
(one hundred and seven <chmrs>) 
Numerical deternuners such as dots, duas and 
vmte are simple words and therefore they are 
formahzed tn the DELAF dicnonary, numerical 
deterrnmers such as trezentos e vmte e dots, 
trezentas e vinte e duas and mtle sete can be seen 
as specml compound words that are more 
adequately described by FST 
The first FST m figure 
-In " 
d 
i I 
/ 
)) ~ - . / ) V-- .... {~' ' 
"- "\[ r,J,~ f'I ) 
.f 
1 ", ,~~, / 
Fig 5 - FST for the ldenuficatlon of Numerical 
Determiners 
describes all the compound numencal determiners 
from vmte e um (21) to novecentos e noventa e 
nove mtl novecentos e noventa e nove (999,999), 
including feminine and mvanant forms, 
assoctatmg to each of them the grammaucal 
category and the corresponding numerical value, 
as m the examples 
78 
trezentos e vmte e dots, trezentos e vmte e dots 
DET+Num+VaI=322 mp 
trezentas e vmte e duas, trezentas e vmte e duas 
DET+Num+Val=322 fp 
mtle sete, mtl e sete DET+Num+VaI=1007 mfp 
The FST shaded nodes refer to embedded FST, 
for instance, CentenasMF refers to the sub-graph 
that represents all mvanant compound 
determiners from cento e tr~s (103) to cento e 
noventa e nove (199) and UmdadesMF represents 
all mvanant umts from tr~s (3) to nove (9) 
4 Parsing Texts Using INTEX Tools 
The hngmstlc resources that we briefly described 
have been imported into INTEX, that apply them 
to large texts We gave here some examples of text 
processing, using a small text 
a) Recognttton of all compound words of the text 
A semelhan~a de um c6dlgo de banas que pemute 
ldentlficar uma mfimdade de produtos, dependendo 
da sequ6ncm de ntimeros, o genoma humano 
tamb6m encerra quase todos os nossos segredos e, 
~osso modo, basta uma hge~ra muta~o num gene 
para que se mamfeste urea doenqa ou, pelo 
contr, irto, uma resmt~.ncm ~t rnesma A toda a hora 
novos genes s~o ~denuficados um cha 6 um gene 
assocmdo ~t repulsa do tabaco, noutro um que 
traduz uma minor susceptabdldade de se ficar 
mfectado por deterrmnado vfrus 
Hfi um c6dlgo para tudo Mas todos estes dados 
consmuem apenas 10 por cento do patrtm6mo 
gen6t~co humano conhec~do Um facto que deverfi 
" ser alterado em Feveretro do pr6x~mo ano, se se 
puderem cumpnr as prev~s6es dos respons~ive~s 
pelo amb~c~oso Projecto do Genoma Humano 
In the example, the compound words have been 
underlined 
b) Indexmg all utterances of a gtven word 
All the forms assocmted to the mfinmve of the 
verb ser (to be) 
ao ldentfflcados um dm 6 um gene assocmdo ~ rep 
toda a hora novos genes silo ldent~ficados um dm 
o Um facto que deverfi se___r alterado em Fevere~ro 
were ~dent~fied and extracted into a concordance 
c) Indexmg a morphologtcal pattern 
The rataonal expression 
<DET+Art+Ind fs> (<E>+<A fs>) <N fs> 
(<E>+<PREP><N>) 
or the eqmvalent FST 
Identify femmme smgular (fs) noun phrases, that 
are specified by a determiner (DET) belonging to 
the class of indefinite articles (Art + lnd), the 
head of the noun phrase ts a feminine singular 
noun (Nfs), optionally (E) modified by an 
adjectave m pre-nomlnal posmon or a 
preposmonal phrase (PREP N) In the first 
paragraph of the sample text, the NPs 
corresponding to those structures are (underhned) 
A semelhanqa de um c6dtgo de barras que 
perrrute ldenuficar urea mfimdade de produtos, 
dependendo da sequ8ncm de ntimeros, o 
genoma humano tamb6m encerra quase todos 
os nossos segredos e, grosso modo, basta urea 
hge~ra mutaq~o hum gene para que se 
mamfeste uma doenqa ou, pelo contr~no, uma 
teslst~nc~a h mesma A toda a hora novos genes 
s.~o ~dentlficados um dm 6 um gene assocmdo 
repulsa do tabaco, noutro um que traduz .u.ma 
minor suscept~bd~dade de se ficar mfectado por 
detemunado vfrus 
d) Locating lextco-syntacttc patterns 
A regular expressaon (or a local grammar) of the 
form 
(<dever>+<poder>) (.<ADV>+<E>) <V W> 
corresponds to syntacuc constructions wath modal 
verbs dever, poder (must, can) The mare verbs 
are m the mfinmve form <V W>, an insert or an 
adverb (simple or compound) can occur between 
the two Verbs In the text, there are two 
construcuons of such type 
conhecldo Um facto que devcr,i scr alterado em Fe 
ro do pr6xlmo ano, se se puderern curnpnt as prev~s 
5 Maintaining and Increasing Dictionaries, 
using INTEX features 
5.1 Simple words 
To evaluate the coverage of the extstmg 
dlcuonary we apply ~t to vaned corpora the non- 
recognttton of a word form Indicates m general 
that (0 it is not m the dlcttonary, (n) tt was 
incorrectly formahzed (m) ~t is a proper name, 
0v) it Is an acronym, (v) It is misspelled 
Each of these failures reqmre different soluttons 
0) all the new words (with good prospects to 
79 
\[\] 
mm 
mm 
1 
\[\] 
m 
m 
1 
1 
ll 
ll 
mm 
1 
1 
1 
1 
1 
m 
mm 
1 
mm 
m 
remam m the lexicon of the language) are 
formahzed and added to the dictionary, (n) the 
erroneous entries must be corrected, (m) proper 
names must be hsted m spectal d~ct~onanes, built 
from the explorauon of existing catalogs 
However a lot of proper nouns are homographs 
wtth common ones, that m some contexts are 
written m capitals (Bush and Rose can be e~ther a 
proper noun or a common one), (tv) acronyms (if 
they have good prospects to survtve) must be 
hsted and assocmted with the words that they 
represent In general, acronyms are formally 
s~mple words, but they represent compounds Our 
expenment of braiding such dlcuonanes indicates 
that the assocmtlon of both types of lexlcal umts tt 
~s not a tnvml task 
5.2 Compound nouns 
The d~cuonanes of compound nouns are being 
enlarged m a seml-automatlc way We write 
regular expressions that correspond to typical 
patterns of compound nouns (e g <N ms> <A 
ms>), and then we ask INTEX to extract from 
texts (to which dicnonanes have been apphed 
prewously) all patterns that match that structure 
The resulting hsts, integrated into a concordance, 
contain not only the combinations of a noun and 
an adjecuve but also compound nouns of that 
form that are followed by an adjective Lmgmsts 
mteracttvely validate the hsts of candidates to 
binary or ternary compounds 
Acknowledgements 
This research was partly supported by the FCT 
(Programme PRAXIS XXI, 2/2 1/CSH/775/95) 
• 80 

References 
Baplasta J (1995), Estabelecimento e formahzafdo de 
classes de nomes compostos, M A Thesis, 
Umversldade de Llsboa 

Courtols B (1990), Un syst~me de dlctlonnmres 
61ectromques pour les mots s~mples du franqms, 
Langue Franfatse, 87, <<Dlctlonnatres 61ectromques 
du franqms>>, Paris Larousse, pp 11-22 

Eleut6no S, Ranchhod E, Frelre H, Baptlsta J 
(1995), A System of Electroruc Dictionaries of 
Portuguese Lmgvtsttcae lnvesttgatwnes, XIX 1, 
'" Amsterdam/Phlladelphm John Benjarmns 
Pubhshmg Company, pp 57-82 

Gross M (1995), Representation of Finite Utterances 
and the Automatic Parsing of Texts, Language 
Research, Vol 31, No 2, Seoul Language Research 
Institute, pp 291-307 

Gross M (1997), The Constmctaon of Local 
Grammars, F, mte-State Language Processing, 
Cambridge, Mass/London MIT Press, pp 329-354 

Laporte E (1997), Les mots Un derru-sl~cle de 
traitements, t a l, vol 38, n ° 2, Pans Association 
pour le Traitement Automahque des Langues, 
pp 47-68 

Ranchhod E (1991), Frozen Adverbs Comparatwe 
Form~ como C tn Portuguese, LmgvlsUcae 
Invesugationes, XV 1, Amsterdam/Pluladelphm 
John Benjarmns, pp 141-170 

Ranchhod E (1998a), Dlclon~inos Electr6mcos e 
Amihse Lexical Autom~iuca, In Actas do Workshop 
sobre Lmgulsttca Computactonal da APL 

Ranchhod, E and Mota C (1998b), Elaboraq,~o de 
dlclon~los termtol6gacos Seguros In Actas do 
Workshop sobre Lmgu£sttca Computactonal da APL 

Sdberztem M (1993), Dlctlonnatres 61ectromques et 
analyse automataque de textes le syst6me INTEX, 
Pans Masson, 233 p 

Sllberztem M (1997), The Lexical Analysis of Natural 
Language, Ftmte-State Language Processmg, 
Cambridge, Mass/London MIT Press, pp 175-203 
