Proceedings of EACL '99 
Full Text Parsing using Cascades of Rules: 
an Information Extraction Perspective 
Fabio Ciravegna and Alberto Lavelli 
ITC-irst Centro per la Ricerca Scientifica e Tecnologica 
via Sommarive, 18 
38050 Povo (TN) 
ITALY 
{cirave\[lavelli} Qirst.itc.it 
Abstract 
This paper proposes an approach to full 
parsing suitable for Information Extrac- 
tion from texts. Sequences of cascades of 
rules deterministically analyze the text, 
building unambiguous structures. Ini- 
tially basic chunks are analyzed; then ar- 
gumental relations are recognized; finally 
modifier attachment is performed and 
the global parse tree is built. The ap- 
proach was proven to work for three lan- 
guages and different domains. It was im- 
plemented in the IE module of FACILE, 
a EU project for multilingual text classi- 
fication and !E. 
1 Introduction 
Most successful approaches in IE (Appelt et al., 
1993; Grishman, 1995; Aone et al., 1998) make a 
very poor use of syntactic information. They are 
generally based on shallow parsing for the anal- 
ysis of (non recursive) NPs and Verba~ Groups 
(VGs). After such step regular patterns are ap- 
plied in order to trigger primitive actions that fill 
template(s); meta-rules are applied to patterns to 
cope with different syntactic clausal forms (e.g., 
passive forms). If we consider the most com- 
plex MUC-7 task (i.e., the Scenario Template task 
(MUC7, 1998)), the current technology is not able 
to provide results near an operational level (ex- 
pected F(1)=75%; the best system scored about 
50% (Aone et al., 1998)). One of the limita- 
tions of the current technology is the inability 
to extract (and to represent) syntactic relations 
among elements in the sentence, i.e. grammati- 
cal functions and thematic roles. Scenario Tem- 
plate recognition needs the correct treatment of 
syntactic relations at both sentence and text level 
(Aone et al., 1998). Full parsing systems are gen- 
erally able to correctly model syntactic relations, 
but they tend to be slow (because of huge search 
spaces) and brittle (because of gaps in the gram- 
mar). The use of big grammars partially solves the 
problem of gaps but worsens the problem of huge 
search spaces and makes grammar modifications 
difficult (Grishman, 1995). Grammar modifica- 
tions are always to be taken into account. Many 
domain-specific texts present idiosyncratic phe- 
nomena that require non-standard rules. Often 
such phenomena are limited to some cases only 
(e.g., some types of coordinations are applied to 
people only and not to organizations). Inserting 
generic rules for such structures introduces (use- 
less) extra complexity into the search space and 
- when applied indiscriminately (e.g., on classes 
other than people) - can worsen the system re- 
sults. It is not clear how semantic restrictions can 
be introduced into (big) generic grammars. 
In this paper we propose an approach to full 
parsing for IE based on cascades of rules. The 
approach is inspired by the use of finite-state cas- 
cades for parsing (e.g., (Abney, 1996) uses them 
in a project for inducing lexical dependencies from 
corpora). Our work is interesting in an IE per- 
spective because it proposes: 
• a method for efficiently and effectively per- 
forming full parsing on texts; 
• a way of organizing generic grammars that 
simplifies changes, insertion of new rules 
and especially integration of domain-oriented 
rules. 
The approach proposed in this paper for parsing 
has been extended to the whole architecture of an 
IE system. Also lexical (lexical normalization and 
preparsing), semantic (default reasoning and tem- 
plate filling) and discourse modules are based on 
the same approach. The system has been devel- 
oped as part of FACILE (Ciravegna et al., 1999), 
a successfully completed project funded by the 
European Union. FACILE deals with text clas- 
sification and information extraction from text in 
102 
Proceedings of EACL '99 
the financial domain. The proposed approach has 
been tested mainly for Italian, but proved to work 
also for English and, as of the time of this writing, 
partially for Russian. Applications and demon- 
strators have been built in four different domains. 
In this paper we first introduce the adopted for- 
malism and then go into details on grammar orga- 
nization and on the different steps through which 
parsing is accomplished. Finally we present some 
experimental results. 
2 Representation and Rules 
Every lexical element a in the input sentence w 
is abstractly represented by means of elementary 
objects, called tokens. A token T is associated 
with three structures: 
• \[T\]dep is a dependency tree for a, i.e. a tree 
representing syntactic dependencies between 
a and other lexical elements (its dependees) 
in w. 
• \[T\]leat is a feature structure representing syn- 
tactic and semantic information needed to 
combine a with other elements in the input. 
• \[T\]zy is a Quasi Logical Form (QLF) providing 
a semantic interpretation for the combination 
of a with its dependees. 
Rules operate on tokens, therefore they can access 
all the three structures above. Rules incremen- 
tally build and update the above structures. Lex- 
ical, syntactic and semantic constraints can then 
be used in rules at any level. The whole IE ap- 
proach can be based on the same formalism and 
rule types, as both lexical, syntactic and semantic 
information can be processed uniformly. 
The general form of a rule is a triple 
(Ta~, FT, FA>, 
where 
• 7c~d is a non-empty string of tokens, called 
the rule pattern; cr is called the rule core 
and is non-empty, 7, fi are called the rule con- 
text and may be empty; 
• FT is a set of boolean predicates, called rule 
test, defined over tokens in the rule pattern; 
• FA is a set of elementary operations, called 
rule action, defined over tokens in the sole 
rule core. 
The postfix, unary operators "," (Kleene star) 
and "?" (optionality operator) can be used in the 
rule patterns. 
A basic data structure, called token chart, is 
processed and dynamically maintained. This is 
a directed graph whose vertices are tokens and 
whose arcs represent binary relations from some 
(finite) basic set. Initially, the token chart is a 
chain-like graph with tokens ordered as the corre- 
sponding lexical elements in w, i.e. arcs initially 
represent lexical adjacency between tokens. Dur- 
ing the processing, arcs might be rewritten so that 
the token chart becomes a more general kind of 
graph. 
For a rule to apply, a path cr must be found 
in the token chart that, when viewed as a string 
of tokens, satisfies the two following conditions: 
(i) ~ is matched by 7a~; and (ii) all the boolean 
predicates in FT hold when evaluated on c~. 
When a rule applies, the elementary operations 
in FA are executed on the tokens of ¢ matching 
the core of the rule. The effect of action execution 
is that \[T\]dep, IT\]lear and \[Tit/are updated for the 
appropriate matching tokens. 
Rules are grouped into cascades that are fi- 
nite, ordered sequences of rules. Cascades rep- 
resent elementary logical units, in the sense that 
all rules in a cascade deal with some specific con- 
struction (e.g., subcategorization of verbs). From 
a functional point of view, a cascade is composed 
of three segments: 
• sl contains rules that deal with idiosyncratic 
cases for the construction at hand; 
• s2 contains rules dealing with the regular 
cases; 
• s3 contains default rules that fire only when 
no other rule can be successfully applied. 
3 Parsing 
The parsing model is strongly influenced by IE 
needs. Its aim is to build the sufficient IE ap- 
proximation (SIEA) of the correct parse tree for 
each sentence, i.e. a complete parse tree where 
all the relations relevant for template filling are 
represented, while other relations are left implicit. 
The parser assumes that there is one and only one 
possible correct parse tree for each sentence and 
therefore also only one SIEA. 
Parsing is based on the application of a fixed 
sequence of cascades of rules. It is performed in 
three steps using different grammars: 
* chunking (analysis of NPs, VGs and PPs); 
. subcategorization frame analysis (for verbs, 
nominalizations, etc.); 
• modifier attachment. 
103 
Proceedings of EACL '99 
\[ACME\] np 
A CME 
\[iniziare\] vg 
start 
\[in raodo da\]compt 
so to 
\[ha deciso\]vg, 
has decided, 
\[1' emis s lone\] np 
the issue 
\[divers if icare\] vg 
diversify 
\[informa\] vg 
tells 
\[di obbligazioni\] pp 
of bonds 
\[il proprio impegno\]np 
its obligation 
\[una nora\] np, \[di\] eompt 
a press release, to 
\[per 12 milioni di Euro\]pp 
for 1~ million (o\]) Euro 
\[nel mercato\]pp. 
in the market. 
Figure 1: The Italian sentence used as an example. 
The first two steps are syntax driven and based on 
generic grammars. Most rules in such grammars 
are general syntactic rules, even if they strongly 
rely on the semantic information provided by a 
foreground lexicon (see also Section 3.2). Dur- 
ing modifier attachment, mainly semantic pat- 
terns are used. At the end of these three steps 
the SIEA is available. 
We use deterministic dependency parsing 1 op- 
erating on a specific linear path within the to- 
ken chart (the parsing path); at the beginning the 
parsing path is equal to the initial token chart. 
When a rule is successfully applied, the parsing 
path is modified so that only the head token is 
visible for the application of the following rules; 
this means that the other involved elements are 
no longer available to the parser. 
3.1 Chunking 
Chunking is accomplished in a standard way. 
In Figure I an example of chunk recognition is 
shown. 2 
3.2 A-structure Analysis 
A-structure analysis is concerned with the 
recognition of argumental dependencies between 
chunks. All kinds of modifier dependencies (e.g., 
PP attachment) are disregarded during this step. 
More precisely: let w be the input sentence. 
A dependency tree for w is a tree represent- 
ing all predicate-argument and predicate-modifier 
syntactic relations between the lexical elements in 
w. The A-structure for w is a tree forest ob- 
tained from the dependency tree by unattaching 
all nodes that represent modifiers. A-structures 
are associated with the token that represents the 
semantic head of w (Tsent in the following). A- 
structure analysis associates to Ts~nt: 
• \[Tsent\]dep: the A-structure spanning the 
whole sentence; 
t Even if we use dependency parsing, in this paper 
we will make reference to constituency based struc- 
tures (e.g., PPs) because most readers are more ac- 
quainted with them than with dependency structures. 
2In this paper we use literal English translations of 
Italian examples. 
* \[Ts~.t\]f~,t: its feature structure; 
• \[Tsent\]tf: its QLF. 
A-structure analysis is performed without recur- 
sion by the successive application of three se- 
quences of rule cascades: 3 
. The first sequence of cascades performs anal- 
ysis of basic (i.e., non-recursive) sentences. 
It does so using the subcategorization frames 
of available chunks. Three cascades of rules 
are involved: one for the subcategorization 
frames of NP and PPs, one for those of VGs, 
and one for combining complementizers with 
their clausal arguments. 
• The second sequence of cascades performs 
analysis of dependencies between basic sen- 
tences. This sequence processes all sentential 
arguments and all incidentals by employing 
only two cascades of rules, without any need 
for recursion. This sequence is applied twice, 
i.e. it recognizes structures with a maximum 
of two nested sentences. 
* The third sequence of cascades performs re- 
covery analysis. During this step all tree frag- 
ments not yet connected together are merged. 
Tokens not recognized as arguments at the end of 
A-structure analysis are marked as modifiers and 
left unattached in the resulting A-structure. They 
will be attached in the parse tree during modifier 
attachment (see Section 3.3). 
We adopt a highly lexicalized approach. In 
a pure IE perspective the information for A- 
structure analysis is provided by a foreground lex- 
icon (Kilgarriff, 1997). Foreground lexica pro- 
vide detailed information about words relevant 
for the domain (e.g., subcategorization frame, re- 
lation with the ontology); for words outside the 
foreground lexicon a large background lexicon pro- 
vides generic information. The term subcatego- 
rization frame is used here in restricted sense: it 
3The order of the cascades in the sequences de- 
pends on the intrasentential structure of the specific 
language coped with. 
104 
Proceedings of EACL '99 
PATTERN TEST ACTION Matched Input 
T1 
T2* 
T3 
\[T1\]yeat.cat=NP 
\[T2\]/eat.cat=ld j unct 
\[T3\]yeat.cat=PP 
\[T3\]\]eat=\[T1\]\]e~t.subcat.int-arg 
Depend ant (IT1\] aep, \[T3\] aep) 
\[T1\]yeat.subcat.int-arg=\[T3\]yeat 
\[T1\]t/.patient =\[7"3\]ty.head 
"the issue" 
"of bonds" 
Figure 2: The rule that recognizes \[the issue of bonds\]np. 
generally includes subject, object and - possibly - 
one indirect complement. The information in the 
foreground lexicon allows to classify known tokens 
as arguments of other known tokens with high re- 
liability. Let's go back to our example. The first 
sequence of cascades recognizes: 
* \[the issue of bonds\]np: the rule in Figure 
2 recognizes \[of bonds\]pp as internal argu- 
ment of \[the issue\]np; the rule uses the in- 
formation associated with issue in the fore- 
ground lexicon, i.e. that it is the nominal- 
ization of "to issue". The subcategorization 
frame of such verb specifies its arguments 
and their semantic restrictions. The syntactic 
rule adds the condition that - being a nomi- 
nalization - the internal argument is realized 
as a PP marked by of; 
- \[ACME has decided\]ip: \[ACME\]np is the ex- 
ternal argument of \[has decided\]vg; 
• \[tells a press release\]ip: \[a press 
release\]np is the external argument of 
\[tells\]vg; 
• \[start the issue of bonds for 12 
million Euro\]ip: \[issue\]np is the internal 
argument of \[start\]v9; \[for 12 million 
Euro\]pp is a modifier (as it is not subcatego- 
rized by other tokens) and is unattached in 
the A-structure; 
- \[diversify its obligation in the 
market\]ip: \[its obligation\]np is the 
internal argument of \[diversify\]v9; \[in 
the market\]pp is a modifier; 
• \[to start the issue of bonds for 12 
million Euro\]ep: \[tO\]compl gets its argu- 
ment \[start ...lip; 
• \[so to diversify its obligation in 
the market\]cp: \[so tO\]compl gets its 
argument \[diversify . . . \]ip. 
The result after the application of the first se- 
quence is: 
\[ACME has decided\]ip 
\[tells a press release\]ip 
\[to start the issue of bonds for 12 
million Euro\] cp 
\[so to diversify its obligation in the 
market\] cp 
The second sequence recognizes that \[ACME has 
decided\]@ still needs a sentential argument, i.e. 
a subordinate clause introduced by to (such infor- 
mation comes from the foreground lexicon). Such 
argument is found after the incidental \[tells 
a press release\]: it is the CY headed by \[to 
start\]. The result of the second phase is: 
\[ACME has decided, tells a press release, 
to start the issue of bonds for 12 
million Euro\] sentence 
\[so to diversify its obligation in the 
market\] ep 
Finally, the recovery sequence collapses the two 
constituents above into a single sentence struc- 
ture; the CP is considered a clausal modifier (as 
it was not subcategorized by anything) and is left 
unattached in the A-structure. 
At the end of the A-structure recognition Tsent 
is the token associated with \[has decided'\]. 
\[Tsent\]dep is integrated with the search space for 
each unattached modifier (see Section 3.3). 
The way A-structures are produced is interest- 
ing for a number of reasons. 
First of all generic grammars are used to cope 
with generic linguistic phenomena at sentence 
level. Secondly we represent syntactic relations 
in the sentence (i.e., grammatical functions and 
thematic roles); such relations allow a better 
treatment of linguistic phenomena than possi- 
ble in shallow approaches (Aone et ah, 1998; 
105 
Proceedings of EACL '99 
Kameyama, 1997). 
The initial generic grammar is designed to cover 
the most frequent phenomena in a restrictive 
sense. Additional rules can be added to the gram- 
mar (when necessary) for coping with the un- 
covered phenomena, especially domain-specific id- 
iosyncratic forms. The limited size of the gram- 
mar makes modifications simple (the A-structure 
grammar for Italian contains 66 rules). 
The deterministic approach combined with the 
use of sequences, cascades and segments makes 
grammar modifications simple, as changes in a 
cascade (e.g., rule addition/modification) influ- 
ence only the following part of the cascade or the 
following cascades. This makes the writing and 
debugging of grammars easier than in recursive 
approaches (e.g., context-free grammars), where 
changes to a rule can influence the application of 
any rule in the grammar. 
The grammar organization in cascades and seg- 
ments allows a clean definition of the grammar 
parts. Each cascade copes with a specific phe- 
nomenon (modularity of the grammar). All the 
rules for the specific phenomenon are grouped to- 
gether and are easy to check. 
The segment/cascade structure is suitable for 
coping with the idiosyncratic phenomena of re- 
stricted corpora. As a matter of fact domain- 
oriented corpora can differ from the standard use 
of language (such as those found in generic cor- 
pora) in two ways: 
• in the frequency of the constructions for a 
specific phenomenon; 
• in presenting different (idiosyncratic) con- 
structions. 
Coping with different frequency distributions is 
conceptually easy by using deterministic parsing 
and cascades of rules, as it is just necessary to 
change the rule order within the cascade coping 
with the specific phenomenon, so that more fre- 
quently applied rules are first in the cascade. Cop- 
ing with idiosyncratic constructions requires the 
addition of new rules. Adding new rules in highly 
modularized small grammars is not complex. 
Finally from the point of view of grammar or- 
ganization, defining segments is more than just 
having ordered cascades. Generic rules ~in s2) are 
separated from domain specific ones (in sl); rules 
covering standard situations (in s2) are separated 
from recovery rules (in s3). In s2, rules are generic 
and deal with unmarked cases. In principle s2 
and s3 are units portable across the applications 
without changes. Domain-dependent rules are 
grouped together in sl and are the resources the 
application developer works on for adapting the 
grammar to the specific corpus needs (e.g., cop- 
ing with idiosyncratic cases). Such rules generally 
use contexts and/or introduce domain-dependent 
(semantic) constraints in order to limit their ap- 
plication to well defined cases. S1 rules are ap- 
plied before the standard rules and then idiosyn- 
cratic constructions have precedence with respect 
to standard forms. 
Segments also help in parsing robustly. $3 deals 
with unexpected situations, i.e. cases that could 
prevent the parser from continuing. For example 
the presence of unknown words is coped with after 
chunking by a cascade trying to guess the word's 
lexical class. If every strategy fails, a recovery 
rule includes the unknown word in the immedi- 
ately preceding chunk so to let the parser con- 
tinue. Recovery rules are applied only when rules 
in sl and s2 do not fire. 
3.3 Modifier attachment 
The aim of modifier attachment is to find the cor- 
rect position for attaching relevant modifiers in 
the parse tree and to add the proper semantic 
relations between each modifier and its modifiee 
in \[Tsent\]ty. \[Tsent\]dep and \[Tsent\]t$ are used to 
determine the correct attachments and are also 
modified during this step. Modifier attachment 
is performed in two steps: first all the possible 
attachments are computed for each modifier (its 
search space, SP). Here mainly generic syntac- 
tic rules are used. Then the correct attachment in 
the search space is determined for each modifier, 
applying domain-specific rules. The rules always 
modify both \[Tsent\]dep and \[Tsent\]ty. Only modi- 
fiers relevant for the IE task are attached in the 
proper position. Other modifiers are attached in 
a default position in \[T~ent\] d~p. 
Initially modifiers are attached in the lowest po- 
sition in \[Tsent\]dep. No semantic relation is hy- 
pothesized between each modifier and the rest of 
the sentence in \[Ts~nt\]t/. Given: 
• T~: a modifier token derived from chunk n, 
• Tn-l: the token, derived from chunk n - 1, 
immediately preceding n in the sentence, 
in right branching languages (such as Italian) the 
lowest possible attachment for T,~ is in the position 
of modifier of Tn-x. 
Afterwards, the possible SP for each modifier is 
computed. The SP for a modifier Tn is a path in 
the token chart connecting T,~ with other elements 
in \[Zsent\]dep that - from a syntactic point of view 
- can be modified by Tn. The initial SP for Tn 
is given by the path in \[T~nt\]d~p connecting Ta-1 
106 
Proceedings of EACL '99 
PATTERN 
TI 
T2* 
T3 
T4* 
T5 
TEST 
\[T1\] t/.head=TO-INCREASE 
\[T2\]/eat.cat =Adjunct 
\[T3\]i/.head=PROFIT 
\[T3\]/oor.cat=VV 
\[Ta\]/~,~t.marked=' 'of" 
\[T4\]Ieat.Cat :Adjunct 
\[Tb\] I/.head:PERCENTAGE 
\[Tb\]/eat.cat=PP 
\[Ts\]feat.marked: " of/by'' 
ACTION 
Dependant (\[T1\] dev, \[Tb\] dev) 
\[ T1\] l\] .increased-by=\[Ta\]t/.head 
Matched Input 
"an increase" 
"of profits" 
"of/by 20%" 
Figure 3: An example of modifier attachment rule. 
with Tsent. Then rules are applied to filter out 
elements in SPs according to syntactic constraints 
(e.g., NPs or PPs can be modified by a relative 
clause, but VGs can not). 
After SPs have been computed, modifiers are 
attached using a sequence of cascades of rules. 
A first cascade, mainly composed by generic 
syntactic rules, attaches subordinates (e.g., rela- 
tive clauses). Many of these rules are somehow 
similar to A-structure recognition rules. They are 
truly syntactic rules recognizing part of the sub- 
categorization frame of subordinated verbs, using 
semantic information provided by the foreground 
lexicon. Note however that they are applied onto 
SPs not on the parsing path (as A-structure rules 
are). 
Other cascades are used to attach different 
types of modifiers, such as PPs. Such rules 
mainly involve semantic constraints. For exam- 
ple, the rule shown in Figure 3 can recognize 
una crescita dei profitti del 20Y, (lit. an 
increase of profits of/by 20%).4 
4Generally rules involve two elements (i.e. the 
modifier and the modifiee), taking into account in- 
tervening elements (such as other adjuncts) that do 
not have further associated conditions. The example 
above, instead, is more complex as it introduces con- 
straints also on one of the intervening adjuncts (i.e., on 
T3). Such domain-oriented rule solves a recurring am- 
biguity in the domain of company financial results. As 
a matter of fact of/by 20% could modify both nouns 
from a syntactic and semantic point of view. The rule 
Rules for modifier attachment are easy to write. 
The SP allows to reduce complex cases to simple 
ones. For exampl e the rule in Figure 3 also applies 
to: 
• an increase in 1997 of profits of/by 
20Z 
• an increase, news report, of profits 
of/by 20Z 
• an increase, considering the 
inflation rate, of profits (both 
gross and net) of/by 20~ 
Patterns are usually developed having in mind 
the simplest case, i.e. a sequence of contiguous 
chunks in the sentence (such as in \[an increase\] 
\[of profits\] \[of/by 20%\]) that can be inter- 
leaved by other non relevant chunks. 
Conceptually this step is very similar to that 
used by shallow parsing approaches such as in 
(Grishman, 1997). Note however that rules are 
not applied on a list of contiguous chunks, but 
on the search space (where the parse tree and re- 
lated syntactic relations are available). Parse-tree 
based modifier attachment is less error prone than 
attachment performed on fiat chunk structures (as 
used in shallow parsing). For example it is possi- 
ble to avoid cases of attachments violating syntac- 
tic constraints, as it would be the case in attaching 
allows to solve the ambiguity attaching of/by 20% to 
increase. 
107 
Proceedings of EACL '99 
NP 
~pp ".. ,,, 
an increase, considering the inflation rate, of profits (both gross and net) of/by 20% 
Figure 4: Violation of syntactic constraints 
(in the third example above) profit to increase 
and 20% to inflation rate (see Figure 4). 
At the end of modifier attachment the final 
parse tree is available in \[T~ent\] aep, together with 
\[Tse,~t\]feat and \[Tsent\]g. The syntactic informa- 
tion in \[Tse,,t\]aep and \[Tsent\].feat is useful in the 
steps following parsing because, for example, the 
availability of syntactic relations increases the ac- 
curacy in determining discourse relations. As a 
matter of fact at discourse level it is possible to 
adopt strategies to compute salience for corefer- 
ence resolution that take into account both the 
syntactic relations among constituents and the ar- 
gumental structure of the sentence. Shallower ap- 
proaches do not produce anything similar either to 
\[Tse,~t\]aep or to \[Tse,~t\]yeat. They generally adopt 
heuristics such as linear ordering and recency of 
basic chunks: such heuristics have been shown not 
as effective as those based on full syntactic rela- 
tions, even if for some languages they represent an 
acceptable approximation (Kameyama, 1997). 
The obtained final parse tree is very close to 
the SIEA mentioned at the beginning of Section 
3. In this tree all the A-structures are correctly 
built and all the modifiers are attached. Modi- 
fiers relevant for the IE application are attached 
in the correct position in the tree and a valid se- 
mantic relation is established with the modifiee 
in \[Tsent\]g. Irrelevant modifiers are attached in 
a default position in the tree (the lowest possi- 
ble attachment) and a null semantic relation is 
established with the modifiee. The only differ- 
ence between the produced tree and the SIEA 
is in the A-structure, where all the relations are 
captured (and not only those relevant for the do- 
main). Modeling also the argumental structures 
of irrelevant constituents can be useful in order to 
correctly assign salience at discourse level. For ex- 
ample when interesting relations involve elements 
that are dependees of irrelevant verbs. 
4 Remarks and Conclusions 
The approach to parsing proposed in this paper 
was implemented in the IE module of FACILE 
(Ciravegna et al., 1999), a EU project for multi- 
lingual text classification and IE. It was tested on 
four domains and in three languages. In particular 
for Italian one application (about bond issues) has 
been fully developed and two others have reached 
the level of demonstration (management succes- 
sion and company financial results). For English 
a demonstrator for the field of economic indica- 
tors was developed. A Russian demonstrator for 
bond issues was developed till the level of modi- 
fier attachment. The approach to rule writing and 
organization adopted for parsing '(i.e., the type of 
rules, cascades, segments, and available primitives 
and rule interpreter) was extended to the whole 
architecture of the IE module. Also lexical (lexi- 
cal normalization and preparsing), semantic (de- 
fault reasoning and template filling) and discourse 
levels are organized in the same way. 
Provided that the approach to parsing proposed 
in this paper is strongly influenced by IE needs, 
it is difficult to evaluate it by means of the stan- 
dard tools used in the parsing community. Ap- 
proximate indications can be provided by the ef- 
fectiveness in recognizing A-structures and by the 
measures on the overall IE tasks. 
Effectiveness in recognizing A-structures was 
experimentally verified for Italian on a corpus of 
95 texts in the domain of bond issue: 33 texts 
were used for training, 62 for test. Results were: 
P=97, R=83 on the training corpus, P=95, R=71 
on the test corpus. In our opinion the high preci- 
sion demonstrates the applicability of the method. 
The lower recall shows difficulties of building com- 
plete foreground lexica, a well known fact in IE. 
Concerning the effectiveness of the IE process, 
in the Italian application on bond issues the sys- 
tem reached P=80, R--72, F(1)=76 on the 95 
108 
Proceedings of EACL '99 
texts used for development (33 ANSA agency 
news, 20 "II Sole 24 ore" newspaper articles, 42 
Radiocor agency news; 10,472 words in all). Ta- 
ble 4 shows the kind of template used for this ap- 
plication. Effectiveness was automatically calcu- 
lated by comparing the system results against a 
user-defined tagged corpus via the MUC scorer 
(Douthat, 1998). The development cycle of the 
template application was organised as follows: re- 
sources (grammars, lexicon and knowledge base) 
were developed by carefully inspecting the first 33 
texts of the corpus. Then the system was com- 
pared against the whole corpus (95 texts) with 
the following results: Recall=51, Precision=74, 
F(1)=60. Note that the corpus used for train- 
ing was composed only by ANSA news, while the 
test corpus included 20 "I1 Sole 24 ore" newspa- 
per articles and 42 Radiocor agency news (i.e., 
texts quite different from ANSA's in both termi- 
nology and length). Finally resources were tuned 
on the whole corpus mainly by focusing on the 
texts that did not reach sufficient results in terms 
of R&P. The system analyzed 1,125 word/minute 
on a Sparc Ultra 5, 128M RAM (whole IE pro- 
cess). 
issuer 
kind of bond 
amount 
currency 
announcement date 
placement date 
interest date 
maturity 
average duration 
global rate 
first rate 
a template element 
a label 
a monetary amount 
a string from the text 
a temporal expression 
a temporal expression 
a temporal expression 
a temporal expression 
a temporal expression 
a string from the text 
a string from the text 
Table 1: The template to be filled for bond issues. 
Acknowledgments 
Giorgio Satta has contributed to the whole work 
on parsing for IE via cascades of rules. The au- 
thors would like to thank him for the constant help 
and the fruitful discussions in the last two years. 
He also provided useful comments to this pa- 
per. The FACILE project (LE 2440) was partially 
funded by the European Union in the framework 
of the Language Engineering Sector. The En- 
glish demonstrator was developed by Bill Black, 
Fabio Rinaldi and David Mowatt (Umist, Manch- 
ester) as part of the FACILE project. The Russian 
demonstrator was developed by Nikolai Grigoriev 
(Russian Academy of Sciences, Moscow). 
References 
Steven Abney. 1996. Partial parsing via finite- 
state cascades. In Proceedings of the ESSLI '96 
Robust Parsing Workshop. 
Chinatsu Aone, Lauren Halverson, Tom Hamp- 
ton, and Mila Ramos-Santacruz. 1998. 
SRA: description of the IE 2 system used 
for MUC-7. In Proceedings of the Seventh 
Message Understanding Conference (MUC-7), 
http://www.muc.saic.com/. 
Douglas E. Appelt, Jerry R. Hobbs, John Bear, 
David Israel, and Mabry Tyson. 1993. FAS- 
TUS: A finite-state processor for information 
extraction from real-world text. In Proceed- 
ings of the Thirteenth International Joint Con- 
ference on Artificial Intelligence, Chambery, 
France. 
Fabio Ciravegna, Alberto Lavelli, Nadia Mann, 
Luca Gilardoni, Silvia Mazza, Massimo Ferraro, 
Johannes Matiasek, William J. Black, Fabio Ri- 
natdi, and David Mowatt. 1999. FACILE: Clas- 
sifying texts integrating pattern matching and 
information extraction. In Proceedings of the 
Sixteenth International Joint Conference on Ar- 
tificial Intelligence, Stockholm, Sweden. 
Aaron Douthat. 1998. The message un- 
derstanding conference scoring software user's 
manual. In Proceedings of the Seventh 
Message Understanding Conference (MUC- 7}, 
http://www.muc.saic.com/. 
Ralph Grishman. 1995. The NYU system for 
MUC-6 or where's syntax? In Sixth mes- 
sage understanding conference MUC-6. Morgan 
Kaufmann Publishers. 
Ralph Grishman. 1997. Information extraction: 
Techniques and challenges. In M. T. Pazienza, 
editor, Information Extraction: a multidisci- 
plinary approach to an emerging technology. 
Springer Verlag. 
Megumi Kameyama. 1997. Recognizing referen- 
tial links: An information extraction perspec- 
tive. In Mitkov and Boguraev, editors, Proceed- 
ings of ACL/EACL Workshop on Operational 
Factors in Practical, Robust Anaphora Resolu- 
tion for Unrestricted Texts, Madrid, Spain. 
Adam Kilgarriff. 1997. Foreground and back- 
ground lexicons and word sense disambiguation 
for information extraction. In International 
Workshop on Lexically Driven Information Ex- 
traction, Frascati, Italy. 
MUC7. 1998. Proceedings of the Seventh Message 
Understanding Conference (MUC-7}. SAIC, 
http://www.muc.saic.com/. 
109 
