Information Extraction from Single and Multiple Sentences
Mark Stevenson
Department of Computer Science
Regent Court, 211 Portobello Street,
University of She–eld
She–eld
S1 4DP, UK
marks@dcs.shef.ac.uk
Abstract
Some Information Extraction (IE) systems
are limited to extracting events expressed
in a single sentence. It is not clear what ef-
fect this has on the di–culty of the extrac-
tion task. This paper addresses the prob-
lem by comparing a corpus which has been
annotated using two separate schemes: one
which lists all events described in the text
and another listing only those expressed
within a single sentence. It was found that
only 40.6% of the events in the flrst anno-
tation scheme were fully contained in the
second.
1 Introduction
Information Extraction (IE) is the process of
identifyingspeciflcpiecesofinformationintext,
for example, the movements of company execu-
tives or the victims of terrorist attacks. IE is a
complex task and a the description of an event
may be spread across several sentences or para-
graphs of a text. For example, Figure 1 shows
two sentences from a text describing manage-
ment succession events (i.e. changes in corpo-
rate executive management personnel). It can
be seen that the fact that the executives are
leaving and the name of the organisation are
listed in the flrst sentence. However, the names
oftheexecutivesandtheirpostsarelistedinthe
second sentence although it does not mention
the fact that the executives are leaving these
posts. The succession events can only be fully
understood from a combination of the informa-
tion contained in both sentences.
Combining the required information across
sentences is not a simple task since it is neces-
sary to identify phrases which refer to the same
entities, \two top executives" and \the execu-
tives" in the above example. Additional di–-
culties occur because the same entity may be
referred to by a difierent linguistic unit. For ex-
ample, \International Business Machines Ltd."
maybereferredtobyanabbreviation(\IBM"),
Pace American Group Inc. said it notifled
two top executives it intends to dismiss them
because an internal investigation found ev-
idence of \self-dealing" and \undisclosed fl-
nancial relationships." The executives are
Don H. Pace, cofounder, president and chief
executive o–cer; and Greg S. Kaplan, senior
vice president and chief flnancial o–cer.
Figure 1: Event descriptions spread across two
sentences
nickname (\Big Blue") or anaphoric expression
suchas\it"or\thecompany". Thesecomplica-
tions make it di–cult to identify the correspon-
dencesbetweendifierentportionsofthetextde-
scribing an event.
Traditionally IE systems have consisted of
several components with some being responsi-
ble for carrying out the analysis of individual
sentencesandothermoduleswhichcombinethe
events they discover. These systems were of-
ten designed for a speciflc extraction task and
could only be modifled by experts. In an ef-
fort to overcome this brittleness machine learn-
ing methods have been applied to port sys-
tems to new domains and extraction tasks with
minimal manual intervention. However, some
IE systems using machine learning techniques
only extract events which are described within
a single sentence, examples include (Soderland,
1999; ChieuandNg, 2002; Zelenkoetal., 2003).
Presumably an assumption behind these ap-
proaches is that many of the events described
in the text are expressed within a single sen-
tence and there is little to be gained from the
extra processing required to combine event de-
scriptions.
Systemswhichonlyattempttoextractevents
described within a single sentence only report
results across those events. But the proportion
of events described within a single sentence is
notknownandthishasmadeitdi–culttocom-
pare the performance of those systems against
ones which extract all events from text. This
question is addressed here by comparing two
versions of the same IE data set, the evaluation
corpus used in the Sixth Message Understand-
ing Conference (MUC-6) (MUC, 1995). The
corpusproducedforthisexercisewasannotated
with all events in the corpus, including those
described across multiple sentences. An inde-
pendent annotation of the same texts was car-
ried out by Soderland (1999), although he only
identifled events which were expressed within a
single sentence. Directly comparing these data
sets allows us to determine what proportion of
all the events in the corpus are described within
a single sentence.
The remainder of this paper is organised as
follows. Section 2 describes the formats for rep-
resenting events used in the MUC and Soder-
land data sets. Section 3 introduces a common
representation scheme which allows events to
be compared, a method for classifying types of
event matches and a procedure for comparing
the two data sets. The results and implications
of this experiment are presented in Section 4.
Some related work is discussed in Section 5.
2 Event Scope and Representation
The topic of the sixth MUC (MUC-6) was
management succession events (Grishman and
Sundheim, 1996). The MUC-6 data has been
commonly used to evaluate IE systems. The
test corpus consists of 100 Wall Street Jour-
nal documents from the period January 1993
to June 1994, 54 of which contained manage-
ment succession events (Sundheim, 1995). The
format used to represent events in the MUC-6
corpus is now described.
2.1 MUC Representation
Events in the MUC-6 evaluation data are
recorded in a nested template structure. This
format is useful for representing complex events
which have more than one participant, for ex-
ample, when one executive leaves a post to be
replaced by another. Figure 2 is a simplifled
event from the the MUC-6 evaluation similar
to one described by Grishman and Sundheim
(1996).
This template describes an event in which
\John J. Dooner Jr." becomes chairman of the
company \McCann-Erickson". The MUC tem-
platesaretoocomplextobedescribedfullyhere
but some relevant features can be discussed.
Each SUCCESSION EVENT contains the name of
<SUCCESSION_EVENT-9402240133-2> :=
SUCCESSION_ORG:
<ORGANIZATION-9402240133-1>
POST: "chairman"
IN_AND_OUT: <IN_AND_OUT-9402240133-4>
VACANCY_REASON: DEPART_WORKFORCE
<IN_AND_OUT-9402240133-4> :=
IO_PERSON: <PERSON-9402240133-1>
NEW_STATUS: IN
ON_THE_JOB: NO
OTHER_ORG: <ORGANIZATION-9402240133-1>
REL_OTHER_ORG: SAME_ORG
<ORGANIZATION-9402240133-1> :=
ORG_NAME: "McCann-Erickson"
ORG_ALIAS: "McCann"
ORG_TYPE: COMPANY
<PERSON-9402240133-1> :=
PER_NAME: "John J. Dooner Jr."
PER_ALIAS: "John Dooner"
"Dooner"
Figure 2: Example Succession event in MUC
format
the POST, organisation (SUCCESSION ORG) and
references to at least one IN AND OUT sub-
template, each of which records an event in
which a person starts or leaves a job. The
IN AND OUT sub-template contains details of the
PERSON and the NEW STATUS fleld which records
whetherthepersonisstartinganewjoborleav-
ing an old one.
Several of the flelds, including POST, PERSON
and ORGANIZATION, may contain aliases which
are alternative descriptions of the fleld flller
and are listed when the relevant entity was de-
scribed in difierent was in the text. For ex-
ample, the organisation in the above template
has two descriptions: \McCann-Erickson" and
\McCann". It should be noted that the MUC
template structure does not link the fleld flllers
onto particular instances in the texts. Conse-
quently if the same entity description is used
more than once then there is no simple way of
identifying which instance corresponds to the
event description.
The MUC templates were manually fllled by
annotatorswhoreadthetextsandidentifledthe
management succession events they contained.
The MUC organisers provided strict guidelines
about what constituted a succession event and
howthetemplatesshouldbefllledwhichthean-
notators sometimes found di–cult to interpret
(Sundheim, 1995). Interannotator agreement
was measured on 30 texts which were examined
bytwoannotators. Itwasfoundtobe83%when
one annotator’s templates were assumed to be
correct and compared with the other.
2.2 Soderland’s Representation
Soderland (1999) describes a supervised learn-
ing system called WHISK which learned IE
rules from text with associated templates.
WHISK was evaluated on the same texts from
the MUC-6 data but the nested template struc-
tureprovedtoocomplexforthesystemtolearn.
Consequently Soderland produced his own sim-
pler structure to represent events which he de-
scribed as \case frames". This representation
could only be used to annotate events described
within a single sentence and this reduced the
complexity of the IE rules which had to be
learned.
The succession event from the sentence
\Daniel Glass was named president and
chief executive o–cer of EMI Records
Group, a unit of London’s Thorn EMI
PLC." would be represented as follows:1
@@TAGS Succession
fPersonIn DANIEL GLASSg
fPost PRESIDENT AND CHIEF EXECUTIVE OFFICERg
fOrg EMI RECORDS GROUPg
Events in this format consist of up to four
components: PersonIn, PersonOut, Post and
Org. An event may contain all four components
although none are compulsory. The minimum
possible set of components which can form an
event are (1) PersonIn, (2) PersonOut or (3)
both Post and Org. Therefore a sentence must
contain a certain amount of information to be
listed as an event in this data set: the name
of an organisation and post participating in a
management succession event or the name of a
person changing position and the direction of
that change.
Soderland created this data from the MUC-
6 evaluation texts without using any of the
existing annotations. The texts were flrst
pre-processing using the University of Mas-
sachusetts BADGER syntactic analyser (Fisher
etal.,1995)toidentifysyntacticclausesandthe
namedentitiesrelevanttothemanagementsuc-
cession task: people, posts and organisations.
Each sentence containing relevant entities was
examined and succession events manually iden-
tifled.
1The representation has been simplifled slightly for
clarity.
This format is more practical for machine
learning research since the entities which par-
ticipate in the event are marked directly in the
text. The learning task is simplifled by the fact
that the information which describes the event
is contained within a single sentence and so the
feature space used by a learning algorithm can
be safely limited to items within that context.
3 Event Comparison
3.1 Common Representation and
Transformation
There are advantages and disadvantages to the
eventrepresentationschemesusedbyMUCand
Soderland. The MUC templates encode more
information about the events than Soderland’s
representation but the nested template struc-
ture can make them di–cult to interpret man-
ually.
In order to allow comparison between events
each data set was transformed into a com-
mon format which contains the information
stored in both representations. In this format
each event is represented as a single database
record with four flelds: type, person, post and
organisation. The type fleld can take the
values person in, person out or, when the di-
rection of the succession event is not known,
person move. The remaining flelds take the
person, position and organisation names from
the text. These flelds may contain alternative
values which are separated by a vertical bar
(\|").
MUC events can be translated into this
format in a straightforward way since each
IN AND OUT sub-template corresponds to a sin-
gle event in the common representation. The
MUC representation is more detailed than
the one used by Soderland and so some in-
formation is discarded from the MUC tem-
plates. For example, the VACANCY REASON
flled which lists the reason for the manage-
ment succession event is not transfered to
the common format. The event listed in
Figure 2 would be represented as follows:
type(person in)
person(‘John J. Dooner Jr.’|
‘John Dooner’|‘Dooner’)
org(‘McCann-Erickson’|‘McCann’)
post(chairman)
Alternative flllers for the person and org
fleldsarelistedhereandthesecorrespondtothe
PER NAME, PER ALIAS, ORG NAME and ORG ALIAS
flelds in the MUC template.
The Soderland succession event shown
in Section 2.2 would be represented
as follows in the common format.
type(person in)
person(‘Daniel Glass’)
post(‘president’)
org(‘EMI Records Group’)
type(person in)
person(‘Daniel Glass’)
post(‘chief executive officer’)
org(‘EMI Records Group’)
In order to carry out this transformation an
eventhastobegeneratedforeachPersonInand
PersonOut mentioned in the Soderland event.
Soderland’s format also lists conjunctions of
post names as a single slot flller (\president and
chief executive o–cer" in this example). These
are treated as separate events in the MUC for-
mat. Consequently they are split into the sepa-
ratepostnamesandaneventgeneratedforeach
in the common representation.
It is possible for a Soderland event to consist
of only a Post and Org slot (i.e. there is nei-
ther a PersonIn or PersonOut slot). In these
cases an underspecifled type, person move, is
used and no person fleld listed. Unlike MUC
templates Soderland’s format does not contain
alternative names for fleld flllers and so these
never occur when an event in Soderland’s for-
mat is translated into the common format.
3.2 Matching
The MUC and Soderland data sets can be com-
pared to determine how many of the events
in the former are also contained in the latter.
This provides an indication of the proportion of
events in the MUC-6 domain which areexpress-
ible within a single sentence. Matches between
Soderland and MUC events can be classifled as
full, partial or nomatch. Each of these possi-
bilities may be described as follows:
Full A pair of events can only be fully match-
ingiftheycontainthesamesetofflelds. In
addition there must be a common flller for
each fleld. The following pair of events are
an example of two which fully match.
type(person in)
person(‘R. Wayne Diesel’j‘Diesel’)
org(‘Mechanical Technology Inc.’|
‘Mechanical Technology’)
post(‘chief executive officer’)
type(person in)
person(‘R. Wayne Diesel’)
org(‘Mechanical Technology’)
post(‘chief executive officer’)
PartialApartialmatchoccurswhenoneevent
contains a proper subset of the flelds of an-
other event. Each fleld shared by the two
events must also share at least one flller.
The following event would partially match
either of the above events; the org fleld is
absent therefore the matches would not be
full.
type(person in)
person(‘R. Wayne Diesel’)
post(‘chief executive officer’)
Nomatch A pair of events do not match if the
conditionsforafullorpartialmatcharenot
met. This can occur if corresponding flelds
do not share a flller or if the set of flelds
in the two events are not equivalent or one
the subset of the other.
Matching between the two sets of events is
carried out by going through each MUC event
and comparing it with each Soderland event for
the same document. The MUC event is flrst
compared with each of the Soderland events to
check whether there are any equal matches. If
one is found a note is made and the matching
process moves onto the next event in the MUC
set. If an equal match is not found the MUC
event is again compared with the same set of
Soderland events to see whether there are any
partialmatches. WeallowmorethanoneSoder-
land event to partially match a MUC event so
when one is found the matching process con-
tinues through the remainder of the Soderland
events to check for further partial matches.
4 Results
4.1 Event level analysis
After transforming each data set into the com-
mon format it was found that there were 276
events listed in the MUC data and 248 in the
Soderland set. Table 1 shows the number of
matches for each data set following the match-
ingprocessdescribedinSection3.2. Thecounts
under the \MUC data" and \Soderland data"
headings list the number of events which fall
into each category for the MUC and Soderland
data sets respectively along with corresponding
percentagesofthatdataset. Itcanbeseenthat
112 (40.6%) of the MUC events are fully cov-
ered by the second data set, and 108 (39.1%)
partially covered.
Match MUC data Soderland data
Type Count % Count %
Full 112 40.6% 112 45.2%
Partial 108 39.1% 118 47.6%
Nomatch 56 20.3% 18 7.3%
Total 276 248
Table 1: Counts of matches between MUC and
Soderland data.
Table 1 shows that there are 108 events in
the MUC data set which partially match with
the Soderland data but that 118 events in the
Soderland data set record partial matches with
the MUC data. This occurs because the match-
ing process allows more than one Soderland
event to be partially matched onto a single
MUC event. Further analysis showed that the
difierence was caused by MUC events which
were partially matched by two events in the
Soderland data set. In each case one event
contained details of the move type, person in-
volved and post title and another contained the
sameinformationwithouttheposttitle. Thisis
caused by the style in which the newswire sto-
ries which make up the MUC corpus are writ-
ten where the same event may be mentioned in
more than one sentence but without the same
level of detail. For example, one text contains
the sentence \Mr. Diller, 50 years old, succeeds
Joseph M. Segel, who has been named to the
post of chairman emeritus." which is later fol-
lowed by \At that time, it was announced that
Diller was in talks with the company on becom-
ing its chairman and chief executive upon Mr.
Segel’s scheduled retirement this month."
Table 1 also shows that there are 56 events in
the MUC data which fall into the nomatch cat-
egory. Each of these corresponds to an event in
one data set with no corresponding event in the
other. The majority of the unmatched MUC
events were expressed in such a way that there
was no corresponding event listed in the Soder-
land data. The events shown in Figure 1 are
examples of this. As mentioned in Section 2.2,
a sentence must contain a minimum amount of
information to be marked as an event in Soder-
land’s data set, either name of an organisation
and post or the name of a person changing po-
sition and whether they are entering or leaving.
In Figure 1 the flrst sentence lists the organisa-
tion and the fact that executives were leaving.
Thesecondsentenceliststhenamesoftheexec-
utives and their positions. Neither of these sen-
tences contains enough information to be listed
as an event under Soderland’s representation,
consequently the MUC events generated from
these sentences fall into the nomatch category.
It was found that there were eighteen events
in the Soderland data set which were not in-
cluded in the MUC version. This is unexpected
since the events in the Soderland corpus should
be a subset of those in the MUC corpus. Anal-
ysis showed that half of these corresponded to
spuriouseventsintheSoderlandsetwhichcould
notbematchedontoeventsinthetext. Manyof
these were caused by problems with the BAD-
GER syntactic analyser (Fisher et al., 1995)
used to pre-process the texts before manual
analysis stage in which the events were identi-
fled. Mistakes in this pre-processing sometimes
caused the texts to read as though the sentence
contained an event when it did not. We exam-
ined the MUC texts themselves to determine
whether there was an event rather than relying
on the pre-processed output.
Of the remaining nine events it was found
that the majority (eight) of these corresponded
to events in the text which were not listed in
the MUC data set. These were not identi-
fled as events in the MUC data because of the
the strict guidelines, for example that historical
events and non-permanent management moves
should not be annotated. Examples of these
event types include \... Jan Carlzon, who left
last year after his plan for a merger with three
other European airlines failed." and \Charles
T. Young, chief flnancial o–cer, stepped down
voluntarily on a ‘temporary basis pending con-
clusion’ of the investigation." The analysis also
identifledoneeventintheSoderlanddatawhich
appeared to correspond to an event in the text
but was not listed in the MUC scenario tem-
plate for that document. It could be argued
that there nine events should be added to the
setofMUCeventsandtreatedasfullymatches.
However, the MUC corpus is commonly used as
a gold standard in IE evaluation and it was de-
cided not to alter it. Analysis indicated that
one of these nine events would have been a full
match and eight partial matches.
It is worth commenting that the analysis car-
ried out here found errors in both data sets.
There appeared to be more of these in the
Soderland data but this may be because the
event structures are much easier to interpret
andsoerrorscanbemorereadilyidentifled. Itis
also di–cult to interpret the MUC guidelines in
some cases and it sometimes necessary to make
ajudgementoverhowtheyapplytoaparticular
event.
4.2 Event Field Analysis
A more detailed analysis can be carried out
examining the matches between each of the
four flelds in the event representation individu-
ally. There are 1,094 flelds in the MUC data.
Although there are 276 events in that data
set seven of them do not mention a post and
three omit the organisation name. (Organisa-
tion names are omitted from the template when
the text mentions an organisation description
rather than its name.)
Table4.2liststhenumberofmatchesforeach
of the four event flelds across the two data sets.
Each of the pairs of numbers in the main body
ofthetablereferstothenumberofmatchingin-
stances of the relevant fleld and the total num-
ber of instances in the MUC data.
The column headed \Full match" lists the
MUC events which were fully matched against
the Soderland data and, as would be expected,
all flelds are matched. The column marked
\Partial match" lists the MUC events which
are matched onto Soderland flelds via partially
matching events. The column headed \No-
match" lists the event flelds for the 56 MUC
events which are not represented at all in the
Soderland data.
Of the total 1,094 event flelds in the MUC
data 727, 66.5%, can be found in the Soderland
data. The rightmost column lists the percent-
ages of each fleld for which there was a match.
Thecountsforthetypeandpersonfleldsarethe
same since the type and person flelds are com-
bined in Soderland’s event representation and
hence can only occur together. These flgures
also show that there is a wide variation between
theproportionofmatchesforthedifierentflelds
with 76.8% of the person and type flelds be-
ing matched but only 43.2% of the organisation
fleld.
This difierence between flelds can be ex-
plainedbylookingatthestyleinwhichthetexts
forming the MUC evaluation corpus are writ-
ten. It is very common for a text to introduce
a management succession event near the start
of the newswire story and this event almost in-
variably contains all four event flelds. For ex-
ample, one story starts with the following sen-
tence: \Washington Post Co. said Katharine
Graham stepped down after 20 years as chair-
man, and will be succeeded by her son, Don-
ald E. Graham, the company’s chief executive
o–cer." Later in the story further succession
events may be mentioned but many of these use
an anaphoric expression (e.g. \the company")
rather than explicitly mention the name of the
organisationintheevent. Forexample,thissen-
tence appears later in the same story: \Alan G.
Spoon, 42, will succeed Mr. Graham as presi-
dent of the company." Other stories again may
only mention the name of the person in the suc-
cession event. For example, \Mr. Jones is suc-
ceeded by Mr. Green" and this explains why
some of the organisation flelds are also absent
from the partially matched events.
4.3 Discussion
From some perspectives it is di–cult to see why
there is such a difierence between the amount
of events which are listed when the entire text
is viewed compared with considering single sen-
tences. After all a text comprises of an ordered
list of sentences and all of the information the
text contains must be in these. Although, as we
have seen, it is possible for individual sentences
to contain information which is di–cult to con-
nect with the rest of the event description when
a sentence is considered in isolation.
The results presented here are, to some ex-
tent, dependent on the choices made when rep-
resenting events in the two data sets. The
events listed in Soderland’s data require a min-
imal amount of information to be contained
within a sentence for it to be marked as con-
taining information about a management suc-
cession event. Although it is di–cult to see how
any less information could be viewed as repre-
senting even part of a management succession
event.
5 Related Work
Huttunenetal.(2002)foundthatthereisvaria-
tionbetweenthecomplexityofIEtasksdepend-
ing upon how the event descriptions are spread
throughthetextandthewaysinwhichtheyare
encoded linguistically. The analysis presented
here is consistent with their flnding as it has
Full match Partial match Nomatch TOTAL %
Type 112 / 112 100 / 108 0 / 56 212 / 276 76.8%
Person 112 / 112 100 / 108 0 / 56 212 / 276 76.8%
Org 112 / 112 6 / 108 0 / 53 118 / 273 43.2%
Post 111 / 111 74 / 108 0 / 50 185 / 269 68.8%
Total 447 / 447 280 / 432 0 / 215 727 / 1094 66.5%
Table 2: Matches between MUC and Soderland data at fleld level
been observed that the MUC texts are often
written in such as way that the name of the
organisation in the event is in a difierent part
of the text to the rest of the organisation de-
scription and the entire event can only be con-
structed by resolving anaphoric expressions in
the text. The choice over which information
about events should be extracted could have an
efiect on the di–culty of the IE task.
6 Conclusions
Itseemsthatthemajorityofeventsarenotfully
described within a single sentence, at least for
one of the most commonly used IE evaluation
sets. Only around 40% of events in the original
MUC data set were fully expressed within the
Soderlanddataset. Itwasalsofoundthatthere
is a wide variation between difierent event flelds
and some information may be more di–cult to
extract from text when the possibility of events
being described across multiple sentences is not
considered. This observation should be borne
in mind when deciding which approach to use
for a particular IE task and should be used to
put the results reported for IE systems which
extract from a single sentence into context.
Acknowledgements
I am grateful to Stephen Soderland for allowing
access to his version of the MUC-6 corpus and
advice on its construction. Robert Gaizauskas
and Beth Sundheim also provided advice on the
data used in the MUC evaluation. Mark Hep-
ple provided valuable comments on early drafts
of this paper. I am also grateful to an anony-
mous reviewer who provided several useful sug-
gestions.

References

H. Chieu and H. Ng. 2002. A Maximum
Entroy Approach to Information Extraction
from Semi-structured and Free Text. In Pro-
ceedings of the Eighteenth International Con-
ference on Artiflcial Intelligence (AAAI-02),
pages 768{791, Edmonton, Canada.

D. Fisher, S. Soderland, J. McCarthy, F. Feng,
and W. Lehnert. 1995. Description of the
UMass system as used for MUC-6. In Pro-
ceedings of the Sixth Message Understand-
ing Conference (MUC-6),pages221{236,San
Francisco, CA.

R. Grishman and B. Sundheim. 1996. Mes-
sage understanding conference - 6 : A brief
history. In Proceedings of the 16th Interna-
tional Conference on Computational Linguis-
tics (COLING-96), pages 466{470, Copen-
hagen, Denmark.

S. Huttunen, R. Yangarber, and R. Grishman.
2002. Complexity of Event Structures in IE
Scenarios. In Proceedings of the 19th Interna-
tional Conference on Computational Linguis-
tics (COLING-2002), pages 376{382, Taipei,
Taiwan.

MUC. 1995. Proceedings of the Sixth Mes-
sage Understanding Conference (MUC-6),
San Mateo, CA. Morgan Kaufmann.

S. Soderland. 1999. Learning Information Ex-
traction Rules for Semi-structured and free
text. Machine Learning, 31(1-3):233{272.

B. Sundheim. 1995. Overview of results of
the MUC-6 evaluation. In Proceedings of
the Sixth Message Understanding Conference
(MUC-6), pages 13{31, Columbia, MA.

D. Zelenko, C. Aone, and A. Richardella. 2003.
Kernelmethodsforrelationextraction. Jour-
nal of Machine Learning Research, 3:1083{
1106.
