Investigations on Event Evolution in TDT
Juha Makkonen
Department of Computer Science,
P.O. Box 26, 00014 University of Helsinki
Finland
jamakkon@cs.helsinki.fi
Abstract
Topic detection and tracking approaches mon-
itor broadcast news in order to spot new, pre-
viously unreported events and to track the de-
velopment of the previously spotted ones. The
dynamical nature of the events makes the use of
state-of-the-art methods difficult. We present a
new topic definition that has potential to model
evolving events. We also discuss incorporat-
ing ontologies into the similarity measures of
the topics, and illustrate a dynamic hierarchy
that decreases the exhaustive computation per-
formed in the TDT process. This is mainly
work-in-progress.
1 Introduction
A fairly novel area of retrieval called topic detection and
tracking (TDT) attempts to design methods to automati-
cally (1) spot new, previously unreported events, and (2)
follow the progress of the previously spotted events (Allan
et al., 1998c; Yang et al., 1998).
Our contribution deals with three problems in TDT.
Firstly, we present a new definition for a topic that would
model the event evolution, i.e., the changing nature of a
topic. The previous event definitions do not really lend
themselves to this change. Secondly, we investigate an
approach suggested by Makkonen, Ahonen-Myka and
Salmenkivi (2002). They partitioned the termspace into
four semantic classes and represented each class with a
designated vector. Unlike the term-weighting model of
Yang et al. (2002) this approach enables the use of intro-
duction of different similarity measures for each semantic
class. We formalize the comparison method and suggest
a a0 NN approach based on this formalization. Thirdly,
we suggest the use of dynamic hierarchies in a TDT sys-
tem that would decrease the exhaustive computation of
the first story detection. In practice this means that we
import text categorization on top of TDT. The purpose of
this paper is to outline the main aspects of our ongoing
and future work. As this is mainly work-in-progress, we
do not have empirical motivation for our work.
This paper is organized as follows: We will discuss the
problems of TDT in Section 2 In Section 3 we examine
the definitions of an event and a topic. Section 4 presents
a novel event representation and an approach to measure
the similarity of such elements. In Section 5 we deal with
dynamic hierarchies. In Section 6 we discuss our conclu-
sions.
2 Problems in TDT
The events are taking place in the world, and some of
them are reported in the news. A TDT system does not
perceive the events themselves, rather makes an effort in
deducing them from the continuous news-stream – which
is in a sense like the shadows on the wall in Plato’s cave
analogy. Given this setting, what is it that we are trying
to model?
Typically, the text categorization is conducted us-
ing some machine learning system (Sebastiani, 2002;
Yang and Liu, 1999). Such system is taught to rec-
ognize the difference between two or more predefined
classes or categories by providing a good number of
pre-labeled samples to learn from. As to classes and
word frequencies, this training material is assumed to
lend itself to the same underlying distribution as the
material that is to be categorized. More formally,
the documents a1a3a2a5a4a7a6a9a8a11a10a12a6a14a13a11a10a7a15a16a15a16a15a7a10a12a6a18a17a19a20a17a22a21 and their labels
a23
a2a5a4a25a24 a8 a10a26a24 a13 a10a7a15a16a15a16a15a16a10a27a24a11a17a28a29a17a30a21 yield to a unknown distribution.
This distribution is expressed as a function a31a32 that assigns
to each document-label pair
a4a34a33a35a6a37a36a27a10a27a24a26a38a40a39a42a41a43a1a45a44
a23a47a46a20a48a50a49a52a51a53a49a54a46
a1
a46
a10
a48a55a49a57a56a47a49a54a46a23a47a46
a21
a boolean value indicating their relevance, i.e., a31a32a52a58 a1a59a44
a23a61a60
a4a63a62
a48
a10
a48
a21a34a15 The task of classification is to come up
with a hypothesis a32a64a58 a1a65a44
a23a54a60
a4a63a62
a48
a10
a48
a21 that represents
                                                               Edmonton, May-June 2003
                                                 Student Research Workshop , pp. 43-48
                                                         Proceedings of HLT-NAACL 2003
a31
a32 , practically, with the ’highest’ accuracy. This accuracy
is evaluated with a pre-labeled testing material.
Now, with TDT the problem is different. Let us assume
that the documents and events yield to an unknown distri-
bution represented by the function
a31
a66
a58
a1a67a44a69a68
a60
a4a63a62
a48
a10
a48
a21
that assigns each document a6a70a36a71a41a43a1 a boolean value in-
dicating whether it discussed event a72a7a38a50a41a43a68 or not. The
problem is that domain of a68a54a2a5a4a40a72a73a8a74a10a27a72a25a13a11a10a7a15a16a15a16a15a16a10a27a72a75a17a76a53a17a77a21 is time-
dependent. The hypothesis a66 a58 a1a61a44a43a68 a60 a4a63a62 a48 a10 a48 a21 built
from the training data does not work with evaluation
data, because these two data sets do not discuss the same
events. Moreover, the events are very small in size com-
pared to categories and their identity, that is, the most im-
portant terms evolve over time. We can, however, model
similarity between two documents. By examining the
pair-wise comparisons in the training set, we can formu-
late a hypothesis a0a78a58 a1a79a44a80a1 a60 a4a34a62 a48 a10 a48 a21 that assigns
the pair a33a81a6 a36 a10a12a6 a38 a39a42a41a64a1a45a44a43a1 a boolean value 1 if the doc-
uments discuss the same event, -1 otherwise. Any two
documents of same event are (ideally) similar in a similar
way. This somewhat trivial observation has some impli-
cations worth mentioning.
Firstly, by definition news documents report changes,
something new with respect to what is already known.
This would lead to think that the identity of an event
eludes all static representations and that the representa-
tion for a topic would have to adapt automatically to the
various changes in the reporting of the event.
Secondly, so far the parameters and thresholds of the
state-of-the-art methods in IR have tried to capture this
similarity of similarity, but there does not seem be a rep-
resentation expressive enough (Allan et al., 2000).
Thirdly, the detection and tracking is based on pair-
wise comparisons which requires exhaustive computa-
tion. Yang et al. (2002) suggested topic-categories that
could be used to limit the search space of the first-story
detection. However, building topic-categories automati-
cally is difficult. In the following we outline some sug-
gestions to these problems: event modeling, event repre-
sentation and decreasing the computational cost.
3 Events and Topics
Although the concept of event might seem intuitively
clear and self-explanatory, formulating a sound definition
appears to be difficult. Predating TDT research, numer-
ous historians and researchers of political science have
wrestled with the definitions (Falk, 1989; Gerner et al.,
1994). What seems to be somewhat agreed upon is that
an event is some sort of activity conducted by some agent
and taking place somewhere at some time.
Definition 1 An event is something that happens at some
specific time and place (Yang et al., 1999).
This initial definition was adopted to TDT project
and it is intuitively quite sound. Practically all of the
events of the TDT test set yield to the temporal proxim-
ity (“burstiness”) and the compactness. However, there
is also a number of problematic cases which this def-
inition seems to neglect: events which either have a
long-lasting nature (Intifada, Kosovo–Macedonia, strug-
gle in Columbia), escalate to several large-scale threads
or campaigns (September 11), or are not tightly spatio-
temporally constrained (BSE-epidemics).
The events in the world are not as autonomous as this
definition assumes. They are often interrelated and do
not necessarily decay within weeks or a few months.
Some of these problematic events would classify as ac-
tivities (Papka, 1999), but when encountering a piece of
news, we do not know a priori whether it is a short term
event or long term activity, a start for a complex chain of
events or just a simple incident.
Definition 2 An event is a specific thing that happens at
a specific time and place along with all necessary pre-
conditions and unavoidable consequences (Cieri et al.,
2002).
This is basically a variant of Definition 1 that in some
sense tries to address the autonomy assumption. Yet, it
opens a number of questions as to what are the necessary
preconditions for certain event, an oil crisis, for exam-
ple. What are the necessary preconditions and unavoid-
able consequences of Secretary Powell’s visit to Middle
East or of suicide-bombing in Ramallah?
Definition 3 A topic is an event or an activity, along with
all related events and activities (Cieri et al., 2002).
Here, Cieri et al. explicate the connection between a
topic and an event: they are more or less synonyms. Rules
of interpretation that have been issued to help to draw the
line and to attain consistency. In TDT, there are eleven
topic types that tell what kind of other topic types are
relevant. The topic type of the topic is determined by
the seminal event. Since TDT2 and TDT3 corpora are
produced along this guideline, this is in a sense the de
facto definition.
Definition 4 A topic is a series of events, a narrative that
evolves and may fork into several distinct topics.
Definition 4 makes an attempt at addressing the chang-
ing or evolving nature of a topic. A seminal event can
lead to several things at the same time and the connec-
tion between the various outcomes and the initial cause
become less and less obvious as the events progress. As
a practical consequence, the event evolution (Yang et al.,
1999; Papka, 1999) causes changes in the vocabulary, es-
pecially in the crucial, identifying terms.
The news documents are temporally linearly ordered,
and the news stories can be said to form series of different
lengths. Identifying these chains as topics is motivated by
Falk’s investigations on historical events (Falk, 1989). A
narrative begins as soon as the first story is encountered.
Then the narrative is developed into one or more direc-
tions: simple events, like plane accidents might not have
as many sub-plots as a political scandal, a war or econom-
ical crises. Then, at some point one could say the latest
story is so different from the initial one that it is consid-
ered a first story for a new event. However, there could
remain some sort of link that these two topics (narratives)
are somehow relevant. Hence, this kind of a narrative has
a beginning, a middle and an end. An event evolution is
illustrated in Figure 1.
A
B
A
B
A
N N+1
N−1
4. K
K+1
K
N
N−1
N+1
N+2
N+3
N+4
1.
2.
3.
D
C K+1
5.
Figure 1: An example of event evolution.
Initially, in phase a48 we have only one document, a first
story a82 , an it constitutes an event that is depicted by the
dashed line. Then in phase a83 , document a84 is found rel-
evant to this event. Since it is found similar to a82 , there
is link in between them. In phase a85 there are two more
relevant documents: a23 and a86 . The former is more found
similar to a84 than to a82 , and thus it continues the off-spring
started by a84 . On the contrary, a86 appears closer to a82 and
thus it starts a new direction. Phase a87 shows two sto-
ries, a88a90a89 a48 and a91a3a89 a48 outside the dashed ellipse. This
represents a situation, where the vocabulary of the two
expulsed documents is diverging from the rest of the doc-
uments, i.e., the inner cohesion of the topic is violated
too much. The dotted ellipse represents the domain of
possible topical shifts, i.e., stories that lead too far from
the original topic. They are still regarded as part of the
topic, but are on the brink of diverging from the topic and
hence candidates for new first stories or seminal events.
Finally, in phase a92 the separation takes place: Three new
documents, a88a67a89a93a83 , a88a67a89a52a85 and a88a67a89a80a87 , are found similar
to a88a94a89 a48 . As a result, document a88a94a89 a48 is separated into
its own topic. Note that there is no follow-ups for a91a95a89 a48 ,
and therefore it is not cut off.
The problem of text summarization is similar to detect-
ing topical shifts: traces of all the main topics occurring
in the given text need to be retained in the summariza-
tion. On the other hand, text segmentation shares some
qualities with the topic shift detection. Lexical cohe-
sion (Boguraev and Neff, 2000) has been employed in the
task as well as in text segmentation (Stokes et al., 2002).
A model of Definition 4 has many open issues. For
example, what is the topic representation and what kind
of impact will there be on the evaluation? We will try to
address the former question in the following.
4 Multi-vector Event Model
It has been difficult to detect two distinct train accidents
or bombings as different events (Allan et al., 1998a).
The terms occurring in the two documents are so sim-
ilar that the term-space or the weighting-scheme in use
fails to represent the required very delicate distinction.
Furthermore, Allan, Lavrenko and Papka suspect that
only a small number of terms is adequate to make the
distinction between different news events (Allan et al.,
1998b). Intuitively, when reporting two different train
accidents, it would seem that the location and the time,
possibly some names of people, are the terms that make
up the difference. Papka observes that when increasing
the weights of noun phrases and dates the classification
accuracy improves and when decreasing them, the accu-
racy declines (Papka, 1999).
4.1 Event Vector
A news document reporting an event states at the very
barest what happened, where it happened, when it hap-
pened, and who was involved. The automatic extraction
of these facts for natural language understanding is quite
troublesome and time-consuming, and could still perform
poorly. Previous detection and tracking approaches have
tried to encapsulate these facts in a single vector. In or-
der to attain the delicate distinctions mentioned above, to
avoid the problems with the term-space maintenance and
still maintain robustness, we assign each of the questions
a semantic class, i.e., i.e. groups of semantically related
words, similarly to approach suggest by Makkonen et al.
(2002). The semantic class of LOCATIONS contains all
the places mentioned in the document, and thus gives an
idea, where the event took place. Similarly, TEMPORALS,
i.e., the temporal expressions name an object, that is, a
point or an interval of time, and bind the document onto
the time-axis. NAMES are proper noun phrases that rep-
resent the people or organizations involved in the news
story. What happened is represented by ’normal’ words
which we call TERMS.
This approach has an impact on the document and the
event representations. Instead of having just one vector,
we issue four sub-vectors – one for each semantic class
as illustrated in Figure 2.
prime ministerpalestinian
Ramallah
Yasser Arafat
Wednesday
Mahmoud Abbas
West Back
appoint
Event vector
TERMS
NAMES
LOCATIONS
TEMPORALS
Figure 2: “RAMALLAH, West Bank - Palestinian leader
Yasser Arafat appointed his longtime deputy Mahmoud
Abbas as prime minister Wednesday, . . . ” (AP: Wednes-
day, March 19, 2003)
4.2 Similarity of Hands
One could claim that the meaning of a word is in the
word’s relation to other words without getting too deep
into philosophical discussions as to what and how the
meaning is. This meaning, that is, relation, can be repre-
sented in an ontology, where similar terms relate to each
other in different manner than dissimilar ones.
The similarity of event vectors is determined class-
wise: Each semantic class has its own similarity measure,
and the over-all similarity could be the weighted sum of
these measures, for example. The interesting thing is that
now we can introduce semantics into the vector-based
similarity by mapping the terms of a semantic class onto a
formal space. Each pair of terms in this space has a sim-
ilarity, i.e., a distance. Two TEMPORAL terms relate to
each other on the time-axis, and the similarity of two LO-
CATION terms can be based on a geographical proximity
represented in an ontology. For example, the utterances
next week and the last week of March 2003 do not coin-
cide on the surface, but when evaluated with respect to the
utterance time, the expressions refer to the same tempo-
ral interval. Similarly, London and Thames can be found
relevant based on an spatial ontology. Similarity in these
ontologies could be a distance on the time-axis or a dis-
tance in a tree, as we have previously noted (Makkonen
et al., 2003).
Now, let us present the above discussion more for-
mally. Each term in the document is a member of exactly
one semantic class. Hence, the documents are composed
of the union of semantic classes, or equivalently, the doc-
ument is a structure of a language specified by the unary
relations that represent the semantic classes.
Definition 5 Let a96 be a universe and let a97
be a language consisting of a98 unary relations
a97a93a2a5a4a74a99 a8 a10a26a99 a13 a10a16a15a7a15a16a15a7a10a26a99a9a100
a46
a99 a36a71a101 a96a78a21 . A model is a97 -
structure a102a103a2a54a104a105a96a57a10a106a97a108a107 .
Now, consider a96 as the set of natural language terms
and a97 as the set of semantic classes. A document repre-
sentation would be a a97 -structure consisting of terms
a4a40a109a5a41a43a96
a46
a109a5a41
a100
a110
a36a22a111a18a8
a99a9a36a12a21a63a10
i.e., a document is simply a union of the semantic classes.
The class-wise similarity of two such structures would be
as follows:
Definition 6 Let a112a16a36 be a function a112a7a36 a58 a96a113a44a43a96 a60a61a114a115
that indicates the similarity of two elements in a99a116a36
The similarity of two a97 -structures is a function
a117
a51
a98
a58
a96a118a44a43a96
a60a61a114a115
a100 such that
a117
a51
a98a119a104a120a102a47a10a12a121a122a107a123a2a124a112 a36 a104a105a99a71a125
a36
a10a106a99a71a126
a36
a107
a100
a36a127a111a128a8
a2a130a129 a36 a15 (1)
This type of similarity we call the similarity of hands 1.
Hence, the similarity of two documents, a102 and a121 ,
would be a vector a104a35a129 a8 a10a12a129 a13 a10a16a15a7a15a16a15a131a10a27a129a11a100a50a107a42a41 a114a115 a100 . There are
many ways to go about turning the vector into a single
score (Makkonen et al., 2002). One way is to define the
similarity as a weighted sum of each value of a112a25a36 , i.e.,
a117
a51
a98a57a104a35a102a132a10a12a121a122a107a123a2
a100
a133
a36a127a111a128a8a128a134
a36 a112 a36 a104a105a99a71a125
a36 a10a106a99a71a126a36 a107a135a10 (2)
where
a134
a36a71a41
a114a115 is the relative weight of class
a99a136a36 . The sim-
ilarities a112a16a36a26a104a120a99 a125
a36
a10a26a99 a126
a36
a107 have also been interpreted as van Ri-
jsbergen’s (van Rijsbergen, 1980) similarity coefficients.
Unlike detective stories, news documents give away
the plot in the first few sentences. Therefore, the simi-
larity measure could exploit the ranking, the ordinal of
the sentence in which the term appears, in weighting the
class-wise similarity.The rank-score of a term a137 occurring
a98 times is a138
a117
a104a81a137a12a107a53a2
a100
a133
a139
a111a128a8
a48
a140a22a141a63a142
a104a35a137
a139
a107
a10 (3)
where a137
a139 is the ranking of the
a0 th instance of term
a137 .
Hence the similarity a112 a36 would yield
a112a74a143a26a144a135a145
a139a131a146
a36
a104a120a99a123a125
a36
a10a26a99a123a126
a36
a107a29a2
a145
a133
a139
a111a128a8
a138
a117
a104a81a137
a139
a107a131a10 (4)
where term a137
a139 occurs
a147 times in intersection a99 a125
a36a80a148
a99 a126
a36
.
Currently we are experimenting with similarity of
hands technique as a relevance score (Yang et al., 2000)
for ranking the a0 nearest neighbours for each semantic
1Consider a simple game where one would have to deter-
mine the similarity of two hands of cards of arbitrary size (up
to 52) drawn from two distinct decks and assume that there is a
designated similarity measure for each suit. For example, with
hearts low cards could be of more value. Furthermore, the suits
could be weighted, i.e., clubs could be trump and unchallenged
clubs would lead to dissimilarity.
class. In other words, we find the a0 nearest events with
respect to TEMPORAL, a0 nearest events with respect to
NAMES, etc. In a sense, each semantic class votes for a0
candidates based on the relevance score and the respec-
tive weight of the semantic class. Once we have the four
sets of candidates, we elect the one with highest number
of votes.
Hence, let a149a103a2a54a4a40a86 a8 a10a27a86 a13 a10a16a15a16a15a7a15a16a10a26a86a151a150a152a21 be the set of
previous a97 -structures (i.e., events). The function
a129a75a153a25a137a154a72
a117
a58
a96a113a44
a114
a88a45a44a155a96
a150
a60
a96
a139
a100 returns
a98 vectors of a97 -
structures of length a0 consisting of structures closest to
a102 with respect to relation a99 a36 . In other words,
a129a75a153a25a137a154a72
a117
a104a120a102a47a10
a0
a10a154a149a156a107a118a2 a104a35a112 a36 a104a120a102a47a10a27a86
a139
a107
a100
a36a22a111a18a8
a107
a150
a139
a111a128a8
a2 a104a12a104a35a1 a8 a107
a139
a8
a10a7a104a81a1 a13 a107
a139
a8
a10a7a15a16a15a7a15a131a10a7a104a35a1a69a100a55a107
a139
a8
a107a135a10
where a104a35a1 a36 a107
a139
a8
is a length-a0 vector of a97 -structures closest
to a102 with respect to relation a99 a36 . The election is a function
a72a25a157a35a72a25a24a135a137
a58
a96
a139
a100
a60
a96 such that
a72a25a157a35a72a25a24a135a137a131a104a27a104a81a1a158a8a135a107
a139
a8
a10a7a104a35a1a69a13a25a107
a139
a8
a10a16a15a16a15a7a15a131a10a7a104a35a1 a100 a107
a139
a8
a107a53a2
a139
a159
a160
a111a128a8
a100
a159
a36a127a111a128a8
a104a35a1a47a36a105a107a81a161a116a15
Quite obviously, the intersection is too strong a func-
tion in this case. Some vector a104a81a1 a36 a107
a139
a8 might be empty
which would make the intersection empty as well. How-
ever, we believe that it would be easier to find optimal
weights for the semantic classes via this voting scheme
than trying to optimize Equation 2, because there are less
parameters.
5 Dynamic Hierarchies
One of the problems that plagues many TDT efforts is the
need to compare each incoming document with all the
preceding documents. Even if we issue a time-window
and have a straight-forward similarity measure, the num-
ber of required comparisons increases drastically as new
documents come in. There have been efforts to decrease
the amount of work by centroid vectors (Yang et al.,
2000), and by building an ad hoc classifier for each topic-
category (Yang et al., 2002), for example.
We suggest the we adopt text categorization on top
of topic detection and tracking, similar to Figure 3.
There has been good results in text categorization (see,
e.g., (Yang and Liu, 1999; Sebastiani, 2002)) The pre-
defined categories would form the static hierarchy – the
IPTC Subject Reference System 2, for example – on top
of all event-based information organization, and the mod-
els for the categories could be built on the basis of the test
set.
Below the static hierarchy there would be a dynamic
hierarchy that evolves as new documents come in and
2International Press Telecommunications Council,
http://www.iptc.org
new topics are detected. There is also a time-window
to limit the temporal scope. Once a topic expires, it is
removed from the dynamic hierarchy and archived to a
news repository of lower operational priority.
The use of static hierarchy has some of the benefits
the topic-categories of Yang et al. (2002) had. It de-
creases the search space and enables a category-specific
weighting-scheme for terms. For example, when a docu-
ment is categorized to the class ’science’, there is no need
to compare it against the events of any other class; ideally,
all the relevant events have also been categorized to the
same class.
6 Conclusion
We have discussed three problems relating to TDT and
mainly its event evolution. The novel topic definition al-
lows the topic to evolve into several directions and ul-
timately to distinguish new topics. A semantic class-
based event vector enables harnessing of domain spe-
cific ontologies, such as the time-axis and the geographic
distances. Finally, we presented a TDT system with
dynamic hierarchies that would cut down the excessive
computation required in the TDT process.
Our previous results were done with a Finnish online
news corpus smaller than the TDT corpora (Makkonen et
al., 2002). The use of semantic classes proved to be ben-
eficial. We have also built a temporal expression scheme
and a geographical ontology for TDT purposes (Makko-
nen et al., 2003). In this paper, all our discussions were
preliminary and should be regarded as such. In the future
we will work to motivate these mostly intuitive theories
with empirical results.
References
James Allan, Jaime Carbonell, George Doddington,
Jonathan Yamron, and Yiming Yang. 1998a. Topic
detection and tracking pilot study: Final report. In
Proc. DARPA Broadcast News Transcription and Un-
derstanding Workshop, February.
James Allan, Victor Lavrenko, and Ron Papka. 1998b.
Event tracking. Technical Report IR – 128, De-
partment of Computer Science, University of Mas-
sachusetts.
James Allan, Ron Papka, and Victor Lavrenko. 1998c.
On-line new event detection and tracking. In Proc.
ACM SIGIR, pages 37–45.
James Allan, Victor Lavrenko, and Hubert Jin. 2000.
First story detection in TDT is hard. In Proc. 9th
Conference on Information Knowledge Management
CIKM, pages 374–381, McClean, VA USA.
ECONOMYARTS/ENT.
SPACE TECH.
SCIENCEPOLITICSHEALTH
EPIDEMIC MEDICAL
Columbia Shuttle Crash Asian Pneumonia
WHO: Killer Pneumonia Being contained Outside Asia
US Restricts Travel to Vietnam over Virus
... ... ... ...
STATIC
DYNAMIC
TIMELINE
Columbia Board Hearing
Killer Pneumonia Eludes Attempts to Find Cause
time−window
Ebola
Ebola kills 100 in Kongo
Recorder Find Heartens Shuttle Searches
NASA plans Shuttle’s Return by Fall
Figure 3: A dynamic hierarchy with a static IPTC taxonomy on top and a topic-based time-varying structure on the
bottom.
Branimir K. Boguraev and Mary S. Neff. 2000.
Lexical cohesion, discourse segmentation and docu-
ment summarization. In Proc.RIAO’2000 (Recherche
d’Informations Assistee par Ordinateur), pages 237–
246, Paris.
Christopher Cieri, Stephanie Strassel, David Graff, Nii
Martey, Kara Rennert, and Mark Liberman. 2002.
Corpora for topic detection and tracking. In James
Allan, editor, Topic Detection and Tracking – Event-
based Information Organization, chapter 3, pages 33–
66. Kluwer Academic Publisher.
Pasi Falk. 1989. The past to come. Economy and Soci-
ety, 17(3):374–394.
Deborah J. Gerner, Philip A. Schrodt, Ronald Francisco,
and Julie L. Weddle. 1994. The analysis of political
events using machine coded data. International Stud-
ies Quarterly, 38:91–119.
Juha Makkonen, Helena Ahonen-Myka, and Marko
Salmenkivi. 2002. Applying semantic classes in event
detection and tracking. In Proc. International Con-
ference on Natural Language Processing (ICON’02),
Mumbai, India.
Juha Makkonen, Helena Ahonen-Myka, and Marko
Salmenkivi. 2003. Topic detection and tracking with
spatio-temporal evidence. Accepted in ECIR 2003.
Ron Papka. 1999. On-line New Event Detection, Clus-
tering and Tracking. Ph.D. thesis, Department of
Computer Science, University of Massachusetts.
Fabrizio Sebastiani. 2002. Machine learning in auto-
mated text categorization. ACM Computing Surveys,
34(1):1–47.
Nicola Stokes, Joe Carthy, and Alan F. Smeaton. 2002.
Segmenting broadcast news streams using lexical
chains. In Proc. STarting AI Researchers Symposium,
(STAIRS 2002), pages 145–154, Lyon.
C. J. van Rijsbergen. 1980. Information Retrieval. But-
terworths, 2nd edition.
Yiming Yang and Xin Liu. 1999. A re-examination of
text categorization methods. In Proc. ACM SIGIR,
pages 42–49, Berkley.
Yiming Yang, Thomas Pierce, and Jaime Carbonell.
1998. A study on retrospective and on-line event de-
tection. In Proc. ACM SIGIR, pages 28–36, Mel-
bourne.
Yiming Yang, Jaime Carbonell, Ralf Brown, Thomas
Pierce, Brian T. Archibald, and Xin Liu. 1999. Learn-
ing approaches for detecting and tracking news events.
IEEE Intelligent Systems Special Issue on Applications
of Intelligent Information Retrieval, 14(4):32 – 43.
Yiming Yang, Thomas Ault, Thomas Pierce, and Charles
Lattimer. 2000. Improving text categorization meth-
ods for event detection. In Proc. ACM SIGIR, pages
65–72.
Yiming Yang, Jian Zhang, Jaime Carbonell, and Chun
Jin. 2002. Topic-conditioned novelty detection. In
Proc. ACM SIGKDD (to appear), Edmonton, Canada.
