TENSE TREES AS THE "FINE STRUCTURE" OF DISCOURSE 
Chung Hee Hwang &: Lenhart K. Schubert 
Department of Computer Science 
University of Rochester 
Rochester, New York 14627, U. S. A. 
{hwang, schubert }@cs. rochester, edu 
ABSTRACT 
We present a new compositional tense-aspect deindex- 
ing mechanism that makes use of tense trees as com- 
ponents of discourse contexts. The mechanism allows 
reference episodes to be correctly identified even for 
embedded clauses and for discourse that involves shifts 
in temporal perspective, and permits deindexed logical 
forms to be automatically computed with a small num- 
ber of deindexing rules. 
1 Introduction 
Work on discourse structure, e.g., \[Reichman, 1985; 
Grosz and Sidner, 1986; Allen, 1987\], has so far taken 
a rather coarse, high-level view of discourse, mostly 
treating sentences or sentence-like entities ("utterance 
units, .... contributions," etc.) as the lowest-level dis- 
course elements. To the extent that sentences are ana- 
lyzed at all, they are simply viewed as carriers of certain 
features relevant to supra-sentential discourse structure: 
cue words, tense, time adverbials, aspectual class, into- 
national cues, and others. These features are presumed 
to be extractable in some straightforward fashion and 
provide the inputs to a higher-level discourse segment 
analyzer. 
However, sentences (or their logical forms) are not in 
general "flat," with a single level of structure and fea- 
tures, but may contain multiple levels of clausal and ad- 
verbial embedding. This substructure can give rise to 
arbitrarily complex relations among the contributions 
made by the parts, such as temporal and discourse rela- 
tions among subordinate clausal constituents and events 
or states of affairs they evoke. It is therefore essen- 
tial, in a comprehensive analysis of discourse structure, 
that these intra-sentential relations be systematically 
brought to light and integrated with larger-scale dis- 
course structures. 
Our particular interest is in tense, aspect and other 
indicators of temporal structure. We are developing a 
uniform, compositional approach to interpretation in 
which a parse tree leads directly (in rule-to-rule fash- 
ion) to a preliminary, indezical logical form, and this 
LF is deindezed by processing it in the current context 
(a well-defined structure). Deindexing simultaneously 
transforms the LF and the context: context-dependent 
constituents of the LF, such as operators past, pres and 
perf and adverbs like today or earlier, are replaced by 
explicit relations among quantified episodes; (anaphora 
are also deindexed, but this is not discussed here); and 
new structural components and episode tokens (and 
other information) are added to the context. This 
dual transformation is accomplished by simple recur- 
sive equivalences and equalities. The relevant context 
structures are called tense trees; these are what we pro- 
pose as the "fine structure" of discourse, or at least as 
a key component of that fine structure. 
In this paper, we first review Reichenbach's influen- 
tial work on tense and aspect. Then we describe tem- 
poral deindexing using tense trees, and extensions of 
the mechanism to handle discourse involving shifts in 
temporal perspective. 
2 Farewell to Reichenbach 
Researchers concerned with higher-level discourse struc- 
ture, e.g., Webber \[1987; 1988\], Passonneau \[1988\] and 
Song and Cohen \[1991\], have almost invariably relied on 
some Reichenbach \[1947\]-1ike conception of tense. The 
syntactic part of this conception is that there are nine 
tenses in English, namely simple past, present and fu- 
ture tense, past, present and future perfect tense, and 
posterior past, present and future tense 1 (plus progres- 
sive variants). The semantic part of the conception is 
that each tense specifies temporal relations among ex- 
actly three times particular to a tensed clause, namely 
the event time (E), the reference time (R) and the 
speech time (S). On this conception, information in 
discourse is a matter of "extracting" one of the nine Re- 
ichenbachian tenses from each sentence, asserting the 
1Exarnples of expressions in posterior tense are would, was 
going to (posterior past), is going to (posterior present), and will 
be going to (posterior future). 
232 
appropriate relations among E, R and S, and appro- 
priately relating these times to previously introduced 
times, taking account of discourse structure cues im- 
plicit in tense shifts. 
It is easy to understand the appeal of this approach 
when one's concern is with higher-level structure. By 
viewing sentences as essentially flat, carrying tense as a 
top-level feature with nine possible values and evoking a 
triplet of related times, one can get on with the higher- 
level processing with minimum fuss. But while there is 
much that is right and insightful about Reichenbach's 
conception, it seems to us unsatisfactory from a mod- 
ern perspective. One basic problem concerns embedded 
clauses. Consider, for instance, the following passage. 
(1) John will find this note when he gets home. 
(2) He will think(a) Mary has left(b). 
Reichenbach's analysis of (2) gives us Eb < S, Rb < 
Ra, Ea, where tl < t~ means tl is before t2, as below. 
S I I I 
Eb Rb R~ 
E~ 
That is, John will think that Mary's leaving took 
place some time before the speaker uttered sentence 
(2). This is incorrect; it is not even likely that John 
would know about the utterance of (2). In actuality, 
(2) only implies that John will think Mary's leaving 
took place some time before the time of his thinking, 
i.e., S < Ra, Ea and Eb < Rb, Ra , as shown below. 
S ~ Ra,E~ 
Eb f Rb 
Thus, Reichenbach's system fails to take into account 
the local context created by syntactic embedding. 
Attempts have been made to refine Reichenbach's 
theory (e.g., \[Hornstein, 1977; Smith, 1978; Nerbonne, 
1986\]), but we think the lumping together of tense 
and aspect, and the assignment of E, R, S triples to 
all clauses, are out of step with modern syntax and se- 
mantics, providing a poor basis for a systematic, com- 
positional account of temporal relations within clauses 
and between clauses. In particular, we contend that 
English past, present, future and perfect are separate 
morphemes making separate contributions to syntactic 
structure and meaning. Note that perfect have, like 
most verbs, can occur untensed ("She is likely to have 
left by now"). Therefore, if the meaning of other tensed 
verbs such as walks or became is regarded as compos- 
ite, with the tense morpheme supplying a "present" or 
"past" component of the meaning, the same ought to be 
said about tensed forms of have. The modals will and 
would do not have untensed forms. Nevertheless, con- 
siderations of syntactic and semantic uniformity suggest 
that they too have composite meanings, present or past 
tense being one part and "future modality" the other. 
This unifies the analyses of the modals in sentences like 
"He knows he will see her again" and "He knew he 
would see her again," and makes them entirely parallel 
to paraphrases in terms of going to, viz., "He knows he 
is going to see her again" and "He knew he was going 
to see her again." We take these latter "posterior tense" 
forms to be patently hierarchical (e.g., is going to see 
her has 4 levels of VP structure, counting to as an aux- 
iliary verb) and hence semantically composite on any 
compositional account. Moreover, going to can both 
subordinate, and be subordinated by, perfect have, as 
in "He is going to have left by then." This leads to ad- 
ditional "complex tenses" missing from Reichenbach's 
list. 
We therefore offer a compositional account in which 
operators corresponding to past (past), present (pres), 
future (futr) and perfect (perf) contribute separately 
and uniformly to the meanings of their operands, i.e., 
formulas at the level of LF. Thus, for instance, the tem- 
poral relations implicit in "John will have left" are ob- 
tMned not by extracting a "future perfect" and asserting 
relations among E, R and S, but rather by successively 
taking account of the meanings of the nested pres, futr 
and perf operators in the LF of the sentence. As it 
happens, each of those operators implicitly introduces 
exactly one episode, yielding a Reichenbach-like result 
in this case. (But note: a simple present sentence like 
"John is tired" would introduce only one episode con- 
current with the speech time, not two, as in Reichen- 
bach's analysis.) Even more importantly for present 
purposes, each ofpres, past, futr and perf is treated uni- 
formly in deindexing and context change. More specif- 
ically, they drive the generation and traversal of tense 
trees in deindexing. 
3 Tense Trees 
Tense trees provide that part of a discourse context 
structure 2 which is needed to interpret (and deindex) 
temporal operators and modifiers within the logical 
form of English sentences. They differ from simple lists 
of Reichenbachian indices in that they organize episode 
tokens (for described episodes and the utterances them- 
selves) in a way that echoes the hierarchy of temporal 
and modal operators of the sentences and clauses from 
which the tokens arose. In this respect, they are anal- 
2In general, the context structure would also contain speaker 
and hearer parameters, temporal and spatial frames, and to- 
kens for salient referents other than episodes, among other 
components--see \[Allen, 1987\]. 
233 
ogous to larger-scale representations of discourse struc- 
ture which encode the hierarchic segment structure of 
discourse. (As will be seen, the analogy goes further.) 
Tense trees for successive sentences are "overlaid" in 
such a way that related episode tokens typically end up 
as adjacent elements of lists at tree nodes. The traver- 
sal of trees and the addition of new tokens is simply and 
fully determined by the logical forms of the sentences 
being interpreted. 
The major advantage of tense trees is that they al- 
low simple, systematic interpretation (by deindexing) 
of tense, aspect, and time adverbials in texts consisting 
of arbitrarily complex sentences, and involving implicit 
temporal reference across clause and sentence bound- 
aries. This includes certain relations implicit in the 
ordering of clauses and sentences. As has been fre- 
quently observed, for a sequence of sentences within 
the same discourse segment, the temporal reference of 
a sentence is almost invariably connected to that of the 
previous sentence in some fashion. Typically, the rela- 
tion is one of temporal precedence or concurrency, de- 
pending on the aspectual class or aktionsart involved 
(eft, "John closed his suitcase; He walked to the door" 
versus "John opened the door; Mary was sleeping"). 
However, in "Mary got in her Ferrari. She bought it 
with her own money," the usual temporal precedence is 
reversed (based on world knowledge). Also, other dis- 
course relations could be implied, such as cause-of, ex- 
plains, elaborates, etc. (more on this later). Whatever 
the relation may be, finding the right pair of episodes 
involved in such relations is of crucial importance for 
discourse understanding. Echoing Leech \[1987, p41\], we 
use the predicate constant orients, which subsumes all 
such relations. Note that the orients predications can 
later be used to make probabilistic or default inferences 
about the temporal or causal relations between the two 
episodes, based on their aspectual class and other infor- 
mation. In this way they supplement the information 
provided by larger-scale discourse segment structures. 
We now describe tense trees more precisely. 
Tense Tree Structure 
The form of a tense tree is illustrated in Figure 1. As 
an aid to intuition, the nodes in Figure 1 are annotated 
with simple sentences whose indexical LFs would lead 
to those nodes in the course of deindexing. A tense tree 
node may have up to three branches--a leftward past 
branch, a downward perfect branch, and a rightward 
future branch. Each node contains a stack-like list of 
recently introduced episode tokens (which we will often 
refer to simply as episodes). 
In addition to the three branches, the tree may have 
(horizontal) embedding links to the roots of embed- 
ded tense trees. There are two kinds of these embed- 
ding links, both illustrated in Figure 1. One kind, 
utterance pres 
node. .......... ~home 
't "f 
He left i(" ~ ~ .... ~KP res Ho ,,,,o.vo \ 
He had left He wbuld He will She will think 
  olo. 
He would have left 
Figure 1. A Tense Tree 
indicated by dashed lines, is created by subordinat- 
ing constructions such as VPs with that-complement 
clauses. The other kind, indicated by dotted lines, is 
derived from the surface speech act (e.g., telling, ask- 
ing or requesting) implicit in the mood of a sentence. 
On our view, the utterances of a speaker (or sentences 
of a text, etc.) are ultimately to be represented in 
terms of modal predications expressing these surface 
speech acts, such as \[Speaker tell Hearer (That ~)\] 
or \[Speaker ask Hearer (Whether ~)\]. Although these 
speech acts are not explicitly part of what the speaker 
uttered, they are part of what the hearer gathers from 
an utterance. Speaker and Hearer are indexical con- 
stants to be replaced by the speaker(s) and the hearer(s) 
of the utterance context. The two kinds of embedding 
links require slightly different tree traversal techniques 
as will be seen later. 
A set of trees connected by embedding links is called 
a tense tree structure (though we often refer loosely to 
tense tree structures as tense trees). This is in effect a 
tree of tense trees, since a tense tree can be embedded 
by only one other tree. At any time, exactly one node 
of the tense tree structure for a discourse is in focus, 
and the focal node is indicated by ~). Note that the 
"tense tree" in Figure 1 is in fact a tense tree structure, 
with the lowest node in focus. 
By default, an episode added to the right end of a 
list at a node is "oriented" by the episode which was 
previously rightmost. For episodes stored at different 
nodes, we can read off their temporal relations from the 
tree roughly as follows. At any given moment, for a 
pair of episodes e and e' that are rightmost at nodes n 
and n', respectively, where n' is a daughter of n, if the 
branch connecting the two nodes is a past branch, Is' 
234 
before e\]3; if it is a perfect branch, \[e' impinges-on e\] 
(as we explain later, this yields entailments \[e' before e\] 
if e' is nonstative and \[e' until e\] if e' is stative, respec- 
tively illustrated by "John has left" and "John has been 
working"); if it is a future branch, \[d after e\]; and if it 
is an embedding link, \[d at-about e\]. These orienting 
relations and temporal relations are not extracted post 
hoc, but rather are automatically asserted in the course 
of deindexing using the rules shown later. 
As a preliminary example, consider the following pas- 
sage and a tense tree annotated with episodes derived 
from it by our deindexing rules: 
(3) John picked up the phone. 
(4) He had told Mary that 
uj,.,® ...... 2' Jpast 
epick, el ¢f 
perf 
etellCD - -/~ 
e2 ~:t 
he would call her. 
ecall 
u3 and u4 are utterance episodes for sentences (3) and 
(4) respectively. 
Intuitively, the temporal content of sentence (4) is 
that the event of John's telling, etdz, took place before 
some time el, which is at the same time as the event 
of John's picking up the phone, epiek; and the event of 
John's calling, eean, is located after some time e2, which 
is the at the same time as the event of John's telling, 
eteu. For the most part, this information can be read 
off directly from the tree: \[eple~ orients el\], \[etett before 
el\] and \[eeatt after e2\]. In addition, the deindexing rules 
yield \[e2 same-time etell\]. From this, one may infer \[etell 
before epic~\] and \[ecau after eteu\], assuming that the 
orients relation defaults to same-time here. 
How does \[epiek orients el\] default to \[epiek same-time 
eli? In the tense tree, el is an episode evoked by the 
past tense operator which is part of the meaning of had 
in (4). It is a stative episode, since this past opera- 
tor logically operates on a sentence of form (perf &), 
and such a sentence describes a state in which & has 
occurred--in this instance, a state in which John has 
told Mary that he will call her. It is this stativity of 
el which (by default) leads to a same-time interpreta- 
tion of orients. 4 Thus, on our account, the tendency 
of past perfect "reference time" to align itself with a 
3Or, sometimes, same-time (cf., "John noticed that Mary 
looked pale" vs. "Mary realized that someone broke her vase"). 
This is not decided in an ad hoc manner, but as a result of sys- 
tematically interpreting the context-charged relation belT. More 
on this later. 
4 More accurately, the default interpretation is \[(end-of epick ) 
same-time ell, in view of examples involving a longer preceding 
event, such as "John painted a picture. He was pleased with the 
result." 
previously introduced past event is just an instance of a 
general tendency of stative episodes to align themselves 
with their orienting episode. This is the same tendency 
noted previously for "John opened the door. Mary was 
sleeping." We leave further comments about particu- 
larizing the orients relation to a later subsection. 
We remarked that the relation \[e2 same-time etett\] is 
obtained directly from the deindexing rules. We leave it 
to the reader to verify this in detail (see Past and Futr 
rules stated below). We note only that e2 is evoked 
by the past tense component of would in (4), and de- 
notes a (possible) state in which John will call Mary. 
Its stativity, and the fact that the subordinate clause 
in (4) is "past-dominated, ''5 causes \[e2 bef T eteu\] to be 
deindexed to \[e2 same-time etch\]. 
We now show how tense trees are modified as dis- 
course is processed, in particular, how episode tokens 
are stored at appropriate nodes of the tense tree, and 
how deindexed LFs, with orients and temporal ordering 
relations incorporated into them, are obtained. 
Processin~ of Utterances 
The processing of the (indexical) LF of a new utter- 
ance always begins with the root node of the current 
tense tree (structure) in focus. The processing of the 
top-level operator immediately pushes a token for the 
surface speech act onto the episode list of the root node. 
Here is a typical indexical LF: 
( decl (past \[John know (That 
(past (', (perf \[Mary leave\]))))\])) 
"John knew that Mary had not left." 
(decl stands for declarative; its deindexing rule intro- 
duces the surface speech act of type "tell"). As men- 
tioned earlier, our deindexing mechanism is a composi- 
tional one in which operators past, futr, perf, -,, That, 
decl, etc., contribute separately to the meaning of their 
operands. As the LF is recursively transformed, the 
tense and aspect operators encountered, past, perf and 
futr, in particular, cause the focus to shift "downward" 
along existing branches (or new ones if necessary). That 
is, processing a past operator shifts the current focus 
down to the left, creating a new branch if necessary. 
The resulting tense tree is symbolized as /T. Similarly 
perf shifts straight down, and futr shifts down to the 
right, with respective results t T and \ T. pres maintains 
the current focus. Certain operators embed new trees 
at the current node, written ~--~T (e.g., That), or shift 
focus to an existing embedded tree, written ¢--*T (e.g., 
decl). Focus shifts to a parent or embedding node are 
symbolized as T T and .--T respectively. As a final tree 
operation, OT denotes storage of episode token e T (a 
new episode symbol not yet used in T) at the current 
5A node is past-domlnated if there is a past branch in its an- 
cestry (where embedding finks also count as ancestry links). 
235 
focus, as rightmost element of its episode list. As each 
node comes into focus, its episode list and the lists at 
certain nodes on the same tree path provide explicit ref- 
erence episodes in terms of which past, pres, futr, pert, 
time adverbials, and implicit "orienting" relations are 
rewritten nonindexically. Eventually the focus returns 
to the root, and at this point, we have a nonindexical 
LF, as well as a modified tense tree. 
Deindexin~ Rules 
Before we proceed with an example, we show some of 
the basic deindexing rules here. 6 In the following,"**" is 
an episodic operator that connects a formula with the 
situation it characterizes. Predicates are infixed and 
quantifiers have restrictions (following a colon), r 
Decl: (decl ¢)T 
Oer:\[\[er same-time So r\] ^ 
\[Last T immediately-precedes eT\] \] 
\[\[Speaker tell Hearer (That ¢~OT)\] 
** er\]) Tree transform: 
(decl ¢)- T -- ',--" (<D" (,---~OT)) 
Pres: (pres <b)T 
*-* (3eT:\[\[e T at-about EmbT\] A \[Last T orients eT\] \] 
\[+or ** er\]) 
Tree transform: (pres <D)- T = (¢" (OT)) 
Past: (past <b)T 
(3eT:\[\[e T bet T EmbT\] ^ \[LaSt/T orients eT\] \] 
\[<bo r ** et\]) 
Tree transform: (past <b)" T '- I (<b" (O/T)) 
Futr: (futr <b)T 
(3et:\[\[e t after F.mbT\] A \[Lastx, T orients eT\] \] \[%., 
** et\]) 
Tree transform: (futr <b)" r = , (<b" (O\ T)) 
Pert: (pert <b)T 
(3eT:\[\[e T impinges-on LaStT\] A 
\[LaStlT orients eT\] \] 
\[%,, ** 
Tree transform: (pert <b)" T = T (<b" (O 1 r)) 
That: (That <b)T ~ (That <D_T ) 
Tree transform: (That <b)"T = *-- (<b" (~-*T)) 
As mentioned earlier, Speaker and Hea~er in the Decl- 
rule are to he replaced by the speaker(s) and the 
hearer(s) of the utterance. Note that each equivalence 
pushes the dependence on context one level deeper into 
the LF, thus deindexing the top-level operator. The 
6See \[Hwang, 1992\] for the rest of our deindexing rules. Some 
of the omitted ones are: Fpres ( "futural present," as in "John has 
a meeting tomorrow"), Prog (progressive aspect), Pred (predica- 
tion), K, Ka and Ke ("kinds"), those for deindexing various oper- 
ators (especially, negation and adverbials), etc. 
r For details of Episodic Logic, our semantic representation, see 
\[Schubert and Hwang, 1989; Hwang and Schubert, 1991\]. 
symbols NOWT, Last T and Emb T refer respectively to the 
speech time for the most recent utterance in T, the last- 
stored episode at the current focal node, and the last- 
stored episode at the current embedding node. bet T 
in the Past-rule will he replaced by either before or 
same-time, depending on the aspectual class of its first 
argument and on whether the focal node of T is past- 
dominated. In the Pert-rule, Last T is analogous to 
the Reichenbachian reference time for the perfect. The 
impinges-on relation confines its first argument e T (the 
situation or event described by the sentential operand of 
pert) to the temporal region preceding the second argu- 
ment. As in the case of orients, its more specific import 
depends on the aspectual types of its arguments. If e T is 
a stative episode, impinges-on entails that the state or 
process involved persists to the reference time (episode), 
i.e., \[e T until LastT\]. If e T is an event (e.g., an accom- 
plishment), impinges-on entails that it occurred some- 
time before the reference time, i.e., \[e T before LaStT\], 
and (by default) its main effects persist to the reference 
time. s 
An Example 
To see the deindexing mechanism at work, consider now 
sentences (ha) and (Ca). 
(5) a. John went to the hospital. 
b. (decl Ta (past Tb \[John goto Hospital\] ) ) Tc 
c. (3 el:tel same-time Now1\] 
\[\[Speaker tell Hearer (That 
(3 e2:\[e2 before ell 
\[\[John goto Hospital\] ** e2\]))\] 
** ell) 
(6) a. The doctor told John he had broken his ankle. 
b. (decl Td (past Te \[Doctor tell John (That If 
(past Tg (pert Th \[John break Ankle\])))\])) t$ 
c. (3 e3:\[\[e3 same-time Now21 ^ 
\[el immediately-precedes e3\]\] 
\[\[Sp eaker tell Hearer (That 
(3 e4:\[\[e4 before e3\] ^ \[e2 orients e411 
\[\[Doctor tell John (That 
(3 eh:\[e5 same-time e4\] 
\[(3 e6:\[e6 before eh\] 
\[\[John break Ankle\] ** e6\]) 
** es\]))\] 
** e4\]))\] 
** e3\]) 
8We have formulated tentative meaning postulates to this ef- 
fect hut cannot dwell on the issue here. Also, we are setting 
aside certain well-known problems involving temporal adverbials 
in perfect sentences, such as the inadmissibility of * "John has left 
yesterday." For a possible approach, see \[Schubert and Hwang, 
1990\]. 
236 
The LFs before deindexing are shown in (5,6b) (where 
the labelled arrows mark points we will refer to); the 
final, context-independent LFs are in (5,6c). The trans- 
formation from (b)'s to (c)'s and the corresponding 
tense tree transformations are done with the deindex- 
ing rules shown earlier. Anaphoric processing is presup- 
posed here. 
The snapshots of the tense tree while processing (5b) 
and (6b), at points Ta-Ti, are as follows (with a null 
initial context). 
ata atb at ¢ atd at e 
el el el el, e3 el, £3 • ...... ~ "'"-'. ~ ...... . ...... . .-....--.-. 
at f at g at h at i 
el, e3 el, e3 el, e3 el, e3 
e2, e4 e e4/(~6 e2, e4 ~: 
The resultant tree happens to be unary, but additional 
branches would be added by further text, e.g., a future 
branch by "It will take several weeks to heal." 
What is important here is, first, that Reichenbach-like 
relations are introduced compositionally; e.g., \[e6 before 
e5\], i.e., the breaking of the ankle, e6, is before the state 
John is in at the time of the doctor's talking to him, e4. 
In addition, the recursive rules take correct account of 
embedding. For instance, the embedded present perfect 
in a sentence such as "John will think that Mary has 
left" will be correctly interpreted as relativized to John's 
(future) thinking time, rather than the speech time, as 
in a Reichenbachian analysis. 
But beyond that, episodes evoked by successive sen- 
tences, or by embedded clauses within the same sen- 
tence, are correctly connected to each other. In par- 
ticular, note that the orienting relation between John's 
going to the hospital, e2, and the doctor's diagnosis, e4, 
is automatically incorporated into the deindexed for- 
mula (6c). We can plausibly particularize this orienting 
relation to \[e4 after e2\], based on the aspectual class of 
"goto" and "tell" (see below). Thus we have established 
inter-clausal connections automatically, which in other 
approaches require heuristic discourse processing. This 
was a primary motivation for tense trees. Our scheme 
is easy to implement, and has been successfully used in 
the TRAINS interactive planning advisor at Rochester 
\[Allen and Schubert, 1991\]. 
More on ParCicularizin~ the ORIENTS Rela¢ion 
The orients relation is essentially an indicator that 
there could be a more specific discourse relation between 
the argument episodes. As mentioned, it can usually 
be particularized to one or more temporal, causal, or 
other "standard" discourse relation. Existing propos- 
als for getting these discourse relations right appear to 
be of two kinds. The first uses the aspectual classes 
of the predicates involved to decide on discourse re- 
lations, especially temporal ones, e.g., \[Partee, 1984\], 
\[Dowty, 1986\] and \[Hinrichs, 1986\]. The second ap- 
proach emphasizes inference based on world knowledge, 
e.g., \[Hobbs, 1985\] and \[Lascarides and Asher, 1991; 
Lascarides and Oberlander, 1992\]. The work by Las- 
carides et hi. is particularly interesting in that it makes 
use of a default logic and is capable of retracting previ- 
ously inferred discourse relations. 
Our approach fully combines the use of aspectual 
class information and world knowledge. For example, in 
"Mary got in her Ferrari. She bought it with her own 
money," the successively reported "achievements" are 
by default in chronological order. Here, however, this 
default interpretation of orients is reversed by world 
knowledge: one owns things after buying them, rather 
than before. But sometimes world knowledge is mute on 
the connection. For instance, in "John raised his arm. 
A great gust of wind shook the trees," there seems to be 
no world knowledge supporting temporal adjacency or 
a causal connection. Yet we tend to infer both, perhaps 
attributing magical powers to John (precisely because 
of the lack of support for a causal connection by world 
knowledge). So in this case default conclusions based 
on orients seem decisive. In particular, we would as- 
sume that if e and e' are nonstative episodes, 9 where e 
is the performance of a volitional action and e' is not, 
then \[e orients e'\] suggests \[e right-before d\] and (less 
firmly) \[e cause-of d\]. 1° 
4 Beyond Sentence Pairs 
The tense tree mechanism, and particularly the way in 
which it automatically supplies orienting relations, is 
well suited for longer narratives, including ones with 
tense shifts. Consider, for example, the following 
(slightly simplified) text from \[Allen, 1987, p400\]: 
(7) a. Jack and Sue went{e~} to a hardware store 
b. as someone had{e~} stolen{~5} their lawnmower. 
c. Sue had{e4} seen{eh} a man take it 
9Non-statives could be achievements, accomplishments, cul- 
minations, etc. Our aspectual class system is not entirely settled 
yet, but we expect to have one similar to that of \[Moens and 
Steedman, 1988\]. 
1°Our approach to plausible inference in episodic logic in gen- 
eral, and to such default inferences in particular, is probabilistic 
(see \[Schubert and Hwang, 1989; Hwang, 1992\]). The hope is that 
we will be able to "weigh the evidence" for or against alternative 
discourse relations (as particularizations of orients). 
237 
d. and had{,,} chased{e,} him down the street, 
e. but he had{e,} driven{,g} away in a truck. 
f. After looking{,,o} in the store, they realized{,in} 
that they couldn't afford{,~} a new one. 
Even though {b-e} would normally be considered a sub- 
segment of the main discourse {a, f}, both the temporal 
relations within each segment and the relations between 
segments (i.e., that the substory temporally precedes 
the main one) are automatically captured by our rules. 
For instance, el and ell are recognized as successive 
episodes, both preceded at some time in the past by 
e3, es, eT, and eg, in that order. 
This is not to say that our tense tree mechanism ob- 
viates the need for larger-scale discourse structures. As 
has been pointed out by Webber \[1987; 1988\] and oth- 
ers, many subnarratives introduced by a past perfect 
sentence may continue in simple past. The following is 
one of Webber's examples: 
(8) a. I was{,l} at Mary's house yesterday. 
b. We talked{,2} about her sister Jane. 
c. She had{e3} spent{e,} five weeks in Alaska 
with two friends. 
d. Together, they climbed{,,} Mt. McKinley. 
e. Mary askedoe } whether I would want to go to 
Alaska some time. 
Note the shift to simple past in d, though as Web- 
bet points out, past perfect could have been used. The 
abandonment of the past perfect in favor of simple past 
signals the temporary abandonment of a perspective 
anchored in the main narrative - thus bringing read- 
ers "closer" to the scene (a zoom-in effect). In such 
eases, the tense tree mechanism, unaided by a notion of 
higher-level discourse segment structure, would derive 
incorrect temporal relations such as \[e5 orients e6\] or 
\[e6 right-after es\]. 
We now show possible deindexing rules for perspec- 
tive shifts, assuming for now that such shifts are inde- 
pendently identifiable, so that they can be incorporated 
into the indexical LFs. new-pets is a sentence operator 
initiating a perspective shift for its operand, and prey- 
pets is a sentence (with otherwise no content) which 
gets back to the previous perspective. Recent T is the 
episode most recently stored in the subtree immediately 
embedded by the focal node of T. 
New-pets: (new-pers ¢)T 
*'* \[$,-. T A \[Itecent T orients RecentT,\]\] 
where T' = $" (~-~ T) 
Tree transform : 
(new-pers ~)" T = ~" (~-* T) 
Prev-pe\]:s: prev-pers T ---. T (True) 
Tree transform : prev-pers • T = ~ T 
When new-pers is encountered, a new tree is created 
and embedded at the focal node, the focus is moved to 
the root node of the new tree, and the next sentence is 
processed in that context. In contrast with other op- 
erators, new-pets causes an overall focus shift to the 
new tree, rather than returning the focus to the orig- 
inal root. Note that the predication \[Recen*c T orients 
Recen'~T, \] connects an episode of the new sentence with 
an episode of the previous sentence, prey-pets produces 
a trivial True, but it returns the focus to the embed- 
ding tree, simultaneously blocking the link between the 
embedding and the embedded tree (as emphasized by 
use of ~ instead of ~---). 
We now illustrate how tense trees get modified over 
perspective changes, using (8) as example. We re- 
peat (Sd,e) below, augmenting them with perspective 
changes, and show snapshots of the tense trees at the 
points marked. In the trees, ul,...,u5 are utterance 
episodes for sentences a,..., e, respectively. 
(8) d. TTl(new-pers Together, they climbed{,s} Mt. 
McKinley.) TT 2 
prev.pers TT 3 
e. Mary asked{,,} whether I would want to go to 
Alaska some time. TT ~ 
TI: T2: U4 
151 ~3 "S , U2, ~r(~ til, t/.2~ t13~- *2" 
T el , e2, e3 ?el ~ e2, e3 • e4 
• e4 
T3: u4 T4: u4 
ul u2 u3 ~ ~" S ul,u2, ......X-~'"'~ /£3,/J'5 (~'"'~ # 
Qe4 °e4 
Notice the blocked links to the embedded tree in T3 and 
T4. Also, note that RecentT1 = e4 and Recenl;T2 = e5. 
So, by Hew-pets, we get \[e4 orients e5\], which can be 
later particularized to \[e5 during e4\]. It is fairly obvi- 
ous that the placement of new-pers and prev-pers oper- 
ators is fully determined by discourse segment bound- 
aries (though not in general coinciding with them). So, 
as long as the higher-level discourse segment structure 
is known, our perspective rules are easily applied. In 
that sense, the higher-level structure supplements the 
"fine structure" in a crucial way. 
However, this leaves us with a serious problem: dein- 
dexing and the context change it induces is supposed 
to be independent of "plausible inferencing"; in fact, 
238 
it is intended to set the stage for the latter. Yet the 
determination of higher-level discourse structure--and 
hence of perspective shifts--is unquestionably a matter 
of plausible inference. For example, if past perfect is fol- 
lowed by past, this could signal either a new perspective 
within the current segment (see 8c,d), or the closing of 
the current subsegment with no perspective shift (see 
7e,f). If past is followed by past, we may have either 
a continuation of the current perspective and segment 
(see 9a,b below), or a perspective shift with opening of 
a new segment (see 9b,c), or closing of the current seg- 
ment, with resumption of the previous perspective (see 
9c,d). 
(9) a. Mary found that her favorite vase was broken. 
b. She was upset. 
c. She bought it at a special antique auction, 
d. and she was afraid she wouldn't be able to find 
anything that beautiful again. 
Only plausible inference can resolve these ambiguities. 
This inference process will interact with resolution of 
anaphora and introduction of new individuals, identifi- 
cation of spatial and temporal frames, the presence of 
modal/cognition/perception verbs, and most of all will 
depend on world knowledge. In (9), for instance, one 
may have to rely on the knowledge that one normally 
would not buy broken things, or that one does not buy 
things one already owns. 
As approaches to this general difficulty, we are think- 
ing of the following two strategies: (A) Make a best ini- 
tial guess about presence or absence of new-pers/prev- 
pres, based on surface (syntactic) cues and then use 
failure-driven backtracking if the resulting interpreta- 
tion is incoherent. A serious disadvantage would be lack 
of integration with other forms of disambiguation. (B) 
Change the interpretation of LaStT, in effect providing 
multiple alternative referents for the first argument of 
orients. In particular, we might use 
Last T = {ei \[ ei is the last-stored episode at the 
focus of T, or was stored in the subtree 
rooted at the focus of T after the last- 
stored episode at the focus of T}. 
Subsequent processing would resemble anaphora disam- 
biguation. In the course of further interpreting the dein- 
dexed LF, plausible inference would particularize the 
schematic orienting relation to a temporal (or causal, 
etc.) relation involving just two episodes. The result 
would then be used to make certain structural changes 
to the tense tree (after LF deindexing). 
For instance, suppose such a schematic orienting re- 
lation is computed for a simple past sentence following 
a past perfect sentence (like 8c,d). Suppose further that 
the most coherent interpretation of the second sentence 
(i.e., 8d) is one that disambiguates the orienting rela- 
tion as a simple temporal inclusion relation between the 
successively reported events. One might then move the 
event token for the second event (reported in simple 
past) from its position at the past node to the right- 
most position at the past perfect node, just as if the sec- 
ond event had been reported in the past perfect. (One 
might in addition record a perspective shift, if this is 
still considered useful.) In other words, we would "re- 
pair" the distortion of the tense tree brought about by 
the speaker's "lazy" use of simple past in place of past 
perfect. Then we would continue as before. 
In both strategies we have assumed a general 
coherence-seeking plausible inference process. While it 
is clear that the attainment of coherence entails delin- 
eation of discourse segment structure and of all relevant 
temporal relations, it remains unclear in which direction 
the information flows. Are there independent principles 
of discourse and temporal structure operating above the 
level of syntax and LF, guiding the achievement of full 
understanding, or are higher-level discourse and tem- 
poral relations a mere byproduct of full understanding? 
Webber \[1987\] has proposed independent temporal fo- 
cusing principles similar to those in \[Grosz and Sid- 
net, 1986\] for discourse. These are not deterministic, 
and Song and Cohen \[1991\] sought to add heuristic con- 
straints as a step toward determinism. For instance, 
one constraint is based on the presumed incoherence 
of simple present followed by past perfect or posterior 
past. But there are counterexamples; e.g., "Mary is 
angry about the accident. The other driver had been 
drinking." Thus, we take the question about indepen- 
dent structural principles above the level of syntax and 
LF to be still open. 
5 Conclusion 
We have shown that tense and aspect can be analyzed 
compositionally in a way that accounts not only for their 
more obvious effects on sentence meaning but also, via 
tense trees, for their cumulative effect on context and 
the temporal relations implicit in such contexts. As 
such, the analysis seems to fit well with higher-level 
analyses of discourse segment structure, though ques- 
tions remain about the flow of information between lev- 
els. 
Acknowledgements 
We gratefully acknowledge helpful comments by James 
Allen and Philip Harrison on an earlier draft and much 
useful feedback from the members of TRAINS group 
at the University of Rochester. This work was sup- 
ported in part by NSERC Operating Grant A8818 and 
239 
ONR/DARPA research contract no. N00014-82-K-0193, 
and the Boeing Co. under Purchase Contract W-288104. 
A preliminary version of this paper was presented at 
the AAAI Fall Symposium on Discourse Structure in 
Natural Language Understanding and Generation, Pa- 
cific Grove, CA, November 1991. 

References 
\[Allen, 1987\] J. Allen, Natural Language Understand- 
ing, Chapter 14. Benjamin/Cummings Publ. Co., 
Reading, MA. 
\[Allen and Schubert, 1991\] J. Allen and L. K. Schu- 
bert, "The TRAINS project," TR 382, Dept. of Comp. 
Sci., U. of Rochester, Rochester, NY. 
\[Dowty, 1986\] D. Dowty, "The effect of aspectual 
classes on the temporal structure of discourse: se- 
mantics or pragmatics?" Linguistics and Philosophy, 
9(1):37-61. 
\[Grosz and Sidner, 1986\] B. J. Grosz and C. L. Sid- 
net, "Attention, intentions, and the structure of dis- 
course," Computational Linguistics, 12:175-204. 
\[Hinrichs, 1986\] E. Hinrichs, "Temporal anaphora in 
discourses of English," Linguistics and Philosophy, 
9(1):63-82. 
\[Hobbs, 1985\] J. R. Hobbs, "On the coherence and 
structure of discourse," Technical Report CSLI-85- 
37, Stanford, CA. 
\[Hornstein, 1977\] N. Hornstein, "Towards a theory of 
tense," Linguistic Inquiry, 3:521-557. 
\[Hwang, 1992\] C. H. Hwang, A Logical Framework for 
Narrative Understanding, PhD thesis, U. of Alberta, 
Edmonton, Canada, 1992, To appear. 
\[Hwang and Schubert, 1991\] C. H. Hwang and L. K. 
Schubert, "Episodic Logic: A situational logic for 
natural language processing," In 3rd Conf. on Sit- 
nation Theory and its Applications (STA-3), Oiso, 
Kanagawa, Japan, November 18-21, 1991. 
\[Lascarides and Asher, 1991\] A. Lascarides and N. 
Asher, "Discourse relations and defeasible knowl- 
edge," In Proc. 29th Annual Meeting of the ACL, 
pages 55-62. Berkeley, CA, June 18-21, 1991. 
\[Lascarides and Oberlander, 1992\] A. Lascarides and 
J. Oberlander, "Temporal coherence and defeasible 
knowledge," Theoretical Linguistics, 8, 1992, To ap- 
pear. 
\[Leech, 1987\] G. Leech, Meaning and the English Verb 
(2nd ed), Longman, London, UK. 
\[Moens and Steedman, 1988\] M. Moens and M. Steed- 
man, "Temporal ontology and temporal reference," 
Computational Linguistics, 14(2):15-28. 
\[Nerbonne, 1986\] J. Nerbonne, "Reference time and 
time in narration," Linguistics and Philosophy, 
9(1):83-95. 
\[Partee, 1984\] B. Partee, "Nominal and Temporal 
Anaphora," Linguistics and Philosophy, 7:243-286. 
\[Passonneau, 1988\] R. J. Passonneau, "A Computa- 
tional model of the semantics of tense and aspect," 
Computational Linguistics, 14(2):44-60. 
\[Reichenbach, 1947\] H. Reichenbach, Elements of Sym- 
bolic Logic, Macmillan, New York, NY. 
\[Reichman, 1985\] R. Reichman, Getting Computers to 
Talk Like You and Me, MIT Press, Cambridge, MA. 
\[Schubert and Hwang, 1989\] L. K. Schubert and C. H. 
Hwang, "An Episodic knowledge representation for 
Narrative Texts," In Proc. 1st Inter. Conf. on Prin- 
ciples of Knowledge Representation and Reasoning 
(KR '89), pages 444-458, Toronto, Canada, May 15- 
18, 1989. Revised, extended version available as TR 
345, Dept. of Comp. Sci., U. of Rochester, Rochester, 
NY, May 1990. 
\[Schubert and Hwang, 1990\] L. K. Schubert and C. H. 
Hwang, "Picking reference events from tense trees: A 
formal, implementable theory of English tense-aspect 
semantics," In Proc. Speech and Natural Language, 
DARPA Workshop, pages 34-41, Hidden Valley, PA, 
June 24-27, 1990. 
\[Smith, 1978\] C. Smith, "The syntax and interpreta- 
tions of temporal expressions in English," Linguistics 
and Philosophy, 2:43-99. 
\[Song and Cohen, 1991\] F. Song and R. Cohen, "Tense 
interpretation in the context of narrative," In Proc. 
AAAI-91, pages 131-136. Anaheim, CA, July 14-19, 
1991. 
\[Webber, 1987\] B. L. Webber, "The Interpretation of 
tense in discourse," In Proc. 25th Annual Meeting 
of the ACL, pages 147-154, Stanford, CA, July 6-9, 
1987. 
\[Webber, 1988\] B. L. Webber, "Tense as discourse 
anaphor," Computational Linguistics, 14(2):61-73. 
