Anytime Algorithms fl)r Speech Parsing?* 
Gfinl;her (\]6rz Marcus Kesscler 
Univorsit;y of l,'zlange|>Niirnberg, IMMD VIII 
goerz@inf ormat ik. uni-erl angen, de 
TOPICAI, PAI'\]~I/ 
Keywords: anytime Mgorithms, p;~rsing, speech analysis 
Abstract 
This paper discusses to which extent the concept of 
"anytime algorithms" can be applied to parsing algo- 
rithms with feature unification. We first try to give a 
more precise definition of what an anytinm algorithm 
is. We arque that parsing algorithms have to be clas- 
sified as contract algorithms as opposed to (truly) it> 
terruptible algorithms. With the restriction that the 
transaction being active at the time an inl,errupt, is is- 
sued has to be COml)leted before the interrupt cart be 
executed, it is possible to provide a parser with linritcd 
anytime t)ehavior, which is in fact t)dng realized in our 
re.search l)rototype. 
1 Introduction 
The idea of '%nylime algorithms", which has been 
around in the tieht of plmming for some time 1, has 
recently been suggested for application in natural lan- 
guage and speech l)rocessing (NL/SP) 2. An anytime 
algorithm is an algorit.hm "whose quality of results 
(legrades graceflflly a~s computation time decreases" 
(\[Russell attd Zilt)erstein 1991\], p. 212). In the follow- 
ing we will first give a more specilic definition of which 
properties allow an algorithm to be implemented and 
used as an anytime algorithm. We then apply this 
knowledge to a specitic aspect of NL/SP, namely pars- 
ing algorithms in a speech understanding system. In 
the Appendix we present the A I)C protocol which sup- 
ports anytime computations. 
We will discuss these matters in the framework of 
the Verbmobil joint research project a, where we are 
working on the implementation of an incremental chart 
parser 4. The conception of this I)arser has been derived 
from earlier work by the llrst author 5. 
lef. e.g. \[llussell mM Zilberstein 1991\] 
P'so \[Wahlster 1992\] in his invited talk at CO1,\[N(I-92 
a ~lThe Verbmnbil joint research project has been defined in the 
document \[Verbmobil tl.eport 1991\] 
4 the Verl)mol,il/15 parser, of. \[Weber 1!)!)'2\] 
Sthe GuLP parser, of. \[Gi~rz 1988\]. 
2 Ai,ytime Algorithms 
\[1)etm and Boddy 1988\] give the fi)llowing characteri~ 
zation of anytime algorithms: 
1. 
2. 
3. 
they lend themselves to preemptive scheduling 
techniques (i.e., they cart bc suspended and re- 
sumed with negligible overhead), 
they can be terminated at any time and will return 
SOl\[le answer~ and 
the attswers reI, urned iml)rove in some welt- 
behaved maturer as a function of time. 
Unforl,unately this characterization does not make 
a clear distinction between the intplementation of an 
algorithm and tile algorithm as such. 
Point (1) is true of a great many Mgorithms imple- 
mented on preenq)tive operatirLg systems. 
Poin~ (2) can be made true for any algorithm by 
adding all explicit Result slot, that is I)reset by a wdue. 
denoting a w)id result. I,et us call the implementation 
of an anyl;inm algorithm an anytime producer. Accord- 
ingly we ttanle the entity interested in the result of such 
an anytime computation the anytime consumer. Fig- 
urc 1 shows two such processes in a tightly coupled 
synchronization loop. Figure 2 shows the same com- 
municating processes decoupled by the introduction of 
the Result slot. Note that synrhronisation is much 
cheaper in terms of perceived complexity R)r the pro- 
gramrne.r and runtime synehronisation overhead (just 
the time to cheek and eventually traverse the mutual 
exclusion barrier). In such an architecture producer 
and consumer work under a regime that allows the 
consmner to interrupt the producer at any lime and 
dentand a result. The risk that the consumer incurs by 
such flexibility is a eertMn non-zero probability that 
this result is void ~ or mtchanged since the last result 
retrievah 
6The faihn'e to provide an answer within a given anmunt of 
time nlay ill itself I)e an interesting and meaningflal result for the 
ally Linle consumet'. 
997 
9 
-"" 
............. % ) 
Result 
Anytime Anytime 
Consumer Producer 
Figure 1: Tightly coupled processes with complex syn- 
chronization internals. 
Result Slot Mutex 
(preset with Barrier 
"VOID") ~', / \ '\ ,/ 
&~ .~.....~ ? "X /+ 
-f t / 
M./'7,;7d,- .... ,._-/M./ 
Anytime Anytime 
Consumer Producer 
Figure 2: Processes decoupled by using a result slot 
protected by a simple mutual exclusion barrier. 
Point (3) is surely a much too strong restriction, 
since it is not always possible to define what exactly 
an improvement is for any given algorithm. In NL/SP, 
where we are often dealing with scored hypotheses, it 
is difficult, if not impossible, to devise algorithms that 
supply answers that improve monotonically as a flmc- 
tion of invested computational resources (time or pro- 
cessing units in a parallel architecture). 
We propose the following characterization of any- 
time algorithms: 
An algorithm is fit to be used as an anytime 
producer if its implementation yields a pro- 
gram that has a Result Production Granular- 
ity (RPG) that is cmnpatible with the time 
constraints of the consumer. 
The notion of RPG is based on the following obser- 
vation: Computations being performed on fnite state 
machines do not proceed directly from goal state to 
goal state. Instead they go through arbitrarily large 
sequences of states that yield no extractable or intelli- 
gible data to an outside observer. To interrupt a pro- 
ducer on any of these intermediate states is fruitless, 
since the result obtained could at best, according to 
the observation made on point (2) above, be the result 
that was available in the last goal state of the producer. 
From the point of view of the consumer the transitions 
from goal state to goal state in the producer are atomic 
transactions. 
The average length of these transactions in the al- 
gorithm correspond to average time intervals in the im- 
plementation, so that we can speak of a granularity 
with which results are produced. 
The time constraints under which the eonsumer is 
operating then give the final verdict if the implemeuta- 
tion of an algorithm is usable as an anytime producer. 
Let us illustrate this by an example: In a real-time 
NL/SP-system tim upper bound for the RPG will I, yp- 
ieally be in the range of 10 lOOms. That is, a producer 
implemented with such an RPG ofl>rs the consumer 
the chance to trade a 500ms delay for 5 to 50 fllrther 
potential solutions. 
Note that goal states can also be associated with 
intermediate results in the producer algorithm. Con- 
ceptually there really is not much of a difference be- 
tween a result and an intermediate result,, but in highly 
optimized implementations there might be the need to 
explicitly export such intermediate results, due to data 
representation incompatibilities or simply because the 
data might be overwritten by other (non-result) data. 
Section 4 gives an example of how the RPG of an imple- 
mentation can be reduced by identifying intermediate 
goal states that yield information which is of interest 
to the consumer. 
3 Breadth and Depth of Analy- 
sis 
In the following we will ask whether and how the idea 
of anytime producers can be applied within the active 
chart parsing algorithm scheme with feature unifica- 
tion. Although the analogy to decision making in plan- 
ning where the idea of anytime algorithms has been 
developed seems to be rather shallow, we can, for 
the operation of the parser, distinguish between depth 
and breadth of analysis 7. 
We define depth of analysis as the concept refering 
to the growing size of information content in a fea- 
ture structure over a given set of non-competing 
word hypotheses in a certain time segment dur- 
ing its computation. Larger depth corresponds to 
a more detailed linguistic description of the same 
objects. 
In contrast, we understand by breadth of analy- 
sis the consideration of linguistic descriptions re- 
sulting from the analysis of growing sets of word 
hypotheses, either from growing segments of the 
utterance to be parsed or from a larger number of 
competing word hypotheses in a given time seg- 
ment. 
q'o regard breadth of analysis as a measure in the 
context of the anytime algorithm concept is in a sense 
r not to |)e confused with depth-first or breadth-first search. 
998 
trivial: Considering only one l)arse, the more process- 
ing time the parser is given the larger the analyzed 
segment of the input utterance will be. In general, 
larger breadth corresponds to more information about 
competing word hypotheses in an (half-) open time in- 
terval as opposed to more information about a given 
word sequence. So, obviously, breadth of analysis does 
not correspond to what is intended by the concel)t of 
anytime algorithms, whereas depth of analysis meets 
the inliention. 
If an utterance is syntactically ambiguous, we (:an 
compute more parses the more processing time the 
parser is given. Therefore, tohis case is apart, icular 
instance of depth of analysis, beeaase the same word 
sequence is considered, and not of breadth of analysis 
given the definition above. In this case one would like 
to get the best analysis in terms of the quality scores of 
its constituents first, and other readings late,', ordered 
by score. If the parser works incrementally, what hap- 
pens to be the case for the Verbmobil/15 parser s, the 
intended effect car, be achieved by the adjustment of a 
strategy parameter namely to report the analysis of 
a grammatical fragment of the input utterance as soon 
as it is found. 
At least one distinction might be useful for the 
Verbmobil/\[5 parser. In our parser a category check 
is performed on two chart edges for eIficiency reasons, 
and only if this check is successflfi, the unificatkm of the 
associated feature structures is performed, llence, an 
interrupt would be admissible after ,,he category check. 
In this case we emphasize a factorization of the set; of 
constraints in two distinct subsets: phrasal constraints 
which are processed by the act.iw~ chart parsing algo- 
rithm schema (with l)olynomial complexity), and func- 
tional constraints which are solved by the unification 
algorithm (with exponential complexity). 'rhe interface 
between both types of constraints is a crucial place for 
the introduction of control in the parsing process in 
general 9 
Since we use a constraint-hased grammar formal- 
ism, whose central operation is the unification of fea- 
ture structures, it does not make sense to admit inter 
rupts at any time. Instead, the operation of the parser 
consists of a sequence of transactions. At the most 
coarse grained level, a transaction would be an appli- 
cation of the flmdamental rule of active chart t)arsing, 
i.e. a series of operations which ends when a new edge 
is introduced into the chart, including the computation 
of the feature structure associated with it. Of course 
this argument holds when an application of the fun- 
damental rule results in another application of it on 
subunits due to the reeursive structure of the grammar 
ruleQ °. Certainly one might ask whether a smaller 
grain size makes sense, i.e. the construclion of a fea- 
ture structure should itself he interruptible. In this 
case one could think of the possibility of au interrupt. 
Sand for Gul,t' as well 
9 cf. \[Maxwell and Kaplan 1994\] 
l°This h,'ts been implemented in the interrupt system of (lul,l) 
\[Ggrz 1988\]. 
after one feature in one of the two feature structures 
to be unified has been l)roeessed. We think that this 
possibility shouhl be rejected, since feature structures 
usually contain eoreli'.rences. If we consider a partial 
feature structure - - as in an intermediate step in the 
unitication of two feature structures in the situation 
where just one feature has been processed, this struc- 
ture might not be a realistic partial description of the 
part of speech under consideration, but simply inad- 
equate as long as not all embedded eoreferences have 
been established. It seems obvious that the grain size 
cannot be meaningfully decreased below the processing 
of one feature. Therefore we decided that transactions 
must be defined in terms of computations of whole fea- 
ture structures. 
Nevertheless, a possibility for interrupting the com- 
putation of a feature structure could be considered in 
case the set of featnre, s is divided in ~wo classes: fea- 
tures which are obligatory and features which are op- 
tional. Members of the last group are candidates for 
constraint relaxation which seems to be relevant with 
respect to robustness at least in the case of speech 
parsing. We have just started to work on the constraint 
relaxation problem, but there is no doubt that this is 
an important issue for further research. Nevertheless, 
at the time being we doubt whether the above men- 
tione.d problem with coreferences couht be avoided in 
this case. 
A further opportunity for interrupts comes up in 
cases where the processing of alternatives in unifying 
disjm)ctiw~' feature structures is delayed. In this case, 
unilication with one of the disjuncts can be considered 
as a transaction. 
Another chance R)r the implementation of anytime 
behavior in parsing arises if we consider the gram- 
mar from a linguistic perspective ~ oppose.d to the 
purely formal view taken above. Since semantic con- 
struction is done by our grammar as well, the func- 
tional constraints contain a distinct subset for the pur- 
pose of semantic construction. In a separate b, vesti- 
gation \[Fischer 1994\] implemented a version of A-I)t{;I ~ 
\[l)inkal 1993\] within the. same feature unification fo> 
realism which buihts semantic structures within the 
framework of Discourse Representation Theory. It has 
been shown that the process of DRS construction can 
be split in two types of transactions, one which can be 
performed incrementally basically the construction 
of event representations without temporal information 
-- and another one which cannot be concluded before 
the end of an utterance has been reached - - supplying 
temporal information. Since the first kind of transac- 
tions represents meaningfnl partial semantic analyses 
those can be supplied immediately on demand under 
au anytime regime. 
The possibility to process interrupts with the re- 
striction that the currently active transaction has to be 
complete.d in advance has been built into the Verhmo- 
bil/15 parser, using the APC protocol (of. Appendix). 
It therefore exhibits a limited anytime behavior. 
999 
4 Feature Unification as an 
Anytime Algorithm? 
Up to now, in our discussion of an appropriate grain 
size for the unification of feature structures we consid: 
ered two cases: the unification of two whole feature 
structures or the unification of parts of two feature 
structures on the level of disjuncts or individual fea- 
tures..In all of these cases unitication is considered as a 
single step, neglecting its real cost, i.e. time constraints 
would only affect the number of unification steps, but 
not the execution of a particular unification operation. 
Alternatively, one might consider the unification algo- 
rithm itself as an anytime algorithm with a property 
which one might call "shallow unification". A shallow 
unification process would quickly come up with a first, 
incomplete and only partially correct solution which 
then, given more computation time, would have to be 
refined and possibly revised. It seems that this prop- 
erty cannot be achieved by a modification of existing 
unification algorithms, but would require a radically 
different approach. A prerequisite for that would be 
a sort of quality measure 11 tbr different partial feature 
structures describing a given linguistic object which is 
distinct from the subsumption relation. To our knowl- 
edge, the definition of such a measure is an open re: 
search question. 
5 Conclusion 
According to \[Russell and Zilberst, ein 1991\] parsing al- 
gorithms with feature unification have to be classified 
as contract algorithms as opposed to (truly) interrupt- 
ible algorithms: They must be given a particular time 
allocation in advance, because interrupted at any time 
shorter than the contract time they will not yield useflll 
results. At least the transaction which is active at the 
time an interrupt occurs has to be completed before 
the interrupt can be executed. With this restriction, 
it is possible to provide a parser with linqited anytime 
behavior, which is in fact being realized in the current 
version of the Verbmobil/15 parser. 
Acknowledgements. The authors would like to 
thank Gerhard Kraetzschmar, Herbert Stoyan, and 
Hans Weber for w~luable comments on a previous ver.- 
sion of this paper. 
References 
\[Dean and Boddy 1988\] 
Thomas Dean and Mark Boddy: An Analysis of 
'I\]me-Dependent Planning. AAAI 1988, 49--54 
\[Dongarra, Geist, Manchek and Sundaram 1993\] Jack 
Dongarra, G. A. Geist, Robert Manchek and V. S. 
Sundaram: Integrated PVM Framework Supports 
11 c.f. \[Russell and Wefald 1989\] 
Heterogeneous Network Computing. Comlmters in 
Physics, Vol. 7, No. 2, 1993, 166-175 
\[Fischer 1994\] Fischer, I.: Die kompositionelle Bildung 
yon .Diskursrepriisentationsstrtzkturen fiber einer 
Chart. Submitted to KONVENS 94, Vienna. 
\[Gfrz 1988\] G5rz, G.: Struktm'analyse natfirli&er 
Spra&e. Bonn: Addison-Wesley, 1988 
\[Maxwell and Kaplan 1.994\] Maxwell, J.T, Kaplan, R.: 
The Interface between Pbr<~sal and F~metional 
Constraints. Computational l,inguistics, Vol. 19, 
1994, 571- 590 
\[Pinkal 1993\] PinkM, M.: Semantik. In: Gfrz, G. 
(Ed.): Einffihr,ng in die Kfinstliehe Intelligenz. 
Bonn: Addison-Wesley, 1993, 425-498 
\[Russell and Wefald 1989\] Rnssell, S.J. and Wefald, E: 
Principles of Metareasoning. Proc. KR-89, 1989, 
400 411. 
\[Russell and Zilberstein 1991\] Russell, S. a., Zilber- 
stein, S.: Composing Real-Time Systems. Proc. 
I3CAI-91, Sydney, 1991, 212-217 
\[Verbmobil Report 1991\] Verbmobil Konsor- 
titan (13¢1.): Verbmobil- Mobiles Dohnets&ge- 
rat. BMFT Report, Miinchen, 1991 
\[Wahlster 1992\] Wahlster, W.: Complltational Models 
of Face-to-Face Dialogs: Multimodality, Negotia- 
tion and Translation. Invited talk at COLING-92, 
Nantes, 1992. Not contained in the proceedings; 
copies of slides are available from the author. 
\[Weber 1992\] Weber, 1I.: Chart Parsing m ASI, ASL 
Tecbnical Report ASL-TR-28-92/UER, Univer- 
sity of 1;rlaugen-Niirnberg, IMMD VIII, Erlangen, 
1992 
Appendix: A Protocol for Any- 
time Producer/Consumer Pro- 
cesses 
In the following we introduce the APC (Anytime Pro- 
ducer Consumer) protocol which allows for easy estab- 
lishment of anytime producer/consumer relationships 
on parallel architectures. 
Let Producer be the flmction that implements the 
producer algorithm. In a purely sequential procedural 
call/return implementation this function would have a 
control structure similar to: 
(defml Producer (...) 
(Initialize) 
(let ((Result nil)) 
(~hile (not (GoodEnough? Result)) 
(ImproveResult)) 
Result)) 
1000 
The RP(\] of Producer is at least that of the func- 
tion ImproveResult. It is finer if ImproveResult is 
itself made of loops that produce intermediate results 
that are ext)ortable to consumers. 
q'he consumer is ilnplemented as the function 
Consumer, that at some point calls the l)roducer: 
(defun Consumer (...) 
(Producer ...) 
We now translate Producer and Consumer into 
parallel processes ,sing the APC protocol, which is 
directly implemented by functions that act as in- 
terfaces to the underlying communication/synchro- 
nization system. All functions implementing the 
protocol have the prefix APC: (In our imphunenta- 
|ion all of them are in the Conmlon~l,isp package 
anyt |me-producer-consumer). 
(defun AnytimeProducer (...) 
(Initialize) 
(let ((Result nil)) 
(while (not (GoodEnough? Result)) 
(ImproveResult) 
;; Make Result available to consumers 
(APC:SetResult! Result) 
;; Check for messages/instructions 
;; from Consumer 
(APC:CheckStatus) 
Result)) 
In a paralM implements||or, it is not sullicient for 
the consumer to simt)ly call the producer. The pro° 
ducer has to be spawT~ed or forked as a separate process: 
(defun AnytimeConsumer (...) 
• Create a new process 
(let ((P-AnytimeProducer-1 
(AI'g:StartProcess (AnytimeProducer ...)))) 
(let ((Result 
(hPC:GetResult P-hnytimeProducer-1))) 
(while (not (ConsumerGoodEnough? Result)) 
; I)o something else, like going to sleep 
; to give tile producer some more time 
(setf Result, 
(APe :GetResnlt P-AnytimeProducer-1) ) 
) ) 
(APe :hbortProcess P-AnytimeProducer-:l) ) 
The APC Pr()/;()('()l 
APC:StartProcess F starts a new process in which 
the procedure F is executed. This function is also 
responsibh; for tile creation of the protected Result 
slot. APe : StartProcess returns the id of the new 
process. 
No~e that an arbitrm:y number of producers may be 
started by a consnlner. A prodtlcer may o\[' course 
also start other producers. 
APC:AbortProcess Proc aborts the process Prec. 
APC:SetResult! R sets the value of the Result slot 
to R. 
APC:GetResult P • retrieves the current value of 
the Result slot from process P. Remember 
that APC:SetResult ! and APC:GetResult avoid 
read/write conflicts by a locking mechanism that 
implements mutual exclusion. 
APC : gesetProcess Proc I - restarts the process Proc 
with new input I. 
APC:CheckStatus \[Proc\] check if any inessages or 
instructions have arrived from Proc. Often par- 
allel soft;ware environments offer only very crude 
process scheduling and control primitives. The 
user may have to implement sortie of them by 
himself. APC:ResetProeess, for example, is (lit" 
ticult to formulate in a general way. Reset 
can also involve, ltla.intenance or eleannp work, 
which is clearly beyond any process-oriented im- 
ph'.mentation of Reset. 'l'he idea is that these 
user implemented control procedures are hooked 
into hPC:CheckStatus \[Proc\]. 'lk) a|,tain a line- 
grained control relationship between consunter and 
t)roducer, the user simply inserts APC : CheckStatus 
at key-positions in the code. 
The AP(; protocol has been implemented aud 
tested under a coarse grained paralM Common 
l,ist) System running on a four processor SUN- 
SPARC MP-670. UNIX IPC 1~ shared mem- 
ory and sen|spheres are used to implement the 
h)w-level communica.tion and synchronisation facil- 
ities. We are currently porting the system to 
Solaris 2.3, with PVM (Parallel Virtual Machine, 
see \[l)ongarra, Geist, Manchek and Snndaram 1993\]) 
as the basic communicatkm facility. IWM would al- 
low us to mow~ our parallel system h'om tile current 
high communication and low memory bandwidth im 
plementation on a shared memory machine, to a low 
communication/high memory bandwidth implementa- 
tion tutoring on a cluster of workstations. 
12 \[lit,el-pl'og(!ss (\]ommunlcation Facilitie, s 
7007 
