Integrating Syntactic and Prosodic Information for the 
Efficient Detection of Empty Categories 
Anton Batliner l, Anke Feldhaus ~, Stefan GeiBler t, 
Andreas KieBling*, Tibor Kiss ~, Ralf Kompe*, Ehnar N~ith* 
I,MU Miinch(.n t IBM l)eutschl~nd lnforma.tionssysteme ~ FAll ti',rla.ngen-Niirnl)erg * 
lnstitut f. I)eutsche Philologie Inst. f. Logik und l,inguistik 1,ehrstuhl f. Mustererk¢.tmung 
Sehellingstr. 3 V~tngerowstr. 18 Martensstr. 3 
D-80799 Miinchen 11)-69115 IIeidelberg 1)-91058 \]';rlangen 
Abstract 
We describe a number of experiments 
that demonstrate the usefulness of 
prosodic information for a processing 
module which parses spoken utterances 
with a feature-based grammar employing 
empty categories. We show that by re- 
quiring certain prosodic properties from 
those positions in the input, where the 
presence of an empty category has to be 
hypothesized, a derivation can be accom- 
plished more efficiently. The approach 
has been implemented in the machine 
translation project VEItBMOBII, and re- 
sults in a significant reduction of the 
work-load for the parser t. 
1 Introduction 
In this paper we describe how syntactic and 
prosodic information interact in a translation 
module for spoken utterances which tries to meet 
the two - often conflicting - main objectives, the 
implementation of theoretically sound solutions 
and efficient processing of tile solutions. 
As an analysis which meets the first criterion 
but seemingly fails to meet the second one, we take 
an analysis of the German clause which relies on 
traces in verbal head positions in the framework of 
Head-driven Phrase Structure Grammar (llt'sG, cf. (Pollard&Sag, 1994)). 
The methods described in this paper have 
been implemented as part of the IBM-SynSem.. 
Module and the FAU-Erlangen/LMU-Munich- 
Prosody-Module in the MT project Vl,;ltHMOmI, 
(of. (Wahlster, 1993)) where spontaneously spo- 
ken utterances in a negotiation dialogue are trans- 
lated. In this system, an lh's(~ is processed by a 
bottom-up chart parser that takes word lattices as 
tThis work was partiMly funded by the Gc,- 
Imtn Vedcral Ministry for Research and Technology 
(BMIW) in the framework of the Verbmobil Project 
under (~r~nt ~(11 IV 101 V (Verbmobil). Tim rt:spon- 
slbility for the contents of this study lies with the aa,- 
thors. 
its input. The output of the parser is the seman- 
tic representation for the best string hypothesis in 
the lattice. 
It is our main result that prosodic informa- 
tion can be employed in such a system to de- 
termine possible locations for empty elements in 
tile input. Rather than treating prosodic informa- 
tion as virtual input items which have to match 
an appropriate category in tile grammar rules 
(Bear&Price, 1990), or which by virtue of being 
'unknown' in the grammar force the parser to close 
off the current phrase (Marcus&Hindle, 1990), our 
parser employs prosodic information as affecting 
the postulation of empty elements. 
2 An HI'sG Analysis of German 
Clause Structure 
\[\[PSG makes crucial use of "head traces" to ana- 
lyze the verb-second (V2) phenomenon pertinent 
in German, i.e. the fact that finite verbs appear in 
second position in main clauses but in final posi- 
tion in subordinate clauses, as exemplified in (la) 
and (lb). 
1. (a) Gestern reparierte er den Wagen. 
(Yesterday fixed he the car) 
'Yesterday, he fixed the car.' 
(b) Ich dachte, dab er gestern den Wagen 
reparierte. 
(I thought that he yesterday the car 
fixed) 
'I thought that he fixed tile car yester- 
day'. 
Following (Kiss&Wesche, 1991) we assume that 
the structural relationship between tile verb and 
its arguments and modifiers is not affected by the 
position of the verb. The overt relationship be- 
tween the verb 'reparierlc' and its object 'den Wa. 
.qe~,'in (1t)) is preserved in (la), although the verb 
shows up in a different position. The apparent 
contradiction is resolved by assuming an empty 
clement which serves as a substitute for tile verb 
ill second position. The empty element fills tile po-- 
sition occupied by the finite verb in subordinate 
'/\]j 
clauses, leading to the structure of main clauses 
exemplified in (2). 
Gestern 
C I; 
d~n "~agen X0-i 
(2): Syntax tree for 'Gestern reparierte 
er den Wagen.' 
The empty verbal head in (2) carries syntac- 
tic and semantic information. Particularly, the 
empty head licenses the realization of the syntac- 
tic arguments of the verb according to the rule 
schemata of German and Ih'sG's Subcategoriza- 
tion Principle. 
The structure of the main clause presented in 
(2) can be justifed on several grounds. In partic- 
ular, the parallelism in verbal scope between verb 
final and V2 clauses - exemplified in (3a) and (3b) 
- can be modeled best by assuming that the scope 
of a verb is always determined w.r.t, the final po- 
sition. 
3. (a) Ich glaube, du sollst nicht tgten. 
(I believe you shall not kill) 
'I believe you should not kill.' 
(b) Ich glaube, dab du nicht tgten sollst. 
(I believe that you not kill shall) 
'I believe that you should not kill.' 
In a V2 clause, the scope of the verb is deter- 
mined with respect to the empty verbal head only. 
Since the structural position of an empty verbal 
head is identical to the structural position of an 
overt finite verb in a verb final clause, the invari- 
ance does not come as a surprise. 
Rather than exploring alternative approaches 
here, we will briefly touch upon the representa- 
tion of the dependency in terms of lIPs(~'s featu~ 
ral architecture. Information pertaining to empty 
heads are projected along the DOUBLI,; SI,ASH 
(DsL) feature instead of the SLASh feature (cf. 
(Borsley, 1989)). The empty head is described in 
(4) where the LOCAL value is coindexed with the 
l)sl, value. 
SYNSEM LOC 
NONLOC I I)SL } 
(4): Feature description of a head trace 
The DsL of a head is identical to the I)sL of the 
mother, i.e. l)sb does not behave like a NONLO- 
CAt, but like a IlEal) feature. 
A DSL dependency is bound if the verbal pro- 
jection is selected by a verb in second position. 
A lexical rule guarantees that the selector shares 
all relevant information with the Dsb value of the 
selected verbal projection. The relationship be- 
tween a verb in final position, a verb in second 
position and the empty head can be summarized 
as follows: For each final finite verb form, there is 
a corresponding finite verb form in second position 
which licenses a verbal projection whose empty 
head shares its LOCAL information with the cor- 
responding final verb form. It is thus guaranteed 
that the syntactic arguments of the empty head 
are identical to the syntactic arguments required 
by the selecting verb. 
3 Processing Empty Elements 
Direct parsing of empty elements can become a 
tedious task, decreasing the efficiency of a system 
considerably. 
Note first, that a reduction of empty elements 
in a grammar in favor of disjunctive lexical rep- 
resentations, as suggested in (Pollard&Sag, 1994, 
ch.9), cannot be pursued. 
(Pollard&Sag, 1994) assume that an argument 
may occur on the SUBCAT or on the SLAS\]I list. 
A lexical operation removes the argument from 
Sur~cA'r and puts it onto SI,AStt. Hence, no fur- 
ther need for a syntactic representation of empty 
elements emerges. This strategy, however, will not 
work for head traces because they do not occur as 
dependents on a SUBCAT list. 
If empty elements have to be represented syn- 
tactically, a top-down parsing strategy seems bet- 
ter suited than a bottom-up strategy. Particu- 
larly, a parser driven by a bottom-up strategy has 
to hypothesize the presence of empty elements at 
every point in the input. 
In lh's(~, however, only very few constraints are 
available for a top-down regime since most infor- 
mation is contained in lexical items. The parser 
will not restrict the stipulation of empty elements 
until a lexical element containing restrictive infor- 
mation has been processed. The apparent advan- 
tage of top-down parsing is thus lost when llpsGs 
are to be parsed. The same criticism applies to 
other parsing strategies with a strong top-down 
orientation, such as left corner parsing or head 
corner parsing. 
We have thus chosen a bottom-up parsing strat- 
egy where the introduction of empty verbal heads 
is constrained by syntactic and prosodic informa- 
tion. The syntactic constraints build on the facts 
that a) a verb trace will occur always to the right 
of its licenser and b) always 'lower' in the syntax 
tree. Furthermore c) since the l)sh percolation 
mechanism ensures structure sharing between the 
verb and its trace, a verb trace always comes with 
a corresponding overt verb. 
As a consequence of c) the parser has a fully 
72 
specified verb form - although with empty phonol- 
ogy - at hand, rather than having to cope with the 
underspecified structure in (4). This form can be 
determined at compile time and stored in the lexi- 
con together with the corresponding verb form. It 
is pushed onto the trace stack whenever this verb 
is accessed. 
Although a large number of bottom-up hy- 
potheses regarding the position of an empty el- 
ement can be eliminated by providing the parser 
with the aforementioned information, the number 
of wrong hypotheses is still significant. 
In a verb-2nd clause most of the input follows 
a finite verb form so that condition a) indeed is 
not very restrictive. Condition b) rules out a large 
number of structures but often cannot prevent the 
stipulation of traces in illicit positions. Condition 
c) has the most restrictive effect in that the syn- 
tactic potential of the trace is determined by that 
of the corresponding verb. 
If the number of possible trace locations could 
be reduced significantly, the parser could avoid a 
large number of subanalyses that conditions a)-c) 
would rule out only at later stages of the deriva- 
tion. The strategy that will be advocated in the 
remainder of this paper employs prosodic infor- 
mation to accomplish this reduction. 
Empty verbal heads can only occur in the right 
periphery of a phrase, i.e. at a phrase bound- 
ary. The introduction of empty arcs is then not 
only conditioned by the syntactic constraints men- 
tioned before, but additionally, by certain require- 
ments on the prosodic structure of the input. 
It turns out, then, that a fine-grained prosodic 
classification of utterance turns, based on corre- 
lations between syntactic and prosodic structure 
is not only of use to determine the segmentation 
of a turn, but also, to predict which positions are 
eligible for trace stipulation. The following sec- 
tion focuses on the prosodic classification schema, 
section 5 features the results of the current exper- 
iments. 
4 Classifying Prosodic Information 
The standard unit of spoken language in a dia- 
logue is the turn. A turn like (5) can be composed 
out of several sentences and subsentential phrases 
-- free elements like the phrase 'ira April' which 
do not stand in an obvious syntactic relationship 
with the surrounding material and which occur 
much more often in spontaneous speech than in 
other environments. One of the major tasks of a 
prosodic component of a processing system is the 
determination of phrase boundaries between these 
sentences and free phrases. 
5. Im April. Anfang April bin ich in Urlaub. 
Ende April habe ich noch Zeit. 
(In April beginning April am I on vacation 
end April have I still time) 
'In April. I am on vacation at the beginning 
of April. I still have time at the end of April.' 
In written language, phrase boundaries are 
often determined by punctuation, which is, of 
course, not available in spoken discourse. For the 
recognition of these phrase boundaries, we use a 
statistical approach, where acoustic-prosodic fea- 
tures are classified, which are computed from the 
speech signal. 
The classification experiments for this pa- 
per were conducted on a set of 21 human- 
human dialogs, which are prosodically labelled (cf. 
(Reyelt, 1995)). We chose 18 dialogs (492 turns, 
36 different speakers, 6996 words) for training, 
and 3 dialogs for testing (80 turns, 4 different 
speakers, 1049 words). 
The computation of the acoustic-prosodic fea- 
tures is based oi1 a time alignment of the phoneme 
sequence corresponding to the spoken or recog- 
nized words. To exclude word recognition errors, 
for this paper we only used the spoken word se- 
quence thus simulating 100% word recognition. 
The time alignment is done by a standard hid- 
den Markov model word recognizer. For each syl- 
lable to be classified the following prosodic fea- 
tures were computed fully automatically from the 
speech signal for the syllable under consideration 
and for the six syllables in the left and the right 
context: 
• the normalized duration of the syllable nu- 
cleus 
• the minimum, maximum, onset, and offset of 
fundamental frequency (FO) and the maxi- 
mum energy and their positions on the time 
axis relative to the position of the actual syl- 
lable 
• the mean energy, and the mean FO 
• flags indicating whether the syllable carries 
the lexical word accent or whether it is in a 
word final position 
The following features were computed only for 
the syllable under consideration: 
• the length of the pause (if any) preceding or 
succeeding the word containing the syllable 
• the linear regression coefficients of the F0- 
contour and the energy contour computed 
over 15 different windows to the left and to 
the right of the syllable 
This amounts to a set of 242 features, which so 
far achieved best results on a large database of 
read speech; for a more detailed account of the 
feature evaluation, (cf. (Kief~ling, 1996)). 
The full set of features could not be used due 
to the lack of sufficient training data. Best re- 
sults were achieved with a subset of features, con- 
taining mostly durational features and F0 regres- 
sion coefficients. A first set of reference labels 
73 
was based on perceptive evaluation of prosod- 
ically marked boundaries by non-naive listen- 
ers (cf. (Reyelt, 1995)). Here, we will only 
deal with major prosodic phrase boundaries (B3) 
that correspond closely to the intonational phrase 
boundaries in the ToBI approach, (cf. (Beck- 
man~Ayers, 1994)), vs. all other boundaries (no 
boundary, minor prosodic boundary, irregular 
boundary). Still, a purely perceptual labelling of 
the phrase boundaries under consideration seems 
problematic. In particular, we find phrase bound- 
aries which are classified according to the per- 
ceptual labelling although they did not corre- 
spond to a syntactic phrase boundary. Illustra- 
tions are given below, where perceptually labelled 
but syntactically unmotivated boundaries are de- 
noted with a vertical bar. 
6. (a) Sollen wir uns dann im Monat M£r~. \[ 
einmal treffen? 
(Shall we us then in month March meet) 
'Should we meet then in March.' 
(b) Wir treffen uns am Dienstag \[ den 
dreizehnten April. 
(We meet us on tuesday the thirteenth 
April.) 
'We meet on tuesday the thirteenth of 
April.' 
Guided by the assumption that only the bound- 
ary of the final intonational phrase is relevant for 
the present purposes, we argue for a categorial 
labelling (cf. (Feldhaus&Kiss, 1995)), i.e. a la- 
belling which is solely based on linguistic defini- 
tions of possible phrase boundaries in German. 
Thus instead of labelling a variety of prosodic 
phenomena which may be interpreted as bound- 
aries, the labelling follows systematically the syn- 
tactic phrasing, assuming that the prosodic real- 
ization of syntactic boundaries exhibits properties 
that can be learned by a prosodic classification al- 
gorithm. 
The 21 dialogues described above were labelled 
according to this scheme. For the classification 
reported in the following, we employ three main 
labels, $3+ (syntactic boundary obligatory), S3- 
(syntactic boundary impossible), and $3? (syn- 
tactic boundary optional). Table 1 shows the cor- 
respondence between the $3 and B3 labels (not 
taking turn-final labels into account). 
cases \]~ not-B3 \] 
$3+ 844~ 18\] 
$3- 5907 97 
$3? 570 68 
Table 1: Correspondence between $3 and B3 
labels in %. 
Multi-layer perceptrons (MLP) were trained to 
recognize $3+ labels based on the features and 
data as described above. The MLP has one out- 
put node for $3+ and one for $3-. During training 
the desired output for each of the feature vectors 
is set to one for the node corresponding to the 
reference label; the other one is set to zero. With 
this method in theory the MLP estimates poste- 
riori probabilities for the classes under considera- 
tion. However, in order to balance for the a priori 
probabilities of the different classes, during train- 
ing the MLP was presented with an equal number 
of feature vectors from each class. For the experi- 
ments, MLPs with 40/20 nodes in the first/second 
hidden layer showed best results. 
For both $3 and B3 labels we obtained overall 
recognition rates of over 80% (cf. table 2). 
Note, that due to limited training data, errors 
in F0 computation and variabilities in the acous- 
tic marking of prosodic events across speakers, di- 
alects, and so on, one cannot expect an error free 
detection of these boundaries. 
Table 2 shows the recognition results in percent 
for the $3+/$3- classifier and for the B3/not-B3 
classifier using the S3-positions as reference (first 
column) again not counting turn final boundaries. 
For example, in the first row the number 24 
means that 24% of the $3+ labels were classified 
as $3-, the number 75 means that 75% of the $3+ 
labels were classified as B3. 
\[ cases 11 $3+ I S3-~-g-\[ n°t-B3 \] 
s3+ 11o 76 75 25 
s3- 766 14 s6 14 s6 
$3? 93 43 57 46 54 
Table 2: Recognition rates for $3 labels in % for 
$3 and B3 classifiers. 
What table 2 shows, then, is that syntactic $3 
boundaries can be classified using only prosodic 
information, yielding recognition rates compara- 
ble to those for the recognition of perceptually 
identified B3 boundaries. This means for our pur- 
poses, that we do not need to label boundaries 
perceptually, but can instead employ an approach 
as the one advocated in (Feldhaus&Kiss, 1995), 
using only the transliterated data. While this sys- 
tem turned out to be very time-consuming when 
applied to larger quantities of data, (Batliner et 
al., 1996) report on promising results applying a 
similar but less labor-intensive system. 
It has further to be considered that the recogni- 
tion rate for perceptual labelling contained those 
cases where phrase boundaries have been recog- 
nized in positions which are impossible on syntac- 
tic grounds-el, the number of cases in table (1) 
where a $3- position was classified as B3 and vice 
versa. 
It is important to note, that this approach does 
not take syntactic boundaries and phonological 
boundaries to be one and the same thing. It is a 
well-known fact that these two phenomena often 
are orthogonal to each other. However, the ques- 
tion to be answered was, can we devise an auto- 
matic procedure to identify the syntactic bound- 
74 
aries with (at least) about the same reliability as 
the prosodic ones? As the fgures in table (2) 
demonstrate the answer to this question is yes. 
Our overall recognition rate of 84.5% for 
the S3-classifier (cf. table (2)) cannot ex- 
actly be compared with results reported in 
other studies because these studies were ei- 
ther based on read and carefully designed ma- 
terial, (cf., e.g., (Bear&Price, 1990), (Osten- 
hof&Veilleux, 1994)), or they used not auto- 
matically computed acoustic-prosodic features 
bait textual and perceptual information, (cf. 
(Wang&Hirschberg, 1992)). 
5 Results 
In order to approximate the usefulness of prosodic 
information to reduce the number of verb trace 
hypotheses for the parser we examined a corpus 
of 104 utterances with prosodic amlotations de- 
noting the probability of a syntactic boundary af- 
ter every given word. For every node whose $3 
boundary probability exceeds a certain threshold 
wdue, we considered the hypothesis that this node 
is followed by a verb trace. These hypotheses were 
then rated valid or invalid by the grammar writer. 
Note that such a setting where a position in the 
input is annotated with scores representing the re- 
spective boundary probabilities is much more ro- 
bust w.r.t unclear classification results than a pure 
binary 'boundary-vs.-nonboundary' distinction. 
The observations were rated according to the 
ff~llowing scheme~: 
X0 position X0 position L ....... ........ I no 
~ X0 prop. Miss : 6 X : 703 
Table 3: Classification results for verb trace 
positions 
Evaluation of these figures for our test corpus 
and a threshold value of 0,01 yielded the following 
result: 
Recall = 95,8 ~ 
Precision = 33,5 
Error = 25,0 
Table 4: P~ecall, Precision and Error for the 
identification of possible verb trace positions. 
where: 
Recall -- Co,.,.~t -- ~-Correct+Missj 
Precision = Co,.,,ect ( Correct + False~ 
__ (Miss+False) Error 
- C'(C-~reet+False+Miss+X) 
In practice this means that the number of loca- 
tions where the parser has to assume the presence 
2XO position means that the relewLnt position is 
occupied by a XO gap, XO prop. means that the 
classifier l)roposes an X0 ~tt this position. 
of a verb trace could be reduced from 1121 to 412 
while only 6 necessary trace positions remMned 
unmarked. These results were obtained from a 
corpus of spoken utterances many of which con- 
tained several independent phrases and sentences. 
These segments, however, are also often separated 
by an S3-boundary, so that the error rate is likely 
to drop considerably if a segmentation of utter- 
ances into syntacticMly well-formed phrases is per- 
formed prior to the trace detection. Since cases 
where the verb trace is not located at the end of 
a sentence (i.e. where extraposition takes place) 
involve a highly characteristic categorial context, 
we expect a further improvement if the trace/no- 
trace classification based on prosodic information 
is combined with a language model. 
The problem with the approach described above 
is that a careful estimation of the threshold value 
is necessary and tiffs threshold may vary from 
speaker to speaker or between certain discourse 
situations. Furthermore the analysis fails in those 
cases where tile correct position is rated lower 
titan this value, i,e. where the parser does not 
consider the correct trace position at all. Thus, in 
a second experiment we examined how the syntac- 
tically correct verb trace position is ranked among 
the positions proposed by the prosody module 
w.r.t, its S3-boundary probability. If the cor- 
rect position turns out to be consistently ranked 
among the positions with the highest $3 probabil- 
ity within a sentence then it might be preferable 
for the parsing module to consider the $3 posi- 
tions in descending order rather than to introduce 
traces for all positions ranked above a threshold. 
For the second experiment we considered only 
those segments in the input that represent V2 
clauses, i.e. we assumed that the input has been 
segmented correctly. Within these sentences we 
ranked all the spaces between words according to 
the associated $3 probability and determined the 
rank of tile correct verb trace position. When per- 
forming this test on 134 sentences the following 
picture emerged: 
Rank 6 
\[#ofocc.~T~\[TI4\[3.\[01 71\[ >i- \]7 
Table 5: Ranking of the syntactically correct 
verb trace position within a sentence according 
to the $3 probability. 
Table 5 shows that in the majority of cases the 
position with the highest $3 probability turns out 
to be the correct one. It has to be added though, 
that in many cases the correct verb trace position 
is at the end of the sentence which is often very 
reliably marked with a prosodic phrase boundary, 
even if this sentence is uttered in a sequence to- 
gether with other phrases or sentences. This end- 
of-sentence marker will be assigned a higher $3 
probability in most cases, even if the correct verb 
trace position is located elsewhere. 
75 
In a third experiment finally we were interested 
in the overall speedup of the processing module 
that resulted form our approach. In order to es- 
timate this, we parsed a corpus of 109 turns in 
two different settings: While in the first round 
the threshold value was set as described above, 
we selected a value of 0 for the second pass. The 
parser thus had to consider every postion in the 
input as a potential head trace location just as if 
no prosodic information about syntactic bound- 
aries were available at all. It turns out then (cf. 
table (6)) that employing prosodic information re- 
duces the parser runtime for the corpus by about 
46%! 
I \[ With Prosody I Without Prosody I 
1 Average 6.5 11.9~ 
I Speedup \[ 45.96% \[ ./.-\] 
Table 6: Comparison of runtimes (in secs) for 
parsing batch-jobs with and without the use of 
prosodic information, resp. 
6 Conclusion 
It has been shown that prosodic information can 
be employed in a speech processing system to de- 
termine possible locations of empty elements. Al- 
though the primary goal of the categorial labelling 
of prosodic phrase boundaries was to adjust tile 
division of turns into sentences to the intuitions 
behind the grammar used, it turned out that the 
same classification can be used to minimize the 
number of wrong hypothesis pertaining to empty 
productions in the grammar, 
We found a very useful correspondence between 
an observable physical phenomenon-the prosodic 
information associated with an utterance-and a 
theoretical construct of formal linguistics-the lo- 
cation of empty elements in the respective deriva- 
tion. The method has been successfully imple- 
mented and is currently being refined by train- 
ing the classifier on a much larger set of examples 
and by integrating categorial information about 
the relevant positions into the probability score 
for the various kind of boundaries. 
Contact: 
The authors can be contacted under the following 
email addresses: 
anton.batliner~phonetik.uni-muenchen.d400.de 
feldhaus@heidelbg.ibm.com 
stefan.geissler@heidelbg.ibm.com 
kiessling@informatik.uni-erlangen.de 
tibor@heidelbg.ibm.com 
kompe@informatik.uni-erlangen.de 
noeth@informatik.uni-erlangen.de 
References 
Batliner, Anton, Andreas Kieflling, Ralf Kompe, 
Heinrich Niemann, Elmar N6th: Syntactic- 
prosodic Labelling of Large Spontaneous Speech 
Data-bases. In Int. Conf. on Spoken Language 
Processing, Philadelphia. 1996. (to appear). 
Bear, John. and Patti Price: Prosody, Syntax, and 
Parsing. In Proceedings of the 28th Conference 
of the Association for Computational Lingus- 
tics. 1990. pp. 17-22. 
Beckman, Mary E. and Ayers, Gayle M.: Guide- 
lines for ToBI transcription, version 2. De- 
partment of Linguistics, Ohio State University. 
1994. 
Borsley, Robert.: Phrase Structure Grammar and 
the Barrier Conception of Clause Structure. In: 
Linguistics, 27. 1989. pp. 843-863. 
Feldhaus, Anke and Tibor Kiss: Kategoriale 
Etikettierung der Karlsruher Dialoge, Vl.;ItB- 
MOBIL-Memo Nr 94, IBM Deutschland Infor- 
mationssysteme, Heidelberg. 1995. 
Kieflling, Andreas: Extraktion und Klassifika- 
tion prosodischer Merkmale in der automatis- 
chen Sprachverarbeitung, PhD thesis. Univer- 
sit,it Erlangen-N/irnberg. 1996. (to appear). 
Kiss, Tibor and Birgit Wesche: Verb order and 
Head-Movement in German. In: Herzog, O./C.- 
R. Rollinger (eds.): Text Understanding in 
LILOG. Integrating Artificial Intelligence and 
Computational Linguistics. Springer, pp. 216- 
240, Berlin. 1991. 
Marcus, Mitchell and Donald Hindle: Description 
Theory and Intonation Boundaries. In: Alt- 
mann, Gerry (ed.): Cognitive Models of Speech 
Processing. The MIT Press, Cambridge. 1990. 
pp. 483-512. 
Ostendorf, Mari and N.M. Veilleux: A Hierarchi- 
cal Stochastic Model for Automatic Prediction 
of Prosodic Boundary Location. In: Computa- 
tional Linguistics, Vol. 20. 1994. pp. 27-53. 
Pollard, Carl and Ivan A. Sag: Head-driven 
Phrase Structure Grammar, Univ. of Chicago 
Press, Chicago. 1994. 
Reyelt, Matthias : Consistency of Prosodic Tran- 
scriptions Labelling Experiments with Trained 
and Untrained Transcribers, Proc. XIIIth Int. 
Cong. of Phonetic Sciences, Stockholm, Vol. 4. 
1995. pp. 212-215. 
Wahlster, Wolfgang: Verbmobih Ubersetzung yon 
Verhandlungsdialogen. V1~IU~MOBlI,-Report 1. 
DFKI Saarbrficken. 1993. 
Wang, Michelle Q. and Julia Hirschberg: Au- 
tomatic Classification of Intonational Phrase 
Boundaries. In: Computer Speech & Language, 
Vol. 6. 1992. pp. 175-190. 
76 
