Acquiring German Prepositional 
Subcategorization Frames from Corpora 
Erika F. de Lima 
G1VID - German National Research Center 
for Information Technology 
DoHvostrasse 15 
64293 Darmstadt, Germany 
delima@darmsCadt, gmd. de 
July 7, 1997 
Abstract 
This paper presents a procedure to automaticafly learn German 
prepositional subcategofization frames fzom text corpora. It is based 
on shallow parsing techniques employed to identify high-accuracy cues 
for prepositional frames, the EM algorithm to solve the PP attachment 
problem implicit in the task, and a method to rank the evidence for 
subcategorization provided by the collected data. 
1 Introduction 
The description of lexical forms in both computation and hun~-oriented 
lexica include prepositional subcategoriza~ion information. For instance in 
German, the verb arbeiten ('to work') subcategorizes for a PP headed by the 
preposition an ('on'), and the verb erinnern ('to remind'), for an accusative 
NP and a PP headed by an: 
(1) Mary arbeitet an der Frage P .~ NP. 
Mary works on the question 
(2) Mary exinnert ihren Freund an den Terrain. 
Mary reminds her friend on the deadline 
'Mary reminds her friend of the deadline.' 
Subcategorization information is usually compiled by hand. A procedure to 
automatically learn prepositional subcategorization would enable the acqui- 
153 
sition of broad-coverage lexica which reflect evolving usage and which are 
less subject to lexical gaps. 
Learning prepositional subcategorization automatically is not a trivial 
t~LSk; it entails a PP attachment decision problem, and requires being able 
to distinguish complement from adjunct prepositional cues. For instance in 
(2) above, it is (syntactically) possible to attach the prepositional phrase 
\[pp an den Termini (to the noun phrase object as well as to the verb phrase. 
Sentence (2) cannot be considered conclusive evidence of a verbal frame 
based on syntactical information alone. 
In (3) the prepositional phrase \[pp in der Nacht\] ('at night') is an adjunct 
PP which may occur with any (aspectuaUy compatible) verb. It is not 
specific of the verb arbeiten ('to work') and should not be considered evidence 
of subcategorization. 
(3) Mary arbeitete in der Nacht. 
Mary worked in the night 
'Mary worked at night.' 
This paper proposes a method to automatically acquire German preposi- 
tional subcategorization frames (SFs) fzom text corpora. It is based on shal- 
low parsing techniques employed to identify high-accuracy cues for prepo- 
sitional SFs, and a method to rank the evidence for subcategorization pro- 
vided by the collected data. The PP attachment problem implicit in the 
task is dealt with by using the EM algorithm to rank alternative frames. 
The subcategorization frames considered are shown in figure 1. 
2 Method 
The automatic extraction of German prepositional SFs is based on the ob- 
servation that certain constructs involving so-called pronominal adverbs are 
high-accuracy cues for prepositional subcategorization. Pronominal adverbs 
are compounds in German consisting of the adverbs da(r)- and wo(r)- and 
certain prepositions. For instance in (4c), the pronominal adverb daran 
('about it') is used as a pro-form for the personal pronoun es ('it') as the 
object of the preposition an ('about'). (Note that the usage of the pronoun 
(4b) is ungrammatical.) In (4d), the pronominal adverb daran occurs in 
a correlative construct with a subordinate daft ('that') clause immediately 
following it. 
(4) a. Mary denkt an Johns Ankuft. 
Mary thlnk~ on John's arrival 
'Mary thinks about John's arrival .' 
154 
Example SF Description 
PP\[auf\] V\[warten\] 
'wait for' Verb with PP 
PP\[an\] NPA V\[erlnnern\] 
'remind NP of' Verb with accusative object and PP 
PP\[fiir\] NPD V\[danken\] 
'thauk NP for' Verb with dative object and PP 
PP\[auf\] sich V\[vorbereiten\] 
'to prepare oneself for' reflexive verb with PP 
PP\[auf\] N\[Hoffmtmg\] 
'hope for' Noun with PP 
PP\[auf\] A\[stolz\] 
'proud of" Adjective with PP 
Figure 1: Subcategorization flames learned by the system 
b. *Mary denkt an es. 
Mary thinks on it 
c. Mary denkt daran. 
Mary thinks on it 
'Mary thlulcs about it.' 
d. Mary denkt damn, da6 John bald aukommt. 
Mary thinks on it that John soon arrives 
'Mary thinks about the fact that John will arrive soon.' 
Unlike prepositional phrases, pronominal adverb correlative constructs pro- 
vide reliable cues for prepositional subcategorization. For instance the oc- 
currence of the pronominal adverb damn in the correlative construct in (4d) 
can be used to infer that the verb denken ('to think') subcategorizes for a 
PP headed by the preposition an (~about'). 
In the next section, a learning procedure is described which makes use 
of pronomln~.1 adverb correlative constructs to infer prepositional subcate- 
gorization. It consists of four components: SF detection, mapping, disam- 
biguation, and ranldng. 
155 
2.1 SF Detection 
This component makes use of shallow parsing tte,,hn~ques to detect possible 
prepositional SF ~ructures; a standard CFG parser is used with a hand- 
written grammar d~qn~ng pairs of main and subordinate clauses in correla- 
tive constructs such as (4d). Main clauses covered by the grammar include 
copular constructs as well as active and passive verb-second and verb-final 
constructs. Subordinate clauses considered include those headed by daft 
('that'), indirect interrogative clauses, and infinitival clauses. 
The internal structure of the clause pair consists of phrase-like con- 
stituents; these include nominative (NC)~ prepositional (PC), adjectival 
(AC), verbal (VC), and clausal constituents. Their deiq-ltion is non-standard; 
for instance, all prepositional phrases, whether complement or not, are left 
unattached. As an example, the shallow parse structure for the sentence 
fragment in (5) is shown in (5') below. 
(5) Er lobte die Reaktion der 5ffentlichen Meinung in RuBland 
he praised the reaction the public opinion in Russia 
als Beweis dafiir, daB... 
as proof for it that 
'He praised the reaction of the public opinion in Russia as proof of 
the fact that ...' 
(e) Is Sr\] \[vc 
lobte\] 
\[NC die R ktion\] 
\[A'c der 5~ent\]ichen Meinung\] 
\[PC in Ru.Bland\] 
\[PC Ms Beweis\] 
\[Pc \[sc 
daS. . . \] 
2.2 SF Mapping 
The SF Mapping component maps a shallow parse structure of a main clause 
in a pronominal adverb correlative construct to a set of putative subcatego- 
rization frames reflecting structural as wen as morphological ambiguities in 
the original sentence. Alternative SFs usually stem from an ambiguity in the 
attachment of the pronominal adverb PP. The mapping is defined as follows. 
(In the following, p denotes the preposition within the pronomlnal adverb 
156 
I 
I 
I 
I, 
I 
I 
I 
i 
I 
I 
I 
I 
in a correlative construct main clause, VC the main verbal constituent in 
the clause; v in VC\[v\] denotes the head Iemm~ of the verbal constituent, 
analogously for NC\[n\].) 
VC\[v\]/NC\[n\]. An active verb-second or verb-final clause with one NC is 
m~tpped to {PP\[p\] V\[v\]} if the NC precedes the finite verb/auxi~ary in the 
clause, otherwise to {PP\[p\] V\[v\], PP\[p\] N\[n\]}. 
For instance, sentence (6) is a verb-second clause with an adverbial in 
the first position in the clause and one NC following the verb. In this 
construct, the PP headed by the pronominal adverb may potentially be 
attached to the verb phrase or to the nominal phrase immediately preceding 
it. According to this rule, this sentence is mapped to {PP\[an\] V\[arbeiten\], 
PP\[anl N\[Student\]}. 
(6) Jetzt arbeitet der Student daran, ... 
Now works the student on it 
'The student is now working on ... ' 
VC\[v\]/NCl\[nl\]/NC2\[n2\]. An active verb-second or verb-final clause with 
two nominal constituents NC1 and NC~ such that NC,2 follows NC1 in the 
clause is mapped to {PP\[p\] NPA V\[v\], PP\[p\] N\[n2\]}, if the head of NC2 is a 
noun, and to {PP\[p\] NPA V\[v\]} otherwise. 
Sentences (Ta,b) are examples to which this rule applies. In (Ta) the 
verb erinnern ('to remind') subcategorizes for an accusative NP and a PP 
headed by the preposition an ('on'), while in (To), the verb nehmen ('to 
take') is a support verb and Racksicht ('consideration') a noun which sub- 
categorizes for a PP headed by the preposition auf. Since their shallow 
structure is ambiguous, they are each mapped to a SF set reflecting both 
attar hment alternatives; (Ta) is mapped to the set {PP\[an\] NPA V\[erirmern\], 
PP\[an\] N\[Freund\]}, and (Tb) to the set {PP\[auf\] NPA V\[nehmen\], PP\[auf\] 
N\[R~eksicht\]}. 
(7) a. Mary erinnert ihren Freund daran, daB... 
Mary reminds her friend on it that 
'Mary reminds her friend of the fact that ... ' 
b. Mary nimmt keine Rficksicht darauf, daft... 
Mary takes no consideration on it that 
'Mary shows no consideration for the fact that ... ' 
Copula/NCl\[nl\]/NC2\[n2\]. A copula clause with two nominal constituents 
NCt\[nl\] and NC2\[n2\] such that NC2 follows NC1 and n2 is a noun is mapped 
to {PP\[p\] N\[n2\]}. For instance (8) is mapped with this rule to {PP\[auf\]  \[nin is\]}. 
157 
(8) Weft dies ein Hinweis darauf ist, da6... 
because this an indication on in is that 
'Because this is an indication (of the fact) that ...' 
Copula/NC\[n\]/AC\[a\]. A copula clause with one nominal and one ad- 
jectival constituent is mapped to {PP\[p\] N\[n\], PP\[p\] A\[a\],}. For instance, 
with this rule the clause in (9) is mapped to {PP\[auf\] A\[stolz\], PP\[auf\] N\[Student\]} 
(9) Stolz ist der Student darauf, da6... 
proud is the student on it that 
'The student is proud of the fact that ...' 
PCs. Any clause in wt~ch a PC immediately precedes the prronomlna\] 
adverb is mapped as in the appropriate rule with the additional element 
'PP\[p\] N\[n\]' in the set, where n is the head of the NC within the prepositional 
constituent. For instance, (10) is mapped to {PP\[an\] V\[arbeiten\], PP\[an\] 
N\[Woche\]} with the VC/NC and PC rules. 
(10) Mary arbeitet seit zwei Wochen daran, ... 
Mary works since two weeks on it 
'Mary has been working for two weeks on ...' 
Morphology. Any clause in wldch a possible locus of attachment is mor- 
phologicaUy ambiguous is mapped with the appropriate rule applied to all 
morphology alternatives. For instance, (11) is mapped with the VC/NC and 
Morphology rules to {PP\[an\] V\[denken\], PP\[an\] V\[gedenken\]}, since g~acht 
is the past participle of both the verbs  nken ('to think') and g~enken ('to 
consider'). 
(11) Er hat daran gedacht, dat3 ... 
he has on it thought/considered that 
'He thought of...' 
Passive/VC\[v\]/NC\[n\]. This rule is applied to 'werden ('to be') passive 
verb-second or verb-final clause with one NC. In case n is not the pronoun 
es ('it'), the clause is mapped to (PP\[p\] NPA V\[v\]} ifNC precedes the verb, 
and to {PP\[p\] NPA V\[v\], PP\[p\] N\[n\]} otherwise. In case n is the pronoun 
~, the clause is mapped to {PP\[p\] NPA V\[v\], PP\[p\] V\[v\]}. For instance, 
(12) is mapped to {PP\[an\] NPA V\[erinnern\]}. 
(12) Mary wird daran erinnert, da6... 
Mary is on it reminded that 
'Mary is reminded (of the fact) that ...' 
158 
I 
I 
i 
! 
2.3 SF Dis~rnbiguation 
The dis~rnhiguation component uses the expectation-maTirni~tion (EM) 
algorithm to assign probabilities to each frame in an SF alternative, given 
all SF sets obtained for a given corpus. The EM algorithm (Dempster, Laird, 
and Rubin, 1977) is a general iterative method to obtain maximum likelihood 
estimators in incomplete data situations. See (Vardi and Lee, 1993) for a 
general description of the algorithms as well as numerous examples of its 
application. The EM algorithm has been used to induce valence information 
in (Carrol and Rooth, 1997). 
In the current setting, the algorithm is employed to rank the frames in 
a given SF set by using the relative evidence obtained for each frame in the 
set. The algorithm is shown below. 
Algorithm. Let F be a set of frames. Further, let ~q be a finite set of 
nonempty subsets of ~(F), and let F0 = I.J X. XE8 
Initialization step: for each frame z in F0: 
c0C~) = E (I(z,x). go(X)) XE8 
Step k + 1 (k >= 0): 
Ck+l(Z) = ek(Z) "t- E (Pk(z,X) "ge(X)) xE$ 
Where ge is a ftmetion from S to the natural n-tubers mapping a set X to 
the number of times it was produced by the SF mapping for a given corpus 
C. Fm-ther, I, Pk, and Pk are run.ions defined as follows: 
x: e × \[0,1\] 
{l~ ifzEX 
(z,X) ~ 0 else 
F × \[0,1\] 
l ...e~l_. if z e X and lX\[ > l (z,X) ~ ~Ex2" p~C~) 
0 else 
pk : F --r \[0,1\] 
x 
~EFO 
Definition. A frame z is best in the set X at the iteration k if z E X and 
p~(z) is an absolute maximum in U Pk(~)- ~EX 
159 
In the algorithm above, 8 denotes the set of SF sets produced by the SF 
mapping for a given corpus C. In the initialization step, co assigns an initial 
"weight" to each frame, depending on its relative frequency of occurrence, 
and on whether the structures in which it occurred are ambiguous. The 
weight ck(x) of a frame x is used to estimate its probability pk(x). In 
each iteration of the algoritBrn, the weight of a frame ¢ is calculated by 
considering the totality of alternatives in which ~c occurs (i.e., the sets for 
which z E X and IX\[ > 1), and its probability within each alternative. 
The best frames in a set are the most probable frames given the evidence 
provided by the data. In the experiment described in section 3~ the 6n~:l 
number of iterations was set empirically. 
2.4 SF l~klng 
This component ranks the SFs obtained by the previous component of the 
system. Let £c be the set of head lemmata (verbs, nouns and adjectives) 
in the subcategorization cues (i.e., best frames in the SF sets) for a given 
corpus C. Let .~" be the set {NPA V\[-\], NPD V\[-\], V\[-\], PP\[an\] V\[.\], PP\[an\] 
NPA V\[-\], ...} of SF structures. (Roughly, an SF structure is an SF with- 
out its head lernm~) The analysis of SF cues is performed by creating a 
contingency table cont~inlng the following counts for each lemma L E £c 
and prepositional structure S E yr: k(L S) (k(L S)) is the count of lemm~ L 
with (without) structure S, and k(L S) (k(L S)) is the count of all \]~mm~ta 
in £c except L with (without) structure S. 
If a lemma L occurs independently of a structure S, then one would 
expect that the distribution of L given that S is present and that of L given 
that S is not present have the same underlying parameter. The log likelihood 
statistic is used to test this hypothesis. This statistic is given by -2 log A = 
2(log L(p1, kl, hi) ÷log L(p2, k2, n2)-log L(p, kl, R1)--log L(p, k2, n2)), where 
log LCo, k, n) = k logp + (n - k)log(1 -- p), and Pl = ~, P2 = ~, P = ,~',~; 
(For a detailed description of the statistic used, see (Dunning, 1993)). 
In the formulae above, kl is k(L S), nl is the total number of occurrences 
of S, k2 is/c(L S), and n2 the total number of occurrences of structures other 
than S. A large value of -2 log A for a lemma L and structure S m~n~ that 
the outcome is such that the hypothesis that the two distributions have 
the same underlying parameter is ,mllicely, and that a lemm~ L is highly 
associated with a structure S in a given corpus. This value is used to 
rank the subcategorization cues produced by the previous components of 
the system. 
160 
3 Results 
The method described in the previous section was applied to 1 year of the 
newspaper FFrankfu~er Allgemeine Zeitung containing approximately 36 mil- 
lion word-like tokens. A total of 16795 sentences matched the pronominal 
adverb correlative construct grammar described in section 2.1. 
3.1 SF Disambiguation 
Of the 16795 sets produced by the SF mapping, 5581 contained more than 
one SF, i.e., reflected some form of ambiguity in the original sentence, of 
which 4365 were unique. A random set of 400 sets was obtained from these 
unique ambiguous sets. The disambiguation component produced a decision 
for 359 of these 400 sets. These results were compared to the blind judge- 
ments of a single judge; 305 were found to be correct, 23 incorrect. The 
rem~inlng 31 sets were considered to contain incorrect SFs solely. Although 
an error rate of over 15% is not negligible, it is comparable to other PP 
attachment experiments (Collins and Brooks, 1995). 
3.2 Acquired Dictionary 
The system acquired a dictionary of 1663 unique subcategorization frames. 
Figure 2 and 3 show the 30 most and 10 least plausible frames according to 
the system. Starred structures are considered to be errors. 
Examination of the r~n~ed SF table shows that frames with a low -2 log 
value consist mostly of errors. The cues produced by the system are not 
perfect predictors of subcategorization. False cues stem from incorrect de- 
cisions in the disambiguation component as well as parsing and mapping 
errors, spurious adjuncts, or actual errors in the original text. 
In figures 3, two errors are due to the disambiguation component (nehmen, 
AmO; three errors stem from mistaking reflexive verbs for verbs t~ki~g any 
accusative object (sich treffen mit ('to meet with'), sich bekennen zu ('de- 
clare oneself for'), sich halten an ('to comply with')). These stem from the 
g~arnm~.r specification, and can be avoided with further development of the 
detection component. 
By far the most frequent type of error was the inclusion of an accusative 
or dative NP in a verbal frame when the verb in fact only takes a PP. For 
instance of the errors in the 31 sets (out of the 400 ambiguous sets exam- 
ined) containing incorrect SFs only, about 42% were due to the fact that an 
additional accusative/dative NP was incorrectly included in a verbal frame, 
161 
-21ogA k(LS) k(ZS) k(L.~) k(Z~) L S 
13270.3167 1225 1691 284.2 3897009 hinweisen PP\[atLf\]V\[-\] ('pointto') 
6757.6162 498 337 1183 3900749 
4857.2234 482 328 7909 3894048 
4241.0161 429 529 5792 3896017 
3307.6279 406 2510 3479 3896372 
3179.3391 339 234 11293 3890901 
3156.5375 342 433 8013 3893979 
3118.6878 255 158 2810 389954,4 
2897.6673 293 888 2766 3898820 
254.8.1622 385 796 20598 3880988 
2253.9826 234 2682 860 3898991 
2002.4658 174 128 3706 3898759 
1605.2355 146 629 1155 3900837 
1193.6521 190 383 25066 3877128 
1042.8259 115 298 4968 3897386 
876.6903 64 121 610 3901972 
813.5798 74 761 510 3901422 
789.1838 78 122 4432 3898135 
777.0291 121 837 7860 3893949 
776.2428 62 92 1368 3901245 
766.8966 122 1059 6794 3894792 
766.6407 65 89 2105 3900508 
764.9588 135 640 16398 3885594 
686.0675 48 393 107 3902219 
684.1054 85 40 27806 3874836 
677.7402 70 1111 660 3900926 
656.856,1 67 455 1435 3900810 
577.1696 58 383 1371 3900955 
569.0955 43 78 815 3901831 
555.6635 61 89 7787 3894830 
Figure 2: 
aus~e.hell 
rechnen 
erinnern 
verweisen 
bestehen 
sorrgen 
PP\[von\] V\[-\] ('assume') 
PP\[mit\] V\[- l ('reckon with') 
PP\[an\] V\[-\] ('remind of') 
PP\[auf\] V\[-\] ('refer to') 
PP\[in\] V\[.\] ('lie in') 
PP\[fftr\] V\[.\] ('care for') 
aussprechenPP\[fiir\] sich V\[.\] ('speak for') 
beitragen PP\[zu\] V\[-\] ('contribute to') 
fdhren PP\[zu\] V\[.\] ('lead to') 
an~ommen PP\[auf\] V\[-\] ('depend on') 
begrfmden PP\[mit\] NPA V\[-\] 
('substantiate NP with') 
pl~dieren PP\[fiir\] V\[-\] ('plead for') 
liegen PP\[in\] V\[-\] ('lie in') 
einsetzen PP\[fdr\] sich V\[.\] ('support') 
hindern PP\[an\] NPA V\[.\] 
('hinder NP from') PP\[von\] V\[-\] ('depend on') 
PP\[auf\] N\[-\] ('reference to') 
PP\[an\] V\[-\] ('think of') 
a-fimerk.~mPP\[auf\] A\[.\] ('attentive to') 
PP\[zu\] V\[-\] ('serve for') 
PP\[auf\] A\[.\] ('proud of') 
PP\[ffir\] V\[-\] ('speak for') 
abh~agen 
Hinweis 
denken 
dienen 
stolz 
sprechen 
hinweg- 
t~uschen 
sehen 
neigen 
Beweis 
PP\[fiber\] V\[-\] ('obscure') 
PP\[in\] NPA V\[-\] ('see NP in') PP\[zu\] v\[.\] ('tend to') 
PP\[fiir\] N\[-\] ('proof of') 
nachdenken PP\[fiber\] V\[.\] ('thlnk about') 
abhalten PP\[von\] NPA V\[.\] 
('prevent NP from') 
Interesse PP\[an\] N\[-\] ('interest in') 
30 most plausible frames 
162 
-21og~ k(LS) k(LS) lc(L,~) I~(L,~) L S 
0.0126 1 301 11527 
0.0117 1 463 9351 
0.0112 4 831 17723 
0.0087 1 204 20866 
0.0054 1 159 22650 
0.0047 1 184 22565 
0.0029 1 809 5082 
0.0011 1 957 3938 
0.0005 1 204 18633 
0.0002 1 521 7580 
3890938 treffen *PP\[mit\] NPA V\[.\] 
3892952 bekennen*PP\[zu\] NPA V\[-\] 
3884209 wissen PP\[von\] V\[-\] ('know of') 
3881696 nehmen *PP\[auiJ NPA V\[-\] 
3879957 fmden *PP\[durch\] NPA V.D 
3880017 halten *PP\[an\] NPA V\[.\] 
3896875 einsetzen *PP\[mit\] V\[.\] 
3897871 verdienen PP\[an\] V\[-\] 
('make a profit on') 
3883929 bringen PP\[auf~ NPA V\[-\] 
3894665 Amt *PP\[fSr\] N\[-\] 
Figure 3:10 least plausible frames 
although the prepos/tion in the frame was subcategorized for. These stem 
from erroneous alternatives in the segmentation of nornin~! constituents as 
defined by the grammar and could be eliminated with further developed of 
the detection component. 
Yet another type of error stems from pronominal adverbs which are con- 
junction/adverb homographs, or which are used anaphorically, while the 
verb in the main clause subcategorizes for a daj~ ('that') clause, so the sen- 
tence is erroneously considered to be a correlative construct. This is the 
source of most errors for flames involving the preposition gegen ('against'), 
bei ('by') and nach ('to'), and cannot be avoided given the learning strategy. 
Given the fact that the cues produced by the system are not perfect 
predictors of subcategorization, a test of significance could be introduced in 
order to filter out potentially erroneous cues. However, it was observed that 
truly "new" prepositional frames--frames not listed in broad coverage pub- 
lished dictionaries, or even considered to be erroneous by a native speaker 
until confronted with examples from the corpus--behaved with respect to 
their rank;ngs very much like errors. So the current version of the learn- 
ing procedure relies on manual post-editing assisted by the SF ranking and 
examples from the corpus in order to discard f~!se frames. 
3.3 Precision and Recall 
Evaluating the acqui~d dictionary is not straightfoward; linguists often dis- 
agree on the criteria for the complement/adjunct distinction. Instead of 
163 
attempting a definition, the acquired dictionary was compared to a broad 
coverage published dictionary cont~iniug explicit information on preposi- 
tional subcategorization. 
A random set of 300 verbs occurring more than 1000 times in the corpus 
was obtained, z The prepositional SFs for these verbs which were listed in 
(W~brig, Kraemer, and Zimmerman, 1980) and in the acquired lexicon were 
noted. There was a total of 307 verbal prepositional frames listed in either 
dictionary. Of these, 136 were listed only in the published dictionary, and 
121 only in the acquired dictionary. 
These prepositional SFs were used to calculate a lower bound for the 
precision and recall rates of the system; A SF is considered correct if and 
only if it is listed in the published dictionary. 2 A lower bound for the recall 
rate of the system is given by the number of learned correct frames divided 
by the number of frames listed in the published dictionary, or 52/173. This 
recall rate is a lower-bound for the actual rate with respect to the corpus, 
since there are prepositional SFs listed in the published dictionary with no 
instance in the corpus. 
A lower bound for the precision of the system is given by the number of 
learned correct frames divided by the number of learned frames, or 52/188. 
This rate is a lower-bound for the actual precision rate of the system, since 
it does not take the fact into account that the system did learn true SFs 
not listed in the published dictionary, so the precision rate of the system 
is actually higher. Further, not all prepositions contributed equally to the 
precision and recall rates. For instance the precision and recall for the 
prepositions aus ('out') was 60% and 42%, that of t~on ('off) 50% and 53%, 
while that of geeger~ ('against') 6% and 11%, respectively. 
4 Related Work 
The automatic extraction of English subcategorization frames has been con- 
sidered in (Brent, 1991; Brent, 1993), where a procedure is presented that 
takes untamed text as input and generates a list of verbal subcategorization 
frames. The procedure uses a very simple heuristics to identify verbs; the 
synt~tic types of nearby phrases are identified by relying on local morpho- 
syntactic cues. Once potential verbs and SFs are identifled, a final com- 
1There was a total of 15178 unique verbs (known to the morphology) occurring in the 
corpus, of which 913 occurred more than 1000 times. 
=No dictionary is exempt from errors (of omission). However it (hopefully) provides a 
1,=iform classification for PP subcategorization. 
164 
portent attempts to determine when a lexical form occurs with a cue often 
enough so that it is unlikely to be due to errors; an automatically computed 
error rate is used to filter out potentially erroneous cues. Prepositional 
frames are not considered, since, according to the author, "it is not clear 
how a machine learning system could do this \[determine which PPs are 
arguments and which are adjuncts\]." 
In (Manning, 1991) another method is introduced for producing a dictio- 
nary of English verbal subcategorization frames. This method makes use of a 
stochastic tagger to determine part of speech, and a Finite state parser which 
r~m.~ on the output of the ta~er, identifying auxiliary sequences, noting pu- 
tative complements after verbs and collecting histogram-type frequencies of 
possible SFs. The final component assesses the frames encountered by the 
parser by using the same model as (Brent, 1993), with the error rate set em- 
pirically. Prepositional verbal frames are learned by the system by relying 
on PPs as cues for subcategorization; since the system cannot differenti- 
ate between complement and adjunct prepositional cues, it learns frequent 
prepositional adjuncts as well. 
In order to evaluate the acquired dictionary, M~nn~ng compares the 
frames obtained for 40 random verbs to those in a published dictionary, 
yielding for these verbs an overall precision and recall rates of 90~ and 43% 
respectively. However, if only the prepositional frames listed for these verbs 
are considered, the rates drop to appro~mately 84% and 25%, respectively. 
In the experiment described, the error bounds for the filtering procedure 
were chosen with the aim of "get\[ing\] a highly accurate dictionary at the ex- 
pense of recall." His system did not consider nomlnal and adjectival frames. 
(Carrol and Rooth, 1997) present a learnln~ procedure for English sub- 
categorization information. Unlike previous approaches, it is based on a 
probabilistic context free grammar. The system uses expected frequencies 
of head words and frames--calculated using a hand-written grammar and 
occurrences in a text corpuswto iteratively estimate probability parameters 
for a PCFG using the expectation maximi~.ation algorithm. These parame- 
ters are used to rh~racterize verbal, nominal and adjectival SFs. The model 
does not distinguish between complements and adjunct prepositional cues. 
5 Conclusion 
This paper presents a method for learning German prepositional subcatego- 
rization frames. Although other attempts have been made to learn English 
verbal/prepositional SFs from text corpora, no previous work considered a 
165 
i 
partially free word-order language such as German, nor differentiated be- 
tween complement and adjunct prepositional cues. 
The overall precision rate for the system described in this paper is lower 
than that of similar systems developed for English, since no test of signif- 
icance was used to filter out possibly erroneous cues. In the experiment 
described in the previous section, truly new prepositional frames behaved 
with respect to frequency of occu~ence very much like errors, and would 
possibly have been discarded by a filtering mechanism. 
A problem in the current version of the system was the fact that segmen- 
tation of nominal constituents was not optimally handled by the detection 
component, leading to a large mlmher of verbal frames with correct preposi- 
tions, but with an additional erroneous accusative/dative NC in the frame. 
So the precision of the system can be significantly improved with further 
development of the detection component. 
Further, the system should be extended to handle other types of pronom- 
inal adverb cues, such as pro-forms for interrogative, personal and relative 
pronouns; possibly PPs headed by prepositions should also be considered. 
Finally, the method-low-level parsing together with a procedure to ranlc 
alternatives obtained-should be extended to other frames as well. 

References 
Brent, Michael R. 1991. Automatic acquisition of subcategofization frames 
~om untamed text. In Proceedings of the ~h Annual Meeding of the 
AGL, pages 209-214. 
Brent, Michael R. 1993. From grammar to lexicon: Unsupervised learning 
of lexical syntax. Computational Linguistics, 19(2):243-262. 
Carrol, Glenn and Mats Rooth. 1997. Valence induction with a head- 
lexiccMiz~ed PCPG. http://www2.1ras.uni-stuttgart.de/,,,mats. 
Collins, Michael and James Brooks. 1995. Prepositional phrase attachment 
through a backed-off model. In Proceeding8 of the Third Wor~hop on 
Very Large Corpora. 
Dempster, A.P., N.M. Laird, and D.B. Rubin. 1977. Maximuln likelihood 
from inclomplete data via the em algorithm. J.R.Sta~s. Soc. B, 39:1-38. 
Dnnnln.% Ted. 1993. Accurate methods for the statistics of surprise and 
coincidence. Computational Linguistics, 19(1):61-74. 
Mauling, Christopher D. 1991. Automatic acquisition of a large subcate- 
gorization dictionary from corpora. In Proceedings of the ~9th Annual 
Meeding of the ACL, pages 235-242. 
Vardi, Y. and D. Lee. 1993. From image deblurring to optimal invest- 
ments: Maximum likelihood solutions for positive linear inverse prob- 
lems. Y.R.Statis. Soc. B, 55(3):569--612. 
Wahrig, Gerhard, Hildegard Kraemer, and Harald Zimmerman. 1980. 
Brockhaus Wahrig Deutsches W~irterbuch in secl~ B~nden. F.A. Brock- 
haus und Deutsche Verlags-Anstalt GmbH, Wiesbaden. 
