Resolving Anaphoric References on 
Deficient Syntactic Descriptions 
Roland Stuckardt 
Im Mellsig 25 
D-60433 Frankfurt am Main, Germany 
stuckardt@compuserve, tom 
Abstract 
Syntactic coindexing restrictions are by 
now known to be of central importance 
to practical anaphor resolution approaches. 
Since, in particular due to structural am- 
biguity, the assumption of the availability 
of a unique syntactic reading proves to be 
unrealistic, robust anaphor resolution relies 
on techniques to overcome this deficiency. 
In this paper, two approaches are presented 
which generalize the verification of coin- 
dexing constraints to deficient descriptions. 
At first, a partly heuristic method is de- 
scribed, which has been implemented. Sec- 
ondly, a provable complete method is spec- 
ified. It provides the means to exploit 
the results of anaphor resolution for a fur- 
ther structural disambiguation. By render- 
ing possible a parallel processing model, 
this method exhibits, in a general sense, 
a higher degree of robustness. As a prac- 
tically optimal solution, a combination of 
the two approaches is suggested. 
1 Introduction 
The interpretation of anaphoric expressions is known 
to be a difficult problem. In principle, a variety of 
constraints and preference heuristics, including fac- 
tors which rely on semantic, pragmatic, and world 
knowledge, contribute to this task (Carbonell and 
Brown, 1988). Operational approaches to anaphor 
resolution on unrestricted discourse, however, are 
confined to strategies exploiting globally available 
evidence like morphosyntactic, syntactic, and sur- 
face information. 
Among the most promising practical work are ap- 
proaches relying on the availability of syntactic sur- 
face structure by employing coindexing restrictions, 
salience criteria, and parallelism heuristics (Lappin 
and Leass, 1994; Stuckardt, 1996b). However, even 
the assumption of the availability of a unique syntac- 
tic description is unrealistic since, in general, parsing 
involves the solution of difficult problems like attach- 
ment ambiguities, role uncertainty, and the instan- 
tiation of empty categories. Based on this observa- 
tion, Kennedy and Boguraev suggest an adaptation 
of the Lappin and Leass approach to the analysis 
frontend of English Constraint Grammar (Karlsson 
et al., 1995), which provides a part-of-speech tag- 
ging comprising an assignment of syntactic function 
but no constituent structure. This information de- 
ficiency is partially overcome by the application of 
a regular filter which heuristically reconstructs con- 
stituent structure (Kennedy and Boguraev, 1996). 
The approach of Kennedy and Boguraev resorts to 
shallow input and heuristic reconstruction of surface 
structure in general, thus leaving open the question 
what may be gained by relying on the possibly par- 
tial, but potentially more reliable output of a con- 
ventional parser. This question is dealt with in the 
present paper. An operational approach to anaphor 
resolution is advocated which achieves robustness by 
a generalization to deficient syntactic descriptions 
rather than by resorting to shallow input. In sec- 
tion 2, notions of robustness are defined according 
to which different methods may be classified. Sec- 
tion 3 develops the perspective of fragmentary syn- 
tax and identifies the coindexing restrictions of bind- 
ing theory as an important anaphor resolution strat- 
egy which is in particular affected by this loss of 
configurational evidence. In section 4, a solution 
is presented which accomplishes robustness against 
syntactic deficiency by a partly heuristic verification 
of coindexing constraints on fragmentary syntax. Fi- 
nally, in section 5, a non-heuristic algorithm is spec- 
ified which works on the standardized representa- 
tion of ambiguous syntactic description by packed, 
shared parse forests. It achieves a higher degree of 
robustness by making available referential evidence 
30 
for a further disambiguation of syntactic structure. 
A combination of the two approaches is suggested as 
the practically optimal solution. 
2 Notions of Robustness 
2.1 Robustness in Natural Language 
Processing 
In natural language processing in general, the ro- 
bustness issue comprises the ability of a software 
system to cope with input that gives rise to deficient 
descriptions at some descriptional layerJ More or 
less implicit is the assumption that the system ex- 
hibits some kind of monotonic behaviour: the less 
deficient the description, the higher the quality of 
the output (Menzel, 1995). 
Following Menzel further, this intuitive character- 
ization may be refined. Processing should exhibit 
autonomy in the sense that complete failures at one 
stage of analysis should not cause complete failures 
at other stages of analysis or even a failure of the 
overall processing. Moreover, the processing model 
should ideally employ some kind of interaction be- 
tween different stages of analysis: deficiency at one 
stage of analysis should be compensated by the in- 
formation gained at other stages. 
2.2 Robustness and Anaphor Resolution 
In the light of the above description, the robustness 
requirement for the anaphor resolution task may be 
rendered more precisely. In the aforesaid operational 
approaches, a sequential processing model is followed 
according to which anaphor resolution is performed 
by referring to the result of an already completed syn- 
tactic analysis. This architecture, however, tacitly 
ignores evidence for structural disambiguation that 
may be contributed by strong expectations at the 
referential layer (Stuckardt, 1996a). In terms of the 
general goals of robust processing, this means that, 
since there is no interaction, the robustness require- 
ment merely shows up in form of the monotonic- 
ity and autonomy demands: the anaphor resolution 
module has to cope with deficient or shallow syntac- 
tic information. Besides the trivial way to achieve 
this kind of robustness by simply not exploiting defi- 
cient syntactic descriptions, the following two mod- 
els may be followed: 
• the shallow description model: by exploit- 
ing heuristic rules to reconstruct syntactic de- 
scription, the anaphor resolution strategies are 
1The deficiency may result either because the input 
itself is deficient, or due to shortcomings of the process- 
ing resources, e.g. lexicon, grammar/parser, or seman- 
tic/pragmatic disambiguation. 
adapted to shallow input data which are never 
defective. 2 
• the deficient description model: by extending 
anaphor resolution strategies to work on a pos- 
sibly ambiguous or incomplete description, syn- 
tactic evidence is exploited as far as available. 
In contrast to the approach of Kennedy and Bogu- 
raev, which is based on the shallow description 
model, the subsequent sections develop two methods 
that follow the deficient description model. At first, 
a new partly heuristic approach will be described. 
Secondly, a non-heuristic algorithm will be specified 
which establishes the conceptually superior degree 
of robustness through interaction: it makes avail- 
able the results of anaphor resolution for syntactic 
disambiguation. 
3 Fragmentary Syntax 
3.1 Phenomena 
The main phenomena which give rise to structural 
ambiguity of syntactic descriptions are uncertainty 
of syntactic function (involving subject and direct 
object) and attachment ambiguities of prepositional 
phrases, relative clauses, and adverbial clauses. In 
the example 
Peter observes the owner of the telescope with it. 
depending on the availability of disambiguating in- 
formation, it may be uncertain whether the under- 
lined prepositional phrase with it should be inter- 
preted adverbially or attributively. From the con- 
figurational perspective, these ambiguities give rise 
to fragmentary syntactic descriptions which consist 
of several tree-shaped connected components. With 
the exception of the topmost tree fragment, all com- 
ponents correspond to a syntagma of type PP, S, or 
NP whose attachment or role assignment failed. 
In addition, cases in which no reading exists give 
rise to fragmentary syntactic descriptions compris- 
ing the constituents whose combination failed due to 
constraint violation. 
3.2 Fragmentary Syntax and Anaphor 
Resolution 
Among the anaphor resolution strategies potentially 
affected by fragmentary syntax are heuristics as well 
as constraints. Preference criteria like salience fac- 
tors and syntactic parallelism are not affected by 
2Here, the monotonicity demand of intuitive robust- 
ness virtually vanishes, since there is no longer a syntac- 
tic input prone to deficiency. 
31 
all types of syntactic defects. Moreover, there is 
a plethora of heuristics which do not rely on syn- 
tactic function or structure. Structural coindexing 
constraints, however, may lose evidence in all above 
cases of fragmentary syntax. Since they are known 
to be of central importance to the antecedent fil- 
tering phase of operational anaphor resolution ap- 
proaches, the subsequent discussion focuses on the 
impact of deficient surface structure description to 
this class of restrictions. 
According to the Government and Binding Theory 
of Chomsky, the core of the syntactic coindexing re- 
strictions is stated as follows 
Definition 1 (binding principles) 
(A ) A reflexive or reciprocal is bound in its binding 
category. 
(B) A pronominal is free (i.e. not bound) in its 
binding category. 
(C) A referring expression is free in any domain. 
where binding category denotes the next domina- 
tor containing some kind of subject (Chomsky, 
1981), and binding is defined as coindexed and c- 
commanding: 
Definition 2 (thec-command relation) 
Surface structure node X c-commands node Y if and 
only if the next "branching node" which dominates 
X also dominates Y and neither X dominates Y, Y 
dominates X nor X = Y. 
A further structural well-formedness condition, com- 
monly named i-within-i filter, rules out "refer- 
entially circular" coindexings, i.e. configurations 
matching the pattern \[c~ ... Ill ... \]i\]i. 
In the above example, the latter restriction comes 
to an application, licensing a coindexing of telescope 
and it only if the PP containing it is not interpreted 
as an attribute to telescope - otherwise, in contradic- 
tion to the i-within-i condition, the pronoun would 
be contained in the NP of the tentative antecedent. 
Hence, if the PP attachment ambiguity has not been 
resolved prior to anaphor resolution, the fragmen- 
tary syntactic description does not contribute the 
configurational evidence which is necessary for defi- 
nitely confirming antecedent candidate telescope. 
4 Checking Binding Constraints on 
Fragmentary Syntax 
4.1 Basic Observations 
The first step towards the verification of binding con- 
straints on fragmentary syntax is suggested by the 
following observation: 
If the anaphor as well as the antecedent 
candidate are contained in the same con- 
nected component of the fragmentary syn- 
tactic description, no (direct) binding the- 
oretic evidence is lost. 
In this case, the verification of the binding restric- 
tions of anaphor and antecedent will be possible in 
a non-heuristic manner, since the necessary positive 
(---~ binding principle A) and negative (-+ binding 
principles B, C) syntactic-configurational evidence is 
entirely available. 3 If, however, the two occurrences 
belong to different fragments, relevant information 
may be lost. 
These considerations give rise to a first solution: to 
be able to detect which one of the two cases holds, 
the descriptions of the discourse referent occurrences 
are supplemented with an attribute which uniquely 
identifies of the syntactic fragment to which the cor- 
responding NP 4 belongs (e.g. a pointer to the root 
of the fragment, or a natural number). For a given 
pair of anaphor a and antecedent candidate 7, the 
following procedure is applied: 
If anaphor c~ and candidate ~ occur in the 
same fragment, verify binding restrictions 
as in case of unique syntactic description. 
If they occur in different fragments, con- 
sider ,~ a configurationally acceptable can- 
didate for a, but reduce the plausibility 
score associated with the pair (o~,~/). 
Consequently, in certain, recognizable cases, the ro- 
bust binding constraint verification merely yields a 
heuristic approval of coindexing. The strategy loses 
a part of its former strictness because configurational 
evidence is only partially available. 
4.2 Rule Patterns 
Even in the disadvantageous case, a closer look at 
the tree fragments of anaphor and antecedent can- 
didate may reveal additional information. Figure 1 
shows rule patterns which exploit this evidence, s 
3This statement, however, solely applies to the di- 
rect comparison of the involved occurrences, since in 
case of further, transitive coindexings, negative evidence 
stemming from decision interdependency (cf. (Stuckardt, 
1996a)) may get lost. 
4This slightly sloppy assumption of a bijection be- 
tween NP nodes of syntactic structure and discourse ref- 
erent occurrences does not affect the validity of the sub- 
sequent discussion. 
5The following notational conventions are used: 
round brackets delimit constituents; square brackets em- 
phasize fragment boundaries; be(X) denotes the bind- 
ing category of surface structure node X; bn(X) denotes 
32 
~\[F1\] ~/{... Fi=\[...bc(7)(...'TtypeB...)...\],..., Fj=\[...bc(a)(...atypeB...)...\] ...} 
\[F2\] • {... Fi -- \[... bn('7)(...Tt~ws/c ...) ...\], ... , Fj = \[... be(a)(...OltypeA ...) ...\] ..-} 
\[EEl\] x/{... Fd=\[... "Ttyp~B/C...\], ... , Fe=\[...bc(a)(...a~yveB...) ...\] ...} 
\[FE2\] *{... Fd = \[... "Ttyp~S/C ...\] , ... , F~ = \[... bc(a)(...atyp~A...) ...\] ...} 
\[FE3\] *{... F~=\[..."Ttyp~S/e...\],..., Fe=\[...atypeC...\] .-.}, if "7 c-commandsa 
independently of the attachment choice 
\[FE4\] .{... Fd=\[...atypeA ...\],... , Fe=\[... "Tt~peS/C ...\] ...}, ifroot(Fe) ~ "7 
Figure 1: rule patterns for binding constraint verification on fragmentary syntax 
In fragmentations matching pattern IF1\], both frag- 
ments are constituents which contain the binding 
categories be(a) and be('7) of the respective occur- 
rences a and '7 of type B. In particular, this im- 
plies that the fragments may not be attached in 
a way that one occurrence locally c-commands the 
other. Therefore, in both cases, binding principle B 
is respected, and the coindexing of a and '7 is non- 
heuristically approved. 
Conversely, if pattern \[F2\] is matched, "7 is defini- 
tively ruled out as the local antecedent prescribed 
for type A anaphor a: it is impossible to connect the 
two fragments in a way that "7 locally c-commands 
a. Here, the fragment of the candidate is only re- 
quired to contain the branching node of "7, because 
this suffices to preclude that "7 c-commands a if Fi 
is embedded in Fj. 
For certain successive fragment pairs, the parsing re- 
sult comprises additional information about immedi- 
ate or transitive embedding. Based on this evidence, 
further non-heuristic rules (\[FEi\],\[FE2\], \[FE3\], and 
\[FE4\]) become applicable (Fd = dominating frag- 
ment, F¢ = embedded fragment). 
This list may be supplemented by rules which are 
based on more subtle configurational case distinc- 
tions, and, moreover, by heuristic rules which em- 
ploy standardized assumptions about typical deci- 
sion patterns of structural disambiguation. With 
each of the latter rules, an individual decision plau- 
sibility weight may be associated. 
In an application context, the extension of heuristic 
rules should be limited to configurations which are 
known to be of practical relevance. Based on a suit- 
able corpus annotated with syntactic and referential 
information, it should be possible to determine prob- 
the branching node dominating X according to the c- 
command definition; the subscript of Xtypev denotes 
that the binding theoretic class of the occurrence con- 
tributed by X is Y E {A, B, C}, e.g. P, yp~ s is a pronom- 
inal. x//* indicates that a rule pattern admits/forbids 
coindexing. 
abilities for different coindexing configurations by a 
statistical distribution analysis. These probabilities 
may then be used to derive promising plausibility 
weighted rules. 
4.3 An example 
The following example illustrates the application of 
some of the above rules: 6 
Der Mann hat den Pr~sidenten besucht, 
der ihn yon sich iiberzeugte. 
"The man has the president visited, 
who him from himself convinced." 
Because of the intervening past participle, the rel- 
ative clause may be interpreted as an attribute to 
either Mann or Prdsidenten. Hence, syntactic ambi- 
guity arises, yielding a surface structure description 
which consists of the following two fragments 
(S Mann 
(VP President)) 
(S der 
(VP ihn 
(VP (PP sich)))) 
In addition, it is known that the second fragment is 
embedded in the first. There are three pronominal 
anaphors to be resolved: the reflexive pronoun sich 
of type A, the nonreflexive pronoun ihn of type B, 
and the relative pronoun der of type B. 
For the reflexive pronoun sieh, the syntactic restric- 
tions may be applied nonheuristically. Candidates 
der and ihn are contained in the same surface struc- 
ture fragment. Consequently, binding theoretic evi- 
dence is completely available. Since the candidates 
locally c-command sich, they are both determined to 
be possible antecedents. The two candidates Mann 
and Prdsident, however, occur in the other fragment. 
Hence, it is attempted to apply one of the above rule 
6The example is given in German, because the struc- 
tural ambiguity comes out more strikingly. 
33 
patterns. Since the reflexive pronoun is of binding 
theoretic type A, and the fragment in which it occurs 
contains its binding category (the S node of the rel- 
ative clause), the Fe fragment pattern of rule \[FE2\] 
is matched; analogously, the (dominating) fragment 
containing the type C candidates matches the Fd 
fragment pattern. Hence, rule \[FE2\] applies: it non- 
heuristically rules out Mann as well as Priisident. 
Similarly, for the pronouns ihn and der, type C can- 
didates Mann and Pr~isident are definitively con- 
firmed. Since these anaphors are of type B, the Fe 
fragment pattern of rule \[FEll is matched. More- 
over, the (dominating) antecedent candidate frag- 
ment matches pattern Fd. Consequently, \[FE1\] ap- 
plies and predicts the admissibility of the candidates. 
4.4 An Implementation 
The above technique for achieving robustness ac- 
cording to the deficient description model has been 
integrated into an anaphor resolution system for 
German text (Stuckardt, 1996b). 7 At present, only 
nonheuristic rules are employed. 
In a quantitative evaluation on a corpus of architect 
biographies (Lampugnani, 1983), the algorithm cor- 
rectly resolved about 82 per cent of type B pronouns 
(including possessives). In an idealized test scenario 
in which correct syntactic readings were manually 
provided, a precision of 90 per cent was obtained. 
Hence, on fragmentary syntax, the result quality 
only decreases by 8 points of percentage. Compared 
with the 75 per cent achieved by the shallow de- 
scription approach of Kennedy and Boguraev, this 
indicates that approaches to robust anaphor reso- 
lution which follow the deficient description model 
may achieve a higher precision. A principled, in- 
structive comparison based on a broader set of text 
genres has to confirm this improvement. 
5 A Complete Algorithm 
By the nonheuristic rules of the above method, only 
those parts of surface structure description are ex- 
ploited which are valid independently of further dis- 
ambiguation. It is, however, possible to follow a 
more principled approach which utilizes configura- 
tional evidence that is confined to certain readings. 
As it will be shown in the following, this may be 
achieved by tracing the reading dependency of an- 
tecedent decisions relying on particular configura- 
tions. Through this technique, the results of refer- 
ential disambiguation can be utilized as evidence for 
further narrowing structural ambiguity. 
Tin its principal layout, the algorithm coincides with 
the one that will be described in section 5.3. 
5.1 Dominance Relations in Packed Shared 
Forests 
Following a standardized framework, ambiguous 
parsing results are henceforth assumed to be rep- 
resented as packed shared forests (PSFs) (Tomita, 
1985). In this representation, structural ambiguity 
is encoded by packing different derivation variants 
of input substrings into single interior nodes. More- 
over, subtrees common to different readings are al- 
lowed to be shared. Formally, such a parse forest can 
be described as a directed acyclic graph (DAG) with 
a distinguished topmost element, and leaves corre- 
sponding to input words. 
Given a PSF T with nodes V = {vl,...,Vk}, let 
P(vl),...,P(Vk) be the respective derivation vari- 
ants according to packing. Hence, 
k 
n := \[I IP(vi)l 
i----1 
denotes the maximum number of readings (parse 
trees) represented by T. Consequently, sets of read- 
ings may be specified by bit vectors of length n. 
The application of binding principles crucially rests 
on the availability of information about dominance 
relations between parse tree nodes. In ambigu- 
ous structure, configurational relations may be con- 
fined to certain subsets of readings. The idea now 
is to qualify dominance information by bit vectors 
of length n. For each pair (vi,vo) consisting of 
an interior node vi and an occurrence contribut- 
ing node vo (usually preterminal), vectors ad(vi, vo) 
and l~l(vi, Vo) are introduced. Vectors ad(vi, Vo) and 
ld(vi,Vo) characterize the readings in which vi arbi- 
trarily dominates, or (in the sense of binding theory) 
locally dominates Vo. Based on these vectors, it will 
be possible to apply the binding restrictions in a 
reading sensitive way. 
By generalizing a technique for unambiguous syn- 
tax (Correa, 1988), the reading-qualified dominance 
information may be precomputed as follows, s The 
tree traversal process starts at the preterminal nodes 
which are assumed to be shared among all readings. 
(If this condition is not satisfied, a preprocessing 
is performed which, by topdown propagation, de- 
termines reading characterization vectors which are 
then taken for a qualified initialization of the vectors 
l~/and ad.) Each preterminal node Vp is assigned the 
following vectors: 
{ (1,...,1), vp ; Vo Jd(v,,,Vo) = l?(vp,Vo) := (0, ,0), vp 
SAlternatively, this information may be determined 
on demand of the anaphor resolution task. 
34 
p 
a'd(vi,Vo) := 
ld(vi,vo) := 
V m) A ( V dd(vd, Vo))) 
l<rn<lP(vi)l Vde D(P(vl),m) 
V (tT(P(vi), m) A ( V ld(vd,Vo))), if ~beateg(vi) 
l_<ra~lP(vl)l Va C D(P(vl),m) 
(0,...,0), if bcateg(vi) 
Figure 2: bottom-up computation of dominance vectors 
The computation proceeds bottom-up as follows. 
Let vi be an interior node for which all descendants 
of all derivation variants in P(vi) have already been 
processed. By taking into consideration whether the 
node delimits a local domain of binding, the vectors 
are assigned as shown in figure 2 (the operators A 
and V denote bitwise conjunction and disjunction of 
vectors, respectively). 
The computation of the vectors t~d(vi,vo), which 
characterize the readings in which vi arbitrarily 
dominates vo, denotes the basic case. The outer- 
most vector disjunction sums over the derivation 
variants P(vi) which, due to packing, exist for the 
interior node vi. Bit vector lY(P(vi), m) acts as a 
filter characterizing the subset of readings in which 
the m th derivation variant of vi, 1 < m < IP(vi)l, 
is valid? For each derivation variant, there exists a 
set D(P(vi), m) of descendants Vd which correspond 
to nonterminals on the right-hand side of the respec- 
tive rewriting rule. The nodes that are dominated 
by these nonterminals are transitively dominated by 
vi. Hence, the overall result is obtained by recur- 
sively summing up the individual contributions of 
the descendants and qualifying them by conjoining 
them with p'~P(vi), m). 
Vectors l~l(vi, vo), which characterize the readings in 
which vi locally dominates Vo, are computed simi- 
larly. The only difference arises if vi delimits a local 
domain of binding. In this case, the dominance re- 
lation computation starts from scratch, i.e. with the 
zero vector. 
5.2 Binding Principle Verification on 
Packed Shared Forests 
The vectors ad(vi,vo) and ld(vi,Vo) are used to 
perform referential disambiguation which is sensi- 
tive to structural ambiguity. For this purpose, dur- 
ing anaphor resolution, each pair of anaphor a and 
antecedent candidate '7 (identified with the corre- 
sponding preterminal nodes) is assigned a vector 
g(a,,7) characterizing the readings under which a 
9Upon fixation of a particular encoding scheme for 
reading characterization, vectors lY(P(vi),m) may be 
computed according to a simple formula. 
coindexing of a and '7 is configurationally admis- 
sible. By taking into account the respective binding 
principles, F(a, '7) may be determined as follows: 
:= ( V vEbn(~) 
A( V vEbn(o~) 
( lY(P(v),rn) A bl~s(v,a))) 
( tY(P(v),m) A bI~w(v,'7))) 
where bn(x) represents the set of dominators of a 
node x which are branching nodes for x in the sense 
of the c-command definition. The first conjunct 
specifies the bitvector characterizing the subset of 
readings under which the binding principle of the 
anaphor a is (constructively) satisfied. Analogously, 
the second conjunct describes the (unconstructive) 
binding principle verification for the antecedent can- 
didate 7. In both cases, branching nodes v have to 
be considered because they determine the starting 
point for the application of the dominance informa- 
tion. Since the property of being a branching node 
relatively to another node may in general depend on 
packing variants, reading dependency arises. Again, 
this subtlety is modeled by adding up a set of dis- 
juncts which are qualified by vectors p"~P(v),m), 
v E bn(x). Here, these vectors characterize the 
subset of readings in which the property of being 
a branching node relatively to node x holds for v. 
The strong (constructive) and weak (unconstructive) 
verification of binding princi~es is accompl~hed by 
a conjunction with vectors bps(v, a) and bpw(v,'7), 
respectively, which, depending on the applicable 
binding principle, exploit the reading-qualified dom- 
ination information: 
b s(vl,v ) := { 
The sole difference between 
l~l(vl, v2), if bttype(v2) = A 
/d(Vl, v2), if bttype(v2) -- B 
a~d(vl, v2), if bttype(v2) = C 
(1,..., 1), if bttype(v2) -- A 
l~l(Vl, v2), if bttype(v2) = B 
ad(Vl, v2), if bttype(v2) = C 
the strong and the week 
35 
1. For each anaphoric NP a, determine the set of admissible antecedents 5': 
(a) Verify morphosyntactic or lexical agreement with "7 
(b) If the antecedent candidate "7 is intrasentential: by checking that F'(a, 5') ¢ (0, -.. , 0), verify that 
i. the binding restriction of a is constructively satisfied, 
ii. the binding restriction of 5' is not violated. 
(c) If a is a type B pronoun, antecedent candidate 5' is intrasentential, and, according to surface order, 
5" follows a, verify that 5' is definite. 
2. Scoring and sorting: 
(a) For each remaining anaphor-candidate pair (al, 5"j), determine, according to salience and paral- 
lelism heuristics, the numerical plausibility score v(ai, 5"j). 
(b) For each anaphor a: sort candidates 7J according to decreasing plausibility v(o~, 7J). 
(c) Sort the anaphors a according to decreasing plausibility of their individual best antecedent can- 
didate. 
Antecedent Selection: Initialization ~ := (1,..., 1). Consider anaphors o~ in the order determined 
in step 2c. Suggest antecedent candidates 5"(a) in the order determined in step 2b. Select 7(a) as 
candidate if there is no interdependency, i.e. if 
(a) the morphosyntactic features of c~ and 5'(o0 are still compatible, 
(b) for each NP (f whose coindexing with 5'(o 0 has been determined in the current invocation of the 
algorithm: the coindexing of a and 5 which results as a side effect when chosing 5"(0) as antecedent 
for oL does not violate the binding principles, i.e. ~'CoL, 5) # (0,..., 0). (Here, for both occurrences, 
the weak predicate bp~, applies.) 
(c) r ~ := r ~ A ( A ,F(~,a)) A ~'(cz,7 ) does not become (0,...,0). 
~, ~ aa above 
Figure 3: robust anaphor resolution on PSFs 
3. 
version holds for type A occurrences. While binding 
principle A constructively demands the existence of 
a local binder, it does not preclude further nonlocal 
coindexings. This prediction is in accordance with 
the following, intuitively acceptable example: 
The barberi admits that he~ shaves himself. 
During the candidate filtering phase of anaphor res- 
olution, compliance with the binding theoretic dis- 
joint reference rules is now verified by computing 
vectors F'Ca, 7)- 7 is considered a suitable candidate 
for a only if F'(a, 7) is not completely zero. In the 
course of the antecedent selection phase, the vec- 
tors of the individual decisions as well as the vec- 
tors pertaining to dynamically resulting transitive 
coindexings are conjoined to form a vector r ~ which 
characterizes the overall reading dependency: 
= h ¢(x,v) 
X and Y coindexed 
This gives rise to a further restriction which has to 
be checked during the decision interdependency test 
step: to assure the existence of an overall reading, 
only choices may be combined for which r ~ does not 
become the zero vector. 
5.3 An Algorithm 
Figure 3 shows an anaphor resolution algorithm 
which employs the above specified method. In step 
1, restrictions are applied. By determining vec- 
tors ~'(a, 7), the binding constraints are verified. In 
step 2, numerical preference scoring and sorting is 
performed, z° Finally, in step 3, antecedent selection 
takes place. Only decisions are combined which do 
not interdepend. The reading compatibility is veri- 
fied by the stepwise computation of vector ~.11 
A theoretical analysis shows that, under the assump- 
tion of a clever organization of the computation, 
the number of bitvector conjunctions and disjunc- 
tions (including dominance vector determination) is 
bounded by O(bq 2 + s), where q is the number of oc- 
currence contributing NPs, s denotes the size of the 
PSF, and b stands for the maximal degree of branch- 
ing due to packing and sharing. For natural lan- 
guage grammars, it is justified to assume that b is a 
(small) constant. 12 The complexity of sorting in step 
2 is O(q 2 log(q)). Hence, the overall time complexity 
of the approach amounts to O(q2(n + log(q)) + s). 
The practical contribution of n, however, is reduced: 
in a reasonable implementation, the conjunction or 
disjunction of w bits (w = processor word length) 
Z°The optimal choice of preference factors and weights 
remains to be investigated; at least some of the crite- 
ria which have been investigated by (Lappin and Leass, 
1994) do not immediately generalize to deficient syntax. 
lZIn some cases, it may be necessary to retract de- 
cisions. Hence, step 3 has to be supplemented with a 
backtracking facility. 
12In the general case, however, b may be O(\[V\[). 
36 
will be performed by an elementary operation. 
5.4 Structural Disambiguation by 
Referential Evidence 
Since vectors r ~ describing the reading dependency 
are available, any set of anaphor resolution choices 
may now be referred to as further evidence for struc- 
tural disambiguation. PSF trees which are from now 
on invalid can be eliminated by pruning all packing 
variants whose characterizing vectors are orthogonal 
to ~. By this means, based on the above described 
framework, it becomes possible to realize a paral- 
lel processing model which accomplishes the refined 
version of robustness by interaction. 
6 Conclusion 
Two approaches to robust anaphor resolution have 
been presented. The first one, which has been imple- 
mented, works on fragments of parses representing 
subtrees which, due to ambiguity or constraint vio- 
lation, have not been conjoined. The second one has 
been formally specified. It exploits structural infor- 
mation as far as possible by taking into account dom- 
inance relations that are confined to certain read- 
ings. Moreover, by producing an exact description of 
the reading dependency of its decisions, anaphor res- 
olution according to the latter model yields further 
evidence for structural disambiguation, thereby ren- 
dering possible a higher degree of robust processing 
by allowing structural and referential disambigua- 
tion to interact. 
An implementation of the theoretically described ap- 
proach has to show whether its practical behaviour, 
which depends on the maximum number of PSF 
readings n determining the length of the bitvectors, 
is acceptable. A further generalization is needed 
for the processing of (truely) fragmentary PSFs in 
case there is no reading at all. A practically fea- 
sible solution might be obtained by combining the 
two approaches: the theoretically complete method 
is applied to cope with ambiguity that is clearly con- 
fined to certain "local" fragments (e.g. PP attach- 
ment within clausal fragments), thereby keeping n 
small. The heuristic approach handles the remain- 
ing cases (e.g. unattached clausal fragments). While 
a systematic investigation and evaluation of the lat- 
ter issue is pending, the first results for the prac- 
tical method confirm that, in accordance with psy- 
cholinguistic evidence, high-quality anaphor resolu- 
tion does not hinge on the availability of a unique 
syntactic description. 

References 
Jaime G. Carbonell and Ralf D. Brown. 1988. Ana- 
phora Resolution: A Multi-Strategy Approach. In: 
Proceedings of the 12th International Conference 
on Computational Linguistics, COLING, Vol. I, 
96-101, Budapest. 
Noam Chomsky. 1981. Lectures on Government and 
Binding. Foris Publications, Dordrecht. 
Nelson Correa. 1988. A Binding Rule for Govern- 
ment-binding Parsing. In: Proceedings of the 12th 
International Conference on Computational Lin- 
guistics, COLING, Vol. I, 123-129, Budapest. 
Fred Karlsson, Atro Voutilainen, Juha Heikkila, and 
Arto Antilla. 1995. Constraint Grammar: A 
language-independent system for parsing ~ree text. 
Mouton de Gruyter, Berlin/New York. 
Christopher Kennedy and Branimir Boguraev. 1996. 
Anaphora for Everyone: Pronominal Anaphora 
Resolution without a Parser. In: Proceedings of 
the 16th International Conference on Computa- 
tional Linguistics (COLING), Vol. I, August 1996, 
Kopenhagen, 113-118. 
Vittorio M. Lampugnani (ed). 1983. Lexikon 
der Architektur des 20. Jahrhunderts. Hatje, 
Stuttgart. 
Shalom Lappin and Herbert J. Leass. 1994. An 
Algorithm for Pronominal Anaphora Resolution. 
In: Computational Linguistics, 20 (4), 535-561. 
Wolfgang Menzel. 1995. Robust Processing of Nat- 
ural Language. Working report, Arbeitsbereich 
Natfirlichsprachige Systeme (NaTS), Universit~t 
Hamburg, Vogt-KSlln-Strai3e 30, 22527 Hamburg, 
Deutschland. 
Roland Stuckardt. 1996a. An Interdependency- 
Sensitive Approach to Anaphor Resolution. In: 
Approaches to Discourse Anaphora: Proceedings 
of the Discourse Anaphora and Anaphor Resolu- 
tion Colloquium (DAARC96), UCREL Technical 
Papers Series, Vol. 8, Lancaster University, July 
1996. 400-413. 
Roland Stuckardt. 1996b. Anaphor Resolution and 
the Scope of Syntactic Constraints. In: Proceed- 
ings of the 16th International Conference on Com- 
putational Linguistics (COLING), Vol. II, August 
1996, Kopenhagen, 932-937. 
Masaru Tomita. 1985. Efficient Parsing for Natu- 
ral Language. Kluwer Academic Publishers, Dor- 
drecht, The Netherlands. 
