THE RESEARCH PROJECT "ANAPHORA" (IN ITS PRESENT STATE OF 
ADVANCEMENT ) 
F. Studnicki, B. Polanowska, E. Stabrawa, J. M. Pall, 
A. ~aohwa 
Institute of Computer Science, Jagellcnian University 
Cracow, Poland 
1. The aim of the project is to work out a method of 
resolving automattca.~ly the anaphorio ~euses of s certain 
class, in particular those used in formulating interdocumentary 
croesorefsrenoes in primary leSal texts (statutory texts). By 
resolving an anaphoric clause of that class we mean the search- 
ing cut possibly all of its referends. The t,nplementaticn of 
the planned method should enable the users of the full text 
legal data banks to obtain in search operations, 8part from 
the doCumentS satisfying the requirements defined in the usual 
querriss, also such documents to which the former explicitly, 
or even implicitly refer. The project has been planned as one 
composed of three parts. A report on the results of part I was 
presented at the Fifth ICCH Conference in Ann Arbor in May 
1981. The present text aims at showing the main outlines of 
the approach applied in part II. To make some aspects of that 
part clear, however, certain references must be made to part Iv 
Part llI is, as yet, at the eta~e of preliminary discussions. 
2. The general approach applied in the whole of the pro- 
~eot is of a semantic kind. It has been assumed, in particular, 
that at a certain level of generalization all elementary ana- 
phoric clauses of the above class (let us call them the a- 
-clauses) have in spite of the diversity of their types, an 
- 273 - 
analogous semantic structure, which can be represented by the 
following diagremz 
the elementary a-clause 
the anaphoric functor the argument of the anaphoric functor 
the standard the specification 
of the argument of the argument 
Consider the following fictitious example of a legal 
provision in which an elementary a-clause is inherent: 
"Art. 56. In cases when the price is to be paid in cash 
article 44 of the civil code should be applied." 
In art. 56 the a-functor is represented in the surface struct- 
ure by the phrase "should be applied", the argument by the 
phrase "article 44", the standard by the phrase "article" and 
the specification by the phrase "44". The role of the a-funot- 
or is confined to signalizin~ the fact that the clause in 
question has the illocutionary status of an anaphorlc utteran- 
ce, while that of the argument (and its immediate semantic 
oonstituens) consists in carrying information relevant to 
identifying the referende of that clause. Four types of the 
elementary a-clauses, in particular the type A (the explicit- 
ly addressing), the type D (the deictio), the type R (the 
implicitly referrlng) and the type S (the semantic) are dis- 
tingulshed. The distinction corresponds to four types of 
indication met in the clauses in question. By indication we 
mean the way in which referends of the a-clause are referred 
to by it. 
3- The operation of automated resolving of an a-clause 
can be conceived of as composed of four stages. Stage 1 con- 
sists in identifying an a-clause of this kind within a defin- 
ite document (article, paragraph,...), mostly by recognizing 
- 274 - 
the phrase representing its a-functor. At the stage 2 certain 
of the sel~antio properties of the analyzed a-olau~e, relevant 
for the selection of the most appropriate search procedures, 
are identified by the program° This sta~e results in the 
generation of a formula which is a generalized semantic re- 
presentation of the a@tu~ly analyzed a-clause. Such a formu- 
la (the SR-formula) is built in a specific language of semant- 
ic representation (the SR-language), which is a l~nguage with 
a drastically reduced vocabulary and a very simple syntax 
outlined in part Z° At the stage 2 only the semantic propert- 
ies accounted for by the specific frame-like interpretation 
scheme TS are taken into consideration. Stage 3 consists in 
utilizing the SR-formul~s, generated at the stage 2, in auto- 
mated selecting the search procedures to be employed at the 
stage 4. The selection is made from among a set of such pro- 
cedu~es e~pplied by the corresponding program° At the stage 
4 the selected procedures are employed in the process of 
searching out the referends of the actually analyzed anaphorio 
clause (i.e. the documents to which it explicitly or implicit- 
ly refers). 
4. The simplest version of IS, to be used in interpret- 
ing the elementary a-clauses, (i.e. the a-clauses in which 
only a single indication is inherent), can be conceived of as 
anl ordered pair IT ! R~ , where T stands for a data structu- 
Ire (called "the ladder") composed of 8 subsequent fields 
(te~ninels) a~d R for a set of rules by which the operation 
of filling out of definite terminals is governed° According 
to the rules R two term£n~ls located to the furthest left, 
are destined to carry infoz~nation on the type of indication 
inherent in the actually analyzed elementary a-clause. The 
remaining terminals (3-8) account each for a de~in~te semant- 
ic property of such a clause. 
5. By composed ~-clauses are meant those in which more 
than a sin~e indication are inherent. Such clauses are 
- 275 - 
semantically represented by the composed fomulas of the 
SR-languags, in particular by a number of filled out "ladd- 
ers", connected by the use of certain connectives of the 
classical calculi. 
6. The empirical investigations which form the subject 
of part II were carried cut on a representative sample of 
the Polish statutory texts of the years 1944 - 1979. The 
research aimed at reconstructing all possible ways in which 
the semantic properties of all kinds of a-clauses may be re- 
presented in the original texts. Such a reconstruction was 
indispensable to building the algorithms of transforming the 
"natural" a-clauses into the corresponding SR-formulas, as 
well as to the building the possibly ~i~ effective procedu- 
res of searching out the referend8 of the analyzed a-clauses. 
The research resulted in formingz a) the lists of words 
which occur in the phrases representing in the surface 
structure the correspond£ng semantic constituents of the 
~-olansee of all types, b) the lists of words which occur 
in the phrases reflecting definite semantic properties of 
such clauses, and c) lists of grammars reconstructing the 
empirically observed syntactic connections between those 
words. Such a ws~7 of presenting the results of the empirical 
investigations inherent in part II seems most suitable for 
constructing the aforementioned algorithms and procedures. 
7. Part III of the project is concerned with the ways 
of implementing of the planned method. Only a few of the 
corresponding algorithms and procedures have already been 
worked cut by the authors. 
- 276 - 
