INFORMATION STATES AS FIRST CLASS CITIZENS 
Jorgen Villadsen 
Centre for Language Technology, University of Copenhagen 
Njalsgade 80, DK-2300 Copenhagen S, Denmark 
Internet: jv@cst.ku.dk 
ABSTRACT 
The information state of an agent is changed when 
a text (in natural language) is processed. The 
meaning of a text can be taken to be this informa- 
tion state change potential. The inference of a con- 
sequence make explicit something already implicit 
in the premises -- i.e. that no information state 
change occurs if the (assumed) consequence text is 
processed after the (given) premise texts have been 
processed. Elementary logic (i.e. first-order logic) 
can be used as a logical representation language 
for texts, but the notion of a information state (a 
set of possibilities -- namely first-order models) is 
not available from the object language (belongs to 
the meta language). This means that texts with 
other texts as parts (e.g. propositional attitudes 
with embedded sentences) cannot be treated di- 
rectly. Traditional intensional logics (i.e. modal 
logic) allow (via modal operators) access to the 
information states from the object language, but 
the access is limited and interference with (exten- 
sional) notions like (standard) identity, variables 
etc. is introduced. This does not mean that the 
ideas present in intensional logics will not work 
(possibly improved by adding a notion of partial- 
ity), but rather that often a formalisation in the 
simple type theory (with sorts for entities and in- 
dices making information states first class citizens 
-- like individuals) is more comprehensible, flexi- 
ble and logically well-behaved. 
INTRODUCTION 
Classical first-order logic (hereafter called elemen- 
tary logic) is often used as logical representa- 
tion language. For instance, elementary logic has 
proven very useful when formalising mathemati- 
cal structures like in axiomatic set theory, num- 
ber theory etc. Also, in natural language process- 
ing (NLP) systems, "toy" examples are easily for- 
malised in elementary logic: 
Every man lies. John is a man. 
So, John lies. (1) 
vx(man(x) lie(x)), man(John) 
zi (john) (2) 
303 
The formalisation is judged adequate since the 
model theory of elementary logic is in correspon- 
dence with intuitions (when some logical maturity 
is gained and some logical innocence is lost) -- 
moreover the proof theory gives a reasonable no- 
tion of entailment for the "toy" examples. 
Extending this success story to linguistically 
more complicated cases is difficult. Two problem- 
atic topics are: 
Anaphora 
It must be explained how, in a text, a dependent 
manages to pick up a referent that was introduced 
by its antecedent. 
Every man lies. John is a man. 
So, he lies. (3) 
Attitude reports 
Propositional attitudes involves reports about cog- 
nition (belief/knowledge), perception etc. 
Mary believes that every man lies. 
John is a man. 
So, Mary believes that John lies. (4) 
It is a characteristic that if one starts with the 
"toy" examples in elementary logic it is very dif- 
ficult to make progress for the above-mentioned 
problematic topics. Much of the work on the 
first three topics comes from the last decade -- 
in case of the last topic pioneering work by Hin- 
tikka, Kripke and Montague started in the sixties. 
The aim of this paper is to show that by taking 
an abstract notion of information states as start- 
ing point the "toy" examples and the limitations 
of elementary logic are better understood. We ar- 
gue that information states are to be taken serious 
in logic-based approaches to NLP. Furthermore, 
we think that information states can be regarded 
as sets of possibilities (structural aspects can be 
added, but should not be taken as stand-alone). 
Information states are at the meta-level only 
when elementary logic is used. Information states 
are still mainly at the meta-level when intensional 
logics (e.g. modal logic) are used, but some ma- 
nipulations are available at the object level. 
This limited access is problematic in connec- 
tion with (extensional) notions like (standard) 
identity, variables etc. Information states can be 
put at object level by using a so-called simple type 
theory (a classical higher-order logic based on the 
simply typed A-calculus) -- this gives a very ele- 
gant framework for NLP applications. 
The point is not that elementary or the vari- 
ous intensional logics are wrong -- on the contrary 
they include many important ideas -- but for the 
purpose of understanding, integrating and imple- 
menting a formalisation one is better off with a 
simple type theory (stronger type theories are pos- 
sible, of course). 
AGENTS AND TEXTS 
Consider an agent processing the texts tl,..., tn- 
By processing we mean that the agent ac- 
cepts the information conveyed by the texts. The 
texts are assumed to be declarative (purely infor- 
mative) and unambiguous (uniquely informative). 
The texts are processed one by one (dynamically) 
-- not considered as a whole (statically). The dy- 
namic interpretation of texts seems more realistic 
than the static interpretation. 
By a text we consider (complete) discourses 
-- although as examples we use only single (com- 
plete) sentences. We take the completeness to 
mean that the order of the texts is irrelevant. In 
general texts have expressions as parts whose or- 
der is important -- the completeness requirement 
only means that the (top level) texts are complete 
units. 
INFORMATION STATES 
We first consider an abstract notion of an infor- 
mation state (often called a knowledge state or a 
belief state). The initial information state I0 is 
assumed known (or assumed irrelevant). Changes 
are of the information states of the agent as fol- 
lows: 
I0 r1'I1 r2, I2 r3 ... r%i n 
where r/ is the change in the information state 
when the text t/is processed. 
An obvious approach is to identify information 
states with the set of texts already processed -- 
hence nothing lost. Some improvements are pos- 
sible (normalisation and the like). Since the texts 
are concrete objects they are easy to treat compu- 
tationally. We call this approach the syntactical 
approach. 
An orthogonal approach (the semantical ap- 
proach) identifies information states with sets of 
possibilities. This is the approach followed here. 
304 
Note that a possibility need not be a so-called 
"possible world" -- partiality and similar notions 
can be introduced, see Muskens (1989). 
A combination of the two approaches might 
be the optimal solution. Many of these aspects 
are discussed in Konolige (1986). 
Observe that the universal and empty sets are 
understood as opposites: the empty set of possi- 
bility and the universal set of texts represent the 
(absolute) inconsistent information state; and the 
universal set of possibility and the empty set of 
texts represent the (absolute) initial information 
state. Other notions of consistency and initiality 
can be defined. 
A partial order on information states ("getting 
better informed") is easy obtained. For the syn- 
tactical approach this is trivial -- more texts make 
one better informed. For the semantical approach 
one could introduce previously eliminated possi- 
bilities in the information state, but we assume 
eliminative information state changes: r(I) C I 
for all I (this does not necessarily hold for non- 
monotonic logics / belief revision / anaphora(?) 
-- see Groenendijk and Stokhof (1991) for further 
details). 
Given the texts tl,...,t~ the agent is asked 
whether a text t can be inferred; i.e. whether pro- 
cessing t after processing tl,...,t~ would change 
the information state or not: 
Here r is the identity function. 
ELEMENTARY LOGIC 
When elementary logic is used as logical represen- 
tation language for texts, information states are 
identified with sets of models. 
Let the formulas ¢1,..., On, ¢ be the transla- 
tions of the texts tl,...,tn,t. The information 
state when tl .... ,tk has been processed is the 
set of all models in which ¢1,..., ¢n are all true. 
Q, • ..,tn entails t if the model set correspond- 
ing to the processing of Q,..., t,, does not change 
when t is processed. I.e. alternatively, consider a 
particular model M -- if ¢1,-.., &n are all true in 
M then ¢ must be true in M as well (this is the 
usual formulation of entailment). 
Hence, although any proof theory for elemen- 
tary logic matches the notion of entailment for 
"toy" example texts, the notion of information 
states is purely a notion of the model theory 
(hence in the meta-language; not available from 
the object language). This is problematic when 
texts have other texts as parts, like the embedded 
sentence in propositional attitudes, since a direct 
formalisation in elementary logic is ruled out. 
TRADITIONAL APPROACH 
When traditional intensional logics (e.g. modal 
logics) are used as logical representation languages 
for texts, information states are identified with 
sets of possible worlds relative to a model M = 
(W,...), where W is the considered set of possible 
worlds. 
The information state when tl,...,tk has 
been processed is, relative to a model, the set of 
possible worlds in which ¢1,.--, ek are all true. 
The truth definition for a formula ¢ allows for 
modal operators, say g), such that if ¢ is (3¢ then 
is true in the possible worlds We C_ W if ¢ is 
true in the possible worlds We _C W, where We -- 
fv(W¢) for some function f¢~ : :P(W) --* :P(W) 
(hence U = (W, fv,...)). 
For the usual modal operator \[\] the function 
f:: reduces to a relation R:~ : W × W such that: 
W¢ -- fo(W,) - U {w¢ I Ro(w~,, w¢)} 
w~EWeb 
By introducing more modal operators the informa- 
tion states can be manipulated further (a small set 
of "permutational" and "quantificational" modal 
operators would suffice -- compare combinatory 
logic and variable-free formulations of predicate 
logic). However, the information states as well as 
the possible worlds are never directly accessible 
from the object language. 
Another complication is that the fv function 
cannot be specified in the object language directly 
(although equivalent object language formulas can 
often be found -- of. the correspondence theory for 
modal logic). 
Perhaps the most annoying complication is 
the possible interference with (extensional) no- 
tions like (standard) identity, where Leibniz's Law 
fails (for non-modally closed formulas) -- see 
Muskens (1989) for examples. If variables are 
present the inference rule of V-Introduction fails 
in a similar way. 
SIMPLE TYPE THEORY 
The above-mentioned complications becomes even 
more evident if elementary logic is replaced by a 
simple type theory while keeping the modal oper- 
ators (cf. Montague's Intensional Logic). The ~- 
calculus in the simple type theory allows for an el- 
egant compositionality methodology (category to 
type correspondence over the two algebras). Often 
the higher-order logic (quantificational power) fa- 
cilities of the simple type theory are not necessary 
-- or so-called general models are sufficient. 
The complication regarding variables men- 
tioned above manifests itself in the way that /3- 
reduction does not hold for the A-calculus (again, 
305 
see Muskens (1989) and references herein). Even 
more damaging: The (simply typed!) A-calculus is 
not Church-Rosser (due to the limited a-renaming 
capabilities of the modal operators). 
What seems needed is a logical representation 
language in which the information states are ex- 
plicit manipulable, like the individuals in elemen- 
tary logic. This point of view is forcefully defended 
by Cresswell (1990), where the possibilities of the 
information states are optimised using the well- 
known technique of indexing. Hence we obtain an 
ontology of entities and indices. 
In recent papers we have presented and dis- 
cussed a categorial grammar formalism capable 
of (in a strict compositional way) parsing and 
translating natural language texts, see Villadsen 
(1991a,b,c). The resulting formulas are terms in a 
many-sorted simple type theory. An example of a 
translation (simplified): 
Mary believes that John lies. (5) 
)~i.believe(i, Mary, ()~j.lie(j, John))) (6) 
Adding partiality along the lines in Muskens 
(1989) is currently under investigation. 
ACKNOWLEDGMENTS 
Reports work done while at Department of Com- 
puter Science, Technical University of Denmark. 

REFERENCES 
M. J. Cresswell (1990). Entities and Indices. 
Kluwer Academic Publishers. 
J. Groenendijk and M. Stokhof (1991). Two Theo- 
ries of Dynamic Semantics. In J. van Eijck, editor, 
Logics in AI - 91, Amsterdam. Springer-Verlag 
(Lecture Notes in Computer Science 478). 
K. Konolige (1986) A Deduction Model of Belief. 
Pitman. 
R. Muskens (1989). Meaning and Partiality. PhD 
thesis, University of Amsterdam. 
J. Villadsen (1991a). Combinatory Categorial 
Grammar for Intensional Fragment of Natural 
Language. In B. Mayoh, editor, Scandinavian 
Conference on Artificial Intelligence- 91, Roskilde. 
IOS Press. 
J. Villadsen (1991b). Categorial Grammar and In- 
tensionality. In Annual Meeting of the Danish As- 
sociation for Computational Linguistics - 91, Aal- 
borg. Department of Computational Linguistics, 
Arhus Business School. 
J. Villadsen (1991c). Anaphora and Intensional- 
ity in Classical Logic. In Nordic Computational 
Linguistics Conference - 91, Bergen. To appear. 
