Language Processing and Commonsense Reasoning 
Robert Wilensky, Principal Investigator 
Computer Science Division 
University of California at Berkeley 
Berkeley, CA 94720 U.S.A. 
PROJECT GOALS 
The goal of this project is to develop the tech- 
nology to construct intelligent, natural-language- 
capable agents. Such agents will have natural lan- 
guage and reasoning capabilities that facilitate in- 
teraction with untrained users who are seeking in- 
formation pertinent to tasks in which they are en- 
gaged. Emphasis is given to techniques for creating 
such agents for new domains, focusing specifically 
on portability and knowledge acquisition. In partic- 
ular, we have been developing DIRC (Domain Inde- 
pendent Retargetable Consultant), a kind of intel- 
ligent, naturai-language-capable consultation shell, 
and a number of mechanisms for world and language 
knowledge acquisition. 
RECENT RESULTS 
Our implementation of DIRC is about 80% complete. 
We also continued our theoretical work on finding a 
probablistic basis for text inference. Current proba,. 
bilistic and statistical inductive models are difficult 
to bias to generalize from training data; specifically, 
the most common methods for inducing prior prob- 
ability distributions -- relative frequency priors and 
maximum entropy priors -- are inadequate. To ad- 
dress this gap we propose a family of inductive meth- 
ods, called the gamma-continuum. These are a pa- 
rameterized set of methods for generating priors from 
a training set incorporating an a priori abstractive 
bias that causes the model to make generalizations. 
Our driving application is probabilistic pattern com- 
pletion to support integrated natural language pars- 
ing and semantic interpretation, where the patterns 
combine lexical, syntactic, and semantic structures. 
We have developed an accurate, relatively low- 
overhead method for the disambiguation of English 
noun homonyms using a large corpus of free text. 
The objective of the algorithm is the following: given 
an English sentence or sentence fragment containing 
the target noun, determine which of a set of pre- 
determined senses should be assigned to the noun. 
This is accomplished by checking the context sur- 
rounding the target noun against that of previously 
recorded instances, and choosing the sense for which 
the most evidence is found. Initial results are promis- 
ing. 
We are continuing work on intelligent dictio- 
nary reading for natural language vocabulary ac- 
quisition. In particular, we have been studying 
noun/preposition patterns and methods for acquir- 
ing their meanings. In some cases, these patterns 
are defined directly in the dictionary. In other cases, 
428 
the semantics of the patterns can be derived by par- 
tially productive relations (subregularities) between 
the complement structures of the nouns and the 
verbs they are derived from. Thus we are exploiting 
our work on subregularities to aid in the interpreta- 
tion of dictionary entries. 
PLANS FOR THE COMING YEAR 
Over the coming year we plan to finish DIRC, and 
continue to develop the various methods of lexical 
and world knowledge acquisition we have discussed 
previously. In addition, we are planning to use some 
of the technology we have developed to produce more 
intelligent text retrieval methods. Such methods are 
both useful additions to our intelligent agents ap- 
proach to "help" systems, as well as ways of improv- 
ing the performance of conventional information re- 
trieval systems. 
