Knowledge Representation and Semantics in a Complex Domain: 
The UNIX Natural Language Help System GOETHE 
Gerhard Heyer, Ralf Kese, Frank Oenfig, Friedrich Dudda 
OLIVETTI D.O.R. - TA RESEARCH LAB 
TA TRIUMPH-ADLER AG 
Fiirtherstr. 21.2 
8500 Nuremberg (West Germany) 
1 Abstract 
Natural language help systems for complex 
domains requirc, in our view, an integration 
of semantic representation and knowledge 
base in order to adequately and efficiently 
deal with cognitively misconceived user in-. 
put. We present such an integration by way 
of the notiml of a frame-semae~tics that has 
been implemented for the purposes of a 
natural language help system for UNIX. 
2 Introduction 
It is commonly agreed ihat natural language 
systems fox" semantically rich domains re- 
quire a level of sema~tic representation in 
order to provide for a sufficiently deep 
processing of the natural language input. 
The level of semantic representation is 
sometimes called a representation of lingu- 
istic knowledge. In addition, a natural 
language help system also requires a 
lo~owledge base of the application domain in 
order to answer the requests for domain 
specific help. The level of knowledge 
representation is sometimes called a re- 
presentation of world knowledge. Most 
present day natural language processing 
sy,'stems, including, amongst others, SRI's 
Core I_,anguage Engine (Alshawi et.al. 1986), 
ESPRIT I project ACORD (Bez et.al. 1990), 
and the UNIX natural language help systems 
UNIX-Consultant (Wilensky et.al. 1988), 
SINIX-Consultant (Wahlster et.al. 1988), and 
AQUA (Ouilici et.al. 1986), keep the two 
levels of representation distinct. In ad- 
dition, there usually is no feed-back of 
information between the semantic represen- 
tation and the knowledge base. Thus, 
parsing is supposed to result in a complete 
semantic representation of the user input 
which then is passed on to tilt knowledge 
base manager for further processing. This 
kind of architecture follows a strategy, that 
can be called stepwise processing. We claim 
that for complex domains this kind of ap- 
proach is inadequate because it ig,mrcs the 
user's cognitive misconceptions about the 
particular application. Instead, we wish to 
argue that at least with respect to seman- 
tics and knowledge representation in natural 
language help systems an h~tegmted ap- 
proach should be preferred, in the approach 
we advocate, semantics and knowledge 
representation interact to correct (or 
complete) a possibly incorrect (or incom- 
plete) semantic representation. The mecha- 
nism by which this is achieved is based on 
the notion of a fi'ame-semandcs (cf. Heyer 
et.al. 1988, tlausser 1989). We demonstrate 
our "integrated approach with examples from 
GOETHE, a natural language help system 
for UNIX as a complex domain. 
GOETHE (cf. Kese/ Oemig 1.989) has been 
developed together with OLIVETTI AI Cen- 
ter, Ivrea, and Tecsicl AI Lab, Rome, for 
UNIX V on the OLIVETTI LSX 30xxCompu- 
ter Series. The present prototype includes a 
protocol for monitoring user's actions and 
has the natural language mode of inter- 
action fully integrated into a graphical 
DeskTop under InterViews (based oll a 
cooperation with Fraunhofer Society, 
Stuttgart), thus allowing also for deictic 
natural language and graphical interactions. 
It covers all of UNIX' file handling, 
containing a static knowledge base of more 
than 70 UNIX programs. It is written in 
Quintus-PROLOG and C, and takes by 
average less than 10 seconds for generating 
an answer to a user's request. 
361 
3 Requirements on Knowledge Re- 
presentation and Semantics for a 
UNiX natural language help system 
It is the task of the knowledge base in a 
UNIX natural language help system to serve 
as a baals for correctly and adequately 
answering a user's questions in one of the 
following situations: (1) the user needs to 
know a UNIX cormnand, or series of 
commands in order to carry out a certain 
i:ask, (2) hc has sent off a I JNIX command 
and the system has resulted in a different 
.qa\[e !ban he expected, or (3) he wants to 
gc.t irffcrmalion about a UNIX command. In 
g,.mcr~fl, !hi:; ',,,'ill require two knowledge 
~,:)urces: 5'laEc knowledge about UNIX as a 
co\]!c;ction of possible, man-machine inter- 
actions, and dynamic knowledge about the 
respc'cl.i-,e UNIX st:ate (in particular, i- 
node~ a~d the associated files with their 
permissions), the u.-,~:t ...... s actions, and the 
i'"' ...... " r~:actions (in particular, error ': ~ l.,~ ill ,~ 
messages), it is the task of the semantic 
representation to provide ,t .. ,,'~c knowledge 
base manager wi:\[t a corrc.ct and adequate 
semantic representation of \[he user's input 
Si\[ ll\[ff\[ Ion) (in -~ spccii!ic : ' ' There basically 
are two strat¢oies..:, . available at this point. 
()q ~.he (:re na~.:d, adhering to the idea that 
c~-occur~en:e ...,,~, ,..ns va,so for missing 
:. c u-,.:t,g q ,.:.cc2d to be treated as real 
res!ric{io~u,, p., ;:-ibly even as syntactic 
:c.<tric:ions (('homsky !965), we can insist 
that ;:' there is a semantic representation 
~>f an input sentence at all, it will be 
COITCCl zl~cl c'ot?,\])IU.e ~-:' O.,.~ respect to the 
domain e,f .q)p!ication). He.nee, the system 
will " ' tat! ~X) provide ~\[I1 ;&nswer to a user's 
\[tilde z,. he rCqtlCSt ~ t .... D~rases his question in a 
correct and comp!ctc way. On the other 
hand, co-occurrence restrictions may not be 
~ak6'n as; real restrictions but rather as 
scnzezmic d@z,/ts which may be over- 
written by additional knowledge base 
itfformatic.,n. Fhis allows for a much more 
use>friendly and cooperative natural 
language processing, but requires that the 
se:n}antic re'presentation is closely tied to 
)- 'd'.e know edge base. 
For the purpost:s el! the GOETHE system, 
we have opted for the second alternative, 
because the cognitive misconceptions a user 
may have about UNIX not only cause him 
to invoke the help system, but also cause 
him in most cases to phrase his questions 
~1 the way he does: If the system is 
presented with a semantically incorrect 
question, this is to be taken as an indi- 
cation, that the user needs help, and a 
reminder that he better rephrase his 
question in a correct way will not be of 
much use to him. Of course, it would have 
also been possible to relax tile syntactic 
co-occurrence restrictions. In effect, 
however, this would have resulted in a 
duplication of knowledge base information 
in the lexicon. The second alternative, 
therefore, not only appears to be the more 
adequate, but also the more efficient 
solution. 
4 Frame Semantics 
Output of the parser in GOETHE is a 
possibly incorrect, or incomplete, semantic 
representation where the meaning of tile 
individual- and predicate-constants of tile 
logical representation are represented as 
frame-theoretic icons (Hcyer et.al. :19881). 
We call this kind of semantic representation 
flame-semantics, or database-semantics 
(Hausser 1989). Taking the frame represen- 
tation of UN\[X (including attached proce- 
dares and, additionally, the protocolled 
history) as the context-model relative to 
which a user's input is interpreted, this 
flame-semantics allows for a simple and 
efficient processing of a semantic represen- 
tation for correction, completion, or the 
retrieval of the requested information via 
the knowledge base manager. As an illu- 
stration, consider the following examples: 
1) "How can I edit a text named 'test"?" 
\[qword(how), 
\[action(edit), 
line(file) ,attr(name,test)\]\]\] 
362 
2) "Why didn't you list dir 'tc.stdir' 
sorted by date!" 
\[qword(why-not), 
\[action(show), 
\[so(directory),attr (narne,testdir)\], 
\[mo(file),quant(all)\], 
attr(name,_), 
app(descending, attr(date,_))\]\] 
(Note that "list directory" = "show all 
files"; "so" = source object, "too" = main 
object, "attr" = attribute). 
Why: search for a fl'ame representing a 
program in the history and compare the 
used commands with the intended goal 
with respect to identities 
Why-not: search for a flame representing 
a program in the history and compare the 
used commands with the intended goal 
with respect to differences. 
Literature 
In these lists (which might equally be 
represented as trees), each argument points 
to a frame in the UNIX knowledge base. 
Semantic processing then basically consists 
of successively unifying each of these 
frames (where the predicates are slots in 
the frame referred to by the respective 
mother-node). 
In case the unification of a set of frames 
fails, GOETttE tries a number of heuristics 
to actfieve unification, including: Identifi- 
catior,. (identifying the referents of proper 
names as denoting a file, a directory, a 
UNIX command, an owner, or a group), 
generalisation (taking .:he frame-generali- 
sation of the (first) action node as a 
candidate for unification), and precondition- 
checlc (checking whether existence of a file, 
ownership, and read-, write-, execution- 
rights are fulfilled a:; required). 
Once a set of frames is consistent, 
retri~:ving the answer to a request is stirred 
by the frames for How, Why, and Why-not, 
always appearing on the top-level node of 
the '~emantic representation. These frames 
can be understood as representing strategies 
for searching the knowledge base as 
follows: 
ltow: Search for a frame with a goal 
component containing a special 
command entry 

References

Alshawi et.al. 1986, "Feasibility Study 
for a Research Prograrmne in Natural- 
language Procesosing", SRI International, 
Cambridge, \]986. 

Bez et.at. 1990, "Construction and 
Interrogation of Knowledge-Bases using 
Natural Language Text and Graphics: 
ESPRIT Project 393, ACORD. Final Report", 
Springer Verlag, 1990 (in print). 

Chomsky 1965, "Aspects of the Theory 
of Syntax", M.I.T. Press, 1965. 

Hausser 1989, "Computation of Lan- 
guage", Springer Verlag, 1989. 

Heyer et.al. 1988, "Specification of the 
KB-Manager: A Frame-Extension of DRSs 
for Supporting Conceptual Reasoning", 
ACORD Project, Deliverable T5.4, 1988. 

Ralf Kese, Frank Oenfig 1989, "GOETHE: 
Ein kontextsensitives Beratungssystem fiJr 
UNIX", LDV-Forum, Vol.6, 1989. 

A.E.Quilici, M.G. Dyer, M.Flowers, 1986, 
"AQUA: An intelligent UNIX Advisor", 
Proceedings of the 7th European Conference 
on Artificial Intelligence (ECAI), 1986, 
Vol.II. 

W.Wah!ster, M.Hecking, C.Kemke, 1988, 
"SC: Ein intelligentes Hilfesystem far 
SINIX", in: Gollan, Paul, Schmitt (eds.):'In- 
novative Informationsinfrastrukturen", 
Informatik-Fachberichte Nr.184, 1988. 

R.Wilensky, D.Chin, M.Luria, J.Martin, 
J.Mayfield, D.Wu: "The Berkeley UNIX 
Consultant Project", Computational Lingui- 
stics, Vo1.14, No.4, 1988. 
