NKRL, a Knowledge Representation Language for Narrative 
Natural Language Processing 
Gian Piero Zarri 
Centre National de la Recherche Scientifique 
CNRS - CAMS 
54, boulevard Raspail 
75270 PARIS Cedex 06, France 
zarri~cams.msh-paris, fr 
Abstract 
NKRL is a conceptual language which intends to 
provide a normalised, pragmatic description of the 
semantic contents (in short, the "meaning") of NL 
narrative documents. We introduce firstly the 
general architecture of NKRL, and we give some 
examples of its characteristic features. We supply, 
afterward, some sketchy information about the 
inference techniques and the NLP procedures 
associated with this language. 
1. Introduction 
NKRL (Narrative Knowledge Representation 
Language) aims to propose some possible, pragmatic 
solutions for the set up of a standardised description of 
the semantic contents (in short, the "meaning") of 
natural language (NL) n,'uTative documents. With the 
term "nan'ative documents" we denote here NL texts of 
an industrial and economic interest con'esponding, e.g., 
to news stories, corporate documents, normative texts, 
intelligence messages, etc. 
The NKRL code can be used according to two 
main modalities. It can be employed as a standard 
vehicle for the interchange of content information 
about narrative documents. It can also bc utilised to 
support a wide range of industrial applications, like 
populating large knowledge bases which can support, 
thereafter, all sort of "intelligent" applications 
(advanced expert systems, case-based reasoning, 
intelligent information retrieval, etc.). NKRL is a 
fully implemented language ; the most recent versions 
have been rcalised in the frmnework of two European 
projects : NOMOS, Esprit P5330, and COBALT, 
LRE P61011. 
2. The architecture of NKRL 
NKRL is a two layer language. 
The lower layer consists of a set of general tools 
which are structured into several integrated 
components, four in our case. 
The descriptive component concerns the tools used 
to produce the formal representations (called predicative 
templates) of general classes of narrative events, like 
"moving a generic object", "formulate a need", "be 
present somewhere". Predicative templates are 
characterised by a threefold format, where the central 
piece is a semantic predicate (a primitive, like 
BEHAVE, EXPERIENCE, MOVE, PRODUCE etc.) 
whose arguments (role fillers) are introduced by roles 
as SUBJ(ect), OBJ(ect), SOURCE, DEST(ination), 
etc. ; the data structures proper to the descriptive 
component are then similar to the case-grammar 
structures. Templates are structured into a hierarchy, 
H TEMP(lates), corresponding, therefore, to a 
"taxonomy of events". 
Templates' instances (predicative occurrences), 
i.e., the NKRL representation of single, specific 
events like "Tomorrow, I will move the wardrobe", 
"Lucy was looking for a taxi", "Peter lives in Paris", 
,are in the domain of the factual component. 
The definitional component supplies the NKRL 
representations, called concepts, of all the general 
notions, like physical_entity, human_being, taxi_, 
city_, etc., which can play the role of arguments 
within the data su'uctures of the two components 
above. The concepts correspond to sets or collections, 
organised according to a generalisation/specialisation 
(tangled) hier~chy which, for historical reasons, is 
called H_CLASS(es). The data structures used for the 
concepts are, substantially, frame-like structures ; 
H_CLASS corresponds relatively well, therefore, to 
the usual ontologies of terms. 
The enumerative component of NKRL concerns 
the formal representation of the instances (concrete, 
countable ex,'unples, see lucy_, wardrobe_l, taxi_53) 
of the concepts of H CLASS ; their formal 
representations take the name of individuals. 
Throughout this paper, we will use the italic type 
style to represent a "concept", the roman style to 
represent an "individual_". 
The upper layer of NKRL consists of two parts. 
The first is a "catalogue", giving a complete 
description of the formal characteristics and the 
modalities of use of the well-formed, "basic templates" 
(like "moving a generic object" mentioned above) 
associated with the language -- presently, about 150, 
pertaining mainly to a (very general) socio-economico- 
political context where the m~fin characters are human 
being~ or social bodies. By means of proper 
specialisation operations it is then possible to obtain, 
from the basic templates, the (specific) "derived" 
templates that could be concretely needed to implement 
a particular, practical application -- e.g., "move an 
industrial process" -- and the corresponding 
occurrences. In NKRL, the set of legal, basic 
templates can be considered, at least in a first 
approach, ,as fixed. 
Analogously, the general concepts which pertain 
to the upper levels of H_CLASS -- such as 
human_being, physicalentity, modality_, etc. -- 
form a sort of upper-level, ilwariable ontology. 
3. Some characteristic NKRL features 
Fig. 1 supplies a simple exmnple of NKRL code. It 
translates a small fragment of COBALT news : 
"Milan, October 15, 1993. qhe financial daily 11 Sole 
1032 
24 Ore reported Mediobanca had called a special l×),'ud 
meeting concerning plums for capital increKse". 
cl) MOVE SUBJ 
OBJ 
date-1: 
date-2: 
(SPECIF sole 24 ore 
financial_daily): (milan__) 
#c2 
15 october_93 
C2) PRODUCE SUBJ 
OBJ 
mediobanca 
(SPECIF summoning 1 
(SPEClF board meeting_l 
mediobanca special_)) 
TOPIC (SPECIF plan1 (SPECIF 
cardinal#y_ several_) 
capital_increase 1 ) 
date-l: circa 15 october_93 
date-2: 
Figure 1. At\] NKRI. coding. 
In Fig. 1, cl and c2 are symbolic labels of 
occurrences ; MOVE and PRODUCE are predicates ; 
SUB J, OBJ, TOPIC (",:l propos of...") are roles. 
With respect now to the arguments, sole 24 ore, 
milan_, mediobanca_ (an Italian merchant bank), 
summoning l, etc. ,'u'e individuals ;financialdaily, 
special_, cardirtality_ and several_ (this last belonging, 
like some , all_ etc., to the logical_quantifier 
intensional sub-tree of II_CLASS) are concepts, q\]~e 
attributive operator, SPECIF(icatiou), with syulax 
(SPECIF el Pl -.. Pn), is used to represent some of 
tile properties which can be asserted about the first 
element el, concept or individual, of a SPECIF list ; 
several is used within a SPECIF list having 
cardinality_ as first element as a standard way of 
representing the plural number mark, sec c2. 
The arguments, and file templates/occurrences as a 
whole, may be characterised by the presence of 
pro'titular codes, the determiners. For example, the 
location determiners, represented as lists, are associated 
with the m'guments (role fillers) by using the colon, 
":", operator, see cl. For the determiners date-1 and 
date-2, see (Zarri, 1992a). 
A MOVE consUuctiou like that of occurrence el 
(completive construction) is necessarily used to 
translate any event concerning the transmission of an 
information ("... I1 Sole 24 Ore reported ..."). 
Accordingly, the filler of the OBJ(ect) slot in the 
occurrences (here, cl) which instantiates the MOVE 
transmission template is ~dways a symbolic label (c2) 
which refers to anolher predicative cx:curreuce, i.e., that 
bearing the informational content to be spread out ("... 
Mediobanca had called a meeting ..."). We can note 
that the enunciative situation can be both explicit or 
implicit. For example, the eompletive construction 
can be used to deal with a problem originally raised by 
Naz,'u'enko (1992) in a conceptual graphs context, 
namely, that of the correct rendering of causal 
situations where the general framework of the 
m~tecedent consists of an (implicit) speech situation. 
Let us examine briefly one of the Nazarenko's 
exmnples (1992 : 881) : "Peter has a lever since he is 
flushed". As Naz~enko remarks, "being flushed" is not 
the "cause" of "having a fever", but that of an implicit 
enunciative situation where we claim (affirm, assert 
etc.) that someone has a fever. Using the completive 
construction, this example is easily translated in 
NKRL using the I'onr occmTences of Fig. 2. 
\[ II 
c3) MOVE SUBJ human being or social body 
OBJ #c4 
c4) EXPERIENCE SUBJ peter_ 
OBJ fevered_state_l 
C5) EXPERIENCE SUBJ peter 
OBJ flushing state_l 
lobs\] 
c6) (CAUSE c3 e5) 
Figure 2. An implicit enunciative situation. 
We cau remark that, in Fig. 2, c6 is a binding 
occun'ence. Binding structures -- i.e., lists where the 
elements are conceptual labels, c3 and c5 in Fig. 2 -- 
~ne second-order structures used to represeut the logico- 
semantic links which can exist between predicative 
templates or teem'fences. The binding occun'ence c6 -- 
meaning that c3, the main event, has been caused by 
c5 -- is labelled using one (CAUSE) of the four 
operators which define together the taxonomy of 
causality of NKRL, see (Zarri, 1992b). The presence 
in c5 of a specific determiner -- a temporal modulator, 
"obs(ervc)", see again (Zarri, 1992a) -- leads to an 
iuterprelalion of this occurrence as the description of a 
situatiou that, that very moment, is observed to exist. 
We give now, Fig. 3, a (slightly simplified) 
NKRL represeutation of the narrative sentence : "We 
have to make orange juice" which, according to 
Ilwang and Schubert (1993 : 1298), exemplifies 
several interesting semantic phenomena. 
c7) BEHAVE SUBJ (COORDinformant 1 
(SPECIF humar&being 
(SPECIF cardinality 
several_))) 
\[oblig, ment\] 
date1: observed date 
date2: 
c8) *PRODUCE SUBJ (COORD informant_l 
(SPECIF human_being 
(SPECIF cardinal#y_ 
several_))) 
OBJ (SPECIF orange_juice 
(SPECIF amount_ 0)) 
date1: observed date + i 
date2: 
c9) (GOAL c7 c8) 
Figure 3. Wishes and intentions. 
ii i i i 
Fig. 3 illustrates the standard NKRL way of 
representing the "wishes, desires, intention" domain. 
To translate the idea of "acting in order to obtain a 
given result", we use : 
i) An occurrence (here c7), instance of a basic 
template pertaining to the BEIIAVE branch of the 
H TEMP hierarchy, and corresponding to the 
general meaniug of focusing on a result. This 
occurrence is used to express the "acting" 
1033 
component -- i.e., it identifies the SUBJ(ect) of 
the action, the temporal co-ordinates, etc. 
ii) A second predicative occurrence, here c8, all 
instance of a template structured around a different 
predicate (e.g., PRODUCE in Fig. 3) and which is 
used to express the "intended result" component. 
iii) A binding occmTence, c9, which links together the 
previous predicative occurrences and which is 
labelled by means of GOAL, another operator 
included in tile taxonomy of causality of NKRL. 
Please note that "oblig" and "ment" in Fig. 3 are, 
like "obs" in Fig. 2, "modulators", see (Zan-i, 1992b), 
i.e., particular determiners used to refine or modify the 
primary interpretation of a template or occurrence as 
given by the basic "predicate -- roles -- argument" 
association. "ment(al)" pertains to the modality 
modulators. "oblig(atory)" suggests that "someone is 
obliged to do or to endure something, e.g., by 
authority", and pertains to the deontic modulators 
series. Other modulators are the temporal modulators, 
"begin", "end", "obs(erve)", see also Fig. 2. 
Modulators work as global operators which take as 
their argument tile whole (predicative) template or 
occurrence. When a list of modulators is present, as in 
the occurrence c7 of Fig. 3, they apply successively to 
the template/occurrence in a polish notation style to 
avoid any possibility of scope ambiguity. In the 
standard constructions for expressing wishes, desires 
and intentions, tile absence of the "ment(al)" modulator 
in the BEHAVE occurrence meaus that tile SUBJ(ect) 
of BEHAVE takes some concrete initiative (acts 
explicitly) in order to fulfil the result ; if "merit" is 
present, as in Fig. 3, no concrete action is undertaken, 
and the "result" reflects only the wishes and desires of 
the SUBJ(ec0. 
4. Inferences and NL processing 
Each of the four components of NKRL is characterised 
by the association with a class of basic inference 
procedures. For exmnple, the key inference mechanism 
for the factual component is the Filtering and 
Unification Module (FUM). The primary data 
structures handled by bq3M are the "search patterns" 
that represent the general properties of an information 
to be searched for, by filtering or unification, within a 
knowledge base of occun'ences. The most interesting 
component of tile FUM module is represented by the 
matching algorithm which unifies the complex 
structures -- like "(SPECIF summoning_l (SPECIF 
board_meeting_l mediobanca_ special))" in 
occurrence c2 of Fig. 1 -- that, in the NKRL 
terminology, are called "structured arguments". 
Structured arguments are built up in a principled way 
by making use of a specialised sub-language which 
includes four expansion operators, the "disjunctive 
operator", the "distributive operator", the "collective 
operator", and the "attributive operator" 
(SPECIFication), see (Zaxli, 1996) for more details. 
The basic inference mechanisms call then be used 
as building blocks for implementing all sort of high 
level inference procedures. An example is given by 
the "transformation rules", see (Ogonowski, 1987). 
NKRL's transformations deal with the problem of 
obtaining a plausible answer from a database of factual 
occurrences also in the absence of the explicitly 
requested infommlion, by searching semantic affinities 
between what is requested and what is really present in 
file base. The fund,'unental principle employed is then 
to "transform" tile original query into one or more 
different queries which -- unlike "trmisfonned" queries 
in a database context -- are not strictly "equivalent" 
but only "semantically closC' to the original one. 
With respect now to the NL/NKRL translation 
procedures, they are based oil file well-known principle 
of locating, within the original texts, the syntactic and 
semantic indexes which can evoke the conceptual 
structures used to represent these texts. Our 
contribution has consisted in tile set up of a rigorous 
algorithmic procedure, centred around the two 
foUowing conceptual tools : 
• The use of rules -- evoked by particular lexical 
items in the text exmnined and stored in proper 
conceptual dictionaries -- which take the form of 
generalised production rules. The left hand side 
(,antecedent Par0 is always a syntactic condition, 
expressed as a tree-like structure, which must be 
unified with the results of tile general parse tree 
produced by the syntactic specialist of the 
translation system. If the unification succeeds, tile 
right haud sides (consequent parts) ,are used, e.g., to 
generate well-formed templates Ctriggering rules"). 
• The use, within file rules, of clever mechanisms to 
deal with the variables. For example, in the 
specific, "triggering" f,'unily of NKRL rules, the 
antecedent variables (a-variables) ,are first declared 
in tile syntactic (antecedent) part of the rules, and 
then "echoed" in tile consequent pro'is, where they 
appear under the form of arguments and constraints 
associated with the roles of the activated templates. 
Theh" function is that of "capturing" -- during the 
match between file antecedents and the results of 
the syntactic specialist -- NL or H_CLASS terms 
to be then used as specialisation terms lot filling 
up the activated templates and building the final 
NKRL structures. 
A detailed description of these tools can be found, 
e.g., in (Zarri, 1995) ; see also Azzmn (1995). Their 
generality and their precise formal scmautics make it 
possible, e.g., tile quickly production of useful sets of 
new rules by simply duplicating and editing the 
existing ones. 
We reproduce now, Fig. 5, one of the several 
triggering rules to which tile lexical entry "call" -- 
pertaining to tile NL fragment examined at the 
beginning of Section 3. -- contains a pointer, i.e., 
one of tile rules corresponding to the meaning "to 
issue a call to convene". This rule allows the 
activation of a basic template (PRODUCE4.12) giving 
rise, at a later stage, to the occurrence c2 of Fig. 1 ; 
the x symbols in Fig. 5 correspond to a-variables. 
We can remark that all the details of the full 
template are not actually stored in the consequent, 
given that the H TEMP hierarchy is part of the 
"common shared data stmctmes" used by the translator. 
Only the par,'uneters relating to tile specific triggering 
rule ,'ue, therefore, really stored. For exmnple, in Fig. 
5, the list "eonstr" specialises the constraints on some 
1034 
of the variables, while others -- e.g., the constraints 
on the v,'uiables xl (humanbeing/social_body) and x4 
(planning_activity) -- are unchanged with respect to 
the constraints permanently associated with the 
variables of template PRODUCFA. 12. 
trigger: "call" 
syntactic condition: 
(s (subj (rip (noun xl))) 
(vcl (voice active) (t = x2 = call)) 
(dir-obj 
(np (modifiers (adjs x31)) (noun x3) 
(modifiers (pp (prep about I concerning I ... ) 
(np (noun x4) 
(modifiers (pp (prep of I for ...) 
(np (noun x5)))))))))) 
parameters for the template : 
(PRODUCE4.12 (roles subj xl obj (SPECIF x2 
(SPECI F x3 x31)) +topic (specif x4 x5)) 
(constr x3 assembly_ x31 quality_ x5 
modification_procedures)) 
Figure 5. An example of triggering rule. 
I II I II I m 
The "standard" prototype of an NL/NKRL 
translation system -- e.g., the COMMON LISP 
translator realised in the NOMOS project -- is a 
relatively fast system which take 3 min 16s on Sun 
SparcStafion 1 wifll 16Mb to process a inedium-size 
text of 4 sentences and 150 wordfonns ; it takes 1 min 
06s for the longest sentence. This pure conceptual 
parser, however, is not suitable, per se, for dealing 
directly with huge quantifies of unrestricted data. In the 
COBALT project, we have then used a commercial 
product, TCS (Text Categorisation System, by 
Carnegie Group) to pre-select from a corpus of Reuters 
news stories those concerning in principle the chosen 
domain (financial news about merging, acquisitions, 
capital increases etc.). The candidate news items (about 
200) have then been translated into NKRI, formal, and 
examined through a query system in order to i) confirm 
their relevance ; ii) exlract their main content elements 
(actors, circumstances, locations, dates, amounts of 
shares or money, etc.). Of the candidate news stories, 
80% have been (at least partly) successfiflly translated ; 
"at least p,'u'fly" metals that, somethnes, the translation 
was incomplete due, e.g., to the difficulty of 
instantiating correctly some binding structures. Other 
quantitative information about the COBALT results 
can be found in (Azzmn, 1995 ; Zarri, 1995). 
5. Conclusion 
Possible, general advantages of NKRL with 
respect to other formalisms that also claim to be able 
to represent extensive chunks of semantics, see, e.g., 
(Lehmann, 1992), are at least the following : 
• The addition of a "taxonomy of events" to the 
traditional "taxonomy of concepts" : often, 
"normal" ontologies elude in fact lhe problem of 
representing how the concepts interact with each 
other in the context of real-life events. Recently, 
Park (Park, 1995) has prcsemed a language which 
provides a set of ontological primitives to be used 
to model the dynamic aspects ("events") of a 
domain, llowever, Park's system seems to be a 
very "young" onc, and it lacks of tools for 
dcscribing essential narrative features like the 
relationships betwecn events, the temporal 
information, etc. 
The presence of a catalogue of standard, basic 
templates, which can be considered as part and 
parcel of the definition of the language. This 
implies that : i) a system-builder does not have to 
create himself the slrnctural knowledge needed to 
describe the events proper to a (sufficiently) large 
class of m~afive documents ; ii) it becomes easier 
to secure the reproduction and the sharing of 
previous results. 
References 
Azzam, S. (1995). "Anaphors, PPs and Disambignation 
Process for (?onceptnal Analysis". In Proceedings of 
the 14th International Joint Conference on Artificial 
Intelligence. Morgan Kaufinann, San Mateo (CA). 
Itwang, C.II., and Schubert, I,.K. (1993). "Meeting the 
Interlocking Needs of LF-Computation, Deindexing 
and Inference: An Organic Approach to General NI,U". 
In Proceedings of the 13th International Joint 
Conference on Artificial Intelligence. Morgan 
Kaufinann, San Mateo (CA). 
Nazarenko-Perrin, A. (1992). "Causal Ambiguity in 
Natural Langnage: Conceptual Representation of 'parce 
que/because' and 'puisqne/since'". In Proceedings of 
the 151h International Cot(erence on Computational 
Linguistics (COLIN(; 92), Nantes, fTrance. 
l.ehmann, F., editor (1992). Semantic Networks in 
Artificial Intelligence. Pergamon Press, Oxford. 
Ogonowski, A. (1987). "MENTAT : An Intelligent and 
Cooperative Natural Lauguage DB Interface". In 
Proceedings of the 7th Avignon International 
Conference on Expert Systems and Their Applications 
(Avignon '87), vol. 2. EC2 & Cie., Paris. 
Park, B.J. (1995). "A Language for Ontologies Based on 
Objects and Events". In Proceedings of the IJCAI'95 
Workshop on Basic Ontological Issues in Knowledge 
Sharing. l)epartmeut of Computer Science of the 
University of Ottawa. 
Zarri, G.P. (1992a). "Encoding the Temporal 
Characteristics of the Natural Language Descriptions 
of (Legal) Situations". In A. Martino, editor, Expert 
Systems in Law. Elsevier Science, Amsterdam. 
Zarri, G.P. (1992b). "The 'Descriptive' Comt~neut of a 
llybrid Knowledge Representation Language". In F. 
Lehmann, editor, Semantic Networks in Artificial 
Intelligence. Pergamon Press, Oxford. 
Zarri, G.P. (1995). "Knowledge Acquisition from 
Complex Narrative Texts Using the NKRL 
Technology". In B.R. Gaines and M. Musen, editors, 
Proceedings of the 9th Banff Knowledge Acquisition 
for Knowledge-Based Systems Workshop, vol. 1. 
l)epartment of Computer Science of the University of 
Calgary. 
Zan'i, G.P., and Gilardoni, L. (1996). "Structuring and 
Retrieval of the Complex Predicate Arguments Proper 
to the NKRL Conceptual I,anguage". In Proceedings of 
the Ninth International Symposiunt on Methodologies 
for Intelligent Systems (ISM1S'96). Springer-Verlag, 
Berlin. 
1035 
