A GRAMMAR AND A LEXICON 
FOR A TEXT-PRODUCTION SYSTEM 
Christian M.I.M. Matthiessen 
USC/Information Sciences Institute 
ABSTRACT 
In a text-produqtion system high and special demands are placed on the 
grammar and the lexicon. This paper will view these comDonents in 
such a system (overview in section 1). First, the subcomponente dealing 
with semantic information and with syntactic information will be 
presented se!:arataly (section 2). The probtems of relating these two 
types of information are then identified (section 3). Finally, strategies 
designed to meet the problems are proDose¢l and discussed (section 4). 
One of the issues that will be illustrated is what happens when a 
systemic linguistic approach is combined with a Kt..ONE like knowledge 
representation • a novel and hitherto unexplored combination\] 
1. THE PLACE OF A GRAMMAR AND A 
LEXICON IN PENMAN 
This gaper will view a grammar and a lexicon as integral parts of a text 
production system (PENMAN). This perspective leads to certain 
recluirements on the form of the grammar and that of the eubparts of the 
lexicon and on the strategies for integrating these components with 
each other and with other parts of the system. In the course of the 
I~resentstion of the componentS, the subcomDonents and the 
integrating strategies, these requirements will be addressed. Here I will 
give a brief overview of the system. 
PENMAN is a successor tO KDS (\[12\], \[14\] and \[13\]) and is being 
created to produce muiti.sentential natural English text, It has as some 
of its componentS a knowledge domain, encoded in a KL.ONE like 
representation, a reader model, a text-planner, a lexicon, end a 
Sentence generator (called NIGEL). The grammar used in NIGEL is a 
Systemic Grammar of English of the type develol:~d by Michael Halliday 
• - see below for references. 
For present DurOoses the grammar, the lexic,n and their environment 
can be represented as shown in Figure 1. 
The lines enclose setS; the boxes are the linguistic compenents. The 
dotted lines represent parts that have been develoDed independently of 
the I~'esent project, but which are being implemented, refined and 
revised, and the continuous lines represent components whose design 
ill being developed within the project. 
The box labeled syntax stands for syntactic information, both of the 
general kind that iS needed to generate structures (the grammar;, the left 
part of the box) and of the more Sl~=cific kind that is needed for the 
syntactic definition of lexical items (the syntactic subentry of lexical 
items; to the right in the box -- the term lexicogrammar can also be uasd 
to denote both ends of the box). 
1Thitl reBe•rcti web SUOl~fled by the Air Force Office of Scientific Re~lllrrJ1 contract 
NO. F49620-7~-¢-01St, The view~ and ¢OIX:IuIIonI contained in this document Me thoe~ 
of the author and ~ould not be intemretKI u neceB~mly ~tJ~ ~ official 
goli¢iee or e~clors~mcm=, either e;~ore~ or im~isd. Of the Air FOrCAI Office of .~WIO 
R~rch ot the U.S. Government. The reeea¢ch re~t~ • joint effort end so ao tt~ 
=tm~ming from it whicti are the sub, tahoe Of this ml~'. I would like to thank in 
p~rt~cull=r WIIIklm MInn, who tieb helped i1~ think, given n~e ~ h~l ideaa 
sugg~o~l and commented extensively on dr.Jft= of th@ PaDre3, without him it ~ not 
be. I am ~ gretefu| tO Yeeutomo Fukumochi for he~p(ul commcmUI On I dran end to 
Michael Hldlldey, who h~ mecle clear to m@ rmmy sylRemz¢ i:~n¢iOl~ end In=Ught~ 
N•turelly, \] am eolefy reso¢~i~le for errors in the grelMmtetlon and contenL 
' CONCEPTUALS 
J~ :::::::::::::::::::::::::::::::::::::::::::::::: 
i s¥ N T jiiiiii iiiii!iiliii!ii i 
Grammor ~i::i::i::il Lexls ii::~i!i!ilil I .................................. \] 
L ~iiii::i::iiiii~ii!iii~::~:::.::i~ii~ii~:.:::.:::.i:.i~ 
General Specific 
Lexicon 
Figure 1.1 : System overview. 
The other box (semamics) represents that part of semantics that has to 
do with our conceptualiz.~tion o: experience (distinct from the 
semantics of interaction -. speech acts stc, .- and the semantics of 
presentation -- theme structure, the distinction between given and new 
information etc.). It is shown as one part of what is called conceDtuals .. 
our general conceptual organization of the world around us and our 
own inner world; it is the linguistic part o! conceptuals. For the lexicon 
this means that lexical semantics is that part of conceptuals which has 
become laxicalized and thus enters into the structure of the vocabulary. 
There is also a correlation between conceptual organization and the 
organization of part of the grammar. 
The double arrow between the two boxes represents the mapping 
(realization or encoding) of semantics into syntax. For example, the 
concept SELL is mapped onto the verb sold? 
The grammar is the general Dart of the syntactic box, the part 
concerned with syntactic structures. The /exicon CUts across three 
levels: it has a semantic part, a syntactic part (isxis) and an 
orthographic part (or spelling; not present in the figure)? The lexicon 
21 •m ul~ng the genec=l convention of cagitllizing terms clattering semantic entree=. 
C.~tak= will also i~l ueBd fo¢ rom~ aJmocieteo with conce~13 (like AGENT. RECIPIENT lu~ 
OI~ECT~ and for gcamm~ktical functions (like ACTOR. BENEFICIARY and GOAL). These 
notions will be introduced below. 
3This me~m= that an ~ fo¢ a lexical item ¢on~L~ts of three sureties...4¢ i eBmlmtic 
wltry, • syrltacti¢ entry anti an orttlogrlkOhi¢ ontry. The lexicon box ~ ~howtt •~ containing 
g4e~l Of ~ syntax and secmlntic=l in the figt~te (ttiQ s~l~ area) to ern~lBize t~ 
nal~re of the isxicaJ entry, 
49 
consists entirely of independent lexical entries, each representing one 
lexicai item (t'ypicaJly a word). 
This figure, then, represents the i~art of the PENMAN text production 
system that includes the grammar, the lexicon and their immediate 
environment. 
PENMAN is at the design stage; conse¢lUantiy the discussinn that 
follows is tentative end exploratory rather than definitive. -- The 
¢om!=onant that has advanced the farthest is the grammar. It has been 
implemented in NIGEL, the santo nee generator mentioned above. It has 
been tested and is currently being revised and extended. None of the 
other components (those demarcated by continuous lines) have been 
implemented; they have been tested only by way of hand examples. 
This groat will concentrate on the design features of the grammar 
rather than on the results of the implementation and testing of it. 
2. THE COMPONENTS 
2.1. Knowledge representation and semantics 
The knowledge representation 
One of the fundamental properties of the KL-ONE like knowledge 
representation (KR) is its intensional -- extensional distinction, the 
distinction between a general conceptual taxonomy and a second part 
of the representation where we find individuals which can exist, states 
of affairs which may be true etc. This is roughly a disbnction t:~ltween 
what is conceptuaiizaDle and actual conceptualizations (whether they 
are real or hypothetical). In the overview figure in section 1, the two 
are together called conceptuals. 
For instance, to use an example I will be using throughout this paper, 
there is an inteflsional concept SELL, about which no existence'D or 
location in time is claimed. An intenalonal concept is related to 
extensional concede by the relation Inclividuates: intenaionai SELL is 
related by individual instances of extensional SELLs by the Individuates 
relation. If I know that Joan sold Arthur ice-cream in the I~!rk, I have s 
SELL fixed in time which is part of an assertion about Joan and it 
Indiviluates intenaional SELL. 4 A concept has internal structure: it is a 
configuration of roles. The concept SELL has an internal ~re 
which is the three roles associated with it, viz. AGENT (the seller), 
RECIPIENT (the buyer) and OBJECT. These rolee are slot3 which are 
filled by other concepts and the domains over which these can very are 
defined as value restrictions. The AGENT of SELL is a PERSON or a 
FRANCHISE and sO on. 
tn ~,ther words, a ¢oncel~t is defined by its relation to other concepts 
(much aS in European structuraiism). These relations are roles 
a'~sociated with the concept, roles whose fillers are other concept¢ 
This gives rise to a large conceptual net. 
There is another reiation which helps define the place of a conoe=t in 
the conceptual net. viz. SuperCategory, which gives the conceptual net 
a taxonomic (or hierarchic) structure in addition to the structure defined 
by the role relations. The concept SELL ie defined by its I~lace in the 
taxonomy by having TRANSACTION as a SuperCate<jory. If we want to, 
4It ~toul¢l be eml)t~ullz41~t ~tlt r.~lltng the cof~-.eot SELL 'u=y~l nothing wt'~lt=oe~t~r li~out 
~ngli~tt exl~'qm~on for it:. ~e *'el.'lons for gz~ it filial ~ Ire I~urely fR~mo~i¢. 
o~ty way the conces=t elm be I~ocmted ~m ~ ~ =o/o' is tlw~gf~ ~g ~ of I 
we can define a conceot that will have SELL as a SuDerCategoq (i.e. 
bear the SuperCategory relation to SELL), for example SELLCB 'sell on 
the black market'. As a result, p)art of the taxonomy of events is 
TRANSACTION --- SELL .-- SELLOB. 
If TRANSACTION has a set of roles associated with it, this set may be 
inherited by SELL and by SELLOB .- this is a generaJ feature of the 
SuperCategory relation. In the examples involving SELL that follow, I 
will concentrate on this concept and not try to generalize to its 
supercategones. 
The Semantic Subentry 
In the overview figure (1.1), the semantics is shown as part of the 
concaptuais- The consequence of this is that the set of semantic 
entries in the lexicon is a subset of the set of concepts. The subset is 
groper if we assume that there are concepts which have not been 
lexicaiized (the assumption indicated in the figure). The a.csumption is 
I~erfectJy reasonable; I have already invented the concept SELLOB for 
which there is no word in standard English: it is not surprising if we have 
formed concepts for which we have to create expressions rather than 
pick them reedy.made from our lexicon. Furthermore, if we construct a 
conceptual component intended to support say a bilingual speaker, 
there will be a number of concepts which are lexicaiized in only one of 
the two languages.. 
A semantic entry, than, is a concept in the conceptuais- For sold, we 
find soil wiffi its associated roles, AGENT, RECIPIENT and OBJECT. 
The right ~ of figure 4.1 below (marked "se:'; after a figure from \[1\] 
gives a more detailed semantic ent~ for sold: = pointer identifies the 
relevant part in the KR, the concept that constitutes the semantic entry 
(here the concept SELL). 
The concept that constitutes the semantic entry of a lexicai item has a 
fairly rich structure. Roles are associated "with the concept and the 
modailty (neces~ury or optional), the ¢ardinaii~ of and restrictions on 
(value of) the fillers are given. 
Through the value restriction the linguistic notion of selection 
restriction is captured. The stone sold a carnation to the little girl is odd 
because the AGENT role of SELL is value restricted to PERSON or 
FRANCHISE and the concept associated with stone fails into neither 
type. 
The strategy of letting semantic entries be part of the knowledge 
representation would not have been possible in a notation designed to 
csgture specific propositions only, However, since KL-ONE pfoviles 
the distinction between intension and extension, the strategy is 
unl=rotolsmati¢ in the I=resant framework. 
So what is the relationship between intensional-extensionai and 
s~manti¢ entries? The working aesumption is that for a large part of the" 
vocaioulary, it is the concepts of the intanalonai part of the KR that may 
be lexicalized and thus serve as semantic entries. We have words for 
intenalonai obje¢=, actions and states, but not for indtviluai 
extensional obiects etc. with the exception of propel names. They have 
extensional concepts as their semantic entries. For instance, Alex 
denotes a particular individuated person and The War of the Roses a 
palrticula¢ individumed war. 
Both the Sul~H'Category relation and the Indiviluates relation provide 
ways of walking around in the KR to find expresmons for concepts. If 
50 
we are in the extensional part of the KR, looking at a particular 
individual, w~ can follow the Individuates link up to an intensional 
concept. There may be a word for it, in which case the concept is part of 
a laxical entry. If there is no word for the concept, we will have to 
consider the various options the grammar gives us for forming an 
¢oPropriate exoressJon. 
The general assumption is that all the intensional vocabulary can he 
used for extensional concepts in the way just describe(l: exc)reasabi..,'y 
is inherited with the Individuates relation. 
Expression candidates for concepts can also be located along the 
SuberCate(Jory link by going from one concept to another one higher 
up in the taxonomy. Consider the following example: Joan sold Arthur 
ice.cream. The transaction took place in tl~e perk. The SuperCate~ory 
link enables us to go from SELL to TRANSACTION, where we find the 
expression transaction. 
Lexical Semantic Relations 
The structure of the vocabulary is parasitic on the conceptual structure. 
In other words, laxicalized concepts are related not only to one another, 
but also to concepts for which there is no word,encoding in English (i.e. 
non-laxicalized concepts). 
Crudely, the semantic structure of the lexicon can be described as 
being part of the hierarchy of intensional concepts -- the intensional 
concepts that happen to be lexicalized in English. -- The structure of 
English vocabulary is thus not the only principle that is reflected in the 
knowledge representation, but it is reflected. Very general concepts 
like OBJECT, THING and ACTION are at the top. In this hierarchy, roles 
are inherited. This corresponds to the semantic redundancy rules of a 
lexicon. 
Considering the possibility of walking around in the KR and the 
integration of texicalized and non.iexicalized concepts, the KR suggests 
itself as the natural place to state certain text-forming principles, some 
of which have been described under the terms lexical cohesion (\[8\]) 
and Thematic Progression (\[6\]). 
I will now turn to the syntactic component in figure 1-1, starting with a 
brief introduction to the framework (Systemic Linguistics) that does the 
same for that component as the notion of semantic net did for the 
component just discussed. 
2,2. Lexicogrammar 
Systemic Linguistic~ stems from a British tradition and has been 
developed by its founder, Michael Halliday (e.g. \[7\], \[9\], \[10\]) and 
other systemic linguists (see e.g. \[5\], \[4\] for S presentation of Fawcett's 
interesting work on developing a systemic model within a cognitive 
model) for over twenty years covering many areas of linguistic concern, 
including studies of text, ;exicogrammar, language development, and 
computational applications. Systemic Grammar was used in SHRDLU 
\[15\] and more recently in another important contribution, Davey'a 
PROTEUS \[3\]. 
The systemic tradition recognizes a fundamental principle in the 
organization of language: the distinction between cl~oice and the 
structures that express (realize) choices. Choice is taken as primary 
and is given special recC,;\]nition in the formalization of the systemic 
model of language. Consequently, a description is a specification of the 
choices a speaker can make together with statement:; about how he 
realizes a selection he has made. This realization of a set of choices is 
typically linear, e.g. a string of words. Each choice point is formalized as 
a ,system (hence the name Systemic). The options open to the speaker 
are two or more features that constitute alternatives which can' be 
chosen. The preconditions for the choice are entry conciitiona to the 
system. Entry conditions are logical expressions whose elementary 
terms are features. 
All but one of the systems have non.emt~/ entry conditions. This 
causes an interdependency among the systems with the result that the 
grammar of English forms one network of systems, which cluster when 
a feature in one system is (part of) the entry condition to another 
system. This dependency gives the network depth: it starts (at its 
"root") with very general choices. Other systems of choice depend on 
them (i.e. have a feature from one of these systems -- or st combination 
of features from more than one system .. as entry conditions) so that the 
systems of choice become less general (more delicate to use the, 
systemic term) as we move along in the network. 
The network of systems is where the control of the grammar resides, its 
non.deterministic part. Systemic grammar thus contrasts with many 
other formalisms in that choice is given explicit representation and is 
captured in a single ruis type (systems), not distributed over the 
grammar as e.g. optional rules of different types. This property of 
systemic grammar makes it s very useful component in a 
text-production system, seDecially in the interf3ce with semantics and in 
ensuring accessibility of alternatives. 
The rest of the grammar is deterministic .. the consequences of 
features chosen in the network of systems. These conse(luences are 
formalized as feature realization statements whose task is to build the 
appropriate structure. 
For example, in independent indicative sentences, English offers a 
choice between declarative and interroaative sentences, if 
interrooativ~ is chosen, this leeds to a dependent system with a choice 
between wh-intsrrooative and ves/no-interroaative. When the latter is 
chosen, it is realized by having ~.he FINITE verb before the SUBJECT. 
Since it is the general design of the grammar that is the focus of 
attention, I will not go through the algorithm for generating a sentence 
as it has been implemented in NIGEL. The general observation is that 
the results are very encouraging, although it is incomplete. The 
algorithm generates a wide range of English structures correctly. There 
have not been any serious problems in implementing a grammar written 
in the systemic notation. 
Before turning to the lexico, part of lexicogrammar, I will give an 
example of the toplevel structure of a sentence generated by the 
grammar. (I have left out the details of the internal structure of the 
constituents.) 
iiiii;o.i iIi i!o   t Iiiiii  i\]\]iiiliiiii     I 
........... .... I ............. .......... 
In the park| Join / sold | Arthur 14ce-¢reem 
51 
The structure consists of three layers of function symbols, aJl of which 
are needed to get the result desired... The structure is not only 
functional (with- function s/m/ools laloeling the const|tuents instead of 
category names like Noun Phrase and Verb Phrase) but it is 
multifunctional. 
Each layer of function symbols shows a particular perspective on the 
clause structure. Layer \[1\] gives the aspect of the sentence as a 
representation of our experience. The second layer structures the 
sentence as interaction between the speaker and the hearer;, the fact 
that SUBJECT precedes FINITE signals that.the speaker is giving the 
hearer information. Layer \[3\] represents a structuring of the clause as a 
message; the THEME is its starting point. The functions are called 
experiential, inte~emonal and textual resm~-~Jvety in the systemic 
framework: the function symbols are said to belong to three different 
metafunctions, in the rest of the !~koar I will concentrate on the 
experiential metafunction, I=artiy because it will turn out to be highly 
relevant to the lexicon. 
The syntactic sut3entry. 
In the systemic tradition, the syntactic part of the lexicon is seen as a 
continuation of grammar (hence the term lexicogrammar for both of 
them): lsxical choices are simply more detailed (delicate) than 
grammatical choices (cf. \[9\]). The vocabulary of English can be seen 
as one huge taxonomy, with Roget's Thesaurus as a very rough model. 
A taxonomic organization of the relevant Dart of the vocabulary of 
English is intended for PENMAN, but this Organization is part of the 
conceptual organization mentioned al0ove. There is st present no 
separate lexicai taxonomy. 
The syntactic subentry potentially con~sts of two parts. There is alv~ye 
the class specification .. the lexical features. This is a statement of the 
grammatical potential of the lexicai item, i.e. of how it can be used 
grammatically. For sold the'ctas,~ specification is the following: 
verb 
C'/I1~ |0 
c~als 02 
bemlf &ct, 1re 
where "benefactive" says that sold can occur in a sentence with a 
BENEFICIARY, "class 10" that it encodes a material pr~ 
(contrasting with mental, varbai and relational processes) and "CMas 
02" that it is a tnmaltive verb. 
In ~ldition, there is a provision for a configurationai part, which is a 
h'agment of a Structure the grammar can generate, more specifically the 
experiential part of the grammar, s The structure corresponds to the top 
layer ( # \[1\]) in the example above. In reference to this example, I can 
make more explicit wh~ I mean by fragment. The general point is that 
(to take just one cimm as an example) the presence and cflara~er of 
functions like ACTOR, BENEFICIARY and GOAL .- diract t:~'ticiplmts in 
the event denoted by the verb .- depend on the type of verb, whereas 
the more circumstantial functions like LOCATION remain unaffected 
and a~oDlical=ie to all ~ of verb. Conse(luently, the information about 
the poasibilib/ of having a LOCATION constituent is not the type of 
information that has to be stated for specific lsxical items. The 
information given for them concerns only a fragment of the experiential 
functional structure. 
The full syntactic entry for sol~ is: 
PROCESS • veto 
class IO 
class 02 
befloflctlve 
ACTOR • 
GOAL 
8EX(FICZAR¥ " 
This says that sold Can occur in a fragment of a struCtUre where it is 
PROCESS and there can be an ACTOR, a GOAL and a RENEF1CIARY. 
The usefulness of the structure fragment will be demonstrated in 
section 4. 
3. THE PROBLEM 
I will now turn to the fundamental proiolem of making a working s/stem 
out of the parts that have been discu~md. 
The problem ~ two parts to it. viz. 
1. the design of the system as a system with int.egrated Darts 
and 
2. the implementation of the system. 
I will only be concerned with the 6rat aspect here. 
The components of the system have been presented. What remains -. 
and that is the problem -- is to dealgn the misalng \[inks; tO find the 
strategies that will do the job of connecting the components. 
Finding these strategies is a design problem in the following sense. The 
stnUegies do not come as accessories with the frameworks we have 
uasd (the systemic framework and the KL-ONE inspired knowledge 
reprasentatJon). Moreover, th~me two frameworks stem from two quite 
dispm'ate traditions with different sets of goals, symbols and terms. 
I will state the problem for the grammar first and then for the lexicon. As 
it has been presented, the grammar runs wik:l and free. It is organized 
Mound choice, to be sure, but there is nothing to relate the choices to 
the rest of the Wstem, in particular to what we can take to be semantics. 
In other word~k although the grammar may have • ~ that faces 
~emantics .. the system network, which; in Hallldly'e worcls, is 
~arnantically relevant grammar .- it does not mmke direct contact with 
semantics. And, if we know what we want the system to ante>de in a 
sentence, how can we indicate what goes where, that is what a 
constituent (e.9. the ACTOR) should encocle? 
The lexicon incorporates the problem of finding an ¢opropriate strategy 
to link the components to each other, since it cuts acrosa component 
boundn,des. The semantic and s/ntsctic subpaJts of a lexica| entry 
have been outlined, but nothing hall been sak:l about how they should 
be matched up with one ,.,nother. The reason why this match is not 
~rfectly straightforward has to do with the fact that both entries may be 
sa'uctunm (conf,~urations) rather than s~ngle elements. In sedition, 
there are lexical relations that have not been accounted for yet, 
es~lcially synonymy and polysemy. 
5Th~ conllgursb(mld ~ dQ~ not mira from the sylmm~ tn~libon, i~t is In 
.~m m me 17mont ckm~ 
52 
4. LOOKING FOR THE SOLUTIONS 
4.1. The Grammar 
Choice experts and their domains. 
The control of the grammar resides in the n.etwork of systems. Choice 
experts can be developed to handle the choices in these systems. 
The idea is that there is an expert for each system in the network and 
that this expert knows what it takes to make a meaningful choice, what 
the factors influencing its choice are. it has at its disposal a table which 
tells it how to find the relevant pieces of information, which are 
somewhere in the knowledge domain, the text plan or the reader model. 
In other words, the part of the grammar that is related to Semantics is 
the part where the notion of choice is: the choice experts know about 
the Semantic consequences of the various choices in the grammar and 
do the job of relating syntcx tO semantics, s 
The recognition of different functional componenta of the grammar 
relates to the multi-funCtional character of a structure in systemic 
grimmer I mentioned in relsUon to the example In the park Joan sold 
Arthur ice.cream in section 2.2. The organization of the sentence into 
PROCESS, ACTOR, BENEFICIARY, GOAL, and LOCATIVE is an 
organization the grammar impeses on our experience, and it is the 
aspect of the organization of the Sentence that relates to the conceptual 
organization of the knowledge domain: it is in terms of this organization 
(and not e.g. SUBJECT, OBJECT, THEME and NEW INFORMATION) 
that the mapping between syntax and semlmtic,,i can be stated... The 
functional diver~ty Hailiday has provided for systemic grammar is 
useful in a text.production .slrstam; the other functJone find uses which 
space does note permit a discuesion of here. 
Pointers from cJonslituents. 
In order for the choice experts to be able to work, they must know 
where to look. Resume that we are working on in the park in our 
example Sentence in the park Joan sold Arthur ice.cream and that an 
expert has to decide whether park should be definite or not. The 
information about the status in the mind of the reader of the concept 
corre~oonding to park in this sentence is located at this conce~t: the 
~ck is to ~mociats the concept with the constituent being built. In the 
example structure given earlier, in the park is both LOCATION and 
THEME, only the former of which is relevant to the present problem. The 
solution is to set a pointer to the relevant extensional concept when the 
function symbol LOCATION is inserted, so that LOCATION will carry the 
pointer and thus make the information attached to the concept 
8ccaesible. 
4.2. The lexicon and the lexlcal entry 
I have already inb-oducad the semantic subentry and the syntactic 
• ubentry. They are stated in a KL-ONE like representation and a 
systemic notation respec~vely. The queslion now is how to relate the 
two. 
In the knowledge representation the internal struc~Jre of a concept is a 
configuration of roles and these roles lead to new concepts to which the 
concept is related. A syntactic structure is seen as a configuration of 
aA ~ d~lnitk~n ot the h~i soTintlca ol tt~ gnlmm•r ik Is • nliA# ot 
IOl~'mlC, h0 "minimti~ • what ti~ Brlmm•~cll ~ ~ io~ at*. in the Ixment 
'4/mcusWon, I ~ focun~l on Ine know~dge domain one, ~ ~ this bl me 
mosl r~J~Im to MmiP.~ ~'T~li~. 
/ 
function symbols; syntactic categories serve these functions -- in the 
generation of a structure the functions lead to an entry of a part of the 
network. For example, the function ACTOR leads to a part of the 
network whoSe entry feature is Nominal Group just ~s the role AGENT 
(of SELL) leads to the concept that is the filler of it. The parallel between 
the two representations in this area are the following: 
KRONLEDG\[ REPRESENTATIOM SYNTACTIC REPRES\[MTATION 
role fuflctton 
f 111el" exponent 
(Where exponent denotes the entry feature into a pm't of the network 
(e.g. Nominal Group) that the function leads to.) 
This parallel clears the path for a strmegy for relating the Semantic entry 
and the syntactic entry. The strategy is in keeping with current ideas in 
linguistics. "r Consider the following crude entry for sold, given here a.s 
an illustration: 
Subentl,les: 
Ii¢~ent~¢ syntactic ol,thogl,lpht¢ 
Functtoni Lextcel 
re&furls 
SELL- • PROCESS • vel,b "sold" 
concept Class 10 
class 0Z 
blfleflttJve 
AGENT " ACTOR 
OBJECT • GOAL 
RECIPIENT • BEMEFICIAR¥ 
where the previously discussed semantic and syntactic subentries are 
repeated and paired off against each other. 
This full lexical entry makes clear the usefulness of the second part of 
the syntactic entry .. the fragment of the experiential functional 
structure in which sold can be the PROCESS. 
Another piece of the total picture siso falls into place now. The notion of 
a pointer from an experiential function like BENEFICIARY in the 
grammatical structure to a point in the conceptual net was introduced 
above. We can now see how this pointer may be Set for individual lexical 
items: it is introduced as a simple relation between a grammatical 
function symbol and s conceptual role in the iexical entry of e.g. SELL. 
Since there is an Indlviduates link between this intensionai concept and 
any extensional SELL the extensional concept that is part of the 
particular proposition that is being encoded grarnmaticaJly, the pointer 
is inherited and will point to a role in the extensional part of the 
knowledge domain. 
At this point, I will refer again to the figure below, whose dght half I have 
already referred to as a full example of a semantic subentry ("see"). 
"sp:" is the spelling or orthographi c subentry; "gee" is the syntactic s,,bentry. 
We have two configurations in the lexical ent~'y: in the Semantic 
subentry the concept plus a number of roles and in the syntactic 
subentry a number of grammaticsi functions. The match is represented 
in the.f_i~ure abov e by the arrows. 
7The mectllmism for maOOing hu much in common with ~ develooed for Cexical 
Functlon~ G~ (lee e.g, {21), idlb'tough tM 14~ebl are not tP4 same. The entry 
• lexic~d enu,/in ~ PIm-LexicaJism hlunework devJooed by Hudson in \[11 \]. 
53 
g~ 
c~--, 02 
ac~ 
C ( 
) ,..OA., .-.....\ ..... /.I \ \ 
FIgure 4-1: Lexical 
entry for sold 
in the first step I introduced the KL-ONE like knowledge representation 
All three roles of SELL have the modaJity "r~c~___,~_~'. This does not 
dictate the grammatical pos.~bilities. The grammar in Nigei offers a 
choice between e.g. They sold many books to their customers and The 
book sold well, In the second example, the grammar only Dicks out a 
subset of the roles of SELL for expras~on. In other words, the grammar 
makes the adoption of different persl~¢tives possible. II I can now 
return to the ol:~ervation that the functional diversity Hallidey has 
provldat for systemic grammar is useful for our pu~__o'-'e~-__; The fact that 
grammatical structure is multi.layered means that those aspects of 
grammatical structure that are relevant to the mapping between the two 
lexical entries are identified, made explicit (as ACTOR BENEFICIARY 
etc.) and kept seperate from pdnciplas of grernmatical structuring that 
are not directly relevant to this mapl:dng (e.g. SUBJECT, NEW and 
THEME). 
In conclusion, a stretegy for accounting for synonymy and polysemy 
can be mentioned. 
The way to cagture synonymy is to allow a concept to be the semantic 
subentry for two distinct orthographic entries. If the items are 
syntactically identical as well. they will also share a syntactic subentry. 
Polyeemy works the other way:. there may be more than one concept for 
the same syntactic subentry. 
5. CONCLUSION 
I have discus.s~l a gremmm" and a lexicon for PENMAN in two steps. 
F~rst I looked at them a~ independent components -- the semantic entry, 
the grammar and the syntactic entry -- and then, after identifying the 
problems of integrating them into a system, I tumed to strategies for 
re!sting the grammar to the conceptual representation and the syntactic 
entry to the semantic one within the lexicon. 
and the systemic notation and indicated how their design features can 
be Out to good use in PENMAN. For instance, the distinction between 
intension and exten*on in the knowledge representation makes it 
I~OS.~ble to let iexical semantic~ be part of the conceptuals. It was also 
suggested that the relations SuberC.,at~gory and Indivlduates can be 
to find expre~-~ions for a particular concept. 
The second steO attempted to connect the grammar to semantics 
through the notion of the choice expel, making use of a design 
principle of systemic grammars where the notion of choice is taken as 
ba~c. I pointed out the correlation between the structure of a concept 
and the notion of structure in the systemic framework and allowed how 
the two can be matched in a lexical entry and in the generation of a 
sentence, a slrstegy that could be adopted because of the 
multl.funotional nature of structure in systemic grammars. This second 
step has been at the same time an attempt to start exploring the 
potential of a combination of a KL-ONE like representation and a 
Sy~emic Grammar. 
Although many ~%oects have had to be left out of the discussion, there 
are s number of issues that are of linguistic interest and significance. 
The most basic one is perhal~ the task itself:, designing • model where 
a grammar and a lexicon can actually be mate to function as more than 
just structure generators. One issue reiatat to this that has been 
brought uD was that different ~ external to the grammar find 
resonance in different I=ari~ of the grammar and that there is a partial 
correlation between tim conceptual structure of the knowleclge 
reOresentation and the grammar and lexicon. 
AS was empha.~zacl in the introduction, PENMAN is at the design stage: 
there is a working sentence generator, but the other 8.qDect~ of what 
has been di$cut~tecl have not been imDlement~l and there is no 
commitment yet to a frozen design. Naturally, a large number of 
problems still await their solution, even at the level of design and, 
cleerly, many of them will have to wait. For example, selectivity among 
terms, beyond referential acle¢luacy, is not adclressecl. 
sl~ly ot ~ the func'UoNd sW~Uctt¢ ~ ~.k u0 dlff~ ~ ot • 
P.,cbrl¢~ ~ IcI0~ d~clNm~ I~tI~¢1~ fll'ldl m~ W ~ Q.Q. ~ ~ trlMIl~l~lt ¢4 ~4u¢1 
tikQ ~uJy ~ ~ ~ g/~ ~ tO¢l~vO ~ in ~ IcC0urd for nocnm4UIT~ClonL 
54 
In general, while noting correlations between linguistic organization 
and conceptual organization, we do not want the relation tO be 
deterministic: part of being a good varbaiizar is being able to adopt 
different viewpoints -- verbalize the same knowledge in different ways. 
This is clearly an ares for future research. Hopefully, ideas such as 
grammars organized around choice and cl~oice experts will ;)rove 
useful tools in working out extensions. 
REFERENCES 
Brachman, Roneld, A Structural Paradigm for Representing 
Knowledge, Bolt, Beranek, and Newman, Inc., Technical Report, 
1978. 
3. 
4. 
5. 
6. 
Bresnan, J., "Polyadicity: Part I of s Theory of LexicaJ Rules and 
Representation," in Hoekstra, van dar Hulst & Moortgat (eds.), 
Lexical Grammar, Dordrecht, 1980. 
Davey, Anthony, Discourse Production, Edinburgh Univer~ty 
Press, Fdinburgh, 1979. 
Fawcett, Robin P., Exeter Linguistic Studies. Volume 3: 
CognitiveLinguistics and Social Interaction, Julius Groos Vedag 
Heidelberg and Exeter University, t 980. 
Fawcett, R. P., Systemic Functiomd Grammar in a Cognitive Model 
of Language. University College, London. MImeo, 1973 
Danes, F., ed., Papers on Functional Sentence Perspective, 
Academia, Publishing House of the Czechoslovak Academy of 
Sciences, 1974. 
7. 
8. 
9. 
10. 
11. 
12. 
13. 
14. 
15, 
Helliday, M. A. K., "'Categories of the theory of grammar'," Word 
17, 1961. 
Halliday M. A. K. and R. Has;m, Cohesion in English, Longman, 
London, 1976. English Language Sod(m, Title No. 9 
Halliday, M.A.K., System and Function in Languege, Oxford 
University Press, London, 1976. 
Hudson, R. A., North Holland Linguistic Series. Volume 4: English 
complex sentences, North Holland, London and Arnstardam, 1971. 
Hudson, R. A., DDG Working Psper¢ University College, London. 
Mimeo, 1980 
Mann, William C., and James A. Moore, Computer as 
Author.-Resulls and Prospects, USC/Informatlon Sciences 
Institute, Research report 79-82, 1980. 
Mann, William C. and James A. Moore, Computer GenQration of 
MuRiparagradh English Text, 1979. AJCL, forthcoming. 
Moore, James A., and W. C. Mann, "A snlo6hot of KDS, a 
knowledge delivery system," in Proceedings of the Conference, 
17th Annual Meeting of the Association for Computational 
Linguistics, pp. 51-52, AuguSt 1979. 
Winogred, Terry, Understanding Natural Language, Academic 
Press, Edinburgh, 1972. 
55 

