AN INTEGRATED SYNTACTIC AND SEMANTIC SYSTEM FOR NATURAL LANGUAGE UNDERSTANDING 
UN SYSTEME SYNTAXIQUE ET SEMANT|QUE INTEGRE POUR LA COMPREHENSION DU LANGAGE 
NATUREL 
Fr&l~rique SEGOND(I) et Karen JENSEN(2) 
(l)Institut National des T616communications (Evry, France); email: segond at vaxu.int-evry.fr 
(2)Microsoft Corporation (Redmond, Washington State, USA); email: karenje at microsoft.corn 
RESUME 
La strat6gie pr6sent6e iciest it l'origine d'un systi~mc 
int6gr6 de Traitement Automatique du Langage Naturel - 
le syst~me PLUS (Progressive Language Understanding 
System), dans lequel les composantes thdoriques 
traditionnelles, syntaxe, s6mantique et diseours, sont 
li~es pour former un tout. Le syst~me est ~crit dmls uu 
seul formalisme, PLNLP (Programming Language for 
Natural Language Processing; Heidorn 1972), qui fonmit 
une architecture efficace pour unifier les diff6rentes 
composantes. Le syst~me offre une strat6gie 616gante 
pour la compr6hension du Langage Naturel h large 
couverture, et ind6pendante du domaine d'application. A 
rheure actuelle, six eomposantes constituent le syst~me 
PLUS/PLNLP; elles peuvent 8ire rapidemeut d6crites de 
la fa~on suivante: (1) syntaxe (PEG, la grammaire 
PLNLP de l'anglais), (2) syntaxe affin6e (rattachement 
des constituants), (3) d6rivation d'une forme logique 
(PEGASUS), (4) d6sambiguisation, (5) normalisation 
des relations s6mantiques, (6) module du discours au 
nivean des Ixaragraphes. 
Les composantes (1) et (3) sont d6jit assez avancdes et 
ont 6t6 test~es darts le cadre de diff6rentes applications. 
Les composantes (2) et (4) sont en cours de r6alisatioa. 
Les techniques pour cr~r le module du discours sont 
6tablies mais non encore impi6ment6es. Cet article 
concentre sur les composantes (3) et (5) avec une 
attention plus particuli~re pour (5) qui pose les 
fondations grammaticales utilis6es par le module du 
discours. Des descriptions des autres composantes 
peuvent $tre trouv6es darts la litt6rature (par exemple 
dans Chanod & al. 1991, Jensen & al. 1992 (il paraitre)). 
La composante (3), PEGASUS, est un passage d6cisif de 
la syntaxe it la s6mautique - s6mantique 61m~t eutendu 
comme impliquant, au minimum, la d6finition de cas ou 
roles th6matiques (i.e. structure pr6dicat-argument). La 
meilleure illustration en est la diff6rence de 
repr6sentations en entrde et en sortie. L'entr6e est un 
arbre syutaxique; la sortie est un graphe 6tiquet6 et 
orient6. On arbre est en premier lieu une repr6sentation 
syntaxique dans laquelle l'ordre liu6aire et la dominance 
grammaticale sont porteurs d'informations. Un graphe 
est une repr6seutation s6mantique; l'ordre tin6aire n'est 
plus significatif ~tant donn6 que l'inforumtion apparait 
d6sormais clans les ~tiquettes des arcs du graphe ou dans 
ses attributs. Afin de d6river la forme Iogique, 
PEGASUS doit traiter h large 6chelle des ph6nom~ues 
d'affectation d'arguments y compris darts des cas difficiles 
comme les d6peudances non born~es (par exemple 
associer le bon objet an verbe "ate" darts la phrase "What 
did Mary say that Johu ate?"), le contr61e fonctionnel 
(par exemple, trouver les sujets et les objets dans le cas 
des infinitives), le,s relations actif/~assif (s'assurer que les 
formes actives et passives ont bien les m~mes argmnents 
sous-jacents), etc... Le programme doit 6galement faire 
apparaitre des relations entre les tStes de syntagmes et 
leurs modifieurs ou adjoints. 11 doit en plus prendre en 
consid6ration les anaphores o anaphores nominales 
comprenant les pronoms et les r6f6rents de GN d6finis, 
anaphores verbales : associer les bons arguments et les 
bolls constitaants en cas d'ellipse-. Toute la chaine 
d'entr6e &)it ~tre correctement 6valu6e. Au stade actuel de 
son d6veloppement, PEGASUS ne prend pas en compte 
les r6f6reuces d6finies de GN ni la quantification, mais 
traite tousles autres ph6nom~nes mentionn6s. L'int6r~t 
de PEGASUS est de proposer ulm m6thode de calcul des 
structures pr6dicat-argumont en post-traitement, ce qui le 
distingue des proc6dures couramarent employ6es clans 
d'autres syst~mes de TALN. 
La composante (5) porte sur les relations s6mantiques. 
Cette composante fair apparaltre les liens s6mantiques 
cach6s dans les relations syntaxiques. Pour cela il va 
falloir, entre autres, rassembler les structures 
s6mantiques paraphrastiques. Pour ce faire on modifie, ~t 
l'aide d'une "grammaire de concept" le r6sultat obtenu 
avec PEGASUS (la structure prgdicat-argument). La 
tfiche de la grammaire de concept est de construire un 
r6seau (bien fond6) dans lequel ies relations s6mantiques 
sont 6tabfies entre des noeuds de concepts. Cette 
grammaire est un ensemble de procddures 6crites en 
PLNLP qui accomplissent, sous certaines conWaintes, un 
certain hombre d'opdrations sur les graphes. Les arcs de 
ranMyse sont 6tiquet~s it raide de noms de relations eux- 
mf~mes d6riv6s de mani~re syst~matique de la 
combinaison de la syntaxe et de la s~mantique du texte 
d'entrde. Les r~gles de cette grammaire sont similaires, 
pour ce qui est de leur fonne, anx r~gles des composanl~s 
ant6rieures, mais elles op~rent sur diff6rems aspects de 
l'information commune aux structures, analysant les 
relations entre les noeuds dn graphe de la phrase, 
nonnalisant les structures s6mantiques et les relations 
lexicales d'uue vari6t6 de domaines syntaxiques, sans 
pour autant perdre acc~s il la structure de surface 
(contenant les diffdrences syntaxiques). Cet article 
moutre comment, purlant de la structure pr6dicat 
argument obtenue en Soltie de PEGASUS, la grammaire 
produit des graphes s6mantiques tout en pr6servant la 
caracl6ristique du syst~me global : large couverture et 
ind6pendance du donmine. 
ACRES DE COLING-92. NANTES, 23-28 Ao~r 1992 8 9 0 PROC. oi: COL1NG-92, NANTES, AUG. 23-28, 1992 
I. Derivation of logical form (PEGASUS) 
We present PLUS (Progressive language 
Understanding System), an integrated NLP analysis system. 
In its current state. PLUS consists of six components, 
roughly described as: (1) syntax (PEG, the PLNLP English 
Grammar); (2) corrected syntax (reassignment of 
constituents); (3) derivation of logical form (PEGASUS); 
(4) sense disambiguation; (5) normalization of semantic 
relations; and (6) paragraph (discourse) model. The 
current system architecture is sequential, because this 
makes it easier to concentrate on developing techniques for 
processing Irnly unrestricted input. However, this control 
structure is expected to become more parallel in the future. 
The purpose of the third component, PEGASUS, is to 
simplify the derivation of a semantic represeattation, or 
logical form, for each input sentence or sentence fragment. 
To do this it computes: (a) the structure of arguments and 
adjuncts for each clause; Co) NP (pronoun)-anaphora; (c) 
VP-anaphora (for elided VPs). Simultaneously it must 
maintain broad coverage (that is, accept and analyze 
unrestricted input tex0. More commonly in NLP systems, 
the computation of such meaning structures is considered 
impossible unless a particular domain is specified. 
Consider the sentence, "After dinner, Mary gave a 
cake to John." Figm-e 1 shows the syntactic (tree) 
representation for that sentence after it has been lZrOcessed 
by the fn'st two analysis components, and Figure 2 shows 
the semantic graph produced by PEGASUS for ate same 
sentence: 
............................................ 
DECLI I .... PPI l ....... PREPI ..... "After" 
\[ I ....... NOUN1 * .... "dl rlne r • 
\[ I ....... PUNC1 ..... , • 
I .... NPI ........ NOUN2 * - - - "Ma ry • 
I .... VERB1 * ..... • gave • 
I .... NP21 ...... DETPI ....... ADJI* .... "a" 
I I ........ NOUN3 * .... cake • 
I .... PP2 l ....... PREP2 ..... "to" 
{ I ....... NOUN4 * ..... John" 
I .... PUNC2 ....... . " 
............................................ 
Figure 1. Syntactic parse tree 
o ~DOBJ 
Figure 2. Semantic graph for tile .sentence in Figure 1 
A graph is produced by displaying only those 
attributes and values that are defined to be semantic. 
However, the underlying record structure contains all 
attributes resulting from the parse. In this fashion, all 
levels and types of information, from morphological to 
syntactic to semantic and beyond, are constantly available. 
This principle of accountability holds throughout tile 
PLUS/PLNLP system. 
In a NLP system that uses attribute-value pairs, 
argument sbuctmes can be produced (a) by def'ming, for 
each node, attribute names that correspond to the desired 
argument or adjunct types, and (b) by assigning values to 
those attributes. It is customary to think of argument names 
like AGENT, PATIENT. etc. However. although these 
labels are tantalizingly semantic in nature, there is as yet no 
uniformly acceptable way of relating sylltaelie structure to 
them. Therefore we avoid such Labels, at least for the time 
being. We adopt, instead, the notion of "deep" cases or 
functional roles: 
DSUB: deep subject 
DIND: deep indirect object 
DOBJ: deep object 
DNOM: deep predicate nominative 
DCMP: deep object complement 
All deep argument attributes arc added to the analysis 
record structure by PEGASUS. For very simple clauses, 
deep arguments correspond exactly to the surface syntactic 
arguments. For example, in "John ate the cake," the NP 
"John" fills the roles of both surface and deep subject; "the 
cake" fills the roles of both surface and deep object In such 
simple cases, the deep argument attributes could as well 
have been assigned by the ,syntax rules; they are assigned by 
PEGASUS just to simplify the overall system architecture, 
Each major class node is exanlined, and, if it contains 
more than just one single (head) word, each ~sociated word 
is evaluated for possible assignment to some deep-structure 
attribute, lit addition to the deep case Labels, the following 
non-syntactic, sou-argument attributes define the fully 
elaborated structure: 
PRED: predicate (basic term) label 
PTCL: particle in two-part verbs 
OPS: operator, like demonstratives and quantifiees 
NADJ: adjective modifying a noun 
PAD J: predicate adjective 
PROP:. otherwise unspecified modifier that is a clause 
MODS: otherwise unspecified modifier that is not a 
clause; also, members of a coordinated structure 
And in addition to these, attributes are defined to point 
to adjunct preposition',d plwases and subordinate clauses. 
The names of these attributes are actually rite iemmas of 
those prepositions and conjunctions that begin their phrases 
and clauses. In this fashion, a step is taken toward a more 
semantic analysis of these constituents, without the 
necessity of going all the way to case Labels like "locative" 
and "durative." 
The procedure slants by renaming the surface 
arguments in all cases, as described previously. Then it 
calls a set of sub.procedures, each one of which is designed 
to solve a particular piece of the argument puzzle. Here is 
an outline of the flow of control taken for tile specification 
of arguments and adjuncts: 
1. Assign arguments and modifiers to all UP nodes: 
A. Assign arganmnts, in this order:. 
1) Unbounded dependencies, e.g., in "What 
did Mary say that John ale?" tile DOBJ of "ate" is 
"What." 
2) Fanctioual control, e.g., ill "John wanted 
to eat the cake," the DSUB of"eat" is "John." 
ACRES ul~ COLlNG-92. NANIa~S, 23-28 AOUI 1992 8 9 l PROC. OV COLING-92. NANTES, AU~. 23-28, 1992 
3) Passives, e.g., in "The cake was eaten by 
John," the DSUB is "John" and the DOBJ is "the 
cake." 
4) Indirect object paraphrases, e.g., the 
structure for "Mary gave a surprise to John" must be 
identical to the structure for "Mary gave John a 
surprise." 
5) Indirect object special cases, e.g., in "1 
told the story," the syntactic object "the story" is the 
DOBJ; but in "I told the woman," the syntactic object 
"the woman" is the DIND. 
6) Extraposition, e.g., "John ate the cake" is 
the DSUB of the sentence "It appears that John ate 
the cake." 
B. Assign modifiers (all adjuncts): prepositional, 
adjectival and adverb phrases; adverbial noun phrases; 
subordinate clauses; infinitives; comment clauses; 
participial modifiers; sentential relative clauses; etc. 
2. Assign modifiers (including arguments) to all NP nodes. 
3. Assign modifiers to all AJP (adjective phrase) nodes. 
4. Assign modifiers to all AVP (adverb phrase) nodes. 
5. Clean up the attribute-vaiue structure by deleting some 
unwanted features, 
The focus of linguistic interest here is on the 
assignment of arguments to VP nodes. Ordering of the sub- 
procedures is importanL Long-distance~lependencies must 
be resolved before functional control is assigned, and both 
of these maneuvers must be performed before passives are 
handled. The ordering presented here was experimentally 
determined by parsing sentences that contain more than one 
of the phenomena noted. 
Subcategorization features on verbs are used more 
strictly here than they are used in the fwst component, the 
broad-coverage syntactic sketch. Also, although selectional 
features were not found to be useful in constructing the 
syntactic sketch, they are buff, useful and necessary for 
defining deep arguments in PEGASUS. With unbounded 
dependencies, it is important to distinguish the probable 
subeategorization types of verbs in the sentence, and also 
some selectional ("semantic") features on nouns, since the 
argument struetme will vary depending on the interplay 
between these two pieces of information. 
The sub-procedure for functional conlrol handles not 
only infinitive clauses, but also participial clauses, both 
present and past. These consauOJons often require 
argument assignment over long intervening stretches of 
text. In the sentence "Mary, just as you predicted, arrived 
excitedly waving her hands," "Mary" is DSUB of the 
present participle "excitedly waving her hands." In the 
sentence "Bolstered by an outpouring of public confidence, 
John accepted the post," "John" is DOBJ of the past 
parlieiple "Bolstered by an outpouring..." 
All of the other sub-procedures for argument 
assignment are linguistically interesting to various degrees, 
but none of them is quite so complex as the lmX:edures for 
unbounded dependency and functional control. 
2. Semantic normalization 
Semantic relations are represented by a graph. The 
nodes of the graph contain words; but, since these are 
linked with dictionary def'mitions, synonyms, and other 
related words, it is possible to say that these nodes represent 
concepts, l It is the job of the concept grammar to construct 
a well-motivated network in which semantic relations are 
properly drawn among concept nodes. 
In order to do this job, one of the important problems 
that has to be addressed is the problem of showing 
equivalences between paraphrases. This problem was first 
approached by PEGASUS, where, for example, both active 
and passive forms of a clause are provided with the same 
argument structure. The work is continued by the concept 
grammar, and expanded to handle a much wider set of 
paraphrase situations. The basic intuition remains the 
same, however:, different sentences that have essentially the 
same meaning (truth-value) will have the same semantic 
graph. And the same principle of accountability applies 
here as there: the system will always have access to the 
original surface syntactic variability, so that no nuances of 
expression need ever be IosL 
As an example, all of the following sentences have the 
same essential meaning, and therefore should be associated 
with the same semantic graph: "Thero is a blue block"; 
"The block is blue"; "The block is a blue block"; "The 
block is a blue one." These are not classical syntactic 
variants, like active and passive; but they are variants of the 
same semantic facts: a block exists, and it is blue. 
The sentences are analyzed by the syntax and 
PEGASUS. (Because our descriptive sentences are 
purposely kept very simple, we can avoid using the second 
and fourth components, reassignment and sense 
disambiguation.) The result is a graph for each sentence, 
corresponding to the basic arguments and adjuncts of that 
sentence. The concept grammar examines each sentence 
graph, checking for certain configurations that signal the 
presence of common underlying conceptual categories. 
Here is where the syntactic variants will be normalized. 
The operation of the ommept grammar can be 
compared to the operation of a syntactic grammar:, syntax 
takes words and phrases, and links them, via common 
morpho-syntactic relationships, into a structtwal whole; the 
concept grammar takes arguments and adjuncts, and links 
them, via common semantic relationships, into a conceptual 
whole. Syntax works with syntactic category labels; the 
concept grammar works with semantic arc labels. 
2.1. the "block" sentence paraphrases 
Consider the four "block" sentences above. The 
argument and adjunct structures (sentential graphs) 
provided by PEGASUS for these sentences, and shown in 
Figure 3, use just four semantic arc labels: DSUB, NADJ, 
PAl)J, and DNOM (see above): 2 
See Segood and Jensen 1991 for an explanation of the 
assignment of NP- and VP-anaphora, a discussion of advantages to using a post-processor, and a comparison of ISce Sowa 1984 for an introduction to conceptual graph 
PEGASUS with other current strategies for deriving structures. 
predicate-argument StlUCtUres. 2Although only the head lemmas are displayed in the graph 
nodes, the underlying record structure keeps access to all 
syntactic details, such as determiners, tense, etc. 
ACRES DE COLING-92, NANTES, 23-28 AO~" 1992 8 9 2 PROC. OF COLING-92, NANTES, AUG. 23-28, 1992 
NADJ 
DSUB"-(~ 
"There is a blue block.'" 
"The block is blue." 
"The block is a blue block." 
"II~e block is a blue one." 
Figure 3. Sentential graphs for tho':'bluek" sentences 
These four sentential graphs am quite different; but, 
since the sentences have the same meaning, there should be 
just one semantic graph for all of them: 
Figure 4. Canonical semantic graph for the sentences 
in Figure 3 
This is a case of paraphrase that requires 
normalization. In order to achieve it, first of all we delete 
the node "be" in all graphs. It is well known that the 
English copula "be" carries very little semantic weighL 
RULE 1: Delete the copula "'be." 
Second, if an adjective carries a lexical feature that 
marks it as a "color" word, then we change the arc label 
NADJ to the label COLOR. The effect is to change the 
name of the relation between the noun and the adjective. 
RULE 2: Change NADJ flora node with "color" word 
to COLOR 
To achieve the desired semantic graph for "There is a 
blue block," we apply Rule 1 and Rule 2, deleting the node 
"be" and changing the name of the relation between the 
node "block" and the adjective "blue." 
When the predicate is an adjective If'AD0, there is, in 
the argument structure, no direct relation between the 
subject (DSUB) and the adjective (PADJ). Both of them are 
attributes of the node "be." In this case, we create a new 
relation, NADJ, between the subject and the adjective, and 
delete the relation PADJ. (We will deal later with the 
difference between predicative (PADJ) and atlributive 
(NADJ) adjectives.) 
RULE 3: Create NADJ arc between subject and 
predicate adjective. 
Once this new arc is created, rules 1 and 2 will 
recognize that the adjective is a "color" word, change the 
name of the relation NADJ to COLOR, and delete the node 
"be." These operations will turn the sentential graph for 
"The block is blue" into the desired semantic graph in 
Figure 4. 
When the predicate is a noun or a noun phrase 
(DNOM), as in the remaining two "block" sentences, we 
have to ask if that predicate nominative is the same term as 
the subject (or is an equivalent empty anaphorie team, like 
"one"). or if it is different from the suhjecL and not empty. 
In the in'st case we unify the subject and the predicate NPs. 
All the nodes which point to the first are made to point to 
the second, and vice versa. Once this is done, the problems 
of the color adjective and of the empty copula are 
automatically handled by existing rnles, and the sentential 
graphs for the last two "block" sentences are transformed 
into the canonical graph in Figure 4. 
RULE 4: Unify subject and predicate under 
appropriate conditions. 
In the second case, when there is a DNOM that is 
different from the subject NP, we create a new relation 
between the subject and the predicate. In the simplest ease, 
we give this relation the ISA label: 
RULE 5: Create ISA link under appropriate 
conditions. 
Hence the sentence "The block is an object" has the 
following semantic graph: 
Figure 5. Semantic graph for "q"he block is an object" 
The reader should not conclude from the previous 
examples that dealing with paraphrases requires a lot of ad 
hoc solutions. On the contrary, the rules (or procedures) of 
the concept grammar are general in nature. "lhey identify 
and represent typical semantic relations in a formal way. A 
syntactic grammar does the same thing, but at a different 
level of structure. The concept grammar tries to catch what 
might be called "the semantics of the syntax." These 
operations are straightforward, just us the operations that 
build constituent structure in a syntactic grammar am 
straightforward. But this simplicity should not obscure the 
elegance of what is going on here. With minimal effort, 
using easily accessible parse information, we am 
automating the creation of a conceptual structure. This 
conceptual structure will ultimately have a high degree of 
abstracmess, generality, and language independence. 
2.2. Locative prepositional phrases 
Consider the following set of sentences, which should 
all have the same semantic graph (Figure 6): 
(I) 
There is a blue block on the red block. 
The blue block is ou the red block. 
There is a red block under the blue block. 
The red block is under the blue bloc.k. 
ACRES DE COLING-92, NANTES, 23-28 AO~' 1992 8 9 3 PROC. OF COL1NG-92, NANTES, AUO. 23-28, 1992 
Figure 6. Canonical graph for sentences in (1) 
Figure 7. Sentential graph for "There is a blue block on the 
red block" 
Note the graph node labeled "position." This word was 
never used in the paraphrase sentences, but the concept was 
implicit in all of them. (The link between preposition 
names and the word/coucept "position" can be validated in 
dictionaries and thesauri.) One interesting and significant 
result of setting out to normalize these paraphrases is the 
emergence of what might be called the essential meaning of 
the expressions, namely, a statement of the relative position 
of two objects. In this fashion, rite writing of aconcept 
glmnmar results naturally, and pragmatically, in the 
emergence of terms that we might want to'consider as 
"'semantic pdmitive.s." It should be emphasized, however, 
that we are not committed beforehand to any basic 
conceptual or semantic primitives. In this example, the 
relations ONTOP and UNDER appear in the canonical 
graph of the sentence, but this is just for purposes of the 
present exposition. What we are interested in is to establish 
,ml appropriate link between the two blocks. Instead of 
ONTOP and UNDER we could have ABOVE (or ON) and 
BItOW, etc. 
It is not necessary to discuss the treatment of each of 
the paraphrases. The In, st sentence in (I} will serve as an 
example. Figure 7, above, shows its sentential graph. 
What we want to do is to link the deep subject (''blue 
block'3 with the object of the preposition ("red block'3 by 
using the relation names ONTOP and UNDER, which 
spring from the concept POSITION. We delete the copula 
"be," and create the new node POSITION, motivated by 
dictionary def'mitions for locative prepositions. Then we 
add two attributes, ONTOP and UNDER, to this node 
(pointing respectively to the subject and the noun phrase 
object of the preposition), and delete the attribute ON in the 
list of attributes of the subject. Notice that if the sentence 
read "above" instead of "on," the treatment would be the 
same. 
Of course, this does not mean that looking at the 
syntactic relations between words is enough; the semantics 
of the wolds themselves are also important. For instance, 
the kind of relation involved between a subject NP and the 
NP object of a PP in the case of a locative prepositional 
phrase (e.g. the eat is in the garden, the cat is under the 
table), is not the same as the one involved with the PP 
which is a part of the sentence "'The cat is in love." But 
still, in all these three sentences, what we are interested in 
is building the relation between "the cat" and the NP object 
of the PP (garden, table, love). Giving a name to the 
relation (and, for that purpose, knowing that love is a 
concept, garden is a place, and table is an object) is the task 
of the sense disambiguation component, which consults 
dictionary def'mitions to find the necessary semantic 
information. 
2.3. Relative clauses 
One way of combining propositions (the block is blue, 
is on the table, etc.) into one sentence is to use a relative 
clause, We can say: 
(2) (a) The block that is blue is on the table. 
(b) On the table is the block that is blue. 
(c) The block, which is on the table, is blue. 
Figure 8 shows the sentential graph for (2a). The 
attribute PROP points to the semantic structure of the 
relative clause "that is blue," and the attribute REF 
identifies the referent of the relative pronoun "that": 
Figure 8. Sentential graph for "The block that is blue 
is on the table" (2a) 
In the sentences of (2), we want to relate the deep 
subjects of the relative clauses with their predicates. All we 
have to do, in this case, is to unify the DSUB of the PROP 
with the REF of the DSUB of the PROP, deleting the REF 
attribute. The result is a record, pointed to by PROP, which 
has a DSUB identical to the DSUB of the whole sentence, 
and therefore possesses both the attributes that it gains from 
tile relative clause, and the attributes of the DSUB of the 
whole sentence. Now the system is able to handle 
recursively all the other problems (copula, predicate 
adjective, and spatial prepositional relationships), and we 
obtain the same graph as is obtained for sentences such as 
'The blue block is on the table" or "There is a blue block on 
the table": 
Figure 9. Canonical semantic graph for the sentences 
in (2) 
2.4. Toward the discourse model 
Our work also involves normalizing across sentence 
houn -daries. For instance, from (3a--b): 
(3) (a) The blue block is on the red block. 
(b) The red block is on the black block. 
we want to be able to infer (3o-d): 
(3) (c) The blue block is above the black block. 
(d) The black block is below the blue block. 
Inference across sentence boundaries does not differ, 
in essence, li-om inference within a single .sentence; after 
ACTES DE COLING-92. NANq~2S, 23-28 AOt'Zr 1992 8 9 4 I'ROC. OF COLING-92, NANTES, AUG. 23-28, 1992 
all, two sentences may l~come one sentence, under 
coordination: 
(3) (a AND b) qlte blue block is on the red block 
AND the red block ks on the black block. 
From an implea~entation point of view, the strategy is 
the same. We consider all nodes called "position." ~llaere is 
one such node in the graph for (3a), and another in the 
graph for (3b). We look at the records for befit "position" 
nodes and obtain two lists: one, a list of all ONTOP 
attributes; and the other, a list of all UNDER attributes. We 
look at the intersection of those li~s. If they have an 
element in common (for instance, in the previous example, 
"red block" will appear in both of them), then we know that 
we can infer the graph in Figure 10: 
Figure 10. Inferential graph for (3d) 
Figure 10 displays only the inferences in (3c-d), 
derived from (3a-b). But the system does not lose access to 
information about the existence and placement of the red 
block mentioned in (3a-J0). ;" 
All the examples given in this paper involve sentences 
with the verb "be?' "Be" and other state verbs comprise a 
complicated and interesting class. "ll~ey accept a lot of 
different constructions (adjectival predicates, nominal 
predicates, prepositional phr,xsc complements, etc.), and 
provide a convenient and convincing field for preliminary 
investigations. At the same time, much of the work done 
for state verbs (coordination, PP relationships, etc.) can be 
applied to other verb classes. 
3. Conclusion 
We hope to have made two substantial contributions in 
this paper: (1) to suggest a novel method l~r computing 
argument structures in a post-processor, in order to simplify 
the derivation of logical fonas for sentences; (2) to show 
the birth of a concept grammar, which receives syntactic 
and semantic information from earlier stages of the systtnn, 
and autoamtically provides a grammatical foundation for 
the next stage, discourse. We dealt with some linguistic 
problems, including different kinds of paraphrases. We also 
suggested methods for handling logical properties of natured 
language, such as the spatial properties of prepositions. 
(See Sego~ld and Jensen 1991 for additional constructions 
handled by the concept grammar.) 
Dealing with locative prepositions is not the same as 
dealing with the whole of natural language, llowever, we 
have tried to avoid specific or ad hoc solutions. The rules of 
the concept grammar are generic in nature. "lhey express 
semantic facts about English (and, in some cases, about 
language in general), just as a moqtho-syntactic grammar 
expresses syntactic facts about English. Thereh~re they are 
in no way restricted to a semantic subdomain. 
This structure of very general relations is one of the 
steps leading to an ideal semautic representation of 
sentences. It provides a universal representation, 
independent front the surface structure but without losing 
the information contained in the surface structure. 
Another contribution of the paper is to illustrate how 
this approach leads to an anticulated architecture for a 
uatural language anderstanding system. The architecture 
provides both modularity and integration of NLP tasks, and 
allows for a smooth flow from syntax through semantics to 
discourse. Starting with an initial syntactic sketch, we 
obtain a conceptual graph step by step, without adding a lot 
of hand-coded semantic infommtion in the dictionary. 
Acknowledgments 
We are grateful to all the people who have helped us. 
Among these, we ackatowledge here especially the 
following: George Heidom, who provided us with tools and 
advice; Joel Fagan, who initialized rite concept grammar 
work (see Fagan 1990); and Wlodek Zadrozny, with whom 
we have had lively and interesting conversations about 
senmntics. Of course, any errors in this work remain our 
re.sponsibility. 
References 
Chanod, J.-P., B. ltarriehausen, and S. Montemagni. 
1991. "Post-processing multi-lingual arguments ..m'uclures" 
in Proceedings of the l l th International Workshop on 
Expert Systems and Theh" Applications, Avignon, France. 
Fagan, Ji.. 1990. "'Natural Language Text: The 
Ideal Knowledge Representation Formalism to Support 
Content/MialysLs for Text Retrievar' in P.S. Jacobs, ed., 
Text-based Intelligent System~': Cun'ent Research in Text 
Analysis, Information Extraction and Retrieval. GE 
Research and Development Center, Technical Information 
Series, 90CRD198, pl t. 48-52. (Originally presented at 
AAAI 1990 Spring Symposimn on Text-based InteUigent 
Systems, March 27, 28 & 29, 1990, Stanford University, 
Stanford, California, USA). 
Heidoru, G.E. 1972. "'Natural Language Inputs to a 
Simulation Progvanlming System." PhD dissertation, Yale 
University, New ltaven, Connecticut, USA. 
Jensen, K., G.E. lteidorn and S.D. Richardson. 1992, 
forthcoming. Natural Language Processing: the PLNLP 
Approach. Kluwer Academic Publishers, Boston, 
Massachusetts, USA. 
Segond, F. and Jonson, K. 1991. "An Integrated 
Syntactic and Semantic System for Natural Language 
Understanding." IBM RC 16914, T.J. Watson Research 
Center, Yorktown Heights, New York, LISA. 
Sowa, J.F. 1984. Conceptual Structures: Information 
Processing in Mindand Machine. Addison-Wesley Pub. 
Co., Reading, M,x~sachusetts, USA. 
AcrF.s DI::COLING-92, NAN-rEs, 23-28 nO~' 1992 8 9 5 I'P.OC. OF COLING-92, N^NTI.:S, AUG. 23-28, 1992 
