RELATING SYNTAX AND SEMANTICS: 
THE sYNTACTICO-SEMANTIC LEXICON OF THE SYSTEM VIE-LANG 
Ingeborg Steinacker, Ernst Buchberger 
Department of Medical Cybernetics 
University of Vienna, Austria 
ABSTRACT 
This paper describes the structure 
and evaluation of the syntactico-semantic 
lexicon (SSL) of the German Natural 
Language Understanding System VIE-LANG 
\[3\]. VIE-LANG uses an SI-Net \[2\] as 
internal representation. The SSL contains 
the rules according to which the mapping 
between net-structures and surface 
structures of a sentence is carried out. 
This information is structured in a way 
that it can be evaluated from two sides. 
The parser interprets it as 
production-rules that control the 
analysis. Syntactic and semantic features 
of the input sentence are evaluated and 
individuals are created in the semantic 
net. The generator uses the same rules to 
express selected net-structures in 
adequate natural language expressions. It 
is shown how both processes can make 
effective use of the SSL. The different 
possibilities for evaluating the SSL are 
explained and illustrated by examples. 
I OVERVIEW OF THE SYSTEM VIE-LANG 
A. Representation 
In the system VIE-LANG real world 
knowledge is represented within a semantic 
net (SN) which is realized in the 
formalism of an SI-Net \[2\]. The net is 
organized in two layers. 
The generic layer contains the static 
knowledge of the system. At the generic 
level real world knowledge is represented 
in the form of concepts and roles. A 
concept is defined by its attributes which 
consist of two parts: role and value 
restriction. The value restriction is a 
concept which defines the range of 
possible fillers for the attribute, the 
role defines the relation of the filler to 
the concept being defined. 
Generic concepts are organized in a 
hierarchy of super- and subconcepts in 
which a subconcept inherits all attributes 
of its superconcepts. 
The second layer of the net contains 
the dynamic knowledge which consists of 
individualized concepts. The parser 
creates individuals of those net 
structures which are addressed by the 
input words. As more input is analyzed 
more individuals and links are created. 
These individuals constitute the episodic 
layer of the net. 
The conceptual content of the net is 
organized according to the idea of 
semantic primitives \[8\] which are 
characterized by typical attributes. 
Action primitives have attributes which 
correspond to cases of a case grammar 
(AGENT, OBJECT, RECIPIENT, LOCATION, etc.) 
\[4\], \[ii\]. 
B. Parsin@ 
Our parser belongs to the class of 
semantic parsers as suggested by \[i\], \[7\]. 
Since syntax carries a lot of information 
in German it has to be considered in 
analysis: The syntactic role of a 
constituent cannot be determined by 
word-order, instead its morphological 
endings which indicate the surface case of 
the constituent have to be evaluated. 
The parser is a data-driven system, 
which combines syntactic and semantic 
processes. Syntax is used as the tool to 
gain information concerning the 
constituents of the sentence, but the 
syntactic processes interact with semantic 
ones in order to confirm their hypotheses 
about a constituent. To recognize NPs and 
PPs the parser uses an ATN, which accepts 
semantically valid interpretations only. 
The resultant structures include syntactic 
and semantic information about the 
constituent. These structures are then 
collected in a constituent list. 
96 
The semantic representation of a 
sentence is built by linking the 
constituents to the predicate. This 
process is controlled by the SSL-entry for 
the verb. First the dominant verb has to 
be disambiguated \[9\]. SSL entries for 
verbs contain the information how 
verb-dependent constituents are mapped 
onto the cases represented within the net. 
In a last step referents for modifying 
constituents are determined and attached. 
A sentence is considered to have been 
parsed successfully after all constituents 
of the sentence have been incorporated. 
As a result the parser produces a 
configuration of individuals in the net - 
the semantic representation of the input. 
C. Generation 
The task of the generator is to 
convert a selected part of the episodic 
layer of the semantic net into surface 
sentences. This part - a root node and 
vertices and nodes attached to it form a 
coherent graph - is assumed to have been 
determined previously by the dialogue 
component. Generation is accomplished in 
two steps: step one performs a mapping of 
the SN to an intermediate structure (IMS) 
containing words together with syntactical 
and morphological data, and step two 
transforms the IMS to surface sentences by 
applying syntactical transformations, 
linearizations and a morphological 
synthesis. 
To produce a single sentence, the 
dominating verb is selected first, as it 
plays a central role in a sentence. The 
semantic primitives of which the SN is 
composed imply that there is no one-to-one 
correspondence between concepts in the net 
and words of the language. Therefore the 
decision which verb to select depends on 
the pattern of individuals in the episodic 
layer of the net. The criteria for this 
selection are attached to the generic 
concept of the root node in form of a 
discrimination net (DN) \[5\]. Its tests 
evaluate the filled attributes of the root 
primitive. The evaluation of this DN 
results not only in a verb, but in a 
verb-sense. 
The generator accesses the SSL entry 
for this verb-sense and continues by 
processing the different rules of which it 
is composed. The rules are evaluated from 
right to left. Right sides mainly deal 
with entities in the SN, especially 
individuals. If an individual is relevant 
to generation, it is put on a stack 
("current individual"). When the left 
side is processed, syntactical data along 
with the result of a recursive call of 
this part of the generator is passed to 
the IMS. The current individual (the 
argument of this call) is then removed 
from the stack and control is returned to 
the calling procedure, thus allowing the 
next rule to be processed. The IMS which 
is created during this part of the process 
forms the input for the step two processor 
which will finally produce the output 
sentence \[6\]. 
II THE SYNTACTICO-SEMANTIC LEXICON 
By means of the SSL the mapping 
between surface expressions in natural 
language and structures of the 
representation is achieved. For an NLU 
dialogue system the relation between 
surface and representation is of interest 
in the context of parsing and the context 
of generating. The structure of the SSL 
allows interpretation by both processes. 
Attributes of actions realize the 
ideas of a case grammar. This leads to a 
correspondence between roles in the net 
and surface cases within the sentence. 
Cases of a case grammar at the one hand 
show regularities in their relation to 
syntactic constituents (subject -> AGENT), 
at the other hand the relation between a 
role and a surface case is verb-dependent. 
E.g. the verb 'bekommen' (to get) relates 
the subject to the role RECIPIENT, the 
verb 'geben' (to give) relates the subject 
to the role SOURCE. The verb 'geben' 
requires the RECIPIENT to be expressed by 
a dative. Such dependencies are captured 
in the entries of the SSL whereas the 
regularities are treated by defaults. 
A. Structure of the SSL 
The basic unit in the SSL is the 
entry for a word-sense. Associated to 
each word-sense is an optional number of 
pairs which we will describe by the terms 
'Left Side' (LS) and 'Right Side' (RS). A 
pair describes how a word (phrase) of the 
sentence is represented within the 
semantic net. 
LSs describe features of the surface 
sentence. Most features refer to 
syntactic properties, e.g. constituents 
of a given surface case, infinitive 
constructions, lexical categories, surface 
words, and some features indicate 
selectional restrictions. If a LS 
contains more than one feature they are 
combined with an operator. One of the 
most frequent patterns that is used in LSs 
combines a syntactic feature with a net 
concept which is interpreted as 
selectional restriction. This combination 
reflects our general parsing approach to 
combine syntax with semantics. 
97 
RSs refer mostly to structures within 
the semantic net. There is no one-to-one 
correspondence between word-senses and 
conceptual primitives. To represent word 
(or phrase) meanings primitives are linked 
forming more complex structures. By 
definition there is one distinguished 
concept in each RS 'the root concept' 
which is the central element of the 
representation. All other structures 
referenced in an RS are linked to it. 
Although the number of 
action-primitives is relatively small 
(14), the net provides possibilities to 
express differences between related verbs. 
This is done by filling attributes with 
certain values by default. Such an 
attribute does not correspond to a 
constituent of the sentence but is 'part' 
of the verb-sense, e,g. 'gehen' (to go) 
is represented by the concept 
CHANGE OF LOCATION, 'laufen' (to run) 
addresSes-the same concept, but its 
attribute SPEED is filled by a different 
value. 
Not all SSL entries are relevant to 
parser and generator - some entries are 
relevant to one process only. This should 
not be regarded as a disadvantage, on the 
contrary, such entries support efficient 
use of the SSL. Since each subsystem has 
its own typical way of interpreting 
entries (LS and RS), process-specific 
entries are simply disregarded by the 
other system. 
B. Evaluation of the SSL 
Parser and generator treat the 
entries in the SSL as production-rules, 
each interpreting LS and RS in its own 
way. The parser works from LSs to RSs 
whereas the generator works in the 
opposite direction. 
i. Parsin@ 
The parser needs to map 
surface-constituents onto elements of the 
semantic net. To produce the semantic 
representation of an input word the parser 
accesses the SSL entry of this word. For 
each word there may be several word-sense 
entries. The LSs of all word-sense 
entries for a word incorporate the 
information necessary to distinguish one 
sense from the others. The parser 
interprets the LSs as conditions that have 
to be fulfilled by the input sentence. 
The SSL contains at least one pair LS - RS 
for each word-sense. In order to choose 
the correct interpretation the LSs of the 
different word-senses are evaluated. 
After the parser has chosen a word-sense 
by matching sentence-patterns and 
LS-conditions the associated RSs are 
interpreted as actions and evaluated 
sequentially. For the parser the 
structures in the RS are interpreted as 
representation of the word, therefore the 
indicated net-structures are 
individualized. The complete structure 
that has been created after all RSs have 
been executed is used as the 
representation of the input-word. 
Verb-entries for example specify 
the relation between surface constituents 
and the cases which are attributes of the 
action concept. Each verb-sense calls for 
a typical sentential pattern in which each 
constituent has to fulfil certain semantic 
restrictions. The parser selects a 
verb-sense if the features of constituents 
in the constituent list satisfy the 
conditions of the LSs. After having 
selected one word-sense its RSs are 
evaluated and the constituents are linked 
to the action as case-fillers. 
The parser uses the SSL entries 
to d isambiguate verbs. The LSs 
incorporate the factors by which 
word-senses can be discriminated from each 
other. For many verbs the selectional 
restriction of the direct object is a 
decisive factor. E.g. the verb 
'bekommen' (to get) is interpreted as 
OBJTRANS iff the semantic restriction of 
the direct object belongs to the class 
OWNABLE-OBJECT (see Fig. i). The 
mechanisms by which disambigua tion is 
carried out if the LS is not met is 
explained elsewhere \[10\]. 
(BEKOMMEN 
(i 
\[(AND (CASE ACC) 
(RESTR OWNABLE-OBJECT)) 
((IND OBJTRANS) 
(VAL + OBJECT *))\] 
\[ (T (CASE NOM)) 
----> 
((VAL + RECIPIENT *))\] 
\[(AND (PP VON) 
(RESTR PERSON INSTITUTION)) 
((VAL + SOURCE *))\])) 
Fig. 1 
SSL entry for 'bekommen', word-sense-i 
('to get an object') 
When the parser analyses the 
sentence 'Hans bekommt von dieser Frau ein 
Buch.' (John gets a book from this woman.) 
there are three constituents on the 
constituent list. 
98 
Interpretation of the first pair 
of the entry for bekommen-i leads to the 
instantiation of the root concept OBJTRANS 
(RS: (IND OBJTRANS)) and the creation of 
the value OBJECT filled by the 
representation of book. 
The parameter '+' refers to the 
root individual for all pairs of the 
word-sense entry. For the parser the 
parameter '*' in the SSL refers to the 
representation of the constituent selected 
by the LS which is local to one pair. 
The second pair leads to the 
instantiation of the value RECIPIENT 
filled by the representation for 'Hans' 
and the third one finally instantiates 
SOURCE filled by the representation of 
'Frau'. The resulting representation of 
the sentence is shown in Fig. 2. 
ect /~urce 
/~recipient 
name 
Fig. 2 
Net structure for 
'Hans bekommt yon dieser Frau ein Buch.' 
Action primitives typically have 
an AGENT and an OBJECT attribute. In most 
cases their surface equivalents are 
subject and direct object respectively. 
Therefore it would be redundant to include 
these relations for every verb. In these 
cases only the root concept is given in 
the RS (see Fig. 3). The mapping is 
carried out by default mechanisms which 
are applied whenever the LSs do not refer 
to subject or direct object. 
(ESSEN 
(i 
\[(T) -->((IND INGEST))\])) 
Fig. 3 
SSL entry for 'essen' , word-sense-i 
('to eat') 
In the default cases the 
selectional restrictions are checked 
implicitly. The net does not allow 
instantiation of structures that do not 
correspond to the patterns given in the 
generic concepts. If this occurs e.g. in 
the sentence 'He will eat his hat.' an 
error-message is generated because the 
semantic concept for 'hat', GARMENT, is 
not compatible with the restriction 
SUBSTANCE for the OBJECT of the concept 
INGEST. At this stage of development we 
do not loosen selectional restrictions as 
suggested by Wilk's preference semantics 
\[12\]. 
2. Generator 
When generating a sentence, the 
generator starts by regarding the root 
node which has been passed to it by the 
dialogue component. Normally, this root 
node will, together with the attributes 
attached to it, correspond to a verb, so 
this verb is selected first. As mentioned 
above, a discrimination net is used to 
accomplish this task. The DN selects a 
verb-sense according to the attributes of 
the root node. 
We will show the further 
processing by means of the example shown 
in Fig. 2. Let us assume the verb-sense 1 
of the verb 'bekommen' (Fig. I) has 
already been selected. The entries of the 
SSL are treated from right to left by the 
generator, so we start with (IND 
OBJTRANS). This will result in a null 
action for the generator, as an instance 
of OBJTRANS (OBJTRANS-II) is already known 
as current root node and it has been put 
as first element onto the stack for the 
current individual. (VAL + OBJECT *) is 
considered next. + denotes the root node, 
* the~ individual attached to that role of 
it which is specified by the second 
parameter, i.e. OBJECT. This element, 
namely BOOK-4, is put on the stack. 
Now the generator proceeds with 
the LS: (CASE ACC) is a recursive call to 
the generator with the current individual, 
BOOK-4, as new root node together with the 
information that the result shall bear 
accusative case endings. The generator 
processes the DN for the concept 'BOOK' 
and returns 'Buch'. This lexeme together 
with the case information now forms part 
of the IMS. After having processed the 
current individual BOOK-4, it is removed 
from the stack. The action (RESTR 
OWNABLE OBJECT) results in a no-op for the 
generat~r, as this information has already 
been processed in the DN when deciding to 
use the verb-sense 'bekommen-l' (see 
below). 
The second RS-LS-pair is treated 
in a similar way: The individual attached 
99 
to RECIPIENT is put on the stack, (CASE 
NOM) calls the generator with PERSON-9 as 
new root node and says that the resultant 
structure shall be rendered as a 
nominative. The DN of PERSON supplies the 
information that persons are best 
specified by their names (if present in 
the net - if not, other criteria are 
considered) , and so the word 'Hans' 
completes the structure being passed to 
the IMS. 
As for the last 
test-action-pair, Pp causes a 
prepositional phrase, 'yon der Frau', to 
be created. In German, the preposition 
'yon' implies dative case, so no 
additional entry (CASE DAT) is required in 
the SSL. (Note that this omission enables 
the parser to ignore case errors in the 
input sentence that do not influence the 
semantics.) 
3. Creating Discriminatio n Nets 
So far, the use of the SSL has 
been demonstrated only partially: in the 
example above some of the elements in RSs 
and LSs have been treated as no-ops, 
especially INDIV and RESTR. These 
elements, instead of being used in the 
process of generation, provide information 
for building data structures for the 
generator, namely the above mentioned DNs. 
AS an example, consider the 
entry for 'bekommen' (Fig. i), (INDIV 
OBJTRANS) informs us about a 
correspondence between the concept 
OBJTRANS and the verb-sense 'bekommen-l' 
This correspondence leads to the 
incorporation of 'bekommen-l' as a leaf 
node in the DN for the concept OBJTRANS. 
Other clues for constructing the DNs are 
provided by the VALs, thus giving them a 
double usage: (VAL + RECIPIENT *) in the 
SSL entry for 'bekommen-l' (Fig. i) 
implies that an individual attached to the 
RECIPIENT role of an OBJTRANS individual 
is a prerequisite for selecting this 
verb-sense. (The absence of a recipient 
in the net would lead to the selection of 
'weggeben' (to give away).) 
III SUMMARY 
We have shown how a lexicon that 
includes syntactic and semantic 
information has to be structured to allow 
efficient use by two processes, parser and 
generator. Whereas both must have access 
to knowledge about syntax as well as 
representation, their starting position 
differs: The parser is confronted with 
surface expressions, therefore LSs are 
evaluated first. The generator has to 
process net structures, so it begins by 
evaluating RSs. The reciprocal relation 
between analysis and synthesis is realized 
in the SSL by pairing off LSs and RSs. 
Flexibility is insured by the fact that 
parser as well as generator treat LS and 
RS each in an idiosyncratic way. 
ACKNOWLEDGEMENTS 
This research was sponsored by the 
Austrian 'Fonds zur Foerderung der 
wissenschaftlichen Forschung', grant no 
4158 (supervision Robert Trappl). 
REFERENCES 
\[s\] 
\[i\] Boguraev B.K.: Automatic Resolution 
of Linguistic Ambiguities, Univ. of 
Cambridge, Comp. Laboratory, TR-II ; 
1979. 
\[2\] Brachman, R.J.: A Structural 
Paradigm for Representing Knowledge, 
Bolt, Beranek and Newman; 1978. 
\[3\] Buchberger E., Steinacker I., Trappl 
R., Trost H., Leinfellner E.: 
VIE-LANG - A German Language 
Understanding System, in: Trappl 
R.(ed.), Cybernetics and Systems 
Research, North Holland, Amsterdam; 
1982. 
\[4\] Fillmore C.: The Case for Case, in: 
Bach E., Harms R.T. (eds.) : Uni- 
versals in Linguistic Theory, Holt, 
Rinehart & Winston, New York, 1968. 
Goldman N.M.: Computer Generation of 
Natural Language from a Deep 
Conceptual Base, Stanford AI Lab Memo 
AIM-247; 1974. 
\[6\] Horacek H.: Generierung im System 
VIE-LANG: Linguistischer Teil, 
TR 83-04, Dept. of Medical 
Cybernetics, Univ. of Vienna, 
Austria; 1983. 
\[7\] Riesbeck, C.K. and Schank, R.C.: 
Comprehension by Computer: 
Expectation-based Analysis of 
Sentences in Context, Yale Univ., 
RR-78; 1976. 
\[8\] Schank R.C.: Conceptual Information 
Processing, North-Holland, Amsterdam; 
1975. 
\[9\] Steinacker I., Trost H., Leinfellner 
E. : Di sambiguation in German, in: 
Trappl R. (ed.), Cybernetics and 
Systems Research, North Holland, 
Amsterdam; 1982. 
\[10\] Steinacker I., Trost H.: Structural 
Relations - A Case Against Case, 
Proceedings of the IJCAI 83, 
Karlsruhe, 1983. 
\[ii\] Trost H.: Erstellen der inhaltlichen 
Komponenten eines Semantischen 
Netzes, TR 81-03, Dept. of Medical 
Cybernetics, Univ. of Vienna, 
Austria; 1983. 
\[12\] Wilks Y.: Making Preferences more 
Active, University of Edinburgh, 
D.A.I., RR-32,1977. 
i00 
