AN EXPERI~FENTON SYNTHESIS OF RUSSIAN PARAMETRIC CONSTRUCTIONS 
I.S. Kononenko, E.L. Pershina 
AI Laboratory, Computing Center, 
Siberian Branch of the USSR Ac. Sci., 
Novosibirsk 630090, USSR 
ABSTRACT 
The paper describes an experimental 
model of syntactic structure generation 
starting from the limited fragment of se- 
mantics that deals with the quantitative 
values of object parameters. To present 
the input information the basic semantic 
units of four types are proposed:"object", 
"parameter", "function" and "constant". 
For the syntactic structure representation 
the system of syntactic components is used 
that combines the properties of the depen- 
dency and constituent systems: the syntac- 
tic components corresponding to wordforms 
and exocentric constituents are introduced 
and two basic subordinate relations ("ac- 
tant" and "attributive") are claimed to be 
necessary. Special attention has been de- 
voted to problems of complex correspon- 
dence between the semantic units and lexi- 
cal-syntactic means, In the process of 
synthesis such sections of the model as 
the lexicon, the syntactic structure gene- 
ration rules, the set of syntactic restric- 
tions and morphological operators are uti- 
lized to generate the considerable enough 
subset of Russian parametric constructions. 
I INTRODUCTION 
The semantics of Russian parametric 
constructions deals with the quantitative 
values of object parameters. The paramet- 
ric information is more or les~ easily ex- 
plicated by means of basic semantic units 
of four types: "object" ('table', 'boy'), 
"parameter" ('weight', 'length', 'age'), 
"function" ('more', 'equal', 'almost equal') 
and "constant" ('two meters', 'from 3 to 5 
years'). 
In simple situations each of these 
units is separately realized in a lexeme 
or a phrase, their combinations forming 
full expressions with the given sense: 
malchik vesit bolshe dvadcati kilogrammov 
'boy weights more than twenty kilograms'. 
It is precisely these direct and simple 
means of expressions that are usually used 
in systems generating natural language texts. 
Natural languages, however, operate 
with more complex means of expression ; 
one-to-one correspondence between semantic 
units and lexical items is not always the 
case. The complex situations are suggested 
here to be explained in terms of decompo- 
sition of the input semantic representa- 
tion (cf. the notion of form-reduction 
in Bergelson and Kibrik (1980)). This phe- 
nomenon is exemplified by such Russian le- 
xemes as stometrovka 'hundred-meters-long- 
distance' which semantically incorporates 
the four constituents of the parametric 
semantics. 
As an ideal, a language model should 
embrace mechanisms that provide generation 
and understanding of the constructions 
that make use of the various possibilities 
of lexicalization and grammaticalization 
of sense. The presented model deals with 
some aspects-of the phenomena that have 
not been Considered before: all the possi- 
bilities of decomposition of the input in- 
formation are taken into account and the 
means of syntactic structure representa- 
tion are developed to provide the synthe- 
sis of the parametric syntactic structure. 
The paper is organized as follows. 
In section 2 the set of semantic components 
is described. In section 3 the relevant 
syntactic notions are introduced. In sec- 
tion 4 the process of synthesis is outlin- 
ed, followed by conclusions in section 5. 
2 SE~IANTIC COMPONENTS 
The information to-be-communicated is 
represented as a set of four semantic 
units each of them being marked with the 
type-symbol (o - "object", p - "parameter", 
f - "function", c - "constant"). 
At the initial step of synthesis a 
process involving the decomposition of the 
input semantic structure into a system of 
semantic components takes place. Usually, 
a semantic structure corresponds to seve- 
ral decompositions. The forming of a com- 
ponent may be motivated by the following 
reasons. 
129 
In the event of separate lexicaliza- tion a componen~ represents exac~±y one 
semantic unit. There are four components 
of this kind according to the number of 
unit types. So, the object component K o 
represents a unit of the "object" type and 
is realized in a noun (dom 'house') or a 
possessive adjective (papin 'father's'). 
The parameter component Kp is lexicalized 
in parametric nouns, verbs and particip- 
les. The function component Kf is realiz- 
ed in lexemes of different syntactic clas- 
ses: prepositions, comparative verbs and 
participles and forms of comparative de- 
gree of some adjectives and adverbs. The 
constant component K c corresponds to mea- 
sure adjectives and some quantitative con- 
structions described in Kononenko et al. (1980). 
A component represents more than one 
semantic unit in two situations. 
(1) The first one has been mentioned 
above. It concerns the phenomenon of in- 
corporation of several units in one lexe- 
me: thus, the component Kopfc is intro- 
duced to account for the lexemes like sto- 
metrovka and Kpf component is a proto- 
type of parametric-comparative adverbs 
like shire 'wider'. 
(2) On the other hand, the introduc- 
tion of a component may be connected with 
the fact that a certain unit is not lexi- 
calized at all. Such "reduced" elements of 
sense are considered to be realized on the 
surface by the type of the syntactic struc- 
ture composed of the lexicalized units of 
the component. For example, in Russian ap- 
proximative constructions litrov pjat 
'about-five-liters' it is only the "cons- 
tant" unit that is lexicalized and the 
unit of the "function" type ('almost equal) 
is expressed by purelysyntactic means, 
i.e. the inverted word-order in the quan- 
titative phrase. The corresponding compo- 
nent represents both the "function" and 
"constant" units. 
3 SYNTACTIC STRUCTURES 
The syntactic structures of Russian 
parametric constructions are various 
enough. The full system of rules (Kononen- 
ko and Pershina, 1982) provides the gene- 
ration of nominal phrases and simple sen- 
tences but the structures within the comp- 
lex sentence such as komnata, dlina koto- 
rojj ravna pjati metr~n 'room whoso length 
is five meters' are left out of account. 
So, the model allows for the following ex- 
amples: shestiletnijj malchik 'six-years- 
old boy'; bashnja vysotojj bolee sta metrov 
'tower of more than hundred meters height'; 
kniga stoit pjat rublejj 'book costs five 
roubles' etc. 
To represent the syntactic structures 
the system of syntactic components sugges- 
ted in Narinyani (1978) proved to be use- 
ful, that combines the properties of the 
dependency and constituent systems. ~vo 
different types of syntactic components, 
the elementary and non-elementary ones, 
are claimed to be necessary. The elementa- 
ry component corresponds to a wordform 
and is traditionally represented by a le- 
xeme symbol marked with syntactic and mor- 
phological features. 
The non-elementary component is com- 
posed of syntactically related elementary 
components. The outer syntactic relations 
of the non-elementary component cannot be 
described in terms of syntactic and mor- 
phological characteristics of the consti- 
tuent elementary components. The notion of 
a non-elementary component is a convenient 
tool for describing the syntactic behavi- 
our of Russian quantitative constructions 
composed of a noun and a numeral: the mor- 
phological features of the subject quanti- 
tative phrase (nominative, plural) are not 
equivalent to those of the nominal consti- 
tuent (genitive, singular). 
The minimal syntactic structure that 
is not equal to a wordform is described 
in terms of a syntagm, i.e. a bipartite 
pattern in which syntactic components are 
connected by an actant or attributive syn- 
tactic relation. Each component is marked 
with the relevant syntactic and morpholo- 
gical features. 
The actant relation holds within the 
attern in which the predicate component 
governs the form of the actant component 
Y, e.g.: shirina \[XJ ehkrana \[Y\] 'width 
of-screen' the governing lexeme shirina 
determines the genitive of the noun-ac- 
tant. 
The attributive relation connects the 
component X with its syntactic modifier, 
or attribute, Y. The attributive synta~u 
is typically composed of a noun and an ad- 
jective (stometrovaja \[YJ vysota \[X\] 'one- 
hundred-meters height'), a noun ~id a par- 
ticiple, a noun and another noun, a verb 
and an adverb or a preposition. 
The syntactic relation is represented 
by an'%ct" or "attr" arrow leading from X 
to Y. 
The syntactic class features reflect 
the combinatorial properties of the compo- 
nents in the constructions under conside- 
ration. The following are some examples of 
the syntactic features: 
"S " - object nouns (dom 'house') obj 
130 
"S " - parametric nouns 
param (yes %veight') 
"A " - possessive adjectives poss (papin 'father's') 
'|V f' param - parametric verbs 
(stoit 'to-cost') 
"P " - parametric participles 
param (vesjashhijj 'weighing') 
"A " - measure adjectives 
meas (pjatiletnijj 'five-years- 
old') 
The syntactic structure does not con- 
tain any syntactically motivated morpholo- 
gical features connected with government 
or agreement (the latter are described se- 
parately in the morphological operators 
section of the model). The case of the 
noun used as attribute is reflected in the 
syntactic structure representation since 
this feature is relevant in distinguish- 
ing syntagms. 
(e) 
Sobj 
(f) 
Sobj 
act V malchik vesit 'boy 
param weights' 
act S vysota doma 'height 
param of-house' 
The rules applicable to different 
fragments of the same decomposition are 
bound with the syntagmatic restrictions 
that prevent the unacceptable combinations 
of syntagms. Thu~ the combination of the 
syntagm (c) for {K_, K } and the adjective 
lexicalization of ~he ~onstant" component 
forms the unacceptable syntactic structure 
~ehkran pjatimetrovojj shirinojj 'screen 
of 5-meters-long width (instr)'. 
The process of synthesis yields all 
the possible syntactic structures corres- 
ponding to the input semantic representa- 
tion. 
4 STRUCTURE GENERATION 5 CONCLUSION 
The first step of synthesis is the 
decomposition of the input semantic repre- 
sentation into the set of semantic compo- 
nents. The possibilities of lexicalization 
of components are determined by the lexi- 
con that provides every lexeme with its 
semantic prototype - the set of semantic 
units incorporated in the meaning of the 
lexeme. The lexicalization rules replace 
the semantic components b~ the concrete 
lexemes, e.g.:'weight' ~K~ is replaced P 
by the lexemes yes IS ~ ~, vesit\[V .... \] 
or vesjashhijj \[Pparl\]~ ~ 
The semantic types of components de- 
termine their combinatorial properties on 
the syntactic level. T~le grammar is deve- 
loped as the set of rules each of which 
provides all the syntagms realizing the 
initial pair of components. 
For example, the pair ~Ko, Kp~ corres- 
ponds to six syntagms: 
(a) A attr S 
poss param papin yes 'father's 
weight' 
Cb~ attr 
Sobj " Sparam,gen ehkran shiriny 
'screen of- 
width (gen)' 
(c) attr 
Sobj ~ Sparam,instr bashnja vyso- 
tojj 'tower 
of height 
(instr.)' (d) 
attr kniga stojashhaja Sobj 
Pparam 'book costing' 
In this report on the basis of the 
very limited data of the parametric const- 
ructions an attempt has been made to con- 
sider a simplified model of synthesis of 
the text expression beginning from the gi- 
ven semantic representation. The scheme 
presented above is planned to be implement- 
ed within the framework of the question- 
answering system. 
Right from the start of synthesis the 
process of decomposition of the input se- 
mantics takes place in order to capture 
different cases of complex correspondence 
between the semantic units and the lexical 
-syntactic means. To generate the conside- 
rable enough subset of Russian parametric 
constructions such sections of the lang- 
uage model as the lexicon, the grammar ge- 
nerating the syntactic structures, the 
set of syntactic restictions and morpholo- 
gical operators are utilized. The listed 
constituents, however, do not, exhaust all 
the necessary mechanism of synthesis 
since the problems of word-order are left 
to be investigated and an additional refe- 
rence to various aspects of the communica- 
tive setting is required. We believe that 
being of primary ~nportance for automatic 
synthesis of natural language texts the 
communicative aspect of text generation 
presents one of the mo~t promising research 
directions for future a~tivity. 
131 
6 REFERENCES 
Bergelson, M.B.; Kibrik, A.E., 1980. 
"Towards the General Theory of Language 
Reduction". In: ~ormal Description of 
Natural Language Structure. pp. 147-161. Novosibirsk (in Russian). 
Kononenko, I.S.; Y~asnova, V.A.; Pershi- 
na, E.L., 1980. The Structure of Russ- 
ian Quantitative Constructions. Prep- 
rint No. 237. Novosibirsk (in Russian). 
Kononenko, I.S.; Pershina, E.L., 1982. A ~odel Generating Syntactic Structures 
of Some Russian Parametric Constructions. 
In: Formal Representation of Linguistic 
Information. pp. 103-122. Novosibirsk (in Russian). 
Narinyani, A.S. 1978. Formal ~odel: Gene- 
ral Scheme and Choice of Adequate Means. 
PrePrint No. 107. Novosibirsk (in Rus- sian ). 
132 
