ABSTRACT 
MY paper will deal with the theoretical and practical 
questions of the co-textual analysis of texts. 
The theoretical frame is contained by the chapters 
titled Introduction and Conclusion. MY aim here is to 
show that the automatization of the text analysis expects 
the accumulated experiences of the general linguistics 
and the documentation-theory to be summed up in one co- 
herent theo~z. The linguistics has to examine the problems 
of documentational thesauri, abstracting, indexing from 
its own aspect, and has to insert these means and methods 
of the documentation in the system of the text-analysis. 
The practical questions, certain concrete phases of 
the co-textual analysis of texts are dealt with in the 
chapters 2 and 3. In connection with certain steps we hope 
our demonstrative examples to contain solutions that will 
necesssril~ become simpler further on. Our aim was, first 
of all, the presentation of the complete process on a 
special type of text. Not all the parts of this complete 
process can be automatized yet. But the presentation will 
always happen with respect to the automatization. 
The next step should be the getting up of an analysing 
system that is built on the thesauristic elaboration of a 
relatively not big material. Then we have to solve such 
special problems of certain phases of the analysis that 
could not even mention in this flow chart-like survey. 
Beside and after the theoretical analysis, only this 
can be the test of the usefulness of the system. 
÷ + ÷ 
-I- 
I@ 
INTRODUCTION 
i._I Before I should analyse the problems of the 
technical means and a method for the text analysis, 
I should like to touch a more general question. I wish to 
give a short sketch of the connection between the method 
to be presented and the problems of the modern theory of 
grammar and theory of documentation. 
To see the connections of the theory of grammar 
clearer, let us start from Figure I. 
Every message is an element of a net of connections 
/context/, ~etermined by time, geographical place, cultural 
environments etc. The linguistic analysis can set the only 
aim of discovering the inner /co-textual/ linguistic 
structure of the text with the demand of completeness. 
/It can draw near to the contextual connections only to 
an extent that is made possible by the structure of the 
dictionary having used at the analysis./ 
The co-textual analysis --and the immediate con- 
stituent and sentence analysis that are organic parts of 
~t-- can be done on different w~s. 
TheFigure 1 shows the components of Chomsk~'s 
gemerative theory. Chomsky's generative process /@/ starts 
from the 'base component' /B/o The deep structure of the 
sentence /DS/ comes to existence from the sentence-basis 
by filling it up with Iexic~l elements in the 'lexicon' 
/L/. The surface structure /S~ comes to existence by 
using different kinds of transformations in the 'trans- 
formational component' /T/. /The situating of the lexicon 
and the transformational component in Figure I wishes to 
demonstrate that we may need turning to the lexicon even 
-2- 
I. Contextual 
relations 
AD\[~o SER 
I~uat I rela~ 
| , ~, ! | '' ! 
| 1 
!'o' 
I I ~ I 
~P 
t, t 
Figure 1. 
-3- 
during dispatching the transformations./ The proper 
interpretating components /IDs and IS~ provide the 
surface and deep structure with phonetic and semantic 
interpretations. These interpretations, to a certain 
extent, relate to the extralinguis~ic realit~ already. 
The phonetic interpretation contains the basic elements, 
too, necessary to the real pronouncing of the sentence, 
beside the phonological interpretation of the surface 
structure /PHTR = ~honetic representation/. /The elements 
originating in the speaker's subjective interpretation 
either settle on or colour this./ The semantic inter- 
pretation --beside the establishing of the semantic 
character of the immediate constituents-- trends to the 
discovering of logical connections independent from 
language, hidden behind the given verbal representation 
/LOSR = logical semantic representation versus LISR = 
linguistic semantic representation/. 
A part of the investigations included by the 
generative linguistic theory, like other examinations of 
other kinds, raise the possibility of generating sentences 
on another way, too. I This generative process /G'/ starts 
from the universal logical construction /LOSR/. /The 
universal logical construction is a net of connections 
containing ,notion-words,2./ We have to order a linguistic 
structure / DS/ --reflecting the characteristics of the 
~v~n language-- to this construction, and make a 
concrete sentence of it / SS/. 
The way of the sentence-analysis /A/ is of the 
opposite direction to both. At the same time, our 
sketched analysis is built on the theoretical basis, 
which has developed during the investigation concerning 
the problems of the generative processes /G and G'/ 
summarize~ above. But, as the text analysis goes beyond the 
- 4 - 
boundaries of the sentence, we have to widen the sentence- 
centered theoretical basis, as well. 3 
I.~ When analysing the notion 'text-structure' we can 
make a difference between the linguistic and the 
sound-textural components of the text. Both can be 
linearly and hierarchically patterned. 
The 'linear patterning' is a net of the recurrences 
of elements that interweave the whole text. The 
'hierarchical patterning' means the way as the text as a 
whole is built up from the basic units of the structure 
through the levels of the composition units of different 
complexity. 4 
Of course there are texts where the sound-textural 
component is the organizer of the structure, but as it 
is not mY intention to deal with this now, I shall ignore 
their problems at present. 
A great part of the essays dealing with text analysis 
considers making lists of the structural elements of 
similar construction. /It mostly remains among sentence- 
boundaries, so the use of a sentence-centered theory is 
enough./ These lists are undoubtedly essential 
characteristics of a text, but they reflect only one 
aspect of the text structure. This is what I called 
'linear patterning'. 
The other --and, from a certain point of view more 
important-- aspect is the 'hierarchic~l patterning' of 
the text. Its analysis means much more problems. 
The problems are different, according to the types 
of texts. Even if we consider only the most homogeneous 
texts --the sclent~fic didactic prose or the prose told 
only in the third person not containing indirect speech-- 
- 5 -- 
i 
even then we meet the following basic problem: grsmnatical 
/syntactic, semantic/ connections can be discovered only 
among sentences constructing the so-called 'paragraphs'. 
/At the same time, no one has yet established the types 
of it, neither has s~yone described its rules as a system./ 
larger text-units ~han 'paragraphs' only such 
connections can be shown that are carried by the 
'content structure'. Thus, beside the examination of the 
grammatical construction of the 'paragraphs' we also have 
to look for the means to help us to discover the connections 
of this kind. 
_I.5 On this field we can get the greatest help from the 
documentation theory. During the last ten years the 
documentalists wrought out means and methods that will 
probably prove useful at the analysis of not scientific 
texts as well. 
I should like to mention the thesaurus as the most 
significant means. The thesaurus --as known-- is a notion- 
dictionary serving for the normalization of the indexing 
and searching language, which provides the different 
connections among the notions. It is this feature, that 
I think makes this means very useful to the linguistics 
for its own purposes, too. 
From the methods those of the abstracting and 
automatic indexing may become significant in our analysis. 
In the text analysis there is a 'great obstacle', that the 
texts are too extensive. For the sake of making the 
analysis easier and the hierarchical structure clear-cut 
it is both useful and necessary to replace the larger 
structure-units by their abstracts. When speaking of 
abstracting --though abstracts made by statistic methods 
may mean a good help temporarily-- I do not think of 
-6- 
statistic abstracts alone. 
In a broad outline these aspects of the documentation 
theory mean the wider frame, in which certain questions 
of the following analysing method can be examined. 
2. 
ON THE MEANS OF THE CO-TEXTUAL ANALYSIS OF TEXTS 
In my opinion, the following means are necessary 
for the co-textual analysis of texts: 
I. a thesaurus including different sectors; 
2. a rule-system working on sentences: 
a/phonological, morphonological rules, ~ 
b/ syntactic rules, 
c/ semantic rules 
to the linguistic semantic interpretation 
to the logical semantic interpretation; 
3. a system of the syntactic-semantic rules of the 
basic composition units; 
4. a rule-system of an abstracting process; 
5. a rule-system of a process that is able to 
establish the thematic connections of a 'text' 
consisting of abstracts. 
In this chapter I want to deal only with the 
questions of the structure of the thesaurus in details. 
The work of all the rule-systems is based on the below- 
sketched thesaurus structure. 
2.1 The thesaurus 
The base of the linguistic analysis has to be a 
thesaurus that unites the structures of the thesaurus 
made for the purpose of documentation and that of the 
lexiconhavingdeveloped during the general linguistic 
- 7 - 
l 
investigations. 
Such a thesaurus consists of two main parts: the 
sector of definitions and that of classifications. 
2.1.1 On the sector of definitions 
The headwords of the documentational thesaurus -- 
combining the structural characteristics of the more 
important thesauri 5- contains the following constituents: 
headword /descriptor/ 
SY synonyms 
EQ equivalent terms 
TR translation 
DEF definition 
S~ scope note 
SF semantic factors are 
ISF is semantic factor of 
FIELD field, thematic group 
CAT category /material, feature, process/ 
BT broader terms 
BT-LOG broader terms - logical 
BT-WH broader terms - whole 
BT-CON broader terms - connected 
/terms to which the headword is connected 
but not as a logical broader term or a 
whole/ 
narrower terms NT 
NT-LOG 
NT-PT 
NT-CON 
COL collateral terms 
COL-LOG 
COL-PT 
COL-CON 
ASC associated terms 
ASCR associated reflexive 
ASCT associated to i 
ASCF associated from 
EC empirically connected 
Let us demonstrate these constituents by the 
thesaurus-entry of two concrete words: 
-8 - 
/bird/ DEF 
ST 
ISF 
FIELD 
BT-LOG 
-WH 
NT-LOG 
-PT 
-CON 
COL-LOG 
EC 
ASCR 
ASCT 
ASCF 
tenger /sea/ 
DEF 
ST 
ISF 
FIELD 
BT-LOG 
-WH 
NT-LOG 
-PT 
-CON 
COL-LOG 
-PT 
ASCR 
ASCT 
ASCF 
/We shall deal with 
-- separately later./ 
'poultry' 'migratory bird',... 
'animals' 
' vert ebrat e ' 
'living being' 
' singing-birds', ' birds-of-prey' 
'beak'. 'wing' 
'migratory birds' 
'mammals', ,reptilia' 
, nest'_,_' air_' z__' t__ree_' A 'water' 
'flying' 
song ' ' chirping', ' shrieking' 
'ocean' 
'seaside', 'sea level' 
' still waters' 
'Earth' 
'North Sea','East Sea' 
'bay' ,sweet water' t 'salt water' 
'lake', 'tam' 
'land' 
'coast', 'island', 'harbour', ,_sh_iRL__'_b_uoxl._i~__i~_h_tho_use', 
' infinitude' 
'waves', 'storm' 
'gulls' 
DEF --that contains SF and CAT 
We did not meet TR and SN in these examples, though 
in certain cases they may be necessary also in non 
technical texts; 
E.g. at foreign words and geographical names 
related to the given language 
archipelagus /archipelago/ 
TR ' insular world' 
-9- 
f 
Duns 
Stockholm 
SN 'the main river of Middle-Europe 
/and Hungary/" 
SN 'the capital of Sweden' 
The DEF /the lexical definition/ has to contain 
the following informations on the base of the results of 
the linguistic theoretical research referring to a lexical 
unit /LU/: 6 
PHM 
STI 
DSC 
USC-~f 
SXKS 
~DI 
SET 
COI 
Thus the DEF 
structure: 
@rz4s /feeling/ 
/4//r/f z//4f /s/ 
SYI USC N 
USC-CH V+ n 
SYMS + c ommon 
-count 
_+abstract 
MOI 
or mental 
aware me S S 
COl /V: +animate_X/ 
C~ ooooeeo 
phonological matrix 
syntactic informations 
%he undifferentiated syntactic 
category of the given LU 
the undifferentiated syntactic 
category-chain marking the micro- 
syntactic structure of the ~iven LU 
the ordered sets of syntactxc markers 
of the given LU 
morphological informations 
the ordered sets of semantic markers 
of the given LU 
informations referring to the syntactic 
semantic conditions and consequences 
of the use of the given LU 
other informations: the origin~ the 
stylistic value, etc. of the glven LU 
of a lexical unit is of the following 
kis~r /'he' accompanies/ 
PHM /b'/i//s//4//r/ 
SYI USC V 
USC-6NH - 
+transitive 
i 
MOI SEI go'~; 
happen or do 
at the same time 
col Ix ~/ 
eeeeeeo 
- I0 - 
Nevertheless, in the thesaurus we have to find not 
the entries of the lex~cal units but those of the 
grammatical morphemes /GH/ as well. 7 
The structure of the fIEF of the grammatical 
morphemes is simpler, though it shows analogous features 
to the DEF of lexical units: 
PHM 
SYI USC 
SEI 
CO1 
OTH 
Thus the D~ 
following structure: 
-e 
PHM 
SYI USC 
SEI 
CO1 
OTH 
phonological matrix 
the undifferentiated syntactic 
category of the LU that the given GM 
can join 
~he grammatic meaning of the given GM 
informations referring to the possible 
i---ediate context of the given GM 
informations referring to the rules 
that reslize the morphonological 
changings that are connected with 
the use of the given GM 
of a grammatical morpheme is of the 
-t 
,/el ~ I"~I 
possessive SEI sing of 'past' 
personal suffix 
3. person 
singular 
root or CO1 root or 
formative formative suffix/N/__ surf. suffix IV/__ surf. 
...... OTH ...... 
I have dealt with the construction of the lexical 
entries and grammatical morphemes in the thesaurus above. 
I should like to add the following to what I told so far. 
I. By dividing the entry of the thesaurus into a 
lexical definition /LDEF/ and a thessuristic information 
block /a thesauristic definition: TDEF/, the verbally 
- II- 
i 
manifestating knowledge can be separated from the so- 
called encyclopedic knowledge. 
2. The syntactic informations of the lexical 
definitions naturally contain only the informations 
relevant in the given language. 
3. The semantic informations are demonstrated here 
only with a lexical definition. Of course this is not the 
form that I find suitable but the definitions written in 
a semantic meta-language. 
4. The thesauristic informations are the results of 
a multihierarchical classification. For expressing them 
we have to elaborate the own 'notional language' of the 
the saurus. 
5. The information FIF/J9 points to the groups and 
fields into which the notion represented b x the given 
word is classified. The group and field structure has to 
unit the virtues of the thematic classifications of the 
linguistic thesauri /Roget, Dornseiff/, the documentation 
thesauri and the illustrated dictionaries. But its 
re~lisation can be built only on a classification- 
theoretical basis. 
6. For reaching a satisfactory or suitable degree 
of the division of the informations BT NT and COL we have 
to analyse it further. 
7. When defining the informations ASC we --of 
course-- think only of the minimal informations that can 
be defined with a relatively big probability. We wanted to 
mark the indefiniteness of the definition by selecting 
it from the others with a dotted line. 
- 12 - 
2.1.2 On the sector of classifications 
The sector of classifications first of all has to 
contain an alphabetic index that includes all the roots 
of words and grammatical morphemes, with reference to the 
form by whicn they can be found in the sector of 
definitions, 
/Beside the basic alphabetic index we may construct 
an a-tergo list and a morphematic ~IC list as well. This 
latter one reflects the connections SF and ISF./ 
Different special classifications can be made on 
the basis of the lexical and the thessuristic informations. 
Lexical classifications: 
The classes of lexical units having 
the same s~ntactic categories~ 
semantic markers, 
conditions~ 
syntactic semantic ~B~2~B~, 
oeoeoo 
The classes of 
derivative endings, 
suffixes, 
...... of the same character. 
Thesauristic classifications: 
fields of the elements having the 
same FIELD marker 
Different 
8n~/or 
hierarchical 
associat&ve nets. 
We can provide the classifications with 
- 13 - 
l 
identificators, and we can order the identificators to 
the thesaurus entries. 
Thus the atr~cture of a thesaurus-uni~ is as follows: 
L-~EF /lexical definition / 
T-X~F /the sauristic definition/ 
I~ENT /identificators/ 
2.2 As I have already mentioned, I do not wish to deal 
here with the other means of the co-textual text 
analysis. We can feel the character of their structure 
at the presentation of the way of analysing. Their real 
elaboration can be realized only on the basis of a 
profound analysis of the mutual influences of all the 
mesns, 
9@ 
ON THE METHOD OF THE CO-TEXTUAL ANALYSIS OF TEXTS 
3.0 The algorithm of the analysis 
The process of the co-textual analysis of texts 
consists of the following larger units: 
I. the segmentation of the texts, 
2. the discovery of the grammatical structure of 
the individual sentences, 
3. compiling of special text-thesauri, 
4. the discovery of the net of connections of the I 
communication units, the establishment of the 
basic composition units, 
5. making the abstracts of the basic composition 
units, 
- 14 - 
6. the thematic analysis of the text consisting 
of abstracts --in more steps. 
3.1 The segmentation of the text 
The notion of 'text' is interpreted on the basis 
of extra-linguistic criteria. The text is always given to 
us. The 'whole' is obvious owing to typogrsphical reasons 
or~mainly in case of spoken texts-- the beginning and 
the end are determined another way. 
For illustrating the method of analysis we have 
chosen a written, 'simple narrative text', one that 
contains only communications in the third person and maybe 
direct speech in the first person beside. The boundaries 
of the segments of the primary classification --as we have 
8 written text-- are marked by the author's punctuation- 
marks+ 
The text chosen to%he ~emonstration and its informativ 
English equivalent are as follows. /The bi~er line-spaces 
sign the boundaries of the paragraphs located by the 
author./ 
Madarak a ten~er felett 
8 . I. A Duns hmdj~n, a kor- 
l~tn~l, emberek ~llnak, 
feln6ttek @s gyermekek 
vegyesen. 
2. N~zik a foly6afelett 
hint~z5 s a vxzre le- 
lesz~ll6 si~lyokat. 
3. A miDuna-szakaszunkon 
is ~yakranl~tom ezeket 
a viz felett lebeg6 ko- 
rfllpiros cs6ra madara- 
Eat 
4.1 N@h8 magasra felcsapnak, 
Birds above the sea 
There are people standing on 
the bridge of the Danube, at 
the rail, adults as well as 
children. 
They are looking at the gulls 
that are swinging above the 
river and flying down to the 
water from time to time. 
I often see these coralline- 
beaked birds hovering above 
the water at our reach of the 
Danube as well. 
Sometimes they dart up high 
- 15 - 
l 
ii.I 
.2 
12.1 
.2 
13.1 
.2 
14. 
- 16 - 
s a szemem elveszti 5- 
ket a kdksdgben vagy a 
kSdben. 
De tenger felett is 
l~ttam sir~lyokat. 
Akkor m~r t@liesre for- 
dult az id6, 
s~rg~n vil~g~tottak a 
sziget r~yirf~i 
s az Eszaki-tenger ha- 
ragosan csspkodta gr~- 
nit-part jait. 
Haj6m finn kik~t6b61 
indult el 
s a svdd archipelaguson 
haladt Stockholm ir~- 
ny~ban. 
Amerre haladtunk, gyak- 
ran felbukkant egy-egy 
aSt~t gr~uitszige~, 
rajta fern6 ds nylr, 
ferny6 ds nyxr. 
Maggulyo s vil~g~tdto- 
ror~y emelkedett n@me- 
lyiken, 
s el kellett hinnem, 
bogy ott ember ~I. 
Ember, ski tal~n l@lek- 
ben is mag~nyos, inert 
azz~ nevelte az Eszaki- 
tenger nagy m~labdja. 
S~vSltStt a feddlzeten 
a szdl, 
s a hull~mok fer~yeget6- 
en csapkodt~k a haj6 
oldal~%. 
M~r tdlikab~t volt 
rajtam! 
s egy ideig knlnn marsd- 
tam a fed~Izeten. 
"My native land --good 
night "--kuldtem Byron 
szavaival a ~dcsdszdt 
az elt~fnt finn partok 
fal~, 
pedig nero is sz~16fSl- 
demt61 b6cs6ztam. 
De olyan otthonos ~r- 
and m~ eyes lose them in the 
azure or in the mist. 
But I have seen gulls above 
the sea, tbo. 
Then the weather had already 
turned to winter-like, 
the birches of the island 
shone yellow 
and the North Sea dashed 
against its granite coasts 
with anger. 
_M~ ship had started from a 
Finnish harbour 
and was leaving for Stockholm 
on the Sweedish archipelago. 
Wherever we passed, there were 
often appearing some dark 
granite islands, 
with pines and birches, 
pines and birches on it. 
There were lonely lighthouses 
towering on some of them, 
and I had to believe that 
there was a man living there. 
A man, perhaps lonely in soul 
too, because that is what he 
was performed by the great 
melancholy of the North Sea. 
The wind was wuthering on the deck, 
and the waves threateningly 
dashed against the side of 
the ship. 
I had m~ winter-coat on already, 
and I stayed out on the deck 
for a time. 
"My native land --good night" 
I sent farewell to the dis- 
appeared Finnish coasts with Byron's words, 
though it was not my native 
land I said farewell to. 
But I had such a homely 
.? 
4.2 
5. 
6.1 
.2 
.3 
7.1 
.2 
~8.1 
.2 
.3 
9.1 
.2 
lO. 
15.1 
.2 
16.1 
.2 
17.1 
.2 
18.1 
.2 
.3 
19. 
~0. 
21.1 
.2 
22.1 
.2 
.3 
23. 
24.1 
.2 
25. 
26. 
z@sem volt ott, hogy azt 
@reztem: ha nero otthon, 
itt tudn~ 41ni, s talan 
boldogan. 
Sohasem IAttam It~liAt, 
mindig 4szakra h~zott a 
V~am. • • 
feeling there, that I felt: 
if not at home, I could 
live here, and maybe 
happily. 
I have never seen Italy, 
l have always been drawn to 
the North by my desire... 
Sir~Llyok szeg6dtek a ha- j6 ~bo 
s 4n azokat n4ztem. 
Nagyobbak, mint a dunai- ak: 
tengeri sir~lyok. 
Olykor eg@szen a fed@l- 
zetig s a k4m~nyig emel- 
kedtek, 
majd lesz~lltak a hull~- 
mok f~l@, 
s a felcsap6 hideg hsb 
meg-megfGrdet te 6ket. 
Mintha usz~lya lettek volna a haj6nak, dgy ki- 
s@rt4k. 
Ritmus volt a leng4sGk- 
ben 4s lebeg@sGkben. 
N@ha felcsaptak a fejem 
f~14, 
bizor~yosan azt v~rt~k, 
hogy megetessem ~ket. 
K~lSr~s k@p lehettem: 
egyetlen fekete utas e 
fed41zeten, 
s kGrQlSttem a vii jog6, 
rik~csol6, kGvetel6dz6 
tajt4kfe~4r sir~lyok. 
Ha ~em kis4rtek volna, kibirhat atlanul nyomasz- 
t6 lett volna az a zord 
4s viharos tengeri ko- 
mors~g. 
Milyen j6, hogy kis~r- 
nek, 
milyen j6, bogy ezt ri- 
k~csolj~k: "A mi orsz~- 
gtmk a tenger. Sir~lyorsz~g~.." 
Gulls joined the track of 
the ship 
and I was looking at them. 
They were bigger than those 
at the Danube: 
seagulls. Sometimes they rose quite 
upto the deck and the 
funnel, 
then flied down above the 
waves, 
and the splashing cold foam bathed them again and egain. 
They accompanied the ship 
as if they had been a tow- 
boat to it. 
There was rhythm in their 
swaying and hovering. 
Sometimes they darted up 
above mY head, 
they probably expected me 
to feed them. 
I must have been a strange 
picture: one sxngle black passenger 
on the deck! 
and screechxng, crying, 
peremtory, spray-white 
gulls around me. 
If theyhad not accompanied 
me, that severe and stormy 
sea mournfulness would have 
been unbearably depressing. 
How good it is, that they 
are accompanying, 
how good iris, that they 
are crylng: "Our country xs 
the sea. 
The country of gulls..." 
Csak az ~v@k? Only theirs? 
- 17 - 
. 32. 
33. 
34. 
35. 
27. Egyszerre panaszos, magas 
mad~rhang Qtette meg a 
fthlemet. 
28. TGbb is. 
29. Olyan hang volt, amilyent 
6sszel hallok, mikor 
nylrf~xmra szall egy f~j- 
dalmasan szr6 klsmad~r- 
csapat. 
30. Csupa i, csupa i, de o- 
lyan szo~oru i, hogy az 
ember szlve megf~jdul ~I~. ~ 
31. A hsj6 felett sz~llt ~t 
a sxr6 mad~rkgk csapata. 
~lm~Ikodvs n~ztem ut~uuk. 
Apr6 madarak a tenger fe- 
lett ~szakon? 
KGlt5 z6madarak voln~mk? 
Tudom, a fecske is, a e~r- 
garig6 is, a gyurgyalag 
is, a fGlemfile is v~llal- 
ja a nagy utat, ha 5sztG- 
ne megsz61al. 
36. De ezek tal~n csak vonu- 
16 madarak. 
37. S talon a sv~d szigetvi- 
l~g magyargzza meg nekem 
hogy mi~rt tal~lkozom itt 
velGk. 
38. SzigetrS1 szigetre v~ndo- 
rdln~nak? 
39. S az~rt merik v~llaln/ az 
utat a haragosan hull~n- 
z6 tenger felett, mart a 
szigetek nincsenek nagy 
t~volsggra egym~stdl? 
40.1 A sir~lyoknak ~telmara- 
d~kot dcbnak ki a tenge- 
r~szek 
.2 s azok mohdn, szinte ve- 
rekedve fal j~k. 
41. De ki eteti meg a tenger- 
nek ezeket az apr6 mada- 
rait? 
42. Csod~latos ez a tengeri 
Suddenl~ a plaintive, high 
bird-voxce stroke ~ ears. 
More. 
It was a voice that I hear 
in autumn when a group of 
painfully cryin~ little 
hirds flies on m~ birches. 
All i-s, all i-s, but such 
sad i-s, that one's heart 
begins to ache. 
The group of the crying 
birdies was flying over 
the ship. 
I was looking after them 
amazed. 
Tiny birds above the sea 
in the North? 
Would they be migratory 
birds? 
I know, the swallow, the 
yellow-bird, the bee-eater, 
the nightingale all shoulder 
the journey when their 
instinct calls the~ 
But perhaps these are Only 
moving birds. 
And perhaps it is the 
Sweedish insular world that 
explains me why I meet them 
here. 
Would they move from island 
to island? 
And do they dare to shoulder 
the journey above the 
angrily waving sea because 
the islands are not very 
far from each other? 
The sailors throw out 
scrapings to the gulls 
and they wolf them eagerly, 
nearly fighting. 
But who feeds these tin~ 
birds of the sea? 
~is rec~ee~ess that 
- 18 - 
43.1 
.2 
44. 
45.1 
.2 
utat v~llal6 vakmer~- 
s~g. 
Biz'tat.6 
~s felemel~. 
~tor~t6 s~, akiben 
olyan szomor~s~g van, 
mint bennem. 
Mart egy dr~ga halottat 
viszek magamben fel-fel- 
QvBlt6 hullg~ok k~z~tt 
az ~szaki-tengeren, 
s n~ ~gy sir bennem 
valami, mint azok az 
apt6 dr~ga madarak s~x- 
tak a fejem felett... 
shoulders the journey above 
• he sea. 
Hopeful 
and elevating. 
Encouragin~ for one who has 
sorrow inslde like me. 
Because I am carrying a dear 
deceased in m~self Rmongst 
the yelling waves on the 
North Sea~ 
and sometxmes something 
cries in me just as those 
tiny dear birds cried above 
mThead... 
3.2The discover~of the grammatical structure of the 
individual sentences 
The discovery of the sentence-structure happens in 
the following steps. /The examples show only the result 
of the analysis./ 
I. The _mo__rphematic se~entation of the sentences 
/its morphonological re-interpretstion/: 
... Duna-szakaszunkon... /at our reach of the Danube/ 
Duns - szakasz + unk+ on 
/the Danube of reach our at / 
2. The substitution of lexical /thesauristic/ 
entries for concrete words and grammatical morphemes 
A mi Duna - szakasz+unk+on is gyakran l~t+om 
T Pr N N surf suff C Adv V suff 
poss poss N V 
pl.l pl.l loc s.l 
/the our Danube of reach our at as well often see I 
ez + ek + et a viz felett lebeg +6 kor~ll + piros 
Pr surf suff T N P0stp V der. N A 
~ adj 
ple aCC 
these the water above coralline hover ing 
- 19 - 
cs6r + G madar + ak + at 
N der. N surf surf 
adj N N 
pl. acc 
beaked birds / 
The lexival entries are signe~ by the concrete 
chaines of letters and the syntactic categories written 
under them. 
During the substituting it can happen that there 
is a word in the text which is missing from our thesaurus. 
The lex~cal entry of it has to be made and inserted into 
the thesaurus. 
Paralelly with the substitution we can make the 
list of 'word forms' of the analysed text. 
3. The discover~ of the s~ntactic relations among 
the elements of the given surface structure: 
//A mi Dumm-szakaszunkon/~/AdvP I + /is/C +/gyakrar~AdvP 2 
//l~tom/V + //ezeket/De t + /a viz felett lebeg6/Ad~p I + 
+ /kor npiros oa6  J jp2, w 
It seemes necessary in the surface structure to 
sign the constituents of the 'topic-comment' relation. 
These signs have to be preserved in the deep structure 
as well. 
4. The discover X of the 'deep st_ru_ct_ur_ee' belonging 
to the given surface structure -- incomplete in mar%y 
cases. 9 
I 
The deep structure is demonstrated by Figure 2. 
5. The l_'_~B~uistic semantic interprets%ion of the 
deep structure. 
At the linguistic semantic interpretation the 
-20- 
definition of the semantic character of the immediate 
constituents happens on the basis of thesyntactic and 
semantic markers of the basic word and the suffixes 
and/or postpositions. 
Eg. -on Duna-szakasz 
oeooeoee • • oeoeoeoeoo 
eeeeooeo eeooeeoooeee 
+place +place 
+time eeoeeooeee • 
ooooooeo eoeoooeoee • • 
Duna-szakaszo.~n 
/adv. of place: 
at the reach of 
the Danube/ 
In the analysed sentence: 
AdvP I 
J~IvP 2 Det/~P/W 
Ad~ 1/st/ 
Aa~ 2/el/ 
nyAr 
toeeeoeoeoo 
oeeeeeeeeoo 
+time 
eeeeeoeoeoo 
ooeeeoeeeeo 
r~y~ro~n 
/adv. of time: 
in Summer/ 
adverb of place 
adverb of time 
deictic pronoun /one pointing out 
of the sentence/ 
local attribute 
qual~tat~ve attribute 
/2/ P: bel~gs to 
s~g: reach of we 
the Danube 
The analysed sentence does not contain semantic 
incompatibility. 
6. The l_o~ical semantic interpretation of the deep 
structure • 
The analysed sentence expresses the following 
relations: 
/1/ where: Reach of the Danube 
: when: often 
sees N 
f I I thaee 
sy: I sg: birds 
- 21 - 
f 
"~ \+ +..+ ++"+"+- 
0. 
\~ -~ .; 
\ >/z\ ._ ~ -~ 
• I'" "C.." • -- ~ .~ -o. / 
/>\ ,. ®~ m~ >__+~ 
- 22'- 
r ~ 
/31 where: above the 
water 
~-- P_.: hovers 
sg: birds 
/5/ P: ia 
_ T 
r s~: beak 
W__ot_es: 
141 P: has 
s.v: birds sg: beak 
coralline 
A Roughly this succession is the way from~through 
Iss to ~ on Figure I. 
The steps from 3 to 6 happened on the basis of the 
lexical /theeauristic/ informations. 
The 'conditions' informations /CO1/ of the substituted 
lexical units mean a certain prediction referring to the 
8nal~sis both inside and outside the sentences. For 
example the compulsory complement of the verbs makes it 
possible to look for the compulsory, possible and not 
complement like adverbs 81gorithmically. 
The logical semantic interpretation of the deep 
structure makes it possible on one hand to establish the 
syno~ymity of sentences, and to discover the net of 
thematic connections of the text on the Other. 
3.3 The compi!inE of special text-thesauri 
After finishing the analysis of the sentences we 
compile different kinds of special text-thesauri. 
It is this point where the whole interpretation of 
the linear patterning of the text takes place on the 
- 23 - 
/ 
level of sentences. I have dealt with its problems in 
detail st another place. Here I should like to mention 
only the things that are necessary for the further 
investigation of the analysing process. 
~._~._I First of all we make the index of the word-forms of 
the whole text. After the word-forms this index 
gives the numbers of the sentences in which the given 
word-form occurs. We shall neglect the complete presentation 
of this index, we shall just give the list of the roots of 
nouns, verbs and adjectives that occur more than one time. 
/Considering the shortness of the text, a word occurring 
• twice can be relevant as well./ 
The underlined numbers in the list sign the 'implicite' 
occurrences of the given words. /We speak about implicit 
occurrences when the given word is represented by 
pronouns, verbal endings or demonstrative pronouns. Their 
identifications with the proper words have to be done 
already at the semantic interpretations of the sentences 
--and have to be signed by a special code in the 
continuously compiled text-dictionary./ 
V N A 
csap /das~ r6 /tlny/ 
drags /dear/ 
Duma /Danube/ 
ember /man/ 
emelkedik /rise, tower/ 
41 /live/ 
~szak /North/ 
fed41ze% /deck/ 
fej /head/ finn /Finnish/ 
fer~y6 /pine/ 
gr~nit /granite/ 
ha j6 /ship/ 
halad /go/ 
haragos /anger/ 
35 41 45 
4 6 II 18 21 
45 45 I 3 17 18 
1 2 9 I0 30 9r8 
9 14 
6 I0 15 33 45 II ~2 22 
21 45 
88 
688 
7 Ii'~i~ ~9 31 
78 
6 59 
- 24 - 
V N A 
hull~m /wave/ II 18 39 45 
"i" 30 30 3O j6 /good/ 24 24 
kis4r /accompany/ 19 23 24 
l~t /see/ 3 5 15 
lett /be/ 19 23 reader /bird/ 3 27 29 31 32 33 34 36 37 
41 45 magas /high/ ~2~ 
nagy /big/ I0 35 39 
n4z /look/ 2 16 32 
nyir /birch/ 6 8 8 23 
part /coast/ 6 13 14 14 
rik~csol /cry/ 22 24 
airily /gull/ 2 3 4 4 5 16 17 18 18 18 I~ 20 21 ~I 21 22 2__~2~24--25 
sir/cry/ ~ 3~10 ~ - 
sv4d /Sweedish/ 7 37 
sz~ll /fly/ 2 18 29 31 
. sziget /island/ 6 8 8_ 9 9 57 38 38 39 
szomoru /sad/ 30 44 teuger /sea/ 5 6 I0 17 23 24 33 39 40 41 
42 45 
t41 /winter/ 6 12 ut 
ljo~ney/ o 22 35 59 42 
vakmer6s~g /recklessne~s~5 4.2 v~llal /shoulder/ 39 ~ 43 44 
viz /water/ 2 3 
On the basis of this list --especially in case of 
shorter texts-- we can also collect the list of the 
logical relations where the most frequently occurring 
nouns take place. 
_~.~_._~ Making a list of the conjugated ~erb-forms, perso~url 
pronouns, possessive pronouns, nouns with possessive 
personal suffixes reflecting the 'communicational net' 
of the text is an important analysing device. 
Let us see for example the list of the words 
referring to the first person: 
- 25 - 
V/Present/ V/Past/ 
3. mi /our/ 
I~unaszakaszunkon /at our reach of the Danube/ 
l~tom /I see/ 
4. szemem /my eyes/ 
l~ttam/I saw/ 
haj6m /my ship/ 
~e 
7. 8. 
9. 
haladtunk /we went/ 
hlnnam kellett /I 
had to believe/ 
12. rajtam /on me/ maradtsm/I remained/ 
13. kltldtem /I sent/ szGl~f5ldemt61 /from my native land/ 
bucsuztam /I said farewell/ 
14. @rz@sem /my ~eeling/ 
@reztem/I felt/ tudn@k /I could/ 
15. l~ttam/I saw/ v AEyam /.~ desire/ 
16. n@ztem /I looked at/ 
21. fejem /mY head/ megetessem /for me to feed/ 22. kGr~lSttem /around me/ lehettem /I must have 
been/ 23. e_n~em /me/ 
24. -e~-e~ /me/ 
27. i~l-~et /my ears/ 
29. hallok /I hear/ 
nyirf~imra /on my birches/ 
32. n~ztem /I looked at/ 
35. tudom /I know/ 37. nekem /for me/ 
44. bennem /in me/ 
45. magambau /in myself/ 
viszek /I bring/ 
bennem /in me/ 
I 
elements ' : 
akkor /then/ 6 
egys zerre/suddenly/ 
gyakran /often/ 3 8 
nmjd /later/ 18 
m~r /already/ 6 12 
mikor /when/ 29 
Making a list of the words referring to the time 
and place of the narrated event means a help as well. 
Let us see here the list of the t_e~_ oral 'connectin~ 
27 
d mln ig/always/ 15 
n~ha /sometimes/ 4 21 45 
olykor /sometimes/ 18 
sohasem /never/ 15 
egy ideig /for a time/ 12 
6sszel /in autumn/ 29 
- 26 - 
We have to makesome of the special thesauri, but 
%he making of others can become necessary because of the 
imformations accumulated in the process of analysis. 
3.4 .The discovery of the net of connections of t.he 
communication units 
_~._~._I After the analysis of the sentences and the 
compilement of the special text-thesauri we have 
to discover the structure of the basic composition units. 
Let us see what is at our disposal at the beginning 
of this process. 
We have 
- the 9trans-eegmentated~ text; 
/On the basis of the sentence-analysis we can 
separate~ on the one hsnd~ smaller units in certain 
• sentences of the author Nwe have written the text 
in this form in the pages 15-19-- and in certain 
cases we may contract two sentences into one unit 
on the other. / The units gained by the trans- 
segmentation are called not sentences out 
communication units already. 
- %he system of connections --discovered during 
the semantic interpretation-- among the separate 
communication units; 
- the list of predictions determined by the 
predicste8 of the communication units and not 
realized in the given communication unit~ 
the special text-thesauri. 
• ~._~._2 Let us examine here the_s~stem of the connections 
discovered among the communication units. 
The list below contains the following conventions: 
- 27- 
f 
X/Y X is a constituent ordered ~mmediately 
under X /if Y is the dominant sentence 
itself, we never write it out/, 
X/Z/ Z is an information that defines the 
character of the constituent X more 
precisely -- if it is e number it 
stands for the communication unit that 
contains X as its constituent, 
/after a semicolon we have the concrete 
lexicsl unit that keeps the communication 
units in question together/, 
X :: Y Y is the definition of X, 
X = r X is the repetition of the constituent Y, 
X : Y the missing constituent X is identical 
to X, 
X -: Y the constituent X refers to Y /stands 
for Y/, 
The underlined constituent on the left 
of the signs -, :, or -: represent the 
proper paraphrase standing on the right. 
NP, VP, AdvP are immediate constituents, 
NPo the 'subject' is marked only by the 
verbal personal suffix, 
the 'subject' is marked only by the 
adjectival predicate without copula, 
Pe personal pronoun, 
Po possessive " Pd demonstrative pronoun 
Px indefinite 
Det the determinant of a NP 
D demonstrative element 
The punctuation-marks after the numbers that sign the 
communication units represents the punctuation marks. 
that end the communication units immediately precedlng 
them. 
2~_ ° MPo : NP/1. ;people/ 95--.1 |j • /Det/l~4~'+~-: NP/~ll~t7"- igulls/ 
4 . ~o : ~P/V~FII 
, D/~e/VP-: Z~/W/~/ 
Np- 
- 28 - 
5. • 6.1 . 
.5 
. 
17 • a: , 
1 . 
II.I 
12.1 
e2 
14. -, . 
15.1 • 
.2 
16.1- . 
.2- 
17.1- . 
.2- : 
18,1- . 
.2- • 
19. 20. 
21.1 
• 2- p 
22.1 
.2 
2 :37 : 
25. \] ' 
26. 27. "; 
28. t " 29. . 
31. . ~2. "\] . 
HPo ; NPI7.1 ; my ship/ 
Adv/Pe/-; NP/8.1 ; granite islands/ 
NP = NP/8.2 ; pines and birches/ 
Adv/Px//VP-: NPIa.I/ 
Adv/D/-: NP/9.1 ;lighthouses/ 
NP = NPI9.2/ 
RP/VP : /13.1 ; mY native land/ 
Ad~/D/-: NP/VP/15.1 ; Finnish coasts/ 
NP/Pd//VP -: NPII6.1 ;gulls/ 
RP-- : NP/16.1/ 
~P- : ~/16.1/ 
KPo : NP/16.1/ 
RPo : ~T/16.1/ 
NP/Pe//VP-: NP/16.1/ 
HPo : NP/16.1/ 
Pos/N/A~vP-: NP/16,1/ NPo : NP/16.1/ 
NPo : NP/16.1/ 
NPolS/a~vPl : ~P122,3 ;gulls/ 
~olS/~l : NP122.3/ 
HPolS~PI : NP122.31 
NP- -: VP/S/NP/VP/24.2; sea/ 
: : Det/NPI27 ;bird-voice/ 
NP- : NPI27/ 
NP- : NPI271 
NP/Pel/VP-: NPI31 ; group of birdies/ 
- 29 - 
3 .9-I 54.u , ? NPo : NPI53 ;tiny birds/ 
35. , ? 
56.q "J ..: NP/Pd/-: N~/~31 
~7.9 . NPIPe//~2-: ~ml~l 38.~ NPo : NP/~3/ 
39. -~ "~ I~o :-~/5~/ 
40:~j" ? NP/Pd/-: NP/VP/40 ;gulls/ 
41. 
42. ~ ? 
4~.lJ . NP- : NP/42 ; recklessness/ 2..~ ~P- : 
NPI421 
44" • ~- : ~P1421 
45. 
/We cannot find predictions in this text. 
We presented some of the special text-thesauri 
in ~.3./ 
~.~.5 As the compositional analysis starts with the 
analysis of communication units we build the 
definitions of the higher units on the notion 'communication 
unit' instead of the indefinited notion 'paragraph'. 
But we need some 'transitional notions', too: 
What we call s continous communicational chain /CC/ 
is the succession of communication units with predicates of 
first person in direct speech. /In this ~ext we can see it 
in the communication units 12.2 - 15.2./ 
What we call a simple block Of communication units 
/SB/ is the --mostly continuous-- chain of the communication 
units all of which contains on_._~e explicite or implicite 
reference element pointing to the same 'reference'. This 
reference element has to be replaced ~ the 'referred 
element', if we want to give the semantic interpretation 
of the communication unit. 
-~0- 
1~atwe call a complexblock of Commun, icationunits 
/CB/ .is a --mastly continuous-- chain of communication 
units connected by more 'referred elements~. 
#For example the communication units 16.1 - 21.2 
form a SB, the communication units 8.1 - l0 form a CB./ 
We mean the basic units of the compositional 
structure the following way: 
We shall call basic composition unit /or comp@aition 
unit of the first de~ree/ the structure unit that forms 
one thematic unit and that comes to existence immediately 
from the communication units. /This always has to consist 
of morethan one communication unit. The 'orphan 
communication unit' inserted between the composition units 
forms a compositio n unit of the zero de,Tee./ 
Generally: we shall call composition unit of the n 
degree the structure unit that also forms one thematic 
unit and that contains compositionunit of the /n-l/ 
degree as well. 
5.~.~ Both the simplex and the complex blocks of the 
communication units and the continuous communication 
chains sign some kind of the compositional dissection 
already. But the discovery of the compositional structure 
demands a detailed thematic analysis. 
At the thematic analysis the first fixed point is 
meant by the 'referential elements'. 
When the referential elements hold not many 
communication units together, this mostly means s smallest 
thematic unit, too. 
Inside the longer simple blocks and in the parts of 
the text outside the simple and complex blocks the 
discovery of the thesauristic connections among the lexical 
- 31 - 
units may help us to separate the composition units. 
The projection of the continuous communication chain 
on the text can mean a help of another kind. 
/The thessuristic connections separate for example 
the communication units 37-39 /~insulgrworld' ~ 'from 
island to island' -- 'islands'/ as one independent 
composition unit. 
The communication chain helps at the compositional 
segmentation of the text consisting of the communication 
units II.I-15~2 /The simple block 13.1-14 is separated by 
the referential element 'Finnish coast', the 12.1-12.2 is 
separated from 11.1-11.2 by the Person i./ 
After all we can separate the follov~ng composition 
units of the zero and first degree: 
the serial number of the referential the character of 
the comp. the comm. elements the comp. unit 
unit unit 
I® I -2 people SB 
2. 3 -4.2 Pers I + gulls SB 
3. 5 Pers I + gulls orphan comm.u. 
4. 6ol-6.3 ? orphan comm.u. 
5. 7.1-7.2 my ship SB 
6. 8.1-8.3 granite island SB 
7. 9.1-10 lighthouse CB 
8. ll.l-ll.2 ? orphan comm.u. 
9. 12.1-12.2 ? the beginning 
of a CC 
I0. 13.1-14 Finnish coast CB 
II. 15.1-15.2 ? orphan comm.u. 
12. 16.1-20 gulls + ship SB 
13. 21.1-24.1 gulls + Person I a CC that keeps 
a not continuous 
SB together 
gulls + sea SB 
? orphan COmm.U. 
bird-voice SB 
the group of birdies SB 
'birds' a not continuous 
SB including one 
orphan commeU. 
14. 24.2-25 
15. 26 
16. 27 -30 
17. 31 -32 
18. ~3 -36 
- 52 - 
19. 37 -39 birds + islands CB 
20. 40.1-40.2 gull s SB 
21. 41 ? orphan comm.u. 
22. 42 -44 recklessness SB 
23. ~5 ? orphan comm. u. 
3.~ The process of makin~ the abstracts of the basic 
composition units 
The making of the abstracts of the composition units 
established in 3.4 happens in more steps. 
_5._5.1 We take the composition units one by one and 
reduce their deep structures. 
This means that the constituents containing the most 
frequent words and the ones expressing time are left,the 
other constituents of completive character /that is AdvP 
embedded in NP/S or NP/VP/S/ are omitted. 
/Na~urally, the 'deep structure' means only the 
incomplete deep st~ctures containing the really existing 
constltuents of the original communication units here too.f 
Then we write out the surface representations one 
by one according to the formerly reduced deep structures. 
As the result of this step we shall get the 
following ,reduced text'. /Here we illustrate this process 
only with the composition units I-ii./ 
i. There are people standing on the bridge of the Danube. 
They are looking at the gulls. 
2. I often see these birds at our reach of the Danube 
as well. 
Sometimes they dart up high. 
3. But I have seen gulls above the sea, too. 
4. Then the weather had already turned to winter-like. 
The birches of the island shone yellow. 
The North Sea dashed against its granite coasts with 
anger. 
- 33 - 
5. My ship had started from a Finnish harbour and was 
leaving for Stockholm. 
6. Wherever we passed, there were often appearing some 
granite islands. 
There were pines and birches on it. 
7. There were lonely lighthouses towering on some of them. 
8. The wind was wuthering on the deck. 
The waves threateningly dashed against the side of 
the ship. 
9. I stayed out on the deck for a time. 
I0. I sent farewell to the Finnish coasts. 
I had such a homely feeling there. 
Ii. I have never seen Italy. 
I have always been drawn to the North by m~ desire. 
~.5.2 Before beginning the second step of making the 
abstract we have to examine the orphan blocks 
consisting of more communication units. We have to decide 
if they represent one or more composition units. 
This is relatively easy in cases like that of the 
composition unit 8. Its two communication units are 
connected in their content by the explicite thesauristic 
connection between the 'deck' and the 'side of the ship' 
/both are the parts of the 'ship'/. /We can find an 
implicite connection as well: "the wind was wuthering", 
"the waves threateningly dashed", the question is to what 
extent can this be made expllcite./ 
The communication units of the composition unit II 
are connected in their form by the pair of adverbs 
never - alw~. 
The composition unit 4 raises mo?e problems already. 
We can establish from the 'birches of the island' u on 
the basis of the later occurences of the 'island' and 
the 'birches' /see the composition unit 6/-- that it 
refers to the birches of sea islands. The 'granite island' 
- ~4 - 
'granite coast' connection also links the two last 
communication units, The 'turned to winter-like' stands 
only in a slack associative connection with the predicate 
'dashed ... with anger', but it is not in any explicitely 
demonstrable connection with the predicate ' shone yellow'. 
Thus we could regard the second and third communication 
units as an independent composition unit. 
3.~.3 In the next step we substitute the separate 
composition units by one communication unit. This 
means the choice of one communication unit /preserving 
its original form/, or we have to generate a new one -- 
originally not occuring in this form. 
To be able to carry out these tasks we must have 
the co~lete analysis of the sentences of the reduced 
text. 
The choice can happen on structural and/or 
statistical basis. 
We choose for example the first communication unit 
from the reduced text of the composition unit 16 on 
structural basis because the others are such attributes 
of the NP of this communication unit that can be regarded 
the paraphrases of the attributes occuring inside the 
communication units /'plaintive' -- 'painfully crying' -- 
' sad'/. 
We choose for example the second communication 
unit from the reduced text of the composition unit II 
on statistical basis because this is connected by'North' 
to the other communication units, while the first one is 
not connected by 'Italy'. 
We choose the last communication unit from the 
reduced text of the composition unit 18 on structural 
and statistical basis. Structurally because this is an 
- 35 - 
i 
answer to the first two asking communication units, and 
statistically because the last-but-one communication unit 
contains NP-s that occur only once in the whole text 
/'swallow', 'yellow-bird', .../. 
we generate /that means a generating of~ When 
type --see Figure I/ both the structural and the 
statistical aspect plays an important part. Actually it 
is enough to speak of generating only because the choice 
can be considered as a kind of generating by which we 
make a communication unit identical with one of ~he 
original communication units. 
Let us demonstrate the character of generating by 
making the abstract of the composition unit IQ. 
The logical deep structures of the communication 
units of this composition unit are as follows: 
/I/ P: says farewell to 
s.lv: I 's~: coasts 
121 
/3/ P: is s~: i 
I i sg: coasts :Finnish 
where:coasts 
P.-f~eels oneself 
somehow:homely 
Considering the paraphrase-possibilities of the /i/ 
and /2/ and the tenses occuring in the text we can 
generate the following sentence from these: 
'I said farewell to the hp~ely Finnish coasts.' 
or, holding to the structure of the text: 
'I sent farewell toward9 the hom~ly Finnlah coasts.' 
I 
'When we generate, of course it is not necessary to 
condense all the informations --hidden in the logical 
representation-- into one sentence. 
- 36 - 
After the generating or the choice of the sentences 
we get the following 'abstract-text': 
20. 
21. 
22. 
I. People on the bridge of the Danube are looking at the 
gulls. 
2. I often see these birds at our reach of the Danube as 
well. 
3. But I have seen gulls above the sea, too. 
4. Then the weather had already turned to winter-like. 
5. MY ship passed from a Finnish harbour towards Stockholm. 
6. Wherever we passed, there were often appearing some 
granite islands. 
7. There were lonely lighthouses towering on some of them. 
8. The wind was wuthering on the deck. 
9. I stayed out on the deck for a time. 
I0. I sent farewell to the homely Finnish coasts. 
ii. I have always been drawn to the North by ~f desire. 
12. Gulls joined the track of the ship. 
13. If they had not accompanied me, that severe and stor~ 
sea murnfulness would have been unbearably depressh~.. 
14. How good it is, that they are crying: "Our country Is 
the sea." 
15. Only theirs? 
16. Suddenly a plaintive bird-voice stroke my esrs. 
17. The group of the cryin~ birdies was fl\[ing over the ship, 
18. Maybe these are not migratory just movlng birds. 
19. Do they dare to shulder the journey above the angrily 
waving sea because the islands are not very far from 
each other? 
The sailors throw out scrapings to the gulls. 
But who feeds these tiny birds of the sea? 
This recklessness that shoulders the journey above the 
sea is encouraging for one who has sorrow inside like 
me because I am carrying a dear deceased in m~self, 
~ sometimes something cries in me just as those tiny 
dear birds cried above my head. 
The informations condensed into the abstract of the 
literary text are relevant with different character than 
those of the scientific %ext. In this abstract~ 
composition unit has to be represented by acommunication 
unit. 
- 57 - 
r 
3.6 The,,thematic anal~sis of the 'abstract-text' 
Among the communicati~ units of the 'abstract- 
text' we can establish very few connections with the 
character of those in the primary text. Here it is rather 
the thesauristic connections that come forward. The 
analysis is similar to the one told earlier. Its result 
can be summarized as follows. 
The different thessuristic connections dissect 
the text of the abstract the following w~v: 
i - 2 'the Danube' 
3 - 22 'the sea' 
/In all the composition units except I, 2, 4, 15, 
16, 18 we can find 'the sea' itself or some element being 
in some thesauristic connection with it./ 
The 1-2 forms a composition unit of the second 
degree, its other connecting element is the 'gulls'. 
We can dxssolve further the unit kept together by 
the 'sea'. The inner '~hematic connecting elements, of 
the so-gained composition units of the second degree are 
as follows: 
5 - 7 'passed' + 'islands' 
9 - ii Person I 
12 - 14 'gulls' + 'ship' 
16 - 19 'birdies' 
20 - 21 'feeding' 
This compositional division will obviously meet 
the net of connections of the communication units of the 
primary text. The orphan composition units /3, 4p 8, 15, 
22/ include the orphan communication units. 
The composition units of the second degree of the 
unit linked by the 'sea' form two composition units of the 
third clegree: 
/5-7/+ /9-11/ : Person I ~my ship,, 'I remained', 
'I sent,, '~ desire'/ 
/This includes the composition unit 8 
as well./ 
/12-14/ + /16-19/: 'bird' /either as a primary lexical 
unit or as a semantic and- 
or thesauristic element/ 
/This includes the composition unit 15 
as well./ 
The compositional structure of the whole text can be 
summarized the following way: 
Birds above the sea 
0 I 2 3 
I. Gulls above the Danube I- 2 
II. Birds above the sea 3-22 
A. But I have seen gulls above the sea, too 5 
B. Then the weather had already turned to 
winter-like 4 
C. I. MY ship passed from a Finnish 
harbour towards Stockholm 5- 7 
2. The wind was wuthering on the deck 8 
3. I sent farewell to the homely 
Finnish coasts 9-~I 
D. I. Gulls joined the track of the ship 12-14 
2. Does the sea belon6 only to the 
gulls? 15 
3. A group of the crying birdies was 
flying over the ship. 16-19 
4. The sailors feed the gulls, but 
who feeds the crying birdies? 20-21 
E. This recklessness that shoulders the 
journey above the sea is encouraging 22 
In the hierarchical structure we cannot establish 
levels equally valid to every text, but we can define the 
'depth-level' of every composition unit as the 
compositional distance measured from the ,whole of the 
work of art' as the broadest composition unit. /The upper 
- 59 - 
r 
line of numbers marks the depth-levels./ We can observe 
that we can meet eg. a composition unit of the first 
degree on the most different depth-levels. 
. 
CONCLUSIONS 
My paper dealt with the questions of the co- 
textual analysis of a special text-type. I tried to 
deliberate on the complete text-analysingprocess 
algorithmically. This process has parts than can be done 
automatically already now, and theoretically -- I 
suppose-- all its parts can be automatized. But the 
practical realisation expects the solution of different 
problems that are among the key-problems of the 
linguistics and the documentation. MY intention was first 
of all to show how these are connected even at the 
analysis of the simplest text. Finally I should like to 
point to some problems of basic importance again. 
~.l Before the compilement of the thesaurus we have 
to select the so-called 'notion-words'. Mostly 
these ~notion-words' take place in the sector of the 
thesaur~stic definitions. 
We also have to elaborate the 'semantic basic 
language' by which the words of the given language can 
be semantically defined and we have to define the system 
of the relations to be used in the semantic definitions. 
/The 'notion-words' are words, the elements of the 
'semantic basic language' are of feature character./ 
The thesaurus has to provide the interrelations 
among the different 'notion-words' and the different 
'lexical units'. These interrelations are of one-many 
- 40 - 
character: 
TOEF. 
LDEFin 
That is: more lexical units /LDEF/ m~y belong to 
one 'notion word' /TDEF/. 
_~.2 The relation 'logical semantic representation' -- 
'linguistic semantic representation' is an analogous 
equivalent of the relation 'notion word'--'lexical unit'. 
The 'logical semantic representation' /LOSE/ 
represents'logical connections' among ' notion words', 
while the 'linguistic semantic representation' /LISR/ 
represents 'verbal connections' among 'lexlcal units'. 
/In fact~ LISR is a deep structure that demonstrates the 
semantic character of the constituents, too./ 
In connection with these we have to define the 
systems of both the logical and linguistic connections, 
the way of their representation, the correspondence of 
these two systems of connections, and the way of passing 
over from one to the other. 
The interrelations are here interrelations of sets 
and they are also of one-many character. /That is: to s 
given set of elementary logical relations more different 
text-structures can belong. These text structures one Dy 
one may contain only as many independent deep structures 
as ~he number of the elementary logical relations./ 
The efficacity of the analysing system first of all 
depends on the determinedness of the elements and relations 
of the logical and linguistic system and of the rules of 
- 41 - 
r 
the matching between the two systems. 
~.~ We have to mention that the relation of the 
phonetic and phonological representation is also 
analogous to the above told one, though this analogy 
contains an oblique symmetry. This is a natural 
consequence of the continuous and one-way one-many relation 
passing from the logical representation to the phonetic 
onee 
_~._~ From the point of view of the relation of the 
linguistic deep and surface structure the structure 
of the syntactic informations and the condition 
informations of the LDEF is of primly importance. It is 
significant to approach it from the transformations. 
/That is, we have to ~ examine to what extent the concrete 
lexical units occuring in a given deep structure determine 
the character of the transformations to be used./ 
~._5 It is only a solution of the above sketched 
problems happening on 8 theoretical basis forming 
one coherent system that makes possible the solution of 
such questions as 
- the automatic realization of the transitions 
from the surface structure to the incomplete 
deep structure, 
from the incomplete to the complete deep 
structure, 
from the deep structure ~D the logical 
repre sentat ion~ 
- the automatic discovery of the thematic connections 
built on the TDEF-s, 
- the automatic establishment of ~he --in the given 
connection-- irrelevant elements of the 
deep structure, 
- the automatic realization of the reduction 
building on the irrelevant elements, 
- 42 - 
flna~Av 
- the automatic dispatching of the generating of 
G' character necessary during the process. 
~.~ The demonstrating material inthe chapters 2 and 3 
of m~ paper was a so-called 'simple narrative text'. 
When listing the problems we must not ignore the fact 
that the different kinds of texts will enlarge the above 
enumerated list of the basic problems with their special 
problems as well. 
- 4~ - 
f 

References

A.K.Zholkovski, I.A.Mel'chuk, 0 
semanticheskom sinteze. Problen~ kibernetiki 19./1967/ 
177-2~8. 

J.lhwe, Linguistik 
und literaturwissenschsft /Bibliogrsphie/. 2./1968/ 28-,42. 

J.S. Petofi, On the structural linguistic analysis 
of poetik works of art. Computational Linguistics /Budapest/ VI./1967/ 53-82. 

D.Soergel, Klassifikationssysteme vmd thesauri. 
Frankfurt am Main, 1969. 

J.S. Petofi, A tezaurusz-k4rd4s jelenle~i helyzete /The 
present state of the thesaurus problem/ Budapest, 
OMEDK 1969. 

A magyar nyelv szdv@gmutst6 sz6t&ra /Reverse-alphabetized 
dictionary of the Hungarian language/ Compiled by F. 
Pap. Budapest, 1969. 

Verzeichnis tier ungarischen Suffixe und Suffixkombina- 
tionen. Zusammengestellt yon W.VeenEer, Hamburg, 1968. 

J.S.Petofi, Notes on semantic interpretation of 
verbal works of art. Computational Linguistics /Budapest/ 
VII./1968/ 79-105 
