American Journal of Computational Linguistics Mi cro fi che 5 CJ 
THE FINITE STRING 
NEWSLETTER OF THE ASSOCIATION ,FOR COMPUTATIONAL LINGUISTICS 
TABLE OF CONTENTS 
An organization for a dictionary of word senses 
Dick H, Fredericksen ............... 2 
Current Bibliography ................. 24 
Cahiers du groupe de travail Analyse et experimentation 
dans les sciences de 1 'homme par les methodes infor - 
mathiques - E. Chouraqui and J. Vfrbel ....... 93 
Bibliography and subject index, current computing ... 94 
....... 
Directory of university computer science 
95 
Privacy, security, and the information processing in- 
dustry - Dahl A. Gerberick ............ 96 
AMERICAN JOURNAL OF COMPUTATIONAL LINGUISTICS is published 
by the Center for Applied Linguistics for the Association 
for Computational Linguistics. 
EDITOR: David G. Hays Professor of Linguistics, SUNY Buffalo 
EDITORIAL ASSISTANT: William Benzon 
EDITORIAL ADDRESS : win Willows, Wanakah, New York 14075 
MANAGING EDITOR: A. HOOD ROBERTS Deputy Director, Center for 
Applied LinguistTics 
MANAGEMENT ASSISTANT : James Megginson 
PRODUCTION AND SUBSCRIPTION ADDRESS : 1611 North Kent Street, 
Arlington, Virginia 22209 
Copyright 1976 
Association for Computational Linguistics 
American Journal of Computational Linguistics Microfiche 50 : 2 
DICK H. FREDERICKSEN 
IBM THOMAS J. WATSON RESEARCH CENTER 
YORKTWON WIGHTS, NEW YORK 10598 
ABSTRACT: This paper describes a loxical organization in which "scnscs" arc rcprcscnted in their 
own right, along with "words" and "phrases", by distinct data itcms. The objcctivc of the scheme is to 
facilitate recognition and employment of synonyms and stock phrases by programs which process 
natural la,nguage. Besides presenting the proposed organization, the paper characterizes thc lexical 
"senses" which result. 
1. Introduction. 
- 
This paper describes an internal lexical organization which is particularly designed to capture the 
facts about synonymy. Besides recording the inclusion of each word in one or more synonym sets 
(identified with its various "senses"), the scheme attempts to distribute attributes per~picuously 
between "senses", "wordings", and the intersections of the two. In addition, there is provision to 
record multi-word idioms, stock phrases, and the like. and to include these as elements in synonym 
sets when appropriate. 
Briefly, "senses" are represented in their own right, along with "words" and "phrases", by distinct 
data items. Each word or phrase is associated with a list of the "senses" which it can express; 
conversely, each "sense" is associated with a list of "alternative wo iings" Additionally. each 
word is associated with a list of phrases in which it occurs. 
Grammatical category, features, selection restrictions, and the like are applicable at three different 
levels: to words or phrases as such, to "senscs" as such, or to particular usages of words or phrases 
(equivalently, to particular wordings of "senses"). 
An Organization for a Dictionary of Word Senses 
This lexical organization has been inlplemerlted at 1BM Research, Yorktown Heights, N.Y., by a 
program -- not to be described here -- which builds such dictionaries in a very compact form, 
giving interactive assistance to the person making the entries. (For cxaniple, the plrogram pdn ts 
out the possibility of merging "senses" whenever their wordings overlap and their attributes arc 
compatible, and merges them if so directed.) There are suitable I'acilities for saving the results, 
retrieving them in various ways, and for altering such things as sclienies of classification witlio~~~ 
scrapping previously prepared work. 
The ultimate intent is that the "dictionary of senses" should serve as the lexical coniponent in a 
natural language fact-retrieval system. Pending its incorporation in that rolc, it will be used to 
amass and organize information on the semantic relations among words and phrases. 
The balance of this paper conies in two sections: 
Section 2 presents the proposed lexical data structures, and suggests haw they arc to be used. 
Included is a sketch of how various types of grammatical anri semantic "attributes" fit into thc 
scheme. 
Section 3 discusses the character of the "senses" encoded in the resulting dictictlary. Reasons arc 
advanced for regarding lexical "senses" as something far short of semantic primitives. At the same 
time, synonym sets are defended against the view that "true paraphrases are rare or rtonexistcnt". 
An Organization far a Dictionary of Word Senses 
2. The Internal Remesentation. 
It will be our purpose in this section to say just enough about internal representation to lay bare 
the organizing principles of the lexicon. The focus is on architecture afid motivations; details of 
field layouts, internal codes, etc. are not at issue here. 
To make the discllssion concrete, suppose we are interested in the senses of the word "changev 
Assuming that none of the words are unfamiliar, the following should put us in mind of two senses: 
change: 1. v alter; 
2. n small coin. 
This, of course, is just a dictionary entry in the traditional format (though with synonyms offered 
in lieu of definitions), 
On the other hand, we might approach the same information from a different direction: starting 
with the two concepts, we might seek words to express them. It is difficult to picture this latter 
situation without assigning artificial labels to the concepts. Call them concepts 1 and 2, and 
suppose for a moment that there were a practical way to look the concepts up (witlzozlt having 
thought of either word for either concept). Then the information to be retrieved might be envi- 
sioned this way: 
I. v change, alter 
2. n change, small coin 
It is this duality of viewpoint -- that words have senses, while senses have wordings -- that our 
lexical representation must reflect. 
The starting point, then, is that words, phrases, and "senses" are separately represented. There are 
three principal types of data item, plus a standard connector: 
1. A "Key Data Item " (KDI) represent:; a single word. 
2. A "Phrase Data Item" (PDI) represents a string of two or more words which are to 
serve as a unit in some context.. 
An Organization for a Dictionary of Word Senses 
3. A "Sense Data Item" (SDI) represents one distinct sense common to a set of wordq 
and/or phrases. In general, a word or phrase may be usable in more than one sense, 
while a given sense may have alternative (synonymous) wordings. Both these types of 
variability are recorded making use of the next data item: 
4. A "Sertse Link Element" (SLE) is a connective item, to be explained shortly. 
Three principal fields will engage our attention in each type of data item. Fig. 1 summarizes the 
fields for each type. 
KDI (Key - - Data - Item) 
"Alteynative 
Senses" Link 
PDI (Phrase - - Data - Item) 
Global 
Attributes 
"Alternative 
Senses" Link 
L 
i 
"Phrase 
lnvol~einents" 
Link 
SDI (Sense - - Data - Item) 
Global 
Attributes 
"Alternative 
Wordings" Link 
"Component 
Word" 
Links 
SLE (Sense - - Link - Element) 
Global 
Attributes 
"Alternative 
'Senses" Link 
Fig. 1 
Schematic of Data Items, with Principal Contents 
1 
"Sense 
Chain" 
Link 
Each KDI (Key Data Item) or PDI (Phrase Data Item) coiltains an "alternative senses" link -- al 
pointer to the first SLE (Sense Link Element) in a chain of SLE's which represent the various 
senses of the word or phrase. The SLE's are chained via their own "alternative senses" links, and 
the final member points back to the KDI or PDI. Thus, we shall speak of such a chain as a ring 
Local 
Attributes 
"Alternative 
Wordings" 
Link 
An Organization for a Dictionary of Word Senses 
6 
specifically, an "alternative senses ring". If no senses are on record for a particular word ox 
phra , the "alternative senses" link in the KDI or PDI is ssl: referent. 
Reciprocally, each SDI (Sense Data Item) contahs an "alt-native wordings" link. This leads to a 
chain of SLE~S which represent more-or-less synonymous wordings mat express the sense. These 
SLE's are chained through their own "alternative wordink links, and again the chain is closed 
into a ring -- this time beginniog and ending with the SDI. 
The structure that is shaping up may now be seen in Fig. 2, The crtrcicrl poirir is that encll SLE 
represents the intersection betwen nn "alternntive  sense^ ring nrrd an "cllterncirive wordi~rgs" 
ring. From the standpoint of the word or phrase, it; represents a particular scnse; from the 
standpoint of the sense, it represents a particular wordink 
Starting from a KDI or PDI, one gets to the SDI for a particular sense by advancing along the 
"alternative senses" ring to the relevant SLE, then &touring along the ring which connects th~ 
latter to the SDI (as one of the SDI's "alternative wordings"). Starting from an SDI, one gets to a 
particular wording by the reverse process. Since each "alternative senses" ring contains exactly 
one KDI or PDI, while each "alternative wording#' ring contains exactly one SDI, each SLE is tied 
to exactly one sense of one word or phrase. (Eani~alently, it is tied to one wording of one sense.) 
The next point of interest is that "attribute" f'i~lds are present in all four types of data item -- even 
in the connectors (SLE's). The attributes which may be recorded in each, however, come from 
different bags. 
To begin with, the attributes found in an SDI characterize a11 the wordings of a given sense 
whenever the wording-, are used in that sense. In Fig. 2, for example, sense "I" should be marked 
as a "verb" sense, whil? sense "2" is a "noun". One would not wish to record the attribute "verb" 
in the KDI for the word "change", for the KDI represents facts about the word itself, irrespective 
of sense, and "verb" does no! 'hold for all uses of the word "change". On the other hand, "verb" 
does characterize all wordings of sense "I", whenever they're being employed to express that 
sense. It would furthermore apply to any additional wordings which we might think of, such as 
"modify", provided they are really used in a synonymous way. 
As a matter of fact, it turns out that the traditional parts of speech -- noun, verb, adjective, 
preposition, etc. -- fit best in this scheme as global attributes of senses, recorded ill the SDI's. 
An Organization for a Dictionary of Word Senses 
senses of "alter" 
Fig. 2 
"Alternative Senses" and "Alternative Wordings" Rings 
(The first sense has two wordings: "alter" and "change". The second sense has wordings "change" 
and "small coin". Two senses are recorded for "change", and one sense each for "alter'.' and 
"small coin".) 
A different sort of attribute may be recorded in a KDI, as a global feature of the word itself. For 
example, we may note of the word "change" that it is "regularly conjugated". That is, when used 
An Organization for a Dictionary of Word Senses 
8 
as a verb, it forms the third person singular by adding "s", and both past ~nd past participle by 
adding "ed" To be sure, this "global" attribute applies only to the "verb" senses of "change"; but 
a moment's reflection will confirm that "change" has more than one "verb" sense, and the 
regularity of its conjugation is common to all of them. Thus, it is useful to note this regularity as an 
attribute of the word itself. (Contrast this with the behavior of the word "can", which is regular 
whn it means "to pack in cans", but irregular when it means "is able to".) 
Various other attributes suggest themselves as global characterizers of the words thcn~selvcs, to bc 
recarded in the KDI's. For example, one might wish to note of "change" that it drops its final "e" 
whon adding "ing" (this is the normal rule) but of "singe" that it doesn't. 
Still other attributes are appropriate when characterizing multi-word units (in PDl's). A string of 
words whose meaning is not evident from the nlere juxtaposition of its constittlents (such as "givc 
up") may be classified as an "idiom", A string of words whose meaning could be figured out from 
the meanings of its constituents, but which occurs with enough frequency to warrant inclusion in 
the dictionary, might be classed as a "stock phrase". (Example: "drop dead".) A string like 
"perform in a subordinate role", which one would not normally expect to encounter in its own 
right, might be classed as a "definition" (for a certain sense of the word "accompany", difficult to 
reword except with a definition). 
Perhaps the most unexpected site for recording attributes is in the connective elements (SLE's). 
These are the logical place, though, to note features that apply to a specific sense of a word, 
without being global to either the sense or the word. Consider the following four senrenccs: 
On the way to the office, he stopped daydreaming. 
On the way to the office, he ceased daydreaming. 
On the way to the office, he ceased to daydream. 
versus: 
On thebway to the office, he stopped to daydream. 
Suppose we choose to view this as a restriction upon the (surface) object of the verb: "stop", when 
applied to an action, must take a gerund as its object; "cease" can take either a gerund or an 
infinitive. (It wouldn't affect the point being made if we said that "stop" inhibits a certain 
grammatical transformation en route to surface structure, while "cease" pennits it.) 
An Organization tur a Dictionary of Word Senses 
9 
Now, we wouldn't want to mark "gerund object only" as a global. attribute of the sense, for we 
have just shown that "cease" and "stop", two wordings of the sense, differ with respect to this 
restriction. On the other hand, it doesn't belong among the global attributes of the word "stop" as 
such, for "stop" has other verb senses, even transitive ones, tu which the restriction is completely 
inapplicable. (Consider "stop a hole in the dike", "stop a catastrophe", etc.) That leaves the 
alternative we are suggesting: treat the restriction as an attribute of one particular usage of thc 
word (equivalently, one particular wording of the sense). 
Fig. 3 
"Phrase Involvement" Rings 
(1) 
"play" 
(Where numbers are shown on connecting links, they indicate the position of the word in the 
phrase which is linked to.) 
1 
KDl 
i 
1, "play down" 
1 
PDI 
1. (1) 
(2) 
"d~wn" 
KDI 
I 
'down payment" 
'I 
PDI 
_ 
A 
I- 
(2) 
"payment" 
KDI 
- 
I 
An Organization for a Dictionary of Word Senses 
10 
Besides having senses, individu~l words are involved in phrases, and this fact is also represented in 
our data structure. Fig. 3 shows the plan of attack. I'n the KDI for each word, there is a link 
connecting it to the PDI for the first phrase in which the word is known to occur, together with a 
number designating the position of the word (Ist, Znd, 3rd. etc.) in that phrase. In the PDI itself, 
there is a coiltinuation link for each word of the phrase, together with its niiri~ber in the next 
phrase. In the final PDI involving a given word, the link for that word points back to the KDI. 
Thus, independent of its "altcrnative senses" ring, each KDI tnay hmc r~ "phrase invol\~cit~ents" 
ring. 
This structure makes it possible to retrieve all the idioms, stock phrases, definitions, etc., in which 
a given word has made its appearance, anywhere in the dictionary. As the same structure is used to 
encode every multi-word unit, no occurrence of a word is ever lost sight of, and r~ phrase can be 
looked up via any of its constituent words. 
Of the fields to which Fig. 1 calls attention, we have discussed all but one. In the SDI for each 
"sense", tbpre is a "sense chain" link field. This links the SDI to its successor in a global chain of 
"senses". Using this chain, it is possible to make an exhaustive, non-duplicative list of all the 
"senses" recorded in the dictionary. The listing program has only to proceed down the chain, 
retrieve frdm each SDI its attributes, decode them, then chase around the "alternative wordings' 
ring of the SDI and list the wordings alongside the attributes. 
One more feature of the internal representation deserves mention: the data items for words occur 
as "leaves" in a lexical tree (Fig. 4). That is, the KDI for a word can be looked up letter-by-letter, 
following a chain of pointers that correspond to successive letters. The chain ends at a KDI after 
following a substring sufficient to distinguish the word fro111 the nearest thing like it in thc 
dictionary. The lexical tree has the advantage that words can be looked up either at random or in 
sequence. 
Recapitulating, these are the essential features of the representation: 
*1) "Senses" are represented separately from "wordings", and the mutual connections 
between them are made explicit in both directions. 
*2) "Wordings" may be either single words or multi-word pnrases. I hese are representea by 
distinct types of data item, and may be subject to distinct schemes of classification. but 
they are on the same footing with regard to "sense" connections. With each word is 
associated an exhaustive list of the phrases in which it occurs. 
An Organization for a Dictionary of Word Senses 
(start) 
I 
"monkey" 
B 
"above" 
&, 
Fig. 4 
Lexical Tree 
$( Y? 
(For a dictionary containing only the words a , "above", "abate", and "monkey", this would be 
the full tree. The path to each word is only as long as needed to distinguish it from the neighbor 
with which it shares the longest leading substring.) 
*3) Classifiers and features, drawn from appropriate sets, may be attributed separately to 
words, to phrases, to senses, or to particular senses of words or phrases (i.e., to particu- 
lar wordings of senses). 
*4) The data items which represent senses are globally chained, and may be exhaustively 
listed. 
An Organhation for a Dictionary of Word Senses 
*5) The data items which represent words are accessible as "leaves" of a lexical tree; hence 
they may either be retrieved by lookup (in response to presentation of the words) or 
volunteered in alphabetical order, 
Given a con~mitn~ent to repreiknt a lexicon as suggested by points * 1 through 9 tabovc, various 
implementations would be possible. Alternative implementations of individtlal points (though not 
of the scheme as a whole) have in fact been described by other writers. Tlte lexical tree (*S), for 
example, is no great novelty: Sydney M Lamb and William H. Jacobsen describe implementation 
details of one such tree [SJ. [lo] also concerns a dictionary which uscs this general style of 
organization for lookup. For '.hat matter, the lexical tree is reminiscent of Fcigcrrbr~um's 
"discrin~ination tree." [ 11 
More interestingly, the Separate representation of senses n11d wordings has been incorporated in 
othei. systems by R. F. Simnlons ([Ill, [12]) and by Larry R. Harris (31. This way of looking at 
matters led Harris to remark some of the same points that we have been stressing: that senses have 
alternative wordings just as words have alternative senses; that multi-word phrases might occur on 
the same footing as individual words in the expression of a sense; and (interestingly enough) that 
part-of-speech information really adheres to the "sense", not to the "word" Similarly, Simmons 
associates his "deep case" information with lexical nodes representing "wordsenscs",, while words 
the~nselves are treated as "print image" attributes of the wordsenses. 
Harris's dictionary was only a minor component in a small-scale model of concept acquisition. No 
great number of either words or concepts was required to illustrate the principles at stake, so 
Harris programmed the dictionary as an array, with words represented by rows and "concepts" by 
columns. Elements of the array were merely frequencies, indicating the strength of association 
between each word and each concept. 
Needless to say, for a full-scale vocabulary of words and concepts, such an array is mostly empty; 
nobody would dream of expanding it in that form. From a programming standpoint, the only 
thinkable choice is some form of list structure. Having decided in principle to use "some form of 
list structure", though, one might well ask: Why chains? Why rings? Why not just include in each 
Key Data Item a full list of pointers to the corresponding Sense Data Items. and vice-versa? 
The answer is simply one of convenience. It's easier to handle insertions and deletions when they 
don't require the movement of expanded items to new quarters, or the provision of "overflow" 
pointers. It's easier to reclaim freed storage when deleted items come in a handful of standard 
An Organization for a Dictionary of Word Senses 
13 
sizes. As for "rings", they eliminate the need for two-way pointers, since one can break into a ring 
at any point and follow it to its source. 
It should be noted that to make rings an attractive representation, the details of the material being 
represented must cooperate, In particular, the rings must not become too long, or the processing 
requlred to follow them becomes excessive. It happens that "alternative senses" rings and 
"alternative wordings" rings are typically short rarely more than a dozen links per ring. "Phrase 
involvement" rings, on the other hand, can become spectacularly long, especially for words like 
"a*' and "to". In practice, it's necessary to provide these rings with short-cut links. 
Any of these, programming details could be altered, however, without abando~ting tltc ossencc of 
the scheme, which is given in points * 1 through *5 above. 
An OqginiZatioh for a Dictionary of Word Senses 
3. The Character of Lexical Senses. 
-- - 
Pe~haps the first tljing to get straight about the "senses" represented in this dictionary is what they 
are not. They are aot "concepts"; they are not a set of "primitives" into which human espurlcncc 
an be decomposed, No conjecture is put forivard here [hot any such collcrtion of discrete, i~tr~riiic 
concepts even exists, let ulonc that it might be finite. 
Rather, the "senses" of the dictionary are in thc nr.urc of fuzzy equivalcncc sets among worcis. 
(This is only a metaphor; we shall do more and more violence to thc ttchnicul notion of an 
"equivalence set" as we proceed.) Each "sense" groups n set of words which, in n set of appr:~pri- 
ate cu.ntexts; might be tiscd ~nore or less intcrchangenbly. That the cq\livalenct. sots arc fu7'r.y. nrie 
can convir~cc otleself with but the briefest im~~~crsion in the r~lntcrials of thc. lang~i~p~ -- trying to 
decide whether particular words belong in particular groups or justiPi\~ the crcr~tion of new groups. 
Consider, for example, the following set of words and phrases: 
(abandon, give up, surrender, relinquish, let go. desert, leave, forsake, abdicate) 
Clearly, there is a common theme that can run through all of these, given the right circumstances. 
It might be expressed as "reluctant parting from somebody or something". This can be seen by 
coupling the verbs with var.ious possible objects: 
(abandon, give up, surrender) a town to the enemy 
(abandon, give up) all hope 
(give up, relinquish) one's claim to an estate 
(give up, let go) our entire stock at a loss 
(abandon, desert, leave) one's wife and children 
(desert, forsake) a friend in need 
(give up, abdicate) the throne 
An Organhtion for a Dictionary of Word Senses 
(abandon, desert) an exhausted mine 
(forsake, give up) all other, keeping thee only to her/him 
(abandon, desert, leave) the area threatened by the storm 
Should we, then, declare this group of words to be a "sense"? There are difficulties. The various 
words carry nuances, which it may or may not be easy to ignore in a particular context, 
"Forsake", for example, can suggest that there is something reprehensible about the action. It can 
also connote formal renunciation, and the above example from a marriage vow shows that the 
formality can be present without the reprehensibility. Nuances get in the way of interchangeabili- 
ty; it would sound strange to substitute "desert" into the marriage vow. 
Besides nuances, the individual words have conventional areas of application. One does not 
normally say that the doctors "deserted" all hope, or that an errant husband "surrendered" his 
wife and children. The minister officiating at a wedding would be considered daft if be adjured the 
bride and groom to "abdicate" all others, and a merchant would not advertize that he was 
"relinquishing" his entire stock at a loss. (Somehow, the larkr situaaon calls for more pedestrian 
language.) 
At the opposite extreme, overawed by this lack of interchangeability, we might decide to respect 
the unique personality of each word, abolishing equivalenioe classes altogether. The inconvenience 
of such a cop-out is obvious: we then have to introduce some other mechanism for recognizing the 
equivalence of utterances that are intended synonymously, though they employ different words. 
But beyond being inconvenient, the exclusion of equidence sets is a denial of linguistic facts -- 
just as bad, in its own way, as the naive attribution of unconditional synonymy. 
For it is a commonplace of everyone's experience that the speaker and the listener agree to ignore 
the nuances of words, whenever nuances get in the way of communication. A writer who has used 
the word "give up" eight times in five lines will surely cast about for some alternative ways of 
saying the same thing. If "relinquish" and "abandon" would normally be too flowery, or if 
"mender" would in other circumstances call to mind an armistice ceremony in a railway wagon. 
that will not deter the writer from tossing in a few occurrences of those words -- once a context 
has been established that discourages the overtones. Nor will the reader understand matters any 
differently. It is as if writer and reader conspired: "We're fed up with that word, let's hear 
another." Or, perhaps, the writer simply connives at jolting the reader awake with frequent 
changes of idiom, maybe even an occasional incongruity. En any case, synonymy is imposed upan 
An Organization for a Dictionary of Word Senses 
16 
the words, and this literary behavior merely exaggerates what people do habitually in cbmmon 
speech. 
Not only can words be stripped of nuances normally present; they can toke on colorations 
suggested by the context. The suggestion of "reluctance" conveyed by all thc verbs of our example 
can be inferred, in at least one case, from the setting alone; and in this case, a variety of rnorc 
neutral verbs could be used synonymously: 
(part with, take leave of) our entire stock at a loss 
One could even substitute the word "sell", and it wouldn't change the meaning that was already 
bad into the utterance. But to adnlit contest-dependent synonyniy of this clcgrcc is to strctch the 
equivalence sets" to thc point of uselessness. 
It comes to this: neither the grouping nor the separation of words can be fully justified. Grouping 
is nearly always conditional, and separation is often so. If one could anticipate all possible contexts 
in which a group of words could occur, one could perhaps enumerate all possible equivalence sets 
-- one for each combination of word group with a set of contexts making the words interchangea- 
ble. Anyone, however, can see the futility of that aspiration. 
In the end, one settles for messy compromises. Words are grouped if a largish set of contexts in 
which they are interchangeable springs readily to mind. They are separated (into perhaps overlap- 
ping groups) if the imagination readily suggests contexts in which their meanings differ 
"significantly" -- whatever "significantly" may mean. In doubtful cases, when words are grouped 
somewhat questionably, one promises oneself to add markings some clay that will prevent misuse 
of the equivalence. When words are separated somewhat questionably, one promises oneself to 
add a mechanism some day that will recognize their relatedness. 
In the end, too, one assigns internal structure to the equivalence sets. That's the effect of assigning 
local attributes to the alternative wordings ("animate subject", "object a vehicle", etc.): const- 
raints are imposed upon the interchangeability of the wordings. More radical structuring can be 
accomplished if, for example, one notes "government" as an alternative wording of the sense 
"govern, rule, control", with the attribute "nominalization". 
A trenchant discussion of such difficulties may be found in Kelly and Stone [4]. There the 
emphasis is upon disambiguation: given a word in a passage of text, they seek to identify (by 
selection from a fixed list of possibilities) the sense in which it is used. Building a computerized 
An Organization for a Dictionary of Word Senses 
dictionary for the purpose, they soon became concerned with the arbitariness and the proliferation 
of target "senses", as taken from standard desk dictionaries. They argue, with persuasive exam- 
ples, that what lexicographers conventionally distinguish as separate senses of a word are often just 
applications of the word's underlying concept to different contexts. To cover the various contexts, 
the underlying concept has to be stretched a little, by a process of metaphoric extension. This 
metaphoric process is beyond our present power to computerize, but for 'the long run looks 
indispensable for successful language processing. Meanwhile, the authors advocate a dictionary 
which records for each word as few discrete senses as practicable, combining into one scnsc all the 
usages which can reasonably be united by a common underlying thougl~t. 
It is interesting to re-examine Kelly and Stone's argument with a diffcrent task in n~inii: not tile 
disambiguation of one word, but the recognition of synonymy between two words. A n~etaphorical 
capability would be as useful for the one task as for the other, but in the case of synonynl rccogni- 
tion, some of the considerations which have guided traditional lexicography remain pertinent. In 
particular, it is necessary to ask not merely whether the concepts overlap, but whether the one 
word may in fact be used in place of the other. As noted before, usage is restricted by conventional 
domaitls of application; for example, an "alteration" is conceptually both a "change" and a 
"modification", but one wouldn't call it a change or a modification when painting a sign for a 
tailor's shap. 
The arbitrariness of the equivalence sets is not all that disqualilies then1 as "conceptual 
primitives". There is a much deeper difficulty in the fact that practically all "senses" can be 
paraphrased in terms of other "senses". Take, for example, the intransitive sense of "change" (as 
in "My, but you've changed!"). Surely, one would suppose, the concept of "change" must be 
primitive? Change of state is what well-nigh a third of all verbs are about. 
But if "change" is a "primitive", it's a peculiar sort of "primitive", for it can be paraphrased in a 
variety of ways: 
(change, become different, cease to be the same, assume new characteristics, make a 
transition into a new state) 
Note that the multi-word paraphrasals are not idioms; the individual words contribute their usual 
meaqings to concatenated meanings which express the concept "change". 
An Organization for a Dictionary of Word Senses 
18 
But perhaps we were merely unlucky? Perhaps we chanced upon a concept which looked elemental 
but actually turned out to be complex. Maybe the real primitives are "become", "be", "cease", 
"different", "same", etc. Let's dig into that possibility. 
What does it mean to "become X", where X is an adjective? The meaning can bc \~ario\isly 
expressed: 
(become X, come to bq X, get to be X, get X, turn X, grow X, assume thc charrtctcris- 
tic X) 
That's a discouraging number of ways for a "primitive" to be re-expressible -- though if we choose 
to regard "come to be" and "get to be" as idiomatic concatcnr\tions of words, only onc of the 
alternatives makes use of other concepts to explain the one at hand. 
As for "different", it implies a whole underlying anecdote about sortlebody making a contparisoi~, 
after first making a judgment about relevant things to compare. In the combination of the two 
concepts -- "become different" --, we furthermore drop mention of the objects being compared. 
It's simply understood that they are certain attributes of the subject at two points in time. 
It is tempting to invent ad-hoc "transformational" explanations for these phenomena. One might 
conjecture, for example, tliat "The man changed." is a surface realization of four underlying 
sentences: 
(Man be X at time m. Man be Y at time n. X not equal Y, Time n greater-than time n~.) 
Thetrouble with explanations of this sort -- apart from the fact that they introduce growing 
complexity into the understanding of straightforward utterances -- is that they assign arbitrary 
primacy to some concepts at the expense of others. Why should 
"time n greater-than time m" 
be an assumed primitive? May we not equally well conjecture that "time n greater-than time m" is 
a surface realization of these?: 
(Time be m. Time change. Then time be n.) 
For that matter, why not view 
An Organization for a Dictibnary of Word Senses 
"Time elapsed. " 
as a surface form of this?: 
"At least one thing in the universe changed." 
After all, what is "time" but a nominalized way of talking about the presence and partitioning of 
change? 
The difficulty, it would seem, lies in the very notion of context-independent "conceptual 
primitives". The metaphor itself is at fault: it calls to mind a fixed set of dements, like thosc of 
which matter is composed, out of which all ideus must be compounded. But where concepts arc 
concerned, primitivity is a matter of focus, Shift the perspective a little, and new elenlcnts swim 
into view as fundamentals, while former simples become con~plex. 
A more promising metaphor is the analogy to a 'vector space. A set of basis vectors is, in a way, a 
set of "primitives" out of which all the entities in the space can be composed. These primitives 
have the appealing property that they are only primitive relative to one frame of reference. Rotatc 
your point of view, and what used to come natural as basis vectors are now at an angle; they 
become easier to express as sums of vectors that lie along new axes. That bears a resemblance to 
what we have seen,in tlhe case of lexical "primitives". 
Thus far and no further may the analogy be pushed, however. The elements which span 
"conceptual space" can be no such uniform set of objects as those in a vettor space, while the 
rules of composition are coextensive with grammar -- at a minimum. Composition of concepts 
itself contributes to the meaning. (For that matter, it is arguable whether concepts are sufficiently 
separable to model them as discrete objects at all -- whether simple or composite.) Moreover as 
"conceptual space" must encompass all things thinkable, the rules of composition must themselves 
be part of the space. That is, the operators as much as the things operated upon lie within the space 
to be spanned. 
A seming counterexample to these remarks may be found in the "primitive ACT's" of conceptual 
dependency theory, as propounded by Schank, Goldman, Rieger, and Riesbeck (121, [7], [8], [9]). 
On a close reading, however, the "primitive ACT's" turn out to be verb paradigms -- powerful, 
semantically motivated generalizations about large classes of verbs. The names of these paradigms 
replace specific verbs as building blocks in the "conceptual" representation of an utterance. The 
An manbation f~r a Dictionary of Word Senses 
2 0 
effect is to provide strong guidelines for the inference of unstated information, for the comparison 
of related utterances, for paraphrasal, etc. 
TO represent a particular verb in terms of these ACT's, however, it is necessary to augment each 
ACT with various substructures which detail the manner, the means, the type of actor or object, 
etc. No reduced set of representatives is as yet offered for the adverbs, nouns, adjectives, etc. in 
terms of which the "primitive ACT's" are qualified. If such additional condcnsntion werc 
attempted, the elaboration of a given utterance in terms of the full set of "primitives" might well 
ramify without practical end. In other words, reduction of the set of names L'or nodes (and labels 
for arcs) must be purchased at the expense of extending the number of them required to represent 
each utterance, 
In conceptual dependency representation, just as in the sernantic networks" of Quillinn [6], 
Simmons ([I I], [12]), Slocum, and others, reality ultimately appears as a shimmering web, every 
part of which trembles when any part of it is touched upon. Taken in its totality, the system -- as 
yet -- is entirely compatible with skepticism about a corrrprehensive set of "conceptual primitives" 
In any case, the verbal "senses" proposed here lie at a far lower level of generality than the 
"primitive ACT's" used in conceptual dependency theory. In terms of that theory, they come 
closest to the so-called "CONCEXICON entries" used by Goldman in realizing surface expres- 
sions 'of a concept from its conceptual representation [2]. Given a primitive ACT, Goldman 
narrows it down to a particular "CONCEXICON" entry by applying the tests in a discrin~ination 
tree to the rest of the structure in which the ACT appears. 
Our lexical "senses", therefore, are lcft with a humbled role. If they span anything, it might best 
be thought of as "communication space", not "conceptual space". Even in this light, they arc a 
hugely redundant basis, and a not at all unique one. They form no inventory of the experiences 
being communicated about; "meaning" is still a step removed, still evoked rather than embodied 
by the elements of this basis. 
If we persist in calling these things "senses", it is because that is the traditional term for what is 
brought to mind as the synonym sets of a given word are enumerated. The tie-in with meaning is 
tenuous, but the human user is able to supply it. There is at least this much justification for the 
term: synonym sets, more forcefully than words, direct attention to the points at which a tie-in 
must be made between the tokens of communication and the underlying representation of "world 
knowledge" 
An Organization for a Dictionary of Word Senses 
2 1 
In a full-fledged system for processing natural language, then, we must envision the "dictionary of 
senses" as a component stretching vertically across the "upper" layers. Its "sense data items" must 
link, ifi some way, to the deeper-lying data structures which encode "knowledge of the world" (the 
"pragmatic component"). The "key data items" and "phrase data items" register tokens to be 
expected or employed in "surfzrce" utterances. Global and local attributes recorded in the various 
data items guide parsing and interpretation. Where one takes it froin there depends upon thc 
linguistic approach to be used. 
An Organization for a Dictionary of Word Senses 
American Journal of Computational Linguistics Microfiche 50 : 24 
CURRENT BIBLIOGRAPHY 
Despite repeated predictions to the contrary, both the 
selection of material for this issue and the choice of 
subject categories are tentative. The Editor and his 
collaborators have found the reconstruction of intellec- 
tual and mechanical systems more onerous than they had 
expected. 
Completeness of coverage, especially for reports circulated 
privately, depends on the cooperation of authors. Summaries 
or articles to be summarized should be sent to the editorial 
office, Twin Willows, Wanakah, New York 14075. 
Many summaries are authors' abstracts, sometimes edited for 
clarity, brevity, or completeness. Where possible, an infor- 
mative summary is provided. 
The Informatheque de linguistique de ilUniversite d'ottawa, 
Dermot Ronan F. Collis, Director, provides a portion of odr 
entries. AJCL gratefully acknowledges the assistance of J. 
Beck, B. Harris, and D. Castonguay. 
See the following framfor a list of subject headings with 
frame numbers. 
SUBJECT HEADINGS 
GENERAL ......... 26 
Chinese ......... 35 
PHONETICS . PHONOLOGY . . 36 
....... Recognition 38 
........ Chinese 43 
WRITING ......... 44 
....... Recognition 44 
........ Chinese 45 
Syrithesis 
........ Chinese 46 
Text Input 
Chinese ........ 48 
Character sets 
Chinese ........ 50 
Chinese ......... 54 
LEXICOGRAPHY . LEXICOLOGY 
Statistics ....... 56 
Text Handling ...... 57 
Dialectology ... 58 
Thesauri .... 58 
GRAMMAI; ......... 59 
Generator ........ 60 
SEMANTICS - DISCOURSE . . 60 
Memory 
Question An 
Text Grammar 
LINGUISTICS 
Methods ... 
Mathematica 
. swering . . 67 
...... 67 
....... COMPUTATION 
Programming ....... 70 
Languages ....... 71 
.... Pictorial Systems 72 
DOCUMENTATION ...... 73 
Classification ..... 76 
Retrieval ........ 76 
TRANSLATION ....... 76 
SOCIAL-BEHAVIORAL SCIENCE 79 
Anthropology ...... 82 
Psychology ....... 83 
Learning ....... 84 
HUMANITIES ....... 85 
Concordance ....... 86 
Analysis ........ 86 
INSTRUCTION ....... 88 
BRAIN THEORY ...... 91 
ROBOTICS 
GENERAL 
Putnam and Clarke and Mind and Body 
Yorick Wilkes 
Arti f icial Intelligence Laboratory, Stan ford 
Brilish Journal for the Philosophy of Science 26:213-225, Seprrrnber I975 lSSN 0007-0882 
Putnam argues for a satirical privacy for machines by asserting that, just us it makes no sense 
to ask John how he knows that he is in pain, so it makes no sense to ask a Turing Machine 
(TM) how it knows that it is in stnte n. When addressed to an abstract Tb1 the question is 
absurd, but not when addressed to a physically realized TM. Putiirlm equivocates about the 
notion of stnte, discussirig only abstract TM's when introducing the notion of state, but 
making an argument which is cohcrer~t only with respect ot a physicaliy renli~cd TM. There 
is thus, in effect, n cot)fusion between 'state' as of an ai1t0111;lt n rind 'stnte' as of a real 
Y 
machine which is executing a program which rcalizes that automaton--any of a iiun~ber of 
states of the machine might correspond to one state of the nutonlaton embodied in the 
progranl. Thus Putnam's argument fails. Clarke's criticisms of Putnam are rilisguided but 
instructive. A more serious notion of ir~achine privacy can be constructed by noting that it 
is impossible to inter the machine's real activitity deterrninately from the content of registers. 
GENERAL 
A Graphical Programming System with Speech Input 
Chacko C. Ncroth 
University of California, Berkeley 
Computers & Grapltics I:227-231, 1975 
The experimental problem solving environrnen t is one of formulating specifying, debugging 
and executing (algebraic) procedures interactively on a small processor. The speech 
recognition system is a real time, syntax directed, limited vocabulary, highly cost effective 
scheme specifically tailored to this environment. The data transformation operations of the 
language are verbally specified and the control flow is specified graphically as a two- 
dimensional directed graph. The semantics of the latter structure is independent of the time 
sequence of its input. An input restricted (conditional input) pseudo-finite state machine 
model is used for the continuous syntax checking of the input on an atomic token basis and 
for directing the speech recognizer. 
GENERAL 
Computerized Natural Language Information System 
Stewart N. T. Shen 
Computer Science Department, Virginia Polytechnic Institute and Stare University, 
Blacksburg 
S. Goald, Ed., Proceedings of rhe First International Symposium on Computers and Chinese 
Inpur/Otr~put Systems, Academia Sinica, 573-588 
General problems in NL processing are discussed and a methodology is presented. A NL 
system should consist of a supervisory module which reads and interprets certain input 
sentences stated in some specific way. These serttences tell the systern what kind of job is 
being done. The system would have various syntactic, semantic, and praglnatic processing 
modules available to it. Technological developmel~ls may well niake it practical for 
iridividuals to have CNLlS (Cotrputerizcd Piatural Language Inforniution System) terminals in 
their homes. A typic4 user terrninal may include a microcornp~~ter an interactive TV, and 
an electric typewriter. In the computerization of Chinese, a simplification of the written 
characters is urged. 
GENERAL 
Design Concepts of Chinese Language Data Processing Systems 
Yaohan Chu, Chu 
Deparlrnent oJ Computer Science, Vniversit), of Maryland, College Park 
S. Could, Ed ., Proceedings of the First l n 1 ernational Syr?lposium on Computers and Chinese 
1 npur /Output Systems, Academia Sinica, 117-136 
Five types of Chinese language data processing systems are discussed. 1) Accept assembly 
code in English and hand-coded Chinese data and use an expanded subroutine library. 2) 
Accept assembly code and data, both in Chinese. by adding a pre-assembler and translators to 
the manufacturer-supplied assembler and linkage editor. 3) Accept a high-level Chinese 
programming language and Chinese data. 4) Those which use the Chinese-language-oriented 
postfix string as the machine language. 5) In which the high-level language itself is the 
machine language (i.e. one-level language). This type of data processing system has no 
intermediate language, no assembly language, no relocatable language, and no absolute 
language. 
GENERAL 28 
Design Philosophy of a Chinese-Oriented Computer 
John Y. Hsu 
Departrner~t of Cornpuler Scien a Slatistics, Calijornia Polyt~chnic Store University, Son 
Luis Obispo 
S. Gould, Ed ., Proceedings of the First Internntioncll .7yrttposil~rll on Conlpi; tets or~d Chinese 
Input /Olitput Systertts, Accrdemia Sinicn, 135-150 
Basic design philosophy of a Chinese-oriented cornpnter includes consideration of the 
idiosyncrasies of silch a computer, The following topics are d~sci~ssrd: 1) I11 tt'rnal coding of 
Chiriese characters, 2) Chirlese Input/Output devices, 3) Instruction Kcpertorie to hl;~nipulnte 
Chinese Characters, arid 4) Miscellaneous. The design approach toward such a Chinese- 
oriented cornpiiter is also comniented on. 
GENERAL 
Is Technology Ready for Chinese/Japanese Data Processing 
h. J. Crecnblott, and M. Y, Hsiao 
I BM Yougkkeepsie, New Yurk 
S. Gould, Ed., Proceedirlgs of the Firsr Internatior~nl S~mposiurn on Computers orrd Chinese 
Input /Ouipur Sysrems, Acndenia Sini'ca, 1.51-16 1 
The technology for the computer processing of Chinese characters on a large scale is almost 
around the corner. The major bottleneck is the training required to key in 5,000 different 
Chinese characters quickly and correctly by either a set of keyboards or some cleverly 
combined coded form or keyboard design. Advances in LSI technology. mechanical or 
magnetic keys, CRT, etc., will all contribute to the realilation of a d3t3 processing system 
capable of handling ideographic languages. An automatic pattern recognition system was not 
chosen to represent the major future trend because its technical development is still b'eyond 
the level of practical large scale implementation. 
GENERAL 
Conversation, Cognition and Learning 
Gordon Pask 
Sysfem Research kd., Richmond, Surrey, England 
Elsevier, Inc., Amsl-erdarn and New York, 1975, $37.00/Dfl 96.00, xii + 570 pages, ISBN 0- 
444-4 1193-3 
This book describes a theory of man/man or madmachine conversations and cognitive 
processes (with emphasis upon the dynamics of learning and teaching at an individual level) 
together with several special experimental methods and practical applications. Most of the 
illustrations and data supporting the argument stem from education, course design, and 
similar fields and the material is relevant to epistemology, subject matter organis:~tion, as well 
as such disciplines as pedagogy, computer aided instruction etc. Some experiments, however, 
deal with laboratory learning and the acquisition of perceptual motor skills, and an attempt is 
made to identify the theory and methods wrth inany standard paradigms in social and 
experimental psychology. An account of consciousness and self-reference is given in the 
theory, 
GENERAL 
Information Processing and Cognition: The Loyola Symposium 
Robert L . Solso. editor 
Loyola University of Chicago 
Halsted Press Division. John Wi1e.y & Sons. New York. 1975 ISBN 0-470-81230-3 HC 
$.19.95 
Con tents 
Preface .................................. xi 
SECTION 1 
I MEMORY. PERCEPTION. AND DECISION IN LETTER IDENTIFICATION. 
W . K . Estes ................................. 3 
Introduction ............................... 3 
........... Apparent Effects of World Context on Perception of Letters 6 
Detection Procedures and Revival of the Redundancy Hypothesis ......... 7 
...................... The Postexposure Probe Technique 9 
An Interpretation of the Role of Positional Uncertainty ............ 16 
...................... Levels of Information Processing 18 
A Model for Levels of lnfornlation Processing in Letter Identification ...... 19 
1 nterpretations of Em pi rical Phenomena ................... 22 
Discussion ............................... 26 
References 
2 STUDIES OF VISUAL INFORMATION PROCESSING IN MAN. M . S . Mayzner . . 31 
Introduction ............................... 31 
Currer~t Hardware System and Associated Software Package ........... 34 
General Research Strategy ......................... 35 
A Brief Review of Sequential Blanking and Displacement ............ 36 
Dynamic Visual Movement ......................... 39 
Dynamic Visual Movement ......................... 39 
GENERAL 31 
Subjective Color Experiences Associated with Dynamic Visual Movement ..... 
42 
Pattern Recognition Mechanisms as Found in Overprinting Paradigms 
....... 44 
................................ 
Overview 52 
............................... 
References 53 
3 ATTENTION AND COGNITIVE CONTROL. 
................... 
Michael I . Posner and Charles R . R . Snyder 
............................... 
Introduction 
....................... 
Automatic Pathway Activation 
A View of Conscious Attention ....................... 
...................... Strategies and Conscious Control 
.................. The Place of Value in a Judgment of Fact 
................................ Overview 
References ............................... 
4 FORM. FORMATION. AND TRANSFORMATION OF INTERNAL 
REPRESENTATIONS. Roger N . Shepard .................... 87 
............................ Some Central Issues 87 
Summary of Some Experimental Findings .................. 96 
Theoretical Discussion .......................... 103 
Acknowledgments and Historical Note ................... 117 
References ............................... 117 
SECTION I1 
5 RETRIEVAL As A MEMORY MODIFIER: AN INTERPRETATION OF 
. . ..... NEGATIVE RECENCY AND RELATED PHENOMENA Robert A Bjorl 
123 
Negative Recency ............................ 124 
Related Phenomena ......................... 1'16 
Conclusion ............................... 142 
References ............................... 
143 
GENERAL 
6 ENCODING. STORAGE. AND RET'RIEVAL OF ITEM INFORMATION. Bennet B 
Murdock. Jr . and . Rita E . Anderson ...................... 145 
Encoding .............................. 153 
Storage ................................ 158 
Retrieval ..... .......................... 164 
Retrieval at Short and Long Lags ..................... 185 
References .............................. 192 
7 WITHIN-INDIVIDUAL DIFFERENCES IN "COGNITIVE" PROCESSES. 
............................. 
Willian~ F . Battig 
.......... Some Questions about Current Cognitive Research Practices 
.................. Processing Differences within Individuals 
............................. Serial Learning 
......................... Paired- Associate Learning 
...................... Verbal-Discrimination Learning 
........................... Free-Recall Learning 
........................... General Conclusions 
............................... References 
8 CONSCIOUSNESS: RESPECTABLE. USEFUL. AND PROBABLY NECESSARY. 
.............................. 
George Mandler 
The Revival of Consciousness ....................... 
..................... Conscious Contents and P~ocesses 
The Limitation of Conscious Capacity and the Flow of Consciousness ...... 
Conclusion ........................... 
References .............................. 
...................... DISCUSSION: SECTIONS I AND 11 
GENERAL 
SECTION Ill 
9 MEMORY REPRESENTATIONS OF TEXT. Walter Kintsch ........... 269 
A Model for Episodic Memory ...................... 270 
Some Remarks on Text Bases .................. 272 
Memory for Text ............................ 277 
Answering Questions about a Text from Memory ............... 284 
Matches at Linguistic and Propositional Levels ................ 286 
Postscript: List Learning and Text Learning ................. 
291 
References .............................. 293 
*10 COMPUTER SIMULATION OF A LANGUAGE ACQUlSTlON SYSTEM: 
A FIRST REPORT. John R . Anderson ..................... 295 
Formal Results on Grammar Induction ................... 296 
The Role of Semantics .......................... 299 
Rationale for the Research Approach .................... 301 
The Program LAS.1 ........................... 302 
Bracket.. The Graph-Deformation Condition ................. 317 
An Example of Grammar Induction .................... 324 
Assessnient of LAS.l ........................... 334 
Generalizations about Noun Phrases .................... 335 
A Prediction about Language Learnability .................. 339 
Summary ............................... 344 
Appendix ............................... 345 
References ............................... 347 
.................. 11 SEMIOTIC EXI'ENSION. David McNeill 351 
........................ The Study of Performance 352 
.......................... Production Mechanisms 354 
Ontogenesis ............................... 355 
Evidence for this Argument ........................ 361 
GENERAL 34 
Further Evidence for the Argument .................... 364 
...................... Other Influences on Word Order 365 
...................... Emergence of Patterned Speech 367 
Semiotic Extension ........................... 372 
Gestures ................................ 373 
Conclusion ............................... 379 
References ............................... 379 
12 THE CONSTRUCTION AND USE OF REPRESENTATIONS INVOLVING 
............. . LINEAR ORDER. Tom Trabasso and Christine A Riley 381 
........... Problem Origins: the Development of Transitive Reasoning 382 
................... Training Results: Serial Position Effects 385 
Testing Results: Memory and Inference Correlations .............. 386 
........................ Stochastic Retrieval Models 388 
............. What Occurs in Training: The Serial-Position Effect 392 
What Occurs in Testing? ......................... 394 
Reaction-Time Experiments: Six-Term Series Problems ............ 395 
Linear Order Is Independent of Input .................... 399 
The Use of A Linear-Order Representation ................. 400 
Accessing A Linear Order: Strength or Distance? ............... 401 
Conclusions .............................. 407 
References ............................... 409 
DISCUSSION: SECTION 111 ........................ 411 
Author Index ............................... 427 
Subject Index ............................... 433 
*This article has been abstracted under SOCIAL-BEHAVIOURAL SCIENCE: PSYCHOLOGY 
on this fiche. 
GENERAL 35 
Proceedings of the First International Symposium on Computers and Chinese 
Input/Output Systems 
Academia Sinica, xiii + 1331 pages 
The Symposium was held on August 14-16, 1973 at Taipei, Taiwan, Republic of China. 
Many of the papers have been abstracted elsewhere on this microfiche. 
GENERAL: CHINESE 
Interactive Processing of Chinese Characters and Texts 
J. T. Tou, J. C. Tsay, and J, K. Yoo 
Center for In forrriatics Research, University of Florida, Gainsville 
S. Could, Ed., Proceedings of the First International Syrnposium on Computers and Chinese 
Inpur/Uurput Systems, Academia Sinica 1-28 
The system provides a tool to teach pupils how to write Chinese ideographs, how to make 
proper pronunciations, and how to translate into a foreign language. and features dynamic 
display of characters. The system can also perform text-editing operations. Techniques for 
Chinese character representation, based on chain codes for stroke sequence, and dictionary 
generation, in which each character of subcharacter is represented as a subroutine in the 
dictionary, are introduced. Text-edi ti ng routines are discussed and the paper concludes With 
an illustrations of text-editing operations. The final edited text can be transcribed from the 
display scope for making hard copies. The system will be further developed for editing maps, 
for typesetting and for use as a Chinese typewriter. 
PHONETICS-PHONOLOGY 
A Small Computer in the Phonetics Laboratory 
Claes-Christian Elert 
Professor of Phonetics, Umea University, Sweden 
World Papers in Phonetics, The Phonetic Society of Japan, Tokyo, 145- 162, 1974 
With adequate programming facilities at hand the phonetician would be able to make his 
table-top coniputer perform pratically everything that was done earlier by conventional 
equipment for analysis and registration. In addition, data may be stored for automatic 
processing, atid sequences of events, such as qunli tative or quan ti trltive variations of 
parameters or stimuli in experiments with human subjects, can be governed l~ccording to a 
pre-set program, or by incoming signals of random pulses. Topics considered: 1) the nature 
of available equipment, 2) programming, 3) speech analysis, 4) speech spnthesis, 5) the 
computer in experimental work, 6) dir\lcctology and phonology, 7) teaching phonetics. 
PHONETICS-PHONOLOGY 
A Study of Time-Domain Speech Compression by Means of a New Analog 
Speech Processor 
1. M. Bcnnctt, and J, G. 1,invill 
Depar~nlenf u f EIectrical Engineering, Sran ford Universi~y, Slanford, California 94305 
Journal of the Audio Engineering SociPly 23:713-721, November I975 
Time-domain speech compression using the SDA (sample, discard, about) procedure at 
compression ratios of 0.25 to 0.75 is studied by means of a new analog speech processor and 
minicomputer algorithms. Fourier transform methods have been used to establish a 
correspondence between the quality of the reconstructed compressed speech waveforms and the 
subjective recognition of compressed speech. The result of two psychoacot~stic experiments 
indicate that 1) the interruption frequency should be equal to the pitch frequency of the 
voice waveform for optimum recognition of the compressed speech, and 2) smoothing of the 
discontinuities with electronic techniques significantly improves the recognition of the 
compressed speech. The optimum smoothing parameters, window width and characteristic 
function, are also obtained from this study. 
PHONETICS-PHONOLOGY 37 
On the Characteristics of Individual Vowels and the Statistical Characteristics 
of Formant Frequency Patterns in Connected Speech 
Y oshbari Kanamori 
Research lns~itufe o/ Elecirical Communication, Tohoku Universily, Sendai 980, Japan 
Sysrems - Compurers - Conrrols, 6, No. 1:22-30, 1975 
The loci of formant frequency patterns of vowels in many kinds of CVC contexts were 
represented in FI-F? space. The areas enclosing these loci were obtained for each vowel. 
The positions of vobels in connected speech lie inside an area surrounded by the isolated 
vowels because of the neutralization sf vowels io connected speech. The faster the speaking 
r3 c, the more the areas tend to concentrate deeper inside. Also, the areas of individual 
vowels overlap each other and the Caster the speaking rate, the more the overlapping areas 
increase. The overlapping areas were estimated in FI-F2 space and, to investigate the effect 
of F in F1;F2-FJ space. The distribution of F is nearly approximated by the normal 
densi ? y funct~on, because the effect on imbalance o? the vowel occurrence frequencies is not 
clearly observed in the frequency distribution of F3. Areas reflecting the bound of 
articulatory movement in the acoustic domain were obtained from the loci of formant 
frequencies represented in F1-FZ and FI-F3 spaces. We conclude with a comparison of the 
discussed areas and those obtained from the artlculatsry model by Lindblorn. 
Epoch Extraction of Voiced Speech 
'r. V. Anathapadnlannbha, and B. Yegnanarayana 
Deparrrnenr of Eleckrical Communication Er~gineering, Indian Institute of Science, Bangolore 
5600 12, India 
IEEE Transactions on Acousrics, Speech, and Signal Processing 23362-570, December 1975 
A general theory of epoch extraction d overlapping nonidentical waveforms is presented and 
applied to outputs of models of voiced speech production (model 1, impirlse excitation of a 
two-resonator system; model 2, glottal wave excitation of a two-resonator system) and to 
actual speech data. Some typical glottal waveshapes are considered to explain their effect of 
the speech output. The points of excitation of the vocal tract can be precisely identified for 
continuous speech and it is possible to obtain accurate pitch information by this method even 
for high-prtched sounds. 
PHONETICS-PHONOLOGY: RECOGNITION 
Real-Time Digital Hardware Pitch Detector 
Ronald W. Schafer 
Department of Electrical Engineering, Georgia Institute of Technology, Atlanta 30332 
John J. Dubnowski, and Lawrence R. Rabiner 
Bell Laboratories, Murraj) Hill, New Jerse-y 07974 
IEEE Trnnsactions on Acoustics, Speech, and Signal Processing 24:2-8, February 1976 
A high-quality pitch detector has been built in digital hardware and operates ill real time at a 
10 kHz snrnpli~ig rate, The hardware is cnpahle of providing energy as well as pitch-period 
estim;ltes. l'he pitch arid energy cr~mputatioris ;Ire pcrfornied 100 tin~es/s (1.t3'., orice pcr 10 
ms interval). 'T'he slgrari thn~ to estinlnte the pitch period uses cell ter clipping, infinite peak 
clipping, nrld n sirripl if ied nutocorrelation analysis. 'l'htl nnalysis is pc'rforriied on a 300 
sample section of spccch which is botti center clipped iind irlfirlite peuh clipped, yielding a 
three-level speech signal where the levels are -1, O, arid +1 depending on the relation of the 
originill speech ample to the clipping threshold. Thus computation of the sutocorretation 
function of the clipped speech is easily inipler;-\en tsd in digital hardware using simple 
cori1bin3torial logic, i.e., an up-down counter can be used to conipute each correlation point. 
A Comparison of Three Methods of Extracting Resonance Information from 
Predictor-Coeficient Coded Speech 
Randall L. Cl~ristcnscn 
Naval Weapons Cenler, Chino. Lake, cbli~ornia 93555 - 
William .I. Strong, and E. I'aul Palr~ier 
Department oj' Ph.vsics and Astronony, Brighnrn Young University, Prov~, Utah 84602 
IEEE Transactions on Acoustics, Speech, and Signal Processing 24:s-14, February 1976 
The methods: finding roots of the polynomial in the denominator of the transfer function 
using Newton iteration, picking peaks in the spectrum of the transfer function. and picking 
peaks in the negative of the second derivative of the spectrum. A relationship was found 
between the bandwidth of a resonance and the magnitude of the second derivative peak. 
Data. accumulated from a total of about two minutes of running speech from both female 
and male talkers, are presented illustrating the relative effectiveness of each method in 
locating resonances. The second-derivative method was shown to locate about 98 percent of 
the significant resonances while the simple peak-picking method located about 85 percent 
PHONETICS-PHONOLOGY: RECOGNITION 
A Method for the Correctio of Garbled Words Based on the Levenshtein 
Metric 
Tcruo Okuda 
Systems Design Section, Systetns Developmeni Depurtmenl, Fujirsu Limited, Kawasaki, Japan 
Eiichi Tanaka, and Tamotsu Kasai 
Department of Electrical Engineering, Facul~y of Engineering, Universify of Osaku 
Prefecture, Sokai, Japan 
lEEE Transactions of Computers 25:172-178, February 1976 
Using a method for correcting garbled words based on Levenshtein distance and weighted 
Levenshtein distance we can correct substi trrtion errors, insertior~ errors, ar~d delection errors. 
According to the results of computer simulation on nearly 1000 high occurrence English 
words, higher error correcting rates can be achieved by this method thn~i any other method 
tried to date. Short words remain a problem; solving it will probably requite utilization of 
contextual information, Hardware realization of the method is possible, though coniplicated. 
PWNETICS-PHONOLOGY: RECOGNlTlCfN 
Speaker-ldentifying Features Based on Formant Tracks 
Ursula C. Goldstein 
Departnlent 01 El~ctrical Engineeriug and Computer Science and Researol Laboratory of 
E/earonics, Massachuseits lrrstilure of Technology, Cambridge, 02139 
journal of [he Acoustic Society of America 59:176-182, January 1976 
The formant structure of three dipthongs, four tense vowels, and tliree retroflex sounds was 
examined in detail for possible speaker-identifying features. Formant tracks were computed 
for each sound under investigation using covariance-type pitch-asynchronous linear prediction 
together with a root-finding algorithm. The interspeaker variability of about 200 
measurements made on these formant tracks was compared initially with intrnspeaker 
variability through the palculation of F ratios. Those with average F ratios greater than 80 
were evaluated further with a probability-of-error criterion. Features that are potentially 
most effective in identifying speakers are the minimum second-formant value in [ar], the 
maximum first-formant value in [ar], the maximum second-formant values of [o], and [ 11. 
and the minimum third-formant value of [ ] The individual differences apparent in these 
sounds presumably depend mort on speaker habits than on vocal-tract anatomy. The error 
bound predicted-for :i speaker identification procedure based on these five features in 0.24%. 
An identification experiment using only the best two features gave 12 errors out of SO 
identifications. 
PHONETICS-PHONOLOGY: RECOGNITION 
Linear Estimation of Nonsta tionary Signals 
Lduis A. Liporace 
Insfitu~e for Defense Analysis, Comrnunico~ions Research Division, Princeion, New Jersey 
08540 
Journal oj the Acoustic Society of America 56:1296-, December 1975 
lmplicit in the use of linear prediction is the assumption that within each analysis frame the 
signal is stationary. The acoustic signal is assumed to be suitably approxininted by a 
recursion which describes a linear time-invariant acoustic systeni cori~posed of a 
concatcnntion of equal-length. constan t-d ii~meter nondissipativr t 11 bes. Thn t is, nssoicnted 
with the coefficients (c ) in the recursion is a styliied articuli~tory configuration which 
remains fixcd throughout the analysis interval. If we allow the corfficientc to hc functions 
of ti~ne rather than constants we can obtain a more r~'3li~tic 1110drl in which [lie par;meters 
of the model chnrige contin~lously and autornntitally which articulation, yarhttr than 
discontinuously at fixed intervals. The time-varying area function can be estimated by 
adapting Wakita's procedure. 
PHONETICS-PHONOLOGY: RECOGNITION 
A - Semiautomatic Pitch Detector (SAPD) 
Carol A. McConcgal, Lawrence R. Ri~biner, and Aaron E. Rosenbcrg 
Be// Laboratories, Murray Hill, New Jerse)~ 07974 
lEEE Transactions on Acoustics, Speech, and Signal Processing 23:570-574, Decenlber 1975 
The determination of an utterance's pitch contour utilizes simultaneous display (r,~ a 10 ms 
section-by-section basis) ~i the low-pass filtered waveform, the autocorrelation of a 400- 
point segment of the wideband recording. For each of the separate displays (i.e., waveform, 
autocorrelation, and cepstrum) an independent estimate of the pitch period is made on an 
interactive basis with the computer, and the final pitch period decision is made by the user 
based on results of each of the meacurements. Formal tests of the method were made in 
which four people were asked to use the method on lhree different iltterances, and their 
results were then compared. During voiced regions, the standard deviation in the value of the 
pitch period was about 0.5 samples across the four people. The standard deviation of the 
location of the time at which voiced regions became uilvoiced, and vice versa was on the 
order of a half a section duration, or 5 ms. The major limitation of the proposed method is 
that it requires about 30 min to analyze 1 s of speech. 
PHONETICS-PHONOLOGY: RECOGNITION 41 
A Pi tch-Synchronous Digital Feature Extraction System for Phonemic 
Recognition of Speech 
Wolfgang J. Hess 
lnstirur fuer In jormofionstecknik (Dotenverarbeitung), Technisch~ Unirvrsitaet Muenchen 
lEEE Transactions on Acoustics, Speech, and Signal Processing 24:14-25, February 1976 
The system has three portions: pitch extraction, segmentation, formant analysis. The pitch 
extractor uses an adaptive digital filter in time-domain transforming the speech signal into a 
signal similar to the glottal waveform. Using the levels of the speech signal and the 
differenced signal as parameters in tim,e domain, the subsequent segmentation algorithm 
derives a signal parameter which describes the speed of articulatory nlovenient. From this. 
the signal is divided into "stationary" and "transitional" segments; one stationary segment is 
associated to one phoneme. For the formant tracking procedure a subset of the pitch periods 
is selected by the seglnentation algorithm and is transfor~ned into freque~icy domain. The 
formant tracking algorith111 uses a niaximum detection strategy and continnity criteria for 
adjacent spectra. After this step the total parameter set is offered to an adaptive universal 
pattern classifier which is trained by selected nlaterial before working. For sationary 
phonemes. the recognition rate is about 85 percent when training material ad test material 
are uttered by the same speaker. The recognition rate is increased to about 90 percent when 
segmentation results are used. 
PHONETICS-PHONOLOGY: RECOGNITION 
Analysis of Intonation Signals by Computer Simulation of Pitch-Perception 
Behavior in Human Listeners 
Yukio Takefuta 
Ohio State University 
SICLASH Newsletter 8, No. 'I:/ -8, February 1975 
Pauses are used to delimit utterances into segments. Linear regression analysis of pitch 
patterns allows a 4-way classification of slopes of lines: fast rising, rising, level, falling. 
These are the 4 Fundamental Pattern Features (FPF). A combination of 2 or 3 (of the 4) 
FPF's per segment of utterance is a pitch pattern (80 possible). An intonation pattern is a 
com~ii~ation of pitch patterns. The position of the highest frequency vdue in the utterance 
is important. In comparing 2 utterances. if the high point occurs in different segments the 
intonations are contrastive even if the pitch patterns are the same. Of the 80 possible pitch 
patterns, some must be recognized as cardinal patterns and same as cognate patterns to the 
cardinal patterns. Different sets of rules must be used for the "high" segment and the final 
segment of the utterance. 
PHONETICS-PHONOLOGY: RECOGNITION 42 
Graph-Theoretic Cluster Analysis and Its Application to Speech Recognition 
2. Chen 
School of Eleclrical Engineering, Purdlre Urliversify, Hft>sr Lqj'ayetle, lndiana 
S. Gould, Ed., Proceedirrgs of the First Il~terna~ianal Syn~posiurn on Corr~plrrers and Chinese 
/npu!/Output S-vsrerr~s, Acad ernia Sinicn, 225-242 
A clustering algorithm is n~ainly a two stage process: 1) selection of 3 pairwise similarity 
measure between every two saniples or objects in the data set, 2) the similarity nieasure is 
used in a sorting procedt~re whereby groups of similar san~nles :Ire extracted. In 3 graph- 
theoretic clustering algorithni n graph is constriictcd fur [tic. given d;\t:~ arid subgraphs G 
satisfying certain properties are obtained. The clust2ring algorithm fct~ti~res a flexible method 
of edge construction (k-nearest neighbor thrttst~i~ld ii~cthod), uhich nllu\+s [tie grouping of 
samples to be riiorc effective, and the generalired Frisch's I;~helling algorith~i~, which detects 
and rrnloves the possible chaining effect in the d:lt;l. The al~orithtn is applied to the 
recogrii tion sf nasal consunan ts. 
PHONETICS-PHONOLOGY: RECOGNITION 
A Comparison of Several Speech-Spectra Classification Nethods 
H. F. Silvcrnlan, and N. R, Dison 
Speech Processirlg Grorip, Cor?lplrrer Sciences Depdrtmt.nr, I B,l/ Tilorr~as J. Il'atson Research 
Cen~er, Yorktown Heiglrfs, New Pork 
IBM Research Report 5584, IS August 1975 
Two measures of performance of speech spectra classif ication--accuracy and stabil i ty--were 
derived through the use of an automatic performance evaluation system. Over 3000 hand- 
labelled spectra were used. The most si~ccessful classification i~iethod involved a linearly- 
mean-corrected minimum distance measure, on a 30-point spectral representation with a 
square (or cube) norm. Straight minimum distance is the worst performer. The question of 
appropriate point representation is really one of adequate information retention. The SO- 
point representation contains too many components above 3kHz while 20- and 10-point 
representations contain insufficient information relative to the classes to be discriminated. 
The value of the norm exponent primarily relates to the weight given extrerrla in the norm 
kernel; a heavier weighting (2 or 3) should be placed on extrerna. 
PHONETICS-PHONOLOGY: RECOGNITION: CHINESE 
Speech Recognition and Chinese Voice Input for Computer 
Kung-Pu Li 
TRW Systems Group, One Spoce Park, Redondo Beach, California 
S. Gould, Ed ., Proceedings of I he First International Symposium on Cotnput ers and Chinese 
Input/Output Syslems, Academia Sinica, 211-223 
The machine recognition of Mandarin mono-syllables seems to be feasible at the present. An 
integrated recognition procedure of monosyllable utterances has also been suggested, and some 
results are described. The basic syllable structure contains three major parts: initial, tone, and 
final. The initial contains only consonantal phonemes of four different categories: 
sonorant, 
plosive, fricative, aspirate. There a re four tonemes in Mandarin Chinese; the pitch con tours 
cover only the final part of the syllable. In the vowel part of a syllable, although seven 
phonen~es are sufficient to describe all possible vowels, the final can also con tuin dipthorigs 
and tripthongs composed of more than one vowel phoneme. An itegrated recognition 
procedure of monosyllable utterances has been suggested, and some results are reported. 
PH0NETIC.S-PHONOLOGY: CHINESE 
Chinese Phonemes Analysis and Synthesis 
T. Y. Chou, and K. C. Huang 
Nafional Cjziao Tung Universily, Hsinchu, Taiwan, Republic of China 
S. Gould, Ed., Proceedings of the Firs1 International Symposium on Computers and Chinese 
1 npul/Ourpur Systems, Academia Sinica, 1227- 124 1 
The system for producing consonants is a noise generator followed by a pole-zero resonator. 
while that for producing vowels is a quasi-periodic pulse generator with variable period 
followed by a resonator with three variable poles called formant frequencies, ranging f rorn 0 
to 3 KHz. The results of analyzing, by means of sonagrams obtained from ten male voices, 
show that the 16 vowel phonemes can be classified into two classes as single and compounded 
vowel sounds. Some of the synthesized single vowels are very monotonic and can be 
recognized. Others, with third formant frequency slightly greater than 3 KHz are not as 
clear, due to the fast decaying of high frequency components in the generated pulses. 
The 
compounded vowels are also synthesized by a step variation of formant frequency derived 
from its components. The result is also well recognizable. 
The sonogram analysis of a 
continuous Chinese speech shows that every word and 
its spelling phonemes are quite 
independent and separable, and are therfore very different from English speech. 
WRITING 44 
Gra hemic Synthesis: The Ultimate Solution to the Chinese Input/Output 
Pro I iem 
Carl Leban 
The University of Kansas, Lawrence 
S. Gould, Ed., Proceedings of the Firsr lrlternational Symposiuni on Compu~ers and Chinese 
Inpuf/Ulctpur Sysferns, Academia Sinica, 533-552 
In design of 10 systems for human graphics it is necessary to simulate the activi!,, of writing 
and not the graphic result of that activity. Orthographic rules are essentially sets of criteria 
for determining proper seriul order of graphic signs, upon which further suhsets of phonetic, 
semantic, graphic and other conventions are imposed. Chinese orthogriiphy is of the 
polyalternnting polyvariable type in which a nunlber of undefined subsets of graphic signs 
conibine with each other in any of several possible juxtnpusitional nludcs according to rules 
not yet fully elucid:~tcd. But if tile logography is not converted to un in~rarinble series it 
cannot be input to, m;~nipul;~tcd in, or outpul from a digital compt~tt.r, The temporal series 
in which elements are coriiposed into logograph$ is variably serial, and therefore computer 
conipatible. Graphemic synthesis is a procedure by which logographs are niechanically 
produced in a manner simulating the nwnla1 writing procedure. Since logographs are 
synthesized from a sn~:lll finite set of coriiponent elements, there is no need to prestore 
logographs, but only the small grapheme set. Output in normal log~gr~iphg is achieved as 
needed only at the output end by reve.rs;ll of the synthesizing process. It is possible to 
achieve a synthesis at somewhat less than the ideal level, pseudographemic s-vnthesis, and this 
has been implemented in the SlNCO system. 
WRITING: RECOGNITION 
Feature Extraction on a Finite Set of Binary Patterns 
l'aul 1'. \Vans, and William S. Hodgkiss, Jr. 
Departrr~ent of Electrical Engineering, Duke University, Durham, North Carolina 
S. Gould, Ed., Proceedings of the Ficst infernational Synzposium on Cornputers and Chinese 
Input /Output Systems, Academia Sinicu, 183-194 
A "best" subset of mutilally orthogonal features which are minin~um in number but sufficient 
to discriminate a finite set of patterns is chosen from a much larger set of available features 
in a systematic and deterministic manner by a heuristic program based on the criterion of 
maximum separability. The unique code words for the finite set of binary patterns are 
established through a learning procedure derived from a theorem on necessary and sufficient 
conditions for mutual independence of these vectors over a binary field. 
WRITING: RECOGNITION: CHINESE 45 
An Experimental System for the Recognition of Handwritten Chinese 
Characters 
Shi-hluo Chang 
Instirure of Mothernatics, Academic Sinica, Nankang, Taipei, Republic of China 
Der Her Lo 
Depurtpenl of Cornpuler Science, National Chiao Tung Universily, Hsinchu, Taiwan, Republic 
of China 
S. Gould, Ed., Proceedings of the First Infernatiatal Symposium on Cornpulers and Chinese 
Inpul/Output Sysrents, Acadctllia Sinica, 257-267. 
A Chinese character can be thought of as cornposed fro111 a set of straight line segn~ents. A 
stroke is ideally a line segnlent or a cuncntenation oh sevcrnl straight line segments. Fach 
straight line segment has its starting point. direction and length. Therc is also a specified 
sequence among these segments. The specified sequence of line segments for a character is 
the same as the seq'uence of th'eir starting points. Therefore, when a character is drawn on 
the tablet of the hgitizer, the output paper tape containing the (x,y) coordinates of sa~nyling 
points of each of its line segments presents these points in the proper order. The 
preprocessor produces a sequence of siniglif'ied straight line segments, each with a direction 
code and length, from the paper tape input, and sends the results to the classifier which 
constructs 3 dictionary of characters which it then uses in the recognition process. The 
program achieved 95% recognition for a test sample of 300 Chinese characters. 
WRITING: RECOGNITION: CHINESE 
Computer-Aided Chinese Character Recognition by Forward Markovian 
Dynamic Programming 
Yung-Lung Ma 
Deparlment of Electrical Engineering, Narional Taiwan University, Taipei, Republic of China 
S. Gould, Ed., Proceedings of the First lntern~ltional Symposium on Computers and Chinese 
lnpur /Ourput Systems, Academia Sinica, 269-286 
A Chinese character may be called a kind of block picture. Each stroke seems to be a 
hieroglyph. Curves formed by several strokes or by a continous stroke do not happen very 
often. Sorne characters only differ by a single stroke. If the Markovian processes mentioned 
in this paper are used, detailed recognition for each row and colunin is available and even a 
single stroke would not be missed absolutely, so the increasing the degree of correct 
recognition is by no means a proi~lem. A plane block picture may be divided i'nto plane 
blocks while a solid one can be discussed by dividing it into solid blocks. The greater the 
number of layers, the higher the degree of correct recognition. The input. pattern is divided 
into 20 layers for individual recognition. If every layer satisfied the condition, then 
recognition is complete. If one of the layers has a great difference, then the pattern should 
be another picture. 
WRITING: RECOGNITION: CHINESE 46 
The Topological Anal sis, Classification, and Encoding of Chinese Characters 
for Digital Computer 7 nterfacing- -Part I 
Paul I? Wang 
Depart~nent of Electrical Engineering, Duke Universir): Durl~um, Nortlt Carolinu 
S. Gould, Ed., Proceedings of the First International Symposium on Computers and Chinese 
lnput/Ouiput Syslems, Academia Sinico, 4 17-439 
A set of features believed to be useful in classification and recognition and which is deduced 
from topologiciil properties and heuristic properties is pro,posed. AII encoding schenie offers 
a unique code word for each character (signature) of a dictionary of about 6.000 items. A 
three-st~ge n~nchine recognition system, based upon the optimal ~~~i~ltiple category 
classif ic:ilion principle, has bee11 proposed to solve r he problem of ui~to~naiic reading of 
Chinese characters. A by product of this research 1s the dcveloplnent of s topolugic:llly based 
Icxicographic;~l ordering for a useful Chinese dictionary. Finally, some reco~nmendations 
concernilig machine recognition of printed Chinese ideographs are made. 
WRITING: SYNTHESIS: CHINESE 
Software Method in Kanji Information Processing 
'l'eiji Knkinun~a 
Fujitsu Limi f ed, Minato-k u, Tokyo, Japan 
Kenn~i Tsukatani 
PETY Limited, Chiyoda-ku, Tokyo, Japan 
S. Gould, Ed., Proceedings of the First International Symposium on Compurers and Chinese 
Inpuf/Output Systems, Acadernig Sinico, 983-998 
in the FCL (FACOM Composition Language) System, information is punched on paper tape 
with a Kanji keyboard. The layout data turns out the forms of finnl printed matter as 
parameters. punched with an alphanumeric keyboard. and these are applied to the FCL as an 
input together with the text data. The results of the editing by the FCL are output to a 
cassette magnetic tape, transferred to the photo type-autosetter, and are printed on film. At 
this stage a print for correction is completed and this print is placed in the c0rrectio.n 
processing cycle. The correction processing generates the corrected data concr rn ing errors in 
the test and layout data and this corrected data is again input to the FCL. The FCL saved 
file is utilized as the objective of the correction processing. The FRAME program is the 
portion of the layout control system which defines paragraph groups which have the same 
character and shape. 
WRITING: SYNTHESIS: CHINESE 
Photo-Electrostatic Kanji Printer 
Atsushi Ishi, Yoichi Hagiwara, Woshimitsu'Masui, and Yoshiyuki Aida 
Fujitsu Ltd., Minato-Ru, Tokyo, Japail 
S. Could, Ed., Proceedings of the First international Symposium on Computers and Chinese 
Input/Oufput Systems, Academia Sinica, 969-981 
This Kanji printer consists of a character generator and a printer. The character generator 
has a small rotating iniage disc with 5,376 characters printed on it, and the character patterns 
are converted into video signals by a vidicon. The printer has an optical fiber tube and a 
photo-electrostatic recording element. Reproduced character patterns on the surface of the 
optical fiber tube are recorded on the dielectric coated paper by a photo-electrostatic element. 
This Kanji printer is capable of printing 100 characters a second, aid is usable for any 
application of printing in Kanji. 
WRITING: SYNTHESIS: CHINESE 
Designing Storage/Outpu t Units. for Chinese Input/Output Digital Computers 
Kai tIuang 
Department of Electrical Engineering, University of Miami, Coral Gables, Florida 
S.Gould., Ed., Proceedings of the First lnternationa! Symposium on Computers and Chinese 
l npuf/Output Sys f ems, Academia Sinica, 931- 942 
The storage unit (SU) provides a permanent filing cabinet for storing Chinese characters in 
any predetermined binary-coded form. The stored information is addressable and readable 
froin external control. The SU contains a large-scale cellular array of Read-only Memories 
(ROM) with the associated address decoder, control logic, memory addreddata registers and 
sense amplifier to enable readability The output unit is used for printing or displaying 
decoded Chinese in ideographic form. It consists of a k2-segment character decoder, two 
buffer registers, and auxiliary display terminals (DT). The DT may take many forms, such as 
a multiple-head printer, D/A converter and storage CRT monitor or a graphical display 
console make with an array of light emitting devices (LED). 
WRITING: SYNTHESIS: CHtNESE 48 
Computer-Aided Design of Chinese Character Patterns 
Hideo Hirahara, Iiiyoshi Kibuchi, alrd Masan~itsu Satou 
Infor~nations Sysfems Rejearch Laboratory, Toshiba Research and Development Center, 
Kawasaki-City, Japan 
S. Could, Ed., Proceedings of flip First Intcrnutional Syniposium on Computers and Chir~ese 
inpul/O~tput S.ystems, ACP mia Sinica, 909-930 
The System generates dot patterns from any original design pittern or handwritten items. 
The source patterns are scanned with a vidicon camera and recoxied on magnetic tape. The 
following procedures are used: 1) Noise elin~ination and smoothing of scnnned patterns, 2) 
Line enhancement, 3) Matrix size conlpression, 4) Inreractive refinerner~t, 5) Auton~atrc 
veneration of Chinese characttr read only nlcmory (ROM) patterns. l'he obtair~ed dot 
b 
patterris are then translated into a paper tape for numerical coiltroller which in turn drives 
the wiring systrnl for tllr read only memory. Chir~esr char;lctrr line printers. CRT displays, 
and other Chinese character output devices can be implemented by this Chinese character rend 
only memory. 
WRITING: TEXT INPUT: CHINESE 
ArSystem Design for the Input of Chinese Characters through 'the Use of 
~hbnetic and Orthographic Symbols. 
ti. C. Li, S. 1'. Hu, C. L. Jcn, H. Chou, S. Shnn, and E, T. Chen 
Departmen1 of Economics, Bryant College, Switll field, R, I, 
S. Gould, Ed., E'toceedings of lhe Firsr international Symiposiunt on Computers alrd Chinese 
I npul/Output Systems, Acad enlia Sinica, 501-5 11 
The following principles have been observed in svstem design: 1) Easy to Learn and Use, 2) 
inexpensive to Implement. 3) Higher lnput Rate. 4) Unique Code for Dictionary Search, 5) 
Facilitate Other Related P,pplications. The input of Chinese characters is through ti]: use of 
?honetic and orthographical symbols. The total number of symbols needed to transcribe a 
single Chinese character varies from a lower limit of three to a maximum of eight. A single 
Chinese character requires a maximum. of three phonetic symbols and one intonation notation 
to indicate pronunciation. By coupling one to four of fifteen orthographical symbols with 
the pronunication symbols, each Chinese character can be uniquely transcribed into a set of 
symbols which indicates the pronunciation as well as the orthographical structure of the 
ideosraphy. No new hardware is required for implen~entation. With some minor 
modifications, the keypunch machines and other typewriter-like input peripherals now 
available on the market can be used immediately. 
WRITING: TEXT INPUT: CHINESE 49 
A New Approach to a Chinese (Tele) Typewriter, Which Can be Used as a 
Telex, Data Terminal and Computer lnput/Output Device 
Ye-San Liu 
Direc~orale General of Telecomntunica/ions, Taipei, Republic of Chiria 
S. Could, Ed., Proceedings of the First International Symposium on Computers and Chinese 
lnpui/Outpui Systems, Academia Sinica, 489-499 
A typing Keyboard is proposed which is arranged like an English teletypewriter, using an 8- 
unit code as in an ASCII code with even parity Switching a lever key, you will be able to 
type Chinese or English. Whe~ you type Chinese, you need only to push a key three tirnes at 
most to con~plete the selection of a character. Then the character will come out by pushing 
the space bar once. If it is so arranged as to push a key three times for all characters, we 
shall be able to save the process of pushing the space bar for every character. The rules of 
decomposing Chinese characters are studied. The key layout of the keyboard is so arranged 
as to make recog~ition of key position easy. Four simple typing rules have been determined. 
There are only 21, out of the 3,000 characters which often appear in current newspapers, that 
share the same 3 keyed codes, and so they have been treated as exceptions. 
WRITING: TEXT INPUT: CHINESE 
PEACE--A Phonetic Encoding and Chinese Editing System 
Shi-Kuo Chang, Chi-Shion Chiu, Ming-Hwei Yang, and Bao-Shuh Lin 
Cotnpuration Laboratory, Institute of Mathentatics, Academia Sinica, Nankang, Republic of 
China 
S. Gould, Ed., Proceedings of the First International Symposium on Compurers and Chinese 
lnput/Ourpur Sysfems, Academia Sinica, 29-47 
Different Chinese characjers may have the same phonetic transcription (using Chinese 
National Phonetic Symbols), requiring methods to disambiguate homonyms. If however, the 
Chinese text is coded into phrases separated by delimiters. then the phrases can often be 
decoded unambiguously to obtain tht corresponding string of Chinese characters. 
The file 
structure for the PEACE system consists of a character file and a phrase file. 
Chinese 
characters are stored in the form of a composition rule. The phonetic encoding method has 
also proved to be quite satisfactory, especially for the generation and editing of Chinese texts, 
where more than 60% of the characters are embeded in phrases. Character encoding rates of 
the order of 30 characters per minute can be attained with this system. 
WRITING: CHARACTER SETS: CHINESE 
A New Alphameric Code for Chinese Ideographs 
Nelson Ling-Sun Chou 
East Asia Library, Rutgers University 
S. Gould, Ed., Proceedings uf the First International Syrtiposiu~n 011 Cor~rptlt ers and CJlirle~e 
Input /Output Systems, Academia Sinica, 47 1-188 
The Following are assumed: 1) A11 Chinese ideographs are composed of one or more 
components, and thus may be classified by the pattern of these conqxments. 2) Each of the 
cormnents is in turn conqposed of one or more graphic elements. The total number of 
gralpvic elen~cnts is fairly limited. 1deogr;iphs may be divided into four mujur pntterrls: 
Hontontal. Vertical. Bordered, ncld Independent. After ide~~lific:~tion of 80 ideogr:~ph's 
pattern, a component's structure, and the husic elements, one c:ln then perforni the codirig by 
followilig rules for: 1) Ideograph as a wliole. 2) Coniponents. 3) Ele~l~e~its, 4) Rrl;it~u~rship or 
separation signs, 5) Coding sequence, 6) Conipone~~t of bordered pattern, 7) lndept*ndent 
ideographs of components. 
WRITING: CHARACTER SETS: CHINESE 
Chinese Input-Output with Standard IBM Selectric Typewriter Terminal 
Ching C. Tsao 
1 Bhj Corporation, Arnronk, New York 
Enlerson W. Pugh 
IBhl T.J. Watson Research Center, Yorkfo~vn Heights, New Yor, 
S. Gould, Ed., Proceedings of: rhe Firs1 lnrernational Syntposium on Cornpzrrers and Chinese 
1 nput /Output ~y;(ents, Acaaemia Sinica, 459-469 
A multicorner indexing system has been developed for entering Chinese characters, using the 
numeric keys of a standard comptlter terminal. Each character is uniquely coded with a 
sequence of one to nine digits in a nine-corner extension of Wang's Four Corner System. A 
20- by 21- dot array output code has been programmed in APL for use on the 18M 2741 
Standard Selectric* Terminal with the fine-plot printing element. These two techniques can 
be combined to provide an easily learned Chinese input-output system for use on standard 
corn puter hardware. 
WRITING: CHARACTER SETS: CHINESE 51 
The Creation of a Set of Alphabets for the Chinese Written Language 
Kendall L. Su 
School Elecrrical Engineering, Georgia institute of Technology, Atlanta 
S. Gould, Ed., Proceedings of the First Inrernational Symposium on Computers and Chinese 
1 nput /Ourpu! Syslems, Academia Sinico, 44 1-458 
A set of symbols has been derived by dividing up each in a set of 4,600 Chinese characters. 
This set of symbols can represent practically all Chinese characters. These syrnbols can be 
used as the alphabets of the language, except that they offer no phonetic information on each 
character. The spelled.-out Chinese is readable without ambiguity and referential :.id. 
The 
immediate and long-term applicatioss of these alphabets are discussed. 
Extensive examples 
are given. 
WRITING: CHARACTER SETS: CHINESE 
On the Formal Description gf Chinese Chul 
Marion Re Finlcy, Jr, 
Universite Laval, Quebec, Canada 
S. Gould, Ed., Proceedings of the First International Symposium on Compur~rs and Chinese 
Input /Ourput Systems, Academia Sinica, 51 3-531. 
Chinese characters maybe defined as objects produced by a generative grammar suitably 
extended to two din~ensions. At the heart of this extension is the coordinate-free 
configuration operator which. together with its inflections, permits the desired stroke linking 
to compose a given character. The character so generated corresponds then ts a derivation- 
tree which reveals structural properties common with other characters. This tree in turn leads 
to a quasi-algebraic expression for the character which can be coded to give character 
indexing for storage and retrieval purposes. The recognition problem is considered briefly. 
Some experimental results with a character design language are presented. 
WRITING: CHARACTER SETS: CHINESE 52 
The Chiao-Tun Radical System, Part I: The Analysis and Design of the 
cP Chiao-Tung Ra ical System 
Ching-chun Hsieh, Yung-wen Hwang, Shu Lin, and Kwei-n~in Hsu 
Cornpurer Science Depurt~r~ent, National Chiao-Turig Uni~ersil!~, H sinchu, Taiwun, Republic of 
China 
S. Gould, Ed., Proceedings of !he First lnternational Syntposiurn on Computer! P 
Cllirrese 
lnput/Ourpur Systetlts, Acadenlio Sinica, 49-62 
Chinese characters are characterized by strokes which are well distributed in n block space of 
a definite size. The use of radicals plus weight design allows composition of characters from 
radicals to yield pleilsing results. Radicals with fewer strokes are given srn;illcr weights: those 
with conlplicatcd strokes are given greater weights, so t hiit the character obtr1ir1s.d f run the 
composition is eveilly distributed in the block. The characters thus coniposrd directly look 
very 111uch like those integral characters \Shicl~ are obt~~iried f roni dot rll:ltricc's without 
decon~position. Using only 496 radicals, 18.713 Chiriese characters can be gentrated. The 
radical system proposed is a precedence grammar 
WRIT-ING: CHARACTER SETS: CHINESE 
The Chiao-Tung Radical System, Part II: Character Composition and Methods 
to Represent Radicals 
Ming-Wen Du, Ching-Chux liseih, Shu-Hong Clii, and Shu-Cling Chu 
Computer Science Departmen[, Nationul Chiao-Turig Llrriversity, Hsinchu, Taiwan Republic of 
China 
S. Gould, Ed,, Proceedings of the First Internotiorral Symposium on Computers and Chinese 
Input /Output Systems, Academia Sinica, 63-78 
There are two steps in composing a Chinese character from its radical formula. 1) Calculate 
the position of each radical and the area occupied by each in the character. 2) Compress 
each radical and place the radicals illto their right positions. There are seven methods for 
representing the radical system in a computer: 1) dot matrix method, 2) absolute line segment 
method (the method adopted in implementing the system). 3) relative line seement method, 4) 
core method, 5) chain code method (basically a line segment method), 6) analytical method, 
7) mixed mode, which requires the smallest memory space (7.992 K Bytes) to store all 496 
radicals. 
WRITING: CHARACTER SETS: CHINESE 53 
The Upper-Right Corner Indexing System for Chinese Language and a New 
Chinese Character Encoding System 
T. Y. Kiang 
Departmen! of Electrical Engineering, National Taiwan University, Taipei, Republic o/ Chino 
S. Gould, Ed., Proceedings of the First International Symposium on Computers and Chinese 
lnput /Output Sysrems, Academia Sinica, 79-93 
Alr Chinese characters may be classified into three classes: 1) Characters with no common 
radicals, 2) Characters composed of a conimon radical and a niain part, 3) Character$ 
including more than one common radical. 100 common radicals were selected for use iq the 
system and nearly 70% of the chnracters belong to class 2. Since most of thel common 
radicals are on the left sides of characters, we suggest indexing a cllarncter according to its 
upper or right stroke form. In the resulting system it is necessary to learn only 47 indices, 
The characters as well as the nuin parts are equally distributed under the indices. The system 
is easy to learn, fast in operation, has a large vocabulary, low in cost and is readily 
implen~entable on a mini-computer, 
WRITING: CHARACTER SETS: CHINESE 
Character Identification Using the Phonetic Code and the Four-Corner Code 
(in Chinese) 
T. he, an C. Ysng 
.ft Narional sing-Hua University, Hsinchu, Taiwan, Republic of China 
S. Gould, Ed., Proceedings of the Firgl Interna!ional Symposium on Computers ohd Chinese 
I npuf/Output Systems, Academia Sinica, 95-101 
WRITING: CHARACTER SETS: CHINESE 54 
Use of Contextual Information in the Design of a Chinese Character Pattern 
File 
l-Lieh Huang 
IBM Warson Research Cenrtr, Yorktown Heights, New York 
S.Gould, Ed., Proceedings of the First lrlternational Symposium on Computers and Chinese 
input /Output Sysrcrrts, Academia Sinica, 1007-101 2 
We assume that Chinese characters are digitized into n by n matrices of 0's and 1's and 
represented in packed from ih computer storage, usually in a magnetic disk. Assume that an 
unblocked indexed sequential method is used for nrgani~ing the c.p. (character pattern) file. 
Each rgcord contairis a 2-bvte key and 34n bytes of data where n is the nunlbsr of Chinese 
character patterns cotitainei in the record. We :lssurne here that we use 2 bytes fur character 
ID and 32 bytes of character patterns, If we illcrease the number of character patterns ir] 
each record from one to four, we reduce the nuniber of records per track (disc 111eriiory) by a 
factor less than 2 A study of contexts of occurrence will tell us what c,p.'s should be stored 
together. Thus a record shoi~ld contain not only the c.p. of a given character, but the c.p.'s 
of some characters that would follow the given character with high probability. 
WRITING: CHINESE 
An Melligent Terminal for Chinese Character Processing 
F. F. Fang, C. N. Liu, and D. T. Tang 
IBM Thomas J. CVatson Research Center, Yorktowrt Heights, New York 
S. Gould, Ed., Proceedings of the First International Symposiunr on Conlputers and Chinese 
Input /Output Systems, Acadernia Sinica, 103- 114 
The proposed terminal system consists of three modules: Input Output. Control. The input 
module includes: 1) a character board consisting of a position sensing input device or an 
array of keys which may be used to input a selected character by transforming its x, y 
positions to any desired character code. 2) a general purpose keyboard which may contain 50 
tn 300 keys each of which is associated with a discriminating mask, and 3) a set of 
discriminating masks which, when used singly or as a group, perform the character set 
selection lo_eic. The autput module consists of: I) a character description file which is 
organized according to the character code and colitains information for pneration of 
characters for pri~ting or display, and 2) a printer or display for physical output of 
characters. The control for the input and/or output system is obtained ttjrough either 
software or firmware. Various degrees of intellieence may be programmed to achieve 
additional character selection and character generation features. 
WRITING: CHINESE 55 
Techniques for the Implementation of a Chinese Input/Output System 
G, W. Crawford, and S. H. Chung 
Weslinghouse Electric Corporarion, Pittsburgh, Pennsylvania 
S. Gould Ed., Proceedings of the First international Symposium on Computers and Chinese 
Input /Oufpul Sysrems, Academia Sinica, 96 1-968 
The input problem is solved using an extension of Wang's Four-Corner Dictiorary Method 
that uniquely describes a Chinese word by a six digit number. The technique is then 
simplified by utilizing a five digit number pnd an interactive CRT terniinal that allows 
complete resolution of ambiguities. Various methods of output are briefly considered and it 
is concluded and a Computer Output Microfil~n (COM) unit is the nlost. logical device 
presently available. A specific example of an implementation of n Chinese input/output 
system based on a. CDC 7600 computer driving a Stromberg-Dotagraphix COM unit is 
described and a specific example of the output included. 
WRITING: CHINESE 
The Modern Chinese Computer System 
Y -Tze Kuo 
B.P. 5080,' Kinshasa, Zaire, Africa 
S. Gould, Ed., Proceedinss of ihe Firs! lnternaiionol Symposium on Compurers and Chinese 
Input /Output Systems, Acoderniu Sinica, 957- 959 
Input is handled by the author's mixed typing system Modern Chinese typewriter w'hich has a 
total of 2400 normal and half-size Chinese words which permit the form~ion of the 
necessary 8000 common Chinese words. With this snialler nun~bet of Chinese words it is 
possible to represent then1 by :different pudched holes on the same kind of punch card as in 
the occidental computers. Any occidental computer and any new Chinese computqr could be 
used with the system for Chinese language as well as for English and other languages. Output 
is ihrough the author's Modern Chinese typewriter equipped with automatic vping and 
positioning devices. 
WRITING: CHINESE 56 
The Chiao-Tung Radical System, Part Ill: Keyboard Design and an 
Implementation of the Chiao -Tung Radical 
Yung-Wen Huang, Ching-Chun Hsich, Sic-Lin Liu, and Tai-Chrrng Chen 
Computer Science Depurr~nent, Nutionol Chiao-Tung Linirrrsitjl, Hsinchu, Tuiwon 
S. Gould, Ed., Proceedings of thc Firsr lr~tern~ltionol S!vrosilrrrr or1 Cor?~putt~s and Chinrse 
Inpur/Outpur Syslenrs, Academia Sinica, 943-956 
The system consists of: 1) HP 2100A n~ini-coniputer. 2) keyhoard of 640 keys. 3) or 
keyboard of 88 keys, 4) tape reader; 5) electrost:\tic printer, h) display scope. Chiriesc 
chilrnclers may be fed into the computer eilticr throt~~h ke),htxlrdz; or L tirough paper tape 
Outpul is prinrtd by the printer and alstl displ;\yt.d on thc s~.~)pe. The input radicals are 
coqlbincd into cliaructcrs by the cori~positic~ii nlcttlocl in the coniputcr. After three huurs of 
fi~nlili;~riz;~tion, m operator nlay reach n spt.t.il rlf 10 chi~r;~c.tcrs pcr r~iiriutc, It is cstiriiated 
that 3 speed of' 30 cprn 111;1!~ bc rt\;lc.tit.d in ;r nlonrh. 7 il bc conip:~tit~lc MII th other input 
rrlrthods (phurirtic alphabets, four-corner code, standard telephone code, etc.) only s table is 
needed. Sirlce it takes niertsly 1 nis to cor~~pcrss ;I character, tirlie sharing may be utilized, 
LEXICOGRAPHY -LEXICOLOGY: STATISTICS 
Computer Generated Word Classes and Sentence Structures 
A, ?'rcti;~koff 
Conl~rrissnriaf u I'Encrgie Atotrlique, BP 6, 92, Fontt.truy aux Roses, Fronce Inj'orn~r~ricln 
Processittg 74, Nortlr-Holland Puhlishirrg CU., 919-93, 1974 
First a dictionary of the corpus (here is an t.str;~ct from S. h4;lugharn's Tlie Painied I'cil) is 
produced. Classification of words is based on the maxirnt~~n infurn~ntion principle which 
considers that for a given number of groups, the greater the qmntity of information. the 
better the distribution of the words into these groups. The words are distributed into two 
groups in such 3 way that the quantity of information associated with this classificati.on is 
C 
maximized. Dichotomization is carried on until the statistical uncertainty on the amount of 
information is greater than the zain of information obtained by a new dichotomy. For each 
sentence of the corpus, the code produces a structure bxed on the degree of correlation of 
two consectltive words. Inside the sentence consecutive words are connected two by two in 
order of decreasing degree of correlation, 
LEXICOGRAPHY -LEXICOLOGY: STATISTICS 
Text Connexitivity and Word- Frequency Distribution 
Hans Karlgren 
KVA L, Stock holm 
Hakan R.ingbotn, Ed., Style and Texb, Sprakforlaget Skrjptor, Sfookholm, 335-348, 1975 
ISBN 91-7282-095-0 
The distribution of a word in a text can be described in two ways. by the mariner in which it 
differs from that in some other texts and by the manner in which it varies from one part of 
the text to another, Thematic words are those which stand out in comparison with a given 
background. The use of the proper statistical techniques (a measure due to Hassler- 
Goransson which is defined as the chi-square value divided by the number of degrees of 
freedom is suggested) makes it possible to study the way in which words enter and leave the 
scene in a pattern which characterizes the text in mircl~ the satlie way as the intrclt s and 
exit s determine a play, Using more sopliisticated methods it is possible to study the strength 
of connection between any two parts of the text and a segmentation of the text into 
internally mote strongly connected and mutually more loosely connected portions can be 
tested or even tnechanically suggested. 
LEXICOGRAPHY -LEXlCOLOGY: TEXT HANDLING 
Machine Readable Texts in Latin (with Special Reference to the Didactic 
Implications of a Computer Aided- Method) 
E. M. Goldstein 
University of Of fawa, mm hnadg 
SIGUSH Newslertcr No. 04: 1-3, October 1975 
Four items of information are recorded for each entry. (1) the form the word as it is in the 
text, (2) the lemm~. the form of the word in a dictionary. (3) a detailed reference to the 
form. (4) its coded analysis. from morphological and syntactic viewpoints. It was found that 
for Caesar's Commenrarii de bell0 gallico the 704 words which occurred more than 10 times 
covered 86.03% of the text Studies of this sort are important in determining what 
vocabulary shoyld be taught <o students; there is no immediate need to teach words of low 
occurrence. 
LEXICOGRAPHY -LEXICOLOGY: DIALECTOLOGY 
An Organization for a Dictionary of Senses 
Dick H. Fredericksen 
IBM Thorllas J. Watson Research Center, Yorktown Heights, New York 
IBM Research Reporr 5548, 4 June 1975 
"Senses" are represented separately from "wordings" and the mutual connections between them 
are made explicit in both directions, "Wordings" may be either single words or multi-word 
phrases which are on the same footing with regard to "sense" connections. Each word is 
associated with an exhai~stive list of the phr:\ses in which it occurs. Classifiers and features, 
drawn froni appropriate sets, may be attributed separately to words, to phrases, to senses, or 
to particular senses of words or phrases (i.e. to particular wordings of senses). The data items 
which represerlt scnses are globally chained, and may be exhaustively listed. The dnt;\ itenis 
which represent words ore accessible as "le;lves" of a lexical tree and may be retrieved either 
by lookup or volr~n teered in alphabetical order. The "senses" represented in this dictionary 
are not a set of primitives into which I~urn;~n experience can be decomposed; meaning is still 
a step removed, still evoked rather than embodied by the elements of this basis. In a full- 
fled~ed system fur natural language processing the "dictionary of senses" could be envisioned 
as a component stretching vertically across the "upper" layers. The "sense drlta items" must 
link to th'e deeper-lying data structures which encode "knowledge of the world." 
LEXICOGRAPHY -LEXICOLOGY: THESAURI 
The Structuring of an Associative Empirical Thesaurus of English 
Christine M. Armstrong, and J. R. Piper 
Medical Research Council, Speech and Cornmunicarion Unit, University of Edirlburgh 
SIGLASH Newsletter 8, No. 2-3:l-6,. April-June 1975 
A collection of English language data based on free associcition is organized as a network: 
node = word, arc = associative frequency. We obtained 100 responses for each of 8,400 
words; the net has 56,000 nodes. For each of 35,000 words we can obtain an environment of 
related words and this environment is fairly large and relevant in content for 8400 stimi~lus 
words. The environment can be clustered into subsets, which turn out to be a semantic 
sorting of the environment. The search can be forward (stimulus to associate), inverse, or in 
both directions. Since growing environments can produce ponderously large subsets of 
network, environments are lin~ited by techniques involving transmittance to each node, the 
number of paths traversed which lead to each node, path length, and a frequency cut-off. It 
seems better to consider forward and inverse environments separately. 
GRAMMAR 59 
Optimal Encoding of Linguistic lnforma tion 
liazu hi ko Ozeki 
N H K (Japan Broadcosting Corporation) Technical Research Laboratories, Tokyo, Jopan 157 
Sysfenis - Computers - Confrols 5, No. 3:96-103, 1974 
Stochastic context-free langyiges as models of NL. A senttnce is parsed first and 
transformed into a sentence dtrivstion, and then this derivation, expressed as a string of 
results of "productions," is coded by coding each production. 
For this coding procedure, it is 
shown that* if the productions are divided into groups of the same left-hand side and if each 
production group is 'Huffman-coded, then the mean code length is less than the sum of the 
entrop,y and the mean number of steps of sentence derivations. Futliermore. under certain 
conditions this code becomes optinial in the sense that the mean code length and the entropy 
coincide, that is, there is no uniquely decodable code with shorter mean code length. 
GRAMMAR 
Stochastic Context-Free Grammar and Markov Chain 
Kazuhiko Ozeki 
NHA: Techrlical Research Laboratories, Tokyo, Japan 157 
Systems - Compuiers - Controls 5, 3:104-110, 1974 
A stochastic context-free grammar (scfg) is an automaton which stops when a sentence is 
generated. However, in investigating the information theoretic properties of a long text is 
convenient to use a scfg-based system which returns to the start symbol and begins the 
eeneration of another sentence as soon as it completes the generation on one sentence. The 
C 
system becomes a Markov chain having the set of all derivations as the state space, which is 
called hece the Markov chain associated with an scfg It turns out that: 1) The Markov chain 
assoicated with an scfg is irreducible. 2) For the chain to be recurrent it is necessary and 
sufficient that the language generated by the scfg be a probability space. 3) For the chain to 
be positive recurrent, it is necessary and sufficient that the mean number of steps of the 
sentence derivations be finite. 4) It is well known that when a Markov chain is positive 
recurrent it has an invariant distribution and its entropy per step HI is defined. If an scfg 
satisfies certain conditions, we have HI = H(E)/(M(E) + 1). 
GRAMMAR: GENERATOR 60 
On the Generation of English Sentence 
Franz Huber 
Cardiovascular Pulmonary Research Laboratory, Universily of Colorado Medical Center, 
Denver 80220 
lEEE Transactions of Coniputers 25:90-91, January 1976 
Beckniann's error-detecting code model for the structure of natural languages is used in a 
simple efficient procedure for generating a large number of English sentences with simple 
gramni;~ticnl structure atid a high rate of word repetition--such as dingrioslic messages of a 
compiler, messages in interactive systems, etc. The prarnmar dcscri bed is irnplerntln ted with a 
state table with S states: Article. Numeral, Adjective. Noun, Conjunc(ion, Preposition, Verb, 
Auxiliary Verb. The syntactic type of the varius words is specified by the location of the 
entry in the dictionary. Sentences are formed by supplying the routine with a sequence of 
poillters to dictionary entries; the routine itself checks wlirther this is a possible sentence, 
calculates the correct check morphemes, and puts a period at the end. 
SEMANTICS-DISCOURSE 
Montague Grammar and Transformational Grammar 
Barbara Partee 
Departmen1 of Linguistics, University of Massaclrusetts, Amhers1 
Linguistic Inquiry 6: 203-300, Spring 1975 
A truth-definition (in the manner of Tarski) or something to the sanie effect must be a par1 
of any adequate semantic theory. The syntax of a Montague grammar is a simultaneous 
recursive definition of all of the syntactic categories of the language, For every syntactic 
category there must be a unique corresponding semantic category and for every syntactic rule 
that combines (operates on) phrases of categories A and B to produce a category C, there 
nilst be a unique senian tic rule that operates on the corresponding semantic interpretation to 
give a semantic interpretation for the resulting phrase; that interpretation will be of the 
semantic category corresponding to the syntactic category C. Techniques involving labelled 
bracketing and "starred variables" enable the additions of transformations to Montague 
grammars. 
SEMANTICS-DISCOURSE 
Contemporary Research in Philosophical Logic and Linguistic Semantics 
D. J. Hockney, E. William Harper, and B. Freed, eds. 
Uoiversity of Western Ontario 
D. Reidel Publishing Company, Dordrecht, Holland, 1975 
HC: ISBN 90-277-0511-9 
PC: ISBN 90-2774512-7 
Con tents 
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii 
Counterfactuals and Comparative Possibility, by David Lewis . . . . . . . . . . . . 1 
Presuppositions, by Robert Stnlnaker . . . . . . . . . . . . . . . . . . . . . . 31 
Incomplete Assertion and Belnap Connectives, by Bas C. van Fraasen . . . . . . . . 43 
Dimensions of Truth, by Hans G. Herzberger . . . . . . . . . . . . . . . . . . 71 
Speaking of Nothing, by Keith S. Donnellan . . . . . . . . . . . . . . . . . . . 93 
The Structure of Efficacy, by Zeno Vendler . . . . . . . . . . . . . . . . . . 119 
Harris and Chomsky at the Syntax-Semantics Boundary, 
by Ray C. Dougherty . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 
Some Transformational Extensions of Montague Grammar, 
by Barbara Partee . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195 
Hedges: A Study in the Meaning Criteria and the Logic of 
Fuzzy Concepts, by George Lakoff . . , . . . . . . . . . . . . . . . . . . . 221 
Coments: Lakoff's Fuzzy Propositional Logic, by Bas C. van Fraasen . . . . . . . . 273 
Ch the Setnantics of Negation, by Howard Lasnik . . . . . . . . . . . . . . . . 279 
Verbs of Bitching, by James McCawley . . . . . , . . . . . . . . . . . . . . . 313 
SEMANTICS-DISCOURSE 62 
Understanding Language: An Information-Processing Analysis of Speech 
Perception, Reading, and Psycholinguis tics 
Donlinic W. Massaro, Ed. 
Deparrmenr o: Psychology, Universiy of N'iscor~sin, Madison 
Academic Press, Ne)v York, 1975. lSBN 0-12-478350-8 
HC: $16.50 
Con tents 
List of Contributors . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix 
Preface . . . . . . . . . . . . , , . . .. . . . . . . . , . . . , , . . . . xi 
Part I lntroduction 
1 Language and lnfornlation Processing, Dominic W Mqssaro 
1. introduction . , . . . . . . . . , . . . . . . . . . . . . . . , . . . . 3 
11. Information Processing . . . . , . . . . . . . . . . . . , . . . . . . . 5 
Ill. Auditory Information Processing . . . . . . . . . . . . . . . . . . . . . 7 
IY. Visual Information Processing . . . . . . . . . . . . . . . . . . . . . . 21 
V. Conclusion . . . . . . . . . . . . . . . . . , . . . , . . . . . . . 27 
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 
Part I1 Speech Perception 
2 Articulatory and Acoustic Characteristics of Speech Sounds, Lucinda Wilder 
1. Introduction . . . . . . . . , . . . . . . . . . . . . . . . . . . . . . 31 
11. Production of Speech Sounds . . . . . . . . . . . . . . . . . . . . . . 33 
111. General Acoustic Properties of Speech Sounds . . . . . . . . . . . . . . . 36 
1V. Articulation of Speech Sounds . . . . . . . . . . . . . . . . . . . . . . 43 
V. Occurrence of Speech Sounds . . . . . . . . . . . . . . . . . . . . . 46 
VI. Vowel Phonemes of English . . . . . . . . . . . . . . . . . . . . . . 51 
VII. Coarticulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 
VIII. Consonant Phonemes of English . . . . . . . . . . . . . . . . . . . . 57 

SEMANTICS-DISCOURSE 
Part III Reading 
6 Visual Features. Preperceptual Storage. and Processing Time in Reading. Dominic 
W . Massaro and Joseph Schmuller 
I . Introduction ............................. 207 
11 . Visual Features ............................ 210 
I1 I . Perceptual Storage .......................... 231 
IV . Processing Time ........................... 234 
References ............................... 237 
7 Prirnary and Secondary Recogr~itiun in Reading. Dominic W . Massaro 
I . Introduction ........................... 241 
....................... 11 . Utilizatior~ of Redundancy 242 
........................ 111 . Phonological Mediation 260 
.................... IV . Nonmediated Models of Reading 273 
...................... V . Mediated Models of Reading 278 
References ............................... 286 
8 Reading Eye Movements from an Information-Processing Point of View. Wayne Shebilske 
1 . Introduction: Significance of Reading Eye Movements ........... 291 
XI . Characteristics of Reading Eye Moveinents; ............... 292 
Ill. Oculomotor Control During Reading .................. 301 
IV. Summary and Concl.usion ....................... 308 
References ............................... 309 
Part IV Psycholinguistics 
9 Linguistic Theory and Information Processing. Kenneth B . Sol berg 
I . introduction ............................. 315 
11 . Linguistics and Psychology ....................... 316 
111 . Theories of Grammar ......................... 318 
1V . Semantics and Syntax ......................... 331 
SEMANTICS-DISCOURSE 
...................... 
V . Models of the Language. User 
............................... 
References 
10 Word and Phrase Recognition in Speech Processing . Arthur Freund 
............................. 
1 . lnfsoduction 
..................... 
I1 . Word Recognition and Context 
................. 
111 . Acoustic Cues of Grammatical Structure 
.......................... 
IV . Click Localization 
.............................. 
V . Summary 
............................... 
References 
11 An Analysis of Some Psychological Studies of Grammar: The Role of Generated 
Abstract Memory. Joseph 0 . ~ellige 
1 . Introduction ............................. 
................. 11 . The Nature of Short-Term Memory (STM) 
I11 . The Psychological Reality of Constituent Structure ............. 
........ 1V . The Psychological FZlriction of Certain Transformation Rules 
V . Summary .............................. 
References ................................ 
Author 'Index ............................... 
Subject lndex ............................... 
SEMANTICS-DISCOURSE 
Seven Theses on Artificial Intelligence and Natural Language 
Yorick Wilks 
Deparlmen~ of Artificial Intelligence, Univei sily of Edinburgh 
Institldte /or Semantic and Cognitive Studies, Fandazione Dalle h falle, Nforking Paper 17, 
1'16 pages, 1975 
Words and semantic primitives are not utimalely of different types, although procedural 
benefits come fro111 making the distinction. Plans stated in high level progranlrrling languages 
are most appropriately scen as texts, since they do riot seen] to he either real world procedures 
or procedures for handliog natural language, As a pe~leral n~erhod of pursing nafural 
language. expectation is radically defective ilnless it has some general slid svste~i~atic capacity 
for attending to what it is rending. The important dist~nction betwen catlses and reasons in 
the esplnnation of hum;tn behnvior should have procedur;ll and not only t:lsono~i~ic rcflction 
in an underst;lnding system. Tenipl:\te to tenlplnte (in preferential setnan tics) inference rules 
are: GOAL, CAUSE. IM1'IL.IC. The CAUSE/GOAL distinction often rstuces to no niore than 
the temporal directionnlity of the rule. Real world knowledge can be represented quite 
usefully at quite local levels in an llnderstanding and can function there as part of a general 
systeni of lit~guistic inference. The use of frames threattns to swamp us in large frame 
structures for relatively trivial nlatters ;~nd it is not cleilr how frames are rrlr1tt.d Io lower 
level structures. We need multiie~el represen!:tion. It is not at all obvious that a NL 
understanding system should be resporlsible or modeling general knowledge of the world. 
On Natural Language Based Query Systems 
Stanley R. I'etrick 
IBhf J. Watson Research Cetrter, hiat/~ernatical Scimces Department, Yorktown Heigkrs, New 
York 
IBM Research Report 5577, 17 August 1975 
Some of the arguments which have been given both for and against the use of natural 
languages in qnestioning-answering (QA) systems are discussed. The following systems are 
considered in evaluatingr the current level of QA system development: LSNLIS, REL, 
SHRDLU. REQUEST There is a trade-off between syntactic and semantic complexity. A 
yystem with relatively simple syntactic capabilities must have complex semantic analysis 
procedures while a system, such as REQUEST, with sophisticated syntax can produce 
i~nderlying syntactic structures which directly reflect meaning without the need for "creative" 
interpretation. A brief comparison between processing times in LSN LlS and REQUEST is 
given. 
SEMANTICS-DISCOURSE: MEMORY: QUESTION ANSWERING 
Question Answering in a Story Understanding System 
Wendy Lehnert 
Deparrmenl of Computer Science, Yale University 
Research Report 57, December 1975 
This theory of question answering is based on the SAM (Script Applier Mechanism) 
mechanism. In the interpretative phase it takes a question in Conceptual Dependency form 
and categorizes it in terms of particular question types: 1) why. 2) how, 3) yes or no, 4) 
occurrence, 5) component. Each question type corresponds to a specific form of CD 
representation. In the response phase the memory is searched for the answer; this may 
involve nothing more than simple inforni:itioi~ retrieval or it may entail inferring the answer 
using gener;ll knowledge of the world. The systenl tries for co~npleteness in answering; a 
yes/no q\~estion will be answered with yes or no plus an account of that answer. Work is 
being done on a Generation-Selection paradigm in which each question generates a number 
of feasible answers (the problem of memory representation) and a selection procedure chooses 
among them (pure QA). Selection rules are presented and discussed. 
SEMANTICS-DISCOURSE: TEXT GRAMMAR 
Beyond the Sentence, Between Linguistics and Logic 
Janos S. Petofi 
Universitaet Bielefeld 
Hakan Ringbop, Ed ., Style and Text, Sprak forlaget Skriptor AB, Stockholm, 377-390, 1975 
ISBN 91-7282-095-0 
The 'text-structure-world-structure theory' (TeSWeST) is an empirically motivated logic- 
oriented theory aiming at the gramn~atical description of a text as a con~plex sign (intensional 
semantic description) and the assignment of the possible extensional interpretations to the 
intensional-semantically described text structure (extensional-semantic theory). The 
intensional-semantic and extensional-semantic descriptions are si~ch that they also contain the 
description of the pragmatic' aspects. The grammatical component of the TeSWeST is a 
generative transformational text grammar operating with linearly not fixed canonic basic 
structur? 
The formation rule system of this grammar consists of a o-called comn~unicative 
E 
rule (R ) expressed informally as: A communicative basis (TB ) is a communicative 
predicate-complex: a communicator (C1) communicates (COMM) to a (potential) 
b 
*t * 1 
communisator/interpreter (C ) at a given tlme ( Q ) in a given place ( Q ) the message TB. 
The elements of the norme implicit representation of the text intension are definienda in 
the lexicon and the elements of the normed explicir repxsentation of the text intension are 
definientes in the lexicon. The task of the extensional semantic component is to assign 
possible extensional semantic interpretations to the possible in tensional semantic 
representations. 
LINGUISTICS: METHODS 
Linguistics and Artificial Intelligence 
Petr Sgall 
Centre of Numerical hiothemo~ics, Charles University, Prague 
Prague Bulletin of Moihemorical Linguisrics 24.9-33, 1975 
(1) Winograd's approach to language has forced a reeval. ~tiotr of the relationship between the 
theory of competence and the study of performance, pragmatics, etc., though it is still wise to 
ack'nowledge the descri~ption of language as a relatively i~~dependent enterprise. (2) The use 
of lingi~istics descriptions in A1 provides a test for l~nguistic theories. (3) Winograd's 
"imperative form" of representing knowledge and semantics is more effective than one based 
entirely on deductive logic. Considerations of topic and focus, theme and rhemt., furictional 
sentence perspective, etc. are siniilnrly "imperative" in their (:niphasis or) the dirfrrcli t  use^" 
which various iteriis of information in a comniunicntion hint.. (4) Wiriogr;ld's work, in his 
wav of ir~truducing new definitions in to the sen1311 tic cotnp~~nen t, sl~ggcsts that in m:ln- 
michine coniii~unication the burden of learning the other participant's 1anpu:lge niay be 
shiftable from m;ln to the cotnputer. (5) The significance of linguistics for Al is connected 
with making prosramming languages more and rnort. like NL':;. (6) The study of tense and 
time and the study of nega~ion illlou us to see how linguistic n~esning and non-linguistic 
content can be distinguished. 
LINGUISTICS: METHODS: MATHEMATICAL 
Formal PhiJosophy: Selected Papers of Richard Montague 
Richrnon H . Thomason. ed . 
University of Pirtsburgh 
Yale Universi~.y Press. New Haven. Connecticut. 1974 . ISBN 0-300-01527-5 
HC: 112.39 
Con tents 
Introduction. By Richard Thomason ....................... 1 
........... 1 . Logical Necessity. Physical Necessity. Etl~ics. arid Quantifiers 71 
......................... 2 . 'That' (with Donald Kalish) 84 
Pragmatics ................................ 95 
...................... 4 . Pragmatics and lntensional Logic 119 
................ 5 . On the Nature of Certab Philosophical Entities 148 
....................... 6 . English as a Formal Langvage 188 
. 7 Universal Grammar ............................ 222 
.......... . 
8 The proper Treatment of Quantificat'ion in Ordinary English 247 
.................. 
. 9 A Paradox Regained (with David Kaplarl) 271 
10 . Syntactical Treatments of Modality. with Corollaries on 
............... Ref lexion Principles and Finite Axiornatimbility 286 
11 . Deterministic Theories .......................... 303 
.................. 
Works of Richard Montague. a Bibliography 361 
Index ................................... 365 
COMPUTATION: PROGRAMMING 
More on In-Core Sort-Search Methods in PL/1 for Lexical Data 
Richard K. Brewer 
Eastern Michigan University, Ypsilonti 
SIGUSH Newsletter 8, No. 1:9-12, February 1975 
Four procedures based on binary tree structilres aR presented alor~g with PL/1 code. 1) 
Sgmrnetric or inorder traversal (Knuth) exploits the powerful syntax of PL/1 in the i~se of 
modular code sr~d recursion. 2) Esplicit Stack (Knuth). 3) "Threading" 3 tree (PCrlis 81 
Thornton) permits traversal without giving up space for either nn iniplicit or an explicit 
stack. 4) 'This technique requires an estim;~te oT outer bound for the sire of the list; by 
nllocntion of 3 fixed hinary nrrily at the outset the tree can be ci~nstructed by basing frcsh 
riodcs on progressively l~ipl~cr ele~iie~~ts in the array. 
COMPUTATION: PROGRAMMING 
SNOBOL4 Applications in Natural Language Research 
Ja~l~es L. Wyatt 
Floridn Slate Urriversitp, Tullahl~ssee 
SIGLASH Ne~osletf~r 8, No. 2-3312-19, April-June 1975 
Topics discussed: 1) Pattern matching basics 2) the equals inark to create or alter character 
strings, 3) statement format. 4) INPUT and OUTPUT, 5) siniple programs. A sirnple 
prosram to count word frequencies in a text is described and discussed and 12 examples of 
output from student programs are given. 
COMPUTATION: PROGRAMMING: LANGUAGES 
Revised Report on the Algorithmic Language ALGOL 68 
A. van Wijngaarden, B. J. Mailloux, J. E L. Peck, C. H. A. Koster, M. Sintzofl, C. H. 
Lindsey, L. G. L. T. Meertens, and R. G. Fisker, Eds, 
1 nternalional Fed eralion /or inj'ormat ion Processing 
Acra ln formatica 5: 1-9236, 1975 
Con tents 
Acknowledgements . . . . . . . . . . . . . . . . . . . . . . - + . - 6 
0. lntroduction . . . . . . . . . . * . . . . . * . - . . * a 8 
PART I: Prel iminrry Considerations 
1. Language and metalanguage . . . . . . . . . . . ............. 17 
2. The computer and the program . . . . . . . . . . . . . . . . . , . . . 35 
PART 11: Fundamental Constructions 
3.Clauses . , . , . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 
4. Declarations, declarers and indicators, 66 . . . . . . . . . . . . . . . . . . . . 
5.Uni ......................... . .. *....... 77 
PART I I I: Context Dependence 
6.Coercion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 
7. Modes and nests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 
PART IV: Elaboration-independent constructions 
8. Denotations . . . . * . . . . . . . . . . * . . . . . . . . . , , . . . 108 
9. Tokens and symbols. . . . . . . . . . . . + . . . . . . . . , , . . . . . 113 
PART V: Environment and Examples 
10. Standard Environment . . , , . . . . . , . . . . . . . . . . . . . . . . 124 
11. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209 
12. Glossaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218 
COMPUTATION: PICTORIAL SYSTEMS 72 
lnteracive Graphics on Intelligent Terminals in a Time-Sharing Environment 
W. K. Ciloi, and S. Savitt 
Universi~y of Minnesota, Minneapolis, 55455 
J, Eucaruacao 
Fachbereich In~ormatirs, EG Graphiscke Do~enverarbeitung, Technische Hoclrschule- 
Darmstadt, 0-6 1, Germany 
Acta In forn~atics 5:257+27 1, 1975 
interactive graphics in n time-sharing environment shoilld be org;~nized in such a way that 
the user's activities are locally processed in order to avoid unaccept;lhly long response times. 
On the other hand, the host coiiiputcr ~iitrst be kept iriforiiied about Ihe user's actions and, 
ccnversrtly, the display file in [lie tcrminal has to be updated whenever the execution of the 
application progrilm causes a rlnnpe in the visi~al representation. In order to avoid the 
trtlnsn~ission of redundancy, the display file is deronlposed iriro two intersecting parts such 
that \he part in the host coniputrr and the other in the terniinal contains only the locally 
required information, i'he riecessarv communication between both parts is n~aintained by an 
4 
information module generated on the base of a low-low-level intermediate language (L ) and 
exchanged between coniputer an terminal. Th~s lellds to the notion of an abstract terminal 
programming systems. 
f 
whose "machine language" is L , facilitating the implementation and portability of graphic 
COMPUTATION: PICTORIAL SYSTEMS 
A Multilevel Modeling Structure for Computer Generated Graphical Symbol 
Design 
Chuan Lee 
M cDonnel-Douglas Astronautics Compa fly, Huntington Beach, California 
S. Gould, Ed., Proceedings af the First International Sj~rnposiurn on Cornpulers and Chinese 
l nput /Output Systems, Acad enlia Sinica, 387-406 
A multilevel data structure has been developed to allow its use to construct graphical symbols 
on a automated d~afting machine. The implemented logic allows a user to define several sets 
of alphanumerical characters and graphical symbols in terms of three levels of tables for 
creating and retrieving the needed line segment strokes. It is believed that the same logic 
codd be applied to provide a solution to the computer output of Chinese characters. To call 
any desired Chinese tharacter as output, it is simply a matter of calling the character's legal 
designator within a particular library. 
COMPUTATION: PICTORIAL SYSTEMS 73 
Interactive Graphical Data Analysis with Implementation of Chinese Characters 
Karen K. Yucn 
Deparrment of Biomothematics, University of Calijornia, Los Angeles 
S. Gould, Ed., Proceedings of the First lnternafional Symposium on Computers and Chinese 
I npu 1 /Output Systems, Academia Sinica, 373-377 
Treating Chinese characters as pictures we can display them on the screen without difficulty. 
Subroutines are written to generate each word or part of the word. On ttre input side, one 
can use a light-pen on the appropriate part of the screen to choose which part of the 
program one wants to go to, e.h, go to next frame, delete this point, etc. 
A push of the 
designated function key switch will also convey inforn~ation. Chernoff proposed the  napping 
of multidimensianal variables orito features of faces. Mouth, nose, eyes, facial contour, etc,, 
are modified in form and size to represent each vector of measurements, A,graphics program 
was written using this idea and extending it to provide the histogram comparison and 
interactive classification of individual cases. 
DOCUMENTATION 
Two Major Flaws in the CODASYL DDL 1973 and Proposed Corrections 
G. M. Nijssen 
Conrrol Dura Europe, 46, Avenue des Arts, 8-1040, Brussels, Belgium 
In formation Systems l:f 15-132, 1975 
In a schema dtten in CODASYL DDL 1973 it is syntactically correct to describe an entity 
either by declaring a data,item in the record of the entry or by declaring a CODASYL set 
type in which the record, describing the entity, is a member record. This is a major flaw 
because extension of a database or integration of two existing-databases will then lead to 
either reprogramming or inconsistency, or both. The flaw can be corrected by 'requiring that 
all attributes are represented as a dab item in the (Ingical) schema. In the CODASY L DDL 
1973, there are five places to optionally declare a record identifier and four of these five 
places are not in the rncord but in the CODASYL set type. Declaring record identifiers 
therefore results in fairly complex and non-orthogonal declarations. This could be simplified 
by abondoning tjtese five places and by introducing a record identifier clause in the record 
type entry. For integrity reasons it is necessary to require that at least one record identifier 
is declared in every fecord type entry. The previous two corrections will make is possible to 
design a CODASY L set selection clause. still providing the same functional capabilities. The 
corrected DDL is functionally equivalent, yet offers more data independence, is simpler and 
more orthogonal. Examples are given. 

DOCUMENTATION 
The Definition of Concepts 
Fred W. Riggs 
Universify of Hawoii 
Occasional Paper No. 6, international S~udies Association, University Center [dr 
lnternationol Studies, University of Pittsburglr, 39-76, 1975 
Definitions should exploit the distinction between de finiendu~n and dc finiens. The def iniens 
designates a concept in context while the definiendum defines the concept wirhout dopertdence 
on context. The definiendum of one concept X1 may appear as a term in the definiens of 
another concept X2, thus permitt~ng recursive construction of defin~tions. A machine based 
archive of soc~al science concepts is proposed in which the core element of each entry will be 
a definition copied from relevant scholarly writings. Each entry will :ilso contain: a) a 
possible translation of the definition into English and into fornial precise language b) 
specification of related terms and relevant theoretical context, c) docu~neiitation of the source 
OF the difinition, d) name of the contributor, e) anything else which secnls appropriate. 
Terms found in the new Thesaurus of the American Political Science Associ;l\ion will be r~sed 
from the beginning with later additions as needed. The project will be administered through 
the Information Utilization Laboratory at the University of Pittsbcrgh. 
DOCU~ENTATION: CLASSIFICATION 
Word' Segmentation by Letter Suctessor Varieties 
Margaret A. Hafer, add Stephen F. Weiss 
Deparrmenr of Cornpuler Scienqe, Uniarsify of North Caroiina, Chapel Hill 
1 n formation Sf orage and Retrieval 10:371-385, 1974 
Within a word, the ith letter is to solfie degree dependent on the i-1 letters that precede it. 
Taking advantage of this, successor and predecessor letter variety counts are used to indicate 
where words should be divided. Four segmentation strategies are used: a) Cutoff, b) Peak and 
plateau, c) Complete word, d) Entropy. Experiments have been run testin2 the machine 
implementation of 'various combinations of of these strategies. A technique involving whole 
word word segmentation techniques and ciltoff points on successor and predecessor counts was 
chosen for use in information retrieval experiments. The information retrieval results 
obtained are virtually identical to those obtained with the more manually oriented forms of 
stemming. 
DOCUMENTA I IUN: CLASSIFICATION 
Application of Minimum Spanning Trees to Information Storage 
R. C. T. Lee, and C. L. Chang 
Heuristics Laboratorv, Division of Computer Research ant! Teclrnology, National Instifvtes of 
H~alrh, Depor/nlenf a! H eolth, Education, ar~d Wtljbre, Iler hesda, Marylarld 
S. Gould, Ed., Proceedings of the Firsr 1 nternlllional Symposium bn Computers ond Chinese 
inpuf/0ufput Systems, Academia Sinica, 1245- 1256 
The use of niinin~um spanning trees allows efficient storage of information but permitting 
information which is redundant across similar items to be stored at only one point in the 
system. Let the distance between two points be a function of the number of changes of 
values between these two points. A minimum spanning tree cor~tains a11 the points and is 
such that its rots1 sun) of distances is n niinimuni anlong all such possible trees. Examples 
discussed: clnssification of anitnals based on aniino acid sequences, voting records of countries 
in the UN, Applications to Chinese characters are discussed. 
DOCUMENTATION: RETRIEVAL 
An Efficient Retrieval Method for a Hierarchical Fact-Retrieval System and its 
Evaluation 
Fujio Nishid:~, and Shinobu Takanlntsu 
Faculty o/ Engineering, Unirersi~y of Osaka Prefecture, Sakai-shi, 591 Japan 
Systems-Computers-Conlrol 6, No. 1.92-60, 1975 
A fact-retrieval system consisting of a data base, axiom set and inference system can deduce 
answers to questions by combining a certain number of facts. A predicate symbol followed 
by a series of constant terms or variable terms and its complement is a literal. A literal 
without variable terms is a ground literal. A. clause is a logical sum of literals. A 
hierarchical fact-retrieval system is stvdied by viewing data as a set of clauses, each of which 
consists of a ground literal. Concrete fact-retrieval is assumed not to contain function 
symbols by limiting the objective. Under such condition efficient procedures- are given for: 
1) retrieving constant terms by specifying predicate symbois and part of the constant terms 
and. 2) retrieving predicate syn~bols and the remaining constant terms by specifying part of 
the constant terms. Futhermore, the number of comp:risons, which is the criterion for the 
retreival time in each procedure, is examined analytically and experimentally. 
DOCUMENTATION: RETRIEVAL 
A Linguistic Approa~h to lnforrnation Retrieval--1 
Petr Syall, and Eva Hajicova 
Cent re o/ Numerical Mat hemo fics, Charles University, Prague 
Infirmation Storage and Retrieval 10:411-4/7, 1974 
The. system should consist of: 1) the brain, which organizes the whole repertory of data. 2) 
the analysis, which takes, as its input, text and qlreslions in NL mixed with formal notations, 
and presents, as output, disambiguated translations in a language the brain can directly use, 
and 3) the synthesis, translating the answers delivered by the brain into the appropriate NL. 
The representation of the sentence on the ~ectuyranimaticnl level (the artifical language) has 
the form of a dependency graph, with the verb as its root. ilie pi~rticip;ints of the verb are 
ordered according to: 1) an inherent order of participants detern~ined by the language system, 
2) topiciilizat~on and comlnt~nicative dynamism, 3) rilles of grammar. The semantic 
representation is marked ns to which elements are contextu3lly bound a13d which are not. 
The verb always stands between its con textually bound and contextually non- bound 
participants. 
DOCUMENTATION: RETRIEVAL 
A Linguistic Approach to lnf ormation Retrieval - -11 
Petr Sga11, Jitrn~ila. I'ancvova, and Sva tnva Machova 
Centre of Nun~erical M othematics, Charles University, Prague 
Information Processing and Management 11:147-153, 1975 
Synthesis of Czech in MT. The generative component is a context-free phrase structure and 
uses modifying, subs~itutional and selectional rules. The right-hand side of the rules cannot 
contain more than two non-terminal symbols. I'here are recursive rules. The order of 
application is determined by the form of the rule itself (by a selection of non-terminal 
synibols). The rules are not distinguished as obligatory and optional ones. The translation of 
the semantic representation to the graphemic level proceeds in two steps. First, the semantic 
representation is translated into a surface syntactic level, and this is then (secondly) translated 
into a morphemic representation. The sequence of computer programs is based on the formal 
pattern of pushdown transducers. 
TRANSLATION 
Some Computer Functions for Machine- Aided Translation 
J. Mathias 
GETA Group, USA. 
S. Gould, Ed., Proce~dings of the Firs{ Internai-ionol Sy~npusiurn on Computers and Chinese 
1 nput /Output Systems, Acaden~ia Sinica, 58 9-592 
The experimental computer configurations use alphanumeric input by keyboard (and 
simulation of graphic input). Both systems depend on retrieval from a glossary file of 
Chinese characters with English meanings and a cross r~ference file of characters containing 
transliterations. The first set of colnputer function, pi~t together is based on itie prime 
method of input by alphanumeric keyboard and graphic output for displaying Chinese 
characters. The most important of these is the segmenting function, which is capable of 
searching the glossary file for meanings for each iridividual char:lcler and, for sets of 
contiguous characters in the query string. The system has two indexing functions: partial 
telecode indexing and Pin Yin indexing, l'he second set of coniputer aid functions are based 
on input by Pin Yin romanization. The opentor con type in the Pin Yin spelling (without 
tone) for a query of up to seven characters in length and the system searches the glossary file 
for all terms that fit the Pin Yin. This method is effective only for con~pound queries. 
TRANSLATION 
Mechanical Translation Between English and Japanese 
Shigehrru Sugita 
Dtparrrnenr of Informotion Science, Kyoto University, Yoshida Sokyo-Ku, Japan 
S. Guuld, Ed., Proceedings ~)j !!be Firs1 I~~rernational Sy~trposiurn on Computers and Chinese 
input /Output Systems, Acad kmia Si:vica, 555-572 
Phrase structun: grammar is supposed for both English and Japanese. and a syntax to syntax 
translation by ~rdered context-free grammar is adopted which concerns word order exchange 
and insertion and deletion of auxiliary particles, and does not concern the semanticit1 aspect 
of translation words. The niain part of the algorithm. syntax analysis and synthesis. is 
carried out by using context-free type rewriting rules. These rules are classified into several 
hierarchies and there exist priorities in each hierarchy whel~ rules are used. Program and 
linguistic data are separated as much as possible, so that a change of grammar does not affect 
the main program. Therefore the program size becomes very small: about 1500 statements in 
assembler language. The word dictionary contains SO00 head words and the grammar table 
contains about 900 rewriting rules. This translation system can in principle acw any 
complex structure, and about 70% of sentences from scientific or technological papers are 
analyzed correctly from the syntactical point of view. Results of English into Japanese 
translation can be spoken through speaker by voice synthesizer. 
SOCIAL-BEHAVIORAL SCIENCE 79 
Tower of Babel: On- the Definition and Analysis of Concepts in the Social 
Sciences 
Ciovanni Sartori 
University oj Florence 
Fred W. Riggs 
University of Hawaii 
Henry Teune 
Universit? of Pennsylvania 
Occasional Paper No. 6, lntefinational Studies Association, Urriversity Center for 
Internotional Studies, Universify of Pitrsburgh, Pennsylvanio, 1975 
Con tents 
................................... Preface 1 
Introduction: Toward the Mastery of Obstacles to Conceptual Clarity 
1. Obstacles ............................... 1 
2. A Broad Strategy ............................ 2 
1) conceptualization; 2) hierarchies; 3) the "paradigm" level 
3. Summary and Conclusion ......................... 5 
...................... 4. Next Steps: Plans and Prospects 5 
Chapter 1: The Tower of Babel, Giovanni Sartori 
Introduction ............................... 7 
1) the loss of etymological anchorage; 2) the loss of historical anchorage; 3) the 
loss of mninstrerl~n discourse; 4) novitisin; 5) the freezing of language; 6) the 
cards and the game; 7) counteract; ng chaos; 8) the paradigm-model juncture 
2. Concepts. Words, Phenomena ................... 12 
1) which is the start? 2) language and thinking; 3) the impact of words; 4) the 
neo-Baconian start 
3. Conceptual Analysis ........................... 15 
1) terms as concepts and terms in sentences; 2) logical systematizing; 3) analysis 
by classification; 4) vertical classification; 5) rules of transformation along a 
ladder of abstraction; 6) an illustration; 7) concepts as data containers; 8) 
theory poor and data chcated 
4. Measurements and Quantification ..................... 21 
1) the broad meaning; 2) the narrow meaning; 3) kinds and degrees; 4) 
objections; 5) pros and cons; 6) quantities of what? 
SOCIAL-BEHAVIORAL SCIENCE 
5. Applied Logic ............................. 
1) science and logic: 2) strategy of allociation; 3) truth-operators and 
quantifiers; 4) theoretical terms and observational terms; 5) object concepts and 
property concepts: 6) entities should not be mul~iplied; 7) the allocation of 
disposi tianol terms; 8) sirnple definitions; 9) varieties of dcfini tions; 10) 
operational definitions; 11) minimal definitions; 12) nrbi trariness in defining 
............................. 6, Hindthoughts 
1) Heraclitus versus Descartes; 2) the missing bridge 
Chapter Two: The Definition of Concepts, Fred W. Riggs 
1. The Definiticn of Definition ...................... 
1) words and dictionaries; 2) dictionary definitions; 3) the definition of 
definitions; 4) lexical definitions: 5) lexicographical propositions; 6) analytic 
definitions; 7) defining propusi~ions; 8) statements; 9) nominal definitions, 10) 
prescriptive definitions; 11) stipulative definitions; 12) summary of definitions; 
13) so-called definitions 
2. The Definition of Concepts ....................... 
1) the ambiguity of nominal definitions; 2) the dictionary definition of 
concepts; 3) definitional functions; 4) terminological agreement; 5) the origin 
of concepts: 6) explicative definitions; 7) strategies for concept formation; 8) 
the explicative strategy; 9) the constructive strategy; 10) evalurlring the 
alternative: 11) natural science precedents; 12) sumninry: complementary 
approaches; 13) reconstruction vs. explication; 14) a choice-concept puzzle? 
3. The Transformation of Concepts and Definitions ....a,......,. 
1) syrnbols and the symbolized: 2) properties of definitions vs properties of 
concepts? 3) the definiendum as concept designator; 4) the definiens as concept 
connoter: 5) some implications: 6) transformation of definitions; 7) toward 
significance and concreteness; 8)changes on the ladder: 9) feedback from 
relevance; 10) terminological confusion from unrecognized concept changes 
4. The Clarity of Definitions: Parsimony .................. 
1) a necessary but not sufficient condition: 2) on the optimality of definitions; 
3) parsimony in definitions; 4) simple redundancy; 5) logical redundancy; 6) 
empirical redundancy; 7) indicators and redundancies: 8) detectors and measures 
5. The Adequacy of Definitions ...................... 
1) an illustration: red; 2) concepts as defining criteria: 3) properties and 
dimensions; 4) intrinsic and extrinsic properties: 5) adequacy of definitions: the 
status of concepts 
6. A Concept Inventory .......................... 
SOCIAL-BEHAVIORAL SCIENCE 
1) an open archive; 2) sources of concept definitions: 3) priorities for concept 
selection. 4) inventory financing; 5) retrieval technology; 6) inventory 
utilization; 7) long-term strategies 
Chapter Three: On The Analysis of Concepts, Henry Teune 
.............................. 
1. Introduction 
.................... 
2. Science and Pl~ilosophy of Science 
1) divergence between philosophy and science: 2) the role of philosophy of 
stience; 3) inadequacy of pl~ilosophy of science for social science; 4) sollie 
reasons for the inadequacy 
........................ 3. Concepts and Uef initions 
1) meanings of concepts: 2) value concepts; 3) logical concepts; 4) empirical 
concepts; 5) definitions and concepts 
.......................... 4. Evaluating Concepts 
1) criteria for evaluating concepts. 2) empirical precision; 3) theoretical 
importance; 4) relationship between empirical precision and theoretical 
relevance 
5. Composition Rules ........................... 
1) simple and complex definitions: 2) the structure gf con~position rules; 3) the 
theoretical importance of composition rules; 4) empirical limits on the use of 
composition rules 
............................ 6. Object Concepts 
1) the nature of object concepts: 2) importance of object concepts in the social 
sciences: 3) the cutting edge of object definitions; 4) object definitions and near 
tautologies. 5) relationships between object and property concepts 
........... 7. Types of Object Concepts: Aggregated Systems and Levels 
1) various meanings of levels; 2) object concepts and generality; 3) classes of 
objects; aggragates; 4) systems and classec of systems 
8. Differentiating Types of Property Concepts ................. 
1) according to fun:tion: 2) according to theory: 3) according to the structure 
of definition; 4) acmrding to the characteristics of the objects: 5) according to 
properties: 6) according to the nature of the composition rule; 7) some 
miscellaneous types of concepts 
9. Measurement ............................. 
1) measurement contrasted with definitions; 2) measurement languages, 
SOCIAL-BEHAVIORAL SCIENCE 82 
operations, and statements: 3) two types of measurement; 4) two traditions d 
measurement; 5) indicators 
........................ 10. A Concluding Statement 93 
Conclusion, Fred W. Riggs .......................... 95 
Footnotes ................................. 99 
'This chapter has been abstracted in the Docurnentation section of this fiche. 
SOCIAL-BEHAVIORAL SCIENCE: ANTHROPOLOGY 
On Binary Categories and Primary Symbols: Some Rotinese Perspectives 
Janics J. Fox 
Harvard University 
Rov Wiliis, en., The lnterprerotion of Symbolism, Halsred Press Division John Wiley & 
sdns., 1975, 99-132. lS BN 0-470-94920-1 
HC: $17.75 
Rotinese chants are ordered by extensive parallelism. Each opposition in the Rotinese ritual 
language is a dyadic set. Some terms may appear in Inore than one dyadic set. Tracing 
relations among these semantic elements involves chains and cycles alorlg the edges of a 
symmetric graph. The analysis of 5000 lines of verse has yielded a dictionary of the ritual 
language consisting of 1000+ dyadic sets over 1400 entries. By concentratir~g on elements 
included in 5 or more dyads (the frequency cutoff is somewhat arbitrary) and with at least 2 
links in common a core set of 21 entries is identified. The core includes directional 
coordinates, words for earth', 'water', 'rock , and 'tree" terms for plants and plant-parts, body 
parts, and a peculiar collection of verbs of position involving ideas of balance, border, ascent, 
and descent. This corc has been it~vestigated using cluster analysis programs. 
SOCIAL-BEHAVIORAL SCIENCE: PSYCHOLOGY 
On the Complexity of Causal Models 
R. R. Gaines 
Mun- M achines Systems Laboratory, Dtpartment oJ Electrical Engineering Science, University 
of Essex, Colchester, England 
IEEE Transactions o/ Systems, Man, and Cybernetics 636-59, January 1976 
The prin,ciple of causality is fundamental to human thinking. It has been observed 
experimentally that causal thinking lends to complex hypothesis formation by humall subjects 
attempting to solve comparatively simple problems involvirly acausal randon11 y generated. 
The assumption of causality in modeling acausal systems leads to meaningless models that 
cannot reflect any stochastic structure present. This correspondence provides un automata- 
theoretic explanation of this phenomenon by analyzing the performance of an optimal 
modeler observitlg tile behavior of a system and forming a minimal-stale model of it. 
SOCIAL-BEHAVIORAL SCIENCE: PSYCHOLOGY 
Computer-Determined Readability Profiles 
David M. Locke 
Illinois 1 nstitute of Technology, Chicago 
Alan K. Stewart 
Illinois lnsritute of Technology Research Institute, Chicago 
SlGLASH Newsletter 8, No. 4:8-12, October 1975 
The readability measure is based on sentence length in words and word length in 
alphanumeric characters. The program measures readability on paragraphs and paragraph 
blocks and proceeds paragraph by paragraph, cumulating data on analyst assigned blocks and 
over the entire passage. Paragraphs can be skipped if necessary. An example of the 
program's operation is given. 
SOCIAL-BEHAVIORAL SCIENCE: PSYCHOLOGY 
Computer Simulation of a Language Acquistion System: A First Report 
John R. Anderson 
Human Performance Center, University of hiichigan 
Roberr 1. Solso, Ed., Ir~forrniition Processirrg and Cogrlition: Tht7 l.o~.ola S~~nrp~sium, John 
Wiley & Sons, 295-349, 1975 
LA$ (Language Acquistion System) is an interactive program which accepts as input lists of 
words, which it treats as sentences, and scene descriptions encoded in a variant of the HAM 
propositional language. It utilizes an ~ugmrnted transition netuorl, ~rapinisr (Woods) for 
both parsing and ~enern~ion. Tlie SPEAk progrllm starts with n HAhl nctrrork of 
proposi[ions tsgxd as to-be-spoken and a topic of sentence. SI'EAK uses a deplh-first 
strategy to fiild a pat11 in theb parsitlg network which c;ln he used to generate the requested 
sentence. In the IJNDEKS'TAND ptogranl it is necesarv. when one pi~th through the network 
fails, to consider the possibility that the failure nlny he in a parsing of 3 sir called 
on that path and to go back into the nctwork to attenipt a differen1 pming. In ocquistion, 
BKAKET is an algorithrrl for taking a sentrnce of in arbitrary I;lngu;~$e and a HAM 
conceptual structure and producing a bracketin: of the srnteKcc that indicates i~ surface 
slructure. This surface structure pr3scrihrs the hirrnrchy of networks required to parse the 
sentence. After BRACKET is conlpirte. SPEAKTEST is isllcd to lest uhrther its grammar is 
capable of generating a sentence and, if it is not, appropiatel! to modify the grarnmlr so that 
it can. A list is kept of all networks crrclted by SPEAKTEST; GENERALIZE is ,!,en called 
to determine which networks are identical. At present LAS operates in a highly restricted 
semantic domain, but a more challenging domain is planned for future work. 
SOCIAL-BEHAVIORAL SCIENCE; PSYCHOLOGY: LEARNING 
A Mathematical Theory of Learning Transformational Grammar 
Henry H;~~~~burgcr, end licniicth Wrslcr 
Sclroo/ a f Sociill Services, lhriversity o J Cul ifurniu, lrvine 
Journal of dlatilert~aticol Ps!~*lrology 1?:137-177, 1975 
The language-learning theory has four interrelated. aspects: (a) the class, G, of possible 
grammars, (b) the kind. I, of information made available to the learner. (c) the language- 
learning procedure, P, and (d) the criterion, C, of success. The theory must deduce a 
guarantee that the correct member. g, of G can be discovered by applying procedure P to 
information of type 1 about g. The sense in which g is "discovered" milst be made precise by 
a formal criterion C, and the proof must hold for arbitrary g in G. The system presented 
converges: that is, the learning process learns the language to a formal criterion. The claim 
that the theory is reasonable rests on the plausibility of G, I, P. and C. G the class of all 
TG's. Each element of I is a sentence coupled with its underlying structure. The procedure 
P doesn't require explicit menlory for past data and must acquire the ability to map each 
phrase-marker (akin to meaning) to the apprclpriate surface structure. 
HUMANITIES 85 
Open-Ended Application of Information Processing Techniques in the 
Humanti ties 
Richard S. Thill, and Jerry L. Ray 
University o/ Nebraska, Omoha 
SICLASH Newletler 8, No. 2-3:7-11, April-June 1975 
A system to store, manipulate, and retrieve lin~stically structured material in interactive 
usage has been applied to the poetic corpus of tleine. The system can: 1) present word 
occurrence frequently lists alphabetically and by frequency, 2) is prov.ided with a table of 
contents about material stored on the tape, 3) has a word occurrence index with volume (in 
the printed edition) and page references and frequency of occurrence. 4) provide key word 
occurrence lists--the researcher lists words or stenis of interest (and receives a list of contexts 
with volume, page, and line numbers, 5) provide a cllronological histognni--researcher 
specifies term or terms for exnniination and is given a grnphic display of varintiorl in word 
class frequency through time. Applications to CAI, Bi bliognphy, and Reference materials are 
being developed. 
HUMANITIES 
The Computer and Creativity 
Jay A. Leavitt 
University of Minnesoiu, Minneapolis 
Allen R. Hanson 
Hampshire College, Amherst, Mossachuseii~ 
SlGLlSH Newsleuer, 8, No. 4:4-7, October 1975 
Though the computer itself cannot be considered creative, the artist can use the computer in 
2 modes: 1) as an active device contributing through simulation, to the final artistic 
product..and 2) passively, like a ton1 to be nianipt~lated by the user. Once we think we 
understand the structure of a class of objects under consideration. that is, how the parts 
making up the whole are interrelated, we can actively use this information and the random 
dement to "create" objects similar to those being studied. 
Wandering in mist 
Reaching out to soft sunlight 
Bl ue-scaled dragons pause. 
HUMANITIES: CONCORDANCE 
Sasanian Pahlavi - Inscriptions: A Concordance 
J. A. Moyne 
Department of Computer Science, Queens College, Depart men1 of Linguistics, Graducate 
Cenrer of rht City University of New York 
Cornpulers and the Humanities 8, 27-39, /974 
The collected volume contains all the known inscriptions in Pahlavi or Middle Persian of the 
Sasanian era (A.D. 226 to A.D. 651). The collection of 21 inscriptiolis was t~ansliterated and 
punched on 718 IBM cards used as input for the concordance program, which produced the 
following output material: 1) r complete listing of the tests on the inscriptions, each 
separated under its heading, 2) Two alpnbrtized concordanres~ gener~lted from all inscriptions 
which ;Ire trei~ted os one text, one for Pahlavi and orle for Aramaic words, 3) Two illdices or 
listings (onr for Pahlnvi, one for Araniaic) produced for ttir words in the iript~t corpus with 
. . 
line references following each word. 1 his general-purpose roncordntice and texl-processing 
computer program can be used for sirnil~lr productions for any text in any language for 
which a suitable transliteration convention is available. Examples. 
HUMANITIES: ANALYSIS 
Pericles and the Question of Structural Unity 
T, R. Waldo 
Universir)~ of Florida, Gaiasville 
SICLASH Newsletfer 8, Nos. 2-3:22-27, April-Jwre 1975 
The analysis of Pericles by computer establishes the organic function of ideas clustered 
around themes of chance (external forces operating on man), with 927 occurrences, and 
choice (humanly controllable forces), with 979 occuFrences. An occurrence table indicates. 
quantity of occbrrence. distribution of themes through scenes, and proportion of occurrence 
of a theme~-defined in terms of the percentage of words in scene with 3 certain classification 
symbol (a theme indicator) divided by the percentage of words of that class in the play as a 
whole. The use of the thematic profile must be supplemented with conyentional techniques 
of literary analysis. 
HUMANITIES: ANALY SlS 
Mathematical Theory of Free Rhythm 
Evzen Kindler 
Deportmen1 of Marhematical In jbrmatics, Faculty of Mot hematics and Physics, Charles 
Universir.~, Prague, Cze.choslovakio 
Free rhythm is that used in ancient Greek and Latin prose and in the music of the early 
Middle Ages in Europe. An elementary arsis(AE) is an elementary time value plus an upbeat 
an elementary thesis(TE) is one or two elementary time values plus a downbeat. An 
elementary rhythm(RE) 
Is: A FTf 
. A simple rhythm consists of a simple arsis (not [he same 
as an elementary ars~s) followe by a siii~ple thesis. Khythnis may be recursively composed 
of other rtlythrns to form a phrase(K). A phrase is ;I result of the t~igtiest rhy~htiiical 
synthesis of the elementary time values while the ordering of phrases is no more an affair of 
rhythm but of stylistic form. A gr:lmmnr of free rhythm is an ordered 4-tuplet: G -- (V,41, 
K, S), where V is a set of 6 terminal svnibols. V1 is a set of 13 auxiliary synlhols. K is a 
member of V and is the initial symbol, and S is a set of 21 production rules. A modified 
1, grarnmnr(G*) as 5 terminal symbols, 6 auxiliary symbols, and 13 prodtictioh rilles, but offers 
an equivalent rhythmic synthesis. An allorithni is given to complete texts generated by G* 
and tr;lnsForm them into texts generated by G. 
HUMANITIES: ANALYSIS 
roving Musical Theorems I: The Middleground of Heiniich Schenker's Theory 
of Tonality 
Michael Kasslcr 
Bassrr Deparrment of Cornpuler Science, School of Physics, The University of Sydney 
Technical-Reporf No. 107, Augusf 1975 
Schen ker's theory of tonality asserts that every composition that is an instance of tonality can 
be derived from one of three Ursaetze' (background) by the successive application of a small 
number of rules of inference called 'prolon~ation techniques' (middle~round). Harmonic: 
rhythmic, melodic, dynamic. etc. details belong to the foreground stage of derivation. Thea 
middleground rules for the major mode involve two formalized lan_eus~es, S1, which has 11 
inference rules (corresponding to Schenker's prolongation techniques), and S2. with 5 
inference rules. S1 governs dilyneor compositions (a lyne being one musical voice) and has 
three primitive axioms, while S2 governs irilvnear conipositions and contains an axiom 
structure which relstes the systems S1 and S2. The decision procedure provides one proof for 
any given theorem (i.e composition) rather than all possible proofs. At the middle level it 
seems that alternative minimal proofs in no way demonstrate a semar~tically important 
ambiguity. though the situation is likely to change significantly with the explication of 
foreground structures. 
HUMANITIES: ANALYSIS 
Rhythmic Variation in the Use of Time References 
Jan Svartvik 
Universi~y of Lund 
Naken Rirrgbom, Ed., Style and Text, Sprokforloge~ Skripzor AB, Stock holm, 4 16-432, 1975 
lSBN 91-7282-095-0 
A study of time references in James D. Watson's The Double Heh reveals that they occur 
rhythmically throughout the book, which has 29 chapters. If one graphs the percentage of 
time references per chapter against chaprer numbers, the resultant cllrve has five peaks 
(Chapters 1, 8, 15. 22. and 29) with six chapters between successive peaks. There are four 
troughs (Chapters 2. 10, 17. and 25). The first and last gaps between the boughs covar seven 
chapters each while the niiddle gap covers six chapters. A siuiilar rhythmic virrintion in time 
reference usage appears ill the paragraph structure wi tllin indivi J ual chapters. 
INSTRUCTION 
Computer -Assisted Tutorial in College Mathematics 
J, L, Caidwell 
Department uJ htatlrematics, University oj IYisconsin-River Fulls, 54022 
Douglas Polley 
Deportment OJ hlathemorics, University of Minnesota, Minneapolis, 55455 
Cornpulers and People 24; No. 12:22-23, December 1975 
The programs deal with quadratic equations and equations of straight lines. The student, 
seated at n teletype, receives a probleni with ramdornly chosen parameters. The student then 
solves the problem and enters his solution. If his solution agrees with the computer's 
solution, then another problem is presented. If the student's solution is incorrect the 
computer then works the problem incorrectly in several different ways, each time checking its 
answer against the student's. If a match is obtained. then the computer suggest that the 
student has made a certain error and asks him to try again. If none of the incorrect answers 
match the student's answtr, then the computer will check the student's work step by step. If 
a student requires detailed solutions to several problems, he is asked to contact the insttuctor. 
INSTRUCTION 89 
Computer Applicability in a Programmed Instruction System for Chinese/ 
Japanese Characters 
Benjamin K. T'sou, and Yat-Shing Cheung 
Linguistics Depariment, V ni versily o/ California, Son Diego 
S. Gould, Ed., Proceedings of the First lnterna~ional S)lmposium on Cornpulers and Chinese 
Inpuf/Ou~put Systems, Academia Sinica, 641 -650 
The set of video equipment consists of a video recorder withJ T.V. monitor, a specially 
modified camera and an electronic interphase device called the Genloc System. The camera is 
installed under the monitor in a cabinet and aims at a piece of translucent glass fitted onto a 
protruding panel. The student places a piece of paper on top of the glass and practices 
writing by copying the model on the screen. The system h:~s three basic components: 1) 
Display, 2) In pi11 of instractional materinl, which is organized into three disti~~ct phases, and 
3) Studqnt mput. When the student turns the switch for the Concurrent Displ;ly Mode, the 
character produced by him will be superimposed onto the instroctor produced character and 
concurrently displayed on the screen. In the simulated compirter aided sysle~n, the Concurrent 
Display Mode will be monitored not only by the student, but also by a research assistant 
through a remote monitoring unit. The research assistant will do what a computer would be 
expected to do. 
INSTRUCTION 
Teaching and Learning Chinese in a Computer Environmenl 
Susan Poh 
The MITRE Corporation, Westgale Research Park, Aicjkan, Virginia 
S. Gould, Ed., prvcredings of the First lnternotional Symposium on Computers and Chinese 
Inpur/Ourput + ~~sic~tns, Academicl Si~ica, 6 17-6.39 
People in Wesun civilization encounter great difficulty when attempting to learn the 
Chinese language. The major difficulties seem to be due to the non-alphabetic. non-phonetic 
and pictorially-structured characters. The system described applies computers and computer 
graphics to teach the Chinese ideogram in a systematic, simple and effective manner. A set 
of LOGO procedures are defined which draw simple strokes and then these procedures are 
used as primitive commands to construct ideograms on a CRT. one stroke at a time. Five 
demonstration courseware lessons are discussed to illustrate the teaching methodology. 
INSTRUCTION 
Computer Aided Instruction in Chinese Characters, Il -Implementation 
Shang-Chun Chen, and Henry Y. H. Chunng 
Computer Syslerns Laboratory, Washington University, St. Louis, Missouri 
S. Gould, Ed., Proceedirrgs of the Firsi Iniernatio~~ul Symposium on Co~nprrters and Chinese 
Input /Ourput Systems, Academic Sinica, 605-6 16 
In character coding for display, the system uses a method based on "piecewise linear 
approximation" The procedure for stroke recognition ltas three steps: 1) Tralisfor~n the 
stroke code into a "direction sequence". 2) Tritnsform the direction sequence into "major 
direction sequence". 3) Identify the stroke type by processing the major direction sequence 
with an automaton. The structure of a character can be unnrnbiguously described as a 
context-free grammar with sirokes BS the terniinal elcnients of the gmnlnlnr. The 
nonterininuls include 'units' and 'components'. A unit consists of a group of closely related 
strokes and conipotirnts consist of two units or of two snialler colnponents. B;lsed on this 
structure, the computer builds a strclcturrd base for error analysis. This base consists of 3 set 
of rel:ition niatrices, one for each production of its grammar. Error detection proceeds by 
comparing entries in the relation matrices of the model character with those of the character 
written by the student. 
INSTRUCTION 
Cornputel Aided l~istruction in Chinese Characters, I--The System 
Chuang Henry Y. ti., and Shnng-Chun Chen 
Cornpurer Systems La bora/ory, Il'ashington Unijlersity, St. Louis, Missouri 
S. Gould, Ed., Proceedings oJ the Firsi Inte0national Symposi~rrn on Cornpulers and Chinese 
Inpuz/Ourput Sysrems, Academia Sirlica, 599 603 
The syslem functions completed so far include: 1) Teach the pronounciations, meanings, and 
the written form of characters, as well as how to write them, 2) Gtride the writing, stroke by 
stroke, with or without information regarding the relative size and position of the strokes, 3) 
Manilor the student's writing to assist the student in correcting mistakes by himself, and 4) 
Pna/yze the stodent's writing and then indicate errors and how to improve. 
BRAIN THEORY 
Brains, Robots, and the Evolution of Language 
Michael Arbib 
Deparrment of Computc~r and information Science, Universiv of Massachuse~ts, Amherst 
Technical Report 74C-2, 40 pp. 
For inieracting with the world animals and robots need a spatical framework, the ability to 
segment he world into chunks, and long-term and ghort-term models of the world. The 
ability to see what objects are in the world and to interact appropriately with the world is 
shared by all animals. Language is not a magic sepatate device to be explained by fornial 
grammars, but is rather an ability which evolved naturally out of our ability to pe~ceive the 
world. The child's acquisition of language builds on his cognitive c:rp:icities and uses a 
strategy in wliich he reconstructs sentences according to his own rules and proceeds by fine- 
tunirig of linguistic structures. A distrtbuted inforniatiori processing ninchine (DIPM) is 
suggested as a theoretical lnodel in which the logic functions of the machilie are distributed 
and not centralized. Consequently no single center exists in which damage could lead to the 
breakdown of the whole machine. The cybernetic study of language needs a theoretical 
language which encompasses A1 work and brain theory. 
BRAIN THEORY 
Artificial Intelligence and Brain theory: Unities and Diversities 
Michael A. Arbib 
Conlputer and Ioforrna!ion Science Depcrtmenl, Center for Systenls Neuroscience, University 
ol ~Vassochusefts, Aniherst 
Teclrnical Reporr 7SC-6, 62 pages, September 1975 
In the control of nlovement, Al offers insight into ovzr?ll planning of behavior: while control 
theory enables BT (Brain Theory) to model feedback and feedforward adjustments by the 
spinal cord brainstem and cerebcllum. A schema is an internal representation of an 'object' 
and comprises inpi$-marching routines, action routines, and competition and cooperation 
routinel. The internal representatioo of the world is then given by a 'collage' of tuned and 
activated schemas. A number of studies in Al aod BT are discussed which offer hope of a 
unified theory of competition and cooperation within a single subsj~stem. We then turn to 
the modelling of a set of brain regions as a cooperative compu~ation system--a distributed 
structure in which each system has its own 'goal structure' for selecting information to act on 
from its environment, and for transmitting the results to suitable receivers. Finally we 
sample Al studies of speech understanding. Of particular interest is work by the Carnegie- 
Mellon Hearsay group on n system using a network of PDP ll's, each functioning as a 
knowledge source, which interact through a communication center called a blackboard rather 
than being controlled by an executive. 
ROBOTICS 
Robot Svstems 
James S. Albus, and John M. Evans, Jr. 
Office of Developmenral Automalion and Conforol Technology, Narionol Bureau of 
Standards 
Scienrific American 234, No. 2, 77-868, February 1976 
Each level in a robot's control hierarchy accepts commands from the next higher level and 
responds by issuing ordered sequences of comniands to the next lower level, making use of 
sensory feedback to close control loops where they are appropriate. At each level feedback 
signals are sent to the next higher level and other feedback signals are recei~ed from below. 
These signals indicate the sbte of the rnnnipul:~tor and the environment. The type of 
feedback needed depends strorigly on the degree of uncertl~inty encountered in the 
environment. Large urlccr ninties in the robot's world require thlrt the upper levels of control 
structure incorporate a "world modeli' wliicti can represent the state of the etwironn~rnt in a 
meaningful way. When such a robot (e.g. SRl's /Shaky) is faced with an input command to 
be executed, it tries to set up a hypottletical desired world niodtll and then to devise a set of 
procedures that can convert the existing world nlodel into the desired one. Various types of 
industrial robots and experimental approaches to machine vision are briefly discussed. 
American Journai of Computational Linguistics Microfiche 50 : 93 
NEW JOURNAL 
C A H I E R S 
DU GROUPE DE TRAVAIL 
ANALYSE ET EXPERIMENTATION DAN LES S IENCES DE L HOMME 
PAR LES METHODES INFORMATI QUE 
CONTENTS 
Programme des travaux du groupe de travail 
Suites aleatoires selong Kolmogorof et Martin-Lof par 
Luts Farinas 
Informatique et Sciences Humaines: les enseignements 
d'un colloque par P. Cibois 
Nouvelles breves 
ADDRESS 
Laboratoire d ' Informatique pour les Sciences de 1 'Homme 
31, Chernin Joseph-Aiguier 
13274 Marseille - Cedex 2 
France 
The Groupe de Travail is attached to AFCET; Division Theorie 
et Techniques +e 1 ' Informati-que. 
American Journal of Computational Linguistics 
Microfiche 50 : 94 
BIBLIOGRAPHY AND SUBJECT INDEX 
CURRENT COMPUTING LITERATURE 
Sections: Citations 
Author index 
Subj ect ir lex by Computing Reviews categories 
Keyword (pmuted title) index 
Coverage: Reviews and abstracts published in Computing Reviews 
during 1974 (annual publication) 
Address: Association for Computing Machinery 
P. 0. Box 12105 
Church Street Station 
New York, New York 10249 
Price : $10 for ACM Members 
$25 for others 
American Journal of Computational Linguistics 
Microfiche 50 : 95 
ADMINISTRATIVE DIRECTORY 
Author: John W, Hamblen and students 
University of Missouri, Rolla 
Entries: 300 computer science departments 
900 computer centers 
societies and government agencies 
Contents: Name of chairman or director; address 
Degree programs 
Computing equipment 
Address : ACM Order Department 
P, 0. Box 12105 
Chr~rch 5treet Station 
New York, New York 10249 
Price: 
$5 for ACll members and those listed in the Directory 
$7.50 for others 
Z'OO pages 
American Journal of Computational Linguistics 
Microfiche 50 : 96 
PRIVACY, SECURITY, ANDTHE 
INFORMATION PROCESSING INDUSTRY 
Author: Dahl A. Gerberick, Chairman 
Ombudsman Committee on Privacy 
Los Angeles Chapter of ACM 
Topics : Administrative, technological, and legal considerations 
Guidelines for implementation 
Comprehensive Right to Privacy Bill 
Privacy Act of 1974 
Data Center Security Check List 
Glossary of Terms 
Bibliography 
Address: ACM Order Department 
P. 0. Box 12105 
Church Street Station 
New York, New York 10249 
Price: $9.00 for ACM members 
$12.00 for others 

References
[I] Feigenbaum. Edward A. (1963), "Simulation of Verbal Learning Bchaviot", in Cnniplrrrrs 
and T/lotcglzt, eds. E, A. Feigenbaum and J. Feldman, McGrnw Hill, 
[2] Goldman, Neil (1 975), "Sentence Paraphrasing from a Conceptual Base", Cor?rr?~llrric.nriori.s c!f' 
the ACM, February, 1975, Vol. 18 No. 2. 
[3] Harris, Larry R. (1972), "A Model for Adaptive Problem Solving Applies to Natural Language 
Acquisition", Corneli U~iiversity, Ithaca, N.Y. PB-2 1 1 378. 
[4] Kelly, Edward, and Stone, Philip (1975). "Computer Recognition of English Word Senses", 
Chapter IV, North-Holland Publishing Co, Amsterdam. 
[S] Lamb, Sydney M., and Jacobsen, William H., Jr. (1966), "A High-speed Large-Capacity 
Dictionary System", in Readings irz Atrtoniatic Lar~glrage Processirzg. ed. David G. Hays, 
American Elsevier Publishing Company, New York. 
[6] Quillian, M. Ross (1968), "Semantic Memory", in Serttnmtic Irljornzntior~ Processirlg. ed. 
Marvin Min~ ky, The MlT Press, Cambridge. Massachusetts. 
[7] Schank, R., Goldman, N., Rieger, C., and Riesbeck, C. (1973), "Margie: Memory, Analysis, 
Response Generation, and Inference on English", Proceedings. Third Iniet+nationaI Joint 
Conference on Artficial Intelligence. Stanford Research Institute, Stanford, CaIifcrnia. 
181 Schank, Roger C. (1973), "Identification of Conceptcalizations Underlying Natural 
Language", in Compzcter Models of Thought and Language, eds. R. Sctrank and K Colby. 
W. H. Freeman & Co., San Francisco. 
[9] Schank, Roger C. (1973), "The Conceptual Analysis of Natural Lang-iage", in Nartrrai 
Language Processing, ed. Randall Rustin, Algorithrnics Press, Inc., New York. 
An Organization for a Dictionary of Word Senses 
23 
[lo] Schmidt, Charles T. (1970); "A Dictionary Structure for Use with an English Language 
Preprocessor to a Computerized Information Retrieval System", Naval Postgraduate School, 
hionterey, California. AD 710 363. 
[ll] Simmons, R. F., and Slocum, J. (1972), "Generating English Discourse from Sc~nantic 
Networks", Communicatiorts of the ACM, October 1972, Vol. 15 No. 10. 
[I23 Simmons, R.F. (1973), "Semantic Networks: Their Computation and Use.for Understanding 
English Sentences", in Contputer Models of Ti~ought arid Langtlage. eds. R. Schank and 
K. Colby, W. H. Freelnan & Co., Sa11 Francisco. 
