A SEARCH ALGORITHM AND DATA STRUCTURE FOR AN EFFICIENT INFORMATION SYSTEM 
by 
Shou-ehuan Yang 
Data and Computation Center 
University of Wisconsin 
Madison, Wisconsin 
Abstract 
This paper describes a system for information storage, retrieval, 
and updating, with special attention to the search algorithm and data 
structure demanded for maximum program efficiency. The program 
efficiency is especially warrantedwhen a natural language or a symbolic 
language is involved in the searching process. 
The system is a basic framework for an efficient information system. 
It can be implemented for text processing and document retrieval; 
numerical data retrieval; and for handling of large files such as 
dictionaries, catalogs, and personnel records, as well as graphic ~ 
informations. Currently, eight cor~nands are implementedand oper- 
ational in batch mode on a CDC 3600: STORE, RETRIEVE, ADD, DELETE, 
REPLACE, PRINT, C(R4PP, ESS and LIST. Further development will be on the 
use of teletype console, CRT terminal, and plotter under a time- 
sharing environment for producing innnediate responses. 
The maximum program efficiency is obtained through a unique 
search algorithm and data structure, instead of examining the recall 
ratio and the precision ratio at a higher level, this efficiency is 
measured in the most basic term of "average number of searches" 
required for looking ~p an item. In order to identify an item, 
at least one search is necessary even if it is found the first time. 
Hc~ever, through the use of the hash-address of a key or keyword, 
in conjunction with an indirect-chaining list-structured table, and 
a large available space list, the average number of searches re- 
quired for retrieving a certain item is 1.25 regardless of the size of 
the file in question. This is to be compared with 15.6 searches for the 
binary search technique in a 50,O00-item file, and 5.8 searches for 
the letter-table method with no regard to file size. 
*This study was supported in part by the National Science Foundation 
and the University of Wisconsin. 
Best of all, since the program can use the same technique for 
storing and updating informations, the maximum efficiency is also 
applicable to them wlth the same ease. Thus, it eliminates all the 
problems of inefficiency caused in establishing a file, and in up- 
dating a file. 
I. MOTIVATION 
In our daily life, there are too many instances of looking for 
some type of information such as checking a new vocabulary in a 
dictionary, finding a telephone number and/or an address in a 
directory, searching a book of a certain author, title, or subject 
in a library catalog card file, etc, Before the desired information 
is found, one has to go through a number of items or entries for 
close examination. The quantitative measurement is usually termed 
as the "number of searches", "number of lookups", or "number of file 
accesses" in mechanized information systems. 
HoWever, as King pointed out in his article in the Annual Review 
of Information Science and Technology, volmne 3, (pp.74-75) that the 
most cOmmon measures of accuracy of an information system are the 
recall ratio and precision ratio. These two measures have come under 
considerable criticism for their indifference in retrieval character- 
istics, being misleading and producing varying results. They probably 
should be used primarily to highlight a system's unsatisfactory 
perf~nce. From the failure analysis of Hooper, King, Lancaster 
and others, the reasons are: incorrect query formulation, indexing 
errors, mechanical errors, incorrect screening, etc. 
In the same volume (p. 139), Shoffner cc~nented on the evaluation 
of system, s that "it is important to be able to determine the extent 
to which file structures and search techniques influence the recall, 
precision, and other measures of syste~ performance". Not until very 
recently, file structure and search techniques were apparently 
i unpopular topics among information scientists except Salton and a 
few others. Nevertheless, these topics have been attacked constantly 
by system scientists for a much smaller size of file but the maximt~ 
efficiency is a vital factor for the total system. They are fre- 
quently discussed under the title of "symbol table techniques", or 
"scatter storage techniques" as used by Morris as the title of his 
article. In addition to the "number of searches" and the "number of 
lookups" other terminologies used by the syste~ scientist for 
referencing the most basic measure are the "number of probes"~ the 
"number of attempts", and the "search length". 
Ever since 1964 the author stepping into the cemputer profession 
noticed that the efficiency of a file handling system is always 
crippled by its file searching technique no matter how sophisticated 
the system. This was especially the case during 1965 and 1966 when 
the author was employed at the Itek Corporation on an Air Force project 
of a Chinese to English machine translation experiment. The best 
search technique used for dictionary lookups was the binary search 
which is still considered one of the best techniques available today. 
For a large file with a huge number of records, entries or items, 
the binary search technique will still yield a substantial number 
of searches which is a function of the file size. The typical files 
are: dictionaries of any sort, telephone directories, library 
catalog cards, personnel records, merchandise catalogs, doct~ment 
collections, etc. For example, in a 50,000-entry file system the 
average number of searches for finding an entry is 15.6 calculated as 
log2N. This figure will not be very satisfactory if frequent 
search inquiries to a file are the case. As a result to finding 
better search techniques, at least three kinds of search techniques 
or algorithms are found to be more satisfactory than the binary search. 
Namely they are: lamb and Jacobson's "Letter Table Method", Peterson's 
"Open-Addressing Technique", and Johnson's "Indirect Chaining Method". 
They have a rather interesting c~on feature that the file size is 
no longer a factor in the search efficiency. 
-3- 
IIo EFFICIENCY OF VARIOUS SEARCH AIEORITHMS 
In order to have a gross understanding of various search 
algorithms, six of them are examined and compared in respect to 
their search effieieneies. 
i. Linear Search 
This is also called sequential search or sequential scan. 
The linear search of an unordered list or file is the simplest 
one, but is inefficient because the average number of searches 
for a given entry in a N-entry file will be N/2. For example, 
if N = 50,000, the average number of searches for a given entry 
is an enormous 25,000. It is assumed that the probability of 
finding a given entry in the file is one. The average number 
of searches in a linear search is calculated as: 
S N + 1 or S = _~N 2 2 
if N is a large number° 
The linear search has to be performed in a consecutive 
storage area and this sOmetimes causes certain inconvenience if 
the required storage area is very large. The inconvenience 
can be avoided by using the last cOmputer word (or some bits 
of it) to index the location of the next section of sto~age 
area used and thus form a single chain for searching. This 
variation of the linear search method is called the single 
chain method. It differs from the linear search in storage 
flexibility but is otherwise the same in the efficiency. 
2. Directory Search 
This is also called block search. With the aid of a 
directory which contains the addresses of every Bth entry of 
the ordered file, a better result can be achieved because the 
average number of searches is greatly reduced. For the best 
result, choosing the blocking factor B = 220 in the example 
above, the answer is 223.6 searches whihh is calculated as: 
-4- 
N+B 2 N B 
s= 2"--'F- or S=-3~-- + --£- 
3. Binary Search 
Using the binary search method will yield a more satis- 
factory result. The search starts with the midpoint of the file, 
and goes to the midpoint of the associated remaining half if 
a match fails. The comparison of their values Will decide which 
half of the file should be tried next time. This process will 
be repeated until a n~tch is found. The average number of 
searches in the example is calculated through the following formula 
as 15.6 searches: 
N+I S = ~ log2(N+l ) - i or ~ = log2N 
if N is a large number. 
The Hibbard's Double Chain Method and Sussenguth's Dis- 
tributed Key Method are compatible to the binary search in 
search efficiency but have a much better update efficiency 
because of the list-structured address-chaining mechanism. 
The respective calculations of the example are: 
Hibbard: S = 1.4 log2N = 21.9 and 
Sussenguth: S = 1.24 Iog2N = 19.4 
4. Letter Table Method 
This attractive method as suggested by Lamb and Jacobsen 
in 1961 for the dictionary lookup in a machine translation 
system did not receive good attention for its possible appli- 
cations in general information systems. The reasons could be 
the immediate response to the numerous letter tables after the 
second level which indicated its inefficiency in storage, and 
that no clear search efficiency and update efficiency were 
expressed. 
Suppose only the twenty-six English letters are involved, 
in theory there are twenty-six tables at the firs£ level, 262 
-5- 
tables at the second level, 263 tables at ~he third level~ 
etc. The number of letter tables will in practice be reduced 
drastically after the second level because of the actual limi- 
tation in letter combination in for~Kng a vocabulary. However, 
no studies of this sort are available for the calculation of 
storage requirement to disprove its storage inefficiency. 
The average number of searches or the expected search length 
of this method can not be calculated as a function of the file or 
dictionary size. It is simply the average number of letters or 
characters of a certain language plus one space character or 
any other delimiter. For the English language, it is a favor- 
able 5.8 searches (S = 4.8 + I), with no concern of the file 
size. Its update efficiency is compatible with its search 
efficiency and may be estimated at less than twice the average 
nmnber of searches. 
In order to achieve the above efficiency, the letter tables 
at each level should be structured in alphabetic order, and 
every letter should be converted into a numeric value such 
as A = I, B = 2, C = 3, ... , Z = 26 and the space delimiter = 0 
or 27 through a simple table-lookup procedure. Those converted 
values would then be used as the direct-access address within 
each subset of alphabetic letters at each letter-table level. 
This discards the need for binary search within each subset of 
"brothers" as in the cases of Hibbardls and Sussenguth's 
searches. 
5. Open Addressing Method 
As early as 1957, Peterson introduced this method for 
random access storage addressing. This method is also called 
linear probing. It assumes the existence of a certain hash 
function to transform the key or keyword of an entry into a 
numerical value within the range of thq table size which is 
predetermined as 2 M for any integer value of M. The table 
size should be large enough to accomodate all the entries 
-6- 
of the file. As in other methods, this method also assumes the 
probability of finding an entry in the file is equal to one. 
Under these two assumptions, and if a good hash function 
is selected for a balanced distribution of hash values, the 
open addressing method will resolve the situation if more than 
one key is mapped into a particular slot in the table, and 
yields a very attractive average number of searches in most of 
the cases. The algorithm is best described in Morris' phrases: 
'The first method of generating successive 
calculated addresses to be suggested in the 
literature was simply to place colliding en- 
tries as near as possible to their nominally 
allocated position, in the following sense. 
Upon collision, search forward from the 
nominal position (the initial calculated 
address), until either the desired entry is 
found or an empty space is encountered-- 
searching circularly past the end of the 
table to the beginning, if necessary. If 
an empty space is encountered, that space be- 
comes the home for the new entry." 
Peterson did some simulations of open addressing by generating 
random numbers and storing them into a 500-entry table, and the 
result of the average number of searches from nine different runs 
is compared with the calculation obtained through Morris' formula 
or Salton's formula (L is the loading factor or the percentage 
of table fullness at the time of search): 
L 
2 
Morris: S = ~ 
lbL 
L Salton: S = ~ + 1 
-7- 
Table I. Average number of searches in open addressing 
Loading factor Peterson Morris/Salton 
O. 1 1.053 1.056 
0.2 1.137 1. 125 
0.3 1.23 1.214 
0.4 1.366 1.333 
0.5 1.541 1.5 
0.6 1. 823 1.75 
0.7 2.26 2.167 
0.8 3.223 3.0 
0.9 5.526 5.5 
1.0 16.914 
It is thus clear that unless the table is nearly full, the 
average number of searches will be surprisingly small. For example, 
if the loading factor is equal or less than 0.9 the average number 
of searches will be an amazing 1.965. This can be achieved by 
allowing an extra ten percent of the table size. In this case, 
its storage efficiency will become less attractive. However, its 
search efficiency and update efficiency are excellent due to its 
extremely low average number of searches. 
6. Indirect Chaining Method 
Since this method makes the same two assumptions as open 
addressing method and is heavily dependent upon the hash 
addressing, a more descriptive name for this method is suggested 
as Hash-Addressed Indirect-Chaining Search (HAICS). Other names 
found in the literature are scatter index tables~ direct chaining 
(a variation in chaining structure), closed addressing (direct 
and indirect chaining), and virtual scatter tables (matching 
additional hashed bits). 
The HAICS method uses a structured four-field table, an 
additional non-addressable overflow area of the table or a separate 
-8- 
overflow table, and a free storage area called the available 
space list. It is aimed to fully utilize all spaces reserved for 
the table before using the overflow area and the free storage 
area. This method treats the addressable table area as end-rounded, 
i.e., the first address of the table is considered following the 
last address. When overflow occurs, the nonaddressable overflow 
area is made available as an extended table area. This is so 
arranged to achieve better storage efficiency since in most cases 
there is no need for the additional overflow area and thus it can 
be ~nitted at the beginning or added on when the need arises. 
The HAICS chaining table has fou= fields: keyword (key or 
data), index, link and pointer. The ke>~ord field is usually one 
computer word in size for accomodating symbols which identifies 
the entry. The index field should have enough bits in size to 
specify the largest relative address in the available space list, 
so that the variable length entry stored in the available space 
list could be indexed from this table. The link field is used to 
indicate the linkage to the next table address where information of 
entries of the same hashed value can be found. The pointer field 
is designated to contain the address of the first entry of entries 
with the same hashed value. Both the link field and the pointer 
field should have a field length in bits large enough to store 
the largest relative address of the addressable table area, i.e., 
the size of the addressable table. 
Entries are entered at their hashed addresses first, and then 
upon collision allotted to the next (or surrounded) empty addresses. 
Their pointers and links are set up for the proper chaining. When 
an entry is being looked up, the first step is to check the pointer 
of the entry at hashed address, then go to the pointed address and 
start searching from this beginning entry. If it is not found, 
the entry pointed by the link of current entry is searched until 
-9- 
it is found or there is no further link. The latter case indicates 
that there is no match for this search. When the entry is found 
through keyword identification, the address stored in the i~dex 
field will direct the actual entry storage in the available space 
list. The index field and the available space list are needed only 
if the entries are of variable length so that storage space can be 
conserved. In the case of fixed length entries, the available 
space llst is no longer needed and the index field in the table 
should be changed into an entry field with the desired fixed length. 
A great advantage of the hash addressing is that to update 
entries in a file requires no sorting or resorting of any kind. In 
the HAICS method, to delete an entry is to follow the algorithm 
until the entry is retrieved, and then to hoo~ up the next entry 
in the chain to the previous entry. All the storage previously 
occupied by this entry is freed for later use. To add an entry will 
use the same algorithm to retrieve the last entry in a particular 
chain and then to set up the linkage to the next empty space in 
the table and have the information of the added entry stored there. 
The added entry itself is stored in some free storage a~ea in the 
available space list being indexed in the chaining table. 
This method was first introduced in 1961 by Johnson and its 
average number of searches is calculated simply as: 
S=I+-- L 2 
More interesting yet, this foz~ula is still valid when the loading 
factor L is greater than one which means the number of entries 
exceeds the allotted table size and the information of overflOW 
entries are kept in the overflow area while entries themselves 
are again placed in the available space list. The cost of ower- 
flo~ is increased linearly at merely 0.9 searches per I00% 
-I0- 
increase of overflow. This provision virtually eliminates the fear 
of overflcw which frequently causes almost unmanageable difficulties 
and at very high expense. 
Before the table is full as in the usual case, the average 
number of searches of the indirect chaining method is a hard-to- 
believe 1.25 with a meximum of 1.5 when the table is about full. 
It is at these two figures and the above-mentioned update efficiency 
and overflow advantage that the author believes some storage ineffi- 
ciency and programming complexity should be tolerated painlessly. 
Table 2. Average number of searches in indirect chaining 
Loading factor Johnson 
O. 1 !.05 
0.2 1.1 
0.3 1.15 
0.4 1.2 
0.5 1.25 
0.6 1.3 
0.7 1.35 
0.8 1.4 
0.9 1.45 
1.0 1.5 
1.5 1.75 
2.0 2.0 
3.0 2.5 
4.0 3.0 
-ii- 
An overview of various search algorithms discussed above is 
given in the foll~ing table: 
Table 3. Comparison of various search algorithms 
•Average 
Search method number of Sample S Search Update Storage 
searches (N=50,000) efficiency efficiency efficiency 
l.Linear search ~ 25,000 Poor Good Excellent 
2 
Single chain N 25,000 Poor Good Excellent 
2 
2.Directory N+__B_B 223.6 Average Average Excellent 
search 2B 2 (B = 220) 
3.Binary search 
Double chain 
(H£bbard) 
Distributed 
key 
(Sussenguth) 
4.Letter table 
method 
(Lamb and 
3acobson) 
5.0pen address- 
ing 
(Peterson) 
6.1ndirect 
addressing 
(Johnson) 
log2N 15.6 Good Poor Good 
1.4 logaN 21.9 Good Good Average 
(opti~a~) 
1.24 log^N 19.4 Good Good Average 
Average 5.8 Excellent Good poor 
number of (English) 
letters per 
word plus 
one space 
L 1.965 Excellent Excellent Good 
2(1-L'-"~" + 1 (L S 0.9) 
L + I 1.25 Excellent Excellent Average 
2 (L < 1.0) 
-12- 
For most search algorithms not included in the above table, they 
are variations and combinations of the linear search," single chain, 
directory search, binary search, double chain and distributed .key 
aimed at the improvement of a certain efficiency. For example, the 
double chain method itself is a combination of binary search, a 
variation of single chain, and the linear search, and it is aimed to 
improve the update efficiency of the binary search. 
IIL KEYWGRD CONSTRUCTION AND HASE FUNCTION 
It is understood from the previous comparison of various search 
algorithms that the turning point for the excellent performance in 
search and update efficiencies is at the hash addressing which is 
essentially a simple ~rocedure applying a certain hash function 
upon a search key or keywrod. Since the same keyword will always 
be hashed into the same hash value for table addressing, the 
criteria for a keyword selection or construction to identify an 
entry is the characteristic of uniqueness. And, in order to mini- 
mize the undesired collision upon hashed addresses, a good hash 
function should be selected such that it would yield a balanced 
distribution of hashed values within the range of the table size. 
I. Keyword Construction 
Under the consideration of the programming and computing 
efficiency and of the storage efficiency, usually a keyword of 
one computer word size is more desirable, e.g., eight-character 
keyword in a 48-bit word machine. 
In machine files such as dictionaries, thesauruses, keyword 
indices, and merchandise catalogs, the keyword is almost readily 
available for hashing. If the keyword is longer than the 
allowable number of characters, a simple word truncation at the 
right end or some word compression schemes can be used to 
reduce the word size to a desired amount of number of characters. 
-13- 
For example, the standard word abbreviatiOn, and a simple 
procedure to eliminate all the vowels and One of the two 
same consecutive cOnsonants in a word will all be acceptable 
for this purpose. 
In cases of author indices and catalogs, membership 
rosters, alphabetical telephone directories, taxatiOn records, 
census, personnel records, student files, and any file using 
a person's name as the primary source of indexing will have the 
convenience in using the last name plus one space character and 
the initials as the keyword. The word compression scheme is 
certainly applicable if it is necessary. 
When title indices and catalogs, subject indices and catalogs, 
business telephone directories, scientific and technical dictionaries, 
lexicons and idiom-and-phrase dictionaries, and other descriptive 
multi-word information are desired, the first character of each non- 
trivil word may be selected in the original word sequence to 
form a key~cord. For example~ the rather lengthy title of this 
paper may have a keyword as SADSIRS. Several known information 
systems are named exactly in this manner such as SIR (Raphael's 
Semantic Info~ation Retrieval), SADSAM (Lindsay's Sentence 
Appraiser and Diagrammer and Semantic Analyzing Machine), 
BIRS (Vinsonhaler's Basic Indexing and Retrieval System), and 
CGC (Klein and Simmons' Computational Grammar Coder). 
An alternative to meet the need of the multi-word situation 
but with a possible improvement in the uniqueness of the re- 
sulting hashed value is to perform some arithmetic or logical 
manipulation on the binary representation of the multi-werd. 
When the multi-word is stored in consecutive computer words, 
each binary representation of a computer word is treated as 
an individual constant. Then either an arithmetic operatiOn 
(e.g., ADD, SUB~RA~E~ MULTIPLY, and DIrgE) or a logical 
operatiOn (e.g., AND, OR) is to be performed on these computer 
words to collapse them into one single computer word as the 
-14- 
keyword. The resulting keyword from this kind of manipulation 
is not human readable but will serve its purpose for hash 
addressing. 
In some cases where a unique number is assigned to an 
entry~ there is no need to hash this number provided that 
number is inside the range of the allotted table size. 
This is mostly seen when a record or document is arranged in 
its accession number or location index. Otherwise the number 
can be treated as letters and be constructed in one of the 
methods described above. 
2. Hash Function 
The different functions used for random number generations 
can also serve as the hash function if a likely one-to-one 
relation can be established between the keyword and the resulting 
random number. This is also subject to the restriction that 
only nmnbers inside the range of table size ace aceeptable. Fre- 
quently this method will not give a balanced distribution of 
table addresses and thus affect the search and update efficiencies. 
The arithmetic or logical manipulation described above for 
handling multi-word items can also be used as a hash function. 
One method called division hash code is suggested by Maurer 
that the binary representation of a keyword is treated as an 
integer and divided by the table size. The remainder of this 
division is thus inside the range of the table size and is 
used as the hash value. As Maurer noticed this method has the 
disadvantage that sometimes it does not produce indices which 
are equally distributed. 
Three methods of computing hash addresses with proven 
satisfactory results were described very neatly by Morris: 
-15- 
"If the keys are names or other objects that 
fit into a single machine word, a popular method 
of generating a hash address from the key is to 
choose some bits from the middle of the square 
of the key--enough bits to be used as an index 
to address any item in the table. Since the value 
of the middle bits of the square depends on all of 
bits of the key, we can expect that different keys 
will give rise to different hash addresses with 
high probability, more or less independently of 
whether the keys share some coe~on feature, say all 
beginning with the same bit pattern. 
"If the keys are multiword items, then some bits 
from the product of the words making up the key may 
be satisfactory as long as care is taken that the 
calculated address does not turn out to be zero most 
of the time. The most dangerous situation in this 
respect is when blanks are coded internally as zeros 
or when partial word items are padded to full word 
length with zeros. 
'~ third method of computing a hash address is 
to cut the key up into N-bit sections, where N is 
the number of bits needed for the hash address, and 
then to form the sum of all of these sections. The 
low order N bits of the sum is used as the hash 
address. This method can be used for single-word 
keys as well as for multiword keys .... " 
All these three method assume one slight restriction that 
the size of the table has to be a power of two because of the 
binary bit selection. Personally the author prefers the first 
method of these three due to the extremely simple programming 
involved. Depending on different machines, the main operation 
requires about five machine language instructions: load A 
register with the keyword, integer multiply with the keyword, 
left shift A and Q registers X bits so that the desired bits 
is at the left end of the Q register~ clear A register, left 
shift A & Q registers again Y bits so that the desired bits 
are resided at the right end of the A register (CDC 3600 COMPASS). 
-16- 
If the second method described by Morris is used, the 
keyword construction for multi-word itom can be eliminated 
if there is no risk of the kind described. The thlrd method 
is more interesting because it has the generality of accepting 
both single-word and multi-word itoms but at a slight cost of 
some more programming which is to be offset by the cost of 
multi-word construction. 
IV. HAICS DATA STRUCTURE 
In response to the needs of search and update efficiencies, 
the data structure for the HAICS technique has to be organized in 
a much more sophisticated way with some additional storage require- 
ment over the entries themselves. As previously described under 
the Indirect Chaining Method, it requires a fixed-size four-field 
addressable chaining table for bockkeeping the keywordand all the 
infornmtion for the chaining mechanism, a reserved free storage 
area called the available space list for storing the variable- 
length entries themselves, and an added-on non-addressable overflow 
table area for overflow chainings. 
The overall HAICS data structure is quite list oriented but 
it is packed into the form of arrays for a more efficient indexing 
and searching procedure. A test program has been written in CDC 
3600 Fortran (a variation of Fortran IV) for the convenience of 
adapting to other computers. The discussions following will fre- 
quently refer to the Fortran language and the list structure for 
a better clarification. 
I. The Chaining Table 
The four-field table can be easily set up as four single- 
dimensional arrays or as a four-dimensional array at the cost 
of several wasted bits in the computer word for storing the 
index to the available space list, the link to the next table 
address, or the pointer to indicate the beginning of a chained 
-17- 
sub-list. The savings are less computer word-packing and 
unpacking operations. 
The positions of each four-field table item in relation 
to the first item, i.e., the hash addresses, can be viewed 
as the main list of the table. The linked entries of the same 
hashed address are treated as a sub-list. Since the relative 
position of each table item is identified with its hashed 
address, there is no need to set up an address index for each 
item. Besides, since each entry can be hash-addressed with 
an average number of searches at 1.25 and that most chains 
are not much longer than one or two entries, it is not necessary 
to have a backward link within a chain. 
The layout of the chaining table is shown in Table 4with 
some sample linkages indicated in hand-drawn circles and 
arrows. 
-18- 
Table 4. The chaining table 
THE CHAINING TAHLE -- 
V~LiW KFYWD~n lN-~v 
~'1 EARPHONE ~96 
ABSOCOO 23 
3 L~ERS R 377 
LMERS H~ 3~9 
_~-~.._~____'i+ _. u~s~ ......... ~ 
ITMK DNI~TCB _.. ..... +.. 
17 
0 
o ~ 
I 
LAM 
-DATACIRC } :l 
1~ ~AeKNOIS,.. 
,~------~i3 CAelNET 
14 EMEeS MS 
~ .15 EMERY OF 
16 ~ACI 11 MAOIS 
AM 
"1~ EMS 
lq MASE~ 
~n ~MERS l 
21 ~MERS HE 
__. ~3 J.)TADPI~ 
AUSORflTN 
~6 BAFFL~ 
F7 
2~ EMEHS AT 
30 BABBLE 
3\] ~MERS G 
32 EMERS HC 
~) 0 0 
0 ............. 0 .... 
2.91 0 B 
27~ 0 ...... 9 ................. 
0 U 0 
99 
\[51 o ..... 16 ................ 
;s35 0 - o 
401 0 1.6 ............. 
415 u 0 
4;)4 0 
,,'I0~ o IH 
43:~ 0 0 ............ 
364 21 ~r. 
3T~,~ ,,, i~ ~ .............. 
3r o . 32 .......... 
119 0 2~ 
n u .3L ........ 
31.~ 0 0 
J,P I II 
3b~q 0 
367 0 o 
-19- 
The four-field table items can also be viewed as four- 
field notes or cells in a list structured presentation: 
I keyword 
An example of the list structured presentation of the HAICS 
system can be illustrated as follows: 
Figure i. List presentation of the HAICS data structure 
116 
KENT LB 
241 ~-- ~ 857 MASS. AVE. 
251 
261~22Z2 SUNSET BULD.~ LOS ANGELS I CALIF. 
-20- 
MASS. 
2. The Available Space List 
1~is can be a single one-dlmensional array for the best 
storage efficiency in accomodating variable length entries. 
The beginning of an entry is indexed in the index field of 
the chaining table the relative position of the beginning 
computer word in the available space llst. This is also 
shown in the examples of Figure I. The ending of an entry 
can either be indicated by a special symbol such as two 
consecutive blanks or EOE as the abbreviation of end-of-entry, 
or calculated as the entry length and placed at the beginning 
of an entry. 
Multi-dimensional arrays are usually wasteful for storing 
variable length entries. If the entries are of fixed length 
then, as described before, there is no need to have a separate 
available space list. These fixed length entries can be put 
directly in the enlarged index field of the chaining table 
for a more efficient processing. 
3. The Overflow Table 
The overflow table is structured the same way as the 
Chaining table except that it is not accessable through 
the hash function. It serves as an emergency device only 
after the chaining table is fully utilized and additional 
storage area is available at that time. When the overflow 
table is established to meet the emergency, its array names 
and the size are made available to the ~CS procedure as an 
extended area for the chaining table. 
V. HAICS AIEORITIR~S FUR STORING~ RETRIEVING, UPDATING~ AND UTILITY 
FUNUTIONS 
The logical procedure of the HAICS technique is described in 
algorithms for easy adaption to procedure-oriented languages such 
as Fortran and Algol. Currently, eight comnands have been imple- 
mented and operational on the CDC 3600 test program: STORE, RETRIEVE~ 
ADD, DELETR~ REPLACE, PEINT~ CCHPRESS, and L~ST. They can be 
ftmctionally classified into three groups: the main algorithms for 
STORE and RETRIEVE; the updating algorithms for ADD, DELETE, and 
REPLACE; and the utility algorithms for ~3 CGMITRESS~ and LIST. 
-21- 
The two main algorithms are frequently utilized by other algorithms 
except PRINT. 
These algorithms are presented in detail in the following: 
I. Algorithm STORE (S) 
This algorithm is to be used for establishing a HAICS 
file at the very beginning. It is assumed that the arrays 
for the chaining table and the available space list have 
been set up properly, and that keywords and the hash 
function have been constructed or selected appropriately 
through out the subsequent uses of this file. 
SI Clear the arrays for the Chaining Table and 
the Available Space list, and set up proper 
indices for these tables 
$2 Compute the hash value of the given keyword, 
I = HASH(KEY) 
$3 If POINTER (I) = 0 and KEYWORD (I) = 0, then 
KEYWORD (I) ~ KEY 
$4 The first available address in the Available 
Space List, J, for storing the entry is placed 
as INDEX (I) = J 
$5 The entry is stored in the Available Space List 
(ASL) sequentially starting at ASL(J) and with 
a special symbol EOE placed at the end of the 
entry in ASL, and exit on success. 
$6 If POINTER (I) = 0 and KEYWORD (I) 4 0, then 
$7 Search the keyword array downward and end-round 
until a KEYWORD (I) = 0 is found 
$8 Set I = POINTER (I), KEYWORD (I) = KEY, and go 
to Step $4 
S9 If a KEYWORD (I) = 0 can not be gound in the 
keyword array, a message is given to indicate 
the overflow of the Chaining Table and then 
exit on failure. 
SI0 If POINTER (I) ~ 0, then I = POINTER (I) 
SII If LINK (1) = 0, then go to Step $7 for a 
KEYWORD (I) ~ O, upon failure, go to Step S9; 
upon success, go to Step SI2 
-22- 
S12 Set LINK (I) = I, KEYWORD (I) m KEY, and go to 
Step S4 
S13 If LINK (I) ~ O, then set I = LINK (I), go to 
Step Sll 
S14 Repeat for additional entries starting at Step $2. 
Examples in Tables 5 and 6 show the result of several 
stored entries under this algorithm. 
-23- 
Table 5. The chaining table after a STORE conmmnd 
THE CHaI,~II,qG TABLE -- 
v/~ Lil(: 
4 
b 
-i 
lq 
!? 
?1 
KEY wORO !{qqE X _LzNK 
k.ARP ,..4 (~ r~IC. C._~ ,.~ 
A~F;OCOO ~3 
I" f~ERS R 377 
Li'4E &! ~ HL 34o 
I}ATA @.4Z~ 
LA'I 29 ( 
!)ATACIRC P74 
u ~ 
o c~ 
Q .,. 
6 ~ ._ 
u le 
o 7 
u ~ 
u q 
F~E~Y OJ 3 '-) ~., .... 
C~HINEI 
E f4ER ~ HS ., 
r.MER Y I)F ~.h I 0 ,n 
0 n 
.I (.J ~ ........ 
iNS 3~,~ V I~ 
: 0 n 
~- MI'IR S I 31b,~ ~1 ?_n 
~MERS NI- 37" U Q 
P2 ~ U 
Z3 UTAL)PD 2b 
;4 A~f)AODIR 
;-:, ~H SOt-~ R T q 
__ .. ~ AFI~ LF= 
"~ P ,4 EMER$ AT ~gJ~./ 29 
0 
-24- 
Table 6, The available space list after a STORE command 
T-E AVAILA,-.ILE $1..'~CF LIST -- 
__ __ a~.}. ~_9.EF_S5 ............................ li'~F__ o R_MAI/_O~. - 
~. bATA ANt} DATA PROCESSING DICTIONARY, BY JAMES F. WO TR ANSN~" }' ()~.J 
w LN, ES ~OF~::'(w~",~/--'~'ll'(AHSOLt.;TE ADDRESS)THE IDENTIFICATIO~ OF A SPECIF 
17 IC REGISTER OR LOCATIPN I,~ STORAGE. FOE (ABSOLUTE CODIN(~ 
2-~ )~OUING IN .,~HICH ALL Ar~i)RESSES REFER TO SPECIFIC MACHI'.IE REGISTE 
Q'I N THE IRANSMISSIuI~i OF w~VES OVE~ RADIO OR wIRE PAT~S OUE TO COK. 
• ",.q VI:'RSI().,~ INTO HEAT OR OTHER FORe'AS OF ENERGY, IN WIRF' TRANSMISSIo 
57 ~J, I~..IE i"I~RM I~ USUALLY APPLIEO ONLY TO LO.~S OF ENERGY INEXTRANEO 
h50S ;AEUI,".. EUE (I)° AC~GRE~ATE cROSsTALK FROM A LAR@E NU 
7~4 M:..IEw UF JISTURHIi",Ib CHANNELS. (2). UNwA,qTED 'r~ISTURMING SOUND 
......... ~! _~_I!,L A CARRIER_O~ OTHER MUL-TIPJ_.E-.C~LAJ~I~IEL- SY~SIEM.-'Eu~4;.J~I-J:'E-~S-UL-T--F-P.Q ...... 
~,...) vl THE A~;,~wE~/~,~E CROSSTALK_ L)R MUTUAL INTERFENENCE FRnM DTHER CHA',.r 
~l r.~EL~;, iUE.(~:~j~I--II~(HAC,~(.~RUUN|~ NOISE)THE TOTAL SYSTEM ~'iOl.~E INDEPE~ 
1,~h L)EHT 0(" THE"P~"~ESt.r~CF OR AHSENCE OF A SIGNAL. THF SITGN~L IS NOT 
|13 TO ~E I'~CLL~I)I'U AS PART uF IHE NOISE. Eu~ A SHIELOIN~ STRu 
I"\] CTURE OR PARTITION ELSE() TO INCREASE THE EFFECTIVE LFNI3THOF THE .F 
I~q XTE.R.',IAL ITRANSMISS..{(.)N PATH ~ETWEEI~ .TwO___p~_N~.S_._A$.~EO~_.TR.~S~I~.~S.IOJ~.___ 
137 PAIH HETWEE".~ I,wO POINTS AS FOR EXAMPLE, HETvJ, I~N THE FRONT AND T~ 
'Iw~ F HACk oF A~,~ ELECrROACnUSTIC TRANSDUGER,EOE C.L.¢.j|--3~'E,.;)UTPMENT--CASE 
|5.~ F~FSI{:;~,~E,} Tu HOUSE RELAYS A,',i\[) OT".IER APP'ARATU.'.~. KFy--CASE IN~T 
|,~! AILEO (}'~ A CUSIO~ER-~':; PREMISES,,, TO PERMIT DIFFERENT LI.~ES TO.THF 
'I~) cr;NTRAL OF~-IL:E I(') HE C(~NN~CTEO TO VARIOUS TELEPHONF STATIONS. 
11 
" -- I-~5 TF. ST--~OX COr~IAI,~IIN~..~ ~F'PARATUS FOR TROUBLE LOCATIO,J AND ROUTII~E 
1w3 *~AT,'JTE.JMNCE. EuE AN ASSEMBLY oF ONE OR MORE cONL)UCTORSt U 
20\] SIJALLY '.'ITPIIN hN EI'.IVt-L(")PIN(~, PROTECTIVE SHEATHE I~'.l SttcH STRUcIuI=A 
2l}9 L ARRA(h~E,.AE".~T OF THF I~ )IvII.)uAL CONI)uCTORS AS wILL PERMIT OF THE 
2\]7 IP ~IS~. SLPARATEI_Y OF;' IN GROUPS. EUE THRF~E UNIT LENGTHS OF _e.,U 
2~5 ST,',I..,IF\[) SIG.",IAL, ,,q'IE;,.l TRANSMITTFO, A DASH WILL A U\[qMATI..C__ALLY_.BE F ....... 
....... ~33 Ol-tiO.,,,E,) 8Y ONE U,.,iJ.T L.E~',L~TPI OF SILENCE. TER~-.I USE,') Ir.l RADIOTELEC~ 
2~,.l ApHY. LOE (i). PLURAL TERM COLLECTIVELY USED TO DESIGNAT= 
~,w,9 MAI'fRIAI_ SERVING AS A ~.,~ASIS FOR UISCUSSION,,, MATERIAL ~IAY DR MAY 
,~b7 e.IuT ~F TECdNICAL IN ~'.IAIURE. THE SINL~ULAR oF DATA I'S OATUM. (;~ 
2t,5 ), I',JFI)RMATIOt~, PARTICULARLY THAI" USED AS A BASIS FnR ,~EC-ANICAI. 
;I ! nR Et_I-_cTRO,.~IC C,},'.'tPUTERS. FOE. (IPAIA C}._~UI.T|.COJ~cDJJ~II.LAI. 
;'~\] ION FaCILIIY PFR~IITTING TRAN¢.;HISSION OF" INF(.}RMATIf)N~.~ pI@ITAL ~" 
2M'-) Oq~'4. EUk ELECTRICAL ACCOL|NI'ING M~CrlI;.,IE. EoF-{,i=,.,af-'I~A ,'J ELECT 
2")7 ROACO,JSIJ.C TRA~JSUUCFR INTENDE() i:0 ,.~E CLOSELY COUPLEP~"'ACOUSTICALL 
3('. r-., Y To T-F I:a~.,.EUE (EMERGENCY MEUICAL ~ERVICE)p5 w NrAIN~,EBB 
31"~ -,.~Sh~'o E{)~.I,..,I~,,a"'I~(L~'IFTRSON A T)IO08 RUTLEOGE,258-~'SBI. EOE 
......... 3.~'1 (E'AER'~L~'., CHE'S d)b314 MAIHE'~5 R~_M__It)DLETOhIt2~B~_SZ~v6 ......... FOE 
3~'9 (EMERSO,I GAi'L),..?2;'~ I.AKE lA~ PL,I, 25?-6916.EOE (EMERSON MARLANr'~ 
337 S)534- ,~ILI.TtH J l)R,233-Ob32. EOE (EMFRSON I..,ApRY H JR}~626 
":I~.% FSCH LA,249-3~.Q.,~. EOE,,,,%.,~ (E,~IERSON HARRY L MRS)231~, CHALET 
":Ib3 r~ARI)E~IS RU*2.'}~-bO6T. EO~.._~I"rJ---,III"(EMFRSON HUGH C}InO4. TC)~PM~INS Dq 
3r~I ,;~2,?.-I~H. EOE (E,'FE"RSON IDA MRS)~,\].9 JEAN*2.~6"~I@~. 
I(",9 Erie ~.t:.~IL'RSUN RICHAPU E)161f) CAMERON DR~2.3B-_I~.~6~,.. EO~ ....... 
3.-,5 (EMERY AiR FREIGHT CORR)5300 S HO~'.~ELL AV MILw--MI~SN NO.,255-831P. 
393 . r~OE" (EMFRY OAN.~EL J)5P'2. STATE,2sb-33GS, EOE 
Q.U\] (F~ERY r,..~i'~A F)2.SUb MCDIVITT RL),25b-12(')4.EOE 
4,?.5 
~33 
-25- 
I IT ,~,~ s|br~Al..~ I.J I:-J{)ICATE ORI@INA!.ING CALLS ANIJ_J~US~_J-I~tE.S.- ............ 
2. Algorithm RETRIEVE (R) 
With a given keyword, this algorithm will retrieve the 
entry which is associated with keyword under the same 
assumptions made in Algorithm STORE. 
R1 
R2 
R3 
R4 
R5 
R6 
R7 
Compute I = HASH(KEY) 
If POINTER(1) = 0, exit on failure. 
If POINTER(1) ~ 0, then I = POINTER(l) 
If KEYWORD(1) = KEY, then J = I~EX(1), 
move the entry in ASL starting from ASL(J) 
to a working area untll an EOE is encountered, 
and exit on success. 
If KEYWORD(1) ~ KEY, and if LINK(X) = 0, then 
exit on failure. 
If KEYWORD(1) ~ KEy, and if LINK(1) ~ 0, then 
I = LINK(X), go to step R4. 
Repeat for additional entries starting at 
Step RI. 
Examples in Tables 5 and 6 will also illustrate this 
algorlthm in actual applications, The execution of the 
RETRIEVE command will not change the contents of the 
Chaining Table and the Available Space List in any event. 
3. Algorlthm ADD (A) 
This algorithm is used when an additlonal or new entry 
is put into the already established HAICS file. It is an 
operation of "adding" an entry to the end of a chain of its 
hashed address, rather than breaking up the chain and "inserting" 
the entry according to some order or hierarchy. This is so 
because each chain in the HAICS file is mostly very short with 
only one or two entries and the "inserting" will gain very 
little in search and update efficiencies. 
0 
-26 - 
This algorithm is different from Algoritl~n STORE in that 
no clear-up operations are performed on the arrays of the 
Chaining Table and the Available Space List. In addition, 
a relative address in the Available Space List is accepted 
as the first available address to stere the added entries 
themselves. 
A1 Set J = the given first available address 
A2 If J ~ size of ASL, go to Step S2 in Algorithm 
STORE and return to Step A4 upon exit from 
Algorithm STORE 
A3 If J> size of ASL, exit on failure. 
A4 Repeat for additional entries starting at 
Step A2 
Tables 7 and 8 exhibit the results of the ADD command of 
some new entries upon Tables 5 and 6. 
-27- 
_, _ - ._ 
- " Tabie 7. The ~'i.i~g table after an ADD co~md 
THE CHAINING TABLE'-- 
...... 
~ALU~- KEYWORD TNUEv LTNK POTMTFQ 
A~SOC~O Z3 ~-~\[ " 
3 EMER$ R 377 O 0 ....... 
EMERS HL 349 O & 
UASH ~2P b 
OA-TA 243 0 \]n 
l EHERS HH 34~ O T ......... 
EAH 291 u 8 
UATAC|RC 27R 0 9 _ 
11 E.ERY oJ ~9~ .... o n 
...... ~ ~-~ 9~ 13 ~2 
\]3 .. CABINET 151 0 14 
1~ EHERS H5 335 
15 ~OF ~ 0 n " 
~~~ o ,'~'~ ~., l~ ~ ~. __- ~-----'"TT -'r'' 
~--'-'~'l EMERS I 36A ~I..,~/Sf 
_.Z3 ...... n 25---- ~3 
P~ ABSOAODR 11 28 P4 
?5 AHSO~BTN 37 U 32 
26 BAFFLE 11q 0 ~ 
27 CABLE 190 31 27 
~8 ~MERS AT 315 ~9 1| 
P9 ~MERS GB 3~\] n 3 
39 BABBLE b~ 0 30 
31 EMERS G 329 o 15 . 
3~ EMERS HC 357 0 0 
I 
-28- 
.................................... T 
--- - Cable 8. Che sva£1abZe space 11starter-an ADD colm~and ....... 
- THEAVAILABLE sPACE LIST -- 
ADDRESS INFORMATION 
1 DATA TRANSMISSION AND DATA pRB.CESSINGI.DICTIONARY~BY JA.MES F~HO 
..... 9 LMES -- FOE (ABSOLUTE ADDRESS)THE IDENTIFICATIOH OF A SPECIF 
17 IC REGISTER OR LOCATIPN IN STORAGEo EOE (ABSOLUTE CODING 
.... 2.~F~ODI~G IN wHICH ALL ADDRESSES--REFER TO-SPEcIFIC MACH~NE--REGISTE 
33 RS AND MEMORY LOCATIONS®EOE (ABSORPTION)THE LOSS OF ENERGY 1 
41N THE TRANSMISSION OF wAVES OVER RADIO OR wIRE PATHS DUE TO CON 
Ao VERSION INTO HEAT OR OTHER FORMS OF ENERGy~_ !N~RLTRANSM~O -- 
57 N, THE TERM IS USUAt.LY APPLIED ONLY TO LOSS OF ENERGY INEXTRANEO 
6~ U~ HEDIA, EOE (1), AGGREGATE cROSsTALK FROM A_LA~G\[~___ 
73 MRER OF DISTURBING CHANNELS, (Z), UNWANTED DISTURBING SOUND 
81S IN A CARRIER OR OTHER MULTIPLE CHANNFL SYSTEM WHICH RFSULT FPn 
~9 M THE AGOREGATE CROSSTALK OR MUTUAL INTERFERENCE FRoM OTHER CHAN 
97 NFLS. FOE (BACKGROUNU NOISE)THE TOTAL _SYSTEM.NQIS~ \[.~PEN 
105 DENT OF THE PRESENCE OR ABSENCE OF A SIGNAL, THE SIGNAL IS NOT 
113 TO BE INCLUOED AS PART OF THE NOISE. FOE. _ A SHIELDI~_S%RU 
IZ1CTUnE O~ PARTITION USED TO INCREASE THE EFFECTIVE LENGTHOF THE E 
129 XTERNAL TRANSMISSION P j~H BETWEEN TwO POINTS AS FOR TRANSMISSTn~ 
1.37 PATH BETWEEN TWO POINTS AS FOR EXAMPLED BETWEEN THE FRONT AND TH 
1~5 E BACK ()k A~ ELECTROACDUSTIC TRANSDUCER.EOE .... EQUIP.M~NT.-CASE __ 
153 DFSIGNED TO HOUSE RELAYS AND oTHER APPARATUS. KEy--CASE INST 
lbt ALLEU ON A CUSTOMER,S PREMISES, TO PERMIT DIFFERENT LINF.S~ THF __ 
I~q cENTRAl. OFFICE TO BE cONNECTED TO VARIOUS TELEPMONF STATIONS, 
177 IT HAS SIGNALS TO INDICATE ORIGINATING CALLS AND BUsy LINES. 
IRS TEST--~UX CONTAINING APPARATUS FOR TROUBLE LOCATION AND ROUTINE 
193 ~AINTE~ANCE. EOE AN ASSEMBLY OF ONE OR MORE_CONDUCTQ~Sp U 
2D1 SUALLY WITHIN AN ENVELOPING PROTECTIVE SHLATH, IN SUC H STRUcTURA 
209 L ARRANGEMENT OF THE INOIVIDuAL CONDUCTORS ~S WILL P\[RHI~_~_~---- 
717 IR.US~ SEPARATELY OR IN GRUUPS, EOE THREE UNIT LENGTHS OF SU 
~25 STAIrWEll SIGNAL~ *HEN TRA_N~TTEDo A DASH WILL AUTOMATICALLY RF F 
~33 OLLO~EU ~Y ~NE U~IT LENGTH OF SILENCE. TERM USE~ IN RADIOTELEGD 
~41ApHy, EUE (1). PLURAL TERM COLLECTIVELY USED TO DESIGNATE 
~49 MATERIAL SERVING AS A BASIS FOR OZSCUSSIONt* MATERIAL MAY OR MAy 
~57 ~tOT BE TECHNLCAL IN NATURE, THE SINGULAR oF DATA IS DATU~.,.._X~_ 
Z65 ), INFO~MATIONo PARTICULARLY THAT USED AS A BASIS F~R MECHANICAL 
~73 ~R ELECTRONIC COMPUTERS, EOE .~A CI~T)COMMUNICAT 
2R\] Imq FACILITY PERMITTING TRANSMISSION OF INFORMATION IN DIGITAL F 
~90RM. EOE ELECTRICAL ACCO~!NTING MACHINE, EOE AN .ELECI __ 
~97 HOACOUSTIC TRANSDUCER INTENDED TO BE CLOSELY COUPLE~ ACOUSTICALL 
33~ Y TO THE EAR. tOE (EMERGENCY MEDICAL sERVICE)?5 W MAIN.o~ 
313 -q567, EOE (EMERSON A T)IOOB RUTLEDGEo2BO-25~7, EOE 
3~\] fFMERSO~\] CMAS B~531~ MATHF~ ~D M/nOLFTnN~3A-~7?~. EOE 
329 (EMERSON @AIL)~2'} CAKE LAWN PL,257"691@.EUE (EMERSON HARLAND 
337 .S)53A HILLTOP DR~233-0632® FOE (EMERSON. HARRY H JRiiG~6 
~#5 ESCH LA~2~9-3~#8~ FOE (EMERSON HARRy L MRS)~31A CHALET 
_.3b~. GARDENS RD~3H-b067, FOE .(EMERSON HUC~d_C.)3~OA.~Ot4P~KZ~Dg.- 
361 ,222-IA3~, FOE (EMERSON IDA MRS)419 JEAN,256-~I~, 
3fi9 E~E (EMERSON RICHARD E)16\]0 CAMFRON DR.~3H.I~6 FOF 
377 (EMERSON ROONEY)O2b6 ELMWOOD AV MIODLETON~3B-B7b9, EOE 
3~5 (EMERY AIR FREIGHT CORp)_5300 S HOWELL AV. MILW--MDSN _N0.~5~-~12 
393 . EUE (EMERy DANIEL J)52Z STATEo25~>z.,~3305, EOE 
4,.I (EMERY bONA F)~B,)6 MCDIVITT RD,~Sb-128#~EOE ~{MAoISo~ ADJUSTM 
~0~ ENT-SYSTEM)303 PRICE RL, 23B-2616, EOFI~--)~(MAI~JL~O ~ ACCEPTA 
• 17 NCE CORP INC)I201 W BELTLINE HY. 257..~1n91. ~~F~/CJ"~'tMADI~ a 
4~5 LMA MRS)3256 MILW, 2~4-7831. EOE~IMADISON AI~}~SE~RVICE)330 
~33~=NRRTM STOUGHTON RD~ ~9-~78, EUE- 
-29- 
4. Algorithm DELETE (D) 
This is used to delete an entry from the HAICS file. 
This algorlthm is heavily dependent upOn the Algorithm 
RETRIEVE but instead of just retrieving the entry it traces 
hack and deletes the entry Itself from the Available Space 
List and all the pertinate information in the Chaining Table. 
DI Go to Step RI in Algorithm RETRIEVE, return 
to Step D2 upon exit on failure from Algorithm 
RETRIEVE, or return to Step D3 upon exit on 
success from Algorithm RETRIEVE 
D2 Exit on failure. 
D3 Clear up the occupied section of the entry in ASL 
including the special symbol EOE at the end of 
the entry 
D4 INDEX(1) = 0 and KEYWORD(1) = 0 
D5 If POINTER(1) = I and LINK(1) - 0, then 
POINTER(I) = O, exit on success. 
D6 If POINTER(1) - I and LINK(1) ~ O, then 
POINTER(1) = LINK(1), LINK(1) = 0, exit 
on success. 
D7 If POINTER(1) ~ I and LINK(I) = O, then 
trace back the previous link which contains 
I and set it to zero when it is found, other- 
wise trace back the origlnal pointer and set 
it to zero, exit on success. 
D8 If POINTER(1) ~ I and LINK(1) ~ O, then trace 
back the previous link which contains I and 
replace it with LINK(1), exlt on success. 
D9 Repeat for additional entries starting at 
Step DI. 
Sample results of the DELETE command executed upon Tables 
7 and 8 are shown in Tables 9 and 10. 
-30- 
+ : .......................... 
Table 9. The chaining table a£ter a DELETE conmmnd 
@ 
THE CHAININO 
VALIJ~ 
1 
3 
TAHLE -- 
KEYWORD tNDE~ 
EARP+4ONE ~9~ 
AHSOCOU ~3 
EMERS R ,. JT7 
L ,N~ ~n!~TFo 
....... ( ; 
EMERS HL 349 q 
\] I + CAHINET 151 ,' 
l 1 6 EMERS HS 335 '._1 
115 EMERY DF ~oI u 1 
l~ '4ACl ~15 ,, 
....................... t ........ ~ z ..... _+mmi-~_am ........... ~ ........... ~. + ,+ FMS 
} 9 MASER 43 ~ 0 
~, CMERS I 364 ~l 
21 LMERS RE 37 , E ;a 
~ HAS 41.n7 0 
................... 2_3_ ..... __~__TA D P_~L~ ....... _t ........ ~5 ........... 
~4 AHS~ADO~ 
~'~.~ ABSOR~TN-- 
dAFFLE 
~-'~7 ~Z~Z~¢-.II~ 
28 ~ME~S AT 
~9 EMERS CH 
31 EMERS G 
3~ EMERS HC 
" +-++Pc. 
+,,+,,,:,,.++ ;,.+,+ .... + 
+9..+'I~ + o 
¢1 
I11 
I + , ............. 
h 
16 .......... 
t~ 
I+ 
JO~ 0 I +3 
r+ 
~3 
it,+ 
3~9 0 . 15 ++ 
357 0 
-31- 
Table I0. The available space list after a DEI~ command 
THE AVAILABLE SPACE LIST -- 
AOORESS INFORMATION 
l DAI'A TRANSMISSION AND O~TA PROCESSING OICTIONARyt By JAMES F, HO 
q LMES EOE (ABSOLUTE AOORESStTHE IDENTIFICATION OF A SPECIF ..... 
17 Ic REGISTER OR LOCATIPN IN STORAGE, EOE (ABSOLUTE COOING 
Z5 )cO01,~G IN wHICH ALL ADDRESSES REFER TO SPECI~C MACHINE REGISTE 
33 R.,:; AND PEMORY LOCATIONSoEOE (ABSORPTION)THE LOS¢; OF ENERGy I 
~l N THr_ rRANSHISSIUN OF WAVES OVER RADIO OR wIRE PATHS DUE TO CON 
~9 VER:~ION INTO MEAl" OR OTHER FORMS OF ENERGy. IN ~IRE TRANSHIS$10 _ _ 
57 h, THE rER~ IS USUALLY APPLIED ONLY TO LOSS OF ENERGY INEXTRANEO 
65.U S HEa)I4, EOE (It. AGRREGATE CROSsTALK FROM A LARGE NIt 
7} ~qEP OF OISTtIRHING CHANNELS, (2). UNWANTED OISTURBIN G SOUND 
8} S IN A CARRIER OR OTHER MULTIPLE CHANNEL SYc;TEM WHICH RFSULT FRI3 
~9 ~ THE A~RE(~AIE CROSSTALK OR MUTUAL INTERFERENCE FRnM OTHER CHAN 
97 ~JFI..~. P_DE (UACKBROUNU NOISEtTHE TOTAL SYSTEM NOISE INDEPEN 
lu50FT_NI OF THE PRESENCE OR ABSENCE OF A SIGNAL. THE SIGNAL IS NOT 
}I~ T q HE" IVCLLr)EO AS PART oF THE NOISE. EUE A sHIEt.DING STR U 
IE\] CT=JR~.~ OW PARTITION USFr) TO INCREASE THE EFFECTIVE LFNGTHOF THE E • 
1,37 PATH HEIWEEN Tie POINTF; AS FOR EXAMPLEI HET~EEN THE FRONT AND TN 
I~ E HaCk, OF A~I ELECTROACOUSTIc TRANSDUCER,EOE E(~UIPMENT--CASE 
tS"~ D~'SI,.3NEO TO MOUSE RELAYS AND OTHER APPARATUs, KFy--CASE INST 
16) ALLE.) (}.i A CUSIOHER#S PREMISES, TO PERMIT DIFFEREMT LI~vES TO THE 
Ib'~ CE~I'R~I OFFICE ro HE CONNECTEO TO VARIOUS TELEPqqNr STATIONS, 
l@5 TE~T--~OX COIqFAJ, NING ~PPARATUS FOR TROUBLE LOCATIO,; AND ROUTINF 
pl7 ITHREE UNTr LENGTHS OF SU 
?33 OLI_,);~F_O r~Y (\]ME U,~IT I.EN(~TH OF SILENCE. TER~; USER IN R~D~OTELEG~ ( 
.-~ r :Z)A TA 
~73 ItL)A~A C~CUI.TLCILM 4,J~T - ~ , ~ MM 
~I In,~ F~CILIrY PERMITTING TRAN.~MI$SION OF INFORMATION IN OIGITAL 
~,, OHM. EOE LLECTRICAL ACCOUNTING M~CHI,,IL, EOE AfW ELECT 
2w7 NnAC.OflSIIC IRANSDUCER INTENuEL} TO ~F CLOSELY COUPlE~ ACOUSTICALL 
3~,5 Y "fO THE EAR, t'¢)E (EMERGENCY MEUICAL ~;ERVICEtp5 i MAINo?S5 
-~\]3 -~5h7. EOF" (EMERSON A TIIO08 RIJTLEOGE,2~6-~S~7, EDE 
)2Q (FMERSUN GAIL)~ LAKE LAWN PL~E.~?-6916,EU K (EMFRSON HARLANp 
337 ~)53#. HILl T(JP 1)w,233-0632,. EOE I (') ~dE=Cff' /~lr// ' 
• ,*,5 ~(EMERSON HARRy L MRS):a3\]4 CHALET 
353 "r.aRDE~ Rt\]~23~-b:=67, E0E (EMERSON HUGH Ctt00A TOMPKINS 13R 
30\] ,P22-1,~3~w. EUE (EMERSON IUA MRS)~I9 JEAN,P56-H|~ , 
._~_~.FnE ...... (EM~~RD E} 1610 CAMFRON OR.p3H.I ~ p~ FOE 
~77 ,(,F'EQSO~, pOL)NEYIh~(~ El MwOOn A y MIDD'LETON,?3H-57691 ~ EOE 
~.t}1 (EME-RY hO~,~A F)ESOb ~CDTVITT RD,25b-|EB~.EOE (k~ADISON _ADJUS 
q,;~O E~'IT SYSTEM)303 PRICE PL~ 23B-261b, EOE (MAoISON ACCEPT/~ 
~1~7~)R_~'_ NI~C)1201 W ~F/TL|NF HY, ~7-1ngl. EnE tM~-;JIS A 
• ~5 L-,IA MRS)325h MILw~ 24~-7831, EOE (MADISON AIR SERVICE)330 
~,J.~ ? NORTH "~TI}I~GHTON RD~ ~9-(}~78. EOE 
-32- 
"\ 
5. Algorithm REPLACE (RP) 
This is for the replacement of an entry itself in the 
Available Space List with the same keyword and linkages in 
the Chaining Table unchanged. Replacement entries longer than 
the original entries can be treated in a few different ways. 
The current algorithm will truncate the excessive end and give 
a message to indicate the situation. A remedy if desired then 
can be made through the deletion of the incomplete entry and 
the addition of the complete entry as a new entry. This 
algorithm will make use of the Algorithms RETRIEVE and STORE 
to find the desired entry and then replace the old contents 
with the new contents in the Available Space List. 
RPI Compute I = HASH(EEY) 
RP2 If POINTER(I) = O, exit on failure. 
RP3 If POINTER(I) ~ 0, then I = POINTER(l) 
RP4 If KEYWORD(1) = KEY, then J = INDEX(I), clear 
the old entry in ASL starting from ASL(J) and 
including an EOE 
RP5 The new entry is stored in ASL starting from 
ASL(J) 
RP6 If the new entry plus an EOE can be accomodated 
in the old space, exit on success. 
RP7 If the new entry plus an EOE can not be 
accommodated in the old space, then store 
the new entry up to the same length of the 
old entry and put an EOE at the end, exit 
on partial success. 
RP8 If KEYWORD(1) ~ KEY and LINK(1) = 0, then exit 
on failure. 
RP9 If KEYWORD(1) ~ KEY and LINK(1) ~ 0, then 
I = LINK(1), go to Step RP4. 
RPI0 Repeat for additional entries starting at 
Step RPI 
Table ii will display the changes rode through the use of 
this algorithm upon Table i0. 
-33- 
• ! 
Table II. The available space list after a REPLACE command ............................ 
TWE AvAILAHLE SPACE LIST -- 
A r)o_Rt" SS XNFORMAT tON 
I D~TA TRANSMISSION ANU DATA PROCESSING DICTIONAR_y~ BY JAMES F o NO _. 
g L,~ES EOE (ABSOLUTE AODRESS)THE IDENTIFICATION OF A SPECIF 
17 IC REGISTER O~ LOCATIPN IN STORAGE~ EOE (ABSOLUTE ¢001NG 
25 )(~OOING IN wHICH ALL ADDRESSES REFER TO SPEcIF-I~C MACHINE REGISTE 
33 P~ AND '4EHORY LOCAT O~N ,EQ~_~ (ABSORPTIUN)THE LOS~; OF ENER@Y ! 
HIm N THE TRANSHISSION OF wAVES OVER RADIO OR wIRE PATHS DIIE TO CON 
w~ VERSIO~i \]NT(J HEAr OR OTHER FORMS OF ENERGy, IN WIRE TRANSMISSIo 
57 N, THE TERM IS USUALLY APPLIEO ONLY TO LOSS OF ENERGY INEXTRANEO 
h5 tJS ~}IL~, I-OE (I), AGGREGATE cROSsTALK FROM A LARGE NU 
7.3 MR~R OF I)ISTURBING CHANNELS, (;~), UNWANTED r)ISTURBING SOUNr~ 
~I ~ IN A CARRIE~ OR OTHER MULTIPLE CHANNEL SYSTEM wHIcH RESULT FRfl 
HQM THE AGGREOAIE CROSSTALK OR MUTUAL INTERFERENCE FRnM OTHER CHAN 
g7 NCLS, EUE (dACKGROUNU NOISE)THE TOTAL SYSTFM NOISE INnEPEN 
I;:'., DF'NT UF: THE PRESENCE OR ABSENCE OF A SIGNAL, TH~ SIGNAL IS NOT 
1J3 in BE Ir-~CLL()ED AS PART OF 1HE NOISE, EOE a S~IELOINO STRIJ 
IPI CTURE O~ PARTITIUN USEI) TO INCREASE THE EFFECTIVE LFNGTHOF THE E = 
I~9 XTE~:~aL TRANSMISSION PATH__~ETWEEN TwO~.AELE~RaNSHISSXO M 
~37 P~TH HETwEE'J TWO POINTS AS FOR ~XAMPLEo BETWEEN THE FRONT AND Tw 
I~ E HE~CK ()F AN ELECTROACOUSTIC TR~NSOUCERoEOE EQUIPMENT--CaSE 
IS.) I.)ESI~Ee~ "\[(\] HOUSE RELAYS AND OTHER APRARAFUS, KEy-.CASE IN.~T 
tb! ALLLu ,_h,~ A CUSTOHER=S PREMISES~ TO PERMIT DIFFERENT LINES TO THE 
lb.;. C, ENFH,~L OFFICE TO BE CONNECTED TO VARIOUS TELEPHONF STATIONS, 
....... L/7 IT _~A.~__.SIbI'~LS_LU IN OIC_ATE .(LR.LGLNAUL_NG GA~L3_AND HUCy LIbeL,_ 
\]~.~ TESI--,~OX CONTAINING APPARATUS FOR TROUBLE LUCATIO.'~ AND ROUTINF 
\]~.~ uA I N rE'lANCE, EOE 
~ 1 
~17 THRFE UNIT LF. NGTHS UF S~i 
2~% STAINEh SIGNAL, WHEN TRA NSM_I\]"\[E~ A~H_ W\]J_~jEU.\]'(JMJLT~C_ALL_Y__~_____ 
.)33 01LO.#Eh oY ONE LINIT LF. NGTH OF SILENCE, TER~ USeD IN R'OIOTELEGR 
241 APMY, EOE 
~57 
;65 
~HI lf)N FACILITY PERMITTING TRANSMISSION OF INFORMATION IN OIGIT~,L 
;~q OPH, ~OE ELECTRICAL ACCOUNTING MACHINE, EOE AN ELECT 
~97 ~nACO=ISTIC TRANSI)UCER INTENDED TO HE CLOSELY COUPLEn ACOUSTICALL 
"4~2~ Y Ti) T~.: EA~c,W EUE (EMERGENCY ~EIIICAL 5ERVIF.E)}~ W...M~IN~>.~ 
('~ 3\].:.;I -~5h7. ~OE(,B/'-=.~I',ITEMERSO~ A T) 1~34 R~/TLEoGE, \]....2~3-~'5#,7~ , EOE | 
~\]. tFHI-RSU~ CHAS I~tS~l& MATHEWS I~;1 MIIlf)I F'T(|N.;~'aR.r~77/,, EOE 
~P~'~EHERSO.'., ,GA~L)22~ LAKE LAWN RL~2a-Z~22,EOE ~MERSON HARLAN~ 
337 S)333 r~ILLTOP DR,333-3333, EOE } ~..---- '" 
~,~.~ --(EMERSON HARRy L HRS)'~3\]4 CHALET 
35:,1 AAROENC; Rt),~'3,1")-bqb7, EOE (EMERSON HUGH C)100*, TOMPKINS Og 
3~I ,722-I#.}~, CUE (EMERSON IOA MRS)#19 JEANt~56-wI2~;, 
...... 3 h ,4 F'nE IFMF-R~LQN HI(~HARI) F')lklll CAM~ROIu DN=PIH. I I~K., EOE 
~'(7. (FMERSO~; ROI)NEY)b~b~ ELMWOO\[)-Av MIoDLETON,23~'5769," EOE 
~l:1 (EMERY ,_~ONA F)~50b MCDIVITT RD~56-I264,EOE _(MAJ)_ISON.AOJUST_~. . ~ 
~'~9 @~.~T SYSTEM)303 PRICE PL~ 23B-E61b, EOE (MAr)ISON ACCEPTA 
_~.}'f NrE cO w}~I~cJl~Ol w RFI TLTNE HY, ~S7-1091. r~ /Manic 
&~S LMA HRS)3?.Sb MILW, 2~4-783I, EOE (MADISON-AIR SERVICE)33O 
STOUGHTON ................................... ~33 ~ NORTH ROt 2~9-b~78, EOE 
-34- 
6. Algorithm PRINT (P) 
This is a simple algorithm for a utility function of 
arranging information in table form and printing out of the 
chaining Table and the Available Space List as those of 
Table 4 to Table 14 in this paper. 
P1 
Table 
P2 Print 
P3 I = I 
table 
P4 Set J 
Space 
P5 Print 
P6 
Set I = I and print titles for the Chaining 
I, KEYWORD(1), INDEX(1), and POINTER(1) 
+ i, repeat Step P2 until the end of the 
is reached 
= I and print the titles for the Available 
List 
J, and ASL(J) 
J = J + i, repeat Step P5 until the end of the 
Available Space List is reached, exit on success. 
7. Algorithm COMPRESS (C) 
This is an algorithm designed to serve as a "Garbage 
Collector" in the list processing languages for better 
storage efficiency. In practical applications, the 
Available Space List is a huge free storage area which 
can be on a secondary bulk storage device Such as a drum 
or disk for random access. After several updating functions 
performed on the HAICS file, there will inevitably be some 
space groups residing in the middle of the used portion of the 
Available Space List. And eventually it will reach a situation 
that the end of the Available Space List is reached but with 
many space groups scattered in the middle. 
To remedy this situation, a periodical operation of the 
C(R4PRESS command is desirable to repack the Available Space 
List for a better storage utilization. Many strategies or 
hierarchies can be used to achieve this purpose with some 
variations in computing efficiency. The current algorithm 
starts with the last entry in the Available Space List and 
moves it to the first accomodatable space group found from 
-35- 
the beginning of the List. The process is repeated until 
all the aceomodatable space groups found are filledand thus 
a largespace group is accumulated at the end of the Avail- 
able Space List for subsequent additions of n~ entries. 
CI Search for the largest INDEX(1) in the Chaining 
Table, then set J = INDEX(1) 
C2 Count the length of this last entry in ASL 
starting at ASL(J) until an E0~ is found 
C3 Check an internal table of space groups in ASL 
to find an accomodatable space group such that 
the number of spaces in a space group is greater 
than the length of the last entry, go to Step C6 
if it is not found, go to Step C4 if it is found 
C4 Move the last entry to the space group found, 
go to Step C5 if some spaces are left unused, 
otherwise exit on success. 
C5 Store information of the space group to the 
internal table, exit on success. 
C6 Search ASL from its beginning for a space 
group, go to Step CI0 if it is not found, 
go to Step C7 if it is found 
C7 If the space group found is accomodatable, 
then go to Step C4, otherwise go to Step C8 
C8 Store information of the space group found 
to the internal table 
C9 Search ASL continuously for a space group, 
go to Step CI0 if it is not found before the 
search reached the original location of the 
last entry, go to Step C7 if it is found 
CIO Exit on table not compressable. 
CII Repeat for additional compression upc~ exit 
on success by entering at Step CI. 
A sample result of the COMPRESS algoritlrm upon Tables 9 
and ii is shown in the following Tables 12 and 13: 
i 
-36- 
Table 12. The clm£ning tabie after a C01~R~P~E~c--o~nand 
THE CHAINING TABLE -- 
V A_L.UE ...... ~ F.~Y_~ 0HO 
IB 
2e? 
................. J?.3 
P4 
1 EARPHONE 
ABSOCOU 
3 LMERS R 
EMERS HL 
INDEx IvNK 
ze~ . o 
~77..(Z~:~ ~ 17 o 
# 3~9 0 
5 ..... )_I2ASL_ 2?2 -- 0 
b 0 0 
FAM ~91 
9 UATACIHC dT~ . . 
I 0 n 0 
U ~ Q 
i ~ dACKNnIS 99 lJ 
13 CABINET 151 
15 LMERY DF 
16 MACI 
17 .... _MAD~t5 AM ~ t"~nA--~'~,.~,,'A~ _n_ 
EHS : 30~ ~ 
L~ERS I 21 
EI~.ERS HE ~78 ~ ~ '2~ 
__ . O TADPF! .I. .... .~___2._b_ 
ABSOAUOH \] I 2G 
AHSORHTN 37 
P6 BAFFLE 119 O 
P7 0 
28. EMERS AT 315 
30 I:JAHt3LE 6R 
31 EMERS G 329 
32 EMERS HC 3#2 
PQTNTrD 
P 
0 P 
0 ........... ~ ................. 
0 
12 
U ..... ~ .............. 
o 
........ 16 
~J-- 19 
b_... P3 
P# 
u _3~ .... 
P6 
0 - .-~i ..... 
0 n 
n 3 
0 ..... .~ S ....... 
0 
-37- 
Table 13. The available space llst after a COHPRESS cccmuand 
THE A~AILASLE SPACE LIST -- 
___A_0\])~ESS INFORMATION 
\] DATA TRA~wSHISSION AND DATA PROCESSING 01CT!oNARy, By JAMES F~HO ...... 
LME$ LOE (ABSOLUTE ADDRESS)THE IDENTIFICATI0~ OF A SPECIF 
IT IC REGISTER OH LOCATIPN IN STORAGE, E~E____~ABsOLUTE_C~J~.I~_. 
~5 )CODINb IN wHICH ALL ADDRESSES REFER TO SPEcIFiC MACHINE REGISTE 
3~ P~ AND MEMORY LOCATION$~EOE (ABSORPTIONtTHE IDS¢ OF £NERRY 
~I N THE IRANSMISSION OF wAVES oVER RADIO OR WIRE PATHS DUE TO CON 
4@ VFRSION INTO HEAT OR OTHER FORMS OF ENERGY, IN WIRE TRANSNi$~IO _ 
b7 N, THE TERM IS USUALLY APPLIED ONLY TO LOSS OF ENERGY INEXTRANEo 
6~ U~ MEt)IAo EOE (1). AGGREGATE cROSsTALK FRoM A LARGE NU 
73 M~ER OF u|STURHING CHANNELS. (~), UNWANTED DISTURBING SOUND 
~¢| S I,~ A CARRIER OR OTHER MULTIPLE CHANNEL SYSTEM WHICH RESULT FPO 
uq 
~T 
It) 
\]dl 
I $3 
169 
THE AG(~HE(JATE CROSSTALK OR MUTUAL INTERFERENCE FRf)M OTHER CHAR 
NFLS, Ef)E (UACKGROUNU NOISE)THE TOTAL SYSTEm r~OISE INOEPEfl 
DF'qlOF IHE-PRE'SENCE OR ABSENCE OF A SI(JNAL, THE SIGNAL iS NOT 
T() HE I,,;CLuI)ED AS PART OF THE NOISE, EOE ~ SHIELDING STRu 
CTURE OR PARTJTION USED TO INCREASE THE EFFECTIVE LENGTHOF THE E 
XTERNAL ZRANSM_!~I~LPJLT_H_I~ETWEEN TWO POINT~ AS ~-oR TRaNsMItting! 
PAlfl HElwEEI'J TwO POINTS AS FOR EAAMPLE~ BETWEEN THE FRONT AND TH 
HaCK . nE ,~: ELECTPOACOUSTIC TRANS_DUCER,EOE EQUIPM~NT.-_-.CZ~SE. 
IJFSIG.".!EI) TO HOUSE RELAYS AND OTHER APPARATUs, KFY--CASE IN~T 
ALLEO ¢)~, A cUSTUHER#S PREMISES~ TO PERMIT DIFFERENT J.It~ES. _T.O THE. 
cE,~TRi~I. OFFICE TO BE cONNECTED TO VARIOUS TELEPwONr STATIONS, 
lt7 lT H...___~ ~f~L~.T(~ IJ~ID_ILCATE ORIGINATING CALL~ AJ~D MUcy I IRES 
IF5 TEST--Hu.X CUNTAINI~ ApPARaTUS FOR TR(~UHLE LD~ATIO~.~ ANI) ROUTINF, 
'~3 ~A{NTE~NCE. EOE(/~J--j~IFMADISON AIR SERVICE)330~ NORTH STOUGHTO( 
;~u) ~'N RI~ ~9.6~7B, EUE/\]~),~(MADI$ ALMA MRs)3ZSB NIL,,., ~.~-T83!, I 
~qolEf)~,-.IJ~ (MADISON AC~"PTANCE CORE iNC)IZL'J W UELTLIN~_HyL 2S7-\]C.~J 
;lT~l. "" ~OE. , _ , ,, |THREE UNIT LENGTHS OF SII 
_ ~ STAINED 5~w~H~A_N~I_T~ED; A DASH WILL ALLT_DM~TICaLLY RF F .... 
?33 Ol LU~FU ~Y (~ UNIT LENGTH OF SILENCE, TER=4 USEr) IN N.nlOTELEGQ 
__-~¢t ~PHY. EoEt.~I(MAD~SnN ADJUSTMENT SYsTEH)3o3 P~ICE PLt ZZB-;~II 
~. '- EOE'/~--.~(EMFRY I)ONA F),~S06 NCUIVITT RU,~5~-I284,EOE | 
~(F*AERSO,, R(+)P~I~IKY)b~.66 ELMWUUD AV MIDDLETON~3B-ST~9= ~-~ EOE | 
,/,~p- ,~(EMER-qU'~ RICHARD E)16I,, CAMERON UH,23B'IIZ6, EOEICr~*"JI'(EMERSnN| 
__~?_73 ~ IDA MR~}4|Q JEAN,256-(~I2H, EOE ~T~L~T~L~cUIT)COMMUNI~L. ....... 
~1 inn FACILITY PERMITTING TRANSMISSION OF INFORMATION IN DIGITAL F 
."(~ 7 
I)3 
'12! 
3.s7 
2)53 
...... "J~,~ 
r)nM. LOE ELECTRICAL ACCOUNTING MACHINE, E0E aN ELECT 
R~ACUIISTIC 1RANSUUCER INTENDED TO HE CLOSELY COUPLER ACOUSTICALL 
Y TO TH~ EAR, RUE (EMERGENCY MEDICAL s~RVICE)~S ~ HAINe~5~ 
-~'5~70 ROE (EMERSON A T)IZ34 RUTLEOGEgIZ3-4567. EOE 
(FHFRSUr~ CHAS H1~31~ MaTHFW~ nn M!g0LFTnN:2~@-¢TTG. EOE 
(FMFRSO'~ GAIL)~ L~KE LAWN PLo22~-P_~.~.2.~0E (EHERSON HARLAN~ 
~)33~ (;TIITQP nH,333-333~. _ EOE~(EMFRSON HU~:~)I00A TOM 
hFINS 0R72~-143~ m ROE_ |(EMERSON'~I~APR~ L ~Rs)2314 CHALET 
--GARgEHS R~238-~6T.. EOE 
3r7 ) ~ ':, 
3'w J 
6 L~ 
4J--Z-- 
433 
-38- 
8. Algorithm LIST (L) 
In contrast to the Algorithm PRINT, LIST will initiate 
an alphabetical sort on the keywords stored in the Chaining 
Table, and output an alphabetical list of all entries in the 
IIAICS file. The final output of this algorithm performed on 
Tables 12 and 13 is illustrated in Table 14. 
LI 
L2 
13 
L4 
Sort array KEYWORD(1) alphabetically and carry 
along the original sequence in the array, I, 
during the sort process 
Take the first original sequence number in the 
sorted keyword order, I 
Set J = INDEX(l), print the hash value, the 
keyword, the entry starting address in ASL, 
and the entry itself, exit on success. 
Repeat for the next keyword and its original 
sequence number in the sorted array until 
this array is exhausted. 
-39- 
LI~T 
Table 14. The alphabetical list after a LIST cow,hand 
t~il-- .~LPHA~ETJCAL LISt --- 
HAsH VALUE = ~' ENTRY ADDRESS IN ASL = II 
A,iS(.,A,)I'JP (AdSOLUTI:. AOORESS}THE IDENTIFICATION OF A SPEC..I~:IC REGIS 
TER OR LOCATIPN IN STORAGE. E0E 
A'!~(~C~L) (A'4'T',ULUTE CO(;ING)CODING IN ~'HICH ALL ADDRESSES REFER TO 
SPECIFIC MACHINE REGISTERS AN\[.) MEMORY LOCATIr)NS.EOE 
HA'=;H VALUE ; ~5 E'~ITRY ADDRESS IN ASL ; 37 
,~-~.Sl)~:4Tr, (A~'~oRPTION) tHE LOSS OF E_NE~GY._ I_N T~E_" T~_ANSMI~_SI_0,J..DF ~ ..... . 
YES OVER R,11)IU r)R WIRE PATHS DUE TO ~o~VERSION INT0 HE 
T (\]~. OTHER FORMS OF ENERGY, IN WIRI~ TR~NSMISSInNt THE T 
FP,"t IS ttSUALLY APPLIF...O ONLY TO LOSS OF ENERGY IP'EXTR,~,',JEO 
IJS "q'l~ I ,' , EC}E 
H~;Fi vALuE = .~n ENTRY .AD_L)p~S~. L N _ A~L = '\]e .~_ 
Hz~!?%I \[" (I), AL;GRE(~ATE CROC:;STALK FROM A LARGE NUMBER nF DISTURRI 
~,IG CHAN;~ELS, (?). UNW.~',,ITED DISTURBING SOUNI)S IN ~ C 
~R~IER r)R o.r~-iER ~ULTIPLE CH~.NNEL SYSTEM wHIC~'W R~'SULT FRO 
M Ii'~E ~(~RE01~TE CROSSTALK OP MUTUAL INTERFEREhCE FROM 0T 
HFR cHANNF LSo ~.OF 
.... HA-,H vALtit- =- '|2 ENTRYAUI~ES~ I N ASL = - 99 
i~,~CK'I,)T'~ (~,C~(',R()Uf,.'II} ~',IUISE)THE T()T.~L SYSTEM NOISF INDEPE"tDE~'~T OF 
THF PRESENCE UR hHS~INCr- OF zt SIGNAL. THE SIG~.AI IS NOT 
rn -r.. Ir,~CLUl.)r-~.) AS PART OF }'HE NOISE, EWE 
HAS"I vAL,IE = ~'~ ENTRY ADL)RF.S~ IN ASL : __Jig 
;t S,'ILELUI;wL'; t';TRUCTIIHE U~ PARTITION USED TO I'.|CR~-A~E T'~iE 
EFF~-CTIVF LEI'~OTFIhF THE EXTERNAL TRANSMISSION PATH qET'"EE 
N T,.-,U P{)II'~IS AS FOR TR~NSMISSIOI,;PATH ~EFWEEN Twn Pr)INTS 
,*S fUR F..X~MPLI:.~ FJET#EEN THE FRONT A,'.ID TH I" PJACK nF AN l:.LL 
CTR.~(;,').I.iS I IC IP~,NSr}uCER,EOE 
~A3h ~,~L~IE" = J3 ENTRy ADDRESS IN ASL 1SI 
C,',~-;I'~EI E,,~uIPMI='.zT--CA~,E UESIGNLI) TO HOUSE RELAYS AND OT~EN APPAR 
ATU~,, KEY--CASE INSTALLED ON A CUSIoMER"* PREmISESt 
T,) ~-r..R~IT I)II'FERENI LINES TO rHE CENTRAL OFFICE TO FIE CO 
',~,,,F('TEI~ TO VARIOUS TELr..PHONE STATIONS. IT HAS c;I(J,',~ALS r 
{)w (.uHTAINING APPAP,ITUS.FOR TROUBLE LOC4tION ANn ROUTINE 
r.', ;~ I N TE NA,,~CE • EaE 
rl~ vALJk = 5 ENTRY ADDRESS IN ASL = ,~;.?Z 
I)~,S~< TH.~EE tl, lll LI".,~I(.;TMS OF SUSTAINED SIGNALt WHEN TR'~NSMITTED 
..................... t~A. UA~, _~_/JJ._AUIDtglI~-ALLY~-~-E- EDJ.LDJ#_~LLJ~X~LLJ.EJ~ ..... 
TH i~F SI\[.F..~'ICF.. TERM USED IN RADIOTELI:.GRAPHY,, FOE 
r~a~;H vALuE = 9 ENTRY ADDRESS IN ASL = ~78 
.t,T~rlr:C (~Z=rA CIRCUI\[)COMMUNICATION F'AGILITY PERMITTING TR.~NSMIS 
SIDe', OF INFORMATION IN DIGITAL FORM. FOE 
..... ~,~;t1 vALUE = 23 ENTRY ADDRESS IN ASL = l 
t;T'~,.)P.) ,.~Ar~, IRANSMISSION AND DATA PROCESSING DICTIONARY~ BY JAN 
FS ~',, HOLMES EVE 
~ASH VALUE = ~ ENTRY ADDRESS IN ASL = ;"w)l -4-0- 
EA~ ELECTRIGAL ACCOUNTIN6 MACHINE. EOE 
--HA~H v~LUE = 1 ENTRY ADDRESS IN ASL = ~6 
EA~PH~NF AN ELECTRgACg_USTI£_\[RANS~g£R_E_R__ITbITE_ELDED TO BE CLnSELY COU 
PLEI) ACOUSTICALLy TO THE EAR. EUE 
HASH VALUE 
E~EH% AT (EMERbON 
E~ERS CR (E~ERSON 
EO~. 
HASH VALUe 
EMFRS G (E~IHSOiq 
HA%H vALuE 
= 2R ENTRY ADDRESS IN ASL = 3\]5 
A T)IZ3~ RUTLEDOEtl~3-ASb7+ .... £~E.._ 
CHAS H)~31+ MATHEWS RD MIDDLEToN,Z38-577b. 
= 31 ENTRy ADDRESS IN ASL = , 3~9 
(+AILI?~? LAKE LAwN PL,222-2222,EUE 
= 37 -- ENTRy ADDRESS IN AsL = 3~2 
H|J~)I C)|~P~ TOMPKINS DRt~-163.8+ EOF 
\]% H~SH VAL.IJE 
E (.;~" 
tiA~H vat ll~ 
E HI&" ~ ~ ht~ ( I-.~4E RSI')N 
= ~ ENTRY ADDRESS IN ASL = 3~9 
HARRY I MRS}2+~|~ CHA_~,_T_G~RO NF~.~,~__~IP~R-f,,O/~7. ._ = .............. _ 
-- 1~ ENTRY ADI)RFSS IN ASL --, 335 
HARLAND S)3}3 HILLTOP {)R+333-3333, EOE ..... 
rl~ b,~ vALUE 
E.+,r HS I (Ei-~E RSL) ~ 
~ A %'+'! V~LUP 
E '-~E I-" % I~ ( F~ ~;I- R S 0,,i 
................... F_!_Jr.+ 
H (3 %,.++ v +', LIJI~ 
E"+F R ~; PF (E "+F.HSOPJ 
-- ,~0 ENTRY ADDRES~ IN ASL. = 
II}A MR$),',I".) JEAN*~6-BI2H. l;.OE 
~7~' 
= .~ ENTRy ADURE'SS IN ASI. = ~',7 
ROl'~hh~y)6;)6,~ ELHwO01) AV MIDOLETON,~-3m-S?69. 
: ;"I ENTRY ADDRFSS IN ASL = "65 
RICHARD EI I61~J CAMERON DR+L~3~-1126. EVE 
HASH VwLIJE = 15 E NT+RY ADQRE5_S IN ASI..-+__~2 ...... 
r"~'Fi"v I)F (E~'+~RY |'~ONA F)~5(+6 ~CDIVITT RL).PSb-\]+~B4.F..OE 
HA~}H VALtJF = 113 ENTRy ADDRESS IN ASI.. = 3 8 
IEF'kR(JENCY HLUICAL SERVIcE)~5 W MAIN,~55"BSbT. EOE. 
HASH vAt,JE = 16 ENTRY ADDRESS rN ASI = 210 
~bCI (~lJiSOJq ACCEPTANCE CORP INC}I2hI W ~ELTLINE Hf. ~57-IC9 
I. .tOE 
HAbH vALUE = 17 ENTRY ADDRESS IN ASL = ~.A 
H~SH VALUE = ~ ENTRY ADORESS IN ASL = ~43 
~ (~AI) ISON ADJUSTMENT SYSTEM)303 PRICE PLt 238-2616. 
HASH V_A UJ.~L _-- .... 19 ............. ~ IEY_~5 ._/N__ASL _--__19_~__ .... 
(MA;JISON AIR 5ERVl'CE)33,92 NORTH STOUGidTt)~ RD~, ;='49-6478+ 
• EuE 
:No OF I~RU RU+~ -41- 
VI. DISCUSSION 
i. Sample Statistics of the Main and Update HAICS Algorithms 
For the purpose of demonstrating the actual performance of 
the HAICS main and update algorithms, the statistics gathered 
from the test run (which also produced Tables 5-11) are listed 
below in Tables 15 and 16 and are the basis for a prelim£mary 
discussion. 
Table 15. Sample statistics of the main HAICS algorithms 
STORE RETRIEVE 
Sample Number of Accumulative 
entry searches average 
sequence of current number of 
__ entry searches 
I 0 0 
2 0 0 
3 0 0 
4 1 0.25 
5 0 0.2 
6 0 0. 167 
7 0 0.143 
8 i 0.25 
9 0 0.222 
10 0 0.2 
ii I 0.273 
12 0 0.25 
13 0 0.231 
14 0 0.214 
15 0 0.2 
16 i 0.25 
17 2 0.359 
18 1 O. 389 
19 0 0.368 
2O 0 O. 35 
21 0 0.333 
22 0 0.318 
23 0 0. 304 
24 I 0.333 
25 0 0.32 
26 0 0. 308 
27 0 0.296 
28 0 0.286 
Ntunber of Accumulative 
searches average 
of current number of 
entry searches 
I 1.0 
1 1.0 
1 1.0 
i 1.0 
I 1.0 
I 1.0 
1 1.0 
2 I. 125 
1 I. iii 
i i.I 
1 1.091 
i 1.083 
I 1.077 
2 i. 143 
1 i. 133 
1 1. 125 
i I. 118 
3 1.222 
2 1.263 
I 
-42- 
Table 16. Sample statistics of the update P~ICS algorithms 
ADD DELETE REPIACE 
Nember Accumu- Number Acc~u- Number Accumu- 
Sample of lative of lative of lative 
entry searches average searches average searches average 
sequence of number of number of number 
current of current of current of 
entry searches entry searches entry searches 
i 2 2.0 I 1.0 i 1.0 
2 0 1.0 1 1.0 2 1.5 
3 i 1.0 I 1.0 i 1.3~33 
4 0 0.75 i 1.0 
5 2 1.2 
The STORE efficiency, i.e., the acc~mlative average number of 
searches for the STORE algorithm, as shown in Table 15 reveals that 
starting with an empty chaining table, it is a low 0.286 at 87.5% 
fullness of the table. Most entries are entered into this table 
with no search at all which implies a good balanced distribution of 
keyword hash values. 
The ADD efficiency is a function of the STORE efficiency. And 
in this sample's statistics the ADD efficiency obtained through 
the addition of four entries to make a full chaining table, is in 
fact the same as if these four entries are placed at the end of the 
STORE command. Thus the ADD efficiency of 0.75 for four entries 
can be combined with the STORE efficiency for tWenty-eight entries 
and the result is a 0.344 of STORE efficiency for a full 32-entry 
chaining table. It is noted that the ADD efficiency is always 
greater than (or equal to) the STORE efficiency due to the non- 
emptiness of the chaining table. 
The RETRIEVE efficiency is always identical with the search 
efficiency as indicated in Table 3 which is an average of 1.25 for 
the indirect chaining method. The accumulative average number of 
searches does fall into the range between the minimum of 1.0 and 
the maximum of 1.5 which is a 1.263 at 59.4% table fullness. 
-43- 
Both the DELETE and REPLACE efficiencies are functions 
of the RETRIEVE efficiency or the search efficiency. The 
sample statistics of accumulative average number of searches of 
1.2 for deleting five entries and of 1.333 for replacing three 
entries gives some indication that the DEleTE and REPLACE 
efficiencies are cOmpatible with the RETRIEVE efficiency. 
As mentioned before, the above discussion is preliminary 
and even premature. The statistics in Tables 15 and 16 do 
not cover some unusual circumstances although it is a typical 
example of several regular test runs. To support, or oppose, 
the above discussion will demand several further extensive 
tests of each of these five efficiencies under a controlled 
and isolated environment. 
2. A Framework for Information Systems 
The HAICS method is a basic framework aimed to improve 
the total efficiency of an information system. It can be 
progran~ned in a number of languages from the fundamental machine 
language or assembly language of a particular family of computers, 
to the high-level procedure-oriented languages such as Fortran 
and Algol which are acceptable to most of the computers. 
With an amazing 1.25 average number of searches per entry, 
this method will certainly make natural language processing 
not much worse than the n,-nerical computation. It is ready 
to be implemented for text processing and document retrieval; 
numerical data retrieval; and for handling of large files such 
as dictionaries, catalogs, and personnel records, as well as 
graphic informations. In the test program coded in Fortran 
and a machine language COMPASS, eight commands as described 
before are currently implemented and operational in batch mode 
on a CDC 3600. Further development wil t be on the use of tele- 
type console, CRT terminal, and plotter under a time-sharing 
-44- 
environment for producing ~ediate responses. This is under 
the ideal of placing the most complete encyclopedia or a tailored 
index-reference work under one's fingertip. 
Specifically, the dictionary lookup operation as the 
principal operation of an information system, is no longer a 
lengthy and painful procedure and thus a barrier in natural 
language processing. Linguistic analysis may be provided 
with a complete freedom in referring back and forth any entry 
in the dictionary and the grammar, and the information gained 
at any stage of analysis can be stored and retrieved in the 
same way. Document retrieval may go deeper in content analysis 
and providing a synonym dictionary for some better query 
descriptor transforw~tions and matching functions. As Shoffner 
noted, "it is important to be able to determine the extent 
to which file structure and search techniques influence recall, 
precision, and other measures of system performance." This 
paper tends to support Shoffner's statement by presenting an 
analysis of current search techniques and a detailed description 
of the HAICS method which is a possible framework for most 
information systems. 
-45 - 

REFERENCES 

Becket, Joseph, and Hayes, Robert M. Information ,Storage and 
Retrieval: tool% elements~ theories. Wiley, New York, 
1963. 

Bobrow, D. G. "Syntactic Theory in Computer Implementations," 
Automated Language Processing, edited by Harold Borko. 
Wiley, New York, 1967, pp.217-251. 

Borko, Harold. "Indexing and Classification," Automated Language 
Processing, edited by Harold Borko. Wiley, New York, 1967, 
pp.99-125. 

Bourne, Charles P. Methods of Information Handling. Wiley, New 
York, 1963. 

Hayes, David G. Introduction to Computational Linguistics. 
American Elsevier, New York, 1967. 

Johnson, L.R. "Indirect Chaining Method for Addressing on 
Secondary Keys," Conmmnications of th e ACId, 4(May,1961), 
pp.218-222. 

King~ Donald W. "Design and Evaluation of Information Systems," 
Annual Review of Information Science and Technolo~y~ Volume 
3, edited by Carlos A. Cnadra. Encyclopedia Britannica, 
Chicago, 1968, pp.61-i03. 

Knuth, Donald E. The Art of Computer Programming, Volome i/ 
Fundamental AisorltbmSo Addison-Wesley, Reading, Massachusetts, 
1968. 

Lamb, Sydney M. and Jacobsen, William H., Jr. "A High-Speed Large- 
Capacity Dictionary Systes%" Readings in Automatic Language 
Proeess in~, edited by David G. Hays. American Elsevier, New 
York, 1966, pp.51-72. Also in Mechanical Translation, 6 
(November, 1961), pp. 76-107. 

Lee, T. C.; Wang, H. T.; and Yang, S, C. "An Experimental Model 
for Chinese to English Machine Translation." Paper presented 
at the Annual Meeting of the Association for Machine Translation 
and Computational Linguistics, San Franqisco, 1966. 

Lee, T.C.; Wang, H. T.; Yang, S. C.; and Farmer, E. Linguistic 
Studies for Chinese to English Machine Translation. Itek 
Corporation, Lexington, Massachusetts, 1965. Also available 
from ERIC Document Reproduction Service as ED 010 872. 

Maurer, W. D. Pro~rmmning: an introduction to computer languages 
and techniques. Holden-Day, San Francisco, 1968. 

Meadow, Charles T. The Analysis of Information Systems: A Programmer's 
!n.troduction to Information Retrieval. Wiley, New York, 1967. 

Morris, Robert. "Scatter Storage Techniques," Communications of the 
• ACH, ll(January, 1968), pp.38-44. 

Pendergraft, E.D. '~ranslatlng Languages," Auto, hated Language 
Processing, edited by Harold Borko. Wiley, New York, 1967, 
pp.291-323. 

Peters@n, W.W. '~ddressing for Random-Access Storage," IBM J. 
Res. De%. l(April, 1957), pp.130-146. 

Salton, Gerard. Automatic Information Organization and Retrieval. 
McGrew-llill, New York, 1968. 

Sedeloe, Salley Yeates, and Sedelow, Walter A., Jr. "Stylistic 
Analysis," Automatic Language Processing, edited by Harold 
Borko. Wiley, New York, 1967, pp.181-213. 

See, Richard. '~chine-Aided Translation and Information Retrieval," 
Electronic Handlin~ of Information: Testin~ & Evaluation, 
edited by Allen Kent; Orrin E. Taulbee; Jack Belzer; and 
Gordon D. Goldstein. Thompson, Washington, D.C°, 
and Academic Press, London, 1967, pp.89-I08. 

Shoffner, Ralph M. "Organization, Maintenance and Search of Machine 
Files," Annual Review of Information Science and Technology, 
Vob-,e 31 edited by Carlos A. Cuadra. Encyclopedia Britannica, 
Chicago, 1968, pp.137-167. 

Simmons, R.F. "Answering English Questions by Computer," Automated 
lenguage Processing, edited by Harold Borko. Wiley, New York, 
1967, pp.253-289. 

Travis, Larry E. "Analytic Information Retrieval," Natural Language 
and the Computer, edited by Paul L. Garvin. McGraw-Hill, New 
York, 1963, pp.310-353. 

Venezky, Richard L. "Storage, Retrieval, and Editing of Information 
for a Dictionary," American Documentation, 19 (January, 1968), 
pp.71-79. 

Wegner, Peter. ProgTamming Languages, Info~matlon Structures~ and 
Machine, OrKanization. McGraw-Hill, New York, 1968. 

Wyllys, Ronald E. '~xtracting and Abstracting by Computer," Automated 
Language Processing, edited by Harold Borko. Wiley, New York, i967, pp.127-179. 

Yang, S. C, '~utom~tic Segmentation and Phrase-Structure Parsing: 
a Simple Chinese Parser," Thought and Word, 6 (January, 1969), 
pp.324-331. 
