AN INTERACTIVE P~0NOIEGICAL 
RULE TE3TING SYST~4 
Victoria A. Fromkin 
and 
D. Lloyd Rice 
University of California, Los Angeles 
September 1969 
One of the many ways the high-speed computer is useful to linguistic 
researchers is for the evaluation of generative grammars. Several pro~-s~ing 
systems for this purpose have been described in the literature. 1,2,3,~ 
A transformational generative grammar consists of a syntactic component, a 
phonological component and a semantic component. This paper is concerned 
solely with the phonological component. While this component is a dependent 
part of the entire grammar, systems of phonological rules for specific lan- 
guages, i.e, the phonological components of the graEmars of th~se languages, 
have been separately presented by Chomsk~ and Halle 5, Kuroda , Schaohter 
and Fromkin 7 and others. The Sound Pattern of English 5 (hereafter, ~PE) 
includes the 'formalism used for presenting phonological rules and the schemata 
that represent them, and the interpretation of this formaliam'° (p° 390) 
This formal description is taken as the basis for the rule structures d/a- 
cussed in this paper. 
Cho~sk~ and Halle state that 'The rules of the grammar operate in a 
mechanical fashion; one my think of them as instruc~ns that might be given 
to a mindless robot, incapable of exercising any ~udsment or imagiluation in 
their application. An~ ambiguity or inexplicitness in the statement of rules 
must in principle be eliminated, since the receiver of the instructions is 
asssmed to~he incapable of uain~ intelligence to fill in gaps or to correct. 
errors'° They find it 'a curious fact' however that 'this condition of pre- 
ciseness of formulation°..has led man~ lin6uists to conclude that the nctivation 
for such grammars must be.°.some.°use of computers'° We also believe that 
there are more b~sic theoretical motives in clarity and completeness; we 
furthe.r believe that this very explicitness makes possible the use of the 
computer for testi~ such rules. 
Furthermore, the complexities of natural language are reflected in 
the components phonological rules. Anyone who has attempted to teach a 
group of graduate students the phonology of English, using the rules pre- 
sented in aPE can attest to the fact that even a single rule schema presents 
endless problems for the brightest of students when he attempts to expand 
the sche.~- and to apply this set of rules to convert an abstract ~urface 
structure of a sentence into its phonetic representation. While the liDguist 
or the student m~y be possessed with greater intelligence than the mindless 
robot, he is also possessed with human fallibility, and limited time and 
energy. For these reasons, the mindless robot can perform far more effectively 
than a Linded human. The conq~ater program which is described in ~ paper 
was written to aid human phonologists in the writi~ of rules, the testing 
-2- 
of rules, and the teaching of phonology. The importance of such a computerized 
phonological rule teeter becomes very apparent when one selects at r-n~om 
any twenty-five English words, attempts to provide what one assumes to be the 
underlying phonological representation, and then applies the rules of SPE 
as specified. One of the authore of this paper made euch an attempt. After 
more hours than she wishes to remember, and using every possible underlying 
segment, she found that eleven of these randomly chosen words could not be 
correctly derived. Nor were these strange foreign loan words, unless one 
believes the word 'America' to be an exceptional item in the Englieh vocabu- 
lary° 
This example is not offered as a criticism of Sound Pattern of English, 
probably the most important published book on English phonology and phono- 
logical theory. Nor are we concerned here with any theoretical weaknesses 
which may or may not be present in this work. What is apparent is that had 
a phonologic~-I rule tester been used, prior to the publication of this set 
of rules, many of the problems in rule orderinE~ omitted contexts etc. could 
have easily been corrected~ and those rules which present problems and which 
cannot work would have at least been revealed. Furthermore, because of the 
speed of the computer, one could have tested not only twenty-five words, but 
hundreds -- determining the correct underlying forms of formatives, and there- 
by providing a lexlc~ @B ~h the Pnle, ,o~ operate° 
A ~er ~t, ebl~s ~.~ed ~n ~i~ up • prod-as ~or roach a tester is 
~h, J~$a$~.om~. oospro~se ot~en n~esury betveen the computer input format 
the rule deeeriptiom seh~su used by linguiste to express their phono- 
logical rules, The Phonological Rule Teeter of Bobrow and Fraser 8 solves 
this problem by offerin6 a variety of logical combinatorial devices which may 
be used to group either segments within a rule or complete rules for dis- 
junctive or conjunctive application. Such a system has very general descriptive 
capability, but complex rules appear in the computer input form rather different 
from the linguist' s format. 
In consideration of the deecriptive powers of euch systems, it appears 
that the input format should be made as specific as possible to the proposed 
theoretical structure since a more general deecriptive scheme requiree the 
rule writer to learn n more powerful meta-language than is needed. This 
parallels the general direction of development of computer languages; from 
the general machine-oriented coding to the specific problem-oriented languages. 
This paper describes the translational core or compiler of a system which 
accepts phonological rules in a format very close to that formalized in The 
Sound Pattern of English and produces as output the coding similar to phone'---tic 
segments necessary to evaluate the input rules in a phonological testing pro- 
gram. The input format of this system is especially applicable to keyboard 
entry on a CRT graphic terminal such as the IBM 2260 and is planned for pos- 
sible use in an on-line classroom system for teaching the properties and 
operation of the phonological component, as well as for the writing and testing 
of phonological rules by the linguist. 
The rule testing program consists of an'input block, a sequence of phone- 
logical rules, and a printout block (see Figure i). The input block will ac- 
cept a string of characters from the operator's console representing the under- 
lying form of a word or phrase or an~ form assumed to occur in a derivati °n 
-3- 
in the phonological component. This form is then tested against the environ- 
mental conditions specified by the stored rules and modified according to 
those rules whenever a match is found° The string of phonological units,i.e, 
segments and boundaries, and/or the binary matrix resulting from the applica- 
tion of any rule may be optionally displayed on the operator's console after 
the application of that rule. 
Rule Specifications I Compiler | 
input R1 R2 
form ---) 
E IPrintout 
A Block 
~3 
Figure 1: The Phonological Rule Tester 
The structure of the program is such that any rule or sequence of rules 
can be tested using the same input and output blocks. The rules initially 
coded for testing and described in this paper are taken from Cholu~y and Halle 
(1968)o The program however, is not limited to these particular rules, but 
can be used to test any set of rules comprising the phonological component of 
a generative grammar. 
The Input Format for Rule Description 
The input to the phonological component consists of a structurally 
analyzed string consisting of syntactic brackets (e.g. Noun Phrase, Noun, 
Adjective, Verb etc), segments, and boundaries. The segments and boundaries 
are composite feature bundles. 
The system used in our Phonological Rule Tester specifies these units in 
any of several ways: 
a. As a combination of upper-case alphabetic characters representing the 
various phonological segments defined in the system; 
b. As a cluster of distinctive feature specifications enclosed in angle 
brackets. These may be spaced horizontally or vertically, i.e. I 
<+voc - cons - round) 
<÷voc> 
<-cons> 
<-round>, 
but are to be considered simultaneously an a cluster rather than con- 
Junctively as in the square bracketed series; 
c. As a sequence of segments of specific predetermined types; 
'~" indicates a~ vowel segment, i.e., defined as +voc 
'~" indicates a~r non Vowel:seBlent, i°e. a true consonant, a 
liquid or a glide, i.e., defined as either -voc or +cons , 
'~" ~dicntes a~ sequence of units not coatat~n~ the boundary 
unit #, 
i'~" indicates a st~ of at least i consonants, 
i,j '~" indicates a strin8 of at least i and not more than j consonants, 
d. As any of several boundary units #, +, or = , which Si~Dify themselves° 
e. as s combination of brackets and upper-case alphabetic characters 
representing the ~tactic brackets defined in the system. 
Rules in the Phon~osical Component are of the form 
A--B/X--Y 
where 'A and B are single units or the null element; the arrow stand~ for 
'is actuaiLized by'; the dia~0n~ line mean~ '~n the context'; and I and Y 
represent respectively, the left and r~ht hand environments in ~nick A 
appears. These ~r also be nu~, or may con~at of units or str~s of 
units and include labeled syntactic brackets. 
Our ~ystel accepts rules written in th~ format, i.e. 
L~ -- I~3 / context specif~ntion. 
A rule is applicable if the IN3 latches some unit ' in the test strin8 
and any context specified in the rule is found to exist at or adjacent to the 
matched unit° The context specification ma~ consist of an~ sequence of one 
or more units, and enmt include a marker -- to indicate where the LHS fit~ in- 
to the specified context, or more exactly, how the enwironment must be con- 
figured around the matched unit in order for the ru~e to be applied° 
In the l~hon~Logkal Coll~nent of a ~ran~r, two partial~ identic~ 
rules mar be coalesced into a ~e rule ~aeq~_~y enclos~ .the cozTespo~ling 
non-identical parts in braces, i°e. A -- B/ --~\[. Schema UeLt~ m~h braces 
coalesce a con~unctive sex~es of r~eso The rg~e~ are ap~ed im order° A 
conjunctive series of unit~ is written in our program as a vertical list 
bounded left and r~ht by columns of left and right square brackets. T~s 
corresponds to the braces° For example, given the phonolo~cal rule (1) 
(1) V -- V -- \[+S~aX\] 
has the interpre~atlon that the rule will be t~ed first with the context spsc- 
ification ooas~tin~ of the ae~mnt ~rmbo\].ized as N (i.e. na~al), and then with 
the context coas~t~ of the se/paent V. 
In our system a rule co~ a ©on~u~tive series is matched against 
the test stri~ taking each of the conjunctive item in the order they appear 
in the ser~es, app~in@ the rule ~llediately aI~ time the match~ stri~ in- 
cluding the current item matches the appropriate portion of the test stri=8o 
In the phonoloEicaI theory ~derl~ing our 6ystem, rules la~ also be 
disjunctive\]~ ordered. Such rules are represe~ed in schema by the use of paren- 
theses and anted brackets. 
A disjunction is written as a m~t or sequence of units enclosed in 
parenthes~s° It differs from the conjunctive ser~es in the sequence of ap- 
p\]-icability to a particular test stz~J~. A d~m~unction is matched ms~inst the 
-5- 
test string by considering first the conteXt including the disjunctive item. 
If this match is successful the rule is applied and no further matchi~ is 
attempted. If the first match is unsuccessful, then the aatch is attempted 
omitting the disjunctive item, applying the rule if a match is found. 
Clearly, the conteXt must specify exactly one relative position of 
the LHS, marked by the double dash, --. Thus, the LKS position marker m~y be 
in a conjunctive series if it appears once in each item of the series, but 
it may not occur inside a disjunction. 
The items of a conjunctive series or of a disjunction may in turn 
include either conjunctive series or disjunctions. Conjunctive series must 
be written with the bracket columns eXtending below and not above the line 
external to the conjunction. Extra spaces m~7 be included either horizontally 
or vertically for clarity and in some eases may be needed for disambiguation. 
Rule (12) from SPE would be expressed as follows in this system: 
\[~ -- Ere \] \] 
# ~ w / \[ \[(Jr\]) \] \] \[ \[ \[Zm\] #\] \] 
\[ \] 
\[ (*round -VOC +cons> -- \] 
A context specification may conoist of stacked contexts according to 
the convention that 
A--B/D-- E/ C -- F 
is interpreted as 
A--B/ C D-- EF. 
System Structure 
The rule testing program proper consists of 4 sections. They are: 
1. the system storage definitions which include definition of the feature 
set used, 2. the mechanics necessary to accept an input form and set up the 
test matrix with the features of the input form, 3. a rule test loop which 
controls cyclic ordering, and 4. the routines to print out the results, either 
in segment string or in binary matrix form, following the application of at~ 
rule. The rules are then included as blocks of coding inserted an desired in 
the test loop. 
Initially, four values are defined which determine the size of the 
various tables and matrices in the system. 
DECLARE L (6O) CHAR (2); 
DECLARE F (60,20), M (50,20) BIT (I); 
DECLARE s (5o) F~; 
The amount of memory reserved may be easily changed by alteri~ only 
the lines defining the size limits. 
An array L of CHARACTER STRING variables is declared to have length 
60 and a logical matrix F is declared to have a length of 60 coluwns, each column 
having 20 elements or bits. Immediately after program execution has beEun, in- 
put of the feature set is requested. The feature specification consists of the 
-6- 
character representation for each phonological unit fol~ded by at least 1 
space followed by an ordered string of ÷ 's and - 's corresponding to the 
feature value assignment. The ordering is a6 in Table 2 below. The 
character representation of the nth unit entered is stored in the nth 
element of the string array L and the feature valuea are stored as l's and 
O's in the nth column of the feature refere~e matrix F (n 60). If 
less than 20 binary values are specified for a~ unit the remainder of the 
column is filled with O's (i.e. -'s). Table 1 is a listing of the units 
and feature values used for testing the rules of Engliah in the present 
study (from Cho~asky and Halle, 1968). 
-7- 
Symbol 
bZ 
II 
UU 
EE 
O0 
AA 
OE %% 
I 
U 
E 
& 
&& 
0 
Y 
W 
? 
R 
L 
P 
B 
F 
V 
X 
T 
D 
Phonemic 
Unit Symbol 
++-++ .... + (T) TH 
++-+ ..... + (T) DH ++-++---++ (~) 
++ ....... + (~) S 
++--+---+4- - (~) z 
4-4----4----4- (~) C 
4-+--4-4----+ (Z) tit 
÷4----+--++ (~) a 
++--++--4-+ (~) SH 
++-4- ...... (i) ZH 
++-4-+---+- (u) 
++ ........ (e) C 
4-+--4- ..... (A) X 
++--+ .... + (7) ~:C 
+4---+---+- (o) H 
+4----+ .... (~) k'w 
++--++--+- (=) ~,.~ 
4---+ ...... (y) ~,~ 
+--4-+---+- (~) + 
+ ......... (~) 
++--++ .... (o) # 
+4-+ .... 4---++-- (r) \[ 
+++---++--++-- (I) \] 
+-+---+ ....... (~) b 
+-+---4----+--- (o) \]A 
+-÷---+ .... ÷-+ ( f ) \] V 
+-+---+---++-+ (v) !:; 
+-+---+---+-+- ( '~ ) i~' 
+-+---+4- ...... (t) ,: 
4--4----÷÷--÷--- (~) 
+--+----.++------+---- 
+--+------++----++---- 
+--+--.--++----+--÷-- 
+--+---.÷+------+--+ 
+--÷----.++----÷+--+ 
+--+------++ ..... + 
+ -++------+ ..... ÷ 
+-4-+------+----+----+ 
4---4-4-------+------+-+ 
+--4-4-------+----4-+--+ 
+-+++ ......... 
4---++4- ..... +------ 
+-++4- ...... +--_ 
+ -4-4-4- ..... +--+ -- 
4- .... 4- ..... +---- 
4--+++------+ ..... 
+-4-4-+------+--+------ 
4---4-4-+------+----+---- 
-+-- 
----4- 
----+4- 
----+++ 
---4-+++ 
--+++--+ 
----+4+----+ 
---4-++------+ 
--++4- .... + 
Phonemic 
Unit 
(e) 
(5) 
(n) 
(s) 
(z) 
(c) 
(~) 
(7) (~) 
(k) 
(g) 
(x) 
(rj) 
(h) 
(kW) 
(x") 
T~DIe i: Unit Feature Values 
-8- 
In order to generate computer instructions as necessary to man- 
ipulate the values in the binary matrix, the rules as specified above must 
be made compatible with the requirements of the internal logical structure 
of the computer. This is accomplished through a compilation process on 
the above rules. 
A logical matrix M is delcared to have a length of 50 columns, each 
column having 20 elements. The Jth feature in the Ith column is referred to 
with the notation M(I,J). The value of each M(I,J) may be either O or 1, 
representing logical False or True (the feature value - or +) respectviely. 
The input string (the form to be tested) is then stored as a pattern of 
features in the test matrix M such that each unit occupies one column of the 
matrix, allowing the entry of a~7 string of segments and boundaries up to 
length 50. The features for each unit are stored in the corresponding column 
of M by transferring the values from the appropriate column of the previously 
defined feature matrix F. 
Symbols were chosen to have mnemonic value relating to the features 
used. These symbols are assigned values corresponding to the row in the matrix 
F having that feature value. 
Value Symbol Feature Represented 
1 SEG segment 
2 VOC vocalic 
3 CONS consonantal 
4 ~ ~GH high 
5 ~CK hack 
6 LOW low 
7 ANT anterior 
8 COR coronal 
9 ROUND round 
I0 TENSE tease 
ii VOICE voice 
12 CONT continuant 
13 N1 NASAL nasal 
14 STRID strident 
Table 2: Feature Values in Matrix Rows 
Several more rows of the matrix M are delcard so that they are a- 
vailable for specification of diacritic information about each unit. The 
number of such spaces is determined by the declared size of 20, the column 
height. Table 3 defines six addi~donal matrix rows. 
Value Symbol Diacritic Represented 
15 FLAG 20 Rule 20 
16 FLAG 30 Rule 30 
17 FLAG 32 ' Rule 32 
18 FLAG ~ ~ule 
19 DMSR D (see Main Stress Rule) 
20 FVSR F (see Vowel Shift Rule) 
Table 3: Diacritic Feature Value in Matrix Rows 
-9- 
A~ additional row associated with M is delcared to have length ~0 
and elements which sa~ have ae~ integer value (up to the computer word size). 
This row has the ~jmbol S and is used to store the stress value assigned to 
each unit. The stress on the Ith unit is referenced by the notation S(I). 
The value O is initially stored in the array S for all units entered, which 
represents f-stress\] for all units. This is very convenient from the point 
of view of the programming language as an integer value O is also logical 0 
while any integer value greater than zero is read as logical 1. It is plan- 
ned at a later date to be able to enter a non-zero stress value for any unit 
in the input string. 
A w~7 was needed to store the information in the matrix representin 6 
boundary units and the syntactic bracketing of the input string. Because the 
previously described distinctive feature set has the common feature \[÷segment\] 
it is clear thnt the positions in a column of the matrix representing this 
feature set need only be so defined when the first position has the value 
~÷seEment\]. When the first element in a column has the value I-segment\] the 
next 13 spaces are in effect free to be defined so as to represent the boundary 
unit information. Thus a duplicate set of values are defined on the matrix 
as in Table 4. 
Value Symbol Feature Represented 
1 SEG segment 
2 FB formative boundary (i~) 
3 WB word boundary (WB) 
4 BRAC 10~e~ket 
5 HBRAC right bracket 
6 NBRAC noun bracket 
7 ABRAC adjective bracket 
8 VBRAC verb bracket 
9 SBRAC stem bracket 
i0 PBRAC prefix bracket 
Table 4: Boundary Unit Feature Values i~ Matrix Rows 
Positions 4 through IO representing bracketin~ informatio~ are de- 
fined only in case the feature set Fsegl occurs in posi~on8 1 through 3- 
|-~ ~ - 
L+WB J Presence of bracketing4~(I,BRAo)-l)then implies occurrence of the word boundary 
#. At present, 4 spaces remain in the matrix column for addition of syntactic 
marker8 other than those defined here. 
When entry of the segmental form to be tested is complete and be- 
fore the test cycle is begun, the matrix positions correspondin E to the 
diacritics Rule n (FLAG 20 through FLAG 34) are set to i. 
At a point within the test sequence when the adjustment rules have 
been applied, the test string is scanned and the bracketing is located. A 
pair of pointers, LEFT and RIGHT, are set to the left and right innermost 
brackets. If no brackets are specified in the input form, brackets are added 
-lO- 
to the left and right ends of the form as referents for these pointers. Cor- 
responding to the cyclic order of application~ the phonological rules, 
all rules begin with environmental scratch at I~ and continue right to 
RIGHT. This is accomplished in the progr~ lance ~th n DO-E~ state- 
ment pair as follows: 
DO I= LE~ TO RIQRT; 
• ~s block of cod~ is executed repeatedly 
• with I= LEFT, LEFT ÷l, ~ +2, .... l~(~r; 
AD~ reference to I ~thin the DO-loop range uses the current value 
of the variable ~ for that repetition of the loop. 
Because several of the rules needed for the phonetic specifications 
in a ~mEunge require ~ertion or deletion of phonological units in the atring, 
it is desirable to be able to print out the results of the application of an~ 
rule after that rule has been applied to the str~. Tl~a ability has been 
provided in the present program with the character~tic that the results ma~ 
be optionally printed after the application of o~ rule in the test sequence. 
Rule Codin~ 
We m~ now conmider the codin K for one of the rules to be proKro~ed• 
In rule 32, Glide Voca~izatio~ we have the specification; 
First) the Ith unit must be checked to see if it has the feature 
\[4~e~ent\]• If not, the scan is continued to the next tmit• If it does, 
also continue the scan. ~tL%s ~ represented by the fo~o~JJ~ co~• 
DO I .LEFTTORI~ 
IF 7 M (I,Sg) then go to end 32 
IF M (I,C1)|7 M (IoB~) then go to end 32 
oeo 
end 32: end; 
(PL/1 uses the synbol 7 for 'I~OT" and 1 for 'DR") 
The ne~ step is to determ~ the ~currence of the environment 
Y 
-ll- 
vention that any rule be interpreted as applicable in the presence of the 
formative boundary +, which has the featuresF~ ~gl in any case in wh~h the 
L rule is other ~8e app~able. That is, no rule should be blocked by the pre- 
sence of + in an7 context where that + is unmarked in the environmental con- 
ditions. On the other hand, if the + is marked in the environmental specificatin 
it must be present in the stri~ before that rule is applicable. 
From the preced~ discussion we see that the environment for this 
rule must be interpreted as 
To reference an~ unit a fixed distance to the left or NOt of the 
currently scanned unit I, it would be possible to add or subtract a constant 
to the column pointer I. That is, M(I-1,J) would reference the unit ilnnediately 
to the left of l. In this case, however, the unit in question may be either 
1 or 2 spaces to the left of I, depending on whether the unit at I-1 has the 
features ~- seg 1 . Actually, it is necessary to check only the first 2 features 
\[~ ~,\]~-2t~e set ~,\]. is not defied i~ the ,t ¥ocabu~Lr7 I~Id, 
be assumed not to oc©ur. A act of pointers is available to indicate the d~ 
tance Of the desired unit fron the currently scanned unit,L1 throughL9 for 
distance to the left andre through I~ for distance to the right. These pointers, 
when used in a rule, are initially set to 1 at the beEinning of the enviromlent~ 
search in each ~atrix position° With this convention t T-L1 initially refe~l to 
the unit immediately to the left of the Ith unit. If the unit I-1 is found to 
have the features of the formative boundary + then L1 is 8e~ equal to 20 I-L 
now refers correctly to the segment to be checked for the environmental con- 
dition specified. 
DO I. LEFT~O ~TC~T 
L1 - 1; 
IF ~M(I,SE~) the Eo to end 32; 
1F M(I,COL~ ~M(I,BACK) then end 32; 
IF M(I -1,SEG) and M(I-1,FB) the L1.2; 
-12- 
If M(I-LI,RODND)IZM(I-LI,HIGH) the go to end32; 
IV\] is defined to be the coincidence of the features \[+ vocalic 1 L- consoasntalJ 
which may be checked simply in one statement, while application of t~is 
rule specifies the value assignment M(I,voc)=l. Following application 
of the rule the printout option flags are checked and if either is set 
the corresponding print subroutine is executed. The coding for the rule 
may now be completed. 
DO I:LEFT TO RIGNT; De6cription 
Units to he scanned start 
at left-most unit, I=LEFT 
and include successive 
units I=LEFT+I, I:LEFT+2, 
to right-most unit I=RIGHT 
Ll-l; Set the pointer equal to 
1, at unit to immediate 
left of I. 
IF'bM(I,SEG) the go to end32; If the currently scanned 
unit, I, is specified as 
\[-segment\] to to next I 
(i.e. Ln+l). 
IF M(I,CONS) I~M(I,BACK) then go to 
end32; 
If scanned unit, I, is 
either \[+coasonantal\] or 
\[-backs (i.e. does not 
match the rule condition), 
go to the next unit. 
IF~M(I-I,SEG) and M(I-I,FB) the LI~2; If the unit immediately to 
the left of I is specified 
as \[-segment\] and \[÷FB\], 
then set the pointer to 2 
(i.e. I-L1 will refer to 
two units to the left of I). 
IFNIM(I-LI,SEG) the end32; If I-L1 is a \[-segment\] go 
to next unit. 
IF M(I-LI,ROUND) ,=M(I-LI,HIGH) then 
go to end32; 
If the unit in the left en- 
vironment does not have the 
same feature values for 
roundness and highness (i.e. 
does not meet the rule 
condition, round, high), 
go to the next unit. 
IF IM(I-LI,VOC ~M(I-LI,CO~) 
then go to end32; 
M(l, VOC)=I; 
PUT LIST ('RULE 32, At',I); 
PRINT '~32,AT" :I 
IF P(32,1) THEN CALL ST~OUT; 
IF P(32,2) THE~CALLMATOUT; 
E~D32: ~D; 
-13- 
If the unit in the left- 
moat environment is either 
I-vocalic3 or \[÷consonantalJ 
(i.e. not a true vowel), 
go to the next unit. 
All the conditions have 
been satisfied; change 
the value of the feature 
\[vocalic~ from - to + 
(i.e. apply Rule 32). 
Instruction to print the 
rule number (R32) and 
state the matrix feature 
column to which it has 
been applied, i.e.I. 
If a display of the string, 
resulting from application 
of Rule 32 is desired, go 
to subroutine $TROUT. 
If a diapls~7 of the matrix 
resultin E from application 
of Rule 32 is desired, go 
to subroutine MATOUT. 
Scan unit Ln+l, where 
Lu = previously scanned I. 
Co~iler Code Generation 
To illustrate, the output coding to evaluate a simple right- 
handed context of the form 
A -~B / -- context 
will be eXa~nedo It will be seen that this codi~ can be generalized to 
evaluate a left-handed context as well. If the context matching process 
is considered to be anchored at the point between the LH~ position marker 
and the context bo~y, then conjunctive and disjunctive items farther to the 
right in the context m~y be tried ~thout rematchin E items to their left 
in the context str~. This would be true even after the rule has been 
applied to the carrent\],y matched unit provided that the matched unit is 
again tested against the L~ after application of the ~ to that unit and 
before the context match continues. 
The run-time environment in the object machine requires a single 
push-down stack and a few simple variable stor~e locations. A test strin~ 
is assumed to be stored in the object machine which ~a~ have been entered 
prior to execution of the rule match or may be the result of application of 
a prior rule in the syste. 
Th~ semantic for match~n~ particula~ units in the test string ~1.l 
not be described~ but will be abbreviated in the output cod~ as 
~ATC~ ; EL~SGOTO ; 
which is taken to mean that a Jump to the ELSE ~O TO label occurs if the 
specified unit was not successfully matched. Further abbreviations in the 
output eodi~ are in the application of the ~ of a r~e, ~d~cated by 
DO RULE; 
and in the declaration of program block and procedure structures. Other~se, 
the coding presented constitutes a valid PL/I program segment. 
~"~-~ the codi~ necessary to evaluate simple contextual ex- 
pressions iuclud~ an un-nested disjunction t it m~y be seen that no loop- 
in~ back to previously matched units is necessary. When the left paren- 
thesis is encountered the current location of the match pointer, stored in 
the variable P, is saved. If a~y subsequent item match fails before the 
r~ht parenthesis is encountered the pointer location is reset to the saved 
value and the matching process resunes with the next unit outside the paren- 
thesis. The saved pointer location is erased when the riEht parenthesis 
is encountered whether or not the disjunctive item was successfully mat- 
ched. This ~heme achieves the desired disjunctivity quite simply in that 
only one match is attempted. If the match of units inside the disjunction 
succeeds, the matching process continues normally. If it fails, the en- 
closed str~ is effectively ignored and the,matchin E process continues as 
before. This process lmy be made recursive to any level by saving the 
pointer location in s push-down stack, freeing the top stack item when a 
-15- 
riEht parenthesis is encountered. Such a stack ~ easily he implemented 
in PL/I by using the CORTROLL~ form of dyna~¢ storage allocation for a 
variable -~ A new level in the stack is secured with the statement 
ALLOCATE STK;, savi~ all previous values. The top level is erased with 
the statement FR~ STK;t bringin 6 the previous value into. accessibility. 
In the coding examples presented below, two variables, LEFT -nd 
RIGHt, are assumed to contain the currently applicable left and right limit~ 
for matching the test string. These will he set by scanning the test string 
to locate the innermost syntactic brackets or other such test S~ri~ delimiters. 
The index variable N will be used to indicate the left-most end of the match- 
ing process; in this case, the anchoring point following the LHS position 
marker. The statement MATCH UNIT __; EI~E GO TO __; is assumed to in- 
crement the current match pointer P and fail at any time the value of P 
eXceeds the right delimiter value, stored in RIGHT° 
The codil~ to evaluate a context of the form 
RULEN: A-,B/-- C D (E (Y G ) HJ) K 
would have the following appearance. 
tulLE: DO I-.LEFT TO HIGHT; 
P=-I; 
IF MATCH UNIT A; ELSE GO TO NEXT; 
IF MATCH UNIT C; EI~E GO TO NEXT; 
IF MATCH UNIT D; ELSE GO TO NEXT; 
ALLOCATE STK; 
STK-p; 
IF MA~H UNIT E; ELSE GO TO PN1; 
AMECATE STK; 
~K-; 
IF MATCH UNIT F; ~SEGO TO PN2; 
IF MATCH UNIT G; ~ GO TO PN2; 
GOTO SK2; 
PN2: paSTE; 
SK2: FREE STK; 
IF MATCH UNIT H; ~E GO TO PN1; 
IF MATCH UNIT I; ELSE GO TO PN1; 
GO TO SK1; 
PNI: P=STE; 
SKI: FREE STE; 
IF MATCH UNIT J~ ELSE GO TO HEXT; 
DO RULE; 
m~T: ~D ~LEN; 
The attempt to formulate the coding to evaluate a context iacludi~ 
a conjunctive series bri~ to liEht a different type of problem. It is not 
possible &o retch units from left to right in an orderly fashion as fox 
simple or disjunctive contexts. Once a match for the entire st~ing has been 
attempted ttsi~ the first item of the conjunctive series, it is necessary, 
5, 
-16- 
whether the rule was applied or not, to reset the current match pointer to 
its value at the time the left bracket was encountered, and then continue 
the matching process using the units of the second conjunctive item as the 
matching patterns. In order to loop back in this manner, it is necessary 
to save three values during a matching pass over the st~ng; 1) The bracket- 
pair number, 2) The pointer value at the time the left bracket is encountered, 
and 3) The item number within the bracket pair. These three values are 
saved in the push-down stack in the order listed when the conjunctive series 
match is begun. It is convenient in the PL/I ~e to accomplish the 
branching by using the stacked values as subscript values in an assiEned- 
label GD TO statement. The labels ITS4(1,1):, ITEM(I,2):, IT~4(1,3):, .... 
are attached to the statements in the coding which perform the pointer 
reset following matching of the corresponding conjunctive items. BranchinE 
is accomplished with the statement GO TO IT~(I,J); following the proper 
assignment of values to the variables I and Jo 
An initial value of zero is put in the stack prior to rule evaluation. 
The stack is then checked for a non-zero top item before it is unstacked 
for label assignment and an empty stack indicates that all conjunctive items 
in the rule have been used in the matching process. If the stack is not 
empty, the top two items are unstacked and stored as the variables J and P 
respectively. The remaining top stack item is accessed and the value stored 
in the variable I~ but it is not freed from the stack. The value of P must 
then be restored to the stack ~o it will be handled properly by the end- of- 
item coding. The details of this scheme may be seen in the followi~ example, 
coded to evaluate a context of the form 
~LE J: A -)B / -- c ~j 
RULE J: 
D~GLARE IT~4(I,3) LABEL; 
ALLOCATE S~; 
STK=O; 
DO N-_LEFT TO EIGEr; 
P-_N; 
IF MATCH UNIT A; ELSE GO TO NEXT; 
IF MATCH UNIT C; ELSE GO TO NEXT; 
AIJ~OCATE ~TK; 
~TK=I; 
ALLOCATE STK; 
STK.P; 
ALLOCATE STK; 
3TK:I; 
IF MATCH UNIT D; ELSE TO TO Bll; 
IF MATCH UNIT E~ ELSE GO TO BII| 
GO TO BRI; 
BII: J-STK; 
FREE STK; 
P=STK; 
ITEM(I,1): ALIECATE STK; 
STK~J + i; 
IF MATCH UNIT F; ~ GO TO BI2; 
IF MATCH UNIT @; ELSE GO TO \]@12; 
-17- 
IF MATCH UNIT H;ELSE GO TO BI2; 
GO TO BRI; 
BI2: J=STK; 
FREE STK; 
P-_STK; 
ITEM(I,2): ALLOCATE STK; 
STK--J + i; 
IF MAT~ UNIT Xl ~ GO TO B1.31 
aO TO BEI.i 
~1.~1 ~ STK I 
iIIEil STK I 
eO TO NEXT I 
BBI: IF ~TCH UNIT J; ELSE GO TO NE~T; 
DO RULE; 
h~T: IF STK.OTHEN GO TO SCAN; 
J=STK; 
FREE STK; 
P=3TK; 
FREE STK; 
I=STK; 
ALLOCATE STK; 
STK-_p; 
GO TO ITE~(I,J); 
SCAN: E~D RULEJ; 
A further complication arises when a conjunctive series is embedded 
inside of a disjunction. Specifically, the pointer location should not be 
reset to the value stored in the stack for a~y failure to match the internal 
sequence of units, but onl~ if the match fails for all items in the embedded 
conjunctive series. Because the last conjunctive item may fail to match, 
while a previously tested item matched successfully, it is necessary to use 
a '~lle applied" flag, which is cleared (reset) wbe,- entering the match of 
a disjunction and set by any application of the rule. The setting of this 
flag determines the action taken concerning the pointer setting on exit from 
the disjunction, when all conjunctive items have been tried. 
The Compilim~ Process 
It m~y be seen from the coding examples given that the output from 
the coeq~iler occurs essentially in ~he same order as the symhols in the 
linear input form, suggesting that a preliminary stage of syntactic analysis 
is unnecessary. It is only necessary to save the ~ specification in the 
compiler from the time it is input u~til it is output in coded form at the 
end of the context coding. Observing the three different types of failure- 
to-match exit branches, it appears that the most direct solution is a three- 
state table driven translator used in conjunction with a number of indices 
defined during the compiling operation for the purpose of counting brackets 
and parenthesis, 8enerating sequential labels, etc. Entries in the table 
indicate for each of the three states what output coding should be generated 
and what compiler index operations should be carried out as a result of each 
possible input symbol. 
The table and listing of compiler actions shown below specifies a 
compiler system capable of producing PL/l coding such as shown in the 
examples. Notations used in the compiler table and action specifications 
-18- 
are explained briefly. 
1. Upper-case letters in the output are output as shown. 
2. Lower-case letters in the output represent compiler variables 
~for which the currently assigned value is output. 
3- Abbreviated output coding has the meaning discussed above, for 
example, DO RULE expressed the codin~ necessary to incorporate 
into the marked unit in the test string the characteristics or 
features given as the ~ of the rule. 
4. The state transfer from state 2 on input of a right parenthesis 
is a conditional transfer, depending on the value of the com- 
piler variable mo The test is shown as a fourth pseudo-state. 
5- Compiler initialization, ahown as state O, mast be accomplished 
at the beginning of compilation for each rule. 
6. Three of the input actions are identical for all states, in- 
dicating that it is unnecessary to store those actions in the 
state table. 
7. No action is specified for error inputs. It is assumed that 
the compiler would respond with some indication of the trouble, 
for example, a comma input when in state 2 could cause the 
reply '~omma illegal inside parenthesis"° 
The compiler uses seven variables, four of them, i,j,k and 1, as 
push-down stacks with the CONTHOLT.E~ attribute, and three, m,n and o as 
simple variables. 
Compiler State Tabl e 
State 
E~OF 
-- uni__! ___E _~ ~ C__ ___~ LU~ 
O. Action: Allocate i; i=I; n.O; o~0; 
Next State: 1 
i. Action: 1 4 5 6 7 9 i0 
Next State: 1 3 3 1 2 1 0 
2. Action: 2 4 error error 7 8 iO 
Next State: 2 3 2 4 0 
3- Action: 3 4 5 6 7 error i0 
Next State: 3 3 3 1 2 0 
4. Action: 
Next State: 
Conditional transfer state, no input; 
If m_-O then go to state 1., else go to state 2. 
-19- 
Compiler Actions 
Action Compiler 
number Operations 
1. Output " 
2. Output " 
3. Output " 
4. n=n*l 
Allocate i; 
i=n; 
Allocate j ; 
j=i; 
l=l÷l; 
m=O; 
Output 
Output 
Output 
5- Output 
Output 
Output 
j=j+l; 
6. l=l-1; 
Output 
Output 
Output 
If I)O then go to 6a. 
Output " 
Output " 
6a. Output " 
Output '~Ri: " 
FREE i; 
FREE j; 
7. o=o,i; 
Allocate k; 
k=o; 
Allocate 1; 
i=O; 
m=m+l; 
Output " 
Output 
8. Free I; 
m-m-l; 
Output " 
Output '?Nk: 
Output '~Kk: 
Free k; 
Output Code 
IF MATCH UNIT __; ELSE GO TO NE~T; " 
IF MATCH UNIT __; ELSE GO TO PNk; " 
IF MATCH UNIT __; ELSE GO TO Bij ; " 
ALLOCATE STK~ STK=i; " 
ALLOCATE STK; STK_-P; " 
"ALLOCATE STK~ STK=I; " 
" GO TO BRi;" 
'Bij : J=STK; FREE STK~ P:STK; " 
'~TEM(i,J ): ALLOCATE ST\](; STK=~+l; " 
" GO TO BRi; " 
'BiJ : FREE STK; 
'~TEM(i,j): FREE STK; FREE STK;" 
IF FLAG_-O THEN GO TO PNk;" 
FREE STK; " 
GD TO NEXT;" 
ALLOCATE STK; STK_-O; '° 
FLAG=o; " 
GO TO SKk; " 
P=STK; " 
FREE STK; " 
-20- 
9. 
i0. 
Free 1; 
Output " GO TO SKk;" 
Output '~Nk: p-ST~; FREE STK;" 
Output '~k: " 
Free k; 
Free 1; 
Output " DO RULE;FLAG=l; " 
Output 'REXT: IF STK_-O THEN GO TO SCAN; " 
Output " J=STK;FREE STK; " 
output " P=-~TK;FREE STK; " 
output " I~STK;ALLOCATE STK;STK~P; " 
Output " GO TO ITEM(I,J); " 
Output 'BCAN: ~D RULE; " 
The Generalit~ of the Process 
The only references to left-right directionality in the matching 
scheme described are in the left to right scan of the current L~ marker 
in attempting to fit the test string and in the assumption that the coding 
for matching particular units ~ncluded an instruction to increment the 
matching location pointer,P. A left-handed context m~7 be evaluated by 
similar coding by letting the pattern match move from the L~ outward, 
i.e., to the left. The same compiling system can be used by reversing 
the symbols of the left-hand context during the initial linearization, 
substituting left for right and right for left brackets and parenthesis. 
Thus, a rule of the form 
A-,B / EF(G)H --I ~ 
L~J 
would appear in the linear format as 
A -,B / H(G)FE -- I\[J,K\]+ 
An additional dimension would be added to the compiler state table t 
providing for the productio= of unit match coding which would decrement 
instead of incrementing the current matching pointer. The L~S marker would 
still scan the test string from left to right. If the L~ marker occurred 
within the items of a co.unction, separate coding would have to be pro- 
duced for the left and right parts of each item. The details of the match- 
ing process for this case have not been worked out, but do not appear to 
present any major difficulties for the system presented here. 
-21- 

References

Blair, F., Programming of the Grsmsar Tester in Specification and 
Utilization of a Transferaational G~, Sci. Rep° 1, IBM Corp° 
Iorktown Hts. New York, 1966 

Friedman, Joyce, A Co~uter System for TransforNational Grammar. 
Cos~uter Sci. Rep. CS-~ AF-21, Stanford, Ca., Jan. 1968 

Gross, L.N., On-Line Prograe~ng System User's Manual MTP-~, The 
MITRE Corp., Bedford, Mass., March 1967 

Londe, D. and Schoene, We, TGT, Transformational Grammar Tester, 
~4-3759/0OO/00, System Development Corpo. Santa N~nica, Ca., 1967 

Ch~, Noam, The Sound Pattern of English, Harper and Row, N.Y., 1968 

Kurd, S.-Y., Yawel~ Phonolo~, MIT Press, Cambridge, 1967 

Schachter, Paul and Fromkin, Victoria, A Phonolo~ of Akan: Akuapem, 
Asante and Fante, Workin~ Papers in Phonetics No. 9, UCLA, August 1968 

Bobrow, D.G. and Fraser, Brice, The Phonological Rule Tester, Cos° ACM, 
Tel 11, no 11, November 1968 

Ri©e, D. Lloyd, and Hofshi, Reuben, An Interactive Phonological /bale 
Tester, WorkinK Papers in Phonetics m No. 10, UC~, Dec. 1968 

~, Roam. Some General Properties of Phonological Rules, LanEu~e 
.oi b3 no 1, March 1967 

K~I, J.P., Conjunctive Stacks and Disjunctive Sequences in Ls~e 
Ch~, ~mu~terly Prog. Rep. Research Lab° of Electronics, No° 88, 
Mmm. Im~t. of Technology. Jan. 1968 
