XML Viewer - c69-5301

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/69/c69-5301_metho.xml
Size: 36,058 bytes
Last Modified: 2025-10-06 14:11:04
<?xml version="1.0" standalone="yes"?>
<Paper uid="C69-5301">
  <Title>Output &amp;quot; Output '?Nk: Output '~Kk: Free k; Output Code</Title>
  <Section position="1" start_page="0" end_page="0" type="metho">
    <SectionTitle>
AN INTERACTIVE P~0NOIEGICAL
RULE TE3TING SYST~4
</SectionTitle>
    <Paragraph position="0"> September 1969 One of the many ways the high-speed computer is useful to linguistic researchers is for the evaluation of generative grammars. Several pro~-s~ing systems for this purpose have been described in the literature. 1,2,3,~ A transformational generative grammar consists of a syntactic component, a phonological component and a semantic component. This paper is concerned solely with the phonological component. While this component is a dependent part of the entire grammar, systems of phonological rules for specific languages, i.e, the phonological components of the graEmars of th~se languages, have been separately presented by Chomsk~ and Halle 5, Kuroda , Schaohter and Fromkin 7 and others. The Sound Pattern of English 5 (hereafter, ~PE) includes the 'formalism used for presenting phonological rules and the schemata that represent them, and the interpretation of this formaliam'deg (pdeg 390) This formal description is taken as the basis for the rule structures d/acussed in this paper.</Paragraph>
    <Paragraph position="1"> Cho~sk~ and Halle state that 'The rules of the grammar operate in a mechanical fashion; one my think of them as instruc~ns that might be given to a mindless robot, incapable of exercising any ~udsment or imagiluation in their application. An~ ambiguity or inexplicitness in the statement of rules must in principle be eliminated, since the receiver of the instructions is asssmed to~he incapable of uain~ intelligence to fill in gaps or to correct. errors'deg They find it 'a curious fact' however that 'this condition of preciseness of formulationdeg..has led man~ lin6uists to conclude that the nctivation for such grammars must be.deg.some.deguse of computers'deg We also believe that there are more b~sic theoretical motives in clarity and completeness; we furthe.r believe that this very explicitness makes possible the use of the computer for testi~ such rules.</Paragraph>
    <Paragraph position="2"> Furthermore, the complexities of natural language are reflected in the components phonological rules. Anyone who has attempted to teach a group of graduate students the phonology of English, using the rules presented in aPE can attest to the fact that even a single rule schema presents endless problems for the brightest of students when he attempts to expand the sche.~- and to apply this set of rules to convert an abstract ~urface structure of a sentence into its phonetic representation. While the liDguist or the student m~y be possessed with greater intelligence than the mindless robot, he is also possessed with human fallibility, and limited time and energy. For these reasons, the mindless robot can perform far more effectively than a Linded human. The conq~ater program which is described in ~ paper was written to aid human phonologists in the writi~ of rules, the testing  -2of rules, and the teaching of phonology. The importance of such a computerized phonological rule teeter becomes very apparent when one selects at r-n~om any twenty-five English words, attempts to provide what one assumes to be the underlying phonological representation, and then applies the rules of SPE as specified. One of the authore of this paper made euch an attempt. After more hours than she wishes to remember, and using every possible underlying segment, she found that eleven of these randomly chosen words could not be correctly derived. Nor were these strange foreign loan words, unless one believes the word 'America' to be an exceptional item in the Englieh vocabularydeg null This example is not offered as a criticism of Sound Pattern of English, probably the most important published book on English phonology and phonological theory. Nor are we concerned here with any theoretical weaknesses which may or may not be present in this work. What is apparent is that had a phonologic~-I rule tester been used, prior to the publication of this set of rules, many of the problems in rule orderinE~ omitted contexts etc. could have easily been corrected~ and those rules which present problems and which cannot work would have at least been revealed. Furthermore, because of the speed of the computer, one could have tested not only twenty-five words, but hundreds -- determining the correct underlying forms of formatives, and thereby providing a lexlc~ @B ~h the Pnle, ,o~ operatedeg A ~er ~t, ebl~s ~.~ed ~n ~i~ up * prod-as ~or roach a tester is ~h, J~$a$~.om~. oospro~se ot~en n~esury betveen the computer input format the rule deeeriptiom seh~su used by linguiste to express their phonological rules, The Phonological Rule Teeter of Bobrow and Fraser 8 solves this problem by offerin6 a variety of logical combinatorial devices which may be used to group either segments within a rule or complete rules for disjunctive or conjunctive application. Such a system has very general descriptive capability, but complex rules appear in the computer input form rather different from the linguist' s format.</Paragraph>
    <Paragraph position="3"> In consideration of the deecriptive powers of euch systems, it appears that the input format should be made as specific as possible to the proposed theoretical structure since a more general deecriptive scheme requiree the rule writer to learn n more powerful meta-language than is needed. This parallels the general direction of development of computer languages; from the general machine-oriented coding to the specific problem-oriented languages. This paper describes the translational core or compiler of a system which accepts phonological rules in a format very close to that formalized in The Sound Pattern of English and produces as output the coding similar to phone'---tic segments necessary to evaluate the input rules in a phonological testing program. The input format of this system is especially applicable to keyboard entry on a CRT graphic terminal such as the IBM 2260 and is planned for possible use in an on-line classroom system for teaching the properties and operation of the phonological component, as well as for the writing and testing of phonological rules by the linguist.</Paragraph>
    <Paragraph position="4"> The rule testing program consists of an'input block, a sequence of phonelogical rules, and a printout block (see Figure i). The input block will accept a string of characters from the operator's console representing the underlying form of a word or phrase or an~ form assumed to occur in a derivati degn  -3in the phonological component. This form is then tested against the environmental conditions specified by the stored rules and modified according to those rules whenever a match is founddeg The string of phonological units,i.e, segments and boundaries, and/or the binary matrix resulting from the application of any rule may be optionally displayed on the operator's console after the application of that rule.</Paragraph>
    <Paragraph position="5">  The structure of the program is such that any rule or sequence of rules can be tested using the same input and output blocks. The rules initially coded for testing and described in this paper are taken from Cholu~y and Halle (1968)o The program however, is not limited to these particular rules, but can be used to test any set of rules comprising the phonological component of a generative grammar.</Paragraph>
    <Paragraph position="6"> The Input Format for Rule Description The input to the phonological component consists of a structurally analyzed string consisting of syntactic brackets (e.g. Noun Phrase, Noun, Adjective, Verb etc), segments, and boundaries. The segments and boundaries are composite feature bundles.</Paragraph>
    <Paragraph position="7"> The system used in our Phonological Rule Tester specifies these units in any of several ways: a. As a combination of upper-case alphabetic characters representing the various phonological segments defined in the system; b. As a cluster of distinctive feature specifications enclosed in angle brackets. These may be spaced horizontally or vertically, i.e. I</Paragraph>
    <Paragraph position="9"> &lt;-round&gt;, but are to be considered simultaneously an a cluster rather than con-Junctively as in the square bracketed series;  c. As a sequence of segments of specific predetermined types; '~&amp;quot; indicates a~ vowel segment, i.e., defined as +voc '~&amp;quot; indicates a~r non Vowel:seBlent, idege. a true consonant, a liquid or a glide, i.e., defined as either -voc or +cons , '~&amp;quot; ~dicntes a~ sequence of units not coatat~n~ the boundary unit #, i'~&amp;quot; indicates a st~ of at least i consonants, i,j '~&amp;quot; indicates a strin8 of at least i and not more than j consonants, d. As any of several boundary units #, +, or = , which Si~Dify themselvesdeg e. as s combination of brackets and upper-case alphabetic characters  representing the ~tactic brackets defined in the system. Rules in the Phon~osical Component are of the form A--B/X--Y where 'A and B are single units or the null element; the arrow stand~ for 'is actuaiLized by'; the dia~0n~ line mean~ '~n the context'; and I and Y represent respectively, the left and r~ht hand environments in ~nick A appears. These ~r also be nu~, or may con~at of units or str~s of units and include labeled syntactic brackets.</Paragraph>
    <Paragraph position="10"> Our ~ystel accepts rules written in th~ format, i.e.</Paragraph>
    <Paragraph position="11"> L~ -- I~3 / context specif~ntion.</Paragraph>
    <Paragraph position="12"> A rule is applicable if the IN3 latches some unit ' in the test strin8 and any context specified in the rule is found to exist at or adjacent to the matched unitdeg The context specification ma~ consist of an~ sequence of one or more units, and enmt include a marker -- to indicate where the LHS fit~ into the specified context, or more exactly, how the enwironment must be configured around the matched unit in order for the ru~e to be applieddeg In the l~hon~Logkal Coll~nent of a ~ran~r, two partial~ identic~ rules mar be coalesced into a ~e rule ~aeq~_~y enclos~ .the cozTespo~ling non-identical parts in braces, idege. A -- B/ --~\[. Schema UeLt~ m~h braces coalesce a con~unctive sex~es of r~eso The rg~e~ are ap~ed im orderdeg A conjunctive series of unit~ is written in our program as a vertical list bounded left and r~ht by columns of left and right square brackets. T~s corresponds to the bracesdeg For example, given the phonolo~cal rule (1)</Paragraph>
    <Paragraph position="14"> has the interpre~atlon that the rule will be t~ed first with the context spscification ooas~tin~ of the ae~mnt ~rmbo\].ized as N (i.e. na~al), and then with the context coas~t~ of the se/paent V.</Paragraph>
    <Paragraph position="15"> In our system a rule co~ a (c)on~u~tive series is matched against the test stri~ taking each of the conjunctive item in the order they appear in the ser~es, app~in@ the rule ~llediately aI~ time the match~ stri~ including the current item matches the appropriate portion of the test stri=8o In the phonoloEicaI theory ~derl~ing our 6ystem, rules la~ also be disjunctive\]~ ordered. Such rules are represe~ed in schema by the use of parentheses and anted brackets.</Paragraph>
    <Paragraph position="16"> A disjunction is written as a m~t or sequence of units enclosed in parenthes~sdeg It differs from the conjunctive ser~es in the sequence of app\]-icability to a particular test stz~J~. A d~m~unction is matched ms~inst the  -5test string by considering first the conteXt including the disjunctive item. If this match is successful the rule is applied and no further matchi~ is attempted. If the first match is unsuccessful, then the aatch is attempted omitting the disjunctive item, applying the rule if a match is found.</Paragraph>
    <Paragraph position="17"> Clearly, the conteXt must specify exactly one relative position of the LHS, marked by the double dash, --. Thus, the LKS position marker m~y be in a conjunctive series if it appears once in each item of the series, but it may not occur inside a disjunction.</Paragraph>
    <Paragraph position="18"> The items of a conjunctive series or of a disjunction may in turn include either conjunctive series or disjunctions. Conjunctive series must be written with the bracket columns eXtending below and not above the line external to the conjunction. Extra spaces m~7 be included either horizontally or vertically for clarity and in some eases may be needed for disambiguation. Rule (12) from SPE would be expressed as follows in this system:</Paragraph>
    <Paragraph position="20"> A context specification may conoist of stacked contexts according to the convention that  A--B/D-- E/ C -- F is interpreted as A--B/ C D-- EF.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
System Structure
</SectionTitle>
      <Paragraph position="0"> The rule testing program proper consists of 4 sections. They are: 1. the system storage definitions which include definition of the feature set used, 2. the mechanics necessary to accept an input form and set up the test matrix with the features of the input form, 3. a rule test loop which controls cyclic ordering, and 4. the routines to print out the results, either in segment string or in binary matrix form, following the application of at~ rule. The rules are then included as blocks of coding inserted an desired in the test loop.</Paragraph>
      <Paragraph position="1"> Initially, four values are defined which determine the size of the various tables and matrices in the system.</Paragraph>
      <Paragraph position="2">  DECLARE L (6O) CHAR (2); DECLARE F (60,20), M (50,20) BIT (I); DECLARE s (5o) F~;  The amount of memory reserved may be easily changed by alteri~ only the lines defining the size limits.</Paragraph>
      <Paragraph position="3"> An array L of CHARACTER STRING variables is declared to have length 60 and a logical matrix F is declared to have a length of 60 coluwns, each column having 20 elements or bits. Immediately after program execution has beEun, input of the feature set is requested. The feature specification consists of the  -6character representation for each phonological unit fol~ded by at least 1 space followed by an ordered string of / 's and - 's corresponding to the feature value assignment. The ordering is a6 in Table 2 below. The character representation of the nth unit entered is stored in the nth element of the string array L and the feature valuea are stored as l's and O's in the nth column of the feature refere~e matrix F (n 60). If less than 20 binary values are specified for a~ unit the remainder of the column is filled with O's (i.e. -'s). Table 1 is a listing of the units and feature values used for testing the rules of Engliah in the present study (from Cho~asky and Halle, 1968).</Paragraph>
      <Paragraph position="4">  In order to generate computer instructions as necessary to manipulate the values in the binary matrix, the rules as specified above must be made compatible with the requirements of the internal logical structure of the computer. This is accomplished through a compilation process on the above rules.</Paragraph>
      <Paragraph position="5"> A logical matrix M is delcared to have a length of 50 columns, each column having 20 elements. The Jth feature in the Ith column is referred to with the notation M(I,J). The value of each M(I,J) may be either O or 1, representing logical False or True (the feature value - or +) respectviely. The input string (the form to be tested) is then stored as a pattern of features in the test matrix M such that each unit occupies one column of the matrix, allowing the entry of a~7 string of segments and boundaries up to length 50. The features for each unit are stored in the corresponding column of M by transferring the values from the appropriate column of the previously defined feature matrix F.</Paragraph>
      <Paragraph position="6"> Symbols were chosen to have mnemonic value relating to the features used. These symbols are assigned values corresponding to the row in the matrix F having that feature value.</Paragraph>
    </Section>
  </Section>
  <Section position="2" start_page="0" end_page="0" type="metho">
    <SectionTitle>
2 VOC vocalic
3 CONS consonantal
4 ~ ~GH high
5 ~CK hack
6 LOW low
7 ANT anterior
8 COR coronal
</SectionTitle>
    <Paragraph position="0"> Several more rows of the matrix M are delcard so that they are available for specification of diacritic information about each unit. The number of such spaces is determined by the declared size of 20, the column height. Table 3 defines six addi~donal matrix rows.</Paragraph>
    <Paragraph position="1">  A~ additional row associated with M is delcared to have length ~0 and elements which sa~ have ae~ integer value (up to the computer word size). This row has the ~jmbol S and is used to store the stress value assigned to each unit. The stress on the Ith unit is referenced by the notation S(I). The value O is initially stored in the array S for all units entered, which represents f-stress\] for all units. This is very convenient from the point of view of the programming language as an integer value O is also logical 0 while any integer value greater than zero is read as logical 1. It is planned at a later date to be able to enter a non-zero stress value for any unit in the input string.</Paragraph>
    <Paragraph position="2"> A w~7 was needed to store the information in the matrix representin 6 boundary units and the syntactic bracketing of the input string. Because the previously described distinctive feature set has the common feature \[/segment\] it is clear thnt the positions in a column of the matrix representing this feature set need only be so defined when the first position has the value ~/seEment\]. When the first element in a column has the value I-segment\] the next 13 spaces are in effect free to be defined so as to represent the boundary unit information. Thus a duplicate set of values are defined on the matrix  #. At present, 4 spaces remain in the matrix column for addition of syntactic marker8 other than those defined here.</Paragraph>
    <Paragraph position="3"> When entry of the segmental form to be tested is complete and before the test cycle is begun, the matrix positions correspondin E to the diacritics Rule n (FLAG 20 through FLAG 34) are set to i.</Paragraph>
    <Paragraph position="4"> At a point within the test sequence when the adjustment rules have been applied, the test string is scanned and the bracketing is located. A pair of pointers, LEFT and RIGHT, are set to the left and right innermost brackets. If no brackets are specified in the input form, brackets are added  -lOto the left and right ends of the form as referents for these pointers. Corresponding to the cyclic order of application~ the phonological rules, all rules begin with environmental scratch at I~ and continue right to RIGHT. This is accomplished in the progr~ lance ~th n DO-E~ statement pair as follows:</Paragraph>
    <Paragraph position="6"> AD~ reference to I ~thin the DO-loop range uses the current value of the variable ~ for that repetition of the loop.</Paragraph>
    <Paragraph position="7"> Because several of the rules needed for the phonetic specifications in a ~mEunge require ~ertion or deletion of phonological units in the atring, it is desirable to be able to print out the results of the application of an~ rule after that rule has been applied to the str~. Tl~a ability has been provided in the present program with the character~tic that the results ma~ be optionally printed after the application of o~ rule in the test sequence.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
Rule Codin~
</SectionTitle>
      <Paragraph position="0"> We m~ now conmider the codin K for one of the rules to be proKro~ed* In rule 32, Glide Voca~izatio~ we have the specification; First) the Ith unit must be checked to see if it has the feature \[4~e~ent\]* If not, the scan is continued to the next tmit* If it does, also continue the scan. ~tL%s ~ represented by the fo~o~JJ~ co~*</Paragraph>
      <Paragraph position="2"> vention that any rule be interpreted as applicable in the presence of the formative boundary +, which has the featuresF~ ~gl in any case in wh~h the L rule is other ~8e app~able. That is, no rule should be blocked by the presence of + in an7 context where that + is unmarked in the environmental conditions. On the other hand, if the + is marked in the environmental specificatin it must be present in the stri~ before that rule is applicable.</Paragraph>
      <Paragraph position="3"> From the preced~ discussion we see that the environment for this rule must be interpreted as To reference an~ unit a fixed distance to the left or NOt of the currently scanned unit I, it would be possible to add or subtract a constant to the column pointer I. That is, M(I-1,J) would reference the unit ilnnediately to the left of l. In this case, however, the unit in question may be either 1 or 2 spaces to the left of I, depending on whether the unit at I-1 has the features ~- seg 1 . Actually, it is necessary to check only the first 2 features \[~ ~,\]~-2t~e set ~,\]. is not defied i~ the ,t Y=ocabu~Lr7 I~Id, be assumed not to oc(c)ur. A act of pointers is available to indicate the d~ tance Of the desired unit fron the currently scanned unit,L1 throughL9 for distance to the left andre through I~ for distance to the right. These pointers, when used in a rule, are initially set to 1 at the beEinning of the enviromlent~ search in each ~atrix positiondeg With this convention t T-L1 initially refe~l to the unit immediately to the left of the Ith unit. If the unit I-1 is found to have the features of the formative boundary + then L1 is 8e~ equal to 20 I-L now refers correctly to the segment to be checked for the environmental condition specified.</Paragraph>
      <Paragraph position="5"> If M(I-LI,RODND)IZM(I-LI,HIGH) the go to end32; IV\] is defined to be the coincidence of the features \[+ vocalic 1 L- consoasntalJ which may be checked simply in one statement, while application of t~is rule specifies the value assignment M(I,voc)=l. Following application of the rule the printout option flags are checked and if either is set the corresponding print subroutine is executed. The coding for the rule may now be completed.</Paragraph>
    </Section>
  </Section>
  <Section position="3" start_page="0" end_page="0" type="metho">
    <SectionTitle>
DO I:LEFT TO RIGNT; De6cription
</SectionTitle>
    <Paragraph position="0"> Units to he scanned start at left-most unit, I=LEFT and include successive units I=LEFT+I, I:LEFT+2, to right-most unit I=RIGHT Ll-l; Set the pointer equal to 1, at unit to immediate left of I.</Paragraph>
    <Paragraph position="1"> IF'bM(I,SEG) the go to end32; If the currently scanned unit, I, is specified as \[-segment\] to to next I (i.e. Ln+l).</Paragraph>
    <Paragraph position="3"> If scanned unit, I, is either \[+coasonantal\] or \[-backs (i.e. does not match the rule condition), go to the next unit.</Paragraph>
    <Paragraph position="4"> IF~M(I-I,SEG) and M(I-I,FB) the LI~2; If the unit immediately to the left of I is specified as \[-segment\] and \[/FB\], then set the pointer to 2 (i.e. I-L1 will refer to two units to the left of I).</Paragraph>
    <Paragraph position="5"> IFNIM(I-LI,SEG) the end32; If I-L1 is a \[-segment\] go to next unit.</Paragraph>
    <Paragraph position="6"> IF M(I-LI,ROUND) ,=M(I-LI,HIGH) then go to end32; If the unit in the left environment does not have the same feature values for roundness and highness (i.e.</Paragraph>
    <Paragraph position="7"> does not meet the rule condition, round, high), go to the next unit.</Paragraph>
    <Paragraph position="8">  (i.e. not a true vowel), go to the next unit.</Paragraph>
    <Paragraph position="9"> All the conditions have been satisfied; change the value of the feature \[vocalic~ from - to +  (i.e. apply Rule 32).</Paragraph>
    <Paragraph position="10"> Instruction to print the rule number (R32) and state the matrix feature column to which it has been applied, i.e.I.</Paragraph>
    <Paragraph position="11"> If a display of the string, resulting from application of Rule 32 is desired, go to subroutine $TROUT.</Paragraph>
    <Paragraph position="12"> If a diapls~7 of the matrix resultin E from application of Rule 32 is desired, go to subroutine MATOUT.</Paragraph>
    <Paragraph position="13">  To illustrate, the output coding to evaluate a simple righthanded context of the form A -~B / -- context will be eXa~nedo It will be seen that this codi~ can be generalized to evaluate a left-handed context as well. If the context matching process is considered to be anchored at the point between the LH~ position marker and the context bo~y, then conjunctive and disjunctive items farther to the right in the context m~y be tried ~thout rematchin E items to their left in the context str~. This would be true even after the rule has been applied to the carrent\],y matched unit provided that the matched unit is again tested against the L~ after application of the ~ to that unit and before the context match continues. The run-time environment in the object machine requires a single push-down stack and a few simple variable stor~e locations. A test strin~ is assumed to be stored in the object machine which ~a~ have been entered prior to execution of the rule match or may be the result of application of a prior rule in the syste.</Paragraph>
    <Paragraph position="14"> Th~ semantic for match~n~ particula~ units in the test string ~1.l not be described~ but will be abbreviated in the output cod~ as</Paragraph>
    <Paragraph position="16"> which is taken to mean that a Jump to the ELSE ~O TO label occurs if the specified unit was not successfully matched. Further abbreviations in the output eodi~ are in the application of the ~ of a r~e, ~d~cated by</Paragraph>
  </Section>
  <Section position="4" start_page="0" end_page="0" type="metho">
    <SectionTitle>
DO RULE;
</SectionTitle>
    <Paragraph position="0"> and in the declaration of program block and procedure structures. Other~se, the coding presented constitutes a valid PL/I program segment.</Paragraph>
    <Paragraph position="1"> ~&amp;quot;~-~ the codi~ necessary to evaluate simple contextual expressions iuclud~ an un-nested disjunction t it m~y be seen that no loopin~ back to previously matched units is necessary. When the left parenthesis is encountered the current location of the match pointer, stored in the variable P, is saved. If a~y subsequent item match fails before the r~ht parenthesis is encountered the pointer location is reset to the saved value and the matching process resunes with the next unit outside the parenthesis. The saved pointer location is erased when the riEht parenthesis is encountered whether or not the disjunctive item was successfully matched. This ~heme achieves the desired disjunctivity quite simply in that only one match is attempted. If the match of units inside the disjunction succeeds, the matching process continues normally. If it fails, the enclosed str~ is effectively ignored and the,matchin E process continues as before. This process lmy be made recursive to any level by saving the pointer location in s push-down stack, freeing the top stack item when a  -15riEht parenthesis is encountered. Such a stack ~ easily he implemented in PL/I by using the CORTROLL~ form of dyna~C/ storage allocation for a variable -~ A new level in the stack is secured with the statement ALLOCATE STK;, savi~ all previous values. The top level is erased with the statement FR~ STK;t bringin 6 the previous value into. accessibility.</Paragraph>
    <Paragraph position="2"> In the coding examples presented below, two variables, LEFT -nd RIGHt, are assumed to contain the currently applicable left and right limit~ for matching the test string. These will he set by scanning the test string to locate the innermost syntactic brackets or other such test S~ri~ delimiters. The index variable N will be used to indicate the left-most end of the matching process; in this case, the anchoring point following the LHS position marker. The statement MATCH UNIT __; EI~E GO TO __; is assumed to increment the current match pointer P and fail at any time the value of P eXceeds the right delimiter value, stored in RIGHTdeg The codil~ to evaluate a context of the form RULEN: A-,B/-- C D (E (Y G ) HJ) K would have the following appearance.</Paragraph>
    <Paragraph position="3">  The attempt to formulate the coding to evaluate a context iacludi~ a conjunctive series bri~ to liEht a different type of problem. It is not possible &amp;o retch units from left to right in an orderly fashion as fox simple or disjunctive contexts. Once a match for the entire st~ing has been attempted ttsi~ the first item of the conjunctive series, it is necessary, 5,  -16whether the rule was applied or not, to reset the current match pointer to its value at the time the left bracket was encountered, and then continue the matching process using the units of the second conjunctive item as the matching patterns. In order to loop back in this manner, it is necessary to save three values during a matching pass over the st~ng; 1) The bracketpair number, 2) The pointer value at the time the left bracket is encountered, and 3) The item number within the bracket pair. These three values are saved in the push-down stack in the order listed when the conjunctive series match is begun. It is convenient in the PL/I ~e to accomplish the branching by using the stacked values as subscript values in an assiEnedlabel GD TO statement. The labels ITS4(1,1):, ITEM(I,2):, IT~4(1,3):, .... are attached to the statements in the coding which perform the pointer reset following matching of the corresponding conjunctive items. BranchinE is accomplished with the statement GO TO IT~(I,J); following the proper assignment of values to the variables I and Jo An initial value of zero is put in the stack prior to rule evaluation.</Paragraph>
    <Paragraph position="4"> The stack is then checked for a non-zero top item before it is unstacked for label assignment and an empty stack indicates that all conjunctive items in the rule have been used in the matching process. If the stack is not empty, the top two items are unstacked and stored as the variables J and P respectively. The remaining top stack item is accessed and the value stored in the variable I~ but it is not freed from the stack. The value of P must then be restored to the stack ~o it will be handled properly by the end- ofitem coding. The details of this scheme may be seen in the followi~ example, coded to evaluate a context of the form</Paragraph>
  </Section>
  <Section position="5" start_page="0" end_page="0" type="metho">
    <SectionTitle>
GO TO ITE~(I,J);
SCAN: E~D RULEJ;
</SectionTitle>
    <Paragraph position="0"> A further complication arises when a conjunctive series is embedded inside of a disjunction. Specifically, the pointer location should not be reset to the value stored in the stack for a~y failure to match the internal sequence of units, but onl~ if the match fails for all items in the embedded conjunctive series. Because the last conjunctive item may fail to match, while a previously tested item matched successfully, it is necessary to use a '~lle applied&amp;quot; flag, which is cleared (reset) wbe,- entering the match of a disjunction and set by any application of the rule. The setting of this flag determines the action taken concerning the pointer setting on exit from the disjunction, when all conjunctive items have been tried.</Paragraph>
    <Paragraph position="1"> The Compilim~ Process It m~y be seen from the coding examples given that the output from the coeq~iler occurs essentially in ~he same order as the symhols in the linear input form, suggesting that a preliminary stage of syntactic analysis is unnecessary. It is only necessary to save the ~ specification in the compiler from the time it is input u~til it is output in coded form at the end of the context coding. Observing the three different types of failureto-match exit branches, it appears that the most direct solution is a three-state table driven translator used in conjunction with a number of indices defined during the compiling operation for the purpose of counting brackets and parenthesis, 8enerating sequential labels, etc. Entries in the table indicate for each of the three states what output coding should be generated and what compiler index operations should be carried out as a result of each possible input symbol.</Paragraph>
    <Paragraph position="2"> The table and listing of compiler actions shown below specifies a compiler system capable of producing PL/l coding such as shown in the examples. Notations used in the compiler table and action specifications  -18are explained briefly.</Paragraph>
    <Paragraph position="3">  1. Upper-case letters in the output are output as shown.</Paragraph>
    <Paragraph position="4"> 2. Lower-case letters in the output represent compiler variables  ~for which the currently assigned value is output.</Paragraph>
    <Paragraph position="5"> 3- Abbreviated output coding has the meaning discussed above, for example, DO RULE expressed the codin~ necessary to incorporate into the marked unit in the test string the characteristics or features given as the ~ of the rule.</Paragraph>
    <Paragraph position="6">  4. The state transfer from state 2 on input of a right parenthesis is a conditional transfer, depending on the value of the compiler variable mo The test is shown as a fourth pseudo-state.</Paragraph>
    <Paragraph position="7"> 5- Compiler initialization, ahown as state O, mast be accomplished at the beginning of compilation for each rule.</Paragraph>
    <Paragraph position="8"> 6. Three of the input actions are identical for all states, in- null dicating that it is unnecessary to store those actions in the state table.</Paragraph>
    <Paragraph position="9"> 7. No action is specified for error inputs. It is assumed that the compiler would respond with some indication of the trouble, for example, a comma input when in state 2 could cause the reply '~omma illegal inside parenthesis&amp;quot;deg The compiler uses seven variables, four of them, i,j,k and 1, as push-down stacks with the CONTHOLT.E~ attribute, and three, m,n and o as simple variables.</Paragraph>
    <Paragraph position="10">  The Generalit~ of the Process The only references to left-right directionality in the matching scheme described are in the left to right scan of the current L~ marker in attempting to fit the test string and in the assumption that the coding for matching particular units ~ncluded an instruction to increment the matching location pointer,P. A left-handed context m~7 be evaluated by similar coding by letting the pattern match move from the L~ outward, i.e., to the left. The same compiling system can be used by reversing the symbols of the left-hand context during the initial linearization, substituting left for right and right for left brackets and parenthesis. Thus, a rule of the form</Paragraph>
    <Paragraph position="12"> would appear in the linear format as A -,B / H(G)FE -- I\[J,K\]+ An additional dimension would be added to the compiler state table t providing for the productio= of unit match coding which would decrement instead of incrementing the current matching pointer. The L~S marker would still scan the test string from left to right. If the L~ marker occurred within the items of a co.unction, separate coding would have to be produced for the left and right parts of each item. The details of the matching process for this case have not been worked out, but do not appear to present any major difficulties for the system presented here.</Paragraph>
    <Paragraph position="13"> -21-</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML