File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/73/c73-1008_metho.xml
Size: 25,358 bytes
Last Modified: 2025-10-06 14:11:04
<?xml version="1.0" standalone="yes"?> <Paper uid="C73-1008"> <Title>NODE NODE NODE NODE NODE NODE NODE</Title> <Section position="1" start_page="0" end_page="0" type="metho"> <SectionTitle> ISTVPSN B_.~,TORI WORKING WITH THE INTERACTIVE VERSION OF THE T.G.T.-SYSTEM OF JOYCE FRIEDMAN </SectionTitle> <Paragraph position="0"> The present paper does not claim to be a description of the TGT-System, since it was already presented by Professor Friedman herself at the International Conference on Computational Linguistics in Stockholm in 1969. In addition the system has been described also in the book Jo'i'cE FRIEDMAN, A Computational Model of Transformational Grammar, Elsevier, 1971. Our intention is to present the new interactive version of the TGT-System, which has been developed at the Basic Research of I13M Germany, and to show how it can be used in linguistic research.</Paragraph> <Paragraph position="1"> In order to appreciate the present interactive version, it will be, however, necessary to recall some essential aspects of the TGT-System, yet we do not want to discuss the Friedman System as such in a systematic fashion.</Paragraph> <Paragraph position="2"> Accordingly, in the first part of the paper I shall talk about the batch version, and about our experiences with the system and then I procede to the interactive version.</Paragraph> </Section> <Section position="2" start_page="0" end_page="0" type="metho"> <SectionTitle> 1. TIlE TGT-SYSTEM IN GENERAL </SectionTitle> <Paragraph position="0"> The TGT-System of Friedman grew out of the necessity to Verify or control a formal grammar. It becomes increasingly di~cult to control any formal system beyond a certain size: if one wishes to follow the interaction of two or three abstract rules with all their implications, The new interactive version of the System has actually been installed at the C.N.U.C.E. in order to enable the participants of the Conference to see the systems as it works. I take this opportunity to thank the organizers of the conference, the C.N.U.C.E., and particularly Professor Faedo, Torrigiani, and Zampolli, once again, for their generous support of the demonstration. I also thank my collegues Mrs. Schirmer, Miss Zoeppritz and Mr, Henning, who assisted me to prepare the demonstration. I am especially indebted Dr. Picchi, who adopted the interactive version to the local cMs-System.</Paragraph> <Paragraph position="1"> he may still use his head; for a dozen rules, he will need paper and pencil; and for hundreds of rules, he must have a computer.</Paragraph> <Paragraph position="2"> As primary objective Friedman wanted to give a computational aid to the transformationally oriented linguists. Her system as it stands now can, however, be considered also as an attempt to formalize the transformational grammar in the strict mathematical sense as well.</Paragraph> <Paragraph position="3"> The basic intention of Friedman was not to argue for a specific type of generative grammar but rather to Offer a framework as general as possible and let the linguist impose restrictions on his particular grammar. However, it cmmot be overlooked that the starting point of Friedman is clearly CHOMSKY'S Aspects-model.</Paragraph> <Paragraph position="4"> Accordingly, it is easy to learn how to work with the TGT-System if yOU are familiar with transformational theory. On the other hand, you can use to system &quot;to learn&quot; transformational grammar, as a tutorial aid. Since we do not want to discuss either the transformational grammar directly, nor the purely technical details of Friedman's System, please, let me presume familiarity with the basic notions of generative grammar and refer for the purely notational conventions once again to Friedman's book.</Paragraph> </Section> <Section position="3" start_page="0" end_page="0" type="metho"> <SectionTitle> 2. THE FORM OF THE GRAMMAR </SectionTitle> <Paragraph position="0"> The form of the Grammar is strictly prescribed, but as already mentioned, it is very close to current transformationalist notation.</Paragraph> <Paragraph position="1"> For the TGT-System a grammar consists of a phrase structure, a lexicon, and a transformational part. In the first phase of the processing the grammar is built up according to the users specifications and in the second, subsequent phase one sentence (or more) are constructed according to the grammar. Each of these major components is subdivided further into smaller units. The structuring of the Grammar is indicated by keywords, which must be used in certain positions and are anticipated by the System.</Paragraph> <Paragraph position="2"> Let me shortly comment on some points of this scheme of grammar.</Paragraph> </Section> <Section position="4" start_page="0" end_page="0" type="metho"> <SectionTitle> 3. THE TREATMENT OF THE CONTEXTUAL FEATURES </SectionTitle> <Paragraph position="0"> Friedman introduced a new type of feature, called contextual, which comprises Chomsky's strict subcategorization and selectional restrictions; i.e., it is all the same for further processing, whether a</Paragraph> </Section> <Section position="5" start_page="0" end_page="0" type="metho"> <SectionTitle> THE INTERACTIVE VERSION OF THE T.G.T.-SYSTEM 105 </SectionTitle> <Paragraph position="0"> contextual rule involves features, like (a) or just category symbols, like (b)in (Fig. 1.).</Paragraph> <Paragraph position="1"> But apart from this simplification, the treatment of these contextual features is significantly different from that of CHOMSKV'S in,the Aspectsmodel. The main innovation is the concept of the &quot;side effects &quot;, which makes the selectional rules independent of the order of inserting lexical items into the derivational tree.</Paragraph> <Paragraph position="2"> If the contextual feature refers to a node (or nodes) to which a lexical entry has already been attached, (as in (1) on Fig. 1) the program checks the compatibility of the item with its environment, just as in the Aspects-model. If on the other hand (as in (2), Fig. 1) the node referred to in the contextual rule is still empty, the new item is introduced and the consequences of the contextual features, i.e. the feature on which the insertion depends, are projected into the invironment.</Paragraph> <Paragraph position="3"> f</Paragraph> </Section> <Section position="6" start_page="0" end_page="0" type="metho"> <SectionTitle> 106 ISTV~N BPSTO~ 4. THE REPRESENTATION OF TREES </SectionTitle> <Paragraph position="0"> Note that the output trees are leaned on the side to simplify print, ing. In addition the nodes are numbered for ease of reference. These numbers can be used, among others to localize the feature, which belong to a specific node, also with higher, non-terminal nodes. Note also that features coming from the lexicon are associated originally with the lexical entries. After lexical insertion they are adjoined to the immediately dominating category node and not to the actual word any more.</Paragraph> </Section> <Section position="7" start_page="0" end_page="0" type="metho"> <SectionTitle> 5. THE FORM OF TRANSFORMATIONS </SectionTitle> <Paragraph position="0"> In comparison with the Phrase Structure Rules the notational conventions for transformations are less uniform. The notational unsteadiness is largely due to the lack of a strict, mathematically founded and universally accepted transformational theory.</Paragraph> <Paragraph position="1"> There are two notational styles in use; the more popular of them is the MIT-Style. (Fig. 2).</Paragraph> <Paragraph position="2"> Verbal Description of Passive:</Paragraph> </Section> <Section position="8" start_page="0" end_page="0" type="metho"> <SectionTitle> 1. EXCHANGE SUBJECT AND OBJECT 2. INSERT THE WORD BY AS LEFT SISTER. OF THE AGENT 3. MAR.K THE MAIN VERB AS PAST PARTICIPLE </SectionTitle> <Paragraph position="0"/> <Paragraph position="2"/> </Section> <Section position="9" start_page="0" end_page="0" type="metho"> <SectionTitle> 4 ALESE 2, 2 ARISE 3. Abbreviations </SectionTitle> <Paragraph position="0"> ALESE ... add left sister ARISE ... add right sister</Paragraph> </Section> <Section position="10" start_page="0" end_page="0" type="metho"> <SectionTitle> THE INTERACTIVE VERSION OF THE T.G.T.-SYSTEM 107 </SectionTitle> <Paragraph position="0"> This convention for transformation is generally advocated in standard introductory works. Accordingly, transformations are written in the form of pseudo-rewriting rules, where apparently, the structural description (SD) part should be replaced by the structural change (SC) part. With other words: you define the input arid you define the output. The convention is self explanatory, but perhaps somewhat vague. The MIT notation is regarded even by its own adherents rather as a convenient short-hand for indicating structural cllanges and not as a proper, full scale formalism.</Paragraph> <Paragraph position="1"> The other style is the MITRE-notation, which is less known and resembles computer commands. This convention defines the !nput into a transformation and lists the elementary operations, to be carried out on the input tree. The elementary operations shotild be defined in advance. On the whole this way of representing transformations is more abstract but it can be formalized more readily. Friedman uses this style of notation: there is no problem to reformulate a transformation from the pseudo-rewriting style into the operational representation.</Paragraph> </Section> <Section position="11" start_page="0" end_page="0" type="metho"> <SectionTitle> 6. '~ THE TRAFHC RULES &quot; </SectionTitle> <Paragraph position="0"> The purpose of the control program (cv) is to determine m which order, and at which point in a derivational tree, a transformation should be applied.</Paragraph> <Paragraph position="1"> By means of a fORTRAN-like control language (by the so called &quot; traffic rules &quot;), the linguist can execute the transformations cyclically, i.e. applying the same set of transformations to every clause, he can determine in which order the clauses of a sentence should be processed, he may change the order of execution depending on certain condition, e.g. on the success of preceding transformations etc. This control part of Friedman's System provides an enormous generative power, the possibilities of which have hardly been discussed in the linguistics. You can easily define several successive transformational cycles by the cv of Friedman, you can solve the ordering problem of transformations by defining unique jumps in order to leave out the execution of a transformation, which in a &quot;simple &quot;, cyclically ordered grammar would be impossible.</Paragraph> <Paragraph position="3"/> </Section> <Section position="12" start_page="0" end_page="0" type="metho"> <SectionTitle> 7. USING THE SENTENCE GENERATOR </SectionTitle> <Paragraph position="0"> The actual testing of the Grammar is done by the Sentence Generator. As already said the Grammar is laid on in the first phase of the processing and subsequendy the system should be instructed to generate sentences according to the given grammar. Trivially in as much as the system generates correct sentences, the grammar is verified to the extent the generated sentences are false, the grammar is wrong and has to be corrected.</Paragraph> <Paragraph position="1"> The sentence generator as such can operate in one of three nodes (Fig. 3): 1. It can generate sentencescompletely at random, where a random number generator mechanism controls the selection of grammatical rules and lexical insertion. All you have to do is to enter the sentence symbols S.</Paragraph> <Paragraph position="2"> THE INTERACTIVE VERSION OF THE T.G.T.-SYSTEM 109 2. You can predefme a sentence entirely at the level of deep structure and let the system check the tree and carry out the transformations leading up to .the surface structure.</Paragraph> <Paragraph position="3"> 3. You can use partially defined input, e.g. defining just the struc null ture, but leaving open the lexical insertions, or just specifying a particular structural configuration you are interested in, while letting the system fill up the rest at random.</Paragraph> <Paragraph position="4"> For practical testing the second and the third way of using the sentence generator is clearly preferable. The random generator may produce spectacular sentences, but practically never the ones which have bearing on the problem you are interested in. The sentences delivered .by the random generator may be and are revealing, and nobody experimenting with the system would withstand the temptation to see what his grammar would produce &quot;left entirely alone &quot;, but it is not suitable for systematic work. You may correct a mistake detected by the random generator, but you better test the correction by a pre-determined skeleton, otherwise you may get a totally different sentence, from which you cannot see whether the error has really been corrected or not.</Paragraph> <Paragraph position="5"> According to our experience, entirely predetermined structures including lexical entries are the best to test a grammar. In this case you can anticipate a normal sentence as the final output of the generator, and can immediately decide whether the generation is correct or not. There are two input formats: a free, bracketed (FTRIN) format, and a fixed tree format (TVJN). It is perhaps a matter of personal taste, yet for us the rTRIN, that is the bracketed input, seemed to be more convenient. (Fig. 4)</Paragraph> </Section> <Section position="13" start_page="0" end_page="0" type="metho"> <SectionTitle> FTRIN Format: </SectionTitle> <Paragraph position="0"/> </Section> <Section position="14" start_page="0" end_page="0" type="metho"> <SectionTitle> 110 ISTVAN B/kTORI </SectionTitle> <Paragraph position="0"> Usually, the interaction of the phrase structure rules is fairly straightforward, while that of the transformational rules is much more intricate.</Paragraph> <Paragraph position="1"> Therefore you can easily predefme a skeleton by using your own phrase structure rules &quot;manually &quot; and then let the system apply the transformations to the prefabricated input. If you use partially predetermined trees, you may be distracted by mistakes, which occur at places which are of no interest to you. Note that you cannot correct all errors, at least not at once, and therefore you had better concentrate on a few points, otherwise you loose sight of you own grammar.</Paragraph> <Paragraph position="2"> 8. THE OUTPUT OF THE BATCH VERSION The original batch output of the ToT-System has been designed to provide all possible information about the processing, which the linguist may possibly need. First the input grammar is listed, followed by the content of the major internal tables, according to which the subsequent generation procedes. Then, the process of sentence generation is reported in such a manner, that the linguist can follow the significant steps of the processing (Fig. 5 (1)).</Paragraph> </Section> <Section position="15" start_page="0" end_page="0" type="metho"> <SectionTitle> 9. TIIE INTERACTIVE VERSION OF TH.E TGT-SYSTEM </SectionTitle> <Paragraph position="0"> The present interactive version has been developed according to the experiences gained by working with the original batch version.</Paragraph> <Paragraph position="1"> We have noticed in general that we are interested in the linguistic aspects of the derivation, such as changes in the tree, or in the final output, but not the actual computation.</Paragraph> <Paragraph position="2"> The demand for a more condensed output will be even more imperative in a terminal environment where the time and the output should be restricted to a minimum. Therefore we defined a new additional output file, containing just the essential information in which a linguist is interested (Fig. 5).</Paragraph> <Paragraph position="3"> The original batch protocol enables you to follow the actual flow of computation, e.g. in the case of a transformation you get the modules called to perform the successive steps of the processing. The interplay of the different subroutines is, however, always the same: ANTES T calls PASSIV, PASSIV calls ELEMOP etc. Since Friedman's System works practically free of error, there is no need to check the subroutine calls every time. This information, therefore, can be dispensed with for the most purposes.</Paragraph> <Paragraph position="4"> We have designed a slightly different, more comprehensive format, which contains only the linguistically relevant information. The new output format of the interactive version makes a clear reference to the input grammar, such as the name of the transformation, the name of the elementary operations, the nodes affected by them. In one point the interactive version provides information, which has not been explicitly reported in the original batch version. You can follow now also the feature operations in the same form as you follow the tree operations: the interactive protocol delivers the features names and the actual feature value. For a linguist testing feature operations this is an inno-</Paragraph> </Section> <Section position="16" start_page="0" end_page="0" type="metho"> <SectionTitle> 112 ISTVAN BATORI </SectionTitle> <Paragraph position="0"> vation over the original batch version, which suffices to give a hint at this point, that the feature operation has been successfully completed without further details.</Paragraph> <Paragraph position="1"> It should be noted, that batch-output and terminal output are not mutually exclusive, the terminal output is a summary extracted from the original and placed on a separate file output. The original output is, however, still available. The file on which is written is normally set dummy, but it can be reactivated and listed, in the very same form as in the original version. 1 10. THE COMMANDS OF THE INTERACTIVE VERSION The interactive version on the whole uses a fairly straightforward language. The answer to most of the questions is either yes or no (or just the first letter of these words). Every answer is prompted; and should be answered by saying yes or no. In such cases where an other answer is expected the book of Friedman should be consulted. Note that in case you want to enter the input skeleton not from the terminal you must have the file allocated prior to calling the TgT-System.</Paragraph> </Section> <Section position="17" start_page="0" end_page="113" type="metho"> <SectionTitle> ~. THE CONTROL OF THE INPUT </SectionTitle> <Paragraph position="0"> Summarizing: if you want to run the ToT-System you have to define and enter a grammar, give a command for the sentence generator, and you have to deliver a skeleton to be expanded (Fig. 6). Originally all these three kinds of input were entered in sequence into the system on the same file as data.</Paragraph> <Paragraph position="1"> It should be noted that the grammar is a part of the input data, which is entered and processed in each run. This homogenous input is then interpreted by the system as grammar or as input into the sentence generator according to the internal logics of the program. In order to achieve greater flexibility while testing a grammar, we separated the three logically different input into three logically different files. The input grammar, usually a text of several hundreds of lines, ) is normally already stored on an external device and entered accordingly. The generator command (the $MatN-card) may be attached to the grammar, if not, it is prompted and you may enter it from the terminal.</Paragraph> <Paragraph position="2"> Similarly, you may predefine input skeletons to be tested and enter them just as you enter the grammar as a separate file. You have, however, the choice to enter skeletons directly from the terminal. In case of interest you may enter as many skeleton as you like. The random generator then provides for variation.</Paragraph> <Paragraph position="3"> Technically, the separation of the three logically different kinds of input has been accomplished by introducing a file variable, which is set first to accept the grammar from a permanent data set and then changed over to the terminal or an other permanent input data set according to user specifications at session time.</Paragraph> </Section> <Section position="18" start_page="113" end_page="113" type="metho"> <SectionTitle> 114 ISTVAN B~TORI </SectionTitle> <Paragraph position="0"> 12. TIlE TREATMENT OF THE ERROR I~iESSAGES The same file variable technique is used to control the error messages. The error file is set either to the terminal or to the batch file alternatively. There would be no problem to assign the error messages permanently, yet an eventual change of the file requirements in terminal environment would mean a revision of several hundreds of error messages, while a file variable can be controlled by a single instruction. null There is a further problem to be faced and that is the reference point of the error message. In the original batch version the error message precedes the actual erroneous line in the grammar or inserted in the protocol at the appropriate point.</Paragraph> <Paragraph position="1"> In the first case the interastive version does not display the original input grammar, and therefore a message that e.g. brackets are opened, but not closed or &quot;special character expected &quot;, but not found, and the like are not very informative, since the user would be left alone to find the critical place in the grammar. Therefore the error messages during the processing of the input grammar are preceded by the actual line in which the error has occurred. The line numbering will help the linguist to localize the erroneous section in the input grammar.</Paragraph> <Paragraph position="2"> If on the other hand the error occurs during sentence generation, the message will be inserted in the terminal protocol at the appropriate place.</Paragraph> </Section> <Section position="19" start_page="113" end_page="113" type="metho"> <SectionTitle> 13. THE CONTROL OF OUTPUT </SectionTitle> <Paragraph position="0"> Another crucial point is the control of the terminal output. You can have the following choices as regards extent of output: 1) You are not interested in any further details, you ,do not want to see the full input tree. In this case you still get: 1., the linear representation of the input, 2., the list of transformations which have been applied and 3., the output of the transformations, also in the linear form. This is the minimal amount of output (Fig. 7): 2) You wish to see the input tree into the transformational component, you answer to the question PRINTOUT INPUT TREE? by saying &quot;yes &quot;. In this case you get also the full output tree of the</Paragraph> </Section> <Section position="20" start_page="113" end_page="113" type="metho"> <SectionTitle> 5 MOD </SectionTitle> <Paragraph position="0"/> </Section> <Section position="21" start_page="113" end_page="113" type="metho"> <SectionTitle> THE INTERACTIVE VERSION OF THE T.G.T.-SYSTEM 117 </SectionTitle> <Paragraph position="0"> 3) You may want to see also the features associated with the nodes in the tree - then you respond to the next question of the system PRINTOUT FEATURES correspondingly - and you get the features displayed both of the input and the output tree. In addition you get also the list of transformations applying, now including also the feature operations (Fig. 9) :</Paragraph> </Section> <Section position="22" start_page="113" end_page="113" type="metho"> <SectionTitle> THE INTERACTIVE VERSION OF THE T.G.T.-SYSTEM 119 </SectionTitle> <Paragraph position="0"> 4) You may be interested in even more details, for instance in some intermediate trees and you have inserted TrACE-cards in the control program of the grammar just as they are inserted in the original batch version. Now if you answer to the question PRINTOUT INPUT TREE by saying ALL, you will receive every intermediate tree as well, in addition to the input and output tree with features and feature operations. Otherwise the TRACt. function returns just the terminal string of the derivation. Fig. 10 shows the general logics of the output A grammar developed direcdy with the aid of the TGT-System is practically never complete, it generates only a subset of the language in question. You may add, change, remove parts of the grammar and thus you can easily produce minor variants of the same grammar one of which may be preferable over the other. In fact this is the normal way to work with the system.</Paragraph> <Paragraph position="1"> At the C.N.U.C.E.-installation the/'e was a number of test-grammars (German, Italian, English and Spanish), offered to the participants to try how such testing looks like. The participants of the Conference were invited to look at the Grammar Tester as it works. In the Centro Nazionale Universitario di Calcolo Elettronico the Transformational Grammar Tester was running on a I~M System/360 Model 67 under CP-CMS-67.</Paragraph> </Section> class="xml-element"></Paper>