File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/69/c69-2501_metho.xml
Size: 17,143 bytes
Last Modified: 2025-10-06 14:11:05
<?xml version="1.0" standalone="yes"?> <Paper uid="C69-2501"> <Title>Type N Type M Type V IMPLICIT CORRELATOR I * AM l I NI--N2 AM I I I M2--MI SERIOUSLY, HE LEFT I Uv2~ Vl-- - 3 - EXPLICIT CORRELATOR Type E DUCKS IN ATHENS I I I</Title> <Section position="2" start_page="0" end_page="12" type="metho"> <SectionTitle> I READ THE BOOK </SectionTitle> <Paragraph position="0"> ii) past tense b) the sequence of correlational indices (Ic's), that is, the string of potential links that each word-sense has. Each Ic represents a possible syntactic connection between two items and is identified by: i) the code number of the relation it establishes between two items; 2) the 'type' of correlation. There are six different types of correlation which split into two groups:'explicit' correlators and 'implicit' correlators.</Paragraph> <Paragraph position="1"> By 'explicit' correlator we mean a linking element which is represented by a linguistic item; prepositions and conjunctions are explicit correlators; by 'implicit' correlator we mean a relation between two items, which is not expressed by any linguistic item but is indicated by the relative position of the two items (which we call their correlational function).</Paragraph> <Paragraph position="3"> For each type there are different correlational functions which determine the position a word has in a correlation. When two adjacent words have complementary functions of the same Ic - for instance, word A has 5050 N1 and word B has 5050 N2 - a 'product' is made and recorded in the form: Word A 5050 N Word B This product is considered as one piece and can become first or second correlatum in a wider correlation and is therefore treated as though it were a single word, i.e., it is assigned strings of Ic's which indicate its correlational possibilities both with adjacent words and with adjacent products already made. Single words, however, being vocabulary items, have their strings of Ic's assigned a priori; products, since they arise during the procedure, have to be assigned their Ic-strings dynamic&lly. The assignation of specific Ic's to a product depends on: a) the correlator responsible for the particular correlation;</Paragraph> <Paragraph position="5"> makes up the first or the second correlatum.</Paragraph> <Paragraph position="6"> The operational cycle that assigns Ic's to a product we call 'reclassification'.</Paragraph> <Paragraph position="7"> The amount of data involved in an analysis of this kind is really enormous. Let us consider a sentence consisting of ten words, each of which has two different senses (S's). On an average 50 correlational indices are assigned to each sense of a word. Now, just to check the correlational compatibility of two adjacent words about 10,000 matching operations would be necessary; the matching procedure for all the words of the sentence would involve about 90,000 operations. On an average five products would result from the first 10,000 matching operations; each of them would be assigned about 50 correlational indices that represent the product's correlational possibilities to correlate with a another adjacent piece - either a word or a product. The procedure to match these five products with another piece would involve about 637,000 operations. If to this figure~ we add the number of operations necessary starting from level 3 (see p. 7) with all the products made in the immediately preceding levels (200,000), the total number of operations involved would come to 927,000.</Paragraph> <Paragraph position="8"> The reclassification routine also involves a great number of operations of this kind: about half of the correlational indices a product is assigned depends on the corre- 5 lator responsible for that correlation; the other half depends on the strings of indices which the two correlata of the product have. According to the presence or absence of specific indices in the strings of the first or second correlatum, pre-established sets of indices are assigned to the product; or sets of indices are assigned to the product only if they are present in the strings of its two correlata. The reclassification of each product would require about 2,000 operations, which means 100,000 for the average of 50 products in a sentence of i0 words, bringing the total of operations to over a million onl\[ for the matchin~ procedure. This would imply - for this part of the program alone - processing times of the order of some seconds of machine time if the most modern computer is available, or of about an hour - at best - if the work is done with an older model.</Paragraph> <Paragraph position="9"> The amount of work and money involved in a procedure of this kind made us try to find a quicker and more economical way of handling correlational indices: as a result of our efforts the Multistore system was developed.( Bibl. i) The basic idea of the Multistore consists in pre-estab~ lishing in a given area of the machine's central core as many separate positions as there ard correlators in the system. The arrangement of these positions representing the correlators orders them according to type. This assures that at any point of the procedure each Ic is not used sev- 6 eral times and in different ways according to the diverse data it contains, but only as one single item which by its positional coordinate~ implies its various significations.</Paragraph> <Paragraph position="10"> Moreover, the Ic's do not have to be compared one by one with the Ic's of other adjacent words or products, but are simply addressed to one and only one pre-established position. Thus the mass of operations of comparison is avoided and also the necessity to ascertain, after every successful match, which items the matched Ic's represent is eliminated, because the very position of the matched Ic's immediately implies what they stand for. To establish whether two . Ic's are complementaryand represent a correlation thus becomes the simple task of checking already present information according to the rules of sequence, of correlational function, and of correlator type, all of which are implicit in the location of the markers which are being handled.</Paragraph> <Paragraph position="11"> The Multistore can be represented as a rectangular area divided into lines and columns. (see Fig. 1 below)</Paragraph> <Paragraph position="13"> Every column is dedicated to one Ic and subdivided into two subcolumns, if the Ic is of type N, M, or V (implicit); if the Ic is of type E, F, or H (explicit), the column is divided into three subcolumns.</Paragraph> <Paragraph position="14"> The lines LI, L2, L3 etc. divide the area into levels. The levels are determined by the succession of words in input.</Paragraph> <Paragraph position="15"> Thus each level bears the number of the word it represents.</Paragraph> <Paragraph position="16"> Every input word causes for each Ic in its Ic string the insertion of a marker into the Multistore column corresponding to that Ic; and the level of that marker in the Ic column corresponds to the input number and the position of that word in the sentence. Thus all the markers inserted for one word represent the correlational possibilities of that word.</Paragraph> <Paragraph position="17"> If, on the line of level i, an Ic of the string representing correlator type N of the first word has caused the insertion of a marker into the column corresponding to that Ic and if an Ic of string N of the second word has caused the insertion,into the line of level two, of a marker representing CF2 of the same Ic, this implicitly means that with the same correlator a correlation is made containing the first and the second word; this product No x, consisting of word 1 (S 'a\[) and word 2 (S .'b') is the product of correlator No y and type N. This product belongs to level 2 and when it has been assigned a string of Ic's by the appropriate rules of reclassification, it will be inserted into the Multistore on the second level; this means that it can enter into combinations only with those words that belong to the immediately preceding level, or with products which contain the words of the immediately preceding level.</Paragraph> <Paragraph position="18"> Such a correlation, whenever it is made, would still belong to the level of product x. In our specific case product x could correlate only with an item of level zero, which does not exist, because product x is on level two and already contains word No. i. Hence we can formulate a restrictive rule to the effect that a product can be a potential second correlatum in an N correlation only if its lower level is larger than 1. The Multistore system lends itself to the introduction of many such restriction rules.</Paragraph> <Paragraph position="19"> When on a given level all products that have sprung from the insertion of markers corresponding to the word of that level have been reclassified,and the products originating from that reclassification have, in turn, been reclassified and have inserted their markers, and there are no more products to be reclassified, then the procedure inserts the next word and thus begins the next level. This means that once a subsequent word of the sentence has been inserted, all preceding words and products become 'inactive' pieces, having exhausted every possible attempt of correlation with 'active' pieces; from then on they represent merely latent correlational possibilities with subsequent items. This state of inactivity in the case of the last word of the sen- 9 tence determines the end of the analysis. At this point the product (or products) that contains all words of the sentence is called 'complete' and represents the hierarchical structure of the sentence.</Paragraph> <Paragraph position="20"> The first tentative program MP1 (Bibl. i) was written for use on a GE 425 computer and its main purpose was to show the applicability of the Multistore system to correlational grammar and to check the method of programming based on 'significant addresses'.</Paragraph> <Paragraph position="21"> The present program, MP2 (Bibl. 3) , is a revised and enlarged version of MPI written for use on an IBM 360/67 computer. On the basis of our previous experience it can be considered an actual working tool.</Paragraph> <Paragraph position="22"> Many solutions, as well as many restraints, depend on the fact that under many respects it is a machine-oriented program. The program is structured on a large area of the central core, divided into lines and columns, whose size is 528 x 330 bytes. Each line (330 all together) consists of 528 bytes and is divided into two sections: A and B. Section A contains all the data necessary to define a line; section B consists of 496 bytes, that is, of as many bytes as there are correlators operative in the i system. Each line specifies as permanent data a reclassification rule - whose definition is given in section A of the same line - and the set of indices assigned by that rule (bit 6). The relevance of the rule to a given product - l0 is specified in the columns of section B.</Paragraph> <Paragraph position="23"> marker of CF3 (explicit correlator) marker of CF2 (right-hand piece) marker of CFI (left-hand piece) marker for special linguistic rules reclassification rule marker Ic assignation marker Fig. 3 Bits 1 to 4 are therefore used in the matching procedure, whereas bit,5 and 6 are pre-established data to be used in the reclassification routine.</Paragraph> <Paragraph position="24"> - ii - null Each S-value of a word occupies one line of the Multistore area and its specifications are recorded in section A of the same line. For each Ic contained in the string of that S-value of the word, a marker is inserted , according to the correlational function, in bit 1,2 or 3 of the corresponding byte of the line, that is, in the byte which bears that Ic as label.</Paragraph> <Paragraph position="25"> According to its function, a marker can be a left-hand piece 'LH', and as such it~is simply recorded, or a right-hand piece 'RH', in which case, immediately after it has been recorded, the column is searched for a complementary and contiguous LH piece. If this is found, an indication of product is recorded in the first free line of the Multistore; this address consists of three data: a) the address of the line where the LH piece was found, which is recorded in the area 'first correlatum';of the line of the product; b) the address of the line where the RH piece was recorded, which is recorded in the area 'second correlatum' and c) the relative address of the column which characterizes both both LH and RH pieces, which is recorded in the area 'correlator'. null After the product's specificatibns are recorded, and if there are no other LH pieces with which the RH piece in hand can combine, the routine for the insertion of Ic's is resumed. If the insertion of the next Ic of that S'value</Paragraph> <Paragraph position="27"> The position 6 in the Multistore area corresponds to coTrelator No y in the same way as position 7 corresponds to correlator No z, an~ so on,</Paragraph> </Section> <Section position="3" start_page="12" end_page="14" type="metho"> <SectionTitle> INSERTION DIAGRAM </SectionTitle> <Paragraph position="0"> of the word causes a new product to be made, the procedure is repeated and the product is recorded on the next free line of the Multistore area. Only when all the Ic's of the piece which has caused the production have been inserted, the reclassification routine takes place, starting from the first product newly recorded.</Paragraph> <Paragraph position="1"> The information contained in the area 'correlator' of the line containing the product's record gives the address of the Multistore column dedicated to the correlator responsible for that product. The column is then searched , from the top down, for a bit 5 set ON (see Fig. 3 on p.10).</Paragraph> <Paragraph position="2"> If it is found, this implicitly means that on the line to which the bit belongs, there will be found the record of a reclassification rule relevant to the product to be reclassified. Section A of the same line contains the instructions concerning the assignation of the Ic's whose markers are contained in bit 6 (see Fig.3). A bit 6 set ON implicitly indicates either the column in which to check (if the rule requires it) the record of the first or second correlatum(the addresses of which are recorded in section A of the line in which the rule is recorded) for the presence or the absence (as specified by the rule) of bits 1 to 3 set ON (which represent Ic's); or i~ may simply indicate the place in which to insert a marker, i.e. a reclassification Ic, The routine for the insertion of reclassi fication markers is exactly the same as the routine for the insertion of markers for words.</Paragraph> <Paragraph position="4"/> <Paragraph position="6"> Conditioned rule. i Unconditioned rule. # Check on 1st correlatum.</Paragraph> <Paragraph position="7"> Ic 4 CFI. @ Assign--the string CFI contained in the rule to the product.</Paragraph> <Paragraph position="8"> _% Assign the string CF2 contained in the rule to the product.</Paragraph> </Section> <Section position="4" start_page="14" end_page="16" type="metho"> <SectionTitle> RECLASSIFICATION DIAGRAM </SectionTitle> <Paragraph position="0"> The analysis of the sentence is complete when the last marker of the last word-sense has been inserted and there are no further products to be reclassified or re-cycled.</Paragraph> <Paragraph position="1"> At this point the output routine starts. Three different kinds of output are produced: a) a list of all the products made in the course of the analysis of the sentence; b) a list of all Ic's assigned to each product during the reclassification routine ; c) a graphic representation of the hierarchical structure of all 'complete' products (that is, containing all words of the sentence). This structure is equivalent to a tree structure with words at the terminals and correlators at the nodes. (see Appendix) This is a general outline of the procedure of combination, production, reclassification and output. In addition to that there are several routines which meet special requirements. A special rule, for instance prevents specific RH pieces from becoming eligible LII pieces once a certain correlation - which contains them as RH pieces - has been made. A word like &quot;LITTLE&quot;, for instance, in its function as a quantifier, once it has been correlated with the de~ finite article and made the product &quot;THE//LITTLE&quot; cannot become LH piece in the correlation:</Paragraph> <Paragraph position="3"> The indication 'discard' on print-out type 'a' - i.e. on the list of all the products made during the analysis will show that &quot;LITTLE&quot; is no more available as LH piece for any other correlation. * Another restraint concerns some 'complete' products which, though grammatically correct, cannot be accepted as interpretation of the sentence. For instance, in a sentence like:</Paragraph> </Section> class="xml-element"></Paper>