File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/87/t87-1009_abstr.xml
Size: 7,883 bytes
Last Modified: 2025-10-06 13:46:30
<?xml version="1.0" standalone="yes"?> <Paper uid="T87-1009"> <Title>COMIT ==:~ PATR II</Title> <Section position="1" start_page="0" end_page="39" type="abstr"> <SectionTitle> COMIT ==:~ PATR II </SectionTitle> <Paragraph position="0"> Here is the history of linguistics in one sentence: once upon a time linguists (i.e. syntacticlans) used augmented phrase structure grammars, then they went over to transformational grammars, and then some of them started using augmented phrase structure grammars again, (space for moral~. Whilst we are in this careful scholarly mode, let us do the same service for computational linguistics: once upon a time computational linguists (i.e. builders of parsers) used augmented phrase structure grammars, then they went over to augmented transition networks, and then many of them started using augmented phrase structure grammars again, (space for moral~. There are people who would have you believe in one or other of these stories (e.g.</Paragraph> <Paragraph position="1"> Chomsky 1983, p65, for the first). And, of course, there is an element of truth in each of them.</Paragraph> <Paragraph position="2"> If an unrestricted rewriting system is an &quot;augmented phrase structure grammar&quot;, then we can say that Chomsky (1951) propounds an augmented phrase structure grammar 1 Turning to computational linguistics, let us consider two fairly well-known exemplars, one for the old grammatism (COMIT - Yngve 1958) and one for the new (PARR IN\[ - Shieber 1984).</Paragraph> <Paragraph position="3"> Both are computer languages, both were designed for computational linguistic purposes, notably the specification of natural language grammars with a view to their use in parsers. The two general criteria that Yngve explicitly notes as having motivated the design of COMIT, namely &quot;that the rules be convenient for the linguist -- compact, easy to use, and easy to think in terms of&quot; and &quot;that the rules be flexible and powerful -- that they not only reflect the current linguistic views on what grammar rules are, but also that they be easily adaptable to other linguistic views&quot; (1958, p26) are indistinguishable from two of the three general criteria that motivate the design of PATR II (Shieber 1985, pp194-197) \[the third -- computational effectiveness -- may have been too obviously pressing in the late 1950s for Yngve to have thought worth mentioning explicitly\].</Paragraph> <Paragraph position="4"> Both have been implemented on a variety of hardware, and substantial grammar fragments have been written in both. 2 Both COMIT and PATR II are, in some sense, and not necessarily the same sense, augmented phrase structure grammar formalisms. In examining the differences between them, it will be convenient to divide the topic into (1) consideration of categories, and (ii) consideration of rules.</Paragraph> <Paragraph position="5"> Looking at the category formalisms first, both formalisms allow categories to have an internal feature structure, but there the resemblance ends. A COMIT category consists of a monadic name (e.g. &quot;NP&quot;), an optional integer &quot;subscript&quot;, and a set containing any number of attribute-value pairs (called &quot;logical subscripts&quot;). Attributes are atomic, but values are sets containing between 0 and 36 atomic members. This is a sophisticated and expressive feature system by contrast to the impoverished phonology-based binary systems that most transformational syntacticians seemed content to assume, though scarcely to use, during the 1960s and 1970s. A PATR II category, however, is an arbitrary directed acyclic graph (dag) whose nodes are labeled with atomic names drawn from some finite set. Thus it easy to see how to translate a set of COMIT categories into a set of PATR II categories: the only minor complication concerns how you choose to encode the COMIT integer subscripts. But translation in the other direction is in general impossible, for all practical purposes, since COMIT logical subscripts do not permit any I The notation Chomsky used mostly suggests a context sensitive rewriting system which allows null productions (hence type 0 rather than type 1). However, one nonstandard augmentation that is employed throughout the work is the &quot;sometimes&quot; notation, as in the following example from page 30.</Paragraph> <Paragraph position="7"> This remarkable innovation does not seem to have found favor in later work except, perhaps, as the precursor of the &quot;variable rules&quot; that became fashionable in sociolinguistics in the 1970s.</Paragraph> <Paragraph position="8"> For some example COMIT grammars, see Dinneen (1062), Fabry (1963), Satterthwait (1962), Weintraub (1970), and Yngve (1967).</Paragraph> <Paragraph position="9"> recursive structure to be built. 3 Switching our attention now to rules, we observe that both COMMIT and PATR II allow one to write rules that say that an expression of category A can consist of an expression of category B followed by an expression of category C. But a COMIT rule is a rewriting rule whose primary concern is that of mapping strings into strings, whereas a PATR II rule is a statement about a permissible structural configuration, a statement that concerns itself with strings almost incidentally. A rule with more than one symbol on the left-hand side makes no sense in the PATR II conception of grammar, but it makes perfectly good sense when the function of a rule is to change one string of categories into another string, as in the COMIT conception. COMIT rules give you unrestricted string rewriting, PATR II rules permit concatenation only. Thus COMIT rules cannot, in general, be translated into PATR II rules, and PATR II rules, thanks to the category system employed, cannot, in general, be translated into COMIT rules. COMIT rules are inextricably embedded in a procedural language: the rules are ordered in their application, every rule has an address, every rule ends with a GOTO-on-success, and rules can set and consult global variables in the environment (the &quot;dispatcher&quot;). PATR II rules, by contrast, are order independent, side effect free, and pristinely declarative. Both languages allow the user to manipulate features in rules, but whilst COMIT offers the user a small arsenal of devices -- deletion, complementation, merger -- of which the last-named appears to be the one most used, PATR II offers only unification. But are &quot;merger&quot; and &quot;unification&quot; two names for the same concept? The answer here is no: merge(A,B), where A and B are attribute values (hence sets), is the intersection of A and B if the latter is nonempty, and B otherwise.</Paragraph> <Paragraph position="10"> There is nothing too surprising in any of the foregoing: as one might expect from the chronology, PATR II stands in much the same relation to COMIT as Scheme does to Fortran. If anyone wanted to do COMIT-style computational linguistics in 1987, then they would probably be better off using Icon than they would be using PATR II. What is distinctive about the new grammatism, as canonically illustrated by PATR II (but also exemplified in CUG, DCG, FUG, GPSG, HPSG, JPSG, LFG, UCG, ...) is (i) the use of a basically type 2 rule format (single mother, unordered, no explicit context sensitivity) under (ii) a node admissibility rather than a string rewriting interpretation, with (iii) a recursively defined tree or dag based category set, and (iv) unification as the primary operation for combining Syntactic information.</Paragraph> <Paragraph position="11"> It would be interesting to learn of any computational linguistic work done in the 1950s or 1960s that exhibits more than one of these characteristics.</Paragraph> </Section> class="xml-element"></Paper>