File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/79/j79-1047_intro.xml
Size: 41,682 bytes
Last Modified: 2025-10-06 14:04:14
<?xml version="1.0" standalone="yes"?> <Paper uid="J79-1047"> <Title>Association for Computational Linguistics k SURVEY OF SYNTACTIC ANN-YS IS PROCEDURES FOR NATURAL LANGUAGE</Title> <Section position="4" start_page="11" end_page="19" type="intro"> <SectionTitle> PROCEDURES </SectionTitle> <Paragraph position="0"> We can impose several rough groupings on the set of parsers in order to structure the follawing survey. To begin withr we may try to separate those sys tcms davclobed with solme reference to transformational theory from the nont-ransforlnational sys terns.</Paragraph> <Paragraph position="1"> This turns out slso to be an approximate historical di+vision, since most systems written since 1965 have made soir~c connection with transformational theory, even though their ihcthods of analysis nmy ke only di%tzantly related to transfol-mational mc ch an i s IIIS .</Paragraph> <Paragraph position="2"> Z'hc tr~~nsformatianal systcms may in turn be divided into those parsers which have bccn systematically clerivca from a specific transformational g~ncrative gracunar and those which have &quot;sacrificed&quot; this direct connection rrii th a generative grammar in order to obtain a more direct and efficient algorithm for recovering base struckures. This division appears to be in part a result of our inadequate theoretical understanding of tran'sformational grammars, and may be reduced by some recent theore tical work on transformational grammar's.</Paragraph> <Section position="1" start_page="11" end_page="11" type="sub_section"> <SectionTitle> 2.1 Early Systems : ContextLFree . and Context-Sensi tive Parsers </SectionTitle> <Paragraph position="0"> The prctransformational systems, developed mostly between 1959 and 1965, were, with a few esccptions, parsers for context-free languages, although cloaked in a nur~ber of different guises. These sys terns were based on immediate constituent analysis, dependency tlieory , linguistic string theory, or sometimes no theory at all.</Paragraph> <Paragraph position="1"> The largest and probably the most important of these early projects was the Harvard Predictive Analyzer [Runo 19621. A predictive analyzer is a top-down parser for contest-free grammars written in Greibach normal form; this formulation of the grammar was adopted from earlier work by Ida Rhodes for her Russian-English translation project. The size of the grammar was staggering: a 1963 report [Kuno 19631 quotes figures of 133 Uord classes 'and about 2100 productions. Even with a grammar of this size, the system did not incorporate simple agreement restrictions of English syntax Since the program was designed to produce parses for sentences which were presumed to be grammatical (and not to differentiate between grammatical, and nongrammatica1 septences) , it was at first hoped that it could operate without these restrictions. It was soon discoveredr however, that these restrictions were required to eliminate invalid analyses of grammatical sentencds . Because the direct inclusion of, say, sub ject-verb number agreement would cause a large increase in an already very large grammar, the Harvarq group chose instead to include a special mechanism in the parsing program to perform a rudimentary check on number agreement. Thus the Harvard Predictive Analyzer, though probably the most successful of the context4 ree analyzers, clearly indicated the inadequacy of a context- free formulation of natural languqge grammar.</Paragraph> <Paragraph position="2"> The ~arvard Predictive Analyzer parsing algorithm progressed through several stages. The first version of the predictive analyzer produced only one analysis of a sentence. The next version introdtxed an automatic backup mechanism in order to produce all analyses of a sentence. This is an exponential time algorithm, hence very slow for long sentences; a 1962 report gives typical times as 1 minute for an 18 word sentence and 12 minutes for a 35 word sentence. An improvement of more than an 0- of magnitude was obtained in the final version of the program by using a bit matrix for a path-elimination technique [Kuno 19651. When an attempt was made to match a nonterminal symbol to the sentence beginning at a particular word and no match was found; the corresponding bit was turned on; if the same symbol came up again later in the ~arsing at the same point in the sentence, the program would not have to try to match it again.</Paragraph> <Paragraph position="3"> Another important early parser was the immediate constituent anal-yzer used at RAND. his system used a grammar in Chornsky normal form. and a parsing algorithm designed by John Cocke, which produced all ~nalyses bottom-up in a single left-to-right scan of -the sentence [Hays 1967 f. This was a fqst algorithm but - because all parses were dcve loped si~~~ult-nncous ly it nccdcd a lot of space for long sentences; the Rand system appears therefore to have bccn limited to sentences of about 30 words.</Paragraph> <Paragraph position="4"> A differe~k bottom-up ~IIA~YS~S PI-occadure was ust2d ill the first linguistic string t~n~lh*sis pqogram dcval~pcd at the University of Pennsylvania [Harris 19651. This proctxlure, called a cycling cancelling. automaton, makes n stvies left-to-right passes through. the sentence; in each p~~ss one type of reduction was performed. The string parser recognized two classes of strihgs : first o~der, pot containing verb-object, and second order, containing verb-ob ject ; the reduction i,f the sentences was correspondingly done in two stages. In addition & to these reductions, which corresponded to context-f ree rules the parsipg program also included some syntactic restricti~ns which were checked when second order strings were reduced.</Paragraph> <Paragraph position="5"> A system incorpo;t-ating this &lrclinq automaton ~chcme was later used by Bross at Rdsewll Park for the analysis of rrlcdical reports [Bross 196 8, ~hapiro 19711.</Paragraph> <Paragraph position="6"> As far as we k~low, onJy one major parsing system has been developed using a context-sensitive phrase structure graimar, This was DZACON, Direct English Access and Control, which was designed as a natural laqguaqe interface to a co~r~isnd, contml, and information retrieval system for the Army and was developed at General Electric [Craig 1'3661. DZACOiJ was one of the first systems to provide flexible, systematic interaction between the parser and the semantic con~ponent. Associgated with each production in the grammar was a semantic rule. ~hese rules operated on a ring-structured data base and had the functions of locating, adding, and changing information in the data base.</Paragraph> <Paragraph position="7"> The parsing was done bottom-up, developing all analyqes of the sentence in parallel. As each reduction was performed, the associated semantic rule was invoked. In the Case of a query, the sequence of rules associated with the correct analysis was Q supposed to Locate the desired answer in the data base. In some cases a rule could not be applied to the data base (e.g., a particqlar relation between two items did not exist) ; the rule then returned a fail-ure signal to the parser, indicating that the analysis was semantically anomalous, and this analysis was aborted.</Paragraph> <Paragraph position="8"> Woods has noted [Woods 197Ql that the parser used in the Dmm4 pro jecf may produce redundant parses, and has given a parsing algorithm for context-sensitive languages which repedies this deficiency .</Paragraph> <Paragraph position="9"> 2-2 Transformational Analyzers: First Systems When the theory of transformational grammar was elaborated in the early l.9601s there was considerable interest in finding a corresponding recognition procedure. Because the grammar is stated in a generative form, however, this is no simple matter. A &hornsky) tree transformational grammar consists of a set of context-sensitive phrase structure rules, which generate a set of base treesa, and a set of transformations, which act on base trees to produce the surf ace trees. A (Harris) string transfo~tional grammar consists of a finite set of sequences of word categories, called kernel sentences, and a set of transformations which combine and modify these kernel sentences t6 e the other sentences of the language. There are at least three basic problems in reversing the generative process: (1) for a tree transformational grammar, assigning to a given sentence a set of parse trees which includes all the surface trees which would be assigned by the trans formational grammar (2) giveq a tree not in the base, determining which sequences of transf ormatigns might have applied to generate this tree (3) having decided on a transformation whose result may be the present tree, undoing this transformation If we attack each of these problems in the most straightforward manner, we are likely to try many false paths which will not lead to an analysis. For the first problem, we could use a context-PS ree gramrnar which will give all the surface trees assigned by the transformational granunar, and probably lots more The superabundance of ''false&quot; surface trees is aggravated by the fact that most E~~glish words have more than one word category (play more than one syntactic role), although normally only one is used in any given sen'tcnce. F the second and third prpb?ems, rrie can mnswuct a set of reverse tYans formations ; however, since we are probably unable to determine uniquely in, advance the transformations which produced a given tree, we will have to try many sequences of reverse transfolrrnations which will not yield a base tree.</Paragraph> <Paragraph position="10"> Because of these problems, the earliest recognition procedure, ctlggested by Matthews, was based on the idea of synthesizing trees to match a given sentence. Although some checks were to have been made against the sentence during the generation procedure, it was still an inherently aery ir~efficient procedure and was never implemented. Two. major systems were developed in the mid-GO 's, however, which did have limited success: the system of Zwicky et al. at MITRE and that of Petrick.</Paragraph> <Paragraph position="11"> The transf orrnational generative grammar from which the MITRE group worked had a base component with about 275 rules and a set of 54 transformations [Zwicky 19651. For the recognition proceIt null dure they developed manually a context- f ree covering&quot; grammar with about 550 productions to produce the surface trees and a set of 134 reverse transformational rules. heir recognition procedure had four phases: (1) analysis of the sentence using the context-free covering grammar (with a bottom-up parser) (2) application of the revexse trans rorrnational rules 'i3) for each candidate base tree produced by steps (1) and (2) , a check whether it can in fact be generated by the base component (4) for each base tree and sequence of transformations which passes the test in step (3) , the (forward) transfoZmations are applied to verify that the original sentence can in fact be generated (The final check in step (4) is required bccause the covering grammar may lead. to spurious matches of a transformation to the sentence in the reverse transformational process and because the reverse transformations may not incorporate' all the constraints included in the forward transformations. ) The covering grammar produced a large number of spurious surface analyses which the parser must process. The 1965 report for example, cites a 12 word sentence which produced 48 parses with the covering grammar; each must be followed through steps (2) and (3) before most can be eliminated. The system was therefore very slow; 36 minutes were required to analyze one 11 word sentence.</Paragraph> <Paragraph position="12"> TWO measures were taken Sy the MITRE group to speed up the program: &quot;super-trees&quot; and rejection rules [Walker 196 61 . &quot;Super-trees&quot; was the MITRE term for a nodal span representation, in which several parse trees were represented in a single structure. They intended to apply the reverse transformations to these super-trees, thus processing several possible surf ace trees simultaneously; it is not clear if they succeeded in implementing this idea. Rejection rules were tests which were applied to the tree during the reverse transformational process (step (2) above) , in order, to eliminate some trees as early as possible in the parsing. The rejection rules incorporated some constraints wi~j-ch previously were only in the forward transformational component, and so eliminated some trees in step 12) which before had survived to step (4) . The rejection rules had a signific:ant effect on parsing times - the 11 word sentence which took 36 minutes hefore now took only 6 The system developed by Petrick [Petrick 1965, 1966; Keyser 19671 is similar in outline: applying a series oE reverse transformations, checking if the resulting tree can be generated by the base component, and the: verifying the analysis by applying the forward transfo~mations ta the base tree . There are, however, several diifere~ces from the M'TTRE system, motivated by the desire to have a parser which could be produced automatically from the generative formulation of the grammar. Petrick devised a procedure to generate, from the base component and trans formations, an enlarged context-free granunar sufficient to analyze the surface sentence structures. He also automatict*J.ly converted a set of forward transformations mi?- ting certain conditions into pseudo-inverse (reverse) transformations. HAS parsing procedure also oif fered from the dITRE algorithm in the way in which the reverse transformations are applied. In the MITRE: program reverse transformations operated on a sentence tree, just like foqvard trarlsformations in a Chonlsky grammar. Petrick, on the other hand, did riot construct a surface tree in the analysis phase; when a particular reverse transformation came up for consideration, he built just enough structure above the sentence (using the enlarged context-PS ree grammar) to determine if the transformation was applicable. If it was, the transformation was applied and the structure above the sentence then torn down again; what was passed from one reverse transformation to the next was only the string of word categories. In the verifying phase, of course, Pexrick had to follow the rules of Chon~ky grammar and apply the forward transformations to a sentence tree.</Paragraph> <Paragraph position="13"> The price for generality was paid in efficiency. Petrick's problems were more severe than MITRZ's for two reasons. ~irst, the ~bsence of a sentence tree during the application of the reverse trans formational rules meant that many sequences of re-rse transformations were tried which did not correspond to any sequence of tree transformations and hence would eventually be rejected. Second, if several rever se transformations Could apply at some point in the analysis, the procedure could not tell in advance which would lead to a valid deep structure.</Paragraph> <Paragraph position="14"> Consequently, each one had to be tried and the resulting structure followed to a deep structure of a &quot;dead end&quot; (where no more transformations apply). This produces a growth in the number of analysis paths which is exponential in the number of reverse transformations applied. This explssion ntn be avoided only if the reverse transformations include tests of the current analysis tree to d6terrnine which transformations applied to generate .this tree. Such tests were included in the manually prepared reverse transformations of the MITRE group, but it would have been fax too complicated for Pe trick to produce such tests automatlcally when inverting the trans formations.</Paragraph> <Paragraph position="15"> Potrick's system has been significantly revised over the past decade [Petrick 1973, Plath 1974aI. In the current system the covering grammar and reverse trans formations are both prepared manually. The trans formational decomposition process works on a tree (as did MITRE 's) , and considerably flexibility has been .,provided in stating the trankformations and the conditions of applicability . The transformations and conditions may be stated either in the traditional dorm (used by linguists) or in terms of elementary operations combined in LISP procedures. The resulting system is fast enough to be used in an information retrieval system with a grammar of moderate size; most requests are processed in less than one minute.</Paragraph> </Section> <Section position="2" start_page="11" end_page="19" type="sub_section"> <SectionTitle> 2.3 Transformational Analyzers: Subsequent Developmi!ints </SectionTitle> <Paragraph position="0"> One result of the early transforrnati~nal systems was a recognition of the importance sf finding an efficient parsing procedure if traiisformationa.l analysis was ever to be a useful techni~ue. As the systems indicated, there are two main obstacles to an efficient procedure. First, there is the problem of refini'ng the surface analysis, so Chat each sentence produces fewer troes for which transformational decomposition must be atten~pted. This has generally been approached by using a inore powerful mechanism than a context-.free parser for the surface analysis. Second, there is the problem of dctcrrnining the base structure (or kernel sentences) from the surface struc.t:ure in a relatively direct fashion. This has generally been done by associaEing particular rules for building the deep structure with rules of the surface structure analysis. The approach here has generally been ad hoc, developing a reverse mapping without mplicit reference to a evrresponding set of fo~ward trans format ions.</Paragraph> <Paragraph position="1"> Several groups which have played a significant role in the development of current parsing systems have been tied together by their comofl use c>f recursive transition networks. Althouqh their use of these transition networks is not central to their basic contribution, it is frequently referred to and so deserves a few words of explanation. A transition network is a set of nodes (including one initial and at least one terminal node) and a set of directed arcs between the nodes, labeled with symbols from the language; it is a standard representation for regular languages. A recursive transition network is a set of transition networks in which the arcs of one network may also be labeled with the names of other networks; it is a form of representation of context-f ree languages. In contrast to the usual context-f ree phrase structure grammars, this is equivalent to allowing regular expressions in place of finite sequences of elements in productions. This does not increase the weak generative capactity of the grammars, but allows nonrccursive formulations for otherwise recursive const-ructions.</Paragraph> <Paragraph position="2"> The first system using such a network was developed by Thorne, Bratley, and Dewar at Edinburgh [Thorne 1968, Dewar 19691. They started with a regular base grammar, i-e., a transition network. The importake of using a regular base lies in their claim that some transformations are equAvalent in effect to changing the base to a recursive transition network. Transformations which could not be handled in this fasion, such as conjunction, were incorporated in^^ the parsing program. Parsing a sentence with this surface grammar should then also give some indication of the associated base and transfsLTnational structure. Their published papers do not describe, however, thee process by which the surface grammar is constructed and so it is not clear just how the transformation and base structure is extracted, from their parse.</Paragraph> <Paragraph position="3"> The recutsive transition network was developed into an augmented recursive transition network grammar in the system of Bobrow and Frasex mw 9 An ausmented network is one in which an arbitrary predicate, written in some general purpose language (in this case, LISP). may be associated uith each arc in the network. A transition in the netw~rk is not allowed if the predicate associated with the arc fails. These predicates perform two functions in the grammar. First, they are used to incorporate restrictions in the language which would be difficult or impossible to state within the amtext-free mechanisms of the recursive network, e. g. , agreement restrictions. Second, they are used to construct the deep structure tree as the sentence is being parsed.</Paragraph> <Paragraph position="4"> The augmented transition network was further developed by Woods at Bolt Beranek and Newman. In order to regularize the predicates, he introduced a standard set of operations for building and testing the deep structure lWoods 1970bl. He considerably enla~ged the scope of the grammar and added a semantic component fbr translating the deep structure into information retrieval commands. With these additions, the system served as a moderately successful natural language input interface to a retrieval system for data about moon rocks [Woods 1972, 19731. The augmented transition network, and in particular the formalism developed bv Woods, has proven ta be an effective instrument for constructing natural language front-ends which is relatively simple to implement and use; it is probably the most widely used procedure today.</Paragraph> <Paragraph position="5"> Like several of the systems described above, Proto-RELADES, developed 1 IBM Cambridge [Culicover 19691, tried to obtain a11 efficient tr~nsfor~national decomposition algorithm by linking the rules for building the deep structure to the productions of the surface granmar. Their surface granunar was also augmented by restrictions (in PL/I this time). Aorvover, their system differed from those mcntioncd earlier in several important respects : ~irst, the surf ace grammar allowed context-sensitive as well as contest-free rules. Second, the rulcs which built the deep structure during the parse were in the form af reverse transformations acting or! an (incomplete) sentence tree (in contrast to the rules used by V400dsr for example, which first put- wer& -8nto registers labeled &quot;subject&quot; , &quot;verb&quot;, and &quot;object&quot; and later build a tree out of them). Proto-RELADES was tested as a restricted English language preprocessor for a library card catalog retrieval system [Loveman 19711.</Paragraph> <Paragraph position="6"> One drawback of these procedures was the relatively ad hoe methodsr from a linguistic point of view, used to construct the surface grammars and to tie them in to the appropriate reverse trans formations. A more principled approach to trans formational decomposition was proposed by Joshi and Hiz 1962, Biz 19671.</Paragraph> <Paragraph position="7"> In contrast to the systems described above, their procedure was based on Harris' string transformational granunar.</Paragraph> <Paragraph position="8"> One advantage of the Harrisian theory over that of Chomsky is the theoretical basis it provides for the segmentation of the sentence into &quot;linguistic strings&quot; (Chomsky 's theory, in contrast, makes no general assertions about the surface structure of sentences.) The procedure of ~oshi and Hiz was predicated on the claim that, from an analysis of the sentence into linguistic strings, one could directly determine the transformations which acted to produce the sentence, without having to try many sequences sf reverse transformations. Their proposed sys tern therefore consisted of a procedure for linguistic string analysis (a context-free parsing problem at the level ,of simpli fication of their original proposal) and a set of rules which constructed from each string a corresponding kernel-like sentence.</Paragraph> <Paragraph position="9"> Their original proposal was a simplified scheme which accounted for only a limited set of trans.formations. It has been followed by a good deal of theoretical work on adjunct grammars and trace conditions [Joshi 19731 which has laid a fbrrnal basis for their procedures. These studaies indicate kow it may be possible, starting from a transformational grammar not specifically oriented towards recognition, to dkterrnine the features of a sentence which indicate that a particu1a.c transformation applied in generating it, and hence to produce an eff icient analysis procedure* Another group which has used linguistic string analysis is the Linguistic Strinq Project at New York University, led by Sager [Sager 1967, 1973; Grishman 1973a, 1973131. Their system, which has gone through ssveral versions since 1965, is based on a context- free grammar augmented with restrictions. Because they were conce ned with processing scientific text, rather than commands or queries, they were led to develop a grammar of particularly broad coverage. The present gramrr~ar has about 250 context-free rules and about 200 restrickions; although not as swift as some of the smaller systems, the parser is able to analyze most sentences in less than one minute. Because of the large size of their grammar, this group *has been particularly concerned with techniques for organizing and specifying the grammar which will facilitate further development. In particular, the most recent implementation of their system has added a special language designed for the economical and perspicuous staement of the restrictions [Sager 19751.</Paragraph> <Paragraph position="10"> One of the earlier versions of this system, with a much more restricted grammar, was used as the front end for an information retrieval system developed by Cautin at the University of Pennsylvania [Cautin 1369 ] .</Paragraph> <Paragraph position="11"> The ~inguistic String Project system has recently been extended to include a transformational decomposition phase; this phase follows the linguistic? string analysis [IIobbs 19751. As in the case of the Joshi-IIiz parser, thc strings ide~~tificd in the sentence generally indicate which rcvcrsc transfrsrmatisns must be applied. '?he trans'forrnations are written in an extension of the 1-anguage which was used fur writing the restrictions. The systelus of Woods, Petrick, and Sayer exhibit a range of approaches to the prob J.m of tral.lsf or~naticanal deceinpos i tion. Their parsing p~octzclurcs are si filar in ,nany respects: they have a contcxt-PSsee granunar as the Ertt~uowsrk for their surface analysis, and they use procedures both to cspress gral1~11rltical co~~straints and to effect the reverse transformations. Petrick's system differs from the others in two primary respects : the restrictions on the context-free grammar are imposed by filtering trans forn~ations which act early in the transformatipnal phase to reject ill-formed trees, rather than by procerlures operating during the surface analysis. This wol ld seem to be disadvantageous from the point of view of efficicncy, since erroneous parses which might be aborted at the 'beginning of the suxface analysis must be followed through the entire surface analysis and part of the transformational deconpssi tion. Second, the trans formations are not associated wi.m particular productions of the surface grarxnar, but rather with particular patterns in the tree (&quot;structural descriptions&quot;) , so pattern match; ng operations are required to determine 1;rhich transformations to apply. These differences reflect Petrick 's c'iesire to remain as close as is practical to the formalism of transformational linguistics.</Paragraph> <Paragraph position="12"> The primary distinction of the Woods system is that the deep structure tree is built during the surface analysis. Consequently, his &quot;transformati.ona1&quot; procedures consist of tree building rather than tree transforming operations. The tradeoffs between this approach and the two-stage analyzers of petrick and Sager are difficult tg weigh at this time. They are part of the more general problem of parallel vs. serial processing; e.g. should semantic analysis be done concurrently with Syntactic analysis. Parallel processing is preferred if the added time required by the deeper analysis is outweighed by the fraction of incorrect analyses which can be eliminated early in the parsing erocess. In the case of s ?mantic analysis, it clearly depends on the relative complexity of the syntactic and semantic components. In the case of transformational analysis, it depends on the fraction of grammatical and selecti~nal constraints which can be expressed at: the surface level (if most of these can only be realized through transformational allalysis , concurrent trans formational analysis is probably more efficient). This may depend in Lurn on the type of surface analysis ; for example, the relationships exhibited by linguistic string analysis axe suitable for expressing many of these constraints, so there is less motivation in the Linguistic String Project system for concurrent trans formational decomposition.</Paragraph> </Section> <Section position="3" start_page="19" end_page="19" type="sub_section"> <SectionTitle> 2.4. Other Syntactic Analysis Pxocedures </SectionTitle> <Paragraph position="0"> The system developed by Winograd at M. I. T. [Winograd 19711 for accepting English commands and questions about a &quot;block world&quot; also uses a context-free grammar augmented by restrictions. Winograd's context-free grammar was encoded as a set of procedures instead of a data structure to be interpreted, but this is not a material difference. His grammar is based on Halliday 's systemic grammar&quot; to the extent that it extracts from a sentence the set of features described by Halliday; however, Halliday 's grammar (at least in its present stage of' development) is essentially descriptive rather than generative, so most of the detailed grammatical structure had to be supplied by Winograd. His parser does not construct a deep structure; rather, it builds semantic structures directly during parsing. The primary distinctive feature of his system is the integration of the syntactic component with senantics and pragmatics (the manipulation of objects in the block world) ; hi$ parser is thus able to use not only syntactic constraints but also semantic and pragmatic information in selecting a proper sentence analysis. With regard .to the serial vs.</Paragraph> <Paragraph position="1"> parallel distinction drawn in the previous section, his system would be characterized as highly patallel.</Paragraph> <Paragraph position="2"> A number of natural language systen~s ?lave used granunars composed of finrestricted phrase-structure rewriting rulcs.</Paragraph> <Paragraph position="3"> Since unrestricted rewriting rules, like transformational grammars, can be used to define any recursively enumerable language, they may be suf ficicnt for analyzing both surface and deep structure. As with transformational. grallunars, it will in practice be necessary to inlpose some cons tralnt (such as ordering) on the rules, so that the language defined is recursive; otherwise a parser will never be able to determine whzther some sentences are grammatical or not.</Paragraph> <Paragraph position="4"> One parser for unrestricted rewriting rules was described by Kay [Kay 19671. This parser includqd a number of mechanisms for restricting the application of rules, such as rule ordering, specifying part of the structure dominated by one elernentkof the rule, or requirihg the equality of the structures don~inatcd by two elements. These mechanisms do not increase the gcnerative power of the grammars, but are designed to make granunars easier to write. Kay described how his parser could be used to effect some gfeverse transformations.</Paragraph> <Paragraph position="5"> Kay's parser was incorporated into a system called REL (Rapidly Extensible Langllage) developed by Thompson, Dostert, et al. at the California Institute of Technoloqy [Thompson 1969, Dostert 19711 . Kay's original parsor was auqmented by allowing a set of binary features to be associated with each node, including feature tests as park of the rewrite rules, and permitting more general restrictions where the features were inadequate. The REL system was designez to support a number of graimnars, each interfaced to its own data base. One of these is REL English, which analyzes a subset of English into a set of sub ject-verb-ob ject-time modifier deep structures ; this grammarhas 239 rules. In support of the use of general rewrite rules with features, they note that only 29 of the 239 rules required constraints which could not be conveniently stated in terms of feature tests. This is also a factor in efficiency, since binary feature tests can be performed very quickly.</Paragraph> <Paragraph position="6"> Another system which uses unrestricfed rewriting rules with optional conditions on the elements is the &quot;Q&quot; system developed by Colmerauer [Colmerauer 19701. *his system is presently being used in a machine translation prqject af the Universitv of Montreal [Kittredge 19731.</Paragraph> <Paragraph position="7"> Colmerauer and de Chastellier [de Chastellier 196 9 ] have also investigated the possibility of using Wijngaarden grammars (as were developed for specifying ALGOL 6 8) for :tr.ansformational decomposition and machine translation. Like unrestricted rewriting rules, W-grammars can define every recursively enumerable laquage, and so can perform the functions of the syrface and reverse transformational components. They show how portions of transformational grammars of English and French may be rewritten as W-grammarsr with the pseudo-rules in the W-grammar taking the place of the transfol'mations~</Paragraph> </Section> <Section position="4" start_page="19" end_page="19" type="sub_section"> <SectionTitle> 2.5 Parsinq with Prop-ability and Graded Acceptability </SectionTitle> <Paragraph position="0"> In all the systems dqscribed above, a sharp line was drawn between correct and incomect parses: a teminal node either did or did not match the next word in the sentence: an analysis of a phrase was either acceptable or unacceptable. There are circumstances under which we would want to relax these requirements. For one thing, in analyzing connected speech, We segmentation and identification of words can never be done with complete certainty. At best, one can say that a certain sound has some probability of being one phoneme and son= other probabiJi ty of being another phoneme; some expected phonemes may be lost entirely in the sound received. consequently, one will associate some nufier with each terminal node, indicating the probability or quality of match; noflcrminal nodes will be assigned some value based on the values of the terminal nodes )xneath. Another circuinstance arises in natural language systems which aro sophisticated mough to realize that syntactic and semantic rcs trictions are rarely all-or-nothing affairs, and that some restrictions arc stronger than others# Fur rsan~pla, the nominative-accusative distinction has become qui tc !tqc,ak for relative pronouns (?The man ~17x1 I met 1'ester;Flay. ) but remains strong for personal pronouns (*The man whom, me ~nct yesterday.).</Paragraph> <Paragraph position="1"> As a result, a parser which wants to get the best analysis even if every analysis violates some constraint must associate a measure of grammaticality or acceptability with the analyses of portions of the sentence, and ultimately with the analyses of the entire sentence.</Paragraph> <Paragraph position="2"> In principle, one could generate every sentence analysis with a nonzero acceptability or probability of n~atch, and then select the best analysis obtained. IIobbs [1974] has described a modification to the bottom-up nodal spans parsing algorithm which uses this approach. WPlks [1975] uses an essentially similar technique in his lLuguage analyzer based on &quot;preference semantics &quot; A more efficient approach, called &quot;best-first&quot; parsing, has been developed by Paxton and Robinson of the Stanford Research Institute as part of a speech understanding system axto ton 19731. Their procedure involves a modi fication of the standard top-down serial parsing algorithm for ccntext-free grammars. The standaqd algorithm generates one possible parse tree until it gets stuck (generates a terminal node which does not match the next sentence word) ; it then &quot;backs up&quot; to try another alternative. The best-first procedure instead tries all alternatives in IleL A measure is associated with each alternative path, fadicating the likelihood that this analysis matches the sentence pwssed so far and that it can be extended to a let@ s@xtence analysis. At each moment, the path with the higbsst likelihood is extended; if its measure falls below that otner path, the parser shifts its attention to that 2-6 ur~lon4uhction and Adjunction mere am$ certain pervasive natural language c~nstructivns rch ds not fit naturally into the standard syntax analysis dues, such as augmented context-free parsers. Two of se are coordinate conjunctions and adjuncts. Special reastAres have Deen developed to handle these constructions; these measur&s deserve brief mention here.</Paragraph> <Paragraph position="3"> allowed patterns of occwl-rence of con joinings in a sentence are quite regular. Loosely speaking, a sequence of e nts in the sentence tree may be followed by a con junction aad by same or all of the elements immediately preceding the jrmction. For example, allowed patterns of con joining subject-verb-ob ject-and-sub ject-verb-ob ject (I drank dw and nary ate cake. ) , sub ject-verb-ob ject-and-verb-object a milk and ate cake. ) and sub ject-verb-ob ject-and-ob ject [I ikank dlk and seltzer. ) . There are certain exceptions, known as ing phenomena, in which one of the elements following the amjlmction may be omitted; for example, sub ject-verb-object-andject-object (I drank milk and Mary seltzer.) . trouble with coordinate conjunctions is that they can t anywhere in the structure of a sentence. Thus, it would be possible tr, extend a context-free surface to allow for all*possible conjoinings, such an extension increase the size of the grammar by perhaps an order of . The alternative scheme which has therefore been developed involves the automat'ic generation of productions which allow for conjunction as required during the parsing process. When a conjunction is encountered in the sentence, the normal parsing procedure is interrupted and a special co~ljunction node is inserted in the parse tree. The alternative values of this node provide for the various conjoi~~cd element 'sequences allowed at this point.</Paragraph> <Paragraph position="4"> An interrupt mechanism of this sort including provision for gapping, is part of the Linguistic String Project parser [Sager 19671. A similar mechanism is included in Woods ' augnlen ted transition network parser [Woods 19731 and a number of other sys terns.</Paragraph> <Paragraph position="5"> This solves the problem of correcting the context-frcc granmar for conjunctions, but the contest-f rcc granw~lar is generally only a small part of the total system. The task remains of modifying the routines which enforce granxmatical constraints and the transformations to account for con junctions. Since practically every routine which examines a parse tree is someWow affected by conjunction, this can be a large job, but fortunately the changes are very regular for most routines.</Paragraph> <Paragraph position="6"> The Linguistic String Project grammar; by performing a&l operations on the parse tree through a s~nall nunbcr of low-level routines, was able to localize the changes to these routines and a srngll number of restrictions (such as nun&er agrcerneJt) which are specially af fectcd by con junction [Raze 19741.</Paragraph> <Paragraph position="7"> Certain classes of adjuncts or modifiers give rise to a different kind of problem: a high degree of syntactic ambiguity. For instance, in the sentence) &quot;I fixed the-pipe under the sink in the bathroom tzri th a wrench. &quot; there is no syntactic basis for deciding whether the pipe had a wrench the sink had a wrenoh, the bathroom had a wrench, or the tixing was done with a wrench. If semantic and pragmatic restrictions are invoked during the syntactic analysis, the parser will have to generate several analyses, all but one of which will (hopefully) be rejected by the restrictions ; this is moderately inefficient.</Paragraph> <Paragraph position="8"> If syntactic analysis precedes semantic processing i the ambiguities of the various adjuncts wi 11 be multiplied producing dozens of analyses for a sentence of moderate size ; this is hopelessly inefficient-A more efficient solution has the parser identify the adjuncts and list for each adjunct the words it could be modifying, without generating a complee separate arp-alysis for each possibilfty. The ambiguities associated with the adjuncts are thus factored out. The semantic and pragmatic components may then choose for each adjunct its most likely or acceptable host (modified word). This may be done either during the syntactic analysis [woods 1973, Simmons 19751 or after the syntax phase is complete [~orgida 1975, Hobbs 19751.</Paragraph> </Section> </Section> class="xml-element"></Paper>