File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/86/c86-1124_metho.xml
Size: 33,692 bytes
Last Modified: 2025-10-06 14:11:53
<?xml version="1.0" standalone="yes"?> <Paper uid="C86-1124"> <Title>FORMAL SPECIFICATION OF NATURAL LANGUAGE SYNTAX</Title> <Section position="3" start_page="0" end_page="0" type="metho"> <SectionTitle> 2. TWO-LEVEL GRAMMARS </SectionTitle> <Paragraph position="0"> A two-level grammar consists of two sel)aratc grammars, the mstaproductlon rule~ (metarules) and the hyperrules. The metarules are generally context-free rules which take the form:</Paragraph> </Section> <Section position="4" start_page="0" end_page="0" type="metho"> <SectionTitle> METANOTION :: hypcrnotion-1; hypcrnotlon-2; ... ; hypernotion-n. </SectionTitle> <Paragraph position="0"> where METANOTION is tile left-hand side &quot;nonterminal&quot; symbol of the production and hypernotion-1, hypernotlon-2, ... hypcrnotion-n are the n alternatives of the production right-hand side. Each hypcrnotion consists of protonotions (terminal symbols) and other metanotions. In the case of English, the terminal symbols of the recta-grammar are English words.</Paragraph> <Paragraph position="1"> The recta-grammar itself is used to definc the context-free ~spccts of English. Example metarules arc:</Paragraph> </Section> <Section position="5" start_page="0" end_page="0" type="metho"> <SectionTitle> SENTENCE :: DETERMINER NOUN VERB. DETERMINER :: a; an; tile; these; those; this; that. USING TWO-LEVEL GRAMMAR </SectionTitle> <Paragraph position="0"> The hypcrrules are of the form hypcrnot\]on ; hyperaltern-1; hyper~ltern-2; ... ; hyperaltern-n. Tile hyperalternatives separated by semicolons arc distinct production alternatives. Each of these hyperaltcrnatives may be divided into a sequence of hypernotions separated by commas. In a two-level grammar derivation tree, there will be one br:mch for each clement in the sequeucc. A two-level grammar with either hyperrnles having more than one hyperaltcrnative or two distinct hypcrrules having the same hypcrnotion on the production left-hand side is nondetcrministic. \[f cach hYl)crrule has only one hyperalternative and all hypcrnotions in production left-hand sides are distinct from mm another then the tlg is dcterminisl;ic.</Paragraph> <Paragraph position="1"> A hypcrrule is actually a production rule &quot;pattern&quot; since each hyperrule can possibly represent an infinite number of production rules in a context-free grammar. This is because each occurrence of a metanotion in the hypcrrulc represents all sequences of protonotions that can be derivcd from that metanotion. That is, a hyperrule may be viewed as a set of production rules (called strict production rules) in which all metanotions are replaced by the protonotions they derive. The only restriction here is that if there arc more than one occurrcnce of a single rnetanotion, then each is replaced by the same protonotion sequence in deriving the strict production rules. This is called conMstent substitution. For example, in the byperrulc where WORD is WOR\]) : true.</Paragraph> <Paragraph position="2"> both occurrences of the metanotion WORD repr~ent the same protonotion. The set of allowable protonotions in this rule is defined by the metarulcs for WORD. If these metarules define an infinite number of possible protonotions, then tile above hyperrule also represents an infinite uumbcr of strict i)roduction rules. \[t is tiffs featurc of two-level grammars that allow tbcm to define context-sensitive and recursivcly enumcrable languages \[12\].</Paragraph> <Paragraph position="3"> If consistent substitution is not required (or desired) for metanotions with the same root metarulcs (and nanm), then these metanotions may be distinguished by subscripts. For example, where SENTENCE1 and SENTENCE2 are correct ; where SENTENCE1 is corre, ct, where SENTENCE2 is correct.</Paragraph> <Paragraph position="4"> In this hypcrrule, SENT:ENOE1 and SENTENCE~. are defined by the same metarulcs (and root mctanotion SENTENCE} but need not have the same instantiations.</Paragraph> <Paragraph position="5"> Some hypcrrules called predicates act as conditions which must be satisfied for the derivation to be :~uccessful. A predicate begins with the word where or coadition and the terminal derivation of the hyperrule is the empty string if the condition is satisfied and will derive a &quot;blind alley&quot; (i.e. not derive any terminal string) if the condition is not satisfied. In tire two-level grammar of English presented in this paper, all hyperrules arc predicates and serve to perform context checks such as subject-verb agreement, object~vcrb agreement, and any additional required context cheeks which cannot be conveniently specified by a eontext-frce grammar (i.e. tile mctarules).</Paragraph> </Section> <Section position="6" start_page="0" end_page="527" type="metho"> <SectionTitle> 3. METARULES FOR ENGLISH </SectionTitle> <Paragraph position="0"> Tile metarulcs of the two-level grammar for English define tire context-free a~pccts of English synt*Lx. Some lexical items from English can not be easily defined in a forinal way (i.e. using context-free rules). These include tile nouns, verbs, adjectives, proper names, and titles, given names and surnames for people which arc lcxical categories containing a large number of elements. The formal specification of these categories would be production rules of tlm form:</Paragraph> <Paragraph position="2"> For simplicity we choose to omit more formal specifications of the above categories. A more complete list of words in these categories may bc found in \[14\].</Paragraph> <Paragraph position="3"> The metarules in our two-level grammar illustrate tile specific subset of English grammar defined in this paper. The subset includes declarative sentences with the subject noun premed;fled and postmodilled, including postmodification by relative clauses. The choice of this subset is rather arbitrary since we have used two-level grammars to define a wide variety of English sentences (e.g. in \[7\], more extensive modification is allowed and also compound sentences). This subset will serve to illustrate the power of two-level grammars for the purposes of defining English syntax. Because the notation for metarules follows context-free grammar conventions using natural language vocabulary, our recta-grammar is fairly self-explanatory. The rules of English syntax that have been incorporated into our grammar are based on English grammar rules given in \[3\], \[11\], \[131, and \[19\]. We now enumerate the metarules used in our two-level grammar of English. A scntence consists of a noun phrase and a verb phrase. The noun phrase consists of an optional sentence modifier such as a &quot;viewpoint&quot; adverbial and a subject sequence. The subject sequence consists of two main subjects, separated by the coordinator and. The main subjects may be either a list of nouns premed;fled and postmodified or a proper name premodificd by a restricter.</Paragraph> <Paragraph position="4"> 1. SENTENCE :: NOUN_PHRASE VERB_PItRASE PERIOD.</Paragraph> <Paragraph position="5"> 2, NOUN_PHRASE :: SENTENCE_MODIFIER SUBJECT_SEQUENCE.</Paragraph> <Paragraph position="6"> 3. SENTENCE_MODIFIER :: VIEWPOINT COMMA; EMPTY.</Paragraph> <Paragraph position="7"> 4, VIEWPOINT :~ artlstlcally; eeonoudeaily; etMcally; financially; geographically; linguistically; militarily; morally; personally; politically; psyehologleally; publically; theoretleally; visually. 5. SUBJECT_SEQUENCE t: MAIN_SUBJECT; MAIN_SUBJECT and MAIN_SUBJECT.</Paragraph> <Paragraph position="8"> g. MAIN_SUBJECT :.* MODIFIED_NAMED_SUBJECT~ PRE_NOUN_MODIFICAT10 N NOUN_tIEAD POST_NOUN_MODIFICATION.</Paragraph> <Paragraph position="9"> 7. MODIFIED_NAMED_SUBJECT :: RESTRIOTERS NAMED_SUBJECT.</Paragraph> <Paragraph position="10"> 8. NAMED_SUBJECT ~: PROPER_NAME; GIVEN_NAME~ SURNAME; TITLE SURNAME.</Paragraph> <Paragraph position="11"> tl. RESTRIOTERS :: chiefly; especially; even; just; largely; mainly; mostly; primarily; not even; only; EMPTY.</Paragraph> <Paragraph position="12"> 10. NOUN_HEAD :: NOUN; NOUN and NOUN; NOUNJLIST COMMA_OPTION and NOUN.</Paragraph> <Paragraph position="13"> 11. NOUN_LIST ::</Paragraph> </Section> <Section position="7" start_page="527" end_page="528" type="metho"> <SectionTitle> NOUN_LIST COMMA NOUN; NOUN COMMA NOUN. </SectionTitle> <Paragraph position="0"> The verb phrase consists of a predicate sequence and an object sequence. Tlm predicate sequence consists of an auxiliary seqnence (an optional auxiliary adverb such as a focusing or maximizing adverb followed by an active or passive auxiliary verb) and the main verb of the sentencc, 12. VERB_PIIRASE :: PREDICATE_SEQUENCE OBJECT_SEQUENCE.</Paragraph> <Paragraph position="1"> 13. PREDICATE_SEQUENCE :: AUXILIAI?~Y_SEQUENGE VERB. 14. AUX-ILIARY_SEQUENCE t: AUXILIARY._ADVERB_OPTION; AUXILIARY_ADVERB_OPTION AGTIV E_OR_PAS S IVE~A UXI LIARY.</Paragraph> <Paragraph position="2"> 15. AIYXILIARY_ADVERI~_OPTION::AUX\]LIARY'~ADVERB; EMPTY. 18. AUXILIARY_ADVERB :: FOCUSING_ADVERB; MAXIMIZING._ADVERB.</Paragraph> <Paragraph position="3"> 17. FOCUSING_ADVERB :: again; also; as we;l; at least; equally; especially; even; fnrtlmr; in addition; in particular; just; largely; likewise; mainly; mercly~ mostly; notably; only; partlcula,'ly! primarily; principally; purely; purely and slmplyl shnilarly i simply\] specifically.</Paragraph> <Paragraph position="4"> 18. MAXIMIZING_.ADVERB :: absolutelyl altogether; completclyl entirely; fully; in Ml respects; perfectly; qulte; thoroughly; totally; utterly; very fufiy; very thoroughly.</Paragraph> <Paragraph position="5"> lg. ACTIVE_OR_PASSIVE_AUXILIARY :~ ACTIVE_AUXILIARY; PASSIVE_AUXILIARY.</Paragraph> <Paragraph position="6"> 20. ACTIVE_AUXILIARY :: A1 IXILIARY_.\[IAVE AUXILIAR Y_ADVERB_OP TIC N. 21. PASSIVE_AUXILIARY :: AUXILIARY_BE AUXILIARY_ADVERB_OPTION; AUXILIARY_J~IAVE AUXILIARY_ADVERB~OPTION been.</Paragraph> <Paragraph position="7"> 22. ALVXILIARY_BE :: am~ is; were; was.</Paragraph> <Paragraph position="8"> 23. AUXILL~Y_ItAVE :: have; had; has.</Paragraph> <Paragraph position="9"> 24. AUXILIARY_VERB z: AUXILIARY_BE; AUXILIARY_HAVE. 25. AUXILIARY_TRAILER :: AUXILIARY_ADVERB_OPTION; AUXILIA RY~aA2)VERB_O PTI O N been.</Paragraph> <Paragraph position="10"> The object sequence of a verb phrase can contain both direct and indirect objects followed by an optional adverbial such as a maximizing adverb or a time adverb. Objects can be either a proper name, possibly modified by the restrieters given above, or a noun expression, possibly premed;fled and postmodified.</Paragraph> <Paragraph position="11"> 26. OBJECT_SEQUENCE :: INDIRECT_OBJECT DIRECT_OBJECT OB JECT_SEQUENGE_ADVERB; DIRECT_OBJECT OBJECT_SEQUENCE_ADVERB.</Paragraph> <Paragraph position="12"> 27. OBJECT_SEQUENCE ~DVERB :: O B JEOT_S EQUENO E~LDVERBIAL; EMPTY.</Paragraph> <Paragraph position="13"> 28. OBJECT_SEQUENCE_ADVERBIAL :: MAXiMIZING_ADVERB; TIME_ADVERB.</Paragraph> <Paragraph position="14"> 29. TIME_.ADVERB :: again; early; first; last; late; next; now; recently; simultaneously; slnee; then; today; yesterday. 30. INDIRECT_OBJECT :: OBJECT.</Paragraph> <Paragraph position="15"> 31. DIRECT_OBJECT :t OBJECT.</Paragraph> <Paragraph position="16"> 32. OBJECT :: MODIFIED_NAMED_SUBJECT; PRE_NOUN_MODIFICATION NOUN_HEAD POST NOUN_MOD1FICATIO N.</Paragraph> <Paragraph position="17"> We now turn to the pro-noun-modifiers specified in our grammar. The modifier is a determiner optionally followed by a list of possessive nouns, an adjective, a sequence of nouns, another list of possessive nouns and a denominal noun. Examples of this type of construct include &quot;the murderer's empty black pistol&quot; and &quot;a very rich man's thick wallet.&quot; For context-sensitive purposes, the determiners are divided into &quot;universal&quot; determiners which may precede both singular and plural nouns and determiners which may only precede singular nouns. Furthermore, a context-frcc restriction of the pro-noun-modifiers is that thcrc can be at most one list of possessive nouns in a sequence. For convenience we choose to enforce this condition in the hypcrrules instead of the metarules. 33. PRE_NOUN_MODIFIOATION .': DETERMINER PRE_NOUN_MODIFIERS.</Paragraph> <Paragraph position="18"> 34. PRE_NOUN_MODIFIERS :: EMPTY; POSSESSM,\]_NOUN_LIST AD JEOTIVE_OPTION NOUN_SEQUENCE POSSESSIVE_NOUN_LIST I) ENO MINAL_NOUN.</Paragraph> <Paragraph position="19"> 35. DETERMINER :: UNIVERSAL_DETERM \[NER; SINGIJLAR_DETERMINER.</Paragraph> <Paragraph position="20"> 311. UNIVERSALDETERMINER :: tim; some; any; my; your; his; her; its; our; their. 37. SINGULAR_I)ETERMINER :: either i neither; another; NOT_OPTION NEGATABLE_SINGULAR_DETERM\[NER.</Paragraph> <Paragraph position="21"> 38. NEGATABLE_SINGULAR_DETERMINER :: a; an; eaeb; every. 39. NOT_OPTION :: not; EMI)TY.</Paragraph> <Paragraph position="22"> 40. POSSESSIVE_NOUN_LIST :: EMPTY; POSSESSIVE_NOUN LIST POSSESSIVE_NOUN.</Paragraph> <Paragraph position="23"> 41. POSSESSIVE_NOUN :: NOUN's; NOUN'.</Paragraph> <Paragraph position="24"> 42. ADJECTIVE_OPTION ~: ADJECTIVE; EMPTY.</Paragraph> <Paragraph position="25"> 43. NOUN_SEQUENCE :: NOUN; NOUN and NOUN; EMPTY. The nouns in the NOUNSEQUENCE denote the physical composition of items (e.g. &quot;the fisherman's rusted iron hook&quot;) and thus act as adjectives Denominal nouns arc adjectives which denote some quality of the noun being modified (e.g. &quot;her social life&quot; and &quot;his moral responsibility&quot;). Since there are a large number of these, we omit their formal specification here. In our grammar subset we restrict post-noun-modifiers to relative clauses involving people. Many other forms of post-noun-modification are fermal\]y specified in \[7\] 44. POST_NOUN_MODIFICATION :: RELATIVE_CLAUSE; EMPTY. 45, RELATIVE_CLAUSE :: who PREDICATE_SEQUENCE OBJECT_SEQUENCE.</Paragraph> <Paragraph position="26"> Finally, the punctuation in our grammar is given below 46. PERIOD :: . .</Paragraph> <Paragraph position="27"> 47. COMMA :t ~ .</Paragraph> <Paragraph position="28"> 48. COMMA OPTION :: COMMA; EMPTY.</Paragraph> <Paragraph position="29"> 49. EMPTY :: .</Paragraph> </Section> <Section position="8" start_page="528" end_page="529" type="metho"> <SectionTitle> 4. HYPERRULES FOR ENGLISH </SectionTitle> <Paragraph position="0"> The hyperrules of tile two-level grammar for English define the context-sensitive aspects of English syntax which can not be specified by the context-free rules ef the recta-grammar. Unlike the meta-grammar, the hyperrulss do not generate any part of the English sentence. They serve only to verify the context-sensitive conditions of the grammar. This is done by using predicates ,~ described earlier. Predicates willderive the empty string if they are satisfied and will derive nonterminal strings of useless symbols otberwise. The notim~ that tile hyperrulcs will not generate any terminal string but instead verify context-sensitive eonditions of a terminal string already generated by the context-h'ee mctarules is a nnique feature of our approach to designing two-level grammars (e.g. in contrast, see \[2\]). This will greatly simplify parsing two-level grammars as we will see later.</Paragraph> <Paragraph position="1"> We will define two types of predicates. The first of these will be preceded by the protonotion condition and will be given explicitly in the formal grammar. As with the recta-grammar, however, there will be some rules which can not bc precisely defined in the formal system. These rules relate to qualities of the unspecified lexical elassc~ (e.g. nouns, vm'bs, etc.) and will be designated by the protonotion where. For exalnplc, the hypernotions where NOUN is singular, where VERB is past partlelple, and where NOUN and VERB agree in person and number call not bc precisely defined except by a very large number of formal rules such ms those given below: where aardvark is singular : EMPTY.</Paragraph> <Paragraph position="2"> where abandoned is past participle I EMPTY.</Paragraph> <Paragraph position="3"> where Adam and ere agree in person and number : EMPTY.</Paragraph> <Paragraph position="4"> In the subseqnent discussion of hyperrules we will use the not, ation Itu to denote hyperrule number n. The start hypcrrule (Ill) of the two-level granunar is: 1, SENTENOE : condition SENTENOE is a well-formed sentence.</Paragraph> <Paragraph position="5"> This hyperrule has as its start notion an English sentence which is well-formed with respect to the context-free rules or the recta-grammar for metanotion SENTENCE. The next hyperrule (H2) expands the sentence with respect to what conditions must be satisfied. The formalization of these is self-explanatory.</Paragraph> <Paragraph position="6"> condition OBJEGT_SEQUEN(JE is a wclbhwmed object.</Paragraph> <Paragraph position="7"> The first condition is that the subject sequence must agree with the predicates specified by the auxiliary sequence and verb. In onr grammsr, agreement means that the subject and the subject-verb must agree in person and !mmbcr. There are two possibilities for snbject-verbs: 1) the auxiliary sequence ia empty (It3) iu which c~sc the main verb must be consistent with the subject, and 2) thc auxiliary scqucncc is uon-empW (H4) in wfiieh case it is the auxiliary verb which must be consistent wit.h the subject: Subjec~.s may be in our of three forms: l) the subject is a proper name (II5), possibly modified by a rcstrictcr (c.g. &quot;even Mr. Smith&quot; or &quot;primarily Mrs. Jones&quot;), and therefore requires ~ singular verb; 2) the subject is a single subject (H6-HT) in wbich case it need only agree wi~h . the subject-verb; or 3) the subject may bca compound subject co-ordinated with and (fIS-II9), in which casc it reqnires a plural verl) (e.g. &quot;John and Bill arc here.&quot;).</Paragraph> <Paragraph position="8"> agrees in person and number with VERB.</Paragraph> <Paragraph position="9"> 7. condition NOUN'agrecs in person and number with VERB : where NOLVN and VERB agree in person and number.</Paragraph> <Paragraph position="10"> 8. condition NOUN LIST OOMMA_OPTION and NOUN agrees in person and number with VERB : wlmre VERB is phlral.</Paragraph> <Paragraph position="11"> 9. condition MAIN_.SUBJECTI a.nd MAIN_SUBJECT2 agrees in person and nnmber wlth VERB : where VERB is plurM.</Paragraph> <Paragraph position="12"> To satisfy tile second condition that tile subject of a sentence must bc well-formed, the subject may fall into one of the following categm'ies: 1) if the subject is a name (II10), then it is already well-formed by the metarules; 2) if the subject is modified (till), then the modifiers must be correct; and 3) if the subject is a componnd subject (I112), then each component of the compound subject must be well-formed according to rules 1 and 2.</Paragraph> <Paragraph position="13"> 10. condition MODIFIED_NAMED_SUBJECT is a well-formed subject : l)remodilied and postmodificd. We first give the hyperru\[es which enforce correct premodification. Premodifieatiml (H13) requires 1) correct determiner usage (i.e. with respect to singular and plural nouns) and 2) any prcmodifying nouns must be singular or &quot;mass&quot; nouns (i.e. nouns which denote item composition such as aluminum, bra~ss, etc.). A singular determiner (e.g. a, an, each, etc.) requires a siugular noun (Ill4) but a &quot;universal&quot; determiner (e.g. some, the, etc.) may bc used with singular or plural nouns (II15). If there arc no premodifying nouns, then hyperrulc Ill6 will apply. A single premodifying noun (II17) may bc either singular or a mass noun. Note that rnle Ill7 is nondeterminlstic in that there are two hyperalternativcs. The condit.ion is satisfied if either onc of these hypcrrules is satisfied. If the premodifying uouns are co-ordinated with and (1118), then both nouns must be mass norms (e.g. &quot;the wooden and iron door&quot; is correct but &quot;the forest and garden path&quot; is not). 18. condition EMPTY are singubw or mass nouns = EMPTY.</Paragraph> <Paragraph position="14"> 17. condlLion NOUN are singular or ma.qs nouns : where NOUN is MngulaC/; wlmre NOUN is a mass noun.</Paragraph> <Paragraph position="15"> 18. condition NOONI and NOUN2 arc singular or mass nouns : where NOUN1 is a mass noun~ where NOUN2 is a mass noun.</Paragraph> <Paragraph position="16"> llyperrulcs \[I19-II27 define the conditions for postmodification. Any postmodificatk)n of the snbjcct mast bc in the form of a relative clause which begins with who. Tliis type of relative clause rcqnires ~t human noun and the verb of the relative clause nmst agree with the modified noun. For cxamplc~ iu &quot;The men who fix computers were very helpful,&quot; the noun men nlust bc }1~ blllll~gn nOUll since it is modified by who and the verb fix must be compatible with men. Tbis type of relative clause may be considered as describing two separate sentences: &quot;The men fix computers.&quot; and &quot;The men were very helpful.&quot; In the hypcrrnles whleh verify these conditions, the sub-sentence described by bhc relative clause is formed and then checked for correctness using hypcrrule I12 rccursively.</Paragraph> <Paragraph position="17"> condition NOUN_IlEAl) is a human noun, eondithm the verb of RELATIVI,;_C, LAUSE agree~ wltll I)ETERMINEll NOUN_IIEAD.</Paragraph> <Paragraph position="18"> 22. conditlou NOUN is a human norm t where NOUN is a human noun.</Paragraph> <Paragraph position="19"> 23. condition NOUNI and NOUN2 is a human noun t wlmre NOUN1 is a human noun 9 where NOUN2 is a human noun.</Paragraph> <Paragraph position="20"> 24. condition NOUN_LIST COMMA_OPTION and NOIJN iS a hLIman nonu 1 condition NOUN_LIST in a human noun~ where NOUN is a human noun.</Paragraph> <Paragraph position="21"> 25. condition NOUN1 COMMA NOUN2 is a human noun : where NOUN1 ia a human noun 9 where NOUN2 is a human noun.</Paragraph> <Paragraph position="22"> 2{1. condition NOUN_LIST COMMA NOUN is a human noun : condition NOUN_LIST is a human noun, wikere NOUN is a human noun.</Paragraph> <Paragraph position="23"> 27. condition the verb of</Paragraph> <Paragraph position="25"> is a well-formed sentence, Tile third condition that the English sentences defined by our grammar must satisfy is that the predicate (verb) and objects should agrcc. The type of verb mast correspoud to the number of objects in the sentence: if the verb is intransitive, then no objects are allowed except for adverbs (ti28); if the verb is transitive, then a direct object is required (H29); and if the verb is ditransitive, then both a direct and an indirect object are required (I130).</Paragraph> <Paragraph position="26"> 28. condition OBJECT_SEQUEN(3E_ADVERB slmws object&quot;predlcate agreement with VERB : where VERB is iutransitlve.</Paragraph> <Paragraph position="27"> 29. condition DIRECT_ORJECT OBJECT_SEQUENI3E ADVERB shows object.predlcate agreement with VERB : where VERB is transitive.</Paragraph> <Paragraph position="28"> 30. condition IND1RECT_ORJECT DiRECT_OBJECT OB JECT_S EQUEN CIE_A1)VERB shows object.predlcate agreement with VERB : where VERB is dltransltive.</Paragraph> <Paragraph position="29"> The fourth condition for a well-formed sentence is that the auxiliary adverbs and main verb are in correct grammatical sequence, if I, here are no auxiliary verbs (H31), then tile auxiliary sequence is correct according to the recta-grammar. If auxiliary verbs are present then the verb must be a past partieiple (II32).</Paragraph> <Paragraph position="30"> 31. condition AUXILIARY_ADVERB_OPTION VERB is a well-formed predicate : EMPTY.</Paragraph> <Paragraph position="31"> 32. condition ALrXILIAI~Y_ADVEI~,B_OPTION</Paragraph> </Section> <Section position="9" start_page="529" end_page="529" type="metho"> <SectionTitle> AC TIVE_OR_PA S SIVE_,AUX/LIARY VERB </SectionTitle> <Paragraph position="0"> is a well-formed predicate : where VERB is a past participle.</Paragraph> <Paragraph position="1"> The fifth and final condition which must be satisfied is fro&quot; the object of the sentence to be well-formed. A simple object (H33) must satisfy the same conditions as a subject and hyperrules H10-H12 will apply recursively. An object sequence (H34) is well-formed if the indirect and direct objects are well-formed.</Paragraph> <Paragraph position="2"> 33, condition OBJE(3T OBJECT_SEQUENCE_ADVERB is a well-formed object : condition OBJECT is a well-formed subject.</Paragraph> <Paragraph position="3"> 34. condition INDIRECT_OBJECIT DIRECT_OBJECT OBJECT_SEQUEN(3E._ADVERB is a well-formed object : condition INDIRECVr OBJE(3T is a well-formed nbject~ condition DIRECT_OBJECT is a well-formed object.</Paragraph> <Paragraph position="4"> It can be seen that the above set of hyperrules is relatively concise and the conditions being described are readily understandable. We claim that the other goals of consistency, precision (for our subset of English), and unambiguity are also achieved. In the next section it will be shown how this specification may be implemented automatically.</Paragraph> </Section> <Section position="10" start_page="529" end_page="530" type="metho"> <SectionTitle> 5. TWO-LEVEL PARSIN(I </SectionTitle> <Paragraph position="0"> Our method of natural language specification has two-levcls: metarules for eontexVfree syntax and hyperrules for context-sensitive syntax. Similarly our method of parsing a two-level grammar requires a parser for metarules and a parser for hyperrules. Since the metarules are context-free, any of the well-known context-free paining algorithms (e.g.</Paragraph> <Paragraph position="1"> see \[17\]) may be used to derive a context-free structure of some input sentence. Context-free parsing will eliminate all sentences which do not satisfy the context-free syntax of the language but is unable to eliminate structures which are correct in the context-free sense but incorrect with respect to context-sensitive syntax. The hyperrule parser will further reduce the set of sentences which arc considered to be grammatically valid by analyzing the context-free parse tree for context-sensitive violations.</Paragraph> <Paragraph position="2"> The &quot;parser&quot; for the hyperrules is actually an interpreter developed by the authors in \[4\] which evaluates the hyperrules in much the same way as a progrannning language interpreter executes programs. The hyperrules are interpreted sequentially in the order that conditions are enumerated in the grammar. Interpretation proceeds by expanding the stm't notion and applying the hyperrules to all of the branches of the hypcrrule derivation tree until all of the prcdicatcs are evaluated. As interpretation proceeds, each node of the derivation tree (corrcsponding to a hypernotion) is expanded by matching it with a hyperrule lcft-hand sldc. The right-hand side of the matched hyperrule is then used to create a subtrcc for that node. Each branch of tile tree is evaluated from left to right in a prc~ordcr traversal. The English sentence is syntactically correct if and only if the resulting terminal string derived by tbe hypcrrulc tree is the empty string.</Paragraph> <Paragraph position="3"> The method of writing hyperrules to derive only the einpty string greatly simplifies the parsing process. Traditionally (e.g. \[2, 10\]), ~wo-lcvel grammars use tile hyperrules to generate the terminal s~rings of the language with the metarules being used only to instantiatc hyperrules. For example, in our grammar the metanotion SIdegNTENCE is nscd to generate English sentences which arc tben input to the hyperrules for anMysis. In other two-level grammar styles, however, the components of thc sentence would also be generated by hypcrrules. The result of hyperrules generating terminal strings is that parsing bccmnes considerably more difficult and is not accomplished without restrictions being placcd on hypcrrules (e.g. \[15\]). Our method of interpreting hypcrrnles places no restricl, ions, thcrclorc allowing the tlg to be more gencral. The differences in writing styles are cxplored further in I4\].</Paragraph> <Paragraph position="4"> The hyperrule interprctatkm algoritbm is outlined below: 1. Find tile hyperrule to apply wMch has tim hypernotion as its left.: hand side. This rule will bc of the form: hypernoffon : hypernotioa-I, hyperaotfon-2, ..., hypernoth>u-n.</Paragraph> <Paragraph position="5"> 2. Expand the derivation tree with hypernotion tts the root of the current snbtree ~nd tile branches being hypernvtion-t, hypernolion..2, , hypernotfon-n.</Paragraph> <Paragraph position="6"> 3. Evaluate (hypernntion-i) for i ~= 1, 2~ .., n.</Paragraph> <Paragraph position="7"> To explain how this interpreter works, consider the examplc sentence &quot;Professor White and the students who attend the university gave Mrs. White a present today.&quot; This sentence is seen to be correct, with respect to context-free syntax and its structural representation is shown in 1,'ignre 1. The specific metarules applied arc numbered. We will now apply the hyperrules to this sentence to show how the context-sensitive conditions arc verified. For notational convenience we have italicized the protonotions which correspond to metanotions in the hyperrules. Since the tree will bc traversed from left to right we will label the branches (i.e. nodes) using a nmnber (0-8) to denote the level in the tree and a letter (a-e) to indieaLe lcf~ to right ordering.</Paragraph> <Paragraph position="8"> The root of the hyperrulc derivation tree is the sentence itself.</Paragraph> <Paragraph position="9"> \[Iyperrulc HI will be applied to initiate the verification process. This will be followed by H2 which divides the derivation tree into five separate branches, one for each condition which the sentence must satisfy.</Paragraph> <Paragraph position="10"> tlyperrule H12 will be applied to expand branch 2b and decompose the compound subject into its components. IIyperrules ltl0 and Illl will then analyze each of the two respective sub-subjects for well-fm'medness. The expansion of branch 4d is one of the more interesting aspects of the context-sensitive analysis since it involves a relative clause. The analysis is performed by hyperrules HI9, H21, ti22 and H27. Note that rule II27 rearranges ~hc relative clause into a new sentence and reem'sively calls hyperrule H2 to analyze the new sentence.</Paragraph> <Paragraph position="11"> Instead of expanding branch 7b further, wc will resmne mlr example at branch 2c to verify the condition that the originM sentence must have object-predicate agreement. Since the object sequence contains an indirect object, direct object and an adverb, hyperrule H30 will be Nlplied next and since the verb gave is ditransitive, object-predicate agreement will he satisfied.</Paragraph> <Paragraph position="12"> The final condition that the sentence must satisfy is well-formcdness of the object. Since the object is a sequence, rule H34 will be applied to branch 2c to decompose tile object sequence and analyze the indirect and direct objects individually by rule H33. Rule Itaa calls rules II10-II12 recursively. Since Mrs. White is a named subject, hyperrule H10 is satisfied for tile indirect object. By applying hypcrrules \[I11, II13, HI4, H16, It19 and 1120, the direct object a present will also be verified as a well-formed object. The analysis is now complete and the sentence has been determined to be correct through tile process of our twoqevel grammar interpretation method.</Paragraph> </Section> <Section position="11" start_page="530" end_page="531" type="metho"> <SectionTitle> 6. CONCLUSIONS </SectionTitle> <Paragraph position="0"> We have shown that two-level grammars may be used very elegantly to give a formal specification of Ignglish context-fl'ec and context-sensitive syntax. In addition to the subset we have defined in this paper, many other types of Nnglish declarative sentences have been formMly specified using two-level grammars {7\]. There seems to be no obstacle to using rig specifications for any type of natural language syntactic specification.</Paragraph> <Paragraph position="1"> Tile principal advantages of the two-level grammar mctManguage are: 1) it is very readable and may be used to give a formal description using a structured form of natural language; 2) it is formal with many well-known mathematical properties; and 3) it is directly implcmentable by interpretation. The significance of the latter fact is that once we have written a two-level grammm' for natural language syntax, we can derive a parser automatically without writing any additional specialized computer programs. The combination of readability and implementability is unique in grammar theory for natural languages.</Paragraph> <Paragraph position="2"> To give a complete spccification of natural language, semantics and knowledge representation must be specified in addition to syntax. Our future goals are the investigation of two-level grammar for semantic specification. Because of the ease with wtfich two-level grammars may express logic \[6\] and their Turing computability \[12\], we expect that tlgs will also bc very suitable for these goals.</Paragraph> </Section> <Section position="12" start_page="531" end_page="532" type="metho"> <SectionTitle> NOUN PIII~SE SF~\]TI~CE SUBJECT PRI~ICATE MODIFIER SEQUF~CE SEQUENCE </SectionTitle> <Paragraph position="0"/> </Section> class="xml-element"></Paper>