SAUMER: SENTENCE ANALYSIS USING METARULES 
Fred Popowich 
Natural Language Group 
Laboratory for Computer and Communications Research 
Department of Computing Science 
Simon Fraser University 
Burnaby. B.C.. CANADA V5A 1S6 
ABSTRACT 
The SAUMER system uses specifications of natural 
language grammars, which consist of rules and metarules. 
to provide a semantic interpretation of an input sentence. 
The SAUMER ' Specification Language (SSL) is a 
programming language which combin~ some of the 
features of generalised phrase structure grammars (Gazdar. 
1981 ). like the correspondence between syntactic and 
semantic rules, with definite clause grammars (DCC-s) 
(Pereira and Warren. 1980) to create an executable 
grammar specification. SSL rules are similar to DCG rules 
except that they contain a semantic component and may 
also be left recursive. Metarules are used to generate new 
rules trom existing rules before any parsing is attempted. 
A.n implementation is tested which can provide semantic 
interpretations for sentences containing tepicalisation, 
relative clauses, passivisation, and questions. 
1. INTRODUCTION 
The SAUMER system allows the user to specify a 
grammar for a natural language using rules and metarules 
rhts grammar can then be u¢,ed ~ obtain a semantic 
interpretation of an input sentence. The SAUMER 
Specification language (SSL). which L~ a variation of 
definite clause gr~s (DCGs) (Pereira and Warren. 
1980). captures some ,ff the festures of generaI£.ted phrase 
structure grammar5 (GPSGs) (Gazdax, 1981) (GaTrl~r and 
Pullum. 1982). like rule schemata, rule transformations. 
structured categories, slash categories, and the 
correspondence between syntactic and semantic rules. The 
semantics currently used in the system are based on 
Schubert and Pelletiers description in (Schubert and 
Pelletier. 1982). - which adapts the intetmional logic 
intervretation associated with GPSGs. into a more 
conventional logical notation 
2. THE SEMANTIC LOGICAL NOTATION 
The logical notation associated with the gr~mm~r 
differs from. the usual notation of intensional logic_since it 
captures some intmtive aspects of natural language, l 
Thus. individuals and objects are treated as entities. 
instead of collections of prope'rties, and actions are n-ary 
relations between these entities. Many of the problems 
that the intensional notation would solve are handled by 
allowing ambiguity to be represented in the logical 
notation. Consequently. as is common in other approaches. 
(e.g.. Gawron. 1982). much of the processing is deferred to 
the pragmatic stage. The structure of the lexicon, and the 
appearance of post processing markers (sharp angle 
brackets) are designed to reflect this ambiguity. The 
lexicon is organised into two levels. For the semantic 
interpretation, the first level gives each word a tentative 
interpretation. During the pragmatic analysis, more 
complete processing information will result in the final 
interpretation being obtained from the second level of the 
lexicon. For e~mple, the sentence John misses John could 
be given an initial interpretation of: 
(2.1) \[ Johnl misa2 John3 \] 
with Johnl, miss2 and John3 obtained from the first level 
of the two level lexicon. The pragmatic stage will 
determine if Johal and John3 both refer to the same 
entry, say JOHN SMITH1. of the second level of the 
lexicon, or if they correspond to different entries, say 
JOHN_JONES1 and JOHN_EVANS1. During the 
pragmatic stage, the entry of MISS which is referred to 
by miss2 will be determined (if possible). For example, 
does John miss John because he has been away for a long 
time, or is it because he is a poor shot with a rifle? 
Any interpretation contained in sharp angle brackets. 
<...>. may require post processing. This is apparent in 
interpretations containing determiners and co-ordinators. 
The proverb: 
(2.2) every man loves some woman 
could be given the interpretation: 
(2.3) \[<everyl man2> love3 <some4 womanS>\] 
without explicitly stating whmh of the two readings is 
intended. During pragmatic analysis, the scope of every 
and some would presumably be determined. 
111 should also be noted that. due Io the separabili'~y of the semantic 
component from ",he grammar rule, • different semantic notation could 
easily be introduced at long as ~u~ app~priate ~.mantic proce~in8 
rou~dne$ were replaced. The use of SAUMER with "an "Al-adap'md" 
version of Mon~ue's Intensional Logic" is being examined by Fawc©It 
(1984), 
48 
The syntax of this logical notation can be b-~mmav~sed 
as follows. Sentences and compound predicate formulas 
are contained within square brackets. So. (2.4) states that 
3oim wants to kiss Mary: 
(2.4) \[Johnl want2 \[John1 kiss3 Mary4\]\] 
These formulas can also be expressed equivalently in a 
more functional form according to the equivalence 
(2.5) \[ t n P t I . . . tad \] 
--- ( • . . ((P t l) t 2) . . . t n ) 
-- ( P t t . t. ) 
Consequently. (2.4) could also be represented as: 
(2.6) ((want2 ((kiss3 Mary4) Johnl)} Johnl) 
However. this notation is usually used for incomplete 
phrases, with the square brackets used to obtain a 
cortvent/ona/ final reading Modified predicate formulas 
are contained in braces. Thus. a little dog likes Fido could 
be expressed as: 
(2.7) \[<al {little2 dog3}> likes4 FidoS\] 
The lambda calculus operations of lambda abstraction and 
elimination are also allowed. When a variable is 
abstracted from an expression as in: 
(2.8) kx \[ • want2 \[ • love3 Mary4 \] \] 
application of this new expression to an argument, say 
dohnl: 
(2.9) ( kx \[ • want2 \[ • love3 l~u~J'4 \] \] Johnl ) 
will result in an int~,v,©tation of John wants to love Mary: 
(2.10) \[ Johnl want2 \[ Johnl love3 Mary4 \] \] 
Further details on this notation are available in (Schubert 
and Pelletier. 1982). 
3. THE SAUMER SPECIFICATION LANGUAGE 
The SAUMER Specification Language (SSL) is a 
programming language that allows the user to define a 
grammar of a natural language "in ~ of rules, and 
metarules. Metarules operate on rules to produce new 
rules. The language is basically a GPSG realised in a 
DCG setting. Unlike GPSGs. the grammars defined by 
this system are not required to be context-free since 
procedure calls are allowed within the rules, and since 
logic variables are allowed in the grammar symbols. 
The basic objects of the language are atoms, variables. 
terms, and lists. Any word starting with a lower case 
letter, or enclosed in single quotes is an atom. Variables 
start with a capital letter or an underscore. A term is an 
atom. optionally followed by a series of objects 
(arguments), which are enclosed in parentheses and 
separated by commas. Lastly. a list is a series of one or 
more objects, separated by commas, that are enclosed in 
square brackets 
3.1 Rules 
The rules are presented in a variation of the DCG 
notation, augmented with a semantic rule corresponding to 
each syntactic rule. Each rule is of the form 
"A --> B : ~," where A is a term which denotes a 
nonterminal symbol. B is either an atom list representing 
a terminal symbol or a conjunction of terms (separated by 
commas) corresponding to nonterminal symbols, and y is a 
semantic rule which may reference the interpretation of 
the components of ~ in determining the semantics of A. 
The rule arrow. -->. separates the two sides of the rule. 
with the colon. :. separating the syntactic component from 
the semantic component. If the rule is preceded by the 
word add, it can be subjected to the transformations 
described in section 3.2. The nonterminal symbols can 
possess arguments, which may be used to capture the 
flavour of the struaurad categor/~s of GPSGs. ~ may also 
possess arbitrary procedural restrictions contained in braces. 
T consists of expressions in the semantic notation. 
The different terms of this semantic expression are joined 
by the semantic connector, the ampersand "&'. The 
ampersand differ, from the syntactic connector, the 
comma, sinc~ the former associates to the right while the 
latter associates to the left. The /og/col and symbol. 
which traditionally may also be denoted by the 
ampersand, must be entered as "&&'. Due to constraints 
imposed by the current implementation, "( exFr )" must 
be entered as "<\[ expr \]'. "< expr >" as "< <\[ expr \]'. 
and "k x expr" as "x lmda expr." An expression may 
contain references to the interpretations of the elements of 
18 by stating the appropriate nonterminal followed by the 
left quote, ". To prevent ambiguity in "these references 
that may arise when two identical symbols appear in B. a 
nonterminal may be appended with a minus sign followed 
by a unique integer. 
Unlike standard Prolog implementations of DCGs. left 
recursion is allowed in rules, thus permitting more natural 
descriptions of certain phenomena (like co-ordination). 
Since the left recursive rules are interpreted, rather than 
converted into rules that are not left recursive, the 
number of rules in the database will not be affected. 
However. the efficiency of the sentence analysis may be 
affected due to the extra processing required. Rules of 
the form "A --> A. A" are not accepted. 
An example of a production that derives John from a 
proper noun. npr. is shown in (3.1): 
(3.1) npr --> \['John'\] : "John'# 
The semantic interpretation of this npr will be John#. 
with "#" replaced by a unique integer during evaluation. 
(3.2) illustrates a verb phrase rule that could be used in 
sentences like John wants to wa/k: 
(3.2) vp(Num) --> 
v(Num.Root) with Root in \[want.like\]. vp(inf) 
x## lmda \[ x## & v" & \[x## & vp'\]) \] 
49 
First nottce that a restriction on the verb appears within 
the w/th statement. In the GPSG formalism, this type of 
restriction would be obtained by naming the rules and 
associating a list of valid rule names with each lexical 
entry. Although the w/~h restriction may contain any 
valid in-ocedure, typically the in operation (for determining 
list membership) is used. The double pound. ##. is 
replaced by the same unique integer in the entire 
expression when the expression is evaluated. If "#" were 
used instead, each instance of x# would be different. For 
the above example, if v' is want2 and vp' is runJ. then 
the semantic expression could evaluate to: 
(3.3) x4 lmda \[x4 & want2 & \[x4 & run3\]\] 
Furthermore. if np" is Johrtl. then: 
(3.4) \[np" & vp'\] 
could result in: 
(3.5) \[Johnl & want2 & \[Johnl & run3\]\] 
3.2 The Metarules 
Traditional transformational grammars provide 
transformations that operate on parse trees, or similar 
structures, and often require the transformations to be 
used in sentence recognition rather than in generation 
(Radford. 1981). However. the approach suggested by 
(GaT~2r. 1981) uses the transformations generatively and 
applies them to the grammar. Thus. the grammar can 
remain contex:-free by compiling this transformational 
knowledge into the grammar. Transformations and rule 
schemata form the maazu/~s of SSI- 2 
Rule schemata allow the user to specify entire classes 
of rules by permitting variables which range over a 
selection of categories to appear in the rule. To control 
the values of the variables, the fora// control structure can 
be used in the schema declaration. The schema 
fora// X ~n List, Body will execute Body for each element 
of Li~. with X instantiated to the current element. The 
use of this statement is illustrated in the following 
metarule that generates the terminal productions for proper 
nouns." 
(3.6) forall Terminal in \['Bob'.'Carol'.'red'.'Alice'\], 
(npr --> \[Terminal\] : Terminal#) . 
Transformations match with grammar rules in the 
database, using a rule pattern that may be augmented 
with arbitrary procedures, and produce new rules from 
the old rules. A transformation is of the form: 
(3.7) a --> /i : y ---> a' --> B" : 7" 
The metarule arrow. -- >, separates the pattern, 
a --> ~ : T. from the template, a" --> /i" : T'- 
2Oflen. metarule~ are considered 1o consisl of transformations only, 
while schemata are pul inlo a category of their own. However. sinoe 
they can both be considered i~ part of • metagramma~, they are called 
me~trule~ in thl, distna~inn. 
The ~n~a~ pattern, Q --> /i. contains nonterminals. 
which correspond to symbols that must appear in the 
matched rule, and free variables, which represent don't 
~r~regions of zero or more nonterminals. The pattern 
nontermmals may also possess arguments. For each rule 
symbol, a matching pattern symbol describes properties 
that must exist, but not all the properties that may exist. 
Thus. if vp appeared in the pattern, it would match any 
of vp. vp(Num), or vp(Nura2"ype) with Type in /transl. 
However. pp(to) would not match pp or pp(frora), but it 
would match plMto,_). The matching conditions are 
summarised in Figures 3-1 and 3-2. In Figure 3-1. A and 
B are nonterminals. X is a free variable, and a and /i are 
conjunctions of one or more symbols, y and 8 of Figure 
3-2 are also conjunctions of one or more symbols. "=" is 
defined as unification (Clocksin and Mellish, 1981). Parts 
of the rule contained in braces are ignored by the pattern 
matcher. The syntactic pattern may also contain arbitrary 
restrictions. 3 enclosed in braces, that are evaluated during 
the pattern match. The semant/c pattern, y, is very 
primitive, h may contain a free variable, which will 
bind to the entire semantics field of the matched rule, or 
it may contain the structure <\[? ~\]. which will bind to 
the entire structure containing the symbol x. If <\[? y\] 
then appears in y', the result will be the semantic 
component of the matched rule with x replaced by y. 
Pattern 
Rule 
(B. /3) B 
(A. a) 
(X. a) 
A 
X 
A matches B A matches B and 
and a matches ~ a is a free variable 
(X. a) matches /i a matches B 
or a matches (B. ~) 
No A matches B 
yes Yes 
Figure 3-1: Pattern Matching for Conjunctions 
Pattern 
Rule 
b(/i\[ .... /I n) b(,/i I .... /in ) with 8 
a(a I .... a m ) 
a(a I .... a=) 
with 
a=b. m~<n. 
ati=/i i, 1~<i~<m 
No 
a--b. m~n. 
ai=/i i, l~i~m 
a=b. m~n. 
ai=/i i. l~<i~<m. " 
matches 8 
Figure 3-2: Pattern Matching for Nonterminals 
3Apparently no1 present in the Hewle1"t Packard system (Gawron, 
1982) or the ProGram system (Evans and Ga~l~r, 1984) 
50 
The behaviour of patterns can be seen in the following 
examples. Consider the sentence rule: 
(3.8) s(decl) --> np(nom.Numb). 
vp(_Jqumb) with agreement(Numb) 
: \[ rip" & vp" \] 
The patterns shown in (3.9a) will match (3.8). while 
those of (3.9b) will not match it. 
(3.9) (a) s(A) --> {not element(A,\[foo\])L X. vp : Sere 
s --> np(nom), X. vp(pass). Y : Sere 
(b) s(inter) --> np. vp : Seam 
s --> vp : Sere 
For the verb phrase rule shown in (3.10): 
(3.10) vp(active.\[MIN\]) --> 
v(\[MIN\],Root,Type,_) with (intrans in Type) 
: v" 
the patterns of (3.11a) will result in a successful match. 
will those of (3.11b) will not: 
With external modification, any nonterminal, or 
variable instantiated to a nonterminal, may be followed 
by the sequence @rood. This will result in rood being 
inserted into the argument list following the specified 
arguments. Thus, mf N@junk appeared in a rule when N 
was instantiated to np(more), it would be expanded as 
rip(more,junk }. Similarly, if the pattern symbol vp 
matched v,v{NumS) in a rule, then the appearance of 
vp@foo in the template would result in vp(foo~Vumb) 
appearing in the new rule. This extra argument. 
introduced by the modifier, can be useful when dealing 
with the missing components of slash or derived categories 
(Gazdar, 1981). 
Internal modification allows the modifier to be put 
directly into the argument list. If an argument is 
followed by @rood. it will be replaced by rood. In the 
case where @rood appears as an argument by itself, rood is 
added as a new argument. For example, if 
v(Numb@pastpart) were contained in a template, it would 
IT-match v(Numb) in the pattern, and would result in the 
appearance of v(pastpart) in the new rule. 
(3.11) (a) vp-> v : <\[?v\] 
vp --> v( .... Type._) 
with (X, intrans in Type. Y). 
Z:Sem 
(b) vp --> v(..._.Type._) 
with (X. trans in Type) 
:Sem 
vp -> v(_~oot .... ) 
with (Root in \[fool. X) 
:Sem 
For every rule that matches the pattern, the template 
of the transformation is executed, resulting the creation of 
a new rule. Any nonterminal. N, that matches a symbol 
8 i on the left side of the transformation, will appear in 
the new rule if there is a symbol ~i" in 8" that 
irura-transformation (IT) matches with ~i" If there are 
several symbols in 8" that IT-match ~i" the leftmost 
symbol will be selected. No symbol on one side of the 
transformation may IT-match with more than one symbol 
on the other side. Two symbols will IT-match only if 
they have the same number of arguments, and those 
arguments are identical. Any w/th expressions and 
modifiers associated with symbols are ignored during IT- 
matching. 8" may also contain extra symbols that do not 
correspond to anything in 8. In this case. they are 
inserted directly into the new rule. Once again, if the 
transformation is preceded by the command add. then the 
resulting rul~ can be subjected to subsequent 
transformations. 
3.3 Modifiers 
Both rules and metarules may contains modifiers that 
alter the ~tructure of the nonterminal symbols. There are 
two types of modification, which have been dubbed 
external and /nzerrud modification. 
4. IMPLEMENTATION 
The SAUMER system is currently implemented in 
highly portable C-Prolog (Pereira. 1984). and runs on a 
Motorola 68000 based SUN Workstation supporting UNIX 4. 
Calls to Prolog are allowed by the system, thus providing 
useful tools for debugging grsmmars, and tracing 
derivations. However. due to the highly declarative 
nature of SSL, it is not restricted to a Prolog 
....... implementation. Implementations in other languages would 
differ externally only in the syntax of the procedure calls 
that may appear in each rule. Use of the system is 
described in detail in (Popowich, 1985). 
The current implementation converts the grammar as 
specified by the rules and metarules into Prolog clauses. 
This conversion can be examined in terms of how rules 
are processecl, and how the schemata and transformations 
are processed. 
4.1 Rule Processing 
The syntactic component of the rule processor is based 
on Clocksin and Mellish's definite clause grammar 
processor (Clocksin and Mellish. 1981) which has been 
implemented in C-Prolog. For a DCG rule. each 
nonterminal is converted into a Prolog predicate, with two 
additional arguments, that can be processed by a top-down 
parser. These ~tn arguments correspond to the list to be 
parsed, and the remainder of the list after the predicate 
has parsed the desired category. With the addition of 
semantics to each rule, another argument is required to 
represent the semantic interpretation of the current 
symbol. Thus. whenever a left quoted category name. x'. 
4UNIX is • Inulemark of Bell Laboralories 
51 
appears in the semantics of the rule. it'is'repla~gl by a 
variable bound to the semantic argument of the 
corresponding symbol, x. in the rule. The semantic 
expression is then evaluated by the eva/ routine with the 
result bound to the semantic argument of the nonterminal 
on the left hand side of the production. For ~ffiample. the 
sentence /ule: 
(4.1) add s(decl) -> 
np(nom.Numb). 
vp(_2qumb) with agreement(Numb) 
: \[ np" & vp" \] 
will result in a Prolog expression of the form: 
(4.2) s(SemS.decl._l. 3) :- 
nlKSemNP.nom2qumb. 1.2). 
vp(SemVP, 2qumb. 2. 3). 
agreement(Numb). 
eval(\[SemNP & SemVP\],SemS). 
Consequently. to process the sentence John runs. one 
would try to satisfy: 
(4.3) :- s(Sem, Type. \['John'.runs\]. \[\]). 
The first argument returns the interpretation, the second 
argument returns the type of sentence, the third is the 
initial input list. and the final argument corresponds to 
the list rPmaining after finding a sentence. Any rule R, 
that is preceded by add will have the axiom r'ul~(R) 
inserted into the database. These axioms are used by the 
transformations during pattern matching. 
The eva/ routine processes the suffix symbols, # and 
## along wlth the lambda .expressions, and may perform 
some- reorganisation of the given expression-- before 
returning a new semantic form. For each expression of 
the form name#, a unique integer N is ca-eared and 
nan~-N is returned. With "##'. the procedure is the 
same except that the first occurrence of "##" will generate 
a unique integer that will be saved for all subsequent 
occurrences. To evaluate an expression of the form: 
(4.4) ( expr i Lmda e~Fj & X ) 
every subexpression of exprj is recursively searched for an 
occurrence of expr i. which is then replaced by X. 
Left recursion is removed with the aid of a gap 
predicate identical to the one defined to process gapping 
gr-ammarS (Dahl and Abramson. 1984) and unre~Lricte~ 
gapping grammars (Popowich. forthcoming). For any rule 
of the form: 
(4.5) A --> A. B. a 
where A does not equal B. the result of the translation is: 
(4.6) Af_I.N n) :- gap(G._l. 2). B(2.No). A(G,\[\]). 
<Xl (No,N 1 ) ..... tXn(Na_l.Nn), 
According to (4.6). a phrase is processed by skipping over 
a region to find a B -- the first non-terminal that does 
not equal A. The skipped region is then examined to 
ensure that it corresponds to an A before the rest of the 
phrase is processed. 
4.2 Schema Processing 
To process the metarule control structures used by 
schemata, a fml predicate is inserted to force Prolog to try 
all possible alternatives. The simple recursive definition 
of /ore// X/~ /./rt: 
(4.7) forall(X in \[\], Body). 
forall(X in \[YIRest\]~xty) :- 
(X=Y. calll(Body), fail) : 
forall(X. Rest. Body). 
uses fa// to undo the binding of Y, the first element of 
the list. to X before calling fore// with the remainder of 
the list. The predicate ¢.<d/l is used to evaluate Body 
since it will prevent the fa// predicate from causing 
backtracking into Body. 
4.3 Transformation Processing 
Execution of transformations requires the most 
complex processing of all of the metagrammatical 
operations. This processing can be divided into the three 
stages of transformation crY. pattern matching, and rule 
crem,/on. 5 
During the rrar~fornuU/~n trot/on phase, the predicate 
rrarts(M,X,Y) is created for the metarule. M. This 
predicate will transform a list of elements. X: into 
another ILSL Y, according to the syntax specification of the 
metarule. Elements that IT-match will be represented by 
the same free variable in both lists. This binding will be 
one to one. since an element cannot match with more than 
one element on the other side. Symbols that appear on 
only one side will not have their free variable appearing 
on the opposite side. Expressions in braces are ignored 
during this stage. If a transformation like: 
(4.8) a --> b, c. X --> a@foo --> b. X. c(foo) 
appears, then a predicate of the form: 
(4.9) tr~s(M. L1._2._3.X\]. L1._2.X._4\]) 
will be created. Notice that the appearance of a modifier 
does not cause a@/oo to be distinguished from a. since all 
modifiers are removed before the pattern-template match is 
attempted. However. c and c(foo) are considered to be 
different symbols. M is a unique integer associated with 
the transformation. 
The pattern match phase determines if a rule matches 
the pattern, and produces a list for each successful match 
which will be transformed by the trans predicate. Each 
element of the list is either one of the matched symbols 
from the rule. or a list of symbols corresponding to the 
don't care region of the pattern. Any predicates that 
5(Popowich, forthcoming) examines a method of transformalion 
~ing that uses the transformations during ~3~e par~e, instead of Using 
them m L~me~te new ~.fle~. 
52 
appear in braces in the pattern are evaluated during the 
pattern match. Consider the operation of an active-passive 
verb phrase transformation: 
(4.10) vp(active~Numb) --> 
v(Numb.R.Type.SType) 
with (X.trans in Type.Y). 
np. Z 
<\[? np'\] 
v~pass.Numb) --> 
v(Numb.be.T.S)-I with auz in T. 
v(Numb@pastpart.R.Type.SType) 
with (X.trans in Type.Y). 
z. pp(by._) 
: x## Imda \[pp(by)" & <\[7 x##\]\] 
on the following verb phrase: 
(4.11) vp(active.Numb) --> 
v(Numb~R.Type._) with trans in Type. 
n~\[x.A.x\] .... ) 
: <\[ v" & np" \] . 
The list produced by the pattern match would resemble: 
'.12) \[ vp(active.Numb). 
v(Numb.R.Type._) with \[\[\].trans in Type~\]\]. 
nr(\[x.A.~\] .... ). 
\[\] \] 
Notice that there was nothing in the rule to bind with X. 
Y or Z. Consequently. these variables were assigned the 
null list. \[\]. The pattern match of the semantics of the 
rule will result in an expression which lambda abswacts 
np" out the of semantics: 
(4.13) <\[ np" lmda <\[ v" & np" \] \] 
Finally. the ru/~ crea¢/on phase applies the 
transformation to the list produced by the pattern match. 
and then uses the new list and the template to obtain a 
new rule. This phase includes conversion of the new list 
back into rule form. the application of modifiers, and the 
addition of any extra symbols that appear on the right 
hand side only. To continue with our *Tample. the trans 
predicate a.~ociated with (4.10) would be: 
(4.14) trans(N. \[_1._2._3.Z\]. \[_.3.4._21..5\]) 
Notice that the two vp's on opposite sides of the metarule 
do not match. So the transformed list would resemble: 
(4.15) \[ _3. 
4, 
v(Numb.R.Type._) with \[\[\].trans in Type,\[\]\]. 
\[3. _51 
The rule generated by the rule creation phase would be: 
(4.16) vp(pass~lumb) --> 
v(Numb.be.T~)-I with aux in T. 
v(pastpart.R,Type._) with tnns in Type. 
pp(by._) 
: x## lmda \[ pp(by)" & <\[ v" & x## \] \] 
• Notice that the expression "<\[ v" & x## \]'. which is 
• contained in the semantics of (4.16) was obtained by the 
application of (4.13) to x##. 
5. APPLICATIONS 
To examine the usefulness of this type of grammar 
specification, as well as the adequacy of the 
implementation, a grammar was developed that uses the 
domain of the Automated Academic Advisor (AAA) 
(Cercone et.al.. 1984). The AAA is an interactive 
information system under development at Simon Fraser 
University. It is intended to act as an aid in "curriculum 
planning and management', that accepts natural language 
queries and generates the appropriate responses. Routines 
for performing some morphological analysis, and for 
retrieving lexical information were also provided. 
The SSL grammar allows questions to be posed. 
permits some possessive forms, and allows auxiliaries to 
appear in the sentences. From the base of twenty six 
rules, eighty additional rules were produced by three 
metarules in about eighty-five seconds. Ten more rules 
were needed to link the lexicon and the grammar. A 
selection of the rules and metarules appears in Figure 5-1. 
The complete grammar and lexicon is provided in 
(Popowich. 1985). 
In the interpretations of some ~mple sentences, which 
can be found in Figure 5-2, some liberties are taken with 
the semantic notation. Variables of the form wN. where 
N is any integer, represent entities that are to be 
instantiated from some database. Thus. any interpretation 
containing wN will be a question. Possessives. like John's 
tab/e are represented as: 
(5.1) <table & \[John poss table\]> 
Although multiple possessives which associate from left to 
right are allowed, group possessives as seen in: 
(5.2) the man who passed the course's book 
and in phrases like: 
(5.3) John's driver's lice.ace 
can not be interpreted correctly by the grammar. 
Inverted sentences are preceded by the word Query in the 
output. Also. proper nouns are assumed to unambiguously 
refer to some object, and thus are no longer followed by 
a unique integer. Analysis times for obtaining an 
interpretation are give 9 in CPU seconds. The total time 
includes the time spent looking for all other possible 
parses. 
Results obtained with SAUMER compare favourably to 
those obtained from the ProGram system (Evans and 
Gazdar. 1984). ProGram operates on grammars defined 
according to the current GPSG formalism (Ga2dar and 
Pullum. 1982). but was not developed with efficiency as a 
major consideration. The grammar used with ProGram. 
which is given in (Popowich. 1985). is similar to the AAA 
53 
/- Case ,s described by a mask. \[N.A,G\], with free variables for Ham., Ace. and Gen. */ 
add vp(octive.Numb) ~> v(Numb. Root. T, _) with (Root in \[pass.give,teach,offer\], indabj in T. trees in T), 
np(\[x.D.x\] .... ). np(\[x.*.x\] .... )-1 : <\[ v' a np' a np-t' \] 
Je WH--<lueetions in inverted sentences */ evcl(y~, Var), NP - np(Case.Numb,Feat) 
• ( NPONP ~> \[\]. |agreement(Case)| : Var ) 
, (e(inv) ~> np(\[x,A,x\],Nomb,Feat) with Clword in Feat, e(inv)Onp(\[x,A,x\],Numb,Feat) 
: <\[ (Vat lads s') • np' \] ). 
/* passive trenefarnmtion e/ 
add vp(octive.Numb) --> v(Numb.R.Type.Subtype) with (X. trees in Type0 Y). npo Z : <\[? np °\] 
mE> vp(poss,Humb) ~> v(Numb,be,T,S)--I with aux in T, 
v(Numi:gpaetpart, R. Type, Subtype) with (X, trees in Type, Y), 
Z. optianal(pp(by._)) : x~ Imda \[ optional" k <\[ ? x~ \] \] . 
/* sentence inversion */ 
add vp(T.\[MiN\]) ~> v(\[MJN\],R,Type,S) with (X, aux in Type, Y), Z : $em 
m> s(inv) --> v(\[UIN\],R,Type,S) with (X.aux in Type,Y), np(\[Nl,x,x\],\[MlN\],_), Z :\[np' a Semi. 
/, metarule for the propagation of "holes" in the "slosh" categories e/ 
farail Hole in \[pp(Prep,Feat),np(Case,Nomb,Foot)\] 
. ( forall Cat1 in \[s(Type),vp.pp(Prep,Feat),optional\] 
• ( forall Cat2 in \[vp,pp(Prep,Feat),np(Caae,Numb,Foat),optional\] 
, ( Cat1 m> X. Cot2, Y : Sem m> CetlIHoie m> X, Cat2OHalo, Y : Sen ) ) ) . 
Figure 5-1: Excerpt from Grammar 
Sentence 
Query: 
Analyo,e:. 
did Fred take omptlel. 
\[Fred takes cmptlel\] 
2.25 eec. Total: 4. 28334 sea. 
Sentence: who wonts to teach Fred's professor's course. 
Semantics: \[ <wl • \[wl onlmgte\]> 
wont4 
\[ <wl • \[wl animate\]> 
teach13 
<course14 k \[ <professarIS • \[Fred pace profosearlS\]> poes course14\]> 
\] 
\] 
Analysis: 6.58337 eec. Total: 18.9834 ee¢. 
Sentence' 
Query" 
Analysis: 
whose course does the student whom John liken want to be taking. 
\[ <<the38 student39> • \[John like4S <the38 student39>\]> 
wont46 
\[ <<the38 student39> • \[John like4S <the38 student39>\]> 
takeS6 
<course29 • \[<w3e • \[w3e animate\]> pose caurwe29\]> 
\] 
\] 
21.9999 eec. Total: 39.4 sac. 
Sentence: 
Query: 
Analysis: 
to whom daee the professor want which paper to be given. 
\[ <the14 professorlS> 
want17 
\[ x39 givo3S <w7 k \[w7 aninmte\]> <w21 k \[w21 paper22\]> \] 
\] 
14.3167 sec. Total: 29.5167 sec. 
Figure 5-2: Summary of Test Results 
54 
grammar used by SAUMER. except that it has a much 
smaller lexicon, and allows neither relative clauses nor 
possessive forms. Running on the same machine as 
SAUMER. ProGram required about 35 seconds to parse the 
sentence does John take cmpelOl, with a total processing 
time of abo,.u 140 second.~ SAUMER required just over 2 
seconds to parse this phrase, and had a total processing 
time of about 4 seconds. 
As it stands, the semantic notation used by SAUMER 
does "not contain much of the relevant information that 
"would be required by a real system. Tense. number and 
adverbial information, including concepts like location and 
time. would be required in the AAA. If the SSL 
description were to be extended, with the resulting system 
behaving as a natural language interface of the AAA. a 
more database directed semantic notation would prove 
invaluable. 
6. PRESENT IXMITATIONS 
Although this application of metarules allows succinct 
descriptions of a grammar, several problems have been 
observed. 
Since each metarule is applied to the rule base only 
once. the order of the metarules is very important. In 
our sample grammar, the passive verb phrases were 
generated before the sentence inversion transformation was 
processed, and then the slash category propagation 
transformations were executed. For the curreat 
implementation, if a rule generated by transformation T1 
is to be subjected to transformation T2. then T1 must 
appear before T2. Moreover. no rule that is the result of 
.... T2-can be operated on by TI. It would be preferable to 
remove this restriction and impose one. that is less severe. 
such as the finite closure restriction which is described in 
(Thompson. 1982) and used by ProGram. With this 
improvement, the only restriction would be that a 
transformation could only be applied once in the 
derivation of a rule. 
The system can not currently process rules expressed 
in the Immediate Dominance/ Linear Precedence (ID/LP) 
format. (Gazdar and Pullum. 1982). With this format, a 
production rule is expressed with an unordered right hand 
side with the ordering determined by a separate 
declaration of //near precedence. For example, a passive 
verb phrase rule could appear something like" 
(6.1) vp(pass.\[MIN\]) -- > 
v(\[MIN\], be .... ). 
v(_. Root. Type. _) with 
(Root in \[pass.carry.give\]. 
indobj in Type. 
trans in Type). 
pp(to). 
optional(pp(by)) 
: x## Imda 
\[optional" & <\[v" & pp(to)" & x##\]\] 
with the components having a linear precedence of: 
(6.2) v(_.be) < v < pp 
The result would be that the pp(by) could appear before 
or after the pp(to), since there is no restriction on 'their 
relative positions. If this format were implemented, only 
one passive metarule would have to be explicitly stated. 
The direct processing of ID/LP gremm~rs is discussed in 
(Shieber. 1982). (Evans and Gazdar. 1984). and (Popowich. 
forthcoming). 
7. CONCLUSIONS 
SSL appears to adequately capture the flavour of 
GPSG descriptions while allowing more procedural control. 
Investigation into a relationship between SSL and GPSG 
grammars could result in a method for translating GPSG 
grammars into SSL for execution by SAUMER. Further 
research could also provide a relationship between SSL and 
other grammar formalisms, such as /ex/c~-funct/on,d 
granmu~$ (Kaplan and Bresnan. 1982). The prolog 
implementation of SAUMER. allowing left recursion in 
rules, should facilitate a more detailed study of the 
specification language, and of some problems associated 
with metarule specifications. Due to the easy separability 
of the semantic rules, one could attempt to introduce a 
more database oriented semantic notation and develop an 
interface to a real database. One could then examine 
system behaviour with a larger rule base and more 
involved transi'ormations in an applications environment 
like that of the AAA. However. as is apparent from the 
application presented here and from preliminary 
experimentation (Popowich. 1984) (Popowich. 1985), 
further investigation of the efficient operation of this 
Prolog implementation with large grammars will be 
required. 
ACKNOWLEDGEMENTS 
l would like to thank Nick Cercone for reading an 
earlier version of this paper and providing some useful 
suggestions. The comments of the referees were also 
helpful. Facilities for this research were provided by the 
Laboratory for Computer and Communications Research. 
This work Was supported by the Natural Sciences and 
Engineering Research Council of Canada under Operating 
Grant no. A4309. Installation Grant no. SMI-74 and 
Postgraduate Scholarship #800. 
REFERENCES 
Cercone. N.. Hadley. R.. Martin F.. McFetridge P. and 
Strzaikowski. T. Deai~in~ and automating the 
quality mmesmment of a knowledge-ba.m~ system: the 
initial automated academic advisor experience, pages 
193-205. IEEE Principles of Knowledge-Based Systems 
Proceedings. Denver. Colorado. 1984. 
Clocksin. W.F. and Mellish. C.S. Progrnmmlng in Prolog. 
Berlin-Heidelberg-NewYork:Springer-Verlag. 1981. 
55 
Dahl. V. and Abramson. H. On Gapping Gr~mm~. 
Proceedings of the Second International Joint Conference 
on Logic. University of Uppsala. Sweden. 1984. 
Evans. R. and Gazdar. G. The ProGram Manual. 
Cognitive Science Programme. University of Sussex, 
1984. 
Fawcett. B. personal commnnication. Dept. of 
Computing Science. University of Toronto. 1984. 
Gawron. J.M. et.aL Procemiag English with a 
GenersliT~d Phrase Structure Grammar. pages 74-81. 
Proceedings of the 2Oth Annual Meeting of the 
Association for Computational Linguistics, June. 1982. 
Gazdar. G. Phrase Structure Grammar. In Po Jacobson 
and G.K. Pullum (Ed.). The Nature of Syn~cx.ic 
Representation, D.Reidel. Dortrecht, 1981. 
Gazdar. G. and Pullum. G.K. Generalized Phrase 
Structure Gr~mm,~r:. A Theoretical Synopsis. 
Technical Report. Indiana University Linguistics Club. 
Bloomington Indiana. August 1982. 
Kaplan. R. and Bresnan. J. Lexical-Functional Grarnmar: 
A Formal System for Grammatical Representation. In 
J. Bresnan (Ed.). Mental Representation of 
Grammatical Relation& Mrr Press. 1982. 
Pereira. F.C.N.(ed). C-Prolog User's Manual. Technical 
Report. SRI International. Menlo Park. California. 1984. 
Pereira. F.C.N. and Warren, D.H.D. Definite Clause 
Grammars for Language Analysis. Artificial 
Intelligence. 1980. 13, 231-278. 
Popowich. F. SA~ Sentence ,t~nlysi~ Using 
\]~ETaJ~lL\].es (\]Pl-el iminal-y Report). Technical 
Report TR-84-10 and LCCR TR-84-2. Department of 
Computing Science. Simon Fraser University. August 
1984. 
Popowich. F. The SAUMER User's Manual. Technical 
Report TR-85-3 and LCCR TR-85-4. Department of 
Computing Science. Simon Fraser University, 1985. 
Popowich. F. Effective Implementation and Application 
of Ulxrestricted Gapping GrammArS. Master's thesis. 
Department of Computing Science. Simon Fraser 
University. forthcoming. 
Radford. A. Tr,~-~t'ormational Syntax. Cambridge 
University Press. 1981. 
Schubert. L.K. and Pelletier. FJ. From English to Logic: 
Context-Free Computation of "Conventional" Logical 
Translation. American Journal of Computational 
1=i~nfi,~tics. January-March 1982. 8(1). 26-44. 
Shieber. S.M. Direct Parsing of ID/LP Grammar. 
draft. 1982. 
Thompson. H. I-Ia~dlin~ Metarules in a Parser for 
GPSG. Technical Report D.A.I. No. 175. Department 
of Artificial Intelligence. University of Edinburgh. 
1982. 
56 
