QPATR and Constraint Threading 
James Kllbury 
Seminar ffir Allgemelne Sprachwissenschaft 
Unlverslt~t Diisseldorf, Unlversltiitsstr. 1 
D-4000 Diisseldorf 1, Fed. Rep. of Germany 
e-maih KIIbury@DDORUD81.BITNET 
Abstract 
QPATR is an MS-DOS Arity/PROLOG implemen- 
tation of the PATR-II formalism for unification grammar. 
The fbnnalism has been extended to include the constraints 
of LFG as well as negation and disjunction, which are 
implemented with the disjunction and negation-as-failure of 
PROLOG itself. A technique of constraint threading is 
employed to collect negative and constraining conditions in 
PROLOG difference lists. The parser of QPATR uses a 
left-corner algorithm for context-free grammars and 
includes a facility for identifying new lexical items in 
input on the basis of contextual information. 
I Introduction 
QPATR ("Quick PATR") is an MS -DOS 
Arity/PROLOG implementation of the PATR-II formalism 
(of Shieber et al. 1983, Shieber 1986) with certain logical 
extensions. The name was chosen to reflect the fact that 
the prototype system was developed in a short period of 
time but nevertheless runs quickly enough for practical 
use. QPATR was developed at the University of 
Dt~sseldorf within the research project "Simulation of 
Lexical Acquisition", which is funded by the Deutsche 
Forsehtmgsgemeinschaft. 
In contrast to most existing PATR implementations 
such as D-PATR (cf Karttunan 1986a, 1986b), QPATR 
runs under MS-DOS and thus makes minimal hardware 
demands. Like ProP (of Carpenter 1989) QPATR is 
implemented in PROLOG but uses both the negation and 
disjunction of PROLOG in the extended PATR formalism; 
moreover, it employs a left-comer parser with a "linking" 
relation and PROI/3G baclctracking rather than a pure 
bottom-up chart parser. 
The system comprises the following components: (1) 
grammar compiler, (2) unification, (3) left-comer parser, 
(4) lexieal look-up, (5) input/output, (6) testing off-line 
input, and (7) tracing. The grammar compiler (1) 
transforms syntax rules and lexical entries from their 
external notation to an internal form; at the same time 
partial feature structure matrices (FSMs) are constructed 
and the linking relation (see below) is constructed. The 
unification package (2) uses techniques introduced by 
Eisele and D5rre (1986) and described by Gazdar and 
Mellish (1989) to implement the unification of FSMs with 
the term unification of PROLOG. A facility of prediction 
is included in the input/output package that allows new 
lexicai items in input to be identified on the basis of 
contextual information. While QPATR uses a full-foma 
lexicon at present, a package for morphological analysis is 
being developed. 
Since QPATR is distributed in a compiled version, 
knowledge of PROLOG is only needed in order to write 
macros (see below) but not to write grammars or to rttrl 
the system. Thus, QPATR can also be used in instruction 
with students who have no background in PROLOG 
programming. 
2 Descriptions of FSMs 
The formalism of PATR-H has been adopted for 
QPATR and will not be inuoduced here. As presented by 
Shieber (1986: 21) rules consist of a context-flee skeleton 
introducing variables for FSMs and a conjunction of path 
equations that describe the FSMs, e.g.: 
Xo --> Xt X2 
~o cat> = S 
<X, cat> =np 
<X~ cat> = vp 
<Xo head> = <X, head> 
<Xo head subject> = <X, head> 
where cat, head, and subject are attributes. Such path 
equations are written with "*=" in QPATR, which is 
implemented with the nonaal ("destructive") PROLOG 
unification. Furthermore, QPATR provides for pseudo- 
constraints written with "*==" in the path equations, which 
capture the expressiveness of constraining schemata in 
LFG (of Kaplan/Bresnan 1982: 213) and allow the 
grammar writer to specify that some attribute must ,ugt 
receive a value unifiable with the indicated value. These 
are implemented with the "==" unification of PROLOG. 
FSMs are described in QPATR with a logic 
generally based on that developed by Kasper and Rounds 
(1986). The presentation of the logical description language 
here is parallel to that of Carpenter (1989). 
Atomic well-formed formulas (wffs) of this logic 
consist of the two types of equations just introduced as 
well as macro heads (see below); heads of macros defined 
in terms of constraints are prefixed with the operator "@" 
in atomic wffs. Equations contain two designators, which 
are atoms or FSM variables, implemented with PROLOG 
atoms and variables, respectively, or else paflm. The latter 
are defined recursively and may contain atoms or paths as 
attribute expressions. The evaluation of emtwxlded paths 
must yield an atom. 
All derived wffs of the logic are built from atomic 
descriptions with conjunction ",", disjunction ";", and 
negation "not"; parentheses may be simplified in the 
customary manner. Disjunction and negation are not 
directly reflected in the FSMs generated in QPATR. 
Disjunctions are implemented with PROLOG backtracking, 
wtfile negations are treated like pseudo-constraints, which 
are executed as tests after the complete FSM of an input 
phrase has been constructed by the parser. The "negation" 
employed here is thus the negation-as.failure of PROLOG. 
FSMs themselves are represented internally as a 
PROLOG list of feature-value pairs with a variable 
382 
remainder list (ef Eisele/D0rre 1986: 551; Oazdar/MeUish 
1989: 228). Since FSMs are described rather than directly 
represented in the grammar and lexicon, these internal 
PROLOG representations normally are neither constructed 
nor seen by the user. 
The syntax of the logical description language is 
defined here in Backus-Naur form: 
well-formed formula 
<wff> :::= <awff> I 
'(' <wff> ',' <wff> 5'1 
'(' <wff> ';' <wff> ')' I 
'(' 'not' <wff> ')' 
conjunction 
disjunction 
negation 
atomic wff 
<awff> ::-'= <deser> I 
<cdescr> I 
<macro-head> I see below 
'(' '@' <macro-head> ')' constraining macro 
FSM description 
<descr> :::= '(' <desig> '*=' <desig> ')' 
constraining FSM description 
<cdescra ::= '(' <desig> '*==' <desig> ')' 
designator 
<desig> ::= <atom> I <fsm-variable> I <path> 
path 
<path> ::= <fsm-variable> '/' <attr-exprs> 
attribute expressions 
<a~-exprs> ::= <attr-expr> I <attr-expr> '/' <attr-exprs> 
attribute expression 
<attr-expr> ::= <atom> I '{' <path> '}' 
3 Maer~ 
Macros (or templates; cf Shieber 1986: 51) may be 
employed in QPATR to reduced redundancy in syntax 
rules and lexical entries and thereby to capture 
generalizations. In the present version of QPATR macros 
are defined as conjunctions of other macros and FSM 
descriptions with "*=" and "*=="; they may not contain 
disjunctions or negations. Furthermore, macros may not be 
defined reeursively as this would lead to nonterminating 
loops. 
Since macros are ultimately defined in terms of FSM 
descriptions with "*=" and "*==", which themselves are 
implemented as executable PROLOG goals, macros are 
represented in the present QPATR version simply as 
PROLOG inference rules with a head consisting of the 
macro name as its predicate and the variables for FSMs 
referred to as its arguments. This is the only part of the 
system that requires elementary PROLOG programming in 
order to write grammars in the formalism. 
A special representation language for the definition 
of macros is being developed and will be included in new 
versions of QPATR. 
4 Rules and Lexlcal Entries 
Syntax rules are indexed with an hlteger which is 
used by the linking relation constructed during compilation 
of the grammar into its intea:nal form (see below). The 
mtmbering of rules is arbitrary and need not be 
consecutive or ordered. 
Category descriptions are macro heads. In principle, 
a single dummy macro name cat can be used for all 
categories so that all information about the FSMs 
contained in a rule is put in the description wff of the 
right-hand side; however, the linking relation would then 
lose its value for the parser. In order to modularirz the 
grammatical description, the wffs of rules and entries may 
be defined exclusively in terms of macros. 
The syntax of rules and lexical entries is defined as 
follows: 
lille 
<rule> ::= <integer> '#' <cat> '--->' <rhs> '.' 
right-hand side 
<rhs> ::= <cats> I <cats> '::' <wff> 
categories 
<eats> ::= <eat> I <cat> ',' <cats> 
category 
<cat> ::= <macro-head> 
lexical entry 
entry ::= <atom> 'lex' <khs> '.' 
lexieal right-hand side 
<lrhs> ::= <cat> I <cat> '::' <wff> 
Orthographic word forms are represented as 
PROLOG atoms. 
5 Constraint Threading 
By convention, the wffs of rules and lexical entries 
are written in conjunctive normal form as a list of atomic 
wffs, disjunctions, and negations. When a rule or entry is 
compiled the list representing its wff is sorted into lists of 
atomic wffs (except constraints), disjunctions, and 
constraints (including negations) whose members are 
executed as PROLOG goals before, during, and after 
parsing, respectively. The execution of the atomic wffs 
without constraints builds partial FSMs which contribute to 
the information encoded in the linking relation (see below). 
In their compiled form rules and entries thus contain 
partial FSMs associated with lists of disjunctions and 
negations that apply to them. 
Disjunctions are executed during parsing and make 
use of the normal backtracking mechanism of PROI.£K\] 
while constraints and negations are executed after parsing 
to test whether a FSM in fact fulf'dls all conditions of the 
original wff. During parsing the constraints and negations 
contributing to the complete description of the FSM 
associated with the input must be collected. In order to 
accomplish this a technique of constraint threading is 
introduced based on the difference lists used by Pereira 
383 
and Shieber (1987) for gap threading. The PROLOG term 
associated with a syntactic constituent contains difference 
lists of constraints associated with the constituent before 
and after it has been parsed. The first difference list for an 
entire input phrase is the empty list, whi!e the second is 
instantiated with the complete list of constraints and 
negations after parsing is completed. 
A complication arises from the fact that constraints 
and negations may be embedded in disjunctions and that 
their execution must be deferred. This can be dealt with 
by "percolating" such embedded constraints up into rite 
difference lists for constraint threading when the 
disjunction is solved. The following program implements 
the execution of disjunctions during parsing: 
% solve disjunctions( 
% <disjunctions>,<constrah~ts0>,<constralnts>) 
solve_disjunctions(\[\], C, C). 
solve disjunctions(\[DIDs\], C0, C) :- 
dsolve(D, C0, C1), 
solve_disjunctions(Ds, C1, C). 
dsolve((Wff ; Wffs), CO, C):- 
l, (dsolve(Wff, C0,C) ; dsolve(Wffs,C0,C)). 
dsolve(fWff , Wffs), C0, C) :- 
I, dsolve(Wff, C0,C1), dsolve(Wffs,C1,C). 
dsolve((not Wff), C, \[(not Wff)lC\]) :- I. 
dsolve((@ Wff), C, \[WfflC\]) :- I. 
dsolve(Wff, C, C) :- call(Wff). 
6 The Parser of QPATR 
The parser is based on a left-comer algorithm with 
backtracking for context-free grammars (cf Kilbury 1988 
and Pereira/Shieber 1987: 179fO. The efficiency of the 
parser is improved with top-down filtering in the feral of a 
linking relation (cf Pereira/Shieber 1987: 182). This 
ordinarily is a transitive binary relation over categories 
represented as PROliX\] atoms or terms with atomic 
category labels as functors. The PATR formalism requires 
a modified technique since the syntax rules contain FSMs, 
whose unification is more costly than that of atomic 
category lables. QPATR therefore uses numbered syntax 
rules and then defines the filter with a binary relation over 
the rule indices. If the grammar contains some rules 
i # F~ ---> Fit ..... F~ 
j # Fie ---> F~ ..... F~, 
where the subscripted F's are FSMs, then we have 
dlink(ij) iff F~z subsumes F~0, i.e. if F~ is an immediate 
left corner of F/0. Then link(ij) is the reflexive and 
transitive closure of dlink(ij). 
7 Lexlcal Prediction 
QPATR includes a facility of prediction whereby 
FSMs are proposed for new lexical items encountered in 
input but not contained in the lexicon. Predictions are 
made on tim basis of contextual infomaation coUoeted 
during the analysis of input. A ftmdamental distinction is 
made between open and closed lexical categories, and dds 
inl'ormation must be represented with definitions of 
con'esponding nmcros in the grammar. These definitions 
may refer to semantic as well as syntactic categorial 
information. A prediction is blocked if the proposed FSM 
does not match an open lexical class or if it is described 
by an entry already in the lexicon, but FSMs may be 
constructed tbr new lexieal items having homonyms in the 
lexicon. The definition of open is not used actively to 
propose an FSM but rather passively to test rite 
admissibility of an FSM already constructed from the 
context. 

References 

Carpenter. Bob (1989) Prop Documentation. Computational 
Linguistics Program, Carnegie Mellon University. 

Eisele, Andreas / Dtrre, Jochen (1986) A l.~xieal 
Functional Grammar System in PROLOG, Proceedings of 
COLING-86, 551-3. 

Gazdar, Gerald / Mellish, Chris (1989) Natural Language 
Processing in PROLOG. Wokingham, England et al.: 
Addison-Wesley. 

Kaplan, Ronald M. / Bresnan, Joan (1982) Lexieal- 
Functional Grammar: A System for Grammatical 
Representation, in The Mental Representation of 
Grammatical Relations (Joan Bresnan, ed.). Cambridge, 
Mass. / London: MIT Press. 

Karttunen, Lauri (1986a) D-PATR: A Development 
Environment for Unification-Based Grammars, Proceedings 
of COLING-86, 74-80. 

Karttunen, Lauri (1986b) D-PATR: A Development 
Environment for Unification-Based Grammars (& CSLI 
Report No. 86-61). Stanford, CaliL: CSLI. 

Kasper, Robert T. / Round, William C. (1986) A Logical 
Semantics for Feature Structures, Praceedingsof the 24th 
Annual Conference of the ACL, 235-242. 

Kilbury, James (1988) Parsing with Category Cooceurrence 
Restrictions, Proceedings of COLING.88, 324-327. 

Pereira, Femando C. N. / Shieber, Smart M. (1987) Prolog 
and Natural-Language Analysis (= CSLI Lecture Notes, 
10). Stanford, Calif.: University of Chicago Press. 

Shieber, Smart M. (1986) An Introduction to Unification- 
Based Approaches to Grammar (= CSLI Lecture Notes, 4). 
Stanford, Calif.: University of Chicago Press. 

Shieber. Smart M. et al. (1983) The Structure and 
Implementation of PATR-II, Research on Interactive 
Acquisition and Use of Knowledge, 39-93. Menlo Park, 
Calif.: SRI International 
