PET: PROCESSING ENGLISH TEXT 
P. Oppacher 
Dept. of Computer Science, Concordia University, Montreal, 
Canada 
This paper describes a new parser that combines top 
down and bottom up strategies and the natural language pro- 
cessin6 system PET which uses the parser. PET is deslgned to 
facilitate the interactive construction of natural language 
front ends and to support experiments in computational ling- 
uistics° The system, which has been implemented in UT-LISP, 
provides facilities for performing the following tasks: natur- 
el language parsing according to context-free and transforaat- 
ional grammars; dtsembt~ation of word senses by pattern- 
-directed inference! construction of a semantic network data 
base from English sentences; deductive information retrieval 
to answer simple English questions° 
Most extant natural language processing systems have 
very complicated control structures and are, therefore, 
difficult to extend modularly. It appears that the lack of 
modularity of these systems is due to the fact that their 
syntactic expertise is not -oonven£ehtly located in one rout- 
ine but, in effect, distributed throughout the entire pro~am. 
In the system described here, modularization is achiev- 
ed by maklD~ the control structure largely transparent, ice. 
by allowing it to reside in the parser° 
The method, in a nutshell, is this: the pa~ssr outputs 
a phrase structure tree, or, if the analyzed sentence is 
structurally ambiguous, a list of several phrase structure 
- 214 - 
trees. A tree for a sentence is generated in bottom up fashion 
under the control of context-free rules and restricted context- 
-sensitive rules. The latter consist of tests and tree-modi- 
fying functions. The context-sensitive tests inspect the 
immediate environment of a node about to be entered into the 
tree. Depending on the outcome of such tests, the tree,modify- 
ing functions can change partially constructed parse trees. 
The interior nodes of the tree are occupied by functions cor- 
responding to transformational and/or semantic rules, and the 
leaf nodes are occupied by the dictionary entries of the words 
in the surface string. Since LISP, unlike other high level 
languages, makes no distinction between programs and data 
structures, the tree generated by the parser can be ia~nediate- 
ly executed as a program. The tree, interpreted as program, 
constitutes the control structure referred to above, and go- 
yarns the semantic interpretation of the sentence whose struct- 
ure it reflects. 
As can be seen from this very rough description, control 
issues involved in processing natural language text are in- 
deed largely taken care of by the syntactic component. However, 
to eliminate semantically uninterpretable parses and to help 
resolve subtle syntactic ambiguities, the parser must occasion- 
ally co~uicate with the semantic component. 
The parser attempts to eliminate the overhead incurred 
by pure top down (TD) or bottom up (UP) algorithms. TD algo- 
rithms may have to do a lot of backtracking because of wrong- 
ly predicted goals. BU algorithms build many temporary struct- 
ures which will not figure in the final parse. Both backtrack- 
ing an~ the generation of all possible BU interpretations can 
be avoided by suitably combining TD and BU strategies" TD 
expectations based on the grammar itself and on what has been 
parsed so far guide and constrain the BU search, while BU 
results are used at once to refute or confirm TD expectations. / 
- 215 - 
The parser is described and contrasted with several 
other TD and BU parsers. 
The most notable features of the operation of the par- 
ser are the follov~ng, As the sentence is traversed from left 
to right, TD expectations are associated with each position 
in the sentence. Each word in the sentence is thought of as 
lylnE between a beginnlng and an ending position, so that the 
i-th word lies between positions i-I and i, Nodes are built 
only when they agree with prior expectations and when they 
meet additional context-sensitive tests. 
At each moment, the parser attempts to keep the forest 
of partial trees as shallow as possible. After the input words 
have been processed from left to right, the roots of the 
trees constructed so far are visited in alternating right to 
left and left to right order. With each pass the height of 
some trees in the parse forest increases until the root for 
the entire sentence is built. 
- 216 - 
