Lexicase Parsing: A Lexicon-driven App.roach. to Syntactic Analysis 
Stanley STAROSTA 
University of lIawaii Social Science Research Institute and 
Pacific international Center for lIigh Technology Research 
Hoaohdu, Hawaii 96822, U.S.A. 
Abstract 
This paper presents a lexicon-based approach to syntactic 
analysis, l,exicase, and applies it to a lexicon driven computatlolml 
parsing system. The basic descriptive mechanism in a l,exicase 
grammar is lexieal features. The properties of lexieal items are 
represented by contextual and non-contextual features, and 
generalizations are expressed as relationships among sets of these 
features and among sets of lexieal entries. Syntactic tree strue{,uros 
are representaed as networks of pairwise dependency relationships 
among the words in a sentence. Possible dependencies are marked as 
contextual features on individual lexical items, and Lexicase parsing 
ix a process of picking out words in a string and attaching dependents 
to them in accordance with their contextual features. Lexiease is an 
appropriate vehicle for parsing because I,exicase analyses are 
monostratal, fiat, and relatively non-abstract, and it is well suited to 
machine translation because grammatical rel)resentations for 
corresponding smrtences in two languages will Im very similar to each 
other in structure and inter-constituent relations, and teas far easier 
to interconvert. 
1. Introduction 
There are a number of current frameworks of syntactic analysis 
which have been used as the basis for natural language processing. 
Many suffer from serious metatheoretical or pra(:tieal defects, 
especially in the areas of power and descriptive adequacy. Several 
more recent syntactic fl'ameworks, including I,exical-Functional 
Grammar \[1\], Generalized Phrase Structure Grammm" \[2\], and 
Lexiease \[3\] have begun to take these problems seriously, and to 
consider al)plications to natural language processi~g. This paper will 
be concerned with the application of lexiease grammatical theory to 
computer parsing of natural language texts. 
The point of view which we will adopt here is a very simple one: 
sentences are hierarchically structured strings of words, and 
grammar is a statement about the internal composition and external 
distributions of words. Proceeding from this basis, it is possible to 
construct a fornml and explicit grannnatical fl'amework of limited 
generative power which is capable of stating language-specific and 
universal generalizations in a natural way, unhh}dered by 
pretheoretieal a priori assumptions about VP's, etc. The fl'amework 
so constructed, lexiease \[13\], \[4\], \[5\], turns out to have a significant 
1)otential for application in the processing of natural language \[6\], 
The basic descriptive mechanism in a lexiease grammar is lexieal 
features. The properties of lexieal items are represented by 
eantextual and non-.contexttlal features, aud generalizations are 
expressed as relationships among sets of these features. The ways in 
which words can combine together are strongly restricted by the 
Sisterhead Constraint \[3\], which states that a word can contract a 
grammatleal relationship only with the head of a dependent sister 
eonstruetlon, and tbe One-bar Constraint lop. ell,l, which requires 
every construction to have at least one lexical head. The result is 
syntaetie tree representations which are flatter, since there are no 
intermediate nodes between lexical entries and their maximal 
projections, and more universal, since there are only a very limited 
number of ways in which languages can differ in their grammars. 
These properties turn out to make lexicase especially well suited to 
machine translation, since the grammatical representations for 
Hirosato NOMURA 
NT'T Basic Research Laboratories 
Musashino.shi, Tokyo, 180, Japan 
corresponding sentences in two languages will be very similar to each 
other in structm'e and inter constituent relations, and thus far' easler 
to intcrconvert. 
This paper begins with a briefdescriptiml of the basic structure of 
a lexieasc grammar, and then describes an algorithm which applies 
lexlease principles to sentence parsing. Because of space limitations, 
we will not provide a full explication of the whole theory here. 
Instead, we will place the primary focus on the ways in whieh 
particular lexicase principles aid in the straightforward and efficlcnt 
construction of syntactic tree representations for input sentences. 
Section 2 describes the way in which grammatical information can be 
presented as a ~;et of generalizations about classes of lexical items 
represented in a dependency-type tree format. Section 3 describes tire 
various types of lexicase feahn'es and their respective roles in a 
gramamr. Section 4 discusses tim representation of structural infer 
elation about individual sentences in terms of a tree representation, 
and sections 5 and 6 present an algorithm showing how the 
inibrmation provided by a lexicase grammar may be used in parsing. 
2. Rules and representations in Iexiease theory 
l,exicase is part of the generative grammar tradition, with its 
name derived from Chomsky's lexiealist hypothesis \[7\] and Filhnore's 
Case (_h'annnar 18\]. It has also been strongly influenced by l!',uropean 
grammatical theory, especially the localistie case grammar and 
dependency approaches of Jolm Anderson \[91 and his recent and 
classical predecessors. I,ike Chomskyan generative grammar, it is an 
attempt to provide a psychologically wdid description of tile linguistic 
competence of a native speaker, but it differs fl'om Chomsky's 
grammatical fl'amework in power, since it has no transformational 
rules, and in generativity, since it requires grammatical z'ules and 
representations to be expressed formally and explicitly and not just 
talked about. The rules of lexlcase grammm" proper are lexlcal rules, 
rules that express relations among lexicaI items and among features 
within \]exieal entries. There are no rules for constructing or 
modifying trees, and trees are generated by the lexicon rather than by 
rules: the structural representation of a sentence is aay sequence of 
words connected by lines in a way which satisfies the contextual 
features of all the words and does not violate the Sisterhead or One- 
bar Constraints or the eouventim~s for constructing well-formed trees. 
A lexiease parsing algorithm, aeeordingly, is just a mechanism for 
linldng pairs of words together in a dependency relationship which 
satisfies these contextual features and tree-forming conventions. 
(\[1l\], \[12\], and \[131 rm. a very similar but independently developed 
approach which evolved fi'om the computational rather than the 
linguistic direction.) 
Figure 1 lists the rule types in a lexiease grammar and their 
interrelationships. Redundaney rules supply all predictable features 
to lexieal entries, which are stored in their maximally reduced tbrms, 
with all predictable features extracted. For example, all pronotms are 
necessarily members of the class of nouns, and since the feature \[ q N\] 
is thus predictable fl'om the \[ q prnnl (pronoun) feature, \[-I-NI can be 
omitted from pronoun entries in the lexicon and supplied to the entry 
by a demon, a lexical Igedtmdan%~ Rule, daring processing. 
Subcategorization rules characterize choices that are available 
within a particular category. These rules are of two subtypes, 
inflectional and lexicah Fer example, one in\[leetioual 
t27 
Suhcategorization Rule states that English count nouns may be 
marked as either singular or plural. The other type of 
Subcategorization Rule does not allow an actual choice, but rather 
characterizes binary subcategories of a lexical categery. For 
example, there is a non-inflectional Subcategorization Rule which 
states that English non-pronouns are either proper or common. 
Inflectional Redundancy Rules state the contextual consequences 
of a particular choice of inflectional feature. Thus the choice of the 
feature 'plural' on a head noun triggers the addition of a contextual 
feature to its matrix stating that none of its dependent sisters may be 
singular. 
Derivation Rules characterize relations between distinct but 
related lexical entries. For example, they provide a means of 
associating 'quality' adjectives with corresponding -ly manner 
adverbs. Due to the non-productivity of ahnost all derivational 
relations, both derived and underivcd lexical items must be stored 
and accessed separately in the lexicon, so these rules play only a 
minor role in parsing. (They are however the major loxicase 
mechanism for stating the interrelationships of sentence 
constructions such as active and passive clauses.) 
Phrase-level phonological rules and anaphorie rules are the only 
non-lexieal rules in the lexicase system. The latter mark pronouns, 
'gaps' or qmles', and other anaphorie devices as coreforential or non- 
eoreferential, and so are a very important component of an adequate 
parsing system, tlowever, a discussion of this question would go well 
beyond the intended boundaries of this paper. 
With the rules and constraints outlined in this section, it is 
possible to radically simplify a grammar and the associated lexicon in 
ways which facilitate parsing, as detailed below. 
3. Features in lexiease 
As mentioned above, lexical features in a lexiease grammar are of 
two types: contextual and non-contextual. Contextual features 
specify ordering and dependency relationships among major syntactic 
categories ('parts of speech'), agreement and government 
requirements, and 'selection', semantic implications imposed by head 
items on their dependents. Non-contextual features characterize 
class memberships, including membership in major syntactic 
categories, subcategory features, inflectional features (including 
person, number, gender, and tense features as well as localistic case 
form and case relation features, which will not be discussed in this 
paper; but see \[3\]), and the minimum number of semantic features 
needed to distinguish non-synonyms from each other. 
Lexical entries 
.~ rivation ,'tiles \] 
r Redundancy rules 
W 
\[ Subcategorlzation rules }----n, 
"~ I Morphological rules-\] 
Inflectional 
Redundancy rules l~'J 
4. Fully specified lexical 
entries 4. 
Phrases 
\[ Phrase-i~ Phrase-level phonological rules J ? anaphorlc rules\] 
Fully speci~ed phrases 
\[ Disconrsecontext \] 
4. Interpreted phrases 
PARSING PRODUCTION 
Fig. 1 Lexicase theory construction 
128 
(1) Case relations 
Lexlcase assumes only five 'deep' case relations, with inner and 
outer functions distinguished for three of them \[5\], as shown in Figm'e 
2. The inventory of case relations is as short as it is because lexicase 
establishes a more efficient division of labor: much of the semantic 
information formerly carried by case relation differences in 
Fillmorean-type case ,'elations is now carried by the semantic 
subcategory features of classes of verbs, and by the semantic features 
of the case markers themselves. The resulting reduced non- 
redundant case relation inventory improves the efficiency of case- 
related parsing procedures, and makes it possible to capture 
significant generalizations about case marking that are not possible 
with the usual extended inventories used in other case grammar and 
natural language processing systems. It is necessary to refer to case 
relations in parsing structures containing multi-argument 
predicates, in accounting for anaphora and semantic scope 
phenomena and ~ext coherence, and of course in translation. Again, 
however, a discussion here of this aspect of lexicase parsing would go 
beyond the scope of this paper. 
(2) (3ase forms 
Unlike case relations, syntactic-semantic categories whose 
presence is inferred indirectly in order to account for lexical 
derivation and scope and anaphora phenomena, case forms are 
configurations o£ surface case markers such as word order, 
prepositions, postpositlons, case inflections, or relator nouns which 
function to mark the presence of case relations. They are grouped 
together into equivalence classes functionally in terms of which case 
relations they identify, and semantically on the basis of shared 
localistic features as established by means of componential analysis. 
Case forms in a lexicase grammar are thus composite rather than 
atomic. Each is composed of one. or nmre features, either purely 
gramrnatical ones such as :t2 Nominative (q~ Nom), which 
characterizes the grammatical subject of a sentence, or localistic ones 
such as source, goal, terminus, surface, association, etc. 
Semantically, case forms carry most of the relational information 
in a sentence, and are used by the parser in recognizing the presence 
of particular ease relations. For example, it is necessary to refer to 
them in for example identifying subjects in order to check for subject- 
verb agreement. Since so much 'case relation'-type information has 
been found to be present lexieally in the case markers themselves, 
they bear much of the semantic load in the semantic analysis of 
relationships among lexieal items, so that this information need not 
be duplicated by proliferating parallel ease relations. This means 
that in parsing, such information is obtainable directly by simply 
accessing the lexical entries of the case-markers rather than by more 
cmnplex inference procedures needed to identify the presence of the 
more usual Filhnore-type case relations. 
Patient (PAT): 
the perceived central participant in a state or event 
Agent (AGT): 
the perceived external instigator, initiator, controller, or 
experienccr of the action, event, or state 
Locus (LOC): 
inner: the perceived concrete or abstract source, goal, or 
location of the Patient 
outer: the perceived concrete or abstract source, goal, or 
location of the action,event, or state 
Correspondent (CAR): 
inner: the entity perceived as being in correspondence with 
the Patient 
outer: the perceived external frame or point of reference for 
the action, event, or state as a whole 
Means (MNS): 
inner: the perceived immediate affeetor or effeetor of the 
Patient 
outer: the means by which the action, state, or event as a 
whole is perceived as being realize 
Fig. 2 Case relations in lexicase 
(3) Syntactic category features 
A small inventory of major atomic syntactic category features is 
assmncd by lexicase, currently limited to the following seven: noun 
(N), verb (V), adverb (Adv), preposition or postposition (P), sentence 
particle (SPort), adjective (Adj), and determiner (Det). 
Major syntactic categories are divided into syntactic 
subcatego,'ies based on differences in distribution. Thus nouns are 
divided into pronouns (no modifiers allowed), proper nouns (no 
adjectives and typically no determiners allowed), mass nouns (not 
pluralizable), etc., and similarly for the other syntactic classes. The 
contextual features associated with the words in these various 
distributional classes determine which words are dependent on which 
other words, and thus are very important in assigning correct b'ees to 
parsed sentences. 
(4) Inflectional features 
Traditional inflectional categories such as person, number, 
gender, case, tense, etc., are b'eated in lexicase as fl-eely variable 
features which are not stored in their lexical entries (except in the 
cases of unpredictable forms), but are rather added as needed by a 
Subeategorizatien Rule in the course of processing. Inflection is 
typically involved in agreement, and agreement relationships (in 
conjm~ction with the Sisterhood Constrahrt) are important in locating 
and linking together those words bearing a head dependent 
relationship to each other. 
(5) Semantic features 
Lexicase assumes that there must be enough semantic featm'es 
marked on lexical items so that every lexieal item is differentiated 
from every other (non-synonymous) item by at least one distinctive 
semantic feature. These features are not directly involved in parsing, 
but may figure in the identification of metaphors in sentences which 
do not have any other well-formed parsings. 
(6) Contextual features 
Contextual features are the part of the lexical representation 
which makes phrase structure rules unnecessary. A contextual 
feature is a Idnd of atmnic valence, stating which other words may 
attach to a giwm word as dependents to form the molecules called 
'sentences'. Contextual features may function syntactically, 
morphologically, or semantieally. For example, tile feature 
\[-\[-F Det\]\] on English nouns states that English determiners may 
not follow their nouns; another feature, \[+\[+I)et\]\], is marked on 
definite common nouns to show that they must reoccur with 
determiners, and a third, \[-\[-plrl\]\], marks plural nouns as not 
allowing non-plural attributes. The feature \[+(\[+Adj\])\] on common 
nouns states that they may have adjectival attributes, a possibility 
which would otherwise be excluded by the Omega-rule (see below). 
Contextual features may refer to dependents occm'ring on the left 
or on the right, or they may be non directional, referring to sister 
dependents on either' side when the presence of some category is 
important but the order varies (as in topicalization and Fmglish 
subject-auxiliary inversion) or is irrelevant (as in fl'ce word-order 
languages). 
Selcetional features are also contextual, but they differ in 
function from grammatical contextual features. Thus a verb like 
the there my 
(a) Noun phrase (b) Sentence 
Fig. 3 l,exicase tree representations 
'love' may impose an animate interpretation on its subject by means 
of the following selcctional feature: \[D\[+AGT, -anmt\]\]. Although 
the violation of a selectional feature does not result in 
ungrammatlcality, solectionaI features are usefal in parsing to pick 
the most promising branch in parsing a sentence when two or more 
diffm'ent links are possible for a given word, or in identifying 
metaphors when no well-formed parse of a sentence is otherwise 
possible. 
Since the 'range' of contextual features is sharply lhnited by the 
Sisterhead Constraint, only certain kir~ds of links between words are 
possible, and only those words directly connected by a single link need 
be checked for the satisfaction ef grammatical requirements such as 
case fl'ames, agreement featm'es, etc. This greatly limits the number 
of places a parser has to cheek in determining the well-formedness of 
a given sentence, and so facilitates parsing. 
Contextual fcatm'es may be positive, negative or optional. 
Positive contextual features state the presence of a required 
dependent, and are used in parsing to establish initial links between 
pairs of words. Negative features klentify classes of words which are 
not allowed to occur as dependent sisters, and serve in parsing to 
reject some of the links nmde in accordance with positive features. 
Optional featm'es do not require or reject any links, but rather serve 
to keep open tire possiblity of linking pairs of words by a general 
procedure applying near the end of the algoritt~m (see 6.3 below). All 
links which are not marked as permissible in this way are ruled out 
by the 'Omega Rule', a lexical Redundancy Rule which states the 
defimlt value for tire 'linkabillty' of given pairs of wm'ds: all liukings 
which are not explicitly allowed for are disallowed. 
The most iml)ortant charactm'istic for all contextual features for 
the purposes of parsing is tile Sisterhcad Constraint: in (lctenninh~g 
whether a contextual feature is satisfied for a given item, the parser 
need look only at the head words of its sister eategm'ies. 
4. Lexiease tree representation 
In lexicase, tree diagrams arc graphic representations of 
dependency and constituency relationships holding among pairs of 
words in a sentence, and thus indirectly of relations among the 
constructions of which these words are the heads. Two types of 
constructions are recognized: endocentric and exoeentrle. These two 
construction types can be identified and their internal and external 
dependency relations determined directly from the kinds of lines by 
which they are connected in a Iexiease tree representation (or, 
equlwdently, by their bracketing in a LISP-type parenthesis 
notation): 
i) vertical lines link a phrasal node with its head: a unitlength 
line indicates a lexical head, and a two-unit-length line 
identifies a phrasal head of an exoeentrie construction; 
it) slanting lines link an endoeentrie phrasal node with its 
dependents; and 
iii) horizontal lines link the vertical lines above the lexieal or 
phrasal heads of an cxocentrie construction. 
An endoeentrie construction is any syntactic construction which 
has only one obligatory member, i.e. one head, which in accordance 
with the lexiease One-Bar Constraint must be a single lexicai item. 
The other constituents of such constructions are phrases which are 
syntactically optional dependents of the head word. Noun Phrases 
and Sentences for example arc endocentric constructions, headed by 
,, JJa(| lit iI.lS c ........ \] \[ ES",~;,\] E"-',,','~ 
my 
Fig. 4 The domain of the verb 'saw' 
saw I 
Tom / I and I 
P~°q / ..,1 C+~cni m,r., 
L'" g \] \[-+?AT\] \] F ..... 1 my 
L ~ N j L ~. N j 
E * l)a \] ¢cnj = coordinating conjunction 
Fig. 5 Tree representation with category features 
129 
nouns and verbs respectively. In a tree, the head word of an 
endocentric construction has a vertical line of unit-length above it. 
An exoccntric construction on the other hand has more than one 
obligatory constituent. Again, the One-Bar Constraint requires that 
at least one of the constituents must be a single word, the lexical head 
of the construction. The other obligatory head (or heads) may be a 
word or a phrase. Examples of exocentric constructions are 
prepositional phrases and coordinate constructions. In a tree, each of 
the co-heads of an exocentric construction has a vertical line above it, 
of unit-length above lexical co-heads and two-unit-length above the 
lexical heads of phrasal co-heads. The apexes of the vertical lines are 
joined by a horizontal line, in effect an elongated node. Examples of 
both types of phrases appear in Figure 3. 
The gramatically relevant relationships between pairs of nodes in 
a tree are expressed in lexicase in terms of the notions 'command' and 
'cap-command' (from Latin caput, capitis 'head'): 
i) a wm'd cap-commands the lexlcal heads of its dependent sisters; 
thus in the two trees in Figure 3, 
a) 'boy' cap-commands 'that', 'on', and 'bus', since 'boy' has 
two dependent sister constituents (indicated by slanting lines), 
'that' and 'on the bus there'. The lexical head of the construction 
'that' (stlown by a vertical line) is the word 'that'. However 'on the 
bus there' is an exocentric construction (shown by a horizontal 
llne) which has two heads (shown by vertical lines), 'on' and 'the 
bus there'. The lexical head of 'on' is 'on', and the lexical head of 
'the bus there (vertical line) is 'bus'. 
b) 'on' cap-commands 'bus', since 'on' has a single dependent 
sister (the phrasal co&cad of the exocentric construction 'on the 
bus there'), 'the bus there', and the lexical head of 'the bus there' 
is 'bus'. Finally, 
c) 'bus' cap-commands 'the' and 'there', since 'bus' has two 
dependent sisters, 'the' mad 'there', and the respective heads of 
these two constructions are the words 'the' and 'there'. 
ii) a word X commands a word Y if elther 
a) X cap-commands Y, or 
b) X cap-commands Z and Z commands Y. 
Thus for example 'boy' commands 'there' because 'boy' cap- 
commands 'bus' and 'bus' cap-connnands 'there'; however 'that' does 
not command 'there' because 'that' has no dependent sisters at all, 
and so does not cap-command anything. 
The notion 'cap-command' plays a crucial role in defining the 
domain of subcategorization. To determine which constituents are 
relevant in subcategorization, lexicase appeals to the Sisterhead 
Constraint, which maintains that 'contextual features are marked on 
the lexical heads of constructions, and refer only to lexlcal heads of 
sister constructions' \[3\]. That is, a word is subcategorized only by the 
words which it cap-commands. For example, a verb may be 
subcategorized by the heads of the noun phrases which are its sisters, 
but not by the other constituents which are inside the NP's. 
Conversely, a noun may not be subcategorized by any constituent 
outside the NP. However, in the case of exocentric constructions such 
as prepositional phrases, the head words of botl~'all obligatory co- 
head constituents are accessible for subcategorization, since they are 
all cap-commanded by the higher head item. 
To illustrate, in the Noun Phrase in Figm'e 3 (a), the lexical head 
of the construction is the noun 'boy'. Following the Sisterhead 
Constraint, the contextual features marked on 'boy' can refer only to 
features of the words it cap-commands, in this case 'that' and the 
heads of the exocentric PP, 'on' and 'bus', but not to 'the' or 'there'. 
The features of both the preposition and the head of its sister NP fall 
within the domain of subcategorization of the cap-commanding 
lexical item and jointly subcategorize it. Their features taken 
together are said to form a 'virtual matrix', i.e. a matrix which is not 
the lexical specification of any single lexical item, but which is rather 
a composite of the (non-contextual) features of all of the lexical heads 
130 
of the construction \[3\]. In the lexicase parsing algorithm discussed in 
this paper, the effect of a virtual matrix has been achieved by copying 
the features of the phrasal head (the lexical head of the phrasal co-. 
head, e.g. 'bus' in 'on the bus') into the matrix of the lexicaI head (e.g. 
'on' in 'on the bus' in Figure 3). The matrix of the preposition 'on' then 
becomes in effect the virtual matrix of the exocentric construction, 
representing the grammatically significant features for the whole PP. 
The Sisterhead Constraint makes it possible to define the notion 
of syntactic domain as all those constituents whose heads are referred 
to by the contextual features of a particular lexieal item. For 
example, the domain of the verb 'saw' in the example of Figure 3 is 
indicated in Figure 4 with ease relations. Thus the domain of the verb 
'saw' in this sentence consists of the arguments marked \[ + PAT\] and 
\[+AGT\]. The determiner 'my', on the other hand, is not in the 
domain of the verb; rather, it is in the dmnain of its own dominating 
lloon~ 'Dad'. 
There are a number of other constraints in lexiease which apply to 
syntactic trees \[3\]. The effect of these constraints is to limit the class 
of possible trees and, consequently, the class of possible analyses. One 
constraint is that all terminal nodes are words, not morphemes or 
empty categories. A related constraint states that syntactic features 
are marked only on lexieal items, not on nodes or on ad hoe abstract 
lexieal categories. Finally, lexicase requires that every construction 
have at least one immediate lexieal head; that is, there can be no 
intervening non-Iexical node between the phrasal node and the 
lcxlcal head of the phrase. In X-bar terminology, lexiease allows 
phrasal nodes with a maxinmm of one bar, where an S is equivalent to 
V-bar. 
The interaction of the tree-drawing conventions, the One-bar 
limitation, and the Sisterhead Constraint makes it possible to 
eliminate both phrasal and major category labels from syntactic trees 
without any loss of information \[3\]. The matrix of an individual 
lexieal item contains information about its syntactic category, 
making a category node label redundant. With the One-Bar 
Constraint, the nature of the phrasal construction can be determined 
with reference to tbe lexieal category of the head of the construction, 
which is identifiable by the unit-length vertical line above it. Thus 
any node directly attached to a lower \[+N\] item by a vertical line of 
unit-length is an NP, so it is redundant to mark such a node by the 
label 'NP'. As a consequence, the tree representatiml in Figure 5 
which has no node labels overtly marked is adequate for the 
representation of all constituency and dependency information. Note 
that the CCJN ('conjunction-bar') 'my Dad and Rufus' in Figure 5 is 
still an NP in function, because a coordinate construction is 
exocentric, and so the virtual matrix associated with 'my Dad and 
Rufus' contains the feature \[+N\] as well as \[+cejnl, making it an NP 
for external subcategorizing purposes. 
The single-level lexicase tree notation incorporates the 
information carried by the three different kinds of tree structure 
contrasted by Winograd \[10\], dependency (head and modifier), phrase 
structure (immediate constituents), and role structure (slot and 
filler). Because it allows no VP constituent, it can equate constituent 
structure with dependency structure. The case role of a constituent is 
the case role of its lexical head. Thus semantic information is readily 
extracted from the syntactic representation, because the 
representation links together those words which are semantically as 
well as syntactically related. 
5. The parsing algorithm 
Figure 6 shows the fundamental components of the Iexiease 
parser. The function of these components in brief is as follows: 
(1) Pre-proeessor 
This procedure replaces the word forms in the input sentence by 
hmnographic fully specified lexicaI entries, that is, entries with 
identical spelling, specified for all contextual and non-contextual 
syntactic features as well as contextual and non contextual semantic 
features ('selection'd restrictions'). If an input form matches more 
than one lexical entry, replace the form by a 'cluster', a list of all the 
lcxical entries whose forms match tim input form. The output is a 
string composed of lexieal entries and clusters of lexical entries which 
is isomorphous with the input string of word forms. 
(2) Morphological analyzer 
If an input fin'm is not matched by any item listed in the lexicon, 
the morphological analyzer checks to see if the form matches any 
stored stem-affix pattern. If it does, the form is divided into stem plus 
inflectional affix and the stem is markcd with the syntactic class 
features associated with tile pattern. Using inflectional 
Subcategorization Rules, the stem is expanded into its full 
inflectional paradigm, and the original input word form is replaced by 
a 'cluster' composed of those (ffdly ,~;pecifled) members of tile 
inflectional paradigm which are homographic with the original word 
lbrm. 
(3) l'laeeholder substitution 
Each cluster of homographic lexical entries in the substitution 
string is temporarily replaced by a 'placeholder' entry composed of the 
intersection of the form and features of all the entries in tile cluster. 
If the entries have nothing in common hut the form itself, then the 
placeholder will be the form alone, with no associated feature matrix. 
If the lcxical entries in a cluster have enough featnres ill con'nnon 
to be equivalent in terms of linldag potential, they are linked into the 
tree structure as a group during the parsing process. When the 
structures containing clusters of entries are subsequently resolved 
into lexically unambiguous structures during placeholder expansion, 
many of the necessary links will have ah'eady been nmde, and will 
net have to be repeated for each separate but syntactically equivalent 
homographic enlry. 
(d) Plaeeholder expansion 
Each substitution string containing plaeeholder clusters is 
expanded into separate structm'es by replacing the clusters with 
subclustcrs of items sharing nlore features in comn'lon, and 
ultimately with their original constituent hldividual entries. After 
each cluster is resolved into subclusters or individual entries, the 
resultant substitution strings are passed through the parser again to 
add links that become possible as the new clusters and entries become 
accessible. 
As with the previous parsing phase, this phase establishes links 
that work for clusters of honmgraphie items, so that these links do not 
have to be nmde separately and repeatedly for each substitution 
Input    orphological 
__~ - --'9 . \[ Analyzer 
Placeholdcr \] \] Placeholder 
Substitution J \[ Expansion 
Parser 
P's 
V's 
N's 
I)et's 
Adj's 
Adv's 
Conjunctions 
Orphans -- 
Output 
Fig. 6 Fundanmntalcomponentsofthelexicaseparser 
string containing a different homographic item. In this way, no 
sequence of words ever has to be rcparsed. 
(5) Parser 
Based on the positive contextual syntactic features of head lexical 
items, the beads are linked to eligible and accessible dependent items. 
As each link is established, the negative contextual £eatm'es are 
checked. If there is a violation, that track is immediately abandoned. 
Note that exactly the same negative contextual feature mechanism 
takes carc of two distinct contextual dependency phenomena: 
i) general cooccurrencc properties, such as the fact that English 
nouns may not have following Determiners, and 
ii) grammatical agreement; thus for example subjce.t-verb 
agreement is stated as a negative contextual feature: a finite 
verb marked for plural nlay not have a dependent Nominative 
sister marked singular. (Ar.tually the matter is somewhat more 
complex than this, but a fidl discussion would go beyond the 
scope of this paper.) 
After each pair of words has been linked in accordance with 
positive and negative grammatical contextual features, implicational 
semantic contextual features C,;eleetienal restrictions') are checked 
for compatibility. If a violation is found, that string is semantically 
allonlalous. 
Lexicase theory is designed such that only the heads of sister 
categories need to be considered in determining whether there is an 
inconsistency in a structure being parsed. That is, only words directly 
connected by a single line need to be checked for the satisfaction or 
violation of any grammatical or selectional contextual requirement, 
and this checking can be done immediately afte.r each link is first 
made. If a violation is found, the structure can be shunted off on a 
siding immediately without wasting time examining surrounding 
material. The parsing procedm'e will be considered in somewhat 
more detail in the section 6. 
(6) Output 
The outptlt of tile algorithm is zero or nmre syntactic analyses of 
the input sentcnce, but at the same time it can be considered an 
intensional semantic representation: it presents all the sernantic 
distinctive features for each word, and specifies the head-modifier and 
semantic implication relations between each linked pair of wm'ds. 
The 'extensional' meaning of the sentence then is just tile range of 
external situations which are compatible with the intension, the 
lexical meanings and interrelationships characterized by this 
structure. I,exicase is very well suited to characterizing this 
intcasional semantic representation bccausc it formally defines the 
range of possible loxical linkages. The structure is simple yet rich 
enough to in principle carry enough information to serve as the input 
to a know\]edge extraction or machine translation system. 
6. The parsing procedure 
6.1 Words 
(1) Prepositions: IAnk each preposition by contextual 
features with an accessible N, V, or P. Prepositions are linked first 
because they link with N's, V's, or other P's to form PP's wbich delimit 
closed domains whose internal non-head constituents are then 
inaccessible to connections with external elements. Subsequent 
parsing stages then search inside of or outside of these dmnains, but 
do not need to consider links between PP-internal not>heads and PP- 
external lexical items. 
(2) Verbs: Verbs are linked with their attributes to form 
clauses or sentences. Note that in the lexicase framework, 'sentence' 
refers to any verb-headed construction, regardless of the finiteness of 
its verbal head or its position in the tree. The searching proceedes 
131 
from left to right in English, but would scan fi'om right to left in a 
verb-final left-branching language such as Japanese. In a 
dependency grammar framework such as lexicase, a (verbal) sentence 
is defined as a verb together with its syntactic dependents. A 
sentence is the basic unit of syntax because it is the maximum 
domain of dependencies. Once a sentence unit has been established 
in this way, subsequent parsing stages can ignore links between 
sentence-internal and sentence-external items. 
(3)Nouns: Nouns are linked with their dependents to form 
Noun Phrases. Noun Phrases and Sentences ('verb phrases') are the 
syntactically and semantically basic sentence constituents. Like 
other head items, nouns establish domains whose non-head 
constituents are inaccessible to external links, so that cross-domain 
Iinkages can be ignored on subsequent passes, thereby radically 
limiting the number of pairs of items that have to be considered on 
each subsequent pass and again cutting down on computation time. 
(4) Determiners: Link each Determiner with an accessible 
Noun. In English, the Determiner marks the left boundary of a Noun 
Phrase. Linking the N and its Det establishes one boundary of the 
NP, and subsequent parsing can ignore links between elements 
inside this domain and elements outside it. 
(5) Adjectives Link each Adjective with an adjacent noun. 
Because previous passes will have already delimited major 
constituent boundaries and radically narrowed the set of possible 
connections, very little checking will need to be done to link an 
Adjective with the correct head Noun. 
(6) Adverbs: Link each Adverb with a head Verb or Adjective. 
Structural ambiguity is most likely to appear in connection with 
alternate attachments of PP's and Adverbs with other words in a 
sentence. By saving Adverb linking until near the end of the parsing 
sequence, we establish domains of inaccessibility which greatly 
reduce the number of possible Adverb attachment points which need 
to be considered. 
6.2 Coordination 
Link each conjunction with one or more major constituents (S, 
NP, PP, AdjP, or AdvP) on each side. At this point, all the major 
constituents have already been established, so the conjunction 
linking procedure needs to consider only the head word of each major 
constituent. Since every conjunction will at this time be either at the 
highest level, that is, linkable only to the immediate constituents of 
the sentence, or inside the domain of some other construction, thc 
number of linking choices will be extremely limited. 
6.3 Orphanage 
Link all remaining upwardly unlinked Nouns, Determiners, 
Adjectives, Adverbs, Prepositions, and Verbs with an accessible 'elder 
sister' (or 'regent' \[12\]). At this point unattached lexical items will be 
found only embedded inside of other constructions, with very few 
accessible attachment possibilities to consider (usually only one). 
Thus there will generally be no backtracking and stacking required. 
The exception will be Adverbs and PP's, which account for most of the 
structural ambiguity likely to be encountered. By saving these 
alternative connection possiblities until near the end of the parsing 
process, we minimize the amount of computation that has to be done 
'on top of' the alternative structures produced at this stage. 
7. Overall assessment and conclusion 
The parsing approach we advocate here is in principle very simple 
because lexicase requires no rules for normal parsing situations at 
all, and is based on linguistic principles designed to maximize the 
generality and simplicity of descriptions. It has no deep structure or 
132 
transformations; instead, 'transformed' and 'untransformed' lexical 
entries are listed separately in the lexicon, thereby placing the 
parsing burden on memory rather than processing. Since Iexicase 
automatically determines which items are relevant to the satisfaction 
of particular contextual requirements, no feature percolation or 
feature copying mechanism is needed to move features around in a 
tree to get them into a position where they are accessible to related 
items. 
Lexicase parsing is bottom-up in the sense that it begins with 
individual words rather than some 'root node' S. It scans from left to 
right or vice versa, depending on whether the language is verb- 
initial, verb-medial, or verb-final, but in fact it is a mechanism which 
works from head to dependent rather than primarily from one end or 
the other. Since it forms constituents from heads and dependents at 
all levels simultaneously, it thus incorporates virtues of both top- 
down and bottom-up parsers. Lexicase accomplishes this by only 
making links allowed or required by contextual features of head 
lexical items, and since the 'overall structure of the sentence' is 
determined by just these features, it is not possible to make links 
which are not compatible with this overall structure. 
Since lexicase has no Phrase Structure rules, a lexicase parser 
cannot blunder into the loops caused by left-recursive rules. Lexicase 
generates linguistically correct structures: they directly represent 
head-attribute relationships, they characterize the concept of 
grammatical relatedness, they allow various other important 
generalizations to be captured, and they account adequately for 
speakers' intuitions. 

References 

Kaplan, It., and Bresnan, J., Lexical-Functional Grammar: A 
Formal System for Grammatical Representation. In J. Bresnan 
(ed), The Mental Representation of Grammatical Relations. 
Cambridge University Press. 1982. 

Gazdar, G., Klein. E., Pullmn, G., and Sag, I., Generalized Phrase 
Structure Grammar. Harvard University Press. 1985. 

Staresta, S., The End of Phrase Structure as We Know it. 
Linguistic Agency - University of Duisburg (Trier) Series A, 
Paper no. 147. 1985. 

Starosta, S., Case in the Lexicon. Proceedings of the Eleventh 
International Congress of Linguists. 1975. 

Starosta, S., Patient Centrality and English Verbal Derivation. 
Proceedings of the thirteenth International Congress of 
Linguists. 1983. 

Starosta, S., and Nomura, II., Lexiease and Japanese Language 
Processing. Musashino Electrical Communication Laboratory 
Technical Report. 1984. 

Chomsky, N., Remarks on Nominalization. In Jacobs, R. A., and 
Rosenbaum, P. S. (eds), Readings in English Transformational 
Grammar. Ginn and Company. 1970. 

Fillmore, C. J., The Case for Case. in Bach, E., and Harms, R. T. 
(eds), Readings in English Transformational Grammar. Ginn and 
Company. 1970. 

Anderson, J., The Grammar of Case: Towards a Localistic Theory. 
Cambridge Studies in Linguistics 4. Cambridge University 
Press. 1971. 

Winog~rad, T., Language as a Cognitive Process, Volume I: 
Syntax. Addison-Wesley Publishing Company. 1983. 

Lehtola, A., Jifppinen, H., and Nelimarkka, E. Language-based 
environment for natural language parsing. Precedings of the 
Second European Conference of the Association for 
Computational Linguistics. 1985. 

Nelimarkka, E., Jfippinen, H., and Lehtola, A., A computational 
model of Finnish sentence structure. In Ann S~ggcall Hein (ed), 
Fhredrag rid De nordiska datatingvistik dagarna. 1983. 

Nelimarkka, E., J~ippinen, H., and Lehtola, A., Parsing an 
inflectional free word order language with two-way finite 
automata. In T. O'shea (ed), ECAI-'84: Advances in artificial 
intelligence. Elsevier Science Publishers B.V. 1984. 
