1. Introduction 
When a language is analyzed in accordance with a phrase 
structure grammar, it is customary to regard a terminal string 
x as gran~matical according to a grammar G if one can start from 
the inltial string of the grammar and apply the rules of G, succes- i 
sively rewriting strings until x is obtained. With the resulting 
derivation of a generated string x..., a structural description of x 
is associated consisting of a labeled bracketing which indicates the 
nonterminal symbol(s) rewritten to obtain substrings of x. When 
a phrase structure grammar contains only context-free rules, 
each generated string can be analyzed and its structural des- 
criptions computed with considerable efficiency. In the event 
that some rules are context-sensitive, however, no general 
analysis procedure of comparable efficiency is known. In this paper 
I discuss a means for allowing the use of context-sensitive rules 
in the description of context-free languages to the end of provid- 
ing greater economy of description and analysis. I will show that 
if phrase structure grammars are allowed to define languages in 
a different way than is usual, then certain context-free languages 
can be analyzed more quickly, using less storage than under the 
standard interpretation, although no noncontext-free languages 
can be so analyzed. Furthermore, the new way in which a gram- 
mar defines a language seems to be a more adequate reconstruc- 
-2- 
tion of the use to which context-sensitive rules were put in immed- 
late consti t uent analysis. 
Assume we are given a phrase structure grammar G and a 
string x and we ask whether it is possible to analyze x in accor- 
dance with the rules of G. The answer is in the affirmative if G 
assigns some labeled bracketing tox as its structural description. 
This suggests that we think of x as being provided with an arbi- 
trary well,formed labeled bracketing ~ and check whether each 
phrase of x determined by a matched pair of labeled brackets in 
::is divided into subphrases in accord with the rules of G. For a 
phrase to satisfy a rule R of G, the matched pair of brackets deter- 
mining that phrase must enclose the particular sequence of phrases 
and members of Gfs terminal vocabulary that ~ says the phrase 
may imediately contain. Furthermore, if R is context-sensitive 
with context ~_i_" "'~m ----~1" " " ~--n' then immediately to the left 
n 
(right) in x of the phrase in question must be a sequence yl...ym 
(Zl...z) of strings such that (a) Yl =¢ti (z_j -- Sj) if --~i (-~j) is in 
-- "-TI n" ~ --* 
the terminal vocabulary and   _)isa phase of type 
d d 
according to the labeled bracketing ¢p of x if a i (~.) is in the non- 
terminal vocabulary, for 1 < i < m(1 < j < n)~. If some well- 
-3- 
formed labeled bracketing of x is analyzable by G in this fashion, 
we can think of it as a structural description assigned to x by G -- . 
If G is context-free, the language associated with it in this rather 
w 
natural fashion is clearly the same as the language generated by 
G in the usual fashion and the structural descriptions assigned to 
strings by G are the same in the two cases. If G contains rules 
with nonnull context, however, it is not obvious whether the langu- 
age associated in the above manner is the same as the language 
generated. So that we can investigate this question, let us proceed 
with precise definitions of the new concepts which have appeared 
informally. 
Z. Definitions 
For familiar concepts I will simply refer to definitions in the 
llterature (cf. Peters and Ritchie, 1969b). Recall that a (context- 
sensitive) phrase structure grammar is an ordered quadruple 
(V T, V_~, S__, I~ ) SUCh that ~T and~ are finite, nonempty, disjoint 
sets (the tern~-inal vocabulary and nonterminal vocabulary, res- 
pectively), S is a member of V 
-- --N 
m 
finite set of rules of the type (i), 
(the initial symbol) and R is a 
- _ -I_ ~ _ 
-4- 
where > O, m, n> O, A EVb/, y., CL , S EV U~(l<i< , 
_ ~ -j -k -T .... 
1 < j < m, I< k < n) and ~>, /and ~ are special symbols not 
in VTUV_N. The rule (l) is often written as (2). 
The notation (I) more clearly brings out the possibilities for 
immediate constituency allowed by the rule and the contextual 
conditions imposed by the rule on those possibilities. Let L = 
f'\[A J _A E__NV } andR-- {\]A I _A E VN. ~ be sets of left and 
right labeled brackets. 
Definition I: A labeled bracketing (finite string over VT~.JR ) ~ 
m 
is said to be well-formed ff (i) ~ E_V T, (ii) ~_ = \[A ~-\]A or 
(iii) ~ = ~_wj where ~ and _w are well-formed labeled bracketings 
and A E V~. 
The language generated by G (written L(_G)) and the set of 
structural descriptions generated by G (written L(G)) are as 
usual (cf. Peters and Ritchie, 1967b, Definitions in § 2), A set 
_L of strings is called a context-sensitive language if there is a 
phrase structure grammar ~ such that L= L(G). A p~ase struc- 
ture grammar _G is context-free if every rule (1) of G has 
m --- n = 0 (i.e. Ct I ... a = ~ "'" ~n = e, ~vhere e is the empty 
-5- 
string). A set L of strings is a context-free language if there is 
w 
a context-free grammar G such that L -- = L(G). 
Definition 2: A triple (~--1' ~ ' ~-3) is caned a nod__.e of a well- 
formed labeled bracketing ~ if ~ -- ~-1 ~ ~ and there are A E ~N 
-- -- -2 --3 -- _ 
and a well-formed labeled bracketing ~ such that ~--2 -- \[A ~- \]A" 
The node (~I' ~--7.' ~3 ) satisfies rule (I) if there are labeled 
bracketings ~-0' ~---1 ..... ~--m' ~ ..... a_~ n, X_ 0 , ~ ..... _Y, 
..... ~ ' --P0' --Pl ..... P--n' --~I ..... ~ such that 
-i_ -~_ _ _ - _ _ 
(i) ~_i=~.0o'1.~.1... O'm~m, ~Z =\[_A'X.o~I_Y,i ..- ~j_')C(L. \]_A 
and~ =~ Zl~... ,~._~p. --3 
(ii) ~x__j, p E(LUR)*,I<i<m, O<j<~., O<k<_~-i a,~d 
~_i,-"~_i ~ Z~ 
(iii) ~ =}\[ . (~. , \] , ~ CL. E -~N (O1'we11-f°rmed)j I< i< m, - C~ ~ ~ - 
-~_ ,i<j< ~and ~-j_ C-'~-J -~=IEY ~.'\]y., if ~ ~ v (~j, ~on-formed 
--T , l<k<n. :~r5 ~-~ ~ 'well-formed 
..... 
Definition 3: The debracketing function d is the homomorphism 
from (V_T U h U R )* onto ~T* defined by 
3. 
-6- 
a_. if~v T 
(i) d(ct) = -- and 
e, ifa E LUR 
(ii) d(v$_._) = D(cp)d(~) for any labeled bracketings ¢p and ~_. 
A labeled bracketing ¢p is analyzed by G if d(cp) E VT~, if there 
is a weU-formed labeled bracketing ~ such that ~= IS ~-- \]S and 
if every node of Cp satisfies some member of R. We say that a 
string x is parsed by G if there is a \]abeled bracketing ¢p such 
that cp is analyzed by G and d(cp) = x. The set of labeled bracket- 
ings analyzed'by G will be written A(~ and the set of strings 
parsed by G will be written P(G). 
The Languages Parsed by Phrase Structure Grammars 
We can think of the labeled bracketings analyzed by a phrase 
structure grammar G as being strings over a terminal vocabu- 
lary which is the union of CJs terminal vocabulary and its set 
of left and right labeled brackets. We may then ask what type 
of language A(G) is. Theorem 1 provides the answer that A(G) 
is a context-free language and from this Theorem 3.8 of Peters 
and Ritchie (1969a) follows immediately as Corollary 1. We 
now proceed to state these results. 
Theorem l: ~f G is a phrase structure grammar, then A(G) 
i 
-7- 
is a context-free language. 
• V , S , 1~) be any phrase structure Proof: LetG_ -- (V T- 
~N 
grammar and let L and R be the corresponding set of 
left and right labeled brackets. To prove the theorem, it 
suffices to describe a pushdown-storage automaton M 
which accepts A(G) since pushdownostorage automata ~ accept 
just the context-free languages (Chomsky, 1963, Theorem 
6 ). I will describe the automaton M informally since this 
will provide more insight into its operation. Formal 
construction of M from this description is a straightforward 
• and tedious exercise and is therefore omitted. 
I~ can receive as input any string over V U L U R. 
w T 
Its pushdown-store can contain symbols from ~T U ~N U I~UR', 
where 1~ t is a set of symbols each corresponding to the string 
resulting from inserting a single ITpointer" (I) in the left- 
context portion c~ a rule (e. g. (3)) or to the string resulting 
from insertion of a 1 in any string which is the right- 
context Qf a rule of R (e.g. ~--1 "'" I ~ .... ~ )- 
m -'I -~1 
(3) A~> ,r ... ,~ I I~ -- -I -- CLI ...... ~ 
--i --m ~''" ~ "-n 
-8- 
M contains a finite set of states sufficient to "remember" 
two tables: a rule table and a right-context table. The rule 
table plays a dual role; it is used to determine ihat a node of 
the input is tentatixely indicated as satisfying a rule only if the 
left-context of that rule is indeed satisfied when the left bracket 
determining the node is reached in the input and it is used to 
store an indicator at that point which will allow M to check as 
the input is read further whether the immediate constituency 
and the right-context of the node are as required by the rule. 
The right-context table is used in checking whether the right- 
context of a rule tentatively identified as being satisfied by a 
node does indeed appear immediately to the right of the right 
bracket determining that node. For each rule (I) of 1~, the rule 
m 
table contains m ~- 1 positions and the ith position contains an 
entry consisting either of the symbol (3) or the symbol (4). 
... i 
The rule table will be updated as t-he input is read so that when 
any position corresponding to any rule (I) of R contains the m 
entry (3), then immediately to the left in the input of the scanned 
symbol is a string analyzable as c~ 1 ... cLi. j Thus if a pointer 
m 
appears in the entry of a position immedi'ately to the left of the 
symbol ~ (dash), then the left-context of the corresponding 
-9. 
rule is saUs£ied at that point in the input. It is clear that the rule 
table ran be "remembered" in a finite set of states. For each dis- 
tinct string ~... J3 appearing as the right-context of a ~eule in R, 
the right-context table contains n÷1_. positions the i_th one of which 
 tain either the entry_ i.., or :... I. When 
the right bracket determining a node is reached in the input, a 
position corresponding to the right-context of the rule which was 
tentatlvely identified as being satisfied at the node receives a pointer 
to the left of its loftmost symbol. As the input is read further, 
pointers are advanced to the right in this string as each " " successive 
portion of the context appears under the scanning head. This 
allows M to check whether the tentatively identified rule is indeed 
satisfied by the node. !IReme~nberingW' the right-context table also 
requires only a finite number of states. 
When started in its initial state scanning the leftmost symbol 
on the input tape with an empty pushdown-store, _M prints S on the 
store and initializes its tables as follows: for each rule (1) of 1~ 
m 
a co~responding position of the rule table receives the entry (4) 
and each position of the right-context table receives an entry with 
a pointer at its extreme right. At each successive step of its 
computation, M performs whichever one of the operations (5)... 
(8) is possible in view of the top symbol on its pushdown- 
store, the scanned symbol on its input tape and the contents 
-10- 
of its tables. If none of the operations can be performed, M 
blocks and fails to accept the input. Since M is nondeterministic, 
a particular input string is accepted if some computation of 
M on that input terminates in the accepting state with an empty 
pushdown-stor e. 
(5) If you see a nonterminal symbol A, on top of the pushdown- 
store if the scanned input symbol is \[A and if some rule table 
m 
position contains the entry (3) with A to the left of the arrow and 
a pointer immediately to the left of the dash, then (i) advance the 
input tape one square, (ii) remove the symbol A from the top of 
the pushdown-store. (iii) for every r~e table entry ~--> _61...&/ 
--~I " " " \]A . . . ~ ~V . . . V nondeterministically decide whether to 
- --v --1 --w 
leave it unchanged or to change it to B_--> tl... 8u/ \[ ~--I''" 
~.~v--~1 . .. v and insert in the pushdown-store the single symbol 
B--> 5 . -- ...... - -I"'" ~ / -~i'" A l .L-~_~ ~ (iv) for every 
right-context table entry _81 ... \[ A ... 5 nondeterrninistically 
-k 
decide whether to leave it unchanged or to change it to 8_1 . . , 8 k \] 
and insert tm single symbol 61... _A I... 8 k in the p.shdown- 
store and (v) insert in the pushdown-store the~+ 2_ symbols 
\[kl "'" ~-~n' \]A' -Y ..... Y--I (so that Y-'I is, on top). 
(6) If yOu see a member a of V on top of the pushdown-store, 
-- -T 
-11- 
if the scanned input symbol is a and if every right-context table 
entry has a pointer either at its extreme right or immediately to 
the left of an a, then (i) advance the input tape one square, (ii) 
for every rule table entry (3) change it to A---~ --YI "'" -Y / 
l ... °r t°C4) or 
the I is next to the dash, (iii) for every entry _51 ... ~a... _.5 k 
in the right-context table change it to 5 . . . 5 k \] and enter --1 
5 ... a \[ ... 5 in the appropriate table position and (iv) remove 
--1 -- --k 
the 
(7) 
I a 
sqlll~r e, 
and (iii) 
a from the top of the pushdown-store. 
If a right bracket \]A is on top of the pushdown-store and if 
is the scanned input symbol, then (i) advance the input tape one 
(ii) remove the symbol \]A from the top of the pushdown-store 
if every right-context table entry has a pointer at its 
extreme right, then nondeterministicany decide whether or not 
to enter the accepting state. 
(8) If you see a member of 1~ t on top of the pushdown-store, then 
enter it in the appropriate position of the rule table or the right- 
context table. 
Let ~ be any labeled bracketing in ~). Since ~ is 
analyzed by Cb every node (~I' ~Z' ~-3 ) of ~ satisfies some rule in 
_R, say (I)° By Definition 2, ~_ can be factored into T~_fs, a's, \[A , 
-12- 
X_'s, w_Bs, \]A ' ~ Bs and _~'s with the appropriate properties. But 
then as M scans the first symbol of a it can advance a pointer past 
-- --I 
a in its rule table (and store the resulting symbol if ~I is a -l 
member of V ). Continuing in this fashion, M can advance a 
pointer across the entire left-context of (l) since if any c~ i is in 
_V N, the symbol (3) appears in the pushdown-store just below the 
\]~i determining the node which satisfied this portion of the environ- 
ment and thus will be reentered in the rule table for further advance- 
ment of the pointer just after the corresponding \]CLi has been 
scanned on the input tape and hence just in time for ~i +I to be 
spotted. 5o the pointer in the left-context of (I) will be immediately 
to the left of the dash when the first symbol of ~-2 is scanned. At 
this time the A which can be on top of the pushdown-store can be 
removed and replaced bY_~l "'" Z \]A I --~l''" 4" then as each 
is scanned M can proceed ultimately removing the \]A from the 
pushdown-store and entering \] --~I " " " -n ~ in the right-context 
table. The pointer can be advanced across the ~j ~s just as 
across the a~s and thus the right-context table will contain no 
--'i 
bar to acceptance of ~ when the end of the input tape is reached. 
For this reason A4 accepts cp. 
For the other direction, let ~ be any string which is 
-13- 
accepted by M, it is clear that ~ must be well-formed. Let 
(~--1' ~Z' ~--3 ) by a node of {p_. Consider a computation by which 
I~ accepts q~ and let (I) be the rule which was utilized by operation 
(5) when the first symbol of ~ was scanned on the input tape. 
From the desceiption of M one can find the ~Is , a_ts, \[A' _~'s 
®.'s, \]A O's and Y' s of Definition 2 and thus determine that the 
node satisfies rule (I). But since (~--1' ~--2' ~--3 ) was any node of 
~, ~ is analyzed by G, completing the sketch of the proof of the 
theorem. 
Corollary1: For every phrase structure grammar _G, P(G) is a 
context-free language and conversely. 
Proof: Let G be any phrase structure grammar. By Theorem 1, 
A(G_) is a context-free language. By Definition 3, P(G) is the image 
of A(G) under the homomorphism d. The context-free languages 
are closed under homomorphism (Chomsk'y, 1963, Theorem 31). 
Therefore P(G) is a context-free language. For the converse, let 
G by any context-free grammar. Clearly L(G) GA(G) since any 
labeled bracketing that can be obtained by rewriting the initial 
symbol of G is analyzed by G. But A(_G) ~ L(G) also since a top to 
bottom, left to right derivation of any ~ E A(G) can be obtained by 
reading off the left labeled brackets of q~_ Thus L(G) --- A(G) and 
-14- 
• o L(G) = d(L(G)) = d(A(_G)) = P(~). 
Remark: For any phrase structunre grammar G, a pushdown-storage 
automaton M' accepting I~G) can be obtained from the automaton 
M described in the proof of Theorem 1 by altering operations (5) 
and (7) so that they apply regardless of what input symbol is 
scanned and do not move the input tape. 
4o Applications 
In a context-free grammar, the only way to express 
grammatical agreement between phrases which are not immediate 
constituents of the same phrase is by introducing additional 
nonterrninal symbols and rules into the gramraar. For example, 
there are good reasons to split an English declarative sentence 
into a subject noun phrase and a predicate verb phrase. The 
noun phrase will contain the subject noun as a constituent and the 
verb phrase will contain the main verb of the sentence. Now the 
noun and verb must agree in number and person and with the 
constituency described the only way to achieve this effect with con- 
text-free rules is by means of rules such as (9). 
(9) S--> NP VP sg sg 
S~> NPpl VPpl 
-15- 
NP --~ Det sg Nsg 
NPpl--> Det Npl 
VP ~ V sg sg 
VPpl~> Vpl 
VP --> V NP sg sg 
VPpl~> Vpl NP 
NP --> NP sg 
NP --> NP 
It would be better to use context-sensitive rules such as in (10) to 
describe these constructions. 
(10) S--> NP VP 
NP--> Det N 
N--> Nsg 
N--> Nll 1 
VP~> V 
VP--> V NP 
V--> Vsg / Nsg 
V~> Vp1 / Np1 
If we are concerned only with analyzing context-free languages, we 
can use such rules to parse sentences rather than to generate them. 
Straightforward modification of existing context-free analysis 
-16- 
computer programs such as that of Earley (1969) will permit them 
to handle arbitrary phrase structure grammars with the same 
efficiency they possess for context-free grammars. Thus for each 
grammar G, there is a constant ~G such that Earleyts program can 
parse an input string of length n in an amount of time no more 
than k n 3 But ~-G depends on the number of rules in G, so using 
-'G-- ° 
fewer context-sensitive rules rather than more context-free rules 
can speed up parsing by a constant factor. This gain in speed 
could be of significance in natural language processing situations. 

References 

Chomsky, N. (1963) "Formal Properties of Grammar", In K. Bush, 
R. Luce and E. Galanter (ed8.) Handbook of Mathematical 
Psychology, Vol. If, New York, WHey. 

Earley, J. (1969) "An Efficient Context-Free Parsing Algorithm" 
(to appear). 

Peters, S. and 1%. W. Ritchie (1969a) "Context Sensitive Immediate 
Constituent Analysis~Context-Free Languages Revisited It, 
(submitted to J.A.C.M.). 

(1969b) "On the Generative Power of Transformational 
Grammars", (submitted to Information Sciences). 
