Motivations and Methods tbr Text Simplification 
R. Chandrasekar* Christine Doran B. Srinivas 
Institute for Research in l)el)artm<;nt of Deparl;mcnt of 
Cognitive, Science & (\]cntcr for \[,inguistics (\]Oml)uter $¢ 
the Advanced Study of hldia InlbrHlatioll Scienc(; 
University ot7 Pcnnsylwmia, lqfiladclphia, PA 19104 
{ra±ckeyc, cdoran, srin± }Ol±nc. cis. upenn, edu 
Abstract 
Lottg alld eolni)licated seltteltces prov(: to b(: 
a. stumbling block for current systems rely- 
ing on N\[, input. These systenls stand to 
gaill frolil ntethods that syntacti<:aHy sim- 
plily su<:h sentences. '\]b simplify a sen= 
tence, we nee<t an idea of tit(." structure of 
the sentence, to identify the <:omponents to 
be separated out. Obviously a parser couhl 
be used to obtain the complete structure of 
the sentence. \]\[owever, hill parsing is slow 
a+nd i)rone to fa.ilure, especially on <:omph!x 
sentences. In this l)aper, we consider two 
alternatives to fu\]l parsing which could be 
use<l for simplification. The tirst al)l)roach 
uses a Finite State Grammar (FSG) to pro- 
dn<:e noun and verb groups while the second 
uses a Superta.gging model to i)roduce de- 
pendency linkages. We discuss the impact of 
these two input representations on the sim- 
plification pro(:ess. 
1 Reasons for Text Simplification 
l,ong and <:oml)licatcd sentences prove to be a 
stumlJing block for <'urrent systems which rely on 
natural language input. 'l'lmsc systems stand to 
gain from metho<ls that preprocess such sentences 
so as to make them simpler. Consider, for exam- 
ph;, the following sentence: 
(l) 7'he embattled Major government survived a 
crucial 'vole on coal pits closure as its 
last-minute concessions curbed the extent of 
'lbry revolt over an issue that generated 
u'ausual heat in the l\]ousc of Commons and 
brought the miners to London streets. 
Such sentences are not uncommon in newswire 
texts. (\]ompare this with the multi-sentence ver- 
sion which has been manually simplified: 
(2) The embatlled Major governmcnl survived a 
crucial vote o'u coal pits closure. Its 
last:minute conccssious curbed the cxlenl o\]" 
*On leave fl'om the National Centre for Soft, ware 
Techno\]ogy, (lulmohar (?ross Road No. 9, Juhu, 
Bombay 4:0(/ (149, India 
Tory revolt over the coal-miue issue. Th.is 
issue generaled unusual heat in the ltousc of 
Commons. II also brought the miners to 
London streels. 
If coml>lex text can be made simph'x, sen- 
ten(-es beconae easier to process, both for In:O- 
grams and humans. Wc discuss a simplifica- 
tion process which identifies components of a sen- 
tence that may be separated out, and transforms 
each of these into frec-sta,ding simpler sentences. 
(\]learly, some mmnees of meaning from the origi- 
nal text may be lost in the simplification process. 
Simplitication is theretbre inappropriate for texts 
(such as legal docunlents) where it is importa.nt 
not to lose any nuance. I|owew;r, one c.~tl\] COil- 
ceive of several areas of natural language process- 
ing where such simplitication would be of great 
use. This is especially true in dolnains such as Ina- 
chine translation, which commonly have a manual 
post-processing stage, where semantic and prag- 
matic repairs may be <'arried out if ne<;essary. 
• Parsing: Syntactically <:omplex sentence's arc 
likely to generate a large number of parses, 
and may cause parsers to fail altogether. Re- 
solving ambiguities in attachment of con- 
stituents is non-trivial. This ambiguii, y is re- 
duced for simpler sentences sin<'e they involve 
fewer constituents. 'Fhus simpler sentences 
lead to faster parsing and less parse aml)igu- 
ity. Once the i>arses for the simpler sentences 
are obtained, the subparses can be assembled 
to form a full parse, or left as is, depending 
on the application. 
• Machine Translation (MT): As in the pars- 
ing case, simplification results in simpler scn- 
tential structures and reduced ambiguity. As 
argued in (Chandrasekar, 1994), this conld 
lead to improvements in the quality of ma- 
chine translation. 
• Information Retrieval: IR systems usually re- 
trieve large segments of texts of which only a 
part n\]ay bc reh~'wml,. Wit|, simplified texts, 
it is possible to extract Sl>eCific phrases or 
simple sentences of relevance in response to 
queries. 
1041 
• Summarization: With the overload of infor- 
mation that people face today, it would be 
very helpful to have text summarization tools 
that; reduce large bodies of text to the salient 
minimum. Simplification can be used to weed 
out irrelevant text with greater precision, and 
thus aid in summarization. 
• Clarity of Text: Assembly/use/maintenance 
manuals must be clear and simple to follow. 
Aircraft companies use a Simplified English 
for maintenance manuals precisely for this 
reason (Wojcik et M., 1993). However, it 
is not easy to create text in such an artifi- 
cially constrained language. Automatic (or 
semi-automatic ) simplification could be used 
to ensure that texts adhere to standards. 
We view simplification as a two stage process. 
The first stage provides a structural representa- 
tion for a sentence on which the second stage ap- 
plies a sequence of rules to identify and extract the 
components that can be simplified. One could use 
a parser to obtain the complete structure of the 
sentence. If all the constituents of the sentence 
along with the dependency relations are given, 
simplification is straightforward, ttowever, full 
parsing is slow and prone to failure, especially on 
complex sentences. To overcome the limitations of 
full parsers, researchers have adopted FSG based 
approaches to parsing (Abney, 1994; Hobbs et al., 
1992; Grishman, 1995). These parsers are fast 
and reasonably robust; they produce sequences 
of noun and verb groups without any hierarchical 
structure. Section 3 discusses an FSG based ap- 
proach to simplification. An alternative approach 
which is both fast and yields hierarchical struc- 
ture is discussed in Section 4. In Section 5 we 
compare the two approaches, and address some 
general concerns for the simplification task in Sec- 
tion 6. 
2 The Basics of Simplification 
Text simplification uses the f~ct that complex 
texts typically contains complex syntax, some of 
which may be particular to specific domain of dis- 
course, such as newswire texts. We assume that 
the simplification system will process one sentence 
at a time. Interactions across sentences will not bc 
considered. Wc also assume that sentences have 
to be maximally simplified. 
2'o simplify sentences, we nced to know where 
we can split them. We define articulation-points 
to be those points at which sentences may be log- 
ically split. Possible articulation points include 
the beginnings and ends of phrases, punctuation 
marks, subordinating and coordinating conjunc- 
tions, and relative pronouns. These articulation 
points are gcneral, and should apply to arbitrary 
English texts. These may, however, be augmented 
with domain-specific articulation points. We can 
use these articulation-points to define a set of rules 
which map froln given sentence patterns to sim- 
pler sentences patterns. These rules are repeat- 
edly applied on each sentence until they do not 
apply any more. For example, the sentence (3) 
with a relative clause can be simplified into two 
sentences (4). 
(3) Talwinder Singh, who masterminded the 
Kanishka crash in 198~, was killed in a 
fierce lwo-honr e~.connter... 
(4) Talwindcr Singh was killed in a .fierce 
two-hoar cncounler ... Talwinder Siugh 
masterminded the Kanishka crash in 198~. 
3 FSG based Simplification 
(Chandrasekar, 1994) discusses an approach that 
uses a FSG for text simplification as part of 
a machine aided translation prototype named 
Vaakya. In this approach, we consider sentences 
to be composed of sequence of word groups, or 
chunks. Chunk boundaries are regarded as poten- 
tial articulation-points. Chunking allows us to de- 
fine the syntax of a sentence and the structure of 
simplification rules at a coarser granularity, since 
we need no longer be concerned with the internal 
structure of the chunks. 
In this approach, we first tag each word with its 
part-of-speech. Chunks are then identified nsing a 
FSG. Each chunk is a word group consisting of a 
verb phrase or a noun phrase, with some attached 
modifiers. The noun phrase recognizer also marks 
the number (singular/plural) of the phrase. The 
verb phrase recognizer provides some information 
on tense, voice and aspect. Chunks identified by 
this mechanism include phrases such as the extent 
of Tory ~evolt and have recently bcen finalizcd. 
The chunked sentences are then simplified using 
a set of ordered simplification rules. The orderi~g 
of the rules is decided manually, to take care of 
more frequent transformations first, and to avoid 
unproductive rule interaction. An example rule 
that simplifies sentences with a relative pronoun 
is shown in (5). 
(5) X:tiP, ReXPron Y, Z --* X:tiP Z. X:tiP Y. 
The rule is interpreted as follows. If a sentence 
starts with a noun phrase (X:tiP), and is followed 
by a phrase with a relative pronoun, of the \['orm 
( , l%elPron Y ,) followed by soIne (Z), where 
Y and Z are arbitrary sequences of words, then 
the sentence may be simplified into two sentences, 
namely the sequence (X) followed by (Z), and (X) 
followed by (Y). The resulting sen\];ences are then 
recursively simplified, to the extent possible. 
The system has been tested on news text, and 
performs well on certain classes of sentences. See 
(Chandrasekar and R, amani, 1996) ibr details of 
quantitative evaluation of the system, including 
an evaluation of the acceptability of the resulting 
1042 
sentences. A set of news stories, consist, ing of 224 
sentences, was simplitied by the prototype system, 
resulting in 369 simplified sentences. 
Ilowever, there are certain weMenesses in this 
system, caused mostly by the relatively simple 
mechanisms used to detect phrases and attach- 
meats. Sentences which include long distance 
or crossed del)enden('ies, and sentences which 
have malt|ply stacked appositives are not handled 
llrOl)erly; nor are sentences with atnbiguous or un- 
ch'.ar attachnwnts. Some of these prol)\]oms can be 
handh'd I)y augmenting the ruh' set but what is 
i'eally require(I is ntorc syntactic firel)ower. 
4 A Dependency-based model 
A second a.I)l)roaeh to simplification is to use 
ri(:her syntactic in\[brmation, in terms of both con- 
stituency inlbrmation and dependency inf'orma- 
tion. We use partial parsing and simple depen-. 
dency attachment techniques as an alternative to 
the FSG I)ased simpliiication. This ~no(M (the 
I)SM) is based on a sinq)le dependency tel)r(> 
sentation provided l)y I,exicalized Tree. Adjoining 
(Ira.tmnar (I/FAG) and uses the "SUl>ertaggiug" 
l;echniques described in (Josh| and Srinivas, 1994). 
4.1 Brief Ovt;rvlt;w of LTAGs 
The primitive elements el LTA(~ formalism are ('.l- 
(:lnentary trees. Elementary trees are of two 
types: initial frees and au,iliary trees. Initial 
/;rees are minimal linguistic structures that con- 
tain no recurs|on, such as sitnph; sentences, N Ps, 
l)Ps etc. Auxiliary trees are recursive stru<-turcs 
which represent constituents that arc adjuncts to 
basic structure (e.g. relative clauses, sentential 
adjuncts, a(Iw'.rbials). For a more R)rmal and (le- 
taile(I (lescription of l,'l'A(\]s see (Schabes et M., 
J988). 
4.2 SuI)(*xl;agging 
Tlte elemmttary trees of LTAG localize dependen- 
(-ies, including hmg distance dependencies, by re- 
quiring that all and only the dependent elements 
be present within the same tree. As a result of 
this localization, a lexical item may be (and al- 
most alwws is) associated with more than one eL- 
ementary tree, We call these elementary trees su- 
pcrlags, since they conttdn more information (such 
as sul)categorization and agreement information) 
than standard part-of speech tags. Henc.e, each 
word is associated with more than one supertag. 
At the end of a complete l)arse, each word is asso- 
ciated with just one supertag (assuming there is 
no global ambiguity), and the supertags of all the 
words in a sentence are combined by sul)stitution 
and adjunct|on. 
As in standard part-of-speech disambiguation, 
we can use local statistical information in the form 
of N-gram models based on the distribution of sn- 
l)ertags ill a LTAG parsed corl)us for disamhigua- 
tion. We. use a trigram model to disambiguate tile 
supcrtags so as to assign one SUl)ertag tbr each 
word, in a process termed supertagging. '\['he tri- 
gram model of supcrtagging is very efficient (in 
linear time) and robust (Josh| and Srinivas, \] 994). 
'1'o establish the dependency links among the 
words of the sentence, we ('xph)it the dei)endency 
information present in the supertags. Each su- 
perl;ag associated with a word allocates slots for 
the arguments o1' the word. These slots have a 
polarity value re\[lecting their orientation wii;h re- 
Sl)ect to the anchor o\[' the SUl)ertag. Also asso- 
('iated with a supertag is a list of internal nodes 
(hmluding the root node) thai, appear in the su- 
pertag. Using I;his information, a simple algo- 
rithnt may be used to annotate the sentence with 
d(,pe.ndency links. 
4.3 Simplification with DeI)(mden('y links 
Tlte output provide(\[ by t, he dellendency analyzer 
not only contains depen(hmcy links annmg words 
but also in(lical,cs the constituent strncture as cn- 
code(I by snpertags. The constituent information 
is used to identify whether a supertag contains a 
clausal constituent and the dependency links are 
used to identify the span of the clause. Thus, 
embedded clauses can easily be identified and ex- 
tracte(t, akmg with their arguments. \])nnctuation 
can be used to identify constituents such as appos- 
itives which can also 1)e sel)arate(I ont. As with 
the finite-state al)l)roach, the resulting segments 
may 1)e incomplete as indellt'ndetlt clauses. I\[' the 
segments are to I)e reassembh'd, no further pro- 
cessing need be done on them. 
l?igm'e 1 shows a rule \[br extracting relative 
('lauses, in dependency notation. We tits| iden- 
tify the relative clause tree (Z), and then extract 
the verb which anchors it along with all of its (te- 
pendents. The right hand side shows the two re- 
suiting trees. The gap in the relative clause (Y) 
need only be tilled if the clauses are not going to 
bc reconlbined. Examples (6) and (7) show a sen- 
tence belbre and after this rule has applied. 
X:S 
Y:NP W 
Z: RelClause 
=> 
yZ':S 
:NP Y:NP W/ 
Figure 1: R,ule for extracting relative clauses 
(6) ... an issue \[that generated unnsnal heat 
in the IIouse of Commons \] ... 
(7) An issne \[generated unusnal heat in the 
Ilouse of Commons \]. The issue ... 
1043 
5 Evaluation 
The objective of the evaluation is to examine the 
advantages of the DSM over the FSG-based model 
for simplification. In the FSG approach since the 
input to the simplifier is a set of noun and verb 
groups, the rules for the simplifier have to identify 
basic predicate argument relations to ensure that 
the right chunks remain together in the output. 
The simplifier in the DSM has access to infor- 
mation about argument structure, which makes 
it much easier to specify simplification patterns 
involving complete constituents. Consider exam- 
pie 8, 
(8) Th.e creator of Air India, Mr. JRD 7hta, 
believes that the airline, which celebrated 60 
years today, could return to its old days of 
glory. 
q'he FSG-based model fails to recognize the rel- 
ative clause on the embedded subject the airline 
in example (8), because Rule 5 looks for matrix 
subject NPs. On the other hand, the DSM cor- 
rectly identifies the relative clause using the rule 
shown in Figure 1, which holds for relative clauses 
in all positions. 
Other differences are in the areas of modifier at- 
tachment and rule generality. In contrast to the 
/)SM approach, the FSG output does not have all 
modifiers attached, so the bulk of attachment de- 
cisions must be made by the simplification rules. 
The FSG approach is forced to enumerate all pos- 
sible variants of the LHS of each simplification 
rule (eg. Subject versus Object relatives, singular 
versus plural NPs) whereas in the DSM approach, 
the rules, encoded in supertags and the associated 
constituent types, are more general. 
Preliminary results using the DSM model are 
very promising. Using a corpus of newswire data, 
and only considering relative clause and apposi- 
tive simplification, we correctly recovered 25 out 
of 28 relative clauses and i4 of 14 appositives. We 
generated 1 spurious relative clause and 2 spuri- 
ous appositives. A version of the FSG model on 
the same data recovered 17 relative clauses and 3 
appositives. 
6 Discussion 
Simplification can be used for two general (:lasses 
of tasks. The first is as a preprocessor to a flfll 
parser so as to reduce \];he parse ambiguity for the 
parser. Tile second class of tasks demands that 
the output of the simplifier be free-standing sen- 
tences. Maintaining the coherence of the simpli- 
fied text raises the fbllowing problems: 
• Determining the relative order of the simpli- 
fied sentences, which impacts the choice of 
referring expressions to be used and the over- 
all coherence of the text. 
• Choosing referring expressions: For instance, 
when separating relative clauses fi'om the 
nouns they modify, copying the head noun 
into the relative clause is simple, but leads 
to quite awkward sounding texts. IIowever, 
choosing an appropriate pronoun or choosing 
between definite and indefinite NPs involves 
knowledge of complex discourse information. 
• Selecting the right tense when creating new 
sentences presents similar problems. 
• No matter how sophisticated the simplifica- 
tion heuristics, the subtleties of meaning in- 
tended by the author may be diluted, if not 
lost altogether. For many computer appli- 
cations, this disadvantage is outweighed by 
the advantages of simplification (i.e. gains of 
speed and/or accuracy), or may be corrected 
with the use of human l)ost-processiug. 
Acknowledgements 
This work is partially supported by NSF grant NSF- 
STC SBR 8920230, ARPA grant N00014-94 and All.() 
grant DAAH04-94-G0426. 
References 
Steven Abney. 1994. I)epcndency Grammars and 
Context-Free Grammars. Manuscript, University 
of Tubingen, March. 
R. Chandrasekar and S. Ramani. 1996. Auto- 
matic Simplifica.tion of Natural Language Text. 
M~muscript, National Centre for So\[tware Technol: 
ogy, Bombay. 
R. Chandrasekar. 1994. A Hybrid Approach to Ma- 
chine Translation using Man Machine Communica- 
tion. Ph.D. thesis, 'Pata Institute of I"undamcnta\] 
Research/University of Bombay, Bombay. 
Ralph Grishman. 1995. Where's the Syntax? The 
New York University M UC-6 System. \[n P~occe(g 
ings of the Sixth Message Understanding Confer- 
ence, Columbia, Maryland. 
Jerry Hobbs, Doug Appelt, John Bear, \])avid Israel, 
and W. Mary Tyson. 11992. FAST(IS: a system, for 
extracting information from natural language text. 
Technical Report 519, SRI. 
Aravind K. Joshi and B. Srinivas. 1994. Disam- 
biguation of Super Parts of Speech (or Supertags): 
Almost Parsing. In Proceedings of the 17 th Inter- 
national Conference on Computational Linguistics 
(COLING '94), Kyoto, Japan, August. 
Yves Schabes, Anne Abeilld, and Aravind K. Joshi. 
1988. Parsing strategies with 'lexicMized' gram- 
mars: Application to Tree Adjoining Grammars. 
In Proceedings of the 12 th International Co@rencc 
on Computational Linguistics (COL1NG'88), Bu- 
dapest, Ilungm:y, August. 
Richard II. Wojcik, Philip tIarrison, ,rod John Bremer. 
1993. Using bracketed parses to evMuate a grain- 
mar checking ;rpplication. In Proceedings of the 31'~t 
Conference of Association of Computational Lin- 
guistics, Ohio State University, Columbus, Ohio. 
1044 
