American Journal of Computational Linguisctics 
Microfiche 12 
Department of Linguistics 
University of ~lberta 
Edmonton 
Copyright 1975 
~ssociation for Computational Lingui~tics 
This paper aEproaches the questich cf whether the 
styles ct dlffererit grosc genres (here referring to 
classif I'cations such as fiction, cew,s~a~er reportage, 
learned jcurnals, etc.) are partially charzcterized by 
diffsrcnces In the co-variation Of a riuuher of ccmmor 
syntactic structures. 
Thirty-six syntactic variables are t a tulsted irl semple 
senter.cEs dra-wn frcm .five genres, The variablqs am, of 
several dif fer;cr scrts; f cr example, sent~nce types, some 
focus ~hsncmena, el5ments of the verb structure, ccn joined 
strucrukes, an3 varicus mcdificaticc ard complementatloc 
S~IUC~UZ~S. 
The co-variaticn cf these sy~t3ctic variables is 
n 
analis~d y medm of a di~crimi~ant function analysis. ,he 
dna'lysia zhows that tile verbal styl~s of the specific genres 
ccr.sid~z+t3 are characterized by diff~r6r:-t ia1 ~at7~rr.s of 
occur 
to bs 
seman 
of. 
rll 
snc 
us. Tkf pztterns zp 
artefacts of sthe GO 
icative ~ur~ose) of t 
pear 
mmor 
hofr 
genres. 
TABLE OF CONTENTS 
Abstract ....... e..................Y....aa................. 2 
Introduction ................................me........... 4 
................................................ Frocedure 7 
.................... Results ...........m......a...........~~ 
Conclusion ..................... ~~~emao~emo~a~~~~~o~m~~~ao29 
Ee?erences ................................. .a...........m31 
TABLES AND FIGURES 
....... ?able I. Syntactic variabl~s selected for ar,alysis 8 
Table 2. Sentence-lmsth blccks for samp1ir.g ............. 11 
....... Table 3. Frsquency cf occurrence of each vzriable -13 
Ta-ble 4. variables discriminating f icticn f rcm thp_ 
...... four nc-,n-f iction qenres ............ ,, -17 
Table 5. Variables discriminating among DOE-f iction 
genres .................................... ..25 
Figur€ 1. Elot cf grcuF centrcids for the five gecres 
cn the first twc discriminant functions ..,..16 
Figur~ 2. Plcts of th~ group ~entroids for the four 
ccn-fiction gEnres for dlscriminact functions 
I, I1 and I, 111 .... .,.,..e.....o...e.......24 
LNTRCCUCTION 
It is clear that linguists should attem~t to use all of 
the ern~irical evidence obtainable to descrite usefully and 
com~l~tely tha significant &atterns which cccur in language 
data, but it is nct always clear just what patterns or 
relationships in the data sho,uld be. included ir, such 
descrP~ticns. Gne aspect cf language data--generally 
referred to as ~tylz--has been known fcz same time to be 
sub]ectivfly rather obvious but extrem~ly difficult ro 
characterize in objectiv2 tdrms, especially if one ignores 
the otvious rclc of lexical selection. 
It uculd be ~ossibla to talk about style in non- 
subjective terms If on WE able to identifv these surface 
ebments df language that appear to partlclgate in .stylistic 
variations and then to relate these to subsets of language 
data defined ky external criteria cf known genre, 
Frovenance, author, or situation. However, zimple counts of 
elements with ~e~p~ct to such subsets have ncr provec to be 
cverly instructive, pri~arzly because it i~ presumably not 
the simple cccurrsncg of the elements which leads one to 
p~rce'ive a certain style but, rather, th~ far less obvious 
co-occurr~ac~ of SE~S cf elements (Karckucrth, 1973) . These 
co-occurr~~ces are ones whlch are not required by the 
grammar cf a language but which have a high probability of 
king linked in the genre in which they ccccx. 
A useful definitio~; of style might be tHe overall 
pattern or gestalt perceivable Fn language data which 
transmits infcrmaticn ahcut an utterance with respect to the 
~ersonal, sccial, and cultural relationshi~s assumed between 
speaker and hearer, the historical cr geographical 
Frcvenance of the scurce, and the bdenrity or mental state 
of the speaker, Reccqnitlcn cf such a qestalt allows the 
hearer or readex to assign the language data to a specific 
sub~oplaticn about which he already hast prototype 
information. 
~ssuming this ~rcvidss a workable view of style, then 
the basic task cf an investigation of the phenomenon is 0x1 
of generating hypot'hes~s about ~ossibls language-element 
~arrici~ants in style-markifis constel latiom which can be 
investiqat~d i3 thema~lves and in pazterns cf co-or:currence, 
The results of such investigaticns, if success~ful in 
differentiating externally defined sut~opulations of 
language, will Frovide addit lonal data about lazguage to 
which any grammatical description must be responsible. 
Genre, the focal ~cint of this study, is quite 
difficult to define in specific terms, As MarcKworth 
(1973, gp. 24-25) has noted: 
Genre tends to be a Humpty-Dumpty term, a useful 
---- 
concept which is redefined dithin tk~ limits of 
each discussion cf it with lit$le requirsment that 
such definitions have more than a superficial 
agreement with each cther. ---I Genre may distinguish 
Frose from poetry frcm drama; it may' distinguish 
historical Flays frcm Fxenc,h farces, scnnets from 
qics, mystery stories from all other novels, As a 
c-cnstruct it *has, ic fact, much in common with 
classif icaticn as def inad by pattern recognition 
----- --------- 
studies: a VouF of items alcng SON~ 
characteri~t ic (attribute) conti~uum uhcse members 
--------=A.-- 
ars closer to each other than to their neighbors 
and which are not separated by a gaF across a 
critical perceptual bcundary or distance, The 
characteristic of the continuum and thg critical 
distance are continually zespecifiable in terms of 
the stimuli perceived and the fineness of'the 
'I 
classificatory syseem required. 
khsn viewed in these terms, it is immediately 
clear why qenre is at cnce such a vague and such a 
useful term, Co-fitinua based on a number of 
different characteristics may be used ss the basis 
for defining clusters of similar texts and the 
genrc classifications fcrmed on differcnt continua 
need not be parallel nor form mutually sxclusive 
gl;ou~s. Some of ths continua used in genre 
classif icaticn are conteat, as in the case of the 
mystery story or the CavBlier love po~rs; ------ intent r 
as in the case of humorous writing cr fantasy; 
fcrm as in the case of drama or the novel; and 
---- 0 
ccntext 
------ 8 
a z in the case of t~chnical or 
t~lletristic Frose. This is not to claim that such 
character4stics are independent of each other or 
that any text may not be defined in terms of all 
four; simply that grou~ings along a coctinuum 
defined by any ons of them alone may be cited to 
establish ccntrast.ing genr9s. 
This FapeE re~orts an inveStigation of a large variety 
of surfac~ syntzctic features, ranging frcm s~ntence type 
through el~ments of the verb, and the covariant pattsrns of 
cccurrcnce of such structures Tn five genres of written 
modern English. It 1s rec,ognized, of course, that powerful 
7 
determiners of the xeccgnizability of various genre styles 
exist at both the infra-ssntential (e. g., r~ecif ic lexical 
select ion) and su~,ra-sentential (e. , general content) 
lsv$l, but it is the aim o'f this study only to assess, at 
the level of the sentence, the cont ributicn of svntactic 
devic~s tc a measurable difference in genre-def ined styles. 
PROCEDURE 
Thirty-six syntactic structures were selscted for 
considerati.cn on ths tasis of being possible contributors to 
the ccnstellaticn~ dftermini~ genre style. They were chosen 
fzom four main categories: sentence-type, including the 
range of pcssible interrogative patterns; fccus phenomena of 
variouz sorts; dements of the main verb phrase: and a group 
of ccr joined or emEcdded structures. 17 is latter group 
included structures of noun-phrase modif icaticn , verbal 
complementation, sentence modification, and carallel el em en^ 
ccm]ciaihg. 
Table 1 shows the svctactic structutes whose frequency 
cf occurrence was ccunted, along with the identification 
numbers assigned to each of these variables for use in 
subsequent tables and discussions. Note that the, various 
structures are clearly hct ccmpletely indc padent of each 
cther. 
TAELE 1 
SYNTACTIC VARIAELES SELECTED FOR ANALYSIS 
-I.I-LLIIII-...).II--II--I-CIL-...-------.C - 
categcry 
Variable 
Num,ber Variable Naue 
Sentence 
Type 
Focus 
Fhenc mena 
Verb 
Structure 
I, ~e~larative 
2, Interrogative: uord-order inversion type 
3,. tag type 
4, Wh- type 
5 , intonation type 
Imperative 
7. Non-standard word order 
8, Passive constructic~ 
3, Cleft construction 
13. Extra~ositicn ccnstructi~c 
kuxiliaries: modals and catenaatives 
progressiv~ as~ect 
perfectiv~ aspect 
emphatic 
pdst tense marking 
Main Verb: transitive 
intransitive 
to be 
- LI 
other copulas 
Ccntracted verbal forms 
Sim~le: of full sentences or clauses 
of phrases or wcrds 
Inclusicn cf direct dlscocrse 
Nominalization: in the NE (cleftable) 
in the VE (non-cleftable) 
adverbial clauses 
indirect quest ions 
NP modificatioa: adjectives 
locat i VES 
appositives 
full relative clauses 
partially reduced 
relativc clauses 
noun-ad jancts 
noun-heads, including 
prcncuns 
Bdverbials : prepositional phrases 
other advcrkials 
9 
~xtra~osition, e. g. , im~liss an embedded clause; a passive 
conztrccticn implies a transitive verb; conditional clauses 
may imply the Fast tense marker; contracted verbal forms 
imply auxiliaries; and emphatic do and other auxiliaries axe 
mutually EXC~US~V~. Hcweve~, none of these relationships 
(ref'err2d to here as grammatical conoccurrence 
restriCticxs) , with the exception of the last, is 
reciprccai. Extra~ositicn implies an embedded claus~, but an 
embedded clau~~ 6oes nct necessarily imply extreposition; a 
~aasive ccn~truction irn~lies a transitive verb, but a 
transitive verb dcec nct ilr,~ly a passive ccnstr~cticn, etc. 
Since these various flemerits arcl not complete1,y redundant, 
they are able to oFerats at i~ast semi-~ndependcntly as 
FOSS~~~€ !syntactic indicztor~ of style. 
Five genres were chosen for investigaticn cc the basis 
cf scntext _c_f _ut_tfrz~cs which ~ermits identification by 
place cf ~ublicaticn..  he genres s~ltctc i: were: Learned 
Jcurnals, N~.,ws~a~er fiqcrtage, Pcpular Jcurna1s, Govercment 
Dccunents, and Ficti~n. 
The actual dati, ucIe drawn  fro^ e Ernwn University 
oillicn- wcrd ~nqlish 
COIFUS, - A --- Standard --- Samylf pf F12sent- 
-- 
dx Edited Am~r-rcac Enqiish for Use with Dl~ital computers. 
.... ----a ---I--- -I --I I...-.- - ---- --1)--.-- -- 
This cor ~us ccnr~sts of 500 samples cf English-language 
texts gubllshed in Ul;it~d Stat~s in 1561, each sample 
10 
approximately 2,GflU words long. This large number of 
relatively shcit ram~les minimizes the effect of any single 
author or topic, and the restrlctiofis on aax~ tind place of 
~ublicatior. contrcl variables associated with provenance. F. 
complete de~cri~tion of this corpus anjd its ccntent may be 
found in Francis (19E4) or in Kucera & Fra~cis (1967). 
A total sam~le of 560 sectences was drawn, 1 from 
each cf the five genres. Each genre subset cf 1C4 consistsd 
cf tsn sentenczs frcm each of ten sentence-l~xgth blocks, 
Sentence length uas measured In words, and the blocks are 
~hcwn in Table 2, 
These block lengths weif chosen to mirscr roughly the 
distribution of sentence lengths in the entire corpus, A 
st~uctursd sam~le of this kind was dkawn tc prevent sentence 
length as such from acting as a variable since sertence- 
ler~gt h distributicn was already knc wn to differentiate among 
genres (Marckuorth G Eell, 1967), and also to guarantee that 
those syntactic devices which tad to be associated with 
greater sentence length wculd have equ~l c~portunitles to 
sppear within each genre sam~le. Again, the emphasis in this 
study was on the sentkrce as the basic u~it cf analysis, and 
cn the CO-ccc urrence of syntactic structures within 
sentences, The reader should keep in mind, however, that a 
randcm sam~l~ cf ZentGnces would permit certain sentsnce 
TAELE 2 
SENTENCE-LENGTH ELOCKS FOR SAMPIING 
------------------------------ --C I- 
- 
-----------I-------------------.----------- 
Block Number Block Length in Words 
lengths tc dcminate in sp~clfic gazes, ~cssibiy obscuring 
the CO-cccurr~nce ~attsrns of interest he re but, 
neverthsl~ss, ref'lectinq another property which can clearly 
he said to chariicter zeiqenre style, 
Each cf thE sarn~le sentences uas analyzed for the 
occurrEnce of the syntactic variables indicated in Table 1, 
and the cumber cf occurrences of each structure was 
recor6ed. A discussicn cf the basis on whim the syntactic 
elements were identified may be found In Marckworth (1973, 
~p. 4'4-48). The basic data for analysis thus consisted of 
12 
500 otservaticnq (ssntences) with ea~h observation scored on 
56 variables and clIassified by genre and length. The 
subsequent analysis cf these data was kased pqimarily d,n 
discriminaxit .functions 'which were used to determine how thg 
variabla~ rerved to distinguish one gare f rcm another. 
Easically , discriminant tunction analysis is the 
multivariate extensicn cf the univariate F ratio which is 
used to distingulsh among previously estatlishsd groups. Ir 
rrprezents, however, a contiderable incxease in both, 
complexity and analytical Fcuer since it focuses not only bn 
the simple. dif fermces between groups ,on each variable, but 
alsc cn the interfslationships amcng differences on the 
several varlables ccnsidered simu'ltaneously. It serves to 
maximize qxoup differences by developing maximally efficient 
weights whlch, when a~~lied to the original.daf2, will yield 
the clearsst disrinctions anong *he groups teing analyzed. 
The mct hod of discrimina~t function analysis is discussed 
fully in Rulc~, - et -- al. (1567, pp. 299-315) 
%he frequency of occurrence of gach cf the variables 
withie each of the qenre categories is shown in Table 3. 
Eleven of the varlables, indicated by an asterisk following 
TABLE 3 
FFEQUENCY OF CCCUEZENCE OF EACH VPSIABLE 
Genres 
Ilari- L~arned Newspaper P,o~ular Government 
ables Journals Beportage 
.Journals Docun~ents Fictipn Total 
*Omitted from main analysis because ,of lcw, frequency of 
cccurrence in the total sample (( 25). 
14 
the tctal ccluma, appeared in less tha~ five percent of the 
sent ERCES examined. Because of this low incidence, 
inter  ret tat ion of these variables would 1:~ difficult and 
tenuous so they were omitted from further a~alyses. 
Each variable was first examined individually in an 
analysis cf varianc2 for the five genre groups and four 
tests were ruc cn ~ach to determine if th~ variable would 
dif f 2rentiat~: 
a, fiction frcm the ncn-fictipn genres, 
k , formal (Learced Journals ahd Government Documents) 
frcm Infcrmal (fjewspa,per Bepoeage and Popular 
Jc.urnale) ncn-f iction, 
c, Learn-ed Journals frcm G~vernment Cccuments, 
d. Newspa~er Rqortage from Popular Jcurnals, 
lcurt~en variables (8, 11, 12, 13, 15, 16, 17, I, 21, 
24, 26, 2E, 35, and 316) were found significantly to 
differentiate fiCticr frcm non-fictic~ (r. 5 ,O5) ; thr,se 
vzriakles (I, 26, and 33) distinguished- foxmal from 
snfcrral ncn-f~ctYon ; 11cne distinguished betueen Learned 
Jcurhq3s and Gcvernrnent Documents; and cnly one (1) 
dif ferentlat~d Newspaper Repcrtage from Po~ular Journals, 
Tho cf the, variables (6 and 23) could not be examined 
in this way because of zero iric-idence in scae qenres, but 
the data patterr, fcr 23 (illclusi~r, gf direct discourse) 
15 
suggests an cbvious distircticn between f cr nal and informal 
n~n-f lctlcn. Variable 6 (imperatives) wobld clearly 
distinguish between Ne wsFaFer Reportag6 and Popular 
Journals, which is ,not surprising in vlew of the number of 
how-tc-do-it articles in the latter- gare. These two 
variakles (6 and 23) could be and were retained for the 
discriminant f uncticc znalyses, The remahlrg variables (1G, 
52, 25, 30, 31,. 32, arid 34) showed nc distinction as 
univariat c indices cc the four tests but they were, 
neverthelesst alsc retain~d for the multivariat2 analysis 
since they cculd, uh~n ~r~alyzed in conjurction with other 
variakles, still ~rcvide irfcrmatioc for genre distinction. 
This is kkcaus~ th~ simple univariate analyses discuss~d 
abqhe dc not take into account the possible 
interccrrclat icns (ccnst~llet-icn effec~s) ~mor,g the 
variatles. 
The first ~ult'iv~rizte analysis yas a f ive-group 
discriminant furcrion analysis, performd on the five 
gEnres. it indicated z clear differentiaticr of fiction 'from 
the £cur ncn-fiction qenrec (see Fiqure 1) , cn *he bzsis of 
the eight syntactic variables llst~d in' Table 4 These 
results demcnstrat~ that sentences  fro^ all cf the non- 
Lecrcnd: GD -= qotrcrnrncnt documents 
LJ = lea:-ncd journals 
PJ - 1 journals 
NR = newt;pc?pcr reportasc 
F = fictlon 
DISCRIMI?IANT FUNCTION* I 
Pigurc 1. Flct of ths qrou? cectrcids for the five gecrss on 
the first two discrlmlnant functlcns. 
TABLE 4 
VARIAELES CZSCRIMZNATTNG FICTICN 
FEON THE FCUE NCN-FICTION GENEES 
Variable 
Number 
Discriminant 
weight* 
15 qasr rase marking .73 
%C contracted verbal-forms .47 
13 prf c,ci.ive aspect 39 
adVertial clauses 
inclusi.cn of direct discourse 
* Those variabl~s having a ~csitive value are charact~rlstic 
of f icticn sentences, bu-t not of noxi-f 1ct im; those values 
with a nesative valu€, vicc versa. 
18 
fiction genres are more alike in syntactic structure than 
any cf them are like s~nt~nces from ficticn. This may, at 
first view, be surprising in light of the range o-f non- 
ficticn genres included In the study, but it bears out the 
findings of at least cna cther investigatic~ cf quantitative 
characteristics of the language of different genres 
(Parckworth and 5~11, 1967, on sentence-length 
distributicns): that the major rneasurabl'e stylistic 
distlnctlcr is between fictian and no-n-ficticn gmres, 
3~veial interesting cbservations may ke made about the 
syntactic variables that participate in the discrimination 
(ses Table 4). The mcst obvious pclnt Is the heavy 
lnvclvem.~r.t cf syntactic featutes of the verbal u~it ir, 
differ~ntiatlny fiction frcm non-fiction styles. With the 
exception cf i~clusicn -- of -- direct ,,,~,r discource which seems 
trans~arent~ly attribut~bl~ 70 5 he dlalogue characteristic of 
flction, znd tce lack cf yassive ----- ------- ccnstructicns, ---- since voice 
has b~en shcwx tc be a whole-sect~~~~ focus ph~romenon 
(Andzcw, 1974), all of the variables in Tzblz 4 ar/e 
associated with verbal rather thai: nominal elements cf ~E 
sentence. Markicg f cr uaf tens= and hcrfective ----I---- agpecr, 
intracsisivo verb- and contricted verbal f9~1_s are all 
, -3 1 ----- ---- 
specific tc the verb ~hrase, and s&erbi& +shs and ofher 
=dvezkials are €ither specifically verk-modif ying or arc 
-------..-- 
whcls-srntencs-modifyj ng. Ap~arerxly elemects of the nour, 
19 
phrase, cr at least those considered in this study, do not 
partici+pa te in the dist inctivedy style-associated 
constellatlcns of syntactic structkpes that distinguish 
f icticc f rcm non-flction. 
B seccnd not able feature diszicguishing the fiction 
sentence set is the amount of indicaticn of past time 
action. This is ccnveyed nct cnly by the fcrmal past -- tense 
variatle, uhich in rhe great majority of cases does indicate 
e past time acrion, but also by the perfective aspect, which 
L 
always- indicates a Fast time event whether marlied for past 
cr  resent tense. This feature is perhaps understandable in 
view cf the usual function of fiction as a rarrarive of past 
€vents, and it stlould alsc be noted that this saw functio~ 
may utilize another role of perfective psp~ct--that of 
hterrelating s~qusntial Events thrcuqh ti,mc, 
k questioo may b~ raised about the relatienship of two 
of the ficticn-distinguishing variables: the presence of 
intransitiv~ varbs and the absezce of passiw ccnstructions. 
-.-----..I----- ---I 
Sinca these two exhibit a non-raciprocal grammatical co- 
cccuzience restriction bstueen voic~ and v~rb type--passive 
voice im~lies a trarsitive verb but not vice versa--it is 
~ossitle +hat the paucity of eassive -- ---- ccnstructiohs in 
ficticr, 3s sia~ly ax arte'act of the rrequency of 
in'trancitiva verbs in cth~r uozds, we must ask whether 
------z---, ,,* 
passive sentences occur less in fiction than in non-fiction 
sim~ly because they have less o~portunity to do so, or 
whether nch-~assiveness is an independent syntactic feature 
of fiction style. k ccm~ariscn of the ratics of occurrence 
ccnstructicn to transitive V~L& in the ficti~n 
of ESS_S~_V_E ----------- 
and ncn-fiction genres shews the latter casE to be the true 
one. (Such a ratio ex~resses the actual occurrence of 
~assive scntenc~s in relaticn to the possitle occurrences. ) 
The ratios for the five genres are 
Fiction 
Learned Journ'als 
Neus~aper Re~ortage 
Fopular Journals 
Government Documents 
It is tem~ting to speculate about lust why the fiction 
genre should he characterized by significantly more 
intranzltive verbs 
11--..1-11---.1 ----- 
tc cause this variable to be 
dis~~rirninatcry. Cne possible explanation, which suggests a 
sirnpl~ characterizatic~ of fiction style, cculd be based on 
analyzing a large subset (if not the whole class) of verbs 
usually callfa intransii~zg 2s items which can occur both 
wlth nc ohjact (traditionally called intransitivs verbs) and 
with cns cr two objects (traditionalfy called tracsitives). 
in such &n analysis it is presuv~d that when the 
gramm+tlcal object of a verb of thls class is either 
redun c6nt or riot c~rr~lct~l y speclfi~d it is SUF~~?SSE~ and. 
John 
the result is a one-place ,predicate s~ntcnce, e. g., 
sang. When the grammatical object carries new or requirea 
--- 
informatior, it is ~resent and the result is a multiple-place 
Greek fclksong. This sart of 
predicate, e.g., Jchn sang a -- 
analysis cf the I~intran~itivs~~ verb o.pens the door to a vsry 
general characterization of genre differences. I hus viewed, 
the ------------ ictransitivs ---- verb variable characteriz~s sentences which 
are nct heavily in£ crmatlcc crienred--sentences in which a 
major coinponent, th~ gr=zmn:atFcal cb ject , is elther so 
predictable cr sc ucim~ortant thirt it is nct iver. spscified. 
such s~nt~ncss are signific.zntly more charect~rlst~ic of 
flction than cf ncn-f icticn wxiting, and this analysis of 
xh~m suggesrs a rneasurakl~ hasis for the cld rule of thumb 
t 
that succzssful (although nct necessarilj good) Action 
wiixirg is strcngly ection oriented, It alsc suggests the 
validity cf the ccmmon-sense iatuft3or. tha~ a primary source 
of the diff~r~nc~s betwcez ~iction end non-fiction is that 
the latter is designed fcremost as an information-cox-v~ying 
instrument; that in the dichctdmy cf litera ry purp~.se it is 
mere likely tc* teach khan to delight, 
Since orie of tke major ways in which infcrmation can be 
packed into a sentence is through heary use of nominal 
~lements, we locked at a simple measure of this 
characterization cf fiction style as actior.-oriented as 
c~pcssd tc non-fiction style as informaticn-or~en~ed: the 
verb/pcun ratic fcr each genre, 
These ratios are 
Fiction 
Learned Jcurnalr 
N€ws~ap€Z ~e~crtage 
Fopular Journals 
Governm~nt Documents 
tearing cut the su~position that sentcncea in the non- 
ficticn genres have ncre ncuns in proporti.cn to verbs than 
do those frcm fiction. It ma,y also bc noted that, in 
additicn tc beir.9 high In nouns ir. prcporticn to verbs, non- 
fiction sentences alsc exhibit somewhat more noun 
modlficatlon than fiction, as shown by the folLouing ratios 
of ail noun-modifisi ty~~s (variables 28, is, 30, -71, 32, 
and 32) tc ncuns: 
Fiction 
Learned Journals 
News~aper Reportage 
Popular Jcurnals 
Government Dccuments 
Ihus, frcm the discrimlcazory vzriebles identified by 
the five-group discriminant function a~alysis and the 
furth~r okel;vaticns sugg~sttd by them, a ~icture erneLges of 
distinctive syntactic structure constellaticns in at least 
two major Genre categories: fiction, with the syntactic 
structures determined by the function of pgst-time, action- 
crient~d, narrative ccrnrnunicaticn; and ncn-f iction, with 
structur~s 3eternined by cn information-carrying function. 
The f ivp-cjroup discriminant function analysis showed 
such a ma jcr distinction b'etnleh fiction ard the non-fiction 
genres that it seemed possible that differences in tho non- 
ficti.cn genres mlght have been obscured, In ccnsequerice, a 
fourdgrou~ discrinicant function analysis ues dcn~ on data 
from these gecr es crly, The result indicated a disticcti~~ 
in syrtactic stiucturf between the forma-1 genres (Learned 
Journals ar? d Goverr,n;ent Docurn~nts) a~d the ir,f ormal 
(Neus,y.aFEr 'Ie~ortage and Popular Journals) alcng the first 
axis, and alcr,g th~ s~cor.d axis a distinction betweex 
Eopular Jcurnals and News~apzr Reportage, The third 
dimsnslon distinguished Lsazned Journals from Governmefit 
Locumfnts (see ~igur~ 2). 
Table 5 shows tho, sy~tactic variables that par-ticipate 
in these thres discr~minations. These discriminating 
structures present a less distinct picture cf dlffer~nt 
A 'YFES cf writing than dc ;hose differentla~ins fiction and 
ncn-f icticn, but nevertheless illustrate scme interesting 
~cint-s abcut genrc and style. 
cf the tour items that characterize informal n-on- 
ficticn sentences, only tray.sAi~ Verb is hot susceptible 
to inmediate sx~isnation, although we can note that its 
Fresence as an informal marker must be due alrnos,? entirely 
to sentences frcm Pc~ularr Jourr~als sidce, in the 
.2 .4 
DISCRIMINANT FUNCTION I 
.2 .4 
DISCRIMINANT FUNCTIONbI 
Figure 2. Plots cf the group centroids for the four non- 
f icticn genres f cr discriminant functions I, I1 
and 1, 131, 
TAELE -5 
VARIAELES CISCEIMINAZING AflONG NON-FICTION GENRES 
--g-.l------.l-------L----.---------- c- 
- 
-II.--..-.------r------.---".-c-----i-----------.---.-----.II,-- 
Oiscriminant Variable Variable Discriminant 
Functicn Number Weight 
1. LbFOEMAL FROM FORMAL NON-FICTION GENBEZ 
23 inclusicn of direct discourse .47 
20 contracted verbal farms 
15 past tense 
16 transitivf verbs .29 
32 partially reduced relative clauses -. 25 
3 3 ncun adjuncts -. 32 
its adverhial clauses -. 32 
11. hEWSEAPEE SEEOETAGE FFCE POPULAR JOURN PLS 
1 decla~ativ~ snxteuces 
23 inclusicn of direct disccurss 
16 transitive verbs -. 35 
6 imperative sentences 
f 11, LEA3NEC JOUEKALS FROM GOVERNMENT DOCUmENiS 
24 ncm~cel~zation ~n ccun phrase 
8 passive constructions 
22 con- Qcrds and phrases 
3 3 ncun adjuncts -. 26 
11 mcdals and catenatives 31 
26 
discr'i~mihation ket ween Newspaper Re portage and Popular 
Journals, it has a ~gative weighting for the fermer; that 
is, in the ccm~a~ison of this pair of Genres, Newspaper 
Be~ortage is distinguished by the absenc~ of transitive 
verbs (SEE Variable 16, Discriminant Functicn 11, in Table 
----- 
5) 
Xhe16ther three distinguishing features of the icformal 
inclusi.cn cf direct discourse, ccntracted verbal 
genre= a=< --------I -- --I--- - ----..II-... ---- 
f9~E I and e isgse. y- ,, 5nclusi~~ pf _di~~ct disco~ise is 
~rokably present as a result of the £.act tkat the parsing 
pyocedure aid nct d5f f erentiate between true direct 
disccursa cf the sort fqund in fiction dialogus acd the 
inclusion cf qucted material of the sort f cund in Newspaper 
Beportage in which one GI two words may te quoted. The 
su~positioc that Newspaper Reportage ccntributed this 
discriminant variakl~ tc the lcformal categcry is borne out 
ty its aFFearance as a characterA-stic distinguishing that 
ger,re  fro^ Ecpular Jcurnals, as shown by the second function 
in Takle 5, 
Ccntracted verbal fcrmc are a typical atd, frequently, 
-I-.------ ---I-- ----- 
a deliberate indicatcr of informal style, which probably 
explains the Fresence of this variable as a discriminator. 
In addition, editorial @icy (or a writer's perception of 
~t) usaally discourages the use of these ferns in any sort 
cf fcrmal written language, which explalns their absence in 
Lsarhzd Jcurnals and Government Docufients, Why jzst ts~se - 
should differentlats informql from formal rch-f ictlon Is not 
really clear; one ~osslbl~ contributory cause say b at 
Learned Jcurnals frequently discuss things as they are (or 
appear to be, Government Documents (which in this sample 
are largaly proclamaticns of f utura legal inter pretatisns or 
holidays) discuss things as they will be, btut Newspap~r 
Rqortags discusses thi~gs as they have beer. 
Several cther trans~arent dissriminatcry variables are 
to be seen ir. Table 5. Variable 8, sv castruct~ons, 
which distinguishes Learned Journals f tom Government 
Cccumsnts, is almost certainly a result of conscious 
editorial ~clicy, T~.E 2r~sence gf declarative -- -- sentences and 
the 'abssnce of -- imperativ~ -I-I_ ---- se~tences which distinguish 
Newspaper Eeporta'yF,. frcm Eopulat Journals axe a -Joint result 
of -the presence ir: Pc~uiar Journals of a rumber of how-to- 
dcyit articles: {'Hold the brick Ja your left hmd .. .,, 11 
A less trans~arent, but perha-ps mcre analytically 
interesting set cf variables is Shown in Table 5 (32, 33, 
26, and 24, 22, 33). The first three, 1 artigl~ rfducsd 
relative clauses, .noun --- -- adjuncts ----t and. lCIIIIIIIII adver tial -- clauses, are 
atypical of informal ocn-f ictioh ic the informal/formal 
discrl~inaticn; the other three are ipvclvra ic 
28 
distinguishing Learh~d ~ournals frcm Golfernuent Documents-- 
romina3izaticns f& ncun ph~asg and ~gyioin~d vords and 
-.I-------- 
yhrar~s ---- by their prfs~nc~ in Learned JcGrnals and --- noun 
adjuncts by its absence. (The activity of noun adiuncts as 
-- -- 
discriminatory by its ats~nce in both of thf informal genres 
and in L~arned Journals indicates its heavy use in 
Gcverriment Dccumeihts: "1, John Chaffee, Gcvernor of Rhode 
Island.. . . ") A11 five of these stractures ax€ sigcificantly 
chracteristic of cne cr both cf the fcrmal. non-fiction 
genres, aad all five are frcm the category cf conjoined or 
embedded syntactic elements--that is, syntactic structures 
'~h.o$e ~rimary p UrFose *is to com'press and relate informatior! 
withiri he sentence. Seemingly, those gEFres in which the 
author's intent is tc convey max2mum information usah 
uaximun exrlicitn6ss are just those tha& make m6ximum use of 
such syntactic tech~iques. (We may note in psssing thi,tr 
except for --------- adverbial ------ clauses and for -- conjcined -- -- wotds -- and 
p_hyases, which may invclve either noucs or verbs, the 
signi-ficart element-s are members of the ccun ph-rase; it 
appears that cnly at rhis level of genre cis~r~iminaticn is 
anything kut a verkal or whole-sentence clement a 
significant stylist~c indicator. ) 
khat seems tc te evident from the above results is: 
that, while there are indeed significantly dif £€rent 
~atte~ns syntactic cccutrfnc~ batween genres, these patterns 
(uith the ExceFtlcn of editorially deterrni~cd use od gassive 
ccn~tr uction~ in ~~arned Journal style- and avoidance of 
----.I------- 
cc~ntrectfd p_e_rb_a_l fz_ms in any formal 
-I------- 
style) result 
primarily from general s~mantic constraints cperzting within 
the Genres and based in tke com-municative FurFoses of ths 
genres. Tc wit, ficticn, nc matter what its topic, is 
typically a narraticrl cE Fast but ,int~rccr,nected actions, 
and the syntactic structures that differentiate f:ction from 
rcn-ticticr are ones uhich ccnvey this semantic ccntcnt; 
non-fictiar, agiin nc matter what it is sbcut, 1s in gsneral 
a data-ccnveying ~nstrumsrit, even tkoush tkiere are 
detectable difffrthces in the manner ic uhlch the data are 
cc~vc yed, g , dsgrce of specificity cf data (Learned 
Jcurqals) , degree cf Gidacticism (Popular Journals), and 
degree of .ir.cluded narrbticn (Newspaper Fqcrtage) . Again, 
tnesz bread semantic simi1arit.ies and diffexnces are 
refl~ct~d in the sy~tac-cic S~IUC~UI~S that differentiate the 
gmre s t-y le s. In summary, quzntitativ~ diffexnces in 
sy~tact ic S~T-UC~UT~C car, indeed be f cun2 between 
indep~r,d~qtly-3.~fir.sd sub-populations of lenguage (genres) , 
but they aEI;ear tc ccc?r:sscnd t~c-~and arr presumably the 
30 
result of--generic con,rnunicative purposes cf the genres, and 
shoulu ccr ssquently be vieurd as 1nte;r.s.lly-constrainad 
artefacts cf this sernzntic component rather thac ~xternally- 
defin-cii elements cf stylo. 

REFERENCES 

C. M. Andrew. 1974. An Experimental approach to grammatical focus. PhD Dissertation. Univ. of Alberta, Dept. of Linguistics.

W. N. Francis. 1964. A standard sample of present-day edited American English, for use with digital computers. Brown University, Dept. of Linguistics.

W. N. Franics. 1964. Manual of information to accompany a standard sample of present-day American English, for use with digital computers. Brown Univ., Dept. of Linguistics.

H. Kucera and W. N. Francis. 1967. Computational analysis of present-day American English. Brown University Press.

M. I. Marckworth and I. M. Bell. 1967. Sentence-length distribution in the Corpus. In Kucera and Francis, pp. 368-405.

M. L. Marckworth. 1973. Statistical determination of some elements of genre style. PhD dissertation. Brown University, Dept. of Linguistics.

P. Rulon and D. V. Tiedeman and M. M. Tatsuoka and C. R. Langmuir. 1967. Multivariate statistics for personnel classification. Wiley, NY.
