Automatic Detection of Text Genre 
Brett Kessler Geoffrey Nunberg Hinrich Schfitze 
Xerox Palo Alto Research Center 
3333 Coyote Hill Road 
Palo Alto CA 94304 USA 
Department of Linguistics 
Stanford University 
Stanford CA 94305-2150 USA 
emaih {bkessler,nunberg,schuetze}~parc.xerox.com 
URL: ftp://parcftp.xerox.com/pub/qca/papers/genre 
Abstract 
As the text databases available to users be- 
come larger and more heterogeneous, genre 
becomes increasingly important for com- 
putational linguistics as a complement to 
topical and structural principles of classifi- 
cation. We propose a theory of genres as 
bundles of facets, which correlate with var- 
ious surface cues, and argue that genre de- 
tection based on surface cues is as success- 
ful as detection based on deeper structural 
properties. 
1 Introduction 
Computational linguists have been concerned for the 
most part with two aspects of texts: their structure 
and their content. That is. we consider texts on 
the one hand as formal objects, and on the other 
as symbols with semantic or referential values. In 
this paper we want to consider texts from the point 
of view of genre: that is. according to the various 
functional roles they play. 
Genre is necessarily a heterogeneous classificatory 
principle, which is based among other things on the 
way a text was created, the way it is distributed, 
the register of language it uses, and the kind of au- 
dience it is addressed to. For all its complexity, this 
attribute can be extremely important for many of 
the core problems that computational linguists are 
concerned with. Parsing accuracy could be increased 
by taking genre into account (for example, certain 
object-less constructions occur only in recipes in En- 
glish). Similarly for POS-tagging (the frequency of 
uses of trend as a verb in the Journal of Commerce 
is 35 times higher than in Sociological Abstracts). In 
word-sense disambiguation, many senses are largely 
restricted to texts of a particular style, such as col- 
loquial or formal (for example the word pretty is far 
more likely to have the meaning "rather" in informal 
genres than in formal ones). In information retrieval, 
genre classification could enable users to sort search 
results according to their immediate interests. Peo- 
ple who go into a bookstore or library are not usually 
looking simply for information about a particular 
topic, but rather have requirements of genre as well: 
they are looking for scholarly articles about hypno- 
tism, novels about the French Revolution, editorials 
about the supercollider, and so forth. 
If genre classification is so useful, why hasn't it fig- 
ured much in computational linguistics before now? 
One important reason is that, up to now, the digi- 
tized corpora and collections which are the subject 
of much CL research have been for the most part 
generically homogeneous (i.e., collections of scientific 
abstracts or newspaper articles, encyclopedias, and 
so on), so that the problem of genre identification 
could be set aside. To a large extent, the problems 
of genre classification don't become salient until we 
are confronted with large and heterogeneous search 
domains like the World-Wide Web. 
Another reason for the neglect of genre, though, is 
that it can be a difficult notion to get a conceptual 
handle on. particularly in contrast with properties of 
structure or topicality, which for all their complica- 
tions involve well-explored territory. In order to do 
systematic work on automatic genre classification. 
by contrast, we require the answers to some basic 
theoretical and methodological questions. Is genre a 
single property or attribute that can be neatly laid 
out in some hierarchical structure? Or are we really 
talking about a muhidimensional space of properties 
that have little more in common than that they are 
more or less orthogonal to topicality? And once we 
have the theoretical prerequisites in place, we have 
to ask whether genre can be reliably identified by 
means of computationally tractable cues. 
In a broad sense, the word "genre" is merely a 
literary substitute for "'kind of text," and discus- 
sions of literary classification stretch back to Aris- 
32 
totle. We will use the term "'genre" here to re- 
fer to any widely recognized class of texts defined 
by some common communicative purpose or other 
functional traits, provided the function is connected 
to some formal cues or commonalities and that the 
class is extensible. For example an editorial is a 
shortish prose argument expressing an opinion on 
some matter of immediate public concern, typically 
written in an impersonal and relatively formal style 
in which the author is denoted by the pronoun we. 
But we would probably not use the term "genre" 
to describe merely the class of texts that have the 
objective of persuading someone to do something, 
since that class -- which would include editorials, 
sermons, prayers, advertisements, and so forth -- 
has no distinguishing formal properties. At the other 
end of the scale, we would probably not use "genre" 
to describe the class of sermons by John Donne, since 
that class, while it has distinctive formal characteris- 
tics, is not extensible. Nothing hangs in the balance 
on this definition, but it seems to accord reasonably 
well with ordinary usage. 
The traditional literature on genre is rich with 
classificatory schemes and systems, some of which 
might in retrospect be analyzed as simple at- 
tribute systems. (For general discussions of lit- 
erary theories of genre, see, e.g., Butcher (1932), 
Dubrow (1982), Fowler (1982), Frye (1957), Her- 
nadi (1972), Hobbes (1908), Staiger (1959), and 
Todorov (1978).) We will refer here to the attributes 
used in classifying genres as GENERIC FACETS. A 
facet is simply a property which distinguishes a class 
of texts that answers to certain practical interests~ 
and which is moreover associated with a characteris- 
tic set of computable structural or linguistic proper- 
ties, whether categorical or statistical, which we will 
describe as "generic cues." In principle, a given text 
can be described in terms of an indefinitely large 
number of facets. For example, a newspaper story 
about a Balkan peace initiative is an example of a 
BROADCAST as opposed to DIRECTED communica- 
tion, a property that correlates formally with cer- 
tain uses of the pronoun you. It is also an example 
of a NARRATIVE, as opposed to a DIRECTIVE (e.g.. 
in a manual), SUASXVE (as in an editorial), or DE- 
SCRIPTIVE (as in a market survey) communication; 
and this facet correlates, among other things, with 
a high incidence of preterite verb forms. 
Apart from giving us a theoretical framework for 
understanding genres, facets offer two practical ad- 
vantages. First. some applications benefit from cat- 
egorization according to facet, not genre. For ex- 
ample, in an information retrieval context, we will 
want to consider the OPINION feature most highly 
when we are searching for public reactions to the 
supercollider, where newspaper columns, editorials. 
and letters to the editor will be of roughly equal in- 
terest. For other purposes we will want to stress 
narrativity, for example in looking for accounts of 
the storming of the Bastille in either novels or his- 
tories. 
Secondly. we can extend our classification to gen- 
res not previously encountered. Suppose that we 
are presented with the unfamiliar category FINAN- 
CIAL ANALYSTS' REPORT. By analyzing genres as 
bundles of facets, we can categorize this genre as 
INSTITUTIONAL (because of the use of we as in edi- 
torials and annual reports) and as NON-SUASIVE or 
non-argumentative (because of the low incidence of 
question marks, among other things), whereas a sys- 
tem trained on genres as atomic entities would not 
be able to make sense of an unfamiliar category. 
1.1 Previous Work on Genre Identification 
The first linguistic research on genre that uses quan- 
titative methods is that of Biber (1986: 1988; 1992; 
1995), which draws on work on stylistic analysis, 
readability indexing, and differences between spo- 
ken and written language. Biber ranks genres along 
several textual "dimensions", which are constructed 
by applying factor analysis to a set of linguistic syn- 
tactic and lexical features. Those dimensions are 
then characterized in terms such as "informative vs. 
involved" or "'narrative vs. non-narrative." Factors 
are not used for genre classification (the values of a 
text on the various dimensions are often not infor- 
mative with respect to genre). Rather, factors are 
used to validate hypotheses about the functions of 
various linguistic features. 
An important and more relevant set of experi- 
ments, which deserves careful attention, is presented 
in Karlgren and Cutting {1994). They too begin 
with a corpus of hand-classified texts, the Brown 
corpus. One difficulty here. however, is that it is 
not clear to what extent the Brown corpus classi- 
fication used in this work is relevant for practical 
or theoretical purposes. For example, the category 
"Popular Lore" contains an article by the decidedly 
highbrow Harold Rosenberg from Commentary. and 
articles from Model Railroader and Gourmet, surely 
not a natural class by any reasonable standard. In 
addition, many of the text features in Karlgren and 
Cutting are structural cues that require tagging. We 
will replace these cues with two new classes of cues 
that are easily computable: character-level cues and 
deviation cues. 
33 
2 Identifying Genres: Generic Cues 
This section discusses generic cues, the "'observable'" 
properties of a text that are associated with facets. 
2.1 Structural Cues 
Examples of structural cues are passives, nominal- 
izations, topicalized sentences, and counts of the fre- 
quency of syntactic categories (e.g.. part-of-speech 
tags). These cues are not much discussed in the tra- 
ditional literature on genre, but have come to the 
fore in recent work (Biber, 1995; Karlgren and Cut- 
ting, 1994). For purposes of automatic classification 
they have the limitation that they require tagged or 
parsed texts. 
2.2 Lexical Cues 
Most facets are correlated with lexical cues. Exam- 
ples of ones that we use are terms of address (e.g., 
Mr., Ms.). which predominate in papers like the New 
~brk Times: Latinate affixes, which signal certain 
highbrow registers like scientific articles or scholarly 
works; and words used in expressing dates, which are 
common in certain types of narrative such as news 
stories. 
2.3 Character-Level Cues 
Character-level cues are mainly punctuation cues 
and other separators and delimiters used to mark 
text categories like phrases, clauses, and sentences 
(Nunberg, 1990). Such features have not been used 
in previous work on genre recognition, but we be- 
lieve they have an important role to play, being at 
once significant and very frequent. Examples include 
counts of question marks, exclamations marks, cap- 
italized and hyphenated words, and acronyms. 
2.4 Derivative Cues 
Derivative cues are ratios and variation measures de- 
rived from measures of lexical and character-level 
features. 
Ratios correlate in certain ways with genre, and 
have been widely used in previous work. We repre- 
sent ratios implicitly as sums of other cues by trans- 
forming all counts into natural logarithms. For ex- 
ample, instead of estimating separate weights o, 3, 
and 3' for the ratios words per sentence (average 
sentence length), characters per word (average word 
length) and words per type (token/type ratio), re- 
spectively, we express this desired weighting: 
, II'+l C+I W+I alog~+31og~+3,1og T+I 
as follows: 
"(c~ -/3 + 7) log(W + 1)- 
a log(S + 1) + 31og(C + 1) - ~. log(T + l) 
(where W = word tokens. S = sentences. C =char- 
acters, T = word types). The 55 cues in our ex- 
periments can be combined to almost 3000 different 
ratios. The log representation ensures that. all these 
ratios are available implicitly while avoiding overfit- 
ting and the high computational cost of training on 
a large set of cues. 
Variation measures capture the amount of varia- 
tion of a certain count cue in a text (e.g.. the stan- 
dard deviation in sentence length). This type of use- 
ful metric has not been used in previous work on 
genre. 
The experiments in this paper are based on 55 
cues from the last three groups: lexical, character- 
level and derivative cues. These cues are easily com- 
putable in contrast to the structural cues that have 
figured prominently in previous work on genre. 
3 Method 
3.1 Corpus 
The corpus of texts used for this study was the 
Brown Corpus. For the reasons mentioned above, 
we used our own classification system, and elimi- 
nated texts that did not fall unequivocally into one 
of our categories. W'e ended up using 499 of the 
802 texts in the Brown Corpus. (While the Corpus 
contains 500 samples, many of the samples contain 
several texts.) 
For our experiments, we analyzed the texts in 
terms of three categorical facets: BROW, NARRA- 
TIVE, and GENRE. BROW characterizes a text in 
terms of the presumptions made with respect to the 
required intellectual background of the target au- 
dience. Its levels are POPULAR, MIDDLE. UPPER- 
MIDDLE, and HIGH. For example, the mainstream 
American press is classified as MIDDLE and tabloid 
newspapers as POPULAR. The ,NARRATIVE facet is 
binary, telling whether a text is written in a narra- 
tive mode, primarily relating a sequence of events. 
The GENRE facet has the values REPORTAGE, ED- 
ITORIAL, SCITECH, LEGAL. NONFICTION, FICTION. 
The first two characterize two types of articles from 
the daily or weekly press: reportage and editorials. 
The level SCITECH denominates scientific or techni- 
cal writings, and LEGAL characterizes various types 
of writings about law and government administra- 
tion. Finally, NONFICTION is a fairly diverse cate- 
gory encompassing most other types of expository 
writing, and FICTION is used for works of fiction. 
Our corpus of 499 texts was divided into a train- 
"ing subcorpus (402 texts) and an evaluation subcor- 
pus (97). The evaluation subcorpus was designed 
34 
to have approximately equal numbers of all repre- 
sented combinations of facet levels. Most such com- 
binations have six texts in the evaluation corpus, but 
due to small numbers of some types of texts, some 
extant combinations are underrepresented. Within 
this stratified framework, texts were chosen by a 
pseudo random-number generator. This setup re- 
sults in different quantitative compositions of train- 
ing and evaluation set. For example, the most fre- 
quent genre level in the training subcorpus is RE- 
PORTAGE, but in the evaluation subcorpus NONFIC- 
TION predominates. 
3.2 Logistic Regression 
We chose logistic regression (LR) as our basic numer- 
ical method. Two informal pilot studies indicated 
that it gave better results than linear discrimination 
and linear regression. 
LR is a statistical technique for modeling a binary 
response variable by a linear combination of one or 
more predictor variables, using a logit link function: 
g(r) = log(r~(1 - zr)) 
and modeling variance with a binomial random vari- 
able, i.e., the dependent variable log(r~(1 - ,7)) is 
modeled as a linear combination of the independent 
variables. The model has the form g(,'r) = zi,8 where 
,'r is the estimated response probability (in our case 
the probability of a particular facet value), xi is the 
feature vector for text i, and ~q is the weight vector 
which is estimated from the matrix of feature vec- 
tors. The optimal value of fl is derived via maximum 
likelihood estimation (McCullagh and Netder, 1989), 
using SPlus (Statistical Sciences, 1991). 
For binary decisions, the application of LR was 
straightforward. For the polytomous facets GENRE 
and BROW, we computed a predictor function inde- 
pendently for each level of each facet and chose the 
category with the highest prediction. 
The most discriminating of the 55 variables were 
selected using stepwise backward selection based on 
the AIC criterion (see documentation for STEP.GLM 
in Statistical Sciences (1991)). A separate set of 
variables was selected for each binary discrimination 
task. 
3.2.1 Structural Cues 
In order to see whether our easily-computable sur- 
face cues are comparable in power to the structural 
cues used in Karlgren and Cutting (1994), we also 
ran LR with the cues used in their experiment. Be- 
cause we use individual texts in our experiments in- 
stead of the fixed-length conglomerate samples of 
Karlgren and Cutting, we averaged all count fea- 
tures over text length. 
3.3 Neural Networks 
Because of the high number of variables in our ex- 
periments, there is a danger that overfitting occurs. 
LR also forces us to simulate polytomous decisions 
by a series of binary decisions, instead of directly 
modeling a multinomial response. Finally. classical 
LR does not model variable interactions. 
For these reasons, we ran a second set of experi- 
ments with neural networks, which generally do well 
with a high number of variables because they pro- 
tect against overfitting. Neural nets also naturally 
model variable interactions. We used two architec- 
tures, a simple perceptron (a two-layer feed-forward 
network with all input units connected to all output 
units), and a multi-layer perceptron with all input 
units connected to all units of the hidden layer, and 
all units of the hidden layer connected to all out- 
put units. For binary decisions, such as determining 
whether or not a text is :NARRATIVE, the output 
layer consists of one sigmoidal output unit: for poly- 
tomous decisions, it consists of four (BRow) or six 
(GENRE) softmax units (which implement a multi- 
nomial response model} (Rumelhart et al., 1995). 
The size of the hidden layer was chosen to be three 
times as large as the size of the output layer (3 units 
for binary decisions, 12 units for BRow, 18 units for 
GENRE). 
For binary decisions, the simple perceptron fits 
a logistic model just as LR does. However, it is 
less prone to overfitting because we train it using 
three-fold cross-validation. Variables are selected 
by summing the cross-entropy error over the three 
validation sets and eliminating the variable that if 
eliminated results in the lowest cross-entropy error. 
The elimination cycle is repeated until this summed 
cross-entropy error starts increasing. Because this 
selection technique is time-consuming, we only ap- 
ply it to a subset of the discriminations. 
4 Results 
Table 1 gives the results of the experiments. ~For each 
genre facet, it compares our results using surface 
cues (both with logistic regression and neural nets) 
against results using Karlgren and Cutting's struc- 
tural cues on the one hand (last pair of columns) 
and against a baseline on the other (first column). 
Each text in the evaluation suite was tested for each 
facet. Thus the number 78 for NARRATIVE under 
method "LR (Surf.) All" means that when all texts 
were subjected to the NARRATIVE test, 78% of them 
were classified correctly. 
There are at least two major ways of conceiving 
what the baseline should be in this experiment. If 
35 
the machine were to guess randomly among k cat- 
egories, the probability of a correct guess would be 
1/k. i.e., 1/2 for NARRATIVE. 1/6 for GENRE. and 
1/4 for BROW. But one could get dramatic improve- 
ment just by building a machine that always guesses 
the most populated category: NONFICT for GENRE. 
MIDDLE for BROW, and No for NARRATIVE. The 
first approach would be fair. because our machines 
in fact have no prior knowledge of the distribution of 
genre facets in the evaluation suite, but we decided 
to be conservative and evaluate our methods against 
the latter baseline. No matter which approach one 
takes, however, each of the numbers in the table is 
significant at p < .05 by a binomial distribution. 
That is, there is less than a 5% chance that a ma- 
chine guessing randomly could have come up with 
results so much better than the baseline. 
It will be recalled that in the LR models, the 
facets with more than two levels were computed by 
means of binary decision machines for each level, 
then choosing the level with the most positive score. 
Therefore some feeling for the internal functioning of 
our algorithms can be obtained by seeing what the 
performance is for each of these binary machines, 
and for the sake of comparison this information is 
also given for some of the neural net models. Ta- 
ble 2 shows how often each of the binary machines 
correctly determined whether a text did or did not 
fall in a particular facet level. Here again the ap- 
propriate baseline could be determined two ways. 
In a machine that chooses randomly, performance 
would be 50%, and all of the numbers in the table 
would be significantly better than chance (p < .05, 
binomial distribution). But a simple machine that 
always guesses No would perform much better, and 
it is against this stricter standard that we computed 
the baseline in Table 2. Here, the binomial distribu- 
tion shows that some numbers are not significantly 
better than the baseline. The numbers that are sig- 
nificantly better than chance at p < .05 by the bi- 
nomial distribution are starred. 
Tables 1 and 2 present aggregate results, when 
all texts are classified for each facet or level. Ta- 
ble 3, by contrast, shows which classifications are 
assigned for texts that actually belong to a specific 
known level. For example, the first row shows that 
of the 18 texts that really are of the REPORTAGE 
GENRE level, 83% were correctly classified as RE- 
PORTAGE, 6% were misclassified as EDITORIAL, and 
11% as NONFICTION. Because of space constraints, 
we present this amount of detail only for the six 
GENRE levels, with logistic regression on selected 
surface variables. 
5 Discussion 
The experiments indicate that categorization deci- 
sions can be made with reasonable accuracy on the 
basis of surface cues. All of the facet level assign- 
ments are significantly better than a baseline of al- 
ways choosing the most frequent level (Table 1). and 
the performance appears even better when one con- 
siders that the machines do not actually know what 
the most frequent level is. 
When one takes a closer look at the performance 
of the component machines, it is clear that some 
facet levels are detected better than others. Table 2 
shows that within the facet GENRE, our systems do 
a particularly good job on REPORTAGE and FICTION. 
trend correctly but not necessarily significantly for 
SCITECH and NONFICTION, but perform less well for 
EDITORIAL and LEGAL texts. We suspect that the 
indifferent performance in SCITECH and LEGAL texts 
may simply reflect the fact that these genre levels are 
fairly infrequent in the Brown corpus and hence in 
our training set. Table 3 sheds some light on the 
other cases. The lower performance on the EDITO- 
RIAL and NONFICTION tests stems mostly from mis- 
classifying many NONFICTION texts as EDITORIAL. 
Such confusion suggests that these genre types are 
closely related to each other, as ill fact they are. Ed- 
itorials might best be treated in future experiments 
as a subtype of NONFICTION, perhaps distinguished 
by separate facets such as OPINION and INSTITU- 
TIONAL AUTHORSHIP. 
Although Table 1 shows that our methods pre- 
dict BROW at above-baseline levels, further analysis 
(Table 2) indicates that most of this performance 
comes from accuracy in deciding whether or not a 
text is HIGH BROW. The other levels are identified 
at near baseline performance. This suggests prob- 
lems with the labeling of the BRow feature in the 
training data. In particular, we had labeled journal- 
istic texts on the basis of the overall brow of the host 
publication, a simplification that ignores variation 
among authors and the practice of printing features 
from other publications. Vv'e plan to improve those 
labelings in future experiments by classifying brow 
on an article-by-article basis. 
The experiments suggest that there is only a 
small difference between surface and structural cues, 
Comparing LR with surface cues and LR with struc- 
tural cues as input, we find that they yield about the 
same performance: averages of 77.0% (surface) vs. 
77.5% (structural) for all variables and 78.4% (sur- 
face) vs. 78.9% (structural) for selected variables. 
Looking at the independent binary decisions on a 
task-by-task basis, surface cues are worse in 10 cases 
36 
Table 1: Classification Results for All Facets. 
Baseline LR (Surf.) \[ 2LP 3LP LR (Struct.) 
Facet All Sel. \] All Sel. All Sel. All Sel. 
Narrative 54 78 80 82 82 86 82 78 80 
Genre 33 61 66 75 79 71 74 66 62 
Brow 32 44 46 47 -- 54 -- 46 53 
Note. Numbers are the percentage of the evaluation subcorpus (:V = 97) which were correctly assigned to 
the appropriate facet level: the Baseline column tells what percentage would be correct if the machine always 
guessed the most frequent level. LR is Logistic Regression, over our surface cues (Surf.) or Karlgren and 
Cutting's structural cues (Struct.): 2LP and 3LP are 2- or 3-layer perceptrons using our surface cues. Under 
each experiment. All tells the results when all cues are used, and Sel. tells the results when for each level 
one selects the most discriminating cues. A dash indicates that an experiment was not run. 
Levels 
Table 2: Classification Results for Each Facet Level. 
Baseline LR (Surf.) 2LP 3LP LR (Struct.) 
Genre 
Rep 
Edit 
Legal 
Scitech 
Nonfict 
Fict 
Brow 
Popular 
Middle 
Uppermiddle 
High 
All 
81 89* 
81 75 
95 96 
94 100" 
67 67 
81 93* 
74 74 
68 66 
88 74 
70 84* 
Sel. 
88 
96 
96 
68 
96* 
75 
67 
78 
88* 
94* 
74 
95 
99* 
78* 
99* 
74 
64 
86 
89* 
All All 
94* 
8O 
95 
94 
67 
81 
74 
S4 
88 
90" 
All Sel. 
90* 90* 
79 77 
93 93 
93 96 
73 74 
96* 96* 
72 73 
58 64 
79 82 
85* 86* 
Note. Numbers are the percentage of the evaluation subcorpus (N = 97) which was correctly classified on a 
binary discrimination task. The Baseline column tells what percentage would be got correct by guessing No 
for each level. Headers have the same meaning as in Table 1. 
* means significantly better than Baseline at p < .05, using a binomial distribution (N=97, p as per first 
column). 
Table 3: Genre Binary 
Actual 
Rep 
Edit 
Legal 
Scitech 
Nonfict 
Fict 
Level Classification Results by Genre Level. 
Guess 
Rep Edit Legal Scitech Nonfict Fict 
83 6 0 0 11 0 
17 61 0 0 17 6 
20 0 20 0 60 0 
0 0 0 83 17 0 
3 34 0 6 47 9 
0 6 0 0 0 94 
N 
18 
18 
5 
6 
32 
18 
Note. Numbers are the percentage of the texts actually belonging to the GENRE level indicated in the first 
column that were classified as belonging to each of the GENRE levels indicated in the column headers. Thus 
the diagonals are correct guesses, and each row would sum to 100%, but for rounding error. 
37 
and better in 8 cases. Such a result is expected if 
we assume that either cue representation is equally 
likely to do better than the other (assuming a bino- 
mial model, the probability of getting this or a more 
8 extreme result is ~-':-i=0 b(i: 18.0.5) = 0.41). We con- 
clude that there is at best a marginal advantage to 
using structural cues. an advantage that will not jus- 
tify the additional computational cost in most cases. 
Our goal in this paper has been to prepare the 
ground for using genre in a wide variety of areas in 
natural language processing. The main remaining 
technical challenge is to find an effective strategy for 
variable selection in order to avoid overfitting dur- 
ing training. The fact that the neural networks have 
a higher performance on average and a much higher 
performance for some discriminations (though at the 
price of higher variability of performance) indicates 
that overfitting and variable interactions are impor- 
tant problems to tackle. 
On the theoretical side. we have developed a tax- 
onomy of genres and facets. Genres are considered 
to be generally reducible to bundles of facets, though 
sometimes with some irreducible atomic residue. 
This way of looking at the problem allows us to 
define the relationships between different genres in- 
stead of regarding them as atomic entities. We also 
have a framework for accommodating new genres as 
yet unseen bundles of facets. Finally, by decompos- 
ing genres into facets, we can concentrate on what- 
ever generic aspect is important in a particular appli- 
cation (e.g., narrativity for one looking for accounts 
of the storming of the Bastille). 
Further practical tests of our theory will come 
in applications of genre classification to tagging, 
summarization, and other tasks in computational 
linguistics. We are particularly interested in ap- 
plications to information retrieval where users are 
often looking for texts with particular, quite nar- 
row generic properties: authoritatively written doc- 
uments, opinion pieces, scientific articles, and so on. 
Sorting search results according to genre will gain 
importance as the typical data base becomes in- 
creasingly heterogeneous. We hope to show that the 
usefulness of retrieval tools can be dramatically im- 
proved if genre is one of the selection criteria that 
users can exploit. 

References 
Biber, Douglas. 1986. Spoken and written textual 
dimensions in English: Resolving the contradic- 
tory findings. Language, 62(2):384-413. 
Biber. Douglas. 1988. Variation across Speech 
and Writing. Cambridge University Press. Cam- 
bridge. England. 
Biber. Douglas. 1992. The multidimensional ap- 
proach to linguistic analyses of genre variation: 
An overview of methodology and finding. Com- 
puters in the Humanities, 26(5-6):331-347. 
Biber. Douglas. 1995. Dimensions of Register Vari- 
ation: A Cross-Linguistic Comparison. Cam- 
bridge University Press. Cambridge. England. 
Butcher, S. H.. editor. 1932. Aristotle's Theory of 
Poetry and Fine Arts. with The Poetics. Macmil- 
lan, London. 4th edition. 
Dubrow, Heather. 1982, Genre. Methuen. London 
and New York. 
Fowler, Alistair. 1982. Kinds of Literature. Harvard 
University Press. Cambridge. Massachusetts. 
Frye. Northrop. 1957. The Anatomy of Criticism, 
Princeton University Press. Princeton, New Jer- 
sey. 
Hernadi, Paul. 1972. Beyond Genre. Cornell Uni- 
versity Press. Ithaca, New York. 
Hobbes, Thomas. 1908. The answer of mr Hobbes 
to Sir William Davenant's preface before Gondib- 
ert. In J.E. Spigarn, editor. Critical Essays of the 
Seventeenth Century. The Clarendon Press, Ox- 
ford. 
Karlgren, Jussi and Douglass Cutting. 1994. Recog- 
nizing text genres with simple metrics using dis- 
criminant analysis. In Proceedings of Coling 94, 
Kyoto. 
McCullagh, P. and J.A. Nelder. 1989. Generalized 
Linear Models. chapter 4, pages 101-123. Chap- 
man and Hall, 2nd edition. 
Nunberg, Geoffrey. 1990. The Linguistics of Punc- 
tuation. CSLI Publications. Stanford. California. 
Rumelhart. David E.. Richard Durbin. Richard 
Golden. and Yves Chauvin. 1995. Backprop- 
agation: The basic theory. In Yves Chau- 
vin and David E. Rumelhart, editors, Back. 
propagation: Theory, Architectures, and Applica- 
tions. Lawrence Erlbaum. Hillsdale, New Jersey, 
pages 1-34. 
Staiger, Emil. 1959. Grundbegriffe der Poetik. At- 
lantis, Zurich. 
Statistical Sciences. 1991. S-PLUS Reference Man- 
ual. Statistical Sciences. Seattle, Washington. 
Todorov, Tsvetan. 1978. Les genres du discours. 
Seuil, Paris. 
