Integrated Text and Image Understanding for Document 
Understanding 
Sttzalzo, e Liebowitz T~tylor, Deborah A. Dahl, Mark Lips/~utz, Carl Weir, 
Lewis M. Norton, Roslyn Nilson and Marcia Liueb~trgcr 
Unisys Corporation 
Paoli, PA 19301 
ABSTRACT 
Because of the complexity of documents and the variety of ap- 
plications which must be supported, document understand- 
ing requires the integration of image understanding with text 
understanding. Our docum(,nt understanding technology is 
implemented in a system called IDUS (Intelligent Document 
Undcrstanding System), which creates the data for a text 
retrieval application and the automatic generation of hyper- 
ttrxt li.ks. This paper summarizes the areas of research dur- 
ing IDUS development where we have found the most benelit 
from the integration of image and text understanding. 
1. INTRODUCTION 
As more and more of our daily transactions involve comput- 
ers, we would expect the volume of paper documents gener- 
ated to decrease. However, exactly the opposite is happening. 
('onsiderable amounts of information are still generated only 
in paper form. This, compounded by volumes of legacy l)a.per 
documents still cluttering offices, creates a need for efficient 
methods for converting hardcopy material into a computer- 
usable form. However, because of the Sol)histication of ap- 
plications requiring electronic documents (e.g. routing and 
retrieval) and the complexity of the documents themselves, 
it is not sufficient to simply scan and perform OCR (optical 
character recognition) on documents; deeper understanding 
of the document is needed. 
Comprehensive document understanding involves determin- 
ing the form (layout), as well as the function and the meaning 
of the document. Document understanding is thus a technol- 
ogy area which benefits greatly from the integration of text 
understanding with image understanding. Text understand- 
ing is necessary to operate on the textual content of the docu- 
ment and image understanding is necessary to operate on the 
pixel content of the document. We have found great benefit 
from intertwining the two technologies instead of employing 
them in a pipeline fashion. We expect that more sophisti- 
cated document applications in the future will require even 
closer knitting. 
Our document understanding technology is implemented in 
a system called IDUS (Intelligent Document Understanding 
System, described in Section 2), which creates the data for 
a text retrieval application and the automatic generation of 
hypertext links. This paper summarizes areas of research 
during IDUS development where we have found the most 
benefit from the integration of image processing with natural 
language/text processing: Document layout analysis (Section 
3), OCR correction (Section 4), and Text analysis (Section 
4). We also discuss two applications we have implemented 
(Sections 5 and 6) and future plans in Section 7. 
2. GENERAL IDUS SYSTEM 
DESCRIPTION 
IDUS employs four general technologies - image understand- 
ing, OCR, do(:ument layout analysis and text understand- 
ing - in a knowledge-based cooperative fashion\[l~ 2\]. The 
curre,t implemenlation is on a SPAl~Cstation TM II with 
the UNIX TM operating system using the 'C' and Prolog 
programlning languages. OCR is performed with the Xe- 
rox hnaging Systems ScanWorX TM Application Program- 
mer's Interface toolkit. All features are accessible via an 
X-Windows TM/Motif TM user interface. 
After scanning the document page(s), IDUS works pagg-by- 
page, pcrtorming image-based segmentation to initially locatc 
the primitiw~ regions of text and nontext which are manipu- 
lated d,riug the logical and functional analysis of the docu- 
ment. Each text unit's content is internally homogeneous in 
physical attributes such as font size and spacing style. 
The ASCII text associated with each block is found through 
OCR and a set of features based both on text attrib.tes 
(e.g., number of text lines, font size and type) and geometric 
attributes (e.g., location on page, size of block) is used to 
refine the segmentation and organize the blocks into proper 
logical groupings, i.e., "articles". The ASCII text for each 
"article" is assembled in a proper reading order. During this 
process the column structure of the document is determi,(,(I, 
and noise and nontext blocks are eliminated. A text process- 
ing component performs a finguistic analysis to extract key 
ideas from each article and then represent them by a seman- 
tic component, the case frame. Each "article" text is saved 
as part of the document corpus and may be retrieved through 
a query interface. 
3. DOCUMENT LAYOUT 
ANALYSIS 
Document layout analysis determines the intra- and inter- 
page physical, logical, functional and topical organization of 
a document. Many applications (e.g. automatic indexing or 
database population) require some level of understanding of 
the document's textual components. However, it is also criti- 
cal to discover the document layout structure for two reasons: 
(1) document layout attributes such as position of text. and 
font specifications are key clues into the relative importance 
of textual content on a page and (2) understanding the layout 
421 
puts the text in the correct reading order for OCR conver- 
sion to ASCII/machine-readable text. Although image-based 
techniques provide valuable information for the physical de- 
composition of the document and feature inpnt to the logical 
and fnnctional areas, the incorporation of textual information 
ensures a more comprehensive layout analysis. 
The current implementation of the document layout analysis 
module of IDUS consists of physical analysis, logical analysis, 
and a rudimentary functional component. We are currently 
developing a more sophisticated fnnctional analysis. 
3.1. Physical Analysis 
Physical analysis of the document determines the primitive 
segments that make up each page and their attributes. Prim- 
itives include both text and non-text segments (pholographs, 
graphics, rule lines, etc.). Attributes may include chromatic 
features (coh)r or gray scale) and typesetting features (mar- 
gins, point size and face style). Initially, physical analysis 
is achieved through page segmentation of the image of each 
page of the document and OC, R (also an image-based pro- 
cess). Page segmentation starts with a binary image of a doc- 
ument page and partitions it into fnndamental units of text or 
non-text. A desirable segmentation will form text units by 
merging adjacent portions of text with similar physical at- 
tributes, such as font size and spacing, within the framework 
of the physical layout of the document. The final output of 
the segment ation is a list of polygonal blocks, their locations 
on the page,, and the locations of horizontal rule lines on the page\[~\]. 
Each text region in the document image is fed to the OCR to 
obtain its AS('II characters, as well as text attributes such 
as facestyh,, pointsize, aad indentation. These attributes are 
used as input to the logical page analysis along with the 
image-based feature output from the image segmentation. 
3.2. Logical Analysis 
Logical analysis groups the primitive regions into fimctional 
units both within and among pages and deternfines the read- 
ing order of the text units through manipulation of both 
image-based and text-based features. Logical analysis is first 
done on a page level anti then on a document level. 
The emphasis of the current logical analysis module has been 
at the page level\[2\] where we group appropriately (e.g., into 
articles in the case of a newspaper page) the text components 
which comprise a document page, sequence them in the cor- 
rect reading order and establish the dominance pattern (e.g., 
find the lead article). A list of text region locations, rule line 
locations, associated ASCII text (as found from an OCR) for 
the text blocks, and a list of text attributes (such as face style 
and point size) are input to logical page analysis. 
Transforming the blocks produced by the image segmenta- 
tion into a logical structure is accomplished via the rule-based 
methods developed by Tsujimoto and Asada\[3\]. A geometric 
structure tree is created from the image segmentation, then a 
set of rules are applied to transform the geometric structure 
tree to a logical structure tree. In addition to the location 
and size of the blocks, these rules employ a gross functional 
classification as to whether a block is a head or a body. Head 
blocks are defined as those which serve in some way as labels 
or pointers, such as headlines and captions; body blocks are 
those which are referenced by head blocks and which have 
substantive content. Our head/hody classification relies o, 
features calculated for each block and for the page as a whole. 
The inputs to the feature calculations are the outputs of im- 
age segmentation and OCR. Construction of the geometric 
structure tree is predicated on the way that head blocks are 
positioned relative to their associated body blocks. 
The Tsujimoto and Asada method for building the geomet- 
ric structure tree assumes that the layout is well behaved 
in terms of uniformity of column configurations and place- 
ment of blocks along column boundaries. By finding ways of 
creating the geometric structure tree when the layout does 
not behave in the above manner, we have extended their ap- 
proach. 
Key to both the basic melhod and the enhancements is th,, 
determination of a page's underlying column structure. V~'e 
compute column boundaries by applying area-based criteria 
to the set of all page-spanning column chains, where neigh- 
boring links represent contiguous blocks. The method is in- 
dependent of the width of individual columns. 
To handle complex layouts we create multiple geometric 
structure trees for the page which are merged during the cre- 
ation of a logical structure tree. For a document with mnl- 
tiple sets of column bounds, we look for consistent column 
patterns in subregions of the document and construct a geo- 
metric structure tree within each subregion. The individual 
trees can be merged by treating each as a child of "root", to 
form a single tree for the page. Phenomena like inset I)lo~ks 
w\]fich interrupt straight column flows are "peeled off" in!o a, 
overlay plane. Separate trees are constructed for this plato. 
and the base plane and then merged so that an inset block 
is placed in the context of, but logically subordinate to, its 
surroumling article. The logical transformation then applies 
layout rules which associate blocks in the geometric struc! u re 
tree(s) into "article" groupings in the proper reading order. 
Text analysis enhances logical analysis particularly in the 
area of determining the reading order. Continuity of arli- 
cles in a document may be very simple or very complex. Au 
article may require many pages, but flow through the (h),'- 
ument in a linear fashion. In this case continuity may be 
determined by proximity. Popular periodicals, though, often 
have nmltiple articles per page with continuations separated 
by many pages. In fact, continuations may occur on pages 
which precede the initial text, as well as the more typical case 
of those which succeed the initial text. 
q'o determine where an article continues, it may be sufficiet,t 
to look for text such as "continued from page 13" or "colllin- 
ned on next page". However, there are several cases where. ;t 
deeper linguistic interpretation will be necessary to find th,. 
article flow. Consider the scenario of nmltiple articles p,.r 
page. If reading order is determined solely h'om geometti,: 
and typesetting attributes, then it is possible that some text 
components will be grouped incorrectly. If the reading order 
in this case nmst be verified, it may be necessary to incor- 
porate more sophisticated hnguistic processing, for example, 
422 
comparing the last. sentence fragment of one text region and 
verifying that is consistent with the sentence fragment of the 
candidate following region (either syntactically or semanti- 
cally). 
3.3. Functional Analysis 
Functional analysis labels each unit (a~ determined through 
h)gical analysis) by role or function within the document. For 
example, a popular journal article will have a title, author ref- 
erence, body text and photos. Functional analysis combines 
image-based and text-based features. 
Functional labeling of regions, first at a high level to dis- 
criminate between head and body or text and non-text, and 
ultimately to deternfine tile particular role a region plays, 
can be enhanced through the use of textual clues. Often 
these clues would be used in conjunction with format clues 
and clues derived from pixel-level features\[4\]. 
The simplest type of textual clue is string-based pattern 
matching or basic keyword search (such as looking for "Fig." 
or "Figure" to identify a non-text region). Since OCR sys- 
tems tend to be overeager in recognizing regions as text, a 
reasonable cluc combination would have the system look for 
the presence of a left-justified or centered figure tag below a 
region with a high number of character recognition errors. If 
b,)th conditions are met, there is strong support for designat- 
ing lqat region as non-text. 
kVhih, much can be gained fr,)m simple string-based pattern 
matching, there are olher inh~rmalion retrieval techniques of 
wtrying kinds and dvgrees of complexity which can be ap- 
plied to advantage. %'e mention a few below, using informal 
d(,scriptions. 
Siring-based pattern matching with boolean operators and po- 
sitional constraints. The "Figure" example above falls in this 
c~ttegory. As another example, we might classify a block as 
an atfiliation block (hence giving evidence that the document 
is a scholarly article) if it contains "(gollege" or "University" 
within two lines of a (U.S.) state name or abbreviation. 
Adherence to a simple grammar. The section headings for 
scholarly articles can be characterized by a grammar which 
specifies a sequence of digits and dots followed by an alpha- 
numeric string. As another example, newspaper headlines 
and captions may conform to a certain style which has a 
syntactic basis, such as noun phrase followed by present par- 
ticiple for a picture caption : "Heroic firefighter rescuing cat 
from tree". 
k)'equency o\] string occurrences. In government Requests for 
Proposals (RFPs), phrases of the form the offerer shall and 
the contractor shall occur with much higher frequency than 
in other types of documents. This frequency information can 
bc used to infer document type. 
D'equency o\] syntactic structure occurrences. Newspaper ar- 
tic'les typically contain a higher frequency of appositive con- 
structions than other types of prose. Using syntactic analysis 
to detect the presence or absence of appositive constructions 
would license the inference that a newspaper writing style was 
present in a given text block, and therefore that a newspaper 
document type should be expected. 
3.4. Topical Analysis 
Topical analysis includes referential links among components, 
such as from the body of a text to a photograph it mentions. 
Examples of references include both explicit references (e.g., 
"see Figure 1.5") and implicit references (e.g.. "following the 
definition described in the introduction"). The topical orga- 
nizalion could also accomnmdate text summary temphttes or 
other interpretive notes. Linguistic processing should play a 
key role in determining tile topical organization. 
4. OPTICAL CHARACTER 
RECOGNITION CORRECTION 
Language understanding serves two functions in document 
interpretation. First it is important if the text of tile doc- 
ument must be interpreted and processed ill some way, for 
example, to fill a database. In addition, natural languag,. 
understanding technology call be used to imprt,ve tilt. a,:cu- 
racy of optical character recognition (OCR) output, whether 
the ultimate goal is to interpret tile text, to find textual clues, 
or simply to obtain accurate character recognition. 
Our method for using linguistic context to improve OCR 
output\[5\] is based on extending work previously done in 
Sl)ok(:n language uuderstanding\[6, 7\], where a natural lan- 
guag,' system is used to constrain tile output of a spee('h 
recognizer. We have applied linguistic constraints to the 
OCR using a variation of the N-best interface with th,: 
natural language processing system, PUNDIT. PUNDIT is 
a large, domain-independent naturai language processing 
system which is modul~tr in design, and includes distinct 
syntactic\[8\], semantic\[9\] and application\[10\] components. 
The data used for this experiment were obtained from pro- 
cessing facsimile output, since optical character recognition 
for these documents typically results in a high error rate. We 
selected a domain of input texts on which the natural lan- 
guage system was already known to perform well, air trav('l 
planning information\[Ill. The OCR output is sent to the 
Alternative Generator, which consults with a lexicon of al- 
lowable words and a spelling corrector, developed as part. of 
this project. The spelhng corrector by itself successfully cor- 
rects many OCR errors. The output of the spelling corrector 
is also used to generate an ordered list of N-best alternatiw" 
sentences for each input. These sentences are sent to PUNDIT, 
which selects the first meaningful alternative sentence. 
The generation of an N-best list of sentence candidates is a 
simple matter of taking a "cross-product" of all word alter- 
natives across the words of a given sentence. The score of 
each candidate sentence is the product of tile scores of th,~ 
words in it; words without nmltil)le alternatives are assigw,,I 
a score of 1.0. N-best sentence candidates are presented to 
the natural language system in ascending order of score. The 
sentence with the lowest score that is accepted by the natural 
language system is taken to be the output of the iutegralt,d 
OCR-natural language recognition system. 
The system was tested with 120 sentences of air travel plan- 
423 
ning data on which the natural language system had previ- 
ously been trained. A known text was formatted in ~TEX in 
both roman and italic fonts, and printed on an Imagen TM 8/3(10 
laser printer. It was then sent through a Fujitsu 
dex9 TM facsimile machine and received by a Lanier 2000 TM 
facsimile machine. The faxed output was used as input to 
a Xerox ScanWorX TM scanner and OCR system. Figure 1 
shows a typical sentence after scanning, after spelling cor- 
rection (with N-best sentence candidates), and after natural 
language correction. 
Performance was measured using software originally designed 
to ewduate speech recognizers, distributed by the National 
Institute of Standards and Technology \[14\]. The word error 
rate for output directly from the OCR was 14.9% for the 
roman font and 11.3% for the italic font. After the output was 
sent through the alternative generator, it was scored on the 
basis of the first candidate of the N-best set of alternatives. 
The error rate was reduced to 6% for the roman font and to 
3.2% for the italic font. Finally, the output from the natural 
language system was scored, resulting in a final error rate of 
5.2% for the roman font and 3.1% for the italic font. 
Most of the improvement seen in these results is the effect 
of the spelling corrector, although there is also a small but 
consistent improvement in word accuracy due to natural lan- 
guage correction. Spelling correction and natural language 
correction have a much bigger effect on application accuracy. 
Sending the uncorrected OCR output directly into the nat- 
nral language system h)r processing without correction leads 
to a 73% average weighted error rale as measured by the 
ARPA error metric defined for ATIS\[14\]. 
This high error rate is due to the fact that the entire sen- 
tence must be nearly correct in order to correctly perform 
the database task. Spelling correction improves the error 
rate to 33%. Finally, with natural language correction, the 
application error rate improves to 28%. 
The tagged articles are input to the parser, which uses a 
grammar consisting of 400 productions to derive regular- 
ized parses which reflect each sentence's logical structure. 
'\]'he parser achieves domain independence by using an 80,01ill 
word lexicon, and by using time-onts and a skip-and-lit nm(:h- 
anism to partially parse sentences for which it lacks appro- 
priate productions. Thus, no sentence will be left comph:tcly 
unanalyzed; the system will extract as much information as 
it can from eyery sentence. Average parsing time is approxi- 
mately 0.5 see/sentence. 
The semantic component represents the meaning of the sen- 
tence in the form of a case frame, a data structure which 
includes a fl'ame, representing a situation and typically cor- 
responding to a verb, as well as its case roles, which represent 
the participants in the situation and which typically corre- 
spond to the subjects and objects of verbs\[17\]. The semant i(: 
component of IDUS takes the parsed output and maps it into 
case frames reflecting each sentence's semantic content. 
The domain-independent case-frame generator uses unigram 
frequencies of case frame mappings based on training data 
obtained from a corpus in the Air Travel Planning domain 
to select the most probable case frame mappings h)r th(, 
arguments of verbs\[7, 18\]. '\]'his training data is supple- 
mented by additional mapping rules based on the semantics 
of prepositions occurring in the University of Pennsylvania 
Treebank\[19\]. 
6. APPLICATIONS 
We have developed two applications to demonstrate the util- 
ity of document understanding from hardcopy material: t,,xt 
retrieval and automatic hypertext generation. 
Articles found on nmlti-artiele pages during the docum~.,t 
layout analysis are extracted separately and stored in a cor- 
pus. The text retrieval interface to IDUS allows the user to 
pose a query in natural language. 
5. TEXT UNDERSTANDING 
Once the ASCII text is assembled in the correct reading or- 
der, we can employ text understanding to support a doc- 
ument application. The text understanding module is a 
generic, fast, robust, domain-independent, natural language 
processing system consisting of components for syntactic and 
semantic processing. The syntactic component includes a sta- 
tistical part-of-speech tagger developed at the University of 
Pennsylvania\[15\], and a robust Prolog parser developed at 
New York University\[16\]. The semantic component is a case 
frame interpreter developed at Unisys. 
Trigram and unigram probabilities are used by the part-of- 
speech tagger to assign parts of speech to words in a cor- 
pus. It was trained on the 1M word Brown corpus of general 
English. The tagger includes models for part-of-speech as- 
signment for unknown words and heuristics for recognizing 
proper nouns. Thus, unknown words resulting from OCR er- 
rors, new proper names, and unusual technical terminology 
can be correctly tagged, and will not cause problems during 
later processing. 
A natural language query is made and a search for matcl,.s 
in the tagged corpus is performed based on part. of spe¢',:h 
tagging of the query. Every sentence in tile tagged c,)rl)us 
receives a score based on: (1) the ratio of tile numl)er of 
distinct words and their associated part-of-speech in common 
between the query aml the sentence and (2) the total numlwr 
of words in the sentence and the query. The score of a, 
entire article is the maximum score of any sentence in lit,. 
article. Grammatical function words such as prepositions, 
conjunctions, and pronouns are ignored during the matching 
process. By basing the comparison on tagged words rather 
than raw words, fewer irrelevant articles are retrieved because 
parts-of-speech corresponding to grammatical function words 
can be ignored and because words in the query and corpus 
must have the same part of speech in order to count as a 
match. After articles have been retrieved, a menu then pops 
up with a list of "hits". The user then has an option to 
examine the ASCII text associated with the article and/or 
the image from which the article came which affords the user 
the fnll richness of context. An example of the text retrieval 
application is shown in Figure 2. 
We have also developed a hypertext generation module which 
424 
Uncorrected OCROutput: 
What is the latest fgght fiom Phaadelphi& to Boston 
After Spelling Correction: 
flight What is the latest night 
right 
from Philadelphia to Boston 
After Natural Language Correction: 
What is the latest flight from Philadelphia to Boston 
Figure 1: Successive improvement in OCR output occurs with spelling and natural language correction. 
converts the output of the IDUS system to a hypertext repre- 
sentation. The target of this application was legacy technical 
manuals. 
An initial prototype system was developed that generates 
SGMI,\[20\] aml hypertext\[21\] finks from raw OCR output. 
The output of this prototype is provided directly as input 
to the Unisys hypertext system IDE/AS, TM, which encodes 
hypertext links as SGML tags. The main steps are the classi- 
fication of the individual text lines, the grouping of them into 
functional units, the labeling of these units and embedding 
the corresponding SGML tags. 
We employ a two-stage fuzzy grammar to classify and group 
lines into logical units regardless of physical page boundaries 
in the original document. The first stage tokenizes each line, 
recognizing key character sequences, such as integers sepa- 
rated by periods, which wouhl be useful dowttst.ream in deter- 
mining line type, in this case some level of (sub)section. This 
method allows us to overcome certain kinds of OCR errors. 
The second stage finds a best match of the sequence of tokens 
representing each fine against a dictionary of token sequences 
to arrive at a line classification. Classifications include page 
header, page footer, blank line, ordinary line, four levels of 
section indicator lines and list item lines. The validity of the 
individual line classifications is then checked in context. If a 
line type does not fit with respect, to its neighboring fines, it 
is reclassified. 
A further set of rules is applied to group lines together 
into functional units, label the units and choose titles for 
them. With this preparation, the SGML tags upon which 
the IDE/AS TM hypertext system reties are generated and in- 
serted. 
From the unit labels and titles an active index is generated; 
a click on any entry takes the reader directly to the corre- 
sponding frame. Additional processing of each line searches 
for cross-references to paragraphs (e.g., "See paragraph 1.2"), 
tables and figures in order to insert tags which transforms 
the references into "hotspots". Paragraph, table and figure 
cross-references are also activated. Text blocks are fed to the 
hypertext generation module from IDUS as fines of ASCII 
characters. 
7. FUTURE PLANS 
Currently, we are working on enhancements to IDUS include 
the ability to analyze new classes of information, specitically 
multi-lingual busine.-s documents and newspapers. This in- 
voh'es extending logical analysis to multiple page doeum,,nts 
and the tighter coupling of text and image understamling. 
References 
1. S. L. Taylor, M. Lipshutz, D. A. Dahl, and C. Weir. 
"An intelligent document understanding system," in 
Second International Con\]erence on Document A,ablsis 
and Recognition, (Tsuknba City, Japan), October l!J!~3. 
2. S, L. Taylor, M. Lipshutz, and C. Weir, "l)ocunwl~t 
structure interpretation by integrating multiple knowl- 
edge sources," in Symposium on Documtnt A,,,l,l.~is 
and Information Retrieval, (Las Vegas, Nevada). 16 18 
March 1992. 
3. S. Tsujimoto anti H. Asada, "Understanding mulli- 
articled documents," in lOlh Int. Conf. on Patter, 
Recognition, (Atlantic City, N J), 16-21 June 1990, pp. 
551-556. 
4. J. Fisher, "Logical structure descriptions of segmented 
document images," in First International Confer~n,e 
on Document Analysis and Recognition, (Saint-Malo. 
France), 30 September - 2 October 1991. 
5. D. A. Dahl, L. M. Norton, and S. L. Taylor, "Improv- 
ing OCR accuracy with linguistic knowledge," in 2,d 
Symposium on Document Analysis and Retrieval, Apr. 
1993. 
6. D. A. Dahl, L. Hirschman, L. M. Norton, M. C. 
Linebarger, D. Magerman, and C. N. Ball, "Training 
and evaluation of a spoken language understanding sys- 
tem," in Proceedings of the DARPA Speech and Lan- 
guage Workshop, (Hidden Valley, PA), June 1990. 
7. L. M. Norton, hi. C. Linebarger, D. A. Dahl, and 
N. Nguyeu, "Augmented role filling capabilities for s,- 
mantle interpretation of natural language," in Proceol- 
ings of the DARPA Speech and Language ll~rksht,l,. 
(Pacific Grove, CA), February 1991. 
8. L. Hirschman and J. Dowding, "Restriction grammar: 
A logic grammar," in P. Saint-Dizier and S. Szpakowic/. 
eds., (Logic and Logic Grammars for Language Proc¢.~.~- 
ing), pp. 141-167. Ellis Horwood, 1990. 
9. hi. Palmer, Semantic Processing .for Finite Dom,i,~. 
Cambridge, England: Cambridge University Press, 
1990. 
425 
fine Edit ~ew 9_pUtts ~etp 
, .ll,~.-,i =\[t,i ~ i ~1 | ii iiI pilliltll Z 1'~'~1 ........ 
~.~ ~..~~~.~'~'t 1'~1~''I~¢~ " ;:~ ~':*/; ; ACIIVAIE VlONLDN011.S .......... . " .... 
• ~rl~tm ka ~sd. t~a~- ~;~.~E2gF~ ~r~ 
~:, ~.~ ~ ~ ~ ~- 
~ ,~ ENTER QUERY: '~- 
m CRl,iA~IrebChll~gOulputAt&~ljil1,¢ottmlm,l.wm~j~.~h .... p~ohlem:Ulofactory(luotaollOfree 1 R ............. ~_ 
I:llt~ll~ ~ l U~ for lirdrl'ied ~nl,~iloyt)e~ only is riot suffidmll for Uleir tllEl|hilR ileel\]l~ . | I I ~ J'~'zl * ~1 Ifi Ih 
Ill tact , Ule wmla's m~lst populous sooety is fac.mg a cnsts Of Cmldoals , | H ¢ ~ .~., I,~t ~ ~ 
OIk~.* S 7 mdhort u~aes.~ , morro than dmdde Rle number Of prem;mtal and irxtr~naunl~l sin( IS nsmg ~ . | a ~ nt~,rl~rsi~ 
~vr~t~eneed~r~nd~ms~Pu~at~nbur.eaucratsarePmss~n~Umna1~n~a~e~mdus1r~stre~h Ill ~,.~-.a,=.,~.~ ,ult~t 
quk F;Iy by 100 ,uillimz , Io a pear of I..5 bit Ikm a yeot. | Im -~.'~,~,~-=" 
Check oflr ......... Imp/lot =. ! .... O! SUlppNy ~ aemana Ove¢ central plamdrl 9 . I H ~,~,~P;~'. "~='=; 
I m~Jht yea~ ago, say them are lint not mmJSI1 prophytaallc.~ tO go am*tna. I II .,;i*,~,) ;~. 
(31Wna's hwje ~ of baby boomers, born in 'the t~6111S when r~lalrwlan It4ao's OdttlrM SevoIuti011 crippled i li '~¢ , " 
the rmtJon'$ emllr~t'oMc blrth-conbl01 ~ , h~re re~l£1N~d ellildbe~dtn 9 age . | II 
Ihe~havealsodevelopedloosersexual habits : I il Jlr~ld~tmtaltlut~lpl~l\]t=t" 
...................................... t • ~ ,. n,~*..,',..,~.rkrdtrl.t 
,,-t~,,..t*,z.',,'. ly,,.4n.,~,,znd.~a~t,~,,,,,a,~,e~i~,m,,,.~_nm 1 ~,a~_n.t~t I ~. i 
r--~,~--1 V-Z77--II":: ! 
.......... U ~ '~= %% " i 
............................. M ~;.'° ° ": 
Figure 2: Text retrieval application for the IDUS prototype system. 
10. C. N. Ball, D. Dahl, L. M. Norton, L. tlirschman. 
C. ~Veir. and M. Liuebarger, "Answers and questions: 
Processing messages and queries," in Proceedings of the 
DA RPA Speech and Natural Language Workshop, (Cape 
Cod, MA), Oct. 1989. 
11. P. Price, "Evaluation of spoken language systems: the 
ATIS domain," in Proccedings of the Speech and Naturul 
Language Workshop, Morgan Kaufmann, 1990, pp. 91- 
95. 
12. C. Filhnore, "The ease for case," in Bach and Harms, 
eds., (Universals in Linguistic Theory), pp. 1-88. New 
York: Holt, Reinhart, and Winston, 1980. 
13. D. A. Dahl and C. N. Ball, "Reference resolution in 
PUNDIT," in P. Saiut-Dizier and S. Szpakowicz, eds., 
(Logic and logic grammars \]or iunguage processing). El- 
lis Horwood Linfited, 1990. 
14. D. S. Pallett, "DARPA resource management and 
ATIS benchmark poster session," in Proceedings of the 
D A RPA Speech and Language Workshop, (Pacific Grove, 
CA), February 1991. 
15. K. W. Church, "A stochastic parts program and noun 
phrase parser for unrestricted text," in Proceedings oll 
the Second Conilerence on Applied Natural Language 
Processing, (Austin), ACL, February 1988, pp. 136-143. 
16. T. Strzalkowski and B. Vauthey, "Information retrieval 
using robust natural language processing," in Proceed- 
ings off the Thirtieth Annual Meeting of the Association 
for Computational Linguistics, 1992, pp. 104-111. 
17. C. Fillmore, "The case for ease reopened," in P. Coh, 
and J. Sadock, eds., (Syntax and Semantics, ~,~htmr 
8: Grammatical Relations). New York: Academic Press, 
1977. 
18. C. T. Hemphill, J. J. Godfrey, and G. R. Doddington. 
"The ATIS spoken language systems pilot corpus," in 
Proceedings of the DARPA Speech and Lanouage Work- 
shop, (Hidden Valley, PA), June 1990. 
19. M. Marcus, "Very large annotated database of American 
English," ill Proceedings of the DARPA Speech and La,- 
9uage Workshop, (Hidden Valley, PA), Morgan Kaub 
mann, June 1990. 
20. E. van Herwijnin, Practical SGML Norwell, MA: 
Kluwer Academic Publishers, 1990. 
21. J. Nielsen, Hypertext and Hypermedia. Academic Prcss, 
Inc., 1990. 
426 
