Language Understanding Research at Paramax 
Debo~h A. Dahl, Carl Weir, Suzanne Liebowitz Taylor, 
Lewis M. Norton, Marcia C. Linebarger and Mark Lipshutz * 
Paramax Systems Corporation, a Unisys Company 
P.O. Box 517 
Paoli, PA, 19301 
ABSTRACT 
Language understanding work at Paramax focuses on ap- 
plying general-purpose language understanding technology 
to spoken language understanding, text understanding, and 
document processing, integrating language understanding 
with speech recognition, knowledge-based information re- 
trieval and image understanding. 
1. INTRODUCTION 
The basic language understanding technology in the 
Paramax language understanding architecture has been 
designed to be both independent of the language input 
inodality (e.g. spoken input vs. scanned OCR input) 
as we\]\] as independent of the end application (e.g. com- 
mand and control vs. data extraction). This architec- 
ture is shown in Figure 1. 
I ............. I 
I.,e i'lgU o~ Input 
Language Applioetlo~s 
Figure 1: Paramax Language Understanding Inputs and 
Applications 
2. SPOKEN LANGUAGE 
UNDERSTANDING 
The focus of our spoken language understanding work is 
the development of a domain- and application- indepen- 
dent dialog processing architecture which can be coupled 
*This paper was partially supported by DARPA contract 
N000014-89-C0171, administered by the Office of Naval Research, 
and by internal funding from Paramax Systems Corporation (for- 
merly Unisys Defense Systems). 
with a variety of speech recognizers. Some of our recent 
achievements include: 
• a non-monotonic reasoning capability, which will 
support "what if?" exploratory dialogs 
• a query paraphrase component to enhance user- 
friendliness 
• enhanced reference resolution capabilities to sup- 
port additional types of context-dependent refer- 
ences 
• automated training techniques for semantics 
Figure 2 illustrates the integration of a speech recog- 
nizer, language understanding component, and a dialog 
manager (VFE) into a modular spoken language archi- 
tecture interacting with the user and the application. A 
Spoken Language Kernel 
E N-bo81 output Imm speech recognizer 
Oisplay Io user 
Figure 2: Paramax spoken language system architecture 
more detailed description of this work can be found in 
\[1\]. 
3. TEXT UNDERSTANDING 
In text understanding we emphasize an architecture in 
which a variety of components cooperate to produce 
459 
an analysis of texts. Not all components may be re- 
quired for any one particular application - they can be 
intermixed in various ways to suit each application's 
needs. This architecture, shown in Figure 3 includes 
keyword-based information retrieval, for message rout- 
ing. A knowledge-based information retrieval compo- 
nent (KBIRD) extracts text information which is ac- 
cessable without a detailed natural language analysis. 
If an application requires detailed natural language pro- 
cessing, a natural language processing system is avail- 
able. Finally, a database record generator formats the 
extracted information for database generation. This ap- 
proach is described in \[2\]. 
~::~::~::~::~!~i:~ ................... ~:~% .................................... 
::::::::i:i:i:i::"" "'::i:i " !:::i Mecqlng dettcr~l?ttoms of ~¢eat~ts -::!: ::::: ::::: :. -:: :::::|. ...... ~::::: 
::::::::::::::::::::::::::::::::::::::i~:~:i~::.~i " ................ ~; ~:i~.::::~:::: ~mm~m~mm ~ I~!~i~:::::::::. 
!!!~i~i.':...~....:~iiii!i~i~i~i~ii! ::::::::::::::::::::::::::::::::::::::::: :: :::::::::::::: :: :::::::::: :: :::::::::::::::::::::::::::::::: :::: :: ::::::::::.":~i~ii 
i~i YWORD-BASED INFORMATION R~TR1EVAL iF ..... • 
.... ,- lii ~ .: .__ ,, 
) 
Figure 3: Paramax text understanding architecture 
4. DOCUMENT UNDERSTANDING 
The object of Paramax document understanding re- 
search is to develop novel knowledge-based approaches 
to the processing of document images. Starting with the 
scanned image(s) of a document's pages, our goal is to 
determine the document's functional and physical orga- 
nization and to extract the key ideas from the ASCII rep- 
resentation of its text. Our current focus is on producing 
the structural interpretation of the document and the ac- 
companying ASCII text for each component. The ASCII 
text would then be analyzed by a text-understanding 
system, as discussed in Section 2. 
An intelligent document understanding system is shown 
in Figure 4 with the scope of our project falling in the 
shaded region. 
This work is discussed in detail in \[3\]. 
References 
1. L. M. Norton, D. A. Dahl, and M. C. Linebarger, "Re- 
cent improvements and benchmark results for the Para- 
max ATIS system," in Proceedings of the DARPA Speech 
and Language Workshop, (Harriman, New York), Febru- 
ary 1992. 
010m~ 0111111 
............ m .................... 
bxtage Underman&ng 
T iiiii !l 
D ~%~r~tadmonctur¢ 
Tell Uademaadltt| . ". "." 
Figure 4: Multiple sources of knowledge are integrated 
in this document processing architecture. 
2. T. Fin.in, R. McEntire, C. Weir, and B. Silk. "A three- 
tiered approach to natural language text retrieval,", July 
1991. AAAI-91 Workshop Program. 
3. S. Liebowitz Taylor, M. Lipshutz, and C. Weir, "Doc- 
ument structure interpretation by integrating multiple 
knowledge sources," in Proceedings of the Symposium 
on Document Analysis and Information Retrieval, March 
1992. 
460 
