A Text Understander that Learns 
Udo Hahn &: Klemens Schnattinger 
Computational Linguistics Lab, Freiburg University 
Werthmannplatz 1, D-79085 Freiburg, Germany 
{hahn, schnatt inger}@col ing. uni-freiburg, de 
Abstract 
We introduce an approach to the automatic ac- 
quisition of new concepts fi'om natural language 
texts which is tightly integrated with the under- 
lying text understanding process. The learning 
model is centered around the 'quality' of differ- 
ent forms of linguistic and conceptual evidence 
which underlies the incremental generation and 
refinement of alternative concept hypotheses, 
each one capturing a different conceptual read- 
ing for an unknown lexical item. 
1 Introduction 
The approach to learning new concepts as a 
result of understanding natural language texts 
we present here builds on two different sources 
of evidence -- the prior knowledge of the do- 
main the texts are about, and grammatical con- 
structions in which unknown lexical items oc- 
cur. While there may be many reasonable inter- 
pretations when an unknown item occurs for the 
very first time in a text, their number rapidly 
decreases when more and more evidence is gath- 
ered. Our model tries to make explicit the rea- 
soning processes behind this learning pattern. 
Unlike the current mainstream in automatic 
linguistic knowledge acquisition, which can be 
characterized as quantitative, surface-oriented 
bulk processing of large corpora of texts (Hin- 
dle, 1989; Zernik and Jacobs, 1990; Hearst, 
1992; Manning, 1993), we propose here a 
knowledge-intensive model of concept learning 
from few, positive-only examples that is tightly 
integrated with the non-learning mode of text 
understanding. Both learning and understand- 
ing build on a given core ontology in the format 
of terminological assertions and, hence, make 
abundant use of terminological reasoning. The 
'plain' text understanding mode can be consid- 
ered as the instantiation and continuous filling 
d~udr s,y ~ trw 
~ Hyl~si~ space- j 
Hyputhcsis t spal.'c-n I Q*mlifi~r 
Q*mlity ~,l~*Ine 
Figure 1: Architecture of the Text Learner 
of roles with respect to single concepts already 
available in the knowledge base. Under learning 
conditions, however, a set of alternative concept 
hypotheses has to be maintained for each un- 
known item, with each hypothesis denoting a 
newly created conceptual interpretation tenta- 
tively associated with the unknown item. 
The underlying methodology is summarized 
in Fig. 1. The text parser (for an overview, cf. 
BrSker et al. (1994)) yields information from 
the grammatical constructions in which an un- 
known lexical item (symbolized by the black 
square) occurs in terms of the corresponding de- 
pendency parse tree. The kinds of syntactic con- 
structions (e.g., genitive, apposition, compara- 
tive), in which unknown lexical items appear, 
are recorded and later assessed relative to the 
credit they lend to a particular hypothesis. The 
conceptual interpretation of parse trees involv- 
ing unknown lexical items in the domain knowl- 
edge base leads to the derivation of concept hy- 
potheses, which are further enriched by concep- 
tual annotations. These reflect structural pat- 
terns of consistency, mutual justification, anal- 
ogy, etc. relative to already available concept 
descriptions in the domain knowledge base or 
other hypothesis spaces. This kind of initial ev- 
idence, in particular its predictive "goodness" 
for the learning task, is represented by corre- 
sponding sets of linguistic and conceptual qual- 
476 
iSyntax Semantics 
CMD C ~ QD z 
CuD CZuD z 
VR.C {d e A z \[ RZ(d) C_ C z} 
RnS R z nS z 
cln {(d,d')en z l d e C z} 
RIG {(d, d') • n z I d' • C z) 
Table l: Some Concept and 
Role Terms 
Axiom Semantics 
A - C A z = C z 
a : C a z E C z 
Q - R QZ = RZ 
a R b (a z, b z) E R z 
Table 2: Axioms for 
Concepts and Roles 
ity labels. Multiple concept hypotheses for each 
unknown lexical item are organized in terms of 
corresponding hypothesis spaces, each of which 
holds different or further specialized conceptual 
readings. 
The quality machine estimates the overall 
credibility of single concept hypotheses by tak- 
ing the available set of quality labels for each 
hypothesis into account. The final computa- 
tion of a preference order for the entire set of 
competing hypotheses takes place in the qual- 
ifier, a terminological classifier extended by an 
evaluation metric for quality-based selection cri- 
teria. The output of the quality machine is a 
ranked list of concept hypotheses. The ranking 
yields, in decreasing order of significance, either 
the most plausible concept classes which classify 
the considered instance or more general concept 
classes subsuming the considered concept class 
(cf. Schnattinger and Hahn (1998) for details). 
2 Methodological Framework 
In this section, we present the major method- 
ological decisions underlying our approach. 
2.1 Terminological Logics 
We use a standard terminological, KL-ONE- 
style concept description language, here referred 
to as C:D£ (for a survey of this paradigm, cf. 
Woods and Schmolze (1992)). It has several 
constructors combining atomic concepts, roles 
and individuals to define the terminological the- 
ory of a domain. Concepts are unary predicates, 
roles are binary predicates over a domain A, 
with individuals being the elements of A. We 
assume a common set-theoretical semantics for 
C7)£ - an interpretation Z is a function that 
assigns to each concept symbol (the set A) a 
subset of the domain A, Z : A -+ 2 n, to each 
role symbol (the set P) a binary relation of A, 
Z : P --+ 2 ~×n, and to each individual symbol 
(the set I) an element of A, Z : I --+ A. 
Concept terms and role terms are defined in- 
ductively. Table 1 contains some constructors 
and their semantics, where C and D denote con- 
cept terms, while R and S denote roles. R z (d) 
represents the set of role fillers of the individual 
d, i.e., the set of individuals e with (d, e) E R z. 
By means of terminological axioms (for a sub- 
set, see Table 2) a symbolic name can be intro- 
duced for each concept to which are assigned 
necessary and sufficient constraints using the 
definitional operator '"= . A finite set of such 
axioms is called the terminology or TBox. Con- 
cepts and roles are associated with concrete in- 
dividuals by assertional axioms (see Table 2; a, b 
denote individuals). A finite set of such axioms 
is called the world description or ABox. An in- 
terpretation Z is a model of an ABox with re- 
gard to a TBox, iff Z satisfies the assertional 
and terminological axioms. 
Considering, e.g., a phrase such as 'The 
switch of the Itoh-Ci-8 ..', a straightforward 
translation into corresponding terminological 
concept descriptions is illustrated by: 
(el) switch.1 : SWITCH 
(P2) Itoh-Ci-8 HAS-SWITCH switch.1 
(P3) HAS-SWITCH -- 
(OuTPUTDEV LJ INPUTDEV U IHAS-PARTISwITCH 
STORAGEDEV t3 COMPUTER) 
Assertion P1 indicates that the instance 
switch.1 belongs to the concept class SWITCH. 
P2 relates Itoh-Ci-8 and switch.1 via the re- 
lation HAS-SWITCH. The relation HAS-SWITCH 
is defined, finally, as the set of all HAS-PART 
relations which have their domain restricted to 
the disjunction of the concepts OUTPUTDEV, 
INPUTDEV, STORAGEDEV or COMPUTER and 
their range restricted to SWITCH. 
In order to represent and reason about con- 
cept hypotheses we have to properly extend the 
formalism of C~£. Terminological hypotheses, 
in our framework, are characterized by the fol- 
lowing properties: for all stipulated hypotheses 
(1) the same domain A holds, (2) the same con- 
cept definitions are used, and (3) only different 
assertional axioms can be established. These 
conditions are sufficient, because each hypoth- 
esis is based on a unique discourse entity (cf. 
(1)), which can be directly mapped to associ- 
ated instances (so concept definitions are stable 
(2)). Only relations (including the ISA-relation) 
among the instances may be different (3). 
477 
Axiom Semantics 
(a : C)h a z E C zn 
(aRb)h (a z,b z) ER zh 
Table 3: Axioms in CDf.. hvp° 
Given these constraints, we may annotate 
each assertional axiom of the form 'a : C' and 
'a R b' by a corresponding hypothesis label h so 
that (a : C)h and (a R b)h are valid terminolog- 
ical expressions. The extended terminological 
language (cf. Table 3) will be called CD£ ~y~°. 
Its semantics is given by a special interpreta- 
tion function Zh for each hypothesis h, which is 
applied to each concept and role symbol in the 
canonical way: Zh : A --+ 2zx; Zh : P --+ 2 AxA. 
Notice that the instances a, b are interpreted by 
the interpretation function Z, because there ex- 
ists only one domain £x. Only the interpretation 
of the concept symbol C and the role symbol R 
may be different in each hypothesis h. 
Assume that we want to represent two of the 
four concept hypotheses that can be derived 
from (P3), viz. Itoh-Ci-Sconsidered as a storage 
device or an output device. The corresponding 
ABox expressions are then given by: 
( Itoh-Ci-8 HAS-SWITCH switch.1)h, 
(Itoh-Ci-8 : STORAGEDEV)h 1 
( Itoh-C i-8 HAS-SWITCH switch.1)h2 
(Itoh-Ci-8 : OUTPUTDEV)h~ 
The semantics associated with this ABox 
fi'agment has the following form: 
~h, (HAS-SWITCH) -" {(Itoh-Ci-8, switch.l)}, 
Zhx (STORAGEDEV) m {Itoh-Ci-8}, 
Zha (OuTPUTDEV) "- 0 
Zh~(HAS-SWITCH) : {(Itoh-Ci-8, switch.l)}, 
Zh2(STORAGEDEV) = 0, 
:~h..(OUTPUTDEV) : {Itoh-Ci-8} 
2.2 Hypothesis Generation Rules 
As mentioned above, text parsing and con- 
cept acquisition from texts are tightly coupled. 
Whenever, e.g., two nominals or a nominal and 
a verb are supposed to be syntactically related 
in the regular parsing mode, the semantic in- 
terpreter simultaneously evaluates the concep- 
tual compatibility of the items involved. Since 
these reasoning processes are fully embedded in 
a terminological representation system, checks 
are made as to whether a concept denoted by 
one of these objects is allowed to fill a role of 
the other one. If one of the items involved is 
unknown, i.e., a lexical and conceptual gap is 
encountered, this interpretation mode generates 
initial concept hypotheses about the class mem- 
bership of the unknown object, and, as a conse- 
quence of inheritance mechanisms holding for 
concept taxonomies, provides conceptual role 
information for the unknown item. 
Given the structural foundations of termi- 
nological theories, two dimensions of concep- 
tual learning can be distinguished -- the tax- 
onomic one by which new concepts are located 
in conceptual hierarchies, and the aggregational 
one by which concepts are supplied with clus- 
ters of conceptual relations (these will be used 
subsequently by the terminological classifier to 
determine the current position of the item to 
be learned in the taxonomy). In the follow- 
ing, let target.con be an unknown concept de- 
noted by the corresponding lexical item tar- 
get.lex, base.con be a given knowledge base con- 
cept denoted by the corresponding lexical item 
base.lex, and let target.lex and base.lex be re- 
lated by some dependency relation. Further- 
more, in the hypothesis generation rules below 
variables are indicated by names with leading 
'?'; the operator TELL is used to initiate the 
creation of assertional axioms in C7)£ hyp°. 
Typical linguistic indicators that can be ex- 
ploited for taxonomic integration are apposi- 
tions ('.. the printer @A@ .. '), exemplification 
phrases ('.. printers like the @A @ .. ') or nomi- 
nal compounds ( '.. the @A @ printer .. 1. These 
constructions almost unequivocally determine 
'@A@' (target.lex) when considered as a proper 
name 1 to denote an instance of a PRINTER (tar- 
get.con), given its characteristic dependency re- 
lation to 'printer' (base.lex), the conceptual cor- 
relate of which is the concept class PRINTER 
(base.con). This conclusion is justified indepen- 
dent of conceptual conditions, simply due to the 
nature of these linguistic constructions. 
The generation of corresponding concept hy- 
potheses is achieved by the rule sub-hypo (Ta- 
ble 4). Basically, the type of target.con is carried 
over from base.con (function type-of). In addi- 
tion, the syntactic label is asserted which char- 
acterizes the grammatical construction figuring 
as the structural source for that particular hy- 
1Such a part-of-speech hypothesis can be derived 
from the inventory of valence and word order specifi- 
cations underlying the dependency grammar model we 
use (BrSker et al., 1994). 
478 
sub-hypo (target.con, base.con, h, label) 
?type := type-of(base.con) 
TELL (target.con : ?type)h 
add-label((target.con : ?type)h ,label) 
Table 4: Taxonomic Hypothesis Generation Rule 
pothesis (h denotes the identifier for the selected 
hypothesis space), e.g., APPOSITION, EXEMPLI- 
FICATION, or NCOMPOUND. 
The aggregational dimension of terminologi- 
cal theories is addressed, e.g., by grammatical 
constructions causing case frame assignments. 
In the example '.. @B@ is equipped with 32 MB 
of RAM ..', role filler constraints of the verb 
form 'equipped' that relate to its PATIENT role 
carry over to '@B~'. After subsequent seman- 
tic interpretation of the entire verbal complex, 
'@B@' may be anything that can be equipped 
with memory. Constructions like prepositional 
phrases ( '.. @C@ from IBM.. ') or genitives ('.. 
IBM's @C@ .. ~ in which either target.lex or 
base.lex occur as head or modifier have a simi- 
lar effect. Attachments of prepositional phrases 
or relations among nouns in genitives, however, 
open a wider interpretation space for '@C~' 
than for '@B~', since verbal case frames provide 
a higher role selectivity than PP attachments 
or, even more so, genitive NPs. So, any concept 
that can reasonably be related to the concept 
IBM will be considered a potential hypothesis 
for '@C~-", e.g., its departments, products, For- 
tune 500 ranking. 
Generalizing from these considerations, we 
state a second hypothesis generation rule which 
accounts for aggregational patterns of concept 
learning. The basic assumption behind this 
rule, perm-hypo (cf. Table 5), is that target.con 
fills (exactly) one of the n roles of base.con it 
is currently permitted to fill (this set is deter- 
mined by the function porto-filler). Depend- 
ing on the actual linguistic construction one en- 
counters, it may occur, in particular for PP 
and NP constructions, that one cannot decide 
on the correct role yet. Consequently, several 
alternative hypothesis spaces are opened and 
target.co~ is assigned as a potential filler of 
the i-th role (taken from ?roleSet, the set of 
admitted roles) in its corresponding hypothesis 
space. As a result, the classifier is able to de- 
rive a suitable concept hypothesis by specializ- 
ing target.con according to the value restriction 
of base.con's i-th role. The function member-of 
?roleSet :=perm-f iller( target.con, base.con, h) 
?r := \[?roleSet I 
FORALL ?i :=?r DOWNTO 1 DO 
?rolel := member-of ( ?roleSet ) 
?roleSet :=?roleSet \ {?rolei} 
IF ?i = 1 
THEN ?hypo := h 
ELSE ?hypo := gen-hypo(h) 
TELL (base.con ?rolei target.con)?hypo 
add-label ((base.con ?rolei target.con)?hypo, label ) 
Table 5: Aggregational Hypothesis Generation Rule 
selects a role from the set ?roleSet; gen-hypo 
creates a new hypothesis space by asserting 
the given axioms of h and outputs its identi- 
fier. Thereupon, the hypothesis space identified 
by ?hypo is augmented through a TELL op- 
eration by the hypothesized assertion. As for 
sub-hypo, perm-hypo assigns a syntactic qual- 
ity label (function add-label) to each i-th hy- 
pothesis indicating the type of syntactic con- 
struction in which target.lex and base.lex are 
related in the text, e.g., CASEFRAME, PPAT- 
TACH or GENITIVENP. 
Getting back to our example, let us assume 
that the target Itoh-Ci-8 is predicted already as 
a PRODUCT as a result of preceding interpreta- 
tion processes, i.e., Itoh-Ci-8 : PRODUCT holds. 
Let PRODUCT be defined as: 
PRODUCT -- 
VHAS-PART.PHYSICALOBJECT I-1 VHAS-SIZE.SIZE \["1 
VHAS-PRICE.PRICE i-I VHAS-WEIGHT.WEIGHT 
At this level of conceptual restriction, four 
roles have to be considered for relating the tar- 
get Itoh-Ci-8 - as a tentative PRODUCT - to 
the base concept SWITCH when interpreting the 
phrase 'The switch of the Itoh-Ci-8 .. '. Three of 
them, HAS-SIZE, HAS-PRICE, and HAS-WEIGHT, 
are ruled out due to the violation of a simple 
integrity constraint ('switch'does not denote a 
measure unit). Therefore, only the role HAS- 
PART must be considered in terms of the expres- 
sion Itoh-Ci-8 HAS-PART switch.1 (or, equiva- 
lently, switch.1 PART-OF Itoh-Ci-8). Due to the 
definition of HAS-SWITCH (cf. P3, Subsection 
2.1), the instantiation of HAS-PART is special- 
ized to HAS-SWITCH by the classifier, since the 
range of the HAS-PART relation is already re- 
stricted to SWITCH (P1). Since the classifier ag- 
gressively pushes hypothesizing to be maximally 
specific, the disjunctive concept referred to in 
479 
the domain restrictiou of the role HAS-SWITCH 
is split into four distinct hypotheses, two of 
which are sketched below. Hence, we assume 
Itoh-Ci-8 to deuote either a STORAGEDEvice 
or an OUTPUTDEvice or an INPUTDEvice or a 
COMPUTER (note that we also include parts of 
the IS-A hierarchy in the example below). 
(Itoh-Ci-8 : STORAGEDEV)h,, 
(Itoh-Ci-8 : DEVICE)h~,.., 
( Itoh-C i-8 HAS-SWITCH switch.1)h~ 
(Itoh-Ci-8 : OUTPUTDEv)h~, 
(Itoh-Ci-8 : DEVICE)h2,.., 
(Itoh-Ci-8 HAS-SWITCH swilch.1)h~,... 
2.3 Hypothesis Annotation Rules 
In this section, we will focus on the quality as- 
sessment of concept hypotheses which occurs at 
the knowledge base level only; it is due to the 
operation of hypothesis annotation rules which 
continuously evaluate the hypotheses that have 
been derived from linguistic evidence. 
The M-Deduction rule (see Table 6) is trig- 
gered for any repetitive assignment of the same 
role filler to one specific conceptual relation that 
occurs in different hypothesis spaces. This rule 
captures the assu,nption that a role filler which 
has been multiply derived at different occasions 
must be granted more strength than one which 
has been derived at a single occasion only. 
EXISTS Ol,O2, R, hl,h~. : 
(Ol R o2)hl A (Ol R o2)h~ A hi ~ h~ 
TELL (ol R o~_)h~ : M-DEDUCTION 
Table 6: The Rule M-Deduction 
Considering our example at the end of subsec- 
tion 2.2, for 'Itoh-Ci-8' the concept hypotheses 
STORAGEDEV and OUTPUTDEV were derived 
independently of each other in different hypoth- 
esis spaces. Hence, DEVICE as their common 
superconcept has been multiply derived by the 
classifier in each of these spaces as a result of 
transitive closure computations, too. Accord- 
ingly, this hypothesis is assigned a high degree 
of confidence by the classifier which derives the 
conceptual quality label M-DEDUCTION: 
(Itoh-Ci-8 : DEVICE)hi A (Itoh-Ci-8 : DEVICE)h~ 
=:=> (Itoh-Ci-8 : DEVICE)hi : M-DEDUCTION 
The C-Support rule (see Table 7) is triggered 
whenever, within the same hypothesis space, 
a hypothetical relation, RI, between two in- 
stances can be justified by another relation, R2, 
involving the same two instances, but where the 
role fillers occur in 'inverted' order (R1 and R2 
need not necessarily be semantically inverse re- 
lations, as with 'buy' and 'sell~. This causes 
the generation of the quality label C-SuPPORT 
which captures the inherent symmetry between 
concepts related via quasi-inverse relations. 
EXISTS Ol, 02, R1, R2, h : 
(ol R1 o2)h ^ (02 R2 ol)h ^ ftl # R~ ~=~ 
TELL (Ol R1 o2)h : C-SuPPORT 
Table 7: The Rule C-Support 
Example: 
(Itoh SELLS ltoh-Ci-8)h A 
(Itoh-Ci-8 DEVELOPED-BY Itoh)h 
(ltoh SELLS ltoh-Ci-8)h : C-SuPPORT 
Whenever an already filled conceptual rela- 
tion receives an additional, yet different role 
filler in the same hypothesis space, the Add- 
Filler rule is triggered (see Table 8). This 
application-specific rule is particularly suited to 
our natural language understanding task and 
has its roots in the distinction between manda- 
tory and optio,lal case roles for (ACTION) verbs. 
Roughly, it yields a negative assessment in 
terms of the quality label ADDFILLER for any 
attempt to fill the same mandatory case role 
more than once (unless coordinations are in- 
volved). Iu contradistinction, when the same 
role of a non-ACTION concept (typically de- 
noted by nouns) is multiply filled we assign the 
positive quality label SUPPORT, since it reflects 
the conceptual proximity a relation induces on 
its component fillers, provided that they share 
a common, non-ACTION concept class. 
EXISTS 01,02, 03, R, h : 
(01 R 02)h A (01 R 03)h A (01 : ACTION)h ===V 
I TELL (01 R o~_)h : ADDFILLER 
Table 8: The Rule AddFiller 
We give examples both for the assignmeut of 
an ADDFILLER as well as for a SUPPORT label: 
Examples: 
(produces.1 : ACTION)h A 
(produces.1 AGENT ltoh)h A 
(produces.1 AGENT IBM)h 
(produces.1 AGENT Itoh)h : ADDFILLER 
(ltoh-Ci-8 : PRINTER)h A (Itoh-Ct : PRINTER)h A 
(Itoh SELLS Itoh-Ci-8)h A (Itoh SELLS Itoh-Ct)h A (ltoh 
: -~AcTION)h 
(Itoh-Ci-8 : PRINTER)h : SUPPORT 
480 
2.4 Quality Dimensions 
The criteria from which concept hypotheses 
are derived differ in the dimension from which 
they are drawn (grammatical vs. conceptual ev- 
idence), as well as the strength by which they 
lend support to the corresponding hypotheses 
(e.g., apposition vs. genitive, multiple deduc- 
tion vs. additional role filling, etc.). In order 
to make these distinctions explicit we have de- 
veloped a "quality calculus" at the core of which 
lie the definition of and inference rules for qual- 
ity labels (cf. Schnattinger and Hahn (1998) for 
more details). A design methodology for specific 
quality calculi may proceed along the follow- 
ing lines: (1) Define the dimensions from which 
quality labels can be drawn. In our application, 
we chose the set I:Q := {ll,..., Ira} of linguistic 
quality labels and CQ := {cl,...,c~} of con- 
ceptual quality labels. (2) Determine a partial 
ordering p among the quality labels from one di- 
mension reflecting different degrees of strength 
among the quality labels. (3) Determine a total 
ordering among the dimensions. 
In our application, we have empirical evi- 
dence to grant linguistic criteria priority over 
conceptual ones. Hence, we state the following 
constraint: Vl E LQ, Vc E CQ : l >p c 
The dimension I:Q. Linguistic quality labels 
reflect structural properties of phrasal patterns 
or discourse contexts in which unknown lexi- 
cal items occur 2 -- we here assume that the 
type of grammatical construction exercises a 
particular interpretative force on the unknown 
item and, at the same time, yields a particu- 
lar level of credibility for the hypotheses being 
derived. Taking the considerations from Sub- 
section 2.2 into account, concrete examples of 
high-quality labels are given by APPOSITION or 
NCOMPOUND labels. Still of good quality but 
already less constraining are occurrences of the 
unknown item in a CASEFRAME construction. 
Finally, in a PPATTACH or GENITIVENP con- 
struction the unknown lexical item is still less 
constrained. Hence, at the quality level, these 
latter two labels (just as the first two labels we 
considered) form an equivalence class whose el- 
ements cannot be further discriminated. So we 
end up with the following quality orderings: 
2In the future, we intend to integrate additional types 
of constraints, e.g., quality criteria reflecting the degree 
of completeness vs. partiality of the parse. 
NCOMPOUND ----p APPOSITION 
NCOMPOUND >p CASEFRAME 
APPOSITION >p CASEFRAME 
CASEFRAME >p GENITIVENP 
CASEFRAME >p PPATTACH 
GENITIVENP =p PPATTACH 
The dimension CQ. Conceptualquality labels 
result from comparing the conceptual represen- 
tation structures of a concept hypothesis with 
already existing representation structures in the 
underlying domain knowledge base or other con- 
cept hypotheses from the viewpoint of struc- 
tural similarity, compatibility, etc. The closer 
the match, the more credit is lent to a hypoth- 
esis. A very positive conceptual quality label, 
e.g., is M-DEDUCTION, whereas ADDFILLER is 
a negative one. Still positive strength is ex- 
pressed by SUPPORT or C-SuPPORT, both being 
indistinguishable, however, from a quality point 
of view. Accordingly, we may state: 
M-DEDUCTION >p SUPPORT 
~{-DEDUCTION >p C-SuPPORT 
SUPPORT --p C-SuPPORT 
SUPPORT >p ADDFILLEK 
C-SuPPORT >p ADDFILLER 
2.5 Hypothesis Ranking 
Each new clue available for a target concept to 
be learned results in the generation of additional 
linguistic or conceptual quality labels. So hy- 
pothesis spaces get incrementally augmented by 
quality statements. In order to select the most 
credible one(s) among them we apply a two-step 
procedure (the details of which are explained 
in Schnattinger and Hahn (1998)). First, those 
concept hypotheses are chosen which have ac- 
cumulated the greatest amount of high-quality 
labels according to the linguistic dimension £:Q. 
Second, further hypotheses are selected from 
this linguistically plausible candidate set based 
on the quality ordering underlying CQ. 
We have also made considerable efforts to 
evaluate the performance of the text learner 
based on the quality calculus. In order to ac- 
count for the incrementality of the learning pro- 
cess, a new evaluation measure capturing the 
system's on-line learning accuracy was defined, 
which is sensitive to taxonomic hierarchies. The 
results we got were consistently favorable, as 
our system outperformed those closest in spirit, 
CAMILLE (Hastings, 1996) and ScIsoR (Rau et 
481 
al., 1989), by a gain in accuracy on the or- 
der of 8%. Also, the system requires relatively 
few hypothesis spaces (2 to 6 on average) and 
prunes the concept search space radically, re- 
quiring only a few examples (for evaluation de- 
tails, cf. Hahn and Schnattinger (1998)). 
3 Related Work 
We are not concerned with lexical acquisition 
from very large corpora using surface-level collo- 
cational data as proposed by Zernik and Jacobs 
(1990) and Velardi et al. (1991), or with hy- 
ponym extraction based on entirely syntactic 
criteria as in Hearst (1992) or lexico-semantic 
associations (e.g., Resnik (1992) or Sekine et al. 
(1994)). This is mainly due to the fact that 
these studies aim at a shallower level of learn- 
ing (e.g., selectional restrictions or thematic re- 
lations of verbs), while our focus is on much 
more fine-grained conceptual knowledge (roles, 
role filler constraints, integrity conditions). 
Our approach bears a close relationship, how- 
ever, to the work of Mooney (1987), Berwick 
(1989), Rau et al. (1989), Gomez and Segami 
(1990), and Hastings (1996), who all aim at the 
automated learning of word meanings from con- 
text using a knowledge-intensive approach. But 
our work differs from theirs in that the need to 
cope with several competing concept hypotheses 
and to aim at a reason-based selection in terms 
of the quality of arguments is not an issue in 
these studies. Learning from real-world texts 
usually provides the learner with only sparse 
and fragmentary evidence, such that multiple 
hypotheses are likely to be derived and a need 
for a hypothesis evaluation arises. 
4 Conclusion 
We have introduced a solution for the semantic 
acquisition problem on the basis of the auto- 
matic processing of expository texts. The learn- 
ing methodology we propose is based on the 
incremental assignment and evaluation of the 
quality of linguistic and conceptual evidence for 
emerging concept hypotheses. No specialized 
learning algorithm is needed, since learning is 
a reasoning task carried out by the classifier 
of a terminological reasoning system. However, 
strong heuristic guidance for selecting between 
plausible hypotheses comes from linguistic and 
conceptual quality criteria. 
Acknowledgements. We would like to thank 
our colleagues in the CLIF group for fruitful discus- 
sions, in particular Joe Bush who polished the text 
as a native speaker. K. Schnattinger is supported by 
a grant from DFG (Ha 2097/3-1). 

References 
R. Berwick. 1989. Learning word meanings from 
examples. In D. Waltz, editor, Semantic Struc- 
tures., pages 89-124. Lawrence Erlbaum. 
N. BrSker, U. Hahn, and S. Schacht. 1994. 
Concurrent lexicalized dependency parsing: the 
PARSETALK model. In Proc. of the COLING'94. 
Vol. I, pages 379-385. 
F. Gomez and C. Segami. 1990. Knowledge acqui- 
sition from natural language for expert systems 
based on classification problem-solving methods. 
Knowledge Acquisition, 2(2):107-128. 
U. Hahn and K. Schnattinger. 1998. Towards text 
knowledge engineering. In Proc. of the AAAI'98. 
P. Hastings. 1996. Implications of an automatic lex- 
ical acquisition system. In S. Wermter, E. Riloff, 
and G. Scheler, editors, Connectionist, Statistical 
and Symbolic Approaches to Learning for Natural 
Language Processing, pages 261-274. Springer. 
M. Hearst. 1992. Automatic acquisition of hy- 
ponyms from large text corpora. In Proc. of the 
COLING'92. Vol.2, pages 539-545. 
D. Hindle. 1989. Acquiring disambiguation rules 
from text. In Proc. of the A CL'89, pages 26-29. 
C. Manning. 1993. Automatic acquisition of large 
subcategorization dictionary from corpora. In 
Proc. of the A CL'93, pages 235-242. 
R. Mooney. 1987. Integrated learning of words 
and their underlying concepts. In Proe. of the 
CogSci'87, pages 974-978. 
L. Rau, P. Jacobs, and U. Zernik. 1989. Information 
extraction and text summarization using linguis- 
tic knowledge acquisition. Information Processing 
Management, 25(4):419-428. 
P. Resnik. 1992. A class-based approach to lexical 
discovery. In Proe. of the A CL '92, pages 327-329. 
K. Schnattinger and U. Hahn. 1998. Quality-based 
learning. In Proc. of the ECAI'98, pages 160-164. 
S. Sekine, J. Carroll, S. Ananiadou, and J. Tsujii. 
1994. Automatic learning for semantic colloca- 
tion. In Proc. of the ANLP'94, pages 104-110. 
P. Velardi, M. Pazienza, and M. Fasolo. 1991. 
How to encode semantic knowledge: a method for 
meaning representation and computer-aided ac- 
quisition. Computational Linguistics, 17:153-170. 
W. Woods and J. Schmolze. 1992. The KL-ONE 
family. Computers ~ Mathematics with Applica- 
tions, 23(2/5):133-177. 
U. Zernik and P. Jacobs. 1990. Tagging for learn- 
ing: collecting thematic relations from corpus. In 
Proc. of the COLING'90. Vol. 1, pages 34-39. 
