In: Proceedings of CoNLL-2000 and LLL-2000, pages 176-183, Lisbon, Portugal, 2000. 
Learning from a Substructural Perspective 
Pieter Adriaans and Erik de Haas 
Syllogic, 
P.O. Box 2729, 3800GG Amersfoort, The Netherlands, 
and 
University of Amsterdam, Fac. of Mathematics, Computer Science, Physics and Astronomy, 
Plantage Muidergracht 24, 1018TV Amsterdam, The Netherlands 
pieter, adriaans@ps.net, erik@propersolution.nl 
Abstract 
In this paper we study learning from a logical 
perspective. We show that there is a strong re- 
lationship between a learning strategy, its for- 
mal learning framework and its logical represen- 
tational theory. This relationship enables one 
to translate learnability results from one theory 
to another. Moreover if we go from a classi- 
cal logic theory to a substructural logic theory, 
we can transform learnability results of logical 
concepts to results for string s. In this 
paper we will demonstrate such a translation by 
transforming the Valiant learnability result for 
boolean concepts to a learnability :result for a 
class of string pattern s. 
1 Introduction 
There is a strong relation between a learn- 
ing strategy, its formal learning framework and 
its representational theory. Such a representa- 
tional theory typically is (equivalent to) a logic. 
As an example for this strong relationship as- 
sume that the implication A ~ B is a given 
fact, and you observe A; then you can deduce 
B, which means that you can learn B from A 
based on the underlying representational the- 
ory. The learning strategy is very tightly con- 
nected to its underlying logic. Continuing the 
above example, suppose you observe -~B. In a 
representational theory based on classical logic 
you may deduce ~A given the fact A ~ B. 
In intuitionistic logic however, this deduction 
is not valid. This example shows that the char- 
acter of the representational theory is essential 
for your learning strategy, in terms of what can 
be learned from the facts and examples. 
In the science of the representational theo- 
ries, i.e. logic, it is a common approach to 
connect different representational theories, and 
transform results of one representational theory 
to results in an other representational theory. 
Interesting is now whether we can transform 
learnability results of learning strategies within 
one representational theory to others. Observe 
that to get from a first order calculus to a string 
calculus one needs to eliminate structural rules 
from the calculus. Imagine now that we do the 
same transformation to the learning strategies, 
we would come up with a learning strategy for 
the substructural string calculus starting from a 
learning strategy for the full first order calculus. 
The observation that learning categorial 
grammars translates to the task of learning 
derivations in a substructural logic theory moti- 
vates a research program that investigates learn- 
ing strategies from a logical point of view (Adri- 
aans and de Haas, 1999). Many domains for 
learning tasks can be embedded in a formal 
learning framework based on a logical repre- 
sentational theory. In Adriaans and de Haas 
(1999) we presented two examples of substruc- 
tural logics, that were suitable representational 
theories for different learning tasks; The first 
example was the Lambek calculus for learning 
categorial grammars, the second example dealt 
with a substructural logic that was designed to 
study modern Object Oriented modeling lan- 
guages like UML (OMG, 1997), (Fowler, 1997). 
In the first case the representation theory is first 
order logic without structural rules, the formal 
learning theory from a logical point of view is 
inductive substructural logic programming and 
an example of a learning strategy in this frame- 
work is EMILE, a learning algorithm that learns 
categorial grammars (Adriaans, 1992). 
In this paper we concentrate on the trans- 
formation of classical logic to substructural 
logic and show that Valiant's proof of PAC- 
176 
learnability of boolean concepts can be trans- 
formed to a PAC learnability proof for learning 
a class of finite s. We discuss the ex- 
tension of this learnability approach to the full 
range of substructural logics. Our strategy in 
exploring the concept of learning is to look at 
the logical structure of a learning algorithm, and 
by this reveal the inner working of the learning 
strategy. 
In Valiant (1984) the principle of Probably 
Approximately Correct learning (PAC learning) 
was introduced. There it has been shown that 
k-CNF (k-length Conjunctive Normal Form) 
boolean concepts can be learned efficiently in 
the model of PAC learning. For the proof 
that shows that these boolean concepts can be 
learned efficiently Valiant presents a learning al- 
gorithm and shows by probabilistic arguments 
that boolean concept can be PAC learned in 
polynomial time. In this paper we investigate 
the logical mechanism behind the learning al- 
gorithm. By revealing the logical mechanism 
behind this learning algorithm we are able to 
study PAC learnability of various other logics in 
the substructural landscape of first order propo- 
sitional logic. 
In this paper we will first briefly introduce 
substructural logic in section 2. Consequently 
we will reconstruct in section 3 Valiant's result 
on learnability of boolean concepts in terms of 
logic. Then in section 4 we will show that the 
learnability result of Valiant for k-CNF boolean 
concepts can be transformed to a learnability re- 
sult for a grammar of string patterns denoted by 
a substructural variant of the k-CNF formulas. 
We will conclude this paper with a discussion 
an indicate how this result could be extended 
to learnability results for categorial grammars. 
2 Substructural logic 
In Gentzen style sequential formalisms a sub- 
structural logic shows itself by the absence of 
(some of) the so-called structural rules. Exam- 
ples of such logics are relevance logic (Dunn, 
1986), linear logic (Girard, 1987) and BCK logic 
(Grishin, 1974). Notable is the substructural 
behavior of categorial logic, which in its proto- 
type form is the Lambek calculus. Categorial 
logics are motivated by its use as grammar for 
natural s. The absence of the struc- 
tural rules degrades the abstraction of sets in 
the semantic domain to strings, where elements 
in a string have position and arity, while they 
do not have that in a set. As we will see further 
on in this paper the elimination of the struc- 
tural rules in the learning context of the boolean 
concepts will transform the learning framework 
from sets of valuated variables to strings of val- 
uated variables. 
Example 2.1 In a domain of sets the following 
'expressions' are equivalent, while they are not 
in the domain of strings: 
a, a, b, a ~ a, b, b 
In a calculus with all the structural rules the fea- 
tures 'position' and 'arity' are irrelevant in the 
semantic domain, because aggregates that differ 
in these features can be proved equivalent with 
the structural rules. To see this observe that 
the left side of the above equation can be trans- 
formed to the right side by performing the fol- 
lowing operation: 
a, a, b, a 
a, b, a 
a, a, b 
a, b 
a, b, b 
contract a, a in .first two positions 
to a 
exchange b, a in last to positions to 
a,b 
contract again a, a in first two 
positions to a 
weaken expression b in last position 
to b, b 
In figure 2 we list the axiomatics of the first 
order propositional sequent calculus 1, with the 
axioms , the cut rule, rules for the connectives 
and the structural rules for exchange, weakening 
and contraction. 
3 PAC Boolean concept learning 
revisited 
In this section we describe the principle of Prob- 
ably Approximately Correct Learning (PAC 
learning) of Boolean concepts. We will reveal 
1Note that in the variant we use here we have a special 
case of the RA rule. 
177 
representational 
theory 
First order 
propositional )~ 
logic j I 
formal learning 
framework 
learn ing strategy 
Boolean \ ~ PAC learning , 
-4 concepts ~ k-CNF ) 
1 
Substructural proposition= ) 411 
1 1 
String . PAC learning , 
,~ s 
Figure 1: Relation between learning strategy, learning framework and representational theory 
(Ax) A ~ A (Cut) 
(LA) F,A,B~A (RA) 
F, AAB~A 
(LV) F,A~A F,B~A F,A V B ~ A (RV) 
F =~ A,A F~,A,~ A 
F', F ~ A', A 
F~A,A F t~B,A 
F,F t =~ AAB, A 
F ~ A,A F ~ B,A 
F~AVB, A F~AVB, A 
(Ex) F'AAB'F~=-~ A 
F,B A A,F ~ ~ A 
F~A 
(Weak) F, A ~ A 
(Contr) F, A, A ~ A F,A~A 
Figure 2: First order propositional sequent calculus 
the logical deduction process behind the learn- 
ing algorithm. 
Consider the sample space for boolean con- 
cepts. An example is a vector denoting the 
truth (presence,l) or falsehood (absence,0) of 
propositional variables. Such an example vec- 
tor can be described by a formula consisting of 
the conjunction of all propositional variables or 
negations of propositional variables, depending 
on the fact whether there is a 1 or a 0 in the 
position of the propositional variable name in 
the vector. A collection of vectors, i.e. a con- 
cept, in its turn can be denoted by a formula 
too, being the disjunction of all the formula's of 
the vectors. 
Example 3.1 Let universe U = {a,b} and let 
concept f = {(0, 1)}, then the following formula 
exactly describes f: 
~Ab 
178 
A little more extensive: Let uni- 
verse \[.j, = {a,b,c} and let concept 
f' = {(0, 0, 0), (0, 0, 1), (0, 1, 1), (1, 1, 1)} 
Then the following formula exactly describes fl 
(with a clear translation): 
(~AbAa) V (~ AbA c) V (~A bA c) V (aAbAc) 
Note that these formulas are in Disjunctive nor- 
mal form (DNF). 
An interesting observation now is that the 
learning algorithm of Valiant that learns k-CNF 
formulas actually is trying to prove the equiv- 
alence between a DNF formula and a k-CNF 
formula. 
Example 3.2 Let universe U = {a,b} and let 
concept f = {(0, 1)}, then the following sequent 
should be 'learned' by a 2-CNF learning algo- 
rithm 2: 
~ A b ,¢:,. (aVb) A (~Vb) A (~Vb) 
A little more extensive: Let U' = 
{a, b, c} and let concept f' = 
{(0, 0, 0), (0, 0, 1), (0, 1, 1), (1, 1, 1)} Then 
the following sequent should be 'learned' by a 
2-CNF learning algorithm: 
(~ Ab A ~) V (HAhA c) V (~A bA c) V (aAbAc) 
(~V b) A (~V b) A (a V b) 
The above observation says in logical terms 
that the learning algorithm needs to implement 
an inductive procedure to find this desired proof 
and the concluding concept description (2-CNF 
formula) from examples. In the search space for 
this proof the learning algorithm can use the ax- 
ioms and rules from the representational theory. 
In the framework of boolean concept learning 
this means that the learning algorithm may use 
all the rules and axioms from the representa- 
tional theory of classical propositional logic. 
Example 3.3 Let IJ = {a, b} and let concept 
f = {(0, 1)} and assume f can be represented 
by a 2-CNF formula, to learn the 2-CNF de- 
scription of concept f the learning algorithm 
needs to find the proof for a sequent starting 
2i.e. an algorithm that can learn 2-CNF boolean con- 
cepts. 
from the DNF formula ~ A b to a 2-CNF for- 
mula and vice versa (¢~.) and to do so it may 
use all the rules and axioms from the first or- 
der propositional calculus including the struc- 
tural rules. The proof for one side of such a 
sequent is spelled out in figure 3. 
In general an inductive logic programming al- 
gorithm for the underlying representational the- 
ory can do the job of learning the concept; i.e. 
from the examples (DNF formulas) one can in- 
duce possible sequents, targeting on a 2-CNF 
sequent on the righthand side. The learning al- 
gorithm we present here is more specific and 
simply shows that an efficient algorithm for the 
proof search exists. 
The steps: 
1. Form the collection G of all 2-CNF 
clauses (p V q) 
2. do l times 
(a) 
(b) 
pick an example al A.-. Aam 
form the collection of all 
2-CNF clauses deducible from 
al A ... A am and intersect this 
collection with G resulting in 
a new C 
Correctness proof (outline): By (Ax), 
(RV), (Weak), (LA) and (Ex) we can proof 
that for any conjunction (i.e. example vector) 
al A ... A am we have for all 1 _< i < m and 
any b a clause of a 2-CNF in which ai occurs 
with b, hence having all clauses deducible from 
the vector proven individually enabling one to 
form the collection of all clauses deducible from 
a vector; i.e. 
al A ... Aam ~ ai Vb 
al A ... A am :::*" b V ai 
By (RA) and (Contr) we can proof the conjunc- 
tion of an arbitrary subset of all the clauses de- 
ducible from the vector, in particular all those 
clauses that happen to be common to all the 
vectors for each individual vector we have seen 
so far, hence proving the 2-CNF for every indi- 
vidual vector; i.e. 
al A .. • A am ~ clause1 A .. • A clausep 
179 
b ~ b (Ax) b =*- b (Ax) (av) (av) 
b ~ E V b (Weak) b =~ a V b (Weak) 
gg=t-E(Ax) (Rv) b,g~ggVb (L^) b,g~aVb (LA) 
EE=~,,.EVb:_ (Weak) bAE=*,ggVb bAg=~aVb (Sx) 
E,b~EVb_ (L^) ggAb~EVb (Ex) EAb=~aVb (RA) 
EAb~EVb (E A b), (E A b) =-~ (E V b) A (a V b) (a^) 
(E A b), (E A b), (E A b) ~ (E V b) A (E V b) A (a V b) (Contr) 
(EAb),(EAb)~(EVb)A(EVb)A(aVb) 
(~A b) =* (~Vb) A (EV b) A (a V b) (Contr) 
Figure 3: Proof to be found for boolean concept learning 
Now by (LV) we can prove the complete DNF 
to 2-CNF sequent; i.e. 
vector1 V • • • V vector/ ~ clause1 A • • • A clausep 
It is easy to see that for the above algorithm 
the same complexity analysis holds as for the 
Valiant algorithm, because we have the same 
progression in l steps, an the individual steps 
have constant overhead. 
4 PAC learning substructural logic 
When we transform the representational theory 
of the boolean concept learning framework to a 
substructural logic, we do the following: 
• eliminate the structural rules from the cal- 
culus of first order propositional logic 
When we want to translate the learnability re- 
sult of k-CNF expressible boolean concepts we 
need to do the same with the formal learning 
framework and the strategy (algorithm). In 
other words: 
• the learning framework will contain con- 
cepts that are sensitive to the features 
which were before abstracted by the struc- 
tural rules ('position' and 'arity' ) 
• the learning algorithm from above is no 
longer allowed to use the structural rules 
in its inductive steps. 
Below we present a learning algorithm for 
the substructural logic representational theory. 
Suppose again the universe U = {al,... ,an}, 
and the concept f is a CNF expressible concept 
for vectors of length m. 
1. start with m empty clauses (i.e. disjunction 
of zero literals) clause1,..., clausem 
2. do l times 
(a) pick an example al A... A am 
(b) for all 1 < i < m add ai to clause/ if 
ai does not occur in clause/. 
Correctness proof (outline): By (Ax) and 
(RV) we can proof for any ai that the sequent 
ai =-~ clause/for any clause/containing ai as one 
of its disjuncts, especially for a clause/contain- 
ing next to ai all the a~ from the former exam- 
ples. Then by (RA) and (LA) we can position 
all the vectors and clauses in the right-hand po- 
sition; i.e. 
al A ... A am ~ clause1 A -.. A clausem 
Hence justifying the adding of the literal ai of 
a vector in clausei. Now (LV) completes the 
sequent for all the example vectors; i.e. 
(al A... A am) V (a i A ... A aim ) V . . . 
clause1 A .-. A clausem 
For the algorithmic complexity in terms of 
PAC learning, suppose we want present exam- 
ples of concept f and that the algorithm learned 
concept ff in l steps. Concept ff then de- 
scribes a subset of concept f because on every 
position in the CNF formula contains a sub- 
set of the allowed variables; i.e. those vari- 
ables that have encountered in the examples 3. 
anote that the CNF formula's can only describe par- 
ticular sets of n-strings; namely those sets that are com- 
plete for varying symbols locally on the different posi- 
tions in the string. 
180 
~ ~ (Ax) 
~vb 
b ~ b (Ax) b ~ b (Ax) (RV) (RV) 
b~Vb b~aVb (at) 
b, b =* (~V) A (a V b) (RV) (LA) 
bAb~ (gVD) A(aVb) (at) 
~,b Ab ~ (~V b) A (~V b) A (a V b) 
gAbAb ~ (~Vb) A (~Vb) A (aVb) 
(LA) 
(EAEA a) V (gAEA b) V (gA bAa) V (EA bA b) V (bAEA a) 
V(bAEA b) V (bA bA a) V (bA bA b) ~ (gVb) A (gV b) A (a V b) 
(LV) 
Figure 4: Proof to be found for string pattern learning 
Now let e = P(fAf ~) be the error then again 
5 = (1 - e) TM is the confidence parameter as we 
have m positions in the string. By the same 
argument as for the Valiant algorithm we may 
conclude that e and 5 decrease exponentially in 
the number of examples l, meaning that we have 
an efficient polynomial time learning algorithm 
for arbitrary e and 5. 
5 Discussion 
We showed that the learnability result of 
Valiant for learning boolean concepts can be 
transformed to a learnability result for pat- 
tern s by looking at the transforma- 
tion of the underlying representational theories; 
i.e. looking at the transformation from clas- 
sical first order propositional logic (underlying 
the boolean concepts) to substructural first or- 
der propositional logic (underlying the pattern 
s). An interesting extension would be 
to look at the substructural concept  
that includes implication (instead of the CNF 
formula's only). A  that allows impli- 
cation coincides with the full Lambek calculus, 
and a learning algorithm and learnability result 
for this framework amounts to results for all lan- 
guages that can be described by context free 
grammars. This is subject to future research. 

References
P. Adriaans and E. de Haas. 1999. Grammar in- 
duction as substructural inductive logic program- 
ming. In Proceedings of the workshop on Learning 
Language in Logic (LLL99), pages 117-126, Bled, 
Slovenia, jun. 
P. Adriaans. 1992. Language Learning from a Cate- 
gorial Perspective. Ph.D. thesis, Universiteit van 
Amsterdam. Academisch proefschrift. 
J. Dunn. 1986. Relevance logic and entailment. In 
F. Guenthner D. Gabbay, editor, Handbook of 
Philosophical Logic III, pages 117-224. D. Reidel 
Publishing Company. 
M. Fowler. 1997. UML Distilled: Applying the Stan- 
dard Object Modeling Language. Addison Wesley 
Longman. 
J.-Y. Girard. 1987. Linear logic. Theoretical Com- 
purer Science, 50:1-102. 
V.N. Grishin. 1974. A non-standard logic, and its 
applications to set theory. In Studies in formal- 
ized s and nonclassical logics, pages 135- 
171. Nanka. 
Object Management Group OMG. 1997. Uml 1.1 
specification. OMG documents ad970802-ad0809. 
L.G. Valiant. 1984. Theory of the learnable. Comm. 
o/the ACM, 27:1134-1142. 
