ANTICIPATION-FREE DIAGNOSIS OF STRUCTURAL FAULTS 
Wolfgang Menzel 
Zentralinstitut fQr Sprachwissenschaft 
Akademie der Wissenschaften der DDR 
Prenzlauer Promenade 949-152 
Berlin, iI00, DDR 
Current attempts to diagnose grammatical 
faults in natural language utterances are 
except for agreement errors and certain cases 
of overgeneralization and interference 
strongly based on the principles of error 
anticipation (cf. Yazdani 1988, Schwind 1988, 
Catt 1988): Rather tiny context free grammars 
are enhanced by some additional rules which 
describe selected faulty structures and in- 
voke error messages zf they are needed for a 
successful parse. The efforts required to 
compile an at least approximatively compre- 
hensive rule set even for simple domains of 
grammar are considerable. Deszdes this, it is 
the student's risk to fall into the remalninq 
~ap of neglected possibilities which seems to 
be difficult to avoid. Hopefully, an improve- 
meat of this situation can be achieved by an 
application of model-based reasoning proce- 
dures, where an internal model (of language 
correctness) is used to simulate and evaluate 
error hypotheses by investigating their con- 
sequences for other parts of the mode\]. To a 
certain degree the diagnostic results are 
logically determined by the correct remainder 
of the utterance and useful results require a 
balanced ratio between correct and incorrect 
language use within the solutlon of the 
student. 
Provided a correct and covering model can 
be supplied for a limited domain, diagnosis 
is guaranteed to be precise and robust enouqh 
and error anticipation eventually may be 
renounced completely. In order to yield an 
efficient implementation of the idea into a 
practical solution a preponderantly data 
driven procedure instead of a strictly 
hypotheses driven one seems to be desirable. 
A procedure of this kind }\]as been success- 
fully pursued in an earlier paper on the 
diagnosis of agreement errors in fixed syn- 
tactic environments (Menzel 1988). Quite 
naturally this success raises the question on 
how much of the experience gathered can be 
transferred to other types of grammatical 
regularities as linear ordering principles or 
dominance regularities, for instance. 
Up to now the only notable exception to 
the one-sided orientation on error anticipa- 
tion has been a fail-soft technique imple- 
mented in the error sensitive parsing system 
Linger (Barehan etal. 1986), an approach 
which later has been named "word soup heuris- 
tics" by their authors: Whenever the normal 
parsing process based on a principally antic- 
ipation-oriented context-free grammar fails, 
the system attempts to achieve a successful 
parse by trying single word form substitu- 
tions, insertions, deletions or displace- 
ments. Although often being very useful in 
detecting simple flaws of the student, this 
heuristics not so infrequently produces 
rather surprising and sometimes even funny 
interpretations of the input data. Its main 
drawback is the basic limitation to only 
single word form errors. Any extension to the 
handling of complete constituents, desirable 
as it may be, seems to be condemned to fail- 
ure because of efficleney reasons: the whole 
approach is basically expectation driven and 
it opens up too vast a search space of possi- 
ble error hypotheses, where the verification 
of onl~ a single one is not just a trivial 
task. 
I. MODEL-BASED DIAGNOSIS 
The intrinsic problem with the diagnosis 
of structural errors is its not fitting easi- 
ly into the standard paradigm of model-based 
reasoning which essentially relies on two 
basic assumptions (Reiter 1987): 
(1) A model a\]ways has an a priori given 
number of elementary model components. 
(2) The intercomponent connections of the 
model are invarzant and, likewise, given 
a priori. 
Accordingly, model--based diagnosis primarily 
should be applicable to domains with a fixed 
and known structure, which is typical for 
e.~. electronic troubleshooting, the origin 
of the approach. Provision can be taken for 
these premises to be fulfilled in artificial- 
ly limited domains of natural language, e.g. 
for agreement errors C c.f. Menzel 1988). It, 
more natural environments of language produc- 
tion. however, they do not hold. Parsing a 
natural \].anguage sentence first of all is 
solving the task of structural identifica- 
tion. Therefore, dlagnosing arbitrary syntac- 
tic errors in arbitrary utterances, from this 
point of view, may perhaps be compared with 
electronic troubleshooting in a circuitry o£ 
obscure function with at least partially 
unknown components and partially invisible 
wiring under the additional assumption that 
there is no possibility to vary the condi- 
tions of measurement! It should go without 
saying that such a task can only be solved in 
very limited domains relying on an as strong 
as possible (semantic) support from the situ- 
ational and sentential context. The final 
goal, of course, should be an as far-reaching 
as possible integration of structural identi- 
fication and diagnosis. 
Model-based diagnosis, especially for 
teaching purposes where comprehensible error 
explanations are desired, poses two addition- 
al constraints on the kind of model informa- 
tion to be used. Both conditions, if compared 
against usual parsing grammars, certainly are 
not a matter of course: 
(3) The model has to provide an extremely 
reliable correct/incorrect distinction, 
whereas traditional grammars, in the hope 
that ungrammatical sentences will not 
appear as input, msssivsl~ rely on over- 
generation. 
(4) An explicit representation based on com- 
prehensible generalizations has to be 
422 i 
attempted for a maximum of regularities 
in the domain, in order to allow this 
information to be used immediately for 
explanation purposes. 
The latter condition in most cases definitely 
rules out simple lists of aIternative solu- 
tions as a proper means of representing model 
information. To code, for instance, word 
order regularities as a list of possible 
permutations gives no sensible error explana- 
tion besides, say, "Your constituent order is 
not contained in the list of admissible con- 
stituent sequences". What is desired instead 
of this would be an explanation, based on 
explicit generalizations as in "The verbal 
group of German subordinate clauses has 
always to be placed in final position". 
A first attempt to make word order regu- 
larities explicit has been made by using 
ID/LP format for GPSG (Gazdar et al. 1985). 
For diagnostie purposes such explicitness is 
not only necessary with respect to linear 
order principles but with respect to omissi- 
bility and combinability of categories as 
well. It should result in a clear distinction 
between a rather simple notion of dominance 
rules and a c.omparatively rich set of various 
constraints over dominance structures. 
II. DOMINANCE STRUCTURES 
In its most simple case a dominance rule 
A --> Bt,B2 ..... B. 
states the ability of category A to dominate 
all sequences of categories which are arbi- 
trary permutations of list L=(B~. B~ ..... B,) 
or of any not empty sublist of L. According 
to this definition, a dominance rule can 
easily be interpreted as a disjunction of 
elementary and independent (local) dominance 
relations dom\[ X, Y\] and each category in list 
L represents an optional constituent: 
or\[ dom\[ A, B~ \] , don\[ A, B2} ..... dom\[ A, B, \] \] , 
or in a shorthand notation: 
or\[ Bs , B2 ..... B. \] . 
If a highly precise and explicit representa- 
tion of dominance regularities (according to 
condition (3) and (4)) is aimed at, this 
simple rule format is obviously not suffi- 
cient. It does not even allow the usual 
distinction between optional and obligatory 
elements in the list of dominated nodes, and 
especially for the purpose of model-based 
diagnosis a further refinement is inevitable. 
Obviously, a minimal formal base should 
contain at least an explicit description of 
the sometimes rather intricate compatibility 
conditions between elementary dominance rela- 
tions, e.g. by means of propositional expres- 
sions. In order to yield a simple diagnosis 
procedure, the complexity of admissib\].e ex- 
pressions has to be carefully restricted. For 
a good number of cases a conjunctive combina- 
tion of elementary (usually binary) expres- 
sions is already sufficient. Such elementary 
expressions then can be interpreted as addi- 
tional constraints for the simultaneous ap- 
pearence of categories within a constituent 
in a very similar way as agreement and word 
order constraints restrict the compatibility 
of inflected forms or the sequencing of cate- 
gories. 
Most often needed are compatibility con- 
straints to describe an alternative (exor) ~ 
an implication (if) or an equivalence (iff) 
of dominance relations. Additionally, a dom- 
in'ante relation can be made obligatory, if it 
is simply specified as a single element in 
the conjunction of constraints. Hence, by 
choosing a sensible specification of con- 
straints optionality or obligatoriness can 
easily be expressed as special eases. In the 
simple noun phrase 
NP ---> Dot, Adj,Noun 
the determiner and the noun can be indicated 
as obligatory by adding the constraints: 
and\[ Det, Noun\] 
whereby the adjective remains optional. 
A more ambitious example could be the 
German local prepositional phrase PP 
PP --> Prep-3, Prep-3-Det, Det, Adv, Adj,Noun 
which allows in addition to the usual dative 
prepositions (Prep-3) the fusion of preposi- 
tion and determiner (Prep-3-Det) which is 
very common not only in spoken German 
( "an" + "den" = "am", "in" + "das" = "ins", 
etc.). The additional constraints 
and\[ exor\[ Prep-3, Prep-3-Det\] , or\[ Adj, Noun\] , 
if f\[ Prep-3, Det\] , if\[ Adv, Adj\] \] 
provide for a prepositional phrase to contain 
one and only one preposition and exactly one 
determiner, independently of being fused or 
not. Adjective and noun both are optional 
(but not simultaneously) and the admissibili~ 
ty of an attributive adverb depends on the 
existence of the modified adjective. 
To describe the omissibility of dominated 
nodes (e.g. for the determiner) arbitrary 
elementary logical conditions (e. g. for the 
presence or absence of certain semantic fea- 
tures) may be included into the set of con- 
straints. 
This simple formal framework certainly is 
not sufficient to write complex grammars. 
Nevertheless, it can serve to build tiny 
(but non-trivial) specialized grammars cover- 
ing e.g. simple types of main or subordinate 
clauses, extended noun phrases with left 
and/or right attributes etc. which then meet 
the rather strong preconditions for an appli- 
cation of model-based diagnosis techniques. 
III. DIAGNOSIS 
According to the strong bias within the 
descriptive framework towards consistency 
constraints, the bulk of student errors will 
have to be diagnosed as consistency viola- 
tions. For that purpose a constraint propaga- 
tion procedure based on constraint retraction 
or, logically stronger, constraint negation 
can be used. It is this kind of procedure by 
which agreement errors earlier have been 
tackled successfully. Now it turns out that 
linear ordering principles can be handled in 
a quite similar way. The only serious dis- 
tinction is the origin of factual informa- 
tion: Whereas for agreement it is taken from 
the dictionary (morpho-syntactic features), 
for word order it is given as a position 
number in the input sequence chosen by the 
student. 
Mutual constituent incompatibility and the 
omission of obligatory constituents are diag- 
2 423 
nosed as violations o£ the above mentioned 
constraints. Constraint negation as basic 
diagnostic technique is a comparatively sim- 
ple procedure in the case of e.g. dis3unction 
and implication. In both cases the reason for 
the constraint violation is unique. More 
attention requires e.g. the violation of an 
alternative where two eases (which result in 
two different explanation variants) have to 
be properly distinguished: None of the re- 
quired categories has been detected vs. both 
categories appear simultaneously. 
Combinatorial problems arise out of the 
transitivity property o£ some constraints. 
This is typical for agreement constraints and 
in most cases it rules out a local decision 
upon a particular error hypothesis. Addition- 
al difficulties arise out of (legal or ille- 
gal) constituent omissions, where constraint 
propagation has to take into consideration a 
(locally limited) transitive closure of con- 
straint relations. 
IV. PARSING AS CONSTRAINT SATISFACTION? 
Shifting J nformation from traditional syn- 
tactic rules into additional constraints 
makes parsing an increasingly difficult en-- 
terprise. Valuable information usually used 
to reduce the search space has been lost. 
With the basic clef}hi(ion of dominance to 
be a disjunctive combination of potential 
dominance relations a grammar carl be inter-- 
preted as an OR-tree and structural identifi-- 
cation (parsing) becomes a procedure o£ arts-- 
china all the categories occurring in the 
input sequence to corresponding leaves of the 
grammar, tree. Thls, of course, restricts 
practiea~ soJatJons to finite trees~ i.e. 
~,onrecursl ve dominance relations. For non-- 
reeursive relations the search space become~ 
finite but rema~ n5 nevertheless e~tremelw 
large. Egen fo~' very small grammars maliy 
combinations of category attachments exist, 
each of which stands for a separate con- 
straint satisfaction problem, which to solve, 
under normal circumstances, again requires a 
combinatorial procedure. 
Hence. a further drastic reduction of the 
search space has to be achieved by means of 
different heuristics: 
I. Certain compatibility constraints which 
are unlikely to be violated by the student 
(e. g. two prepositions in a single PP) can 
be made implicit. In that case, they can- 
not be violated and consequently not be 
used for explanatory purposes. They can, 
however, be well used to exclude senseless 
category attachments. 
Since this heuristics often applies for 
clearly alternative dominance relations, 
the tree of dominance possibilities gets 
an implicit or/exor-structure. A very 
similar technique can be applied for op- 
tional subordinated constituents, where 
the subordinated constituent should be 
accepted only, if at least something of of 
the dominating constituent has been iden- 
tified. 
2. Exclusion of useless permutations during 
category attachment to the grammar tree by 
maximising a locality measure. This heu- 
ristics fails systematically in case of 
certain embedded constituents, where it 
prefers to attach e.g. a determiner to the 
embedded noun phrase instead of assigning 
it to the more distant noun. 
3. Reuse of partial results guided by a 
clustering of functionally different con- 
stituents according to their structural 
equivalence, e.g. NPs, PPs etc. 
4. Data driven best first analysis. 
For simple grammars (of the above mentioned 
complexity) these heuristics usually reduce 
the ambiguity of attachment to only a few 
readings which remain to be passed to the 
consistency check. Since even very simple 
grammars often are quite sufficient for lan- 
guage learning purposes, indeed a kind o£ 
restricted, but nevertheless useful parsing 
system mainly based on constraint satisfac- 
tion techniques can be devised. This at least 
allows to considerably soften the strong 
limitations of the approach in Menzel (1988), 
which are imposed by the restriction to fixed 
syntactic environments. Considering however 
the rapidly growing search space required for 
more complex models, an extension of the 
approach to the level of a universal grammar 
obviously is not feasible. 
As a result Of the \].imits of a simple 
mode\], a few types o£ errors cannot be diag- 
nosed as constraint violations but have to 
be detected already during the procedure of 
category attachment. In particular thi~ con- 
cerns the detection of superfluous forms e.g. 
the use of two finite verbs within a single 
sentence. But generally the preference of an 
insertion as error hypothesis is rather low. 
The suspicion of misinterpreting the 
student'~ iiltentions should be much more 
justified. Such substitutions of categorie~ 
result in a combination of a category inser- 
tion and omission at the same place in (:he 
utterance, if s single word Form is concerned 
this constellation sometimes indicates a 
wrong application of word formation rules o~:' 
inflectional scheme~ which result in an unin-. 
tended category. E. ~. "kochei~" ( to cook) i~ i~ 
Finite verb but in the sentence "~Die ~oche~ 
gehen nach Hauee" ( approximately: ~ The cook 
inc~ are going home) it has to be interpreted 
as a mistaken Dlurdl of "I<och" ( the cook) . 
Diagnosis infers this from the missinu noatL 
of' the subject at~d the super fl*lous verb, 
supported, by the unusual capitalization of 
the supposed verb. 

References

Barchan, J% Woodmansee, B. and Yazdani, M. 
(1986) A PROLOd-based Tool for French 
Grammar Analysis. in: Instructional 
Science, vol. 14, p. 21-48. 

Catt. M. E. (1988) Intelligent Diagnosis of 
Ungrammaticality in Computer-Assisted 
Language Instruction, Technical Report 
CSRI-218, Computer Systems Research 
Institute, University of Toronto. 

Gazdar, G. , Klein, E. , Pullum, G. K. , Sag, I. 
A. (1985) Generalized Phrase Structure 
Grammar, Oxford. 

Menzel, W. (1988) Error Diagnosing and 
Selection in a Training System for Second 
Language Learning, Prec. 12th Coling 88, 
Budapest: 414-419. 

Reiter~ R. (1987) A Theory of Diagnosis from 
First Principles, Artificial 
Intelligence, vol. 32, no. I: 57-95. 

Schwind, C. B. (1988) Sensitive Parsing: 
Error Analysis and Explanation in an 
Intelligent Language Tutoring System, Prec. 
Coling 88, Budapest: 608-613. 

Yazdani, M. (1988) Language Tutoring with 
PROLOG, Papers of the International 
Workshop on Intelligent Tutoring Systems 
for Second Language Learning, Trieste: 150-155. 
