Proceedings of the Ninth International Workshop on Parsing Technologies (IWPT), pages 188–189,
Vancouver, October 2005. c©2005 Association for Computational Linguistics
The Quick Check Pre-unification Filter for Typed Grammars: Extensions
Liviu Ciortuz
CS Department, University of Ias¸i, Romania
ciortuz@infoiasi.ro
The so called quick check (henceforth QC) pre-
unification filter for feature structure (FS) unifica-
tion was introduced by (Kiefer et al., 1999). QC is
considered the most important speed-up technique
in the framework of non-compiled FS unification.
We present two potential ways in which the design
of the quick check can be further extended: con-
sistency sort check on coreferenced paths, and pre-
unification type-checking. We analyse the effect of
these extensions on LinGO, the large-scale HPSG
grammar for English (Flickinger et al., 2000) using
the compiler system LIGHT (Ciortuz, 2002).
1 Coreferenced Based Quick Check
Suppose that the FS ϕ is going to be unified with
ψ, and that ϕ contains the coreference ϕ.pi .= ϕ.piprime.
In this setup, if for a certain path η it happens that
sort(ϕ.(piη)) ∧ sort(ψ.(piη)) ∧ sort(ψ.(piprimeη)) = ⊥,
then certainly ϕ and ψ are not unifiable. There is
no a priori reason why, on certain typed grammars,
coreference-based sort inconsistency would not be
more effective in ruling out FS unification than sort
inconsistency on mutual paths. Moreover, the in-
tegration of the two forms of QC is not compli-
cated. However, up to our knowledge no system
parsing LinGO-like grammars included the above
newly presented form of (coreference-based) pre-
unification QC test.
On the GR-reduced form LinGO (Ciortuz, 2004)
we identified 12 pairs of non-cross argument coref-
erences inside rule arguments (at LinGO’s source
level). Interestingly enough, all these coreferences
occur inside key arguments, belonging to only 8 (out
of the total of 61) rules in LinGO.
To perform coreference-based QC, we computed
the closure of this set Λ of coreference paths. The
closure of Λ will be denoted ¯Λ. If the pair pi1 and pi2
is in Λ, then together with it will be included in ¯Λ all
pairs of QC-paths such that pi1η and pi2η, where η
is a feature path (a common suffix to the two newly
selected paths). For the GR-reduced form of LinGO,
the closure of Λ defined as above amounted to 38
pairs. It is on these pairs of paths that we performed
the coreference-based QC test.
Using all these coreference paths pairs, 70,581
unification failures (out of a total of 2,912,623 at-
tempted unifications) were detected on the CSLI test
suite. Only 364 of these failures were not detectable
through classical QC. When measuring the “sensi-
tivity” of coreferenced-based QC to individual rule
arguments, we found that out of a total of 91 rule
arguments in LinGO only for 4 rule arguments the
coreference-based QC detects inconsistencies, and
the number of these inconsistencies is far lower than
those detected by the classical QC on the same ar-
guments. None of the pairs of coreference paths ex-
hibited a higher failure detection rate than the first
ranked 32 QC-paths. If one would work with 42 QC-
paths, then only 4 of the pairs of coreference paths
would score failure detection frequencies that would
qualify them to be taken into consideration for the
(extended form of) QC-test.
As a conclusion, it is clear that for LinGO, run-
ning the coreference-based QC test is virtually of
no use. For other grammars (or other applications
involving FS unification), one may come to a dif-
ferent conclusion, if the use of non-cross argument
coreferences balances (or outnumbers) that of cross-
188
argument coreferences.
2 Type Checking Based Quick Check
Failure of run-time type checking — the third po-
tential source of inconsistency when unifying two
typed FSs — is in general not so easily/efficiently
detectable at pre-unification time, because this check
requires calling a type consistency check routine
which is much more expensive than the simple sort
consistency operation.
While exploring the possibility to filter unifica-
tion failures due to type-checking, the measurements
we did using LinGO (the GR-reduced form) on the
CSLI test suite resulted in the following facts:
1. Only 137 types out of all 5235 (non-rule and non-
lexical) types in LinGO were involved in (either suc-
cessful or failed) type-checks.1 Of these types, only
29 types were leading to type checking failure.2
2. Without using QC, 449,779 unification fail-
ures were due to type-checking on abstract instruc-
tions, namely on intersects sort; type-checking on
test feature acts in fact as type unfolding. When
the first 32 QC-paths (from the GR-set of paths)
were used (as standard), that number of failures went
down to 92,447. And when using all 132 QC-paths
(which have been detected on the non GR-reduced
form of LinGO), it remained close to the preceding
figure: 86,841.
3. For QC on 32 paths, we counted that failed type-
checking at intersect sort occurs only on 14 GR-
paths. Of these paths, only 9 produced more than
1000 failures, only 4 produced more than 10,000
failures and finally, for only one GR-path the num-
ber of failed type-checks exceeded 20,000.
The numbers given at the above second point sug-
gest that when trying to extend the ‘classical’ form
of QC towards finding all/most of failures, a consid-
erably high number of type inconsistencies will re-
main in the FSs produced during parsing, even when
we use all (GR-paths as) QC-paths. Most of these in-
consistencies are due to failed type-checking. And
as shown, neither the classical QC nor its exten-
sion to (non-cross argument) coreference-based QC
is able to detect these inconsistencies.
1For 32 QC-paths: 122 types, for 132 QC-paths: also 122.
2For 32 QC-paths and 132 QC-paths: 24 and 22 respectively.
s = GRϕ[ i ] ∧ GRψ[ i ];
if snegationslash= GRϕ[ i ] and snegationslash= GRψ[ i ] and
ϕ.pij or ψ.pij is defined for
a certain non-empty path pij = piipi,
an extension of pii,
such that pij is a QC-path,
then
if Ψ(s).pi∧ GRψ[i] = ⊥, where
a Ψ(s) is the type corresponding to s,
or type-checking ψ.pii with Ψ(s) fails
then ϕ and ψ do not unify.
Figure 1: The core of a type-checking specialised
compiled QC sub-procedure.
The first and third points from above say that in
parsing the CSLI test suite with LinGO, the failures
due to type checking tend to agglomerate on cer-
tain paths. But due to the fact that type-checking
is usually time-costly, our conclusion, like in the
case of non-cross argument coreference-based QC,
is that extending the classical QC by doing a cer-
tain amount of type-checking at pre-unification time
is not likely to improve significantly the unification
(and parsing) performances on LinGO.
For another type-unification grammar one can ex-
tend (or replace) the classical QC test with a type-
check QC filter procedure. Basically, after identify-
ing the set of paths (and types) which most probably
cause failure during type-checking, that procedure
works as shown in Figure 1.

References
L. Ciortuz. 2002. LIGHT — a constraint language and
compiler system for typed-unification grammars. KI-
2002: Advances in Artificial Intelligence. M. Jarke, J.
Koehler and G. Lakemeyer (eds.), pp. 3–17. Springer-
Verlag, vol. 2479.
L. Ciortuz. 2004. On two classes of feature paths in
large-scale unification grammars. Recent Advances in
Parsing Technologies. H. Bunt, J. carroll and G. Satta
(eds.). Kluwer Academic Publishers.
D. Flickinger, A. Copestake and I. Sag. 2000. HPSG
analysis of English. Verbmobil: Foundations of
speech-to-speech translation. Wolfgang Wahlster
(ed.), pp. 254–263. Springer-Verlag.
B. Kiefer, H-U. Krieger, J. Carroll and R. Malouf. 1999.
A bag of useful techniques for efficient and robust
parsing. Proceedings of the 37th annual meeting of the
Association for Computational Linguistics, pp. 473–
480.
