Deep Processing of Honorification Phenomena in a Typed Feature
Structure Grammar
Jong-Bok Kim
School of English
Kyung Hee U.
Seoul, 130-701
jongbok@khu.ac.kr
Peter Sells
Dept. of Linguistics
Stanford U.
Stanford, CA 94305
sells@stanford.edu
Jaehyung Yang
School of Computer Eng.
Kangnam U.
Kyunggi, 449-702, Korea
jhyang@kangnam.ac.kr
Abstract
Honorific agreement is one of the main
properties of languages like Korean or
Japanese, playing an important role
in appropriate communication. This
makes the deep processing of honorific
information crucial in various computa-
tional applications such as spoken lan-
guage translation and generation. We
argue that, contrary to the previous lit-
erature, an adequate analysis of Ko-
rean honorification involves a system
that has access not only to morpho-
syntax but to semantics and pragmatics
as well. Along these lines, we have de-
veloped a typed feature structure gram-
mar of Korean (based on the frame-
work of HPSG), and implemented it
in the Linguistic Knowledge Builder
System (LKB). The results of parsing
our experimental test suites show that
our grammar provides us with enriched
grammatical information that can lead
to the development of a robust dialogue
system for the language.
1 Basic Properties of Honorific
Agreement
Honorification, one of the main features of spoken
language in Korean, plays a key role in proper and
successful verbal communication (Chang 1996,
Lee 1998, Sohn 1999). The Korean honorific sys-
tem basically requires that when the subject is in
the honorific form (usually with the marker -nim),
the predicate also be inflected with the honorific
form -(u)si as in (1):1
(1) a. sensayng-nim-i wus-usi-ess-e.
teacher-HON-NOM laugh-HON-PST-DECL
‘The teacher laughed.’
b.#sensayng-nim-i wus-ess-e.
This type of agreement is often assumed to be
purely pragmatic, mainly because certain contexts
allow disagreeing cases between the subject and
the verb: the utterance of (1)b can be felicitous
when the speaker does not honor the referent of
the subject (marked by #). The possibility of hav-
ing such disagreement has often to an assumption
in the literature that using the -nim and -si form
of verbs is a matter of gradience and appropriate-
ness rather than grammaticality (cf. Chang 1996,
Pollard and Sag 1994, Lee 1998).
However, one often neglected fact is that this
agreement constraint must be observed when the
subject is non-human as in (2) (cf. Sohn 1999):
(2) a. cha-ka o-(*si)-ess-e.
cha-NOM come-HON-PST-DECL
‘The car came.’
b. kwukhoy-ka pepan-ul simuy-ha-(*si)-ess-e.
congress bill review-HON-PST-DECL
‘The congress reviewed the bill.’
If we rely only on pragmatic information, we
would have difficulty understanding why, in con-
trast to the disagreement in (1)b, disagreement
like that in (2) are rarely found in real language
usages.
1Abbreviations we use in the paper include ARG
(ARGUMENT), ACC (Accusative), BAKGR (BACK-
GROUND), COMP (Complementizer), CTXT (CON-
TEXT), DECL (Declarative), HON (Honorific), IMPER
(Imperative), NOM (Nominative), ORTH (ORTHOGRA-
PHY), PST (Past), SYN (SYNTAX), SEM (SEMANTICS),
RELS (RELATIONS), and POS (part of speech).
97
In addition, there exist agreement-sensitive
syntactic phenomena such as auxiliary verb con-
structions:
(3) a. sensayng-nim-i nolay-lul
teacher-HON-NOM song-ACC
pwulu-si-ci anh-(usi)-ess-e.
sing-HON-COMP not-HON-PST-DECL
‘The teacher did not sing a song.’
b. sensayng-nim-i ton-ul mo-(*si)-e
teacher-NOM money-ACC save-HON-COMP
twu-si-ess-e.
hold-HON-PST-DECL
‘The teacher saved money (for rainy days).’
c. sensayng-nim-i nolay-lul
teacher-HON-NOM song-ACC
pwulu-si-na po-(*si)-e.
sing-HON-COMP seem-HON-DECL
‘The teacher seems to sing a song.’
As noted here, even though the subject is hon-
ored in each case, the honorific marker on the
main predicate in (3)a is optional with the aux-
iliary anh- ‘not’; in (3)b the marker must appear
only on the auxiliary verb twu- ‘hold’; meanwhile
in (3)c the marker cannot appear on the auxiliary
po ‘seem’. Such clear contrasts, we can hardly
attribute to pragmatic factors.2
2 Honorification in a Typed Feature
Structure Grammar
A closer look at the honorific phenomena of the
language in the previous section suggests that an
adequate theory of honorification aiming for inte-
gration into a proper communication system re-
quires not just complex pragmatic information
but also morpho-syntactic information. The ba-
sic framework of the grammar we adopt for mod-
elling the language is the type-feature structure
grammar of HPSG. HPSG seeks to model human
languages as systems of constraints on typed fea-
ture structures. In particular, the grammar adopts
the mechanism of a type hierarchy in which ev-
ery linguistic sign is typed with appropriate con-
straints and hierarchically organized. This system
then allows us to express cross-classifying gen-
eralizations about linguistic entities such as lex-
emes, stems, words, and phrases in the language
(cf. Kim and Yang 2004, Kim 2004).
2In addition to this subject-verb agreement, the language
employs addressee agreement marked on a verbal suffix de-
pending on the honoring relationship between speaker and
addressee. Such agreement, though implemented in our
grammar, is not presented here because of space limit here.
2.1 Lexicon and Subject Agreement
Our grammar, named KPSG (Korean Phrase
Structure Grammar), first assumes that a nominal
with -nim and a verbal with -si bear the head fea-
ture specification [HON +]. This is supported by
the contrast in the following:
(4) a. [[haksayng-i manna-n] sensayng-nim-i]
student-NOM meet-MOD teacher-HON-NOM
o-si-ess-e.
come-HON-PST-DECL
‘The teacher that the student met came.’
b. [[sensayng-nim-i manna-si-n]
teacher-NOM-NOM meet-HON-MOD
haksayng-i] o-(*si)-ess-e.
student-NOM come-HON-PST-DECL
‘The student that the teacher met came.’
As seen here, it is the honorific information on the
head noun sensayng-nim in (4)a that agrees with
that of the verb.
With this head feature information, the gram-
mar builds the honorific nominal type (n-hon)
from the basic lexeme (n-lxm) as represented in
the following feature structures:3
(5) a.2
66
66
66
66
66
66
4
n-lxm
ORTH hsensayng i ‘teacher’
SYNjHEAD
•POS noun
HON boolean
‚
SEM 2
2
66
4
INDEX i
RELS
*"
PRED teacher-rel
INSTANCE i
#+
3
77
5
3
77
77
77
77
77
77
5
b.
2
66
66
64
n-hon
ORTH hsensayng-nim i ‘teacher-HON’
SYNjHEAD
•POS noun
HON +
‚
SEM 2
3
77
77
75
As seen in (5)a, a nominal lexeme with no hon-
orific marker -nim is underspecified for the HON
feature.4
Meanwhile, the subject of an honorific verbal
element carries the feature [HON +] in addition
to the relevant pragmatic information:
3The information our grammar encodes for such lexeme
entries is only the shaded part: all the other information is
inherited from its supertypes defined in the grammar. For
a more comprehensive system of morphology built within
such a system, see Kim (2004) and Kim and Yang (2004).
4The boxed number here is used as a way of showing that
semantic value of the lexeme, n-lxm is identical with that of
the honorific noun n-hon.
98
(6)
a.
2
66
66
66
66
4
v-lxm
ORTH 1
SYNjHEAD
•POS verb
HON boolean
‚
ARG-ST
D
NP£INDEX i⁄, . . .
E
SEM 2
3
77
77
77
77
5
b.
2
66
66
66
66
66
66
66
66
66
64
v-hon
ORTH 1 + si
SYNjHEAD
"
POS verb
HON +
#
ARG-ST
*
NP
"
HON +
INDEX i
#
, . . .
+
CTXT
2
66
64
C-INDICESjSPEAKER p
BAKGR
*2
4
PRED honoring
ARG1 p
ARG2 i
3
5
+
3
77
75
3
77
77
77
77
77
77
77
77
77
75
The basic verbal lexeme type v-lxm in (6)a does
not carry any restriction on its subject. However,
as given in (6)b, the v-hon type with the -(u)si suf-
fix adds the information that its subject (the first
element in the ARG-ST (argument structure)) is
[HON +], in addition to the information that the
speaker is honoring the subject referent as given
in the CTXT value.
One of the key points in this system is that even
though the [HON +] verb selects a [HON +] sub-
ject, the subject of a nonhonorific verb can be ei-
ther in the honorific or nonhonorific form since its
value is underspecified with respect to the verb.
This then correctly allows disagreeing examples
like (1)b where the subject is [HON +] and the
verb’s HON value is ‘boolean’:
(7) sensayng-nim-i wuc-ess-e. ‘The teacher laughed.’
The nonhonorific verb combines with the hon-
orific subject with no honoring intention from the
speaker since the nonhonorific verb does not bear
the pragmatic constraint that the speaker honors
the referent of the subject.
Yet the grammar blocks disagreeing cases like
(2) where an honorific verb combines a non-
honorific subject:
(2) a. *cha-ka o-si-ess-ta. ‘The car came.’
b. *kwukhoy-ka ku pepan-ul simuy-ha-si-ess-e.
‘The congress reviewed the bill.’
These are simply not parsed since the honorific
verb here would combine with the [HON ¡] sub-
ject, violating the constraint in (6)b. A noun
like sensayng ‘teacher’ is [HON boolean], while
sensayng-nim is [HON +], and most nouns are
[HON ¡].
2.2 Object and Oblique Agreement
While subject honorification has a productive suf-
fixal expression, there are some lexically sup-
pletive forms like poyp-e ‘see.HON-DECL’ and
mosi-e ‘take.HON-DECL’, which require their
object to be in the honorific form:
(8) a. *John-i Mary-lul poyp-ess-e.
John-NOM Mary-ACC see.HON-PST-DECL
‘John honorably saw Mary.’
b. John-i sensayng-nim-ul poyp-ess-e.
John-NOM teacher-HON-ACC
‘John honorably saw the teacher.’
Our grammar lexically specifies that these supple-
tive verbs require the object to be [HON +] to-
gether with the pragmatic honoring relation. The
following is the lexical information that a supple-
tive verb like this accumulates from the inheri-
tance hierarchy:
(9) 2
66
66
66
66
66
66
66
64
v-lxm
ORTH hpoyp-i ‘HON.see’
SYNjHEAD 1 [HON +]
ARG-ST
*
NP[INDEX i], NP
"
HON +
INDEX j
#+
SEM see-rel
CTXT
2
64BAKGR
*2
4
PRED honoring
ARG1 i
ARG2 j
3
5
+3
75
3
77
77
77
77
77
77
77
75
Such lexical information can easily block exam-
ples like (8)a where the object is [HON ¡].
Lexically suppletive forms like tuli-e
‘give.HON-DECL’ and yeccup-e ‘ask.HON-
DECL’ require their oblique argument to be in
the HON form (nonhonorific forms are cwu-e
and mwut-e, respectively):
(10) a. John-i sensayng-nim-eykey senmwul-ul
John-NOM teacher-HON-DAT present-ACC
tuli-ess-e.
give.HON-PST-DECL
‘John gave the present to the teacher.’
b. *John-i haksayng-eykey senmwul-ul tuli-ess-e.
Just like object agreement, our grammar assigns
the HON restriction on its dative argument to-
gether with the pragmatic honoring constraint:
99
(11) 2
66
66
66
66
66
64
v-lxm
SYNjHEADjHON +
ARG-ST
*
[INDEX i], [ ],
"
HON +
INDEX k
#+
CTXTjBAKGR
*2
4
PRED honoring
ARG1 i
ARG2 k
3
5
+
3
77
77
77
77
77
75
Once again the grammar rules out examples like
(10)b in which the dative argument haksayng-
eykey ‘student-DAT’ is nonhonorific. How-
ever, nothing blocks the grammar from gener-
ating examples like (12) where the dative argu-
ment sensayng-nim-eykey ‘teacher-HON-DAT’ is
[HON +] even if the verb cwu- ‘give’ is in the
nonhonorific (unspecified) form:
(12) John-i sensayng-nim-eykey senmwul-ul cwu-ess-e.
2.3 Multiple Honorification
Given this system, we can easily predict that it
is possible to have multiple honorific examples
in which subject agreement cooccurs with object
agreement:
(13) ape-nim-i sensayng-nim-ul
father-HON-NOM teacher-HON-ACC
poyp-(usi)-ess-e.
HON.see-HON-PST-DECL
‘The father saw the teacher.’
The honorific suffix -si on the verb here requires
the subject to be [HON +] whereas the supple-
tive verb stem asks its object to be [HON +]. In
such examples, the honorific marker in the verb
can be optional or the verb can even be replaced
by the nonsuppletive form po- ‘seem’. However,
the grammar does not generate cases like the fol-
lowing:
(14) a. *John-i sensayng-nim-ul
John-NOM teacher-HON-ACC
poyp-usi-ess-e.
HON.see-HON-PST-DECL
‘John saw the teacher.’
b. *ape-nim-i John-ul
HON.see-HON-PST-DECL
poyp-ess-e.
HON.see-HON-PST-DECL
‘The father saw John.’
(14)a is ruled out since the HON form -(u)si re-
quires the subject to be [HON +] whereas (14)b is
ruled out since the suppletive form poyp- selects
a [HON +] object.
We also can see that oblique agreement can oc-
cur together with subject agreement:
(15) a. eme-nim-i sensayng-nim-eykey
mother-HON-NOM teacher-HON-DAT
senmwul-ul tuli-si-ess-e.
present-ACC give.HON-PST-DECL
‘Mother gave the teacher a present.
b.#eme-nim-i sensayng-nim-eykey senmwul-ul
tuli-ess-e.
c.#eme-nim-i sensayng-nim-eykey senmwul-ul
cwu-(si)-ess-e.
d. *John-i sensayng-nim-eykey senmwul-ul tuli-si-
ess-e.
e. *eme-nim-i John-eykey senmwul-ul tuli-si-ess-e.
Since the nonhonorific verb places no restriction
on the subject, the grammar allows the disagree-
ment in (15)b and c. However, (15)d and (15)e
cannot be generated: the former violates subject
agreement and the latter violates object agree-
ment.
2.4 Agreement in Auxiliary Constructions
The present honorification system in the KPSG
can offer us a streamlined way of explaining
the agreement in auxiliary verb constructions we
noted in section 1.1. Basically there are three
types of auxiliaries with respect to agreement (see
Sells 1998):
Type I: In the construction with auxiliary verbs
like anh- ‘not’, when the subject is in the hon-
orific form, the honorific suffix -si can optionally
appear either on the preceding main verb or on the
auxiliary verb or on both:
(16) a. sensayng-nim-i o-si-ci
teacher-NOM come-HON-COMP
anh-usi-ess-e.
not.HON-PST-DECL
‘The teacher did not come.’
b. sensayng-nim-i o-si-ci anh-ess-e.
c. sensayng-nim-i o-ci anh-usi-ess-e.
d.#sensayng-nim-i o-ci anh-ess-e .
Type II: When the head auxiliary verb is one
like po- ‘try’, twu- ‘hold’, and ci- ‘become’, sub-
ject honorification occurs only on the auxiliary
verb. That is, the preceding main verb with the
specific COMP suffix form -a/e cannot have the
honorific suffix -si:
(17) a. *sensayng-nim-i John-ul cap-usi-e
teacher-NOM John-ACC catch-HON-COMP
twu-si-ess-e.
do.for.the.future
‘The teacher hold John for future.’
b. sensayng-nim-i John-ul cap-a twu-si-ess-e.
c. *sensayng-nim-i John-ul cap-usi-e twu-ass-e.
d. sensayng-nim-i John-ul cap-a twu-ass-e.
100
Type III: Unlike Type II, auxiliary verbs like
po- ‘see’ and kath- ‘seem’ cannot have the hon-
orific suffix -si even if the subject is in the hon-
orific form:
(18) a. *sensayng-nim-i chayk-ul ilk-usi-na
teacher-NOM book-ACC read-HON-COMP
po-si-ta.
seem-DECL
‘The teacher seems to read a book.’
b. sensayng-nim-i chayk-ul ilk-usi-na po-ta.
c.#sensayng-nim-i chayk-ul ilk-na po-ta.
d. *sensayng-nim-i chayk-ul ilk-usi-na po-si-ta.
First, the agreement in Type I simply follows
from the general assumption that this kind of aux-
iliary verbs acts like a raising verb whose subject
is identical with that of the main verb:
(19) a. 2
66
66
66
66
4
aux-v
ORTH hanh-ai ‘not-DECL’
SYNjHEAD jAUX +
ARG-ST
*
1 , 2
"LEX +
ARG-ST
D
1 , . . .
E
#+
SEM not-rel
3
77
77
77
77
5
b. 2
66
66
66
66
4
aux-hon-v
ORTH hanh-usi-ei ‘not-HON-DECL’
SYNjHEAD
•AUX +
HON +
‚
ARG-ST
D
1 [HON +] , 2
E
SEM not-rel
3
77
77
77
77
5
The negative auxiliary verb with or without the
-(u)si suffix selects as its arguments a subject and
a lexical complement whose subject is identical
with the auxiliary’s subject. This means when ei-
ther one of the verbs requires an HON subject,
then the combination of the main verb as a com-
plex predicate will also require an HON subject.5
The absence of the HON on the main verb for
the Type II AUX is due to the language’s mor-
phological constraints. Such an auxiliary verb
forms a verbal complex together with a main
verb that bears the COMP suffix -a/e: this suf-
fix morphologically requires its verb stem to have
no honorific -(u)si (cf. Kim and Yang 2004).
This morphological constraint can be attested by
the fact that suppletive honorific form with no
5This treatment assumes that the auxiliary verb combines
with the preceding (main or auxiliary) verb and forms a com-
plex predicate. See Kim and Yang (2004) for this line of
treatment.
productively-formed -si marking can occur in the
Type II construction:6
(20) a. sensayng-nim-i sakwa-lul tusi-e
teacher-NOM apple-ACC HON.eat-COMP
po-si-ess-e.
try-HON-PST-DECL
‘The teacher tried to eat the apple.’
b. sensayng-nim-i chayk-ul ilk-(*usi)-e
teacher-NOM book-ACC read-HON-COMP
po-si-ess-e.
try-HON-PST-DECL
‘The teacher tried to read the book.’
Within the grammar we developed where each
specific verb stem has its own type constraint, the
stem value of the COMP suffix -a/e must be a verb
lexeme with no suffix -si.
As for the Type III AUX, the grammar needs
to rely on semantics: AUX verbs like po- ‘seem’
and kath- ‘seem’ select propositions as their se-
mantic argument:
(21) 2
66
66
66
66
66
66
4
hpo-i ‘see’
SYNjHEAD
•AUX +
HON ¡
‚
ARG-ST hS[INDEX s2]i
SEM
2
66
64
INDEX s1
RELS
*2
4
PRED seem-rel
ARG0 s1
ARG1 s2
3
5
+
3
77
75
3
77
77
77
77
77
77
5
The honoring relation applies not to a proposition
but to a human individual: it is such a seman-
tic property that places a restriction on the HON
value of the auxiliary verb.
3 Testing the Feasibility of the Analysis
In testing the performance and feasibility of the
grammar, we implemented our grammar in the
LKB (Linguistic Knowledge Building) system
(cf. Copestake 2002). The test suites we used con-
sist of the SERI Test Suites ’97 (Sung and Jang
1997), the Sejong Corpus, and sentences from
the literature on honorification. The SERI Test
Suites (Sung and Jang 1997), designed to evalu-
ate the performance of Korean syntactic parsers,
6The verb in Korean cannot be an independent word
without inflectional suffixes. The suffixes cannot be attached
arbitrarily to a stem or word, but need to observe a regular
fixed order. Reflecting this, the verbal morphology has tra-
ditionally been assumed to be templatic:
(i) V-base + (Passive/Causative) + (HON) + (TENSE)
+ MOOD
101
consists of total 472 sentences (292 test sentences
representing the core phenomena of the language
and 180 sentences representing different types of
predicate). Meanwhile, the Sejong Corpus has
179,082 sentences with about 2 million words.
We randomly selected 200 simple sentences (the
average number of words in each sentence is
about 5) from the corpus. These sentences are
classified according to their honorification types
(agreement target £ predicate) and the ratio of
parsed sentences:7
(22) (target) # of # Parsed
(predicate) Sentences Sentences
nonHON (tgt)
£ 514 (76.4%) 455 (88.5%)
nonHON (pred)
HON (tgt)
£ 64 (9.5%) 58 (90%)
HON (pred)
HON (tgt)
£ 90 (13.3%) 82 (91%)
nonHON (pred)
nonHON (tgt)
£ 4 (0.05%) 0 (0%)
HON (pred)
Total 672 595 (88.5%)
In addition to these sentences, we selected 100
sentences (including the ones given in the paper)
from the literature on Korean honorification: 51
sentences with -si marked verbs, 31 with auxiliary
verb constructions, and 18 with suppletive verb
forms. We obtained similar results: the grammar
parsed a total of 96 sentences.
Among the total of 691 parsed sentences, we
checked the meaning representations (minimal re-
cursion semantics: MRS) and the pragmatic rep-
resentations of 100 randomly selected sentences,
and could see that the representations contain the
correct information that the grammar is designed
for. We believe that the enriched deep process-
ing of grammatical honorific information that the
grammar successfully composed in the parsing
process can well function for the proper under-
standing of natural data.
4 Conclusion
Honorification, one of the most salient features of
the language, involves various grammatical levels
7The four nonHON £ HON sentences are cases where
the nominals are not in the honorific form. One way to ac-
cept such examples is to remove the [HON +] restriction on
the object of such verbs while keeping the pragmatic honor-
ing relationship between the subject and object.
of information: morphology, syntax, semantics,
and pragmatics. It is thus necessary for a parser
to have not only shallow but also deep process-
ing of the honorific information, so that we can
check that a given sentence is felicitous. Such
deep processing is a prerequisite to the success of
dialogue processing, zero pronominal/anaphoric
resolution, and so forth.
The grammatical architecture we adopt is a
typed feature structure grammar, couched upon
HPSG, that allows us to handle morpho-syntactic,
semantic, and also pragmatic information. The
implementation of this grammar in the LKB sys-
tem proves that a type-feature structure grammar
can provide us with a proper deep processing
mechanism for Korean honorification that opens
doors for promising applications in such areas as
machine translation and dialogue systems.
References
Chang, Suk-Jin. 1996. Korean. Amsterdam:
John Benjamins.
Copestake, Ann. 2002. Implementing Typed Fea-
ture Structure Grammars. Stanford: CSLI
Publications.
Kim, Jong-Bok and Jaehyung Yang. 2004. Pro-
jections from Morphology to Syntax in the
Korean Resource Grammar: Implementing
Typed Feature Structures. In Lecture Notes
in Computer Science Vol. 2945: 13–24.
Springer-Verlag.
Kim, Jong-Bok. 2004. Korean Phrase Struc-
ture Grammar (In Korean). Seoul: Hankwuk
Publishing.
Lee, Dong-Young. 1998. Information-based pro-
cessing of Korean dialogue with reference to
English. Seoul: Thaehak Publishing.
Pollard, Carl and Sag, Ivan A. 1994. Head-
Driven Phrase Structure Grammar.
Chicago: University of Chicago Press.
Sells, Peter. 1998. Structural Relationships
within complex predicates. In B.-S. Park and
J. Yoon (eds.), The 11th International Con-
ference on Korean Linguistics. Seoul: Han-
kwuk Publishing, 115–147.
Sohn, Ho-Min. 1999. The Korean Language.
Cambridge: Cambridge University Press.
Sung, Won-Kyung and Myung-Gil Jang. 1997.
SERI Test Suites ‘95. In Proceedings of
the Conference on Hanguel and Korean Lan-
guage Information Processing.
102
