Two-level Description of Turkish Morphology 
Kemal Oflazer 
Department of Computer Engineering and Information Science 
Bilkent University, Ankara, Turkey 
Fax: (90-4) 266 4127 e-maih ko@trbilun.bitnet 
1 Introduction 
This poster paper describes a full scale two-level mor- 
phological description (Karttunen, 1983, Kosken- 
niemi, 1983) of Turkish word structures. The 
description has been implemented using the PC- 
KIMMO environment (Antworth, 1990) and is based 
on a root word lexicon of about 23,000 roots words. 
Almost all the special cases of and exceptions to 
phonological and morphological rules have been im- 
plemented. 
Turkish is an agglutinative language with word 
structures formed by productive affixations of deriva- 
tional and inflectional suffixes to root words. Turkish 
has finite-state but nevertheless rather complex mor- 
photactics. Morphemes added to a root word or a 
stem can convert the word from a nominal to a ver- 
bal structure or vice-versa, or can create adverbial 
constructs. The surface realizations of morphologi- 
cal constructions are constrained and modified by a 
number of phonetic rules such as vowel harmony. 
2 Two-level description of Turkish 
morphology 
The phonetic rules of contemporary Turkish have 
been encoded using 22 two-level rules while the mor- 
photactics of the agglutinative word structures has 
been encoded as finite-state machines for verbal, 
nominal paradigms. Our lexicons are based on the 
comprehensive word list that we have compiled for 
our spelling checker developed earlier (Solak and 
Oflazer, 1992). We have lexicons for nouns, ad- 
jectives verbs, compound nouns, proper nouns, pro- 
nouns, adverbs, connectives, exclamations, postposi- 
tions, acronyms, technical words, special cases, There 
are total of 18,500 nominal (nouns + adjectives) 
roots and about 2,450 verbal roots. There are about 
100 lexicons for suffixes. 
3 Example Output 
Here we provide a sample output from our imple- 
mentation (slightly edited for proper orthography): 
Input 
Morpheme Struct. 
cah~manm cah~-I-mA+Hn 
+nHn 
~:a h,~-I-mA-I-nH n 
G lOSS 
English meaning 
V(¢ah~)+VtoN(ma)+2PS-POS+GEN ot your =ork(mg) 
V(¢aI,~)-t-VtoN(ma)+GEN 
o/the work(ing) 
N(¢ocuk)+3PS-POS 
his/her child 
~OCU~U ~ocuk+sH 
¢ocuk+yH 
ahnml~ 
al+Hn+ymH~ 
al+nHn+ymH,~ 
al$m+ymH~ 
al-t-Hn-l-mH~ 
al-I-Hn-l-mH~ 
ahn-t-mH~ 
ahn-l-mH~ 
boynu boy$un+sH 
boy$un+yH 
N(¢ocuk)+ACC 
child (accusative) 
N(al)+2PS-POS-I-NtoVO-I-NARR+3PS 
(it) was your red (one) N(al)+GEN+NtoVO+NARR+3PS 
(it) belongs to the red (one) N(al,n)+NtoVO+NARR+3PS 
(it) was a forehead V(al)+PASS+VtoAdj(mis) 
(a) taken (object) V(al)+PASS+NARR+3PS 
it was taken V(ahn)+VtoAdj(mis) 
Can) offended (person) V(ahn)+NARR+3PS 
s/he was offended 
N(boyun)+3PS-POS 
(his/her) neck N(boyun)+ACC 
neck (accusative) 
4 Conclusions 
This poster has presented a summary of the first 
full scale implementation of two-level description of 
Turkish morphology. We have been using this de- 
scription as a morphological parsing module in a 
number of applications like LFG parsing, ATN pars- 
ing and semantics analysis of Turkish sentences. 
References 
\[Antworth, 1990\] Evan L. Antworth. PC-KIMMO: 
A two-level processor for Morphological Analysis. 
Summer Institute of Linguistics, Dallas, Texas, 
1990. 
\[Karttunen, 1983\] Lauri Karttunen. KIMMO: A 
general morphological processor. Texas Linguis- 
tic Forum, 22:163- 186, 1983. 
\[Koskenniemi, 1983\] Kimmo Koskenniemi. Two- 
level morphology: A general computational model 
for word form recognition and production. Publi- 
cation No: 11, Department of General Linguistics, 
University of Helsin, 1983. 
\[Solak and Oflazer, 1992\] Ay§m Solak and Kemal 
Oflazer. Parsing agglutinative word structures and 
its application to spelling checking for Turkish. In 
Proceedings of the 15 th International Conference 
on Computational Linguistics, volume 1, pages 39 
- 45, Nantes, France, 1992. International Commi- 
tee on Computational Linguistics. 
472 
