Identifying Syntactic Role of Antecedent in Korean Relative 
Clause Using Corpus and Thesaurus Information 
Hui-Feng Li, Jong-Hyeok Lee, Geunbae Lee 
Department of Computer Science and Engineering 
Pohang University of Science and Technology 
San 31 Hyoja-dong, Nam-gu, Pohang 790-784, Republic of Korea 
hflee@madonna.postech.ac.kr, {jhlee, gblee)@postech.ac.kr 
Abstract 
This paper describes an approach to identify- 
ing the syntactic role of an antecedent in a Ko- 
rean relative clause, which is essential to struc- 
tural disambiguation and semantic analysis. In 
a learning phase, linguistic knowledge such as 
conceptual co-occurrence patterns and syntac- 
tic role distribution of antecedents is extracted 
from a large-scale corpus. Then, in an appli- 
cation phase, the extracted knowledge is ap- 
plied in determining the correct syntactic role 
of an antecedent in relative clauses. Unlike pre- 
vious research based on co-occurrence patterns 
at the lexical level, we represent co-occurrence 
patterns with concept types in a thesaurus. In 
an experiment, the proposed method showed a 
high accuracy rate of 90.4% in resolving am- 
biguitie s of syntactic role determination of an- 
tecedents. 
1 Introduction 
A relative clause is the one that modifies an an- 
tecedent in a sentence. To determine the syn- 
tactic role of the antecedent in a verb argu- 
ment structure of relative clause is important in 
parsing and structural disambiguation(Li et al., 
1998). While applying case frames of a verb for 
structural disambiguation, identifying the role 
of antecedent will affect the correctness of struc- 
tural disambiguation impressively. 
In this paper, we will describe a method of 
identifying the syntactic role of antecedents, 
which consists of two phases. First, in the 
learning phase, conceptual patterns (CPs) and 
syntactic role distribution of antecedents are 
extracted from a corpus of 6 million words, 
the Korean Language Information Base (KLIB). 
The conceptual patterns reflect the possible case 
restriction of a verb with concept types, while 
the syntactic role distribution shows the prefer- 
ence of syntactic role of antecedents of a verb. 
Second, in the application phase, the syntactic 
role of an antecedent is decided using CPs and 
the syntactic role distribution. 
In regards to the rest of this paper, Section 
2 will review the problems and related work. 
Section 3 will describe a statistical approach 
of conceptual pattern extraction from a large 
corpus as knowledge for determining syntactic 
roles. Section 4 will describe how to identify 
syntactic roles using conceptual patterns and 
syntactic role distribution of antecedents in the 
corpus. Section 5 will then present an experi- 
mental evaluation of the method. The last sec- 
tion makes a conclusion with some discussion. 
The Yale Romanization is used to represent Ko- 
rean expressions. 
2 Problems and Related Work 
In English, it is possible to recognize the syntac- 
tic role of antecedents by their position (trace) 
in relative clauses and the valency information 
of verbs. For example, the syntactic role of an 
antecedent man can be recognized as subject of 
the relative clause in a sentence "He is the man 
who lives next door" and as object in a sen- 
tence "He is the man whom I met." The rela- 
tive pronouns such as who, whom, that, whose, 
and which can also be used in identifying the 
role of antecedents in relative clauses. 
However, it is not a trivial work to identify 
the syntactic role of antecedents in Korean rel- 
ative clauses. Korean is such a head final lan- 
guage that the antecedent comes after the rel- 
ative clause. The rest of this section will de- 
scribe three main characteristics of Korean rel- 
ative clauses that make it difficult to determine 
the syntactic role of their antecedents. The first 
characteristic is that unlike English, Korean 
lacks relative words corresponding to English 
756 
SOT." 
-.,...°°,.=,..,°.°. °o°o,o.op 
Figure 1: Syntactic dependency tree for (1) 
relative pronouns. Instead, an adnominal verb 
ending follows its verb stem of a relative clause 
modifying an antecedent. The adnominal verb 
ending does not provide any information about 
the syntactic role of antecedent. For example, 
the relative clause kang-eyse hulu- (flow in a 
river) in sentence (1) modifies the antecedent 
mwul- (water), while adnominal verb ending - 
nun provides no clue about the syntactic role of 
the antecedent mwul (water). Figure 1 shows 
the syntactic dependency tree (SDT) of sen- 
tence (1). We need to decide the syntactic role 
of the antecedent mwul- (water) in the argu- 
ment structure of the verb hulu- (flow) when 
applying case frames of the verb for structural 
disambiguation. The dependency parser (Lee, 
1995) only gives the syntactic relation mod be- 
tween them, which should be regarded as subject 
in the relative clause. 
(1) nanun kang-eyse hulu-nun mwul-lul poatt- 
ta. 
(I saw water that flowed in a river.) 
As the second characteristic, the syntac- 
tic role of an antecedent cannot be determined 
by word order. This is because Korean is a rel- 
atively free word-order language like Japanese, 
Russian, or Finnish, and also because some ar- 
guments of a verb may be frequently omitted. 
In sentence (2), for example, the verb of rela- 
tive clause nolay-lul pwulless-ten (where \[I\] sang 
a song \[at the place\]) have two arguments \[I\] 
and \[place\] omitted. Thus, the antecedent kos- 
(place) might be identified as subject or adver- 
bial in the relative clause. 
~B 
I' I 
Figure 2: System architecture 
(2) nolay-lul pwulless-ten kos-ey na-nun kass- 
ta. 
(I went to the place where \[I\] sang a song 
\[at the place\].) 
The third characteristic of Korean relative 
clauses is that the case particle of an antecedent, 
that indicates the syntactic role in the relative 
clause, is omitted during relativization. In fact, 
in a relatively free-word order language, the case 
particles are very important to the syntactic role 
determination. 
Due to lack of syntactic clues, it is very dif- 
ficult to construct general rules for identify- 
ing the syntactic role of antecendents. Thus, 
the corpus-based method has been prefered 
to the rule-based one in solving the prob- 
lem of syntactic role determination in Korean 
relative clauses. Yang and Kim (1993) pro- 
posed a corpus-based method, where, for each 
noun/verb pair, its word co-occurrence and sub- 
categorization scores are extracted at lexical 
level. Park and Kim (1997) described a method 
of semantic role determination of antecedents 
using verbal patterns and statistic information 
from a corpus. These word co-occurrence pat- 
terns are all at lexical-level, so we have to con- 
struct a large amount of word co-occurrence 
patterns and statistical information before ap- 
plying to a real large-scale problem. Actually, 
the system performance mainly relies on the do- 
main of application, the number of word co- 
occurrence patterns extracted, and the size of 
corpus. 
757 
In the following sections, we will describe 
an approach to acquiring statistical information 
at conceptual level rather than at lexical level 
from a corpus using conceptual hierarchy in the 
Kadokawa thesaurus titled New Synonym Dic- 
tionary (Ohno and Hamanishi, 1981), and also 
describe a method of syntactic role determina- 
tion using the extracted knowledge. The system 
architecture is shown in Figure 2. 
3 Extraction of Statistic Information 
from Corpus 
First, for each of 100 verbs selected by order of 
frequency in the KLIB (Korean Language In- 
formation Base) corpus of 6 million words, its 
syntactic relational patterns (SRPs) of the form 
(Noun, Syntactic relation, Verb) are extracted 
from the corpus. Then, the nominal words in 
the SRPs are substituted with their correspond- 
ing concept codes at level 4 of the Kadokawa 
thesaurus. A nominal word may have multi- 
ple meanings such as C1,C2, ..., Cn. However, 
since we cannot determine which meaning of 
the nominal word is used in a SRP, we uni- 1 
formly add n to the frequency of each concept 
code. Through this processing, the syntactic 
relational pattern (SRP) changes into the con- 
ceptual frequency pattern (CFP), ({< C1, fl > 
,< C2, f2 >,...,< Crn, fm >},SRj,Vk), where 
Ci represents a concept code at level four of the 
Kadokawa thesaurus, fi indicates the frequency 
of the code Ci, and SRj shows a syntactic rela- 
tion between these concept codes and verb Vk. 
These patterns are then generalized by a con- 
cept type filter into more abstract conceptual 
patterns (CPs), {({el, C2, ..., Cn}, SRj, Vk)ll < 
j < 5, 1 _< k < 100}. Unlike in CFPs, the con- 
cept code in the more generalized CPs may be 
not only at level four (denoted as L4), but also 
at level three (L3) and two (L2). In addition 
to the CPs, we also extract the syntactic role 
distributiion of antecedents. 
3.1 Retrieving Syntactic Relational 
Patterns from Corpus 
Unlike the conventional parsing problem whose 
main goal is to completely analyze a whole sen- 
tence, the extraction of syntactic relational pat- 
terns (SRPs) aims to partially analyze sentences 
and thus to get the syntactic relations between 
nominals and verbs. For this, we designed a 
partial parser, the analysis result of which is 
obviously not as precise as that of a full-parser. 
However, it can provide much useful informa- 
tion. For the set of 100 verbs, a total of 282,216 
syntactic relational patterns (SRPs) was ex- 
tracted from the KLIB corpus. During the gen- 
eralization step, the problematic patterns are 
filtered out. 
In Korean, the syntactic relation of nominal 
words toward a verb is mainly determined by 
case particles. During the extraction of SRPs 
(Ni, SRj,Vk), we only consider the syntactic 
relation SRjs determined by 5 types of case 
particles: nominative (-i/ka/kkeyse), accusative 
(-ul/lul), and three adverbial (-ey/eynun, se/eyse/eysenun, -to/ulo/ulonun). 
3.2 Conceptual Pattern Extraction 
3.2.1 Thesaurus Hierarchy 
For the purpose of type generalization of nom- 
inal words in SRPs, the Kadokawa thesaurus 
titled New Synonym Dictionary (Ohno and 
Hamanishi, 1981) is used, which has a four-level 
hierarchy with about 1,000 semantic classes. 
Each class of upper three levels is further di- 
vided into 10 subclasses, and is encoded with a 
unique number. For example, the class 'station- 
ary' at level three is encoded with the number 
96 and classified into ten subclasses, Figure 3 
shows the structure of the Kadokawa thesaurus. 
To assign the concept code of Kadokawa 
thesaurus to Korean words, we take advan- 
tage of the existing Japanese-Korean bilingual 
dictionary (JKBD) that was developed for a 
Japanese-Korean MT system called COBALT- 
J/K. The bilingual dictionary contains more 
than 120,000 words, the meaning of which is en- 
coded with the concept codes that are at level 
four in the Kadokawa thesaurus. Thus, Korean 
words in the SRPs are automatically assigned 
their corresponding concept codes of level four 
through JKBD. 
3.2.2 Principle of Generalization 
We encoded the nouns in SRPs extracted by the 
parser with concept codes from the Kadokawa 
thesaurus, and examined histograms of the fre- 
quency of concept codes. We observed that the 
frequency of codes for different syntactic rela- 
tions of a verb showed very different distribution 
shapes. This means that we could use the dis- 
tribution of concept codes, together with their 
frequencies as clues for conceptual pattern ex- 
758 
concept 
I I I i I I I I I i I 
• I : ;J ~ s 6 ~ I • 
' I I I t I I I I I 1 I "i I I I I I I I I 
o~ (~1 e~z ~ u~ oss qt~6 w'9 isl O~9 ~o 9~1 9~1 9~ ~4 I ~S6 ~ 9Sa 9~ 
Figure 3: Concept hierarchy of Kadokawa the- 
saurus 
traction. From the histograms of codes of both 
subject and object relational patterns for the 
verb ttena-ta (leave), we observed that concept 
codes about human (codes from 500 to 599) ap- 
pear most frequently in the role of subject, and 
codes of position (from 100 to 109), codes of 
place (from 700 to 709) and codes of building 
(from 940 to 949) appear most often in the role 
of object. 
For each verb Vk, we first analyzed the co- 
occurrence frequencies fi of concept codes Ci 
of noun N, and then computed an average fre- 
quency fave,t and standard deviation at around 
lave,t, at level g (denoted as Lt) of the con- 
cept hierarchy. We then replaced fi with its 
associated z-score k$,e. k$,e is the strength of 
code frequency f at Lt, and represents the 
standard deviation above the average of fre- 
quency fave,t. Referring to Smadja's definition 
(Smadja, 1993), the standard deviation at at 
Lt and strength kf,t of the code frequencies are 
defined as shown in formulas 1 and 2. 
nt 2 :_fow,t) 
at = V nt - 1 (1) 
k$,,,,t = fi,t - fave,t (2) 
at 
where fi,t is the frequency of concept code Ci at 
Lt of Kadokawa thesaurus, fave,t is the average 
frequency of codes at Lt, nt is the number of 
concept codes at Lt. 
3.2.3 Code Generalization 
The standard deviation at at Lt characterizes 
the shape of the distribution of code frequen- 
Level Threshold of standard deviation O'OT l Threshold of 
subj obj advl adv2 adv3 Strength ko,t 
L4 2.0 8.0 0.5 0.1 0.9 k0,4=4.0 
L3 6.0 16.0 1.5 2.0 2.0 k0,3=l.0 
L2 30.0 50.0 15.0 4.0 10.0 ko,2=-0.60 
Table 1: Thresholds of the filter 
cies. If al is small, then the shape of the his- 
togram will tend to be flat, which means that 
each concept code can be used equally as an ar- 
gument of a verb with syntactic role SRi. If 
at is large, it means that there is one or more 
codes that tend to be peaks in the histogram, 
and the corresponding nouns for these concept 
codes are likely to be used as arguments of a 
verb. The filter in our system selects the pat- 
terns that have a variation larger than threshold 
a0,t, and pulls out the concept codes that have a 
strength of frequency larger than threshold k0,l. 
If the value of the variation is small, than we 
can assume there is no peak frequency for the 
nouns. The patterns that are produced by the 
filter should represent the concept types of ex- 
tracted words that appear most frequently as 
syntactic role SRi with verb Vk. 
We later analyzed the distribution of fre- 
quency f/ in CFPjs to produce an aver- 
age frequency fave,t and standard deviation 
at. Through experimentation, we decided 
the threshold of standard deviation a0,t and 
strength of frequency k0,t as shown in Table 1. 
The lower the value of threshold k0,t is assigned, 
the more concept codes can be extracted as 
conceptual patterns from the CFPs. We main- 
tained a balance between extracting conceptual 
codes at low levels of the conceptual hierar- 
chy for the specific usage of concept type and 
extracting general concept types for enhancing 
overall system performance. These values may 
be variable in different application. 
In Table 2, we enlist the concept types that 
have more than 5 appearances in the CFP of 
verb ttena-ta (leave). The strength of frequen- 
cies for generalization is calculated with formula 
2. 
1 - 0.932 
kl,4 = 2.82513 = 0.024 
759 
code code code code code l code l (freq.) (freq.) (freq.) (freq.) (freq.) (freq.) 
061(10) 086(7) 117(5) 118(7) 158(5) 160(5) 
179(5) 324(5) 410(12) 411(14) 430(16) 436(5) 
480(7) 481(8) 482(9) 500(23) 501(31) 503(31) 
507(35 508(30) 511(11) 513(8) 514(8) 515(5) 
516(5) 519(6) 521(15) 522(19) 523(10) 525(7) 
530(5) 535(6) 540(15) 550(7) 572(8) 576(9) 
580(7) 581(7) 590(8) 591(5) 595(12) 814(9) 
822(5) 828(5) 830(5) 833(7) 941(8) 997(7) 
998(6) other(427) 
* No. of codes: n 4 = 932 
* Average freq.: fa,.,e,4 = 932/1000 = 0.932 
* Standard deviation: a t = 2.821530 
* 'other' in the table means the total freq. of nouns less than 5 
* The numbers in brackets are the frequencies of code appearance 
Table 2: Concept types and frequencies in CFP 
({< Ci, fi >},subj,ttena-ta) 
12 - 0.932 
k12,4 -- 2.82513 - 3.9176 
14 - 0.932 
k14,4 - 2.82513 - 4.626 
Since the value of k0,4 is set at 4.0, as shown 
in Table 1, the concept codes with frequencies 
of more than 13, as the equation for k14,4 shows, 
are selected as generalized concept types at L4. 
After abstraction at L4, the system performs 
generalization at L3. It removes selected fre- 
quencies, such as frequency 14 of code 411 in 
Table 2, and sums up the frequencies of the re- 
maining concept codes to form the frequency 
of higher level group. For example, the system 
removes the frequency for code 411 from the 
group {410(12), 411(14), 412(3), 413(0), 414(0), 
415(0), 416(1), 417(0), 415(0), 419(0)}, then 
sums up the frequencies of the remaining codes 
for a more abstract code of 41. The frequency 
of code 41 then becomes 16. Through this pro- 
cess, the system performs a generalization at L3 
for the more abstract types of the concept. The 
system calculates ae and strength Kf,e, selects 
the most promising codes, and stores concep- 
tual patterns ({C1, C2, C3, ...}, SRj, Vk) as the 
knowledge source for syntactic role determina- 
tion in real texts, where concept type Ci is cre- 
ated by the generalization procedure. After gen- 
eralization of the CFP patterns for the subject 
role of the verb ttena-ta (leave), the produced 
conceptual patterns are: ({411,430, 500, ..., 06, 
11, ..., 99, 1}, subj, ttena-ta). 
3.3 Syntactic Role Distribution of 
Antecedents 
In (Yang et al., 1993), they defined subcatego- 
rization score (SS) of a verb considering the verb 
argument structure in a corpus. They asserted 
that the SS of a verb represents how likely a verb 
might have a specific grammatical complement. 
We observed from analyzing the corpus that 
we cannot infer the syntactic roles of an- 
tecedents from subcategorization scores since 
the syntactic role distribution of verb arguments 
in a corpus is so different from the syntactic role 
distribution of antecedents due to the property 
of free word language. In Korean, an argument 
of a verb could be omitted, and so the subcat- 
egorization score don't provide possible trend 
of the role of antecedent in many cases. For 
example, 26.8% of arguments of the verb ttena- 
ta (leave) are used as subjects, and 54.4% are 
used as objects, but 74.41% of antecedents of 
the verb are of subject role, and 6.9% are of 
object role. 
Although the distribution of antecedents is 
necessary to our task, we cannot automatically 
retrieve the syntactic role distribution of them 
from the corpus. We extracted relative clauses 
for specific verbs from the corpus, and then 
counted the number of syntactic roles of the 
antecedents manually by language trained peo- 
ple. Since there are about 200 to 500 relative 
clauses for each verb in the corpus, it is possible 
to check this information. This information is 
represented by relative score RSk(SRi) of syn- 
tactic role SRi for antecedents of verb Vk as is 
shown bellow and is used in syntactic role de- 
termination as described in section 4: 
RSk(SRi)- freqk(SRi) (3) 
freq(Vk) 
where freq(Vk) are the frequency of verb Vk 
of relative clauses, and freqk(SRi) is the fre- 
quency of syntactic role SRi of antecedents in 
relative clauses including verb Vk in the corpus. 
4 Identifying Deep Syntactic 
Relation 
While determining syntactic relation for an- 
tecedents of relative clauses, the system checks 
the argument structure of the verb in a rela- 
tive clause first, and then records the empty 
(or omitted) arguments of the verb in relative 
760 
2*2 is-a 2*2 is-a 2* I is-a 
4+2 penalty(l.O) 2+3 penalty(0.5) 4+2 penahy(0.5) 
Figure 4: Conceptual similarity computation 
Syntactic No. of Percentage Accuracy 
relation appearances (%) (%) 
subject 1,087 
object 
adverb(-ey) 
adverb(-eyse) 
adverb(do) 
total 
431 
121 
19 
114 
1,772 
61.34% 
24.32% 
6.82% 
1.08% 
6.44% 
100% 
90% 
92% 
89% 
92% 
89% 
90.4% 
Table 3: The test results of syntactic role deter- 
mination for antecedents 
clause referring to the verb valency information. 
The antecedent that the verb phrase is modify- 
ing can be one of these empty arguments. 
An antecedent (a noun) usually has one 
or more meanings, which causes ambigu- 
ity in determining the correct syntactic re- 
lation between the antecedent and a verb. 
We assume that an antecedent has meanings 
C1, C2, C3, ..., Cn, and that CPi is a conceptual 
pattern ({P1, P2, ..., Pro}, SRi, Vk) correspond- 
ing to syntactic relation SP~ of verb Vk. The 
evaluation score SIMi (Np, Vk) of an antecedent 
Np that can be syntactic role SRi with verb Vk 
is defined as formula 4, and conceptual similar- 
ity Csim(Cw, Pj) between concept Cw and Pj 
as formula 5. 
SIMI(Np, Vk) = rnax(Csirn(Cw,Pj)) 1 < w < n, 1 ~ j ~_ m 
(4) 
Csim(Cw, Pj ) 2 * level(MSCA(Cw, Pj )) = • ispenalty (5) 
level( Cw ) + level( Pj ) 
where MSCA(Cw, Pj) in Csim(Cw, Pj) rep- 
resents the most specific common ancestor 
(MSCA) of concepts Cw and Pj in the 
Kadokawa concept hierarchy. Level(Cw) refers 
to the depth of concept Cw from the root node in 
the concept hierarchy. Is_a Penalty is a weight 
factor reflecting that Cw as a descendant of Pj 
is preferable to other cases. Conceptual simi- 
larity computation with formula 5 is shown in 
Figure 4. 
Based on these definitions, the syntactic re- 
lation SRj between antecedent Np and verb Vk 
can be calculated as follows: 
1. Let R = {SP~\[SRi is a syntactic relation 
of an empty (or omitted) argument in the 
relative clause of Irk, 1 < i < 5}. 
2. For each conceptual pattern CPi of verb Vk 
of which SRi is in R, and for each concept 
code Pi in CPi, compute SIMi(Np, Vk). 
3. Determine the syntactic relation of an- 
tecedent Np to SRj on the condition that 
SIMj(Np, Vk) has the largest value in 
{SIMi(Np, Vk)\[1 < i < 5} and SRj in R. 
If two or more SIMi(Np, Vk) have the same 
value, decide syntactic role referring to the 
higher relative score RSk(SRi) of the syn- 
tactic role of the verb Vk. 
Here, syntactic relation can be one of subj, 
obj, advl, adv2, and adv3. The symbols advl, 
adv2, and adv3 represent adverbs with case par- 
ticles -ey, -eyse, and -lo, respectively. 
5 Experimental Evaluation 
An informal way to evaluate the correctness of 
syntactic relation determination is to have an 
expert examine the test patterns and source 
sentences that the patterns appears, and give 
his/her judgment about the correctness of the 
results produced by the system. In our exper- 
iment, the correctness of syntactic and concep- 
tual relation determination was evaluated man- 
ually by humans who were well trained in de- 
pendency syntax. 
As a test set, we extracted 1,772 sentences 
that included relative clauses for the 100 verbs 
from 1.5 million word corpora of integrated Ko- 
rean information base and test books of primary 
school. The distribution of syntactic relation of 
antecedents among them and the test results 
were shown in Table 3. There were 1,087 an- 
tecedents (61.34%) that were of subject role. 
The baseline accuracy of the problem is 61.34%. 
That is, if we always select subject role for an- 
tecedents, the accuracy will reach 61.34%. 
761 
Our system showed 90.4% of accuracy on av- 
erage in syntactic relation identification, which 
shows that the conceptual patterns and relative 
score of syntactic relation produced in the first 
phase can be a good source for determining the 
syntactic relation of an antecedent. 
Through experiment, we observed several fac- 
tors that affect the performance of the system. 
First, the multiple meanings of a noun will af- 
fect the frequency distribution of concept codes. 
In our system, we cope with this problem by 
adjusting the threshold of standard deviation 
and strength value. The second problem is the 
sparseness of corpus domain. If the corpus for 
learning is specified as a certain domain, it will 
greatly increase the validity of conceptual pat- 
terns. If we use a sense tagged corpus in the 
learning stage, we can achieve high accuracy in 
syntactic relation determination. 
6 Concluding Remarks 
This paper describes an approach for syntac- 
tic role determination between an antecedent 
and a verb in relative clause for semantic anal- 
ysis. This method consists of two phases. In 
the first phase, the system extracts conceptual 
patterns and syntactic role distribution of an- 
tecedents from a large corpus. In the second 
phase, the system applies the extracted con- 
ceptual patterns as knowledge in determining 
correct syntactic relations for structural disam- 
biguation and semantic analysis in MT system 
for CG generation. 
Unlike previous research that calculates sta- 
tistical information at a lexical level for every 
pair of words, which may require a lot of space 
to store resulting patterns, we represent those 
co-occurrence patterns with concept types of 
Kadokawa thesaurus. The problematic concept 
types are filtered out by the type generaliza- 
tion procedure. We used a corpus of 6 mil- 
lion words for conceptual pattern extraction. 
Our method can cope with the general scope 
of texts. In the experiment evaluation, the pro- 
posed method showed a high accuracy rate of 
90.4% in identifying the syntactic role of an- 
tecedents. 
The method described in this paper can be 
used in resolving syntactic role of antecedents 
in relative clauses of other free word order lan- 
guages, and can also be used in generating se- 
lectional restrictions of case frames of verbs. 

References 
Lee, J. H. and G. Lee. 1995. A Depen- 
dency Parser of Korean based on Connec- 
tionist/Symbolic Techniques. Lecture Notes 
on Artificial Intelligence 990, pages 95-106. 
Springer-Verlag, Berlin. 
Li, H. F., J. H. Lee and G. Lee. 1998. Con- 
ceptual Graph Generation from Syntactic De- 
pendency Structures in an MT Environment. 
(to be published by Computer Processing of 
Oriental Languages in 1998). 
Ohno, S. and M. Hamanishi. 1981. New Syn- 
onym Dictionary, Kadokawa Shoten, Tokyo 
(written in Japanese). 
Park, S. B. and Y. T. Kim. 1997. Semantic Role 
Determination in Korean Relative Clauses 
Using Idiomatic Patterns. In Proceedings of 
17th International Conference on Computer 
Processing of Oriental Languages, pages 1-6. 
Hong Kong. 
Smadja, F. 1993. Retrieving Collocations from 
Text: Xtract, Computational Linguistics, 
19(1):143-177. 
Yang, J. and Y. T. Kim. 1993. Identifying Deep 
Grammatical Relations in Korean Relative 
Clauses Using Corpus Information. In Pro- 
ceedings of Natural Language Processing Pa- 
cific Rim Symposium '93, pages 337-344. Tae- 
Jon, Korea. 
