Hypothesis Scoring over Theta Grids Information 
in Parsing Chinese Sentences with Serial Verb Constructions 
Koong H. C. Lin and Von-Wun Soo 
Department of Computer Science, National Tsing-Hua University HsinChu, 
Hsinchu, 30043, Taiwan, R.O.C. 
E-Mail:soo@cs.nthu.edu.tw 
Abstract 
Serial verb constructions (SVCs) h~ Chinese are popular 
structural ambiguities which make parsing difficult, bl this 
paper, we propose a quantitative model, ~-model', based on 
theta grids hJfo~Tnation, that can systematically resolve 
ambiguities of SVCs to arbitrate competence between verbs h~ 
parsing SVCs sentences. S-model has three major 
characteristics." (1) it can resolve SVCs without relying on 
,~Tecific types of SVCs classified by linguists; (2) it can handle 
long SVCs, i.e., SVCs with more than two verbs; (3) it can 
simultaneously determine whether a verb candidate is really 
acts as a verb in the sentence. 
1 Introduction 
in Mandarin Chincse, it is common that there are 
two or more verbs in a sentence without any marker 
indicating the relationships between them. Such peculiar 
construct is called Serial verb constructions (SVCs) \[Li 
and Thompson 1981\]. For example, in the sentence: "~ 
~~ ),~,~- ~(~" (the defendant hol~ ~laintiff 
~) (The defendant hoped that the plaintiff could 
forgive him.), there are two verbs: "~-~" (hope) and "~ 
" (forgive); however, there are no such markers as 
subordination markers, conjunctions, preposition, or 
other morphological cues, which indicate the 
relationships between them. In developing a parser, 
SVCs cause considerable problems. We have designed a 
modified chart parser using theta grids information. In 
parsing sentences with SVCs, different verbs will 
compete in searching the chart for their own theta roles. 
Thus, some mechanism for arbitrating among the 
competing verbs for the ownership of each constituent in 
the chart must be designed. The theta grid chart parser is 
to be described in the next section. 
The study of SVCs is still primitive. Most previous 
work lChang and Knflee 1991\] \[Yeh and Lee 1992\] 
were based on Li and Thompson's classification of SVCs 
\[Li and Thompson 1981\]. Surveying their work, we find 
there are some limitations. Yang \[19871 and Chang ct al. 
\[Chang and Krulee 1991\] dealt with only subsets of 
SVCs. Moreover, it is not clear how the implementations 
of Yang \[1987\], Chang et al. \[Chang and Krulce 1991\], 
and Yeh et al. \[Yeh and Lee 1992\] can be extended to 
handle long SVCs, i.e., those sentences containing more 
than two occurrences of verbs. It is because their work 
were based on the classification of SVCs, and the 
classification was based on two-verbs cases only. Pun 
\[19911 claimed that his work could handle long SVCs; 
however, did not report how to systematically extend his 
method to SVCs with three or more verbs. In our model, 
there are three characteristics: First, instead of 
classifying SVCs into several types, we make use of a 
numerical scoring function to determine a prcferred 
structure. It is an attempt to make the SVCs handling 
process more ,systematic. The information encoded in 
theta grids are used as bases for scoring. Second, it can 
handle long SVCs. Third, category, ambiguities can be 
taken into consideration at the same time. Namely, we 
can simultaneously determine whether a verb candidate 
actually plays a vclb or not. While in previous work, 
before the SVC handling processes are triggcred, it must 
determine the actual verbs in the sentence. 
This work is part of our long-term research for 
building a natural language front-end of a verdict 
understanding system. Thus, the corpora we use are 
judicial verdict documents from the Kaohsiuug district 
court lTaiwan 1990a\]\[Taiwan 1990b\], which were 
written in a special official-document style. Thus, our 
analysis is based on such kind of sub-language. 
2 A Theta-grid Chart Parser 
Since the mechanism we propose is under the 
framework of a theta-grid chart parser, in this section, 
we introduce the parser briefly. Thematic inJbrmation is 
one of the information sources that can bridge the gap 
between syntactic and semantic processing phases. In 
theta-grid theory ITang 1992\], rich thematic information 
is incorporated for the analysis of human languages. The 
idea of theta-grid theory is as follows: we use a predicate, 
say, a verb, as the center of a "grid" and, by finding the 
theta-roles registered in the lexical entries of this 
predicate, we can construct a grid formed by this 
predicate and then construe the sentence (or clause) 
spanned by this predicate. We think the thcta-grid 
representation suitable for processing Chinese. This 
shares similar viewpoint with other work of designing 
Chinese parser which uses thematic information, such as 
ICG parser \[Chcn and Huar, g 1990\]. To 
computationalize theta-grid theory, some control 
strategies for parsing must be implemented. 
The well-known chart parser \[Kay 19801, which 
utilizes the data structure "chart" to record the partial 
parsing results, is suitable for our work. Since it keeps 
all possible combination of constituents, it can accept 
sentences with missing thcta roles. Thus, we designed a 
modified chart parser called TG-Chart parser \[Lin and 
Soo 1993\] by combining thcta-grid theory and chart 
parser. Note that currently in our work, only the theta 
942 
grids for verbs are considered. For each verb, there are 
two kinds of theta roles registered: the obligatory roles, 
which must bc found for this verb to construct a legal 
"grid"; the optional roles, with their appearance being 
optional. Takc "~ ~)~" for example, its theta roles are 
registercd as: +lTh (Pd) Agl; thus, two NPs must bc 
found in the chart for the constntction of a legal grid 
(From ,~yntactic clues, both "Ag" and "Th" are always 
played by NPs. ILiu and See 19931.), while the 
appearance of a clause to serve as a "Pd" role is optional. 
A brief dcscriptiou of our parsing algoritlun is as follows: 
\[Step 1\] Search the sentence for positions of all "verb candidates". (What we call verb candidates are those 
words that have the verb-categol7 as one of its syntactic categories in the dictionary.) 
\[Step 2\] By considcring all possible combination, the chart parser groups the words into syntactic constituents. 
Syntactic knowlcdge is used in this step. 
\[Step 3\] If only one verb candidate ix lbund in I Slep l\], search the chart \[or constituents which can play the 
theta rolcs of ttfis verb. 
\[Step 4\] If more than one verb candidate are lbund, call S-model to deterlnine the most preferred structure. S- 
model will be describcd in scction 3. 
3 The S-model 
We design a model which utilizes scoring fimctions 
and thela-grid theoiy to handle the SVCs problem. This 
model, called S-model (au abbreviation of "SVCs 
lmndling model"), consists of four modules: a 
combinalion genera|or, a combination filter, a score 
cvaluator, and a struclure selector as shown in Ifigurc 1 I. 
Wc now dcscri/)c Ihcsc modtdcs as follows: 
I Fill I1\[ I 
Sentences with SVCs 
Constituents from 
chart parser 
Theta grids 
for each verb candidate 
Combination Generate 1 + 
Combination Filter / 
\[,rS ooro  vo'uotor / 
Scores for every verb candidate 
~/ Struclure Select°r / 
a most preferred sfructur 
Return to TG-chart parser 
Figm'e 1 Modules of S-model 
Verb-string Gcucrator gcncralcs all possible verb strings 
As wc know, all verb candidates compete to act as 
verbs. '/'Ire qt.cstion is: "wlfich candidates can actually 
act as vcrbs?" and, "what is thcir correlation?". If we can 
enumcratc all possible combination and cwfluale their 
scores respectively, we can determine the most preferred 
construction. Take the two-verb-candidates case as an 
example, let the two verb candidates be vl, v2, there arc 
five combination: (1) only vl is a verb whilc v2 is not, (2) 
only v2 is a verb, (3) both vl aml v2 arc vcrbs, while 
there is not any subordination relation bclweeu them. (4) 
both arc verbs, and vl is subordinatc to v2. (5) both arc 
verbs, alld v2 is subordinate to vl. 
3.1 Combination Generator 
Combination Geucrator consisls of two submodules: 
Verb-string Generator and Subordination-relation 
Tagger. We illustrate a case with three verb candidates: 
by sequentially cnmucrating tile biua O' string: 001, 010, 
011, 100, 101, llO, 111. The verb string "101" 
represents the situation where vl and v3 acl as verbs, 
wlfile v2 docsn't. And then, Subordinatiouorelation 
Tagger tags these verb strings with possible 
subordination relations. II divides these strings into three 
classes according to the occurrences of l's in lhe siring, 
that is, the number of verb candidates in the sentencc. 
These three classes arc: (I) For the one-1 class (i.c., 001, 
010, 100), there is obviously no subordination relation. 
Thai ix, there is only one possible case to consider: this 
candidate acts as the only verb in Ibis sentence. (2) For 
the two-I class (i.e., 011. 101, 110), there are three 
possibilities to consider: vl=v2, vl<v2, and vl>v2. Wc 
follow the notations used by Pun \[Ibm 1991\], where 
"vl>v2" means v2 is subordinate to v\[; "vl-~v2", no 
subordination relalions exist between the two verbs. (3) 
943 
For the three-1 class (i.e., 111), there are seventeen cases. 
We use abbreviated notations to represent them, where 
"><" is the abbreviation of "vl > \[v2<v31", with square 
brackets being represented by underlines, meaning that 
locally v2 is subordinate to v3, and they together form a 
clause, which then plays a prepositional role for vl, and, 
for another example, "=<" is the abbreviation of "lvl=v2\] 
< v3". These seventeen cases are: ==, =% =% =>, =>, 
<=, <=, <<, <<, <>, <>, >=, >=, ><, ><, >>, and >>. 
Thcse cases are gcneratcd simply by enumerating 
possiblc combinations of thesc threc symbols: =, <, and >. 
For each pair of symbols Sj ,S~, two combinations arc 
possible: S,S 2 and ,5~S 2 . Note that "-:" and "=-" 
represents the same case; thus, only a single "==" is 
generated. Therefore, 3x3 x2-1 = 17 cases arc 
possible. By summarizing classes (1), (2), and (3), 
Combination Generator generates 
C~ x 1 + C 3 x 3 + C~ 3 x 17 = 29 cases. It is easy to 
design a routine which ,~ystematically enumerates these 
possibilities. 
3.2 Combination Filter 
The Combination Generator above does not take 
linguistic knowledge into consideration. Actually, tliere 
are some cases which will never happen in a real 
sentence according to syntactic constraints. Thus, it is 
not necessary to pass it to the score evaluator. 
Combination Filter is responsible for filtering out 
impossible cases. We illustrate three circumstances. 
Firstly, for "vl > v2", the Combination Filter will check 
the theta grid for vl; if there is a Pd or Pc role registered 
in vl, it is possible, since v2 can be subordinate to vl 
only if vl also expects a prepositional role; othenvisc, 
such a case is filtered out. The second circumstance is, 
when vl has only a single syntactic category, verb, it 
must act as a verb in the sentence. Thus, the case that v2 
acts as a verb while vl doesn't is removed. The third 
circumstance regards the three-candidates situations. 
Combinalion Generator generates seventeen cases; 
however, under some circumstances, there are four cases 
which are impossible: << <> <> and >> These 
circumstances happens when the main verb of the 
prepositional part (i.e., the part marked by a underline.) 
expects an animate agent. In such circumstances, a VP 
cannot be subordinate to an "event". Thus, these four 
will be filtered out by Combination Filter. For example, 
the following sentence, with the relation "<>" (i.e., ~-f < 
~>~h~\]), is impossible: ";t~ N ~. ~-~h~, ~- 
~'~ I~ " (11~_ thunder hope attend the labor insurance) 
(Thundering hoped to attend the labor insurance.). It is 
because "~ ~ "expects an animate NP to act as its Ag, 
the VP "-~T '~" thus cannot act as itsAg role. 
There are still many linguistic knowledge and 
constraints which can be used by Combination Filter. 
However, some of them, such as the third circumstance 
mentioned above, are too specific and thus must be used 
carefully to avoid over-constraints. Therefore, how to 
collect and select those constraints and knowledge which 
are general enough is still our filturc concern. 
The main function of Combination Filter is to 
improve the performance of the S-model. Note that in 
this paper, for the beneficiary of brevity, Combination 
Generator and Combination Filter are designed as two 
separate modules. However, Combination Filter can 
behave as an embedded module of Combination 
Generator so that it can cut off some generating branches 
which are impossible as early as possible. It is also our 
future concern. 
3.3 Score Evaluator 
Whenever Combination Filter passes a feasible case 
into Score Evaluator, the Score Evaluator utilizes a 
scoring function to compute the score of the input case 
and then, passes the evaluated score to lhe structure 
selector. We will now describe it: 
3.3.1 The S-function 
In our legal domain corpora, there are many 
occurrences of SVCs. Since our parser is based on tile 
theta grids, in case of SVCs, different verbs will compete 
in finding their own theta roles. Thus, some mechanism 
for arbitrating among verbs for the ownership of each 
constituent in tile chart must be designed. Just as what 
Yorick Wilks said, language does not always allow the 
formation of "lO0%-correct" theories \[ltirst 19811; 
therefore, we attempt to find a more flexible melhod for 
recognizing SVCs. We propose a scoring fimction to 
select a "preferable" construction for the sentence with 
SVCs. (For the "preference" notion, sec \[Wilks 19751 
\[Fass and Wilks 19831.) The scoring fimctiou is called S- 
fimction, an abbreviation for "SVCs scoring fimction". 
S-function is defined as in lfigurc 21, where RWR is the 
abbreviation of "Ratio of Words included in some phrase 
with Roles assigned", RRF, "Ratio of Roles Found", 
OBR, "OBligatory Role", and OPR, "OPtional Role" 
(Note that OBR and OPR indicate those roles registered 
in theta grids'.): 
944 
Score .- Per - Verb 
Score - ~w ~,~, (I) 
number of verbs 
Score - t'er- Verb = RIU; x RWR (2) 
RRF = \](number of OBR found) x k + (number of OPR found) I (3) 
Base 
mm~ber of words included in some phrase with roles assigned R WR = 
number of words in the chmsc 
Base = k x (number of OBR)4- (number of OPR) (5) 
Figure 2. The S-flmction 
(4) 
The score is calculated as the average value of scores 
obtained by each verb in the sentence (as in equation l). 
For each verb, the score is eslimatcd by two factors: first, 
the ratio of lheta roles found, i.e., RRF, and, second, the 
ratio of words with roles assigned, i.e., RWR. For 
detailed formula, see equation (2). The relative 
significance between obligatory roles and optional roles 
is heuristically weighted by 2:1, as m (3) and (5); thus, 
the value ofk is set to be 2. In some cases, the verb finds 
many theta roles in the clause it constructs, but the words 
in this clause are not all assigned roles. Wc consider 
such assignment doesn't constrnc the real construction of 
the sentence. Thus, to reflect such cases, we calculate 
RWR by dividing the number of words which are 
included in some phrase with a role assigned by the total 
number of words in the clause (see equation 4). 
3.3.2 Illustration of S-function 
Now, let's illustrate the calculation of S-function by the 
fbllowin~ examples: .. 
:::::::::::::::::::::::::::::::::::::::: :: :::::::i :::::: " ;::.:::5:; ::: 
\[== == ====}= ==::i===:ii:: ===:== i ======= i i===#==! := ==i:=i==:=== ==:= ~ff ====:=~=i === fit~ =:==~: == ==::: ==i== :=:=== 1 
In this example, we demonstrate how to determine 
whether a verb candidate ean actually act as a verb. in 
\[Step 11, "~ ~ " (file) and "~,)i " (tell) are both found 
as "verb candidates". Here " {~-?,# " has two syntactic 
categories registered in its lexical entry: the verb and the 
noun, while " ~ ~\[~ " has only one category, the verb. 
The theta grid for "~ ~ " is ~\[Th Ag\], " @ ~,)~ " I \[Th 
(Pd) Agl. So, to decide whether " ~ ~)# " is treated as a 
verb or a noun, there are tbnr cases to be considered: 
(1) ":~¢~ ~ " is treated as a verb, while "~ i,~-" a noun. 
q- -C 
Ag (Ma) 
In the above, "~ ~ " enveloped by a box means it plays 
a verb. When it searches for theta roles, "),~?, ~" ,rod "~ 
g)) " are respectively found as its Ag and Th, the two 
obligatory theta roles registered in its lexical entry. The 
score is calculated as folk~ws: For "~ ~ ", there are two 
obligatory roles, so Base = 2 × 2 = 4. Moreover, in 
this sentence, "N ~ ", "~ ~ "," t~_ tt~ ",and "~t~_ ~1~ ,, are all 
assigned some roles; thus, RWR = 4/4 -- 1. And then, 
Score-Per-Verb = {\[(number of OBR tbund)*2 + 
(number of OPR tbund)\]/Base} * RWR = {\[2 X 2 + 01/41 
X 1- l. Finally, Score = 1/1 = 1.00. 
. . ZF ,4 - . (2) "~ ~ and ~z ~,1~. both are treated as verbs. Score 
:-- (0.5+0.4)/2 = 0.45. 
(3) "~ ~ " and "~i~ ~)~" both are treated as verbs, while " 
G- ~,}~" is subordinate to "~ ~ ". Score - (0.375 + 0)/2 = 
0.1775 
(4) "~ ~5 " and "~ ~)~" both are treated as verbs, while " 
;~ ~ " is subordinate to ,, ~aj~ -~)~. ,,. Score -- (0.5+0.2)/2 
= 0.35. 
From the above discussions, case(l) apparently gets the 
highest score (1.00). So, the parsed structure in case(I) 
is preferable to those in the other cases. That is, in this 
sentence, " ~ ~ " plays the only verh, while " ~ ~ " 
plays a noun. Therefore, the right syntactic category for 
"~ff/" in this sentence is determined. 
In this example, we will demonstrate how to determine 
the rehuionship between verbs. In \[Step 1\], " ~-~i ~ " 
(request) and "~ Jt~" (divorce) are both found as "verb 
candidates". Here " ~-i,~- ~ " and " ~ J/~ " both have two 
syntactic categories registerexl in its lexical entry: the 
verb and the minn. The theta grid for "~ ~ " is +\[(Th) 
Pe Agl, "~t}" +lAg (Ag)l. So, there are five cases to 
be considered: 
(1) "~;~ ~" is treated as the only verb, while "~\[~" a 
noun. Score :~ 0.15/I = 0.15. 
(2) "~t ~?" is treated as a verb, while "~h~" a noun. 
T -F--C_ (Ag) CONJ Ag 
For "~\[ #~ ", Base= 3. Note that although "~ ~ " is an 
NP, it cannot play as Ag for " ~(\[ ~ ". It is because it 
doesn't satisfy the constraint for playing as Ag: an Ag 
must has a feature "+animate", according to Gruber's 
theory that an agent nmst be an entity with intentionality 
\[Gruber J. S. 19761. The situation that a verb cannot 
find a theta role is represented by the symbol "r--'l ". So, 
RWR = 3/4 = 0.75, and Score-Per-Verb = 
{\[1"2+0\]/31"0.75 = 0.5. Score -- 0.5/1 = 0.5. 
945 
(3) "~ ~ " and "~ ~t~" both are treated as verbs. Seore 
= (0.134+0.67)/2 = 0.402. 
(4) "~ ~" and "N ~;' both are treated as verbs, with " 
~l}" being subordinate to "~ ~" 
T-J (Ag) CONJ 
| _. 
-V A9 
I ! 
Pe 
-7 Pe 
For "~r~", Base=5. RWR=4/4= 1. Score-Per-Verb = 
{\[l'2+0\]/5} = 0.4. For " N ~h ~ ", Base=3. 
RWR=3/3= 1. Score-Per-Verb = {\[1"2+01/3} = 0.67. 
Score = (0.4+0.67)/2 = 0.535. 
(5) "~ 5\]~" and "N ~" both are treated as verbs, with " 
~ ~ " being subordinate to " ~ ~}' ". Score --- 
(0.134+0)/2 = 0.067. 
From the above discussions, case(4) apparently gets the 
highest score (0.535). So, the parsed structure in case(4) 
is preferable to those in the other cases. That is, in this 
sentence, "N ~" and "~ ~t\[}" both are treated as verbs, 
while " ~ ~ " is subordinate to " ~'~ J-\]~ ". The clause 
constructed by "N ~" is assigned the Pe role for "~ ~ ". 
Thus, this is a SVC sentence; moreover, this kind of 
SVC is commonly called "sentential objects". 
3.4 Structure Selector 
Structure Selector plays a final arbitrator. It collects 
all feasible cases and their scores. After scores of all 
cases are evaluated, the competition of all cases is 
arbitrated by Structure Selector. Structure Selector 
selects the case with the highest score as the most 
prelbrred one. The final result is retnrncd to the parser. 
4 Experimental Results 
4.1 Results of More Sample Sentences 
in tablc 2, we show thc results of more sentences with 
SVC in the legal documents which are parsed by this 
scheme in our TG-Chart parser. The sample sentences 
are shown in table 1 : 
Tahle 1. Some sample sentences with SVCs 
Sl:~,-~ -ff~g~ ~{~ ~ ~-{-~J~;~ (tl~~thedefendant tg!y 2 three hundred thousand dollars) 
The plaintiffpetitioncd the defendant to ~ive him three hundred thousand dollars. 
$2:~,~-~ g~/~ ~-~ ~ t~J (~'~ the defendant re~L ~ 
The plaintiffrequested the defendant to repay his debts. 
$3: ~0~-~ ~ ~lJ \]2~ -~:~ (Ihedefendant didnht arrive theeonrt 
The defendant didn't arrive at the court to argue. 
$4: ~-~ ~ ~f~i~' ~l\]~g-~ ~;~ (the defendant suddcnl~ ~ left h ..... desert his famil~ 
The defendant deserted his family suddenly and causelessly. 
$5: ~-~ :5~ ~_ _~ ~ J~,-~ \[Nit ~ (the defendant didn't retum l ..... ith the21aintiffcohabit ) 
The defendant didn't return home to cohabit with the plaintiff. 
$6: f'~,-~ ~.~ -~g~=~J -~)-, (the defendant petition inten'o~ate the witness) 
The defendant petitioned to interro\[~ate the witness. 
$7: ~j~-~ ~J_~ /~,~-~ ~ -~,~ (the defendant ~ ~ c_an 
The defendant hoped that the plaintiffcould ford, ire him. 
$8: N-~ I~g~ ~-~\]~\[\] ~T{~I~ (the defendant ~ attend the labor insurance) 
The defendant applied to attend the labor insurance. 
S9: 1~,-~ -~\[\]~ ~,~ ,,k, ~1~ ~l\]lt'~ (theplaintiff ordinaril~ treat iLeo2~lc ~ ~) 
Ordinarily, the plaintifftreats people fiiendly. 
SIO: ~,~'~ ~Z~ Z --IN ~~1~ {1~ {~ (~fl" hreak As E ........ ~ va\[nah\[~) 
The plaintiffbroke a vase thai was valuable. 
Tahle 2. Results of S-function calculation for sample sentences 
iiiiii::iiiiiiiiii~!~iiiiiiiii}iiljiiii vl: ~j~,v2:~'~-3 ~ vl,v2 vl>v2 1.00 • ;.Z.I Z.I .;!..Z. ZZ 
iiiiiiiii::)i::!i!::!S~::!ii::!i!i::!i!::i::i vl: --=~5~,v2: i~f~ vl ,v2 vl>v2 1.00 
iiiiii;~!iiiiiiiiiii~!iii!i!i!iiiii!i!i vl: 
ii!iiiiiiii!iiiii!iiS~ilililiiii!iii!i!i! vl: 
!!iii!iiiii!!ilililig~iiiiiiiililiiii!iii vt: :::::::::::::::::::::::: 
iiii~i!iiii!iiiiiiilS~iiii~!~!i!iiiiiiii~ 
iiiii!i!ililiiiiiiii~iliiiiiiii~i~i~i~i~ 
iiiiiiiiiiiiiiiiiilggiiiiiiiiiiiiii!iiii 
i!iiiiiiiiiiiiiiiiiS~iiiiiiiiiiiiii!iiii 
~11, v2: -5~ vl,v2 vl=v2 
f~j~,v2: ~Z-~: vl,v2 vl=v2 
~_, v2: ~/=~ vl,v2 vl-v2 
vl: ~.~,v2: ~yt~ vl,v2 vl>v2 
vl: NN,v2: NN 
0.84 
1.00 
0.83 
0.70 
vl,v2 vl>v2 0.84 
vl: ~,v2:~\]jI3 vl,v2 vl>v2 0.75 
vl: "~, v2: :~1\]1~ vl,v2 vl<v2 1.00 
vl:-}'\]'ti~,v2: t11~ vl,v2 vl=v2 1.00 
946 
4.2 Demonstrating How to ilandle 
Three-Verbs SVCs 
Let's consider the following three-verbs senlencc: " 
.~.~ "~ ~ -lh~ .},t_ ~ ~_ 4~," (!t~cplaintiff return 
home remind his wife p)~ fees_) (The plaintiff 
relurned home to remind his wife lo pay fees.). There are 
three verbs in this sentence: .~ (return), ~:L/~2 (remind), 
and ~ (pay). At the first stage, Combination Generator 
generates 29 possible combination; and then, 
Combination Filler filters out 26 of them, and only three 
cases remained to be considered: "~ = -1~ ~2 = ,~","~ = 
\[ ,b~/$~ > .~ l", and "1 ~ = ,t,~{ fi'~l > ~ ". Thus, Score 
Evaluator only needs to calculate the scores lot these 
three remained cases. At the final slage, Structure 
Selector accepts the evahmted scores for these cases and 
selects the one with the highest score. In this example, 
the structure "=>" gels the highesl score: (/.94', it is lhe 
correct structure l'or this sentence. 
Consider another interesting example, ",fC vx~ k .q~ 
'~)1 ~ qg ~ ~, 6.0" (12q think l mock he i s ~,rong) 
\[Pun 1991\]. This sentence is ambiguou.v to native 
speakers, since there arc two possible readings: (1) "l~g 
vX~ ;1.~ ':~)1 ~ 4'gl ~ $~@" (His thinking lhal 1 mocked 
him is wrong.), and (2) "~G vX J..~ \[4~, t~)l ~ ~ ~'~ $~ 6"~J\] '' 
(He thinks that 1 mocked him for being wrong.). In S- 
model, bath these two readings get the highest score: 1.0, 
and thus both are selected by Slruclure Selector as the 
final onlpnl. S-model doesn't altempt 1o select a 
"uniquely-correct" structure, bul just selects what are 
pr~'.rred. It matches humans' behavior since even a 
human may not be able to tell which of these two is 
better, 
5 Conclusion 
In this paper, we propose a systematic method for 
analyzing SVCs. The method is based on the 
information offered by theta grids. Many possible 
correlation relations may exist between verbs, we use a 
numerical scoring fimetion to determine the most 
preferred one. To utilize the S-fimction defined, wc 
design a S-model, which consists of four modules: a 
combination generator, a combination filter, a score 
evaluator, and a slruclure selcclor, to realize il. For the 
examples we have lestcd so far, taken from the legal 
documents \[Taiwang0al rFaiwang0b\], our mechanism 
always produces the correct reading. 
Li and Thompson 119811 classified SVCs into four 
types: (1) two or more separate events (2) a VP or a 
clause plays the subject or dirccl object of another verb (3) 
pivotal construction (4) descriptive clauses. We usually 
split lype (2) into two sub-types: (2)-1 scntential subjects, 
and (2)-2 scntcntial objects. Most work for handling 
SVCs are based on this classification. In our desigi~ of S- 
function, information about this classification is not used. 
However, in our testing sentences, it lnrlls otlt 1hal these 
five lypes are actually covered by the S-model which 
selects a preferred slructure based on only scoring 
functions. For example, $5 in table 1 belongs to type (1), 
$9, type (2)-1, $6, type (2)-2, $2, type (3), and SI0, type 
(4). The rcason why S-model may cover the 
classification is due to the rich information cacoded in 
thela grids. As an example, consider the sentence "~$ ~- 
~ ~ I','\] ;~v .& ,,. (The dcfcndant pclitioncd to 
interrogate the witness.) By Li and Thompson's 
classification, it belongs to the "scntential objecls" type. 
If we can classil~¢ the senlence into the correct type, the 
structure " A~:f~petitioeO >Jtg I:/\] (interrogfge)" will be 
determined. This is the idea used in most previous work. 
However, in S-model, we achieve the same result without 
relying on the classification. In S-model, sincc "~ a~" 
needs a "Pe" which implies that it expects an "event", 
i.e., a "sentential object" to play the thela role, alter 
calculating S-flmclion, the stntcture where " ~,E I':1 " is 
subordinate to "~: 2~" naturally gets the highest score 
alld lhlls becomes II1c "winner". As the previous cxalnple 
in section 4.2, lbr the ambiguous sentence S-model also 
yields more than one highesl score. We can conclude 
thai S-model could be a very general and sound 
mechanism 1o handle SVC sentences. 
Acknowledgment 
This research is supported in part by National Science 
Council of R.O.C. under the grant NSC83-0408-E-007- 
OO8. 
RefCl'ellces 
\[Chan~ and Krulec 19911 Chao-Huang Chang and 
Gilbert K. Krulee, Prediction Ambi£,ui(v in Chinese and 
Its Resolution. Proc. of ICCPCOL 1991, Pl). 10%114. 
\[Chert and lluang 1990\] Keh-jiann Chert and Chu-Rcn 
11uang, Information-based Case Grammar. In Proc. of 
COLING-90. 
\[Fass and Wilks 1983\] Dan Fass and Yorick Wilks, 
Prefi, rence Semantic's, IIl-libtwtednes.v, and \]vIetaphor. 
American Journal of Computational Linguistics, Vol. 9 
(3-4), July-December 1983, pp. 178-187. 
\[Gruber J. S. 19761 Gmber J. S., Lexical Structures Dt 
,~vntax and Semantics, North-Ilolland Publishing 
Company. 1976. 
\[llirst 1981\] In Graemc ifirst, Lecture Notes m 
Computer Science, Anaphora in Natural Language 
Understanding, A Nt#'vey. Springer-Vcrlag Bcrlin 
Heidelberg 1981. 
\[Kay 198111 Martin Kay. Algorithm Schemata and Data 
Structures in £iyntactic Processin,q. It) Prec. of the Nobel 
Symposmm on Text Processing, Gothenburg, 1980. 
lt, i and Thompson 19811 C. N. I.i and S. Thompson, 
Mandarin Chinese: a Functional Re/krence Grammar, 
University of California Press, Berkeley. 1981. 
\[Lin and Soo 1993\] Koong 1t.C. Lin and Von-Wnn Soo, 
Toward l)iscoutwe-guided Chart Parsing jor Mandarin 
Chinese--A Preliminat T Report. ROCLING VI, 1993. 
\[Liu and Soo 1993\] P, cy-Long l,iu and Von-Wun Soo, 
An \]Onpirical ,%'tully of Thematic KnowledL, e Acquisition 
Based on £Zyntactic ('lues and f\[euristicx. In Proceedings 
of ACl, 1993. 
947 
\[Pun 1991\] K. H. Pun, Analysis of Serial Verb 
Constructions in Chinese. ICCPCOL 1991, pp.170-175. 
1991. 
\[Taiwan 1990a\] Taiwan Kaohsiung district court, 
Summary of Kaohsiung District Court Criminal Verdict 
Documents, Vol. 1, 1990. 
\[Taiwan 1990b\] Taiwan Kaohsiung district court, 
Summary of Kaohsiung District Court Civil Verdict 
Documents, Vol. 1, 1990. 
\[Tang 1992\] Ting-Chi Tang, Syntax Theory and 
Machine Translation: Principle and Parameter Theory. 
In Proc. of ROCLING V, pp.53-83. 1992. 
\[Yang 1987\] Yiming Yang, Combining Prediction, 
Syntactic Analysis and Semantic Analysis in Chinese 
Sentence Analysis. IJCAI 1987, pp.679-681. 
\[Yell and Lee 1992\] Ching-Long Yeh and Hsi-Jian Lee, 
A Lexicon-Driven Analysis of Chinese Serial Verb 
Constructions. In Proc. of ROCLING V, pp.195-214. 
1992. 
\[Wilks 1975\] Yorick Wilks, An h~telligentAnalyzer and 
Understander of English. tn B. J. Grosz, K. S. Jones, and 
B. L. Webber, "Reading in Natural Language 
Processing", 1975. 
948 
