Local context templates for Chinese constituent boundary 
prediction 
Qiang Zhou 
The State Key Laboratory of Intelligent Technology and Systems 
Dept. o1' Computer Science and technology, 
Tsinghua University, Beijing 100084 
zhouq @ s 1000e.cs.t singhu a.ed u.cn 
Abstract: 
in this paper, we proposed a shallow 
syntactic knowledge description: 
constituent boundary representation and its 
simple and efficient prediction algorithm, 
based on different local context templates 
learned fiom the annotated corpus. An open 
test on 2780 Chinese real text sentences 
showed the satisfying results: 94%(92%) 
precision for the words with multiple 
(single) boundary tag output. 
llt simplified the complex constituent levels in 
parse trees and only kept the boundary 
information of every word in different 
constituents. Then, we developed a simple and 
efficient constituent boundary prediction 
algorithm, based on different local context 
templates learned flom the annotated corpus. An 
open test on 2780 Chinese real text sentences 
showed the satisfying results: 94%(92%) 
precision for lhe words with multiple (single) 
boundary lag output. 
I. Introduction 
Research on syntactic parsing has been a focus 
in natural  processing for a long lime. As 
the developlnent of corpus linguistics, many 
statistics-based parsers were proposed, such as 
Magerman(1995)'s statistical decision tree parser, 
Collins(1996)'s bigram dependency model parser, 
1;/atnaparkhi(1997)'s maximum entropy model 
parser. All of lhem fried to get the complete parse 
trees of the input sentences, based on the 
statistical data extracted l'rom an annotated corpus. 
The besl parsing accuracy of these parsers was 
about 87%. 
Realizing the difficulties o1' complete parsing, 
many researches turned to explore the partial 
parsing techniques. Church(1988) proposed a 
silnple stochastic technique for lecognizing the 
non-recursive base noun phrases in English. 
\;outilaimen(1993) designed an English noun 
phrase recognition tool --~ NPTbol. Abney(1997) 
applied both rule-based and statistics-based 
approaches for parsing chunks in English. Due to 
the advantages of simplicity and robustness, these 
systems can be acted as good preprocessors for 
the further colnplete parsing. 
In tiffs paper, we will introduce our partial 
parsing aPl)roach for the Chinese . We 
first proposed a shallow syntactic knowledge 
description: constituent boundary representation. 
2. Constituent boundary description 
The constituent boundary representation 
comes fl'om the simplification of the complete 
parse Irees of the senlences. It omits the 
constituenfl levels in parse trees and only keeps 
the boundary information of every word in 
different constituents, i.e. it is at the left boundary, 
right boundary or middle position of a 
constituent. 
l~,vidently, if the input sentence has only one 
parse lree, i.e. without syntactic ambiguity, the 
constituent boundary position of every word in 
the sentence is clear and definite. In the sense, the 
constituent boundary tag indicates the basic 
syntactic structure information in the sentence. 
Separating them frolll the constituent smlcture 
tree and assigning them Io every word in the 
sentence, we can form a special syntactic unit: 
word botmdao, block (WBB). 
Definition: A word boumla O, block is lhe 
combination o1' the word(including part-of-speech 
information) and its constituent boundary tag, i.e. 
wbb~=<%, b~>, where % is the ith word in the 
sentence, b~ can value 0,1,2, which means % is at 
I llereafler, 'constiluent' represents all internal or root 
nodes in a parse tree, i.e. phrase oF sentence tags. In 
our syslem, each consliluen( must consist of two or 
more words{leaf node in parser tree). 
975 
the middle, left-most, or right--most position of a 
constituent respectively. 
In the view of syntactic description capability, 
the WBBs defined above, the chunks defined by 
Abney(1991) and the phrases(i.e, constituents) 
defined in a parse tree have the following 
tealtions: WBBs < chunks < phrases 
Here is an example: 
® The input sentence (10 words): 
(My brother gives him a book.) 
• Its parse tree representation (7 phrases): 
1,,, \[,,~ \[,,~ @~ t'('J ~'} \] \[,,4 l,,~ ~(t T \] 
41J~ \[1'6 \[,'~ -- $ \] -I~ \]1\] O \] 
•lts chunk representation (5 chunks): 
• Its constituent boundary tepresentation 
(10 WBBs): <~f~,l> <l'l<J,0> <~'~'A\],2> < 
)(\],1> <T,2> <41g,0> <-,1> </l<,2> < 
:1~,2> <o ,2> 
The goal of the constituent boundary 
prediction is to assign a suitable boundary tag for 
every word in the sentence. It can provide basic 
information for further syntactic parsing research. 
The following lists some application examples: 
• To develop a statistics-based Chinese 
parser(Zhou 1997) based on tile bracket 
matching principle(Zhou and Huang, 1997). 
• To develop a Chinese maxinmm noun 
phrase identifier(Zhou,Sun and Huang, 1999). 
• The automatic inference of Chinese 
probabilistic context-five grammar(PCFG) 
(Zhou and throng 1998). 
3. Local context templates 
The linguistic intuitions tell us that many local 
contexts may be useful for constituent boundary 
prediction. For example, many function words in 
Chinese have their certain constituent boundary 
position in the sentences, such as, most 
prepositions are at the left boundaries, and the 
aspectual particles ("le", "zhe", "guo") ate at the 
right boundaries. Moreover, seine content words 
also show their pteferential constituent boundary 
positions in a special local context, such as most 
adjectives arc at the right boundary in local 
context: "adverb + adjective". 
A tentative idea is how to use such simple 
local context information(including the part-of- 
speech(POS) tags and the number of Chinese 
characters(CN)) to develop an elTicient automatic 
boundary prediction algorithm. Therefore, we 
defined the following local context templates 
(LCTs): 
1) Unigram POS template: t~, BPFL, 
2) Bigram POS templates: 
, Left restriction: t~_, t~, BI'I;L~ 
® Right restriction:L t~+,, BPFL~ 
3) Trigram POS template: t~_~t~t~+~, BPFL, 
4) Trigram POS+CN template: t~_~+cn~_~ 
t~+cn,~ t~+~+cn~+~, BPFL~ 
In tile above LCTs, t~ is tile POS tag of the ith 
word in the sentence, cn~ is its character number, 
and BPFL~ is the frequency distribution list of its 
diffetent BP(boundary prediction) value(0,1,2) 
under the local context restrictions(I.CR)(the left 
and right word). 
Table 1 Some examples of the local context 
templates 
I~ Token 
Unigram p, 
39 849 476 
....................... 
bigram a n, 
(left) 5 164 2007 
bigram a n, 
(right) 4 2012 160 
Meaning 
A preposition is prior to 
at the constituent left 
boundary in Chinese. 
A noun is prior to at the 
right boundary if its 
previous word is an 
adjective. 
An adjective is prior to 
at llle left boundary if 
its nexl word is a noun. 
Trigranl n n u.Jl)l'~, A noun is priorlo at tile 
POS 1 18 1496 right boundary if its 
previous word is a norm 
and its next one is a 
.......... /ra_r_tial(De). 
Table shows some examples of LCTs. All 
these templates can be easily acquired fiom the 
Chinese treebanks or (.hme:e ~ " s corpus annotated 
with constituent boundary tags. 
Among these templates, some special ones 
have the following properties: 
a) TIV~ = ~ BPI;L, \[bl),\] > (z, 
b) 3bl),e 10,21, P(bPALCR)=BPFL, \[1%11 7T, > f3 
where the total frequency threshokl o~ and the BP 
probability threshold f~ are set to 3 and 0.95, 
respectively. They are called the proiectcd 
templates (PTs) (i.e. the local context template 
with a projecting BP wflue). 
Based on the different PTs, we can design a 
thtee-stage training procedure to overcome tile 
problem of data sparseness: 
976 
Slage 1 • I,carn the unigram and bigram 
tClllplales Oil the whole inslallCOS in annolated COl|mS. 
Slago 2 : I,carn lhc Irigraln P()S lonlphltos Oil the 
nonq~rojocled unigraili and bigraiil illSlallCCs (see i1o×l 
section for illOl'O dclailod). 
Slago 3: \[,carll lho lfigfaill PO,q-I-CN Ioinphiles 
oil Iho non-projcclt;d Irigrani P()~ iliSlanccs. 
Therefore, only the useful trigranl templates 
Call be lear|led. 
4. Automatic prediction algoritlnn 
After getting lhc LCTs, the auloillatic 
prediction algorithm becomes very simple: 1) to 
sot the protecting BPs based on the projected 
LCTs, 2) to select the best l:IPs based on tile lion- 
pro|coted LCTs. Sonic detailed inl~rmation will 
be discussed in the 12fllowing sections. 
4.1 Set tile projecting lips 
In this stage, tile refercllce seqlielice lo the 
LCTs is : unigfanl ~ I)igralil ~ higranl POS 
tiigranl I>OS+CN, i.e. l:ronl the rough roslriclion 
Lcrrs to tile tight restriction L(\]Ts. This sequence 
is same with the LCT training procedure. 
The detailed algoritlnn is as follows: 
Input: the position of the/ill word in the sentence. 
Background: the LCTs learned |'1o111 corpus. 
Output: the pro|coting BP of the word - if' fo/lnd; 
- 1 - otherwise. 
Procedure: 
• Gel lhc local context of the iih word. 
Ill If its unigran\] tonal)late is a PT, thor| return 
its projecting BP. 
• If its left and right bigram template satisfy 
the following conditions: 
> rE,+ TF,~- Z SmFL, Ijl +Z BPFL, Ijl > a 
> p0,/,,I Lc#O : (BPFL, Ijl + nS'FL. Ijl) / 
(77:, + 77=,< ) > I~ 
thor returll this combined projecting l~P(l)l@ 
• If its trigram POS template is a PT, then 
retul'u its protecting BP. 
• If its trigram POS+CN template is a PT, 
then return its projecting BP. 
4.2 Select the best liPs 
In this stage, tile reference sequence to the 
LCTs is : trigram POS+CN --> trigram POS "-> 
bigram ---7 unigram. It's a backing-off model 
(K.atz, 1987), just like the approach of Collins and 
Brooks(1995) for the prepositional phrase 
allachineilt plot)loin in English. The detailed 
alger|finn is as follows: 
Input: the position of the ith word in lho sentence. 
Background: lho LCTs learned from corpus. 
Output: tile best BP of the word. 
Procedure: 
• Get the local context of tile ith word. 
o For tile kth nlatched lrigram POS+CN 
tonlplatos, if 77,'x > CZ, lhen rolHrll SelectBestlll > 
UHJFL,). 
• |Sor the ruth niatchod loft bigranl and nth 
matched righl: bigrain, 
)~ Gel lho Combined BI'FL = Blqq, + IHq<L,, 
lJ" TFc<,,,,I,i,,,',l ,<',,,H ..... > 0, then rol!.lril 
SelectlJestBP(C<mibined Blqq O. 
I For lhe kth nlatched unigram templates, if 
7"/~ > 0, lhen relurn SelectlJestBl~(17PlrLk). 
I 17,olurn l(dol\]nilt is at the loft t)oundary). 
The internal function SelectBeslBP() tries to 
select the best BP based on the fi'equency 
distribution list of different BP wllue in LCTs. It 
has two output modes: I) single-output mode: 
only output the best t3t > with the highest 
fiequency in the LCT; 2) nmltiple-outlmt mode: 
outpul the BPs salisfying the conditions: 
\[/',,,,,-Pt,,.,,\[ < 7, where 7 = 0.2 
5. Experimental results 
5.1 Training and test data 
The training data were extracted fl'Olll two 
ditTerent parts of annotated Chinese corpus: 
1) The small Chinese treebank developed 
in Peking University(Zhou, 1996b), which 
consists of the sentences extracted fiom 
two parts of Chinese texts: (a) test set for 
Chinese-Englisla machine transhltion 
systems, (b) Singapore priiaary school 
textbooks. 
2) The test suite treebank being 
developed in Tsinghua University(Zhou 
and Sun, 1999), which consists of about 
10,000 representative Chinese sentences 
extracted from a large-scale Chinese 
bahmced corpus with about 2,000,000 
Chinese characters. 
The test data were extracted from the articles 
of People's Daily and manually annotated with 
977 
correct constituent boundary tags. It was also 
divided into two parts: 
1) The ordinary sentences. 
2) The sentences with keywords for 
conjunction structures (such as the 
conjunctions or special punctuation 
'DunHao'). They can be used to test the 
performance of our prediction algorithm 
on complex conjunction structures. 
Table 2 shows some basic statistics of these 
training and test data. Only the sentences with 
more than one word were used for training and 
testing. 
Table 2 The basic statistics of training and 
test data. (ASL = Average sentence length) 
Trainl 
Train2 
Testl 
Test2 
Sent. Word Char. ASL 
Num. Num. Num. (w/s) 
5573 64426 89492 11.56 
7774 108542 173334 13.96 
2780 68986 108218 24.82 
1071 32358 51169 30.21 
5.2 The learned templates 
After the three-stage learning procedure, we 
got four kinds of local context templates. Table 3 
shows their different distribution data, where the 
section 'Type' lists the distribution of different 
kinds of LCTs and the section 'Token' lists the 
distribution of total words(i.e, tokens) covered by 
the LCTs. In the colmnn 'PTs' and 'Ratio', the 
slash '/' was used to separate the PTs with total 
frequency threshold 0 and 3. 
More than 66% words in the training corpus 
can be covered by the unigram and bigram POS 
projected templates. Then only about 1/3 tokens 
will be used for training the trigram templates. 
Although the type distribution of the trigram 
templates shows the tendency of data sparseness 
(more than 70% trigram projected templates with 
total fi'equency less than 3), the useful trigram 
templates (TF>3) still covers about 70% tokens 
learned. Therefore, we can expect that them can 
play an important role during constituent 
boundary prediction in open test set. 
5.3 Prediction results 
In order to evaluate the performance of the 
constituent boundary prediction algorithm, the 
followiug measures were used: 
1) The cost time(CT) of the kernal 
functions(CPU: Celeron TM 366, RAM: 64M). 
2) Prediction precision(PP) = 
number of words with correct BPs(CortBP) 
total word number (TWN) 
For the words with single BP output, the 
correct condition is: 
Annotated BP = Predicted BP 
For the words with nmltiple BP outputs, the 
correct conditiou is: 
Annotated BP ~ Predicted BP set 
Tile prediction results of the two test sets were 
shown in Table 4 and Table 5, whose first 
columns list the different template combinations 
using in the algorithm. In the columns 'CortBP' 
and 'PP', the slash '/' was used to list the 
different results of the single and multiple BP 
outputs. 
After analyzing the experimental results, we 
found: 
1) The POS information in local context is 
very important for constituent boundary 
prediction. After using the bigram and trigram 
POS templates, the prediction accuracy was 
increased by about 9% and 3% respectively. 
But the chmacter number information shows 
lower boundary restriction capability. Their 
application only results in a slight increase of 
precision in single-output mode but a slight 
decrease in lnultiple-output lnode. 
Table 3 Distribution data of different learned LCTs 
LCTs 
l -gram 
2-gram(Left) 
2-gram(Right) 
3-gram (POS) 
3-graln(P+CN) 
Total 
59 
1448 
Type 
PTs(c~=0/3) 
24 
1030/591 
Ratio(tz=0/3) 
40.68 
71.13/40.81 
Total 
171705 
171705 
Token 
PTs(o~=0/3) 
53932 
87027/86339 
Ratio0x=0/3) 
31.41 
50.68 / 50.28 
1440 1008/567 70.00/39.38 171705 99443/98754 57.92/57.51 
3105 2324 ! 713 74.85 / 22.96 50333 24280 / 21982 48.24 / 43.07 
2553 1677 / 287 65.69 / I1.24 19098 5978 / 4079 31.30 / 21.36 
978 
Table 4 Experimental results of the test set 1 
Set the Projecting BPs templates 
used 
I -gram 22408 
+2-gram 46167 
TWN \[ CortBP 
22143 
45285 
53969 
55866 
PP(%) 
98.82 
98.09 
97.56 
97.40 
TWN CortBP 
46555 32876/ 
24374 
22796 16188/ 
17678 
13642 9946/ 
10986 
11603 8168/ 
8955 
+3-gram 55321 
POS 
+3-gram 57360 
P+CN 
Select the best BPs 
PP(%) 
70.62/ 
73.84 
71.01/ 
77.55 
72.90/ 
80.53 
70.40/ 
77.18 
TWN 
Total 
CortBP PP(%) CT 
68963 
68963 
68963 
55019/ 
56517 
61473/ 
62963 
63915/ 
64955 
79.78/ 14/16 
81.95 
89.14/ 11/15 
91.30 
92.68/ 13/I 1 
94.19 
68963 64034/ 92.85/ 
64821 93.99 
11/14 
2) Most of the prediction errors can be 
attributed to the special structures in the 
sentences, such as col!iunction structures (CSs) 
or collocation structures. I)ue to the long 
distance dependencies among them, it's very 
difficult to assign the conect botmdary lags to 
the words in these structures only according to 
the local context templates. The lower overall 
precision of the test set 2 (about 2% lower 
than tesl set 1) also indicates the boundary 
prediction difficulties of the conjunction 
structures, because there are more CSs in test 
set 2 than in test set I. 
3) The accuracy of the multiple outlml 
results is about 2% better than the single 
OUtlmt results. But the words with multiple 
boundary tags constitute only about 10% of 
the tolal words predicted. Therefore, the 
multil)le-output mode shows a good trade-off 
between precision and redundancy. It can be 
used as the best preprocessing data for the 
further syntactic parser. 
4) The maximal ratio of the words set by 
prqjected templates can reach 80%. It 
guarantees the higher overall pl'ecisioiL 
Table 5 Experimental results of the test set 2 
5) Tile algoritlml shows high efficiency. It 
can process about 6,000 words per second 
(CPU: Celeron TM 366, RAM: 64M). 
5.4 Compare with other work 
Zhou(1996) proposed a constituent boundary 
prediction algorithm based on hidden Marcov 
model(HMM). The Viterbi algorithm was used to 
find the best boundary path B': 
B' = arg max P(W, T \[ B)P(B) 
II 
= arg max I'(CT, Ib,)P(t, iII,,-,) 
i-I 
where the local POS probability l'(C~ \[ b) was 
computed by backing-off model and the bigram 
parameters:./(/~, t, , b) and.l(b~ , t,, \[i+l)" 
To compare its 19erformance with our 
algorithm, the trigram (POS and POS+CN) 
information was added up to its backing-off 
model. Table 6 and Table 7 show the prediction 
results of lhc HMM-based algorithm, based on the 
same parameters learned from training set 1 and 
2. 
Table 6. Prediction results of the HMM-based 
__I 
Templates Set tile Projecti~ 
Used L TWN CoriBP 
I -gram 9873 
+2-gram 20454 
+3-D'am 24079 
POS 
+3-gram 24866 
P+CN 
PP(%) 
98.57 
97.00 
96.42 
Select tile best BPs 
fWN CortgP I )P(%)II 
22342 15737/1 70.44/ 
6593 74.27 
11273 7856/ 70.44/ 
8607 74.27 
7384 5225/ 70.76/ 
5777 78.24 
6519 4525/ 69.40/ 
4958 76.05 
96.23 
Total 
TWN 
32358 
32358 
32358 
32358 
CortBP I 
25610/ 
26466 
28310/ 
29061 
29304/ 
29856 
29390/ 
29824 
PP(%) 
79. t 5/ 
81.79 
87.49/ 
89.81 
90.56/ 
92.27 
90.83/ 
92.17 
\[\[ CT 
6/5 
4/4 
3/6 
8/6 
979 
algorithm(test set 1) 
TWN CortBP PP(%) Ctime 
2-gram 68963 60908 88.32 144 
+ 3g-POS 68963 63397 91.93 138 
+3g-P+CN 68963 63649 92.29 139 
Table 7. Prediction results of the HMM-based 
algorithm(test set 2) 
+2-gram 
+3g-POS 
+3g-P+CN 
TWN CortBP PP(%) 
32358 27792 85.89 
32358 28918 89.37 
32358 29030 89.72 
Ctime 
68 
7O 
68 
The performance of the LCT-based algorithm 
surpassed the HMM-based algorithm in 
accuracy(about 1%) and efficiency (about 10 
times). 
Another similar work is Sun(1999). The 
difference lies in the definition of the constituent 
boundary tags: he defined them between word 
pair: w; /_? w;~;, not for the word. By using the 
HMM and Viterbi model, his algorithm showed 
the similar performance with Zhou(1996) (using 
bigram POS parameters): 
• Training data : 3051 sentences extracted 
from People's Daily. 
• Test data: I000 sentences. 
• Best precision:86.3% 
6. Conclusions 
The paper proposed a constituent boundary 
prediction algorithm based on local context 
templates. Its characteristics can be summarized 
as follows: 
• The simple definition of the local context 
templates made the training procedure very 
easy. 
• The three-stage training procedure 
guarantees that only the useful trigram 
templates can be learned. Thus, the data 
sparseness problem was partially overcome. 
• The high coverage of different types of 
projected templates assures a higher overall 
prediction accuracy. 
• The multiple output mode provides the 
possibility to describe different boundary 
ambiguities. 
• The algorithm runs very fast, surpasses 
the HMM-based algorithm in accuracy and 
efficiency. 
There are a few possible improvement which 
may raise performance flwther. Firstly, some 
lexical-based templates, such as prepositions as 
left restriction, may improve performance further 
- this needs to be investigated. The introduction 
of the automatic identifiers for some special 
structures, such as conjunction structures or 
collocation structures, may reduce the prediction 
errors due to the long distance dependency 
problem. Finally, more training data is ahnost 
certain to improve results. 
Acknowledgements 
The research was supported by National 
Natural Science Foundation of China (NSFC) 
(Grant No. 69903007). 

References 

Abney S. (1991). "Parsing by Chunks", In Robert 
Berwick, Steven Abney and Carol Tenny (eds.) 
Principle-Based Parsing, Kluwer Academic 
Publishers. 

Abney S. (1997). "Part-of-speech Tagging and Partial 
Parsing", Ill Young S. Bloothooft G. (eds.) Corpus- 
based ntethods in  and speech processings, 
118-136. 

Collins M. and Brooks J. (1995) "Prepositional Phrase 
Attachment through a Backing-OIT Model", In 
David Yarowsky & Ken Church(eds.) Proceedings 
of the third workshop on very large corpora, MIT. 
27-38. 

Church K. (1988). "A Stochastic Parts Program and 
Noun Phrase Parser for Unrestricted Text." In: 
Proceedings of Second Conference on Applied 
Natural Language Processing, Austin, Texas, 136- 
143. 

Collins M. J. (1996). "A New Statistical Parser Based 
on Bigram Lexical Dependencies." In Proc. of ACL- 
34, 184-191. 

Katz S. (1987). "Estimation of Probabilities from 
sparse data for the  model component of a 
speech recogniser". IEEE Transactions on ASSP, 
Vol .35, No. 3. 

Magerman. D.M. (1995). "Statistical Decision-Tree 
Models for Parsing", In Proc. o1' ACL-95,276-303. 

Ratnaparkhi A.(1997). '% linear observed time 
statistical parser based on maximum entropy 
models". In Claire Cardie and Ralph 
Weischedel(eds.), Second Conference on EmpMcal 
Methods in Natural Language Proeessing(EMNLP- 
2), Somerset, New Jersey, ACL. 

Sun H. L., Lu Q. and Yu S. W.(1999). "Two-level 
shallow parser for t, nrestricted Chinese text", In 
Changning Huang and Zhendong l)ong (eds.) 
Proceedings of Computalional linguistics, Beijing: 
Tsinghua University press, 280-286. 

Votllilanmn A. (1993). "NPTool, a delector of English 
Noun Phrases." In: Ken Church (ed.) t'roceedings of 
lhe Workshop on Very La,ge Corpora: Academic 
and lnduslrial Perspeelives. Columbus, Ohio, USA, 
48-57. 

Zhou Q. (1996a). '% Model for Automalic Prediction 
of (;hinese l'hrase Boundary I,ocation", 

Zhou Q. (1996b). Phrase Bracketing and Annotating 
on Chinese l,anguage Corpus . l~h.l), l)issertalion, 
Peking University. 

Zhou Q. (1997) "A Statistics-Based Chinese Parser", 
In l;roc, of lhe Fiflh Wol'kshol~ on Very I,arge 
Corpora, 4-15. 

Zhou Q. and l\]uang C.N. (1997) "A Chincse syntaclic 
parser based on bracket malehing principle", 
Communication of COIdPS, 7(2), #97008. 

Zhotl Q. alld Htlang C.N. (1998). "An hll'crence 
Approach for Chinese Probabilistic Con/exl-Free 
Gramnmr", Chinese .Iournal of Computers, 21(5), 
385-392. 

Zhou Q. and Sun M.S. (1999). "P, uiM a Chinese 
Trecbank as the lest suite for Chinese parser", In 
Key-Sun Choi & Young-Soog Chae(cds.) 
Proceedings of the workshop MAI,'99, Beijing. 32- 
37. 

Zhou (,)., Sun M.S. and ltuallg C.N.(1999) 
"Attlonmlically Identify Chinese Maximal Noun 
Phrases", Technical Report 99001, Slate Key l~ab. o1' 
Intelligent Technology and Syslcm,% l)epl, of 
Comlmter Science and Technology, Tsinghua 
University. 
