File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/03/w03-1711_intro.xml
Size: 1,912 bytes
Last Modified: 2025-10-06 14:02:07
<?xml version="1.0" standalone="yes"?> <Paper uid="W03-1711"> <Title>A Chinese Efficient Analyser Integrating Word Segmentation, Part-Of-Speech Tagging, Partial Parsing and Full Parsing</Title> <Section position="4" start_page="0" end_page="3" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> Traditionally, a text parser outputs a complete parse tree for each input sentence, achieving a speed in the order of 10 words per second (wps) (Abney 1997). However, for many applications like text mining, a parse tree is not necessary and a speed of 10 wps is unacceptable when we have to process millions of words in thousands of documents in a reasonable time (Feldman 1997). Therefore, there is a compromise between speed and performance in many applications.</Paragraph> <Paragraph position="1"> * is the head word of c and is the POS tag of w when</Paragraph> <Paragraph position="3"> In this case, we call node c a normal chunk node.</Paragraph> <Paragraph position="5"> rd -level 3tuple sequence * is just the word linked with c and is the POS tag of when ( c is a special chunk). In this case, we call node c a special chunk or POS node.</Paragraph> <Paragraph position="7"> Figure 1 shows that, sequentially from bottom to top, -level 3tuple sequence is then chunked into 1 st -level 3tuple sequence (NP(NN, Guo Jia ) .(ADV, Ye ) .(VB, Cun Zai ) NP(NN, Wen Ti )) via st -level partial parsing, while POS nodes .(ADJ,</Paragraph> <Paragraph position="9"> -level 3tuple sequence is chunked into 3 rd -level 3tuple sequence (S(VB, Cun Zai )) via rd -level partial parsing, while normal chunk nodes NP(NN, Guo Jia ) and VP(VB, Cun Zai ) are chunked into a normal chunk node S(VB, Cun Zai ). 5) In this way, full parsing is completed with a fully parsed tree after several levels (3 in the example of Figure 1) of cascaded partial parsing.</Paragraph> </Section> class="xml-element"></Paper>