XML Viewer - w04-1120

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/w04-1120_intro.xml
Size: 3,079 bytes
Last Modified: 2025-10-06 14:02:33
<?xml version="1.0" standalone="yes"?>
<Paper uid="W04-1120">
  <Title>A New Chinese Natural Language Understanding Architecture Based on Multilayer Search Mechanism</Title>
  <Section position="2" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> At present a classical Chinese NLU architecture usually includes several components, such as Word Segmentation (Word-Seg), POS Tagging, Phrase Analysis, Parsing, Word Sense Disambiguation (WSD) and so on. These components are executed one by one from lower layers (such as Word-Seg, POS Tagging) to higher layers (such as Parsing, WSD) to form a kind of cascade mechanism. But when people build a NLU system based on these complex language analysis, it is a very serious problem since the errors of each layer component are multiplied.</Paragraph>
    <Paragraph position="1"> With more and more analysis components, the flnal result becomes too bad to be applicable.</Paragraph>
    <Paragraph position="2"> Another problem is that the components in the system afiect each other when people build a practical but toy NLU system. Here the toy system means that each component is ideal enough with perfect input. But in fact, on the one hand the lower layer components need the information of higher layer components; on the other hand the incorrect analysis of lower layers must reduce the accuracy of higher layers. In Chinese Word-Seg component, many segmentation ambiguities which cannot be solved using only lexical information. In order to improve the performance of Word-Seg, we have to use some syntax and even semantic information. Without correct Word-Seg results, however the syntax and semantic parser cannot obtain a correct analysis. It is a chain debts problem.</Paragraph>
    <Paragraph position="3"> People have tried to solve the error-multiplied problem by integrating multi-layers into a uniform model (Gao et al., 2001; Nagata, 1994).</Paragraph>
    <Paragraph position="4"> But with the increasing number of integrated layers, the model becomes too complex to build or solve.</Paragraph>
    <Paragraph position="5"> The feedback mechanism (Wu and Jiang, 1998) helps to use the information of high layers to control the flnal result. If the analysis at feedback point cannot be passed, the whole analysis will be denied. This mechanism places too much burden on the function of feedback point. This leads to the problems that a correct lower layer result may be rejected or an error result may be accepted.</Paragraph>
    <Paragraph position="6"> We propose a new Multilayer Search Mechanism (MSM) to solve the problems mentioned above. Based on the mechanism, we build a practical Chinese NLU platform { CUP (Chinese Understanding Platform). Section 2 introduces the background and architecture of the new mechanism and how to build it up. Experimental results with CUP is given in Section 3. In Section 4, we discuss why the new mechanism gets better results than the old ones. Conclusions and the some future work follow in Section 5.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML