File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/94/c94-2169_intro.xml
Size: 2,864 bytes
Last Modified: 2025-10-06 14:05:41
<?xml version="1.0" standalone="yes"?> <Paper uid="C94-2169"> <Title>hesaurus-based Efficient Example Retrieval</Title> <Section position="2" start_page="0" end_page="0" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> Since a nmdel of machine translation (MT) called Translation by Analogy was first proposed in Nagao (1984), nmch work has been undertaken in exampleba~sed NLP (e.g. Sato and Nagao (1990) and Kurohashi and Nagao (1993)). The basic idea of example-based approach to NLP is to accomplish some task in NLP by imitating a similar previous example, instead of using rules written by human writers. Major processing steps of example-based approach are: 1) collect examples and the results of performing the task in a database, 2) given an input, retrieve similar examples from the database, 3) adapt the results of tile similar examples to the current input and obtain the output.</Paragraph> <Paragraph position="1"> Compared with the traditional rule-based approach, example-based approach has advantages like: 1) it is easier to maintain the implemented system, since once the system is constructed, the performance can be improved just by adding new examples, 2) finer-grained syntactic and semantic discrimination can be expected just by adopting finer-grained similarity measure between the input and the example.</Paragraph> <Paragraph position="2"> ht almost all the previous fl'ameworks of example-based NLP, it is necessary to calculate similarity values for all the examples in the database in order to find the most similar one, and this is called full retrieval. Usually, the computational cost of example retrieval causes a severe problem, because the retrieval time increases in proportion to the number of examples in the database.</Paragraph> <Paragraph position="3"> This paper proposes a novel method for avoiding flfll retrieval. The proposed method, which we call query generation retrieval, has the following three features, *The authors would like to thank Prof. Y. Matsumoto of Nara Institute of Science and Technology, Dr. Y. Den and Dr. E. Sumita of ATR, and Mr. M. Shimbo of Kyoto University, for valuable comments on the draft of the paper.</Paragraph> <Paragraph position="4"> 1) it generates retrieval queries from similarities, 2) efficient example retrieval through the tree structure of a thesaurus, 3) binary search along subsumption ordering of retrieval, queries. In this paper, we focus on retrieval of example surface case structures of Japanese sentences. The similarity vatne between the input and the example is calculated using existing hand-compiled thesaurus. In the following sections, the similarity measure of surface case structures is defined in section 2, then the framework of query generation retrieval is described in section 3.</Paragraph> </Section> class="xml-element"></Paper>