File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/94/a94-1026_intro.xml

Size: 3,604 bytes

Last Modified: 2025-10-06 14:05:35

<?xml version="1.0" standalone="yes"?>
<Paper uid="A94-1026">
  <Title>Handling Japanese Homophone Errors in Revision Support System for Japanese Texts; REVISE</Title>
  <Section position="2" start_page="0" end_page="156" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> We have been using morphological analysis to develop REVISE, a revision support system that corrects Japanese input errors (Ikehara, Yasuda, Shimazaki, and Takagi, 1987; Ohara, Takagi, Hayashi, and Takeishi, 1991).</Paragraph>
    <Paragraph position="1"> REVISE can detect and correct various types of errors, such as character deletion, character insertion and some grammatical errors, using knowledge bases that describe the characteristics of each error type (see figure 1).</Paragraph>
    <Paragraph position="2"> Homophone errors are one of the error types that can be detected and corrected in REVISE.</Paragraph>
    <Paragraph position="3"> Most Japanese texts are made with Japanese word processors. As Japanese texts consist of phonograms, KANA, and ideograms, KANJI, lapanese word processors always use KANA-KANJI conversion in which KANA sequences (i.e. readings) input through the key board are converted into KANA-KANJI sequences. Therefore, Japanese texts suffer from homophone errors caused by erroneous KANA-KANJI conversion. A homophone error occurs when a KANA sequence is converted into the wrong word which has the same KANA sequence (i.e.</Paragraph>
    <Paragraph position="4"> the same reading). Therefore, detecting and correcting homophone errors is an important topic.</Paragraph>
    <Paragraph position="5">  Previous research into detecting homophone errors with revision supportsystems used two approaches; (a) using correct-wrong word pairs (Kuga, 1986), Co) using KWIC (Key Word In Context) lists (Fukushima, Ohtake, Ohyama, and Shutoh, 1986; Suzuki and Takeda, 1989).</Paragraph>
    <Paragraph position="6"> Previous research into correct homophone selection in KANA-KANJI conversion used the following two  methods; (c) using collocation of words (Nakano, 1982; Tanaka, Mizutani, and Yoshida, 1984; Makino and Kizawa, 1981).</Paragraph>
    <Paragraph position="7"> (d) using case frame grammar (Oshima, Abe, Yuura, and Takeichi, 1986).</Paragraph>
    <Paragraph position="8"> Method (a) has a drawback in that only pre-defined wrong words in correct-wrong word pairs are detected. Method (b) only indicates which words are in the KWIC list.  Therefore, method (b) cannot automatically detect if the word is misused. Method (c) demands the creation of a huge dictionary which must describe all possible word collocations. Method (d) can select the correct homophone by using the semantic restriction between a verb and its cases based on case frame grammar. It is difficult, however, to use method (d) for detecting the homophone  errors in compound nouns because it mainly depends on JOSHI (i.e. Japanese postpositions) which are absent in compound nouns. Furthermore, it is difficult, if not impossible, for existing methods, (a)~(d), to correct homophone errors.</Paragraph>
    <Paragraph position="9"> This paper describes a method for detecting and correcting homophone errors in compound nouns used in REVISE. The idea underlying this method is that a compound noun component semantically restricts the semantic categories of adjoining words. Using semantic categories reduces dictionary size; moreover, this method needs no syntactic information such as case frames. Mso described are the experimental results made to certify the validity of this method.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML