XML Viewer - p86-1024

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/86/p86-1024_metho.xml
Size: 21,185 bytes
Last Modified: 2025-10-06 14:11:55
<?xml version="1.0" standalone="yes"?>
<Paper uid="P86-1024">
  <Title>A SENTENCE ANALYSIS METHOD FOR A JAPANESE BOOK READING MACHINE FOR THE BLIND</Title>
  <Section position="1" start_page="0" end_page="0" type="metho">
    <SectionTitle>
A SENTENCE ANALYSIS METHOD FOR A JAPANESE
BOOK READING MACHINE FOR THE BLIND
</SectionTitle>
    <Paragraph position="0"/>
  </Section>
  <Section position="2" start_page="0" end_page="169" type="metho">
    <SectionTitle>
ABSTRACT
</SectionTitle>
    <Paragraph position="0"> The following proposal is for a Japanese sentence analysis method to be used in a Japanese book reading machine. This method is designed to allow for several candidates in case of ambiguous characters. Each sentence is analyzed to compose a data structure by defining the relationship between words and phrases.</Paragraph>
    <Paragraph position="1"> This structure ( named network structure ) involves all possible combinations of syntactically collect phrases.</Paragraph>
    <Paragraph position="2"> After network structure has been completed, heuristic rules are applied in order to determine the most probable way to arrange the phrases and thus organize the best sentence. All information about each sentence ~ the pronunciation of each word with its accent and the structure of phrases ~ will be used during speech synthesis. Experiment results reveal: 99.1% of all characters were given their correct pronunciation. Using several recognized character candidates is more efficient than only using first ranked characters as the input for sentence analysis. Also this facility increases the efficiency of the book reading machine in that it enables the user to select other ways to organize sentences.</Paragraph>
    <Paragraph position="3"> I. Introduction English text-to-speech conversion technology has substantially progressed through massive research ( e.g., Allen 1973, 1976, 1986; Klatt 1982, 1986 ). A book reading machine for the blind is a typical use for text-to-speech technology in the welfare field ( Allen 1973 ). According to the Kurzweil Reading Machine Update ( 1985 ), the Machine is in use by thousands of people in over 500 locations worldwide.</Paragraph>
    <Paragraph position="4"> In the case of Japanese, however, due to the complexities of the language, Japanese text-to-speech conversion technology hasn't progressed as fast as that of English. Recently a Japanese text-to-speech synthesizer has been introduced ( Kabeya et al. 1985 ). However, this synthesizer accepts only Japanese character code strings and doesn't include the character recognition facility.</Paragraph>
    <Paragraph position="5"> Since 1982, the authors have been engaged in the research and development of a Japanese sentence analysis method to be used in a book reading machine for the blind. The first version of the Japanese book reading machine, which is aimed to exarnine algorithms and its performance, has developed in 1984 ( Tsuji and Asai 1985; Tsukurno and Asai 1985; Fukushima et al. 1985; Mitome and Fushikida 1985, 1986 ). Figure 1 shows the book reading process of the machine. A pocket-size book is first scanned, then each character on the page is detected and recognized. Sentence analysis ( parsing ) is accomplished by using character recognition result. Finally, synthesized speech is generated. The speech can be recorded for future use. The pages will turn automatically.</Paragraph>
    <Paragraph position="6">  The Japanese sentence analysis method that the authors have developed has two functions: One, to choose an appropriate character among several input character candidates when the character recognition result is ambiguous. Two, to convert the written character strings into phonetic symbols. The written character strings are made up Kanji ( Chinese } characters and kana ( Japanese consonant-vowel combination ) characters. These phonetic symbols depict both the pronunciation and accent of each word. The structure of the phrases is also obtained in order to determine the pause positions and intonation.</Paragraph>
    <Paragraph position="7"> After briefly describing the difficulty of Japanese sentence analysis technology compared to that of English, this paper will outline the Japanese sentence analysis method, as well as experimental results.</Paragraph>
    <Paragraph position="8"> 2. Comparison of Japanese and English as Input for a Book Reading Machine In this section, the difficulty of Japanese sentence analysis is described by comparing with that of English.</Paragraph>
    <Section position="1" start_page="165" end_page="165" type="sub_section">
      <SectionTitle>
2.1 Conversion from Written Characters to
Phonetic Symbols
</SectionTitle>
      <Paragraph position="0"> In English, text-to-speech conversion can be achieved by applying general rules. For exceptional words which are outside the rules, an exceptional word dictionary is used. Accentuation can be also achieved by rules and an exceptional dictionary.</Paragraph>
      <Paragraph position="1"> Roughly speaking, Japanese text-to-speech conversion is similar to that of English. However, in case of Japanese, more diligent analysis is required. Japanese sentences are written by using Kanji characters and kana characters. Thousands of kinds of Kanji characters are generally used in Japanese sentences. And, most of the Kanji characters have several readings ( Figure 2 (a)).</Paragraph>
      <Paragraph position="2"> On the other hand, the number of kana characters is less than one hundred. Each kana character corresponds to certain monosyllable. Therefore, in the conversion of kana characters, kana-to-phoneme conversion rules seem to be successfully applied. However, in two cases, kana characters l~ and ~', are used as Kaku-Joshi, Japanese preposition which follows a noun to form a noun phrase, then the pronunciation changes ( Figure 2 (b) }.</Paragraph>
      <Paragraph position="3"> Subsequently the reading of numerical words also changes ( Figure 2 (c)).</Paragraph>
      <Paragraph position="4"> As described above, the pronunciation of each character in Japanese sentences is determined by a neighbor character which combines to form a word.</Paragraph>
      <Paragraph position="5"> There are too many exceptions in Japanese to create general rules. Therefore, a large size word dictionary which covers all commonly used words is generally used to analyze Japanese sentences.</Paragraph>
    </Section>
    <Section position="2" start_page="165" end_page="166" type="sub_section">
      <SectionTitle>
2.2 Required Sentence Analysis Level
</SectionTitle>
      <Paragraph position="0"> In English sentences, the boundaries between words are indicated by spaces and punctuation marks. This is quite helpful in detecting phrase structure, which is used to determinate pause positions and intonation.</Paragraph>
      <Paragraph position="1"> On the contrary, Japanese sentences only have punctuation marks. They don't have any spaces which indicate word boundaries, Therefore, more precise analysis is required in order to detect word boundaries at first. The structure of the sentence will be analyzed after the word detection.</Paragraph>
      <Paragraph position="3"/>
    </Section>
    <Section position="3" start_page="166" end_page="167" type="sub_section">
      <SectionTitle>
2.3 Character Recognition Accuracy
</SectionTitle>
      <Paragraph position="0"> English sentences consist of twenty-six alphabet characters and other characters, such as numbers and punctuations. Because of the fewer number of the English alphabet characters, characters can be recognized accurately.</Paragraph>
      <Paragraph position="1"> Japanese sentences consist of thousands of Kanji characters, more than one hundred different kana characters ( two kana character sets ~ Hiragana and Katakana are used in Japanese sentences ) and alphanumeric characters. Because of the variety of characters, even when using a well-established character recognition method, the result is sometimes ambiguous.  3. Characteristics of Sentence Analysis Method  The Japanese sentence analysis method has the following characteristics.</Paragraph>
      <Paragraph position="2"> I. The mixed Kanji-kana strings are analyzed both through word extraction and syntactical examination. An internal data structure ( named network structure in this paper ), which defines the relationship of all possible words and phrases, is composed through word extraction and syntactical examination. After network structure has been completed, heuristic rules are applied in order to determine the most probable way to arrange the phrases and thus organize a sentence.</Paragraph>
      <Paragraph position="3"> 2. When an obtained character recognition result is ambiguous, several candidates per character are accepted. Unsuitable character candidates are eliminated through sentence analysis.</Paragraph>
      <Paragraph position="4"> 3. Each punctuation mark is used as a delimiter. Sentence analysis of Japanese reads back to front between punctuation marks. For example, the analysis starts from the position of the first punctuation mark and works to the beginning of the sentence. Thus, word dictionaries and their indexes have been organized so they can be used through this sequence.</Paragraph>
      <Paragraph position="5"> 4. The sentence analysis method is required for short computing time to analyze unrestricted Japanese text. Therefore, it has been designed not to analyze deep sentence structure, such as semantic or pragmatic correlates.</Paragraph>
      <Paragraph position="6">  5. By the user's request, the book reading machine can read the same sentence again and again. If the user wants to change the way of reading ( e.g. in the case that there are homographs ), the machine can also crest other ways of reading. In order to achieve this goal, several pages of sentence analysis result is kept while the machine is in use.</Paragraph>
      <Paragraph position="7"> 4. Outline of Sentence Analysis System  As shown in Figure 3, the Japanese sentence analysis system consists of two subsystems and word dictionaries. Two subsystems are named &amp;quot;network structure composition subsystem&amp;quot; and &amp;quot;speech information organization subsystem&amp;quot;, respectively. These subsystems work asynchronously.</Paragraph>
    </Section>
    <Section position="4" start_page="167" end_page="168" type="sub_section">
      <SectionTitle>
4.1 Network Structure Composition Subsystem
</SectionTitle>
      <Paragraph position="0"> As the input, the network structure composition subsystem receives character recognition results. When the character recognition result is ambiguous, several character candidates appear. During the character recognition, the probability of each character candidate is also obtained. Figure 4 is an example of character recognition result. Figure 4 describes: The first character of the sentence as having three character candidates. The fifth and seventh characters as having two candidates.</Paragraph>
      <Paragraph position="1"> Except the fifth character, all of the first ranking character candidates are correct. However, the fifth character proves an exception with the second ranking character candidate as the desired character.</Paragraph>
      <Paragraph position="2"> With the recognized result, the network structure composition subsystem is activated. Figure 5 describes how the recognition result ( shown in Figure 4 ) is analyzed.</Paragraph>
      <Paragraph position="3"> Through the detection of punctuation marks in the input sentence ( recognition result ), the subsystem determines the region to be analyzed. After one region has been analyzed, the next punctuation mark which determines the next region is detected. In case of Figure 5, for example, whole data will be analyzed at once, because the first punctuation mark is located at the end of the sentence.</Paragraph>
      <Paragraph position="4"> Characters in the region are analyzed from the detected punctuation to the beginning of the sentence.</Paragraph>
      <Paragraph position="5"> The analysis is accomplished by both word extraction ;~nd syntactical examination. Words in dictionaries are extracted by using character strings which are obtained by combining character candidates. The type of the characters ( kana, Kanji etc. ) determines which index for the dictionaries will be used.</Paragraph>
      <Paragraph position="6">  After extracting the words, phrases are composed by combining the words. Using syntactical rules ( i.e.</Paragraph>
      <Paragraph position="7"> conjugation rules ), only syntactically correct phrases are composed.</Paragraph>
      <Paragraph position="8"> Finally, by using these phrases, network structure is composed. Network structure obtained through the analysis described in Figure 5 is shown in Figure 6. This structure involves the following information.</Paragraph>
      <Paragraph position="9"> * hierarchical relationship between sentence, phrases and words * syntactical meaning of each word * pointers to the pronunciation and accent information of for each word in dictionaries * pointers between phrases which are used when the user selects other ways of reading Some features of Japanese language are utilized in the network structure composition subsystem. Some examples of them are as follow.</Paragraph>
      <Paragraph position="10"> 1. In general, a Japanese phrase consists of both an independent word and dependent words. The prefix word and/or the suffix word are sometimes adjoined. The number of dependent words is not so many as compared with independent words. It seems to be efficient to analyze dependent words first. Thus, the analysis is accomplished from the  end of the region to the beginning.</Paragraph>
      <Paragraph position="11"> 2.</Paragraph>
      <Paragraph position="12"> 3.</Paragraph>
      <Paragraph position="13">  Independent words mostly include non-kana characters, alternately, dependent words are written in kana characters. Therefore, higher priority is given both to independent words which include a non-kana characters and to dependent words which consist of only kana characters.</Paragraph>
      <Paragraph position="14"> The number of Kanji characters is far greater than that of kana characters. Therefore, it seems efficient to use a Kanji character as the search key to scan the dictionary indexes. These indexes are designed so that the search key must be a non-kana character in cases where there is one or more non-kana character.</Paragraph>
    </Section>
    <Section position="5" start_page="168" end_page="169" type="sub_section">
      <SectionTitle>
4.2 Speech Information Organization Subsystem
</SectionTitle>
      <Paragraph position="0"> With the user's request for speech synthesis, the speech information organization subsystem is activated.</Paragraph>
      <Paragraph position="1"> This subsystem determines the best sentence ( a combination of phrases ) by examining the phrases in network structure. After organizing the sentence, the information for speech synthesis is then organized. The pronunciation and accent of each word are determined by using the dictionaries. The structure of the sentence is obtained by analyzing the relationship between phrases.</Paragraph>
      <Paragraph position="2"> In case of numerical words, such as 1,234..56, a special procedure is activated to generate the reading. In case the user requests other ways of reading the sentence, the subsystem chooses other phrases in network structure, thus organizing the speech synthesis information.</Paragraph>
      <Paragraph position="3">  In order to determine the most probable phrase combination in network structure, heuristic rules axe applied. The rules have been obtained mainly by experiments. Some of them are as follow.</Paragraph>
      <Paragraph position="4"> \[11 Number of Phrases in a Sentence The sentence which contains the least number of phrases will be given the highest priority.</Paragraph>
      <Paragraph position="5"> i21 Probabilities of Characters The phrase which contains more probable character candidates will be given higher priority.</Paragraph>
      <Paragraph position="6"> This probability is obtained as the result of character recognition.</Paragraph>
      <Paragraph position="7"> !3\] Written Format of Words Independent words written in kana characters will be given lower priority.</Paragraph>
      <Paragraph position="8"> Independent words written in one character will be also given lower priority.</Paragraph>
    </Section>
    <Section position="6" start_page="169" end_page="169" type="sub_section">
      <SectionTitle>
14! Syntactical Combination Appearance Frequency
</SectionTitle>
      <Paragraph position="0"> The frequently used syntactical combination will be given higher priority.</Paragraph>
      <Paragraph position="1"> ( e.g. noun-preposition combination )</Paragraph>
    </Section>
    <Section position="7" start_page="169" end_page="169" type="sub_section">
      <SectionTitle>
!51 Selected Phrases
</SectionTitle>
      <Paragraph position="0"> The phrase which once has been selected by a user will be given higher priority.</Paragraph>
      <Paragraph position="1"> In the case of Figure 3, the best way of arranging phrases is determined by applying the heuristic rule \[1\].</Paragraph>
    </Section>
    <Section position="8" start_page="169" end_page="169" type="sub_section">
      <SectionTitle>
4.3 Word Dictionaries
</SectionTitle>
      <Paragraph position="0"> their usage.</Paragraph>
      <Paragraph position="1"> 560 words (4) Prefix Word Dictionary 153 words (5) Suffix Word Dictionary 725 words Each word stored in these dictionaries has the following information. (a) written mixed Kanji-kana string (first-choice) (b) syntactical meaning (c) pronunciation (d) accent position Items (a) and (b) of all words are gathered to form the  These indexes are used by the network structure composition subsystem. Items (c) and (d) are used by the speech information organization subsystem.</Paragraph>
    </Section>
  </Section>
  <Section position="3" start_page="169" end_page="171" type="metho">
    <SectionTitle>
5. Experimental Results
</SectionTitle>
    <Paragraph position="0"> Some experiments have achieved in order to evaluate the sentence analysis method. In this section, these experimental results are described.</Paragraph>
    <Section position="1" start_page="169" end_page="170" type="sub_section">
      <SectionTitle>
5.1 Pronunciation Accuracy
</SectionTitle>
      <Paragraph position="0"> The accuracy of pronunciation has been evaluated by counting correctly pronounced characters. In this experiment, character code strings were used as the input data. The following two whole books are analyzed.</Paragraph>
      <Paragraph position="1">  The major cases for mispronunciation are as follows.  (1) Unregistered words in dictionaries (l-a) uncommon words (l-b) proper nouns (l-c) uncommon written style (2) Pronunciation changes in the case of compound words (3) Homographs (4) Word segmentation ambiguities (5) Syntactically incorrect Japanese usage</Paragraph>
    </Section>
    <Section position="2" start_page="170" end_page="171" type="sub_section">
      <SectionTitle>
5.2 Efficiency as the Postprocessing Roll for
Character Recognition
</SectionTitle>
      <Paragraph position="0"> The efficiency as the postprocessing roll for character recognition has been evaluated by comparing the characters used for speech synthesis with the character recognition result. Twelve pages of character recognition results ( four pages of three books ) have been analyzed.</Paragraph>
      <Paragraph position="1"> The books used as the input data are as follow.</Paragraph>
      <Paragraph position="2">  Table 3 shows the score for characters which are' chosen as correct characters by the sentence analysis method, as well as the score for correctly pronounced characters.</Paragraph>
      <Paragraph position="3">  As shown in Tables 2 and 3, the score for correct characters obtained after the sentence analysis was 99.7%, while the score for the 1st ranking chaxacters obtained in the character recognition result was 99.5%. This experimental result reveals that the sentence analysis method is effective as a postprocessing roll of character recognition. The state of errors found during the experiment is shown in Table 4. The difference between (b') and (b3) in Table 4 indicates the effectiveness of the sentence analysis method. The score 99.0% in Table 3 indicates the efficiency of the sentence analysis method in the book reading machine.</Paragraph>
    </Section>
    <Section position="3" start_page="171" end_page="171" type="sub_section">
      <SectionTitle>
5.3 Efficiency of Selection by Manual
</SectionTitle>
      <Paragraph position="0"> To examine the efficiency, an experiment has been conducted where sentences have been read both automatically and with the help of manual manipulation. The same text used in Section 5.2 was used in this experiment. Table 5 shows scores for the correctly pronounced characters. As shown in Table 5, 99.9% and 99.8~ of all characters were given correct pronunciation after the manual selection, while 99.3% and 99.0e~ of all characters had been given their correct pronunciation before the manual selection, respectively. These scores reveal that most mispronunciation could be recovered by manual selection so that nearly all accurately pronounced reading can be taped.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML