File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/82/c82-2059_abstr.xml

Size: 6,629 bytes

Last Modified: 2025-10-06 13:46:02

<?xml version="1.0" standalone="yes"?>
<Paper uid="C82-2059">
  <Title>CHINESE INPUT SYSTEM WITH ARTIFICIAL INTELLIGENCE</Title>
  <Section position="1" start_page="0" end_page="0" type="abstr">
    <SectionTitle>
CHINESE INPUT SYSTEM WITH ARTIFICIAL INTELLIGENCE
</SectionTitle>
    <Paragraph position="0"> ional Technology, East China ~ormal Univ., Shanghai, China There are. already a variety of Chinese Language Processing Systems commercially available around the world, of which the main divergence is probably in the input approaches of Chinese characters. But due to the following two facts none of the current approaches could be taken for an universally acknowledged scheme, the first fact is that the total number of Chinese characters is immense and the second and perhaps more critical fact is the topological structures of Chinese characters are rather sophisticated.</Paragraph>
    <Paragraph position="1"> Considering that Chinese Phonetic Alphabet (CPA) is gaining on Chinese pupils and students (including foreign students studying in Chins)e Considering also that CPA is supposed to be used as an international transcription for Chinese characters, the author has proposed a new input approach to input Chinese language to a computer in the light of AI theory and practice with its prototype Just being implement on a microcomputer system in our laboratories. This new Chinese Input System will sex~e as one of the subsystems of a comprehensive Chinese Language Computer-Aided Instruction System for teaching foreign students oh ot~ campus - the HUAHAN system.</Paragraph>
    <Paragraph position="2"> Working with our Chinese Input System people can input Chinese text by typing the corresponding phonetic symbols of each Chinese word through an ordinary keyboard. Eventually the original Chlnes% text will be obtained on the Chinese ORT screen or on the harcopy through Chinese printer.</Paragraph>
    <Paragraph position="3"> - 240 The bottleneck of this kind of approaches is generally attributed to the large and inreasonable amount of homophones in Chinese language. As a matter of fact, a siDgle Chinese word may consist of more than one characters. In CPA a multi-character word is often represented as one unit. For instance, in phonetic system the string &amp;quot;xingshi&amp;quot; is a word which ks formed out of two Chinese characters, i.e. the character &amp;quot;zing&amp;quot; and the character &amp;quot;shi&amp;quot;. Some individual Chinese characters, such as &amp;quot;xing&amp;quot; and &amp;quot;shi&amp;quot;, may also be a word, hence they may have one or more homophones. Furthermore, multi-character word i8 still subject to homophone, i.e. phonetic symbol &amp;quot;xingshi&amp;quot; represents both the Chinese words &amp;quot;situation&amp;quot; and &amp;quot;form&amp;quot;, among others, though the number of homophones is much reduced.</Paragraph>
    <Paragraph position="4"> The key to the question ks the software of this Chinese Input System which must be developed as to identify different Chinese words properly on the basis of the same phonetic s~mbol, i.e., different string of Chinese oharacters must be generated from the same phonetic symbol occuring in different contexts.</Paragraph>
    <Paragraph position="5"> We argue that the differentiation of the homophones could be realized in the similar as in the dissmblguation of the same word occuring in different contexts in Natural Language Understanding (NLU) which is making rapid progress.</Paragraph>
    <Paragraph position="6"> When considered as without any connection with other words a Chinese word with one or more homophones sharing the same phonetic symbol is really a trouble, but when we try to grasp the proper word not merely by itself but in connection with its context with the background knowledgy and/or with the very topics of the whole text or corresponding paragraph, we find, as a rule, the phonetic ambiguity (i.e. the different homophones which cause language ambiguities) would dissolve. So the most important is to extract the above-mentioned linguistic conceptuations as Chinese text input is going on.</Paragraph>
    <Paragraph position="7">  The lexical analyzer separates the words which have one or more mohophones from those without homophones.</Paragraph>
    <Paragraph position="8"> The context analyzer tries to extract the contextual meaning and/or the topic of the text or the paragraph wherein the word occurs by analyzing the context.</Paragraph>
    <Paragraph position="9"> Theinference mechanism draws, when necessary, inferences from the contextual meaning in order to obtain proper concept of the troublesome phonetic symbols so as to get the proper Chinese words consequently.</Paragraph>
    <Paragraph position="10"> The knowledge/coD~ept/topie base performs as a driver for the whole system, by communicating with each of the other three subsystems in gathering processed results from one sub-system as input data to the other and providing them with additional material necessary for further pocessing.</Paragraph>
    <Paragraph position="11"> Thus people can easily see that what our system really does is essentially a Chinese Language Understandin E System tries to do. But what our system features as compared with other NLU systems is that we tried hard to develop it to make sure that the knowledge/concept/topic base and the three other subsystems operate concurrently, reflecting our notion of the actual process of human language understanding.</Paragraph>
    <Paragraph position="12"> Conventionally Chinese Language Processing is involved in the area of Computer Science. In working up our system we have benefited a lot from theories and practices in Computational Linguistics and advanced researches in AI, especially those research activities at University of Pennsylvania (under direction of Prof. A.K.Joshi) and at Yale University (under direction of Prof. R.C.Schank), all of which bear a strong linguistic flavor.</Paragraph>
    <Paragraph position="13"> - 242 As a result, the author is inclined to s~est a new interdisciplinary research field be put forward which is tentatively termed as Linguistic Engineering. O-ur work is taken as a humble start of its practice.</Paragraph>
    <Paragraph position="14"> Now the implementation of the system is Jumt going on with a prototype. To our special purpose the BASIC dialects of our CROW~I~CO and CBSEC microcomputers are extended by the author in order to involve some LISP features which would meet the needs to implement such a heavy linguistic system on a microcomputer.</Paragraph>
    <Paragraph position="15"> - 243 -</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML