File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/86/c86-1076_metho.xml

Size: 16,272 bytes

Last Modified: 2025-10-06 14:11:49

<?xml version="1.0" standalone="yes"?>
<Paper uid="C86-1076">
  <Title>NARA: A Two-way Simultaneous Interl3retation System</Title>
  <Section position="3" start_page="0" end_page="326" type="metho">
    <SectionTitle>
3. Linguistic Data Structure and Computing Model
</SectionTitle>
    <Paragraph position="0"> in order to investigate the correspondence between the two languages, we partition a grammar into independent components: segmented words, the word order, morphology, syntax, and semantics. The partition of a grammar constitutes an important step of modular decomposition into the interpretation subsystems.</Paragraph>
    <Section position="1" start_page="0" end_page="325" type="sub_section">
      <SectionTitle>
3.1 Interpretation strategy of segmented word com-
ponent
</SectionTitle>
      <Paragraph position="0"> In comparison with other symbol system, every hmlan language has a remarkable characteristics; namely, the structure of segmented words. The utterance as a segmented word conveys a message regarding some matter, and communicates the information concerning the matter. A se\[~nented word is a word or an ordered pair of words. Using some criteria: positional transformation, substitution and insertion, we can specify a segmented word of Korean or Japanese.</Paragraph>
      <Paragraph position="1">  Between Korean and Japanese, some common properties are observed, such as an agglutinative language structure and the identical word order(SOV). In addition, we sight three corresponding word order properties of segmented words between the two languages: For some (kl, k2) e:Sk and (jl, j2) eSj, where Sk and Sj are a set of Korean segmented wordsj a set of Japanese segmented words, respectively, and I a binary  e.g. Bako X &lt;I&gt; %t-t'~ ~ (a Japanese) ~mong above properties, Property 3 depends upon Korean pragmatic information.</Paragraph>
      <Paragraph position="2">  The production form of a se\[{nented word of Korean or Japanese can be described in the rule forms in a regular granmar, and it is right linear. Since a language L generated by some right linear grammar G is regular, there exists a finite automaton which accepts L. If L is a context-free language and s is a substitution map such that for every a e V(a fixed vocabulary), s(a) is a context-free language, then s(L) is a context-free language. A type of substitution map that is of special intcrest is a hc~mmorphiaa. If' L is a regular language and h is a hom~rJorphia~l, then the range of tile inverse homomorphism ff~(L) is also regular language. And, for two given regular grammars G and G', if L(G) : L(G'), there is a sequence equivalence. Two sequences generate the same word order in the increasing length order.</Paragraph>
    </Section>
    <Section position="2" start_page="325" end_page="325" type="sub_section">
      <SectionTitle>
3.2 Interpretation strategy of Norphological component
</SectionTitle>
      <Paragraph position="0"> The study of the structure of words occupies an important place within linguistics, sandwiched between phonology and syntax. Horphemes may also be partitioned into lexical and grammatical classes. Lexical morphemes are generally free, while many of the grammatical morphemes are bound.</Paragraph>
      <Paragraph position="1">  In a given Korean-Japanese (or Japanese-Korean) dictionary, let Dk be the set of morphemes of Korean, and Dj be the set of morphemes of Japanese. A mapping I between the sets is defined as follows.</Paragraph>
      <Paragraph position="3"> implying that the image of Dk is D j; taking the in-</Paragraph>
      <Paragraph position="5"> By generalizing the relation and the mapping between the two sets, we may consider the set of Korean words to be a domain, and the set of Japanese words a range.</Paragraph>
      <Paragraph position="6"> ~ssuming the same cardinality for both, Dk and Dj may be partitioned as shown below. Here we suppose {I&lt;I, k2,..kn}eDk, {jl, j2,..jm}eDj:  (I) one-to-one (ki,ji) e DkxDj.</Paragraph>
      <Paragraph position="7"> (2) one-to-many (ki. lJn.Ji2,...Ji,(il\]) e DkX2&amp;quot;i (3) many-to-nlany (Ikihki2,-..ki,(i)l,lJil,Ji2.'&amp;quot;Ji,,,(i)}) e 21)kx2 l)j  where, A xB is the Cartesian product of the two sets A and B~ and 2 A is the a power set of a set A.</Paragraph>
      <Paragraph position="8"> Obviously, one-to-one correspondence is isomorphic. Naturally, our attention will be focused on the one-to-many and many-to-many relations. Interpretation of these relations depends on various factors: allomorph, synonym and homonym. Thus, as for the interpretation which is dependent on synonymy or polysemy, we charac-terize the interpretation by specifying the canonical form, or the semantic feature instantiation, respectively. null</Paragraph>
    </Section>
    <Section position="3" start_page="325" end_page="325" type="sub_section">
      <SectionTitle>
3.3 Syntax level interpretation strategy
</SectionTitle>
      <Paragraph position="0"> We examine the syntactic structure of the two languages. Frcn~ the correspondence in a segTaented word and word order, it is seen intuitively that they are strongly equivalent. And there is a sufficient linguistic evidence for it based on the study of experimental comparative linguistics\[2\]. ~ phrase structure preserves each lexical semantic feature of a constituent structure, and a parse tree describes the construction of syntactic representation of a sentence. Horeover~ a partial tree in the whole parse tree plays a role of adjusting semantic and syntactic interpretation. Let us compare the examples of two parse tree constructions(Fig I):</Paragraph>
      <Paragraph position="2"> It is obvious that parse trees coincide with each other in one-to-one fashion, but syntactic categories do not. This implies that two given languages, Korean and Japanese, do not generate the same set of sentential forms. Furthermore, there is no algorit~n for deciding whether or not two given context-free grammars generate the same sentential forms. This is the reason why we adopt the covering grar~ar technique to parse the source language for interpretation.</Paragraph>
    </Section>
    <Section position="4" start_page="325" end_page="326" type="sub_section">
      <SectionTitle>
3.4 Semantics, pra~aatics and ambiguity
</SectionTitle>
      <Paragraph position="0"> Semantics and pragmatics also play an important role in generating the well-formed target language. In the interpretation between Korean and Japanese, there exist several kinds of inherently ambiguous sentences which are generated only by the ambiguous gralrmars of  both languages. (see 5degFragments of interpretation)</Paragraph>
    </Section>
  </Section>
  <Section position="4" start_page="326" end_page="326" type="metho">
    <SectionTitle>
4. K-J Gr~nmar
</SectionTitle>
    <Paragraph position="0"> We design the K-J (or J-K) grammar which elgninates syntactical and semantical ambiguity of' both languages for interpretationdeg This gra~m~mr corresponds to the ccxnmunicative c~npetence for the interpretation between Korean and Japanese. The K-J (J-K) grammar is motivated by grammar modification and the coverinl\] grammar.</Paragraph>
    <Paragraph position="1"> ALGORITHM&amp;quot; irregularity categories removal or adjustment and semantic features insertion.</Paragraph>
    <Paragraph position="2"> Input: a 5-tuple phrase structure grammar G : (N,Tk,Tj, P,S).</Paragraph>
    <Paragraph position="3"> Output: an equivalent 5-tuple phrase structure grammar G' : (N',Tk'\[semj\],Tj',P',S').</Paragraph>
    <Paragraph position="4"> Method: entpirical and heuristic methoddeg llere N and I~' are nontermina\].s, Tk, Tj, Tk' ancl T o ' are terminals, sem~ is semantic features, P and P' are production rules, ~nd S and S' are the start symbolsdeg The J-K granmmr is designed analogously. In the framc~vorl,: of the generalized phrase structure grammar, the semantic features are accepted by a special phrase structure rule, that is a linking rule, which causes the relevant information about the phrase to be passed down the tree as a feature on the syntactic nodes.</Paragraph>
    <Paragraph position="5"> Therefore, interpretation procedure is constructed by a succinct algorithn founded on the K-J(J-K) grammar.</Paragraph>
  </Section>
  <Section position="5" start_page="326" end_page="327" type="metho">
    <SectionTitle>
5. Fragments of Irlterpretation
</SectionTitle>
    <Paragraph position="0"> In this section, we exhibit the frap~nents of our intcrpretatJon system: how phrase structure rules and semantic PSeatures interact in the interpretation procedure aceordJ.ru;; to the K-.J(J-K) grai~lt/iai&amp;quot;.</Paragraph>
    <Section position="1" start_page="326" end_page="326" type="sub_section">
      <SectionTitle>
5.1 \[Iomonymous construction
</SectionTitle>
      <Paragraph position="0"> There are some kinds of construction types provided by syntax relations of each constituent. Among them, modificatiorl is a construction type related to Head and Attributes. Coordination imp\].ies that more than two subconstituents have syntactical coordination re-.</Paragraph>
      <Paragraph position="1"> lation. Let us consider the following Japanese utteranoes : I) t~2~.I-~,~ \[T\] ~&gt;~&lt;~o (modification &amp;quot;(Someone) goes to school, and eats bread.</Paragraph>
      <Paragraph position="2"> 2) ~&gt;99:&lt; \[-c\] g*~-~9&lt;o (coordination &amp;quot;(Someone) eats \[)read and goes to school.' The two utterances imply the semantic notions of modification and coordination, respectively, but have the same conjunction morpheme \[tel. Semantically, they are represented in Korean by the outcome of interpretation as follows: I) ~d;,~degll d*l ~'~~z '~#4=*:I. (modification) 2) ,,~.~-,-'\].,L q;,'-ol\] &amp;quot;,\]'-l. (coordination) All such morpheme ambiguities induce not only lexieal semantic ambiguity but sentential ambiguity, in order to interpret such ambiguous utterances, we c~nploy semantic feature specification as the discipline of the semantic conjunction schemata. The foilowin{~ rules account immediately for the sentences in the example.</Paragraph>
      <Paragraph position="3"> Here we use the GPSG notations:</Paragraph>
    </Section>
    <Section position="2" start_page="326" end_page="327" type="sub_section">
      <SectionTitle>
5.2 Missing construction
</SectionTitle>
      <Paragraph position="0"> Korean and Japanese allow one of&amp;quot; the constituents of a sentence not to be explicitly stated when it is understandable fr~ll the context. In the GPSG framework, this kind of difference can be expressed by a FOOT feature S\[,ASH\[3\]. The SLAS\[I feature indicates that somethinl\] is lilJ.ssJ.n 6 in the structure dominated by the category specified, in this subsection, we exhibit a semantically ~m\]biguous utterance across a h(~nonytilous construction and a missing constructiondeg Consider the fol\].owing Korean utterance. This utteranee also has inherent syntactical and semantic ambil~ui ty.</Paragraph>
      <Paragraph position="1">  In the above example, h~nonymous construction does not arise in Japanese, but missing construction remains.</Paragraph>
      <Paragraph position="2"> We ~nploy a parse tree (2) for semantic adjustment, and fill the gap of local environment with syntactically and semantically agreeable vocabulary; then such utterance of Korean and Japanese is interpretable without ambiguitydeg Consequently, the utterance of Korean I) is interpreted as followsdeg \[\[seoul- cj~ \[kazi-ga okitato\] \[se99~\-~a_/~ \[renraku ga kita\]\]\].</Paragraph>
      <Paragraph position="3">  In order to define a two-way interpretation system more formally, we formulate the internal interfaee(K-J system) for the interpretation. This interface corresponds to the transducer of interpretation. We can define the K-J(J-K) syster,; as a 3-tuple grammar G:(wj,k(or j),wk ), wherewk and w i are Korean words and Japanese words, respectively, and k(j) : Wi-~Wk ( Wk--~Wj ) is a homomorphism. The K-J(J-K) system G defines the following sequence preserving the word order: w~-k(wD, w~w~=k(wDk(w~),. .....</Paragraph>
      <Paragraph position="4"> It also defines the language L(Gk) = {ki(wi)li&gt;O}.</Paragraph>
      <Paragraph position="5"> As mentioned above, the K-J(J-K) systel;L constitutes a simple device for interpretation. A language defined by the K-J(J-K) systom corresponds to the target language. Inversely, the mapping j of w~ into w i is such that the inverse homc{i~orphi~i</Paragraph>
      <Paragraph position="7"> exists. Thus, we define the two-way simultaneous interpretation system ~ by:</Paragraph>
      <Paragraph position="9"> We can define our system ~ using the extended notion; the inverse homo~\]orphism can be replaced by the direct operation of a finite substitution. Consider a gra\[~ar(e.g. Korean) GK&amp;quot; = (Nk, Tk, Pk, Sk) and let j be a finite substitution, defined on the vocabulary (Nk u Tk)*, such that j(w) is a finite(possibly empty) set of word for each word A. We denote j(Nk) = \[~j, j(Tk) = Tj, PjDj(Pk), Sjnj(Sk). Then, the gray, nat (e. g. Japanese) Gj = (Nj, Tj, Pj, Sj) is an interpretation of Gk. If I(Gk), I(Cj) are the sets of all interpretation of Gk and G j, respectively, then I(G#') = I(Gj), and I is an invariant for Gk and Gj.</Paragraph>
      <Paragraph position="10"> 7. Complexity of System NARA The complexity of the algorithm is usually measured by the growth rate of its time and space requirements, as a function of the size of its input (or the length of input string) to which the algorithm is applied. We adopt a finite state transducer as a computing model which governs the fundamental interpretation control. Since we do not count the time it takes to read the input, finite state languages have zero complexity. If reading the input is counted, then finite languages have time complexity of exactly A (the length of input string). Such languages are interpretable in exactly time it, and then called real-time languages. The interpretation which is accompanied by co-occurrence dependency cannot be done in general without relying on arbitrary look-ahead or rescanning of the output.</Paragraph>
      <Paragraph position="11"> However, the nature of on line interpretation is unchangeable. Consequently, our system \]~R_&amp; is interpreted in real-time.</Paragraph>
      <Paragraph position="12"> 8. Concluding Remarks Our approach for constructing this system has both logical view and experimental view; the former is given by mathematical formalization, the latter by the correspondence of two languages. In the view of computational linguistics, we separated the mechanism of our two-way simultaneous interpretation system into the levels of abstract theory, algorit~ii, and implementation to carve out the results at each level ira more independent f'ashion. In order to do so, we specified four important levels of description; the lowest level is morphology, the second level is se~lented word, the third level is syntax and semantics, and the top level controls the computing model of each level.</Paragraph>
      <Paragraph position="13"> Hence, we could determine the range of correspondence between internal representations of both grammars, and the basic architecture of the machinery actually instantiates the algorithn. Consequently, our model produces the extra power by the proposed theory with multiple levels of representation and systematic mapping between the corresponding levels of two languages, because interpretation efficiency requires both functional and mathematical discussions. Nevertheless, the complete pragmatic interpretation still remains quite obscure. Finally, we confront the proble~,~ whether it is possible to construct a two-way simuitaneous interpretation system between other two different language systems such as Japanese and English. We presuppose that the key point of problem-solving is in the study of universality and individuality between two given languages.</Paragraph>
    </Section>
  </Section>
  <Section position="6" start_page="327" end_page="327" type="metho">
    <SectionTitle>
Acknowled ~F~ients
</SectionTitle>
    <Paragraph position="0"> We are deeply grateful to Prof. If. YAI~\[ADA for his encouragement. We would like to thank Dr. A. ADACHI and Dr. K. HASHIDA, for many stimulating discussions and for detailed commentsp and to I lr. Y. SHIRAI and Hr. I.</Paragraph>
    <Paragraph position="1"> FUJISIIIRO for suggestions to if~iprove the paper.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML