File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/92/c92-2113_metho.xml

Size: 11,525 bytes

Last Modified: 2025-10-06 14:13:02

<?xml version="1.0" standalone="yes"?>
<Paper uid="C92-2113">
  <Title>A Method of Utilizing Domain and Language specific Constraints in Dialogue Translation</Title>
  <Section position="5" start_page="0" end_page="0" type="metho">
    <SectionTitle>
:STATUS :COMPLEMENT
:PREVIOUS-IFT :QUSTIONREF
</SectionTitle>
    <Paragraph position="0"/>
    <Paragraph position="2"> if type of luput.Obje. Pred is :action then set ?output to \[lIFT REQUEST\]</Paragraph>
    <Paragraph position="4"> Rule-3. Transfer rule for IFT TA concise description for notation of rewriting(tra~mfer) rules: The first line of a rule indicates the target feature path of rewriting, followed by Application Constraints with combinations of pm'a~neters and their values; e.g. :Type :Generah The patterns in = ... and out = ... indicate the input and the output (sub)feature stntcture r~pectively. Some additional conditions can be described using if sentences. For referring a feature value, a feature path i~a tot~to-down direction can be used like as Input.Obje.Pred Note that the above mfle~ are partly modified for explanation using PSs instead of FSs.</Paragraph>
    <Paragraph position="5"> The explanation for the rules is described as follows, though the allowed space precludes the detail.</Paragraph>
    <Paragraph position="6"> The whole transfer process are composed of several sub-procedures according to the Rewriting Environmeats designated by the main rule (the top level rule). The general framework is as follows.</Paragraph>
    <Paragraph position="7"> First, the rewriting of ellipsis resolution process provides the missing zero-pronouns referring the speaker or the hearer. Then an Illocutionary Force Type is given to ttle top level of tile feature structure. After this a kind of normalization is performed (so called Japanese-to-Japanese transfer) in order to make the (Japanese-to-English) transfer easier. The processing of these sub-procedures are regarded as a pre-transfer phase.</Paragraph>
    <Paragraph position="8"> The main transfer phase contains 3 sub-procedures : idiomatic, general and default. The Rule-1 is an example of simple general transfer rules.</Paragraph>
    <Paragraph position="9"> After the main transfer phase, the transfer within the English feature structures is performed. The Rule-2 and the Rule-3 are applied in this phase.</Paragraph>
    <Paragraph position="10"> Using ttle Rule-2, a Copula predicate structure is transferred to another substantial predicate structure. When this rule is applied, a local parameter is set to the Rewriting Environment. After this, under the new RE the transfer of cases (e.g. lden -~ Mann) is carried out with another rewriting rule including domain knowledge.</Paragraph>
    <Paragraph position="11"> The Rule-3 designates a rewriting of IFT from IN-FORM to REQUEST under certain conditions. As mentioned in tile previous section, such a transfer yields a more natural utterance.</Paragraph>
    <Paragraph position="12"> At present the flexibility of the system is still insufficient from the viewpoint of context, processing. Iiowever, it is possible to control apllying rules by means of local parameter setting (like :status :complement), to a certain extent.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.4 Other Examples and the current
</SectionTitle>
      <Paragraph position="0"> status The following examples were described as domain and language specific knowledge for translating typical &amp;quot;da-expressions&amp;quot; that appear in our target dc~ maim The frequency of &amp;quot;da-expressions&amp;quot; iu ATR Dialogue Database is as follows. This investigation (by Tomokiyo) recognized about 200 different word sequences as da-expressions in predicate parts of sentences in the conference registration dialogues.</Paragraph>
      <Paragraph position="1"> The occurrence of da-expressions: 1,845 The occurrence of all predicates: 5,200 (approximat ely ) The numbers of sentences and words appeared the corpus are respectively 1,666 and 38,258. The rate of da-expressions is roughly 35 %. Though tile exact percentage of copula da-expressions is not yet calculated, it is estimated at 150 ~ 200. Besides, we envis age some copula expressions which are *tot included in the above investigation, like &amp;quot;to natte orimasu&amp;quot; (mentioned in the subsection 3.2). The current task ACqES DE COLING-92, NANTES, 23-28 AOUq&amp;quot; 1992 7 6 0 PROC. OF COLING-92, NANTES, AUG. 23-28, 1992 is to classify the types of copula expressions which require certain complementation of substantial words.</Paragraph>
      <Paragraph position="2"> Among them, two typical examples are shown as fol lows.</Paragraph>
      <Paragraph position="3"> aa: Tsugi no iinkai wa ikkagetsu go desK.</Paragraph>
      <Paragraph position="4"> E3: The next committee will be held after one month.</Paragraph>
      <Paragraph position="5"> J4: XXX-hotoru wa ZZZZ-yen to natte orimasu.</Paragraph>
      <Paragraph position="6"> E4: As for XXX-hotel, it(the room charge) costs ZZZZ yen.</Paragraph>
      <Paragraph position="7"> Both of tire above Japanese sentences lack substantial predicates: e.g. corresponding to &amp;quot;will be held&amp;quot; or &amp;quot;costs&amp;quot;. For translation of 33, an associative knowledge(a kind of common sense) is required: committee time location ~ be held In this example, J 3 is the answer for the question that demands the date of the next committee. Whether or not a substantial predicate indicating the event lead by tire committee and the date(interrogation) appears in the previous utterances, that kind of associative knowledge (relatively specific to the target domain) is applicablE.</Paragraph>
      <Paragraph position="8"> As fbr ,14, an implicit comparison (actually the local topic of the dialogue is &amp;quot;the expense of hotel rooms&amp;quot;) is underlying. In this case, the key to complemental,on can bE obtained from tile preceding utterances. It implies that the XXX hoteru with topic nmrker &amp;quot;wa&amp;quot; (it seems to be the subject of the sentence like aa) only designates the field of the copula equation. In our current h'amework of analysis of sentence by sentence, it is impossible to distinguish the difference between J3 and a4. Thereh)re certain domain klmwledge is required. For achieving a suitable translation, it should be comlected with the law guage specitic constraint of producing (discourse) utterances. The input PS-J4 (corresponds to the analysis result, of ,14) couhl be rewritten into I'S-E4, am shown below.</Paragraph>
      <Paragraph position="9">  PS-E4. tbr translation of.14 Am the lexicalization from the P.q 1','4, we could give several variations for the cave, Field: as for, *,~ the case of, ... if we adopt the generating strategy of the prior position of theme (equivalent with the input), the result output may be as E4.</Paragraph>
    </Section>
  </Section>
  <Section position="6" start_page="0" end_page="0" type="metho">
    <SectionTitle>
4 Discussion
4.1 Related Issues
</SectionTitle>
    <Paragraph position="0"> Ellipsis iv one of the prominent characteristics of Japanese spoken dialogue. Concerning the issue of identifying Japanese zero pronouns, we have some previous works. A theoretical foundation was given by Yosbimoto\[15\] and an implementation was performed by Dohsaka\[1\], in which zero pronouns referring dialogue participants (speaker/hearer) are identified based on tile usage of honor'tics and the speaker's territory within a sentence. As such ellipses occnr ahnost obligatorily in dialogue, tim formalization seems to be relatively simple. Of course, the resolution of some phenomena requires more complex information from the context.</Paragraph>
    <Paragraph position="1"> Kudo\[5\] showed that another kind of ellipsis indicating objects in the previous sentence could be r~ solved with local cohesive knowledge extracted from actual corpora. This knowledge consists of pmr tern-plate patterns of successive two se.ntences and enables certain eomplementation of elliptical objects. The value of iris work is to bave proposed a method of senti-automatic acquisition of such knowledge from real dialogue corpora.r6\] The primary objectiw~ of these approaches was to resolve ellipses. Therefore, problems of translation tmve not been sutticiently focused. HereaftEr we have to pay attention to the insight suggested in the pre vious sections.</Paragraph>
    <Paragraph position="2"> As approaches tYom the other viewpoint of knowl edge based translation, WE tind sonn! representative works in which semantic networks are used tbr representing meaning structure including context. (and sometimes world knowledge)inlbrmation, \[10\] \[4\] Mel'~uk's Meaning Text Theory is remarkable in corn sidering cormnunicative structure of text. '\['Ire al.tempt of knowledge based generating mull,lingual text at CMU is also notable, while it does not seem to },ave clearly mentioned about tile relationships between their interlingua and hmguage specilic conununicatiw: strategies.</Paragraph>
    <Paragraph position="3"> Stanwood and Suzuki suggested that the conmnl nicative structures somel.ina~s ditfer with languages and showed a concept of repartitionmg the given ,letwork conliguration. In this study, a semantic network is ,~ssumed to have been divided into contrastive par titions: Theme vs. RhemE, Ohl- vs. New-information etc. An input utterance in the source language is represented as a part of the ne.twork. From this start point, tile producing a target language utterance is processed through repartitionmg the network, if necessary. \[11\] \[13\] q'his processing model motivated the currEnt issue of utilizing dolnain and language specific constrailltS ill oar (lialog/ll! I.lanslation 8ysteln.</Paragraph>
  </Section>
  <Section position="7" start_page="0" end_page="0" type="metho">
    <SectionTitle>
ACRES DE COLING-92, NANTES, 23-28 Aot~r 1992 7 6 1 PROC. OF COLING-92, NANTES, AUO. 23-28, 1992
4.2 Future Directions
</SectionTitle>
    <Paragraph position="0"> According to Narita\[9\], we can aSsulne two kinds of syntactic systems for ally languages. The first is a core syntactic structurc that is generally recognized as a universal system of syntax. Tbe second syntactic structure is rather language dependent and peripheral. Ilowever, this does not mean that tile second syntactic system is unimlrortant. Though it is difficult to translate into other languages, the second syntactic system just reflects the characteristics of a certain langnage. It includes many favorite expressions ill the language. This issue is quite interesting also froln tile standpoint of soeioliuguistics and cross language eomlnunieation.</Paragraph>
    <Paragraph position="1"> From tile viewpoint of translating dialogues, if an exl)ressi(m of a source language is peril)heral and there is no corresponding structures in a target lan guage, the source struetoure could be transforlned into a universal structure before translation. In order to perforln this idea, such a transformation should be possible to be formalized. 1,'urtherlnore, certain implicit (domain- and language-specific) knowledge might be needed ill sonic cases.</Paragraph>
    <Paragraph position="2"> Tile target expression in this article, a certain kind of &amp;quot;da-expressions', is regarded as a typical second syntactic structure described above. Our fnture efforts will be directed to investigating various structures and for refining and extending the methodology proposed here.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML