File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/84/p84-1105_metho.xml
Size: 11,228 bytes
Last Modified: 2025-10-06 14:11:44
<?xml version="1.0" standalone="yes"?> <Paper uid="P84-1105"> <Title>LANGUAGE GENERATION FROM CONCEPTUAL STRUCTURE: SYNTHESIS OF GERMAN IN A JAPANESE/GERMAN MT PROJECT</Title> <Section position="1" start_page="0" end_page="0" type="metho"> <SectionTitle> LANGUAGE GENERATION FROM CONCEPTUAL STRUCTURE: SYNTHESIS OF GERMAN IN A JAPANESE/GERMAN MT PROJECT </SectionTitle> <Paragraph position="0"> J. Laubsch, D. Roesner, K. Hanakata, A. Lesniewski Projekt SEMSYN, Institut fuer Informatik, Universitaet Stuttgart Herdweg 51, D-7000 Stuttgart i, West Germany This paper idescribes the current state of the S~/~gYN project , whose goal is be develop a module for generation of German from a semantic representation. The first application of this module is within the framework of a Japanese/German machine translation project. The generation process is organized into three stages that use distinct knowledge sources. ~ne first stage is conceptually oriented and language independent, and exploits case and concept schemata. The second stage e~ploys realization schemata which specify choices to map from meaning structures into German linguistic constructs. The last stage constructs the surface string using knowledge about syntax, morphology, and style. This paper describes the first two stages.</Paragraph> </Section> <Section position="2" start_page="0" end_page="0" type="metho"> <SectionTitle> INTRO\[X~TION </SectionTitle> <Paragraph position="0"> ~'s generation module is developed within a German/Japanese MT project. FUjitsu Research Labs.</Paragraph> <Paragraph position="1"> provide semantic representations that are produced as an interim data structure of their Japanese/English MT system ATLAS/II (Uchida & Sugiyama, 1980). ~ne feasibility of the approach of using a semantic representation as an interlingua in a practical application will be investigated and demonstrated by translating titles of Japanese papers from the field of &quot;Information Technology&quot;. This material comes from Japanese documentation data bases and contains in addition to titles also their respective abstracts. Our design of the generation component is not limited to titles, but takes extensibility to abstracts and full texts into account. The envisioned future application of a Japanese/German translation system is to provide natural language access to Japanese documentation data bases.</Paragraph> </Section> <Section position="3" start_page="0" end_page="0" type="metho"> <SectionTitle> OVERALL DESIGN CF </SectionTitle> <Paragraph position="0"> Fig. 1 shows the stages of generation. The Japanese text is processed by the analysis part of FtUI&quot;TS~'s ATLAS/II system. Its output is a semantic net which serves as the input for our system.</Paragraph> <Paragraph position="1"> 1 ~ is an acronym for semantic synthesis. The project is funded by the &quot;Informationslinguistik&quot; program of the Ministry for Research and Technology (BM~T), FRG, and is carried out in cooc~ration with</Paragraph> </Section> <Section position="4" start_page="0" end_page="491" type="metho"> <SectionTitle> ~JJITSU Research Laboratories, Japan. </SectionTitle> <Paragraph position="0"/> </Section> <Section position="5" start_page="491" end_page="491" type="metho"> <SectionTitle> CONCEPTUAL STRUCTURE </SectionTitle> <Paragraph position="0"> ATLAS/II's semantic networks (see Fig.2) are directed graphs with named nodes and labelled arcs. The names of the node are called &quot;semantic symbols&quot; and are associated with Japanese and English dictionary entries.</Paragraph> <Paragraph position="1"> The labelled arcs are used in two ways: a) Binary arcs either express case relations between connected symbols or combine sub-structures null b) Unary arcs serve as modifying tags of various kinds (logical junctors, syntactic features, stylistics, ...) The first stage of generation is conceptually oriented and should be target language independent, we use frame structures in a KRL-like notation. Our representation distinguishes between case scb~.mta (used to carry the meaning of actions), and concept scho-~_ta (used to represent &quot;things&quot; or &quot;qualities&quot;). Each semantic symbol points to such a schema. These schemata have three parts: (I) roles: For action schemata, these are the usual cases of Fillmore (e.g. AGENT, OBJECT, ...); for concept schemata roles describe how the concept may be further specified by other concepts.</Paragraph> <Paragraph position="2"> (2) transformation rules: These are condition-action pairs that specify which schema is to be applied, and how its roles are to be filled from the ATLAS/II net.</Paragraph> <Paragraph position="3"> (3) choices describe possible syntactic patterns for realization.</Paragraph> <Paragraph position="4"> (choices ...))).</Paragraph> <Paragraph position="5"> i) Retrieval of the lexical entry of a German verb and its associated case frame corresponding to the IKBS.</Paragraph> <Paragraph position="6"> ii) Selection of lexical entries for the other semantic symbols.</Paragraph> <Paragraph position="7"> iii) Selection of a realization schema (RS), mapping of IKBS roles to RS functional roles, and inferring syntactic features.</Paragraph> <Paragraph position="8"> In i) a simple retrieval may not suffice.</Paragraph> <Paragraph position="9"> In order to choose the most adequate German verb, it will e.g. be necessary to check the fillers of an IKBS. For example, the semantic symbol REALISE may translate to &quot;realisieren&quot;, &quot;implementieren&quot; etc.. If the Instrument role of REALISE were filled with an instance of the PROGRAM concept, we would choose the more adequate word sense &quot;implementieren&quot;. In ii) sometimes similar problems arise.</Paragraph> <Paragraph position="10"> For example, the semantic symbol ACCIDENT may translate to the German equivalent of &quot;accident&quot;, &quot;error&quot;, &quot;failure&quot; or &quot;bug&quot;. The actual choice depends here on the filler of ACCIDENT's semantic role for &quot;where it occurred&quot;.</Paragraph> <Paragraph position="11"> iii) The choices aspect o~ a schema describes different possibilities how an instance may be realized and specifies the conditions for selection. (This idea is due to McDonald (iq83) and his MUMBLE system). The factors determining the choice include: (a) Which roles are filled? (b) What are their respective fillers? (c) Which type of text are we going to generate? For example if the Agent-role of a case frame is unfilled, we may choose either passivation or selection of a German verb which maps the semantic object into the syntactic subject. If neither agent nor object are filled, nominalization is forced.</Paragraph> <Paragraph position="12"> A realization schema (RS) is a structure which identifies a syntactic category (e.g. CLAUSE, NP) and describes its functional roles (e.g. HEAD, MODIFIER, ...). We employ Winograd's terminology for functional gran~nar (Winograd, 1983). In general, case schemata will be mapped into CLAUSE-RS and concept schemata are mapped into NP-R~. A CLAUSE-RS has a features description and slots for verb, subject, direct object, and indirect obiects. A features description may include information about voice, modality, idiomatic realization, etc.. There are realization schemata for discourse as well as titles. The latter are special cases of the former, forcing nominalized constructions.</Paragraph> </Section> <Section position="6" start_page="491" end_page="491" type="metho"> <SectionTitle> FROM CONCEPTS TO LANGUAGE </SectionTitle> <Paragraph position="0"> In the target language oriented stage 2, the following decisions have to be made:</Paragraph> </Section> <Section position="7" start_page="491" end_page="493" type="metho"> <SectionTitle> REFERENCING AND FOCUSSING </SectionTitle> <Paragraph position="0"> For referencing and other phenomena like focussing, the simple approach of only allowing a schema instance as a filler is not sufficient. We therefore included in our knowledge representation a way to have descriptors as fillers. Such descriptors are references to parts of a schema. In the following example the filler of USE'S Objectslot is a reference descriptor to SYNTHESIZE's X could be realized as: &quot;Using functions, that are synthesized by dynamic programming for data-base access.&quot; In general, descriptors have the form: (the <path> from <IKBS>) <path> = <slot>...</Paragraph> <Paragraph position="1"> A description can be realized by a relative clause.</Paragraph> <Paragraph position="2"> The same technique of referring to a sub-structure may as well be used for focussing. For example, embedding X into (the Purpose from X) expresses that the focus is on X's Purpose slot, which would yield the realization: &quot;Database access using functions that are synthesized by dynamic progra,ming.&quot; A WALK WITH SEMSYN Let us look at the first sentence from an abstract. Figure 2 contains the Japanese input and the semantic net corresponding to ATLAS/II's analysis.</Paragraph> <Paragraph position="3"> In stage i, we first examine those semantic symbols which have an attached case schema and instantiate them according to their transformation rules.</Paragraph> <Paragraph position="4"> In this example the WANT and ACHIEVE nodes (flagged by a FRED arc) are case schemata. Applying their tranformation rules results in the following IKBS: In stage 2, we will derive a description of how this structure will be realized as German text.</Paragraph> <Paragraph position="5"> First, consider the outer WANT act. There japanese input for FUJITSUs RTLRS/II-systeR passive voice. Next, we observe that WANT's object is itself an act with several filled roles and could be realized as a clause. One of the choices of WANT fits this situation. Its condition is that there is no Agent and the Object will be realized as a clause. Its realization schema is an idiomatic phrase named *Es-Part*: &quot;Es ist erwuenscht, dass <CLAUSE>&quot; (&quot;It is wanted that <CLAUSE>&quot;) Now consider the embedded <CLAUSE>. An ACHIEVE act can be realized in German as a clause by the following realization schema: This schema is not particular to ACHIEVE. It is shared by other verbs and will therefore be found via general choices which ACHIEVE inherits.</Paragraph> <Paragraph position="6"> The Agent of ACHIEVE's IKBS maps to the Subject and the Method is realized as an indirect object. Within the scope of the chosen German verb &quot;erreichen&quot; (for &quot;achieve&quot;), a Method role maps into a PP with one of the prepositions &quot;dutch&quot;, &quot;mit&quot;, &quot;mittels&quot; (corresponding to &quot;by means of&quot;). This leads to the following IRS: Such an instantiated realization schema (IRS) will be the input of the generation front end that takes care of a syntactically and morphologically correct German surface structure (see Fig. 2).</Paragraph> </Section> class="xml-element"></Paper>