File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/94/c94-1028_metho.xml

Size: 13,749 bytes

Last Modified: 2025-10-06 14:13:37

<?xml version="1.0" standalone="yes"?>
<Paper uid="C94-1028">
  <Title>THE RUMORS SYSTEM OF RUSSIAN SYN .\[ HtJSIS</Title>
  <Section position="1" start_page="0" end_page="0" type="metho">
    <SectionTitle>
THE RUMORS SYSTEM OF RUSSIAN SYN .\[ HtJSIS
</SectionTitle>
    <Paragraph position="0"> MaX I. Kanovich, Zoya M. ShMyapina.</Paragraph>
    <Paragraph position="1"> Institute of Oriental Sh~dies, Russian Academy of Sciences, Rozhdestvenka sir., 12, 103753 Moscow, Kussi~</Paragraph>
  </Section>
  <Section position="2" start_page="0" end_page="0" type="metho">
    <SectionTitle>
Abstract
</SectionTitle>
    <Paragraph position="0"> The RUMORS synthesizer of Russian is a,~ integral part of the JM~AP experimental system of ,ht.pn.nese-Russian antom.t~tic translation, although it cn,n also h~ve other ~ppliczrtions. Morphologically, it is based, primtrrily, on A.A.ZMizny~k's model of Russi~t. inflexion. Syntactical functions of RUMORS rely on word-order ~nd dependency d~t~ as input information. The synthesizer is implemented on IBM PC, MS DOS, in Turbo Pascal.</Paragraph>
  </Section>
  <Section position="3" start_page="0" end_page="0" type="metho">
    <SectionTitle>
1 General information
</SectionTitle>
    <Paragraph position="0"> The RUMORS system of RUssian MORphologicul and Syntactical synthesis has been developed as part of the JaRAP experimental system of Jt~p~nese-Russian antoma~tic tra~sl~tion, described in (Modina,Shaly~pin~,1994). Its operation is, however, completely independent of the other components of the Jatl2kP system, so that RUMORS could be used in any other AT system irrespective of its source l~nguage. Basically, RUMORS constitutes a system in its own right, which can also be used for purposes other th~n tra~nsl~tion, e.g., as ~ computerized reference book of Russian morphology and the simplest phenomena of syntactic government a~td agreement (for students and teachers of Russian), ~ part of spell-checking system, etc.</Paragraph>
    <Paragraph position="1"> The RUMORS synthesizer has two m~jor modes of oper~tlon: the QUERY mode ttnd the TASK mode.</Paragraph>
    <Paragraph position="2"> In the QUERY mode of operr~tion, ItUMOItS accepts, as its input, a separate syntacticomorphological query (entered from the keybo~Lrd) which represents the lexeme to be processed ~nd the synttmtlcal.~.nd morphological ch araeteristics specifylag the word-form to be obtained by the processing. The output is, primarily, the desired word-form of the input lexeme or, if necessary, ~ periphrastic ~u6stitute for this word-form.</Paragraph>
    <Paragraph position="3"> After obtaining this output, the user c~t switch at will to the FULL PARADIGM submode of the QUERY mode. In this submode, RUMORS gener~rte8 all synthetic word-forms of the input lexeme (or of the l~t lexeme processed while obtaining a pcriphra~tic word-combina.tion). It n\],o offers s. set of menus alIowing the user to modify his initial qlmry by choosing additions\] morphologicM categorh~s from ~;lLese It\] (2~rtll S.</Paragraph>
    <Paragraph position="4"> In the TASK mode~ the input d~t~ is i~ sequence.</Paragraph>
    <Paragraph position="5"> of queries fed from the special TASK file. Apart from lexical, morphological, and syntacticM inform~ tlon contained in eaclt query, the TASK mode of opertLtion m~kes considerable use of word order (l~t~ which is essential, axnong other things, for processing prepositional, adjectival and noun phrases. The output is the sequence of word-forms manifesting the phrases or sentences specified by the input sequence of queries.</Paragraph>
    <Paragraph position="6"> Both in the QUERY ttnd bt th.e TASK modes, the output is displayed on the screen and written slmultaneously in the speci~ SOLVE file. If required by the user, it may Mso include the alterations rome in the queries processed and the d~tubase informa~tion used in their processing. Inasmuch as the simulatio~t by RUMORS of ttte linguistic processes involved in tl.ussla~t synthesis is faithful enough, this ~uxillary d~ttL could be valuable by itself (e.g., for learning or teaching Russian), ~ide from its significance for debugging and controlling purposes.</Paragraph>
  </Section>
  <Section position="4" start_page="0" end_page="178" type="metho">
    <SectionTitle>
2 Synthesis functions
</SectionTitle>
    <Paragraph position="0"> envisaged by RUMORS</Paragraph>
    <Section position="1" start_page="0" end_page="177" type="sub_section">
      <SectionTitle>
2.1 Morphoh)gical functions
</SectionTitle>
      <Paragraph position="0"> The morphologic~ functlolts of RUMORS cover HI aspects of Russia.n in~le:don, as well as some semr~nticMly bafic lcxico-morphologlcaJ relationships.</Paragraph>
      <Paragraph position="1"> '\['he inlh:xiona\] funetions tLre initiated after the input query ha~ ~dreaxty been subjected to syntactical and lexieo-morphologlcal operations, which retry lltrve modified its htitial form. At this stage, ii: contMas nothing but the lexeme and the infiexiontd categories specifying its desired word-form. If some categories needed for complete specifictrtion of this word-form are not explicltly stated in the query, they are settled by default. E.g., a query containing nothing but the lexeme of a verb is taken to describe the finite  form, indicutlve mood, present tense, active voice, 3d person singular of this verb, so that the query, s~y, 6pam~ produces the form 6epem.</Paragraph>
      <Paragraph position="2"> The inilexional model of Russian implemented by RUMORS is the one proposed and detailed by prof.</Paragraph>
      <Paragraph position="3"> A.A.Zaliznyak (1977). Its important virtue is tttat the generation procedures it envls~ges represent those to be expected of human speakers of Russian more faithfully than any other known model, wtfile the requisite database information is very compact.</Paragraph>
      <Paragraph position="4"> Our verslo~t of ZaliznyaMs inflexionai model differs from its description in (Zaliznyabq1977) in two respects. On the one haztd~ we have reduced the scope of the original Zalizny~k's model, implementing it only in so fax as written Russltrn is concerned. As tr result, quite a number of the p~rticulars of Russian accentuation registered in (Zallznyak,1977), nt~mely, all those that axe relev~t for oral speech only, have been ignored.</Paragraph>
      <Paragraph position="5"> On the other hand, we h~ve extended the model to cover analytical word-forms. Moreover, we have introduced a new type of morphological functions, the periphrastic functions allowing RUMORS to produce output that makes sense even if the required worflform is non-existent (e.g., due to the lexeme h~ving a defective paradigm or to the combination of cs.tegorlea in the query being beyond the scope of Russlan infiexlonal morphology). E.g., the future tense ist person singular of the verb noge~um~ 'win' (which does not h~ve thls form) is p~r~phrased e.uo~y noge~urn~ '&lt; I &gt; ~hall be able to win'.</Paragraph>
      <Paragraph position="6"> The lexlco-morphological functions of RUMORS axe limited so fax to conversion between lexemes having essentially similak semantics, but differins in their part-of-speech or (for verbs only) aspectuai characteristics. 'l'hus~ the aspectuM or part-of-speech markers in the following three queries paaocrn.~arn~:coo, Snarn~:lI c~use the lexeme~ in these queries to be replaced, resp., by the required perfective verb, noun, a:nd adjective: null paeemu.aarat, ~l, ne~lue t u$o ecmubtl~I.</Paragraph>
      <Paragraph position="7"> The implementation of aspectual lexico-morphological relations is based, principally, o~t their description in (Zalizny~k,1977). For p~rtof-speech relations, we have adopted, though in a very limited sense, the concept of lexical substitutions (Zholkovsldj,Mel'chuk, 1970).</Paragraph>
      <Paragraph position="8"> If the d~tabase contains no information necessary for switcldng to a lexeme of the desired a~pect or paxt of speech, RUMORS resorts to its periphrastic functions or else m~kes modifications i~t the query. E.g., the query '~U~OS~Ug : 1' aimed at forming the verb corresponding to the noun uunoouur 'bureaucrat', will be processed to produce the phrase: Deaaern ms, umo xapa~mepno DAx uu~oo~uEa '&lt; He &gt; is doing what is typical of a bureaucrat'.</Paragraph>
    </Section>
    <Section position="2" start_page="177" end_page="178" type="sub_section">
      <SectionTitle>
2.2 Syntactical functions
</SectionTitle>
      <Paragraph position="0"> RUMORS has two m~jor types of syntactical funetlons: relational ttnd word-string sues. There is alas a third group of prepositional\]unctions.</Paragraph>
      <Paragraph position="1"> Relational functions may be called both in the QUERY and in the TASK mode of operation to modify the input query with regard to the relational re}. erenees it may include. There may be references to dependency relations, where the node specified by the query acts either as dependent (D-references) or as governor (G-references}, and to ~napttoric relations (F-references). I)- and G-references may contain embedded rehLtional references, so that in the genera\] case each reference present in the input query corresponds to tL more or leas corrtplex fragment of the dependency and anaphoric structure this query is p~rrt of.</Paragraph>
      <Paragraph position="2"> The job of the rel~tlonal functions is to ensure fulfilment of the requirements for syntectical government and ~greement which may be imposed on the word-form specified by the query by the dependency and anaphorlc relations this query has refereuces to.</Paragraph>
      <Paragraph position="3"> Tkls involves extr0cting such requirements from the references in the query, reconciling them with enc.h other (if there axe two or more references dictating conflicting requlrements)~ a.nd then rnodifylng the inltim query to fit them: choosirtg the correct prepoultion or conjunction (tire empty one, if needs be) to accompany the goal word-form, ~Itd ulteringj a~ required, the inflexional and paxt-obspeech categories within the query. E.g., the query pemenue R D~(aat~ucem~) describing the noun petueuue 'decision~an the synt~tctical object of the verb oasueem~ 'depend' will produce the prepo~itlonai combination: om pemenu~r '&lt; depend &gt; on &lt; the &gt; decision'.</Paragraph>
      <Paragraph position="4"> Word-strlng fimctlons axe specific to the TASK mode of operation only. Their peculiarity is that they include some t~nalysis-llke operations maldng it possible to locate s.nd process simple prepositional, adjectival and noun phrazes, even if the input sequence of queries h~ no syntactical marking.</Paragraph>
      <Paragraph position="5"> To be more partlcul~r, word-strlng processing consists in ext~mining the queries of the input sequence one by one until the query under examlutLtlon is found to answer our definition of the end of a word-string. During this ex~min~tlon, e~ch query is checked for information relevant to agreement tLnd prel)ozitionaJ gow~rnmenl;, ~J,nd tile inflexion~l ;J,ad  paxt-of-speech c~tegorles pertaining to such inform,&gt; tion axe integrated into a special word-strir~g query (w-query). After the end of the word-string bus been located and exarnhted, the w-query obta.hted is, i~t staatdaxd ca~es~ rome common to all of the individual queries within this word-string. 'rhus~ the sequence of morphologically empty querle, for lexemes o, oec b ~atu, aa.~a~amura 'in, all, our, galaxy' will be processed to pro&amp;me the prepodtionM phrase: so 8ee~ uataeCl eaaa~mu~e 'in the whole of our #alaxy'.</Paragraph>
      <Paragraph position="6"> Some types of word-strlngs, e.g. tho~e containing ct~rdinM numerals, have to be subjected to more elaborate operations.</Paragraph>
      <Paragraph position="7"> If * query with la a word-string contains rela.tlonal references, the requirements imposed by these are given priority over the requirements extr~ted by word-string functions~ so that the l~tter provide a sort of default.</Paragraph>
      <Paragraph position="8"> Prepositional functions s.re employed in both modes of operation, if the word-form or word-string being processed is to be preceded by a preposition.</Paragraph>
      <Paragraph position="9"> Tltus~ if ttte preposition i~t question denote~ location, direction or source, the noun it i~ meant to accompanty is checked for having lexical preferences in thi~ respect. This helps to account, e,g., for such idiomrttiCfl a.fl aa yam~e 'in the ~treet' VS.</Paragraph>
      <Paragraph position="10"> n nepeyaue 'in the aide-,~treet'.</Paragraph>
      <Paragraph position="11"> Other prepositional fa~tctio,ts serve to ~dd the prothetie u to personal pronouns ~.fter prepositions imposhtg this requirement, to choose the eoutextual form of the prepositlon if it ha~ more tht~n one of theft h etc.</Paragraph>
      <Paragraph position="12"> The data.base ha~ thus been reduced to leas than one fifth of (ZMiznya.k,1977), still Mt'ordiIIg correct morphologica.l proees,~ing of M1 of the 100 000 lexemes listed in (7,ali'anyak,1977).</Paragraph>
      <Paragraph position="13"> Moreover, so far ~m lexeme8 with st~n(hLrd morpho\]ogle'el ch~razterlstlcs go, they ctLn now be processed correctly, even if they axe newly-colned or occasional (and do not htwe therefore dictionaxy entries of their own). As (ZMizny~k,1977) m~y be trusted to eontMn M1 non-standard lexemes, the inllexionM ~nd aspectu~l information in the resulting d~t,~b~e very nearly covers the whole of the ltussb~rt voct~bulaxy.</Paragraph>
      <Paragraph position="14"> The sltu~Mon is different with inform~tlon to be used in p~rt-of-speec.h conversion ~nd syntactic proc(;~dng, for it i~ not provided in (Zslie, ny~k,1977). This information is now also being Mded~ but in this respect, the datn.ba*m is fa,r from completed ~tnd ha~ yet only experimental vMue.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML