File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/94/c94-1004_concl.xml
Size: 2,547 bytes
Last Modified: 2025-10-06 13:57:08
<?xml version="1.0" standalone="yes"?> <Paper uid="C94-1004"> <Title>INTERPRETING COMPOUNDS FOR MACIIINE TRANSLATION</Title> <Section position="3" start_page="49" end_page="49" type="concl"> <SectionTitle> 3. CONCLUSIONS AND IMPI.ICATIONS </SectionTitle> <Paragraph position="0"> I;OR FUR'FItEI~ RI,;SEAI:tCI! 3.1. Remaining problems The method proposed here has so far led to good translation results. However, the problem lies not only in interpreting a COlnp(mnd, but also in identifying an English word sequence its a compound. For the time heing, we use a parsins procedure based on a combination of dependency grammar and categorial grammar. The main parsing difficulty, when dealing with an English input, is to decide whether a lexical stem functions as a finite predicate or as a noininal. We try to remove the ambiguity by starting the parsing by a procedure called 'verbfinder', searching for possible candidates for the predicate function. The function o1&quot; ambiguous items, like result, control etc., may often be identified on tim basis of their evironmeut: if tile word in question is immediately preceded by a preposition and/or an article, it can be easily identified as a nominal element. The parsing procedure may still be made more efficient by utilizing restilts of statistic investigations of tile corpus (Steier & Below 1991, Johansson 1993).</Paragraph> <Paragraph position="1"> 3.2. Future plans Tim advantage of the model outlined here lies in the fact that the general approacll to the grammar underlying the translation system may be adapted to differei~t domains without violating any theoretical assumptions, tlowever, the theory solely does not guarantee a high-quality translation. The preliminary system outlined above is to be developed and improved along the following lines: 0 statistical methods will be used in order to reduce ambiguities and to discover coocurrence patterns on tile basis of larger corpora 0 the medical vocabuhu'y will be enlarged by using hirge compui,'ltional medical data-bases (e.g. MEDLINE) and by consnlting specialists who are native speakers o1&quot; the languages involved in the system 0 the interactive procedures will be evaluated l.lud refined by testing their tiselTuh~ess in experiments with non-linguists.</Paragraph> <Paragraph position="2"> The results of the corpus investigations and the experiments with translation of abstracts am to be used in a system for automatic abstracting and multilingual abstract generation.</Paragraph> </Section> class="xml-element"></Paper>