File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/06/p06-2095_concl.xml
Size: 4,492 bytes
Last Modified: 2025-10-06 13:55:23
<?xml version="1.0" standalone="yes"?> <Paper uid="P06-2095"> <Title>Using comparable corpora to solve problems difficult for human translators</Title> <Section position="6" start_page="744" end_page="745" type="concl"> <SectionTitle> 4 Conclusions and future work </SectionTitle> <Paragraph position="0"> The results of evaluation show that the tool is successful in finding translation equivalents for a range of examples. What is more, in cases where the problem is genuinely difficult, ASSIST consistently provides scores around 4 - &quot;minor adaptations needed&quot;. The precision of the tool is low, it suggests 50-100 examples with only 2-4 useful for the current context. However, recall of the output is more relevant than precision, because translators typically need just one solution for their problem, and often have to look through reasonably large lists of dictionary translations and examples to find something suitable for a problematic expression. Even if no immediately suitable translation can be found in the list of suggestions, it frequently contains a hint for solving the problem in the absence of adequate dictionary information.</Paragraph> <Paragraph position="1"> The current implementation of the model is restricted in several respects. First, the majority of target language constructions mirror the syntactic structure of the source language example. Even if the procedure for producing similarity classes does not impose restrictions on POS properties, nevertheless words in the similarity class tend to follow the POS of the original word, because of the similarity of their contexts of use. Furthermore, dictionaries also tend to translate words using the same POS. This means that the existing method finds mostly NPs for NPs, verb-object pairs for verb-object pairs, etc, even if the most natural translation uses a different syntactic structure, e.g. I like doing X instead of I do X gladly (when translating from German ich mache X gerne).</Paragraph> <Paragraph position="2"> Second, suggestions are generated for the query expression independently from the context it is used in. For instance, the words judicial, military and religious are in the similarity class of political, just as reform is in the simclass of upheaval. So the following example The plan will protect EC-based investors in Russia from political upheavals damaging their business.</Paragraph> <Paragraph position="3"> creates a list of &quot;possible translations&quot; evoking various reforms and transformations.</Paragraph> <Paragraph position="4"> These issues can be addressed by introducing a model of the semantic context of situation, e.g. 'changes in business practice' as in the example above, or 'unpleasant situation' as in the case of daunting experience. This will allow less restrictive identification of possible translation equivalents, as well as reduction of suggestions irrelevant for the context of the current example. null Currently we are working on an option to identify semantic contexts by means of 'semantic signatures' obtained from a broad-coverage semantic parser, such as USAS (Rayson et al., 2004).</Paragraph> <Paragraph position="5"> The semantic tagset used by USAS is a language-independent multi-tier structure with 21 major discourse fields, subdivided into 232 sub-categories (such as I1.1- = Money: lack; A5.1- = Evaluation: bad), which can be used to detect the semantic context. Identification of semantically similar situations can be also improved by the use of segment-matching algorithms as employed in Example-Based MT (EBMT) and translation memories (Planas and Furuse, 2000; Carl and Way, 2003).</Paragraph> <Paragraph position="6"> The proposed model looks similar to some implementations of statistical machine translation (SMT), which typically uses a parallel corpus for its translation model, and then finds the best possible recombination that fits into the target language model (Och and Ney, 2003). Just like an MT system, our tool can find translation equivalents for queries which are not explicitly coded as entries in system dictionaries. However, from the user perspective it resembles a dynamic dictionary or thesaurus: it translates difficult words and phrases, not entire sentences. The main thrust of our system is its ability to find translation equivalents for difficult contexts where dictionary solutions do not exist, are questionable or inappropriate.</Paragraph> </Section> class="xml-element"></Paper>