File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/04/w04-1408_metho.xml

Size: 22,056 bytes

Last Modified: 2025-10-06 14:09:16

<?xml version="1.0" standalone="yes"?>
<Paper uid="W04-1408">
  <Title>Bilingual concordancers and translation memories: A comparative evaluation</Title>
  <Section position="3" start_page="5" end_page="5" type="metho">
    <SectionTitle>
4 Comparative analysis of BCs and TMs
</SectionTitle>
    <Paragraph position="0"> On the surface, it may seem to be an obvious choice for a translator to select a TM over a BC since a TM includes the basic functions of a BC, as well as a number of additional features (e.g.</Paragraph>
    <Paragraph position="1"> automated searching, segment-level matching, fuzzy matching). However, if one looks beneath the surface, it seems that while TMs may be favourable in some circumstances, there are other situations where a BC may be the preferred tool. In the following sections, we will examine the strengths and weaknesses of BCs and TMs, using ParaConc and Trados as representative examples of these respective categories of tools.</Paragraph>
    <Section position="1" start_page="5" end_page="5" type="sub_section">
      <SectionTitle>
4.1 Automation
</SectionTitle>
      <Paragraph position="0"> Automation is an oft-touted advantage of TMs.</Paragraph>
      <Paragraph position="1"> In principle, automating the search feature should speed up the process; however, this may not always be the case. As pointed out by Bedard (1995:28), it is possible to approach automation in one of two ways: 1) an ambitious or high-tech approach, using very sophisticated and highly automated tools, such as TMs, or 2) a more modest or low-tech approach, where the tools (e.g. BCs) are simpler and require more user input.</Paragraph>
      <Paragraph position="2"> In the case of the highly-automated approach, there can be hidden costs. Because the tools are more sophisticated, they may require a greater investment of time and effort in learning how to use them, which may prompt users to ask &amp;quot;What have I got myself into?&amp;quot;. The pre-processing steps (e.g. alignment) may also be more demanding because an automated system depends more heavily on correct alignment. As noted in section 2.2, in the case of Trados, if a translator wishes to ensure that the alignment is absolutely correct in order to prevent misaligned TUs being presented, he must manually verify, and if necessary correct, the alignment - a process that can be extremely labour-intensive if the database is large. In contrast, since the data generated by BCs is designed for consultation by a human user, not a computer, the alignment requirements are somewhat less stringent. A certain number of alignment errors can be tolerated in a BC because the danger of &amp;quot;automatically&amp;quot; retrieving misaligned segments does not exist, and if an error does occur, the translator can simply look to the preceding or following text to find the corresponding segment because a BC does not extract the segment from its surrounding text.</Paragraph>
      <Paragraph position="3"> Because BCs can tolerate a certain margin of error, the translator need not bother to manually verify every alignment segment prior to beginning to use the tool, which can represent a significant time saving.</Paragraph>
      <Paragraph position="4"> Another potential drawback of automation is that the system searches for all matches, even in cases where the translator may not need help with a particular passage. For example, if the auto-concordance feature in Trados is activated, it may retrieve and display matches for phrases such as &amp;quot;because of the&amp;quot; or &amp;quot;in order to&amp;quot;, for which an experienced translator is unlikely to need assistance. This can be distracting because the fact that information has been retrieved means that the translator will probably at least have a brief look at what the system has proposed, which takes time and is disruptive to the translation process. And the return on investment is bound to be low for time spent looking at matches for segments for which no translation assistance was required in the first place. In contrast, when working with a BC, the translator initiates the searches and therefore only looks for passages for which he requires help.</Paragraph>
      <Paragraph position="5"> In addition, the fact that many TMs, including Trados, automatically copy and paste fuzzy matches or term matches directly into the target text can sometimes be a hindrance. Depending on the amount of editing required to produce a desirable target segment, it may actually be faster for the translator to type the translation from scratch rather than editing the proposed segment.</Paragraph>
      <Paragraph position="6"> In contrast, a BC does not automatically paste any text directly into the target document, which can be a good thing or a bad thing depending on the quality of the match retrieved.</Paragraph>
      <Paragraph position="7"> A small point, but one that is worth mentioning nonetheless is that TMs often require a great deal of user-initiated clicking in order to view or use the &amp;quot;automatically&amp;quot; retrieved information. For example, in Trados, when working in interactive mode, the user must click in order to instruct the system to conduct a search for each new segment.</Paragraph>
      <Paragraph position="8"> Once the search has been conducted, only the highest-ranked match is automatically presented to the user, but depending on the translator's needs, this is not necessarily the match that will be the most helpful. There are extra clicks involved in pulling up and viewing additional matches. Lastly, when the auto-concordance feature is activated, if the system does not find any sentence-level matches for the current segment, it automatically opens the concordance window and displays the results; however, in so doing, it makes the concordance window the active window, so the translator has to make a point of clicking back in the target field before starting to type, otherwise the text will be inadvertently written to the search field of the concordance window. It is true that there is also typing and clicking to be done when using a BC, but the point we want to make here is that BCs such as ParaConc do not profess to use automation as a time-saver. Moreover, the lack of automation may actually save time in some cases.</Paragraph>
      <Paragraph position="9"> For example, in ParaConc, all the matches are displayed at once and the user can peruse them at a glance instead of having to click through them.</Paragraph>
      <Paragraph position="10"> Finally, it should be noted that not all features of TMs are in fact automated. In Trados, for example, the termbase that is used to identify term matches must be manually pre-stocked with term records by the translator prior to beginning a translation job.</Paragraph>
      <Paragraph position="11"> However, as pointed out by Arrouart and Bedard (2001:30), when a translator consults a parallel bilingual corpus using a BC, he has at his disposal a sort of &amp;quot;full-text glossary&amp;quot; which, by its very nature, contains countless &amp;quot;term records&amp;quot; that the translator has not yet had the time to formalize.</Paragraph>
      <Paragraph position="12"> Arrouart and Bedard go on to observe that one day, such resources may well supplant carefully managed collections of term records.</Paragraph>
      <Paragraph position="13"> In summary, while less-automated tools such as BCs appear to achieve less, they may be quicker to provide translators with results they can actually use, and they are likely to be more tolerant of unexpected situations. Of course, using such tools may call for a higher level of inventiveness or creativity on the part of the user, but thankfully, these are qualities that translators typically possess.</Paragraph>
    </Section>
    <Section position="2" start_page="5" end_page="5" type="sub_section">
      <SectionTitle>
4.2 Search flexibility
</SectionTitle>
      <Paragraph position="0"> It was noted in section 2.1.1 that one of the perceived limitations of BCs is the nature of the searches that can be conducted. Typically, BCs search for occurrences in the corpus that precisely match the search pattern entered by the user. In contrast TMs can make use of a fuzzy matching technique that can identify patterns that are similar to, but do not precisely match, the source segment.</Paragraph>
      <Paragraph position="1"> However, a fuzzy match is not a panacea. When using fuzzy matching techniques, the translator can set the sensitivity threshold of the match; in other words, the translator can decide how similar the two segments must be in order for a TU to be retrieved and displayed. Setting the appropriate sensitivity threshold can actually be quite tricky: if the threshold is set too high (e.g., 95% similarity), then potentially useful matches may be overlooked and the translator will be forced to do unnecessary independent research. But if it is set too low (e.g., 30% similarity), then irrelevant segments may be erroneously retrieved and the translator will waste time weeding through the non-pertinent data. In addition, as noted in section 2.2, even if a fuzzy match has a high percentage of similarity, it may not be that useful to the translator since the matching is based on surface structure similarities rather than semantic similarities. For instance, the following would be retrieved as a good match in a TM since the two segments strongly resemble each other on the surface, differing by only two characters: File the form. / Fill the dorm.</Paragraph>
      <Paragraph position="2"> In contrast, the following pair would not be retrieved because they are not superficially similar, though they are closely linked semantically: File the form. / He is re-filing those forms.</Paragraph>
      <Paragraph position="3"> A translator who is looking for an equivalent of a given segment would find the translation of a semantically-related segment to be more useful than that of a segment which bears only a superficial resemblance to the source text segment.</Paragraph>
      <Paragraph position="4"> With a BC, a translator could use his own knowledge of semantics to try to formulate more relevant queries, but with a TM, the translator has no input into the search patterns used.</Paragraph>
      <Paragraph position="5"> Moreover, as mentioned in section 2.1, many BCs have developed a number of additional flexible searching techniques which, though still manually initiated, can approximate to some extent the results of a fuzzy match. For example, ParaConc offers the possibility of using operators such as wildcards as part of a search. If used properly, these operators can increase the flexibility of a search (e.g. by finding inflected forms). However, as was the case with fuzzy matching, they can also lead to problems if they are not used rigorously. For instance, in an effort to retrieve examples of all forms of the verb &amp;quot;to enter&amp;quot;, a translator may input a pattern such as &amp;quot;enter*&amp;quot; where the * can be used to represent any string of characters. However, this pattern will also retrieve occurrences of all other words beginning with the string &amp;quot;enter&amp;quot; (e.g., &amp;quot;enterprise&amp;quot;, &amp;quot;entertain&amp;quot;). As a result, the translator may inadvertently be presented with irrelevant data.</Paragraph>
      <Paragraph position="6"> The nice thing about working with a BC, however, is that the translator does have control over the search pattern that is entered, so by learning the proper search syntax and by gaining some experience, translators can learn which types of patterns are likely to produce valuable information and which are likely to waste time.</Paragraph>
      <Paragraph position="7"> When working with a TM, however, the translator has no control over the search pattern that is used.</Paragraph>
      <Paragraph position="8"> For example, as mentioned in section 2.1, the parallel search offered by ParaConc allows a translator to limit a search to a given word sense, whereas this cannot be achieved using a TM.</Paragraph>
    </Section>
    <Section position="3" start_page="5" end_page="5" type="sub_section">
      <SectionTitle>
4.3 Consistency
</SectionTitle>
      <Paragraph position="0"> Another highly advertised feature of TMs is that they promote consistency in translation. The question that has been raised by some translators, however, is whether this is always desirable.</Paragraph>
      <Paragraph position="1"> Merkel (1998:143) conducted a survey of 13 translators using TMs to carry out the translation of software manuals. One of the questions asked was whether they preferred consistent translations of a given source segment in two different contexts.</Paragraph>
      <Paragraph position="2"> The choice of answer was either &amp;quot;yes&amp;quot; or &amp;quot;no&amp;quot;, with space for the respondent to elaborate on the motivations for his/her choice. Upon examining the completed questionnaires, Merkel noted that &amp;quot;it became apparent that there was a need for a third response, in between 'yes' and 'no', namely a response which we can call 'doesn't matter'. This applies when the translator in the justification for the choice has indicated that the translation could be consistent, but that it would not matter whether the source segment was also translated differently.&amp;quot; This raises an interesting point: in contrast to what many TM vendors would have us believe, while consistency may sometimes be desirable, it may not always be strictly necessary.</Paragraph>
      <Paragraph position="3"> Furthermore, there may even be cases where consistency is not at all appropriate. For instance, the translators consulted as part of Merkel's survey warn that there is a need to evaluate a proposed match within the new context, and that it may not always be automatically acceptable. This is particularly true in the case of different structural contexts (e.g. sentence vs heading vs table cell), where caution should be used in applying consistent translations (Merkel 1998:145).</Paragraph>
    </Section>
    <Section position="4" start_page="5" end_page="5" type="sub_section">
      <SectionTitle>
4.4 Other quality-related issues
</SectionTitle>
      <Paragraph position="0"> In addition to the question of consistency, other quality-related issues have been raised by translators working with TMs. One of the most significant, which was briefly introduced in section 2.2, is the fact that TM databases store isolated segment pairs, rather than complete texts. In the words of Arrouart and Bedard (2001:30), a TM is actually a memory of sentences out of context.</Paragraph>
      <Paragraph position="1"> This can be problematic because the sentences in a text generally depend on each other in various ways. For example, when we read/write the third sentence in a text, we can refer back to information already presented in the first two sentences, which means that it is possible to use pronouns, deictic and cataphoric references, etc. However, if we take that third sentence in isolation, it may not be clear what the antecedents of such references are.</Paragraph>
      <Paragraph position="2"> In addition, because languages do not have a one-for-one correspondence or the same stylistic requirements, translators who are trying to convey the overall message of a text may map the information to the sentences in the target text in a way that differs from how that information was originally dispersed among the source text sentences. The result is that even if the two texts are considered to be equivalent when taken as a whole, the sentences in a translation may not depend on each other in precisely the same way in which the source text sentences do (Bedard 2000).</Paragraph>
      <Paragraph position="3"> In order to maximize the &amp;quot;recyclability&amp;quot; of a text, a translator working with a TM may choose to structure the sentences in the target text to match those in the source text, and he may choose to avoid using pronouns or other references.</Paragraph>
      <Paragraph position="4"> According to Heyn (1998:135), the result may be a text that is inherently less coherent or readable, and of a lesser overall quality. Bedard (2000) describes this as a &amp;quot;sentence salad&amp;quot; rather than a text. The sentence salad effect is exacerbated when the sentences in a TM come from a variety of different texts that have been translated by different translators. Each text and translator will have a different style, and when sentences from each are brought together, the resulting text will be a stylistic hodgepodge. It is highly unlikely that the source text has been created in such a fashion (i.e., by asking a variety of authors to contribute individual sentences), so it is questionable whether this approach should be used to produce a translation, which is also a text in and of itself.</Paragraph>
      <Paragraph position="5"> Another quality-related problem is that errors contained in TMs may come back to haunt a translator if the database is not scrupulously maintained in order to correct such errors. Lanctot (2001:30) provides the following account of a translator who carefully stores all his translations in a TM, but who does not update the contents to reflect corrections made by the client to the final document. When the client sends a document that closely resembles a version of a document previously translated the year before, the translator uses the TM and blithely reproduces the same errors in the new translation. The client is irritated because the same passages that were corrected last year need to be corrected again. This is not the kind of added value the client was looking for.</Paragraph>
      <Paragraph position="6"> It is worth pointing out that a BC will also produce less-than-satisfactory results if the contents of the corpus are not of high quality. The main advantage offered by a BC in this regard is that it is much more straightforward to update the corpus with a corrected text than it is to fix erroneous TUs in a TM.</Paragraph>
    </Section>
    <Section position="5" start_page="5" end_page="5" type="sub_section">
      <SectionTitle>
4.5 Translators' attitudes and satisfaction
</SectionTitle>
      <Paragraph position="0"> An important point to consider with regard to any tool is whether or not the intended users enjoy working with it. In the case of TMs, Merkel (1998:140) observes that some translators &amp;quot;fear that translation work will become more tedious and boring, and that some of the creative aspects of the job will disappear with the increasing use of translation memory tools.&amp;quot; Merkel (1998:141) goes on to note that there is concern that a translator who works with a TM may be reduced to somebody who simply has to press the OK button.</Paragraph>
      <Paragraph position="1"> In a similar vein, Bedard (2000) expresses concern that translators may lose motivation when working with a TM because they risk becoming &amp;quot;translators of sentences&amp;quot; rather than &amp;quot;translators of texts&amp;quot;. In order to maximize recyclability when working with a TM, translators are encouraged to translate one source text sentence by one target text sentence. However, as noted in section 4.4, the aim of most translators is not to translate sentences, but rather to translate a message. To do this effectively, translators often need to work outside the artificial boundaries of end-of-sentence markers, and they may therefore feel constrained by the sentence-by-sentence approach imposed by TMs. In contrast, Arrouart and Bedard (2001:30) have observed that when working with a BC, few constraints are imposed by the tool and translators are therefore more free to work as they wish.</Paragraph>
      <Paragraph position="2"> Another difficulty that may be faced by translators working with TMs is that they may be biased by what the system presents. In other words, after a translator has seen a suggestion from the database, it may be difficult to think of another way of expressing that thought, so he may use the suggested translation even if it does not fit very coherently into the text as a whole. When using a BC, however, a translator is more likely to be seeking inspiration for handling a shorter term or expression, rather than a complete segment match, so he is less likely to feel unduly influenced by the overall structure of the sentence contained in the corpus. He is also more likely to find examples of that term used in a variety of ways, so he can pick the usage that is most suitable for integration into the text as a whole. In this way, a translator feels like he is making his own decisions, rather than having someone else's decisions forced upon him.</Paragraph>
      <Paragraph position="3"> The very fact that there are multiple ways to render a given passage in another language may also be a reason why some translators are unhappy about using a TM. Merkel (1998:148) notes that as part of his survey, translators were presented with several different options as translations of a given passage. The choice of &amp;quot;best translation option&amp;quot; varied widely among translators, which leads him to believe that it may be difficult to encourage translators to accept suggestions from TMs.</Paragraph>
      <Paragraph position="4"> A related problem that has to do with different working styles of translators is described by Lanctot (2001:30). When multiple translators are sharing a single TM over a network, it may be that translator A, for example, works by ploughing through a text to complete a full rough draft, and he then goes back over the text a second and third time to clean up any outstanding problems (e.g.</Paragraph>
      <Paragraph position="5"> terminological, stylistic). In contrast, translator B's approach is to go more slowly, doing terminological research and addressing stylistic concerns as he goes along. In Lanctot's scenario, translator B is frustrated by the suggestions proposed by the TM - many of which were produced as part of translator A's first rough draft.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML