Proceedings of HLT/EMNLP 2005 Demonstration Abstracts, pages 16–17,
Vancouver, October 2005.
 
Translation Exercise Assistant: 
Automated Generation of Translation Exercises  
for Native-Arabic Speakers Learning English 
 
Jill Burstein 
Educational Testing Service 
Princeton, NJ 08541 
jburstein@ets.org 
Daniel Marcu 
Language Weaver, Inc 
Marina del Rey, CA 90292 
dmarcu@languageweaver.com 
 
1. Introduction 
 
Machine translation has clearly entered into 
the marketplace as a helpful technology. 
Commercial applications are used on the internet 
for automatic translation of web pages and news 
articles. In the business environment, companies 
offer software that performs automatic 
translations of web sites for localization 
purposes, and translations of business 
documents (e.g., memo and e-mails).  With 
regard to education, research using machine 
translation for language learning tools has been 
of interest since the early 1990’s (Anderson, 
1993, Richmond, 1994, and Yasuda, 2004), 
though little has been developed. Very recently, 
Microsoft introduced a product called Writing 
Wizard that uses machine translation to assist 
with business writing for native Chinese 
speakers. To our knowledge, this is currently the 
only deployed education-based tool that uses 
machine translation. 
Currently, all writing-based English 
language learning (ELL) writing-based products 
and services at Educational Testing Service rely 
on e-rater automated essay scoring and the 
Critique writing analysis tool capabilities 
(Burstein, Chodorow, and Leacock, 2004).  In 
trying to build on a portfolio of innovative 
products and services, we have explored using 
machine translation toward the development of 
new ELL-based capabilities. We have developed 
a prototype system for automatically generating 
translation exercises in Arabic --- the 
Translation Exercise Assistant.   
Translation exercises are one kind of task 
that teachers can offer to give students practice 
with specific grammatical structures in English. 
Our hypothesis is that teachers could use such a 
tool to help them create exercises for the 
classroom, homework, or quizzes. The idea 
behind our prototype is a capability that can be 
used either by classroom teachers to help them 
generate sentence-based translation exercises 
from an infinite number of Arabic language texts 
of their choice. The capability might be 
integrated into a larger English language 
learning application. In this latter application, 
these translation exercises could be created by 
classroom teachers for the class or for 
individuals who may need extra help with 
particular grammatical structures in English. 
Another potential use of this system that has 
been discussed is to use it in ESL classrooms in 
the United States, to allow teachers to offer 
exercises in students’ native language, especially 
for students who are competent in their own 
language, but only beginners in English. 
We had two primary goals in mind in 
developing our prototype. First, we wanted to 
evaluate how well the machine translation 
capability itself would work with this 
application.  In other words, how useful were the 
system outputs that are based on the machine 
translations? We also wanted to know to what 
extent this kind of tool facilitated the task of 
creating translation exercise items.  So, how 
much time is involved for a teacher to manually 
create these kinds of items versus using the 
exercise assistant tool to create them? Manually 
creating such an item involves searching through 
numerous reference sources (e.g., paper or web-
based version of newspapers), finding sentences 
with the relevant grammatical structure in the 
source language (Arabic), and then manually 
producing an English translation that can be 
used as an answer key.   
To evaluate these aspects, we implemented a 
graphical user interface that offered our two 
users the ability to create sets of translation 
 
16
 
exercise items for six pre-selected, grammatical 
structures. For each structure the system 
automatically identified and offered a set of 200 
system-selected potential sentences per category. 
For the exercise creation task, we collected 
timing information that told us how long it took 
users to create 3 exercises of 10 sentences each, 
for each category. In addition, users rated a set 
of up to 200 Arabic sentences with regard to if 
they were usable as translation exercise items, so 
that we could gauge the proportion of sentences 
selected by the application. These were the 
sentences that remained in the set of 200 
because they were not selected for an exercise. 
Two teachers participated in the evaluation of 
our prototype. One of the users also did the task 
manually. 
 
2. Translation Exercise Selection 
 
2.1     Data Sets 
 
The source of the data was Arabic English 
Parallel News Part 1 and the Multiple 
Translation Arabic Part 1 corpus from the 
Linguistic Data Consortium.
1
   Across these data 
sets we had access to about 45,000 Arabic 
sentences from Arabic journalistic texts taken 
from Ummah Press Service, Xinhua News and 
the AFP News Service available for this 
research. We used approximately 10,000 of 
these Arabic sentences for system development, 
and selected sentences from the remaining 
Arabic sentences for use with the interface.
2
  
 
2.2 System Description 
 
We used Language Weaver’s
3
 Arabic-to-English 
system to translate the Arabic sentences in the 
data sets. We built a module to find the relevant 
grammatical structures in the English 
translations. This module first passes the English 
                                                 
1
 The LDC reference numbers for these corpora are: 
LDC2004T18 and LDC2003T18. 
2
 To avoid producing sentences with overly 
complicated structures, we applied two constraints to 
the English translation: 1) it contained 20 words or 
less, and 2) it contained only a single sentence.  
3
 See http://www.languageweaver.com. 
 
translation to a part-of-speech tagger that assigns 
a part-of-speech to each word in the sentence. 
Another module identifies regular expressions 
for the relevant part-of-speech sequences in the 
sentences, corresponding to one of these six 
grammatical structures: a) subject-verb 
agreement, b) complex verbs, c) phrasal verbs, 
d) nominal compounds, e) prepositions, and f) 
adjective modifier phrases.  When the 
appropriate pattern was found in the English 
translation, the well-formed Arabic sentence that 
corresponds to that translation is added to the set 
of potential translation exercise sentence 
candidates in the interface.   
 
2.3 Results 
 
The outcome of the evaluation indicated 
that between 98% and 100% of automatically-
generated sentence-based translation items were 
selected by both users as usable for translation 
items.  In addition, the time involved to create 
the exercises using the tool was 2.6 times faster 
than doing the task manually. 
 
References 
 
Anderson, Don D. (1995) “Machine Translation as a 
Tool in Second Language Learning”, CALICO 
Journal 13.1, 68–97.  
 
Burstein, J., Chodorow, M., & Leacock, C. (2004). 
Automated essay evaluation: The Criterion online 
writing service. AI Magazine, 25(3), 27-36. 
 
Johnson, Rod (1993) “MT Technology and 
Computer-Aided Language Learning”, in Sergei 
Nirenburg (ed.) Progress in Machine Translation, 
Amsterdam: IOS and Tokyo: Ohmsha, pages 286–
287. 
 
Richmond, Ian M. (1994) “Doing it backwards: 
Using translation software to teach target-language 
grammaticality”, Computer Assisted Language 
Learning 7, 65–78. 
 
Yasuda, K. Sugaya F., Sumita E, Takezawa T.,  
Kikui G., Yamamoto, S. (2004). Automatic 
Measuring of English Language Proficiency using 
MT Evaluation Technology. Proceedings of e-
Learning workshop, COLING 2004, Geneva, 
Switzerland. 
 
17
