Phrasal Cohesion and Statistical Machine Translation
Heidi J. Fox
Brown Laboratory for Linguistic Information Processing
Department of Computer Science
Brown University, Box 1910, Providence, RI 02912
hjf@cs.brown.edu
Abstract
There has been much interest in us-
ing phrasal movement to improve statis-
tical machine translation. We explore
how well phrases cohere across two lan-
guages, specifically English and French,
and examine the particular conditions un-
der which they do not. We demonstrate
that while there are cases where coherence
is poor, there are many regularities which
can be exploited by a statistical machine
translation system. We also compare three
variant syntactic representations to deter-
mine which one has the best properties
with respect to cohesion.
1 Introduction
Statistical machine translation (SMT) seeks to de-
velop mathematical models of the translation pro-
cess whose parameters can be automatically esti-
mated from a parallel corpus. The first work in
SMT, done at IBM (Brown et al., 1993), developed
a noisy-channel model, factoring the translation pro-
cess into two portions: the translation model and the
language model. The translation model captures the
translation of source language words into the target
language and the reordering of those words. The
language model ranks the outputs of the translation
model by how well they adhere to the syntactic con-
straints of the target language.1
The prime deficiency of the IBM model is the re-
ordering component. Even in the most complex of
1Though usually a simple word n-gram model is used for the
language model.
the five IBM models, the reordering operation pays
little attention to context and none at all to higher-
level syntactic structures. Many attempts have been
made to remedy this by incorporating syntactic in-
formation into translation models. These have taken
several different forms, but all share the basic as-
sumption that phrases in one language tend to stay
together (i.e. cohere) during translation and thus the
word-reordering operation can move entire phrases,
rather than moving each word independently.
(Yarowsky et al., 2001) states that during their
work on noun phrase bracketing they found a strong
cohesion among noun phrases, even when compar-
ing English to Czech, a relatively free word or-
der language. Other than this, there is little in the
SMT literature to validate the coherence assump-
tion. Several studies have reported alignment or
translation performance for syntactically augmented
translation models (Wu, 1997; Wang, 1998; Alshawi
et al., 2000; Yamada and Knight, 2001; Jones and
Havrilla, 1998) and these results have been promis-
ing. However, without a focused study of the be-
havior of phrases across languages, we cannot know
how far these models can take us and what specific
pitfalls they face.
The particulars of cohesion will clearly depend
upon the pair of languages being compared. Intu-
itively, we expect that while French and Spanish will
have a high degree of cohesion, French and Japanese
may not. It is also clear that if the cohesion between
two closely related languages is not high enough
to be useful, then there is no hope for these meth-
ods when applied to distantly related languages. For
this reason, we have examined phrasal cohesion for
French and English, two languages which are fairly
close syntactically but have enough differences to be
                                            Association for Computational Linguistics.
                    Language Processing (EMNLP), Philadelphia, July 2002, pp. 304-311.
                         Proceedings of the Conference on Empirical Methods in Natural
interesting.
2 Alignments, Spans and Crossings
An alignment is a mapping between the words in a
string in one language and the translations of those
words in a string in another language. Given an En-
glish string, a0a2a1a4a3a6a5a7a9a8a10a3 a7 a3a12a11a14a13a15a13a15a13a16a3 a5 , and a French
string, a17 a1a19a18a21a20a7 a8a19a18 a7 a18 a11 a13a15a13a15a13a22a18 a20 , an alignment a can
be represented by a23 a1a2a24a25a5a7a26a8a27a24 a7 a24 a11 a13a15a13a15a13a28a24 a5 . Each a24a30a29 is
a set of indices into a0 where a31a33a32 a24a25a29a35a34a12a36a38a37 a31 a37a9a39a40a34a42a41a43a37
a44
a37a46a45 indicates that word
a31 in the French sentence is
aligned with word a44 in the English sentence. a24 a29 a1a48a47
indicates that English word a44 has no corresponding
French word.
Given an alignment a23 and an English phrase cov-
ering words a3a6a29a25a13a15a13a15a13a22a3a28a49 , the span is a pair where the
first element is a50a43a51a53a52a55a54 a24 a29a57a56 a13a15a13a15a13 a56 a24 a49a59a58 and the second el-
ement is a50a43a60a62a61a63a54 a24a64a29 a56 a13a15a13a15a13 a56 a24a62a49 a58 . Thus, the span includes
all words between the two extrema of the alignment,
whether or not they too are part of the translation. If
phrases cohere perfectly across languages, the span
of one phrase will never overlap the span of another.
If two spans do overlap, we call this a crossing.
Figure 1 shows an example of an English parse
along with the alignment between the English and
French words (shown with dotted lines). The En-
glish word “not” is aligned to the two French words
“ne” and “pas” and thus has a span of [1,3]. The
main English verb “change” is aligned to the French
“modifie” and has a span of [2,2]. The two spans
overlap and thus there is a crossing. This definition
is asymmetric (i.e. what is a crossing when mov-
ing from English to French is not guaranteed to be
a crossing when moving from French to English).
However, we only pursue translation direction since
that is the one for which we have parsed data.
3 Experiments
3.1 Data
To calculate spans, we need aligned pairs of En-
glish and French sentences along with parses for the
English sentences. Our aligned data comes from
a corpus described in (Och and Ney, 2000) which
contains 500 sentence pairs randomly selected from
the Canadian Hansard corpus and manually aligned.
The alignments are of two types: sure (S) and pos-
sible (P). S alignments are those which are unam-
situation .[ne not vraiment la0 1 2 3 4 5 6
7
pas][modifie]on change
PRP AUX RB RB VB DT NN .
do not thethey really change status .
NP VP
ADVP NP
VP
S
Figure 1: Alignment Example with Crossing
biguous while P alignments are those which are less
certain. P alignments often appear when a phrase in
one language translates as a unit into a phrase in the
other language (e.g. idioms, free translations, miss-
ing function words) but can also be the result of gen-
uine ambiguity. When two annotators disagree, the
union of the P alignments produced by each anno-
tator is recorded as the P alignment in the corpus.
When an S alignment exists, there will always also
exist a P alignment such that P a65 S. The English sen-
tences were parsed using a state-of-the-art statistical
parser (Charniak, 2000) trained on the University of
Pennsylvania Treebank (Marcus et al., 1993).
3.2 Phrasal Translation Filtering
je invoque le R`eglement
I orderofpointon arise
PRP NNNNDTVBP IN IN
NPNPNP
PP
NP
PP
VP
S
Figure 2: Phrasal Translation Example
Since P alignments often align phrasal transla-
Phrasal Filter Off Phrasal Filter On
Alignment Type S Sa0 P P S Sa0 P P
Head Crossings 0.236 4.790 5.284 0.172 2.772 2.492
Modifier Crossings 0.056 0.880 0.988 0.048 0.516 0.362
Phrasal Translations – – – 0.072 2.382 3.418
Table 1: Average Number of Crossings per Sentence
tions, the number of crossings when P alignments
are used will be artificially inflated. For example, in
Figure 2 note that every pair of English and French
words under the verb phrase is aligned. This will
generate five crossings, one each between the pairs
VBP-PP, IN-NPa1 , NPa11 -PP, NN-DT, and IN-NPa7 .
However, what is really happening is that the whole
verb phrase is first being moved without crossing
anything else and then being translated as a unit. For
our purposes we want to count this example as pro-
ducing zero crossings. To accomplish this, we de-
fined a simple heuristic to detect phrasal translations
so we can filter them if desired.
3.3 Calculating Crossings
After calculating the French spans from the English
parses and alignment information, we counted cross-
ings for all pairs of child constituents in each con-
stituent in the sentence, maintaining separate counts
for those involving the head constituent of the phrase
and for crossings involving modifiers only. We did
this while varying conditions along two axes: align-
ment type and phrasal translation filtering. Recalling
the two different types of alignments, S and P, we
examined three different conditions: S alignments
only, P alignments only, or S alignments where
present falling back to P alignments (Sa0 P). For
each of these conditions, we counted crossings both
with and without using the phrasal translation filter.
For a given alignment type a2 a32a4a3 S, S a0 P,Pa5 , let
a6a8a7
a54a10a9
a7 a56
a9
a11 a58 a1a27a36 if phrases
a9
a7 and
a9
a11 cross each other
and a41 otherwise. Let a11 a7 a54a10a9 a7 a56 a9 a11 a58 a8a19a36 if the phrasal
translation filter is turned off. If the filter is on,
a11
a7
a54a10a9
a7 a56
a9
a11 a58 a1
a12a13
a13
a13
a14
a13
a13
a13a15
a41 if
a9
a7 and
a9
a11 are part
of a phrasal translation
in alignmenta2
a36 otherwise
Then, for a given phrase a9 with head constituent
a16 , modifier constituents
a17 , and child constituents
a6
a1
a17a19a18a20a3
a16
a5 and for a particular alignment type
a2 , the number of head crossings a6a22a21
a7 and modifier
crossings a6 a20a7 can be calculated recursively:
a6 a21
a7
a54a10a9
a58 a1a24a23
a25a27a26a29a28
a6 a21
a30 a54a32a31
a58a34a33 a23
a20
a26a29a35
a6a8a7
a54
a16
a56 a39 a58
a11
a7
a54
a16
a56 a39 a58
a6
a20
a7
a54a10a9
a58 a1 a23
a25a27a26a29a28
a6
a20
a30 a54a32a31
a58a34a33 a23
a36a38a37a40a39
a36a42a41a43a37a40a39a45a44 a36a46a41a48a47a49 a36
a6 a7
a54
a39a24a50 a56 a39 a58
a11
a7
a54
a39a24a50 a56 a39 a58
4 Results
4.1 Average Crossings
Table 1 shows the average number of crossings per
sentence. The table is split into two sections, one
for results when the phrasal filter was used and one
for when it was not. “Alignment Type” refers to
whether we used S, P or Sa0 P as the alignment
data. The “Head Crossings” line shows the results
when comparing the span of the head constituent of
a phrase with the spans of its modifier constituents,
and “Modifier Crossings” refers to the case where
we compare the spans of pairs of modifiers. The
“Phrasal Translations” line shows the average num-
ber of phrasal translations detected per sentence.
For S alignments, the results are quite promising,
with an average of only 0.236 head crossings per
sentence and an even smaller average for modifier
crossings (0.056). However, these results are overly
optimistic since often many words in a sentence will
not have an S alignment at all, such as “coming”,
“in”, and “before” in following example:
le rapport complet sera de ici le automne prochaind´epos´e
the full report will be coming in before the fall
When we use P alignments for these unaligned
words (the Sa0 P case), we get a more meaningful
result. Both types of crossings are much more fre-
quent (4.790 for heads and 0.88 for modifiers) and
phrasal translation filtering has a much larger effect
(reducing head average to 2.772 and modifier aver-
age to 0.516). Phrasal translations account for al-
most half of all crossings, on average. This effect is
even more pronounced in the case where we use P
alignments only. This reinforces the importance of
phrasal translation in the development of any trans-
lation system.
Even after filtering, the number of crossings in
the Sa0 P case is quite large. This is discouraging,
however there are reasons why this result should be
looked on as more of an upper bound than anything
precise. For one thing, there are cases of phrasal
translation which our heuristic fails to recognize, an
example of which is shown in Figure 3. The align-
ment of “explorer” with “this” and “matter” seems
to indicate that the intention of the annotator was to
align the phrase “work this matter out”, as a unit, to
“de explorer la question”. However, possibly due to
an error during the coding of the alignment, “work”
and “out” align with “de” (indicated by the solid
lines) while “this” and “matter” do not. This causes
the phrasal translation heuristic to fail resulting in a
crossing where there should be none.
questionlaexplorerde
VB RPNNDT
work outthis matter
PRTNP
VP
Figure 3: Phrasal Translation Heuristic Failure
Also, due to the annotation guidelines, P align-
ments are not as consistent as would be ideal. Re-
call that in cases of annotator disagreement, the P
alignment is taken to be the union of the P align-
ments of both annotators. Thus, it is possible for
the P alignment to contain two mutually conflict-
ing alignments. These composite alignments will
likely generate crossings even where the alignments
of each individual annotator would not. While re-
flecting genuine ambiguity, an SMT system would
likely pursue only one of the alternatives and only a
portion of the crossings would come into play.
4.2 Percentage Crossings
Our results show a significantly larger number of
head crossings than modifier crossings. One possi-
bility is that this is due to most phrases having a head
and modifier pair to test, while many do not have
multiple modifiers and therefore there are fewer op-
portunities for modifier crossings. Thus, it is infor-
mative to examine how many potential crossings ac-
tually turn out to be crossings. Table 2 provides this
result in the form of the percentage of crossing tests
which result in detection of a crossing.
To calculate this, we kept totals for the number
of head (a0 a21a7 ) and modifier (a0 a20a7 ) crossing tests per-
formed as well as the number of phrasal translations
detected (a0a2a1 a21a7 ). Note that when the phrasal transla-
tion filter is turned on, these totals differ for each of
the different alignment types (S, Sa0 P, and P).
a0
a21
a7
a54a10a9
a58 a1 a23
a25 a26 a28
a0
a21
a7
a54a32a31
a58a34a33 a23
a20
a26 a35
a11
a7
a54
a16
a56 a39 a58
a0
a20
a7
a54a10a9
a58 a1 a23
a25 a26 a28
a0
a20
a7
a54a32a31
a58 a33 a23
a36a34a37 a39
a36a46a41 a37 a39a45a44 a36a42a41a32a47a49 a36
a11
a7
a54
a39a24a50 a56 a39 a58
a0 a54a10a9
a58 a1
a3a5a4
a6
a4
a6a8a7
a33 a23
a25 a26a29a28
a0 a54a32a31
a58
a0a2a1
a21
a7
a54a10a9
a58 a1
a0 a54a10a9
a58a10a9
a0
a21
a7
a54a10a9
a58a10a9
a0
a20
a7
a54a10a9
a58
The percentages are calculated after summing over
all sentences a11 in the corpus:
a12 a21
a7
a1a27a36a15a41 a41a14a13a16a15a18a17
a28a20a19
a21a23a22a25a24a27a26
a15 a17a29a28
a19
a21a30a22a25a24a27a26
a12
a20
a7
a1 a36a15a41 a41a31a13a32a15a18a17
a28
a36
a21a33a22a25a24a27a26
a15 a17a29a28
a36
a21a34a22a35a24a36a26
a12
a1
a21
a7
a1a27a36a15a41 a41a31a13a16a15 a17 a28a38a37
a19
a21 a22a35a24a36a26
a15 a17 a28
a22a35a24a36a26
There are still many more crossings in the Sa0 P
and P alignments than in the S alignments. The S
alignment has 1.58% head crossings while the Sa0 P
and P alignments have 32.16% and 35.47% respec-
tively, with similar relative percentages for modi-
fier crossings. Also as before, half to two-thirds of
crossings in the Sa0 P and P alignments are due to
phrasal translations. More interestingly, we see that
modifier crossings remain significantly less preva-
lent than head crossings (e.g. 14.45% vs. 32.16% for
the Sa0 P case) and that this is true uniformly across
all parameter settings. This indicates that heads are
more intimately involved with their modifiers than
Phrasal Filter Off Phrasal Filter On
Alignment Type S Sa0 P P S Sa0 P P
Head Crossings 1.58% 32.16% 35.47% 1.15% 18.61% 16.73%
Modifier Crossings 0.92% 14.45% 16.23% 0.78% 8.47% 5.94%
Phrasal Translations – – – 0.34% 11.35% 16.29%
Table 2: Percent Crossings per Chance
Cause Count
Ne Pas 13
Modal 9
Adverb 8
Possessive 2
Pronoun 2
Adjective 1
Parser Error 16
Reword 16
Reorder 13
Translation Error 5
Total 86
Table 3: Causes of Head Crossings
modifiers are with each other and therefore are more
likely to be involved in semi-phrasal constructions.
5 Analysis of Causes
Since it is clear that crossings are too prevalent to
ignore, it is informative to try to understand exactly
what constructions give rise to them. To that end, we
examined by hand all of the head crossings produced
using the S alignments with phrasal filtering. Table 3
shows the results of this analysis.
The first thing to note is that by far most of
the crossings do not reflect lack of phrasal cohe-
sion between the two languages. Instead, they are
caused either by errors in the syntactic analysis or
the fact that translation as done by humans is a much
richer process than just replication of the source sen-
tence in another language. Sentences are reworded,
clauses are reordered, and sometimes human trans-
lators even make mistakes.
Errors in syntactic analysis consist mostly of at-
tachment errors. Rewording and reordering ac-
counted for a large number of crossings as well. In
most of the cases of rewording (see Figure 4) or re-
aura de les effets destructifsplus que positifsen fait , elle
there
EX
will
MD
be
AUX
more
JJR
divisiveness
NN
than
IN
positive
JJ NNS
effects
NP NP
PP
ADJP
VP
VP
S
ADJP
,RB
ADVP
indeed ,
Figure 4: Crossing Due to Rewording
encelaprisavonsnous,budgetleavonsnouslorsque pr´epar´e consid´eration
VP
NP
VBN WRB
NP
RBNNSDTAUXPRP NN PRP VBD DT NNIN
NP
VP
S
SBAR
RB
NP WHADVP
PP
ADVP
NP
VP
S
ADVP
we have taken these account when we designed the Budgetmuchveryconsiderations into
Figure 5: Crossing Due to Reordering of Clauses
ordering (see Figure 5) a more “parallel” translation
would also be valid. Thus, while it would be difficult
for a statistical model to learn from these examples,
there is nothing to preclude production of a valid
translation from a system using phrasal movement
in the reordering phase. The rewording and reorder-
ing examples were so varied that we were unable to
find any regularities which might be exploited by a
translation model.
Among the cases which do result from language
differences, the most common is the “ne . . . pas”
construction (e.g. Figure 1). Fifteen percent of the
86 total crossings are due to this construction. Be-
cause “ne . . . pas” wraps around the verb, it will al-
ways result in a crossing. However, the types of syn-
tactic structures (categorized as context-free gram-
mar rules) which are present in cases of negation are
rather restricted. Of the 47 total distinct syntactic
structures which resulted in crossings, only three of
them involved negation. In addition, the crossings
associated with these particular structures were un-
ambiguously caused by negation (i.e. for each struc-
ture, only negation-related crossings were present).
Next most common is the case where the En-
glish contains a modal verb which is aligned with
the main verb in the French. In the example in Fig-
ure 6, “will be” is aligned to “sera” (indicated by
the solid lines) and because of the constituent struc-
ture of the English parse there is a crossing. As with
negation, this type of crossing is quite regular, re-
sulting uniquely from only two different syntactic
structures.
le rapport complet sera de ici le automne prochaind´epos´e
DT NN
the full report will be in before
DT JJ NN MD AUX VBG RB IN
the fallcoming
NPADVPNP
PP
VP
VP
VP
S
Figure 6: Crossing Due to Modal
Adverbs are a third common cause, as they typ-
ically follow the verb in French while preceding it
in English. Figure 7 shows an example where the
span of “simplement” overlaps with the span of the
verb phrase beginning with “tells” (indicated by the
solid lines). Unlike negation and modals, this case
is far less regular. It arises from six different syntac-
tic constructions and two of those constructions are
implicated in other types of crossings as well.
les bontout simplement gens ce est pour euxle gouvernement dit qui`a
simply
RB
that Government themforgoodiswhatpeoplethetells
PRPINJJAUXWPNNSDTVBZDT NN
NP ADVP NP WHNP NP
PP
ADJP
VP
S
SBAR
NP
VP
S
Figure 7: Crossing Due to Adverb
6 Further Experiments
6.1 Flattening Verb Phrases
Many of the causes listed above are related to verb
phrases. In particular, some of the adverb-related
crossings (e.g. Figure 1) and all of the modal-related
crossings (e.g. Figure 6) are artifacts of the nested
verb phrase structure of our parser. This nesting usu-
ally does not provide any extra information beyond
what could be gleaned from word order. Therefore,
we surmised that flattening verb phrases would elim-
inate some types of crossings without reducing the
utility of the parse.
The flattening operation consists of identifying all
nested verb phrases and splicing the children of the
nested phrase into the parent phrase in its place. This
procedure is applied recursively until there are no
nested verb phrases. An example is shown in Fig-
ure 8. Crossings can be calculated as before.
NP
PP
VP
will be before the fallcoming
MD VBG IN DT NNAUX
NP
VP
PP
VP
VP
will be before the fallcoming
MD AUX VBG IN DT NN
Figure 8: Verb Phrase Flattening
Alignment Type S Sa0 P P
Baseline 0.172 2.772 2.492
Flattened VPs 0.136 2.252 1.91
Dependencies 0.078 1.88 1.476
Table 4: Average Head Crossings per Sentence
(Phrasal Filter On)
Alignment Type S Sa0 P P
Baseline 0.048 0.516 0.362
Flattened VPs 0.06 0.86 0.694
Dependencies 0.1 1.498 1.238
Table 5: Average Modifier Crossings per Sentence
(Phrasal Filter On)
Flattening reduces the number of potential head
crossings while increasing the number of potential
modifier crossings. Therefore, we would expect to
see a comparable change to the number of cross-
ings measured, and this is exactly what we find, as
shown in Tables 4 and 5. For example, for Sa0 P
alignments, the average number of head crossings
decreases from 2.772 to 2.252, while the average
number of modifier crossings increases from 0.516
to 0.86. We see similar behavior when we look at the
percentage of crossings per chance (Tables 6 and 7).
For the same alignment type, the percentage of head
crossings decreases from 18.61% to 15.12%, while
the percentage of modifier crossings increases from
8.47% to 10.59%. One thing to note, however, is that
the total number of crossings of both types detected
in the corpus decreases as compared to the baseline,
and thus the benefits to head crossings outweigh the
detriments to modifier crossings.
Alignment Type S Sa0 P P
Baseline 1.15% 18.61% 16.73%
Flattened VPs 0.91% 15.12% 12.82%
Dependencies 0.52% 12.62% 9.91%
Table 6: Percent Head Crossings per Chance
(Phrasal Filter On)
Alignment Type S Sa0 P P
Baseline 0.78% 8.47% 5.94%
Flattened VPs 0.73% 10.59% 8.55%
Dependencies 0.61% 9.22% 7.62%
Table 7: Percent Modifier Crossings per Chance
(Phrasal Filter On)
6.2 Dependencies
Our intuitions about the cohesion of syntactic struc-
tures follow from the notion that translation, as a
meaning-preserving operation, preserves the depen-
dencies between words, and that syntactic structures
encode these dependencies. Therefore, dependency
structures should cohere as well as, or better than,
their corresponding syntactic structures. To exam-
ine the validity of this, we extracted dependency
structures from the parse trees (with flattened verb
phrases) and calculated crossings for them. Figure 9
shows a parse tree and its corresponding dependency
structure.
The procedure for counting modifier crossings in
a dependency structure is identical to the procedure
for parse trees. For head crossings, the only differ-
ence is that rather than comparing spans of two sib-
lings, we compare the spans of a child and its parent.
bewill before thecoming fall the
will be before
fall
coming
VBGAUXMD IN DT NN
NP
PP
VP
Figure 9: Extracting Dependencies
Again focusing on the Sa0 P alignment case, we
see that the average number of head crossings (see
Table 4) continues to decrease compared to the pre-
vious case (from 2.252 to 1.88), and that the aver-
age number of modifier crossings (see Table 5) con-
tinues to increase (from 0.86 to 1.498). This time,
however, the percentages for both types of crossings
(see Tables 6 and 7) decrease relative to the case
of flattened verb phrases (from 15.12% to 12.62%
for heads and from 10.59% to 9.22% for modifiers).
The percentage of modifier crossings is still higher
than in the base case (9.22% vs. 8.47%). Overall,
however, the dependency representation has the best
cohesion properties.
7 Conclusions
We have examined the issue of phrasal cohesion be-
tween English and French and discovered that while
there is less cohesion than we might desire, there is
still a large amount of regularity in the constructions
where breakdowns occur. This reassures us that re-
ordering words by phrasal movement is a reasonable
strategy. Many of the initially daunting number of
crossings were due to non-linguistic reasons, such
as rewording during translation or errors in syntactic
analysis. Among the rest, there are a small number
of syntactic constructions which result in the major-
ity of the crossings examined in our analysis. One
practical result of this skewed distribution is that
one could hope to discover the major problem ar-
eas for a new language pair by manually aligning a
small number of sentences. This information could
be used to filter a training corpus to remove sen-
tences which would cause problems in training the
translation model, or for identifying areas to focus
on when working to improve the model itself. We
are interested in examining different language pairs
as the opportunity arises.
We have also examined the differences in cohe-
sion between Treebank-style parse trees, trees with
flattened verb phrases, and dependency structures.
Our results indicate that the highest degree of co-
hesion is present in dependency structures. There-
fore, in an SMT system which is using some type
of phrasal movement during reordering, dependency
structures should produce better results than raw
parse trees. In the future, we plan to explore this
hypothesis in an actual translation system.
8 Acknowledgments
The work reported here was supported in part by the
Defense Advanced Research Projects Agency under
contract number N66001-00-C-8008. The views and
conclusions contained in this document are those of
the author and should not be interpreted as neces-
sarily representing the official policies, either ex-
pressed or implied, of the Defense Advanced Re-
search Projects Agency or the United States Gov-
ernment.
We would like to thank Franz Och for providing
us with the manually annotated data used in these
experiments.

References
Hiyan Alshawi, Srinivas Bangalore, and Shona Douglas.
2000. Learning dependency translation models as col-
lections of finite-state head transducers. Computa-
tional Linguistics, 26(1):45–60, March.
Peter F. Brown, Stephen A. Della Pietra, Vincent J.
Della Pietra, and Robert L. Mercer. 1993. The math-
ematics of machine translation: Parameter estimation.
Computational Linguistics, 19(2):263–311, June.
Eugene Charniak. 2000. A maximum-entropy-inspired
parser. In Proceedings of the 1st Meeting of the North
American Chapter of the Association for Computa-
tional Linguistics.
Douglas Jones and Rick Havrilla. 1998. Twisted pair
grammar: Support for rapid development of machine
translation for low density languages. In Proceed-
ings of the Conference of the Association for Machine
Translation in the Americas.
Mitchell P. Marcus, Beatrice Santorini, and Mary Ann
Marcinkiewicz. 1993. Building a large annotated cor-
pus of English: The Penn Treebank. Computational
Linguistics, 13(2):313–330, June.
Franz Josef Och and Hermann Ney. 2000. Improved sta-
tistical alignment models. In Proceedings of the 38th
Annual Meeting of the Association for Computational
Linguistics, pages 440–447.
Ye-Yi Wang. 1998. Grammar Inference and Statistical
Machine Translation. Ph.D. thesis, Carnegie Mellon
University.
Dekai Wu. 1997. Stochastic inversion transduction
grammars and bilingual parsing of parallel corpora.
Computational Linguistics, 23(3):377–403, Septem-
ber.
Kenji Yamada and Kevin Knight. 2001. A syntax-based
statistical translation model. In Proceedings of the
39th Annual Meeting of the Association for Compu-
tational Linguistics.
David Yarowsky, Grace Ngai, and Richard Wicentowski.
2001. Inducing multilingual text analysis tools via ro-
bust projection across aligned corpora. In Proceedings
of the ARPA Human Language Technology Workshop,
pages 109–116.
