File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/95/p95-1033_evalu.xml
Size: 4,291 bytes
Last Modified: 2025-10-06 14:00:21
<?xml version="1.0" standalone="yes"?> <Paper uid="P95-1033"> <Title>An Algorithm for Simultaneously Bracketing Parallel Texts by Aligning Words</Title> <Section position="8" start_page="249" end_page="250" type="evalu"> <SectionTitle> 6 Experiments </SectionTitle> <Paragraph position="0"> Evaluation methodology for bracketing is controversial because of varying perspectives on what the &quot;gold standard&quot; should be. We identify two prototypical positions, and give results for both. One position uses a linguistic evaluation criterion, where accuracy is measured against some theoretic notion of constituent structure. The other position uses a functional evaluation criterion, where the &quot;correctness&quot; of a bracketing depends on its utility with respect to the application task at hand. For example, here we consider a bracket-pair functionally useful if it correctly identifies phrasal translations---especially where the phrases in the two languages are not compositionally derivable solely from obvious word translations. Notice that in contrast, the linguistic evaluation criterion is insensitive to whether the bracketings of the two sentences match each other in any semantic way, as long as the monolingual bracketings in each sentence are correct. In either case, the bracket precision gives the proportion of found br~&ets that agree with the chosen correctness criterion.</Paragraph> <Paragraph position="1"> All experiments reported in this paper were performed on sentence-pairs from the HKUST English-Chinese Parallel Bilingual Corpus, which consists of governmental transcripts (Wu 1994). The translation lexicon was automatically learned from the same corpus via statistical sentence alignment (Wu 1994) and statistical Chinese word and collocation extraction (Fung & Wu 1994; Wu & Fung 1994), followed by an EM word-translation learning procedure (Wu & Xia 1994). The translation lexicon contains an English vocabulary of approximately 6,500 words and a Chinese vocabulary of approximately 5,500 words. The mapping is many-to-many, with an average of 2.25 Chinese translations per English word.</Paragraph> <Paragraph position="2"> The translation accuracy is imperfect (about 86% percent weighted precision), which turns out to cause many of the bracketing errors.</Paragraph> <Paragraph position="3"> Approximately 2,000 sentence-pairs with both English and Chinese lengths of 30 words or less were extracted from our corpus and bracketed using the algorithm described. Several additional criteria were used to filter out unsuitable sentence-pairs. If the lengths of the pair of sentences differed by more thml a 2:1 ratio, the pair was rejected; such a difference usually arises as the result of an earlier error in automatic sentence alignment.</Paragraph> <Paragraph position="4"> Sentences containing more than one word absent from the translation lexicon were also rejected; the bracketing method is not intended to be robust against lexicon inadequacies. We also rejected sentence pairs with fewer than two matching words, since this gives the bracketing algorithm no diso'iminative leverage; such pairs ~c~ounted for less than 2% of the input data. A random sample of the b~keted sentence pairs was then drawn, and the bracket precision was computed under each criterion for correctness. Additional examples are shown in Figure 5.</Paragraph> <Paragraph position="5"> Under the linguistic criterion, the monolingual bracket precision was 80.4% for the English sentences, and 78.4% for the Chinese sentences. Of course, monolinguai grammar-based bracketing methods can achieve higher precision, but such tools assume grammar resources that may not be available, such as good Chinese granuna~.</Paragraph> <Paragraph position="6"> Moreover, if a good monolingual bracketer is available, its output can easily be incorporated in much the same way as punctn~ion constraints, thereby combining the best of both worlds. Under the functional criterion, the parallel bracket precision was 72.5%, lower than the monolingual precision since brackets can be correct in one language but not the other. Grammar-based bracketing methods cannot directly produce results of a comparable nature.</Paragraph> </Section> class="xml-element"></Paper>