File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/95/w95-0106_abstr.xml

Size: 1,130 bytes

Last Modified: 2025-10-06 13:48:29

<?xml version="1.0" standalone="yes"?>
<Paper uid="W95-0106">
  <Title>Trainable Coarse Bilingual Grammars for Parallel Text Bracketing</Title>
  <Section position="2" start_page="0" end_page="0" type="abstr">
    <SectionTitle>
Abstract
</SectionTitle>
    <Paragraph position="0"> We describe two new strategies to automatic bracketing of parallel corpora, with particular application to languages where prior grammar resources are scarce: (1) coarse bilingual grammars, and (2) unsupervised training of such grammars via EM (expectation-maximization). Both methods build upon a formalism we recently introduced called stochastic inversion transduction grammars. The first approach borrows a coarse monolingual grammar into our bilingual formalism, in order to transfer knowledge of one language's constraints to the task of bracketing the texts in both languages. The second approach generalizes the inside-outside algorithm to adjust the grammar parameters so as to improve the likelihood of a training corpus. Preliminary experiments on parallel English-Chinese text are supportive of these strategies.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML