File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/05/w05-1508_concl.xml
Size: 2,186 bytes
Last Modified: 2025-10-06 13:55:01
<?xml version="1.0" standalone="yes"?> <Paper uid="W05-1508"> <Title>Vancouver, October 2005. c(c)2005 Association for Computational Linguistics Treebank Transfer</Title> <Section position="5" start_page="80" end_page="81" type="concl"> <SectionTitle> 5 Conclusions </SectionTitle> <Paragraph position="0"> The approach described in this paper was illustrated using very simple examples. The simplicity of the exposition should not obscure the full generality of our approach: it is applicable in the following situations: null * A prior over latent trees is defined in terms of stochastic finite automata.</Paragraph> <Paragraph position="1"> We have described the special case of bigram models, and pointed out how our approach will generalize to higher-order n-gram models.</Paragraph> <Paragraph position="2"> However, priors are not generally constrained to be n-gram models; in fact, any stochastic finite automaton can be employed as a prior, since the intersection of context-free grammars and finite automata is well-defined. However, the intersection construction that appears to be necessary for sampling from the posterior distribution over latent trees may be rather cumbersome when higher-order n-gram models or more complex finite automata are used as priors. null * The inverse image of an observed tree under the mapping from latent trees to observed trees can be expressed in terms of a finite context-free language, or equivalently, a packed forest.</Paragraph> <Paragraph position="3"> The purpose of Gibbs sampling is to simulate the posterior distribution of the unobserved variables in the model. As the sampling procedure converges, knowledge contained in the informative but structurally weak prior L is effectively passed to the syntactic transfer model Ks. Once the sampling procedure has converged to a stationary distribution, we can run it for as many additional iterations as we want and sample the imputed target-language trees. Those trees can then be collected in a treebank, thus creating novel syntactically annotated data in the target language, which can be used for further processing in syntax-based NLP tasks.</Paragraph> </Section> class="xml-element"></Paper>