File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/06/n06-1054_intro.xml

Size: 2,881 bytes

Last Modified: 2025-10-06 14:03:31

<?xml version="1.0" standalone="yes"?>
<Paper uid="N06-1054">
  <Title>A fast finite-state relaxation method for enforcing global constraints on sequence decoding</Title>
  <Section position="4" start_page="423" end_page="423" type="intro">
    <SectionTitle>
2 Finite-state constraints
</SectionTitle>
    <Paragraph position="0"> Previous approaches to global sequence labeling--Gibbs sampling, ILP, and reranking--seem motivated by the idea that standard sequence methods are incapable of considering global constraints at all.</Paragraph>
    <Paragraph position="1"> In fact, finite-state automata (FSAs) are powerful enough to express many long-distance constraints.</Paragraph>
    <Paragraph position="2"> Since all finite languages are regular, any constraint over label sequences of bounded length is finitestate. FSAs are more powerful than n-gram models. For example, the regular expression S[?]XS[?]YS[?] matches only sequences of labels that contain an X before a Y. Similarly, the regular expression !(O[?]) requires at least one non-O label; it compiles into the FSA of Figure 1.</Paragraph>
    <Paragraph position="3"> Note that this FSA is in one or the other of its two states according to whether it has encountered a non-O label yet. In general, the current state of an FSA records properties of the label sequence prefix read so far. The FSA needs enough states to keep track of whether the label sequence as a whole satisfies the global constraint in question.</Paragraph>
    <Paragraph position="4"> FSAs are a flexible approach to constraints because they are closed under logical operations such as disjunction (union) and conjunction (intersection). They may be specified by regular expressions (Karttunen et al., 1996), in a logical language (Vaillette, 2004), or directly as FSAs. They may also be weighted to express soft constraints.</Paragraph>
    <Paragraph position="5"> Formally, we pose the decoding problem in terms of an observation sequence x [?] X[?] and possible label sequences y [?] Y[?]. In many NLP tasks, X is the set of words, and Y the tags. A lattice L: Y[?] mapsto- R maps label sequences to weights, and is encoded as a weighted FSA. Constraints are formally the same-any function C: Y[?] mapsto- R is a constraint, including weighted features from a classifier or probabilistic model. In this paper we will consider only constraints that are weighted in particular ways.</Paragraph>
    <Paragraph position="6"> Given a lattice L and constraints C, we seek</Paragraph>
    <Paragraph position="8"> We assume the lattice L is generated by a model M: X[?] mapsto- (Y[?] mapsto- R). For a given observation sequence x, we put L = M(x). One possible model is a finite-state transducer, where M(x) is an FSA found by composing the transducer with x. Another is a CRF, where M(x) is a lattice with sums of logpotentials for arc weights.1</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML