XML Viewer - w06-0403

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/06/w06-0403_metho.xml
Size: 26,581 bytes
Last Modified: 2025-10-06 14:10:33
<?xml version="1.0" standalone="yes"?>
<Paper uid="W06-0403">
  <Title>Numbat: Abolishing Privileges when Licensing New Constituents in Constraint-oriented Parsing</Title>
  <Section position="4" start_page="0" end_page="18" type="metho">
    <SectionTitle>
2 Constraint-oriented Approaches
</SectionTitle>
    <Paragraph position="0"> The main feature common to all Constraint-oriented approaches is that parsing is modelled as a Constraint Satisfaction Problem (CSP).</Paragraph>
    <Section position="1" start_page="0" end_page="17" type="sub_section">
      <SectionTitle>
Maruyama's Constraint Dependency Grammar
</SectionTitle>
      <Paragraph position="0"> (CDG) (Maruyama, 1990) is the first formalism to introduce the parsing process as a CSP solver.</Paragraph>
      <Paragraph position="1"> Several extensions of CDG have then been proposed (Heinecke et al., 1998; Duchier, 1999; Foth et al., 2004).</Paragraph>
      <Paragraph position="2"> Menzel and colleagues (Heinecke et al., 1998; Foth et al., 2004) developed a weighted (or &amp;quot;graded&amp;quot;) version of CDG. Their parsing strategies are explored in the context of robust parsing. These strategies are based on an over-generation of candidate solutions. In this approach the CSP is turned into an optimisation problem, where sub-optimal solutions are filtered out according to a function of the weights associated to the violated constraints, and the notion of well-formedness is replaced by one of optimality. Indeed, the over-generation introduces inconsistencies in the constraint system, which prevents the use of the con- null straint system as a set of well-formedness conditions, since even a well-formed utterance violates a subset of constraints. Consequently it is not possible to distinguish an optimal structure of an ill-formed utterance from an optimal structure of a well-formed utterance.</Paragraph>
      <Paragraph position="3"> Duchier (1999) relies on set constraints and selection constraints1 to axiomatise syntactic well-formedness and provides a concurrent constraint programming account of the parsing process. With the eXtended Dependency Grammar (XDG) (Debusmann et al., 2004) the notion of dependency tree is further extended to &amp;quot;multi-dimensional&amp;quot; dependency graph, where each dimension (e.g. Immediate Dominance and Linear Precedence) is associated with its own set of well-formedness conditions (called principles). Duchier (2000) sees dependency parsing as a configuration problem, where given a finite set of components (nodes in a graph) and a set of constraints specifying how these components may be connected, the task consists of finding a solution tree.</Paragraph>
      <Paragraph position="4"> It seems, to the best of our knowledge, that neither of these works around XDG attempts to account for ill-formedness.</Paragraph>
      <Paragraph position="5"> The Property Grammars (PG), introduced by Blache (Blache, 2001; Blache, 2005)2, step back from Dependency Grammar. Solving the constraint system no longer results in a dependency structure but in a phrase structure, whose granularity may be tailored from a shallow one (i.e. a collection of disconnected components) to a deep one (i.e. a single hierarchical structure of constituents) according to application requirements3. This feature makes the formalism well suited for accounting for both ill-formedness and well-formedness, which is a key requirement for our experimental platform.</Paragraph>
      <Paragraph position="6"> Introducing degrees of acceptability for an utterance does not mean indeed that it should be done at the expense of well-formedness: we want our model to account for ill-formedness and yet to also be able to recognise and acknowledge when an utterance is well-formed. This require1Although they are referred to with the same name by their respective authors, Duchier's notion of selection constraint is not to be confused with Dahl's selection constraints (Dahl and Blache, 2004). The two notions are significantly different.</Paragraph>
      <Paragraph position="7">  granularity see (VanRullen, 2005).</Paragraph>
      <Paragraph position="8"> ment rules out Optimality-theoretic frameworks as well as the ones based on Maruyama's CDG.</Paragraph>
      <Paragraph position="9"> Note that this is not to say that the task could not be achieved in a CDG-based framework; simply at this stage there is no work based on CDG, which would combine both an account of well-formedness and of optimality. A CO framework based on PG seems therefore best-suited for our purpose. Meanwhile, though different parsing strategies have been proposed for PG (Morawietz and Blache, 2002; Balfourier et al., 2002; Dahl and Blache, 2004; VanRullen, 2005), none of these strategies implements the possibility afforded by the theory to rely on any type of constraint in order to license a (possibly ill-formed) constituent.</Paragraph>
      <Paragraph position="10"> We will see in this paper how the parsing strategy implemented in Numbat overcomes this problem. null</Paragraph>
    </Section>
    <Section position="2" start_page="17" end_page="18" type="sub_section">
      <SectionTitle>
2.1 The Property Grammars Formalism
2.1.1 Terminology
</SectionTitle>
      <Paragraph position="0"> Construction. In PG a construction can be a lexical item's Part-of-Speech, a phrase, or top-level constructions such as, for example, the Caused-motion or the Subject-auxiliary Inversion constructions. The notion of construction is similar to the one in Construction Grammar (CxG)4, as in (Goldberg, 1995), where: Cx is a construction iff Cx is a form-meaning pair &lt;Fi,Si&gt; such that some aspect of Fi or some aspect of Si is not strictly predictable from Cx's component parts or from other previously established constructions.</Paragraph>
      <Paragraph position="1"> In this paper we only focus on syntax. For us, at the syntactic level, a construction is defined by a form, where a form is specified as a list of properties. When building a traditional phrase structure (i.e. a hierarchical structure of constituents) a construction can be simply seen as a non-terminal. Property. A property is a constraint, which models a relationship among constructions. PG pre-defines several types of properties, which are specified according to their semantics. Moreover, the framework allows for new types to be defined.</Paragraph>
      <Paragraph position="2">  In Numbat, a property type is also called a relation. Section 2.1.2 briefly presents some of the pre-defined property types and their semantics.</Paragraph>
      <Paragraph position="3"> Assignment. In PG an assignment is a list of constituents. Let's consider, for example, the three constituents DET, ADJ and N, the following lists are possible assignments: [DET], [ADJ], [DET, ADJ], [ADJ, N], [DET, N], [DET, ADJ, N], etc..</Paragraph>
      <Paragraph position="4">  Here are some property types pre-defined in PG.</Paragraph>
      <Paragraph position="5"> See (Blache, 2005) for more types and more detailed definitions.</Paragraph>
      <Paragraph position="6"> Notation. We note:  * K a set of constructions, with {C,C1,C2} [?] K; * C a set of constituents, with {c,c1,c2} [?]C; * A an assignment; * ind a function such that ind(c,A) is the index of c in A; * cx a function such that cx(c) is the construction of c; * P(C1, C2)[c1,c2,A] or (C1 P C2)[c1,c2,A] the constraint such that the relation P parametered with (C1,C2), applies to [c1,c2,A]. Linear Precedence ([?]).</Paragraph>
      <Paragraph position="8"> Uniqueness (Uniq).</Paragraph>
      <Paragraph position="9"> By definition, Uniq(C)[c,A] holds iff cx(c) = C, and c [?] A, and [?]cprime [?] A\{c},cx(cprime) negationslash= C</Paragraph>
    </Section>
    <Section position="3" start_page="18" end_page="18" type="sub_section">
      <SectionTitle>
2.2 Related Problems
</SectionTitle>
      <Paragraph position="0"> CO parsing with PG is an intersection of different classes of constraint-related problems, each of which is listed below.</Paragraph>
      <Paragraph position="1"> Configuration problem. Given a set of components and a set of constraints specifying how these components can be connected, a configuration problem consists of finding a solution tree which connects the components together. Deep parsing with PG is a configuration problem where the components are constituents, and the resulting structure is a phrase structure. By extension, a solution to such a problem is called a configuration. A configuration problem can be modelled with a (static) CSP.</Paragraph>
      <Paragraph position="2"> Dynamic CSP. In our case the problem is actually dynamic, in that the set of constraints to be solved evolves by the addition of new constraints. As we will see it later new constituents are inferred during the parsing process, and subsequently new constraints are dynamically added to the system.</Paragraph>
      <Paragraph position="3"> When dealing with deep parsing, i.e. with well-formedness only, the problem can be tackled as a Dynamic CSP, and solving techniques such as Local Search (Verfaillie and Schiex, 1994) can be applied.</Paragraph>
      <Paragraph position="4"> Optimisation problem. In order to account for ill-formedness as well as well-formedness, we need to allow constraint relaxation, which turns the problem into an optimisation one. The expected outcome is thus an optimal configuration with respect to some valuation function. Should the input be well-formed, no constraints are relaxed and the expected outcome is a full parse.</Paragraph>
      <Paragraph position="5"> Should the input be ill-formed, constraints are relaxed and the expected outcome is either an optimal full parse or a set of (optimal) partial parses.</Paragraph>
    </Section>
  </Section>
  <Section position="5" start_page="18" end_page="21" type="metho">
    <SectionTitle>
3 Numbat Architecture
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="18" end_page="20" type="sub_section">
      <SectionTitle>
3.1 The Parsing Strategy in Numbat
</SectionTitle>
      <Paragraph position="0"> Relying on a design pattern used in various optimisation techniques, such as dynamic programming, the top-level strategy adopted in Numbat consists  in three main steps: 1. splitting the problem into overlapping subproblems; null 2. solving the sub-problems--or building optimal sub-solutions;  3. building an optimal global solution, using the sub-solutions.</Paragraph>
      <Paragraph position="1"> More specifically, the strategy adopted proceeds by successive generate-and-test: the possible models to local systems are generated, then their satisfiability is tested against the grammar. The partial solutions are re-injected in the process dynamically, and the basic process is iterated again. Note that the generate-and-test method is not compulsory and is only chosen here because it allows us to conveniently control and then filter the assignments.</Paragraph>
      <Paragraph position="2"> Given an input utterance, the parsing process is made up of a re-iteration of the basic following steps:  1. Building Site. Build a set of constituents; 2. Assignation. Build all the possible assignments, i.e. all the possible combinations of one or more constituents; 3. Checkpoint Alpha. Filter out illegal assignments; null 4. Appropriation. For every assignment, identify and build all the relevant properties among its elements, which leaves us with a property store, i.e. a constraint system; 5. Checkpoint Bravo. Filter out illegal assignments and irrelevant properties; 6. Satisfaction. Solve the constraint system; 7. Formation. Identify forms of construction, i.e. subsets of properties from the property store and nominate the corresponding candidate constructions; 8. Polling booth. Decide which of the candi null date constructions are licensed and carried over to the next iteration; The process stops when no new constituent can be built.</Paragraph>
      <Paragraph position="3"> Each of these steps is defined in the following section.</Paragraph>
      <Paragraph position="4">  During the first iteration, this phase builds one constituent for each Part-of-Speech (POS) associated with an input word. From the second iteration onwards, new constituents are built provided the candidate assignments output by the previous round.</Paragraph>
      <Paragraph position="5">  From one iteration to the next new assignments are built, involving at least one of the new constituents. These constituents result from the previous iteration. Notice that the amount of new assignments created by each iteration grows exponentially with the amount of constituents (the 'old' ones and the new ones). Fortunately, the next step will filter out a large proportion of them. This phase of assignation is essential to the process, and makes Numbat different from any other parsing strategy for PG. The difference will be made clear in the Satisfaction phase.</Paragraph>
      <Paragraph position="6">  In Numbat we use a filtering profile to specify which combination of heuristics applies during the parsing process. This feature proves to be very useful when performing experiments, as it allows an incremental approach, in order to determine the relative importance of each of the criteria on gradience by turning on and off one or other heuristic. The heuristics play different roles. They are primarily used to prune the search space as early as possible in the process. Meanwhile, most of them capture language specific aspects (e.g. Contiguity, see below). These language specific heuristics are already present in previous works on PG in one form or another. We are working in the same framework and accept these restrictions, which might be relaxed by future work on the formal side.</Paragraph>
      <Paragraph position="7"> During Checkpoint Alpha the following heuristics may apply.</Paragraph>
      <Paragraph position="8"> Heuristic 1 (Distinct Constituents) An assignment may contain no pairwise intersecting constituents.</Paragraph>
      <Paragraph position="9"> That is, any two constituents may not have any constituent in common. For example, the constituents {DET1, ADJ2} and {ADJ2, NOUN3} may not belong to the same assignment, since they have one constituent in common.</Paragraph>
      <Paragraph position="10"> Heuristic 2 (Contiguity) An assignment is a set of contiguous elements.</Paragraph>
      <Paragraph position="11"> This heuristic rules out crossing-over elements. Although this heuristic has little consequence when dealing with languages such as French or English, it may have to be turned off for languages with cross-serial dependencies such as Dutch. But if turned off, an additional problem then occurs  that the semantics of pre-defined property types must be re-defined. The linear precedence, for instance, would need to account for the order between two crossing-over phrases, which is not the case in the current definition. On the other hand, notice that long distance dependencies are not ruled out by heuristic 2, since nested constituents are still legal.</Paragraph>
    </Section>
    <Section position="2" start_page="20" end_page="20" type="sub_section">
      <SectionTitle>
3.1.4 Appropriation
</SectionTitle>
      <Paragraph position="0"> This step has to do with the gathering of all the properties relevant to every assignment from the grammar. This operation is made easier by pre-processing the grammar, which is done at an initialisation step. During this preliminary phase, a lookup table is created for the grammar, where all the properties are indexed by their operands. Every property is also linked directly to the constructions for which it participates in the definition-i.e. the constructions for which the property is a member of the form. This table is actually a hash table, where the keys are the constructions on which the properties hold. For example, the property (Det [?] Noun) is indexed by the couple of constructions (Det, Noun). And the property ({Pronoun, Adv} notdblarrowboth V) is indexed by the triplets of constructions (Pronoun, Adv, V). Thus, given an assignment, i.e. a set of constituents, all we have to do here is to retrieve all the relevant properties from the lookup table, using all the (relevant) combinations of constituents as keys.</Paragraph>
      <Paragraph position="1">  Filters apply here, which aim to prune again the search space. The following heuristics may apply.</Paragraph>
      <Paragraph position="2"> Heuristic 3 (Full Coverage) Every element of an assignment must be involved in at least one constraint. That is, for each element in an assignment there must be at least one constraint defined over this element.</Paragraph>
      <Paragraph position="3"> Example 1 Consider the assignment A = &lt;Det,N,V&gt; , and the grammar made up of the following properties:</Paragraph>
      <Paragraph position="5"> According to heuristic 3 A is ruled out, since the V element is not covered by any constraints, whether we build an NP or a VP.</Paragraph>
      <Paragraph position="6"> Notice that this heuristic is semantically equivalent to the Constituency property present in early versions of PG5. The Constituency property used to specify which types of constituent (i.e. constructions) were legal ones (for a construction).</Paragraph>
      <Paragraph position="7"> Such a constraint is unnecessary since the information can be retrieved by simply listing all the types of constituents used in the definitions of properties. In example 1 for instance, the set of legal constituents for the NP construction is [Det,N,Adj].</Paragraph>
      <Paragraph position="8"> A main reason for dealing with constituency as a filter rather than as a constraint is to improve efficiency by reducing the amount of constraints in the system. Indeed, a filter aims to rule out constraints, which are subsequently removed from the constraint system. If dealt with as a constraint itself, Constituency would only make the constraint system more complex.</Paragraph>
      <Paragraph position="9"> Heuristic 3 raises the issue of ruling out assignments with &amp;quot;free&amp;quot; constituents, i.e. constituents which are not connected to the rest of the assignment. Such a situation may occur, for example, in the case of an unknown word, either because it is absent from the lexicon, or misspelled. We choose to leave it up to the grammar writer to design their own ad hoc solutions regarding how to handle such cases. It may be done, for instance, through the definition of a &amp;quot;wildcard construction&amp;quot;, and perhaps also a &amp;quot;wildcard property type&amp;quot;, which will be used appropriately in the grammar.</Paragraph>
    </Section>
    <Section position="3" start_page="20" end_page="21" type="sub_section">
      <SectionTitle>
3.1.6 Satisfaction
</SectionTitle>
      <Paragraph position="0"> At this stage, only legal assignments and relevant properties are kept in the system. All the required information for evaluating the properties is thus available and all we have to do now is to solve the constraint system.</Paragraph>
      <Paragraph position="1"> The solver we use is implemented in Constraint Handling Rules (CHR) (Fr&amp;quot;uhwirth, 1994). Unlike other CHR implementations of PG (Morawietz and Blache, 2002; Dahl and Blache, 2004) where the semantics of the property types are encoded in the handlers6--and therefore each type of property requires a different handler--, the approach we have adopted allows us to externalise the semantics and to generalise the properties evaluation with one single handler. The algorithm un- null derlying this handler can be expressed as follows: for each (list of n constituents, assignment, property) if (the list of n constituents and the assignment match the property's ones) then if (property is satisfied) then (tick property as being SATISFIED) else (tick property as being VIOLATED) The CHR handler takes the following form:</Paragraph>
      <Paragraph position="3"/>
    </Section>
  </Section>
  <Section position="6" start_page="21" end_page="22" type="metho">
    <SectionTitle>
3.1.7 Formation
</SectionTitle>
    <Paragraph position="0"> This phase is concerned with identifying the constructions in the grammar which can be triggered (i.e. licensed) by the properties present in the property store. A construction is triggered by any of the properties which are used to define this construction. This task can be performed easily by accessing them directly in the lookup table (see section 3.1.4), using a property's operands as the key. The constructions which are triggered are called target constructions. We then build a constituent for each of these target construction. Such a constituent is called a candidate constituent.</Paragraph>
    <Paragraph position="1"> This phase basically builds constituent structures. During the next iteration these candidates may be used in turn as constituents. The process thus accounts for recursive structures as well as non-recursive ones. Meanwhile, it is interesting to emphasise that building such a constituent structure is not necessary when parsing with PG. We could, for instance, deal with the whole sentence at once as a sequence of word order constraints.</Paragraph>
    <Paragraph position="2"> This way no constituent structure would be needed to license infinite sets of strings. In this case, the efficiency of such a process is something that has been worked on extensively within the CSP field.</Paragraph>
    <Paragraph position="3"> What we are contributing is merely a representation and translation to CSP, which allows us to take advantage of these efficiencies that decades of other work have produced.</Paragraph>
    <Paragraph position="4"> Monotonic and Non-monotonic Constraints.</Paragraph>
    <Paragraph position="5"> The notions of Selection Constraint in (Dahl and Blache, 2004) and of non-Lacunar Constraint in (VanRullen, 2005) are equivalent and denote a class of constraint types, whose semantics is monotonic, in that their satisfiability does not change when new elements are added to the assignment. Constraint types such as Linear Precedence or Obligation, for example, are monotonic.</Paragraph>
    <Paragraph position="6"> On the other hand the constraint Uniq(C)[c,A] (see 2.1.2), for example, is non-monotonic: if the contextual assignment A grows--i.e. if new constituents are added to it--the constraint needs to be re-evaluated. In parsing strategies where the assignments are built dynamically by successive additions of new constituents, the evaluation of the relevant constraints is performed on the fly, which means that the non-monotonic constraints need to be re-evaluated every time the assignment grows.</Paragraph>
    <Paragraph position="7"> This problem is tackled in different ways, according to implementation. But we observe that in all cases, the decision to trigger new candidate constituents relies only on the evaluation of the monotonic constraints. The decision process usually simply ignores the non-monotonic ones. Numbat, by fixing the assignments prior to evaluating the local constraint systems, includes both the monotonic and the non-monotonic constraints in the licensing process (i.e. in the Formation phase).</Paragraph>
    <Paragraph position="8">  This phase is concerned with the election process, which leads to choosing the candidates who will make it to the next iteration.</Paragraph>
    <Paragraph position="9"> The following heuristics may apply.</Paragraph>
    <Paragraph position="10"> Heuristic 4 (Minimum Satisfaction) An assignment is valid only if at least one constraint holds on any of its constituents.</Paragraph>
    <Paragraph position="11"> Notice that in all other implementations of PG this heuristic is much more restrictive and requires that a monotonic constraint must hold.</Paragraph>
    <Paragraph position="12"> Heuristic 5 (Full Input Span) A valid (partial or final) solution to the parsing problem is either a single constituent which spans exactly the input utterance, or a combination of constituents (i.e.</Paragraph>
    <Paragraph position="13"> a combination of partial parses) which spans exactly the input utterance.</Paragraph>
    <Paragraph position="14"> In theory, we want the Polling Booth to build all the candidate constituents we have identified, and re-inject them in the system for new iterations. In practice, different strategies may apply in order to prune the search space, such as strategies based on the use of a ranking function. In our case, every iteration of the parsing process only propagates one  valid combination of constituents to the next iteration (e.g. the best one according to a valuation function). Somehow such a strategy corresponds to always providing the main process with a &amp;quot;disambiguated&amp;quot; set of input constituents from one iteration to another. This heuristic may also be used as a termination rule.</Paragraph>
    <Paragraph position="15"> A question then arises regarding the relaxation policy: Do all the constraint types carry same importance with respect to relaxation? This question addresses the relative importance of different constraint types with respect to acceptability. Does, for instance, the violation of a constraint of Linear Precedence between a Determiner and a Noun in a Noun Phrase have the same impact on the overall acceptability of the Noun Phrase than the violation of Uniqueness of the Noun (still within a Noun Phrase)? From a linguistic point of view, the answer to that question is not straight-forward and requires number of empirical studies. Some works have been carried out (Gibson, 2000; Keller, 2000), which aim to provide elements of answer in very targeted syntactic contexts.</Paragraph>
    <Paragraph position="16"> The impact that the relaxation of different constraint types has on acceptability should not be biased by a particular parsing strategy. Thus, the framework provides the linguist (and the grammar writer) with maximum flexibility when it comes to decide the cost of relaxing different types of constraint on acceptability, since any type may be relaxed. Intuitively, one can clearly relax (in French) a constraint of Agreement in gender between determiner and noun; on the other hand one could not as easily relax constraints of type Obligation, which are often used to specify heads. A complete breakdown of constraints into relaxable and non-relaxable is future work. But at the end, the parser just produces sets of satisfied and violated constraints, regardless of how important they are.</Paragraph>
    <Paragraph position="17"> There will then be a separate process for predicting gradience, where the relative importance of particular constraints in determining acceptability will be decided experimentally.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML