File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/04/c04-1026_metho.xml

Size: 22,582 bytes

Last Modified: 2025-10-06 14:08:40

<?xml version="1.0" standalone="yes"?>
<Paper uid="C04-1026">
  <Title>A Relational Syntax-Semantics Interface Based on Dependency Grammar</Title>
  <Section position="3" start_page="0" end_page="0" type="metho">
    <SectionTitle>
2 Extensible Dependency Grammar
</SectionTitle>
    <Paragraph position="0"> This section presents Extensible Dependency Grammar (XDG), a description-based formalism for dependency grammar. XDG generalizes previous work on Topological Dependency Grammar (Duchier and Debusmann, 2001), which focussed on word order phenomena in German.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.1 XDG in a Nutshell
</SectionTitle>
      <Paragraph position="0"> XDG is a description language over finite labelled graphs. It is able to talk about two kinds of constraints on these structures: The lexicon of an XDG grammar describes properties local to individual nodes, such as valency. The grammar's principles express constraints global to the graph as a whole, such as treeness. Well-formed analyses are graphs that satisfy all constraints.</Paragraph>
      <Paragraph position="1"> An XDG grammar allows the characterisation of linguistic structure along several dimensions of description. Each dimension contains a separate graph, but all these graphs share the same set of nodes. Lexicon entries synchronise dimensions by specifying the properties of a node on all dimensions at once. Principles can either apply to a single dimension (one-dimensional), or constrain the relation of several dimensions (multi-dimensional).</Paragraph>
      <Paragraph position="2"> Consider the example in Fig. 1, which shows an analysis for a sentence of English along two dimensions of description, immediate dominance (ID) and linear precedence (LP). The principles of the underlying grammar require both dimensions to be trees, and the LP tree to be a &amp;quot;flattened&amp;quot; version of the ID tree, in the sense that whenever a node v is a transitive successor of a node u in the LP tree, it must also be a transitive successor of u in the ID tree. The given lexicon specifies the potential incoming and required outgoing edges for each word on both dimensions. The word does, for example, accepts no incoming edges on either dimension and must therefore be at the root of both the ID and the LP tree. It is required to have outgoing edges to a subject (subj) and a verb base form (vbse) in the ID tree, needs fillers for a subject (sf) and a verb complement field (vcf) in the LP tree, and offers an optional field for topicalised material (tf). All these constraints are satisfied by the analysis, which is thus well-formed.</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.2 Formalisation
</SectionTitle>
      <Paragraph position="0"> Formally, an XDG grammar is built up of dimensions, principles, and a lexicon, and characterises a set of well-formed analyses.</Paragraph>
      <Paragraph position="1"> A dimension is a tuple D =(Lab,Fea,Val,Pri) of a set Lab of edge labels, a set Fea of features, a set Val of feature values, and a set of one-dimensional</Paragraph>
      <Paragraph position="3"> ysis on dimension D, is a triple (V,E,F) of a set V of nodes, a set E [?]V xV xLab of directed labelled edges, and an assignment F : V - (Fea - Val) of lexical entries to nodes. V and E form a graph. We write StrD for the set of all possible D-structures.</Paragraph>
      <Paragraph position="4"> The principles characterise subsets of StrD that have further dimension-specific properties, such as being a tree, satisfying assigned valencies, etc. We assume that the elements of Pri are finite representations of such subsets, but do not go into details here; some examples are shown in Section 3.2.</Paragraph>
      <Paragraph position="5"> An XDG grammar ((Labi,Feai,Vali,Prii)ni=1,Pri, Lex) consists of n dimensions, multi-dimensional principles Pri, and a lexicon Lex. An XDG analysis (V,Ei,Fi)ni=1 is an element of Ana = Str1x***xStrn where all dimensions share the same set of nodes V .</Paragraph>
      <Paragraph position="6"> Multi-dimensional principles work just like one-dimensional principles, except that they specify subsets of Ana, i. e. couplings between dimensions (e. g. the flattening principle between ID and LP in Section 2.1). The lexicon Lex [?] Lex1 x***x Lexn constrains all dimensions at once. An XDG analysis is licenced by Lex iff (F1(w),...,Fn(w)) [?] Lex for every node w [?]V .</Paragraph>
      <Paragraph position="7"> In order to compute analyses for a given input, we model it as a set of input constraints (Inp), which again specify a subset of Ana. The parsing problem for XDG is then to find elements of Ana that are licenced by Lex and consistent with Inp and Pri. Note that the term &amp;quot;parsing problem&amp;quot; is traditionally used only for inputs that are sequences of words, but we can easily represent surface realisation as a &amp;quot;parsing&amp;quot; problem in which Inp specifies a semantic dimension; in this case, a &amp;quot;parser&amp;quot; would compute analyses that contain syntactic dimensions from which we can read off a surface sentence.</Paragraph>
    </Section>
    <Section position="3" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.3 Constraint Solver
</SectionTitle>
      <Paragraph position="0"> The parsing problem of XDG has a natural reading as a constraint satisfaction problem (CSP) (Apt, 2003) on finite sets of integers; well-formed analyses correspond to the solutions of this problem.</Paragraph>
      <Paragraph position="1"> The transformation, whose details we omit due to lack of space, closely follows previous work on axiomatising dependency parsing (Duchier, 2003) and includes the use of the selection constraint to efficiently handle lexical ambiguity.</Paragraph>
      <Paragraph position="2"> We have implemented a constraint solver for this CSP using the Mozart/Oz programming system (Smolka, 1995; Mozart Consortium, 2004). This solver does a search for a satisfying variable assignment. After each case distinction (distribution), it performs simple inferences that restrict the ranges of the finite set variables and thus reduce the size of the search tree (propagation). The successful leaves of the search tree correspond to XDG analyses, whereas the inner nodes correspond to partial analyses. In these cases, the current constraints are too weak to specify a complete analysis, but they already express that some edges or feature values must be present, and that others are excluded. Partial analyses will play an important role in Section 3.3.</Paragraph>
      <Paragraph position="3"> Because propagation operates on all dimensions concurrently, the constraint solver can frequently infer information about one dimension from information on another, if there is a multi-dimensional principle linking the two dimensions. These inferences take place while the constraint problem is being solved, and they can often be drawn before the solver commits to any single solution.</Paragraph>
      <Paragraph position="4"> Because XDG allows us to write grammars with completely free word order, XDG solving is an NP-complete problem (Koller and Striegnitz, 2002).</Paragraph>
      <Paragraph position="5"> This means that the worst-case complexity of the solver is exponential, but the average-case complexity for the hand-crafted grammars we experimented with is often better than this result suggests. We hope there are useful fragments of XDG that would guarantee polynomial worst-case complexity.</Paragraph>
    </Section>
  </Section>
  <Section position="4" start_page="0" end_page="0" type="metho">
    <SectionTitle>
3 A Relational Syntax-Semantics Interface
</SectionTitle>
    <Paragraph position="0"> Now that we have the formal and processing frameworks in place, we can define a relational syntax-semantics interface for XDG. We will first show how we encode semantics within the XDG framework. Then we will present an example grammar (including some principle definitions), and finally go through an example that shows how the relationality of the interface, combined with the concurrency of the constraint solver, supports the flow of information between different dimensions.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.1 Representing Meaning
</SectionTitle>
      <Paragraph position="0"> We represent meaning within XDG on two dimensions: one for predicate-argument structure (PA), every student readsa book  and one for scope (SC). The function of the PA dimension is to abstract over syntactic idiosyncrasies such as active-passive alternations or dative shifts, and to make certain semantic dependencies e. g. in control constructions explicit; it deals with concepts such as agent and patient, rather than subject and object. The purpose of the SC dimension is to reflect the structure of a logical formula that would represent the semantics, in terms of scope and restriction. We will make this connection explicit in Section 4.</Paragraph>
      <Paragraph position="1"> In addition, we assume an ID dimension as above.</Paragraph>
      <Paragraph position="2"> We do not include an LP dimension only for ease of presentation; it could be added completely orthogonally to the three dimensions we consider here. While one ID structure will typically correspond to one PA structure, each PA structure will typically be consistent with multiple SC structures, because of scope ambiguities. For instance, Fig. 2 shows the unique ID and PA structures for the sentence &amp;quot;Every student reads a book.&amp;quot; These structures (and the input sentence) are consistent with the two possible SC-structures shown in (iii). Assuming a Davidsonian event semantics, the two SC trees (together with the PA-structure) represent the two readings of the sentence:</Paragraph>
      <Paragraph position="4"/>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.2 A Grammar for a Fragment of English
</SectionTitle>
      <Paragraph position="0"> The lexicon for an XDG grammar for a small fragment of English using the ID, PA, and SC dimensions is shown in Fig. 3. Each row in the table specifies a (unique) lexical entry for each part of speech (determiner, common noun, proper noun, transitive verb and preposition); there is no lexical ambiguity in this grammar. Each column specifies a feature. The meaning of the features will be explained together  with the principles that use them.</Paragraph>
      <Paragraph position="1"> The ID dimension uses the edge labels LabID = {det,subj,obj,prep,pcomp} resp. for determined common noun,1 subject, object, preposition, and complement of a preposition. The PA dimension uses LabPA = {ag,pat,arg,quant,mod,instr}, resp.</Paragraph>
      <Paragraph position="2"> for agent, patient, argument of a modifier, common noun pertaining to a quantifier, modifier, and instrument; and SC uses LabSC ={r,s,a} resp. for restriction and scope of a quantifier, and for an argument. The grammar also contains three one-dimensional principles (tree, dag, and valency), and three multi-dimensional principles (linking, codominance, and contra-dominance).</Paragraph>
      <Paragraph position="3"> Tree and dag principles. The tree principle restricts ID and SC structures to be trees, and the dag principle restricts PA structures to be directed acyclic graphs.</Paragraph>
      <Paragraph position="4"> Valency principle. The valency principle, which we use on all dimensions, states that the incoming and outgoing edges of each node must obey the specifications of the in and out features. The possible values for each feature ind and outd are subsets of Labd x{!,?,[?]}. lscript! specifies a mandatory edge with label lscript, lscript? an optional one, and lscript[?] zero or more. Linking principle. The linking principle for dimensions d1,d2 constrains how dependents on d1 may be realised on d2. It assumes a feature linkd1,d2 whose values are functions that map labels from Labd1 to sets of labels from Labd2, and is specified by the following implication: v l-d1 vprime = [?]lprime [?] linkd1,d2(v)(l) : v lprime-d2 vprime Our grammar uses this principle with the link feature to constrain the realisations of PA-dependents in the ID dimension. In Fig. 2, the agent (ag) of reads must be realised as the subject (subj), i. e.</Paragraph>
      <Paragraph position="5"> 1We assume on all dimensions that determiners are the heads of common nouns. This makes for a simpler relationship between the syntactic and semantic dimensions.</Paragraph>
      <Paragraph position="6"> reads ag-PA every = reads subj-ID every Similarly for the patient and the object. There is no instrument dependent in the example, so this part of the link feature is not used. An ergative verb would use a link feature where the subject realises the patient; Control and raising phenomena can also be modelled, but we cannot present this here.</Paragraph>
      <Paragraph position="7"> Co-dominance principle. The co-dominance principle for d1,d2 relates edges in d1 to dominance relations in the same direction in d2. It assumes a feature codomd1,d2 mapping labels in Labd1 to sets of labels in Labd2 and is specified as v l-d1 vprime = [?]lprime [?] codomd1,d2(v)(l) : v lprime--[?]d2vprime Our grammar uses the co-dominance principle on dimension PA and SC to express, e. g., that the propositional contribution of a noun must end up in the restriction of its determiner. For example, for the determiner every of Fig. 2 we have: every quant- PA student = every r--[?]SCstudent Contra-dominance principle. The contra-dominance principle is symmetric to the co-dominance principle, and relates edges in d1 to dominance edges into the opposite direction in d2. It assumes a feature contradomd1,d2 mapping labels of Labd1 to sets of labels from Labd2 and is specified as v l-d1 vprime = [?]lprime [?] contradomd1,d2(v)(l) : vprime lprime--[?]d2v Our grammar uses the contra-dominance principle on dimensions PA and SC to express, e. g., that predicates must end up in the scope of the quantifiers whose variables they refer to. Thus, for the transitive verb reads of Fig. 2, we have: reads ag-PA every = every s--[?]SCreads reads pat-PA a = a s--[?]SCreads</Paragraph>
    </Section>
    <Section position="3" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.3 Syntax-Semantics Interaction
</SectionTitle>
      <Paragraph position="0"> It is important to note at this point that the syntax-semantics interface we have defined is indeed relational. Each principle declaratively specifies a set of admissible analyses, i. e. a relation between the structures for the different dimensions, and the analyses that the complete grammar judges grammatical are simply those that satisfy all principles. The role of the lexicon is to provide the feature values which parameterise the principles defined above.</Paragraph>
      <Paragraph position="1"> The constraint solver complements this relationality by supporting the use of the principles to move information between any two dimensions. If, say, the left-hand side of the linking principle is found to be satisfied for dimension d1, a propagator will infer the right-hand side and add it to dimension d2. Conversely, if the solver finds that the right-hand side must be false for d2, the negation of the left-hand side is inferred for d1. By letting principles interact concurrently, we can make some very powerful inferences, as we will demonstrate with the example sentence &amp;quot;Mary saw a student with a book,&amp;quot; some partial analyses for which are shown in Fig. 4.</Paragraph>
      <Paragraph position="2"> Column (i) in the figure shows the state after the constraint solver finishes its initial propagation, at the root of the search tree. Even at this point, the valency and treeness principles have conspired to establish an almost complete ID-structure. By the linking principle, the PA-structure has been determined similarly closely. The SC-structure is still mostly undetermined, but by the co- and contra-dominance principles, the solver has already established that some nodes must dominate others: A dotted edge with label s in the picture means that the solver knows there must be a path between these two nodes which starts with an s-edge. In other words, the solver has computed a large amount of semantic information from an incomplete syntactic analysis.</Paragraph>
      <Paragraph position="3"> Now imagine some external source tells us that with is a mod-child of student on PA, i. e. the analysis in (iii). This information could come e. g. from a statistical model of selectional preferences, which will judge this edge much more probable than an instr-edge from the verb to the preposition (ii).</Paragraph>
      <Paragraph position="4"> Adding this edge will trigger additional inferences through the linking principle, which can now infer that with is a prep-child of student on ID. In the other direction, the solver will infer more dominances on SC. This means that semantic information can be used to disambiguate syntactic ambiguities, and semantic information such as selectional preferences can be stated on their natural level of representation, rather than be forced into the ID dimension directly.</Paragraph>
      <Paragraph position="5"> Similarly, the introduction of new edges on SC could trigger a similar reasoning process which would infer new PA-edges, and thus indirectly also new ID-edges. Such new edges on SC could come from inferences with world or discourse knowledge (Koller and Niehren, 2000), scope preferences, or interactions with information structure (Duchier and Kruijff, 2003).</Paragraph>
    </Section>
  </Section>
  <Section position="5" start_page="0" end_page="0" type="metho">
    <SectionTitle>
4 Traditional Semantics
</SectionTitle>
    <Paragraph position="0"> Our syntax-semantics interface represents semantic information as graphs on the PA and SC dimensions. While this looks like a radical departure from traditional semantic formalisms, we consider these graphs simply an alternative way of presenting more traditional representations. We devote the rest of the paper to demonstrating that a pair of a PA and a SC structure can be interpreted as a Montague-style formula, and that a partial analysis on these two dimensions can be seen as an underspecified semantic description.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
4.1 Montague-style Interpretation
</SectionTitle>
      <Paragraph position="0"> In order to extract a standard type-theoretic expression from an XDG analysis, we assign each node v two semantic values: a lexical value L(v) representing the semantics of v itself, and a phrasal value P(v) representing the semantics of the entire SCsubtree rooted at v. We use the SC-structure to determine functor-argument relationships, and the PA-structure to establish variable binding.</Paragraph>
      <Paragraph position="1"> We assume that nodes for determiners and proper names introduce unique individual variables (&amp;quot;indices&amp;quot;). Below we will write &lt;&lt;v&gt; &gt; to refer to the index of the node v, and we write |lscript to refer to the node which is the lscript-child of the current node in the appropriate dimension (PA or SC). The semantic lexicon is defined as follows; &amp;quot;L(w)&amp;quot; should be read as &amp;quot;L(v), where v is a node for the word w&amp;quot;.</Paragraph>
      <Paragraph position="3"> Lexical values for other determiners, common nouns, and proper names are defined analogously.</Paragraph>
      <Paragraph position="4"> Note that we do not formally distinguish event variables from individual variables. In particular, L(with) can be applied to either nouns or verbs, which both have type &lt;e,t&gt; .</Paragraph>
      <Paragraph position="5"> We assume that no node in the SC-tree has more than one child with the same edge label (which our grammar guarantees), and write n(lscript1,...,lscriptk) to indicate that the node n has SC-children over the edge labels lscript1,...,lscriptk. The phrasal value for n is defined (in the most complex case) as follows:</Paragraph>
      <Paragraph position="7"> This rule implements Montague's rule of quantification (Montague, 1974); note that l&lt;&lt;n&gt; &gt; is a binder for the variable &lt;&lt;n&gt; &gt; . Nodes that have no s-children are simply functionally applied to the phrasal semantics of their children (if any).</Paragraph>
      <Paragraph position="8"> By way of example, consider the left-hand SC-structure in Fig. 2. If we identify each node by the word it stands for, we get the following phrasal  ing CLLS description.</Paragraph>
      <Paragraph position="9"> value for the root of the tree: L(a)(L(book))(lx.L(every)(L(student) (ly.readprime(y)(x)))), where we write x for &lt;&lt;a&gt; &gt; and y for &lt;&lt;every&gt; &gt; . The arguments of readprime are x and y because every and a are the arg and pat children of reads on the PAstructure. After replacing the lexical values by their definitions and beta-reduction, we obtain the familiar representation for this semantic reading, as shown in Section 3.1.</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
4.2 Underspecification
</SectionTitle>
      <Paragraph position="0"> It is straightforward to extend this extraction of type-theoretic formulas from fully specified XDG analyses to an extraction of underspecified semantic descriptions from partial XDG analyses. We will briefly demonstrate this here for descriptions in the CLLS framework (Egg et al., 2001), which supports this most easily. Other underspecification formalisms could be used too.</Paragraph>
      <Paragraph position="1"> Consider the partial SC-structure in Fig. 5, which could be derived by the constraint solver for the sentence from Fig. 2. We can obtain a CLLS constraint from it by first assigning to each node of the SC-structure a lexical value, which is now a part of the CLLS constraint (indicated by the dotted ellipses). Because student and book are known to be rdaughters of every and a on SC, we plug their CLLS constraints into the r-holes of their mothers' constraints. Because we know that reads must be dominated by the s-children of the determiners, we add the two (dotted) dominance edges to the constraint.</Paragraph>
      <Paragraph position="2"> Finally, variable binding is represented by the binding constraints drawn as dashed arrows, and can be derived from PA exactly as above.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML