File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/04/c04-1043_metho.xml

Size: 7,762 bytes

Last Modified: 2025-10-06 14:08:43

<?xml version="1.0" standalone="yes"?>
<Paper uid="C04-1043">
  <Title>Uni cational Combinatory Categorial Grammar: Combining Information Structure and Discourse Representations</Title>
  <Section position="4" start_page="0" end_page="0" type="metho">
    <SectionTitle>
3 Uni cational CCG
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.1 Signs
</SectionTitle>
      <Paragraph position="0"> UCCG makes use of feature structures called signs in its linguistic description. There are two types of signs: basic and complex signs. A basic sign is a list of attributes or features describing the syntactic and semantic characteristics of a lexical expression, in the spirit of UCG.</Paragraph>
      <Paragraph position="1"> We deviate from UCG in the way we de ne complex signs, which is done recursively: If X and Y are signs then X/Y is a complex sign.</Paragraph>
      <Paragraph position="2"> If X and Y are signs XnY is a complex sign.</Paragraph>
      <Paragraph position="3"> All basic and complex signs are signs.</Paragraph>
      <Paragraph position="4"> A basic sign can have a varied number of features, depending on the syntactic category of the lexical expression the sign is characterising. There are three obligatory features any sign must have, namely pho, cat and drs. pho stands for the phonological form, cat for the syntactic category of the lexical expression, and drs for its semantical representation. Besides the above three a sign can also have the following features:1 agr to mark the in ectional characteristics of categories; var for discourse referents ranging over individuals; null sit for discourse referents ranging over eventualities (events or states).</Paragraph>
      <Paragraph position="5"> In our notation inside the feature structures we use the following convention: constants start with a lower case letter, and variables start with an upper case letter. The feature names are written using small capitals. To make the feature structures more easily readable we narrow the choice of possible variable names for each type of variables:  (pho) variables: W, W1, W2, etc.</Paragraph>
      <Paragraph position="6"> (agr) variables: A, A1, A2, etc.</Paragraph>
      <Paragraph position="7"> (drs) variables: D, D1, D2, etc.</Paragraph>
      <Paragraph position="8"> (sit) variables: E, E1, E2, etc.</Paragraph>
      <Paragraph position="9">  Discourse referents (var) use any other capital letter with the preference for the characters towards the end of the alphabet. There are three kinds of basic signs in UCCG, corresponding to the basic categories  |those with cat feature sentence (s), those with cat feature noun (n), and those with cat feature verb phrase (vp). A basic sign for verb phrases is shown in (1), and a complex sign for noun phrases is shown in (2).</Paragraph>
      <Paragraph position="10">  language for which a UCCG grammar is constructed many more features could be introduced in basic signs. The above examples illustrate the role of unication by creating a link between syntax and semantics. UCCG explores the fact that the same variables can be used at several di erent levels. For example, the variables standing for discourse referents serve as a link between syntax and semantics  |the variable in the var feature in the feature structure ts into its corresponding slot in the DRS in the drs feature. We use this technique to integrate information structure as well.</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.2 Categories
</SectionTitle>
      <Paragraph position="0"> Each sign corresponds to a related CCG category. The category of a basic sign is the value of its cat feature. The category of a complex sign it is made up of the cat feature values of all the component parts of the complex sign, separated by the slashes and brackets used in the complex sign, resulting in a complex category. For instance, the the syntactic category of the sign in (1) is vp, and in (2) the category is s/vp. The three basic categories used in UCCG are thus s, n and vp, while all other categories are formed by combining the above three, using backward and forward slashes.</Paragraph>
      <Paragraph position="1"> Note that noun phrase is not among the basic categories. In UCCG We use its 'type-raised' variant s/vp (corresponding to the CCG category s/(snnp)). This choice is motivated by the need to determine quanti er scope in the semantics of quanti ed noun phrases. The somewhat unconventional basic category vp is a by-product of the above.</Paragraph>
    </Section>
    <Section position="3" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.3 Feature Values
</SectionTitle>
      <Paragraph position="0"> In order to make it easier to refer to parts of complex signs later, we introduce the following terminology: X is the result of a sign X/Y or XnY.</Paragraph>
      <Paragraph position="1"> Y is the argument of a sign X/Y or XnY.</Paragraph>
      <Paragraph position="2"> The value of the var and the sit features is always a variable, while other features can have a number of constant values. The pho feature holds the string value of the linguistic expression represented by the given feature structure. Presently, we use the orthographic form of words. In basic signs the pho feature is lled by lexical items, in complex signs it also contains variables, which get constant values when the complex sign is combined with its argument signs. The pho feature in result parts of complex signs is of the form: .. .+ W1 + word + W2 + .. .</Paragraph>
      <Paragraph position="3"> where word is a lexical item, and W1 and W2 are variables that get values through uni cation in the categorial combination process. The item unifying with W1 precedes and the one unifying with W2 follows the lexical item word. The exact number and order of the variables the pho feature contains depends on the category of the given sign.</Paragraph>
      <Paragraph position="4"> In the present implementation the agr feature is only used in connection with verb phrases and can take constant values n ( nite) or nonn (non nite).</Paragraph>
      <Paragraph position="5"> The drs feature, if it is not a variable itself, holds a DRS corresponding to the semantics of the lexical item(s) characterised by the given sign. DRSs are constructed in a compositional way using the var and sit features of the sign to take care of predicate argument structure, and the merge operator (;) to construct larger DRSs from smaller ones. Merge-reduction is used to eliminate merge operators introduced in the composition process. This is also the stage where discourse referents are renamed to avoid accidental clashes of variables introduced by uni cation (Blackburn and Bos, 2003).</Paragraph>
    </Section>
    <Section position="4" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.4 The Combinatory Rules
</SectionTitle>
      <Paragraph position="0"> Presently we have introduced the following four CCG combinatory rules in UCCG: forward application, backward application, forward composition, and backward composition.</Paragraph>
      <Paragraph position="1"> Other CCG combinatory rules could be introduced equally easily should the need arise.</Paragraph>
      <Paragraph position="3"> The rule boxes above are to be interpreted in the following way: in the rst row there is the rule, on the left in the second row there is the name of the rule and on the right the marking for it as used in the derivations. The variables X, Y and Z in the rules above stand for (basic or complex) signs.</Paragraph>
      <Paragraph position="4"> Some of the combinatory rules can be seen in action on UCCG signs in Figures 1 to 3 below.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML