File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/96/j96-2002_intro.xml

Size: 6,297 bytes

Last Modified: 2025-10-06 14:06:03

<?xml version="1.0" standalone="yes"?>
<Paper uid="J96-2002">
  <Title>DATR: A Language for Lexical Knowledge Representation</Title>
  <Section position="4" start_page="168" end_page="169" type="intro">
    <SectionTitle>
3 The syntax of DATR, like its name and its minimalist philosophy, owes more than a little to that of the
</SectionTitle>
    <Paragraph position="0"> unification grammar language PATR (Shieber 1986). With hindsight this may have been a bad design decision since similarity of syntax tends to imply a similarity of semantics. And, as we shall see in Section 4.7 below, and elsewhere, there is a subtle but important semantic difference. 4 Node names and atoms are distinct, but essentially arbitrary, classes of tokens in DATR. In this paper we shall distinguish them by a simple case convention--node names start with an uppercase letter, atoms do not.</Paragraph>
    <Paragraph position="1">  Computational Linguistics Volume 22, Number 2 is exactly what we want: it represents the actual information we generally wish to access from the description. So in a sense, we do want all the above statements to be present in our description; what we want to avoid is repeated specification of the common elements.</Paragraph>
    <Paragraph position="2"> This problem is overcome in DATR in the following way: such exhaustively listed path/value statements are indeed present in a description, but typically only implicitly present. Their presence is a logical consequence of a second set of statements, which have the concise, generalization-capturing properties we expect. To make the distinction sharp, we call the first type of statement extensional and the second type definitional. Syntactically, the distinction is made with the equality operator: for extensional statements (as above), we use -, while for definitional statements we use ---=. And, although our first example of DATR consisted entirely of extensional statements, almost all the remaining examples will be definitional. The semantics of the DATR language binds the two together in a declarative fashion, allowing us to concentrate on concise definitions of the network structure from which the extensional &amp;quot;results&amp;quot; can be read off.</Paragraph>
    <Paragraph position="3"> Our first step towards a more concise account of Wordl and Word2 is simply to change the extensional statements to definitional ones:  This is possible because DATR respects the unsurprising condition that if at some node a value is specifically defined for a path with a definitional statement, then the corresponding extensional statement also holds. So the statements we previously made concerning Wordl and Word2 remain true, but now only implicitly true.</Paragraph>
    <Paragraph position="4"> Although this change does not itself make the description more concise, it allows us to introduce other ways of describing values in definitional statements, in addition to simply specifying them. Such value descriptors will include inheritance specifications that allow us to gather together the properties that Wordl and Word2 have solely by virtue of being verbs. We start by introducing a VERB node: VERB: &lt;syn cat&gt; == verb &lt;syn type&gt; == main.</Paragraph>
    <Paragraph position="5"> and then redefine Wordl and Word2 to inherit their verb properties from it. A direct encoding for this is as follows:  In these revised definitions the right-hand side of the &lt;syn cat&gt; statement is not a direct value specification, but instead an inheritance descriptor. This is the simplest form of DATR inheritance: it just specifies a new node and path from which to obtain the required value. It can be glossed roughly as &amp;quot;the value associated with &lt;syn cat&gt; at Wordl is the same as the value associated with &lt;syn cat&gt; at VERB.&amp;quot; Thus from VERB:&lt;syn cat&gt; == verb it now follows that Wordl:&lt;syn cat&gt; == verb. 6 However, this modification to our analysis seems to make it less concise, rather than more. It can be improved in two ways. The first is really just a syntactic trick: if the path on the right-hand side is the same as the path on the left-hand side, it can be omitted. So we can replace VERB : &lt;syn type&gt;, in the example above, with just VERB. We can also extend this abbreviation strategy to cover cases like the following, where the path on the right-hand side is different but the node is the same: Come: &lt;mor root&gt; == come &lt;mor past participle&gt; == Come:&lt;mor root&gt;.</Paragraph>
    <Paragraph position="6"> In this case we can simply omit the node: Come: &lt;mor root&gt; == come &lt;mor past participle&gt; == &lt;mor root&gt;.</Paragraph>
    <Paragraph position="7"> The other improvement introduces one of the most important features of DATR-specification by default. Recall that paths are sequences of attributes. If we understand paths to start at their left-hand end, we can construct a notion of path extension: a path P2 extends a path P1 if and only if all the attributes of P1 occur in the same order at the left-hand end of P2 (so &lt;al a2 a3&gt; extends &lt;&gt;, &lt;al&gt;, &lt;al a2&gt;, and &lt;al a2 a3&gt;, but not &lt;a2&gt;, &lt;al a3&gt;, etc.). If we now consider the (finite) set of paths occurring in definitional statements associated with some node, that set will not include all possible paths (of which there are infinitely many). So the question arises of what we can say about paths for which there is no specific definition. For some path P1 not defined at node N, there are two cases to consider: either P1 is the extension of some path defined at N or it is not. The latter case is easiest--there is simply no definition for P1 at N (hence N can be a partial function, as already noted above). But in the former case, where P1 extends some P2 which is defined at N, P1 assumes a definition &amp;quot;by default.&amp;quot; If P2 is the only path defined at N which P1 extends, then P1 takes its definition from the definition of P2. If P1 extends several paths defined at N, it takes its definition from the most specific (i.e., the longest) of the paths that it extends.</Paragraph>
    <Paragraph position="8"> In the present example, this mode of default specification can be applied as follows:</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML