File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/90/c90-2065_metho.xml
Size: 23,159 bytes
Last Modified: 2025-10-06 14:12:24
<?xml version="1.0" standalone="yes"?> <Paper uid="C90-2065"> <Title>AN IMPLEMENTATION OF FORMAL SEMANTICS IN THE FORMALISM OF RELATIONAL DATABASES.</Title> <Section position="3" start_page="0" end_page="0" type="metho"> <SectionTitle> 0. INTRODUCTION. </SectionTitle> <Paragraph position="0"> In extensional semantics, each denotation corresponds to an object of the worlddeg The world is the set of all the denotations. In the implementation that I shall present in this paper, the world will be represented by means of a database, more precisely a relational database.</Paragraph> <Paragraph position="1"> The structure of the database is designed in such a way that it makes explicit the semantic type of each denotation.</Paragraph> <Paragraph position="2"> Although I will not always stick to the standard version of formal semantics when assigning semantic types to syntactic categories, I aim at accounting for the same range of phenomena that formal semantics deals with.</Paragraph> <Paragraph position="3"> The paper will be divided into five parts. First, I shall trace back research results to which my contribution can be related. Next, I describe the database. Then, I explain how the principles used to design the database meet the requirements of formal semantics. The fourth part is concerned with entailment while the last part mainly shows how one proceeds to interpret sentences.</Paragraph> </Section> <Section position="4" start_page="0" end_page="0" type="metho"> <SectionTitle> 1. BACKGROUND. </SectionTitle> <Paragraph position="0"> The topics which this work is concerned with have mainly been studied from three points of view.</Paragraph> <Paragraph position="1"> A first class of studies covers the problems encountered in trying to translate NL into a formal language. On the one hand, there is theoretical research aiming at such a translation, like PATR \[7\]. On the other hand, various kinds of inaccuracies of NL translations into logical form in view of accessing databases have been discussed, see \[6\] for example.</Paragraph> <Paragraph position="2"> A second field of research that need be mentioned is concerned with NL interfaces. Famous systems are described in \[9\] and \[1\]. There are important differences between these systems and my work since I am not aiming at accessing a knowledge base at all. The database that I use encodes NL meanings and it does so according to linguistic constraints. Traditionally, the database rather encodes a certain knowledge independent of the language used to talk about it. Problems specific to NL interfaces can be found in \[3\] and \[81.</Paragraph> <Paragraph position="3"> From another point of view, there are works which are concerned with the question of the organisation of the knowledge base constituted by NL meanings, see \[2\]. The difference between my approach and ones like \[2\], is that I am sticking to the theory of formal semantics. Consequently, I do not (yet) address questions about the structure of the i 377 lexicon nor do I treat pragmatic phenomena like common sense inferences.</Paragraph> </Section> <Section position="5" start_page="0" end_page="0" type="metho"> <SectionTitle> 2. THE DATABASE. </SectionTitle> <Paragraph position="0"> The structure of the database is dependent on the semantic properties of the denotations. More specifically, the structure of the database is dependent on the fact that denotations are classified into different types and specifically recognized as the denotations of such and such syntactic categories.</Paragraph> <Paragraph position="1"> Each denotation of each constituent is a value in the database. Some of the denotations result from the composition of other denotations. Which denotations can be composed with which other ones are properties of their type. These properties are not encoded as such. The overall structure of the database shows how the semantic types combine with each other. Consequently, complex denotations (denotations of complex expressions) are represented by atomic values, but the fact that they are complex is deduced from the structure of the database.</Paragraph> <Paragraph position="2"> Consider the case of noun phrase denotations.</Paragraph> <Paragraph position="3"> The denotation of a determiner combines with the denotation of a common noun. This combination yields the denotation of a noun phrase, i.e., an atomic value in the database. The representation of this denotation is connected (in the sense of relational databases) to the representations of the denotations of the noun and of the determiner. Therefore, it can be recognized as a complex denotation.</Paragraph> <Paragraph position="4"> The design of the database is dependent on the fact that we need an explicit means to recognize the type of each denotation represented in it. Within the formalism of relational databases, defining types of denotations amounts to defining a relation for each such type.</Paragraph> <Paragraph position="5"> A relation is formally defined as an n-tuple of formal attributes. By formal attribute is meant a way to identify the attribute (a position in the relation or a name) and the definition of the set of its possible values. The extension of a relation is the set of all well-formed n-tuples of attribute values for the corresponding formal attributes. (An ill-formed n-tuple has at least one non-possible value for a formal attribute.) Relations each represent a type. Each of them has (at least) one attribute whose domain is the set of denotations of the corresponding type. For example, the relation which corresponds to the type of noun phrases has an attribute whose values are noun phrase denotations. Each denotation is an atomic value of the attribute of the relation.</Paragraph> <Paragraph position="6"> Furthermore, each such value actually belongs to an n-tuple belonging to the extension of the relation. (This is due to the fact that the set-theoretical model of the world, i.e. the database, contains all the denotations built on an ontology constituted by a set of entities, IJ, and the truth values.) The structure of the database captures the (degree of) complexity of a denotation by connecting the relation which represents the semantic type assigned to the corresponding syntactic category with the relations which represent the semantic types assigned to the constituents of a complex expression of the same syntactic category. For example, a proper name has a complex denotation because it belongs to the syntactic category of noun phrases. Therefore, its denotation belongs to the extension of the relation representing the type of noun phrases. Since there are noun phrases constituted by a determiner and a common noun, the relation representing the type of noun phrases actually connects to the relations representing respectively the types of determiners and of common nouns. Hence, the structure of the relation associated to the type of noun phrases and, in particular, of proper names, shows that they are complex expressions.</Paragraph> <Paragraph position="7"> To the extent that we need to define the connections which show the respective complexity of each type of denotation, a relation is actually defined for each semantic type. For example, we shall define relations like Tn, Tdet, Tnp, Tvp standing, respectively, for the type of denotations of common nouns, determiners, noun phrases, verb phrases.</Paragraph> <Paragraph position="8"> Still, there is a problem in defining connections. The problem is that, in formal semantics, expressions of different syntactic categories can have the same semantic type.</Paragraph> <Paragraph position="9"> For example, common nouns and intransitive verbs share the same type. Now, the complexity of the denotation of a noun phrase is encoded in the fact that it is connected to the denotation of a common noun. On the contrary, the denotation of a verb phrase must be connected to the denotation of a simple verb and to the denotations of complements. In general, we not only need to define relations as counterparts of semantic types, but, where expressions of different syntactid categories collapse into the same type, their types must nevertheless correspond to different relations.</Paragraph> </Section> <Section position="6" start_page="0" end_page="0" type="metho"> <SectionTitle> 3 7~ 2 </SectionTitle> <Paragraph position="0"> With respect to the example, the relation Tv (the type; of simple verbs) cannot be the same as the relation Tn, because Tvp connects to Tv while Tnp connects to Tn.</Paragraph> <Paragraph position="1"> Notice that it is true of all the syntactic categories that they have one and only one relation as semantic counterpart. Sometimes however, the relation could be defined as the sum of several relations. For example the complex relation Tvp has several mutually exclusive sets of connections. It connects to the relation Tv and the relation Tnp, or to the relation Tv and the relation Tpp or the relation Tvp and the relation Tpp, etc.</Paragraph> <Paragraph position="2"> Let me illustrate these principles by showing the definition of two relations. The Trip relation is defined as a triple of formal attributes: Tnp= <\[np\],Tn,Tdet> where it is understood that the possible values of the first attribute are noun phrase denotations, the possible values of the second attribute are pointers to noun denotations, and the possible values of the third attribute are pointers to determiner denotations. Notice that proper names have dummy values for Tn and Tdet.</Paragraph> <Paragraph position="3"> Relations which encode simple types, i.e. types of lexical categories, cannot be encoded the same way as relations corresponding to complex expressions: they have no connections since they do not have any constituents. Instead, they are generally defined by pairs of attributes, the first one instantiates to a denotation of the type in question and the second one to the symbolic expression which is the item. For example, Tn, which represents the type of simple common nouns, will be defined by the pair: Tn = <\[n\],&quot;n&quot;> where it is understood that the possible values of the first attribute are common noun denotations while the possible values of the second attribute are the nouns themselves considered as symbolic expressions. As expected, the role of the second attribute in a relation such as Tn is to anchor the denotation of simple expressions into the lexicon.</Paragraph> <Paragraph position="4"> In summary, the design of the database meets the two following principles: - i) relations that correspond to types of lexical categories have two attributes: the first one has as domain the set of denotations of all the lexical items which belong to the lexical category in question, and the second one has as domain those items themselves regarded as symbolic expressions.</Paragraph> <Paragraph position="5"> - ii) relations that correspond to types of nondeg lexical categories have one attribute whose domain is the set consisting of all the denotations of all these expressions.</Paragraph> <Paragraph position="6"> Moreover, they have other attributes, one for each of their constituents. These attributes have as domain the extension of the relations corresponding to the types of these constituents.</Paragraph> <Paragraph position="7"> These principles ensure that denotations are submitted to the principle of compositionality which states that the denotation of a complex expression is a compound of the denotations of' its constituents. (It is important to understand that we want to be able to check that denotations are submitted to compositionality.) How compositionality constrains the definitions of the relations will now be illustrated on Tn and Tnp.</Paragraph> <Paragraph position="8"> In the extension of a relation like Tn, all the pairs of values have the property that the first value is the denotation of the second one. We augment the schema of the database with the constraint on Tn that: \[\[&quot;n&quot;\]\] = \[n\]. In the extension of relations like Tnp, all the n-tuples are required to satisfy the constraint that the value of the first attribute, i.e. the np denotation, is the denotation of an np whose constituents, i.e. the determiner and the noun, have as respective denotations the ones connected to by the remaining attributes of the n-tuple. For Tnp, we augment the schema of the database with the constraint that: C(Tn,Tdet) = \[rip\], where C operates the composition of its arguments.</Paragraph> </Section> <Section position="7" start_page="0" end_page="0" type="metho"> <SectionTitle> 3. THEORETICAL PRINCIPLES MET BY THE DATABASE. </SectionTitle> <Paragraph position="0"> Let us summarize how principles of formal semantics are taken into account in designing the database: i) we restrict ourselves to extensional semantics. Therefore, every denotation must be represented and must correspond to one object of the (unique) world.</Paragraph> <Paragraph position="1"> ii) the word is the smallest unit that receives a denotation, the sentence is the biggest one. iii) all the expressions that are well-formed syntactic consUtuents have a denotation.</Paragraph> <Paragraph position="2"> iv) all the denotations are encoded in the extension of a specific relation, that is, all the denotations have a type defined i1~ ~:i~e database. v) denotations of complex expressions are connected to the denotations of the constituents that are contained in those expressions.</Paragraph> <Paragraph position="3"> 3 379 The theory of databases states that relational databases are logically equivalent to a first order language whose predicates are the relations. In this first order language, the extensions of the predicates are the sets of tuples of argument values on which the &quot;relation-predicates&quot; evaluate to true. Therefore, the fact that the database is interpreted as a first order language ensures that all the denotations have a type and their type is explicitly attached to them.</Paragraph> </Section> <Section position="8" start_page="0" end_page="0" type="metho"> <SectionTitle> 4. ENTAILMENT. </SectionTitle> <Paragraph position="0"> What kinds of things are the denotations is indispensable to know in order to define what it means for an expression to entail another expression (of the same category).</Paragraph> <Paragraph position="1"> Let us distinguish between attributes that point to other relations, attributes that are instantiated to symbolic expressions and attributes that take as values the denotations of the type represented by the relation they are attributes of. Only the last kind of attributes are concerned with entaihnenc According to formal semantics, attribute values that represent denotations are sets. (Do not forget that they are atomic from the point of view of the structure of the database.) Some (primitive) entities are implicitly defined.</Paragraph> <Paragraph position="2"> Then, all the denotations (except for the denotations of sentences) are sets of entities, or sets of sets of entities, or functions whose domain and range are such kinds of sets. Let me use the meta-variable X which ranges over the sets of entities, while Y ranges over sets of sets of entities. The set structure of the world is the following:</Paragraph> <Paragraph position="4"> What is entailment, in the implementation? First, for expressions which denote functions, the fact that a certain expression entails another one is given by the fact that the respective expressions in which each of them appears (with other constituents) entail each other. For example, we will not say that &quot;most&quot; entails &quot;some&quot;, but rather that &quot;most Xs&quot; entails &quot;some Xs&quot;. This being so, functions can be represented by symbols, either names (the lexical items) or connections.</Paragraph> <Paragraph position="5"> Now, representing functions by symbols rather than by the sets of pairs argument-result implies that entailment cannot be defined on the relations representing functional types.</Paragraph> <Paragraph position="6"> For relations not corresponding to functional types, i.e. Tn, Tnp, Tv and Tvp, entailment is defined by means of set inclusion.</Paragraph> <Paragraph position="7"> Let tl and t2 be two tuples of Tnp, tl entails t2 if the attribute value which is the np denotation in tl is a subset of the attribute value which is the np denotation in t2. Take another example.</Paragraph> <Paragraph position="8"> Assume that Tvp has four attributes: the first one is the denotation of the vp, e.g. eat an apple, the third one is the denotation of the verb without the complement whose denotation is the value of the fourth attribute, i.e. eat : Tvp(\[eat an apple\],W,\[eat\], \[an apple\]) Now, anything that has necessarily the property of eating an apple has the property of eating. So, for Tvp=(y 1,W,y2,Z), we set the general constraint that y2 ~ yl. We will say that y2 D yl is an axiom that belongs to the definition of the Tvp relation. Crucially, objects that would not meet the axioms characterizing the extension of the relation to which they belong cannot correspond to objects of the world. In the example, if eating an apple does not entail eating, then the two expressions fail to have acceptable denotations.</Paragraph> </Section> <Section position="9" start_page="0" end_page="0" type="metho"> <SectionTitle> 5. THE SKETCH OF A SEMANTIC INTERPRETOR AND SENTENCE </SectionTitle> <Paragraph position="0"> DENOTATIONS.</Paragraph> <Paragraph position="1"> The denotations of sentences are truth values. I have not insisted on the way sentence denotations are encoded but one might expect that there is a relation Ts = <\[s\],Tnp,Tvp>. Although this agrees with the principles of the database, it is not the solution that I have adopted.</Paragraph> <Paragraph position="2"> Assume that there is no Ts relation. We must nevertheless ensure that the interpretor will provide sentences with the truth values they denote. Furthermore, we would like to show how these truth values depend on the denotations of the constituents of the sentence because we want to represent how compositionality is respected.</Paragraph> <Paragraph position="3"> The basic operation for interpreting an expression, that is for assigning it its denotation, is the selection of values in the database. How is the interpretor meant to select denotations actually encoded in the database? 38o 4 The interpretor proceeds in parallel with a syntactic parser which yields (at least) the constituent structure of the expressions. Imagine that the information that the parser sends Io the interpretor is the context-free rule used by the parser in parsing such expression to be interpreted. For example, suppose that the rule: np -> det, n parses the given expression. And suppose that the interpretor knows that the category np is the category of an expression of the type of noun phrases, hence of the relation Tnp. Likewise, n corresponds to Tn and det to Tdet.</Paragraph> <Paragraph position="4"> Knowing that the above context-free rule applies, the interpretor can perform the selection of a tuple in the Tnp relation. The schema of the selection to perform will be written: Z Tn,Tdet (Tnp). The specific selection to perform in order to interpret a specific noun phrase requires the interpretor to instantiate the parameters of the selection, Tn and Tdet, to the denotations of the noun and the determiner, respectively. The output of the instantiated selection:</Paragraph> <Paragraph position="6"> Notice that I use a logical notation to note the output of the interpretor. This output is, itself, a relational database containing one relation, the extension of which is constituted by one tuple which satisfies the second conjunct of the formula. In the example, the tuples consists of attribute values such that \[most\](\[men\]) = \[most men\] is true.</Paragraph> <Paragraph position="7"> The way the interpretor selects the denotation of a noun phrase can be easily generalized to the other types of expressions. I immediately turn to the case of sentences.</Paragraph> <Paragraph position="8"> Let us assume that there is only one rule to parse sentences, namely : S -> np,vp Since there is no Ts type, there is no selection. Nevertheless, the pseudo-schema of selection corresponding to this rule is defined by I; Tvp.Vnp 0 (to use a notation coherent with the one used for the: other selections). The denotation that such a &quot;selection&quot; will yield is a formula interpretable as the truth value denoted by the sentence. I shall represent this formula by ~. The way in which (~ is assumed to yield either truth value conforms to the account of standard formal semantics: it consists in checking whether the property, i.e. the set, denoted by the verb phrase is a member of the set of properties denoted by the noun phrase. The outputs of Z Tvp,Tnp 0. are more than just ~. Indeed, in order to show that compositionality is respected, we must show explicitly what are the denotations of constituents which combine to yield the truth value. The latter denotations involve denotations of their own constituents.</Paragraph> <Paragraph position="9"> Therefore, the denotation of a sentence will be logically represented by an assertion. This assertion is the logical conjunction of the denotations of all the constituents of the sentence. For example, the sentence: Most men eat an apple denotes:</Paragraph> <Paragraph position="11"> It is easy to predict that sentences having the same constituent structure as Most men eat an apple will each be interpreted by an assertion of the same form as this one.</Paragraph> <Paragraph position="12"> The computational counterpart of such an assertion is a database contained in the original database. (We call such a database a view in the computational terminology.) In summary, all the sentences that share the same syntactic structure denote assertions equivalent to databases having the same structure (but different extensions, of course). Thus, the denotation of a sentence has iconic properties and its structure is of the same kind as that of the representation of the world. We shall say that it is a possible fact, where &quot;fact&quot; means that the denotation of the sentence is a part of the world, while &quot;possible&quot; means that its structure conforms to that of the world.</Paragraph> <Paragraph position="13"> Since ~ has been defined independently from any relation of the database, false sentences can have the same kind of denotation as do true sentencesdeg By this, I want to emphasize the fact that, when ~ does not yield the value true, it does not follow that the assertion is ill-formed. On the contrary, the fact that a false sentence fails to denote the 5 381 actual state of the world does not prevent it from denoting a possible fact as long as its denotation is a well-fomled assertion.</Paragraph> </Section> class="xml-element"></Paper>