File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/02/w02-0804_metho.xml

Size: 25,656 bytes

Last Modified: 2025-10-06 14:08:01

<?xml version="1.0" standalone="yes"?>
<Paper uid="W02-0804">
  <Title>Defining and Representing Preposition Senses: a preliminary analysis</Title>
  <Section position="3" start_page="0" end_page="0" type="metho">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> Most prepositions are highly polysemous and are often involved in a large number of derived, unexpected or metaphorical uses. Analyzing and representing the semantics of prepositions is a rather delicate and risky task, but of much importance for any application that requires even a simple form of understanding. Spatial and temporal prepositions have recieved a relatively in-depth study for a number of languages (e.g. (Boguraev et al. 87), (Verkuyl et al. 92)). The semantics of other types of prepositions describing manner, instrument, amount or accompaniement remains largely unexplored (with a few exceptions however, such as avec (with) (Mari 00)).</Paragraph>
    <Paragraph position="1"> Our general application framework is knowledge extraction using linguistic and symbolic techniques. In this framework, the treatment of predicative forms is crucial to characterize actions, processes and states.</Paragraph>
    <Paragraph position="2"> Predicative forms include verbs, but also prepositions which have a heavy semantic weight by themselves. Of much interest are also the interactions verbpreposition-NP. null This short document is a brief analysis of how preposition uses (as arguments or adjuncts) and senses, in standard utterances, can be organized and characterized. The method presented here, applied to French, seems general and applicable to many languages. Our proposal is rather a feasability study and elements of a working method, with some results that require e.g. a lot of lexical tuning, than, obviously, a set of firmly established results. We propose an organization of preposition senses into families where basic usages as well as metaphorical ones are identified and contrasted. A semantic representation in Lexical Conceptual Structure (LCS) is proposed where a great attention is devoted to the economy and expressivity of the primitives used. An evaluation of the accuracy and relevance of the sense distinctions concludes this paper.</Paragraph>
    <Paragraph position="3"> Prepositions are mainly studied in isolation. We think this first step is necessary before studying their interactions with verbs. These interactions are indeed very diverse, from standard composition (the most frequent case), to facet activation and to complex situations of mutual influence, involving non-monotonic forms of semantic composition.</Paragraph>
  </Section>
  <Section position="4" start_page="0" end_page="0" type="metho">
    <SectionTitle>
2 Defining a preposition semantics for
French: Methodological issues
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.1 Delimiting preposition senses
</SectionTitle>
      <Paragraph position="0"> Before looking in more depth at the semantic representation of preposition senses, let us investigate a few elements for delimiting senses that settle our theoretical and practical perspective. We prefer to use the term 'strategy' to delimit senses since there is obviously no theory or even no 'formal' procedure. This task is extremely difficult, but necessary for any real NLP application with a quite wide coverage. Very informally, in our perspective, we assume that a sense (more or less large and constrained) of a lexeme has a basic form and basic expressions or usages (surface forms reflecting the basic sense). The basic sense originates derived usages, which are more or less constrained and limited, via metonymy, metaphor, slight sense-shiftings or co-composition (Pustejovsky, 1991, 1995). One of the difficulties is, given a set of usages, to partition them into semantically coherent sets, each set corresponding to a sense.</Paragraph>
      <Paragraph position="1"> Sense delimitation is largely an open problem. It is July 2002, pp. 25-31. Association for Computational Linguistics. Disambiguation: Recent Successes and Future Directions, Philadelphia, Proceedings of the SIGLEX/SENSEVAL Workshop on Word Sense indeed almost impossible to state precise and general principles that characterize the boundaries of different senses of a lexeme and, finally, what a sense exactly is. The difficulty is then to elaborate a coherent system for sense delimitation and for characterizing sense and usage variations. Solutions have been proposed, which are not totally satisfactory. For example, Word-Net (Fellbaum, 1997) tends to introduce one sense for each group of very closely related usages. For example, WordNet has 27 different senses for the verb give.</Paragraph>
      <Paragraph position="2"> Distinctions between senses are often very subtle and somewhat hard to represent in a semantic representation. This approach is very useful in the sense that it provides a very detailed description of the usages of a large number of words in English, but we think it lacks generalizations about language which are often useful for NLP systems to work efficiently. On the other side, are AI-based perspectives which tend to postulate a unique sense for a lexeme and very complex derivation procedures, involving complex logical systems, to produce different senses and possibly sub-senses.</Paragraph>
      <Paragraph position="3"> The approach taken in WordNet is close to that taken by a number of paper dictionaries, where sense distinctions are very numerous and essentially based on usage. These distinctions are, in a large part, based on the semantic nature of the arguments. There are confusions between what we view as 'basic' senses and derived ones. Indeed, a number of situations that would be analyzed as metaphors or metonymies are identified as original senses. Consequently, dictionaries are certainly a very good tool to identify the different usages and senses of a lexeme, but they cannot be used directly in our framework. There are however a few very welcome exceptions such as the German-French Harrap's dictionary which has a very relevant and sound approach to multilinguism based on a conceptual analysis of language and of translation.</Paragraph>
      <Paragraph position="4"> Our perspective is between these two 'extremes', lexicography and AI. We think that the different usages of a word should be organized around a small, relatively generic, number of senses. From these senses, generative procedures must produce or recognize derived usages. These procedures must obviously be sound, and must not over-generate (e.g. the rules claimed to be general in e.g. (Lakoff et ali. 99) must certainly not be taken for granted).</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.2 A few criteria for delimiting preposition
senses
</SectionTitle>
      <Paragraph position="0"> The identification of a preposition sense is essentially based on the observation of groups of usages. It is then confirmed by two criteria: (a) the nature and the stability within a certain semantic domain of the type of the head noun of the PP, that confirms the ontological basis of the sense and, concomitantly, (b) the restrictions required by the verb on the nature of the PP, if it is an argument. Dictionary definitions and multilingual considerations may also help. Pragmatic factors may also interfere, but this is outside the scope of this study (e.g. Busa 96).</Paragraph>
      <Paragraph position="1"> Although prepositions have some idiosyncratic usages (probably much less in French than in English), most senses are relatively generic and can be characterized using relatively consensual and high-level ontology labels.</Paragraph>
      <Paragraph position="2"> Let us consider the case of par1. The following 6 senses can be quite easily identified and characterized.</Paragraph>
      <Paragraph position="3"> They come from very diverse ontological domains but they seem to be all approximately at the same level of abstraction: a0 proportion or distribution: il gagne 1500 Euros par mois (he earns 1500 Euros per month), a0 causality: as in passives but also e.g. in par mauvais temps, je ne ne sors pas (by bad weather I don't go out), a0 origin: je le sais par des amis (I know it from friends), a0 via: je passe par ce chemin (I go via this path), a0 tool or means: je voyage par le train (I travel by train), a0 approximation of a value: nous marchons par 3500m d'altitude (we hike at an altitude of 3500m).</Paragraph>
      <Paragraph position="4"> An important point is that uses with par do not necessarily cover all the conceptual field associated with each sense. For example, the expression of the idea of approximation using par is rather restricted to localization, speed or movement, not e.g. to amounts of money. One of the tasks is then to characterize, for each sense, what is the portion of the conceptual field which is covered. This is done via two means: (1) by a semantic characterization of the NP dominated by the preposition and (2) by the analysis of the restrictions imposed by the verb of the clause on the PP, or, conversely, by the type or family of the verb the preposition can be combined with, for that particular sense. Let us now examine the basic restrictions for 3 senses of par. The 'VIA' sense is basically subcategorized by movement verbs; it is a path, subcategorizing for a noun of type 'way' or 'route' or, by a kind of metonymic extension, any object which can define a trajectory, e.g. an aperture (by the window). It has numerous metaphors in the psychological and epistemic 1This is obviously one possible characterization of the different meanings of par which is very much dependent on the theory of meaning one considers.</Paragraph>
      <Paragraph position="5"> domains (e.g. Il passe par des moments difficiles (He experiences difficult moments)).</Paragraph>
      <Paragraph position="6"> The 'ORIGIN' sense is more narrow, it is essentially used in conjunction with communication or epistemic verbs, the argument is usually of type place, and the head noun is of type 'human' Il transite par Paris (he commutes in Paris). We consider that nouns of type e.g. 'object with an informational content' introduce a metonymic extension, as in, e.g. par la radio / la presse (I know the news from the radio / the newspapers). Finally, note that there is a kind of continuum between Origin and Causality, as in: I know she wears bracelets from the noise she makes when she moves.</Paragraph>
      <Paragraph position="7"> Finally, the 'TOOLS or MEANS' sense is used with verbs describing concrete actions (e.g. creation and movement verbs, if we refer to verb class systems (e.g. (Levin 93), (Fellbaum 93)). In general it is an adjunct. It is typed as a means, and the head noun of the PP must be e.g. a tool, or, more generally, an object that allows the action to be realized. This object could be found e.g. in the encyclopedic knowledge associated with the verb, or via a functional relation in a thesaurus. It has also numerous metaphoric extensions (e.g. je traite ce ph'enom`ene par la logique temporelle (I deal with this phenomena 'by' temporal logic)).</Paragraph>
    </Section>
    <Section position="3" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.3 Some difficulties
</SectionTitle>
      <Paragraph position="0"> However, there are many well-known difficulties inherent to the selectional restriction approach, where additional, non-trivial, world knowledge is required to make sense distinctions. Consider the usage: 'Dans followed by an NP of type location' (e.g. to be in a drawer).</Paragraph>
      <Paragraph position="1"> Location is obviously too general a restriction (*to be in the shelf). It is then necessary to enter into more complex descriptions, specifying that the location has a (salient) 'inside', that is not just a surface, etc. However, as far as only elementary spatial properties are concerned, this remains feasable.</Paragraph>
      <Paragraph position="2"> More complex is the case of boire dans un verre (literally: drink in a glass). This example highlights the complex interactions between the verb and its PP. The preposition is part of the PP, not part of a verb complex form, this latter construction being quite unusual in French. The recipient is not neutral: while verre, tasse, bol,... are acceptable arguments, bouteille, robinet (bottle, faucet) are not, probably because of their narrow neck, which prevents the drinker from having his mouth in the recipient. This characterization becomes more complex and, probably, an interpretation for example in terms of Euclidean geometry could be necessary.</Paragraph>
    </Section>
  </Section>
  <Section position="5" start_page="0" end_page="0" type="metho">
    <SectionTitle>
3 A preliminary semantic structure for
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
French prepositions
3.1 A general classification
</SectionTitle>
      <Paragraph position="0"> Here is an organization proposal of the different senses that French prepositions may have. Senses are organized on three levels:  1. a first level characterizes a semantic family, of a level roughly comparable to thematic roles: localization, manner, quantity, accompaniement, etc., 2. a second level accounts for the different facets of the semantic family, e.g. source, destination, via, fixed position for the localization family, 3. a third level characterizes, roughly speaking, the  modalities of a facet when appropriate. For example, the facet manner and attitudes is decomposed into 3 modalities: basic manner, manner by comparison and manner with a reference point. Due to space limitations, this latter level will not be developed in this document.</Paragraph>
      <Paragraph position="1"> It is also important to note that each preposition sense is considered from the point of view of its basic usage and as the source of numerous metaphors. For example, origin is basically spatial, but has numerous metaphorical transpositions into the temporal, psychological and epistemic domains, to cite just a few generic cases.</Paragraph>
      <Paragraph position="2"> Here is our classification, one or more examples follow to illustrate definitions, which cannot be given here in extenso due to space limitations:  a0 Localization with subsenses: - source, - destination, - via/passage, - fixed position.</Paragraph>
      <Paragraph position="3"> Destination may be decomposed into destination reached or not (possibly vague), but this is often contextual. From an ontological point of view, all of theses senses can, a priori, apply to spatial, temporal or to more abstract arguments.</Paragraph>
      <Paragraph position="4"> a0 Quantity with subsenses: - numerical or referencial quantity, - frequency and iterativity, - proportion or ratio.</Paragraph>
      <Paragraph position="5"> Quantity can be either precise (temperature is 5 degrees above 0) or vague. Frequency and iterativity, e.g.: he comes several times per week. a0 Manner with subsenses: - manners and attitudes, - means (instrument or abstract), - imitation or analogy.</Paragraph>
      <Paragraph position="6"> Imitation: he walks like a robot; he behaves according to the law, a0 Accompaniement with subsenses: - adjunction, - simultaneity of events (co-events), - inclusion, - exclusion.</Paragraph>
      <Paragraph position="7"> Adjunction : flat with terrace / steak with French fries / tea with milk, Exclusion: they all came except Paul.</Paragraph>
      <Paragraph position="8"> a0 Choice and exchange with subsenses: - exchange, - choice or alternative, - substitution.</Paragraph>
      <Paragraph position="9"> Substitution : sign for your child, Choice: among all my friends, he is the funniest one. a0 Causality with subsenses : - cause, - goal or consequence, - intention.</Paragraph>
      <Paragraph position="10"> Cause: the rock fell under the action of frost. a0 Opposition with two ontological distinctions: physical opposition and psychological or epistemic opposition. Opposition: to act contrary to one's interests.</Paragraph>
      <Paragraph position="11"> a0 Ordering with subsenses: - priority, - subordination, - hierarchy, - ranking, - degree of importance.</Paragraph>
      <Paragraph position="12"> Ranking : at school, she is ahead of me. a0 Minor groups: - About, - in spite of, - comparison.</Paragraph>
      <Paragraph position="13">  About: a book concerning dinosaurs.</Paragraph>
      <Paragraph position="14"> Each of the facets described above is associated with a number of prepositions. Here is a brief description of the Ordering family, with its 2 subsequent levels: Fig. 1 - prepositions of the Ordering family facet modality preposition sense of</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.2 Representing the meaning of preposition
senses
</SectionTitle>
      <Paragraph position="0"> Each sense is associated with a semantic representation, often largely underspecified. Lower levels in the hierarchy recieve a more precise representation, constructed in a monotonic way. Senses are described at two levels: (1) by means of a thematic grid characterizing the 'standard' function of each argument using the 21 thematic role system we have defined and, mainly (2) by means of the Lexical Conceptual Structure (LCS) (Jackendoff 90, 97), which seems to be sufficiently expressive for that purpose. Compared to the description in LCS of verbs, representing prepositions in LCS is rather straightforward and much more adequate. null A few principles guide this description: (1) the representation of generic senses (e.g. family level) subsumes the representation of their daughters, (2) different senses of a given preposition receive substancially different semantic representations, (3) metaphoric uses are characterized in part by semantic field substitution in the LCS, not by a different representation with different primitives, and (4) the number of primitives representing prepositions must be as limited as possible. These primitives are lower in the LCS primitive hierarchy than e.g. the GO, CAUSE or BE primitives. null Points 1 to 3 are studied formally in (Saint-Dizier and Vazquez 2000). To summarize, LCS representations are associated with (1) a typed-a0 -calculus and (2) logical devices to represent and constrain under-specification (e.g. defaults, choices).</Paragraph>
      <Paragraph position="1"> We have identified 68 primitives to cover all the senses we have defined. To give a flavor of their descriptive level, here are a few of them, definitions in English being quite informal: Fig. 2 - a few LCS primitives for prepositions primitive short definition ABOUT concerning, theme of verb ABOVE fixed position above something, no contact ON same as ABOVE but with contact  (they opened the door with a knife). This is, in fact, a generic representation for most prepositions introducing instruments (realized as: `a, `a l'aide de, au moyen de, avec, par).</Paragraph>
      <Paragraph position="2"> Note that both senses are contrasted by different selectional restrictions on the NP, represented by the variable I.</Paragraph>
      <Paragraph position="3"> More subtle is the representation of contre (approximately 'against'), for which we give the comprehensive representation of its 5 senses: a0 A first sense describes a physical object positioned against another one (in the hierarchy above: localization - fixed position - spatial): a0a62a61 a1a63 a19a22a4a64a23a65a9 a51a66a48a50a67a66a68a69a44a70a68a18a15 a17a20a19a22a21a24a23a58a71a23a12a72a17 a25a58a1a28a30a29a32a31a59a6a34a33 a61 a36a38a37a39a36 where NEXT-TO indicates a physical (+loc) proximity; contact is encoded by c:+ (Jackendoff 90) 2, between two objects, I and K, where 2In French, our analysis is that contre describes a position, not a path.</Paragraph>
      <Paragraph position="4"> I is against K. It is important to note that the idea of movement, if any (as in: push the chair against the wall), comes from the verb, not from the preposition.</Paragraph>
      <Paragraph position="5"> a0 Contre is also used to express opposition: to swim against the current or, metaphorically in the epistemic or psychological domains: to argue against a theory/ a practice. The primitive OPPOSITE is used to capture the fundamental idea of antagonistic forces: a0a62a61 a1a63 a19a22a4a64a23a65a9 a15a56a73a16a73a50a15a16a52a74a0a10a68a18a48 a17a20a19a22a21a24a23a65a75a76a17 a63a78a77a65a79 a75a76a17a80a9 a63 a31 a77 a28a39a71a23a12a72a82a81a80a71a28a30a4a78a72a17 a25a27a1a28a30a29a32a31a22a6a8a33 a61 a36a38a37a39a36 .</Paragraph>
      <Paragraph position="6"> In that case, the physical contact is not relevant (c:-), while the agonist / antagonist force is present (noted ta:+, (Jackendoff 90), slightly simplified here).</Paragraph>
      <Paragraph position="7"> a0 Contre can also be used to express notions like provides a certain protection or defense in the hierarchy 'causality - goal': medecine for cought.</Paragraph>
      <Paragraph position="8"> It is represented as follows: a0 a67a83a1a9a65a84a64a9a65a6a8a28a30a75 a77 a28a30a4a85a28a30a9 a55a50a15a56a86a2a25a58a1a9a12a84a64a9a65a6a34a28a30a75a87a28a30a29a32a31a22a6a8a33 a67a88a36a38a37a39a36 a0 The fourth sense captures the notion of exchange (in the hierarchy 'choice and exchange') : litt.: I substitute my hors d'oeuvre against a desert, representation is as follows:</Paragraph>
      <Paragraph position="10"> a0 The last sense is related to the expression of the ratio or proportion (hierarchy 'quantity - proportion or ration): litt. 9 votes against 12: a0 a67a83a1a4a7a3a5a21a58a97a32a6a8a28 a49a60a98a16a49a60a0a87a51a53a52a99a68 a17a80a100a27a97a34a4a7a6a34a28 a25a27a1 a4a7a3a5a21a58a97a32a6a8a28 a67a88a36a94a37a65a36 . As can be seen, representations are all substantially different. Substitutions on basic fields, in particular semantic fields, allow for the taking into account of numerous metaphorical uses within a sense.</Paragraph>
    </Section>
  </Section>
  <Section position="6" start_page="0" end_page="0" type="metho">
    <SectionTitle>
4 Related work
</SectionTitle>
    <Paragraph position="0"> The closest work to ours was developed about 6 years ago by Bonnie Dorr, it is accessible at: http://www.umiacs.umd.edu/ ~bonnie/ AZ-preps-English.lcs. This is a very large database of preposition semantic representations, characterized by their LCS representation and, sometimes, by a thematic grid. There are about 500 entries (compared to our 170 entries), for probably all English prepositions. Although it is not easy to go into such a huge work dedicated to a different language and to make comparisons, we outline below some differences we feel have some importance.</Paragraph>
    <Paragraph position="1"> Our perspective was first to organize preposition senses from usages, according to a certain theoretical view on what a sense is. The next goal was to evaluate the results in order to confirm or invalidate our perspective. Then, came the semantic representations, with an analysis of the adequacy of the LCS system. We also took care of the complex interactions with verbs in order to identify as clearly as possible the semantic contribution of each element.</Paragraph>
    <Paragraph position="2"> Each preposition sense in Bonnie Dorr's work receives a comprehensive semantic representation in LCS. Senses are paraphrased by an example, in a way close to synsets in WordNet. Some restrictions are added, and syntactic positions are made explicit.</Paragraph>
    <Paragraph position="3"> Let us now compare some elements of these two systems. In our approach, we introduced disjunctions of semantic fields in order to account e.g. for metaphors. This limits the number of entries. For example, for behind, B. Dorr has 3 senses (locative, temporal and with movement) whereas we have just one with a disjunction for the 2 first cases.</Paragraph>
    <Paragraph position="4"> We also tried to be compositional: in Bonnie Dorr's work, there is e.g. a primitive called AWAY-FROM, in addition to AWAY and FROM. We tend to consider that these two primitives can be combined compositionally and that the composite AWAY-FROM is not motivated.</Paragraph>
    <Paragraph position="5"> Another difference is that we have considered a kind of 'minimal' semantics for prepositions, without considering potential combinations with verbs. As a result, in B. Door there is for against e.g. a sense describing a fixed position and another one describing a movement where the moved object reaches a position against another object. For this latter case, we think that the movement is only in the semantics of the verb and is compositionally induced at the level of the proposition. Same remark for most prepositions expressing positions (north, west, inside, etc.). We have only one representation for the fixed position.</Paragraph>
    <Paragraph position="6"> Finally, depending on the fact that the source is given or not, into is represented by a combination of TOWARD(IN) or TO(IN). We do not see any reason for this distinction and think that origin and destination should be treated apart.</Paragraph>
    <Paragraph position="7"> These relatively minor differences indicate that, probably, Bonnie Dorr had a more 'lexicographic' view than we had in the sense descriptions. One of her motivations was to efficiently use her work in a machine translation system, where senses need to be relatively narrow and explicit to allow e.g. for a simpler multi-lingual treatment of prepositions.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML