File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/06/w06-2105_metho.xml

Size: 13,447 bytes

Last Modified: 2025-10-06 14:10:48

<?xml version="1.0" standalone="yes"?>
<Paper uid="W06-2105">
  <Title>Semantic Interpretation of Prepositions for NLP Applications</Title>
  <Section position="3" start_page="29" end_page="30" type="metho">
    <SectionTitle>
2 General Classes of PP Interpretation
</SectionTitle>
    <Paragraph position="0"> The semantic interpretation of prepositions in NLPhastodealwiththefollowingtwoorthogonal phenomena: regular (or compositional or productive) vs. irregular (or non-compositional or collocative) uses of prepositions and uses in complements vs. uses in adjuncts (putting aside the well-known issue of borderline cases).2 There are thus the four cases indicated in Table 1. In German, PPs can occur also as invariant syntagmas in light verb constructions ('Funktionsverbgef&amp;quot;uge') such as in Beschlag nehmen ('to occupy'), to which the complement-adjunct distinction does not apply. In the following, we keep aside the interpretation of prepositions in fixed phrases, which is the case for the light verb constructions just mentioned and also for the irregular adjunct interpretation shown at the lower right of Table 1. This leaves us with three types of PP interpretation: (regular) adjunct interpretation, regular complement interpretation, and irregular complement interpretation.</Paragraph>
    <Paragraph position="1"> The standard examples for regular adjunct interpretation are local (or directional) and temporal PPs, which can be attached to verbs, nouns, and adjectives:  (1) a. He walked to the museum.</Paragraph>
    <Paragraph position="2"> b. They met in August.</Paragraph>
    <Paragraph position="3"> (2) a. the building on the hill b. the debate on Wednesday 2See also (Rauh, 1993).</Paragraph>
    <Paragraph position="4"> (3) a. das  'the house that is dark in winter' Notice that German, unlike English, allows adjectives with PP adjuncts (or complements) as attributes of nouns, witness (3).</Paragraph>
    <Paragraph position="5"> It is characteristic of regular adjunct interpretation that the preposition has a meaning of its  own and expresses some sort of relationship. Besides local and temporal specifications, there are of course many other relationships expressed by regular PP adjuncts such as instrumental (4-a), comitative (4-b), and part-whole (4-c) interpretations.3 null (4) a. John cleaned the floor with his shirt. b. Mary visited London with her sister.</Paragraph>
    <Paragraph position="6"> c. a building with large windows  The examples in (4) furthermore illustrate the well known fact that prepositions are highly polysemous in general. Within our approach described in Sect. 4, prepositions currently have up to sixteen readings.4 We speak of a regular complement interpretation if the PP is subcategorized and its interpretation is identical to the (correct) interpretation of thePPwhenanalyzedasanadjunct. Theseadjunct interpretations of PPs are defined by the set of PP rules, which are explained in Sect. 4.2. Examples of regular complements are wohnen in/auf/... ('to live in/on/...'), schicken nach/in/... ('to send to/into/...'), mitkommen mit ('to come along with'), and Einstieg in ('getting in'). Here, the choice of the preposition in the PP complement of the lexeme is determined by the semantic characterization of the complement.</Paragraph>
    <Paragraph position="7"> In the case of irregular complement interpretation, in contrast, the selection of the preposition is an idiosyncratic property of the subcategorizing lexical entry. The preposition alone can be viewed as semantically empty; only the combination of 3In our approach, 91 different relations occur, often in combinations; see Sect. 4.2.</Paragraph>
    <Paragraph position="8"> 4Theaveragenumberofreadingsis2.44, whichisslightly higher than the polysemy degree 2.27 (= 847/373) reported by Litkowski and Hargraves (2005) for English. But our lexiconcurrentlycontainsonlyfewphrasal(orcomplex)preposi- null tions like in Anbetracht ('in view of'), which often have only one reading.</Paragraph>
    <Paragraph position="9">  the lexeme and the preposition bears semantics. Examples for verbs, adjectives, and nouns of this sort are glauben an ('to belief in'), sich verlassen auf ('to depend on'), gut in ('good at'), and Wut auf ('anger at').</Paragraph>
    <Paragraph position="10"> It should be noted, however, that there is a whole spectrum of subregular phenomena within what we called &amp;quot;irregular&amp;quot; complement semantics.5 Consider, for instance, the verbs ernennen zu, bestimmen zu ('appoint', 'designate'), k&amp;quot;uren zu ('elect'), and weihen zu ('ordain'). Even if the preposition zu ('to') could be said to semantically express some sort of abstract goal in these cases, anadequateinterpretationruleassociatedwiththat preposition would have to make reference to a rather restricted semantic class of verbs. We regard it therefore as a matter of lexical semantic organization to capturesuch subregularities within theinterpretationofprepositionalcomplementsby means of appropriate semantic verb templates in the lexicon; see (Osswald et al., 2006) for details. In the context of the present paper, an interpretation of a prepositional complement is called irregular if the interpretation is not covered by one of our PP interpretation rules.</Paragraph>
    <Paragraph position="11"> Table 2 show the frequency of adjunct and complement interpretations in different corpora. The numbers in the token rows are derived from automatic corpus parses (see below), so there is some noise to be expected, but the trends should be valid.</Paragraph>
  </Section>
  <Section position="4" start_page="30" end_page="30" type="metho">
    <SectionTitle>
3 The Semantic Formalism MultiNet
</SectionTitle>
    <Paragraph position="0"> MultiNet is one of the few knowledge representation paradigms which have also been used as a semantic interlingua in real-life NLP applications (Leveling and Helbig, 2002). The MultiNet formalism represents meanings of natural language expressions by means of (partial) semantic networks. A semantic network consists of nodes representing concepts and edges representing relations between concepts. Every node is additionally labeled by a sort arising from an ontologically or epistemically motivated classification of concepts (see Appendix, Table 4). Apart from that, every node is embedded in a system of layer attributes and their values expressing the extension type, facticity, genericity, referential determination, quantification, and others. The rela- null tional verbs in English.</Paragraph>
    <Paragraph position="1"> tions connecting the concepts in a semantic network have to be taken from a predefined set of expressional means, which are systematically describedandformallycharacterized(Helbig, 2006).</Paragraph>
    <Paragraph position="2"> A strongly abbreviated description of all MultiNet relations used in this paper can be found in Table 5 of the Appendix.</Paragraph>
    <Paragraph position="3"> For the semantic characterization of the selectional restrictions (i.e. valencies) of lexemes, an additional set of 16 binary semantic features (such asanimate, human, artificial, movable, andinstitution) is provided, which can be combined with the above-mentioned sorts to yield a rich repertoire of semantical characterizations for the description of the slots and fillers corresponding to the valencies. These expressional means have been used in the computational lexicon HaGenLex (see Sect. 4.1).</Paragraph>
  </Section>
  <Section position="5" start_page="30" end_page="32" type="metho">
    <SectionTitle>
4 Resources for PP Interpretation
</SectionTitle>
    <Paragraph position="0">  Threesourcesofprepositioninformationareavailable to the syntactico-semantic parser used in our NLP applications:6 subcategorization information in the lexicon, context dependent PP interpretation rules, and an annotated PP corpus.</Paragraph>
    <Section position="1" start_page="30" end_page="31" type="sub_section">
      <SectionTitle>
4.1 Selection of PPs in the Lexicon
</SectionTitle>
      <Paragraph position="0"> Our parser makes use of the computational lexicon HaGenLex (Hagen German Lexicon, see (Hartrumpf et al., 2003)), which is a general domain lexicon for German with about 25,000 entries (including 136 prepositions). Each entry contains detailed morpho-syntactic and semantic information. In particular, the lexicon provides valency frames for nouns, verbs, and adjectives (in the lexical feature SELECT). This includes complements that are syntactically realized by a PP.</Paragraph>
      <Paragraph position="1"> Each complement is characterized by one or more syntactic specifications and its semantic contribution to the head word. This contribution can be a MultiNet relation (case role) or a more complex MultiNet expression directly or indirectly connecting the representation of the complement and of the head, which typically involves other complements. null In order to capture semantic constraints on possible adjuncts, the set of semantic relations compatible with a given lexeme is specified in the lexicon (under the lexical feature COMPAT-R). This information is inherited from the semantic class  of the lexeme so that the set of all possible adjunct readings for a PP (see next section) can be filtered. Lexical entries exemplifying both aspects are listed in Fig. 1.</Paragraph>
    </Section>
    <Section position="2" start_page="31" end_page="31" type="sub_section">
      <SectionTitle>
4.2 Interpretation Rules for Prepositions
</SectionTitle>
      <Paragraph position="0"> The second knowledge source for PP interpretation are symbolic PP interpretation rules developed for adjunct interpretations. The premise of such a rule encodes under which semantic and syntactic constraints a specific preposition interpretation is possible; the conclusion specifies a semantic network representing the PP semantics.</Paragraph>
      <Paragraph position="1"> Two simplified interpretation rules are shown in  tions; if several rule premises can be unified with a given pair of a preposition's complement and a candidate mother, the PP disambiguation module retreats to a statistical back-off model to resolve this ambiguity. Currently, we have 332 rules.</Paragraph>
      <Paragraph position="2"> The interpretation rules can be viewed as a declarative part of the corresponding preposition entry in the lexicon. For maintenance reasons, the rules are stored and manipulated separately. They are linked to the lexicon by lexeme IDs.</Paragraph>
      <Paragraph position="3"> TherulesshowthatPPsemanticsinvolvesmany areas of semantics. For example, MultiNet defines around 150 relations and 91 of them are used in the conclusion of PP rules. The 10 most frequent ones are: LOC, VAL, TEMP, ATTR, ELMT, ATTCH, DIRCL, INSTR, SUBM, ORIGM (see Table 5). As exemplified by the second rule of Fig. 2, the semantic network specified in the conclusion of a rule often consists of more than one network edge; on average, an interpretation has 1.69 edges.</Paragraph>
    </Section>
    <Section position="3" start_page="31" end_page="32" type="sub_section">
      <SectionTitle>
4.3 Annotated PP Corpus
</SectionTitle>
      <Paragraph position="0"> A third source of preposition information is an annotated PP corpus and statistics derived from it.</Paragraph>
      <Paragraph position="1"> The occurrences of six frequent prepositions in  aus.origm examples: eine Platte aus Kupfer ('a plate out of copper'), ... sort(c1)=co [?] sort(c2)=s - (origm c1 c2) auf.attr language examples: ein Artikel auf Englisch ('an article in English'), ...</Paragraph>
      <Paragraph position="3"> tation.7 This knowledge acts as the training set for a machine learning component that disambiguates attachment and interpretation of PPs (see Sect. 5).</Paragraph>
      <Paragraph position="4"> 5 Preposition Interpretation within</Paragraph>
    </Section>
    <Section position="4" start_page="32" end_page="32" type="sub_section">
      <SectionTitle>
Semantic Parsing
</SectionTitle>
      <Paragraph position="0"> All the knowledge resources described in Sect. 4 are used by the parser to determine the correct interpretation of prepositions. Furthermore, PP attachment ambiguities are resolved on the basis of possible interpretations. The complement information (valency frames) in the lexicon licenses possible complement interpretations, the PP interpretation rules (combined with the adjunct information in the lexicon) license possible adjunct interpretations. In case of alternatives, they are disambiguated using statistics derived from the annotated PP corpus and a whole range of preference scores.8 The statistical data is represented in the form of a multidimensional back-off model. Each alternativeisdescribedbytherulename, thesemanticsof 7In case of so-called systematic ambiguities, both attachments have been classified as valid. Moreover, two readings were considered as equally likely in some cases.</Paragraph>
      <Paragraph position="1"> 8Classical rule-based approaches often apply some sort of decision algorithm to disambiguate such cases; see e.g. (Hirst, 1987).</Paragraph>
      <Paragraph position="2"> the preposition's complement, and the semantics of the possible syntactic head. If no exact match is found in the disambiguation statistics the number of considered alternatives and the granularity of the description of an alternative are reduced by backing off in these two orthogonal dimensions; see (Hartrumpf, 2003; Hartrumpf, 1999) for details. null</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML