XML Viewer - e93-1043

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/93/e93-1043_metho.xml
Size: 24,440 bytes
Last Modified: 2025-10-06 14:13:18
<?xml version="1.0" standalone="yes"?>
<Paper uid="E93-1043">
  <Title>Coping With Derivation in a Morphological Component *</Title>
  <Section position="3" start_page="368" end_page="368" type="metho">
    <SectionTitle>
2 Inheritance Lexica
</SectionTitle>
    <Paragraph position="0"> Research directed at reducing redundancy in the lexicon has come up with the idea of organizing the information hierarchically making use of inheritance (see, e.g. \[Daelemans et al., 1992; Russell et al., 1992\]).</Paragraph>
    <Paragraph position="1"> Various formalisms supporting inheritance have been proposed that can be classified into two major approaches. One uses defaults, i.e., inherited data may be overwritten by more specific ones. The default mechanism handles exceptions which are an inherent phenomenon of the lexicon. A well-known formalism following this approach is DATR \[Evans and Gazdar, 1989\].</Paragraph>
    <Paragraph position="2"> The major advantage of defaults is the rather natural hierarchy formation it supports where classes can be organized in a tree instead of a multiple-inheritance hierarchy. Drawbacks are that defaults are computationally costly and one needs an interface to the sentence grammar which is usually written in default-free feature descriptions.</Paragraph>
    <Paragraph position="3"> Although the term default is taken from knowledge representation one should be aware of the quite different usage. In knowledge representation defaults are used to describe uncertain facts which may or may not become explicitly known later on. 2 Exceptions in the lexicon are of a different nature because they form an a priori known set. For any word it is 2An example for the use of defaults in knowledge representation is an inference rule like Birds typically can fly. In the absence of more detailed knowledge this allows me to conclude that Tweety which I only know to be a bird can fly. Should I later on get the additional information that Tweety is a penguin I must revoke that conclusion.</Paragraph>
    <Paragraph position="4"> known whether it is regular or an exception. 3 The only motivation to use defaults in the lexicon is that they allow for a more concise and natural representation. null The alternative approach organizes classes in a multiple-inheritance hierarchy without defaults.</Paragraph>
    <Paragraph position="5"> This means that lexical items can be described as standard feature terms organized in a type hierarchy (see, e.g., \[Smolka, 1988; Carpenter el al., 1991\]).</Paragraph>
    <Paragraph position="6"> The advantages are clear. There is no need for an interface to the grammar and computational complexity is lower.</Paragraph>
    <Paragraph position="7"> At the moment it is an open question which of the two anppproaches is the more appropriate. In our system we decided against introducing a new formalism. Most current natural language systems are based on feature formalisms and we see no obvious reason why the lexicon should not be feature-based (see also \[Nerbonne, 1992\]).</Paragraph>
    <Paragraph position="8"> While inheritance lexica--concerned with the syntactic word--have mainly been used to express generalizations over classes of words the idea can also be used for the explicit representation of derivation. In \[Nerbonne, 1992\] we find such a proposal. What the proposal shares with most of the other schemes is that not much consideration is given to morphophonology. The problem is acknowledged by some authors by using a function morphologically append instead of pure concatenation of morphs but it remains unclear how this function should be implemented. null The approach presented here follows this line of research in complementing an extended two-level morphology with a hierarchical lexicon that contains as entries not only words but also morphs. This way morphophonology can be treated in a principled way while retaining the advantages of hierarchical lexica.</Paragraph>
  </Section>
  <Section position="4" start_page="368" end_page="369" type="metho">
    <SectionTitle>
3 Two-Level Morphology
</SectionTitle>
    <Paragraph position="0"> For dealing with a compositional syntax and semantics of derivatives one needs a component that is capable of constructing arbitrary words from a finite set of morphs according to morphotactic rules.</Paragraph>
    <Paragraph position="1"> Very successful in the domain of morphological analysis/generation are finite-state approaches, notably two-level morphology \[Koskenniemi, 1984\]. Two-level morphology deals with two aspects of word formation: null Morphotactics: The combination rules that govern which morphs may be combined in what order to produce morphologically correct words.</Paragraph>
    <Paragraph position="2"> Morphophonology: Phonological alterations occuring in the process of combination.</Paragraph>
    <Paragraph position="3"> Morphotactics is dealt with by a so-called continuation lexicon. In expressiveness that is equivalent to a finite state automaton consuming morphs.</Paragraph>
    <Paragraph position="4"> aWe do not consider language acquisition here.</Paragraph>
    <Paragraph position="5">  Morphophonology is treated by assuming two distinct levels, namely a lexical and a surface level. The lexical level consists of a sequence of morphs as found in the lexicon; the surface level is the form found in the actual text/utterance. The mapping between these two levels is constrained by so-called two-level rules describing the contexts for certain phonological alterations.</Paragraph>
    <Paragraph position="6"> An example for a morphophonolocical alteration in German is the insertion of e between a stem ending in a t or d, and a suffix starting with s or t, e.g., 3rd person singular of the verb arbeiten (to work) is arbeitest. In two-level morphology that means that the lexical form arbei~+st has to be mapped to surface arbeitest. The following rule will enforce just  that mapping: (1) +:e gO {d, t} _ {s, t};  A detailed description of two-level morphology can be found in \[Sproat, 1992, chapter 3\].</Paragraph>
    <Paragraph position="7"> In its basic form two-level morphology is not well suited for our task because all the morphosyntactic information is encoded in the lexical form. When connected to a syntactic/semantic component one needs an interface to mediate between the morphological and the syntactic word. We will show in in chapter 5 how our version of two-level-morphology is extended to provide such an interface.</Paragraph>
  </Section>
  <Section position="5" start_page="369" end_page="369" type="metho">
    <SectionTitle>
4 Derivation in German
</SectionTitle>
    <Paragraph position="0"> Usually, in German derived words are morphologically regular. 4 Morphophonological alterations are the same as for inflection only the occurrence of umlaut is less regular. Syntax and semantics on the other hand are very often irregular with respect to compositional rules for derivation.</Paragraph>
    <Paragraph position="1"> As an example we will look at the German derivational prefix be-. This prefix is both very productive and considered to be rather regular. The prefix beproduces transitive verbs mostly from (intransitive) verbs but also from other word categories. We will restrict ourselves here to all those cases where the new verb is formed from a verb. In the new verb the direct object role is filled by a modifier role of the original verb while the original meaning is basically preserved. One regularly formed example is bearbeiten derived from the intransitive verb arbeiten (to work).  (2) \[Maria\]svBj arbeitet \[an dem Papier\]eoBj.</Paragraph>
    <Paragraph position="2"> Mary works on the paper.</Paragraph>
    <Paragraph position="3"> (3) \[Maria\]svBJ bearbeitet \[das Papier\]oBj.</Paragraph>
    <Paragraph position="4">  Skimming through \[Wahrig, 1978\] we find 238 en4Most exceptions are regularly inflecting compound verbs derived from an irregular verb, e.g., handhaben (to manipulate) a regular verb derived from the irregular verb haben (to have).</Paragraph>
    <Paragraph position="5"> tries starting with prefix be-. 91 of these can be excluded because they cannot be explained as being derived from verbs. Of the remaining 147 words about 60 have no meaning that can be interpreted compositionally. 5 The remaining ones do have at least one compositional meaning.</Paragraph>
    <Paragraph position="6"> Even with those the situation is difficult. In some cases the derived word takes just one of the meanings of the original word as its semantic basis, e.g., befolgen (to obey) is derived from folgen in the meaning to obey, but not to follow or to ensue:  (4) Der Soldat folgt \[dem Befehl \]~onJ.</Paragraph>
    <Paragraph position="7"> The soldier obeys the order.</Paragraph>
    <Paragraph position="8"> (5) Der Soldat befolgt \[den Befehl \]oBJ.</Paragraph>
    <Paragraph position="9"> (6) Bet Soldat folgt \[dem 017izier \]IonJ.</Paragraph>
    <Paragraph position="10">  The soldier follows the officer.</Paragraph>
    <Paragraph position="11"> (7) *Der Soldat befolgt \[den Offizier \]oBJ.</Paragraph>
    <Paragraph position="12"> In other cases we have a compositional as well as a non-compositional reading, e.g., besetzen derived from setzen (to set) may either mean to set or to occupy.</Paragraph>
    <Paragraph position="13"> What is needed is a flexible system where regularities can be expressed to reduce redundancy while irregularities can still easily be handled.</Paragraph>
  </Section>
  <Section position="6" start_page="369" end_page="371" type="metho">
    <SectionTitle>
5 The Morphological Component
</SectionTitle>
    <Paragraph position="0"/>
    <Paragraph position="2"> basis of our system is a morphological component based on two-level morphology. X2MORF extends the standard model in two way which are crucial for our task. A feature-based word grammer replaces the continuation class approach thus providing a natural interface to the syntax/semantics component. Two-level rules are provided with a morphological filter restricting their application to certain morphological classes.</Paragraph>
    <Section position="1" start_page="369" end_page="370" type="sub_section">
      <SectionTitle>
5.1 Feature-Based Grammar and Lexicon
</SectionTitle>
      <Paragraph position="0"> In X2MORF morphotactics are described by a feature-based grammar. As a result, the representation of a word form is a feature description. The word grammar employs a functor argument structure with binary branching.</Paragraph>
      <Paragraph position="1"> Let us look at a specific example. The (simplified) entry for the noun stem Hand (hand) is given in fig.1. To form a legal word that stem must combine with an inflectional ending. Fig.2 shows the (simplified) entry for the plural ending. Note that plural for- null Combining the above two lexical entries in the appropriate way leads to the feature structure described in fig.3.</Paragraph>
    </Section>
    <Section position="2" start_page="370" end_page="371" type="sub_section">
      <SectionTitle>
5.2 Extending Two-level Rules with
Morphological Contexts
</SectionTitle>
      <Paragraph position="0"> X2MORF employs an extended version of two-level rules. Besides the standard phonological context they also have a morphological context in form of a feature structure. This morphological context is unified with the feature structure of the morph to which the character pair belongs. This morphological context serves two purposes. One is to restrict the application of morphophonological rules to suitable morphological contexts. The other is to enable the transmission of information from the phonological to the morphological level.</Paragraph>
      <Paragraph position="1"> We can now show how umlaut is treated in  X2MORF. A two-level rule constrains the mapping of A to ~ to the appropriate contexts, namely where the inflection suffEx requires umlaut: (8) A:~ C/~_ ; \[MORPH: \[HEAD: \[UMLAUT: +\] \]\]  The occurrence of the umlaut ~ in the surface form is now coupled to the feature UMLAUT taking the value +. As we can see in fig.3 the plural ending has forced the feature to take that value already which means that the morphological context of the rule is valid.</Paragraph>
      <Paragraph position="2"> Reinhard \[Reinhard, 1991\] argues that a purely feature-based approach is not well suited for the treatment of umlaut in derivation because of its idiosyncrasy. One example are different derivations from Hand (hand) which takes umlaut for plural (ll~nde) and some derivations (h~ndisch) but not for others (handlich) There are also words like Tag (day) where the plural takes no umlaut (Tage) but derivations do (tSglich). Reinhard maintains that a default mechanism like DATR is more appropriate to deal with umlaut.</Paragraph>
      <Paragraph position="3"> We disagree since the facts can be described in X2MORF in a fairly natural manner. Once the equivalence classes with respect to umlaut are known we can describe the data using a complex feature UMLAUT 6 instead of the simple binary one. This complex feature UMLAUT consists of a feature for each class, which takes as value + or - and one feature value for the recording of actual occurrence of  The value of the feature UMLAUT\[VALUE is set by the morphological filter of the two-level rule triggering umlaut, i.e., if an umlaut is found it is set to + otherwise to -. The entries of those affixes requiring umlaut set the value of their equivalence class to +. Therefore the relevant parts of the entries for -iich and -isch look like \[UMLAUT: \[UOH-U~,: +\]\] and \[UMLAUT: \[ISCH-UML: + \]\] because both these endings normally require umlaut.</Paragraph>
      <Paragraph position="4"> As we have seen above the noun Hand comes with umlaut in the plural (llSnde) and the derived adjective hSndisch (manually)but (irregularly) without umlaut in the adjective handlich (handy). In fig.4 we show the relevant part of the entry for Hand that produces the correct results. The regular cases are 6In our simplified example we assume just 3 classes (for plural, derivation with -lich and -isch). In reality the number of classes is larger but still fairly small.</Paragraph>
      <Paragraph position="5">  taken care of by the first disjunct while the exceptions are captured by the second.</Paragraph>
      <Paragraph position="6"> The first disjunct in this feature structure takes care of all cases but the derivation with .lich. The entries for plural (see fig.5) and -isch come with the value + forcing the VALUE feature also to have a + value. The entry for -lich also comes with a + value and therefore fails to unify with the first disjunct. Suffixes that do not trigger umlaut come with the VALUE feature set to -.</Paragraph>
      <Paragraph position="7"> The second disjunct captures the exception for the -lich derivation of Hand. Because of requiring a value it fails to unify with the entries for plural and -isch. The + value for -lich succeeds forcing at the same time the VALUE feature to be -.</Paragraph>
      <Paragraph position="8">  This mechanism allows us to describe the umlaut phenomenon in a very general way while at the same time being able to deal with exceptions to the rule in a simple and straightforward manner.</Paragraph>
    </Section>
    <Section position="3" start_page="371" end_page="371" type="sub_section">
      <SectionTitle>
5.3 Using X2MORF directly for derivation
</SectionTitle>
      <Paragraph position="0"> Regarding morphotactics and morphophonology there is basically no difference between inflection and derivation. So one could use X2MORF as it is to cope with derivation. Derivation particles are word-forming heads \[di Sciullo and Williams, 1989\] that have to be complemented with the appropriate (simple or complex) stems. Words that cannot be interpreted compositionally anymore have to be regarded as monomorphemic and must be stored in the morph lexicon.</Paragraph>
      <Paragraph position="1"> Such an approach is possible but it poses some problems: * The morphological structure of words is no more available to succeeding processing stages. For some phenomena just this structural information is necessary though. Take as an example the partial deletion of words in phrases with conjunction (gin- und Vcrkan\]).</Paragraph>
      <Paragraph position="2"> * The compositional reading of a derived word cannot be suppressed r, even worse, it is indistinguishable from the correct reading (remember the befehlen example).</Paragraph>
      <Paragraph position="3"> * Partial regularities cannot be used anymore to reduce redundancy.</Paragraph>
      <Paragraph position="4"> Therefore we have chosen instead to augment X2MORF with a lexeme lexicon and an explicit interface between morphological and syntactic word.</Paragraph>
    </Section>
  </Section>
  <Section position="7" start_page="371" end_page="373" type="metho">
    <SectionTitle>
6 System Architecture
</SectionTitle>
    <Paragraph position="0"> Logically, the system uses two different lexica.</Paragraph>
    <Paragraph position="1"> A morph lexicon contains MI the morphs, i.e., monomorphemic stems, inflectional and derivational affixes. This lexicon is used by X2MORF. A iezeme lexicon contains the lexemes, i.e. stem morphs and derivational endings (because of their word-forming capacity). The lexical entries contain the lexemespecific syntactic and semantic information under the feature SYNSEM.</Paragraph>
    <Paragraph position="2"> These two lexica can be merged into a single type hierarchy (see fig.6) where the morph lexicon entries are of type morph and lexeme lexicon entries of type lezeme. Single-stems and deriv-morphs share the properties of both lexica.</Paragraph>
    <Paragraph position="3"> ZOne could argue that the idea of preemption is incorrect anyway and that only syntactic or semantic restrictions block derivation. While this may be true in theory at least for practical considerations we will need to be able to block derivation in the lexicon.</Paragraph>
    <Paragraph position="4">  Since we have organized our lexica in a type hierarchy we have already succeeded in establishing an inheritance hierarchy. We can now impose any of the structures proposed in the literature (e.g., \[Krieger and Nerbonne, 1991; Russell et al., 1992\]) for hierarchical lexica on it, as long as they observe the same functor argument structure of words crucial to our morphotactics.</Paragraph>
    <Paragraph position="5"> Why are we now in a better situation than by using X2MORF directly? Because complex stems are no morphs and therefore inaccessible to X2MORF. They are only used in a second processing stage where complex words can be given a non-compositional reading. To make this possible the assigning of compositional readings must also be postponed to this second stage. This is attained by giving derivation morphs in the lexicon no feature SYNSEM but stating the information under FUNCTOR\]SYNSEM instead.</Paragraph>
    <Paragraph position="6"> In the first stage X2MORF processes the morphotactic information including the word-form-specific morphosyntactic information making use of the morph lexicon. The result is a feature-description containing the morphotactic structure and the morphosyntactic information of the processed word form. What has also been constructed is a value for the STEM feature that is used as an index to the lexeme lexicon in the second processing stage, s In the second stage we have to discriminate between the following cases: * The stem is found in the lexeme lexicon. In case of a monomorphemic stem processing is completed because the relevant syntactic/semantic information has already been constructed during the first stage. In case of a polymorphemic stem the retrieved lexical entry is unified with the result of the first stage, delivering the lexicalized interpretation.</Paragraph>
    <Paragraph position="7"> SInflectional endings do not contribute to the stem. Also, allomorphs like irregular verb forms share a common stem.</Paragraph>
    <Paragraph position="8"> The stem is not found in the lexeme lexicon. In that case a compositional interpretation is required. This is achieved by unifying the result of stage one with the feature structure shown in fig.7 This activates the SYNSEM information of the functor-which must be either an inflection or a derivation morph. In case of an inflection morph nothing really happens. But for derivation morphs the syntactic/semantic information which has already been constructed is bound to the feature SYNSEM. Then the process must recursively be applied to the argument of the structure. Since all monomorphemic stems and all derivational affixes are stored in the lexeme lexicon this search is bound to terminate.</Paragraph>
    <Paragraph position="9">  How does this procedure account for the flexibility demanded in section 4. By keeping the compositional synyactic/semantic interpretation local to the runetot during morphological interpretation the decision is postponed to the second stage. In case there is no explicit entry found this compositional interpretation is just made available.</Paragraph>
    <Paragraph position="10"> In case of an explicit entry in the lexeme lexicon there is a number of different possibilities, among them: * There are just lexicalized interpretations.</Paragraph>
    <Paragraph position="11"> * There is a compositional as well as a lexiealized interpretation.</Paragraph>
    <Paragraph position="12"> * The compositional interpretation is restricted to a subset of the possible semantics of the root.</Paragraph>
    <Paragraph position="13"> The entries in the lexeme lexicon can easily be tailor-made to fit any of these possibilities.</Paragraph>
  </Section>
  <Section position="8" start_page="373" end_page="373" type="metho">
    <SectionTitle>
7 A Detailed Example
</SectionTitle>
    <Paragraph position="0"> We will now illustrate the workings of the system using a few examples from section 4. The first example describes the purely compositional case. The verb betreten (to enter) can be regularly derived from treten (to enter) and the suffix be-. The sentences (9) Die Frau tritt \[in das Zimmer\]POBd.</Paragraph>
    <Paragraph position="1"> The woman enters the room.</Paragraph>
    <Paragraph position="2"> (10) Die Frau betritt \[das Zimmer\]oBJ.</Paragraph>
    <Paragraph position="3"> are semantically equivalent. The prepositional object of the intransitive verb treten is transformed into a direct object making betreten a transitive verb. A number of verbs derived by using the particle befollows this general pattern. Figure 8 shows-a simplified version of-the lexical entry for be-.</Paragraph>
    <Paragraph position="4"> The SYNSEM feature of the functor contains the modified syntactic/semantic description. Note that the lexical entry itself contains no SYNSEM feature.</Paragraph>
    <Paragraph position="5"> When analyzing a surface form of the word betreten this functor is combined with the feature structure for treten (shown in fig.9) as argument.</Paragraph>
    <Paragraph position="6"> At that stage the FUNCTORISYNSEM feature of beis unified with the SYNSEM feature of treten. But there is still no value set for the SYNSEM feature.</Paragraph>
    <Paragraph position="7"> This is intended because it allows to disregard the composition in favour of a direct interpretation of the derived word. In our example we will find no entry for the stem betreten though. We therefore have to take the default approach which means unifying the result with the structure shown in fig.7.</Paragraph>
    <Paragraph position="8"> Up to now our example was overly simplified because it did not take into account that treten has a second reading, namely to kick. The final lexical entry for treten is shown in fig.10.</Paragraph>
    <Paragraph position="9"> But this second reading of treten cannot be used for deriving a second meaning of betreten: (11) Die Frau 1tilt \[den Huna~oss.</Paragraph>
    <Paragraph position="10"> The woman kicks the dog.</Paragraph>
    <Paragraph position="11"> (12) *Die Frau betritt \[den Hnna~oB.~.</Paragraph>
    <Paragraph position="12"> We therefore need to block the second compositional interpretation. This is achieved by an explicit entry for betreten in the lexeme lexicon which is shown in fig.ll.</Paragraph>
    <Paragraph position="13"> single-ster~  We now get the desired results. While both readings of treten produce a syntactic/semantic interpretation in the first stage the incorrect one is filtered out by applying the lexeme lexicon entry for betreten in the second stage.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML