File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/92/j92-2005_intro.xml

Size: 14,025 bytes

Last Modified: 2025-10-06 14:05:16

<?xml version="1.0" standalone="yes"?>
<Paper uid="J92-2005">
  <Title>Incremental Processing and the Hierarchical Lexicon</Title>
  <Section position="4" start_page="228" end_page="233" type="intro">
    <SectionTitle>
5. Lexical Preferences and the Hierarchical Lexicon
</SectionTitle>
    <Paragraph position="0"> Besides windowing an equally important source of information that may be exploited to render the interpretation process more efficient in case of ambiguity are lexical preferences. To indicate the importance of lexical preferences, the present section opens with a short discussion of preferences as they have been proposed in the literature.</Paragraph>
    <Paragraph position="1"> Next, lexical preferences are modeled. They follow from the structure of the lexicon, which was independently motivated to capture linguistic generalizations. Inference  Computational Linguistics Volume 18, Number 2 rules model the proceedings of the parser in this respect. Heuristic information is thus integrated in a principled and formal way into the interpretation process. The behavior of idiomatic expressions will be discussed as an example.</Paragraph>
    <Section position="1" start_page="229" end_page="230" type="sub_section">
      <SectionTitle>
5.1 Preference Strategies
</SectionTitle>
      <Paragraph position="0"> Several preference strategies have been proposed for guiding parsers. Among these are structural, syntactic preferences like Right Association (Kimball 1973), which entails that a modifier should preferably be attached to the rightmost verb (phrase) or noun (phrase) it can modify; and Minimal Attachment (Frazier and Fodor 1988), which states that the analysis that assumes the minimal number of nodes in the syntactic tree should be preferred. 12 Semantic preferences are illustrated in Examples 9 and 10. The modifiers in both cases are preferably attached contrary to expectations on the basis of syntactic preferences (see Schubert 1984, 1986; Wilks, Huang, and Fass 1985).</Paragraph>
      <Paragraph position="1"> Example 9 John met the girl that he married at the dance.</Paragraph>
      <Paragraph position="2"> Example 10 John saw the bird with the red beak.</Paragraph>
      <Paragraph position="3"> Evidence for the existence of preferences based upon contextual information has been provided by Marslen-Wilson and Tyler (1980), who have shown in a number of psycholinguistic experiments that contextual information influences word recognition (see also Crain and Steedman \[1982\] and Taraban and McCleUand \[1988\]). Lexical preferencing (Ford, Bresnan, and Kaplan 1982) refers to the preference functor categories have for certain arguments. For instance, the verb to go can either occur as an intransitive verb that can be modified by a pp with the prosodic form to + X, or it can take this pp as an argument. The second frame is the preferred frame. The prepositional phrase should preferably be considered as an argument to the verb and not as a vp-modifier.</Paragraph>
      <Paragraph position="4"> Although the existence of all of these preferences should thus be acknowledged, there are two arguments in favor of lexical preferences. Firstly, from empirical, corpus-based studies it may be concluded that lexical preferences are successful heuristics for resolving ambiguity (Whittemore, Ferrara, and Brunner 1990; Hobbs and Bear 1990). Secondly, although ambiguities may be resolved at any level of processing, lexical processing takes place on a lower level, since higher levels depend upon lexical information. Resolution of ambiguity on a low level ensures that higher levels of processing are not bothered with ambiguities occurring on lower levels. Therefore, if it is equally possible to model the behavior of the parser as a lexically guided or as, for instance, a contextually guided process, the former should be preferred. For instance, in the case of an idiomatic expression, it is more efficient to decide that the idiom should be interpreted on the basis of the mere fact that it is an idiom than on the basis of consultation of, for instance, some model of the context. Since lexical preferences are successful heuristics that operate on a low level, there is sufficient reason to model them in a principled and formal way.</Paragraph>
      <Paragraph position="5">  Erik-Jan van der Linden Incremental Processing and the Hierarchical Lexicon</Paragraph>
    </Section>
    <Section position="2" start_page="230" end_page="230" type="sub_section">
      <SectionTitle>
5.2 Formalization of Lexical Preferences
</SectionTitle>
      <Paragraph position="0"> The formalization of lexical preferences proposed here is another application of the principle of priority to the instance (Hudson 1984): the parser prefers information lower on in the hierarchical structure of the lexicon over information on higher levels in the hierarchy. If two subcategorization frames of, for instance, go maintain an inheritance relation Inp\s ~ (np\s)/Ipp~ to + -1 -)1, and both apply, the more specific frame is preferred. The difference between windowing and lexical preferencing is that windowing applies to the choice during incremental processing among a number of frames of which only one applies eventually, whereas lexical preferencing applies to a choice among frames all of which apply. Lexical preferences do not follow as some statistically motivated preference, but as a linguistically motivated one: lexical preferences follow from the application of the principle of priority to the instance to the use of the structured lexicon.</Paragraph>
      <Paragraph position="1"> As was the case with windowing, lexical preferencing can be modeled by means of the inference rules that operate upon inheritance connectives. The implementation of this preference is quite simple. The rules for elimination of the &gt;&gt;-operator are ordered in such a way that the inference engine firstly uses the category as a functor, and next as the argument of a modifier (see Definition 8; A KK B denotes that A should be applied before B).</Paragraph>
      <Paragraph position="2">  Note that the boolean operator A does not enable the implementation of this kind of preference. It is, of course, possible to order the categories (((np\s)/(pp, to~ A (np\s)) and to order the rules that eliminate boolean connectives (first category first). However, the order of these categories must be stipulated, whereas in the case of the hierarchical lexicon structure presented here, the relation between the categories is linguistically motivated. Frequency of occurrence, that is, giving forms with higher frequency prevalence over those with lower frequency, is not an alternative either: more specific forms do not necessarily appear more frequently than the forms they inherit from.</Paragraph>
      <Paragraph position="3"> Examples. Schubert (1984; 1986) presents a number of sentences that he claims show a preference for attachment that he claims cannot be explained on the basis of structural, syntactic preferences. The preference to attach, for example, (pp,from + -I to disappearance can, however, be modeled as a lexical preference if disappearance (as well as disappear) (optionally) subcategorizes for this prepositional phrase. The form with the pp then prevails over the form without the pp. The same argument applies to Examples 12-15 (daughter categories are fully specified).</Paragraph>
    </Section>
    <Section position="3" start_page="230" end_page="233" type="sub_section">
      <SectionTitle>
5.3 Idioms and Parsing Preferences
5.3.1 Conventionality and Idiom Processing. Idiomatic expressions can in most cases
</SectionTitle>
      <Paragraph position="0"> be interpreted nonidiomatically as well. 13 It has, however, frequently been observed that an idiomatic phrase should very rarely be interpreted nonidiomatically (Koller 1977, p. 13; Chafe 1968, p. 123; Gross 1984, p. 278; Swinney 1981, p. 208). Also, psycholinguistic research indicates that in case of ambiguity there is clear preference for the idiomatic reading (Gibbs 1980; Schraw et al. 1988; Schweigert 1986; Schweigert and Moates 1988). The phenomenon that phrases should be interpreted according to their idiomatic, noncompositional, lexical, conventional meaning will be referred to as the 'conventionality' principle (Gibbs 1980). The application of this principle is not limited to idioms. For instance, compounds are not interpreted compositionally, but according to the lexical, conventional meaning (Swinney 1981). Words are formed by regular rules, but their meaning will undergo 'semantic drift,' obscuring the compositional nature of the complex word.</Paragraph>
      <Paragraph position="1"> If this principle could be modeled in an appropriate way, this would be of considerable help in dealing with idioms. As soon as the idiom has been identified, the ambiguity can be resolved and 'higher' knowledge sources do not have to be used to solve the ambiguity. In Stock's (1989) approach to ambiguity resolution the idiomatic and the nonidiomatic analyses are processed in parallel. An external scheduling function gives priority to one of these analyses. Higher knowledge sources are thus necessary to decide upon the interpretation. In PHRAN (Wilensky and Arens 1980), specificity plays a role, but only in suggesting patterns that match the input: evaluation takes place on the basis of length and order of the patterns. Zernik and Dyer (1987) present lexical representations for idioms, but do not discuss ambiguity. Van der Linden and Kraaij (1990) discuss two alternative formalizations for conventionality. One extends the notion continuation class from two-level morphology. The other is a simple localist connectionist model. Here, another model based upon the specificity of information in the hierarchical structure of the lexicon will be presented.</Paragraph>
      <Paragraph position="2">  Erik-Jan van der Linden Incremental Processing and the Hierarchical Lexicon encountering a situation where the &gt;--operator should be removed, the specific information, the daughter, takes precedence over the more general information, the mother  if ({ {syn_mother, sem_mother} ~ {syn_daughter, sem_daughter)), sere}, mother) n &gt; type and type,V =~* Z {{syn_mother, sem_mother} ~ {syn_daughter, sem_daughter}, sere}, V=~* Z \[~ E-daughter\] if ({{syn_mother, semdnother}~- {syn_daughter, sem_daughter)}, sem}, daughter) N &gt; aux and ({{aux},sem/, daughter) N &gt; type and type, V =~* Z As was stated in Section 5.2, the boolean operator A does not enable the implementation of this kind of preference. Neither is it possible to model this kind of preference with the use of frequency of occurrence of these forms. On the contrary, since verbs occur within all idioms they are part of, and also occur independently of idioms, their frequency will always be higher than that of the idiomatic expression. Therefore, verbs would always be preferred over idioms, exactly the reverse of what is desired. Also in the case the occurrences of the verb within the idiom are not counted as occurrences of the verb proper, it will be unlikely that on the basis of the frequency criterion the idiom will in all cases be preferred over the verb.</Paragraph>
      <Paragraph position="3"> An example of the proceedings of the parser will be presented now to illustrate the way windowing, incremental processing, and lexical preferences interact in the case of an idiomatic expression. The sign that represents the idiom is abbreviated as  k_t_b (compare Example 2).</Paragraph>
      <Paragraph position="4"> (1) After the lexicalization of John and kicked, it becomes possible to form a flexible constituent on the basis of these two words. The result of this step is that, semantically, John is considered the subject of any of the verbs in the kick hierarchy.</Paragraph>
      <Paragraph position="5"> (2) Upon encountering bucket, firstly the and bucket are reduced to an np with a prosodic representation the+bucket. Now it becomes possible to descend in the kick hierarchy.</Paragraph>
      <Paragraph position="6"> (3) First the choice between the transitive and the intransitive form is made.</Paragraph>
      <Paragraph position="7"> (4) Next the choice between the nonidiomatic and the idiomatic form is  made.</Paragraph>
      <Paragraph position="8"> The derivation results in the assignment of the meaning die(john) to this sentence. 14 In case a verb occurs in more than one idiomatic expression, for instance kick the bucket and kick one's heels, only the idiomatic expression that is possible on the basis of the input is used.</Paragraph>
      <Paragraph position="10"/>
    </Section>
    <Section position="4" start_page="233" end_page="233" type="sub_section">
      <SectionTitle>
5.4 Determinism
</SectionTitle>
      <Paragraph position="0"> Windowing and Lexical Preferencing are nondeterministic processes. Although the parser commits itself only to information it is certain of and leaves other choices implicit in the structure of the lexicon until it is able to choose (windowing), it can mistake a vp-modifier for an argument. Lexical Preferencing is also a nondeterministic process in that backtracking is necessary when interpretations do not fit in the context.</Paragraph>
      <Paragraph position="1"> Although it is a linguistically motivated strategy, it does not guarantee that the correct choice is made in all cases. In Example 17 the idiomatic reading is preferred, but later on in the input it turns out that this is not the correct interpretation. Yet, Marcus' Determinism Hypothesis states that &amp;quot;(...) all sentences which people can parse without conscious difficulty can be parsed strictly deterministically&amp;quot; (Marcus 1980, p. 6). It remains to be seen whether people do not garden-path in Example 17. Note also that backtracking is modeled very easily--it amounts to making a different choice between two items that maintain an inheritance relation.</Paragraph>
      <Paragraph position="2"> Example 17 John kicked the bucket and Mary the small pail.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML