File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/94/c94-2125_metho.xml

Size: 20,738 bytes

Last Modified: 2025-10-06 14:13:41

<?xml version="1.0" standalone="yes"?>
<Paper uid="C94-2125">
  <Title>ALGORITHM FOR AUTOMATIC INTERPRETATION OF NOUN SEQUENCES</Title>
  <Section position="1" start_page="0" end_page="0" type="metho">
    <SectionTitle>
ALGORITHM FOR AUTOMATIC INTERPRETATION OF NOUN SEQUENCES
Lucy Vanderwende
Microsoft Research
lucyv@ microsoft.corn
ABSTRACT
</SectionTitle>
    <Paragraph position="0"> This paper describes an algorithm for automatically interpreting noun sequences in unrestricted text. This system uses broad-coverage semantic information which has been acquired automatically by analyzing the definitions ira an on-line dictionary. Previously, computational studies of noun sequences made use of hand-coded semantic information, and they applied the analysis rules sequentially. In contrast, the task of analyzing noun sequences in unrestricted text strongly favors an algorithm according to which the rules are applied in parallel and the best interpretation is determined by weights associated with rule applications.</Paragraph>
  </Section>
  <Section position="2" start_page="0" end_page="783" type="metho">
    <SectionTitle>
1. INTRODUCTION
</SectionTitle>
    <Paragraph position="0"> The inte~opretation of noun sequences (henceforth NSs, and also known as noun compounds or complex nominals) has long been a topic of research in natural language processing (NLP) (Finin, 1980; Sparck Jones, 1983; Leonard, 1984; Isabelle, 1984; Lehnert, 1988; and Riloff, 1989). The challenge in analyzing NSs derives from the semantic nature of the problem: their interpretation is, at best, only partially recoverable from a syntactic or a morphological analysis of NSs. To arrive at an interpretation of plum sauce which specifies that plum is the Ingredient of sauce, or of knowledge representation, specifying that knowledge is the Object of representation, requires semantic information for both the first noun (the modifier) and the second noun (the head).</Paragraph>
    <Paragraph position="1"> In this paper, we are concerned with interpreting NSs which are composed of two nouns, ira absence of the context in which the NS appears; this scope is similar to most of the studies mentioned above. The algorithm for interpreting a sequence of two nouns is intended to be basic to the algorithm for interpreting sequences of more than two nouns: each pair of NSs will be interpreted in turn, and the best interpretation forms a constituent which can modify, or be modified by, another noun or NS (see also Finin, 1980). There is no doubt that context, both intra- and inter-sentential, plays a role in determining the correct interpretation of a NS, since the most plausible interpretation in isolation might not be the most plausible in context. It is, however, a premise of the present system that, whatever the context is, the interpretation of a NS is always available in the list of possible interpretations. A NS that is ah'eady listed in an on-line dictionary needs no interpretation because the meaning can be derived from its definition.</Paragraph>
    <Paragraph position="2"> In the studies of NSs mentioned above, the systems tbr interpreting NSs have relied on hand-coded semantic information, which is limited to a specific domain by the sheer effort involved in creating such a semantic knowledge base. The level of detail made possible by hand-coding has led to the development of two main algorithms for the automatic interpretation of NSs: concept dependent and sequential rule application.</Paragraph>
    <Paragraph position="3"> The concept dependent algorithm (Finin, 1980) requires each lexical item to contain an index to the rule(s) which should be applied when that item is part of a NS; it has the advantage that only those rules are applied for which the conditions are met and each noun potentially suggests a unique interpretation. Whenever the result of the analysis is a set of possible interpretations, the most plausible one is determined on the basis of the weight which is associated with a role fitting procedure. The disadvantage of this approach is that this level of lexical information cannot be acquired automatically, and so this approach cannot be used to process unrestricted text.</Paragraph>
    <Paragraph position="4"> The algorithm for sequential rule application (Leonard, 1984) focuses on the process of determining which interpretation is the most plausible; the fixed set of rules are applied in a fixed order and the first rule for which the conditions are met results in the most plausible interpretation. This algorithm has the advantage that no weights are associated with the rules. The disadvantage of this approach is that the degree to which the rules are satisfied cannot be expressed, and so, in some cases, the most plausible  interpretation of an NS will not be produced.</Paragraph>
    <Paragraph position="5"> Also, section 4 will show that this algorithm is suitable only when the sense of each noun is a given, a situation which is not true for processing unrestricted text.</Paragraph>
    <Paragraph position="6"> This paper introdt.ces an algorithm which is specifically designed for analyzing NSs in unrestricted text. The task of processing unrestricted text has two consequences: firstly, hand-coded semantic information, and therefore a concept dependent algorithm, is no longer feasible; and secondly, the intended sense of each noun cau no longer be taken as a given. \]'he algorithm described here, therefore, relies on semantic information which has been extracted automatically fi'om an on-line dictionary (see Montemagni and Vanderwende, 1992; l)ohm et al., 1993). This algorithm manipulates a set of general rules, each of which has an associated weight, and a general procedure for matching words. The result of this algorithm is an ordered set of interpretations and partial scnsedisambiguation of the nouns by taking note of which noun senses were most relevant in each of the possible interpretations.</Paragraph>
    <Paragraph position="7"> We will begin by reviewing the chtssification schema for NSs described in Vanderwende (1993) and the type of general rules which this algorithm is designed to handle. The matching procedure will be described; by introducing a separate matching procedure, the rules in Vanderwende (1993) can be organized in such it way as to make the algorithm more efficient. We will then show the algorithm t'or rule application in delail. This algorithm differs fiom I,conard (1984) by applying all of the rules before determining which interpretation is the most plausible (effectively, a parallel rule application), rather than determining the best interpretations by the order in which the rules ate applied (a serial rule application). In section 4, we will provide examples which illustrate that a parallel algorithm is required when processing unrestricted, uudisambiguated text. Finally, lhe results of applying this algorithm to a training and a test corpus of NSs will be presented, along with a discussion of these results and fnrther directions fox&amp;quot; research in NS analysis.</Paragraph>
    <Section position="1" start_page="782" end_page="783" type="sub_section">
      <SectionTitle>
1.1 NS interpretations
</SectionTitle>
      <Paragraph position="0"> 'Fable 1 shows a classil'ication schema for NSs (Vande,wende, 1993) which accounts for most of the NS classes studied previously in theoretical linguistics (Downing, 1977; Jespersen, 1954; Lees, 1960; and Levi, 1978). The relation which holds between the nouns in a NS has conventionally been given names such as Purpose or Location. The classification schema that is used in this system has been formulated as whquestions. A NS 'can be classified according to which wh-question the modifier (filwt noun) best answers' (Vanderwende, 1993). Deciding how a NS should be classified is not at all clear and we need criteria for judging whether a NS has been classified appropriately. The formulation of NS classes its wh-questions is intended to provide at least one criterion for judging NS classification; other criteria are provided in Vanderwende (1993).</Paragraph>
      <Paragraph position="1">  1.2 General rules for NS analysis Each general rule call be considered to be a description of the configuration of semantic and syntactic attributes which provide evidence for a particular NS interpretation, i.e., a NS classification. Exactly how these rules are applied is the topic of this paper. Typically, the general rules correspond in a many-to-one relation to the number of classes in the classification schema because more than one combination of semantic attributes can identify the NS as a member of a particular class. This is illustrated in Table 2, which presents two of the rules tbr establishing a 'What for?' interpretation.</Paragraph>
      <Paragraph position="2"> The first rule (H1) tests whether the definition of the head contains a PURPOSE or INSTRUMENT-FOR attribute which matches (i.a., has the same lemlna as) the modifier. When this rule is applied to the NS bird sanctuary, the rule finds that a PURPOSE attribute has been identified automatically in the definition of the head: sanctuary (L n,3) 'an area for birds or other kinds of animals where they may not be hunted and their animal enemies are controlled'. (All examples are from the Longman Dictionary of Contemporary English, Longman Group, 1978.) The values of this PURPOSE attribute are bird and animal. The rule HI verifies that the definition of sanctuary contains a PURPOSE attribute, and that one of its values, namely bird, has the same lemma as the modifier, the first noun.</Paragraph>
      <Paragraph position="3"> The second rule (H2) tests a different configuration, namely, whether the definition of the head contains a LOCATION-OF attribute which matches the modifier; the example bird cage will be presented in section 2.</Paragraph>
      <Paragraph position="4"> These rules are in a notation modified from Vanderwende (1993, pp. 166-7). Firstly, the rules have been divided into those that test attributes on the head, as rules HI and H2 do, and those that test attributes on the modifier. Secondly, associated with each rule is a weight. Unlike in Vanderwende (1993), this rule weight is only part of the final score of a rule application; the final score of a rule application is composed of both the rule weight and the weight returned from the matching procedure, which will be described in the next section.</Paragraph>
    </Section>
  </Section>
  <Section position="3" start_page="783" end_page="784" type="metho">
    <SectionTitle>
2. THE MATCHING PROCEDURE
</SectionTitle>
    <Paragraph position="0"> Matching is a general procedure which returns a weight to reflect how closely related two words are, in this case how related the value of an attribute is to a given lemma. The weight returned by the matching procedure is added to the weight of the rule to arrive at the score of the rule as a whole. Ill the best case, the matching procedure finds that the lemma is the same as the value of tile attribute being tested. We saw above that ill the NS bird sanctuary, the lnodifier bird has the same lemma as the value of a PURPOSE attribute which can be identified in the definition of the head, sanctualy. The weight associated with snch an exact match is 0.5. Applying rule H1 ill Table 2 to the NS bird sanctuary has an overall score of 1; the match weight 0.5 added to the rule weight 0.5.</Paragraph>
    <Paragraph position="1"> When an exact match cannot be found between the lemma and the attribute value, the matching procedure can investigate a match given semantic information for each of the senses of the lemma. (Only in the worst case would this be equivalent to applying each rule to each combination of modifier and head senses.) Of course the HYPERNYM attribute will be useful to find a match. Applying rule HI to the NS owl sanctuary, a match is found between the PURPOSE attribute in the definition of sanctuary and the modifier owl, because the definition of owl (L n,l): 'any of several types of night bird with large eyes, supposed to be very wise', identifies bird (one of the values of sanctual:v's PURPOSE attribute) as the HYPERNYM of owl.</Paragraph>
    <Paragraph position="2"> Whenever the HYPERNYM attribute is used, the weight returned by the matching procedure is  Other semantic attributes are also relewmt for l'inding a match. Fig. I shows graphically how the attribute HAS-PART can be used to establish a match. One of the 'Who~What?&amp;quot; rules tests whether any of the verbal senses of the head has a BY-MEANS-OF attribute which lnatches the modifier. In the verb definition o1' scratch (I, v, I): 'to rub and tear or mark (a surface) with something pointed or rough, as with claws or fingernails', a P,Y-MI';ANS-OF attribute can be idenlified with claw and.fingernail as its values, neither of which match the modifier norm cat.</Paragraph>
    <Paragraph position="3"> Now the ma|ching procedure investigates the senses of cat attempting to find a match. The definition of eat (L n, 1): 'a small animal with soft fur and sharp teeth and claws (nails) .... ' klentil'ies claw (one o1' scratch's 13 Y-MF, ANS-OF altributes) as one of the wflues of ItAS-PART, thus establishing the match shown in Fig. I. The weight associated with a match using \[tAS- null with cat (/, n, I) and scratch (l, v, 1) lqg. 2 shows how also the attrilmtes IIAS-OBJECT and HAS--SUfLIECT can be used; this type of match is required when a rule calls for a match between a lemma (which is a noun) and an attribute which typically has a verb as its value, since we can expect no link between a noun and a verb according to hypernymy or any part relation. In the definition of cage (l, n, I): 'a framework of wires or bars m which animals or birds may hc kept or carried', a IX)CATION-.OF attribute can be identified, with as its value the veflm keep and carry and a nested HAS-()BJI~;CI&amp;quot; attribute, with animal and bird as its wflue; it is the HAS()BJECT attribute which can match the modifier noun bird. A match using the HAS-OBJECT or IlAS-SUBJI';CT attribute carries a weight of 0.2.</Paragraph>
    <Paragraph position="5"> Even when alternate matches are being investigated, such as a match using \[I\[AS-OBJECT, the senses of the lemma can still be examined. In this way, a 'What for?' interpretation can also be determined for the NS canat:v cage, shown in Fig. 3; the weight for this type of link is O. 1.</Paragraph>
    <Paragraph position="6">  In Vanderwende (1993), the rules themselves specified how to find the indirect matches described above. By separating the matching information from the information relevant to each role, the matching can be applied more consistently; but equally important, the roles specify only those semantic attributes that indicate a specific interprelation.</Paragraph>
  </Section>
  <Section position="4" start_page="784" end_page="785" type="metho">
    <SectionTitle>
3. ALGORITHM FOR APPLYING RULI,;S
</SectionTitle>
    <Paragraph position="0"> The algoritlm\] controls how the set of general rules will be applied in order to interpret NSs in unrestricted text. Given that a separate procedure for matching exists, the rules are naturally formulated as conditions, in the form of a semantic attribute(s) to be satisfied, on either the modifier or head, but not necessarily on both at the same time. This allows the rules lo be divided into groups: modifier-based, head-based, and deverbal-head based. NSs with a deverbal head require additional conditions in the rules; if deverbal-head based rules were applied on par with the headqmsed rules, the deverbal-head rules wouM apply far too often, leading to spurious interpretations, because in English nouns and verbs are often homographs.</Paragraph>
    <Paragraph position="1">  The algorithm for interpreting NSs has four steps:  1. apply the head-based rules to each of the noun senses of the head and the lemma of the modifier 2. apply the modifier-based rules to each of the noun senses of the modifier and the lemma of the head 3. if no interpretation has received a  weight above a certain threshold, then apply the deverbal-head rules to each of the verb senses of the head and the lemma of the modifier 4. order the possible interpretations by comparing the weights assigned by the rule applications and return the list in order of likelihood The semantic attributes which are found in the head-based conditions are: LOCATED-AT,  In Vanderwende (1993), it was suggested that each rule is applied to each combination of head sense and modifier sense. If the modifier has three noun senses and the head has four noun senses, then each of the 34 general rules would apply to each of the (3x4) possible combinations, for a total of 408 rules applications. With the current algorithm, if the modifier has three noun senses and the head has four noun senses, then first the eleven modifier roles apply (3xl 1), then the sixteen head rules apply (4xl6), and if the head can be analyzed as a deverbal noun, then also the seven deverbal-head rules apply (4x7), for a total of 125 rule applications. Only after all of the rules have applied are the possible interpretations ordered according to their scores. It may seem that we have made the task of interpreting NSs artificially difficult by taking into consideration each noun sense in the modifier and head; one might argue that it is reasonable to assume that these nouns could be sense-disambiguated before NS analysis. We are not aware of any study which describes sense-disambiguation of the nouns in a NS. On the contrary, Braden-Harder (1992) suggests that the results of disambiguation can be improved when relations such as verb-object, purpose, and location, are available; these relations are the result of our NS analysis, not the input.</Paragraph>
  </Section>
  <Section position="5" start_page="785" end_page="786" type="metho">
    <SectionTitle>
4. PARALLEL VERSUS SERIAL RULE
APPLICATION
</SectionTitle>
    <Paragraph position="0"> As we have seen above, the overall score for each possible interpretation is a combination of the weight of a rule and the weight returned by the matching procedure. A rule with a relatively high weight may have a low score overall if the match weight is very low, and a role with a relatively low weight could have a high overall score if the match weight is particularly high. It is therefore impossible to order the rules a priori according to their weight.</Paragraph>
    <Paragraph position="1"> In Leonard (1984), the most plausible interpretation is determined by the order in which the rules are applied. By ordering the 'search for a material modifier' ahead of the 'search for a related verb', the interpretations of both silver pen and ink pen will be the same, given that both silver and ink are materials. In fact, only silver pen is correctly analyzed by the 'search for a material modifier' rule, while the correct interpretation of ink pen would have used the 'search for a related verb'.</Paragraph>
    <Paragraph position="2"> The problem with rule ordering is compounded when more than one sense of each noun is considered. In Leonard's lexicon, pen\[l\] is the writing implement and pen\[2\] is the enclosure for keeping animals in. By ordering a 'search for a related verb' ahead of a 'search for a locative', the interpretation of the NS bull pen is incorrect: 'a pen\[1\] that a bull or bulls writes something with'. Less likely is the correct locative interpretation 'a pen\[2 \] Jbr or containing a bull or bulls'.</Paragraph>
    <Paragraph position="3"> In our system, the most likely interpretations of bull pen are ordered correctly because, for the locative interpretation, we find meaningful matches in the definitions of bull and pen: the definition of pen (L n,l): 'a small piece of land enclosed by a fence, used esp. for keeping animals in', identifies a PURPOSE attribute, with the verb keep and a nested HAS-OBJECT animal as its values. The HAS-OBJECT animal can be matched with the modifier lemma bull, because one of the HYPERNYMs of bull (L n,2) is animal. For the related verb interpretation,  however, we find no match between the typical subjects the verb related to pen, namely write, and the modifier bull; a 'Who/What?' interpretation is only possible because bull is an animate, and, by default, animates can be the subject of a verb.</Paragraph>
    <Paragraph position="4"> We must conclude that what is important is the degree to which there is a match between the values of these attribules and the lemma, and not merely the presence or absence of semantic attributes. Only after all of the rttles have been applied can the most plausible interprelation be determined.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML