XML Viewer - w05-1003

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/05/w05-1003_metho.xml
Size: 17,090 bytes
Last Modified: 2025-10-06 14:09:59
<?xml version="1.0" standalone="yes"?>
<Paper uid="W05-1003">
  <Title>Language and Computation</Title>
  <Section position="4" start_page="18" end_page="19" type="metho">
    <SectionTitle>
3 Attribute Extraction and Classification
</SectionTitle>
    <Paragraph position="0"> The goal of this work is to identify genuine attributes by classifying candidate attributes collected using text patterns as discussed in (Almuhareb and Poesio, 2004) according to a scheme inspired by those proposed by Guarino and Pustejovsky.</Paragraph>
    <Paragraph position="1"> The scheme we used to classify the training data in the experiment discussed below consists of six categories:  * Qualities: Analogous to Guarino's qualities and Pustejovsky's formal 'role'. (E.g., &amp;quot;the color of the car&amp;quot;.) * Parts: Related to Guarino's non-relational attributes and Pustejovsky's constitutive 'roles'. (E.g., &amp;quot;the hood of the car&amp;quot;). * Related-Objects: A new category intro null duced to cover the numerous physical objects which are 'related' to an object but are not part of it--e.g., &amp;quot;the track of the deer&amp;quot;.  E.g., http://plato.stanford.edu/entries/substance. Thanks to one of the referees for drawing our attention to this.  'Facets' would be perhaps a more appropriate term to avoid confusions with the use of the term 'role' in Knowledge Representation. null  * Activities: These include both the types of activities which are part of Pustejovsky's telic 'role' and those which would be included in his agentive 'role'. (E.g., &amp;quot;the repairing of the car&amp;quot;.) * Related-Agents: For the activities in which the concept in question is acted upon, the agent of the activity: e.g., &amp;quot;the writer of the book&amp;quot;, &amp;quot;the driver of the car&amp;quot;. * Non-Attributes: This category covers the cases in which the construction &amp;quot;the N of the N&amp;quot; expresses other semantic relations, as in: &amp;quot;the last of the deer&amp;quot;, &amp;quot;the majority of the deer,&amp;quot; &amp;quot;the lake of the deer,&amp;quot; and &amp;quot;in the case of the deer&amp;quot;.</Paragraph>
    <Paragraph position="2"> We will quickly add that (i) we do not view this classification as definitive--in fact, we already collapsed the classes 'part' and 'related objects' in the experiments discussed below--and (ii) not all of these distinctions are very easy even for human judges to do. For example, design, as an attribute of a car, can be judged to be a quality if we think of it as taking values such as modern and standard; on the other hand, design might also be viewed as an activity in other contexts discussing the designing process. Another type of difficulty is that a given attribute may express different things for different objects. For example, introduction is a part of a book, and an activity for a product. An additional difficulty results from the strong similarity between parts and related-objects. For example, &amp;quot;key&amp;quot; is a related-object to a car but it is not part of it. We will return to this issue and to agreement on this classification scheme when discussing the experiment.</Paragraph>
    <Paragraph position="3"> One difference from previous work is that we use additional linguistic constructions to extract candidate attributes. The construction &amp;quot;the X of the Y is&amp;quot; used in our previous work is only one example of genitive construction. Quirk et al (1985) list eight types of genitives in English, four of which are useful for our purposes:  * Possessive Genitive: used to express qualities, parts, related-objects, and relatedagents. null * Genitive of Measure: used to express qualities. null * Subjective &amp; Objective Genitives: used to  express activities.</Paragraph>
    <Paragraph position="4"> We used all of these constructions in the work discussed here.</Paragraph>
  </Section>
  <Section position="5" start_page="19" end_page="22" type="metho">
    <SectionTitle>
4 Information Used to Classify Attributes
</SectionTitle>
    <Paragraph position="0"> Our attribute classifier uses four types of information: morphological information, an attribute model, a question model, and an attributive-usage model. In this section we discuss how this information is automatically computed.</Paragraph>
    <Section position="1" start_page="19" end_page="20" type="sub_section">
      <SectionTitle>
4.1 Morphological Information
</SectionTitle>
      <Paragraph position="0"> Our use of morphological information is based on the noun classification scheme proposed by Dixon (1991). According to Dixon, derivational morphology provides some information about attribute type. Parts are concrete objects and almost all of them are expressed using basic noun roots (i.e., not derived from adjectives or verbs). Most of qualities and properties are either basic noun roots or derived from adjectives. Finally, activities are mostly nouns derived from verbs. Although these rules only have a heuristic value, we found that morphologically based heuristics did provide useful cues when used in combination with the other types of information discussed below.</Paragraph>
      <Paragraph position="1"> As we are not aware of any publicly available software performing automatic derivational morphology, we developed our own (and very basic) heuristic methods. The techniques we used involve using information from WordNet, suffix-checking, and a POS tagger.</Paragraph>
      <Paragraph position="2"> WordNet was used to find nouns that are derived from verbs and to filter out words that are not in the noun database. Nouns in WordNet are linked to their derivationally related verbs, but there is no indication about which is derived from which. We use a heuristic based on length to decide this: the system checks if the noun contains more letters than the most similar related verb. If this is the case, then the noun is judged to be derived from the verb. If the same word is used both as a noun and as a verb, then we check the usage familiarity of the word, which can also be found in WordNet.</Paragraph>
      <Paragraph position="3"> If the word is used more as a verb and the verbal usage is not rare, then again the system treats the noun as derived from the verb.</Paragraph>
      <Paragraph position="4">  To find nouns that are derived from adjectives we used simple heuristics based on suffixchecking. (This was also done by Berland and Charniak (1999).) All words that end with &amp;quot;ity&amp;quot; or &amp;quot;ness&amp;quot; are considered to be derived from adjectives. A noun not found to be derived from a verb or an adjective is assumed to be a basic noun root.</Paragraph>
      <Paragraph position="5"> In addition to derivational morphology, we used the Brill tagger (Brill, 1995) to filter out adjectives and other types of words that can occasionally be used as nouns such as better, first, and whole before training. Only nouns, base form verbs, and gerund form verbs were kept in the candidate attribute list.</Paragraph>
    </Section>
    <Section position="2" start_page="20" end_page="21" type="sub_section">
      <SectionTitle>
4.2 Clustering Attributes
</SectionTitle>
      <Paragraph position="0"> Attributes are themselves concepts, at least in the sense that they have their own attributes: for example, a part of a car, such as a wheel, has its own parts (the tyre) its qualities (weight, diameter) etc.</Paragraph>
      <Paragraph position="1"> This observation suggests that it should be possible to find similar attributes in an unsupervised fashion by looking at their attributes, just as we did earlier for concepts (Almuhareb and Poesio, 2004). In order to do this, we used our text patterns for finding attributes to collect from the Web up to 500 pattern instances for each of the candidate attributes. The collected data were used to build a vectorial representation of attributes as done in (Almuhareb and Poesio, 2004). We then used CLUTO (Karypis, 2002) to cluster attributes using these vectorial representations. In a first round of experiments we found that the classes 'parts' and 'related objects' were difficult to differentiate, and therefore we merged them. The final model clusters candidate attributes into five classes: activities, parts &amp; related-objects, qualities, related-agents, and non-attributes. This classification was used as one of the input features in our supervised classifier for attributes.</Paragraph>
      <Paragraph position="2"> We also developed a measure to identify particularly distinctive 'attributes of attributes'-attributes which have a strong tendency to occur primarily with attributes (or any concept) of a given class--which has proven to work pretty well.</Paragraph>
      <Paragraph position="3"> This measure, which we call Uniqueness, actually is the product of two factors: the degree of uniqueness proper, i.e., the probability P(class</Paragraph>
      <Paragraph position="5"> ) that an attribute (or, in fact, any other noun) will belong to class i given than it has attribute j; and a measure of 'definitional power' -the prob- null ability P(attribute j  |class i ) that a concept belonging to a given class will have a certain attribute. Using MLE to estimate these probabilities, the degree of uniqueness of attributes j of class</Paragraph>
      <Paragraph position="7"> count function that counts concepts that are associated with the given attribute. Uniqueness ranges from 0 to 1.</Paragraph>
      <Paragraph position="8"> Table 1 shows the 10 most distinctive attributes for each of the five attribute classes, as determined by the Uniqueness measure just introduced, for the 1,155 candidate attributes in the training data for the experiment discussed below.</Paragraph>
      <Paragraph position="9">  measure, basis, determination, question, extent, issue, measurement, light, result, increase Non-Attribute (0.18) content, value, rest, nature, meaning, format, interpretation, essence, size, source  classes of candidate attributes. Average distinctiveness (uniqueness) for the top 10 attributes is shown between parentheses Most of the top 10 attributes of related-agents, parts &amp; related-objects, and activities are genuinely distinctive attributes for such classes. Thus, attributes of related-agents reflect the 'intentionality' aspect typical of members of this class: identity, duty, and responsibility. Attributes of parts are common attributes of physical objects (e.g., inside, shape). Most attributes of activities have to do with temporal properties and causal structure: e.g., beginning, cause. The 'distinctive' attributes of the  quality class are less distinctive, but four such attributes (measure, extent, measurement, and increase) are related to values since many of the qualities can have different values (e.g., small and large for the quality size). There are however several attributes in common between these classes of attributes, emphasizing yet again how some of these distinctions at least are not completely clear cut: e.g., result, in common between activities and qualities (two classes which are sometimes difficult to distinguish). Finally, as one would expect, the attributes of the non-attribute class are not really distinctive: their average uniqueness score is the lowest. This is because 'non-attribute' is a heterogeneous class.</Paragraph>
    </Section>
    <Section position="3" start_page="21" end_page="21" type="sub_section">
      <SectionTitle>
4.3 The Question Model
</SectionTitle>
      <Paragraph position="0"> Certain types of attributes can only be used when asking certain types of questions. For example, it is possible to ask &amp;quot;What is the color of the car?&amp;quot; but not &amp;quot;[?]When is the color of the car?&amp;quot;.</Paragraph>
      <Paragraph position="1"> We created a text pattern for each type of question and used these patterns to search the Web and collect counts of occurrences of particular questions. An example of such patterns would be: * &amp;quot;what is|are the A of the&amp;quot; where A is the candidate attribute under investigation. Patterns for who, when, where, and how are similar.</Paragraph>
      <Paragraph position="2"> After collecting occurrence frequencies for all the candidate attributes, we transform these counts into weights using the t-test weighting function as done for all of our counts, using the following formula from Manning and Schuetze (1999):</Paragraph>
      <Paragraph position="4"> where N is the total number of relations, and C is a count function.</Paragraph>
      <Paragraph position="5"> Table 2 shows the 10 most frequent attributes for each question type. This data was collected using a more restricted form of the question patterns and a varying number of instances for each type of questions. The restricted form includes a question mark at the end of the phrase and was used to improve the precision. For example, the what-pattern would be &amp;quot;what is the * of the *?&amp;quot;.</Paragraph>
      <Paragraph position="6"> Question Top 10 Attributes what purpose, name, nature, role, cost, function, significance, size, source, status who author, owner, head, leader, president, sponsor, god, lord, father, king where rest, location, house, fury, word, edge, center, end, ark, voice how quality, rest, pace, level, length, morale, performance, content, organization, cleanliness when end, day, time, beginning, date, onset, running, birthday, fast, opening  Instances of the what-pattern are frequent in the Web: the Google count was more than 2,000,000 for a query issued in mid 2004. The who-pattern is next in terms of occurrence, with about 350,000 instances. The when-pattern is the most infrequent pattern, about 5,300 instances.</Paragraph>
      <Paragraph position="7"> The counts broadly reflected our intuitions about the use of such questions. What-questions are mainly used with qualities, whereas whoquestions are used with related-agents. Attributes occurring with when-questions have some temporal aspects; attributes occurring with how-questions are mostly qualities and activities, and attributes in where-questions are of different types but some are related to locations. Parts usually do not occur with these types of questions.</Paragraph>
    </Section>
    <Section position="4" start_page="21" end_page="22" type="sub_section">
      <SectionTitle>
4.4 Attributive Use
</SectionTitle>
      <Paragraph position="0"> Finally, we exploited the fact that certain types of attributes are used more in language as concepts rather than as attributes. For instance, it is more common to encounter the phrase &amp;quot;the size of the [?]&amp;quot; than &amp;quot;the [?] of the size&amp;quot;. On the other hand, it is more common to encounter the phrase &amp;quot;the * of the window&amp;quot; than &amp;quot;the window of the *&amp;quot;. Generally speaking, parts, related-objects, and related-agents are more likely to have more attributes than qualities and activities. We used the two patterns &amp;quot;the * of the A&amp;quot; and &amp;quot;the A of the *&amp;quot; to collect Google counts for all of the candidate attributes.</Paragraph>
      <Paragraph position="1"> These counts were also weighted using the t-test as in the question model.</Paragraph>
      <Paragraph position="2"> Table 3 illustrates the attributive and conceptual usage for each attribute class using a training data of 1,155 attributes. The usage averages confirm the initial assumption.</Paragraph>
    </Section>
  </Section>
  <Section position="6" start_page="22" end_page="22" type="metho">
    <SectionTitle>
5 The Experiment
</SectionTitle>
    <Paragraph position="0"> We trained two classifiers: a 2-way classifier that simply classifies candidate attributes into attributes and non-attributes, and a 5-way classifier that classifies candidate attributes into activities, parts &amp; related-objects, qualities, related-agents, and nonattributes. These classifiers were trained using decision trees algorithm (J48) from WEKA (Witten and Frank, 1999).</Paragraph>
    <Paragraph position="2"> values for morph are as follows: DV: derived from verb; BN: basic noun; DA: derived from adjective Our training and testing material was acquired as follows. We started from the 24,178 candidate attributes collected for the concepts in the balanced concept dataset we recently developed (Almuhareb and Poesio, 2005). We threw out every candidate attribute with a Google frequency less than 20; this reduced the number of candidate attributes to 4,728. We then removed words other than nouns and gerunds as discussed above, obtaining 4,296 candidate attributes.</Paragraph>
    <Paragraph position="3"> The four types of input features for this filtered set of candidate attributes were computed as discussed in the previous section. The best results were obtained using all of these features. A training set of 1,155 candidate attributes was selected and hand-classified (see below for agreement figures). We tried to include enough samples for each attribute class in the training set. Table 4 shows the input features for five different training examples, one for each attribute class.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML