File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/06/w06-0505_intro.xml

Size: 2,450 bytes

Last Modified: 2025-10-06 14:03:55

<?xml version="1.0" standalone="yes"?>
<Paper uid="W06-0505">
  <Title>Efficient Hierarchical Entity Classifier Using Conditional Random Fields</Title>
  <Section position="4" start_page="0" end_page="33" type="intro">
    <SectionTitle>
2 WordNet
</SectionTitle>
    <Paragraph position="0"> WordNet (Fellbaum et al., 1998) is a lexical database whose design is inspired by psycholinguistic theories of human lexical memory. English nouns, verbs, adjectives and adverbs are organized in synsets. A synset is a collection of words that have a close meaning and that represent an underlying concept. An example of such a synset is &amp;quot;person, individual, someone, somebody, mortal, soul&amp;quot;. All these words refer to a human being.</Paragraph>
    <Paragraph position="1"> WordNet (v2.1) contains 155.327 words, which are organized in 117.597 synsets. WordNet defines a number of relations between synsets. For nouns the most important relation is the hypernym/hyponym relation. A noun X is a hypernym of a noun Y if Y is a subtype or instance of X. For example, &amp;quot;bird&amp;quot; is a hypernym of &amp;quot;penguin&amp;quot; (and &amp;quot;penguin&amp;quot; is a hyponym of &amp;quot;bird&amp;quot;). This relation organizes the synsets in a hierarchical tree (Hayes, 1999), of which a fragment is pictured in fig. 1.</Paragraph>
    <Paragraph position="2">  This tree has a depth of 18 levels and maximum width of 17837 synsets (fig. 2).</Paragraph>
    <Paragraph position="3"> We will build a classifier using CRFs that tags noun phrases in a text with their WordNet synset.</Paragraph>
    <Paragraph position="4"> This will enable us to recognize entities, and to classify the entities in certain groups. Moreover, it allows learning the context pattern of a certain meaning of a word. Take for example the sentence &amp;quot;The ambulance took the remains of the bomber to the morgue.&amp;quot; Having every noun phrase tagged with it's WordNet synset reveals that in this sentence, &amp;quot;bomber&amp;quot; is &amp;quot;a person who plants bombs&amp;quot; (and not &amp;quot;a military aircraft that drops bombs during flight&amp;quot;). Using the hypernym/hyponym relations from WordNet, we can also easily find out that &amp;quot;ambulance&amp;quot; is a kind of &amp;quot;car&amp;quot;, which in turn is a kind of &amp;quot;conveyance, transport&amp;quot; which in turn is a &amp;quot;physical object&amp;quot;.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML