File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/06/w06-0505_intro.xml
Size: 2,450 bytes
Last Modified: 2025-10-06 14:03:55
<?xml version="1.0" standalone="yes"?> <Paper uid="W06-0505"> <Title>Efficient Hierarchical Entity Classifier Using Conditional Random Fields</Title> <Section position="4" start_page="0" end_page="33" type="intro"> <SectionTitle> 2 WordNet </SectionTitle> <Paragraph position="0"> WordNet (Fellbaum et al., 1998) is a lexical database whose design is inspired by psycholinguistic theories of human lexical memory. English nouns, verbs, adjectives and adverbs are organized in synsets. A synset is a collection of words that have a close meaning and that represent an underlying concept. An example of such a synset is &quot;person, individual, someone, somebody, mortal, soul&quot;. All these words refer to a human being.</Paragraph> <Paragraph position="1"> WordNet (v2.1) contains 155.327 words, which are organized in 117.597 synsets. WordNet defines a number of relations between synsets. For nouns the most important relation is the hypernym/hyponym relation. A noun X is a hypernym of a noun Y if Y is a subtype or instance of X. For example, &quot;bird&quot; is a hypernym of &quot;penguin&quot; (and &quot;penguin&quot; is a hyponym of &quot;bird&quot;). This relation organizes the synsets in a hierarchical tree (Hayes, 1999), of which a fragment is pictured in fig. 1.</Paragraph> <Paragraph position="2"> This tree has a depth of 18 levels and maximum width of 17837 synsets (fig. 2).</Paragraph> <Paragraph position="3"> We will build a classifier using CRFs that tags noun phrases in a text with their WordNet synset.</Paragraph> <Paragraph position="4"> This will enable us to recognize entities, and to classify the entities in certain groups. Moreover, it allows learning the context pattern of a certain meaning of a word. Take for example the sentence &quot;The ambulance took the remains of the bomber to the morgue.&quot; Having every noun phrase tagged with it's WordNet synset reveals that in this sentence, &quot;bomber&quot; is &quot;a person who plants bombs&quot; (and not &quot;a military aircraft that drops bombs during flight&quot;). Using the hypernym/hyponym relations from WordNet, we can also easily find out that &quot;ambulance&quot; is a kind of &quot;car&quot;, which in turn is a kind of &quot;conveyance, transport&quot; which in turn is a &quot;physical object&quot;.</Paragraph> </Section> class="xml-element"></Paper>