File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/04/n04-4030_evalu.xml
Size: 1,457 bytes
Last Modified: 2025-10-06 13:59:11
<?xml version="1.0" standalone="yes"?> <Paper uid="N04-4030"> <Title>Nearly-Automated Metadata Hierarchy Creation</Title> <Section position="5" start_page="0" end_page="0" type="evalu"> <SectionTitle> 4 Results </SectionTitle> <Paragraph position="0"> We experimented with a collection of descriptions of stands beneath the tree.</Paragraph> <Paragraph position="1"> A Greek trellis with Ionic columns, meander crossing diagonally; few vines; trees background; trellis is in a circle.</Paragraph> <Paragraph position="2"> The descriptions are preprocessed by eliminating frequent words from a stop list. Information gain is used to select target words, in this case resulting in 849 words. Figure 2 shows partial results obtained using the Word-Net algorithm (where compression reduced the number of nodes by a43 a14a45a44 ) and Word Space (Schutze, 1993). Note that the WordNet-based organization is intuitive, but if not exactly what the designer wants, should be easy to adjust. For example, a designer may prefer to have a &quot;nature&quot; category that combines the subcategories of &quot;geological formation,&quot; &quot;body of water,&quot; and &quot;vascular plant&quot;. Some terminology may also need renaming, but note that WordNet also provides thesaurus terms that can be used in an underlying search engine. Word Space, by contrast, produces associationally related terms.</Paragraph> </Section> class="xml-element"></Paper>