File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/99/e99-1023_concl.xml
Size: 1,617 bytes
Last Modified: 2025-10-06 13:58:22
<?xml version="1.0" standalone="yes"?> <Paper uid="E99-1023"> <Title>Representing Text Chunks</Title> <Section position="6" start_page="177" end_page="177" type="concl"> <SectionTitle> 5 Concluding remarks </SectionTitle> <Paragraph position="0"> We hay('. (:omI)ared seven (tiffi~rent (tata. formats for the recognition of baseNPs with memory-based learning (IBI-IG). The IOB1 format, introduced in (Ramshaw and Marcus, 1995), consistently (:ame out as the best format. However, the differences with other formats were not significant.</Paragraph> <Paragraph position="1"> Some representation formats achieved better pre(:ision rates, others better recall rates. This information is usefifl ibr tasks that require chunking structures because some tasks might be more interested in high precision rates while others might be more interested in high recall rates.</Paragraph> <Paragraph position="2"> The IBI-IG algorithm has been able to improve the best reported F2=1 rates for a stan(lar(l data set (92.37 versus (Ramshaw and Mar(:us, 1995)'s 92.03). This result was aided by using non-standard parameter values (k=3) and the algorithm was sensitive for redundant input features. This means that finding an optimal performance or this task requires searching a large parameter/feature configuration space. An interesting topic for future research would be to embed ml-IG in a standard search algorithm, like hillclimbing, and explore this parameter space. Some more room for improved performance lies in computing the POS tags in the data with a better tagger than presently used.</Paragraph> </Section> class="xml-element"></Paper>