File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/06/n06-2038_concl.xml
Size: 2,200 bytes
Last Modified: 2025-10-06 13:55:12
<?xml version="1.0" standalone="yes"?> <Paper uid="N06-2038"> <Title>A Comparison of Tagging Strategies for Statistical Information Extraction</Title> <Section position="6" start_page="151" end_page="151" type="concl"> <SectionTitle> 5 Conclusion </SectionTitle> <Paragraph position="0"> Previously, classification-based approaches to IE have combined a specific tagging strategy with a specific classification algorithm and specific other parameter settings, making it hard to detect how each of these choices influences the results. We have designed a generalized IE system that allows exploring each of these choices in isolation. For this paper, we have tested the tagging strategies that can be found in the literature. We have also introduced a new tagging strategy, BIA (Begin/After tagging).</Paragraph> <Paragraph position="1"> Our results indicate that the choice of a tagging strategy, while not crucial, should not be neglected when implementing a statistical IE system. The IOB2 strategy, which is very popular, having been used in public challenges such as those of CoNLL (Tjong Kim Sang and De Meulder, 2003) and JNLPBA (Kim et al., 2004), has been found to be indeed the best of all established tagging strategies. It is rivaled by the new BIA strategy. In typical situations, using one of those strategies should be a good choice--since BIA requires more classes, it makes sense to prefer IOB2 when in doubt.</Paragraph> <Paragraph position="2"> Considering that it is not much worse, the Triv strategy which requires only a single class per slot type might be useful in situations where the number of available classes is limited or the space or time overhead of additional classes is high. The two-classifier BE strategy is still interesting if used as part of a more refined approach, as done by the ELIE system (Finn and Kushmerick, 2004).4 Future work will be to observe how well these results generalize in the context of other classifiers and other corpora.</Paragraph> <Paragraph position="3"> To combine the strengths of different tagging strategies, ensemble meta-strategies utilizing the results of multiple strategies could be explored.</Paragraph> </Section> class="xml-element"></Paper>