File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/02/p02-1062_concl.xml
Size: 2,742 bytes
Last Modified: 2025-10-06 13:53:18
<?xml version="1.0" standalone="yes"?> <Paper uid="P02-1062"> <Title>Ranking Algorithms for Named-Entity Extraction: Boosting and the Voted Perceptron</Title> <Section position="8" start_page="0" end_page="0" type="concl"> <SectionTitle> 6 Discussion </SectionTitle> <Paragraph position="0"> A question regarding the approaches in this paper is whether the features we have described could be incorporated in a maximum-entropy tagger, giving similar improvements in accuracy. This section discusses why this is unlikely to be the case. The problem described here is closely related to the label bias problem described in (Lafferty et al. 2001).</Paragraph> <Paragraph position="1"> One straightforward way to incorporate global features into the maximum-entropy model would be to introduce new features a27 a22 a32a45a30a37a17a37a24 which indicated whether the tagging decision a17 in the history a32 creates a particular global feature. For example, we could introduce a feature As an example, this would take the value a2 if its was tagged as N in the following context, She/N praised/N the/N University/S for/C its/? efforts to a99a19a99a100a99 because tagging its as N in this context would create an entity whose last word was not capitalized, i.e., University for. Similar features could be created for all of the global features introduced in this paper.</Paragraph> <Paragraph position="2"> This example also illustrates why this approach is unlikely to improve the performance of the maximum-entropy tagger. The parameter a44a130a34a8a7a37a35 associated with this new feature can only affect the score for a proposed sequence by modifying a0a233a22 a17a4a50a32a21a24 at the point at which a27 a34a8a7a37a35 a22 a17a108a30a33a32a70a24a245a38 a2 . In the example, this means that the LWLC=1 feature can only lower the score for the segmentation by lowering the probability of tagging its as N. But its has almost probably a2 of not appearing as part of an entity, so</Paragraph> <Paragraph position="4"> in this context! The decision which effectively created the entity University for was the decision to tag for as C, and this has already been made. The independence assumptions in maximum-entropy taggers of this form often lead points of local ambiguity (in this example the tag for the word for) to create globally implausible structures with unreasonably high scores. See (Collins 1999) section 8.4.2 for a discussion of this problem in the context of parsing.</Paragraph> <Paragraph position="5"> Acknowledgements Many thanks to Jack Minisi for annotating the named-entity data used in the experiments. Thanks also to Nigel Duffy, Rob Schapire and Yoram Singer for several useful discussions.</Paragraph> </Section> class="xml-element"></Paper>