File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/04/w04-3233_concl.xml

Size: 2,776 bytes

Last Modified: 2025-10-06 13:54:31

<?xml version="1.0" standalone="yes"?>
<Paper uid="W04-3233">
  <Title>NP Bracketing by Maximum Entropy Tagging and SVM Reranking</Title>
  <Section position="6" start_page="0" end_page="0" type="concl">
    <SectionTitle>
6 Conclusion
</SectionTitle>
    <Paragraph position="0"> We have presented a method for performing noun phrase bracketing, which outperforms competing methods both in terms of f-score and recall. The system is based on two separate components: a maximum entropy-based tagging system and a support vector machine reranking system. The key component of the tagging system is that it produces underspecified tags that are determined only at decoding time by bracketing constraints. The tagging system operates very quickly and can tag and rerank at a rate of approximately two sentences per second.</Paragraph>
    <Paragraph position="1"> The tagger alone achieves an f-score of 83:4. This score is only 0:4% lower (absolute) than the best reported result to date of 83:8.</Paragraph>
    <Paragraph position="2"> After tagging, we have fed 100 best lists into a support vector reranking system, which performs global optimization to choose a good bracketing.</Paragraph>
    <Paragraph position="3"> Our reranking system is able to increase the f-score of our bracketing approach from 83:4 to 86:1, improving our performance beyond the best reported system to date.</Paragraph>
    <Paragraph position="4"> As we can see from Table 1, by comparing the output of our system to that of COL00Full, there is much in the way of recall to be gained by using a full syntactic parser. However, this gain comes at two expenses. First, full syntactic parsers are computationally more expensive to run. Moreover, performance of Collins' parser degrades significantly (from 87:9 to 68:7 in f-score) when it cannot take advantage of other constituent information. This has a strong influence when one is faced with the task of moving to a new domain. On the one hand, our system (as well as the other bracketing systems cited) requires data to only be annotated at the NP level in order to achieve high performance. Conversely, without full parses, using a parser for learning NPs is inadequate.</Paragraph>
    <Paragraph position="5"> Despite these successes, there is still much that can be improved upon. While the reranking is very efficient in the classification phase, training a support vector reranking system is computationally very expensive. Other well grounded statistical learning systems might allow us to train this component on more data and using more features. We also hope to be able to improve our system's performance from its current rate of 86:1 (on official data) and 87:4 (on all data) closer to the n-best optimal, depicted in Figure 3.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML