File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/00/w00-0728_intro.xml
Size: 2,661 bytes
Last Modified: 2025-10-06 14:01:00
<?xml version="1.0" standalone="yes"?> <Paper uid="W00-0728"> <Title>A Context Sensitive Maximum Likelihood Approach to Chunking</Title> <Section position="3" start_page="136" end_page="136" type="intro"> <SectionTitle> 3 Results </SectionTitle> <Paragraph position="0"> The evaluation program shows that this simple procedure reaches its best result for 5-contexts (table 1) with 92.46% label accuracy and phrase correctness measured by FZ=i = 87.23. However, the improvement from 3-contexts to 5-contexts is insignificant, as 3-contexts reached 92.41% accuracy and F~=1=87.09. The results for 7-contexts is almost identical to 5-contexts (92.44% and FZ=1=87.21). This is taken as the limit performance due to the size of the training corpus.</Paragraph> <Paragraph position="1"> In a larger training corpus, the most common longer contexts are likely to be useful but in a small set the longer contexts may occur with very low frequencies making it hard to determine if the label of such contexts is the best guess for unseen samples.</Paragraph> <Paragraph position="2"> These results are the best that could be expected without generalization. In order to do better, the method has to generalize to unseen contexts, e.g., by using some notion of close matching contexts (instances), to be able to use longer context even when some of that context has not been previously recorded. In addition, the tag-structure could be productively utilized. The presented method has treated all labels as arbitrary, atomic and independent symbols.</Paragraph> <Section position="1" start_page="136" end_page="136" type="sub_section"> <SectionTitle> 3.1 Computational complexity </SectionTitle> <Paragraph position="0"> Using rule 2 from section 2.1, 45 patterns 'survived' for 1-contexts, and 3225, 71022, 38541 for 3-,5- and 7-contexts respectively, i.e., a total of 45, 3270, 74292, 109563 using all contexts up to and including 1-, 3-, 5- and 7-contexts. Each unique context can be retrieved in one logical step (i.e., a hash-table lookup). There are obviously many patterns in the database - but the complexity of the task is limited to the number of look-ups necessary.</Paragraph> <Paragraph position="1"> There is a maximum of four hash-table look-ups for each tag (i.e., when the 7-, 5-, and 3-contexts does not exist in the database the most likely label of the current tag will be used).</Paragraph> <Paragraph position="2"> Good performance can be obtained within a maximum of 2 look-ups for each label (i.e., using only 1- and 3-contexts) and the best results were obtained with a maximum of 3 look-ups per label.</Paragraph> </Section> </Section> class="xml-element"></Paper>