File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/06/w06-2808_evalu.xml
Size: 4,857 bytes
Last Modified: 2025-10-06 13:59:56
<?xml version="1.0" standalone="yes"?> <Paper uid="W06-2808"> <Title>Anomaly Detecting within Dynamic Chinese Chat Text</Title> <Section position="8" start_page="52" end_page="52" type="evalu"> <SectionTitle> 5.3.2 Results </SectionTitle> <Paragraph position="0"> The experiment results for the approaches using the standard Chinese corpora on test set #2 are presented in Table 6.</Paragraph> <Paragraph position="1"> Table 4 shows that, in most cases, the entropy-based approach outperforms the confidence-based approach slightly. It can thus be conclude that the entropy-based approach is more effective in anomaly detection.</Paragraph> <Paragraph position="2"> It is also revealed that both approaches perform better with word trigrams than that with POS tag trigrams. This is natural for class based trigram model when number of class is small.</Paragraph> <Paragraph position="3"> Thirty-nine classes are used in ICTCLAS in POS tagging Chinese words.</Paragraph> <Paragraph position="4"> When the three Chinese corpora are compared, the CNGIGA performs best in the confidence-based approach with word trigram model. However, it is not the case with POS tag trigram model. Results of two approaches on CNTB are best amongst the three corpora. Although we are able to draw the conclusion that bigger corpora yields better performance with word trigram, the same conclusion, however, does not work for POS tag trigram. This is very interesting. The reason we can address on this issue is that CNTB probably provides highest quality POS tag tri-grams and other corpora contain more noisy POS tag trigrams, which eventually decreases the performance. An observation on word/POS tag lists for three Chinese corpora verifies such a claim. Text in CNTB is best-edited amongst the three.</Paragraph> <Section position="1" start_page="52" end_page="52" type="sub_section"> <SectionTitle> 5.4 Experiment III: Anomaly Detection </SectionTitle> <Paragraph position="0"> with NIL Corpus Incorporated In this experiment, we incorporate one chat text corpus, i.e. NIL corpus, to the two approaches. We run them on test set #2, #3 and #4 with the estimated threshold values. We use precision, recall and F measure again to evaluate performance of the two approaches.</Paragraph> </Section> </Section> <Section position="9" start_page="52" end_page="53" type="evalu"> <SectionTitle> 5.4.2 Results </SectionTitle> <Paragraph position="0"> The experiment results are presented in Table 7~ Table 9 on test set #2, #3 and #4 respectively.</Paragraph> <Paragraph position="1"> We first compare the two approaches with different running configurations. All conclusions made in experiment II still work for experiment III. They are, i) the entropy-based approach out-performs the confidence-based approach slightly in most cases; ii) both approach perform better with word trigram than POS tag trigram; iii) both approaches perform best on CNGIGA with word trigram model. But with POS tag trigram model, CNTB produces the best results.</Paragraph> <Paragraph position="2"> An interesting comparison is conducted on</Paragraph> <Paragraph position="4"> measure between the approaches in experiment II and experiment III on test set #2 in Figure 1 (the left two columns). Generally, F measure of anomaly detection with both approaches with word trigram model is improved when the NIL corpus is incorporated. It is revealed in Table 7~9 that same observation is found with POS tag trigram model.</Paragraph> <Paragraph position="5"> We compare F measure of the approaches with word trigram model in experiment III on test set #2, #3 and #4 in Figure 1 (the right three columns). The graph in Figure 1 shows that</Paragraph> <Paragraph position="7"> measure on three test sets are very close to each other. This is also true the approaches with POS tag trigram model as showed in Table 7~9. This provides evidences for the argument that the approaches can produce stable performance with the NIL corpus. Differently, as reported in (Xia et. al., 2005a), performance achieved in SVM classifier is rather unstable. It performs poorly with training set C#1 which contains BBS text posted several months ago, but much better with training set C#5 which contains the latest chat text.</Paragraph> <Paragraph position="8"> F measure of the approaches with word trigram on test set #2, #3 and #4 in experiment II and experiment III.</Paragraph> <Paragraph position="9"> We finally compare performance of our approaches against the one described in (Xia, et. al., 2005a). The best F measure achieved in our work, i.e. 0. 853, is close to the best one in their work, i.e. 0.871 with training corpus C#5. This proves another argument that our approaches can produce equivalent performance to the best ones achieved by the approaches in existence.</Paragraph> </Section> class="xml-element"></Paper>