File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/01/h01-1057_evalu.xml
Size: 2,018 bytes
Last Modified: 2025-10-06 13:58:41
<?xml version="1.0" standalone="yes"?> <Paper uid="H01-1057"> <Title>Non-Dictionary-Based Thai Word Segmentation Using Decision Trees</Title> <Section position="6" start_page="1" end_page="1" type="evalu"> <SectionTitle> 4. EXPERIMENT RESULTS </SectionTitle> <Paragraph position="0"> In our experiments, the TCC corpus is divided into five sets, four for training and one for testing. Based on this, five times cross validation are performed. To test the accuracy, we trained the decision trees and tested them several times for six different levels of merging permission according to certainty factor(CF). Each level is the starting level of merging permission of the strings in the second and the third blocks in the buffer. Recall, precision, and accuracy where the certainty factor ranges between 50% and 100% are shown in Figure 6.</Paragraph> <Paragraph position="1"> From the result, we observed that our m satisfactory in the percentage of accuracy an and recall compared to those numbers of performance. The TCC corpus has 100% re precision, and 44.93% accuracy. Using the d from a Thai corpus, the precision improves up to 94.11and the accuracy increases up to 85.51-87 recall drops to 63.72-94.52%. For a high CF drops a little because there are few cases to m precision and accuracy improve dominantly to respectively. For a lower CF, say 50%, reca but precision and accuracy dramatically imp 85.51% respectively.</Paragraph> <Paragraph position="2"> However, from 50 to 100% CF, at approxim accuracy had declined. The reason to this decl very high level of merging permission, there are removing '|' because of the %CF at those leaves are l this permission level. Therefore, there are mo word segmentation, which lead to decrease accu conclusion, the appropriate level of merging pe used in order to achieve high accuracy. From best permission level is approximately equal to 70% the recall equals to 96.13%, precision equals to 91.92% and the accuracy equals to 87.41%.</Paragraph> </Section> class="xml-element"></Paper>