File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/99/e99-1018_evalu.xml
Size: 1,496 bytes
Last Modified: 2025-10-06 14:00:36
<?xml version="1.0" standalone="yes"?> <Paper uid="E99-1018"> <Title>POS Disambiguation and Unknown Word Guessing with Decision Trees</Title> <Section position="7" start_page="2638" end_page="2638" type="evalu"> <SectionTitle> 5 Evaluation </SectionTitle> <Paragraph position="0"> To evaluate our approach, we first partitioned the datasets described in Section 3 into training and testing sets according to the 10-fold cross-validation methodL Then, (a) we found the most frequent POS in each training set and (b) we induced a decision tree from each training set.</Paragraph> <Paragraph position="1"> Consequently, we resolved the ambiguity of the testing sets with two methods: (a) we assigned the most frequent POS acquired from the corresponding training sets and (b) we used the induced decision trees.</Paragraph> <Paragraph position="2"> Table 3 concentrates the results of our experiments. In detail: Column (1) shows in what percentage the ambiguity schemes and the unknown words occur in the corpus. The total problematic word-tokens in the corpus are 23,38%. Column (2) shows in what percentage each ambiguity scheme contributes to the total POS ambiguity. Column (3) shows the error rates of method (a). Column (4) shows the error rates of method (b).</Paragraph> <Paragraph position="3"> To compute the total POS disambiguation error rates of the two methods (24,1% and 5,48% respectively) we used the contribution percentages shown in column (2).</Paragraph> </Section> class="xml-element"></Paper>