File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/98/p98-2214_evalu.xml
Size: 4,057 bytes
Last Modified: 2025-10-06 14:00:34
<?xml version="1.0" standalone="yes"?> <Paper uid="P98-2214"> <Title>General-to-Specific Model Selection for Subcategorization Preference*</Title> <Section position="6" start_page="1318" end_page="1320" type="evalu"> <SectionTitle> 5.3 Results </SectionTitle> <Paragraph position="0"> Table 1 shows the performance of subcategorization preference test described in section 4.2.2, for the approximately optimal models selected by the procedure in section 4.4 (the &quot;Optimal&quot; mode\] of &quot;General-to-Specific&quot; method), as well as for several other models including baseline models.</Paragraph> <Paragraph position="1"> Coverage is the rate of test instances which satisfy the case covering constraint of section 4.1.</Paragraph> <Paragraph position="2"> Accuracy is measured with the following heuristics: i) verb-noun collocations which satisfy the r&quot;Agaru (rise)&quot;, &quot;kau (buy)&quot;, &quot;motoduku (base)&quot;, &quot;oujiru (respond)&quot;, &quot;sumu (live)&quot;, &quot;tigau (differ)&quot;, and &quot;tsunagaru (connect)&quot;.</Paragraph> <Paragraph position="3"> case covering constraint are preferred, it) even those verb-noun collocations which do not satisfy the case covering constraint are assigned the conditional probabilities in (15) by neglecting cases which are not covered by the model. With these heuristics, subcategorization preference can be judged for all the test instances, and test set coverage becomes 100%.</Paragraph> <Paragraph position="4"> In Table 1, the &quot;Initial&quot; model is the one constructed according to the description in section 4.1, in which cases are independent of each other and the sense restriction of each case is (one of) the most general class(es). The &quot;Independent Cases&quot; model is the one obtained by removing all the case dependencies from the &quot;Optimal&quot; model, while the &quot;General Classes&quot; model is the one obtained by generalizing all the sense restriction of the &quot;Optimal&quot; model to the most general classes. The &quot;MDL&quot; model is the one with the minimum description length. This is for evaluating the effect of the MDL principle in the task of subcategorization preference model learning. The &quot;Optimal&quot; model of &quot;One-by-one Feature Adding&quot; method is the one selected from the sequence of one-by-one feature adding in section 3.1 by the procedure in section 4.4.</Paragraph> <Paragraph position="5"> The &quot;Optimal&quot; model of 'General-to-Specific&quot; method performs best among all the models in Table 1. Especially, it outperforms the &quot;Optimal&quot; model of &quot;One-by-one Feature Adding&quot; method both in coverage and accuracy. As for the size of the optimal model, the average number of the active feature set is 126 for &quot;General-to-Specific&quot; method and 800 for &quot;One-by-one Feature Adding&quot; method. Therefore, general-to-specific feature selection algorithm achieves significant improvements over the one-by-one feature adding algorithm with much smaller number of active features. The &quot;Optimal&quot; model of &quot;General-to-Specific&quot; method outperforms both the &quot;Independent Cases&quot; and &quot;General Classes&quot; models, and thus both of the case dependencies and specific sense restriction selected by the proposed method have much contribution to improving the performance in subcategorization prefer- null ence test. The &quot;MDL&quot; model performs worse than the &quot;Optimal&quot; model, because the features of the &quot;MDL&quot; model have much more specific sense restriction than those of the &quot;Optimal&quot; model, and the coverage of the &quot;MDL&quot; model is much lower than that of the &quot;Optimal&quot; model.</Paragraph> </Section> class="xml-element"></Paper>