File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/06/p06-2016_evalu.xml
Size: 4,224 bytes
Last Modified: 2025-10-06 13:59:45
<?xml version="1.0" standalone="yes"?> <Paper uid="P06-2016"> <Title>Markov model</Title> <Section position="8" start_page="124" end_page="126" type="evalu"> <SectionTitle> 5 Evaluation </SectionTitle> <Paragraph position="0"> The first evaluation presents preliminary evidence that the merged hierarchical hidden Markov Model (MHHMM) is able to produce more accurate results either a plain HHMM or a HMM during the text chunking task. The results suggest that the partial flattening process is capable of improving model accuracy when the input data contains complex hierarchical structures. The evaluation involves analysing the results over two sets of data. The first is a selection of data from CoNLL-2004 and contains 8936 sentences. The second dataset is part of the Lancaster Treebank corpus and contains 1473 sentences. Each sentence contains hand-labeled syntactic roles for natural language text.</Paragraph> <Paragraph position="1"> against the number of training sentences during text chunking (A: MHHMM, B: HHMM and C: HMM) The first finding is that the size of training data dramatically affects the prediction accuracy. A model with an insufficient number of observations typically has poor accuracy. In the text chunking task the number of observation symbol relies on the number of part-of-speech tags contained in training data. Figure 6 plots the relationship of micro-average F-measure for three types of models (A: MHHMM, B: HHMM and C: HMM) on 10-fold cross validation with the number of training sentences ranging from 200 to 1400. The result shows that the MHHMM has the better performance in accuracy over both the HHMM and HMM, although the difference is less marked for the latter.</Paragraph> <Paragraph position="2"> for testing (in seconds) for the 10-fold cross validation. The test were carried out on a dual P4-D computer running at 3GHz and with 1Gb RAM.</Paragraph> <Paragraph position="3"> The results indicate that the MHHMM gains efficiency, in terms of computation cost, by merging repeated sub-models, resulting in fewer states in the model. In contrast the HMM has lower efficiency as it is required to identify every single path, leading to more states within the model and higher computation cost. The extra costs of constructing a HHMM, which will have the same number of production states as the HMM, make it the least efficient.</Paragraph> <Paragraph position="4"> The second evaluation presents preliminary evidence that the partially flattened hierarchical hidden Markov model (PFHHMM) can assign propositions to language texts (grammar parsing) at least as accurately as the HMM. This is assignment is a task that HHMMs are generally not well suited to. Table 2 shows the F1-measures of identified semantic roles for each different model on the Lancaster Treebank data set. The models used in this evaluation were trained with observation data from the Lancaster Treebank training set. The training set and testing set are sub-divided from the corpus in proportions of 23 and 13. The PFHHMMs had extra training conditions as follows: PFHHMM obs 2000 made use of the partial flattening process, with the high dependency parameter determined by considering the highest 2000 dependency values from observation sequences from the corpus. PFHHMM state 150 again uses partial flattening, however this time the highest 150 dependency values from state sequences were utilized in discovering the high dependency threshold. The n values of 2000 and 150 were determined to be the optimal values when applied to the training set.</Paragraph> <Paragraph position="5"> The results show that applying the partial flattening process to a model using observation sequences to determine high dependency values reduces the complexity of the model's hierarchy and consequently improves the model's accuracy. The state dependency method is shown to be less favorable for this particular task, but the micro-average result is still comparable with the HMM's performance. The results also show no significant re- null set.</Paragraph> <Paragraph position="6"> lationship between the occurance count of a state against the various models prediction accuracy.</Paragraph> </Section> class="xml-element"></Paper>