File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/99/e99-1016_concl.xml

Size: 2,598 bytes

Last Modified: 2025-10-06 13:58:21

<?xml version="1.0" standalone="yes"?>
<Paper uid="E99-1016">
  <Title>Cascaded Markov Models</Title>
  <Section position="7" start_page="123" end_page="124" type="concl">
    <SectionTitle>
5 Conclusion and Future Work
</SectionTitle>
    <Paragraph position="0"> We have presented a new parsing model for shallow processing. The model parses by representing each layer of the resulting structure as a separate Markov Model. States represent categories of words and phrases, outputs consist of partial parse trees. Starting with the layer for part-of-speech tags, the output of lower layers is passed as input to higher layers. This type of model is restricted to a fixed maximum number of layers in the parsed structure, since the number of Markov Models is determined before parsing. While the effects of these restrictions on the parsing of sentences and VPs are still to be investigated, we obtain excellent results for the chunking task, i.e., the recognition of kernel NPs and PPs.</Paragraph>
    <Paragraph position="1"> It will be interesting to see in future work if Cascaded Markov Models can be extended to parsing sentences and VPs. The average number of layers per sentence in the NEGRA corpus is only 5; 99.9% of all sentences have 10 or less layers, thus a very limited number of Markov Models would be sufficient.</Paragraph>
    <Paragraph position="2"> Cascaded Markov Models add left-to-right context-information to context-free parsing. This contextualization is orthogonal to another important trend in language processing: lexicalization. We expect that the combination of these techniques results in improved models.</Paragraph>
    <Paragraph position="3"> We presented the generation of parameters from annotated corpora and used linear interpolation for smoothing. While we do not expect ira- null Proceedings of EACL '99 provements by re-estimation on raw data, other smoothing methods may result in better accuracies, e.g. the maximum entropy framework. Yet, the high complexity of maximum entropy parameter estimation requires careful pre-selection of relevant linguistic features.</Paragraph>
    <Paragraph position="4"> The presented Markov Models act as filters.</Paragraph>
    <Paragraph position="5"> The probability of the resulting structure is determined only based on a stochastic context-free grammar. While building the structure bottom up, parses that are unlikely according to the Markov Models are pruned. We think that a combined probability measure would improve the model. For this, a mathematically motivated combination needs to be determined.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML