File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/97/w97-0121_concl.xml

Size: 2,150 bytes

Last Modified: 2025-10-06 13:57:51

<?xml version="1.0" standalone="yes"?>
<Paper uid="W97-0121">
  <Title>Collocation Lattices and Maximum Entropy Models</Title>
  <Section position="8" start_page="229" end_page="229" type="concl">
    <SectionTitle>
8 Conclusion
</SectionTitle>
    <Paragraph position="0"> In this paper we presented a novel approach for building maximum entropy models. Our approach uses a feature collocation lattice and selects the atomic features without resorting to iterative scaling..A_fter the atomic features have been selected we, using the iterative scaling, compute a fully saturated model for the maximal constraint space and then start to eliminate the most specific constraints. Since during constraint deselection at every point we have a fully fit maximum entropy model, we rank the constraints on the basis of their weights in the model. Therefore we don't have to use the iterative scaling for constraint ranking and apply it only for linear model regression.</Paragraph>
    <Paragraph position="1"> Another important improvement is that since the smaller model deviates from the previous larger model only in a small number of constraints, we use the parameters of the old model as the initial values of the parameters for the iterative scaling of the new one. This proved to decrease the number of required iterations by about tenfold. We applied the described method to several langnage modelling tasks and proved its feasibility for selecting and building the models with the complexity of tens of thousands constraints. A potential drawback of our approach is that we require to build a maximum entropy model for the whole observed feature-space which might not be feasible for applications with hundreds of thousands of features. So one of the directions in our future work is to find efficient ways for a decomposition of the feature lattice into non-overlapping sub-lattices which then can be handled by our method. Another avenue for further improvement is to introduce the &amp;quot;or&amp;quot; operation on the nodes of the lattice. This can provide a further generalization over the employed by the model features.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML