File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/relat/00/j00-1002_relat.xml

Size: 2,031 bytes

Last Modified: 2025-10-06 14:15:35

<?xml version="1.0" standalone="yes"?>
<Paper uid="J00-1002">
  <Title>Stoyan Mihov t Bulgarian Academy of Sciences</Title>
  <Section position="6" start_page="14" end_page="14" type="relat">
    <SectionTitle>
5. Related Work
</SectionTitle>
    <Paragraph position="0"> An algorithm described by Revuz \[1991\] also constructs a dictionary from sorted data while performing a partial minimization on-the-fly. Data is sorted in reverse order and that property is used to compress the endings of words within the dictionary as it is being built. This is called a pseudominimization and must be supplemented by a true minimization phase afterwards. The minimization phase still involves finding an equivalence relation over all of the states of the pseudominimal dictionary. It is possible to use unsorted data but it produces a much bigger dictionary in the first stage of processing. However, the time complexity of the minimization can be reduced somewhat by using knowledge of the pseudominimization process. Although this pseudominimization technique is more economic in its use of memory than traditional techniques, we are still left with a subminimal dictionary that can be a factor of 8 times larger than the equivalent minimal dictionary (Revuz \[1991, page 33\], reporting on the DELAF dictionary).</Paragraph>
    <Paragraph position="1"> Recently, a semi-incremental algorithm was described by Watson (1998) at the Workshop on Implementing Automata. That algorithm requires the words to be sorted in any order of decreasing length (this sorting process can be done in linear time), and takes advantage of automata properties similar to those presented in this paper.</Paragraph>
    <Paragraph position="2"> In addition, the algorithm requires a final minimization phase after all words have been added. For this reason, it is only semi-incremental and does not maintain full minimality while adding words--although it usually maintains the automata close enough to minimality for practical applications.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML