File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/04/j04-2005_metho.xml

Size: 2,217 bytes

Last Modified: 2025-10-06 14:08:49

<?xml version="1.0" standalone="yes"?>
<Paper uid="J04-2005">
  <Title>c(c) 2004 Association for Computational Linguistics Squibs and Discussions Comments on &amp;quot;Incremental Construction and Maintenance of Minimal Finite-State Automata,&amp;quot; by Rafael C. Carrasco and Mikel</Title>
  <Section position="5" start_page="234" end_page="234" type="metho">
    <SectionTitle>
5. Evaluation
</SectionTitle>
    <Paragraph position="0"> Two experiments have been performed to compare the new algorithm with the algorithm for adding strings to a minimal, deterministic, cyclic automaton presented in Carrasco and Forcada (2002). In both experiments, a cyclic automaton was created. It recognized any sequence of words from one set and any word from another set. The first set was used to construct an initial cyclic automaton recognizing any sequence of words from the first set. Then the second set was used to measure the relative speed of the algorithms being compared. In the first experiment, the first set consisted of German words beginning with Latin letters from A to M, and the second set consisted of German words beginning with letters from N to Z. This was the &amp;quot;easier&amp;quot; task, since only the initial state of the automaton had to be cloned. In the second experiment, odd-numbered German words beginning with letters A to Z formed the first set, and even-numbered ones, the second set. In this task, many paths in the automaton were shared between words from both sets. A total of 69,669 German words were used in the experiments.</Paragraph>
    <Paragraph position="1"> In the first experiment, the new algorithm was 4.96 times faster, and in the second one, 2.53. Most of the speedup was not the result of using an algorithm optimized for sorted data--an improvement to the algorithm for adding strings in Carrasco and Forcada (2002) consisting in avoiding unnecessary cloning of prefix states (as described in section 3.2 and mentioned on page 215 in Carrasco and Forcada [2002] as a suggestion from one of Carrasco and Forcada's reviewers) was 3.12 and respectively 2.35 times faster than the original algorithm. However, the new algorithm is still the fastest.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML