File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/00/j00-1003_concl.xml

Size: 2,861 bytes

Last Modified: 2025-10-06 13:52:49

<?xml version="1.0" standalone="yes"?>
<Paper uid="J00-1003">
  <Title>Practical Experiments with Regular Approximation of Context-Free Languages</Title>
  <Section position="7" start_page="39" end_page="42" type="concl">
    <SectionTitle>
7. Conclusions
</SectionTitle>
    <Paragraph position="0"> If we apply the finite automata with the intention of filtering out incorrect sentences, for example from the output from a speech recognizer, then it is allowed that a  certain percentage of ungrammatical input is recognized. Recognizing ungrammatical input merely makes filtering less effective; it does not affect the functionality of the system as a whole, provided we assume that the grammar specifies exactly the set of sentences that can be successfully handled by a subsequent phase of pro- null Number of recognized sentences divided by number of grammatical sentences. cessing. Also allowed is that &amp;quot;pathological&amp;quot; grammatical sentences are rejected that seldom occur in practice; an example are sentences requiring multiple levels of selfembedding. null Of the methods we considered that may lead to rejection of grammatical sentences, i.e., the subset approximations, none seems of much practical value. The most serious problem is the complexity of the construction of automata from the compact representation for large grammars. Since the tools we used for obtaining the minimal  Nederhof Experiments with Regular Approximation deterministic automata are considered to be of high quality, it seems unlikely that alternative implementations could succeed on much larger grammars, especially considering the sharp increases in the sizes of the automata for small increases in the size of the grammar. Only LC2 could be applied with relatively few resources, but this is a very crude approximation, which leads to rejection of many more sentences than just those requiring self-embedding.</Paragraph>
    <Paragraph position="1"> Similarly, some of the superset approximations are not applicable to large grammars because of the high costs of obtaining the minimal deterministic automata. Some others provide rather large languages, and therefore do not allow very effective illtering of ungrammatical input. One method, however, seems to be excellently suited for large grammars, namely, the RTN method, considering both the unparameterized version and the parameterized version with d = 2. In both cases, the size of the automaton grows moderately in the grammar size. For the unparameterized version, the compact representation also grows moderately. Furthermore, the percentage of recognized sentences remains close to the percentage of grammatical sentences. It seems therefore that, under the conditions of our experiments, this method is the most suitable regular approximation that is presently available.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML