File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/97/a97-1023_intro.xml
Size: 2,571 bytes
Last Modified: 2025-10-06 14:06:16
<?xml version="1.0" standalone="yes"?> <Paper uid="A97-1023"> <Title>Techniques for Accelerating a Grammar-Checker</Title> <Section position="2" start_page="0" end_page="0" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> This paper describes an efficiency-supporting tool for one of the two grarnmar-checker technologies developed in the fi'amework of the PECO2824 Joint Research Project sponsored by the European Union.</Paragraph> <Paragraph position="1"> The project, covering Bulgarian and Czech, two ti'eeword-order languages from the Slavic t~rnily, was performed between January 1993 and mid 1996 by a consortium consisting of both academic and industrial partners.</Paragraph> <Paragraph position="2"> The basic philosophy of the technology discussed in this paper 1 is that of linguistic-theoretically smmd grammar-and-parsing-based machinery able to detect, by constraint relaxation, errors from a predefined set (as opposed to pattern-matching approaches, which do not seem promising for a free word-order language). The core of the system (broad-coverage HPSG-based grammars of Bulgarian and Czech, and a single language-independent parser) was developed m the first three years of the project and was then passed to the industrial partners Bulgarian Business System IMC Sofia. and Macron Prague, Ltd. While the Bulgarian system remained in more or less a demonstrator stage only, the Czech one satisfied Macron's requirements as to syntactic coverage. However, Macron expressed serious worries about the speed of the system, should this be really introduced to the market. Following this, severa.1 possibili-IAs for the alterna.tive technology, cf. (Hola.n, Kubol't, a.ztd Pl/Ltek, 1997) ties of using finite-state automata (FSA) as means for speeding up the performance of the system were designed, developed and implemented, in particular: * for detecting sentences where none of the predefined errors can occur (tiros ruling out such sentences from the procedure of error-search proper) * for detecting which one(s) of tile predefined error types might possibly occur in a particular sentence (hence, cutting clown the search space of the error-search proper) * for detecting errors which are of such a nature that their occurrence might be discovered by a machinery simpler than full-fledged parsing with constraint relaxation * for splitting (certain cases of) complex sentences into independent clauses, a,llowing thus for the error-detection to be performed on short, er strings.</Paragraph> </Section> class="xml-element"></Paper>