File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/99/p99-1061_concl.xml
Size: 3,567 bytes
Last Modified: 2025-10-06 13:58:27
<?xml version="1.0" standalone="yes"?> <Paper uid="P99-1061"> <Title>A Bag of Useful Techniques for Efficient and Robust Parsing</Title> <Section position="12" start_page="478" end_page="478" type="concl"> <SectionTitle> 10 Conclusions and Further Work </SectionTitle> <Paragraph position="0"> The collection of methods described in this paper has enabled us to unite deep linguistic analysis with speech processing. The overall speed-up compared to the original system is about a factor of 10 up to 25. Below we present some absolute timings to give an impression of the current systems' performance.</Paragraph> <Paragraph position="1"> time overall 4.53 s 1.38 s 4.42 s In the table, the last six rows are average values per sentence, time first and time overall are the mean CPU times to compute the first result and the whole search space respectively. # lex. entries and # chart items give an impression of the lexical and syntactic ambiguity of the respective grammars 4 The German and Japanese corpora and half of the English corpus consist of transliterations of spoken dialogues used in the VEI:tBMOBIL project. These dialogues are real world dialogues about appointment scheduling and vacation planning. They contain a variety of syntactic as well as spontaneous speech phenomena. The remaining half of the English corpus is taken from a manually constructed test suite, which may explain some of the differences in absolute parse time.</Paragraph> <Paragraph position="2"> Most of the methods are corpus independent, except for the quick check filter, which requires a training corpus, and the use of a purely conjunctive grammar, which will do worse in cases of great amounts of syntactic ambiguity because there is currently no ambiguity packing in the parser. For the quick check, we have observed that a random subset of the corpora with about one to two hundred sentences is enough to obtain a filter with nearly optimal filter rate. Although the actual efficiency gain will vary for differently implemented grammars, we are 4The computations were made using a 300MHz SUN Ultrasparc 2 with Solaris 2.5. The whole system is programmed in Franz Allegro Common Lisp.</Paragraph> <Paragraph position="3"> certain that these techniques will lead to substantial improvements in almost every unification based system. It is, for example, quite unlikely that unification failures are equally distributed over the different nodes of the grammar's feature structure, which is the most important prerequisite for the quick check filter to work. Avoiding disjunctions usually requires a reworking of the grammar which will pay off in the end.</Paragraph> <Paragraph position="4"> We have shown that the combination of algorithmic methods together with some discipline in grammar writing can lead to a practical high performance analysis system even with large general grammars for different languages.</Paragraph> <Paragraph position="5"> There is, however, room for further improvements. We intend to generalize to other cases the technique for removing unnecessary lexical items. A detailed investigation of the quick-check method and its interaction with the rule application filter is planned for the near future. Since almost all failing unifications are avoided through the use of filtering techniques, we will now focus on methods to reduce the number of chart items that do not contribute to any analysis; for instance, by computing context-free or regular approximations of the HPSG grammars (e.g., (Nederhof, 1997)).</Paragraph> </Section> class="xml-element"></Paper>