File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/04/c04-1024_evalu.xml
Size: 2,781 bytes
Last Modified: 2025-10-06 13:59:05
<?xml version="1.0" standalone="yes"?> <Paper uid="C04-1024"> <Title>Ef cient Parsing of Highly Ambiguous Context-Free Grammars with Bit Vectors</Title> <Section position="9" start_page="0" end_page="0" type="evalu"> <SectionTitle> 8 Experiments </SectionTitle> <Paragraph position="0"> The parser was tested with a grammar containing 65,855 grammar rules, and 4,444 different categories. The grammar was extracted from a version of the Penn treebank which was annotated with additional features similar to (Klein and Manning, 2003b). The average rule length has 3.7 (without parent category). The experiments were conducted on a Sun Blade 1000 Model 2750 server with 750 MHz CPUs and 4 GB memory.</Paragraph> <Paragraph position="1"> In a rst experiment, 1000 randomly selected sentences from the PENN treebank containing 24,595 tokens were parsed. Viterbi parsing of these sentences took 27,596 seconds (1.14 seconds per word). The generation of parse forests3 for the same sentences took 26,840 seconds (1.09 seconds per word).</Paragraph> <Paragraph position="2"> In another experiment, we examined how parse times increase with sentence length. Figure 9 shows the average Viterbi parse times of BitPar for randomly selected sentences of different lengths4. For comparison, the average parse times of the LoPar parser (Schmid, 2000) on the same data are also shown. LoPar is a 1-pass left-corner chart parser which computes the Viterbi parse from the parse forest. BitPar is faster for all sentence lengths and the growth of the parse times with sentence length is smaller than for LoPar. Although the asymptotic runtime complexity of BitPar is cubic, gure 9 shows that the exponent of the actual growth function in the range between 4 and 50 is about 2.6. This can be explained by the fact that the bit-vector operations become more effective as the length of the by a high processor load. The experiment will be repeated for the nal version of the paper.</Paragraph> <Paragraph position="3"> sentence and therefore the length of the bit-vectors increases.</Paragraph> <Paragraph position="4"> The memory requirements of BitPar are far lower than those of LoPar. LoPar needs about 1 GB memory to parse sentences of length 22, whereas BitPar allocates 180 MB during parse forest generation and 55 MB during Viterbi parsing. For the longest sentence in our 1000 sentence test corpus with length 55, BitPar needed 113 MB to generate the Viterbi parse and 3,185 MB to compute the parse forest.</Paragraph> <Paragraph position="5"> LoPar was unable to parse sentences of this length. We are planning to evaluate the in uence of the different optimisations presented in the paper on parsing speed and to compare it with other parsers than LoPar.</Paragraph> </Section> class="xml-element"></Paper>