XML Viewer - j97-3004

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/97/j97-3004_metho.xml
Size: 64,514 bytes
Last Modified: 2025-10-06 14:14:29
<?xml version="1.0" standalone="yes"?>
<Paper uid="J97-3004">
  <Title>An Efficient Implementation of the Head-Corner Parser</Title>
  <Section position="3" start_page="428" end_page="440" type="metho">
    <SectionTitle>
2. A Specification of the Head-Corner Parser
</SectionTitle>
    <Paragraph position="0"> Head-corner parsing is a radical approach to head-driven parsing in that it gives up the idea that parsing should proceed from left to right. Rather, processing in a head-corner parser is bidirectional, starting from a head outward (island-driven). A head-corner parser can be thought of as a generalization of the left-corner parser (Rosenkrantz and Lewis 1970; Matsumoto et al. 1983; Pereira and Shieber 1987). As in the left-corner parser, the flow of information in a head-corner parser is both bottom-up and topdown. null In order to explain the parser, I first introduce some terminology. I assume that grammars are defined in the Definite Clause Grammar formalism (Pereira and Warren 1980). Without any loss of generality I assume that no external Prolog calls (the ones that are defined within { and }) are used, and that all lexical material is introduced in rules that have no other right-hand-side members (these rules are called lexical  Computational Linguistics Volume 23, Number 3 goal .</Paragraph>
    <Paragraph position="1"> lex Figure 1 The head-corner parser.</Paragraph>
    <Paragraph position="2"> g? l goal entries). The grammar thus consists of a set of rules and a set of lexical entries* For each rule an element of the right-hand side is identified as the head of that rule. The head-relation of two categories h, m holds with respect to a grammar iff the grammar contains a rule with left-hand side m and head daughter h. The relation head-corner is the reflexive and transitive closure of the head relation.</Paragraph>
    <Paragraph position="3"> The basic idea of the head-corner parser is illustrated in Figure 1. The parser selects a word (1), and proves that the category associated with this word is the head-corner of the goal. To this end, a rule is selected of which this category is the head daughter* Then the other daughters of the rule are parsed recursively in a bidirectional fashion: the daughters left of the head are parsed from right to left (starting from the head), and the daughters right of the head are parsed from left to right (starting from the head). The result is a slightly larger head-corner (2). This process repeats itself until a head-corner is constructed that dominates the whole string (3).</Paragraph>
    <Paragraph position="4"> Note that a rule is triggered only with a fully instantiated head daughter. The generate-and-test behavior discussed in the previous section (examples 1 and 2) is avoided in a head-corner parser, because in the cases discussed there, the rule would be applied only if the VP is found, and hence arg is instantiated. For example if arg = np(sg3, \[\] ,Subj), the parser continues to search for a singular NP, and need not consider other categories* To make the definition of the parser easier, and to make sure that rules are indexed appropriately, grammar rules are represented by the predicate headod_rulo/4 in which the first argument is the head of the rule, the second argument is the mother node of the rule, the third argument is the reversed list of daughters left of the head, and the fourth argument is the list of the daughters right of the head. 1 This representation of a grammar will in practice be compiled from a friendlier notation* As an example, the DCG rule x(A,E) --&gt; a(A), b(B,A), x(C,B), d(C,D), e(D,E).</Paragraph>
    <Paragraph position="5"> of which the third daughter constitutes the head, is represented now as: headed_rule(x(C,B), x(A,E), \[b(B,A), a(A)\], \[d(C,D), e(D,E)\]).</Paragraph>
    <Paragraph position="6"> It is assumed furthermore that lexical lookup has been performed already by another module. This module has asserted clauses for the predicate lexieal_analysis/3 where the first two arguments are the string positions and the third argument is the</Paragraph>
    <Paragraph position="8"> Small from QO-Q is a head-corner of Cat from PO-P where PO-P occurs within EO-E</Paragraph>
    <Paragraph position="10"> % there are categories LeftDs from QO to Q s.t. RevLeftDs is reverse of LeftDs, and E0=&lt;~0.</Paragraph>
    <Paragraph position="11"> parse left_ds(\[\],Q,Q,_).</Paragraph>
    <Paragraph position="12"> parse left_ds(\[HIT\],QO,Q,EO) &amp;quot;parse(H,QI,Q,E0,Q), parse_left_ds(T,Q0,QI,E0). % parse_right_ds(+RightDs,+Q0,-Q,+E) % there are categories RightDs from Q0 to Q s.t. Q =&lt; E. parse_right_ds(\[\],Q,Q,_).</Paragraph>
    <Paragraph position="13"> parse right_ds(\[HIT\],Q0,Q,E) &amp;quot;parse(H,QO,Ql,@0,E), parse_right ds(T,QI,Q,E).</Paragraph>
    <Paragraph position="15"> smaller_equal(EO,QO), smaller_equal(Q,E).</Paragraph>
    <Paragraph position="16"> Figure 2 Definite clause specification of the head-corner parser. (lexical) category. For an input sentence Timeyqies like an arrow this module may produce the following set of clauses: lexical_analysis (0, I, verb). (5) lexical_analysis (0,2 ,noun).</Paragraph>
    <Paragraph position="17"> lexieal_analysis (i, 2, verb).</Paragraph>
    <Paragraph position="18"> lexical_analysis (2,3, verb).</Paragraph>
    <Paragraph position="19"> lexical_analysis (4,5, noun).</Paragraph>
    <Paragraph position="20"> A simple definite-clause specification of the head-corner parser is given in Figure 2. The predicate visible to the rest of the world will be the predicate parse/3. This lexical_analysis(O,l,noun).</Paragraph>
    <Paragraph position="21"> lexical_analysis(1,2,noun).</Paragraph>
    <Paragraph position="22"> lexical_analysis(2,3,prep).</Paragraph>
    <Paragraph position="23"> lexical_analysis(3,4,det).</Paragraph>
    <Paragraph position="24">  Computational Linguistics Volume 23, Number 3 predicate is defined in terms of the parse/5 predicate. The extra arguments introduce a pair of indices representing the extreme positions between which a parse should be found. This will be explained in more detail below. A goal category can be parsed if a predicted lexical category can be shown to be a head-corner of that goal. The head-corner predicate constructs (in a bottom-up fashion) larger and larger head-corners. To parse a list of daughter categories, each daughter category is parsed in turn. A predicted category must be a lexical category that lies somewhere between the extreme positions. The predicate smaller_equal is true if the first argument is a smaller or equal integer than the second. The use of the predicates head_link and lex_head_link is explained below.</Paragraph>
    <Paragraph position="25"> Note that unlike the left-corner parser, the head-corner parser may need to consider alternative words as a possible head-corner of a phrase, for example, when parsing a sentence that contains several verbs. This is a source of inefficiency if it is difficult to determine what the appropriate lexical head for a given goal category is. This problem is somewhat reduced because of: * the use of extremes * the use of top-down information</Paragraph>
    <Section position="1" start_page="431" end_page="431" type="sub_section">
      <SectionTitle>
2.1 The Use of Extremes
</SectionTitle>
      <Paragraph position="0"> The main difference between the head-corner parser in the previous paragraph and the left-corner parser is--apart from the head-driven selection of rules--the use of two pairs of indices, to implement the bidirectional way in which the parser proceeds through the string.</Paragraph>
      <Paragraph position="1"> Observe that each parse-goal in the left-corner parser is provided with a category and a left-most position. In the head-corner parser, a parse-goal is provided either with a begin or end position (depending on whether we parse from the head to the left or to the right) but also with the extreme positions between which the category should be found. In general, the parse predicate is thus provided with a category and two pairs of indices. The first pair indicates the begin and end position of the category, the second pair indicates the extreme positions between which the first pair should lie. In Figure 3 the motivation for this technique is illustrated with an example.</Paragraph>
    </Section>
    <Section position="2" start_page="431" end_page="434" type="sub_section">
      <SectionTitle>
2.2 Adding Top-Down Filtering
</SectionTitle>
      <Paragraph position="0"> 2.2.1 Category Information. As in the left-corner parser, a linking table is maintained, which represents important aspects of the head-corner relation. For some grammars, this table simply represents the fact that the HEAD features of a category and its head-corner are shared. Typically, such a table makes it possible to predict that in order to parse a finite sentence, the parser should start with a finite verb; to parse a singular noun-phrase the parser should start with a singular noun, etc.</Paragraph>
      <Paragraph position="1"> The table is defined by a number of clauses for the predicate head_link/2 where the first argument is a category for which the second argument is a possible headcorner. A sample linking table may be: head_link(s,verb).</Paragraph>
      <Paragraph position="2"> head_link( s, vp).</Paragraph>
      <Paragraph position="3"> head_link(pp,prep).</Paragraph>
      <Paragraph position="4"> head_link( X, X).</Paragraph>
      <Paragraph position="5"> head_link( vp, verb).</Paragraph>
      <Paragraph position="6"> head_link( np, noun).</Paragraph>
      <Paragraph position="7"> head_link(sbar, comp).</Paragraph>
      <Paragraph position="9"> This example illustrates how the use of two pairs of string positions reduces the number of possible lexical head-corners for a given goal. Suppose the parser predicted (for a goal category s) a category v from position 5 to 6. In order to construct a complete tree s for this head-corner, a rule is selected that dictates that a category np should be parsed to the right, starting from position 6. To parse np, the category n from 7 to 8 is predicted. Suppose furthermore that in order to connect n to np a rule is selected that requires a category adjp to the left of n. It will be clear that this category adjp should end in position 7, but can never start before position 6. Hence the only candidate head-corner of this phrase is to be found between 6 and 7.</Paragraph>
      <Paragraph position="10">  about begin and end positions, following an idea in Sikkel (1993). For example, if the goal is to parse a phrase with category sbar from position 7, and within positions 7 and 12, then for some grammars it can be concluded that the only possible lexical head-corner for this goal should be a complementizer starting at position 7. Such information is represented in the table as well. This can be done by defining the head relation as a relation between two triples, where each triple consists of a category and two indices (representing the begin and end position). The head relation ((Cm, pm, qm), (Ch, ph, qh)) holds iff there is a grammar rule with mother Cm and head Ch. Moreover, if the list of daughters left of the head of that rule is empty, then the begin positions are identical, i.e., Ph = Pro. Similarly, if the list of daughters right of the head is empty, then qh = qm. As before, the head-corner relation is the reflexive and transitive closure of the head relation.</Paragraph>
      <Paragraph position="11"> The previous example now becomes: head_link( s ..... verb .... ).</Paragraph>
      <Paragraph position="12"> head_link( s,_,P, vp,_,P).</Paragraph>
      <Paragraph position="13"> head_link(pp,P,_, prep,P,_).</Paragraph>
      <Paragraph position="14"> head_link( X,P,Q, X,P,Q).</Paragraph>
      <Paragraph position="15"> head_link( vp,P,_, verb,P,_).</Paragraph>
      <Paragraph position="16"> head_link( np ..... noun .... ).</Paragraph>
      <Paragraph position="17"> head_link(sbar,P,_, comp,P,_).</Paragraph>
      <Paragraph position="19"> Obviously, the nature of the grammar determines whether it is useful to represent such information. In order to be able to run a head-corner parser in left-corner mode, this technique is crucial. On the other hand, for grammars in which this technique does not provide any useful top-down information no extra costs are introduced either.</Paragraph>
      <Paragraph position="20">  Computational Linguistics Volume 23, Number 3 2.2.3 Integrating the Head-Corner Table. The linking table information is used to restrict which lexical entries are examined as candidate heads during prediction, and to check whether a rule that is selected can in fact be used to reach the current goal. To distinguish the two uses, we use the relation lex_head_link, which is a subset of the head_link relation in which the head category is a possible lexical category. An example might be the following (where we assume that the category vp is never assigned to a lexical entry), which is a subset of the table in 7.</Paragraph>
      <Paragraph position="21">  lex_head_link( s .....</Paragraph>
      <Paragraph position="22"> lex_head link( np .....</Paragraph>
      <Paragraph position="23"> lex_head link(sbar,P,_, verb .... ).</Paragraph>
      <Paragraph position="24"> noun,_,_).</Paragraph>
      <Paragraph position="25"> comp,P,_).</Paragraph>
      <Paragraph position="26"> lex_head_link(vp,P,_, verb,P,_).</Paragraph>
      <Paragraph position="27"> lex_head_link(pp,P,_, prep,P,_).</Paragraph>
      <Paragraph position="28"> lex_head_link( X,P,Q, X,P,Q).</Paragraph>
      <Paragraph position="29"> (8)  A few potential problems arise in connection with the use of linking tables. Firstly, for constraint-based grammars of the type assumed here the number of possible non-terminals is infinite. Therefore, we generally cannot use all information available in the grammar but rather we should compute a &amp;quot;weakened&amp;quot; version of the linking table. This can be accomplished, for example, by replacing all terms beyond a certain depth by anonymous variables, or by other restrictors (Shieber 1985). Secondly, the use of a linking table may give rise to spurious ambiguities. Consider the case in which the category we are trying to parse can be matched against two different items in the linking table, but in which case the predicted head-category may turn out to be the same.</Paragraph>
      <Paragraph position="30"> Fortunately, the memorization technique discussed in Section 3 takes care of this problem. Another possibility is to use the linking table only as a check, but not as a source of information, by encapsulating the call within a double negation. 2 The solution implemented in the head-corner parser is to use, for each pair of functors of categories, the generalization of the head-corner relation. Such functors typically are major and minor syntactic category labels such as NP, VP, S, S-bar, verb, .... As a result there will always be at most one matching clause in the linking table for a given goal category and a given head category (thus there is no risk of obtaining spurious ambiguities). Moreover, this approach allows a very efficient implementation technique, as described below.</Paragraph>
      <Paragraph position="31"> 2.2.4 Indexing of the Head-Corner Table. In the implementation of the head-corner parser, we use an efficient implementation of the head-corner relation by exploiting Prolog's first argument indexing. This technique ensures that the lookup of the head-corner table can be done in (essentially) constant time. The implementation consists of two steps. In the first step, the head-corner table is weakened such that for a given goal category and a given head category at most a single matching clause exists. In the second step, this table is encoded in such a way that first argument indexing ensures that table lookup is efficient.</Paragraph>
      <Paragraph position="32"> As a first step we modify the head-corner relation to make sure that for all pairs of functors of categories, there will be at most one matching clause in the head-corner table. This is illustrated with an example. Suppose a hypothetical head-corner table 2 This approach also solves another potential problem: the linking table may give rise to (undesired) cyclic terms due to the absence of the occur check. The double negation also takes care of this potential problem.  van Noord Efficient Head-Corner Parsing contains the following two clauses relating categories with functor x/4 and y/4: head_link (x (A, B .... ) ..... y(A,B .... ) .... ).</Paragraph>
      <Paragraph position="33"> head_link(x(_,B,C,_) ..... y(_,B,C,_) .... ).</Paragraph>
      <Paragraph position="34"> In this case, the modified head-corner relation table will consist of a single clause relating x/4 and y/4 by taking the generalization (or &amp;quot;anti-unification&amp;quot;) of the two clauses: head_link(x(_ ,B .... ) ..... y(_ ,B .... ) .... ).</Paragraph>
      <Paragraph position="35"> As a result, for a given goal and head category, table lookup is deterministic. In the second and final step of the modification we re-arrange the information in the table such that for each possible goal category functor g/n, there will be a clause: head_link(g(Al..An) ,Pg,Qg,Head,Ph,Qh) :head_link_G_N (Head, Ph, Qh, g (AI.. An), Pg, Qg).</Paragraph>
      <Paragraph position="36"> Moreover, all the relations head_link_G_N now contain the relevant information from the head-comer table. Thus, for clauses of the form: head_link (x (_, B .... ) ..... y(_,B .... ) .... ) .</Paragraph>
      <Paragraph position="37"> we now have: head_link_x 4(y(_,B .... ) ..... x(_,B .... ) .... ).</Paragraph>
      <Paragraph position="38"> First argument indexing now ensures that table lookup is efficient.</Paragraph>
      <Paragraph position="39"> The same technique is applied for the lex_head_link relation. This technique significantly improves the practical time efficiency of the parser (especially if the resulting code is compiled).</Paragraph>
    </Section>
    <Section position="3" start_page="434" end_page="436" type="sub_section">
      <SectionTitle>
2.3 Dealing with Epsilon Rules
</SectionTitle>
      <Paragraph position="0"> In the preceding paragraphs we have said nothing about empty productions (epsilon rules). A possible approach is to compile the grammar into an equivalent grammar in which no such epsilon rules are defined. It is also possible to deal with epsilon rules in the head-corner parser directly. For example, we could assert empty productions as possible lexical analyses. In such an approach, the result of lexical analysis may contain clauses such as those in (9), in case there is a rule np/np --+ \[\].</Paragraph>
      <Paragraph position="1"> lexical_analysis (0, O, np/np), lexical_analysis (i, I, np/np) . (9) lexical_analysis (2,2, np/np), lexical_analysis (3,3,np/np).</Paragraph>
      <Paragraph position="2"> lexical_analysis (4,4, np/np).</Paragraph>
      <Paragraph position="3"> There are two objections to this approach. The first objection may be that this is a task that can hardly be expected from a lexical lookup procedure. The second, more important, objection is that empty categories are hypothesized essentially everywhere. In the general version of the head-corner parser, gaps are inserted by a special clause for the predict/8 predicate (10), where shared variables are used to indicate the corresponding string positions. The gap_head_link relation is a subset of the head_link relation in which the head category is a possible gap.</Paragraph>
      <Paragraph position="4"> predict (Cat, PO, P, _EO, _E, Small, Q, Q) * - (10) gap_head_link (Cat, PO, P, Small, Q, 6)), gap(Small).</Paragraph>
      <Paragraph position="5">  Computational Linguistics Volume 23, Number 3 For this approach to work, other predicates must expect string positions that are not instantiated. For example, Prolog's built-in comparison operator cannot be used, since that operator requires that its arguments are ground. The definition of the smaller_equal predicate therefore reflects the possibility that a string position is a variable (in which case, calls to this predicate should succeed).</Paragraph>
      <Paragraph position="6"> For some grammars it turns out that a simplification is possible. If it is never possible that a gap can be used as the head of a rule, then we can omit this new clause for the predict predicate, and instead use a new clause for the parse/S predicate, as follows: parse (Small, Q, Q, _EO, _E) :gap(Small). null (11) This will typically be much more efficient because in this case gaps are hypothesized in a purely top-down manner.</Paragraph>
      <Paragraph position="7"> It should be noted that the general version of the head-corner parser is not guaranteed to terminate, even if the grammar defines only a finite number of derivations for all input sentences. Thus, even though the head-corner parser proceeds in a bottom-up direction, it can run into left-recursion problems (just as the left-corner parser can). This is because it may be possible that an empty category is predicted as the head, after which trying to construct a larger projection of this head gives rise to a parse-goal for which a similar empty category is a possible candidate head .... This problem is sometimes called &amp;quot;hidden left-recursion&amp;quot; in the context of left-corner parsers. This problem can be solved in some cases by a good (but relatively expensive) implementation of the memorization technique, e.g., along the lines of Warren (1992) or Johnson and DOrre (1995). The simplified (and more efficient) memorization technique that I use (see Section 3), however, does not solve this problem.</Paragraph>
      <Paragraph position="8"> A quite different solution, which is often applied for the same problem if a left-corner parser is used, is to compile the grammar into an equivalent grammar without gaps. For left-corner parsers, this can be achieved by partially evaluating all rules that can take gap(s) as their left-most daughter(s). Therefore, the parser only needs to consider gaps in non-left-most position, by a clause similar to the clause in (11). Obviously, the same compilation technique can be applied for the head-corner parser. However, there is a problem: it will be unclear what the heads of the newly created rules will be. Moreover, and more importantly, the head-corner relation will typically become much less predictive. For example, if there is a rule vp --&gt; np verb where verb can be realized as a gap, then after compilation, a rule of the form vp --&gt; np will exist. Therefore, an np will be a possible head-corner of vp. The effect will be that head-corners are difficult to predict, and hence efficiency will decrease.</Paragraph>
      <Paragraph position="9"> Fortunately, experience suggests that grammars exhibiting hidden head-recursion can often be avoided. For example, in the Alvey NL Tools grammar in only 3 rules (out of more than 700) the head of the rule could be gapped. These rules are of the form x --&gt; not x. Arguably, in such rules the second daughter should not be gapped.</Paragraph>
      <Paragraph position="10"> In the MiMo2 grammar of English, no heads can be gapped. Finally, in the Dutch OVIS grammar (in which verb-second is implemented by gap-threading) no hidden head-recursion occurs, as long as the head-corner table includes information about the feature vslash, which encodes whether or not a v-gap is expected.</Paragraph>
      <Paragraph position="11">  van Noord Efficient Head-Corner Parsing 3. Selective Memorization and Goal-Weakening</Paragraph>
    </Section>
    <Section position="4" start_page="436" end_page="437" type="sub_section">
      <SectionTitle>
3.1 Selective Memorization
</SectionTitle>
      <Paragraph position="0"> The basic idea behind memorization is simple: do not compute things twice. In Prolog, we can keep track of each goal that has already been searched and keep a list of the corresponding solution(s). If the same goal needs to be solved later, then we can skip the computation and simply do a table lookup. Maintaining a table and doing the table lookup is rather expensive. Therefore, we should modify the slogan &amp;quot;do not compute things twice&amp;quot; to do not compute expensive things twice.</Paragraph>
      <Paragraph position="1"> In the head-corner parser it turns out that the parse/5 predicate is a very good candidate for memorization. The other predicates are not. This implies that each maximal projection is computed only once; partial projections of a head can be constructed during a parse any number of times, as can sequences of categories (considered as sisters to a head). Active chart parsers memo everything (including sequences of categories); inactive chart parsers only memo categories, but not sequences of categories.</Paragraph>
      <Paragraph position="2"> In our proposal, we memo only those categories that are maximal projections, i.e., projections of a head that unify with the top category (start symbol) or with a nonhead daughter of a rule.</Paragraph>
      <Paragraph position="3"> The implementation of memorization uses Prolog's internal database to store the tables. The advantage of this technique is that we use Prolog's first argument indexing for such tables. Moreover, during the consultation of the table we need not worry about modifications to it (in contrast to an approach in which the table would be maintained as the value of a Prolog variable). On the other hand, the use of the internal database brings about a certain overhead. Therefore, it may be worthwhile to experiment with a meta-interpreter along the lines of the XOLDT system (Warren 1992) in which the table is maintained dynamically.</Paragraph>
      <Paragraph position="4"> Memorization is implemented by two different tables. The first table encodes which goals have already been searched. Items in the first table are called goal items.</Paragraph>
      <Paragraph position="5"> The second table represents all solved (i.e., instantiated) goals. Items in this second table are called result items. It might be tempting to use only the second table, but in that case, it would not be possible to tell the difference between a goal that has already been searched, but did not result in a solution (&amp;quot;fail-goal&amp;quot;) and a goal that has not been searched at all. If we have two tables, then we can also immediately stop working on branches in the search-space for which it has already been shown that there is no solution. The distinction between these two kinds of item is inherited from BUP (Matsumoto et al. 1983). The memorized version of the parse predicate can be defined as in (12).</Paragraph>
      <Paragraph position="7"> Computational Linguistics Volume 23, Number 3 The first table is represented by the predicate ' GOAL_ITEM'. This predicate simply consists of a number of unit-clauses indicating all goals that have been searched completely. Thus, before we try to attempt to solve Goal, we first check whether a goal item for that goal already exists. Given the fact that Goal may contain variables, we should be a bit careful here. Unification is clearly not appropriate, since it may result in a situation in which a more general goal is not searched because a more specific variant of that goal had been solved. We want exactly the opposite: if a more general version of Goal is included in the goal table, then we can continue to look for a solution in the result table. It is useful to consider the fact that if we had previously solved, for example, the goal parse (s, 3, X, 3,12), then if we later encounter the goal parse(s,3,Y,3,10), we can also use the second table immediately: the way in which the extreme positions are used ensures that the former is more general than the latter. The predicates for the maintenance of the goal table are defined in (13).</Paragraph>
      <Paragraph position="8"> in_tablel (Cat, P0, P, E0, E) * - (13) 'GOAL_ITEM'(Cat_d,P0_d,P_d,E0_d,E_d), Z goal exists which is subsumes chk((Cat_d,P0_d,P_d), (Cat,P0,P)), Z more general and within smaller_equal(E0_d,E0), degh a larger interval smaller_equal (E, E_d).</Paragraph>
      <Paragraph position="9"> assert_tablel(Cat,PO,P,EO,E) :- assertz('GOAL_ITEM'(Cat,PO,P,EO,E)).</Paragraph>
      <Paragraph position="10"> The second table is represented by the predicate 'RESULT_ITEM'. It is defined by unit-clauses that each represent an instantiated goal (i.e., a solution). Each time a result is found, the table is checked to see whether that result is already available. If it is, the newer result is ignored. If no (more general version of the) result exists, then the result is added to the table. Moreover, more specific results that may have been put on the table previously are marked. These results need not be used anymore. 3 This is not strictly necessary but is often useful because it decreases the size of the tables; in this approach, tables are redundancy free and hence minimal. Moreover, no further work will be done based on those results. Note that result items do not keep track of the extreme positions. This implies that in order to see whether a result item is applicable, we check whether the interval covered by the result item lies within the extreme positions of the current goal. The predicates dealing with the result table are defined in (14).</Paragraph>
      <Paragraph position="11">  and it's more specific then mark it % do this for all such % items  The implementation uses a faster implementation of memorizating in which both goal items and result items are indexed by the functor of the category and the string positions.</Paragraph>
      <Paragraph position="12"> In the head-corner parser, parse-goals are memorized. Note that nothing would prevent us from memoing other predicates as well, but experience suggests that the cost of maintaining tables for the head_corner relation, for example, is (much) higher than the associated profit. The use of memorization for only the parse/5 goals implies that the memory requirements of the head-corner parser in terms of the number of items being recorded is much smaller than in ordinary chart parsers. Not only do we refrain from asserting so-called active items, but we also refrain from asserting inactive items for nonmaximal projections of heads. In practice the difference in space requirements can be enormous. This difference is a significant reason for the practical efficiency of the head-corner parser.</Paragraph>
    </Section>
    <Section position="5" start_page="437" end_page="439" type="sub_section">
      <SectionTitle>
3.2 The Occur Check
</SectionTitle>
      <Paragraph position="0"> It turns out that the use of tables defined in the previous subsection can lead to a problem with cyclic unifications. If we assume that Prolog's unification includes the occur check then no problem would arise. But since most versions of Prolog do not implement the occur check it is worthwhile investigating this potential problem.</Paragraph>
      <Paragraph position="1"> The problem arises because cyclic solutions can be constructed that would not have been constructed by ordinary SLD-resolution. Furthermore, these cyclic structures lead to practical problems because items containing such a cyclic structure may have to be put in the table. In SICStus Prolog, this results in a crash.</Paragraph>
      <Paragraph position="2"> An example may clarify the problem. Suppose we have a very simple program containing the following unit clause: x(A,B).</Paragraph>
      <Paragraph position="3">  is attempted. This clearly succeeds. Furthermore an item of that form is added to the table. Later on it may be the case that a goal of the form 7- x(Y,Y) is attempted. Clearly this is not a more specific goal than we solved before, so we need to solve this goal afresh. This succeeds too. Now we can continue by picking up a solution from the second table. However, if we pick the first solution then a cyclic term results.</Paragraph>
      <Paragraph position="4"> A possible approach to deal with this situation is to index the items of the second table with the item of the first table from which the solution was obtained. In other words, if you want to select a solution from the second table, it must not only be the case that the solution matches your goal, but also that the corresponding goal of the solution is more general than your current goal. This strategy works, but turns out to be considerably slower than the original version given above. The reason seems to be that the size of the second table is increased quite drastically, because solutions may now be added to the table more than once (for all goals that could give rise to that solution).</Paragraph>
      <Paragraph position="5"> An improvement of the head-corner parser using a goal-weakening technique often eliminates this occur check problem. Goal weakening is discussed in the following subsection.</Paragraph>
    </Section>
    <Section position="6" start_page="439" end_page="440" type="sub_section">
      <SectionTitle>
3.3 Goal Weakening
</SectionTitle>
      <Paragraph position="0"> The insight behind goal weakening (or abstraction \[Johnson and D6rre 1995\]) in the context of memorization is that we may combine a number of slightly different goals into a single, more general, goal. Very often it is much cheaper to solve this single (but more general) goal than to solve each of the specific goals in turn. Moreover, the goal table will be smaller (both in terms of number of items, and the size of individual items), which can have a positive effect on the amount of memory and CPU-time required for the administration of the table. Clearly, one must be careful not to remove essential information from the goal (in the worst case, this may even lead to nontermination of otherwise well-behaved programs).</Paragraph>
      <Paragraph position="1"> Depending on the properties of a particular grammar, it may, for example, be worthwhile to restrict a given category to its syntactic features before attempting to solve the parse-goal of that category. Shieber's (1985) restriction operator can be used here. Thus we essentially throw some information away before an attempt is made to solve a (memorized) goal. For example, the category x(A, B, f (A, B), g(A,h(B, i (C)) ) ) may be weakened into: x(A,B,f (_,_) ,g(_,_)) If we assume that the predicate weaken/2 relates a term t to a weakened version tw, such that tw subsumes t, then (15) is the improved version of the parse predicate:</Paragraph>
      <Paragraph position="3"> van Noord Efficient Head-Corner Parsing Note that goal weakening is sound. An answer a to a weakened goal g is only considered if a and g unify. Also note that goal weakening is complete in the sense that for an answer a to a goal g there will always be an answer a t to the weakening of g such that a t subsumes a.</Paragraph>
      <Paragraph position="4"> For practical implementations the use of goal weakening can be extremely important. It is my experience that a well-chosen goal-weakening operator may reduce parsing times by an order of magnitude.</Paragraph>
      <Paragraph position="5"> The goal-weakening technique can also be used to eliminate typical instances of the problems concerning the occur check (discussed in the previous subsection). Coming back to the example in the previous subsection, if our first goal</Paragraph>
      <Paragraph position="7"> then the problem would not occur. If we want to guarantee that no cyclic structures can be formed, then we would need to define goal weakening in such a way that no variable sharing occurs in the weakened goal.</Paragraph>
      <Paragraph position="8"> An important question is how to come up with a good goal-weakening operator.</Paragraph>
      <Paragraph position="9"> For the experiments discussed in the final section all goal-weakening operators were chosen by hand, based on small experiments and inspection of the goal table and item table. Even if goal weakening is reminiscent of Shieber's (1985) restriction operator, the rules of the game are quite different: in the case of goal weakening, as much information as possible is removed without risking nontermination of the parser, whereas in the case of Shieber's restriction operator, information is removed until the resulting parser terminates. For the current version of the grammar of OVIS, weakening the goal category in such a way that all information below a depth of 6 is replaced by fresh variables eliminates the problem caused by the absence of the occur check; moreover, this goal-weakening operator reduces parsing times substantially. In the latest version, we use different goal-weakening operators for each different functor.</Paragraph>
      <Paragraph position="10"> An interesting special case of goal weakening is constituted by a goal-weakening operator that ignores all feature constraints, and hence only leaves the functor for each goal category. In this case the administration of the goal table can be simplified considerably (the table consists of ground facts, hence no subsumption checks are required). This technique is used in the MiMo2 grammar and the Alvey NL Tools grammar, both discussed in Section 7.</Paragraph>
    </Section>
  </Section>
  <Section position="4" start_page="440" end_page="441" type="metho">
    <SectionTitle>
4. Compact Representation of Parse Trees
</SectionTitle>
    <Paragraph position="0"> Often a distinction is made between recognition and parsing. Recognition checks whether a given sentence can be generated by a grammar. Usually recognizers can be adapted to be able to recover the possible parse trees of that sentence (if any).</Paragraph>
    <Paragraph position="1"> In the context of Definite Clause Grammar this distinction is often blurred because it is possible to build up the parse tree as part of the complex nonterminal symbols.</Paragraph>
    <Paragraph position="2"> Thus the parse tree of a sentence may be constructed as a side effect of the recognition phase. If we are interested in logical forms rather than in parse trees, a similar trick may be used. The result of this, however, is that as early as the recognition phase, ambiguities will result in a (possibly exponential) increase of processing time.</Paragraph>
    <Paragraph position="3"> For this reason we will assume that parse trees are not built by the grammar, but rather are the responsibility of the parser. This allows the use of efficient packing  Example of a partial derivation tree projected by a history item.</Paragraph>
    <Paragraph position="4"> techniques. The result of the parser will be a parse forest: a compact representation of all possible parse trees rather than an enumeration of all parse trees.</Paragraph>
    <Paragraph position="5"> The structure of the parse forest in the head-corner parser is rather unusual, and therefore we will take some time to explain it. Because the head-corner parser uses selective memorization, conventional approaches to constructing parse forests (Billot and Lang 1989) are not applicable. The head-corner parser maintains a table of partial derivation trees, each of which represents a successful path from a lexical head (or gap) up to a goal category. The table consisting of such partial parse trees is called the history table; its items are history items.</Paragraph>
    <Paragraph position="6"> More specifically, each history item is a triple consisting of a result item reference, a rule name, and a list of triples. The rule name is always the name of a rule without daughters (i.e., a lexical entry or a gap): the (lexical) head. Each triple in the list of triples represents a local tree. It consists of the rule name, and two lists of result item references (representing the list of daughters left of the head in reverse, and the list of daughters right of the head). An example will clarify this. Suppose we have a history item:</Paragraph>
    <Paragraph position="8"> \[rule (vp_v, \[\] , \[\] ) , rule (s_np_vp, \[87\] , \[\] ) , rule (vp_vp_np_pp, \[\] , \[121,125\] ), rule (s_adv_s, \[46\] , \[\] )\] ).</Paragraph>
    <Paragraph position="9"> (16) This item indicates that there is a possible derivation of the category defined in result item 112 of the form illustrated in Figure 4. In this figure, the labels of the interior nodes are rule names, and the labels of the leaves are references to result items. The head-corner leaf is special: it is a reference to either a lexical entry or an epsilon rule. The root node is special too: it has both an associated rule name and a reference to a result item. The latter indicates how this partial derivation tree combines with other partial trees.</Paragraph>
    <Paragraph position="10"> The history table is a lexicalized tree substitution grammar, in which all nodes (except substitution nodes) are associated with a rule identifier (of the original grammar). This grammar derives exactly all derivation trees of the input. 4 As an example,</Paragraph>
  </Section>
  <Section position="5" start_page="441" end_page="443" type="metho">
    <SectionTitle>
4 The tree substitution grammar is lexicalized in the sense that each of the trees has an associated anchor,
</SectionTitle>
    <Paragraph position="0"/>
    <Paragraph position="2"> Tree substitution grammar that derives each of the two derivation trees of the sentence I see a man at home, for the grammar of Billot and Lang (1989). The start symbol of this grammar is nt6. Note that all nodes, except for substitution nodes, are associated with a rule (or lexical entry) of the original grammar. Root nodes have a nonterminal symbol before the colon, and the corresponding rule identifier after the colon. The set of derived trees for this tree substitution grammar equals the set of derivation trees of the parse (ignoring the nonterminal symbols of the tree substitution grammar).</Paragraph>
    <Paragraph position="3"> consider the grammar used by Tomita (1987) and Billot and Lang (1989), given here in (17) and (18).</Paragraph>
    <Paragraph position="4">  (I) s --&gt; np, vp. (2) s --&gt; s, pp. (3) np --&gt; n.</Paragraph>
    <Paragraph position="5"> (4) np --&gt; det, n. (5) np --&gt; rip, pp. (6) pp --&gt; prep, rip.</Paragraph>
    <Paragraph position="6"> (7) vp --&gt; v, rip.</Paragraph>
    <Paragraph position="8"> prep--&gt; \[at\]. det--&gt; \[a\]. n--&gt; \[home\].</Paragraph>
    <Paragraph position="9"> The sentence I see a man at home has two derivations, according to this grammar. The lexicalized tree substitution grammar in Figure 5, which is constructed by the head-corner parser, derives exactly these two derivations.</Paragraph>
    <Paragraph position="10"> Note that the item references are used in the same manner as the computer generated names of nonterminals in the approach of Billot and Lang (1989). Because we use chunks of parse trees, less packing is possible than in their approach. Correspondingly, the theoretical worst-case space requirements are also worse. In practice, however, this does not seem to be problematic: in our experiments, the size of the history table is always much smaller than the size of the other tables (this is expected because the latter tables have to record complex category information).</Paragraph>
    <Paragraph position="11"> Let us now look at how the parser of the previous section can be adapted to be able to assert history items. First, we add an (output-) argument to the parse predicate. This sixth argument is the reference to the result item that was actually used. The predicates to parse a list of daughters are augmented with a list of such references. This enables the construction of a term for each local tree in the head_corner predicate consisting of the name of the rule that was applied and the list of references of the result items which is a pointer to either a lexical entry or a gap.</Paragraph>
    <Paragraph position="12">  Computational Linguistics Volume 23, Number 3 used for the left and right daughters of that rule. Such a local tree representation is an element of a list that is maintained for each lexical head upward to its goal. Such a list thus represents in a bottom-up fashion all rules and result items that were used to show that that lexical entry indeed was a head-corner of the goal. If a parse goal has been solved then this list containing the history information is asserted in a new kind of table: the 'HISTORY_ITEM'/3 table. 5 We already argued above that parse trees should not be explicitly defined in the grammar. Logical forms often implicitly represent the derivational history of a category. Therefore, the common use of logical forms as part of the categories will imply that you will hardly ever find two different analyses for a single category, because two different analyses will also have two different logical forms. Therefore, no packing is possible and the recognizer will behave as if it is enumerating all parse trees. The solution to this problem is to delay the evaluation of semantic constraints. During the first phase, all constraints referring to logical forms are ignored. Only if a parse tree is recovered from the parse forest we add the logical form constraints. This is similar to the approach worked out in CLE (Alshawi 1992).</Paragraph>
    <Paragraph position="13"> This approach may lead to a situation in which the second phase actually filters out some otherwise possible derivations, in case the construction of logical forms is not compositional in the appropriate sense. In such cases, the first phase may be said to be unsound in that it allows ungrammatical derivations. The first phase combined with the second phase is of course still sound. Furthermore, if this situation arose very often, then the first phase would tend to be useless, and all work would have to be done during the recovery phase. The present architecture of the head-corner parser embodies the assumption that such cases are rare, and that the construction of logical forms is (grosso modo) compositional.</Paragraph>
    <Paragraph position="14"> The distinction between semantic and syntactic information is compiled into the grammar rules on the basis of a user declaration. We simply assume that in the first phase the parser only refers to syntactic information, whereas in the second phase both syntactic and semantic information is taken into account.</Paragraph>
    <Paragraph position="15"> If we assume that the grammar constructs logical forms, then it is not clear that we are interested in parse trees at all. A simplified version of the recover predicate may be defined in which we only recover the semantic information of the root category, but in which we do not build parse trees. The simplified version may be regarded as the run-time version, whereas parse trees will still be very useful for grammar development.</Paragraph>
  </Section>
  <Section position="6" start_page="443" end_page="445" type="metho">
    <SectionTitle>
5. Parsing Word-Graphs with Probabilities
</SectionTitle>
    <Paragraph position="0"> The head-corner parser is one of the parsers developed within the NWO Priority Programme on Language and Speech Technology. In this program a spoken dialog system is developed for public transportation information (Boves et al. 1995).</Paragraph>
    <Paragraph position="1"> In this system the input for the parser is not a simple list of words, as we have assumed up to now, but rather a word-graph: a directed, acyclic graph where the states are points in time and the edges are labeled with word hypotheses and their corresponding acoustic score. Thus, such word-graphs are acyclic weighted finite-state automata.</Paragraph>
    <Paragraph position="2"> In Lang (1989) a framework for processing ill-formed input is described in which 5 A complication is needed for those cases where items are removed later because a more general item has been found.</Paragraph>
    <Paragraph position="3">  van Noord Efficient Head-Corner Parsing certain common errors are modeled as (weighted) finite-state transducers. The composition of an input sentence with these transducers produces a (weighted) finite-state automaton, which is then input for the parser. In such an approach, the need to generalize from input strings to input finite-state automata is also clear. The generalization from strings to weighted acyclic finite-state automata introduces essentially two complications: we cannot use string indices anymore and we need to keep track of the acoustic scores of the words used in a certain derivation.</Paragraph>
    <Section position="1" start_page="444" end_page="445" type="sub_section">
      <SectionTitle>
5.1 From String Positions to State Names
</SectionTitle>
      <Paragraph position="0"> Parsing on the basis of a finite-state automaton can be seen as the computation of the intersection of that automaton with the grammar. If the definite clause grammar is off-line parsable, and if the finite-state automaton is acyclic, then this computation can be guaranteed to terminate (van Noord 1995). This is obvious because an acyclic finite-state automaton defines a finite number of strings. More importantly, existing techniques for parsing based on strings can be generalized easily by using the names of states in the automaton instead of the usual string indices.</Paragraph>
      <Paragraph position="1"> In the head-corner parser, this leads to an alternative to the predicate smaller_ equal/2. Rather than a simple integer comparison, we now need to check that a derivation from P0 to P can be extended to a derivation from E0 to E by checking that there are paths in the word-graph from E0 to P0 and from P to E.</Paragraph>
      <Paragraph position="2"> The predicate connection/2 is true if there is a path in the word-graph from the first argument to the second argument. It is assumed that state names are integers; to rule out cyclic word-graphs we also require that, for all transitions from P0 to P, it is the case that P0 &lt; P. Transitions in the word-graph are represented by clauses of the form wordgraph:trans (P0, Sym, P, Score), which indicate that there is a transition from state P0 to P with symbol Sym and acoustic score Score. The connection predicate can be specified simply as the reflexive and transitive closure of the transition relation between states:</Paragraph>
      <Paragraph position="4"> wordgraph : trans (AO, _, A i, _), connection (AI, A).</Paragraph>
      <Paragraph position="5"> The implementation allows for the possibility that state names are not instantiated (as required by the treatment of gaps). Moreover it uses memorization, and it ensures that the predicate succeeds at most once:</Paragraph>
      <Paragraph position="7"> Computational Linguistics Volume 23, Number 3 ; assertz(fail_conn(A,B)), fail .</Paragraph>
      <Paragraph position="8"> A somewhat different approach that may turn out to be more efficient is to use the ordinary comparison operator that we used in the original definition of the head-corner parser. The possible extra cost of allowing impossible partial analyses is worthwhile if the more precise check would be more expensive. If, for typical input word-graphs, the number of transitions per state is high (such that almost all pairs of states are connected), then this may be an option.</Paragraph>
    </Section>
    <Section position="2" start_page="445" end_page="445" type="sub_section">
      <SectionTitle>
5.2 Accounting for Word-Graph Scores
</SectionTitle>
      <Paragraph position="0"> To account for the acoustic score of a derivation (defined as the sum of the acoustic scores associated with all transitions from the word-graph involved in the derivation), we assume that the predicate lexical_analysis represents the acoustic score of the piece of the word-graph that it covers by an extra argument. During the first phase, acoustic scores are ignored. During the second phase (when a particular derivation is constructed), the acoustic scores are combined.</Paragraph>
    </Section>
  </Section>
  <Section position="7" start_page="445" end_page="447" type="metho">
    <SectionTitle>
6. Head-Corner Parsing and Robustness
</SectionTitle>
    <Paragraph position="0"> Certain approaches towards robust parsing use the partial results of the parser. It is assumed in such approaches that even if no full parse for the input could be constructed, the discovery of other phrases in the input might still be useful. It is also often assumed that a bottom-up parser is essential for such approaches to work: parsers that use top-down information (such as the head-corner parser) may fail to recognize relevant subparses in the context of an ungrammaticality.</Paragraph>
    <Paragraph position="1"> In the application for which the head-corner parser was developed, robust processing is essential. In a spoken dialogue system it is often impossible to parse a full sentence, but in such cases the recognition of other phrases, such as temporal expressions, might still be very useful. Therefore, a robust processing technique that collects the remnants of the parsing process in a meaningful way seems desirable.</Paragraph>
    <Paragraph position="2"> In this subsection, we show how the head-corner parser can be used in such circumstances. The parser is modified in such a way that it finds all derivations of the start symbol anywhere in the input. Furthermore, the start symbol should be defined in such a way that it includes all categories considered useful for the application.</Paragraph>
    <Section position="1" start_page="445" end_page="446" type="sub_section">
      <SectionTitle>
6.1 Underspecification of the Positions
</SectionTitle>
      <Paragraph position="0"> Normally the head-corner parser will be called as follows, for example: ?- parse(s(Sem) ,0,12) .</Paragraph>
      <Paragraph position="1"> indicating that we want to parse a sentence from position zero to twelve with category s (Sere) (a sentence with a semantic representation that is yet to be discovered). Suppose, however, that a specific robustness module is interested in all maximal projections anywhere in the sentence. Such a maximal projection may be represented by a term xp (Sere). Furthermore there may be unary grammar rules rewriting such an xp into appropriate categories, for example:</Paragraph>
      <Paragraph position="3"> van Noord Efficient Head-Corner Parsing If we want to recognize all maximal projections at all positions in the input, then we can simply give the following parse-goah ?- parse(xp(Sem) .... ). (22) Now one might expect that such an underspecified goal will dramatically slow down the head-corner parser, but this turns out to be false. In actual fact we have experienced an increase of efficiency using underspecification. This can only be understood in the light of the use of memorization. Even though we now have a much more general goal, the number of different goals that we need to solve is much smaller. Also note that even though the first call to the parse predicate has variable extreme positions, this does not imply that all power of top-down prediction is lost by this move; recursive calls to the parse predicate may still have instantiated left and/or right extreme positions. The same applies with even more force for top-down information on categories.</Paragraph>
    </Section>
    <Section position="2" start_page="446" end_page="447" type="sub_section">
      <SectionTitle>
6.2 The Robustness Component in OVIS
</SectionTitle>
      <Paragraph position="0"> In an attempt to obtain a robust natural language understanding component, we have experimented in OVIS with the techniques mentioned in the preceding paragraph. The top category (start symbol) of the OVIS grammar is defined as the category max (gem).</Paragraph>
      <Paragraph position="1"> Moreover there are unary rules such as max(gem) --* np(Sem,.. ) for NP, S, PP, AdvP.</Paragraph>
      <Paragraph position="2"> In the first phase, the parser finds all occurrences of the top category in the input word-graph. Thus, we obtain items for all possible maximal projections anywhere in the input graph. In the second phase, the robustness component selects a sequence of such maximal projections. The robustness procedure consists of a best-first search from the beginning of the graph to the end of the graph. A path in the input graph can be constructed by taking steps of two types. To move from position P to Q you can either: * use a maximal projection from P to Q (as constructed by the parser), or * use a transition from P to Q. In this case we say that we skip that transition.</Paragraph>
      <Paragraph position="3"> In order to compare paths in the best-first search method, we have experimented with score functions that include some or all of the following factors: * the number of skips. We prefer paths with a smaller number of such skips.</Paragraph>
      <Paragraph position="4"> * the number of maximal projections. We prefer paths with a smaller number of such projections.</Paragraph>
      <Paragraph position="5"> * the combined acoustic score as defined in the word-graph.</Paragraph>
      <Paragraph position="6"> * the appropriateness of the semantic representation given the dialogue context * the bigram score.</Paragraph>
      <Paragraph position="7"> If bigram scores are not included, then this best-first search method can be implemented efficiently because for each state in the word-graph we only have to keep track of the best path to that state.</Paragraph>
      <Paragraph position="8">  Computational Linguistics Volume 23, Number 3 The resulting best path in general consists of a number of maximal projections. In the OVIS application, these are often simple time or place expressions. The pragmatic module is able to deal with such unconnected pieces of information and will perform better if given such partial parse results.</Paragraph>
      <Paragraph position="9"> To evaluate the appropriate combination of the factors determining the scoring function, and to evaluate this approach with respect to other approaches, we use a corpus of word-graphs for which we know the corresponding actual utterances. We compare the sentence associated with the best path in the word-graph with the sentence that was actually spoken. Clearly, the more often the robustness component uses the information that was actually uttered, the more confidence we have in that component. This notion of word accuracy is an approximation of semantic accuracy (or &amp;quot;concept accuracy&amp;quot;). The string comparison is defined by the minimal number of deletions and insertions that is required to turn the first string into the second (Levenshtein distance), although it may be worthwhile to investigate other measures. For example, it seems likely that for our application it is much less problematic to &amp;quot;miss&amp;quot; information than to &amp;quot;hallucinate&amp;quot;. This could be formalized by a scoring function in which insertion (into analysis result) is cheaper than deletion.</Paragraph>
      <Paragraph position="10"> Currently, the best results are obtained with a scoring function in which bigram scores, acoustic scores, and the number of skips are included. We have also implemented a version of the system in which acoustic scores and bigram scores are used to select the best path through the word-graph. This path is then sent to the parser and the robustness component. In this &amp;quot;best-l-mode&amp;quot; the system performs somewhat worse in terms of word accuracy, but much faster, as seen in the experiments in the next section.</Paragraph>
    </Section>
  </Section>
  <Section position="8" start_page="447" end_page="448" type="metho">
    <SectionTitle>
7. Practical Experience
</SectionTitle>
    <Paragraph position="0"> There does not exist a generally agreed-upon method to measure the efficiency of parsers for grammars of the kind we assume here, i.e., constraint-based grammars for natural language understanding. Therefore, I will present the results of the parser for the current version of the OVIS grammar in comparison with a number of other parsers that have been developed in the same project (by my colleagues and myself). Moreover, a similar experiment was performed with two other grammars: the English MiMo2 grammar (van Noord et al. 1991), and the English Alvey NL Tools grammar (Grover, Carroll, and Briscoe 1993). 6 It should be clear that the results to be presented should not be taken as a formal evaluation, but are presented solely to give an impression of the practical feasibility of the parser, at least for its present purpose. The following results should be understood with these reservations in mind.</Paragraph>
    <Section position="1" start_page="447" end_page="448" type="sub_section">
      <SectionTitle>
7.1 Other Parsers
</SectionTitle>
      <Paragraph position="0"> The head-corner parser was compared with a number of other parsers. The parsers are described in further detail in van Noord, Bouma, Koeling, and Nederhof (1996) 6 The material used to perform the experiments with the MiMo2 grammar and the Alvey NL Tools grammar, including several versions of the head-corner parser, is available via anonymous ftp at: ftp ://ftp. let. rug. nllpublprolog-app/CL97/and the world-wide-web at: http://www, let. rug. nl/~vannoord/CL97/. The material is ready to be plugged into the Hdrug environment available from the same site.</Paragraph>
      <Paragraph position="1">  van Noord Efficient Head-Comer Parsing and van Noord, Nederhof, Koeling, and Bouma (1996). The last two parsers of the following list were implemented by Mark-Jan Nederhof.</Paragraph>
      <Paragraph position="2"> * lc. Left-corner parser. This parser is derived from the head-corner parser. It therefore uses many of the ideas presented above. Most importantly, it uses selective memorization with goal weakening and packing. The parser is closely related to the BUP parser (Matsumoto et al. 1983).</Paragraph>
      <Paragraph position="3"> * bu-inactive. Inactive chart parser. This is a bottom-up parser that records only inactive edges. It uses packing. It uses a precompiled version of the grammar in which no empty productions are present.</Paragraph>
      <Paragraph position="4"> * bu-earley. Bottom-up Earley parser. This is a bottom-up chart parser that records both active and inactive items. It operates in two phases and uses packing. It uses a precompiled version of the grammar in which no empty productions are present.</Paragraph>
      <Paragraph position="5"> * bu-active. Bottom-up Earley parser without packing. This is a chart parser that constructs only active items (except for categories that unify with the top category). It uses a precompiled version of the grammar in which no empty productions are present.</Paragraph>
      <Paragraph position="6"> * lr. LR parser. This is an experimental implementation of a generalization for Definite Clause Grammars of the parser described in Nederhof and Satta (1996). It proceeds in a single phase and does not use packing. It uses a table to maintain partial analyses. It was not possible to perform all the experiments with this parser due to memory problems during the construction of the LR table.</Paragraph>
      <Paragraph position="7"> Note that we have experimented with a number of different versions of each of these parsers. We will report only on the most efficient version. The experiments were performed on a 125Mhz HP-UX 735 machine with 240 Megabytes of memory. Timings measure CPU-time and should be independent of the load on the machine. 7</Paragraph>
    </Section>
    <Section position="2" start_page="448" end_page="448" type="sub_section">
      <SectionTitle>
7.2 Experiment 1: OVIS
</SectionTitle>
      <Paragraph position="0"> The OVIS grammar (for Dutch) contains about 1,400 lexical entries (many of which are station and city names) and 66 rules (a substantial fraction of which are concerned with time and date expressions), including 7 epsilon rules. The most important epsilon rule is part of a gap-threading implementation of verb-second. The grammar is documented in detail in van Noord, Nederhof, Koeling, and Bouma (1996). The head-corner table contains 128 pairs, the lexical head-corner table contains 93 pairs, the gap-head-corner table contains 14 pairs. The left-corner table contains 156 pairs, the lexical left-corner table contains 114 pairs, the gap-left-corner table contains 20 pairs. The precompiled grammar, which is used by the chart parsers, contains 92 rules.</Paragraph>
      <Paragraph position="1"> The input for the parser consists of a test set of 5,000 word-graphs, randomly taken from a corpus of more than 25,000 word-graphs. These word-graphs are the latest word-graphs that were available to us; they are &amp;quot;real&amp;quot; output of the current version of the speech recognizer as developed by our project partners. In this application, typical</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML