File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/94/c94-2127_metho.xml
Size: 18,698 bytes
Last Modified: 2025-10-06 14:13:42
<?xml version="1.0" standalone="yes"?> <Paper uid="C94-2127"> <Title>Bottom-Up Earley Deduction</Title> <Section position="3" start_page="796" end_page="799" type="metho"> <SectionTitle> 2 Bottom-up Earley Deduction </SectionTitle> <Paragraph position="0"> Earley deduction \[10\] is based on grammars encoded as definite clauses. The instantiation (prediction) rule of top-down Earley deduction is not needed in bottom-up Earley deduction, because there is no prediction. There is only one inference rule, namely the reduction rule (1)3 In (1), X, G and G t are literals, ~ is a (possibly empty) sequence of literals, and a is the most general unifier of G and G'. The leftmost literal in the. lmdy of a non:unit clause is Mways the selected literal.</Paragraph> <Paragraph position="2"> In 1)riuciple, this rule can be applied to any pair of unit clanses and non:unit clauses of the program to derive any consequences of the pro: gram. In order to reduce this search space and achieve a more goal-directed behaviour, the rule is not applied to any pair of clauses, but clauses are on\]y selected if they can contribute to a proof of the goal. The set of selected clauses is (;ailed the chart. 3 The selection of clauses is guided by a scanning step (section 2.1) an(l indexing of clauses (section 2.2).</Paragraph> <Section position="1" start_page="796" end_page="796" type="sub_section"> <SectionTitle> 2.1 Scanning </SectionTitle> <Paragraph position="0"> The purpose of the scanning step, whic:h corresponds to lexical lookup in chart parsers, is to look up base cases of recursive definitions to serve as a starting point for bottom-up processing. The scanning step selects clauses that can appear as leaves in the proof tree lbr a given goal C.</Paragraph> <Paragraph position="1"> Consider the following simple definition of an HPSG, with the recursive definition of the predicate sign/I. 4 2This rule is called combine by Earley, and is also referred to as the flmdamental rule in the literature on chart parsing.</Paragraph> <Paragraph position="2"> aThc chart differs from the state of \[10\] in that clauses in the chart arc indexed (cf. section 2.2).</Paragraph> <Paragraph position="3"> 4 We use feature terms in dcfinitc clauses in addition to Prolog terms, f:X denotes a feature structure where X is the value of h:ature f, and X ~ Y denotes the conjunction</Paragraph> <Paragraph position="5"> sequence_union(CD_Ph,HD_Ph,X_Ph).</Paragraph> <Paragraph position="6"> The predicate sign/1 is defined recursively, and the base case is the predicate lexical_sign/1. But, clearly it is not restrictive enough to find only the predicate name of the base case for a given goal. The base cases must also be instantiated in order to find those that are useful for proving a given goal. In the case of parsing, the lookup of base cases (lexical items) will depend on the words that are present in the input string. This is implied by the first goal of the predicate principles/3, the constituent order principle, which determines ihow the l'nON value of a constituent is construtted from the eflON values of its daughters.</Paragraph> <Paragraph position="7"> in general, we assume that the constituent order principle makes use of a linear and non-erasing oI)eratkm tor combining stringsJ If this is the case, then M1 the words contained in the PnON value of the goal can have their lexical items selected as unit clauses to start bottom-up processing.</Paragraph> <Paragraph position="8"> l%r generation, an analogous condition on logical forms has been proposed by Shieber \[13\] as the &quot;semantic monotonicity condition,&quot; which requires that the :logical form of every base case must subsume some portion of the goal's logical form.</Paragraph> <Paragraph position="9"> Base case lookup must be defined specifically tbr different grammatical theories and directions of processing by the predicate lookup/2, whose first argument is the goal and whose second argument is the selected base case. The following of the feature terms X and Y.</Paragraph> <Paragraph position="10"> r'There is an obvions connection to the Linear Context-</Paragraph> </Section> <Section position="2" start_page="796" end_page="797" type="sub_section"> <SectionTitle> Free Rewriting Systems (LCFRS) \[15, 16\]. </SectionTitle> <Paragraph position="0"> clause defines the lookup relation for parsing with HPSG.</Paragraph> <Paragraph position="2"> lexicon(Word,X).</Paragraph> <Paragraph position="3"> Note that the base case clauses can become further instantiated in this step. If concatenation (of difference lists) is used as the operation on strings, then each base case clause can be instantiated with the string that follows it. This avoids combination of items that are not adjacent in the input string.</Paragraph> <Paragraph position="4"> lookup (phon : PhonLis t, lexical_sign(phon: \[Word\[ Suf\] -Suf</Paragraph> <Paragraph position="6"> append(_, \[Word I Suf\] , PhonList), lexicon (Word, Synsem).</Paragraph> <Paragraph position="7"> In bottom-up Earley deduction, the first step towards proving a goal is perform lookup for the goal, and to add all the resulting (unit) clauses to the chart. Also, all non-unit clauses of the program, which can appear as internal nodes in the proof tree of the goal, are added to the chart.</Paragraph> <Paragraph position="8"> The scanning step achieves a certain degree of goal-directedness for bottom-up algorithms because only those clauses which can appear as leaves in the proof tree of the goal are added to the chart.</Paragraph> </Section> <Section position="3" start_page="797" end_page="798" type="sub_section"> <SectionTitle> 2.2 Indexing </SectionTitle> <Paragraph position="0"> An item in normal context-free chart parsing can be regarded as a pair (R,S) consisting of a dotted rule R and the substring S that the item covers (a pair of starting and ending position). The fundamental rule of chart parsing makes use of these string positions to ensure that only adjacent sub-strings are combined and that the result is the concatenation of the substrings.</Paragraph> <Paragraph position="1"> In grammar formalisms like DCG or IIPSG, the complex nonterminals have an argument or a feature (PtION) that represents the covered substring explicitly. The combination of the substrings is explicit in the rules of the grammar. As a consequence, Earley deduction does not need to make use of string positions for its clauses, as Pereira and Warren \[10\] point out.</Paragraph> <Paragraph position="2"> Moreover, the use of string positions known from chart parsing is too inflexible because it el= lows only concatenation of adjacent contiguous substrings. In linguistic theory, the interest has shifted from phrase structure rules that combine adjacent and contiguous constituents to * principle-based approaches to grammar that state general well-formedness conditions instead of describing particular constructions (e.g. IIPSG) * operations on strings that go beyond concatenation (head wrapping \[11\], tree adjoining \[15\], sequence uuion \[12\]).</Paragraph> <Paragraph position="3"> The string positions known from chart parsing are also inadequate for generation, as pointed out by Shieber \[13\] in whose generator all items go from position 0 to 0 so that any item can be combined with any item.</Paragraph> <Paragraph position="4"> Itowever, the string positions are useful as an indexing of the items so that it can be easily detected whether their combination can contribute to a proof of the goal. This is especially important for a bottom-up algorithm which is not goal-directed like top-down processing. Without indexing, there are too many combinations of items which are useless for a proof of the goal, in fact there may be infinitely many items so that termination problems can arise.</Paragraph> <Paragraph position="5"> For example, in an order-monotonic grammar formalism that uses sequence union as the operation for combining strings, a combination of items would be useless which results in a sign in which the words are not in the same order as in the input string \[14\].</Paragraph> <Paragraph position="6"> We generalize the indexing scheme from chart parsing in order to allow different operations for the combination of strings. Indexing improves efficiency by detecting combinations that would fall anyway and by avoiding combinations of items that are useless for a proof of the goal.</Paragraph> <Paragraph position="7"> We define an item as a pair of a clause Cl and an index Idx, written as (Cl~ Idx}.</Paragraph> <Paragraph position="8"> Below, we give some examples of possible indexing schemes. Other indexing schemes can be used if they are needed.</Paragraph> <Paragraph position="9"> 1. Non-reuse of Items: This is useful for LCFRS, where no word of the input string can be used twice in a proof, or for generation where no part of the goal logical form should be verbalized twice in a derivation.</Paragraph> <Paragraph position="10"> used several times in a proof, tor example for the non-unit clauses of the program, which would be represented as items of the form <X ~-- (11 A ... h Gn, fret;).</Paragraph> <Paragraph position="11"> The following table summarizes the properties of these live coml)ination schemes. Index 1 (11) is the index associated with the non-unit clause, Index 2 (12) is associated with the unit clause, and</Paragraph> <Paragraph position="13"> ....... + 4. X-Y Y-Z X-Z I 5. l--~-- 'free ~ .... ~--+ In case 2 (&quot;non-adjacent combinatiou&quot;), the indices X and Y consist of a set of string positions, and tile operation (:) is the union of these string positions, provided that no two string positions fi'om X and Y do overlap.</Paragraph> <Paragraph position="14"> In (2), the reduction rule is augmented to handle indices. X, Y denotes the combination of the indices X and Y.</Paragraph> <Paragraph position="16"> +- a), n. 12> (2) With the use of indices, the lookup relation becomes a relation between goals and items. The following specification of the lookup relation provides indexing according to string positions as in a chart parser (usable for combination schemes 2, 3, and 4).</Paragraph> <Paragraph position="18"/> </Section> <Section position="4" start_page="798" end_page="799" type="sub_section"> <SectionTitle> 2.3 Goal Types </SectionTitle> <Paragraph position="0"> In constraint-based grammars there are some predicates that are not adequately dealt with by bottom-up Earley deduction, for example the llead Feature Principle and the Subcategorization Principle of nrsG. The Head Feature Principle just unifies two variables, so that it can be executed at compile time and need not be called as a goal at runtime. The Subcategorization Principle involves an operation on lists (appond/3 or dolot+/3 in different formalizations) that does not need bottom-up processing, but can better be evaluated by top-down resolution if its arguments are sulficiently instantiated. Creating and managing items for these proofs is too much of a computational overhead, arid, moreover, a proof may not terminate in the bottom-up case because infinitely many consequences may be derived fi'om the base case of a recursively defined relation.</Paragraph> <Paragraph position="1"> In order to (teal with such goals, we associate the goals in the body of a clause with goal types.</Paragraph> <Paragraph position="2"> The goals that are relevant for bottom-up Earley deduction are called wailing goals because they wait until they are activated by a unit clause that unifies with the goalfi Whenever at unit clause is combined with a non-unit clause all goals up to the first waiting goal of the resulting clause are proved according to their goal type, and then a new clause is added whose selected goal is the first waiting goal.</Paragraph> <Paragraph position="3"> In the following inference rule for clauses with mixed goal types, E is a (possibly empty) sequence of goals without any waiting goals, and 9t is a (possibly empty) sequence of goals starting with a waiting goal. (r is the most general unifier of G and G ~, and the substitution v is the solution which results from proving the sequence of goals</Paragraph> <Paragraph position="5"/> </Section> <Section position="5" start_page="799" end_page="799" type="sub_section"> <SectionTitle> 2.4 Correctness and Completeness </SectionTitle> <Paragraph position="0"> In order to show the correctness of the system, we must show that the scanning step only adds consequences of the program to the chart, and that any items derived by the inference rule are consequences of the program clauses. The former is easy to show because all clauses added by the scanning step are instances of program clauses, and the inference rule performs a resolution step whose correctness is well-known in logic programming. The other goal types are also proved by resolution.</Paragraph> <Paragraph position="1"> There are two potential sources of incompleteness in the algorithm. One is that the scanning step may not add all the program clauses to tim chart that are needed for proving a goal, and the other is that the indexing may prevent the derivation of a clause that is needed to prove the goal. In order to avoid incompleteness, the scanning step must add all program clauses that are needed for a proof of the goal to the chart, and the combination of indices may only fail for inference steps which are useless for a proof of the goal. That depth-first search), x-corner goals (which combine bottom-up and top-down processing like left-corner or head-corner algorithms), Prolog goals (which are directly executed by Prolog for efficiency or side-effects), and chart goals which create a new, independent chart for the proof of the goal. DSrre \[3\] proposes a system with two goal types, namely trigger goals, which lead to the creation of items and othcr goals which don't.</Paragraph> <Paragraph position="2"> the lookup relation and the indexing scheme satisfy this property must be shown for particular grammar formalisms.</Paragraph> <Paragraph position="3"> In order to keep the search space small (and finite to ensure termination) the scanning step should (ideally) add only those items that are needed for proving the goal to the chart, and the indexing should be chosen in such a way that it excludes derived items that are useless for a proof of the goal.</Paragraph> </Section> </Section> <Section position="4" start_page="799" end_page="800" type="metho"> <SectionTitle> 3 Best-First Search </SectionTitle> <Paragraph position="0"> For practical NL applications, it is desirable to have a best-first search strategy, which follows the most promising paths in the search space first, and :finds preferred solutions before the less preferred ones.</Paragraph> <Paragraph position="1"> There are often situations where the criteria to guide the search are available only for the base cases, for example (r) weighted word hypotheses from a speech recognizer null * readings for ambigous words with probabilities, possibly assigned by a stochastic tagger (el. \[2\]) hypotheses for correction of string errors which should be delayed \[5\] Goals and clauses are associated with preference values that are intended to model the degree of confidence that a particular solution is the ~correct' one. Unit clauses are associated with a numerical preference value, and non-unit clauses with a formula that determines how its preference value is computed fi'om the preference values of the goals in the body of the clause. Preference values can (but need not) be interpreted as probabilities. 7 The preference values are the basis for giving priorities to items, l'br unit clauses, the priority is identified with the preference value, tibr non-unit clauses, where the preference formula may contain uninstantiated variables, the priority is the value of the formula with the free variables instantiated to the highest possible preference value (in case of an interpretation as probabilities: 1), so that the priority is equal to the maximal possible preference valne for the clause, s The implementation of best-first search does not combine new itelns with the chart immediately, but makes use of an agenda \[8\], on which new items are ordered in order of descending priority. The following is the algorithm for bottom-up best-first F, arley deduction.</Paragraph> <Paragraph position="2"> procedure prove( Goal): - initialize-agenda(Goal) - consume-agenda - for any item {G,I) - return mgu(Goal, G) as solution if it exists procedure initialize-agenda(Goal): -for every unit clause UCin lookup(Goal, UC) - create the index I for UC - add item (UC, I) to agenda - for every non-unit program clause H +- Body - add item (H &quot;,-- 13ody.free) to agenda procedure add item 1 to agenda - compute the priority of I - agenda := agenda 12 {I} procedure consume-agenda - while agenda is not empty - remove item I with highest priority from agenda - add item I to chart procedure add item (C, It) to chart - chart := chart O {(C, I1)} - if 6' is a unit clause - for all items (H ~-- G A E A ~, 12) - if I = 12-k I1 exists and r, = mgu(C, G) exists and goals ~ are provable with solution r then add item (ra(H ~- ~), 1) to agenda -if C = H ~- GAEA ~ is a non-unit clause - for all items (G' ~-, I2)</Paragraph> <Paragraph position="4"> and goals ~ are provable with solution r then add item (ra(lt +- ~2), I) to agenda The algorithm is parametrized with respect to the relation lookup/2 and the choice of the indexing scheme, which are specific for difi'erent grammatical theories and directions of processing.</Paragraph> <Paragraph position="5"> SThere are also other methods for mssigning priorities to items.</Paragraph> </Section> <Section position="5" start_page="800" end_page="800" type="metho"> <SectionTitle> 4 Implementation </SectionTitle> <Paragraph position="0"> The bottom-up Earley deduction algorithm described here has been implemented in Quintus Prolog as part of the GeLD system. GeLD (GeneraJized Linguistic Deduction) is an extension of Prolog which provides typed feature descriptions and preference values as additions to the expressivity of the language, and partial evaluation, topdown, head-driven, and bottom-up Barley deduction as processing strategies. Tests of the system with small grammars have shown promising resalts, and a medium-scale HPSG for German is presently being implemented in GeLD. The lookup relation and the choice of an indexing scheme must be specified by the user of the system.</Paragraph> </Section> class="xml-element"></Paper>