File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/99/e99-1022_metho.xml
Size: 13,305 bytes
Last Modified: 2025-10-06 14:15:20
<?xml version="1.0" standalone="yes"?> <Paper uid="E99-1022"> <Title>Selective Magic HPSG Parsing</Title> <Section position="3" start_page="167" end_page="169" type="metho"> <SectionTitle> 3 Selective Magic HPSG Parsing </SectionTitle> <Paragraph position="0"> In case of large grammars the huge space requirements of dynamic processing often nullify the benefit of tabling intermediate results. By combining control strategies and allowing the user to specify how to process particular constraints in the grammar the selective magic parser avoids this problem. This solution is based on the observation that there are sub-computations that are relatively cheap and as a result do not need tabling (Johnson and D6rre, 1995; van Noord, 1997).</Paragraph> <Section position="1" start_page="167" end_page="168" type="sub_section"> <SectionTitle> 3.1 Parse Type Specification </SectionTitle> <Paragraph position="0"> Combining control strategies depends on a way to differentiate between types of constraints. For example, the ALE parser (Carpenter and Penn, 1994) presupposes a phrase structure backbone which can be used to determine whether a constraint is to be interpreted bottom-up or topdown. In the case of selective magic parsing we use so-called parse types which allow the user to specify how constraints in the grammar are to be interpreted. A literal (goal) is considered a parse lype literal (goal) if it has as its single argument a typed feature structure of a type specified as a parse type. 1deg All types in the type hierarchy can be used as parse types. This way parse type specification supports a flexible filtering component which allows us to experiment with the role of filtering. However, in the remainder we will concentrate on a specific class of parse types: We assume the specification of type sign and its sub-types as parse types. 11 This choice is based on the observation that the constraints on type sign and its sub-types play an important guiding role in the parsing process and are best interpreted bottom-up given the lexical orientation of I-IPSG. The parsing process corresponding to such a parse type specification is represented schematically in magic parsing process the :r~'L definite clauses that specify the word objects in the grammar, phrases are built bottom-up by matching the parse type literals of the definite clauses in the grammar against the edges in the table. The non-parse type literals are processed according to the top-down control strategy 1degThe notion of a parse type literal is closely related to that of a memo literal as in (Johnson and DSrre, 1995). l~When a type is specified as a parse type, all its sub-types are considered as parse types as well. This is necessary as otherwise there may e.xist magic variants of definite clauses defining a parse type goal for which no magic facts can be derived which means that the magic literal of these clauses can be interpreted neither top-down nor bottom-up.</Paragraph> <Paragraph position="1"> described in section 3.3.</Paragraph> </Section> <Section position="2" start_page="168" end_page="168" type="sub_section"> <SectionTitle> 3.2 Selective Magic Compilation </SectionTitle> <Paragraph position="0"> In order to process parse type goals according to a semi-naive magic control strategy, we apply magic compilation selectively. Only the T~-L definite clauses in a typed feature grammar which define parse type goals are subject to magic compilation.</Paragraph> <Paragraph position="1"> The compilation applied to these clauses is identical to the magic compilation illustrated in section 2.1 except that we derive magic rules only for the right-hand side literals in a clause which are of a parse type. The definite clauses in the grammar defining non-parse type goals are not compiled as they will be processed using the top-down interpreter described in the next section.</Paragraph> </Section> <Section position="3" start_page="168" end_page="169" type="sub_section"> <SectionTitle> 3.3 Advanced Top-down Control </SectionTitle> <Paragraph position="0"> Non-parse type goals are interpreted using the standard interpreter of the ConTroll grammar development system (G5tz and Meurers, 1997b) as developed and implemented by Thilo GStz. This advanced top-down interpreter uses a search function that allows the user to specify the information on which the definite clauses in the grammar are indexed. An important advantage of deep multiple indexing is that the linguist does not have to take into account of processing criteria with respect to the organization of her/his data as is the case with a standard Prolog search function which indexes on the functor of the first argument.</Paragraph> <Paragraph position="1"> Another important feature of the top-down interpreter is its use of a selection function that interprets deterministic goals, i. e., goals which unify with the left-hand side literal of exactly one definite clause in the grammar, prior to non-deterministic goals. This is often referred to as incorporating delerministic closure (DSrre, 1993).</Paragraph> <Paragraph position="2"> Deterministic closure accomplishes a reduction of the number of choice points that need to be set during processing to a minimum. Furthermore, it leads to earlier failure detection.</Paragraph> <Paragraph position="3"> Finally, the used top-down interpreter implements a powerful coroutining mechanism: 12 At run time the processing of a goal is postponed in case it is insufficiently instantiated. Whether or not a goal is sufficiently instantiated is determined on the basis of so-called delay palierns. 13 These are specifications provided by the user that 12Coroutining appears under many different guises, like for example, suspension, residuation, (goal) freezing, and blocking. See also (Colmerauer, 1982; Naish, 1986). 13In the literature delay patterns are sometimes also referred to as wait declarations or .block statements.</Paragraph> <Paragraph position="4"> indicate which restricting information has to be available before a goal is processed.</Paragraph> </Section> <Section position="4" start_page="169" end_page="169" type="sub_section"> <SectionTitle> 3.4 Adapted Semi-naive Bottom-up Interpretation </SectionTitle> <Paragraph position="0"> The definite clauses resulting from selective magic transformation are interpreted using a semi-naive bottom-up interpreter that is adapted in two respects. It ensures that non-parse type goals are interpreted using the advanced top-down interpreter, and it allows non-parse type goals that remain delayed locally to be passed in and out of sub-computations in a similar fashion as proposed by (Johnson and DSrre, 1995). In order to accommodate these changes the adapted semi-naive interpreter enables the use of edges which specify delayed goals.</Paragraph> <Paragraph position="1"> definite clause under consideration to the advanced top-down interpreter via the call to advanced_td_interpret/2 as the list of goals TopDown. 14 The second defining clause of match/3 is added to ensure all right-hand side literals are directly passed to the advanced top-down interpreter if none of them are of a parse type.</Paragraph> <Paragraph position="2"> Allowing edges which specify delayed goals necessitates the adaption of the definition of edges/3. When a parse type literal is matched against an edge in the table, the delayed goals specified by that edge need to be passed to the top-down interpreter. Consider the definition of the predicate edges in figure 11. The third argument of the definition of edges/4 is used to collect delayed goals. When there are no more parse type literals in the right-hand side of the definite clause under consideration, the second defining clause of edges/4 appends the collected delayed goals Z4The definition of match/3 assumes that there exists a strict ordering of the right-hand side literals in the definite clauses in the grammar, i. e., parse type literals always preced e non-parse type literals.</Paragraph> <Paragraph position="3"> edges(\[Lit\[Lits\],Table,Delayed0,TopDown):parse_type(Lit), null member(edge(Lit,Delayedl),Table), append(Delayed0,Delayedl,Delayed).</Paragraph> <Paragraph position="4"> edges(Lit,Table,Delayed,TopDown).</Paragraph> <Paragraph position="5"> edges(\[\],_,Delayed,TopDown):append(Delayed,Lit,TopDown). null Figure lh Adapted definition of edges/4 to the remaining non-parse type literals. Subsequently, the resulting list of literals is passed up again for advanced top-down interpretation.</Paragraph> </Section> </Section> <Section position="4" start_page="169" end_page="169" type="metho"> <SectionTitle> 4 Implementation </SectionTitle> <Paragraph position="0"> The described parser was implemented as part of the ConTroll grammar development system (GStz and Meurers, 1997b). Figure 10 shows the over-all setup of the ConTroll magic component. The Controll magic component presupposes a parse type specification and a set of delay patterns to determine when non-parse type constraints are to be interpreted. At run-time the goal-directedness of the selective magic parser is further increased by means of using the phonology of the natural language expression to be parsed as specified by the initial goal to restrict the number of facts that are added to the table during initialization. Only those facts in the grammar corresponding to lexical entries that have a value for their phonology feature that appears as part of the input string are used to initialize the table.</Paragraph> <Paragraph position="1"> The ConTroll magic component was tested with a larger (> 5000 lines) HPSG grammar of a sizeable fragment of German. This grammar provides an analysis for simple and complex verb-second, verb-first and verb-last sentences with scrambling in the mittelfeld, extraposition phenomena, wh-movement and topicalization, integrated verb-first parentheticals, and an interface to an illocution theory, as well as the three kinds of infinitive constructions, nominal phrases, and adverbials (Hinrichs et al., 1997).</Paragraph> <Paragraph position="2"> As the test grammar combines sub-strings in a non-concatenative fashion, a preprocessor is used that chunks the input string into linearization domains. This way the standard ConTroll interpreter (as described in section 3.3) achieves parsing times of around 1-5 seconds for 5 word sentences and 10-60 seconds for 12 word sentences) s The use of magic compilation on all grammar constraints, i.e., tabling of all sub-computations, lSParsing with such a grammar is difficult in any system as it does neither have nor allow the extraction of a phrase structure backbone.</Paragraph> <Paragraph position="3"> significant speedup in many cases. For example, parsing with the module of the grammar implementing the analysis of nominal phrases is up to nine times faster. At the same time though selective magic HPSG parsing is sometimes significantly slower. For example, parsing of particular sentences exhibiting adverbial subordinate clauses and long extraction is sometimes more than nine times slower. We conjecture that these ambiguous results are due to the use of coroutining: As the test grammar was implemented using the standard ConTroll interpreter, the delay patterns used presuppose a data-flow corresponding to advanced top-down control and are not fine-tuned with respect to the data-flow corresponding to the selective magic parser.</Paragraph> <Paragraph position="4"> Coroutining is a flexible and powerful facility used in many grammar development systems and it will probably remain indispensable in dealing with many control problems despite its various disadvantages) 6 The test results discussed above indicate that the comparison of parsing strategies can be seriously hampered by fine-tuning parsing using delay patterns. We believe therefore that further research into the systematics underlying coroutining would be desirable.</Paragraph> </Section> <Section position="5" start_page="169" end_page="171" type="metho"> <SectionTitle> 5 Concluding Remarks </SectionTitle> <Paragraph position="0"> We described a selective magic parser for typed feature grammars implementing HPSG that combines the advantages of dynamic bottom-up and advanced top-down control. As a result the parser avoids the efficiency problems resulting from the huge space requirements of storing intermediate results in parsing with large grammars. The parser allows the user to apply magic compilation to specific constraints in a grammar which as a 16Coroutining has a significant run-time overhead caused by the necessity to check the instantiation status of a literal/goal. In addition, it demands the procedural annotation of an otherwise declarative grammar. Finally, coroutining presupposes that a grammar writer possesses substantial processing expertise.</Paragraph> <Paragraph position="1"> Proceedings of EACL '99 result can be processed dynamically in a bottom-up and goal-directed fashion. State of the art top-down processing techniques are used to deal with the remaining constraints. We discussed various aspects concerning the implementation of the parser which was developed as part of the grammar development system ConTroll.</Paragraph> </Section> class="xml-element"></Paper>