XML Viewer - p95-1036

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/95/p95-1036_metho.xml
Size: 19,134 bytes
Last Modified: 2025-10-06 14:14:06
<?xml version="1.0" standalone="yes"?>
<Paper uid="P95-1036">
  <Title>Some Novel Applications of Explanation-Based Learning to Parsing Lexicalized Tree-Adjoining Grammars&amp;quot;</Title>
  <Section position="4" start_page="268" end_page="271" type="metho">
    <SectionTitle>
3 Overview of our approach to using
EBL
</SectionTitle>
    <Paragraph position="0"> We are pursuing the EBL approach in the context of a wide-coverage grammar development system called XTAG (Doran et al., 1994). The XTAG system consists of a morphological analyzer, a part-of-speech tagger, a wide-coverage LTAG English grammar, a predictive left-to-right Early-style parser for LTAG (Schabes, 1990) and an X-windows interface for grammar development (Paroubek et al., 1992).</Paragraph>
    <Paragraph position="1"> Figure 3 shows a flowchart of the XTAG system.</Paragraph>
    <Paragraph position="2"> The input sentence is subjected to morphological analysis and is parts-of-speech tagged before being sent to the parser. The parser retrieves the elementary trees that the words of the sentence anchor and combines them by adjunction and substitution operations to derive a parse of the sentence.</Paragraph>
    <Paragraph position="3"> Given this context, the training phase of the EBL process involves generalizing the derivation trees generated by XTAG for a training sentence and storing these generalized parses in the generalized parse 2There axe some differences between derivation trees and conventional dependency trees. However we will not discuss these differences in this paper as they are not relevant to the present work.</Paragraph>
    <Paragraph position="5"> the sentence: show me the flights from Boston to Philadelphia.</Paragraph>
    <Section position="1" start_page="270" end_page="270" type="sub_section">
      <SectionTitle>
Input Segtcnce
</SectionTitle>
      <Paragraph position="0"/>
      <Paragraph position="2"> the EBL component database under an index computed from the morphological features of the sentence. The application phase of EBL is shown in the flowchart in Figure 4. An index using the morphological features of the words in the input sentence is computed. Using this index, a set of generalized parses is retrieved from the generalized parse database created in the training phase. If the retrieval fails to yield any generalized parse then the input sentence is parsed using the full parser. However, if the retrieval succeeds then the generalized parses are input to the &amp;quot;stapler&amp;quot;. Section 5 provides a description of the &amp;quot;stapler&amp;quot;. null</Paragraph>
    </Section>
    <Section position="2" start_page="270" end_page="271" type="sub_section">
      <SectionTitle>
3.1 Implications of LTAG representation
for EBL
</SectionTitle>
      <Paragraph position="0"> An LTAG parse of a sentence can be seen as a sequence of elementary trees associated with the lexical items of the sentence along with substitution and adjunction links among the elementary trees. Also, the feature values in the feature structures of each node of every elementary tree are instantiated by the parsing process. Given an LTAG parse, the generalization of the parse is truly immediate in that a generalized parse is obtained by (a) uninstantiating the particular lexical items that anchor the individual elementary trees in the parse and (h) uninstantiating the feature values contributed by the morphology of the anchor and the derivation process. This type of generalization is called feature-generalization.</Paragraph>
      <Paragraph position="1"> In other EBL approaches (Rayner, 1988; Neumann, 1994; Samuelsson, 1994) it is necessary to walk up and down the parse tree to determine the appropriate subtrees to generalize on and to suppress the feature values. In our approach, the process of generalization is immediate, once we have the output of the parser, since the elementary trees anchored by the words of the sentence define the sub-trees of the parse for generalization. Replacing the elementary trees with unistantiated feature values is all that is needed to achieve this generalization.</Paragraph>
      <Paragraph position="2"> The generalized parse of a sentence is stored indexed on the part-of-speech (POS) sequence of the training sentence. In the application phase, the POS sequence of the input sentence is used to retrieve a generalized parse(s) which is then instantiated with the features of the sentence. This method of retrieving a generalized parse allows for parsing of sentences of the same lengths and the same POS sequence as those in the training corpus. However, in our approach there is another generalization that falls out of the LTAG representation which allows for flexible matching of the index to allow the system to parse sentences that are not necessarily of the same length as any sentence in the training corpus.</Paragraph>
      <Paragraph position="3"> Auxiliary trees in LTAG represent recursive structures. So if there is an auxiliary tree that is used in an LTAG parse, then that tree with the trees for its arguments can be repeated any number of times, or possibly omitted altogether, to get parses of sentences that differ from the sentences of the training corpus only in the number of modifiers. This type of generalization is called modifier-generalization. This type of generalization is not possible in other EBL approaches.</Paragraph>
      <Paragraph position="4"> This implies that the POS sequence covered by the auxiliary tree and its arguments can be repeated zero or more times. As a result, the index of a generalized parse of a sentence with modifiers is no longer a string but a regular expression pattern on the POS sequence and retrieval of a generalized parse involves regular expression pattern matching on the indices.</Paragraph>
      <Paragraph position="5"> If, for example, the training example was</Paragraph>
      <Paragraph position="7"> then, the index of this sentence is (2) VNDN(PN)* since the two prepositions in the parse of this sentence would anchor (the same) auxiliary trees.  The most efficient method of performing regular expression pattern matching is to construct a finite state machine for each of the stored patterns and then traverse the machine using the given test pattern. If the machine reaches the final state, then the test pattern matches one of the stored patterns. Given that the index of a test sentence matches one of the indices from the training phase, the generalized parse retrieved will be a parse of the test sentence, modulo the modifiers. For example, if the test sentence, tagged appropriately, is</Paragraph>
      <Paragraph position="9"> then, Mthough the index of the test sentence matches the index of the training sentence, the generalized parse retrieved needs to be augmented to accommodate the additional modifier.</Paragraph>
      <Paragraph position="10"> To accommodate the additional modifiers that may be present in the test sentences, we need to pro- null vide a mechanism that assigns the additional modifiers and their arguments the following: 1. The elementary trees that they anchor and 2. The substitution and adjunction links to the  trees they substitute or adjoin into.</Paragraph>
      <Paragraph position="11"> We assume that the additional modifiers along with their arguments would be assigned the same elementary trees and the same substitution and adjunction links as were assigned to the modifier and its arguments of the training example. This, of course, means that we may not get all the possible attachments of the modifiers at this time. (but see the discussion of the &amp;quot;stapler&amp;quot; Section 5.)</Paragraph>
    </Section>
  </Section>
  <Section position="5" start_page="271" end_page="272" type="metho">
    <SectionTitle>
4 FST Representation
</SectionTitle>
    <Paragraph position="0"> The representation in Figure 6 combines the generalized parse with the POS sequence (regular expression) that it is indexed by. The idea is to annotate each of the finite state arcs of the regular expression matcher with the elementary tree associated with that POS and also indicate which elementary tree it would be adjoined or substituted into. This results in a Finite State Transducer (FST) representation, illustrated by the example below. Consider the sentence (4) with the derivation tree in Figure 5.</Paragraph>
    <Paragraph position="1"> (4) show me the flights from Boston to Philadelphia.</Paragraph>
    <Paragraph position="2"> An alternate representation of the derivation tree that is similar to the dependency representation, is to associate with each word a tuple (this_tree, head_word, head_tree, number). The description of the tuple components is given in Table 1.</Paragraph>
    <Paragraph position="3"> Following this notation, the derivation tree in Figure 5 (without the addresses of operations) is represented as in (5).</Paragraph>
    <Paragraph position="5"> depend on any other word.</Paragraph>
    <Paragraph position="6"> head_tree : the tree anchored by the head word; &amp;quot;-&amp;quot; if the current word does not depend on any other word.</Paragraph>
    <Paragraph position="7"> number : a signed number that indicates the direction and the ordinal position of the particular head elementary tree from the position of the current word OR : an unsigned number that indicates the Gorn-address (i.e., the node address) in the derivation tree to which the word attaches OR : &amp;quot;-&amp;quot; if the current word does not depend on any other word.</Paragraph>
    <Paragraph position="9"> Generalization of this derivation tree results in the representation in (6).</Paragraph>
    <Paragraph position="11"> After generalization, the trees /h and f12 are no longer distinct so we denote them by ft. The trees a5 and a6 are also no longer distinct, so we denote them by a. With this change in notation, the two Kleene star regular expressions in (6) can be merged into one, and the resulting representation is (7)</Paragraph>
    <Paragraph position="13"/>
    <Paragraph position="15"> which can be seen as a path in an FST as in Figure 6.</Paragraph>
    <Paragraph position="16"> This FST representation is possible due to the lexicalized nature of the elementary trees. This representation makes a distinction between dependencies between modifiers and complements. The number in the tuple associated with each word is a signed number if a complement dependency is being expressed and is an unsigned number if a modifier dependency is being expressed, s</Paragraph>
  </Section>
  <Section position="6" start_page="272" end_page="272" type="metho">
    <SectionTitle>
5 Stapler
</SectionTitle>
    <Paragraph position="0"> In this section, we introduce a device called &amp;quot;stapler&amp;quot;, a very impoverished parser that takes as input the result of the EBL lookup and returns the parse(s) for the sentence. The output of the EBL lookup is a sequence of elementary trees annotated with dependency links - an almost parse. To construct a complete parse, the &amp;quot;stapler&amp;quot; performs the following tasks: * Identify the nature of link: The dependency links in the almost parse are to be distinguished as either substitution links or adjunction links.</Paragraph>
    <Paragraph position="1"> This task is extremely straightforward since the types (initial or auxiliary) of the elementary trees a dependency link connects identifies the nature of the link.</Paragraph>
    <Paragraph position="2"> * Modifier Attachment: The EBL lookup is not guaranteed to output all possible modifier-head dependencies for a give input, since the modifier-generalization assigns the same modifier-head link, as was in the training example, to all the additional modifiers. So it is the task of the stapler to compute all the alternate attachments for modifiers.</Paragraph>
    <Paragraph position="3"> * Address of Operation: The substitution and adjunction links are to be assigned a node address to indicate the location of the operation.</Paragraph>
    <Paragraph position="4"> The &amp;quot;staPler&amp;quot; assigns this using the structure of 3In a complement auxiliary tree the anchor subcategorizes for the foot node, which is not the case for a modifier auxiliaxy tree.</Paragraph>
    <Paragraph position="5"> the elementary trees that the words anchor and their linear order in the sentence.</Paragraph>
    <Paragraph position="6"> Feature Instantiation: The values of the features on the nodes of the elementary trees are to be instantiated by a process of unification.</Paragraph>
    <Paragraph position="7"> Since the features in LTAGs are finite-valued and only features within an elementary tree can be co-indexed, the &amp;quot;stapler&amp;quot; performs termunification to instantiate the features.</Paragraph>
  </Section>
  <Section position="7" start_page="272" end_page="274" type="metho">
    <SectionTitle>
6 Experiments and Results
</SectionTitle>
    <Paragraph position="0"> We now present experimental results from two different sets of experiments performed to show the effectiveness of our approach. The first set of experiments, (Experiments l(a) through 1(c)), are intended to measure the coverage of the FST representation of the parses of sentences from a range of corpora (ATIS, IBM-Manual and Alvey). The results of these experiments provide a measure of repetitiveness of patterns as described in this paper, at the sentence level, in each of these corpora.</Paragraph>
    <Paragraph position="1"> Experiment l(a): The details of the experiment with the ATIS corpus are as follows. A total of 465 sentences, average length of 10 words per sentence, which had been completely parsed by the XTAG system were randomly divided into two sets, a training set of 365 sentences and a test set of 100 sentences, using a random number generator. For each of the training sentences, the parses were ranked using heuristics 4 (Srinivas et al., 1994) and the top three derivations were generMized and stored as an FST. The FST was tested for retrieval of a generalized parse for each of the test sentences that were pretagged with the correct POS sequence (In Experiment 2, we make use of the POS tagger to do the tagging). When a match is found, the output of the EBL component is a generalized parse that associates with each word the elementary tree that it anchors and the elementary tree into which it adjoins or substitutes into - an almost parse, s  Experiment l(b) and 1(c): Similar experiments were conducted using the IBM-manual corpus and a set of noun definitions from the LDOCE dictionary that were used as the Alvey test set (Carroll, 1993).</Paragraph>
    <Paragraph position="2"> Results of these experiments are summarized in  corpora, the coverage of the FST and the traversal time per input are shown in this table. The coverage of the FST is the number of inputs that were assigned a correct generalized parse among the parses retrieved by traversing the FST.</Paragraph>
    <Paragraph position="3"> Since these experiments measure the performance of the EBL component on various corpora we will refer to these results as the 'EBL-Lookup times'.</Paragraph>
    <Paragraph position="4"> The second set of experiments measure the performance improvement obtained by using EBL within the XTAG system on the ATIS corpus. The performance was measured on the same set of 100 sentences that was used as test data in Experiment l(a). The FST constructed from the generalized parses of the 365 ATIS sentences used in experiment l(a) has been used in this experiment as well.</Paragraph>
    <Paragraph position="5"> Experiment 2(a): The performance of XTAG on the 100 sentences is shown in the first row of Table 3. The coverage represents the percentage of sentences that were assigned a parse.</Paragraph>
    <Paragraph position="6"> Experiment 2(b): This experiment is similar to Experiment l(a). It attempts to measure the coverage and response times for retrieving a generalized parse from the FST. The results are shown in the second row of Table 3. The difference in the response times between this experiment and Experiment l(a) is due to the fact that we have included here the times for morphological analysis and the POS tagging of the test sentence. As before, 80% of the sentences were assigned a generalized parse.</Paragraph>
    <Paragraph position="7"> However, the speedup when compared to the XTAG system is a factor of about 60.</Paragraph>
    <Paragraph position="8"> Experiment 2(c): The setup for this experiment is shown in Figure 7. The almost parse from the EBL lookup is input to the full parser of the XTAG system. The full parser does not take advantage of the dependency information present in the almost parse, however it benefits from the elementary tree assignment to the words in it. This information helps the full parser, by reducing the ambiguity of assigning a correct elementary tree sequence for the words of the sentence. The speed up shown in the third row of Table 3 is entirely due to this ambiguity reduction. If the EBL lookup fails to retrieve a parse, which happens for 20% of the sentences, then the  tree assignment ambiguity is not reduced and the full parser parses with all the trees for the words of the sentence. The drop in coverage is due to the fact that for 10% of the sentences, the generalized parse retrieved could not be instantiated to the features of the sentence.</Paragraph>
    <Paragraph position="9">  is shown in Figure 4. In this experiment, the almost parse resulting from the EBL lookup is input to the &amp;quot;stapler&amp;quot; that generates all possible modifier attachments and performs term unification thus generating all the derivation trees. The &amp;quot;stapler&amp;quot; uses both the elementary tree assignment information and the dependency information present in the almost parse and speeds up the performance even further, by a factor of about 15 with further decrease in coverage by 10% due to the same reason as mentioned in Experiment 2(c). However the coverage of this system is limited by the coverage of the EBL lookup. The results of this experiment are shown in the fourth row of Table 3.</Paragraph>
  </Section>
  <Section position="8" start_page="274" end_page="274" type="metho">
    <SectionTitle>
7 Relevance to other lexicalized
</SectionTitle>
    <Paragraph position="0"> grammars Some aspects of our approach can be extended to other lexicalized grammars, in particular to categorial grammars (e.g. Combinatory Categorial Grammar (CCG) (Steedman, 1987)). Since in a categorial grammar the category for a lexical item includes its arguments, the process of generalization of the parse can also be immediate in the same sense of our approach. The generalization over recursive structures in a categorial grammar, however, will require further annotations of the proof trees in order to identify the 'anchor' of a recursive structure. If a lexical item corresponds to a potential recursive structure then it will be necessary to encode this information by making the result part of the functor to be X --+ X. Further annotation of the proof tree will be required to keep track of dependencies in order to represent the generalized parse as an FST.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML