File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/02/c02-1100_metho.xml

Size: 14,227 bytes

Last Modified: 2025-10-06 14:07:51

<?xml version="1.0" standalone="yes"?>
<Paper uid="C02-1100">
  <Title>Lenient Default Unification for Robust Processing within Unification Based Grammar Formalisms</Title>
  <Section position="4" start_page="0" end_page="77" type="metho">
    <SectionTitle>
3 Ideal Lenient Default Unification
</SectionTitle>
    <Paragraph position="0"> In this section, we explain our default unification, ideal lenient default unification. Ideal lenient default unification tries to maximize the amount of information of the result, subsuming the result of forced unification. In other words, ideal lenient default unification tries to generate a result as similar as possible to the result of forced unification such that the result is defined in the type hierarchy without the top type. Formally, we have:</Paragraph>
    <Paragraph position="2"> where vf is a subsumption relation where the top type is defined.</Paragraph>
    <Paragraph position="3"> From the definition of skeptical default unification, ideal lenient default unification is equivalent to F &lt;ts (Ftf G) assuming that skeptical default unification does not add the default information that includes the top type to the strict information.</Paragraph>
    <Paragraph position="4"> Consider the following feature structures.</Paragraph>
    <Paragraph position="6"> In the case of Carpenter's default unification, the results of skeptical and credulous default unification become as follows: F &lt;ts G = F;F &lt;tc G =fFg. This is because G is generalized to the bottom feature structure, and hence the result is equivalent to the strict feature structure.</Paragraph>
    <Paragraph position="7"> With ideal lenient default unification, the result becomes as follows.</Paragraph>
    <Paragraph position="8">  Note that the result of ideal lenient default unification subsumes the result of forced unification. As we can see in the example, ideal lenient default unification tries to keep as much information of the structure-sharing as possible (ideal lenient default unification succeeds in preserving the structure-sharing tagged as 1 and 2 though skeptical and credulous default unification fail to capture it).</Paragraph>
  </Section>
  <Section position="5" start_page="77" end_page="77" type="metho">
    <SectionTitle>
4 Lenient Default Unification
</SectionTitle>
    <Paragraph position="0"> The optimal answer for ideal lenient default unification can be found by calculating F &lt;ts (Ftf G). As Copestake (1993) mentioned, the time complexity of skeptical default unification is exponential, and therefore the time complexity of ideal lenient default unification is also exponential.</Paragraph>
    <Paragraph position="1"> As other researchers pursued efficient default unification (Bouma, 1990; Russell et al., 1991; Copestake, 1993), we also propose another definition of default unification, which we call lenient default unification. An algorithm derived for it finds its answer efficiently.</Paragraph>
    <Paragraph position="2"> Given a strict feature structure F and a default feature structure G, let H be the result of forced unification, i.e., H = Ftf G. We define topnode(H) as a function that returns the fail points (the nodes that are assigned the top type in H), f pnode(H) as a function that returns the fail path nodes (the nodes from which a fail point can be reached), and f pchild(H) as a a function that returns all the nodes that are not fail path nodes but the immediate children of fail path nodes.</Paragraph>
    <Paragraph position="3"> Consider the following feature structures.</Paragraph>
    <Paragraph position="5"> Figure 1 shows F, G and H in the graph notation.</Paragraph>
    <Paragraph position="6"> This figure also shows the nodes that correspond to topnode(H), f pnode(H) and f pchild(H).</Paragraph>
    <Paragraph position="7">  F tH fails because some of the path value in H conflict with F, or some of the path equivalence in H cause inconsistencies. The basic ideas are that i) the inconsistency caused by path value specifications can be removed by generalizing the types assigned to the fail points in H, and that ii) the inconsistency caused by path equivalence specifications can be removed by unfolding the structure-sharing of fail path nodes in H.</Paragraph>
    <Paragraph position="8"> Let H be hQH; -qH;thH;dHi, where Q is a set of a feature structure's nodes, -q is the root node, th(q) is a total node typing function, and d(pi;q) is a partial function that returns a node reached by following path pi from q. We first give several definitions to define a generalized feature structure.</Paragraph>
    <Paragraph position="10"> n the least feature structures where path value of pi is s Let I(=hQI; -qI;thI;dIi) be I00(H). The definition of the generalized feature structure is given as follows: null</Paragraph>
    <Paragraph position="12"> create a new state q0 2 QH ;</Paragraph>
    <Paragraph position="14"> ++s is an appropriate type for all the arcs that reach qH</Paragraph>
    <Paragraph position="16"> Finally, lenient default unification F &lt;t G is defined as follows:  Both ideal and non-ideal lenient default unification satisfy the following desiderata: 1) It is always defined (and produces a unique result). 2) All strict information is preserved. That is, F v(F &lt;t G)vf (Ftf G). 3) F &lt;t G is defined without the top type. 4) It reduces to standard unification in the case F and G are consistent (unifiable). That is, F &lt;t G = F tG if F tG is defined.</Paragraph>
    <Paragraph position="17"> Algorithm Our algorithm for lenient default unification proceeds in the following steps. 1) Calculate forced unification of F and G (let H be Ftf G). 2) Find fail points and fail path nodes in H. 3) Generalize H so that F tH can be unified.</Paragraph>
    <Paragraph position="18"> Figure 2 describes the algorithm that generalizes the result of forced unification.1 The time complexity of the algorithm for finding F &lt;t G is linear to 1In this paper, we assume acyclic feature structures. Our algorithm never stops if a cyclic part in a feature structure is to be generalized. Acyclicity is easily enforced by requiring that no path has a proper extension that leads to the same node. We also assume outputs of default unification are not necessarily totally-well typed since constraints of appropriateness conditions of types can propagate a node to its subnodes, and this behavior makes the definitions of default unification complex. Instead, totally-well typedness can be easily enforced by the total type inference function.</Paragraph>
    <Paragraph position="19"> the size of feature structures F and G because the time complexity of each algorithm (the algorithm for finding fail points, finding fail path nodes, and generalization) is linear to their size.</Paragraph>
    <Paragraph position="20"> Comparison The difference between ideal lenient default unification and lenient default unification can be exemplified by the following example.</Paragraph>
    <Paragraph position="21">  In the example above, the results of ideal lenient default unification and skeptical default unification are the same. In the case of lenient default unification, all the information embedded in the default is removed because all structure-sharings tagged as 1 are on the paths that lead to the fail points in the result of forced unification. Lenient default unification is much more suspicious in removing information from the default than the ideal one. Lenient default unification may remove structure-sharings that are irrelevant to unification failure. Another defect of lenient default unification is that the bottom type is assigned to nodes that correspond to the fail points in the result of forced unification. The type assigned to their nodes should be more specific than the bottom type as the bottom type has no feature, i.e., all the arcs that go out from the fail point are cut.</Paragraph>
    <Paragraph position="22"> Though lenient default unification seems to have many defects, lenient default unification has the advantage of efficiency. As we are thinking to use default unification for practical robust processing, the efficiency is of great importance. Furthermore, the result of lenient default unification can be more informative than that of skeptical default unification in many cases of practical applications. For example, suppose that the grammar rule R and the daughters  Suppose also that the type head has PHON:, CASE:, INV: and TENSE: as its features, and the type sign has HEAD: and VAL:. The result of skeptical default unification DT R &lt;ts R becomes DT R. This is because all structure-sharings embedded in R are relevant to unification failure. However, the result of lenient default unification is more informative.</Paragraph>
    <Paragraph position="23">  The information of structure-sharing is preserved as much as possible. In the example above, the structure-sharing tagged as 2 in the original grammar rule R is decomposed into the structure-sharings 3; 4; 5; 6. That is, the structure-sharing tagged as 2 is preserved except HEAD:CASE:.</Paragraph>
  </Section>
  <Section position="6" start_page="77" end_page="77" type="metho">
    <SectionTitle>
5 Offline Robust Parsing and Grammar
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="77" end_page="77" type="sub_section">
      <SectionTitle>
Extraction
</SectionTitle>
      <Paragraph position="0"> This section describes a new approach to robust parsing using default unification. Given an HPSG grammar, our approach takes two steps; i) extraction of grammar rules from the result of offline robust parsing using default unification for applying the HPSG grammar rules, and ii) runtime parsing using the HPSG grammar with the extracted rules. Offline parsing is a training phase to extract grammar rules, and runtime parsing is a phase where we apply the extracted rules to practice. The extracted rules work robustly over corpora other than the training corpus because the extracted rules reflect the effects of default unification that are applied during offline parsing. Given an annotated corpus, our algorithm extracts grammar rules that make the coverage of the HPSG grammar wider.</Paragraph>
      <Paragraph position="1"> In the offline parsing, constituents are generated by default unification of daughters and grammar rules of the HPSG grammar2, where a head daughter and a grammar rule are strict feature structures and a non-head daughter is a default feature structure. With this construction, the information in a grammar rule and a head daughter is strictly preserved and the information in a non-head daughter is partially lost (but, as little as possible). The ideas 2In HPSG, both constituents and grammar rules are represented by feature structures.</Paragraph>
      <Paragraph position="2"> behind this construction are that (i) we had better construct a mother node without the information of the non-head daughter rather than construct nothing (i.e., we had better construct a mother node by unifying only a head-daughter and a grammar rule), (ii) we had better construct a mother node with the maximal information of a non-head daughter rather than have no information of the non-head daughter added. Parse trees can be derived even if a parse tree cannot be derived by normal unification.</Paragraph>
      <Paragraph position="3"> Offline robust parsing is based on A* algorithm, but we generate only parse trees which meet the following conditions, 1) a generated parse tree must be consistent with an existing bracketed corpus, and 2) the parsing cost of a generated parse tree must be minimum. This means that i) we can limit a search space, and that ii) the parsing result is valid in the sense that it is consistent with the existing corpus. The cost of a parse tree can be calculated by adding the cost of lenient default unification, which is the amount of information that is lost by lenient default unification. We regard it as the difference between the number of path values and structure-sharing in the results of a lenient default unification and a forced unification.</Paragraph>
      <Paragraph position="4"> Grammar extraction is very concise. When we find a mother M in the result of offline parsing that cannot be derived by using unification but can be derived by default unification, we regard M ! L;R as a new rule, where L and R are the daughters of the mother. The rules extracted in such a way can reconstruct the mothers as does default unification, and they reflect the condition of triggering default unification, i.e., the extracted rules are not frequently triggered because they can be applied to feature structures that are exactly equivalent to their daughter's part. By collecting a number of such rules,3 a grammar becomes wide-coverage with some overgeneration. They can be regarded as exceptions in a grammar, which are difficult to be captured only by propagating information from daughters to a mother.</Paragraph>
      <Paragraph position="5"> This approach can be regarded as a kind of explanation-based learning (Samuelsson and Rayner, 1991). The explanation-based learning method is recently attracting researcher's attention (Xia, 1999; Chiang, 2000) because their parsers are comparative to the state-of-the-art parsers in terms of precision and recall. In the context of unification-based grammars, Neumann (1994) has developed a parser running with an HPSG grammar learned by explanation-based learning. It should be also noted that Kiyono and Tsujii (1993) exemplified the grammar extraction approach using offline parsing in the 3Although the size of the grammar becomes very large, the extracted rules can be found by a hash algorithm very efficiently. This tractability helps to use this approach in practical applications.</Paragraph>
      <Paragraph position="6">  context of explanation-based learning.</Paragraph>
      <Paragraph position="7"> Finally, we need to remove some values in the extracted rules because they contain too specific information. For instance, a value of PHONOLOGY: represents a list of phoneme strings of a phrasal structure. Without removing them, extracted rules cannot be triggered until when completely the same strings appear in a text.4</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML