File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/02/c02-2024_intro.xml

Size: 5,549 bytes

Last Modified: 2025-10-06 14:01:22

<?xml version="1.0" standalone="yes"?>
<Paper uid="C02-2024">
  <Title>An Indexing Scheme for Typed Feature Structures</Title>
  <Section position="3" start_page="0" end_page="5" type="intro">
    <SectionTitle>
2 Algorithm
</SectionTitle>
    <Paragraph position="0"> Briefly, the algorithm for the ISTFS proceeds according to the following steps.</Paragraph>
    <Paragraph position="1"> 1. When a set of data TFSs is given, the ISTFS prepares a path value table and a unifiability checking table in advance.</Paragraph>
    <Paragraph position="2"> 2. When a query TFS is given, the ISTFS retrieves TFSs which are unifiable with the query from the set of data TFSs by performing the following steps.</Paragraph>
    <Paragraph position="3"> (a) The ISTFS finds the index paths by using the unifiability checking table. The index paths are the most restrictive paths in the query in the sense that the set of the data TFSs can be limited to the smallest one.</Paragraph>
    <Paragraph position="4">  (b) The ISTFS filters out TFSs that are non-unifiable by referring to the values of the index paths in the path value table.</Paragraph>
    <Paragraph position="5"> (c) The ISTFS finds exactly unifiable TFSs  by unifying the query and the remains of filtering one-by-one, in succession.</Paragraph>
    <Paragraph position="6"> This algorithm can also find the TFSs that are in the subsumption relation, i.e., more-specific or more-general, by preparing subsumption checking tables in the same way it prepared a unifiability checking table.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.1 Preparing Path Value Table and
Unifiability Checking Table
</SectionTitle>
      <Paragraph position="0"> Let D(= fF1;F2;:::;Fng) be the set of data TFSs.</Paragraph>
      <Paragraph position="1"> When D is given, the ISTFS prepares two tables, a path value table Dpi;s and a unifiability checking table Upi;s , for all pi 2PathD and s 2Type. 2 A TFS might have a cycle in its graph structure. In that case, a set of paths becomes infinite. Fortunately, our algorithm works correctly even if the set of paths is a subset of all existing paths. Therefore, paths which might cause an infinite set can be removed from the path set. We define the path value table and the unifiability checking table as follows:</Paragraph>
      <Paragraph position="3"> Assuming that s is the type of the node reached by following pi in a query TFS, we can limit D to a smaller set by filtering out 'non-unifiable' TFSs. We have the smaller set:</Paragraph>
      <Paragraph position="5"> Upi;s corresponds to the size of U0pi;s . Note that the ISTFS does not prepare a table of U0pi;s statically, but just prepares a table of Upi;s whose elements are integers. This is because the system's memory would easily be exhausted if we actually made a table of U0pi;s . Instead, the ISTFS finds the best paths by referring to Upi;s and calculates only U0pi;s where pi is the best index path.</Paragraph>
      <Paragraph position="6"> Suppose the type hierarchy and D depicted in Figure 1 are given. The tables in Figure 2 show Dpi;s and Upi;s calculated from Figure 1.</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.2 Retrieval
</SectionTitle>
      <Paragraph position="0"> In what follows, we suppose that D was given, and we have already calculated Dpi;s and Upi;s .</Paragraph>
    </Section>
    <Section position="3" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
Finding Index Paths
</SectionTitle>
      <Paragraph position="0"> The best index path is the most restrictive path in the query in the sense thatD can be limited to the smallest set by referring to the type of the node reached by following the index path in the query.</Paragraph>
      <Paragraph position="1"> Suppose a query TFS X and a constant k, which is the maximum number of index paths, are given. The best index path in PathX is path pi such that Upi;s is minimum where s is the type of the node reached by following pi from the root node of X. We can also find the second best index path by finding the path pi s.t. Upi;s is the second smallest. In the same way, we can find the i-th best index path s.t. i * k.</Paragraph>
    </Section>
    <Section position="4" start_page="0" end_page="5" type="sub_section">
      <SectionTitle>
Filtering
</SectionTitle>
      <Paragraph position="0"> Suppose k best index paths have already been calculated. Given an index path pi, let s be the type of the node reached by following pi in the query. An element of D that is unifiable with the query must have a node that can be reached by following pi and whose type is unifiable with s. Such TFSs (=U0pi;s) can be collected by taking the union of Dpi;t, where t is unifiable with s. For each index path, U0pi;s can be calculated, and the D can be limited to the smaller one by taking their intersection. After filtering, the ISTFS can find exactly unifiable TFSs by unifying the query with the remains of filtering one by one.</Paragraph>
      <Paragraph position="1"> Suppose the type hierarchy and D in Figure 1 are  given, and the following query X is given:  In Figure 2, Upi;s where the pi and s pair exists in the query is indicated with an asterisk. The best index paths are determined in ascending order of Upi;s indicated with an asterisk in the figure. In this example, the best index path is CDR:CAR: and its corresponding type in the query is 6. Therefore the unifiable TFS can be found by referring to DCDR:CAR:;6, and this is fF3g.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML