XML Viewer - a97-1015

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/97/a97-1015_metho.xml
Size: 16,011 bytes
Last Modified: 2025-10-06 14:14:32
<?xml version="1.0" standalone="yes"?>
<Paper uid="A97-1015">
  <Title>The Domain Dependence of Parsing</Title>
  <Section position="5" start_page="96" end_page="96" type="metho">
    <SectionTitle>
3 Domain Dependence of Structures
</SectionTitle>
    <Paragraph position="0"> First, we investigate the syntactic structure of each domain of the Brown corpus and compare these for different domains. In order to represent the syntactic structure of each domain, the distribution of partial trees of syntactic structure is used. A partial tree is a part of syntactic tree with depth of one, and it corresponds to a production rule. Note that this partial tree definition is not the same as the structure definition used in the parsing experiments described later.</Paragraph>
    <Paragraph position="1"> We accumulate these partial trees for each domain and compute the distribution of partial trees based on their frequency divided by the total number of partial trees in the domain. For example, Figure 1 shows the five most frequent partial trees (in the format of production rule) in domain A (Press: Re- null portage) and domain P (Romance and love story).</Paragraph>
    <Paragraph position="2"> For each domain, we compute the probabilities of partial trees like this. Then, for each pair of domains, cross entropy is computed using the probability data.</Paragraph>
    <Paragraph position="3"> Figure 2 shows a part of the cross entropy data. For example, 5.41 in column A, row E shows the cross entropy of modeling by domain E and testing on domain A. From the matrix, we can tell that some pairs of domains have lower cross entropy than others. It means that there are difference in similarity among domains. In particular, the differences among fiction domains are relatively small.</Paragraph>
    <Paragraph position="4"> In order to make the observation easier, we clustered the domains based on the cross entropy data.</Paragraph>
    <Paragraph position="5"> The distance between two domains is calculated as the average of the two cross-entropies in both directions. We use non-overlapping and average-distance clustering. Figure 3 shows the clustering result based on grammar cross entropy data. From the results, we can clearly see that fiction domains, in particular domains K, L, and N are close which is intuitively understandable.</Paragraph>
  </Section>
  <Section position="6" start_page="96" end_page="97" type="metho">
    <SectionTitle>
4 Domain Specific Structures
</SectionTitle>
    <Paragraph position="0"> Secondly, in contrast to the global analysis reported in the previous section, we investigate the structural idiosyncrasies of each domain in the Brown corpus.</Paragraph>
    <Paragraph position="1"> For each domain, the list of partial trees which are relatively frequent in that domain is created. We select the partial trees which satisfy the following  1. Frequency of the partial tree in a domain should be 5 times greater than that in the entire corpus 2. It occurs more than 5 times in the domain  The second condition is used to delete noise, because low frequency partial trees which satisfy the first condition have very low frequency in the entire corpus. The list is too large to show in this paper; a part of the list is shown in Appendix B. It obviously demonstrates that each domain has many idiosyncratic structures. Many of them are interesting to see and can be easily explained by our linguistic intuition. (Some examples are listed under the corresponding partial tree) This supports the idea of domain dependent grammar, because these idiosyncratic structures are useful only in that domain.</Paragraph>
  </Section>
  <Section position="7" start_page="97" end_page="99" type="metho">
    <SectionTitle>
5 Parsing Results
</SectionTitle>
    <Paragraph position="0"> In this section, the parsing experiments are described. There are two subsections. The first is the individual experiment, where texts from 8 domains are parsed with 4 different types of grammars. These are grammars acquired from the same size corpus of the same domain, all domains, non-fiction domains and fiction domains.</Paragraph>
    <Paragraph position="1"> The other parsing experiment is the intensive experiment, where we try to find the best suitable grammar for some particular domain of text and to see the relationship of the size of the training corpus. We use the domains of 'Press Reportage' and 'Romance and Love Story' in this intensive experiment.  In order to measure the accuracy of parsing, recall and precision measures are used (Black et.al., 1991).</Paragraph>
    <Section position="1" start_page="97" end_page="98" type="sub_section">
      <SectionTitle>
5.1 Individual Experiment
</SectionTitle>
      <Paragraph position="0"> Figure 4 shows the parsing performance for domain A, B, E, J, K, L, N and P with four types of grammars. In the table, results are shown in the form of 'recall/precision'. Each grammar is acquired from roughly the same size (24 samples except L with 21 samples) of corpus. For example, the grammar of all domains is created using corpus of 3 samples each from the 8 domains. The grammar of non-fiction and fiction domains are created from corpus of 6 samples each from 4 domains. Then text of each domain is parsed by the four types of grammar. There is no overlap between training corpus and test corpus.</Paragraph>
      <Paragraph position="1"> We can see that the result is always the best when the grammar acquired from either the same domain or the same class (fiction or non-fiction) is used. We will call the division into fiction and non-fiction as 'class'. It is interesting to see that the grammar acquired from all domains is not the best grammar in any tests. In other words, if the size of the training corpus is the same, using a training corpus drawn from a wide variety of domains does not help to achieve better parsing performance.</Paragraph>
      <Paragraph position="2"> For non-fiction domain texts (A, B, E and J), the performance of the fiction grammar is notably worse than that of the same domain grammar or the same class grammar. In contrast, the performance on some fiction domain texts (K and L) with the non-fiction grammar is not so different from that of the same domain. Here, we can find a relationship between these results and the cross entropy observations. The cross entropies where any of the fiction domains are models and any of the non-fiction domains are test are the highest figures in the table. This means that the fiction domains are not suitable for modeling the syntactic structure of the non-fiction domains. On the other hand, the cross entropies where any of the non-fiction domains are  models and any of the non-fiction domains (except P) are test have some lower figures. Except for the case of N with the non-fiction grammar, these observations explains the result of parsing very nicely. The higher the cross entropy, the worse the parsing performance.</Paragraph>
      <Paragraph position="3"> It is not easy to argue why, for some domains, the result is better with the grammar of the same class rather than the same domain. One rationale we can think of is based on the comparison observation described in section 3. For example, in the cross comparison experiment, we have seen that domains K, L and N are very close. So it may be plausible to say that the grammar of the fiction domains is mainly representing K, L and N and, because it covers wide syntactic structure, it gives better performance for each of these domains. This could be the explanation that the grammar of fiction domains are superior to the own grammar for the three domains. In other words, it is a small sampling problem, which can be seen in the next experiment, too. Because only 24 samples are used, a single domain grammar tends to covers relatively small part of the language phenomena. On the other hands, a corpus of similar domains could provide wider coverage for the grammar. The assumption that the fiction domain grammar represents domains of K, L and M may explain that the parsing result of domain P strongly favors the grammar of the same domain compared to that of the fiction class domains.</Paragraph>
    </Section>
    <Section position="2" start_page="98" end_page="99" type="sub_section">
      <SectionTitle>
5.2 Intensive Experiments
</SectionTitle>
      <Paragraph position="0"> In this section, the parsing experiments on texts of two domains are reported. The texts of the two domains are parsed with several grammars, e.g. grammars acquired from different domains or classes, and different sizes of the training corpus. The size of the training corpus is an interesting and important issue.</Paragraph>
      <Paragraph position="1"> We can easily imagine that the smaller the training corpus, the poorer the parsing performance. However, we don't know which of the following two types of grammar produce better performance: a grammar trained on a smaller corpus of the same domain, or a grammar trained on a larger corpus including different domains.</Paragraph>
      <Paragraph position="2"> Figure 5 and Figure 6 shows recall and precision of the parsing result for the Press Reportage text. The same text is parsed with 5 different types of grammars of several variations of training corpus size. Because of corpus availability, we can not make single domain grammars of large size training corpus, as  text. This text is also parsed with 5 different types of grammars.</Paragraph>
      <Paragraph position="3"> The graph between the size of training corpus and accuracy is generally an increasing curve with the slope graduMly flattening as the size of the corpus increases. Note that the small declines of some graphs at large number of samples are mainly due to the memory limitation for parsing. Parsing is carried out with the same memory size, but when the training corpus grows and the grammar becomes large, some long sentences can't be parsed because of data area limitation. When the data area is exhausted during the parsing, a fitted parsing technique is used to build the most plausible parse tree from the partially parsed trees. These are generally worse than the trees completely parsed.</Paragraph>
      <Paragraph position="4"> It is very interesting to see that the saturation point of any graph is about 10 to 30 samples. That is about 20,000 to 60,000 words, or about 1,000 to 3,000 sentences. In the romance and love story domain, the precision of the grammar acquired from 8 samples of the same domain is only about 2% lower than the precision of the grammar trained on 26 samples of the same domain. We believe that the reason why the performance in this domain saturates with such a small corpus is that there is relatively little variety in the syntactical structure of this domain.</Paragraph>
      <Paragraph position="5"> The order of the performance is generally the following: the same domain (best), the same class, all domMns, the other class and the other domain (worst). The performance of the last two grammars are very close in many cases. In the romance and love story domain, the grammar acquired from the same domain made the solo best performance. The difference of the accuracy of the grammars of the same domain and the other domain is quite large.</Paragraph>
      <Paragraph position="6"> The results for the press reportage is not so obvious, but the same tendencies can be observed.</Paragraph>
      <Paragraph position="7"> In terms of the relationship between the size of training corpus and domain dependency, we will compare the performance of the grammar acquired from 24 samples of the same domain (we will call it 'baseline grammar'), and that of the other grammars. In the press reportage domain, one needs a three to four times bigger corpus of all domains or non-fiction domains to catch up to the performance of the baseline grammar. It should be noticed that a quarter of the non-fiction domain corpus and one eighth of the all domain corpus consists of the press report domain corpus. In other words, the fact that the performance of the baseline grammar is about the same as that of 92 samples of the non-fiction domains means that in the latter grammar, the rest of the corpus does not improve or is not harmful for the parsing performance. In the romance and love story domain, the wide variety grammar, in particular the fiction domain grammar quickly catch up to the performance of the baseline grammar. It needs only less than twice size of fiction domain corpus to achieve the performance of the baseline grammar.</Paragraph>
      <Paragraph position="8"> These two results and the evidence that fiction domains are close in terms of structure indicate that if you have a corpus consisting of similar domains, it is worthwhile to include the corpus in grammar acquisition, otherwise not so useful. We need to further quantify these trade-offs in terms of the syntactic diversity of individual domains and the difference between domains.</Paragraph>
      <Paragraph position="9"> We also find the small sampling problem in this experiment. In the press reportage experiment, the grammar acquired from the same domain does not make the best performance when the size of the training corpus is small. We observed the same phenomena in the previous experiment.</Paragraph>
    </Section>
  </Section>
  <Section position="8" start_page="99" end_page="100" type="metho">
    <SectionTitle>
6 Discussion
</SectionTitle>
    <Paragraph position="0"> One of our basic claims is the following. When we try to parse a text in a particular domain, we should prepare a grammar which suits that domain.</Paragraph>
    <Paragraph position="1"> This idea naturally contrasts to the idea of robust broad-coverage parsing (Carroll and Briscoe, 1996), in which a single grammar should be prepared for parsing of any kind of text. Obviously, the latter idea has a great advantage that you do not have to create a number of grammars for different domains and also do not need to consider which grammar should be used for a given text. On the other hand, it is plausible that a domain specific grammar can produce better results than a domain independent grammar. Practically, the increasing availability of  corpora provides the possibilities of creating domain dependent grammars. Also, it should be noted that we don't need a very large corpus to achieve a relatively good quality of parsing.</Paragraph>
    <Paragraph position="2"> To summarize our observations and experiments: * There are domain dependencies on syntactic structure distribution.</Paragraph>
    <Paragraph position="3"> * Fiction domains in the Brown corpus are very similar in terms of syntactic structure.</Paragraph>
    <Paragraph position="4"> * We found many idiosyncratic structures from each domain by a simple method.</Paragraph>
    <Paragraph position="5"> For 8 different domains, domain dependent grammar or the grammar of the same class provide the best performance, if the size of the training corpus is the same.</Paragraph>
    <Paragraph position="6"> The parsing performance is saturated at very small size of training corpus. This is the case, in particular, for the romance and love story domain. null The order of the parsing performance is generally the following; the same domain (best), the same class, all domain, the other class and the other domain (worst).</Paragraph>
    <Paragraph position="7"> * Sometime, training corpus in similar domains is useful for grammar acquisition.</Paragraph>
    <Paragraph position="8"> It may not be so useful to use different domain corpus even if the size of the corpus is relatively large.</Paragraph>
    <Paragraph position="9"> Undoubtedly these conclusions depend on the parser, the corpus and the evaluation methods. Also our experiments don't cover all domains and possible combinations. However, the observations and the experiment suggest the significance of the notion of domain in parsing. The results would be useful for deciding what strategy should be taken in developing a grammar on a 'domain dependent' NLP application systems.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML