File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/04/p04-2008_evalu.xml

Size: 4,668 bytes

Last Modified: 2025-10-06 13:59:14

<?xml version="1.0" standalone="yes"?>
<Paper uid="P04-2008">
  <Title>Improving the Accuracy of Subcategorizations Acquired from Corpora</Title>
  <Section position="6" start_page="4" end_page="8" type="evalu">
    <SectionTitle>
4 Experiments
</SectionTitle>
    <Paragraph position="0"> I applied my method to SCFs acquired from 135,902 sentences of mobile phone newsgroup postings archived by Google.com, which is the same data used in (Carroll and Fang, 2004). The number of acquired SCFs was 14,783 for 3,864 word stems, while the number of SCF types in the data was 97. I then translated the 163 SCF types into the SCF types of the XTAG English grammar (XTAG Research Group, 2001) and the LinGO ERG (Copestake, 2002)  using translation mappings built by Ted Briscoe and Dan Flickinger from 23 of the SCF types into 13 (out of 57 possible) XTAG SCF types, and 129 into 54 (out of 216 possible) ERG SCF types.</Paragraph>
    <Paragraph position="1"> To evaluate my method, I split each lexicon of the two grammars into the training SCFs and the testing SCFs. The words in the testing SCFs were included in the acquired SCFs. When I apply my method to the acquired SCFs using the training SCFs and evaluate the resulting SCFs with the  I used the same version of the LinGO ERG as (Carroll and Fang, 2004) (1.4; April 2003) but the map is updated.  XTAG English grammar (left) the LinGO ERG (right) testing SCFs, we can estimate to what extent my method can preserve reliable SCFs for words unknown to the grammar.</Paragraph>
    <Paragraph position="2">  The XTAG lexicon was split into 9,437 SCFs for 8,399 word stems as training and 423 SCFs for 280 word stems as testing, while the ERG lexicon was split into 1,608 SCFs for 1,062 word stems as training and 292 SCFs for 179 word stems as testing. I extracted SCF confidence-value vectors from the training SCFs and the acquired SCFs for the words in the testing SCFs. The number of the resulting data objects was 8,679 for XTAG and 1,241 for ERG. The number of initial centroids  extracted from the training SCFs was 49 for XTAG and 53 for ERG. I then performed clustering of 8,679 data objects into 49 clusters and 1,241 data objects into  I here assume that the existing SCFs for the words in the lexicon is more reliable than the other SCFs for those words.  I used the vectors that appeared for more than one word. 53 clusters, and then evaluated the resulting SCFs by comparing them to the testing SCFs.</Paragraph>
    <Paragraph position="3"> I first compare confidence cut-off with frequency cut-off to observe the effects of Bayesian estimation. Figure 4 shows precision and recall of the SCFs obtained using frequency cut-off and confidence cut-off 0.01, 0.03, and 0.05 by varying threshold for the confidence values and the relative frequencies from 0 to 1.</Paragraph>
    <Paragraph position="4">  The graph indicates that the confidence cut-offs achieved higher recall than the frequency cut-off, thanks to the a priori distributions. When we compare the three confidence cut-offs, we can improve precision using higher recognition thresholds while we can improve recall using lower recognition thresholds. This is quite consistent with our expectations.</Paragraph>
    <Paragraph position="5">  Precision= Correct SCFs for the words in the resulting SCFs All SCFs for the words in the resulting SCFs Recall = Correct SCFs for the words in the resulting SCFs All SCFs for the words in the test SCFs I then compare centroid cut-off with confidence cut-off to observe the effects of clustering. Figure 5 shows precision and recall of the resulting SCFs using centroid cut-off 0.05 and the confidence cut-off 0.05 by varying the threshold for the confidence values. In order to show the effects of the use of the training SCFs, I also performed clustering of SCF confidence-value vectors in the acquired SCFs with random initialization (k =49 (for XTAG) and 53 (for ERG); centroid cut-off 0.05*). The graph shows that clustering is meaningful only when we make use of the reliable SCFs in the manually-coded lexicon. The centroid cut-off using the lexicon of the grammar boosted precision compared to the confidence cut-off. The difference between the effects of my method on XTAG and ERG would be due to the finer-grained SCF types of ERG. This resulted in lower precision of the acquired SCFs for ERG, which prevented us from distinguishing infrequent (correct) SCFs from SCFs acquired in error. However, since unusual SCFs tend to be included in the lexicon, we will be able to have accurate clusters for unknown words with smaller SCF variations as we achieved in the experiments with XTAG.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML