File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/03/p03-1017_evalu.xml

Size: 4,012 bytes

Last Modified: 2025-10-06 13:58:58

<?xml version="1.0" standalone="yes"?>
<Paper uid="P03-1017">
  <Title>Constructing Semantic Space Models from Parsed Corpora</Title>
  <Section position="6" start_page="0" end_page="0" type="evalu">
    <SectionTitle>
3.3.3 Results
</SectionTitle>
    <Paragraph position="0"> We carried out an ANOVA with the lexical relation as factor and the distance as dependent variable. The lexical relation factor had six levels, namely the relations detailed in Section 3.3.1. We found no effect of semantic distance for the traditional semantic space model (F(5;141) = 1:481, p = :200). The e2 statistic revealed that only 5:2% of the variance was accounted for. On the other hand, a reliable effect of distance was observed for all dependency-based models (p &lt; :001). Model 7 (wide context specification and plain path value function) accounted for the highest amount of variance in our data (20.3%).</Paragraph>
    <Paragraph position="1"> Our results can be seen in Figure 4.</Paragraph>
    <Paragraph position="2"> We examined whether there are any significant differences among the six relations using Post-hoc Tukey tests. The pairwise comparisons for model 7 are given in Table 3. The mean distances for conceptual associates (CA), phrasal associates (PA), superordinates/subordinates (SUP), category coordinates (CO), antonyms (ANT), and synonyms (SYN) are also shown in Table 3. There is no significant difference between PA and CA, although SUP, CO, ANT, and SYN, are all significantly different from CA (see Table 3, where indicates statistical significance, a = :05). Furthermore, ANT and SYN are significantly different from PA.</Paragraph>
    <Paragraph position="3"> Kilgarriff and Yallop (2000) point out that manually constructed taxonomies or thesauri are typically organised according to synonymy and hyponymy for nouns and verbs and antonymy for adjectives. They further argue that for automatically constructed thesauri similar words are words that either co-occur with each other or with the same words. The relations SYN, SUP, CO, and ANT can be thought of as representing taxonomy-related knowledge, whereas CA and PA correspond to the word clusters found in automatically constructed thesauri.</Paragraph>
    <Paragraph position="4"> In fact an ANOVA reveals that the distinction between these two classes of relations can be made reliably (F(1;136) = 15:347, p &lt; :001), after collapsing SYN, SUP, CO, and ANT into one class and CA and PA into another.</Paragraph>
    <Paragraph position="5"> Our results suggest that dependency-based vector space models can, at least to a certain degree, distinguish among different types of lexical relations, while this seems to be more difficult for traditional semantic space models. The Tukey test revealed that category coordination is reliably distinguished from all other relations and that phrasal association is reliably different from antonymy and synonymy. Taxonomy related relations (e.g., synonymy, antonymy, hyponymy) can be reliably distinguished from conceptual and phrasal association. However, no reliable differences were found between closely associated relations such as antonymy and synonymy.</Paragraph>
    <Paragraph position="6"> Our results further indicate that context encoding plays an important role in discriminating lexical relations. As in Experiment 1 our best results were obtained with the wide context specification. Also, weighting schemes such as the obliqueness hierarchy length again decreased the model's performance (see conditions 2, 5, 9, and 13 in Figure 4), showing that dependency relations contribute equally to the representation of a word's meaning. This points to the fact that rich context encodings with a wide range of dependency relations are promising for capturing lexical semantic distinctions. However, the performance for maximum context specification was lower, which indicates that collapsing all dependency relations is not the optimal method, at least for the tasks attempted here.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML