File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/03/p03-1017_intro.xml
Size: 5,657 bytes
Last Modified: 2025-10-06 14:01:47
<?xml version="1.0" standalone="yes"?> <Paper uid="P03-1017"> <Title>Constructing Semantic Space Models from Parsed Corpora</Title> <Section position="2" start_page="0" end_page="0" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> Vector-based models of word co-occurrence have proved a useful representational framework for a variety of natural language processing (NLP) tasks such as word sense discrimination (Schutze, 1998), text segmentation (Choi et al., 2001), contextual spelling correction (Jones and Martin, 1997), automatic thesaurus extraction (Grefenstette, 1994), and notably information retrieval (Salton et al., 1975).</Paragraph> <Paragraph position="1"> Vector-based representations of lexical meaning have been also popular in cognitive science and figure prominently in a variety of modelling studies ranging from similarity judgements (McDonald, 2000) to semantic priming (Lund and Burgess, 1996; Lowe and McDonald, 2000) and text comprehension (Landauer and Dumais, 1997).</Paragraph> <Paragraph position="2"> In this approach semantic information is extracted from large bodies of text under the assumption that the context surrounding a given word provides important information about its meaning. The semantic properties of words are represented by vectors that are constructed from the observed distributional patterns of co-occurrence of their neighbouring words.</Paragraph> <Paragraph position="3"> Co-occurrence information is typically collected in a frequency matrix, where each row corresponds to a unique target word and each column represents its linguistic context.</Paragraph> <Paragraph position="4"> Contexts are defined as a small number of words surrounding the target word (Lund and Burgess, 1996; Lowe and McDonald, 2000) or as entire paragraphs, even documents (Landauer and Dumais, 1997). Context is typically treated as a set of unordered words, although in some cases syntactic information is taken into account (Lin, 1998; Grefenstette, 1994; Lee, 1999). A word can be thus viewed as a point in an n-dimensional semantic space. The semantic similarity between words can be then mathematically computed by measuring the distance between points in the semantic space using a metric such as cosine or Euclidean distance.</Paragraph> <Paragraph position="5"> In the variants of vector-based models where no linguistic knowledge is used, differences among parts of speech for the same word (e.g., to drink vs. a drink) are not taken into account in the construction of the semantic space, although in some cases word lexemes are used rather than word surface forms (Lowe and McDonald, 2000; McDonald, 2000). Minimal assumptions are made with respect to syntactic dependencies among words. In fact it is assumed that all context words within a certain distance from the target word are semantically relevant. The lack of syntactic information makes the building of semantic space models relatively straightforward and language independent (all that is needed is a corpus of written or spoken text). However, this entails that contextual information contributes indiscriminately to a word's meaning.</Paragraph> <Paragraph position="6"> Some studies have tried to incorporate syntactic information into vector-based models. In this view, the semantic space is constructed from words that bear a syntactic relationship to the target word of interest. This makes semantic spaces more flexible, different types of contexts can be selected and words do not have to physically co-occur to be considered contextually relevant. However, existing models either concentrate on specific relations for constructing the semantic space such as objects (e.g., Lee, 1999) or collapse all types of syntactic relations available for a given target word (Grefenstette, 1994; Lin, 1998). Although syntactic information is now used to select a word's appropriate contexts, this information is not explicitly captured in the contexts themselves (which are still represented by words) and is therefore not amenable to further processing.</Paragraph> <Paragraph position="7"> A commonly raised criticism for both types of semantic space models (i.e., word-based and syntaxbased) concerns the notion of semantic similarity. Proximity between two words in the semantic space cannot indicate the nature of the lexical relations between them. Distributionally similar words can be antonyms, synonyms, hyponyms or in some cases semantically unrelated. This limits the application of semantic space models for NLP tasks which require distinguishing between lexical relations.</Paragraph> <Paragraph position="8"> In this paper we generalise semantic space models by proposing a flexible conceptualisation of context which is parametrisable in terms of syntactic relations. We develop a general framework for vector-based models which can be optimised for different tasks. Our framework allows the construction of semantic space to take place over words or syntactic relations thus bridging the distance between word-based and syntax-based models. Furthermore, we show how our model can incorporate well-defined, informative contexts in a principled way which retains information about the syntactic relations available for a given target word.</Paragraph> <Paragraph position="9"> We first evaluate our model on semantic priming, a phenomenon that has received much attention in computational psycholinguistics and is typically modelled using word-based semantic spaces. We next conduct a study that shows that our model is sensitive to different types of lexical relations.</Paragraph> </Section> class="xml-element"></Paper>