XML Viewer - p06-1040

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/06/p06-1040_intro.xml
Size: 2,556 bytes
Last Modified: 2025-10-06 14:03:38
<?xml version="1.0" standalone="yes"?>
<Paper uid="P06-1040">
  <Title>Expressing Implicit Semantic Relations without Supervision</Title>
  <Section position="4" start_page="313" end_page="313" type="intro">
    <SectionTitle>
2 Pertinence
</SectionTitle>
    <Paragraph position="0"> The relational similarity between two pairs of words, 11 :YX and 22 :YX , is the degree to which their semantic relations are analogous. For example, mason:stone and carpenter:wood have a high degree of relational similarity. Measuring relational similarity will be discussed in Section 4. For now, assume that we have a measure of the relational similarity between pairs of words, R[?]):,:(sim 2211r YXYX .</Paragraph>
    <Paragraph position="1"> Let }:,,:{ 11 nn YXYXW = be a set of word pairs and let },,{ 1 mPPP = be a set of patterns. The pertinence of pattern iP to a word pair jj YX : is the expected relational similarity between a word pair kk YX : , randomly selected from W according to the probability distribution ):(p ikk PYX , and the word pair jj YX : :</Paragraph>
    <Paragraph position="3"> The conditional probability ):(p ikk PYX can be interpreted as the degree to which the pair kk YX : is representative (i.e., typical) of pairs that fit the pattern iP . That is, iP is pertinent to jj YX : if highly typical word pairs kk YX : for the pattern iP tend to be relationally similar to</Paragraph>
    <Paragraph position="5"> Pertinence tends to be highest with patterns that are unambiguous. The maximum value of ),:(pertinence ijj PYX is attained when the pair jj YX : belongs to a cluster of highly similar pairs and the conditional probability distribution ):(p ikk PYX is concentrated on the cluster. An ambiguous pattern, with its probability spread over multiple clusters, will have less pertinence.</Paragraph>
    <Paragraph position="6"> If a pattern with high pertinence is used for text mining, it will tend to produce word pairs that are very similar to the given word pair; this follows from the definition of pertinence. We believe this definition is the first formal measure of quality for text mining patterns.</Paragraph>
    <Paragraph position="7"> Let ikf , be the number of occurrences in a corpus of the word pair kk YX : with the pattern</Paragraph>
    <Paragraph position="9"> Instead, we first estimate ):(p kki YXP :</Paragraph>
    <Paragraph position="11"> The use of Bayes' Theorem and the assumption that nYX jj 1):p( = for all word pairs is a way of smoothing the probability ):(p ikk PYX , similar to Laplace smoothing.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML