File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/05/j05-2004_concl.xml

Size: 2,778 bytes

Last Modified: 2025-10-06 13:54:37

<?xml version="1.0" standalone="yes"?>
<Paper uid="J05-2004">
  <Title>A Mathematical Model of Historical Semantics and the Grouping of Word Meanings into Concepts</Title>
  <Section position="12" start_page="245" end_page="245" type="concl">
    <SectionTitle>
9. Conclusion
</SectionTitle>
    <Paragraph position="0"> Empirical evidence indicates that the number of senses per word in a dictionary has an approximately exponential distribution. We have shown that a stochastic model of historical semantics not only can explain this near-exponential phenomenon but can also use the distance from an exponential distribution to estimate the average number of distinct concepts per word as well as the concept creation factor (the percentage of new senses for existing words which can be considered genuinely new concepts).</Paragraph>
    <Paragraph position="1"> Further research is required to determine whether refinements to the mathematical model presented in this article can produce a more accurate model by, for example, distinguishing among different parts of speech, distinguishing between everyday and technical terms, or introducing the extra parameter word-frequency. The introduction to the OED states that prepositions generally have more senses than verbs and adjectives, which in turn have more senses than nouns. Pagel (2000) emphasizes the fact that the evolution of language is not identical for all words. Fundamental vocabulary, including body parts, seasons, and cosmological terms, are more stable than less basic words (Swadesh 1952). Zipf (1949) was the first to notice a correlation between word frequency and number of senses listed in a dictionary, in the form of a hyperbolic distribution. This can be summarized by saying that a word with frequency rank r has on average twice as many meanings as a word with frequency rank 10r. A sophisticated stochastic model along the lines of the one presented in this article but taking into account frequency would require information on the relative frequency of each different sense of each word. Unfortunately this is beyond the scope of this preliminary article, whose aim is simply to show that a simple stochastic model (based on grouping senses into distinct concepts liable to give rise to different new senses) can explain a universal property of monolingual dictionaries.</Paragraph>
    <Paragraph position="2"> A more intriguing avenue of future research is the investigation of the possibility of using statistical analysis of dictionaries to model synonymy rather than (or as well as) polysemy. Indeed, it is an open question whether a similar approach to that followed in this work can be used to group together senses of different words which correspond to the same concept.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML