File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/03/w03-0609_intro.xml

Size: 4,158 bytes

Last Modified: 2025-10-06 14:01:54

<?xml version="1.0" standalone="yes"?>
<Paper uid="W03-0609">
  <Title>Grounding Word Meanings in Sensor Data: Dealing with Referential Uncertainty</Title>
  <Section position="2" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> We are interested in how robots might learn language given qualitatively the same inputs available to children natural language utterances paired with sensory access to the environment. This paper focuses on the sub-problem of learning word meanings. Suppose a robot has acquired a set of sound patterns that may or may not correspond to words. How is it possible to separate the words from the non-words, and to learn the meanings of the words? We assume the robot's sensory access to its environment is through a collection of primitive sensors organized into sensor groups, where each sensor group is a set of related sensors. For example, the sensor group a0a2a1 might return a single value representing the mean grayscale intensity of a set of pixels corresponding to an object in the visual field. The sensor group a0a4a3a6a5 might return two values representing the height and width of the bounding box around the object.</Paragraph>
    <Paragraph position="1"> Learning the meanings of words requires a representation for meaning. We use a representation that we call a conditional probability field (CPF), which is a type of scalar field. A scalar field is a map of the following form: a7a9a8a11a10a6a12a14a13a15a16a10 The mapping assigns to each vectora17a19a18  which corresponds to a point in an a24 -dimensional sensor group, a conditional probability of the form a25 a20Ea26a17a23a22 , where E denotes the occurrence of some event. Let Ea20a0a28a27 a17a23a22 denote the CPF defined over sensor group a0 for event E.</Paragraph>
    <Paragraph position="2"> The semantics of a CPF clearly depend on the nature of E. Two events that will be of particular importance in learning the meanings of words are: a29 utter-W - the event that word a30 is uttered, perhaps as part of an utterance that refers to some feature of the world denoted by a30 a29 hear-W - the event that word a30 is heard The corresponding conditional probability fields are:</Paragraph>
    <Paragraph position="4"> a17a23a22 - the probability that word a30 will be uttered by a competent speaker of the language to denote the feature of the physical world that a0 is currently sensing (i.e. that results in the current</Paragraph>
    <Paragraph position="6"> a17a23a22 - the probability that word a30 will be heard given that a17a31a18 a0 is observed In this framework, the meaning of word a30 is simply utter-Wa20a0a28a27a17a23a22 . The last plot in figure 3 shows a CPF defined over a0a32a1 that might represent the meaning of the word &amp;quot;gray&amp;quot;. Grayscale intensities near 128 will be called gray with probability almost one, whereas intensities near 0 and 255 will never be called gray. Rather, they are &amp;quot;black&amp;quot; and &amp;quot;white&amp;quot; respectively.</Paragraph>
    <Paragraph position="7"> Learning the denotation ofa30 involves determining the identity of a0 and then recovering utter-Wa20a0a28a27a17a23a22 . The learner does not have direct access to utter-Wa20a0a28a27a17a23a22 . Rather, the learner must gain information about utter-Wa20a0a28a27a17a23a22 indirectly, by noticing the sensory contexts in which a30 is used and those in which it is not, i.e. via hear-Wa20a0a28a27 a17a23a22 .</Paragraph>
    <Paragraph position="8"> This problem is difficult due to referential uncertainty.</Paragraph>
    <Paragraph position="9"> Even if the utterances the learner hears are true statements about aspects of its environment that are perceptually available, there are are usually many aspects of the environment that might be a given word's referent. This is Quine's &amp;quot;gavagai&amp;quot; problem (Quine, 1960). The algorithm described in this paper solves a restricted version of the gavagai problem, one in which the denotation of a word must be representable as a CPF defined over one of a set of pre-defined sensor groups.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML