ENDORSED BY:
SIGSEM, the ACL Special Interest Group in Computational Semantics
SIGGEN, the ACL Special Interest Group in Generation
SIGLEX, the ACL Special Interest Group on the Lexicon
ORGANISERS:
Regina Barzilay, Cornell University
Ehud Reiter, University of Aberdeen
Jeffrey Mark Siskind, Purdue University
PROGRAM COMMITTEE:
Kobus Barnard, University of Arizona
Paul Cohen, UMass Amherst
Peter Dominey, CNRS
Phil Edmonds, Sharp Laboratories of Europe
Allen Gorin, AT&T Research Labs
Graeme Hirst, University of Toronto
Lillian Lee, Cornell University
Tim Oates, University of Maryland Baltimore County
Terry Regier, University of Chicago
Deb Roy, MIT Media Lab
CONFERENCE WEBSITE:
http://www.cs.cornell.edu/˜regina/lwm03
INTRODUCTION
One of the grand challenges of NLP, AI, and Cognitive Science is to develop models of what
words mean (lexical semantics) in terms of the non-linguistic world. Recently there has been
growing interest in using corpus and data based techniques for this task. In other words, trying to
learn what words mean by analysing a ‘parallel corpus’ of (A) non-linguistic data and (B) linguistic
texts that describe or otherwise are based on the non-linguistic data. Recent examples of such work
include learning verb semantics from visual-image sequences; learning the meaning of time phrases
from a collection of weather forecasts based on numerical weather simulations; and learning the
meaning of mathematical predicates from human verbalisations of theorem-prover output.
We felt that while the enterprise of learning semantic information from conventional text-only
corpora is well established, work on learning word meanings from nonlinguistic data was being
undertaken by researchers in many diverse fields. We needed a venue for these researchers to meet,
exchange ideas, and become familiar with each other’s work.
Our intention from the start was to make this an interdisciplinary workshop, attracting papers
and attendees from the NLP, AI, Cognitive Science, and Machine Vision communities. While we
needed to choose a particular venue that is home to one of these communities, in this case NAACL-
HLT, to host this workshop, we hoped for participation from other areas of AI and Cognitive Science
including Vision and Robotics researchers with interest in learning how to relate sensor data to
words, and psychologists with interest in cognitive models of how people learn to relate words to
the non-linguistic world. We assembled a program committee that included representation from
all these communities and disseminated the call for papers among these communities. We were
unsure that we would succeed in attracting such interdisciplinary participation. We were pleasantly
surprised and quite pleased with the response that we received from our call for papers. Indeed, our
program has papers from members of all of the above communities. These are divided into sessions
that (roughly) focus on learning from image, video, robotics, and other types of data.
For logistic reasons, we got off to a late start in organising this workshop. We would like
to thank the authors—for writing papers—and the program committee members—for reviewing
those papers—with fast turnaround. A number of researchers told us that they would have liked
to submit papers but were unable to do so given the tight deadlines. They expressed the desire
that this workshop be the beginning of a new research community that would hold future meetings.
We share that desire. Our hope in organising this workshop is that it will help ‘gel’ this new
and exciting research area, by bringing together interested people from different communities and
different perspectives who bring different approaches and methods to bear on the same problem of
learning word meanings from non-linguistic data.
Regina Barzilay
Ehud Reiter
Jeffrey Mark Siskind
April 2003
