File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/97/w97-1411_intro.xml

Size: 1,511 bytes

Last Modified: 2025-10-06 14:06:30

<?xml version="1.0" standalone="yes"?>
<Paper uid="W97-1411">
  <Title>Referring to Displays in Multimodal Interfaces</Title>
  <Section position="4" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> Projects which have attempted to' integrate natural language (NL) with graphical displays (B~s and Guillotin, 1992; Neal and Shapiro, 1991; Pineda,  1989) have mainly focussed on one of two problems: 1. How can output text be coordinated with graphical information displayed on the screen? 2. How can pointing gestures be coordinated with NL input?  We are interested in a slightly different issue, namely: How can NL terms be used, in a relatively uniform way, to refer to visual objects on the screen as well as the objects (for example, database items) which they may denote? The situation we have in mind is where the computer system has some stored knowledge base, database or model, and is able to graphically display selected items from that store. The user wishes to interact with the system, and may wish to ask questions which either allude to visual features of the display (e.g. Is the blue zone inside the city boundary?) or are directly about the meaning of the display (e.g. What does the blue marking represent?). Such queries require that the system have access to some representation of what is represented on the screen, and that this representation be amenable to NL or multimodal (MM) querying. 1</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML