File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/94/c94-2115_intro.xml
Size: 2,198 bytes
Last Modified: 2025-10-06 14:05:42
<?xml version="1.0" standalone="yes"?> <Paper uid="C94-2115"> <Title>Drawing Pictures with Natural Language and Direct Manipulation</Title> <Section position="2" start_page="0" end_page="0" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> This paper describes an experimental implementa~ tion of a multimodal interface. Specifically, the authors have developed a multimodal drawing tool. The multimodal drawing tool allows users to draw pictures by using multiple modalities; mouse, keyboard and voice input, in various combined ways.</Paragraph> <Paragraph position="1"> Recently, most user interfaces tend to be based on a direct manipulation method. However the direct manipulation method is not always better than other ways. The direct manipulation method is not particularly applicable, when mentioning several operations together and operating an object which is not displayed. Also, it compels a user to point to a target object correctly with a pointing device. On the other hand, voice inputs have some advantages, since a user can feel free to speak at any time, and the user can use the voice input while simultaneously using other devices. A combination of such different modalities offers an interface which is easy for the user to use.</Paragraph> <Paragraph position="2"> Many multimodal systems, which integrate natural language inputs and pointing inputs, have been developed \[2\]\[1\]\[5\]\[4\]. In those systems, tbe user uses natural language mainly supported by the pointing inputs. However, when the user has to communicate with the computer frequently, in such a system as drawing tool, it is ,lot effective for the user to always speak while working.</Paragraph> <Paragraph position="3"> A prototype system for a multimodal drawing tool has been developed, whereby the user can use voice inputs unrestrainedly and effectively, that is, the user can choose a modality unrestrainedly, and can use the voice inputs only when the user wants to do so. Ill such a system, input data come ill at random with multiple modalities. Tile multimodal system must be able to handle such several kinds of input data.</Paragraph> </Section> class="xml-element"></Paper>