File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/82/c82-1002_intro.xml

Size: 2,611 bytes

Last Modified: 2025-10-06 14:04:22

<?xml version="1.0" standalone="yes"?>
<Paper uid="C82-1002">
  <Title>COGNITIVE MODELS FOR COMPUTER VISION</Title>
  <Section position="2" start_page="0" end_page="0" type="intro">
    <SectionTitle>
INTRODUCTION
</SectionTitle>
    <Paragraph position="0"> A common approach to the problem of scenes interpretation is to generate hypothesis about the position and size of objects and try to use these expectations to guide the search for picture areas which exhibit the expected features 14,8,15\[. But where this expectation came from? If a robot operates in a known environment,expectations can be self-generated on the basis of built-in knowledge and previously experienced situations. Another very common source of informations can be some kind of external input, often based on natural language communication. A piece of conversation as &amp;quot;look for the pencil&amp;quot;,&amp;quot;where?&amp;quot;, &amp;quot;on the table&amp;quot; conveys a lot of informations about the presence of a reference object (table) and the characteristics of a surface (top of table) which must be located in order to restrict the search for the target object (pencil). To take advantage of these linguistic information sources we must be able to extract from a qualitative expression like &amp;quot;on the table&amp;quot; all those quantitative constraints which are relevant from a geometric modelling point of view 121. These problems could seem much more related to the generation of visual analog representations than to the understanding of a scene; but what does it mean exactly to &amp;quot;understand&amp;quot; a scene? When we analyze a scene, we use a I~ of not geometric knowledge; we are not surprised to find smoken cigarettes into an ashtray, and a glance is enaugh to classify them, but we could have some troubles to recognize that it contains a company of goldfishes, and this surely not only because of geometric constraints! Therefore the processing of visual knowledge must be based on cognitive models that are able to handle different kinds and sources of informations, and in this sense we feel that there is not a clear * cut between scene analysis and scene generation 114,161.</Paragraph>
    <Paragraph position="1"> In the following we will deal mainly with the representation of objects and the formalization of spatial relationships, trying to point out how linguistic informations can be related to visual ones.</Paragraph>
    <Paragraph position="2"> * Work supported by Italian National Research Council under grant 80.01142.07 G. ADORNI, A. BOCCALATTE and M. 19I MANZO</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML