File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/c04-1096_intro.xml
Size: 4,707 bytes
Last Modified: 2025-10-06 14:02:10
<?xml version="1.0" standalone="yes"?> <Paper uid="C04-1096"> <Title>Generation of Relative Referring Expressions based on Perceptual Grouping</Title> <Section position="2" start_page="0" end_page="2" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> In the last two decades, many researchers have studied the generation of referring expressions to enable computers to communicate with humans about concrete objects in the world.</Paragraph> <Paragraph position="1"> For that purpose, most past work (Appelt, 1985; Dale and Haddock, 1991; Dale, 1992; Dale and Reiter, 1995; Heeman and Hirst, 1995; Horacek, 1997; Krahmer and Theune, 2002; van Deemter, 2002; Krahmer et al., 2003) makes use of attributes of an intended object (the target) and binary relations between the target and others (distractors) to distinguish the target from distractors. Therefore, these methods cannot generate proper referring expressions in situations where no significant surface difference exists between the target and distractors, and no binary relation is useful to distinguish the target. Here, a proper referring expression means a concise and natural linguistic expression enabling hearers to distinguish the target from distractors.</Paragraph> <Paragraph position="2"> For example, consider indicating object b to per-son P in the situation shown in Figure 1. Note that person P does not share the label information such as a and b with the speaker. Because object b is not distinguishable from objects a or c by means of their appearance, one would try to use a binary relation between object b and the table, i.e., &quot;A ball to the right of the table&quot;.</Paragraph> <Paragraph position="3"> However, &quot;to the right of&quot; is not a discriminatory relation, for objects a and c are also located to the right of the table. Using a and c as a reference object instead of the table does not make sense, since a and c cannot be uniquely identified because of the same reason that b cannot be identified. Such situations have never drawn much attention, but can occur easily and frequently in some domains such as object arrangement (Tanaka et al., 2004).</Paragraph> <Paragraph position="4"> van der Sluis and Krahmer (2000) proposed using gestures such as pointing in situations like those shown in Figure 1. However, pointing and gazing are not always available depending on the positional relation between the speaker and the hearer.</Paragraph> <Paragraph position="5"> In the situation shown in Figure 1, a speaker can indicate object b to person P with a simple expression &quot;the front ball&quot; without using any gesture. In order to generate such an expression, one must be able to recognize the salient perceptual group of the objects and use the n-ary relative relations in the group.</Paragraph> <Paragraph position="6"> In this paper, we propose a method of generat- null In this paper, we simply assume that all participants share the appropriate reference frame (Levinson, 2003). We mention this issue in the last section.</Paragraph> <Paragraph position="7"> Although Krahmer et al. claim that their method can handle n-ary relations (Krahmer et al., 2003), they provide no details. We think their method cannot directly handle situations we discuss here.</Paragraph> <Paragraph position="8"> ing referring expressions that utilizes n-ary relations among members of a group. Our method recognizes groups by using Th'orisson's algorithm (Th'orisson, 1994). As the first step of our research project, we deal with the limited situations where only homogeneous objects are randomly arranged (see Figure 2). Therefore, we handle positional n-ary relation only, and other types of n-ary relation such as size, e.g., &quot;the biggest one&quot;, are not mentioned.</Paragraph> <Paragraph position="9"> Speakers often refer to multiple groups in the course of referring to the target. In these cases, we can observe two types of relations: the intra-group relation such as &quot;the front two among the five near the desk&quot;, and the inter-group relation such as &quot;the two to the right of the five&quot;. We define that a subsumption relation between two groups is an intra-group relation.</Paragraph> <Paragraph position="10"> In what follows, Section 2 explains the experiments conducted to collect expressions in which perceptual groups are used. The proposed method is described and evaluated in Section 3. In Section 4, we examine a possibility to predict the adequacy of an expression in terms of perceptual grouping. Finally, we conclude the paper in Section 5.</Paragraph> </Section> class="xml-element"></Paper>