XML Viewer - w99-0203

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/99/w99-0203_evalu.xml
Size: 3,302 bytes
Last Modified: 2025-10-06 14:00:42
<?xml version="1.0" standalone="yes"?>
<Paper uid="W99-0203">
  <Title>Identification of Coreference Between Names and Faces</Title>
  <Section position="6" start_page="22" end_page="23" type="evalu">
    <SectionTitle>
6 Experiments
</SectionTitle>
    <Paragraph position="0"> We have experimentally evaluated the system we proposed by comparing with the simple systems which contain only the language module or the image module respectively to confirm the effect of the combining process. The language module and the image module work under three kinds of hypothesis in the simple systems as well. Thus, we use the system's result which has the minimum distance between the output of media and the hypothesis defined by formula (6),(7) a.s the baseline of evaluation. In our experiments, we use the photograph news in the web page called &amp;quot;AULOS&amp;quot; distributed by The Mainichi Newspapers(Mai, 1997). The average length of the text of the article is about 300 characters or 100 words. The almost all of the images are full colored, and the average size of them is about 250 x 200 pixels. Moreover, the images are not accompanied with captions. On this evaluation, we use articles with full colored images published on May and June 1997. As for common name extraction, we did four fold cross-validation for 228 articles of this period which contains common human names. As for common face extraction, we did three fold cross-validation for the set of color photograph images which are contained by the articles used by the language module. To evaluate how accurate the system identifies the given person being a common person, we calculated the recall and precision rate of the system's decision about a person being common. Since the outputs of our system are certainties, recall and precision rates are defined as follows.</Paragraph>
    <Paragraph position="1">  where W(i) is the certainty of person i, and cc means a set of all correctly identified persons.</Paragraph>
    <Paragraph position="2">  The evaluation results of each module is shown in Table 2.</Paragraph>
    <Paragraph position="3"> For the language module and the combining module, we evaluate names and its certainties.</Paragraph>
    <Paragraph position="4"> On the other hand, for the image module, we evaluate only certainties under the assumption that the human name of the face which was assigned higher certainty is correct because the image module doesn't output human names.</Paragraph>
    <Paragraph position="5"> The effect of combining appears as the difference between the results of the combining module and the results of the language module or the image module. The combining module has two variations. The module based on (10) improved both recall and precision rates by combining. The reason of high recall rate is that one module picks up the person whom the other module fails to pick up. Since high precision rate is maintained, this compensation is really effective. On the other hand, the combining module based on (9) improves the precision rate more than the module based on (10). The reason of this phenomena is that the module is able to cancel the noise which appears in one media contents by the other media contents. However, the recall rate was decreased as expected from (9).</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML