File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/04/c04-1106_concl.xml

Size: 2,213 bytes

Last Modified: 2025-10-06 13:53:57

<?xml version="1.0" standalone="yes"?>
<Paper uid="C04-1106">
  <Title>Lower and higher estimates of the number of &amp;quot;true analogies&amp;quot; between sentences contained in a large multilingual corpus</Title>
  <Section position="6" start_page="0" end_page="0" type="concl">
    <SectionTitle>
6 Conclusion
</SectionTitle>
    <Paragraph position="0"> In this paper, we reported experiments of counting the number of &amp;quot;true analogies,&amp;quot; i.e., analogies of form and meaning, between sentences contained in a large multilingual corpus, making the assumption that translation preserves meaning. We computed a lower and a higher estimates.</Paragraph>
    <Paragraph position="1"> Using an English corpus of almost 100,000 different sentences, we obtained a lower estimate of almost 70,000 &amp;quot;true analogies&amp;quot; involving almost 14,000 sentences by intersecting analogies of form between Chinese, English and Japanese.</Paragraph>
    <Paragraph position="2"> A higher estimate was obtained by enforcing analogies of form, i.e., generating new sentences to fulfil analogies of form, so as to increase the number of paraphrases. More than a million and a half &amp;quot;true analogies&amp;quot; were found. They involve almost 50,000 sentences, i.e., half of the sentences of the corpus. This meets our impression that almost all analogies of form between the English sentences of our corpus are also analogies of meaning.</Paragraph>
    <Paragraph position="3"> Although we do not claim that analogy can explain everything about language, this work shows that, even when considering the lower estimate obtained, the number of &amp;quot;true analogies&amp;quot; that can be found in a corpus is far from being negligeable. Further research should focus on the way analogies are distributed over sentences, i.e., on the characterisation of sentences involved in analogies.</Paragraph>
    <Paragraph position="4"> Finally, as a speculative remark, similar countings as the ones reported above could contribute to the debate about &amp;quot;the argument from the poverty of the stimulus&amp;quot; if it were possible to reproduce them on such corpora as the CHILDES corpus15.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML