File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/94/h94-1071_concl.xml
Size: 1,211 bytes
Last Modified: 2025-10-06 13:57:23
<?xml version="1.0" standalone="yes"?> <Paper uid="H94-1071"> <Title>Learning from Relevant Documents in Large Scale Routing Retrieval</Title> <Section position="6" start_page="361" end_page="361" type="concl"> <SectionTitle> 5. CONCLUSION </SectionTitle> <Paragraph position="0"> We explore several strategies of selecting relevant documents or portions of them for query training in the TREC-2 routing retrieval experiment. It confirms that using all relevants for training is not a good strategy because irrelevant noisy portions of documents would be included. Short relevants are the quality documents. Simple methods such as using only short documents, together with beginning portions of longer documents for training performs well and is also efficient. For this TREC2 routing, an average of about 200-300 subdocuments per query appears adequate, about 1/5-1/4 of all known relevant subdocuments available in this experiment. Selecting the bestn ranked relevants (as in relevance feedback) is not as effective as just selecting the top ranked unit of every document. This investigation also shows that breaking documents into subdocuments is useful for query training.</Paragraph> </Section> class="xml-element"></Paper>