File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/90/h90-1057_concl.xml
Size: 2,117 bytes
Last Modified: 2025-10-06 13:56:34
<?xml version="1.0" standalone="yes"?> <Paper uid="H90-1057"> <Title>Representation Quality in Text Classification:</Title> <Section position="6" start_page="292" end_page="292" type="concl"> <SectionTitle> 4 Summary </SectionTitle> <Paragraph position="0"> Text-based systems of all kinds are an area of increasing research interest, as evidenced by the recent AAAI Symposium on Text-Based Intelligent Systems, by funding initiatives such as the TIPSTER portion of the DARPA Strategic Computing Program, and by an increase in the number of research papers proposing the application of various artificial intelligence techniques to text classification problems. This interest is driven by both an undeniable need to cope with large amounts of data in the form of online text, and by the resource that this text represents for intelligent systems.</Paragraph> <Paragraph position="1"> In this paper we have discussed the nature of text classification, which is the central task of most current text-based systems, and which is an important component of most proposed text comprehension systems, as well. We introduced a theoretical model of how text representation impacts the performance of text classification systems, and described how the performance of these systems is typically evaluated.</Paragraph> <Paragraph position="2"> We also summarized the results of our ongoing research on syntactic phrase clustering. Perhaps the most important point to stress about this work is the complexity of evaluating text classification systems, particularly those involving natural language processing or machine learning techniques, and the need to examine results carefully.</Paragraph> <Paragraph position="3"> This should not discourage evaluation, however. If it is difficult to verify good text classification techniques through controlled experiments, it is impossible to do so purely through intuition or theoretical arguments. The history of IR is full of plausible techniques which experiment has shown to be ineffective. Only through careful evaluation will progress be likely.</Paragraph> </Section> class="xml-element"></Paper>