File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/02/j02-4004_intro.xml
Size: 7,654 bytes
Last Modified: 2025-10-06 14:01:22
<?xml version="1.0" standalone="yes"?> <Paper uid="J02-4004"> <Title>c(c) 2002 Association for Computational Linguistics Efficiently Computed Lexical Chains as an Intermediate Representation for Automatic Text Summarization</Title> <Section position="4" start_page="492" end_page="495" type="intro"> <SectionTitle> 4. Evaluation Design </SectionTitle> <Paragraph position="0"> Our algorithm now makes it feasible to use lexical chains as the method for identifying important concepts in a document, and thus they may now form the basis of an intermediate representation for summary generation, as proposed by Barzilay and Elhadad. An important consequence of this is that Barzilay and Elhadad's proposal can now be evaluated on documents of substantial size. We propose an evaluation of this intermediate stage that is independent of the generation phase of summarization.</Paragraph> <Paragraph position="1"> Silber and McCoy Efficient Lexical Chains for Summarization This said, we make no attempt to claim that a summary can actually be generated from this representation; we do attempt, however, to show that the concepts found in a human-generated summary are indeed the concepts identified by our lexical chains algorithm.</Paragraph> <Paragraph position="2"> The basis of our evaluation is the premise that if lexical chains are a good intermediate representation for summary generation, then we would expect that each noun in a given summary should be used in the same sense as some word instance grouped into a strong chain in the original document on which the summary is based. Moreover, we would expect that all (most) strong chains in the document should be represented in the summary.</Paragraph> <Paragraph position="3"> For this analysis, a corpus of documents with their human-generated summaries are required. Although there are many examples of document and summary types, for the purposes of this experiment, we focus on two general categories of summaries that are readily available. The first, scientific documents with abstracts, represents a readily available class of summaries often discussed in the literature (Marcu 1999). The second class of document selected was chapters from university level textbooks that contain chapter summaries. To prevent bias, textbooks from several fields were chosen.</Paragraph> <Paragraph position="4"> In this analysis, we use the term concept to denote a noun in a particular sense (a given sense number in the WordNet database). It is important to note that different nouns with the same sense number are considered to be the same concept. It is also important to note that for the purposes of this analysis, when we refer to the &quot;sense&quot; of a word, we mean the sense as determined by our lexical chain analysis. The basic idea of our experiment is to try to determine whether the concepts represented by (strong) lexical chains in an original document appear in the summary of that document and whether the concepts appearing in the summary (as determined by the lexical chain analysis of the summary) come from strong chains in the document. If both of these give 100% coverage, this would mean that all and only the concepts identified by strong lexical chains in the document occur in the summary. Thus the higher these numbers turn out to be, the more likely it is that lexical chains are a good intermediate representation of the text summarization task.</Paragraph> <Paragraph position="5"> A corpus was compiled containing the two specific types of documents, ranging in length from 2,247 to 26,320 words each. These documents were selected at random, with no screening by the authors. The scientific corpus consisted of 10 scientific articles (5 computer science, 3 anthropology, and 2 biology) along with their abstracts. The textbook corpus consisted of 14 chapters from 10 university level textbooks in various subjects (4 computer science, 6 anthropology, 2 history, and 2 economics), including chapter summaries.</Paragraph> <Paragraph position="6"> For each document in the corpus, the document and its summary were analyzed separately to produce lexical chains. In both cases we output the sense numbers specified for each word instance as well as the overriding sense number for each chain. By comparing the sense numbers of (words in) each chain in the document with the computed sense of each noun instance in the summary, we can determine whether the summary indeed contains the same &quot;concepts&quot; as indicated by the lexical chains. For the analysis, the specific metrics we are interested in are Computational Linguistics Volume 28, Number 4 word occurs in the summary in the same sense as in the document strong chain. (Analogous to recall) * The number and percentage of noun instances in the summary that represent strong chains in the document. (Analogous to precision) By analyzing these two metrics, we can determine how well lexical chains represent the information that appears in these types of human-generated summaries. We will loosely use the terms recall and precision to describe these two metrics.</Paragraph> <Section position="1" start_page="494" end_page="495" type="sub_section"> <SectionTitle> 4.1 Experimental Results </SectionTitle> <Paragraph position="0"> Each document in the corpus was analyzed by running our lexical chain algorithm and collecting the overriding sense number of each strong lexical chain computed. Each summary in the corpus was analyzed by our algorithm, and the disambiguated sense (i.e., the sense of the noun instance that was selected in order to insert it into a chain) of each noun was collected. Table 5 shows the results of this analysis. The number of strong chains computed for the document is shown in column 2. Column 3 shows the Silber and McCoy Efficient Lexical Chains for Summarization total number of noun instances found in the summary. Column 4 shows the number, and percentage overall, of strong chains from the document that are represented by noun instances in the summary (recall). The number, and the percentage overall, of nouns of a given sense from the summary that have a corresponding strong chain with the same overriding sense number (representing the chain) in the original text are presented in column 5 (precision). Summary statistics are also presented.</Paragraph> <Paragraph position="1"> In 79.12% of the cases, lexical chains appropriately represent the nouns in the summary. In 80.83% of the cases, nouns in the summary would have been predicted by the lexical chains. The algorithm performs badly on two documents, anthropology paper 3 and computer science chapter 4, under this analysis. Possible reasons for this will be discussed below, but our preliminary analysis of these documents leads us to believe that they contain a greater number of pronouns and other anaphoric expressions (which need to be resolved to compute lexical chains properly). These potential reasons need to be examined further to determine why our algorithm performs so poorly on these documents. Excluding these two documents, our algorithm has a recall of 83.39% and a precision of 84.63% on average. It is important to note that strong chains represent only between 5% and 15% of the total chains computed for any document.</Paragraph> <Paragraph position="2"> The evaluation presented here would be enhanced by having a baseline for comparison. It is not clear, however, what this baseline should be. One possibility would be to use straight frequency counts as an indicator and use these frequency counts for comparison.</Paragraph> </Section> </Section> class="xml-element"></Paper>