File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/03/w03-0506_concl.xml
Size: 13,866 bytes
Last Modified: 2025-10-06 13:53:39
<?xml version="1.0" standalone="yes"?> <Paper uid="W03-0506"> <Title>A Study for Documents Summarization based on Personal Annotation</Title> <Section position="7" start_page="1" end_page="3" type="concl"> <SectionTitle> 5 Experiments </SectionTitle> <Paragraph position="0"> Since our approaches are based on annotated documents, we need to collect users' annotations for a set of documents. In the meantime, users are required to supply a summary consisting of several sentences that reflects the main ideas of the document. We supply an annotating tool called &quot;Annotation&quot; that can be used to highlight words, phrases, or sentences which users are interested in, and to select important sentences into summary.</Paragraph> <Paragraph position="1"> The original data we used is from Yahoo news, which is interesting to most users. The data is composed of eight categories: business, sports, entertainment, world, science, technology, politics, and oddly, with totally about 6000 documents. Documents are preprocessed and removed graphics and tags before experiments. We hired five students 10 days to annotate the documents, each student were supplied 20 percent of the documents. They are allowed to choose articles interesting to do the experiments. Users are told to make annotations freely and summaries which reflect the main ideas of the text. The process of annotating and summarizing for one user are independent, that is, they are done at different time. At last, we collect totally 1098 different annotated documents, each of which consists of a set of annotations and a human-made summary.</Paragraph> <Paragraph position="2"> The statistics for five persons is presented in table 1</Paragraph> <Paragraph position="4"> which shows that the average summary length (sentences number) is 6.11, and the average annotations number is 11.86.</Paragraph> <Section position="1" start_page="1" end_page="3" type="sub_section"> <SectionTitle> 5.1 Summarization Evaluation </SectionTitle> <Paragraph position="0"> Since different users may have different annotation style, we separate the experiments for each individual's data.</Paragraph> <Paragraph position="1"> In experiments, the keyword threshold a is set 2, which is reasonable that most keywords' frequency is at least 2.</Paragraph> <Paragraph position="2"> The threshold for summary similarity related to sentences precision, is 0.3 (which in fact means its square root 55% sentences are correct). The summarizer produce the same number of sentences as are in the corresponding manual summary, as in (Kupiec et al., 1995), therefore, precision and recall are the same for summaries sentences comparison.</Paragraph> <Paragraph position="3"> First, we make experiments with different annotations and context weights. The results are presented in for the meanings of other columns. It is obvious that context can help to improve the summarization performance than no context consideration, so in our later experiments, we set the context weight g =1, and annotation weight b =1.</Paragraph> <Paragraph position="4"> For ease of representation in table 1, &quot;Pi&quot; is for different users; &quot;DN&quot; is document number; &quot;ASN&quot; is average sentences number; &quot;ASL&quot; is average summary length; &quot;ANN&quot; is average annotation number.</Paragraph> <Paragraph position="5"> For two compared summaries, &quot;SS&quot; means summary similarity; &quot;SP&quot; means sentences precision for conditional match; &quot;PP&quot; means sentences precision for perfect match; &quot;KP&quot; means keywords precision; and &quot;KR&quot; means keywords recall. . From the above table and figure, we found that annotation-based summarization is much better than generic summarization. The improvements are quite inspiring. In the case of user P4, the cosine similarity is increased by 10.1%; the sentences precision for conditional match is increased by 13.57%; precision for perfect match is increased by 17.59%; keywords precision is increased by 11.6%; keywords recall is increased by 13.18%, which shows that annotations can help a lot to improve the summarization performance.</Paragraph> <Paragraph position="6"> notated summary for 5 users' data.</Paragraph> <Paragraph position="7"> Figure 1 shows the average performance comparison for total 1098 documents. Compared with generic summarization, cosine similarity of annotation based summarization is increased by 6.47%; sentences precision for conditional match is increased by 7.99%; precision for perfect match increased by 10.62%; keywords precision increased by 7.14%; keywords recall increased by 8.61%. The most significant is that the improvement of sentences precision for perfect match is higher than 10%, since perfect matches require two sentences totally matched, it is very important to gain this point, showing that in general annotations are able to contribute much to the performance of summarization. In Figure 1, &quot;BA&quot; means 'first Best' Annotated summary, which will be explained in the next subsection.</Paragraph> <Paragraph position="8"> In Table 3 and later figures, &quot;G&quot; means comparison of manual summary and generic summary, and &quot;A&quot; means comparison of manual summary and annotated summary.</Paragraph> </Section> <Section position="2" start_page="3" end_page="3" type="sub_section"> <SectionTitle> 5.2 Annotation Evaluation </SectionTitle> <Paragraph position="0"> In our last experiments, we found that the average annotation number was 11.86, which was much higher than summary length. We wonder whether there are some relations between the number of annotations and summarization performance. Thus we make such annotations evaluations to study how the number of annotations affects on summarization performance.</Paragraph> <Paragraph position="1"> The first experiment is to find the best summary by selecting the first k (k[?]n, n is total number of annotations) annotations, that is &quot;first Best Annotated summary&quot;. In fact, from figure 1 we can see that when using all of the annotations, the performance falls about in the middle of generic summary and first Best Annotated summary (labeled by &quot;BA&quot;). However we found that, in some annotated documents, some of annotations are beyond the scope of the manual summary. This means that some of them are noisy to summarization; using all of them cannot reach the best performance; there must have a changing process of performance as more annotations are considered, which confirms us to explore the relationship between the annotation number and the summarization performance.</Paragraph> <Paragraph position="2"> So the next experiment is to observe how the average summarization performance evolves as we select any of the k annotations. That is, for any annotation number k[?]n, to average the summarization performance by enumerating all possible annotations combinations.</Paragraph> <Paragraph position="3"> Figure 2 presents such a plot for a document &quot;sports_78.html&quot;, which indicates how the number of annotations affects summarization performance. It has totally 15 annotations, &quot;0&quot; stands for generic summary performance. When considering only one annotation, the performance drops a bit down (We found in this document some of the single annotations are far away from corresponding summary), but as more annotations considered, the performance begins to increase slowly and reaches the best at annotation number 12, then again begins to drop. For other documents, we find similar situations that at beginning the performance increases along the number of annotations, but after it reaches a certain degree, the performance will fluctuate and sometimes drop slightly down. These are all true for our 5 evaluation measures, which are consistent and reasonable since too many annotations will introduce some degree of noise, which would bias the summarization process. We conclude from this evolving process that not all of the annotations are valuable for summarization. In figure 2, the best and worst performance point can be identified, for example, we get the best point at annotation number 12 and the worst point at annotation number 1. For a subset of 10 documents, summary similarity comparisons with generic, best, worst, and all annotations-based summarization are shown in figure 3.</Paragraph> <Paragraph position="4"> Different documents have different annotations, which have different influences on summarization. In most cases, best summaries are better than all annotations-based summaries, which are better than generic and worst summaries. There is an exception in Figure 3 at Document 5, we found in this document some of the annotations are irrelevant to the summary. For example, in this document, percentage of summary annotated sentences in annotated sentences is 28.57%; and percentage of summary annotated keywords in summary keywords is only 17.19%. While for Document 6, the corresponding values are 50.00% and 32.73%. This indicates that the user's interests drifted as he read the document.</Paragraph> <Paragraph position="5"> When annotations are consistent with users' summaries, they help to improve the personalization of summarization, the more the better, otherwise, when the annotations are far from users' summaries, the influence of annotations may be negative, for example some annotations subset make the performance worse. But generally the former has a much larger proportion than the latter. Along the documents in figure 3, the main trend is that annotation-based summaries are better than summaries with no annotations consideration; the average improvement for 12 documents is 13.49%.</Paragraph> </Section> <Section position="3" start_page="3" end_page="3" type="sub_section"> <SectionTitle> 5.3 Collaborative Filtering </SectionTitle> <Paragraph position="0"> Another part of our experiments is about collaborative filtering, which identifies how much effect one user's annotation is on others' summaries. To do that, we additionally invited the previous 5 users to annotate and summarize 100 common documents. After removing some bad data, we got totally 91 common documents at last. For each user's data, we will find whether it is helpful when his annotations are applied on other users' summarizations. Figure 4 presents the contributions of user P3's annotation on the total five persons' summarization. Intuitively, P3's annotations are most helpful to P3's summaries; they also have some contributions to P2, P4 and P1's according to importance, but make negative effects on P5's summarization. This indicates that P3's interest is most close to P2's, but far from P5's. This is possibly understandable since different users have different preferences for a document, thus their annotations style may have significant variances.</Paragraph> <Paragraph position="1"> For validation, we also make a reverse plot in figure 5 which presents the contributions of all's annotations on P3's summarization. We got similar results from this figure that P2's annotations contributes most to P3's summarization among other four persons. In fact we found most of the annotations of P2 and P3 are similar for most documents. For example, in document &quot;business_642.txt&quot; whose title is &quot;Stocks Sag; EDS Warnings Whacks IBM Lower&quot;, we found that, P2 and both P3 annotated &quot;drooped, Computer makers, Homebuilders, Federal Reserve&quot;, which occupies 27% of P2's annotations and 36% of P3's annotations. While in another document, the corresponding values are 36% and 50%.</Paragraph> <Paragraph position="2"> In table 4, we calculate the annotations cosine similarity for any two users, and found that averaged 37% of P3 and P2's annotations are consistent, but 29% for P3 and P5's. This confirms that P2 and P3 should fall into one interest group, while P5 belongs to another.</Paragraph> <Paragraph position="3"> This paper introduces a new document summarization approach based on users' annotations. Annotations are used to help generate personalized summary for a particular document. Contexts are also considered to make up the incompleteness of annotations. Two types of summaries are produced for each document: generic summary without considering annotations and annotation-based summary. Performance comparisons are made between these two summaries. We show by experiments that annotations are quite useful in improving the personalization of summarization compared to no annotations consideration.</Paragraph> <Paragraph position="4"> We made extensive evaluations on the relationships between annotations and summaries, and concluded that annotation selection is necessary in order to get a best summarization. We will try to identify how to choose an appropriate subset of annotations that make summarization best, which is challenging due to the variety of users' interests in different documents.</Paragraph> <Paragraph position="5"> We also did collaborative filtering to find whether one user's annotation is helpful for other's summarization. The answer is positive that summarization performance can be improved based on similar users' annotations. For the next step, we think that &quot;similar users&quot; needs more precise definition to improve the performance of collaborative filtering.</Paragraph> <Paragraph position="6"> As an extension of collaborative filtering, more work will be done for multi-documents summarization based on annotations of similar ones. They will help user to get a global personal view for a set of documents.</Paragraph> </Section> </Section> class="xml-element"></Paper>