File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/98/p98-1112_intro.xml
Size: 1,801 bytes
Last Modified: 2025-10-06 14:06:35
<?xml version="1.0" standalone="yes"?> <Paper uid="P98-1112"> <Title>by NSF award #IRI-9618797 STIMULATE: Generating Coherent Summaries of On-Line Documents: Combining</Title> <Section position="3" start_page="681" end_page="681" type="intro"> <SectionTitle> 2 Initial Observations </SectionTitle> <Paragraph position="0"> We initially considered two specific categories of verbs in the corpus: communication verbs and support verbs. In the WSJ corpus, the two most common main verbs are say, a communication verb, and be, a support verb. In addition to say, other high frequency communication verbs include report, announce, and state. In journalistic prose, as seen by the statistics in Table 1, at least 20% of the sentences contain communication verbs such as say and announce; these sentences report point of view or indicate an attributed comment. In these cases, the subordinated complement represents the main event, e.g. in &quot;Advisors announced that IBM stock rose 36 points over a three year period,&quot; there are two actions: announce and rise. In sentences with a communication verb as main verb we considered both the main and the subordinate verb; this decision augmented our verb count an additional 20% and, even more importantly, further captured information on the actual event in an article, not just the communication event. As shown in Table 1, support verbs, such as go (&quot;go out of business&quot;) or get (&quot;get along&quot;), constitute 30%, and other content verbs, such as fall, adapt, recognize, or vow, make up the remaining 50%. If we exclude all support type verbs, 70% of the verbs yield information in answering the question &quot;what happened?&quot; or &quot;what did X do?&quot;</Paragraph> </Section> class="xml-element"></Paper>