File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/05/i05-2011_intro.xml
Size: 2,936 bytes
Last Modified: 2025-10-06 14:02:57
<?xml version="1.0" standalone="yes"?> <Paper uid="I05-2011"> <Title>Automatic Detection of Opinion Bearing Words and Sentences</Title> <Section position="2" start_page="0" end_page="1" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> Sophisticated language processing in recent years has made possible increasingly complex challenges for text analysis. One such challenge is recognizing, classifying, and understanding opinionated text. This ability is desirable for various tasks, including filtering advertisements, separating the arguments in online debate or discussions, and ranking web documents cited as authorities on contentious topics.</Paragraph> <Paragraph position="1"> The challenge is made very difficult by a general inability to define opinion. Our preliminary reading of a small selection of the available literature (Aristotle, 1954; Toulmin et al., 1979; Perelman, 1970; Wallace, 1975), as well as our own text analysis, indicates that a profitable approach to opinion requires a system to know and/or identify at least the following elements: the topic (T), the opinion holder (H), the belief (B), and the opinion valence (V). For the purposes of the various interested communities, neutral-valence opinions (such as we believe the sun will rise tomorrow; Susan believes that John has three children) is of less interest; more relevant are opinions in which the valence is positive or negative. Such valence often falls together with the actual belief, as in &quot;going to Mars is a waste of money&quot;; in which the word waste signifies both the belief a lot [of money] and the valence bad/undesirable, but need not always do so: &quot;Smith[the holder] believes that abortion should be permissible[the topic] although he thinks that is a bad thing[the valence]&quot;. null As the core first step of our research, we would like an automated system to identify, given an opinionated text, all instances of the [Holder/Topic/Valence] opinion triads it contains null . Exploratory manual work has shown this to be a difficult task. We therefore simplify the task as follows. We build a classifier that simply identifies in a text all the sentences expressing a valence. Such a two-way classification is simple to set up and evaluate, since enough testing data has been created.</Paragraph> <Paragraph position="2"> As primary indicators, we note from newspaper editorials and online exhortatory text that certain modal verbs (should, must) and adjectives and adverbs (better, best, unfair, ugly, nice, desirable, nicely, luckily) are strong markers of opinion. Section 3 describes our construction of a series of increasingly large collections of such marker words. Section 4 describes our methods for organizing and combining them and using them to identify valence-bearing sentences. The evaluation is reported in Section 5.</Paragraph> </Section> class="xml-element"></Paper>