File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/00/w00-0905_metho.xml
Size: 11,569 bytes
Last Modified: 2025-10-06 14:07:28
<?xml version="1.0" standalone="yes"?> <Paper uid="W00-0905"> <Title>Verb Subcategorization Frequency Differences between Business- News and Balanced Corpora: The Role of Verb Sense IDouglas Roland, ~&quot;Danid Jurafsky, &quot;3Lise Menn,'Susanne Gahl, IElizabeth Elder and IChris</Title> <Section position="4" start_page="28" end_page="32" type="metho"> <SectionTitle> 3 Subcategorization Frequency </SectionTitle> <Paragraph position="0"/> <Section position="1" start_page="28" end_page="30" type="sub_section"> <SectionTitle> 3.1 Methodology: </SectionTitle> <Paragraph position="0"> For the second experiment, we coded the examples of the 64 verbs from each of the three corpora for transitivity. We counted any use with a direct object as transitive, and any other use, such as with a prepositional phrase, as intransitive. Passive uses were also included in the transitive category. Examples ( 1 ) and ( 2 ) illustrate intransitive uses, example ( 3 ) illustrates transitive (and active) while examples verb sense can affect verb subcategofization.</Paragraph> <Paragraph position="1"> We therefore controlled for verb sense by only including sentences from the majority sense of the verb in our counts. For example, we did not include instances of drop which were phrasal verbs with distinct senses like &quot;drop in&quot; or &quot;drop off&quot;. We did however, include metaphorical extensions of the main sense, such as a company &quot;dropping a product line&quot;. We thus used a broadly defined notion of sense rather than the more narrowly defined word senses used in some on-line word sense resources such as Wordnet. This was partly for logistic reasons, since such fine-grained senses are very hard to code, and partially because we suspected that very narrowly defined senses frequently have only one possible subcategorization. Coding for such senses would have thus biased our experiment strongly toward finding a strong link between sense and subeategorization-bias.</Paragraph> <Paragraph position="2"> We calculated transitivity biases for each of the 64 verbs in each of the three corpora. We classed the verbs as high transitivity if more than 2/3 of the tokens of the major sense were transitive, low transitivity if more than 2/3 of the tokens of the major sense were intransitive, and as mixed otherwise. We removed from consideration any token of the verb which was not used in its major sense. If subcategorization biases are related to verb sense, we would expect the transitivity biases to be stable across corpora once secondary senses are removed from consideration.</Paragraph> </Section> <Section position="2" start_page="30" end_page="30" type="sub_section"> <SectionTitle> 3.2 Results: </SectionTitle> <Paragraph position="0"> Nine of the 64 verbs, shown in Table 5, had a significant shift in transitivity bias. These verbs had a different high/mixed/low transitivity bias in at least one of the three corpora.</Paragraph> </Section> <Section position="3" start_page="30" end_page="32" type="sub_section"> <SectionTitle> 3.3 Discussion: </SectionTitle> <Paragraph position="0"> In general, these shifts in transitivity were a result of the verbs having differences in sense between the corpora such that the senses had different subcategorizations, but were still within our broadly defined 'main sense' for that verb.</Paragraph> <Paragraph position="1"> For seven out of the nine verbs, the shifts in transitivity are a result of differences between the WSJ data and the other data, which are a result of the WSJ being biased towards business-specifie uses of these verbs. For example, in the BNC and Brown data, 'advance' is a mixture of transitive and intransitive uses, shown in ( 6 ) and ( 7 ), while intransitive share price changes ( 8 ) dominated in the WSJ data.</Paragraph> <Paragraph position="2"> ( 6 ) BNC intransitive: In films, they advance in droves of armour across open</Paragraph> <Paragraph position="4"> ~'moral careers&quot; as another useful concept ._ ( 8 ) WSJ intransitive: Of the 4,345 stocks that T changed hands, 1,174 declined and 1,040 advanced.</Paragraph> <Paragraph position="5"> 'Crack' is used to mean 'make a sound' ( 9 ) or 'break' ( 10 ) in the Brown and BNC data (both of which have transitive and intransitive uses), while it is more likely to be used to mean 'enter or dominate a group/market' ( transitive use) in haven't yet been able to crack Saatchi's clubby inner circle, or to have significant influence on company strategy.</Paragraph> <Paragraph position="6"> ( 12 ) WSJ transitive: ... big investments in &quot;domestic&quot; industries such as beer will make it even tougher for foreign competitors to crack the Japanese market.</Paragraph> <Paragraph position="7"> 'Float' is generally used as an intransitive verb ( 13 ), but nmst be used transitively when used in a financial sense ( 14 ).</Paragraph> <Paragraph position="9"> its big paper and British retailing businesses via share issues to existing holders.</Paragraph> <Paragraph position="10"> 'Relax' is generally used intransitively ( 15 ), but is used transitively in the WSJ data when discussing the relaxation of rules and credit (16).</Paragraph> <Paragraph position="11"> ( 15 ) BNC intransitive: The moment Joseph stepped out onto the terrace the worried faces of Tran Van Hien and his wife relaxed with relief.</Paragraph> <Paragraph position="12"> ( 16 ) WSJ transitive: Ford is willing to bid for 100% of Jaguar's shares if both the government and Jaguar shareholders agree to relax the anti-takeover barrier prematurely. 'Soften&quot; is generally used transitively ( 17 ), but is used intransitively in the WSJ data when discussing the softening of prices ( 18 ) and (19).</Paragraph> <Paragraph position="13"> ( 17 ) Brown transitive: Hardy would not allow sentiment to soften his sense of the irredeemable pastness of the past, and the eternal deadness of the dead.</Paragraph> <Paragraph position="14"> ( 18 ) WSJ intransitive: A spokesman for Scott says that assuming the price of pulp continues to soften, &quot;We should do well.&quot; ( 19 ) WSJ intransitive: The stock has since softened, trading around $25 a share last week and closing yesterday at $2.3.00 in national over-the-counter trading.</Paragraph> <Paragraph position="15"> 'Surrender' is used both transitively ( 20 ) and intransitively ( 21 ), but must be used transitively when discussing the surrender of particular items such as 'stocks' ( 22 ) and ( 23 ). ( 20 ) BNC transitive: In 1475 Stanley surrendered his share to the crown...</Paragraph> <Paragraph position="17"> to save bloodshed, surrendered under the promise that they would be treated as neighbors ( 22 ) WSJ transitive: Holders can... surrender their shares at the per-share price of $1,000, plus accumulated dividends of $6.71 a share.</Paragraph> <Paragraph position="18"> ( 23 ) WSJ transitive: ... Nelson Peltz and Peter W. May surrendered warrants and preferred stock in exchange for a larger stake in Avery's common shares.</Paragraph> <Paragraph position="19"> The verb 'fight&quot; is the only verb that has a different transitivity bias in each of the three corpora; with all other verbs, at least two corpora share the same bias. In the WSJ, fight tends to be used transitivdy, describing action against a specific entity or concept ( 24 ). In the other two corpora, there are more descriptions of actions for or against more abstract concepts ( 25 ) and ( 26 ). In addition, the WSJ differences may further be influenced by a journalistic style practice of dropping the preposition 'against' in the phrase 'fight against'.</Paragraph> <Paragraph position="20"> ( 24 ) WSJ lrarlsifive: Los Angeles County Supervisor Kenneth Hahn yesterday vowed to fight the introduction of double-decking in the area.</Paragraph> <Paragraph position="21"> ( 25 ) BNC intransitive: He fought against the United Nations troops in the attempted Katangese secession of nineteen sixty to sixtytwo. null ( 26 ) Brown intransitive: But he would fight for his own liberty rather than for any abstract principle connected with it -- such as &quot;cause&quot;. The verb 'study' is generally transitive ( 27 ), except in the Brown data, where study is frequently used with a prepositional phrase ( 28 ) or to generically describe the act of studying ( 29 ). We are currently investigating what might be causing this difference; possible candidates include language change (since Brown is much older than BNC and WSJ), British-American differences, or micro-sense differences.</Paragraph> <Paragraph position="22"> ( 27 ) BNC transitive: A much more useful and realistic approach is to study recordings of different speakers' natural, spontaneous ... ( 28 ) Brown intransitive: In addition, Dr.</Paragraph> <Paragraph position="23"> Clark has studied at Rhode Island State College and Massachusetts Institute of Technology.</Paragraph> <Paragraph position="24"> ( 29 ) Brown intransitive: She discussed in her letters to Winslow some of the questions that came to her as she studied alone.</Paragraph> <Paragraph position="25"> The verb 'flood&quot; is used intransitively more often in the BNC than in the other corpora.</Paragraph> <Paragraph position="26"> The Brown and WSJ uses tend to be transitive non-weather uses of the verb flood ( 30 ) and ( 31 ), while the BNC uses include more weather uses, which are more likely to be intransitive ( 32 ). We are investigating whether this is a result of the BNC discussing weather more often, or a result of which particular grammatical structures are used to describe the weather floods in British and American English.</Paragraph> <Paragraph position="27"> ( 30 ) WSJ transitive: Lawsuits over the harm caused by DES have flooded federal and state courts in the past decade.</Paragraph> <Paragraph position="28"> ( 31 ) Brown transitive: The terrible vision of the ghetto streets flooded his mind.</Paragraph> <Paragraph position="29"> ( 32 ) BNC intransitive: ,.. should the river flood, as he'd observed it did after heavy rain, the house was safe upon its hill.</Paragraph> <Paragraph position="30"> Conclusion The goal of the work performed in this paper was to find a stable set of transitivity biases for 64 verbs to provide norming data for psychological experiments.</Paragraph> <Paragraph position="31"> The first result is that 55 out of 64 single sense verbs analyzed did not change in transitivity bias across corpora. This suggests that for our goal of providing transitivity biases for single sense verbs, the influence of American vs. British English and broad based vs. narrow corpora may not be large. We would, however, expect larger cross corpus differences for verbs that are more polysemous than our particular set of verbs.</Paragraph> <Paragraph position="32"> The second result is that for the 9 out of 64 verbs that did change in transitivity bias, the shift in transitivity bias was largely a result of subtle shifts in verb sense between the genres present in each corpus. These two results suggest that when verb sense is adequately controlled for, verbs have stable suboategorization probabilities across corpora.</Paragraph> <Paragraph position="33"> One possible future application of our work is that it might be possible to use verb frequencies and subeategodzafion probabilities of multi-sense verbs can be used to measure the degree of difference between corpora.</Paragraph> </Section> </Section> class="xml-element"></Paper>