File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/06/w06-2804_intro.xml
Size: 14,370 bytes
Last Modified: 2025-10-06 14:04:03
<?xml version="1.0" standalone="yes"?> <Paper uid="W06-2804"> <Title>An analysis of Wikipedia digital writing</Title> <Section position="3" start_page="0" end_page="18" type="intro"> <SectionTitle> 1. Introduction </SectionTitle> <Paragraph position="0"> The encyclopaedia's structure, either hierarchical or alphabetically ordered, with its evolving nature is particularly adaptable to a disk-based or online format. All major printed encyclopaedias have moved to this method of delivery. Online E-ncyclopedias can include multimedia (such as video, sound clips and animated illustrations) unavailable in the printed format. They can make use of hypertext cross-references between conceptually related items and, furthermore, they offer the additional advantage of being dynamic: new and frequently updated information can be presented almost immediately, rather than waiting for the next release of a static format (as with a paper or disk publication).</Paragraph> <Paragraph position="1"> This research is based particularly on a contrastive linguistic analysis of Wikipedia and Encyclopaedia Britannica Online. The latter is considered one of the greatest examples of general encyclopaedias in the English speaking world. It contains 120,000 articles which are commonly considered accurate, reliable and well-written. Brief article summaries can be viewed for free on the net, while the full text is available only for individuals with monthly or yearly subscription.</Paragraph> <Paragraph position="2"> On the other hand, Wikipedia is a collaborative authoring project on the web, a repository of encyclopaedic knowledge, an example of a collaborative hypermedium focused on a common project. It is one of the most popular reference websites receiving around 50 million hits per day. It is a social edemocracy environment, designed with the goal of creating a free encyclopedia containing information on all subjects written collaboratively by volunteers. At the time of writing this paper the project has produced over two and half million articles and has been officially recognized as the largest international online community. It consists of 200 independent language editions and the English version is the biggest one with more than 962,995 articles (up to January 2006).</Paragraph> <Paragraph position="3"> 2. Wiki as new textual genre With reference to the extensive empirical studies of Susan Herring on CMC, wikis and blogs considered as spaces belonging to the second web generation, can be regarded as adding new peculiarities to the existing synchronous and asynchronous tools of the first CMC generation (such as e-mail, mailing list, forum and chat). It is well known in media studies that &quot;the medium is the message&quot; as McLuhan (1964) pointed out in the sixties, and in fact the medium adds unique properties to the web genre in terms of production, function, and reception which cannot be ignored. Wikis are co-authoring tools which allow collective collaboration. They can be, simultaneously, a repository of information and an asynchronous tool of communication and discussion across the web (see Wikipedia). All wikis have integrated search engines for locating content and are open to anyone since they are considered a public space, even though they can be protected against unauthentic users.</Paragraph> <Paragraph position="4"> Their main aim is to create documents.</Paragraph> <Paragraph position="5"> Wikis, unlike traditionally designed web sites, encourage &quot;topical writing&quot; by using wiki links and creating a wide network of interconnected pages. The interlinking process becomes simpler to type by just putting the word(s) in square brackets. It simultaneously creates a new topic title (a WikiWord), a new writing space for that topic and a link to that space.</Paragraph> <Paragraph position="6"> Once created, a topic will be available anywhere on the wiki as whenever the WikiWord is typed, it will link to the writing space of that topic (Morgan, 2006).</Paragraph> <Paragraph position="7"> The writer, the supreme authority in print, is considered the one who transmits content through paper pages, to passive readers, whose role is merely to decode and interpret their message. The electronic writing space, being hypertextual and extremely flexible, changes the landscape. Writers can create multiple structures from the same topics (hierarchy, web, spiral, etc.) and readers can enter, browse and leave text at many points. In the hypertext, the author creates different paths for the reader, although there is neither a canonical path nor a defined page order to follow. The new active readers making their choices, become co-authors of the hypertext (Bolter, 1991). This idea is more pronounced on a wiki than elsewhere, because in an open wiki the reader can (if allowed) really interrupt the process, rewriting, changing, erasing and modifying the original text or creating new topics.</Paragraph> <Paragraph position="8"> Traditional writing creates a gap between writer and reader. Wiki technology mediates the gap because the two actors assume interchangeable roles in this new open eenvironment. To conclude, wiki text is never static as it is considered revisable, a-temporal as nodes continually change through the collaborative writing process, creating a never ending evolving network of topics. Thus, knowledge becomes webbed, contextualized though it remains temporary as it can always be changed or vandalized. Luckily, the original version can always, and easily, be recovered (Morgan, 2006).</Paragraph> <Paragraph position="9"> Wikis offer two different writing modes. The first one is known as &quot;document mode&quot;. When it is used, contributors create documents collaboratively and can leave their additions to articles. Multiple authors can edit and update the content of documents which gradually become representations of contributors' shared knowledge (Leuf and Cunningham, 2001).</Paragraph> <Paragraph position="10"> Wikis have two states, &quot;Read&quot; and &quot;Edit&quot;. &quot;Read state&quot; is by default. In this case, wiki pages look just like normal webpages. When the user wants to edit a page, he/she must only access the &quot;edit state&quot;.</Paragraph> <Paragraph position="11"> &quot;Document mode&quot; is expository, extensive, monological, formal, refined and less creative than &quot;thread mode&quot;. It is in third person and unsigned. &quot;Document mode&quot; demonstrates that knowledge is collective and that the ideas, not the writers, are the main focus. Writers contribute to &quot;document mode&quot; refactoring, reorganizing, incorporating and synthesizing &quot;thread mode&quot; comments in encyclopaedic articles and changing the first to third person (Morgan, 2006).</Paragraph> <Paragraph position="12"> The second wiki writing mode is &quot;thread mode&quot;. Contributors carry out discussions by posting signed messages in the discussion page connected to the main article. Others reply to the original message and so a group of threaded messages evolves (Morgan, 2006).</Paragraph> <Paragraph position="13"> &quot;Thread mode&quot; is dialogical, open, collective, dynamic and informal. It develops organically, without a predictive structure. It expresses public thinking, presents multiple positions and is exploratory. Entries are phrased in first person and are signed. Rather then replying to a discussion entry, the writer can refactor the page to incorporate suggestions made, then delete the comment.</Paragraph> <Paragraph position="14"> &quot;Thread mode&quot; demonstrates that knowledge is the result of constructivist collaboration and not a lonely production.</Paragraph> <Paragraph position="15"> SysOp is the abbreviation for &quot;systems operator&quot;, and is a commonly used term for the administrator of a specialinterest area of an online service.</Paragraph> <Paragraph position="16"> The page history of all versions of previous pages is available on Wikipedia. It consists of text, date , time and editing authors.</Paragraph> <Paragraph position="17"> 3. Research objectives and methodolology 3.1. Wikipedia vs Britannica The first objective of this research has been directed towards the investigation of Wikipidia articles and on what has been defined, in this paper as &quot;WikiLanguage&quot;, the formal, neutral and impersonal language used in the official encyclopedic articles. In this phase, an analysis of randomly selected sample articles has been carried out. The data for this research in progress has been based on two corpora. Up to now, they include a collection of txt files made up of one hundred articles representing topics taken from the Wiki Folksonomy's eight categories (culture, geography, history, life, mathematics, science, society, technology) and on a contrastive analysis of the same articles found in Encyclopaedia Britannica Online.</Paragraph> <Paragraph position="18"> The purpose of the quantitative research has been the empirical measurement of some linguistic features in order to define the degree of formality in the WikiLanguage. The sample articles have been analyzed through the ConcApp Concordancer Program. Different factors have been taken into consideration in order to define the formality of Britannica vs Wikipedia. The first aspect has been articles' length (total words) as conciseness was found to be a feature of formal written discourse (Chafe, 1982). The second, average word length (in letters) as short words have been considered a characteristic of informal genres (Biber, 1988). A high level of lexical density (Halliday, 1985) has been found in formal academic writing. It has been considered the main stylistic difference between speech and writing (Biber, 1988).</Paragraph> <Paragraph position="19"> Subsequently, the number of unique lexical items in the two corpora has been measured.</Paragraph> <Paragraph position="20"> With reference to the findings of Heylighen and Dewaele (1999), frequency of word suffixes typical in formal genres (such as -age, -ment, -ance/ence, -ion, -ity, -ism) and impersonal pronouns (it/they) have been calculated. A contrastive frequency of meaningful keywords has also been Folksonomy is a neologism which indicates a practice of collaborative categorization which makes use of freely chosen keywords. Taxonomy derives from Greek &quot;taxis&quot; and &quot;nomos&quot;. &quot;Taxis&quot; means classification, &quot;nomos&quot; (or nomia) management and &quot;folk&quot; people; so folksonomy means people's classification management.</Paragraph> <Paragraph position="21"> investigated. The informality of the language has been measured through the frequency of abbreviations, acronyms, contractions (I'm, don't, he's, etc.) and personal pronouns (I, we, you, he/she, they) which have been found to be typical of informal genres, such as face-to-face and phone conversations (Biber, 1988). As shown in Appendix A (Fig.1), the first results of this research conducted on one hundred articles have highlighted a number of differences and similarities between Wikipedia and Britannica.</Paragraph> <Paragraph position="22"> Articles in Britannica have proven to be shorter than those in Wikipedia (average length: 1728 vs 3510 words) and they have shown a higher lexical density (44.9% vs 31.4%). Although the level of total formality is clearly higher in Britannica (50.2% vs 36.6%), the frequency of formal nouns and impersonal pronouns typical of the formal discourse (5.3 vs 5.2) and the average word length (in letters 5.4 vs 5.2) has proven to be very similar. The divergent value is related to lexical density, but if text length varies widely (as happens in the two e-ncyclopedias) the different lexical items will appear to be much higher in the shortest text as their relationship is not linear. Each additional one hundred words of text adds fewer and fewer additional unique words (Biber, 1988). Thus, an interpretation of the collected data seems to suggest that thanks to the collective editorial control, the WikiLanguage of the co-authored articles shows a formal and standardized style similar to that found in Britannica. A table representing a part of the collected data, and their graphical representation, has been provided in Appendix A (Fig. 2,3,4).</Paragraph> <Section position="1" start_page="17" end_page="18" type="sub_section"> <SectionTitle> 3.2 Web analysis </SectionTitle> <Paragraph position="0"> Particular attention has been devoted to Wikipedia digital style due to the importance of the interplay between genre and medium when dealing with web-mediated texts. The layout of sample articles has been investigated (table of content, sections and sub-sections extension) as well as multimodality (tables, graphs, images, audio recordings and videos) and hypertextuality [explicative (internal bookmarks), associative (wikilinks) and explorative links (external weblinks)]. At present Wikipedia does not seem to fully exploit the potential offered by multimodality (and Britannica even less), showing few audio recordings and videos. This is probably due to the feature of Open Source software, keeping with hackers' simple and essential style (i.e. Slashdot and Everything2), to the contributors' average technical skills and to the philosophical choice which grants a privilege to information and content over appearance. One of the prominent properties of Wikipedia is its highly dense hypertextuality when compared to Britannica. The analysis of the articles clearly reveal the abundance of Wikipedia's nodes interlinking and dynamism, made possible by wiki software and, by contrast, the isolation, linearity (page structure) and static nature of corresponding Britannica articles. In this case using Finnemann's (1999) concept of &quot;modal shifts&quot; with reference &quot;to reading mode&quot; and &quot;navigating mode&quot;, it is evident that Wikipedia articles actively stimulates the latter allowing the reader to construct his/her own personal pathway, browsing inside and outside the website.</Paragraph> </Section> </Section> class="xml-element"></Paper>