File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/06/w06-2806_evalu.xml

Size: 8,641 bytes

Last Modified: 2025-10-06 13:59:54

<?xml version="1.0" standalone="yes"?>
<Paper uid="W06-2806">
  <Title>Interpreting Genre Evolution on the Web: Preliminary Results</Title>
  <Section position="5" start_page="35" end_page="36" type="evalu">
    <SectionTitle>
3.4 Results
</SectionTitle>
    <Paragraph position="0"> Currently, there is no standard test largely agreed upon that can be used for experiments where subjects can make choices from a large number of categories (23 labels) for a large number of objects (25 web pages). In the following paragraphs some views and interpretations of the data are presented, namely raw counts and percentages, Fisher's exact test, and adjusted residuals.</Paragraph>
    <Paragraph position="1"> Raw Counts and Percentages. A view on the data is offered in Table 2, which shows the number of subjects assigning a particular label to a particular web page and the percentage of the most voted label. For example, the label eshop (8th row) was assigned to WP11 (first column) by 119 subjects (highlighted cell), which corresponds to 88.15% (bottom row). Four subjects thought that WP1 was a corporate home pages (around 2.9%), seven selected net ad (around 5%), one subject chose front page (around 0.7%), one hotlist, one did not know, two added a new label for it (around 1.4%).</Paragraph>
    <Paragraph position="2"> Three ranges of agreement can be identified out of this table. The top range includes web pages with a percentage of agreement above 80%; the middle range groups web pages with an agreement between 79% and 50%; finally the bottom range contains web pages with an agreement between 49 % and 20%. Table 1 lists the web pages by percentage of agreement.</Paragraph>
    <Paragraph position="3"> From these ranges a first conclusion can be drawn. According to the ranges shown in Table 1, participants show the highest agreement on what we selected as &amp;quot;easy web genres&amp;quot;, except in three cases: front pages, net ad and splash screen, which seem among the least agreed upon (see bottom range). The middle range includes most of the ambiguous web genres together with ezine, which was deemed to be difficult by the author. The bottom range includes the rest of the ambiguous genres, together with other difficult web pages and three web pages from the top range, webpage_type_04 (front page), webpage_type_24 (splash screen) and webpage_type_25 (net ad).</Paragraph>
    <Paragraph position="4"> We have now a first picture of users' perception of some web pages in relation to some web genre labels. The hypothesized genre 1 WP1, WP2, WP3, etc. are short form of webpage_01, webpage_02, webpage_03, etc.</Paragraph>
    <Paragraph position="5"> recognition pattern was mostly confirmed in the top range, but slightly reshuffled in the middle and bottom ranges. Figure 1 shows the charted percentages.</Paragraph>
    <Paragraph position="6"> Fisher's Exact Test. The percentages at the bottom row in Table 2 can be interpreted in terms of conditional distribution on the most voted label (response variable) per web page type (explanatory variable). In other words, they refer to the sample distribution of most voted labels, conditional to the web page type. In terms of association, this means that the distribution of the response variable (the label) changes with the value of the explanatory variable (the web page type) if the two variables are related. Table 2 suggests the existence of an association or correlation between the label and the web page to which this label was assigned. But as Table 2 refers to the sample rather than the population, it provides evidence but not the final answer to whether labels and web page types are associated in the way suggested by the percentages. In order to see if it is plausible that labels and web page types are associated in the population, Fisher's exact test can be calculated. The value returned for this test by SPSS is 9292.275, which is large enough to reject the hypothesis that labels and web page are independent2. This statistically significant association shows that the web pages chosen by the author to represent some web genres mostly map the subjects' perception of these web pages. It also shows that many genre labels are acknowledged by the users and are consistently associated to web pages.</Paragraph>
    <Paragraph position="7"> Adjusted Residuals: A test statistic, such as Fisher's exact test, and statistical significance summarize the strength of evidence against the null hypothesis of independence, but does not indicate how many and which cells deviate greatly from this hypothesis. Residuals, i.e. the differences between expected and observed cell frequencies can help in this task. In particular, adjusted residuals can indicate if the cell counts are significantly different from what independence predicts. A large adjusted residual provides evidence against independence of a cell. As Table 3 mostly maps Table 2, a significant association between genre labels and web page types on the cells containing the most voted labels is then confirmed.</Paragraph>
    <Section position="1" start_page="36" end_page="36" type="sub_section">
      <SectionTitle>
3.5 Discussion
</SectionTitle>
      <Paragraph position="0"> The original impression that there were different degree of perception of genres of web pages was confirmed by these preliminary results. Also the rough distinction into three levels of genre awareness (easy, ambiguous and difficult) was confirmed. Three ranges of perception came out clearly from percentages, but the distribution of the web pages into these three ranges is slightly different from what was expected.</Paragraph>
      <Paragraph position="1"> The general view of the results (Fisher's test) reveals that there is a significant association between the 25 web pages and the 23 labels. The analysis of adjusted residuals support this interpretation.</Paragraph>
      <Paragraph position="2"> The agreement among subjects on the label to assign to a particular web pages can be divided into three levels.</Paragraph>
      <Paragraph position="3"> At the first level, which can be interpreted as the highest perception of web genres, there are web pages labelled as personal home page (webpage_type_02), eshop (webpage_type_01), corporate home page (webpage_type_11), FAQs (webpage_type_12), and search pages (webpage_type_05). We can define these labels as stable web genres.</Paragraph>
      <Paragraph position="4"> At a middle level of perception, there are web genres still emerging. Most of the labels are fairly novel (ezine, clog, blog, about, how to), sometimes not entirely transparent, and some of them are specialized (academic home page, organizational home page, online tutorial).</Paragraph>
      <Paragraph position="5"> Probably the textual conventions of these genres are not entirely standardized yet and can cause oscillation in users' perception. This level offers the most interesting view on a genre repertoire which is moving and evolving and it is not consolidated yet.</Paragraph>
      <Paragraph position="6"> The bottom range shows a blurred level of perception for different reasons. For some genres such as email and newsletter, the presentation in form of screenshots was not ideal. Subjects could not navigate through the web page and they could not resolve the level of granularity. For instance, for webpage_type_03 (the web page selected by the author to represent an email), 66 subjects chose email, but 34 subjects preferred to add a new label for it and 20 thought it was an about page. Surprisingly, labels such as splash screen and front page for webpage_type_04 and webpage_type_24 were not favoured by the respondents who preferred to add their own labels in many cases. For webpage_type_06, subjects preferred the label search page instead of sitemap. Another interesting case is net ad (webpage_type_25), which was often assessed as eshop, probably because the concept of advertising and selling are closely related. The most opaque label seems to be hotlist (webpage_type_15) because most subject preferred to add their own label. Three of the four web pages that were classified by the author as &amp;quot;I don't know&amp;quot; belong to this level of perception. While webpage_type_21 fell into the middle range because most of the subjects perceive it as a search page, the genre perception or interpretation of webpage_type_17, webpage_type_18, and webpage_type_23 is not so straightforward. For instance, webpage_type_17 was assessed as online form (57 subjects), search page (26 subjects), an eshop (26 subjects) and probably it is has all these functions at the same.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML