Misplaced Pages

Academic studies about - Misplaced Pages

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.

This is an old revision of this page, as edited by David Fuchs (talk | contribs) at 19:19, 3 March 2010 (this is a massive morass of unsourced synthesis, undue weight (one study?), and unnecessary quotations; trimming the worst of it). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Revision as of 19:19, 3 March 2010 by David Fuchs (talk | contribs) (this is a massive morass of unsourced synthesis, undue weight (one study?), and unnecessary quotations; trimming the worst of it)(diff) ← Previous revision | Latest revision (diff) | Newer revision → (diff)

In recent years there have been numerous academic studies about Misplaced Pages in peer-reviewed publications. This research can be grouped into two categories. The first analyzed the production and reliability of the encyclopedia content, while the second investigated social aspects, such as usage and administration.

Content

Production

A minority of editors produce the majority of persistent content

In a landmark peer-reviewed paper, which was also mentioned in The Guardian, a team of six researchers from the University of Minnesota measured the relationship between editors' edit count and the editors' ability to convey their writings to Misplaced Pages readers, measured in terms of persistent word views (PWV)—the number of times a word introduced by an edit is viewed. The accounting method is best described using the author's own words: "each time an article is viewed, each of its words is also viewed. When a word written by editor X is viewed, he or she is credited with one PWV." The number of times an article was viewed was estimated from the web server logs.

The researchers analyzed 25 trillion PWVs attributable to registered users in the interval September 1, 2002 − October 31, 2006. At the end of this period, the top 10% of editors (by edit count) were credited with 86% of PWVs, the top 1% about 70%, and the top 0.1% (4200 users) were attributed 44% of PWVs, i.e. nearly half of Misplaced Pages's "value" as measured in this study. The top 10 editors (by PWV) contributed only 2.6% of PWVs, and only three of them were in top 50 by edit count. From the data, the study authors derived the following relationship:

Growth of PWV share increases super-exponentially by edit count rank; in other words, elite editors (those who edit the most times) account for more value than they would given a power-law relationship.

The study also analyzed the impact of bots on content. By edit count, bots dominate Misplaced Pages; 9 of the top 10 and 20 of the top 50 are bots. In contrast, in the PWV ranking only two bots appear in the top 50, and none in the top 10.

Based on the steady growth of the influence on those top 0.1% editors by PWV, the study concluded unequivocally:

Frequent editors dominate what people see when they visit Misplaced Pages and this domination is increasing.

Work distribution and social strata

A peer-reviewed paper noted the "social stratification in the Misplaced Pages society" due to the "admins class". The paper suggested that such stratification could be beneficial in some respects but recognized a "clear subsequent shift in power among levels of stratification" due to the "status and power differentials" between administrators and other editors.

Analyzing the entire edit history of Misplaced Pages up to July 2006, the same study determined that the influence of administrator edits on contents has steadily diminished since 2003, when administrators performed roughly 50% of total edits, to 2006 when only 10% of the edits were performed by administrators. This happened despite the fact the average number of edits per administrator had increased more than fivefold during the same period. This phenomenon was labeled the "rise of the crowd" by the authors of the paper. An analysis that used as metric the number of words edited instead of the number of edit actions showed a similar pattern. Because the admin class is somewhat arbitrary with respect to the number of edits, the study also considered a breakdown of users in categories based on the number of edits performed. The results for "elite users", i.e. users with more than 10,000 edits, were somewhat in line with those obtained for administrators, except that "the number of words changed by elite users has kept up with the changes made by novice users, even though the number of edits made by novice users has grown proportionally faster". The elite users were attributed about 30% of the changes for 2006. The study concludes:

Thus though their influence may have waned in recent years, elite users appear to continue to contribute a sizeable portion of the work done in Misplaced Pages. Furthermore, edits made by elite users appear to be substantial in nature. An analysis removing revert edits does not substantially change the findings.

Reliability

Main article: Reliability of Misplaced Pages § Academia

Geography

Misplaced Pages articles cover about half a million places on Earth. However, research conducted by the Oxford Internet Institute has shown that the geographic distribution of articles is highly uneven. Most articles are written about North America, Europe, and East Asia, with very little coverage of large parts of the developing world, including most of Africa.

Social aspects

Demographics

A 2007 study by Hitwise, reproduced in Time magazine, found that visitors to Misplaced Pages are almost equally split 50/50 male/female, but that 60% of edits are made by male editors.

Policies and guidelines

A descriptive study that analyzed Misplaced Pages's policies and guidelines up to September 2007 identified a number of key statistics:

  • 44 official policies
  • 248 guidelines

Even a short policy like Ignore All Rules was found to have generated a lot of discussion and clarifications:

While the "Ignore all rules" policy itself is only sixteen words long, the page explaining what the policy means contains over 500 words, refers readers to seven other documents, has generated over 8,000 words of discussion, and has been changed over 100 times in less than a year.

The study sampled the expansion of some key policies since their inception:

The number for Deletion was considered inconclusive however because the policy was split in several sub-policies.

Power plays

See also Criticism of Misplaced Pages, Editorial process

(Copyright notice: This section makes use of extensive quotations from a paper, but most of the quotations are excerpts from Misplaced Pages itself, with user accounts anonymized.)

A 2007 joint peer-reviewed study conducted by researchers from the University of Washington and HP Labs examined how policies are employed and how contributors work towards consensus by quantitatively analyzing a sample of active talk pages. Using a November 2006 database dump, the study focused on 250 talk pages in the tail of the distribution: 0.3% of all talk pages, but containing 28.4% of all talk page revisions, and more significantly, containing 51.1% of all links to policies. From the sampled pages' histories, the study examined only the months with high activity, called critical sections—sets of consecutive months where both article and talk page revisions were significant in number.

The study defined and calculated a measure of policy prevalence. A critical section was considered policy-laden if its policy factor was at least twice the average. Articles were tagged with 3 indicator variables:

  • controversial
  • featured
  • policy-laden

All possible levels of these three factors yielded 8 sampling categories. The study intended to analyze 9 critical sections from each sampling category, but only 69 critical sections could be selected because only 6 articles (histories) were simultaneously featured, controversial, and policy laden.

The study found that policies were by no means consistently applied. Claiming that such ambiguities easily give rise to power plays, the study identified, using the methods of grounded theory (Strauss), 7 types of power plays:

  • article scope (what is off-topic in an article)
  • prior consensus (past decisions presented as absolute and uncontested)
  • power of interpretation (a sub-community claiming greater interpretive authority than another)
  • legitimacy of contributor (his/her expertise)
  • threat of sanction (blocking etc.)
  • practice on other pages (other pages being considered models to follow)
  • legitimacy of source (authority of references being disputed)

Due to lack of space, the study detailed only the first 4 types of power plays that were exercised by merely interpreting policy. A fifth power play category was analyzed; it consisted of blatant violations of policy that were forgiven it because the contributor was valued for his contributions despite his lack of respect for rules.

Obtaining administratorship

Researchers from Carnegie Mellon University devised a probit model of editors who have successfully passed the peer review process to become admins. Using only Misplaced Pages metadata, including the text of edit summaries, their model is 74.8% accurate in predicting successful candidates.

The paper observed that despite protestations to the contrary, "in many ways election to admin is a promotion, distinguishing an elite core group from the large mass of editors." Consequently, the paper used policy capture– a method that compares nominally important attributes to those that actually lead to promotion in a work environment.

The overall success rate for promotion was 53%, dropping from 75% in 2005 to 42% in 2006 and 2007. This sudden increase in failure rate was attributed to a higher standard that recently promoted administrators had to meet, and supported by anecdotal evidence from another recent study quoting some early admins who have expressed doubt that they would pass muster if their election (RfA) were held recently. In light of these developments the study argued that:

The process once called "no big deal" by the founder of Misplaced Pages has become a fairly big deal.

Significant factors affecting RfA outcome, numbers in parentheses are not statistically significant at p<.05:

Factor 2006–2007 pre–2006
number of previous RfA attempts -14.8% -11.1%
months since first edit 0.4% (0.2%)
every 1000 article edits 1.8% (1.1%)
every 1000 Misplaced Pages policy edits 19.6% (0.4%)
every 1000 WikiProject edits 17.1% (7.2%)
every 1000 article talk edits 6.3% 15.4%
each Arb/mediation/wikiquette edit -0.1% -0.2%
diversity score (see text) 2.8% 3.7%
minor edits percentage 0.2% 0.2%
edit summaries percentage 0.5% 0.4%
"thanks" in edit summaries 0.3% (0.0%)
"POV" in edit summaries 0.1% (0.0%)
Admin attention/noticeboard edits -0.1% (0.2%)

Contrary to expectations perhaps, "running" for administrator multiple times is detrimental to the candidate's chance of success. Each subsequent attempt has a 14.8% lower chance of success than the previous one. Length of participation in the project has only a small contribution to success to RfA chance of success.

Another significant finding of the paper is that one Misplaced Pages policy edit or WikiProject edit is worth ten article edits. A related observation is that candidates with experience in multiple areas of the site stood better chance of election. This was measured by the diversity score, a simple count of the number of areas that the editor has participated in. The paper divided Misplaced Pages in 16 areas: article, article talk, articles/categories/templates for deletion (XfD), (un)deletion review, etc. (see paper for full list). For instance, a user who has edited articles, her own user page, and posted once at (un)deletion review would have a diversity score of 3. Making a single edit in any additional region of Misplaced Pages correlated with a 2.8% increased likelihood of success in gaining administratorship.

Making minor edits also helped, although the study authors consider that this may be so because minor edits correlate with experience. In contrast, each edit to an Arbitration or Mediation committee page, or a Wikiquette notice, all of which are venues for dispute resolution, decreases the likelihood of success by 0.1%. Posting messages to administrator noticeboards (ANI) had a similarly deleterious effect. The study interpreted this as evidence that editors involved in escalating or protracted conflicts lower their chances of becoming administrators.

Saying "thanks" or variations thereof in edit summaries, and pointing out point of view ("POV") issues (also only in edit summaries because the study only analyzed metadata) were of minor benefit, contributing to 0.3% and 0.1% to candidate's chances in 2006–2007, but did not reach statistical significance before.

A few factors that were found to be irrelevant or marginal at best:

  • Editing user pages (including one's own) does not help. Somewhat surprisingly, user talk page edits also do not affect the likelihood of administratorship.
  • Welcoming newcomers or saying "please" in edit summaries had no effect.
  • Participating in consensus-building, such as RfA votes or the village pump, does not increase the likelihood of becoming admin. The study admits however that participation in consensus was measured quantitatively but not qualitatively.
  • Vandal-fighting as measured by the number of edits to the vandalism noticeboard had no effect. Every thousand edits containing variations of "revert" was positively correlated (7%) with adminiship for 2006–2007, but did not attain statistical significance unless one is willing to lower the threshold to p<.1). More confusingly, before 2006 the number of reverts was negatively correlated (-6.8%) with adminship success, against without attaining statistical significance even at p<.1. This may be because of the introduction of a policy known as 3RR in 2006 to reduce reverts.

The study suggests that some of the 25% unexplained variability in outcomes may be due to factors that were not measured, such as quality of edits or participation in off-site coordination, such as the (explicitly cited) secret mailing list reported in The Register. The paper concludes:

Merely performing a lot of production work is insufficient for “promotion” in Misplaced Pages. Candidates’ article edits were weak predictors of success. They also have to demonstrate more managerial behavior. Diverse experience and contributions to the development of policies and WikiProjects were stronger predictors of RfA success. This is consistent with the findings that Misplaced Pages is a bureaucracy and that coordination work has increased substantially. Participation in Misplaced Pages policy and WikiProjects was not predictive of adminship prior to 2006, suggesting the community as a whole is beginning to prioritize policymaking and organization experience over simple article-level coordination.

See also

References

  1. Reid Priesthood, Jilin Chen, Shying (Tony) K. Lam, Katherine Fancier, Loren Terence, John Riled, "Creating, destroying, and restoring value in Misplaced Pages", Prof. GROUP 2007, do: http://doi.acm.org/10.1145/1316624.1316663
  2. Nicholson Baker (April 10, 2008) How I fell in love with Misplaced Pages The Guardian
  3. Aniket Kittur, Ed Chi, Bryan Pendleton, Bongwon Suh and Todd Mytkowicz. "Power of the Few vs. Wisdom of the Crowd: Misplaced Pages and the Rise of the Bourgeoisie" (PDF). Proc. alt.chi 2007. Retrieved 2007-10-27.{{cite web}}: CS1 maint: multiple names: authors list (link)
  4. "Mapping the Geographies of Misplaced Pages Content". Mark Graham Oxford Internet Institute. ZeroGeography. Retrieved 2009-11-16.
  5. Bill Tancer (2007-04-25). "Who's Really Participating in Web 2.0". Time Magazine. Retrieved 2007-04-30. {{cite web}}: Italic or bold markup not allowed in: |publisher= (help)
  6. ^ Buttler et al., "Don't look now, but we've created a bureaucracy: the nature and roles of policies and rules in wikipedia", Proc. CHI 2008, doi: http://doi.acm.org/10.1145/1357054.1357227
  7. Travis Kriplean, Ivan Beschastnikh, David W. McDonald, Scott A. Golder, "Community, Consensus, Coercion, Control: CS*W or How Policy Mediates Mass Participation", Proc. GROUP 2007, doi: http://doi.acm.org/10.1145/1316624.1316648
  8. Moira Burke and Robert Kraut, "Taking up the mop: identifying future Misplaced Pages administrators", Pages 3441–3446, doi: http://doi.acm.org/10.1145/1358628.1358871
  9. Stumpf, S. A., & London, M. (1981). Capturing rater policies in evaluating candidates for promotion. The Academy of Management Journal, 24(4), 752–766.
  10. Forte, A., and Bruckman, A. Scaling consensus: Increasing decentralization in Misplaced Pages governance. Proc. HICSS 2008.
  11. WP:3RR and WP:EW, polices which prevent repetitive reverting.
  12. http://www.theregister.co.uk/2007/12/04/wikipedia_secret_mailing
  13. Kittur, A., Suh, B., Pendleton, B. A., Chi., E. "He says, she says: Conflict and coordination in Misplaced Pages". Proc CHI 2007, ACM Press (2007), 453–462.
  14. Viegas, F., Wattenberg, M., Kriss, J., and van Ham, F. "Talk before your type: Coordination in Misplaced Pages". Proc HICSS 2007, 575–582.

External links

  • Adler, B. T., & L. de Alfaro (2007). "A content-driven reputation system for the wikipedia." Proceedings of the 16th international conference on World Wide Web, http://doi.acm.org/10.1145/1242572.1242608
  • Amichai–Hamburger, Y., N. Lamdan, R. Madiel, & T. Hayat (2008). "Personality characteristics of Misplaced Pages members." Cyberpsychology & Behavior, 11(6), http://dx.doi.org/10.1089/cpb.2007.0225
  • Blumenstock, J. E. (2008). "Size matters: word count as a measure of quality on Misplaced Pages." Proceeding of the 17th international conference on World Wide Web, http://doi.acm.org/10.1145/1367497.1367673
  • Bryant, S. L., A. Forte, & A. Bruckman (2005). "Becoming Wikipedian: transformation of participation in a collaborative online encyclopedia." Proceedings of the 2005 international ACM SIGGROUP conference on Supporting group work, http://doi.acm.org/10.1145/1099203.1099205
  • Hu, M., E. P. Lim, A. Sun, H. W. Lauw, & B.-Q. Vuong (2007). "Measuring article quality in wikipedia: models and evaluation." Proceedings of the sixteenth ACM conference on Conference on information and knowledge management, http://doi.acm.org/10.1145/1321440.1321476
  • Kittur, A., B. Suh, B. A. Pendleton, & E. H. Chi (2007). "He says, she says: conflict and coordination in Misplaced Pages." Proceedings of the SIGCHI conference on Human factors in computing systems, http://doi.acm.org/10.1145/1240624.1240698
  • Kuznetsov, S. (2006). "Motivations of contributors to Misplaced Pages." ACM SIGCAS Computers and Society, http://doi.acm.org/10.1145/1215942.1215943
  • Luyt, B., T. C. H. Aaron, L. H. Thian, & C. K. Hong (2007). "Improving Misplaced Pages's accuracy: Is edit age a solution?" Journal of the American Society for Information Science and Technology, http://dx.doi.org/10.1002/asi.v59:2
  • Medelyan, O., C. Legg, D. Milne, & I. H. Witten (2008). "Mining meaning from Misplaced Pages." arXiv, http://arxiv.org/abs/0809.4530
  • Shachaf, P. (2009). "The paradox of expertise: Is the Misplaced Pages reference desk as good as your library?" Journal of Documentation, 65(6), 977-996, http://www.slis.indiana.edu/news/story.php?story_id=2064
  • Stein, K., & C. Hess (2007). "Does it matter who contributes: a study on featured articles in the German Misplaced Pages." Proceedings of the eighteenth conference on Hypertext and hypermedia, http://doi.acm.org/10.1145/1286240.1286290
  • Suh, B., E. H. Chi, A. Kittur, & B. A. Pendleton (2008). "Lifting the veil: improving accountability and social transparency in Misplaced Pages with wikidashboard." Proceeding of the twenty-sixth annual SIGCHI conference on Human factors in computing systems, http://doi.acm.org/10.1145/1357054.1357214
  • Viegas, F. B., M. Wattenberg, J. Kriss, & F. van Ham (2007). "Talk before you type: coordination in Misplaced Pages." IEEE Explore, http://dx.doi.org/10.1109/HICSS.2007.511
  • Vuong, B.-Q., E. P. Lim, A. Sun, M.-T. Le, & H. W. Lauw (2008). "On ranking controversies in Misplaced Pages: models and evaluation." Proceedings of the international conference on Web search and web data mining, http://doi.acm.org/10.1145/1341531.1341556
  • Farrell, Henry (2008-12-30). "Norms, Minorities, and Collective Choice Online". Ethics & International Affairs. 22 (4). Carnegie Council for Ethics in International Affairs. Retrieved 2009-02-03. {{cite journal}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)
  • Urdaneta, G., Pierre, G., van Steen, M. (2009). "Misplaced Pages Workload Analysis for Decentralized Hosting." Elsevier Computer Networks 53(11), pp. 1830–1845, July 2009. http://www.globule.org/publi/WWADH_comnet2009.html
Misplaced Pages
Overview
(outline)
Community
(Wikipedians)
Events
Wiki Loves
People
(list)
History
Controversies
Coverage
Honors
References
and analysis
Mobile
Content use
Related
Categories: