Wikipedia Wikipedia DynamicsDynamics
- with open problems- with open problemsMingli Yuan Mingli Yuan
August 27, 2006 August 27, 2006
Jimmy Wales at Wikimania Jimmy Wales at Wikimania 20062006
According to Alexa data, Wikipedia According to Alexa data, Wikipedia passed CNN on page views.passed CNN on page views.
NatureNature article article on Wikipedia and on Wikipedia and Britannica.Britannica.
One Laptop Per ChildOne Laptop Per Child project will project will integrate Wikipedia content.integrate Wikipedia content.
For English Wikipedia, turn For English Wikipedia, turn our attention away from our attention away from growth to quality.growth to quality.
Image: limiting fair use and Image: limiting fair use and encouraging free imagesencouraging free images
Stable versions Stable versions
OverviewOverview Wikipedia: mission and characteristicsWikipedia: mission and characteristics Criticisms on WikipediaCriticisms on Wikipedia Wikipedia: a dynamic systemWikipedia: a dynamic system Growth of WikipediaGrowth of Wikipedia Featured articles and good articlesFeatured articles and good articles Stub percentagesStub percentages Words and Revisions per ArticleWords and Revisions per Article Articles per WikipedianArticles per Wikipedian WikipedianWikipedian How do Wikipedians create featured articles?How do Wikipedians create featured articles? My personal conclusionsMy personal conclusions ReferencesReferences
Wikipedia: mission and Wikipedia: mission and characteristicscharacteristics
MissionMission ““Imagine a world in which every single person Imagine a world in which every single person
is given free access to the sum of all human is given free access to the sum of all human knowledge. ” – by Jimmy Wales knowledge. ” – by Jimmy Wales [1][1]
Characteristics Characteristics [2][2]
free to redistribute and reproducefree to redistribute and reproduce constant and plentiful updatesconstant and plentiful updates comprehensive, diverse coveragecomprehensive, diverse coverage absent of advertisementabsent of advertisement versions in numerous languages versions in numerous languages
Criticisms on WikipediaCriticisms on Wikipedia
Criticisms Criticisms [2][2]
susceptibility to vandalismsusceptibility to vandalism uneven qualityuneven quality inconsistencyinconsistency systemic biassystemic bias preference for consensus or popularity preference for consensus or popularity
over credentialsover credentials
Wikipedia: a dynamic Wikipedia: a dynamic systemsystem
Wikipedia: a complex social Wikipedia: a complex social phenomenaphenomena
Problems: Problems: How will Wikipedia evolve?How will Wikipedia evolve? Is it feasible to establish a Is it feasible to establish a high qualityhigh quality
knowledge repository in the Wikipedia knowledge repository in the Wikipedia way?way?
Growth of WikipediaGrowth of Wikipedia
Quasi-exponential growth patternQuasi-exponential growth pattern 2003 Model: using 2003 data to setup a model2003 Model: using 2003 data to setup a model 2003 Model underestimated the actual growth2003 Model underestimated the actual growth Problem: how long will the exponential growth Problem: how long will the exponential growth
continue, or is it in reality merely the early phase of a continue, or is it in reality merely the early phase of a logistic curve? logistic curve? [3][3]
Featured articles and good Featured articles and good articlesarticles
Statistics on featured articles and good articles Statistics on featured articles and good articles [4][4] The number of good articles has been rising faster than the number of The number of good articles has been rising faster than the number of
featured articles.featured articles. The proportion of good articles is rising.The proportion of good articles is rising. The proportion of featured articles is declining.The proportion of featured articles is declining.
Problems:Problems: Will the proportion of good articles rise constantly?Will the proportion of good articles rise constantly? Will the proportion of featured articles drop to near zero?Will the proportion of featured articles drop to near zero?
Stub percentagesStub percentages
Stubs are Wikipedia entries Stubs are Wikipedia entries that have not yet received that have not yet received substantial attention from substantial attention from the editors of Wikipedia, the editors of Wikipedia, and do not yet contain and do not yet contain sufficient information on sufficient information on their subject matter. their subject matter. [5][5]
Statistical result: Statistical result: [6][6]
Stubs still comprise an Stubs still comprise an increasingly large increasingly large percentage of the percentage of the articles on Wikipedia.articles on Wikipedia.
The rate of stub increase The rate of stub increase is slowing.is slowing.
The need to focus on The need to focus on stub expansion, not just stub expansion, not just article creation. article creation. [6][6]
Problem: Does the stub Problem: Does the stub percentage increase for percentage increase for ever?ever?
Words and revisions per Words and revisions per ArticleArticle
the average number the average number of words per page of words per page will increase at no will increase at no more than a snail's more than a snail's pace. pace. [7][7]
it is even possible it is even possible that the gradient that the gradient might flatten or fall might flatten or fall slightly if the rate of slightly if the rate of new stub addition new stub addition eclipses the rate of eclipses the rate of expansion of existing expansion of existing articles. articles. [7][7]
Articles per WikipedianArticles per Wikipedian
Statistical result: Statistical result: [8][8]
Articles per Articles per Wikipedian is Wikipedian is decaying along some decaying along some function that appears function that appears either logarithmic or either logarithmic or polynomial.polynomial.
It's uncertain It's uncertain whether this will whether this will level off somewhere level off somewhere above 20 articles per above 20 articles per contributor.contributor.
WikipedianWikipedian
Statistical result: Statistical result: [8][8]
Community Community membership has, membership has, like article count, like article count, been growing been growing exponentially.exponentially.
The trend of very The trend of very active vs. active is active vs. active is dropping.dropping.
How do Wikipedians create How do Wikipedians create featured articles?featured articles?
Length vs Time
0
2000
4000
6000
8000
10000
12000
14000
2/6
/20
04
3/6
/20
04
4/6
/20
04
5/6
/20
04
6/6
/20
04
7/6
/20
04
8/6
/20
04
9/6
/20
04
10
/6/2
00
4
11
/6/2
00
4
12
/6/2
00
4
1/6
/20
05
2/6
/20
05
3/6
/20
05
4/6
/20
05
5/6
/20
05
6/6
/20
05
7/6
/20
05
8/6
/20
05
9/6
/20
05
10
/6/2
00
5
11
/6/2
00
5
12
/6/2
00
5
1/6
/20
06
2/6
/20
06
3/6
/20
06
4/6
/20
06
5/6
/20
06
6/6
/20
06
7/6
/20
06
Length
Contribution
-500
0
500
1000
1500
2000
2500
3000
3500
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61
Series1
Leaps in Length Leaps in Length vs. Time graph.vs. Time graph.
Few Wikipedians Few Wikipedians contribute most of contribute most of the content of the the content of the article.article.
A pro-Am gameA pro-Am game
My personal conclusionsMy personal conclusions The Wikipedia project is still in its early stage with The Wikipedia project is still in its early stage with
exponential growth in contributors and articles but such exponential growth in contributors and articles but such growth will eventually decline to a constant rate.growth will eventually decline to a constant rate.
Articles per Wikipedian and edits per Wikipedian will be Articles per Wikipedian and edits per Wikipedian will be constant.constant.
The numbers of very active users and active users will The numbers of very active users and active users will approach certain constants, while the ordinary users will approach certain constants, while the ordinary users will grow linearly.grow linearly.
Feature articles, good articles and stub will grow Feature articles, good articles and stub will grow linearly, but stub will grow faster than the other two; the linearly, but stub will grow faster than the other two; the stub percentage will approach a constant in the future.stub percentage will approach a constant in the future.
As a long-term project, Wikipedia may stand for As a long-term project, Wikipedia may stand for hundreds of years, and become another Britannica in the hundreds of years, and become another Britannica in the Internet era.Internet era.
ReferencesReferences [1] [1] Jimmy Wales Kicks off Wikimania Jimmy Wales Kicks off Wikimania
http://ross.typepad.com/blog/2006/08/jimmy_wales_kic.htmlhttp://ross.typepad.com/blog/2006/08/jimmy_wales_kic.html [2] [2] Wikipedia Wikipedia http://en.wikipedia.org/wiki/Wikipediahttp://en.wikipedia.org/wiki/Wikipedia [3] [3] Wikipedia: Modeling Wikipedia's growth Wikipedia: Modeling Wikipedia's growth
http://en.wikipedia.org/wiki/Wikipedia:Modelling_Wikiphttp://en.wikipedia.org/wiki/Wikipedia:Modelling_Wikipedia%27s_growthedia%27s_growth
[4][4] Wikipedia: Good articles/Statistics Wikipedia: Good articles/Statistics http://en.wikipedia.org/wiki/Wikipedia:Good_articles/Stahttp://en.wikipedia.org/wiki/Wikipedia:Good_articles/Statisticstistics
[5] [5] Wikipedia: Stub http://en.wikipedia.org/wiki/WP:StubWikipedia: Stub http://en.wikipedia.org/wiki/WP:Stub [6] [6] User: Dantheox/Stub percentages User: Dantheox/Stub percentages
http://en.wikipedia.org/wiki/User:Dantheox/Stub_percenhttp://en.wikipedia.org/wiki/User:Dantheox/Stub_percentagestages
[7] [7] Wikipedia: Words per article Wikipedia: Words per article http://en.wikipedia.org/wiki/Wikipedia:Words_per_articlhttp://en.wikipedia.org/wiki/Wikipedia:Words_per_articlee
[8] [8] Wikipedia: Xiong's stats Wikipedia: Xiong's stats http://en.wikipedia.org/wiki/Wikipedia:Xiong%27s_statshttp://en.wikipedia.org/wiki/Wikipedia:Xiong%27s_stats
Thanks!Thanks!