Wikipedia as an engine for scientific communication and
collaboration at massive scale
Andrew Su, Ph.D.@andrewsu
[email protected]://sulab.org
ScienceWriters2012
October 27, 2012
OK
OK
1979
1980
1981
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
0
200,000
400,000
600,000
800,000
1,000,000
Number of PubMed-indexed articles
The biomedical literature is growing rapidly 2
The biomedical literature is growing rapidly 3
0
1 0
2 0
Average capacity of human scientistNumber of articles read by typical scientist
High-throughput molecular profiling is powerful4
Testable hypothesis
~20,000 genes 100+ candidates 10+ experiments
Filtering, extracting, and summarizing PubMed
Documents
Concepts Review article
Filtering, extracting, and summarizing PubMed
Documents
Concepts
10k gene “stubs” within Wikipedia ≈ “Gene Wiki”7
Protein structure
Symbols and identifiers
Tissue expression pattern
Gene Ontology annotations
Links to structured databases
Gene summary
Protein interactions
Linked references
Huss, PLoS Biol, 2008
Gene Wiki has a critical mass of readers8
Rank 1-10: Laypeople
InsulinTitin
Human chorionic gonadotropinVasopressin
ANKHCLOCKCatalase
ErythropoietinGlucagon
Parathyroid hormone
Rank 1001-1010: Specialists
CSDACNTNAP2
IGSF8Adenosine A3 receptor
RYR1ETV6
Small heterodimer partner5-HT1D receptor
TRPC6Interleukin-6 receptor
Rank 101-110: Scientists
Tau proteinInterleukin 10
APCC-Met
Factor VInterleukin 8
CD44Histamine H1 receptorKappa Opioid receptor
Dihydrofolate reductase
Total: 4.0 million views / month
Huss, PLoS Biol, 2008; Huss, NAR, 2010; Good, NAR, 2011
Gene Wiki has a critical mass of readers9
Huss, PLoS Biol, 2008; Huss, NAR, 2010; Good, NAR, 2011
Gene Wiki has a critical mass of editors10
Increase of ~10,000 words / month from >1,000 editsCurrently 1.42 million words
Approximately equal to 230 full-length articles
Edi
tor
coun
t Editors
Edits Edi
t co
unt
Huss, NAR, 2010; Good, NAR, 2011
A review article for every gene is powerful11
References to the literature
Hyperlinks to related conceptsReelin: 98 editors, 703 edits since July 2002
Heparin: 358 editors, 654 edits since June 2003
AMPK: 109 editors, 203 edits since March 2004
RNAi: 394 editors, 994 edits since October 2002
The Gene Wiki is timely and current12
Manny Ramirez suspended for doping
Catalase linked to premature gray hair
Also, MGAT2 (obesity), ALDH2 (heart attack), SOX21 (hair loss), SATB1 (breast cancer), TSLP (asthma), CCR5 (HIV), …
Huss, NAR, 2010
The Gene Wiki is (reasonably) reliable13
Good edits
VandalismCum
ulat
ive
edits
Date
Per edit probability
98.9%
1.1%
Average lifetime
115.4 d
3.4 d
Probability by time
99.968%
0.032%
(0.63% for WP overall)
Good, NAR, 2011
Making the Gene Wiki more reliable14
The company name is derived from old Greek, and means
"destroyer of birds".
Novartis is a multinational pharmaceutical company
based in Basel, Switzerland that manufactures drugs such
as clozapine (Clozaril), diclofenac (Voltaren), …
2
2
http://www.wikitrust.net/Good, NAR, 2011
Making the Gene Wiki more reliable15
http://www.wikitrust.net/
The company name is derived from old Greek, and means
"destroyer of birds".
Novartis is a multinational pharmaceutical company
based in Basel, Switzerland that manufactures drugs such
as clozapine (Clozaril), diclofenac (Voltaren), …
*
36211 total edits 36 total edits
High-trust author Low-trust author
******
** *
*
*
**
2
Good, NAR, 2011
Partnering with traditional scientific publishing16
Partnering with traditional scientific publishing17
Partnering with traditional scientific publishing18
19
Doug Howe, ZFINJohn Hogenesch, U PennJon Huss, GNFLuca de Alfaro, UCSCAngel Pizzaro, U PennFaramarz Valafar, SDSUPierre Lindenbaum,
Fondation Jean DaussetMichael Martone, RushKonrad Koehler, Karo BioWarren Kibbe, Simon Lim, NorthwesternMany Wikipedia editors
WP:MCB Project
Collaborators
Ben GoodSalvatore LoguercioIan Macleod
Max NanisChunlei Wu
Group members
Funding and Support
(BioGPS: GM83924, Gene Wiki: GM089820)
Contacthttp://sulab.org
[email protected]@andrewsu+Andrew Su
http://slideshare.com/andrewsu