+ All Categories
Home > Technology > Crowdsourcing Chemistry for the Community – 5 Years of Experiences

Crowdsourcing Chemistry for the Community – 5 Years of Experiences

Date post: 10-May-2015
Category:
Upload: orcid-0000-0002-2668-4821
View: 3,003 times
Download: 2 times
Share this document with a friend
Description:
ChemSpider is one of the internet’s primary resources for chemists. ChemSpider is a structure-centric platform and hosts over 26 million unique chemical entities sourced from over 400 different data sources and delivers information including commercial availability, associated publications, patents, analytical data, experimental and predicted properties. ChemSpider serves a rather unique role to the community in that any chemist has the ability to deposit, curate and annotate data. In this manner they can contribute their skills, and data, to any chemist using the system. A number of parallel projects have been developed from the initial platform including ChemSpider SyntheticPages, a community generated database of reaction syntheses, and the Learn Chemistry wiki, an educational wiki for secondary school students.This presentation will provide an overview of the project in terms of our success in engaging scientists to contribute to crowdsouring chemistry. We will also discuss some of our plans to encourage future participation and engagement in this and related projects.
Popular Tags:
65
Crowdsourcing Chemistry for the Community – 5 Years of Experiences Antony Williams NFAIS, February 28 th 2012
Transcript
Page 1: Crowdsourcing Chemistry for the Community – 5 Years of Experiences

Crowdsourcing Chemistry for the Community – 5 Years of Experiences

Antony WilliamsNFAIS, February 28th 2012

Page 2: Crowdsourcing Chemistry for the Community – 5 Years of Experiences

The World of Online Chemistry Safety data Toxicity data Blogs and Wikis Property databases Experimental results Scientific publications Compound aggregators Open Notebook Science Metabolic pathway databases Encyclopedic articles (Wikipedia)

Page 3: Crowdsourcing Chemistry for the Community – 5 Years of Experiences

If it was not just about me…

Page 4: Crowdsourcing Chemistry for the Community – 5 Years of Experiences

If it was not just about me…

We might have a community built encyclopedia

I might know where the best restaurants are

I might get good advice on books to read

I might know which movies to watch

I might know which plumber to call

Data might just be Open

Page 5: Crowdsourcing Chemistry for the Community – 5 Years of Experiences

If it was not just about me…

We might have a community built encyclopedia

I might know where the best restaurants are

I might get good advice on books to read

I might know which movies to watch

I might know which plumber to call

Data might just be Open

Page 6: Crowdsourcing Chemistry for the Community – 5 Years of Experiences

Collaborative Knowledge Management

Page 7: Crowdsourcing Chemistry for the Community – 5 Years of Experiences

QUESTION

Are you involved with assisting chemists, pharmaceutical scientists, etc. in sourcing information about Chemistry?

1. Yes

2. No

Page 8: Crowdsourcing Chemistry for the Community – 5 Years of Experiences

Chemistry Databases on the Internet Public databases are “trusted” as primary sources

Trust is granted without investigation of the content

Online data vary dramatically in quality!

Examples…

Page 9: Crowdsourcing Chemistry for the Community – 5 Years of Experiences

With Great Fanfare…

Page 10: Crowdsourcing Chemistry for the Community – 5 Years of Experiences

NPC Browser http://tripod.nih.gov/npc/

Page 11: Crowdsourcing Chemistry for the Community – 5 Years of Experiences
Page 12: Crowdsourcing Chemistry for the Community – 5 Years of Experiences

NPC Browser http://tripod.nih.gov/npc/

Page 13: Crowdsourcing Chemistry for the Community – 5 Years of Experiences

How many contribute to clean-up?

Less than a dozen contributors to data

The majority are project members

The crowd is small…

Page 14: Crowdsourcing Chemistry for the Community – 5 Years of Experiences

What you might not know about Chemistry Databases on the Internet Data-sharing between the databases is cyclic –

proliferating errors – “Linked Data”

Page 15: Crowdsourcing Chemistry for the Community – 5 Years of Experiences

What is the Structure of Vitamin K?

Page 16: Crowdsourcing Chemistry for the Community – 5 Years of Experiences

MeSH

A lipid cofactor that is required for normal blood clotting.

Several forms of vitamin K have been identified: VITAMIN K 1 (phytomenadione) derived from

plants, VITAMIN K 2 (menaquinone) from bacteria, and

synthetic naphthoquinone provitamins, VITAMIN K 3 (menadione).

Page 17: Crowdsourcing Chemistry for the Community – 5 Years of Experiences

What is the Structure of Vitamin K1?

Page 18: Crowdsourcing Chemistry for the Community – 5 Years of Experiences

QUESTION

Who has heard of ChemSpider as a chemistry database?

1. Yes

2. No

Page 19: Crowdsourcing Chemistry for the Community – 5 Years of Experiences

ChemSpider

Page 20: Crowdsourcing Chemistry for the Community – 5 Years of Experiences

We Want to Answer Questions

Questions a chemist might ask… What is the melting point of n-heptanol? What is the chemical structure of Xanax? Chemically, what is phenolphthalein? What are the stereocenters of cholesterol? Where can I find publications about xylene? What are the different trade names for Ketoconazole? What is the NMR spectrum of Aspirin? What are the safety handling issues for Thymol Blue?

Page 21: Crowdsourcing Chemistry for the Community – 5 Years of Experiences

Available Information…

Linked to vendors, safety data, toxicity, metabolism

Page 22: Crowdsourcing Chemistry for the Community – 5 Years of Experiences

Available Information….

Page 23: Crowdsourcing Chemistry for the Community – 5 Years of Experiences

Crowdsourced “Annotations”

Users can add Descriptions/Syntheses/Commentaries Links to PubMed articles Links to articles via DOIs Add spectral data Add Crystallographic Information Files Add photos Add MP3 files Add Videos

Page 24: Crowdsourcing Chemistry for the Community – 5 Years of Experiences
Page 25: Crowdsourcing Chemistry for the Community – 5 Years of Experiences

QUESTION

Did you know that ChemSpider was OWNED by the Royal Society of Chemistry?

1. Yes

2. No

Page 26: Crowdsourcing Chemistry for the Community – 5 Years of Experiences

Public Domain Databases

Our databases are a mess…

Non-curated databases are proliferating errors

We source and deposit data between databases

Original sources of errors hard to determine

Curation is time-consuming and challenging

Page 27: Crowdsourcing Chemistry for the Community – 5 Years of Experiences

Stop Whining – Fix it

Page 28: Crowdsourcing Chemistry for the Community – 5 Years of Experiences

Crowdsourced Curation

Crowdsourced curation: identify/tag errors, edit names, synonyms, identify records to deprecate

Page 29: Crowdsourcing Chemistry for the Community – 5 Years of Experiences

Search “Vitamin H”

Page 30: Crowdsourcing Chemistry for the Community – 5 Years of Experiences

“Curate” Identifiers

Page 31: Crowdsourcing Chemistry for the Community – 5 Years of Experiences

“Curate” Identifiers

Page 32: Crowdsourcing Chemistry for the Community – 5 Years of Experiences

Validated Name-Structure Dictionaries

Chemical name dictionaries are used for: Text-mining (publications, patents)

Used to index PubMed and link to Google Patents

Linking to other databases – think Biology! When structures are not available drug names link

Searching the web Names link to structures link to InChIs

Page 33: Crowdsourcing Chemistry for the Community – 5 Years of Experiences

Why are Dictionaries important?

Page 34: Crowdsourcing Chemistry for the Community – 5 Years of Experiences

The Final Search Strategy

Page 35: Crowdsourcing Chemistry for the Community – 5 Years of Experiences

Many Names, One Structure

Page 36: Crowdsourcing Chemistry for the Community – 5 Years of Experiences

I want to know about “Vincristine”

Page 37: Crowdsourcing Chemistry for the Community – 5 Years of Experiences

Vincristine: Identifiers and Properties

Page 38: Crowdsourcing Chemistry for the Community – 5 Years of Experiences

Vincristine: PatentsLinked by Name

Page 39: Crowdsourcing Chemistry for the Community – 5 Years of Experiences

Text-Mining Depends on Dictionaries

Page 40: Crowdsourcing Chemistry for the Community – 5 Years of Experiences

Curated Dictionaries Matter

Page 41: Crowdsourcing Chemistry for the Community – 5 Years of Experiences

Originally 15 compounds “called” Yohimbine54 Skeletons for Yohimbine

Page 42: Crowdsourcing Chemistry for the Community – 5 Years of Experiences

Sharing Chemspider curation

Page 43: Crowdsourcing Chemistry for the Community – 5 Years of Experiences

Data Curation Sharing - Proof of Concept

Page 44: Crowdsourcing Chemistry for the Community – 5 Years of Experiences

Identifier Dictionaries

Reciprocal curation processes…share curation

A series of “added” and “removed” synonyms against structures for matching.

Announced 9 months ago – only one consumer

Who will participate???

Page 45: Crowdsourcing Chemistry for the Community – 5 Years of Experiences

Community Contribution to ChemSpider

Page 46: Crowdsourcing Chemistry for the Community – 5 Years of Experiences

www.SpectralGame.comhttp://www.jcheminf.com/content/1/1/9

Page 47: Crowdsourcing Chemistry for the Community – 5 Years of Experiences

Curation through “gaming”

Page 48: Crowdsourcing Chemistry for the Community – 5 Years of Experiences

Data Curation

Page 49: Crowdsourcing Chemistry for the Community – 5 Years of Experiences

Reversed Spectrum

Page 50: Crowdsourcing Chemistry for the Community – 5 Years of Experiences

True Curation of Data

Page 51: Crowdsourcing Chemistry for the Community – 5 Years of Experiences

ChemSpider SyntheticPages

Page 52: Crowdsourcing Chemistry for the Community – 5 Years of Experiences

ChemSpider SyntheticPages

Page 53: Crowdsourcing Chemistry for the Community – 5 Years of Experiences

Submission Process Simple template-based submission process

Submissions reviewed by editorial board.

Online Peer Review process

Crowdsourced expansion? A few regular dedicated authors only Online peer review and feedback small but useful

Page 54: Crowdsourcing Chemistry for the Community – 5 Years of Experiences

Crowdsourcing – does it work?

192 people EVER have deposited or curated data

ChemSpider SyntheticPages small group of authors

Database hosts make the largest contributions

ChemSpider staff tend to do the most curation

Page 55: Crowdsourcing Chemistry for the Community – 5 Years of Experiences

Contributions

Page 56: Crowdsourcing Chemistry for the Community – 5 Years of Experiences

Curations

2009 – 8255 curations by 43 people

2010 – 10014 curations by 66 people

2011 – 16025 curations by 116 people

“Crowdsourcing” – the crowd is small!

Page 57: Crowdsourcing Chemistry for the Community – 5 Years of Experiences

www.SciMobileApps.com

8 contributors only…in 7 months

Page 58: Crowdsourcing Chemistry for the Community – 5 Years of Experiences

www.SciDBs.com

7 contributors only…in 6 months

Page 59: Crowdsourcing Chemistry for the Community – 5 Years of Experiences

www.ScientistsDB.com

38 contributors …in 6 weeks

Page 60: Crowdsourcing Chemistry for the Community – 5 Years of Experiences

What encourages participation?

“Interested” parties contribute

Marketing and self-promotion are primary reasons for participation

There are very few “selfless” participants

Relationships garner contributions…

Page 61: Crowdsourcing Chemistry for the Community – 5 Years of Experiences

Crowdsourcing across drug discovery

Open PHACTS : partnership between European Community and European Pharma Companies

Freely accessible for knowledge discovery and verification. Data on chemistry and biology Pharmacological profiles Proprietary and public data sources.

Page 62: Crowdsourcing Chemistry for the Community – 5 Years of Experiences
Page 63: Crowdsourcing Chemistry for the Community – 5 Years of Experiences

How will it improve?

Participation and

contribution

Page 64: Crowdsourcing Chemistry for the Community – 5 Years of Experiences

Conclusions For chemistry - crowdsourced deposition, annotation,

and curation works but low engagement to date

Primary challenge – engaging the community to help create what they want. Rewards and recognition?

MORE collaboration can benefit us all

Indicators are good for small but continued growth

Page 65: Crowdsourcing Chemistry for the Community – 5 Years of Experiences

Thank you

Email: [email protected] Twitter: ChemConnectorPersonal Blog: www.chemconnector.com SLIDES: www.slideshare.net/AntonyWilliams


Recommended