+ All Categories
Transcript
Page 1: ChemInfo 2011 class1

Chemical Information Retrieval 2011

Jean-Claude Bradley

September 23, 2011

First Class

Associate Professor of ChemistryDrexel University

CHEM367/767 Drexel University

Page 2: ChemInfo 2011 class1

Finding reliable chemical information

can be really hard

Page 3: ChemInfo 2011 class1

After this class,you should feel that

you can never blindly trust

chemical data sources again

Page 4: ChemInfo 2011 class1

But…You will learn how to do the best you can

with imperfect information

Page 5: ChemInfo 2011 class1

The Chemical Information Validation Sheet

567 curated and referenced measurements from Fall 2010 Chemical Information Retrieval course

Page 6: ChemInfo 2011 class1

Discovering outliers for melting points (stdev/average)

Page 7: ChemInfo 2011 class1

Investigating the m.p. inconsistencies of EGCG

Page 8: ChemInfo 2011 class1

Investigating the m.p. inconsistencies of cyclohexanone

Page 9: ChemInfo 2011 class1

Most popular data sources

Page 10: ChemInfo 2011 class1

Alfa Aesar donates melting points to the public

Page 11: ChemInfo 2011 class1

Open Melting Point Explorer

(Andrew Lang)

Page 12: ChemInfo 2011 class1

OutliersMDPI

datasetEPI (donated all data to public

also)

Page 13: ChemInfo 2011 class1

Outliers for ethanol: Alfa Aesar and Oxford MSDS

Page 14: ChemInfo 2011 class1

Inconsistencies and SMILES problems within MDPI dataset

Page 15: ChemInfo 2011 class1

MDPI Dataset labeled with High Trust Level

Page 16: ChemInfo 2011 class1

Open Melting Point DatasetsCurrently 20,000 compounds with Open MPs

Page 17: ChemInfo 2011 class1

American Petroleum Institute 5 CPHYSPROP -30 CPHYSPROP 125 Cpeer reviewed journal (2008) 97.5 Cgovernment database -30 Cgovernment database 4.58 C

What is the melting point of 4-benzyltoluene?

Page 18: ChemInfo 2011 class1

The quest to resolve the melting point of 4-benzyltoluene: liquid at room temp

and can be frozen <-30C

Page 19: ChemInfo 2011 class1

Open Lab Notebook page measuring the melting point of 4-benzyltoluene

Page 20: ChemInfo 2011 class1

Motivation: Faster Science, Better Science

Page 21: ChemInfo 2011 class1

Ruling out all melting points above -15C?

Page 22: ChemInfo 2011 class1

Oops – 4-benzyltoluene freezes after 16 days at -15C!

Page 23: ChemInfo 2011 class1

Measuring the melting point by slowly heating from -15 C gives 5 C

Page 24: ChemInfo 2011 class1

There are NO FACTS, only measurements embedded

within assumptions

Open Notebook Science maintains the integrity of data

provenance by making assumptions explicit

Page 25: ChemInfo 2011 class1

Open Random Forest modeling of Open Melting Point data using CDK descriptors

(Andrew Lang)

R2 = 0.78, TPSA and nHdon most important

Page 26: ChemInfo 2011 class1

Melting point prediction service

Page 27: ChemInfo 2011 class1

Melting point predictions and measurements on iPhone/iPad (Andrew Lang and Alex Clark)

Page 28: ChemInfo 2011 class1

Using melting point for temperature dependent solubility prediction

Page 29: ChemInfo 2011 class1

Web services for summary data

(Andrew Lang)

Page 30: ChemInfo 2011 class1

Web service calls from within a Google Spreadsheet for solubility measurement

and prediction

(Andrew Lang)

Page 31: ChemInfo 2011 class1

Integration of Multiple Web Services to Recommend Solvents

for Reactions

(Andrew Lang)

Page 32: ChemInfo 2011 class1

Publication of double+ validated melting point dataset to Nature

Precedings and LuLu

Page 33: ChemInfo 2011 class1
Page 34: ChemInfo 2011 class1
Page 35: ChemInfo 2011 class1

Reaction Attempts Book

Page 36: ChemInfo 2011 class1

Reaction Attempts Book: Reactants listed Alphabetically

Page 37: ChemInfo 2011 class1
Page 38: ChemInfo 2011 class1

All ONS web services

Page 39: ChemInfo 2011 class1

Google Apps Scripts web services

Page 40: ChemInfo 2011 class1

Google Apps Scripts for conveniently exploring melting

point data

Page 41: ChemInfo 2011 class1

Straight chain carboxylic acids from 1 to 10 carbons

Straight chain alcohols from 1 to 10 carbons

Comparison of model with triple validated measurements

Page 42: ChemInfo 2011 class1

Cyclic primary amines from 3 to 6 carbons (cyclobutylamine flagged for validation – only single source available)

Page 43: ChemInfo 2011 class1

Google Apps Scripts for planning reactions and creating schemes

Page 44: ChemInfo 2011 class1

Open Melting Points in Supplementary Data Pages of Wikipedia (Martin Walker)

Page 45: ChemInfo 2011 class1

Web services from data collected in this class will be added here

Page 46: ChemInfo 2011 class1

In this class you will learn

How to search Science1.0 resources

•Peer-Reviewed journals•Commercial databases•Patents•Conference Proceedings

Page 47: ChemInfo 2011 class1

In this class you will learn

How to participate in Science2.0

•wikis (Wikipedia, class wiki)•blogs•interactive databases (ChemSpider)•social software (Twitter, FriendFeed)

Page 48: ChemInfo 2011 class1

In this class you will learnHow to leverage Science3.0

(via collaboration with Andrew Lang)

•machine readable web-services

Page 49: ChemInfo 2011 class1

Now lets take a look at the class wiki


Top Related