+ All Categories
Home > Documents > Identification of “Known Unknowns” Using Accurate Mass ... · Identification of “Known...

Identification of “Known Unknowns” Using Accurate Mass ... · Identification of “Known...

Date post: 17-Apr-2020
Category:
Upload: others
View: 2 times
Download: 0 times
Share this document with a friend
20
Identification of “Known Unknowns” Using Accurate Mass Data and Large “Spectraless” Databases James Little, Eastman Chemical Company Accurate Mass and Novel Applications of Mass Spectrometry for Unknown Environmental Analyses Symposium Session No. 1520, 3/14/2012
Transcript

Identification of “Known Unknowns” Using Accurate Mass Data and

Large “Spectraless” Databases

James Little, Eastman Chemical Company

Accurate Mass and Novel Applications of Mass Spectrometry for Unknown Environmental Analyses Symposium

Session No. 1520, 3/14/2012

Origin of “Known Unknowns” Term

“. . . there are known knowns; there are things we know we know. We also

know there are known unknowns; that is to say we know there are some

things we do not know. But there are also unknown unknowns -- the ones we

don't know we don't know. And if one looks throughout the history of our

country and other free countries, it is the latter category that tend to be the

difficult ones.”

Donald Rumsfeld, weapons of mass destruction in Iraq News Briefing, Feb.

12, 2002

“Known Unknowns”-unknown to investigator, but known in the

chemical literature, internet or reference databases

“Unknown Unknowns”-not previously cited in literature, internet or

reference databases

Non-Targeted Species:

First Approach: Search Computer Mass Spectral Databases

Obtain accurate or nominal mass data by GC-MS or LC-MS

Technique Commercial Spectra

Purchased

Eastman Created

Spectra

EI GC-MS ~1.1 M ~55K

MS/MS LC-MS ~65K ~4K

Search electron ionization or collision induced dissociation

(“MS/MS”) databases

Usually much more successful for EI than collision

induced dissociation searches

NIST Search Used for EI and MS/MS Searches

All new spectra with structures added to Eastman libraries

• Thermo Excalibur

• Agilent Chemstation

• Agilent MassHunter

• Waters MassLynx

• NIST AMDIS

• Cambridge ChemDraw

Data Processing:

NIST Library Search:

Updated and distributed automatically nightly to 40 systems

Search by spectrum, structure (model compounds), electronic

notebook, etc.

Search of “Spectraless” Databases: If Not Found in Spectral Search

Obtain accurate mass EI or MS/MS data

Database No. of Entries

CAS Registry ~65 M

ChemSpider ~26 M

Molecular formula (MF) determined using isotopic pattern and other

approaches

Search large “spectraless” databases to find candidate structures1,2

Rank candidates by number of associated references (both) or

association with key words (CAS only)

Confirm with fragmentation patterns, exchangeable protons, sample

history, UV-VIS spectra, relative retention times, purchased standard, etc.

Get Substances

C22H29N3O

2,256

CAS Registry

~65,000,000

Sort by Number

of References

2,059 #1 1,252 Ref

#2 20

#3 19

#1 Correct

Search CAS Registry with SciFinder by MF: UV Additive in Polymer

Request to Identify UV additive in polymer

Found in LC chromatogram by characteristic UV spectrum

MF by accurate mass and isotope pattern

MW 351

Interpretation of MS/MS Spectrum

M+H+

-C5H10

-2 C5H10

C5H11+

-C2H4

-2 C2H4

Confirmation of Candidate Structure

CH3CN + C5H11+

UV spectrum characteristic of UV Absorber

Confirmation of Candidate Structure (continued)

nm

AU

302 340

204 Diode Array

Spectrum

One exchangeable proton by deuterium exchange via infusion in

solvent mix with minimal D2O

Ultimately confirmed by purchased standard

Results for Searching Classes of Compounds by MF in

ChemSpider and CAS

Identification of greenish yellow species in baby diaper adhesive

CAS Registry

~65,000,000

Get all references

for all substances

482

Get Substances

C30H42O2

264 1 Ref.

Refined all references with

“greenish yellow”

Ironic, BHT included to stabilize, led to color from oxidation

Problem Solved!

MF Search Refined with Key Word for More Obscure Compounds

“BHT” “BHTdimer”

O2

[Ox]

ChemSpider Search by Monoisotopic Mass Instead of Molecular Formula (MF)

Problem:

Approach:

The number of MF’s increases dramatically with molecular weight

e.g. at 1000 da for eleven most common elements, >350 million MF3

Often difficult to get unique MF at molecular weights > 600 Da

In practice, number of known compounds decreases with molecular weight

Thus, initially search by monoisotopic mass, not molecular formula at >600 Da

Then compare isotope abundance of all candidates

ChemSpider

~26,000,000

Search by

783.520 +/-15 ppm

16

Sort by Number

of References

51

#1 29 Ref.

#2 5

#3 2

#1 Correct

Search by Monoisotopic Mass: Antioxidant in Polymer

Identification of additive in polymer

Search ChemSpider by monoisotopic mass

Confirmation of Antioxidant Candidate Structure

Initially confirmed by MS/MS fragmentation pattern

3 exchangeable protons by infusion in solvent mix with D2O

Ultimately confirmed by purchased standard

Twelve Examples from Literature by Monoisotopic Mass vs. MF

Ranked by Number of References

Species MF Monoisotopic

Mass

Rank MF Rank Monoisotopic

Mass using +/- 5 ppm

window

Moxidectin C37H53NO8 639.3771 1 of 5 1 of 39

Erythromycin C37H67NO13 733.4612 1 of 42 1 of 53

Digoxin C41H64O14 780.4296 1 of 47 1 of 65

Rifampicin C43H58N4O12 822.4051 1 of 29 1 of 96

Rapamycin C51H79N1O13 913.5551 1 of 43 1 of 51

Amphotericin B C47H73N1O17 923.4878 1 of 33 1 of 42

Gramicidin S C60H92N12O10 1140.7059 1 of 5 1 of 13

Cereulide C57H96N6O18 1152.6781 1 of 3 2 of 8

Cyclosporin A C62H111N11O12 1201.8414 1 of 36 1 of 38

Vancomycin C66H75Cl2N9O24 1447.4302 1 of 24 1 of 26

perfluorotriazine C30H18N3O6P3F48 1520.9642 1 of 1 1 of 1

Thiostrepton C72H85N19O18S5 1663.4924 1 of 5 1 of 5

Average 1 of 23 1 of 36

Searching CAS Registry with STN Express by Molecular Weight

Error for measurement normally 4-5 times greater

442.2844

443.2863

444.2911 445.3012

Molecular weight = (m/z x intensity)

intensity – 1.0074

Monoisotopic mass = 442.2844 – 1.0073 = 441.2771

= 441.600

CAS Registry only searched by molecular weight, not monoisotopic mass

Only searched with STN Express, command base interface

=> file registry

=> s 441.57-441.63/mw

Two significant figures to right of decimal

Comparison of Four Approaches for 90 Test Compounds

Search Approach #1 #2 #3 #4 #5 >#5

CAS Registry/molecular formula 84 4 1 1

ChemSpider/molecular formula 81 4 3 1 1

CAS Registry/average molecular weight 66 13 4 1 3 3

ChemSpider/monoisotopic mass 77 4 4 2 3

Search by molecular formula is best by CAS or ChemSpider

Monoisotopic mass by ChemSpider very useful for compounds MW> 600

Monoisotopic mass and molecular weight also useful for compounds at

lower molecular weight

Summary of ChemSpider and CAS Registry Capabilities

Search

Approach

Pros Cons

ChemSpider -free via internet

-good user interface

-automation by instrument

manufacturer using Web API

(Application Program

Interface)

-ability to search monoisotopic

mass

-smaller No. of entries (~26 M) and

references

-can’t refine by key word

CAS Registry

with SciFinder

or STN Express

-larger No. of entries (~65 M)

and references

-refine by key word

-good SciFinder user interface

for MF searches on internet

-fee charged

-no API available for instrument

manufacturer automation

-no ability to search by monoisotopic

mass, only MW with complicated STN

Express interface

Conclusions and Future Plans

Computer searches of mass spectral databases easiest

approach for identifying “known unknowns”

Automation of latter approach needed for more complex samples

Searching “spectraless” databases very powerful alternative approach

Attempting to persuade CAS to add monoisotopic mass

Ranking and visual comparison of computer generated CID

spectra of candidate structures to observed spectrum

References

1. “Identification of “known unknowns” utilizing accurate mass data and ChemSpider,” J.

Little, A. Williams, A. Pshenichnov, V. Tkachenko, Vol. 23, No. 1, p 179-185

2. “Identification of “known unknowns” utilizing accurate mass data and chemical abstracts

service databases, J. Little, C. Cleven, S. Brown Vol. 22, No. 2, p 348-359.

3. “Metabolomic database annotations via query of elemental compositions: mass

accuracy is insufficient even at less than 1 ppm,” T. Kind, Oliver Fiehn, BMC

Bioinformatics 2006, 7:234.

Additional Information on Internet with Screenshots

Search “Little Mass Spec,” top hit in Google

Acknowledgements

NIST: Steve Stein, David Sparkman, Dmitrii Tchekhovskoi, Anzor Mikaya

ChemSpider: Tony Williams, Alexey Pshenichnov, Valery Tkachenko

Eastman: Bill Tindall, Kent Morrill, Curt Cleven, Adam Howard, Jean

Coffman, Mike Ramsey, Sen Li

ETSU School of Pharmacy: Stacy Brown

CAS: Anthony Machosky

Waters: Jim Lekander

Agilent Technologies: Mike Scott

Art Work for Journal Cover: Minta Fannon


Recommended