+ All Categories
Home > Documents > PHAR 201 Lecture 3 20121 Know the Limitations of your Data – X-ray, NMR, EM PHAR...

PHAR 201 Lecture 3 20121 Know the Limitations of your Data – X-ray, NMR, EM PHAR...

Date post: 15-Jan-2016
Category:
View: 219 times
Download: 1 times
Share this document with a friend
Popular Tags:
38
PHAR 201 Lecture 3 2012 1 Know the Limitations of your Data – X-ray, NMR, EM PHAR 201/Bioinformatics I Philip E. Bourne SSPPS, UCSD Prerequisite Reading: Structural Bioinformatics Chapters 4-6
Transcript
Page 1: PHAR 201 Lecture 3 20121 Know the Limitations of your Data – X-ray, NMR, EM PHAR 201/Bioinformatics I Philip E. Bourne SSPPS, UCSD Prerequisite Reading:

PHAR 201 Lecture 3 2012 1

Know the Limitations of your Data – X-ray, NMR, EM

PHAR 201/Bioinformatics I

Philip E. Bourne

SSPPS, UCSD

Prerequisite Reading: Structural Bioinformatics Chapters 4-6

Page 2: PHAR 201 Lecture 3 20121 Know the Limitations of your Data – X-ray, NMR, EM PHAR 201/Bioinformatics I Philip E. Bourne SSPPS, UCSD Prerequisite Reading:

When You Grab a PDB Fie What Are You Starting With?

PHAR 201 Lecture 3 2012 2

Page 3: PHAR 201 Lecture 3 20121 Know the Limitations of your Data – X-ray, NMR, EM PHAR 201/Bioinformatics I Philip E. Bourne SSPPS, UCSD Prerequisite Reading:

PHAR 201 Lecture 3 2012

PDB ID

DistributionSite

Depositor

ArchivalData

Core DB

PDB Entry

Deposit Annotate Validate

Depositor Approval

Validation Report

Corrections

Step 2

Step 3

Step 4

Step 1

Data Views• Depositor/Annotator

• Type of experiment: X-ray, NMR, EM

• Type of molecule: protein, nucleic acid, or protein-nucleic acid complex

3

Page 4: PHAR 201 Lecture 3 20121 Know the Limitations of your Data – X-ray, NMR, EM PHAR 201/Bioinformatics I Philip E. Bourne SSPPS, UCSD Prerequisite Reading:

PHAR 201 Lecture 3 2012

Annotation

• Resolve nomenclature and format problems

• Add missing required data items

• Add higher level classifications

• Review validation report and summary letter to the

depositor

• Produce and check final mmCIF and PDB files

• Update status and load database

• Check data consistency across archive

4

Page 5: PHAR 201 Lecture 3 20121 Know the Limitations of your Data – X-ray, NMR, EM PHAR 201/Bioinformatics I Philip E. Bourne SSPPS, UCSD Prerequisite Reading:

PHAR 201 Lecture 3 2012

Annotation – More Specifics

• Make sure entry is complete (mandatory items from mmCIF

dictionary)

• Format exchange

– Converts between PDB and mmCIF formats

– Recognizes most variants of PDB format

• Check nomenclature

– Residue

– Polymer atoms

– Hydrogen atoms

– Ligand atoms

5

Page 6: PHAR 201 Lecture 3 20121 Know the Limitations of your Data – X-ray, NMR, EM PHAR 201/Bioinformatics I Philip E. Bourne SSPPS, UCSD Prerequisite Reading:

PHAR 201 Lecture 3 2012

Validation• Covalent geometry

– Comparison with standard values (Engh and Huber1; Gelbin et al.3; Clowney et al.2 )

– Identify outliers

• Stereochemistry – check chiral centers

• Close contacts in asymmetric unit and unit cell

• Occupancy

• Sequence in SEQRES and coordinates

• Distant waters

• Experimental (SFCHECK4)1R.A.Engh & R.Huber. Acta Cryst. A47 (1991):392-4002L. Clowney et al. J.Am.Chem.Soc. 118 (1991):509-5183A. Gelbin et al. J.Am.Chem.Soc. 118 (1991):519-5294A.A. Vaguine, J. Richelle, and S.J. Wodak. Acta Cryst. D55 (1999):191-205.

6

Page 7: PHAR 201 Lecture 3 20121 Know the Limitations of your Data – X-ray, NMR, EM PHAR 201/Bioinformatics I Philip E. Bourne SSPPS, UCSD Prerequisite Reading:

PHAR 201 Lecture 3 2012

The process by which biological data in a database are annotated and validated

changes over time – this introduces a temporal

inconsistency

7

Page 8: PHAR 201 Lecture 3 20121 Know the Limitations of your Data – X-ray, NMR, EM PHAR 201/Bioinformatics I Philip E. Bourne SSPPS, UCSD Prerequisite Reading:

PHAR 201 Lecture 3 2012

Summary Thus Far• The biocurators (annotators) are the unsung

heroes of modern biology

– International Society for Biocuration

• As a resource developer - start right and the need for data remediation in years to come will be less likely

• As a resource user - be aware of the process used to provide the data and hence the limitations of the data you are using

P.E.Bourne and J. McEntyre 2006 Biocurators: Contributors to the World of Science PLoS Comp. Biol., (Editorial) 2(10) e142 [PDF]

8

Page 9: PHAR 201 Lecture 3 20121 Know the Limitations of your Data – X-ray, NMR, EM PHAR 201/Bioinformatics I Philip E. Bourne SSPPS, UCSD Prerequisite Reading:

The quality of the data you use in a bioinformatics experiment is a function of the method used to collect these data – understand

the method

PHAR 201 Lecture 3 2012 9

Page 10: PHAR 201 Lecture 3 20121 Know the Limitations of your Data – X-ray, NMR, EM PHAR 201/Bioinformatics I Philip E. Bourne SSPPS, UCSD Prerequisite Reading:

PHAR 201 Lecture 3 2012 10

As of Oct 5, 2011

EM254

Page 11: PHAR 201 Lecture 3 20121 Know the Limitations of your Data – X-ray, NMR, EM PHAR 201/Bioinformatics I Philip E. Bourne SSPPS, UCSD Prerequisite Reading:

PHAR 201 Lecture 3 2012 11

X-ray Crystallography• Oldest technique• Majority of the depositions• A number of Nobel prizes• International Union of Crystallography (IUCr) .. Acta ..• Method based on scattering from electrons – hydrogen

atoms usually not seen (sometimes modeled in)• In fact modeling in is an issue• Atoms of similar atomic weight not distinguishable eg O, N,

C• Influence of crystal packing eg malate dehydrogenase

(4MDH)• Environment in crystal highly aqueous• Produces similar structures to NMR eg thioredoxin (3TRX

vs 1SRX)

Page 12: PHAR 201 Lecture 3 20121 Know the Limitations of your Data – X-ray, NMR, EM PHAR 201/Bioinformatics I Philip E. Bourne SSPPS, UCSD Prerequisite Reading:

PHAR 201 Lecture 3 2012 12

Basic Steps

Target Selection

Crystallomics• Isolation,• Expression,• Purification,• Crystallization

DataCollection

StructureSolution

StructureRefinement

Functional Annotation Publish

The X-ray Crystallography Pipeline

Page 13: PHAR 201 Lecture 3 20121 Know the Limitations of your Data – X-ray, NMR, EM PHAR 201/Bioinformatics I Philip E. Bourne SSPPS, UCSD Prerequisite Reading:

PHAR 201 Lecture 3 2012 13

Limitations - Crystallization

• Crystallization:– Non-soluble– Twinning– Micro heterogeneity– Disorder

Page 14: PHAR 201 Lecture 3 20121 Know the Limitations of your Data – X-ray, NMR, EM PHAR 201/Bioinformatics I Philip E. Bourne SSPPS, UCSD Prerequisite Reading:

Limitations – Data Collection

PHAR 201 Lecture 3 2012 14

Page 15: PHAR 201 Lecture 3 20121 Know the Limitations of your Data – X-ray, NMR, EM PHAR 201/Bioinformatics I Philip E. Bourne SSPPS, UCSD Prerequisite Reading:

Limitations - Refinement

PHAR 201 Lecture 3 2012 15

Page 16: PHAR 201 Lecture 3 20121 Know the Limitations of your Data – X-ray, NMR, EM PHAR 201/Bioinformatics I Philip E. Bourne SSPPS, UCSD Prerequisite Reading:

PHAR 201 Lecture 3 2012 16

Limitations – Map Fitting

• In an intricate study the only way to be sure that the work is correct is to make your own judgment from the electron density – this is never done.

• It can be done at http://eds.bmc.uu.se/eds/

• It requires that the experimental data (the structure factors be available)

100d

Page 17: PHAR 201 Lecture 3 20121 Know the Limitations of your Data – X-ray, NMR, EM PHAR 201/Bioinformatics I Philip E. Bourne SSPPS, UCSD Prerequisite Reading:

Limitations – Non-crystallographic Symmetry (NCS)

PHAR 201 Lecture 3 2012 17

Page 18: PHAR 201 Lecture 3 20121 Know the Limitations of your Data – X-ray, NMR, EM PHAR 201/Bioinformatics I Philip E. Bourne SSPPS, UCSD Prerequisite Reading:

PHAR 201 Lecture 3 2012 18

Limitations – Refinement

• Introduces restraints/constraints that may or may be realistic

• Water has been used unnecessarily• Resolution quoted wrongly• Standards have helped• See for example: H. Weissig, and P.E. Bourne

1999 Bioinformatics 15(10) 807-831. An Analysis of the Protein Data Bank in Search of Temporal and Global Trends

Page 19: PHAR 201 Lecture 3 20121 Know the Limitations of your Data – X-ray, NMR, EM PHAR 201/Bioinformatics I Philip E. Bourne SSPPS, UCSD Prerequisite Reading:

Limitations – Interpretation of the Biologically Active Molecule

PHAR 201 Lecture 3 2012 19

http://www.pdb.org/pdb/101/static101.do?p=education_discussion/Looking-at-Structures/bioassembly_tutorial.html

1QQP

Page 20: PHAR 201 Lecture 3 20121 Know the Limitations of your Data – X-ray, NMR, EM PHAR 201/Bioinformatics I Philip E. Bourne SSPPS, UCSD Prerequisite Reading:

PHAR 201 Lecture 3 2012 20

Limitations – Functional Annotation

• Functional annotation is ONLY in the publication NOT PDB

• Attempt to address this with GO assignments • Attempt to address this with literature integration • Structural genomics – function unknown• One structure – one to many functions (power law)

– functions may be unrecognized since the PDB is relatively static

• Many efforts at functional annotation

Page 21: PHAR 201 Lecture 3 20121 Know the Limitations of your Data – X-ray, NMR, EM PHAR 201/Bioinformatics I Philip E. Bourne SSPPS, UCSD Prerequisite Reading:

PHAR 201 Lecture 3 2012 21

Why Are Understanding Limitations Important?

• Later we will study reductionism – a key process in the use of biological data

• As a result of reductionism you will need to choose a representative structure for the task at hand

• Understanding the limitations of the experiment will help us do this

Page 22: PHAR 201 Lecture 3 20121 Know the Limitations of your Data – X-ray, NMR, EM PHAR 201/Bioinformatics I Philip E. Bourne SSPPS, UCSD Prerequisite Reading:

PHAR 201 Lecture 3 2012 22

Summary of Important Features in using Structure Data Determined by X-ray

Crystallography

• Resolution is a key indicator – think about it relative to atomic resolution ie 1.54A for a C-C single bond

• Disorder (ie undetermined or alternative atomic coordinates) is a natural part of many structures

• R factor (all) describes the agreement of the model with the experimental data. It should be better than 0.20 (Rfree 0.26)

Page 23: PHAR 201 Lecture 3 20121 Know the Limitations of your Data – X-ray, NMR, EM PHAR 201/Bioinformatics I Philip E. Bourne SSPPS, UCSD Prerequisite Reading:

PHAR 201 Lecture 3 2012 23

Summary of Important Features in using Structure Data Determined by X-ray

Crystallography Cont.

• B (aka temperature) factors offer indicators both to the accuracy of a structure and the most mobile regions

• At right is 5EBX drawn with QuickPDB

Page 24: PHAR 201 Lecture 3 20121 Know the Limitations of your Data – X-ray, NMR, EM PHAR 201/Bioinformatics I Philip E. Bourne SSPPS, UCSD Prerequisite Reading:

PHAR 201 Lecture 3 2012 24

NMR

Page 25: PHAR 201 Lecture 3 20121 Know the Limitations of your Data – X-ray, NMR, EM PHAR 201/Bioinformatics I Philip E. Bourne SSPPS, UCSD Prerequisite Reading:

PHAR 201 Lecture 3 2012 25

Features of NMR• Limited in size (25-100 kDa) – provided labeled samples are

obtainable• Selected information on proteins to ~150kDa• Solution study – small sample needed for soluble proteins• Only a few solid state studies• Reveals hydrogen positions• Leads to an ensemble of dynamical structures – these are

rarely used in bioinformatics studies• Useful in high throughput screens to determine protein

ligand interactions• Used for phasing of X-ray structures ie the methods are

synergistic• Until recently applicable to membrane proteins

Page 26: PHAR 201 Lecture 3 20121 Know the Limitations of your Data – X-ray, NMR, EM PHAR 201/Bioinformatics I Philip E. Bourne SSPPS, UCSD Prerequisite Reading:

PHAR 201 Lecture 3 2012 26

NMR - Methodology• Molecules are tumbling and vibrating with thermal motion• Usually labeled with H1 C13 N15 P31 - in an external magnetic field

have two spin states – one paired and one opposed to the external magnetic field

• Detects and assigns chemical shifts of atomic nuclei with non-zero spin

• The shifts depend on their electronic environments ie identities and distances of nearby atoms

• The system can be tuned to look at specific features of the characteristic spin moments

• H1 H1 provides NOE constraints

• Better resolution is obtained when the molecule is tumbling fast – size slows this – offset by higher magnetic field strengths

• Protein must be soluble at high concentration and stable without aggregation – high throughput can show this and folded vs unfolded very quickly

Page 27: PHAR 201 Lecture 3 20121 Know the Limitations of your Data – X-ray, NMR, EM PHAR 201/Bioinformatics I Philip E. Bourne SSPPS, UCSD Prerequisite Reading:

PHAR 201 Lecture 3 2012 27

NMR – Methodology cont.• Result is a set of distance constraints between pairs of

atoms either bonded or non-bonded• If there are sufficient constraints then an ensemble of

possibilities results • Often this ensemble is averaged and constraints adjusted to

conform to normal bond lengths and distances• Usually left with 15-30 members of the ensemble• Ideally less than 1Å RMSD between models (backbone

only)• Portions of the molecule with high motion have tell-tale

signals eg apo calmodulin

Page 28: PHAR 201 Lecture 3 20121 Know the Limitations of your Data – X-ray, NMR, EM PHAR 201/Bioinformatics I Philip E. Bourne SSPPS, UCSD Prerequisite Reading:

PHAR 201 Lecture 3 2012 28

BMRB - http://www.bmrb.wisc.edu/

Page 29: PHAR 201 Lecture 3 20121 Know the Limitations of your Data – X-ray, NMR, EM PHAR 201/Bioinformatics I Philip E. Bourne SSPPS, UCSD Prerequisite Reading:

PHAR 201 Lecture 3 2012 29

NMR Terms• COSY/NOESY spectra: Allow the space interactions between atoms

to be measured and generate a 3D structure of the protein. (what we have discussed)

• TROSY Transverse Relaxation Optimized Spectroscopy: Invented about 1997. First described by Professor Kurt Wuthrich. Useful for analyzing larger protein systems. TROSY is a method for getting sharper peaks on large proteins. TROSY is best at higher fields. If the aim is to study a large complex or a chemical shift perturbation when a protein binds to a receptor using NMR, it’s better to use a 900 MHz machine than a more standard lower-field machine

• solid state NMR: Requires wider-bore (63 or even 89 mm diameter) magnets (than solution state NMR). The higher stored energy of these wide bore magnets means that they are significantly more difficult to build, and as a result high-field solid state NMR lags behind liquid state in terms of available field strength.

• multidimensional (three- and four-dimensional) NMR: Introduced about 12-15 years ago. This technology has the advantage of resolving the severe overlap in 2D spectra.

Page 30: PHAR 201 Lecture 3 20121 Know the Limitations of your Data – X-ray, NMR, EM PHAR 201/Bioinformatics I Philip E. Bourne SSPPS, UCSD Prerequisite Reading:

PHAR 201 Lecture 3 2012 30

In both X-ray crystallography and NMR there is the danger that the

final structure reflects the model it was computed against

Page 31: PHAR 201 Lecture 3 20121 Know the Limitations of your Data – X-ray, NMR, EM PHAR 201/Bioinformatics I Philip E. Bourne SSPPS, UCSD Prerequisite Reading:

PHAR 201 Lecture 3 2012 31

Additional Validation Checks

• Stereochemical quality– Ramachandran plot outliers– Dihedrals, bond lengths and angles– Fold Deviation Score (FDS)– Validation Server

http://deposit.rcsb.org/validate/

Page 32: PHAR 201 Lecture 3 20121 Know the Limitations of your Data – X-ray, NMR, EM PHAR 201/Bioinformatics I Philip E. Bourne SSPPS, UCSD Prerequisite Reading:

PHAR 201 Lecture 3 2012 32

Use the PDB Geometry Data

Page 33: PHAR 201 Lecture 3 20121 Know the Limitations of your Data – X-ray, NMR, EM PHAR 201/Bioinformatics I Philip E. Bourne SSPPS, UCSD Prerequisite Reading:

PHAR 201 Lecture 3 2012 33

Electron Microscopy

• Able to look at large molecular assemblies• Resolution now 30A to below 4A• Cryo-EM preserves aqueous environment (no

staining)• Experimentally more tractable• Can resolve images (direct measurement of

phases) or diffraction patterns• Can provide a 3D volumetric reconstruction• Suitable for the study of membrane proteins eg

bacteriorhodopsin (1990)

1KVP STRUCTURAL ANALYSIS OF THE SPIROPLASMA VIRUS, SPV4, IMPLICATIONS FOR EVOLUTIONARY VARIATION TO OBTAIN HOST DIVERSITY AMONG THE MICROVIRIDAE,

Page 34: PHAR 201 Lecture 3 20121 Know the Limitations of your Data – X-ray, NMR, EM PHAR 201/Bioinformatics I Philip E. Bourne SSPPS, UCSD Prerequisite Reading:

PHAR 201 Lecture 3 2012 34

1P85 Real space refined coordinates of the 50S subunit fitted into the low resolution cryo-EM map of the EF-G.GTP state of E. coli 70S ribosome

• Single particle reconstruction – multiple orientations of the same particle found in the specimen (viruses, ribosome…)

• Electron tomography – 3D reconstruction of a single particle (organelles, whole cells)

Page 35: PHAR 201 Lecture 3 20121 Know the Limitations of your Data – X-ray, NMR, EM PHAR 201/Bioinformatics I Philip E. Bourne SSPPS, UCSD Prerequisite Reading:

PHAR 201 Lecture 3 2012 35

Example EM Result• Example for a hybrid study that combines

elements of electron crystallography and helical reconstruction with homology modeling and molecular docking approaches in order to elucidate the structure of an actin-fimbrin crosslink (Volkmann et al., 2001b). Fimbrin is a member of a large superfamily of actin-binding proteins and is responsible for crosslinking of actin filaments into ordered, tightly packed networks such as actin bundles in microvilli or stereocilia of the inner ear. The diffraction patterns of ordered paracrystalline actin-fimbrin arrays (background) were used to deduce the spatial relationship between the actin filaments (white surface representation) and the various domains of the crosslinker (the two actin-binding domains of fimbrin are pink and blue, the regulatory domain cyan). Combination of this data with homology modeling and data from docking the crystal structure of fimbrin’s N-terminal actin-binding domain into helical reconstructions (Hanein et al., 1998), allowed us to build a complete atomic model of the crosslinking molecule (foreground, color scheme as in surface representation of the array).

• From Structural Bioinformatics 2005 p124

Page 36: PHAR 201 Lecture 3 20121 Know the Limitations of your Data – X-ray, NMR, EM PHAR 201/Bioinformatics I Philip E. Bourne SSPPS, UCSD Prerequisite Reading:

PHAR 201 Lecture 3 2012 36

• Example for a combination of high-resolution structural information from X-ray crystallography and medium-resolution information from electron cryomicroscopy (here 2.1 nm). Actin and myosin were docked into helical reconstructions of actin decorated with smooth-muscle myosin (Volkmann et al., 2000). Interaction of myosin with filamen tous actin has been im plicated in a variety of biological activities including muscle contraction, cytokinesis, cell movement, membrane transport, and certain sig nal transduction pathways. Attempts to crystallize actomyosin failed due to the tendency of actin to polymerize. Docking was performed using a global search with a density correlation measure (Volkmann and Hanein, 1999). The estimated accuracy of the fit is 0.22 nm in the myosin portion and 0.18 nm in the actin portion. One actin molecule is shown on the left as a molecular sur face representation. The yellow area de notes the largest hydrophobic patch on the exposed surface of the filament, a region expected to participate in actomyosin interactions. The fitted atomic model of my osin is shown on the right. The trans par ent envelope repre sents the density correspond ing to myosin in the 3D reconstruc tion. The solution set concept (see text) was used to evaluate the results and to assign probabilities for residues to take part in the interaction. The tone of red on the myosin model is proportional to this statistically evaluated probability (the more red, the higher the prob ability).

• From Structural Bioinformatics 2005 p127

Example EM Result

Page 37: PHAR 201 Lecture 3 20121 Know the Limitations of your Data – X-ray, NMR, EM PHAR 201/Bioinformatics I Philip E. Bourne SSPPS, UCSD Prerequisite Reading:

Small-angle X-ray Scattering SAXS

• Reveals shape and size of macromolecules in the range 5-25nm

• Handles partially ordered systems

• No need for crystalline sample; larger molecules than NMR, but at lower resolution

• Leading to hybrid techniques

PHAR 201 Lecture 3 2012 37

http://en.wikipedia.org/wiki/Small-angle_X-ray_scattering

Page 38: PHAR 201 Lecture 3 20121 Know the Limitations of your Data – X-ray, NMR, EM PHAR 201/Bioinformatics I Philip E. Bourne SSPPS, UCSD Prerequisite Reading:

PHAR 201 Lecture 3 2012 38

Summary Regarding Data Limitations

• Pay attention to the method its pluses and minuses• Be aware of models• Be aware of the general limitations of each method• For NMR be aware of an ensemble of structures• Be aware of hybrid models• For all methods be aware of the parameters that govern the

accuracy• You will need to know these limitations for just about any

bioinformatics study since it will be necessary to choose a non-redundant set (NR) – we will visit Astral and Pisces which are tools in defining an NR set


Recommended