Find A Homolog in Protein Structure Database ?

Post on 13-Jan-2016

43 views 0 download

Tags:

description

Find A Homolog in Protein Structure Database ?. Homology Modeling. YES. Secondary Structure Prediction. NO. Homology Modeling from Swiss-Model. Malate dehydrogenase (14 MDH) sequence SEPIRVLVTGAAGQIAYSLLYSIGNGSVFGKDQPIILVLLDITPMMGVLD - PowerPoint PPT Presentation

transcript

Find A Homolog in Protein Structure

Database ?

YES

NO

Malate dehydrogenase (14 MDH) sequence

SEPIRVLVTGAAGQIAYSLLYSIGNGSVFGKDQPIILVLLDITPMMGVLD

GVLMELQDCALPLLKDVIATDKEEIAFKDLDVAILVGSMPRRDGMERKDL

LKANVKIFKCQGAALDKYAKKSVKVIVVGNPANTNCLTASKSAPSIPKEN

FSCLTRLDHNRAKAQIALKLGVTSDDVKNVIIWGNHSSTQYPDVNHAKVK

LQAKEVGVYEAVKDDSWLKGEFITTVQQRGAAVIKARKLSSAMSAAKAIC

DHVRDIWFGTPEGEFVSMGIISDGNSYGVPDDLLYSFPVTIKDKTWKIVE

GLPINDFSREKMDLTAKELAEEKETAFEFLSSA

High Smallest Poisson ProbabilitySequences Producing High-scoring Segment Pairs: Score P(N) N 14MDH

11BMD

21BDM

11BDM

11LLC

15LDH

1692 2.9e-230 1 610 1.9e-81 1 604 1.2e-80 1 295 1.2e-70

4

68 0.0012 3

79 0.0014 1

Finding Appropriate Template from Structure Database

11 BMD: Muscular Dystrophin, Becker types

Using Magic Fit to Align Two Sequences

14MDH 1 S EP IRVLVTG AAGQIAYSLL YS IGNGSVFG KDQP I ILVLL DITPMMGVLD

11BMD 1 KAPVRVAVTG AAGQIGYSLL FRIAAGEMLG KDQPV ILQLL EIPQAMKALE

14MDH 51 GVLM ELQDCA LPLLKDVIAT DKEE I AFKDL DVA ILVGSMP RRDGMERKDL

11BMD 51 GVVMELEDCA FPLLAGLEAT DDPDVAFKDA DYALLVGAAP RKAGMERRDL

14MDH 101 LKANVKIFKC QGAALDKYAK KSVKV IVVGN PANTNCLTAS KSAPS I PKEN

11BMD 101 LQVNGKIFTE QGRALAEVAK KDVKVLVVGN PANTNALIAY KNAPGLNPRN

14MDH 151 FSC LTRLDHN RAKAQ IALKL GVTSDDVKNV I I WGNHSSTQ YPDVNHAKVK

11BMD 151 FTAMTRLDHN RAKAQLAKKT GTGVDR IRRM TVWGNHSSTM FPDLF HAEVD

14MDH 201 LQAKEVGVYE AVKDDSWLKG EFITTVQQRG AAVIKARKLS SAMSAAKAIC

11BMD 201 GRP - - - - ALE LVDME -WYEK VFIPTVAQRG AA I IQARGAS SAASAANAAI

14MDH 251 DHVRDI W –FG TPEG E FVSMG I ISDGNSYGV PDDLLYSFPV TIKDKTWK I V

11BMD 246 EH IRD - WALG TPEGDWVSMA VPSQGE –YGI PEGIVYSFPV TAKDGAYRVV

14MDH 300 EGLP INDFSRE KMDLTAKELA EEKE TAF EFL SSA

11BMD 294 EGLEINEFARK RME ITAQ ELL DEMEQVKALG LI

Length = 326, Score = 610 (278.7 bits), Expect = 1.9e-81, P = 1.9e-81, Identities = 178 / 326 (54.6%)

Modifying Sequence Alignment

14MDH 1 S EPIRVLVTG AAGQIAYSLL YS IGNGSVFG KDQPI ILVLL DITPMMGVLD

11BMD 1 KAPVRVAVTG AAGQIGYSLL FRIAAGEMLG KDQPV ILQLL EIPQAMKALE

14MDH 51 GVLM ELQDCA LPLLKDVIAT DKEE I AFKDL DVA ILVGSMP RRDGMERKDL

11BMD 51 GVVMELEDCA FPLLAGLEAT DDPDVAFKDA DYALLVGAAP RKAGMERRDL

14MDH 101 LKANVKIFKC QGAALDKYAK KSVKV IVVGN PANTNCLTAS KSAPSI PKEN

11BMD 101 LQVNGKIFTE QGRALAEVAK KDVKVLVVGN PANTNALIAY KNAPGLNPRN

14MDH 151 FSC LTRLDHN RAKAQ IALKL GVTSDDVKNV I I WGNHSSTQ YPDVNHAKVK

11BMD 151 FTAMTRLDHN RAKAQLAKKT GTGVDR IRRM TVWGNHSSTM FPDLF HAEVD

14MDH 201 LQAKEVGVYE AVKDDSWLKG EFITTVQQRG AAVIKARKLS SAMSAAKAIC

11BMD 201 GRP - - - -ALE LVDME -WYEK VFIPTVAQRG AA I IQARGAS SAASAANAAI

14MDH 251 DHVRD I WFGT PEGE F VSMG I I SDGNSYGVP DDLLYSFPVT I KDKTWK IVE

11BMD 246 EH I RDWALGT PEGDWVSMAV PSQGE –YG IP E GIVYSFPVT AKDGAYRVVE

14MDH 301 GLPINDF SRE KMDLTAKELA EEKETA F E FL SSA

11BMD 295 GLEINEFARK RME I TAQELL DEMEQVKALG LI

ATOM 1 C ACE A 0 11.590 2.938 35.017 1.00 45.90 14B 5

ATOM 2 O ACE A 0 12.581 2.371 35.517 1.00 28.75 14B 6

ATOM 3 CH3 ACE A 0 10.179 2.477 35.417 1.00 36.75 14B 7

ATOM 4 N SER A 1 11.648 3.946 34.081 1.00 49.10 14 341

ATOM 5 CA SER A 1 12.901 4.557 33.573 1.00 52.42 14 342

ATOM 6 C SER A 1 12.733 5.624 32.482 1.00 48.48 14 343

ATOM 7 O SER A 1 13.238 5.432 31.363 1.00 57.03 14 344

ATOM 8 CB SER A 1 13.990 3.553 33.162 1.00 41.45 14 345

ATOM 9 OG SER A 1 15.105 3.679 34.039 1.00 42.59 14 346

ATOM 10 N GLU A 2 12.073 6.774 32.772 1.00 37.72 14 347

ATOM 11 CA GLU A 2 11.948 7.788 31.721 1.00 20.88 14 348

ATOM 12 C GLU A 2 12.042 9.235 32.169 1.00 28.31 14 349

Obtaining Atomic Coordinates of The Model

Building The Model

The First Model

14 MDH

11 BMD

Refining The Model

14 MDH11 BMD

The Refined Model

First Model Refined Model Real 14 MDH Structure

Models & Real Structure

Yellow real 14 MDH structure Blue refined model Green 11BMD (template)

Comparison of Backbone Structures

14MDH11BMD (template)

3-D Structure Docked with Substrate

In presence of reduced NAD (NADH) In presence of oxidized NAD (NAD+)

• Deduces the most likely position of alpha-helices and beta-strands • Confirms structural or functional relationships when sequence similarity is weak

• Determines guidelines for rational selection of specific mutants for further laboratory study

Secondary Structure Prediction Attributes

Alpha helices have a periodicity of 3.6, which means that for helices with one face buried in the protein core, and the other exposed to solvent, will have residues at positions i, i+3, i+4 & i+7, will lie on one face of the helix.

Beta strands that are half buried in the protein core will tend to have hydrophobic residues at positions i, i+2, i+4, i+8 etc, and polar residues at positions i+1, i+3, i+5, etc.

Beta strands that are completely buried usually contain a run of hydrophobic residues, since both faces are buried in the protein core.

Other Important Secondary Structures

Loop regions

– Often join combinations of -helices and -sheets

– May participate in forming active sites/binding sites

– Usually found on exterior of proteins (H-bond with solvent, H2O)

– Rich in charged and polar hydrophilic residues

– Usually have irregular structure

– Insertions and deletions are most likely to occur in these regions

Hairpin

- Generally 2 to 5 residues long- 70% are shorter than 7 residues- Type I ; residue 2 is always G- Type II; residue 1 is always G

Flavodoxin Chain A (FCA) Sequence

KIGLFYGTQTGVTQTIAESIQQEFGGESIVDLNDIANADASDLNA

YDYLIIGCPTWNVGELQSDWEGIYDDLDSVNFQGKKVAYFGAG

DQVGYSDNFQDAMGILEEKISSLGSQTVGYWPIEGYDFNESKAV

RNNQFVGLAIDEDNQPDLTKNRIKTWVSQLKSEFGL

Example for Secondary Structure Prediction

1 AKIGLFYGTQ TGVTQTIAES IQQEFGGESI VDLNDIANAD ASDLNAYDYL         EEEEEE S SSHHHHHHHH HHHHHTTTTT EEEEEGGGTT GGGGGGSEE   

51 IIGCPTWNVG ELQSDWEGIY DDLDSVNFQG KKVAYFGAGD QVGYSDNFQD       EEEE EETTT EE HHHHHHH GGGGGS  TT  EEEEEEE   TTTTTTTTTH   

101 AMGILEEKIS SLGSQTVGYW PIEGYDFNES KAVRNNQFVG LAIDEDNQPD       HHHHHHHHHH HTT EE   E ESTT   S    TTEETTEESS EEE TTTTHH   

151 LTKNR I KT WV SQLKS E FGL       HHTHHHHHHH HHHH HHTTT 

FCA Secondary Structure

The assignments are: •Helix

•H=helix •G=310 helix •I=pi helix

•Beta •B=residue in isolated beta bridge •E=extended beta strand

•Turns and Bends •T=hydrogen bonded turn •S=bend

Diagram of FCA Secondary Structure

3 sheets, 11 strands, 8 helices, 20 beta turns, 2 beta hairpins, Summary:

3-D Structure of FCA Docked with Substrate (flavin)