Computational Method for Predicting Amyloidogenic Sequences Bill Welsh UMDNJ- Robert Wood Johnson...

Computational Method forComputational Method for

Predicting Amyloidogenic Predicting Amyloidogenic SequencesSequences

Bill WelshBill Welsh

UMDNJ- Robert Wood Johnson Medical UMDNJ- Robert Wood Johnson Medical SchoolSchool

Computational Method forComputational Method for

Predicting Amyloidogenic Predicting Amyloidogenic SequencesSequences

Bill WelshBill Welsh

UMDNJ- Robert Wood Johnson Medical UMDNJ- Robert Wood Johnson Medical SchoolSchool

[email protected]

Amyloid Fibril Formation

A Common Mechanism for Protein Misfolding Diseases

• Numerous amyloid & misfolding diseases• All of them are incurable at present• Short list of more familiar examples

– Alzheimer’s disease

– Parkinson’s disease

– Huntington’s disease

– Crutzfeld-Jakob disease (“Mad Cow”)

– Familial Amyloidosis

– Type II Diabetes

• Triggered by short sequences that convert from native -helix or coil to -strand

• We call this trait ‘hidden -strand propensity’

http://emboj.oupjournals.org/content/vol19/issue7/images/large/e061308.jpeg

1. No sequence specificities

2. Absence of detailed structural information on misfolded proteins (amyloid fibrils)

Problems

Our Solution

1. Misfolding process is triggered by short (5-7 residue) sequences

2. Redefine sequence-structure relationships in terms of tertiary context

3. Identify short sequences that exhibit non-native (hidden) -strand propensity [HP].

Relative Occurrence of Secondary Structure Elements in Different Tertiary Contact States

Secondary structure

Coil Total sequences

Low 38 % 59 % 3 % 191,300

Medium 47 % 37 % 16 % 112,199

High 39 % 11 % 50 % 150,288

Tertiary contacts

All 41 % 38 % 21 % 453,787

Based on SCOP20v1.57

Two non-H atoms 4Å apart separated by more than 4 residues in sequence

Tertiary Contact (TC)

Intriguing Relationship Between

Tertiary Contacts and Secondary Structure

Striking Conclusion

-helix dominates in low-TC regions

-sheet dominates in high-TC regions

TC Influence on Secondary Structure TC Influence on Secondary Structure PropensityPropensity

L S A Q E K D N R I F T M P G W C V Y H

9.9 8.3 6.4 10.2 8.6 8.0 9.1 10.6 15.9 11.0 18.5 9.3 11.9 7.0 6.5 25.9 12.7 10.1 21.7 16.2

Average Tertiary Contacts (TCs) in SCOP20

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2

TC

Fra

gm

ents

pre

dic

ted

as

hel

ix

beta-strandCoil

-helix propensity of -strands increases sharply

at low TCs

hel

ix p

rop

ensi

ty

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2

TC

fra

gm

en

ts p

red

icte

d a

s b

eta

-str

an

d

Helix Coil

β-strand propensity of helices increases sharply at high TCs

-st

ran

d p

rop

ensi

ty

Database of >450,0007-residue sequences

with secondary structure & TCs

Database of >450,0007-residue sequences

with secondary structure & TCs

TertiaryContact (TC)

TertiaryContact (TC)

DSSPSec Str DSSP

Sec Str Fandrich et al., 2001 Nature

Amyloid fibrils from myoglobin

The CSSP Algorithm: Locating Sequences Exhibiting HP

Amino acid|… …AGHGQEVLIRLFTGHPETL… …| PHD |… …HHHHHHHHHHHHHH HHH… …| P() |… …7756899999999623469… …| P() |… …0000000000000000000… …|

PHD prediction of secondary structure

SCOP20Sequences

SCOP20Sequences

3D Structure from PDB

3D Structure from PDB

Low TC

High TC

A G H G Q E V L I R L F T G H P E T LW W W W W W W W W W W W W W W W W W W

W W W W W W W W W W W W W W W W W W W

… …P(|low)

P(|high)

Sequence of hidden -propensity

sliding 7-residue window

- Q – E – V – L – I – R – L -

Query Sequence

Similar Sequences

Sensitivity of the CSSP MethodSensitivity of the CSSP Method

Cameleon SequencesCameleon Sequences

ASVKQVSin -sheet

ASVKQVSin -helix

1AMP 1GKYAminopeptidase Guanylate kinase

Query local sequence

Resident proteinNative secondary

structure

Tertiary Contacts

(TC)

HP prediction (0-10 scale)

PDB ID Name P() P() P(Coil)

ASVKQVS

1AMP Aminopeptidase strand 1.3 2 7 1

1GKY Guanylate kinase helix 0.4 8 1 1

Helix Beta Coil Propensity

StrongModerateWeakVery weak

Amyloidogenic wild type Aβ fragment

Non-amyloidogenic mutant Aβ fragment

Hidden β-propensity in Alzheimer’s Disease

KLVFF are key residues in amyloid fibril polymerization(Tjernberg et al., JBC 1996)

Yoon and Welsh, Protein Science (2004); ibid., Proteins (2005)

hIAPP sequence (Type 2 Diabetes)

hIAPP sequence (4-34) associated with type II diabetes

NAC sequence of α-synuclein associated with Parkinson’s disease

-NFLVH- -FLVHS- Mazor et al., JMB (2002)

VTNVGGAVVTGVTAVA VTGVTAVAQKTV GAVVTGVTAVA Bodles et al., J Neurochem (2001)

NAC sequence (Parkinson’s disease)

-NFGAIL- Zanuy, Nussinov, et al. Biophysical Journal (2003)

Beta propensity of acetylcholinesterase (AChE)

and its homolog butyrylcholinesterase (BuChE)

Amyloidogenic AChE586-599 fragment

Nonamyloidogenic BuChE573-596 fragment

Cottingham et al., Biochemistry (2002); ibid., (2003): AChE586-599 and BuChE573-586

Amyloid Formation by G334V

Mutant p53 Associated with Lung Cancer

Higashimoto et al, Biochemistry 45, 1608-1619 (2006)

Amyloidogenic Sequence Knowledge Base (ASKB)

http://askb.umdnj.edu/askb/welcome.html

CSSP Algorithm that

predicts

“Hidden” -Strand

Propensity

in Proteins & Polypeptides

Searchable peptide

database

Unfolded

KRTGhidden log

G-helix

-strand

Random coil

G

coilG

G

coilG

)log(log coilKKRT

-rich amyloid

Partially Folded

amyloidG

coilahidden GGG

hiddenamyloid GG

coilhidden P

P

P

PRTG

loglog

coilPP

PRT

2

log

Estimating Free Energies

Predicted vs. Expt’l -Sheet

Structure of Prion Protein Peptide

• Decatur and coworkers employed FTIR spectroscopy to determine % -sheet structure for peptides based on residues 109-122 of the Syrian hamster prion protein (H1) substituted at position 117.

• We plotted our calculated HP metrics for the sequences H1, A117G, A117V, A117L, and A117I vs. Decatur’s expt’l values.

• Strong correlation (R2=0.96) suggests that calculated HP profiles are excellent predictors of -sheet nature.

SA Petty, T Thorsteinn, & SM Decatur, Biochemistry 44:4720-4726 (2005)

Thank [email protected]

Thank [email protected]

The CSSP algorithm successfully pinpoints amyloidogenic sequences in numerous examples where expt’l data are available

These sequences possess hidden -strand propensity generally short sequences (4-7 residues) that serve as ‘core nucleation motifs’ to

trigger amyloid fibril formation adopt -helix in low contact regions (low TC) and -strand in high contact regions

(high TC)

These sequences are conformationally ambivalent interconvertible between -helix and -strand highly sensitive to tertiary environment generally contain hydrophobic, aromatic residues (Phe, Trp, Tyr) consistent with recent findings: Rojas Quijano et al Biochemistry (2006)

Ability to form amyloid is a generic trait of all proteins

General Observations and Implications

Date post:	20-Jan-2016
Category:	Documents
View:	219 times
Download:	0 times

Computational Method for Predicting Amyloidogenic Sequences Bill Welsh UMDNJ- Robert Wood Johnson...

Documents