+ All Categories
Home > Documents > Olga V. Kalinina Pavel S. Novichkov Andrey A. Mironov Mikhail S. Gelfand

Olga V. Kalinina Pavel S. Novichkov Andrey A. Mironov Mikhail S. Gelfand

Date post: 15-Jan-2016
Category:
Upload: hamlin
View: 42 times
Download: 0 times
Share this document with a friend
Description:
SDPpred : a method for identification of amino acid residues that determine differences in functional specificity of homologous proteins and application thereof to the MIP family of membrane transporters. Olga V. Kalinina Pavel S. Novichkov Andrey A. Mironov Mikhail S. Gelfand - PowerPoint PPT Presentation
Popular Tags:
19
SDPpred : a method for identification of amino acid residues that determine differences in functional specificity of homologous proteins and application thereof to the MIP family of membrane transporters Olga V. Kalinina Pavel S. Novichkov Andrey A. Mironov Mikhail S. Gelfand Aleksandra B. Rakhmaninova
Transcript
Page 1: Olga V. Kalinina Pavel S. Novichkov Andrey A. Mironov Mikhail S. Gelfand

SDPpred: a method for identification of amino acid residuesthat determine differences in functional specificity

of homologous proteins and application thereof to the MIP family

of membrane transporters

Olga V. Kalinina

Pavel S. Novichkov

Andrey A. Mironov

Mikhail S. Gelfand

Aleksandra B. Rakhmaninova

Page 2: Olga V. Kalinina Pavel S. Novichkov Andrey A. Mironov Mikhail S. Gelfand

Large families of proteins: generally similar biochemical function but many different specificities… Example: ~800 transcription factors of the LacI family.

Average sequence identity 30%. Bind different effectors and operators.

Some effectors:• lactose (LacI)

• D-fructose-6-phosphate (FruR)

• guanine, hypoxantine (PurR)

• cytidine, adenosine (CytR)

• trehalose-6-phosphate (TreR)

• D-gluconate (GntR)

• D-galactose (GalR)

• D-ribose (RbsR)

• maltose (MalR)

• raffinose (RafR)

• …….

• Х??

Page 3: Olga V. Kalinina Pavel S. Novichkov Andrey A. Mironov Mikhail S. Gelfand

Positions that account

for specificity

Assignment of specificity to new proteins

ExperimentExperiment

Testing on families that include proteins with resolved 3D structure

SDPpredSDPpred

Description of specificity groups :Group А: No. 1-10,13…Group В: No.12, 14-16…

Group С: No. 17-45……

Q9KDW9 ----------MSPFLGEVIGTMILIILGGGVVAGVVLKGTKQ8Y6Z1 ----MIDTSLATQFLGEVIGTAILIILGAGVVAGVSLKRSKQ97JG6 ----------MTIFFAELVGTLLLILLGDGVVANVVLKNSKGLPF_ECOLI MSQT---STLKGQCIAEFLGTGLLIFFGVGCVA--ALKVAGQ8ZJK5 MSQTA-SSTLKGQCIAEFLGTGLLIFFGAGCVA--ALKLAGGLPF_HAEIN MDKS-----LKANCIGEFLGTALLIFFGVGCVA—-ALKVAGGLPF_PSEAE MTTAAPTPSLFGQCLAEFLGTALLIFFGTGCVA--ALKVAGAQPZ_BRUME ---------MLNKLSAEFFGTFWLVFGGCGSAILAA--AFPQ92NM3 ---------MFRKLSVEFLGTFWLVLGGCGSAVLAA--AFPQ8UJW4 ---------MGRKLLAEFFGTFWLVFGGCGSAVFAA--AFPAQPZ_ECOLI ---------MFRKLAAECFGTFWLVFGGCGSAVLAA--GFP Alig

nm

en

t

??

Page 4: Olga V. Kalinina Pavel S. Novichkov Andrey A. Mironov Mikhail S. Gelfand

SDP is not equivalent to a functionally important position!

• Specificity group = group of proteins that have the same specificity (experimental data, genome analysis, etc.)

• SDP = alignment position that is conserved within specificity groups but differs between them

What are SDPs? (SDP = Specificity Determining Position)

Page 5: Olga V. Kalinina Pavel S. Novichkov Andrey A. Mironov Mikhail S. Gelfand

• Mutual information Ip reflect the extent to which an alignment position

tends to be a SDP.

• Statistical significance of Ip.

Expected mutual information Ipexp of an alignment column.

Z-score.

(Mirny&Gelfand, 2002, J Mol Biol, 321(1))

• Smoothed amino acid frequencies: a leucine is more a methionine than a valine, and any arginine has a dash of lysine…

• Are 5 SDP with Z-score >10.5 better than 10 SDP with Z-score >9.0? Bernoulli estimator for selection of proper number of SDPs

• ы 21 ZZ

N - number of groups, - fraction of proteins in group i. - ratio of occurrences of amino acid In group i in position p to the length of the whole alignment column, - frequency of amino acid in the whole alignment column in position p,

Algorithm

N

i p

ppp iff

ififI

1

20

1 )()(

),(log),(

)(if

),( if p

)(pf

)exp(

exp

pIpIpI

pZ

)(),(),( ininif )()(

)()(),(),(

),(~

20

1

inin

inminin

if

kk

ZZkPk scores- Zobserved least at are thereminarg*

n

kni

iniin

k

pqC1

1minarg

kZ

k dZZZZPp )exp(2

1)( 2

pq 1

Page 6: Olga V. Kalinina Pavel S. Novichkov Andrey A. Mironov Mikhail S. Gelfand

• Kalinina OV, Mironov AA, Gelfand MS, Rakhmaninova AB. (2004)

Automated selection of positions determining functional specificity of proteins by comparative analysis of orthologous groups in protein families. Protein Sci 13(2): 443-56

• http://math.belozersky.msu.ru/~psn/

Kalinina OV, Novichkov PS, Mironov AA, Gelfand MS, Rakhmaninova AB. (2004) SDPpred: a tool for prediction of amino acid residues that determine differences in functional specificity of homologous proteins. Nucl Acids Res 32(Web Server issue): W424-8.

Page 7: Olga V. Kalinina Pavel S. Novichkov Andrey A. Mironov Mikhail S. Gelfand

Web interfaceInput: multiple alignment of proteins

divided into specificity groups

=== AQP ===%sp|Q9L772|AQPZ_BRUME-------------------------------------mlnklsaeffgtfwlvfggcgsailaa--afp-------elgigflgvalafgltvltmayavggisg--ghfnpavslgltviiilgsts------------------------------slap------------------qlwlfwvaplvgavigaiiwkgllgrd---------------------------------------%sp|P48838|AQPZ_ECOLI-------------------------------------mfrklaaecfgtfwlvfggcgsavlaa--gfp-------elgigfagvalafgltvltmafavghisg--ghfnpavtiglwalvihgatd------------------------------kfap------------------qlwffwvvpivggiiggliyrtllekrd--------------------------------------%tr|Q92ZW9-------------------------------------mfkklcaeflgtcwlvlggcgsavlas--afp-------qvgigllgvsfafgltvltmaytvggisg--ghfnpavslglaviiilgsth------------------------------rrvp------------------qlwlfwiaplfgaaiagivwksvgeefrpvd-----------------------------------=== GLP ===%sp|P11244|GLPF_ECOLI----------------------------msqt---stlkgqciaeflgtglliffgvgcvaalkvag---------a-sfgqweisviwglgvamaiyltagvsg--ahlnpavtialwlglilaltd------------------------------dgn--------------g-vpr-flvplfgpivgaivgafayrkligrhlpcdicvveek--etttpseqkasl--------------%sp|P44826|GLPF_HAEIN----------------------------mdks-----lkancigeflgtalliffgvgcv

Page 8: Olga V. Kalinina Pavel S. Novichkov Andrey A. Mironov Mikhail S. Gelfand

Web interfaceOutput

Alignment of the family with the SDPs highlighted(Alignment view)

Detailed description of each SDP(List of SDPs)

Plot of probabilities, used by the Bernoulli estimator to set the cutoff (Probability plot view)

Page 9: Olga V. Kalinina Pavel S. Novichkov Andrey A. Mironov Mikhail S. Gelfand

Examples: the LacI family of bacterial transcription factors

• Training set: 459 sequences, average length: 338 amino acids, 85 specificity groups

10 residues contact NPF (analog of the effector)

6 residues make up intersubunit contacts

7 residues contact the operator sequence

7 residues in the effector contact zone (5Ǻ<dmin<10Ǻ)

5 residues in the intersubunit contact zone (5Ǻ<dmin<10Ǻ)

6 residues in the operator contact zone (5Ǻ<dmin<10Ǻ)

– 44 SDPs

LacI from E.coli

Page 10: Olga V. Kalinina Pavel S. Novichkov Andrey A. Mironov Mikhail S. Gelfand

Examples: bacterial membrane channels of the MIP family

• Training set: 17 sequences, average length 280 amino acids, 2 specificity groups: Aquaporines & glyceroaquaporines

– 21 SDPs8 residues contact glycerol (substrate) (dmin<5Ǻ)

8 residues oriented to the channel

5 residues make up contacts with other subunits

GlpF from E.coli

Page 11: Olga V. Kalinina Pavel S. Novichkov Andrey A. Mironov Mikhail S. Gelfand

Why does the prediction make sense? LacI from E.coli

• Total 348 amino acids

• 44 SDP

Non-contacting residues (distance to the DNA, effector, or the other subunit >10Ǻ)

Contact zone (may be functional)

Contacting residues (distance to the DNA, effector, or the other subunit <5Ǻ)

Page 12: Olga V. Kalinina Pavel S. Novichkov Andrey A. Mironov Mikhail S. Gelfand

Why does the prediction make sense? GlpF from E.coli

• Total 281 amino acids

• 21 SDP

Contacting residues (distance to the substrate, or another subunit <5Ǻ)

Non-contacting residues (distance to the substrate, or another subunit >10Ǻ)

Contact zone (may be functional)

Page 13: Olga V. Kalinina Pavel S. Novichkov Andrey A. Mironov Mikhail S. Gelfand

GlpF from E.coli, a membrane channel from the MIP family:

SDPs either interact with the substrate or are located on the outer surface of the monomer

Structure of the GlpF monomer Predicted SDPs

Glycerol

Page 14: Olga V. Kalinina Pavel S. Novichkov Andrey A. Mironov Mikhail S. Gelfand

SDPs located on the outer surface of the GlpF monomer form subunit contacts

Glu43 from all four subunits

20Leu, 24Ile, 108Tyr of one subunit, 193Ser from another subunit

Page 15: Olga V. Kalinina Pavel S. Novichkov Andrey A. Mironov Mikhail S. Gelfand

SDPs located on the outer surface of the GlpF monomer (continued)

Subunit I Subunit II Subunit IV

Residue Atom Residue Atom Residue Atom (Ǻ)

Glu43 OE1 Ser38 O 4.8

Glu43 OE2 Glu43 OE2 4.1

Glu43 CG Trp42 CD1 3.7

Glu43 OE2 Glu43 OE2 4.1

Subunit I Subunit II

Residue Atom Residue Atom (Ǻ)

Leu20 CD2 Ile158 CD1 4.3

Leu20 CD1 Leu162 CD2 4.5

Phe24 CZ Ile158 CG2 3.9

Phe24 CZ Leu186 CD1 3.9

Phe24 CE2 Val189 CG2 3.8

Phe24 CE2 Ile190 CG1 3.7

Phe24 CA Ser193 CB 3.9

Phe24 O Ser193 OG 4.2

Phe24 O Ser193 CB 3.3

Gly27 O Ser193 O 3.2

Cys28 CA Ser193 CA 3.8

Tyr108 OH Ser193 O 2.6

Tyr108 CE1 Met194 CE 3.7

Tyr108 CE1 Leu197 CD1 3.9

Page 16: Olga V. Kalinina Pavel S. Novichkov Andrey A. Mironov Mikhail S. Gelfand

SDPs located on the outer surface of the GlpF monomer (continued)

Structure of contacts in the type A cluster

Structure of contacts in the type B cluster

Page 17: Olga V. Kalinina Pavel S. Novichkov Andrey A. Mironov Mikhail S. Gelfand

Conclusions I. SDPpred: the SDP prediction method

• A method for identification of amino acid residues that account for differences in protein functional specificity– Does not rely on the protein 3D structure– Automatically determines the number of significant positions– Considers substitutions according to the chemical properties of

substituted amino acids

• Results agree with available structural and experimental data• Applicable to any protein family in a standard way

Kalinina OV, Mironov AA, Gelfand MS, Rakhmaninova AB. (2004) Automated selection of positions determining functional specificity of proteins by comparative analysis of

orthologous groups in protein families. Protein Sci 13(2): 443-56http://math.belozersky.msu.ru/~psn/

Kalinina OV, Novichkov PS, Mironov AA, Gelfand MS, Rakhmaninova AB. (2004) SDPpred: a tool for prediction of amino acid residues that determine differences in

functional specificity of homologous proteins. Nucl Acids Res 32(Web Server issue): W424-8.

Page 18: Olga V. Kalinina Pavel S. Novichkov Andrey A. Mironov Mikhail S. Gelfand

Conclusions II. SDPs for GlpF from E.coli

• In protein families, whose members function as oligomers, predicted SDPs are often localized on the contact surface between subunits

• 5 “surface” SDPs in GlpF: 20Leu, 24Ile, 43Glu, 108Tyr, 193Ser. All of them participate in forming the quaternary structure Evolutionary pressure on amino acids that establish intersubunit

contacts correlates with evolutionary pressure on amino acids that account for the correct recognition of the substrate

• These residues form compact spatial clusters “structural clasps” for recognition of proper subunits

Page 19: Olga V. Kalinina Pavel S. Novichkov Andrey A. Mironov Mikhail S. Gelfand

• Olga V. Kalinina• Pavel S. Novichkov• Andrey A. Mironov• Mikhail S. Gelfand• Aleksandra B. Rakhmaninova

– Department of Bioengineering and Bioinformatics, Moscow State University, Moscow, Russia

– Institute for Information Transmission Problems RAS, Moscow, Russia

– State Scientific Center GosNIIGenetika, Moscow, Russia

• Acknowledgements– Leonid A. Mirny– Olga Laikova– Vsevolod Makeev– Roman Sutormin– Shamil Sunyaev– Aleksey Finkelstein


Recommended