+ All Categories
Home > Documents > Evolutionary trace report by report maker November 28,...

Evolutionary trace report by report maker November 28,...

Date post: 09-Oct-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
14
Pages 1–13 2uxn Evolutionary trace report by report maker November 28, 2009 CONTENTS 1 Introduction 1 2 Chain 2uxnA 1 2.1 O60341 overview 1 2.2 Multiple sequence alignment for 2uxnA 1 2.3 Residue ranking in 2uxnA 2 2.4 Top ranking residues in 2uxnA and their position on the structure 2 2.4.1 Clustering of residues at 25% coverage. 2 2.4.2 Overlap with known functional surfaces at 25% coverage. 3 3 Chain 2uxnB 8 3.1 Q9UKL0 overview 8 3.2 Multiple sequence alignment for 2uxnB 8 3.3 Residue ranking in 2uxnB 8 3.4 Top ranking residues in 2uxnB and their position on the structure 8 3.4.1 Clustering of residues at 25% coverage. 9 3.4.2 Overlap with known functional surfaces at 25% coverage. 9 3.4.3 Possible novel functional surfaces at 25% coverage. 10 4 Notes on using trace results 11 4.1 Coverage 11 4.2 Known substitutions 11 4.3 Surface 11 4.4 Number of contacts 11 4.5 Annotation 11 4.6 Mutation suggestions 11 5 Appendix 11 5.1 File formats 11 5.2 Color schemes used 12 5.3 Credits 12 5.3.1 Alistat 12 5.3.2 CE 12 5.3.3 DSSP 12 5.3.4 HSSP 12 5.3.5 LaTex 12 5.3.6 Muscle 12 5.3.7 Pymol 12 5.4 Note about ET Viewer 12 5.5 Citing this work 12 5.6 About report maker 13 5.7 Attachments 13 1 INTRODUCTION From the original Protein Data Bank entry (PDB id 2uxn): Title: Structural basis of histone demethylation by lsd1 revealed by suicide inactivation Compound: Mol id: 1; molecule: lysine-specific histone deme- thylase 1; chain: a; fragment: swirm domain, amine oxidase domain and linker, residues 171-836; synonym: flavin-containing amine oxidase domain-containing protein 2, braf35-hdac complex protein bhc110, lysine- specific demethylase 1; ec: 1.-.-.-; engineered: yes; mol id: 2; molecule: rest corepressor 1; chain: b; fragment: fragment of sant1, linker region and sant2 domain, residues 286-482; syn- onym: protein corest; engineered: yes; mol id: 3; molecule: histone h3.1; chain: e; fragment: histone h3-derived suicide inhibitor, resi- dues 2- 22; synonym: h3/a, h3/b, h3/c, h3/d, h3/f, h3/h, h3/i, h3/j, h3/k, h3/l; engineered: yes; other details: histone h3 n-terminal pep- tide (residues 1- 21 ) with a propargyl moiety at lysine 4 covalently attached to fad and subsequently reduced using sodium borohydride Organism, scientific name: Homo Sapiens; 2uxn contains unique chains 2uxnA (664 residues) and 2uxnB (133 residues) Chain 2uxnE is too short (7 residues) to permit statistically significant analysis, and was treated as a peptide ligand. 2 CHAIN 2UXNA 2.1 O60341 overview From SwissProt, id O60341, 93% identical to 2uxnA: 1 Lichtarge lab 2006
Transcript
Page 1: Evolutionary trace report by report maker November 28, 2009mammoth.bcm.tmc.edu/report_maker/reports/pdb... · Pages 1–13 2uxn Evolutionary trace report by report maker November

Pages 1–13

2uxnEvolutionary trace report by report maker

November 28, 2009

CONTENTS

1 Introduction 1

2 Chain 2uxnA 12.1 O60341 overview 12.2 Multiple sequence alignment for 2uxnA 12.3 Residue ranking in 2uxnA 22.4 Top ranking residues in 2uxnA and their position on

the structure 22.4.1 Clustering of residues at 25% coverage. 22.4.2 Overlap with known functional surfaces at

25% coverage. 3

3 Chain 2uxnB 83.1 Q9UKL0 overview 83.2 Multiple sequence alignment for 2uxnB 83.3 Residue ranking in 2uxnB 83.4 Top ranking residues in 2uxnB and their position on

the structure 83.4.1 Clustering of residues at 25% coverage. 93.4.2 Overlap with known functional surfaces at

25% coverage. 93.4.3 Possible novel functional surfaces at 25%

coverage. 10

4 Notes on using trace results 114.1 Coverage 11

4.2 Known substitutions 114.3 Surface 114.4 Number of contacts 114.5 Annotation 114.6 Mutation suggestions 11

5 Appendix 115.1 File formats 115.2 Color schemes used 125.3 Credits 12

5.3.1 Alistat 125.3.2 CE 125.3.3 DSSP 125.3.4 HSSP 125.3.5 LaTex 125.3.6 Muscle 125.3.7 Pymol 12

5.4 Note about ET Viewer 125.5 Citing this work 125.6 About reportmaker 135.7 Attachments 13

1 INTRODUCTIONFrom the original Protein Data Bank entry (PDB id 2uxn):Title: Structural basis of histone demethylation by lsd1 revealed bysuicide inactivationCompound: Mol id: 1; molecule: lysine-specific histone deme-thylase 1; chain: a; fragment: swirm domain, amine oxidase domainand linker, residues 171-836; synonym: flavin-containing amineoxidase domain-containing protein 2, braf35-hdac complex proteinbhc110, lysine- specific demethylase 1; ec: 1.-.-.-; engineered: yes;mol id: 2; molecule: rest corepressor 1; chain: b; fragment: fragmentof sant1, linker region and sant2 domain, residues 286-482; syn-onym: protein corest; engineered: yes; mol id: 3; molecule: histoneh3.1; chain: e; fragment: histone h3-derived suicide inhibitor, resi-dues 2- 22; synonym: h3/a, h3/b, h3/c, h3/d, h3/f, h3/h, h3/i, h3/j,h3/k, h3/l; engineered: yes; other details: histone h3 n-terminal pep-tide (residues 1- 21 ) with a propargyl moiety at lysine 4 covalentlyattached to fad and subsequently reduced using sodium borohydrideOrganism, scientific name:Homo Sapiens;

2uxn contains unique chains 2uxnA (664 residues) and 2uxnB (133residues) Chain 2uxnE is too short (7 residues) to permit statisticallysignificant analysis, and was treated as a peptide ligand.

2 CHAIN 2UXNA

2.1 O60341 overviewFrom SwissProt, id O60341, 93% identical to 2uxnA:

1

Lichtarge lab 2006

Page 2: Evolutionary trace report by report maker November 28, 2009mammoth.bcm.tmc.edu/report_maker/reports/pdb... · Pages 1–13 2uxn Evolutionary trace report by report maker November

Description: Lysine-specific histone demethylase 1 (EC 1.-.-.-)(Amine oxidase flavin containing domain protein 2) (AOF2 protein)(BRAF35-HDAC complex protein BHC110).Organism, scientific name:Homo sapiens (Human).Taxonomy: Eukaryota; Metazoa; Chordata; Craniata; Vertebrata;Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Primates;Catarrhini; Hominidae; Homo.Function: Histone demethylase that specifically demethylates Lys-4of histone H3, a specific tag for epigenetic transcriptional activation,thereby acting as a corepressor. Demethylates both mono- and tri-methylted Lys-4 of histone H3. May play a role in the repression ofneuronal genes.Cofactor: FAD.Subunit: Part of a histone deacetylase complex that containsHDAC1, HDAC2, AOF2, RCOR1, ZNF261, ZNF198, KIAA0182and GTF2I.Subcellular location: Nuclear.Alternative products:

Event=Alternative splicing; Named isoforms=2; Name=1;IsoId=O60341-1; Sequence=Displayed; Name=2; IsoId=O60341-2; Sequence=VSP 011198, VSP 011199; Note=No experimentalconfirmation available;Similarity: Belongs to the flavin monoamine oxidase family.Similarity: Contains 1 SWIRM domain.Caution: Ref.2 sequence differs from that shown due to erroneousgene model prediction.About: This Swiss-Prot entry is copyright. It is produced through acollaboration between the Swiss Institute of Bioinformatics and theEMBL outstation - the European Bioinformatics Institute. There areno restrictions on its use as long as its content is in no way modifiedand this statement is not removed.

2.2 Multiple sequence alignment for 2uxnAFor the chain 2uxnA, the alignment 2uxnA.msf (attached) with 57sequences was used. The alignment was downloaded from the HSSPdatabase, and fragments shorter than 75% of the query as well asduplicate sequences were removed. It can be found in the attachmentto this report, under the name of 2uxnA.msf. Its statistics, from thealistat program are the following:

Format: MSFNumber of sequences: 57Total number of residues: 34015Smallest: 522Largest: 664Average length: 596.8Alignment length: 664Average identity: 39%Most related pair: 99%Most unrelated pair: 24%Most distant seq: 36%

Furthermore, 3% of residues show as conserved in this alignment.The alignment consists of 15% eukaryotic ( 1% vertebrata, 1%

arthropoda, 7% fungi, 3% plantae) sequences. (Descriptions ofsome sequences were not readily available.) The file containing thesequence descriptions can be found in the attachment, under the name2uxnA.descr.

Fig. 1. Residues 173-504 in 2uxnA colored by their relative importance. (SeeAppendix, Fig.15, for the coloring scheme.)

Fig. 2. Residues 505-836 in 2uxnA colored by their relative importance. (SeeAppendix, Fig.15, for the coloring scheme.)

2.3 Residue ranking in 2uxnAThe 2uxnA sequence is shown in Figs. 1–2, with each residue coloredaccording to its estimated importance. The full listing of residuesin 2uxnA can be found in the file called 2uxnA.rankssorted in theattachment.

2.4 Top ranking residues in 2uxnA and their position onthe structure

In the following we consider residues ranking among top 25% ofresidues in the protein . Figure 3 shows residues in 2uxnA coloredby their importance: bright red and yellow indicate more conser-ved/important residues (see Appendix for the coloring scheme). APymol script for producing this figure can be found in the attachment.

2

Page 3: Evolutionary trace report by report maker November 28, 2009mammoth.bcm.tmc.edu/report_maker/reports/pdb... · Pages 1–13 2uxn Evolutionary trace report by report maker November

Fig. 3. Residues in 2uxnA, colored by their relative importance. Clockwise:front, back, top and bottom views.

2.4.1 Clustering of residues at 25% coverage.Fig. 4 shows thetop 25% of all residues, this time colored according to clusters theybelong to. The clusters in Fig.4 are composed of the residues listed

Fig. 4. Residues in 2uxnA, colored according to the cluster they belong to:red, followed by blue and yellow are the largest clusters (see Appendix forthe coloring scheme). Clockwise: front, back, top and bottomviews. Thecorresponding Pymol script is attached.

in Table 1.

3

Page 4: Evolutionary trace report by report maker November 28, 2009mammoth.bcm.tmc.edu/report_maker/reports/pdb... · Pages 1–13 2uxn Evolutionary trace report by report maker November

Table 1.cluster size membercolor residuesred 167 178,183,192,210,211,214,215

218,221,256,260,262,263,265285,286,287,290,291,293,294296,297,304,308,309,310,312314,315,316,317,319,327,328329,330,331,332,333,335,336339,340,341,348,363,367,375379,382,383,386,387,531,532534,535,536,537,538,539,540541,547,548,549,552,553,554555,558,559,560,562,564,569570,571,578,582,590,593,595624,626,627,628,630,631,639641,643,644,647,650,651,653654,655,656,657,659,660,661662,664,666,670,671,676,678679,681,688,689,690,691,692693,694,695,696,701,703,705706,709,711,712,716,731,732735,743,749,751,754,759,761762,764,770,772,773,774,776777,798,799,800,801,802,803807,808,809,810,811,812,813814,817,818,820,821,822

Table 1. Clusters of top ranking residues in 2uxnA.

2.4.2 Overlap with known functional surfaces at 25% coverage.The name of the ligand is composed of the source PDB identifierand the heteroatom name used in that file.

FDA binding site. Table 2 lists the top 25% of residues at theinterface with 2uxnAFDA900 (fda). The following table (Table 3)suggests possible disruptive replacements for these residues (seeSection 4.6).

Table 2.res type subst’s cvg noc/ dist antn

(%) bb (A)285 G G(100) 0.03 19/19 3.34 site287 G G(100) 0.03 23/23 3.47 site290 G G(100) 0.03 1/1 4.31308 E E(100) 0.03 43/10 2.49 site314 G G(100) 0.03 5/5 3.51315 G G(100) 0.03 20/20 3.15 site316 R R(100) 0.03 84/22 2.75 site330 G G(100) 0.03 30/30 3.33661 K K(100) 0.03 9/0 3.99624 T T(98) 0.04 15/10 3.50

A(1)751 W W(98) 0.07 19/0 3.51 site

.(1)761 Y Y(98) 0.07 30/0 3.50 site

continued in next column

Table 2.continuedres type subst’s cvg noc/ dist antn

(%) bb (A).(1)

800 G G(98) 0.07 6/6 3.18 site.(1)

801 E E(98) 0.07 15/5 2.95.(1)

814 A A(98) 0.07 9/2 3.29 site.(1)

590 V V(94) 0.08 20/14 2.96I(1)L(1)A(1)

310 R R(87) 0.10 17/1 3.06 siteQ(1)K(8)S(1)

810 T T(96) 0.12 39/32 3.03 site.(1)S(1)

659 L L(73) 0.13 11/2 3.76I(8)M(15)V(1)

809 A A(75) 0.13 17/17 3.67 siteQ(8).(1)D(12)S(1)

286 S A(91) 0.14 11/10 4.03G(1)S(7)

333 V V(47) 0.14 21/17 2.78I(50)L(1)

812 H H(77) 0.15 3/1 4.42A(3).(1)G(12)T(5)

811 V V(70) 0.17 30/18 2.90M(28).(1)

571 Y Y(68) 0.18 2/0 4.67F(3)A(3)N(22)T(1)

335 T T(71) 0.19 1/0 4.88V(19)N(3)S(3)M(1)

309 A A(47) 0.20 26/15 3.27 sitecontinued in next column

4

Page 5: Evolutionary trace report by report maker November 28, 2009mammoth.bcm.tmc.edu/report_maker/reports/pdb... · Pages 1–13 2uxn Evolutionary trace report by report maker November

Table 2.continuedres type subst’s cvg noc/ dist antn

(%) bb (A)G(49)S(3)

332 M M(36) 0.20 13/12 2.99 siteQ(24)S(36)H(1)

331 A A(63) 0.21 54/37 3.31 siteG(35)P(1)

626 P T(12) 0.21 20/1 3.42 siteP(71)S(15)

329 L V(3) 0.24 6/6 3.29 siteM(29)K(3)L(57)R(3)S(1)

317 V I(29) 0.25 8/4 3.56V(64)M(3)L(1)

Table 2. The top 25% of residues in 2uxnA at the interface withFDA.(Field names: res: residue number in the PDB entry; type: amino acidtype; substs: substitutions seen in the alignment; with the percentage of eachtype in the bracket; noc/bb: number of contacts with the ligand, with the num-ber of contacts realized through backbone atoms given in the bracket; dist:distance of closest apporach to the ligand. )

Table 3.res type disruptive

mutations285 G (KER)(FQMWHD)(NYLPI)(SVA)287 G (KER)(FQMWHD)(NYLPI)(SVA)290 G (KER)(FQMWHD)(NYLPI)(SVA)308 E (FWH)(YVCARG)(T)(SNKLPI)314 G (KER)(FQMWHD)(NYLPI)(SVA)315 G (KER)(FQMWHD)(NYLPI)(SVA)316 R (TD)(SYEVCLAPIG)(FMW)(N)330 G (KER)(FQMWHD)(NYLPI)(SVA)661 K (Y)(FTW)(SVCAG)(HD)624 T (KR)(QH)(FEMW)(N)751 W (KE)(TQD)(SNCG)(R)761 Y (K)(QM)(NVLAPI)(ER)800 G (KER)(FQMWHD)(NLPI)(Y)801 E (FWH)(VCAG)(YR)(T)814 A (KYER)(QHD)(N)(FTMW)590 V (YR)(KE)(H)(QD)310 R (TY)(D)(FVCAWG)(SELPI)810 T (KR)(FMWH)(Q)(LPI)

continued in next column

Table 3.continuedres type disruptive

mutations659 L (Y)(R)(H)(T)809 A (Y)(R)(H)(K)286 S (KR)(QH)(FMW)(E)333 V (YR)(KE)(H)(QD)812 H (E)(Q)(M)(D)811 V (Y)(R)(KE)(H)571 Y (K)(Q)(R)(EM)335 T (R)(K)(H)(FW)309 A (KR)(E)(Y)(QH)332 M (Y)(T)(H)(CRG)331 A (R)(Y)(KE)(H)626 P (R)(Y)(H)(K)329 L (Y)(R)(T)(H)317 V (Y)(R)(KEH)(D)

Table 3. List of disruptive mutations for the top 25% of residues in2uxnA, that are at the interface with FDA.

Fig. 5. Residues in 2uxnA, at the interface with FDA, colored by their relativeimportance. The ligand (FDA) is colored green. Atoms further than 30A awayfrom the geometric center of the ligand, as well as on the line of sight to theligand were removed. (See Appendix for the coloring scheme forthe proteinchain 2uxnA.)

Figure 5 shows residues in 2uxnA colored by their importance, at theinterface with 2uxnAFDA900.

Interface with 2uxnB.Table 4 lists the top 25% of residues at theinterface with 2uxnB. The following table (Table 5) suggests possibledisruptive replacements for these residues (see Section 4.6).

5

Page 6: Evolutionary trace report by report maker November 28, 2009mammoth.bcm.tmc.edu/report_maker/reports/pdb... · Pages 1–13 2uxn Evolutionary trace report by report maker November

Table 4.res type subst’s cvg noc/ dist

(%) bb (A)387 E E(35) 0.22 15/5 3.62

K(1)D(59)N(1)A(1)

Table 4. The top 25% of residues in 2uxnA at the interface with 2uxnB.(Field names: res: residue number in the PDB entry; type: amino acid type;substs: substitutions seen in the alignment; with the percentage of each typein the bracket; noc/bb: number of contacts with the ligand, with the number ofcontacts realized through backbone atoms given in the bracket; dist: distanceof closest apporach to the ligand. )

Table 5.res type disruptive

mutations387 E (FWH)(Y)(R)(CG)

Table 5. List of disruptive mutations for the top 25% of residues in2uxnA, that are at the interface with 2uxnB.

Fig. 6. Residues in 2uxnA, at the interface with 2uxnB, colored by their rela-tive importance. 2uxnB is shown in backbone representation (See Appendixfor the coloring scheme for the protein chain 2uxnA.)

Figure 6 shows residues in 2uxnA colored by their importance, at theinterface with 2uxnB.

Chloride ion binding site. Table 6 lists the top 25% of residues atthe interface with 2uxnACL1839 (chloride ion). The following table(Table 7) suggests possible disruptive replacements for these residues(see Section 4.6).

6

Page 7: Evolutionary trace report by report maker November 28, 2009mammoth.bcm.tmc.edu/report_maker/reports/pdb... · Pages 1–13 2uxn Evolutionary trace report by report maker November

Table 6.res type subst’s cvg noc/ dist antn

(%) bb (A)751 W W(98) 0.07 9/1 4.13 site

.(1)754 D D(92) 0.20 3/0 3.63

N(1).(1)E(3)

749 S T(84) 0.21 2/1 4.62S(14).(1)

Table 6. The top 25% of residues in 2uxnA at the interface with chlorideion.(Field names: res: residue number in the PDB entry; type: amino acidtype; substs: substitutions seen in the alignment; with the percentage of eachtype in the bracket; noc/bb: number of contacts with the ligand, with the num-ber of contacts realized through backbone atoms given in the bracket; dist:distance of closest apporach to the ligand. )

Table 7.res type disruptive

mutations751 W (KE)(TQD)(SNCG)(R)754 D (R)(FWH)(YVCAG)(TK)749 S (KR)(FWH)(QM)(LPI)

Table 7. List of disruptive mutations for the top 25% of residues in2uxnA, that are at the interface with chloride ion.

Figure 7 shows residues in 2uxnA colored by their importance, at theinterface with 2uxnACL1839.

Glycerol binding site. Table 8 lists the top 25% of residues at theinterface with 2uxnAGOL1837 (glycerol). The following table (Table9) suggests possible disruptive replacements for these residues (seeSection 4.6).

Table 8.res type subst’s cvg noc/ dist antn

(%) bb (A)688 R R(100) 0.03 14/0 2.86 site537 E E(98) 0.04 1/0 4.83

L(1)

Table 8. The top 25% of residues in 2uxnA at the interface with glyce-rol.(Field names: res: residue number in the PDB entry; type: amino acidtype; substs: substitutions seen in the alignment; with the percentage of eachtype in the bracket; noc/bb: number of contacts with the ligand, with the num-ber of contacts realized through backbone atoms given in the bracket; dist:distance of closest apporach to the ligand. )

Table 9.res type disruptive

mutations688 R (TD)(SYEVCLAPIG)(FMW)(N)

continued in next column

Fig. 7. Residues in 2uxnA, at the interface with chloride ion, colored by theirrelative importance. The ligand (chloride ion) is colored green. Atoms furtherthan 30A away from the geometric center of the ligand, as well as on the lineof sight to the ligand were removed. (See Appendix for the coloring schemefor the protein chain 2uxnA.)

Table 9.continuedres type disruptive

mutations537 E (H)(FYWR)(CG)(TVA)

Table 9. List of disruptive mutations for the top 25% of residues in2uxnA, that are at the interface with glycerol.

Figure 8 shows residues in 2uxnA colored by their importance, at theinterface with 2uxnAGOL1837.

Interface with the peptide 2uxnE. Table 10 lists the top 25%of residues at the interface with 2uxnE. The following table (Table11) suggests possible disruptive replacements for these residues (seeSection 4.6).

Table 10.res type subst’s cvg noc/ dist antn

(%) bb (A)564 H H(100) 0.03 36/1 2.69562 G G(98) 0.04 5/5 4.09

C(1)555 D D(89) 0.07 10/1 2.70

N(8)E(1)

761 Y Y(98) 0.07 25/0 3.36 site.(1)

540 N N(77) 0.09 10/0 3.27continued in next column

7

Page 8: Evolutionary trace report by report maker November 28, 2009mammoth.bcm.tmc.edu/report_maker/reports/pdb... · Pages 1–13 2uxn Evolutionary trace report by report maker November

Fig. 8. Residues in 2uxnA, at the interface with glycerol, colored by theirrelative importance. The ligand (glycerol) is colored green. Atoms furtherthan 30A away from the geometric center of the ligand, as well as on the lineof sight to the ligand were removed. (See Appendix for the coloring schemefor the protein chain 2uxnA.)

Table 10.continuedres type subst’s cvg noc/ dist antn

(%) bb (A)C(22)

535 N N(91) 0.10 19/1 3.15H(7)L(1)

552 W W(94) 0.10 18/3 3.45Y(3)C(1)

553 D D(84) 0.10 3/2 4.43N(14)R(1)

531 W W(91) 0.11 1/0 4.75F(7)L(1)

810 T T(96) 0.12 1/0 4.60 site.(1)S(1)

659 L L(73) 0.13 1/0 4.45I(8)M(15)V(1)

809 A A(75) 0.13 6/5 3.40 siteQ(8).(1)D(12)S(1)

continued in next column

Table 10.continuedres type subst’s cvg noc/ dist antn

(%) bb (A)695 W W(57) 0.15 2/0 4.97

Y(31)F(10)

536 L L(92) 0.17 7/0 3.69V(1)T(3)M(1)

538 F F(36) 0.18 11/0 3.75Y(61)C(1)

559 E E(68) 0.18 9/0 4.18N(3)G(12)D(8)Q(5)L(1)

539 A A(68) 0.20 26/14 3.20S(17)G(12)R(1)

382 F F(59) 0.21 2/0 4.19D(1)Y(33)R(3)L(1)

693 L L(61) 0.23 1/0 4.20M(21)I(3)Q(10)V(3)

Table 10. The top 25% of residues in 2uxnA at the interface with 2uxnE.(Field names: res: residue number in the PDB entry; type: amino acid type;substs: substitutions seen in the alignment; with the percentage of each typein the bracket; noc/bb: number of contacts with the ligand, with the number ofcontacts realized through backbone atoms given in the bracket; dist: distanceof closest apporach to the ligand. )

Table 11.res type disruptive

mutations564 H (E)(TQMD)(SNKVCLAPIG)(YR)562 G (KER)(FQMWHD)(NYLPI)(SVA)555 D (R)(FWH)(Y)(VCAG)761 Y (K)(QM)(NVLAPI)(ER)540 N (Y)(FWH)(ER)(T)535 N (Y)(T)(E)(FWH)552 W (K)(E)(Q)(D)553 D (FW)(R)(YH)(VCAG)531 W (KE)(T)(QD)(R)810 T (KR)(FMWH)(Q)(LPI)659 L (Y)(R)(H)(T)

continued in next column

8

Page 9: Evolutionary trace report by report maker November 28, 2009mammoth.bcm.tmc.edu/report_maker/reports/pdb... · Pages 1–13 2uxn Evolutionary trace report by report maker November

Table 11.continuedres type disruptive

mutations809 A (Y)(R)(H)(K)695 W (K)(E)(Q)(D)536 L (R)(Y)(H)(K)538 F (K)(E)(Q)(D)559 E (H)(FW)(Y)(R)539 A (E)(KY)(R)(D)382 F (K)(E)(T)(Q)693 L (Y)(R)(H)(T)

Table 11. List of disruptive mutations for the top 25% of residues in2uxnA, that are at the interface with 2uxnE.

Fig. 9. Residues in 2uxnA, at the interface with 2uxnE, colored by their rela-tive importance. 2uxnE is shown in backbone representation (See Appendixfor the coloring scheme for the protein chain 2uxnA.)

Figure 9 shows residues in 2uxnA colored by their importance, at theinterface with 2uxnE.

3 CHAIN 2UXNB

3.1 Q9UKL0 overviewFrom SwissProt, id Q9UKL0, 89% identical to 2uxnB:Description: CoREST protein.Organism, scientific name:Homo sapiens (Human).Taxonomy: Eukaryota; Metazoa; Chordata; Craniata; Vertebrata;Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Primates;Catarrhini; Hominidae; Homo.Subcellular location: Nuclear (By similarity).

Fig. 10. Residues 308-440 in 2uxnB colored by their relative importance.(See Appendix, Fig.15, for the coloring scheme.)

3.2 Multiple sequence alignment for 2uxnBFor the chain 2uxnB, the alignment 2uxnB.msf (attached) with 27sequences was used. The alignment was downloaded from the HSSPdatabase, and fragments shorter than 75% of the query as well asduplicate sequences were removed. It can be found in the attachmentto this report, under the name of 2uxnB.msf. Its statistics, from thealistat program are the following:

Format: MSFNumber of sequences: 27Total number of residues: 3395Smallest: 103Largest: 133Average length: 125.7Alignment length: 133Average identity: 42%Most related pair: 95%Most unrelated pair: 9%Most distant seq: 30%

Furthermore,<1% of residues show as conserved in this ali-gnment.

The alignment consists of 22% eukaryotic ( 18% vertebrata, 3%arthropoda) sequences. (Descriptions of some sequences were notreadily available.) The file containing the sequence descriptions canbe found in the attachment, under the name 2uxnB.descr.

3.3 Residue ranking in 2uxnBThe 2uxnB sequence is shown in Fig. 10, with each residue coloredaccording to its estimated importance. The full listing of residuesin 2uxnB can be found in the file called 2uxnB.rankssorted in theattachment.

3.4 Top ranking residues in 2uxnB and their position onthe structure

In the following we consider residues ranking among top 25% ofresidues in the protein . Figure 11 shows residues in 2uxnB coloredby their importance: bright red and yellow indicate more conser-ved/important residues (see Appendix for the coloring scheme). APymol script for producing this figure can be found in the attachment.

3.4.1 Clustering of residues at 25% coverage.Fig. 12 shows thetop 25% of all residues, this time colored according to clusters theybelong to. The clusters in Fig.12 are composed of the residues listedin Table 12.

9

Page 10: Evolutionary trace report by report maker November 28, 2009mammoth.bcm.tmc.edu/report_maker/reports/pdb... · Pages 1–13 2uxn Evolutionary trace report by report maker November

Fig. 11. Residues in 2uxnB, colored by their relative importance. Clockwise:front, back, top and bottom views.

Fig. 12. Residues in 2uxnB, colored according to the cluster they belong to:red, followed by blue and yellow are the largest clusters (see Appendix forthe coloring scheme). Clockwise: front, back, top and bottomviews. Thecorresponding Pymol script is attached.

Table 12.cluster size membercolor residuesred 24 378,380,382,383,387,389,391

392,393,396,399,401,404,405412,416,420,423,425,427,429434,436,439

blue 6 310,311,312,313,314,316

Table 12. Clusters of top ranking residues in 2uxnB.

3.4.2 Overlap with known functional surfaces at 25% coverage.The name of the ligand is composed of the source PDB identifierand the heteroatom name used in that file.

Interface with 2uxnA.Table 13 lists the top 25% of residues atthe interface with 2uxnA. The following table (Table 14) suggestspossible disruptive replacements for these residues (see Section 4.6).

Table 13.res type subst’s cvg noc/ dist

(%) bb (A)313 G .(3) 0.01 11/11 3.90

G(96)310 P .(7) 0.04 34/14 3.48

P(85)R(7)

393 Q K(11) 0.07 65/12 3.37Q(66)L(14)E(7)

314 M .(3) 0.10 52/11 3.25I(7)M(77)R(7)A(3)

312 K .(7) 0.14 10/9 2.49K(85)N(3)R(3)

311 P .(7) 0.19 31/2 3.62P(85)R(3)G(3)

401 D N(25) 0.20 1/0 4.58D(62)Q(3)T(3)S(3)

404 A A(74) 0.20 20/14 3.37T(11)M(11)E(3)

389 L E(11) 0.22 14/3 3.67continued in next column

10

Page 11: Evolutionary trace report by report maker November 28, 2009mammoth.bcm.tmc.edu/report_maker/reports/pdb... · Pages 1–13 2uxn Evolutionary trace report by report maker November

Table 13.continuedres type subst’s cvg noc/ dist

(%) bb (A)L(59)Q(14)N(3)M(3)R(3)P(3)

316 L .(3) 0.25 40/11 3.78L(59)I(25)T(7)M(3)

Table 13. The top 25% of residues in 2uxnB at the interface with 2uxnA.(Field names: res: residue number in the PDB entry; type: amino acid type;substs: substitutions seen in the alignment; with the percentage of each typein the bracket; noc/bb: number of contacts with the ligand, with the number ofcontacts realized through backbone atoms given in the bracket; dist: distanceof closest apporach to the ligand. )

Table 14.res type disruptive

mutations313 G (KER)(FQMWHD)(NLPI)(Y)310 P (Y)(T)(CRG)(SH)393 Q (Y)(FTWH)(CG)(SVA)314 M (Y)(T)(H)(S)312 K (Y)(T)(FW)(SVCAG)311 P (Y)(R)(TH)(E)401 D (R)(FWH)(Y)(K)404 A (R)(Y)(KH)(E)389 L (Y)(T)(HR)(CG)316 L (R)(Y)(H)(T)

Table 14. List of disruptive mutations for the top 25% of residues in2uxnB, that are at the interface with 2uxnA.

Figure 13 shows residues in 2uxnB colored by their importance, atthe interface with 2uxnA.

3.4.3 Possible novel functional surfaces at 25% coverage.Onegroup of residues is conserved on the 2uxnB surface, away from (orsusbtantially larger than) other functional sites and interfaces reco-gnizable in PDB entry 2uxn. It is shown in Fig. 14. The right panelshows (in blue) the rest of the larger cluster this surface belongs to.The residues belonging to this surface ”patch” are listed in Table 15,while Table 16 suggests possible disruptive replacements for theseresidues (see Section 4.6).

Table 15.res type substitutions(%) cvg383 W W(92).(3)R(3) 0.03387 E E(92)K(3)P(3) 0.03382 R R(88)K(3).(3) 0.04

continued in next column

Fig. 13. Residues in 2uxnB, at the interface with 2uxnA, colored by their rela-tive importance. 2uxnA is shown in backbone representation (See Appendixfor the coloring scheme for the protein chain 2uxnB.)

Fig. 14. A possible active surface on the chain 2uxnB. The larger cluster itbelongs to is shown in blue.

Table 15.continuedres type substitutions(%) cvg

P(3)429 N N(88).(11) 0.06393 Q K(11)Q(66)L(14) 0.07

E(7)412 K K(85)A(7)P(3) 0.08

S(3)392 V K(11)V(66)L(14) 0.09

R(3)P(3)436 E Q(11).(14)E(74) 0.10427 R R(85)Q(3).(7) 0.11

T(3)425 R K(11)S(3)R(77) 0.13

.(7)416 Q Q(74)H(14)A(3) 0.14

continued in next column

11

Page 12: Evolutionary trace report by report maker November 28, 2009mammoth.bcm.tmc.edu/report_maker/reports/pdb... · Pages 1–13 2uxn Evolutionary trace report by report maker November

Table 15.continuedres type substitutions(%) cvg

.(3)S(3)378 K N(3)K(74)R(7) 0.15

.(11)T(3)380 N S(18)N(66)T(7) 0.16

.(3)Q(3)420 F F(88)T(3)N(3) 0.17

R(3)434 L L(66).(14)V(18) 0.18389 L E(11)L(59)Q(14) 0.22

N(3)M(3)R(3)P(3)

423 N N(70)S(18).(7) 0.23Q(3)

439 A .(18)A(70)Q(7) 0.23V(3)

396 R L(3)R(77)A(3) 0.24V(7)P(3)K(3)

Table 15. Residues forming surface ”patch” in 2uxnB.

Table 16.res type disruptive

mutations383 W (E)(TD)(K)(Q)387 E (FYWH)(CG)(TVAR)(S)382 R (T)(YD)(SCG)(VA)429 N (Y)(FTWH)(SVCAG)(ER)393 Q (Y)(FTWH)(CG)(SVA)412 K (Y)(FW)(T)(H)392 V (Y)(E)(R)(D)436 E (FWH)(YVCAG)(TR)(S)427 R (D)(T)(YVLAPI)(SFECWG)425 R (TD)(Y)(VCLAPIG)(SFEW)416 Q (Y)(FTWH)(CG)(SVAD)378 K (Y)(FW)(T)(VA)380 N (Y)(FWH)(R)(T)420 F (E)(K)(D)(T)434 L (Y)(R)(H)(T)389 L (Y)(T)(HR)(CG)423 N (Y)(FWH)(T)(VCARG)439 A (Y)(ER)(KH)(D)396 R (Y)(T)(D)(E)

Table 16. Disruptive mutations for the surface patch in 2uxnB.

4 NOTES ON USING TRACE RESULTS

4.1 CoverageTrace results are commonly expressed in terms of coverage: the resi-due is important if its “coverage” is small - that is if it belongs tosome small top percentage of residues [100% is all of the residuesin a chain], according to trace. The ET results are presented in theform of a table, usually limited to top 25% percent of residues (orto some nearby percentage), sorted by the strength of the presumed

evolutionary pressure. (I.e., the smaller the coverage, the strongerthepressure on the residue.) Starting from the top of that list, mutating acouple of residues should affect the protein somehow, with the exacteffects to be determined experimentally.

4.2 Known substitutionsOne of the table columns is “substitutions” - other amino acid typesseen at the same position in the alignment. These amino acid typesmay be interchangeable at that position in the protein, so if one wantsto affect the protein by a point mutation, they should be avoided. Forexample if the substitutions are “RVK” and the original protein hasan R at that position, it is advisable to try anything, but RVK. Conver-sely, when looking for substitutions which willnot affect the protein,one may try replacing, R with K, or (perhaps more surprisingly), withV. The percentage of times the substitution appears in the alignmentis given in the immediately following bracket. No percentage is givenin the cases when it is smaller than 1%. This is meant to be a roughguide - due to rounding errors these percentages often do not add upto 100%.

4.3 SurfaceTo detect candidates for novel functional interfaces, first we look forresidues that are solvent accessible (according to DSSP program) byat least10A2, which is roughly the area needed for one water mole-cule to come in the contact with the residue. Furthermore, we requirethat these residues form a “cluster” of residues which have neighborwithin 5A from any of their heavy atoms.

Note, however, that, if our picture of protein evolution is correct,the neighboring residues whichare notsurface accessible might beequally important in maintaining the interaction specificity - theyshould not be automatically dropped from consideration when choo-sing the set for mutagenesis. (Especially if they form a cluster withthe surface residues.)

4.4 Number of contactsAnother column worth noting is denoted “noc/bb”; it tells the num-ber of contacts heavy atoms of the residue in question make acrossthe interface, as well as how many of them are realized through thebackbone atoms (if all or most contacts are through the backbone,mutation presumably won’t have strong impact). Two heavy atomsare considered to be “in contact” if their centers are closer than5A.

4.5 AnnotationIf the residue annotation is available (either from the pdb file orfrom other sources), another column, with the header “annotation”appears. Annotations carried over from PDB are the following: site(indicating existence of related site record in PDB ), S-S (disulfidebond forming residue), hb (hydrogen bond forming residue, jb (jamesbond forming residue), and sb (for salt bridge forming residue).

4.6 Mutation suggestionsMutation suggestions are completely heuristic and based on comple-mentarity with the substitutions found in the alignment. Note thatthey are meant to bedisruptive to the interaction of the proteinwith its ligand. The attempt is made to complement the followingproperties: small[AV GSTC], medium[LPNQDEMIK], large[WFY HR], hydrophobic[LPV AMWFI], polar [GTCY ]; posi-tively [KHR], or negatively[DE] charged, aromatic[WFY H],long aliphatic chain[EKRQM ], OH-group possession[SDETY ],

12

Page 13: Evolutionary trace report by report maker November 28, 2009mammoth.bcm.tmc.edu/report_maker/reports/pdb... · Pages 1–13 2uxn Evolutionary trace report by report maker November

5%30%50%100%

COVERAGE

V

VRELATIVE IMPORTANCE

Fig. 15. Coloring scheme used to color residues by their relative importance.

and NH2 group possession[NQRK]. The suggestions are listedaccording to how different they appear to be from the original aminoacid, and they are grouped in round brackets if they appear equallydisruptive. From left to right, each bracketed group of amino acidtypes resembles more strongly the original (i.e. is, presumably, lessdisruptive) These suggestions are tentative - they might prove disrup-tive to the fold rather than to the interaction. Many researcher willchoose, however, the straightforward alanine mutations, especially inthe beginning stages of their investigation.

5 APPENDIX

5.1 File formatsFiles with extension “rankssorted” are the actual trace results. Thefields in the table in this file:

• alignment# number of the position in the alignment

• residue# residue number in the PDB file

• type amino acid type

• rank rank of the position according to older version of ET

• variability has two subfields:1. number of different amino acids appearing in in this column

of the alignment

2. their type

• rho ET score - the smaller this value, the lesser variability ofthis position across the branches of the tree (and, presumably,the greater the importance for the protein)

• cvg coverage - percentage of the residues on the structure whichhave this rho or smaller

• gaps percentage of gaps in this column

5.2 Color schemes usedThe following color scheme is used in figures with residues coloredby cluster size: black is a single-residue cluster; clusters composed ofmore than one residue colored according to this hierarchy (orderedby descending size): red, blue, yellow, green, purple, azure, tur-quoise, brown, coral, magenta, LightSalmon, SkyBlue, violet, gold,

bisque, LightSlateBlue, orchid, RosyBrown, MediumAquamarine,DarkOliveGreen, CornflowerBlue, grey55, burlywood, LimeGreen,tan, DarkOrange, DeepPink, maroon, BlanchedAlmond.

The colors used to distinguish the residues by the estimatedevolutionary pressure they experience can be seen in Fig. 15.

5.3 Credits5.3.1 Alistat alistat reads a multiple sequence alignment from thefile and shows a number of simple statistics about it. These stati-stics include the format, the number of sequences, the total numberof residues, the average and range of the sequence lengths, and thealignment length (e.g. including gap characters). Also shown aresome percent identities. A percent pairwise alignment identity is defi-ned as (idents / MIN(len1, len2)) where idents is the number ofexact identities and len1, len2 are the unaligned lengths of the twosequences. The ”average percent identity”, ”most related pair”, and”most unrelated pair” of the alignment are the average, maximum,and minimum of all (N)(N-1)/2 pairs, respectively. The ”most distantseq” is calculated by finding the maximum pairwise identity (bestrelative) for all N sequences, then finding the minimum of these Nnumbers (hence, the most outlying sequence).alistat is copyrightedby HHMI/Washington University School of Medicine, 1992-2001,and freely distributed under the GNU General Public License.

5.3.2 CE To map ligand binding sites from differentsource structures, reportmaker uses the CE program:http://cl.sdsc.edu/. Shindyalov IN, Bourne PE (1998)”Protein structure alignment by incremental combinatorial extension(CE) of the optimal path. Protein Engineering 11(9) 739-747.

5.3.3 DSSP In this work a residue is considered solvent accessi-ble if the DSSP program finds it exposed to water by at least 10A2,which is roughly the area needed for one water molecule to come inthe contact with the residue. DSSP is copyrighted by W. Kabsch, C.Sander and MPI-MF, 1983, 1985, 1988, 1994 1995, CMBI versionby [email protected] November 18,2002,

http://www.cmbi.kun.nl/gv/dssp/descrip.html.

5.3.4 HSSP Whenever available, reportmaker uses HSSP ali-gnment as a starting point for the analysis (sequences shorter than75% of the query are taken out, however); R. Schneider, A. deDaruvar, and C. Sander.”The HSSP database of protein structure-sequence alignments.”Nucleic Acids Res., 25:226–230, 1997.

http://swift.cmbi.kun.nl/swift/hssp/

5.3.5 LaTex The text for this report was processed using LATEX;Leslie Lamport, “LaTeX: A Document Preparation System Addison-Wesley,” Reading, Mass. (1986).

5.3.6 Muscle When making alignments “from scratch”, reportmaker uses Muscle alignment program: Edgar, Robert C. (2004),”MUSCLE: multiple sequence alignment with high accuracy andhigh throughput.”Nucleic Acids Research 32(5), 1792-97.

http://www.drive5.com/muscle/

5.3.7 Pymol The figures in this report were produced usingPymol. The scripts can be found in the attachment. Pymolis an open-source application copyrighted by DeLano Scien-tific LLC (2005). For more information about Pymol seehttp://pymol.sourceforge.net/. (Note for Windows

13

Page 14: Evolutionary trace report by report maker November 28, 2009mammoth.bcm.tmc.edu/report_maker/reports/pdb... · Pages 1–13 2uxn Evolutionary trace report by report maker November

users: the attached package needs to be unzipped for Pymol to readthe scripts and launch the viewer.)

5.4 Note about ET ViewerDan Morgan from the Lichtarge lab has developed a visualizationtool specifically for viewing trace results. If you are interested, pleasevisit:

http://mammoth.bcm.tmc.edu/traceview/

The viewer is self-unpacking and self-installing. Input files to be usedwith ETV (extension .etvx) can be found in the attachment to themain report.

5.5 Citing this workThe method used to rank residues and make predictions in this reportcan be found in Mihalek, I., I. Res, O. Lichtarge. (2004).”A Family ofEvolution-Entropy Hybrid Methods for Ranking of Protein Residuesby Importance” J. Mol. Bio.336: 1265-82. For the original versionof ET see O. Lichtarge, H.Bourne and F. Cohen (1996).”An Evolu-tionary Trace Method Defines Binding Surfaces Common to ProteinFamilies” J. Mol. Bio.257: 342-358.

report maker itself is described in Mihalek I., I. Res and O.Lichtarge (2006).”Evolutionary Trace Report Maker: a new typeof service for comparative analysis of proteins.”Bioinformatics22:1656-7.

5.6 About report makerreport maker was written in 2006 by Ivana Mihalek. The 1D ran-king visualization program was written by Ivica Res. report makeris copyrighted by Lichtarge Lab, Baylor College of Medicine,Houston.

5.7 AttachmentsThe following files should accompany this report:

• 2uxnA.complex.pdb - coordinates of 2uxnA with all of itsinteracting partners

• 2uxnA.etvx - ET viewer input file for 2uxnA

• 2uxnA.clusterreport.summary - Cluster report summary for2uxnA

• 2uxnA.ranks - Ranks file in sequence order for 2uxnA

• 2uxnA.clusters - Cluster descriptions for 2uxnA

• 2uxnA.msf - the multiple sequence alignment used for the chain2uxnA

• 2uxnA.descr - description of sequences used in 2uxnA msf

• 2uxnA.rankssorted - full listing of residues and their rankingfor 2uxnA

• 2uxnA.2uxnAFDA900.if.pml - Pymol script for Figure 5

• 2uxnA.cbcvg - used by other 2uxnA – related pymol scripts

• 2uxnA.2uxnB.if.pml - Pymol script for Figure 6

• 2uxnA.2uxnACL1839.if.pml - Pymol script for Figure 7

• 2uxnA.2uxnAGOL1837.if.pml - Pymol script for Figure 8

• 2uxnA.2uxnE.if.pml - Pymol script for Figure 9

• 2uxnB.complex.pdb - coordinates of 2uxnB with all of itsinteracting partners

• 2uxnB.etvx - ET viewer input file for 2uxnB• 2uxnB.clusterreport.summary - Cluster report summary for

2uxnB

• 2uxnB.ranks - Ranks file in sequence order for 2uxnB

• 2uxnB.clusters - Cluster descriptions for 2uxnB

• 2uxnB.msf - the multiple sequence alignment used for the chain2uxnB

• 2uxnB.descr - description of sequences used in 2uxnB msf

• 2uxnB.rankssorted - full listing of residues and their rankingfor 2uxnB

• 2uxnB.2uxnA.if.pml - Pymol script for Figure 13

• 2uxnB.cbcvg - used by other 2uxnB – related pymol scripts

14


Recommended