Date post: | 19-Feb-2017 |
Category: |
Science |
Upload: | shwetaamoni |
View: | 152 times |
Download: | 0 times |
Modelling Assignment
Submitted to: Submitted by:Dr. Durg Vijay Singh Shweta Kumari
Roll- 21M.Sc. Bioinformatics2Nd sem
CONTENT Objective
Structure prediction
Threading
Ab inito
Phyre2 and result
Dali str-str alignment and its result
Robetta and its result
Validation
Result and discussion
Conclusion
reference
OBJECTIVE
To build the model of given amino acid residue sequence and validate the generated model.
>gi|407259499|gb|AFT91383.1| EcdL [Emericella rugulosa]MDDSPWPQCDIRVQDTFGPQVSGCYEDFDFTLLFEESILYLPPLLIAASVALLRIWQL
RSTENLLKRSGLLSILKPTSTTRLSNAAIAIGFVASPIFAWLSFWEHARSLRPSTILNVYLLGTIPMDAARARTLFRMPGNSAIASIFATIVVCKVVLLVVEAMEKQRLLLDRGWAPEETAGILNRSFLWWFNPLLLSGYKQALTVDKLLAVDEDIGVEKSKDEIRRRWAQAVKQNASSLQDVLLAVYRTELWGGFLPRLCLIGVNYAQPFLVNRVVTFLGQPDTSTSRGVASGLIAAYAIVYMGIAVATAAFHHRSYRMVMMVRGGLILLIYDHTLTLNALSPSKNDSYTLITADIERIVSGLRSLHETWASLIEIALSLWLLETKIRVSAVAAAMVVLVCLLVSGALSGLLGVHQNLWLEAMQKRLNATLATIGSIKGIKATGRTNTLYETILQLRRTEIQKSLKFRELLVALVTLSYLSTTMAPTFAFGTYSILAKIRNMTPLLAAPAFSSLTIMTLLGQAVSGFVESLMGLRQAMASLERIRQYLVGKEAPEPSPNKPGVASTEGLVAWSASLDEPGLDPRVEMRRMSSLQHRFYNLGELQD
Structure PredictionProtein structure prediction is the prediction of the three-
dimensional structure of a protein from its amino acid sequence i.e, the prediction of its folding and its secondary, tertiary, and quaternary structure from its primary structure.
The knowledge of the 3D structure is useful for rational drug design, protein engineering, detailed study of protein –bio-molecular interactions, study of evolutionary relationship between proteins or protein families etc.
METHOD OF STRUCTURE PREDICTION
Structure prediction
Experimental Method Computational Method
X-Ray NMR EM Template based Template free
Homology Threading Ab inito
We have to build the model of given sequence, 604 AA residue of Ecdl (Emerucella rugulosa).
Hence, the given protein sequence have not shown the significant alignment with any solved structure
We cann't perform Homology Modelling to build the given sequence.
The only alternative way is THREADING or AB INITIO method.
Threading
“Remote Homology” Method of protein modeling which is used to model those proteins
which have the same fold as proteins of known structures, but do not have homologous proteins with known structure.
The software used for fold recognition methods are:
PHYRE2
I-TASSER
MUSTER
RaptorX
GenThreader
LOMETS
Ab inito method
Predicting the 3D structure without any “prior knowledge”
If structure homologues (occasionally analogues) do not exist, or exist but cannot be identified, models have to be constructed from scratch. This procedure, called ab initio modelling.
Software used for Ab inito structure prediction
Robetta
PHYRE2( Protein Homology/analogY Recognition Engine V 2.0)
Developed by Dr. Kelly Released on 14th feb 2011. Most popular structure prediction server cited over 1500
times. Ranked as best for function prediction in CASPs 9. The basic principal of work of PHYRE2 is
Finding a sequence alignment to a known structure.
Copying the co-ordinate and relabeling the residues according to our sequence based on alignment.
PHYRE2
Features of PHYRE2:
Domain analysis
Highlight motif
Transmembrane helix are coloured Algorithm used to predict 3D str is LOCAL ALIGNMENT
&HMM. Localy aligned our seq against fold library and HMM matching
of our seq and known sequence structure. Return a confident prediction for a subsequence of our seq cut
this all confident seq and resubunit to join them for their assembly.
DALI(Distance mAtrix aLIgnment)
Method for structure-structure alignment.
It uses 3D cartesian coordinate of c-alpha carbon atom of each protein in order to calculate residue-residue diatance matrix.
Output generate:
Rank of PDB identifier
Z-score
RMSD
Lali (number of aligned position)
Nres (number of aligned residue)
%ID
PDB discription
DALI result analysis Low rmsd and high nres shows the better alignment. If both rmsd and nres is high or low, not possible to establish an
order between the alignment. Rmsd- It is the measure of the average deviation in distance between
aligned alpha carbons (i.e, calculate the divergance from one to another b/w two sequences)
Z score- The Z-Score is the measure of quality of the structural alignment.
Note:- DALI package is based on Fartran programming and perl script.
“The shows the best alignment with 4f4c_A with low rmsd 0.6 and high lali score 403.”
Ab inito through ROBETTA
Non query templete based alignment
Robetta secure the best position in CASP (Critical Assessment of Techniques for Protein Structure Prediction) 4, 5, 6, 7 and 8.
Roberta prediction type-
1. Ginzu : Domain prediction
2. Structure : 3D Model (available per domain after Ginzu completes from result page)
Domain prediction by GINZU protocal
There are several model Robetta produces.
It determine more than one domain that means Robetta breaks up the query sequence into putative domains and model each of them separately.
After that assembles all the model into contiguous chain.
Robetta result analysis Robetta shows the alignment with these three protein for domain
prediction:
Sl. no. Protein ID Discription
1 4p79 Crystal str of cloudin provides insight into the architecture of tight junction
Ion channel regulator, alpha helicalMembrane protein
2 1ni0 HydrolaseRestriction endonuclease PuvII from proteus vulgaris,
class alpha/beta proteinEC 3.1.21.4
3 4m1m Multidrug resistant protein ATP binding cassate transpoterPgp
ANOLEA
Atomic Non-LOcal Environment Assessment
Perform energy calculation on a protein chain evaluating non-local environment of each heavy atomin the molecule.
Steps-
1. Open anolea server
2. Browse sequence file
3. Fill job title n submit to servet .
PROSA
PROtein Structure Analysis
Developed by Sippl,1993.
Calculate quality score of C alpha carbon of input structure.
OUTPUT-
Z score
Plot of residue score-
3D structure of input protein
PROSA
PROtein Structure Analysis
Developed by Sippl,1993.
Calculate quality score of C alpha carbon of input structure.
OUTPUT-
Z score
Plot of residue score-
3D structure of input protein
PROSA1 .Z score- indicate the overall quality of model value display of all experimentally determined protein chain in PDB.
“more negative value more accurate structure”.
2. Plot of residue score- shows local quality of model by plotting energy as sum of AA sequence position i (take window size 40)
Positive value correspond problematic or erroneous part of structure.
3. Prosa web visualized the 3D structure of input protein using the molecular viewer Jmol.
Residue are colored from blue to red in order of increasing residue energy.
PROCHECK(PDBsum)
The PDB sum is a pictorial database that provides an at-a-glance overview of the contents of each 3D structure deposited in the Protein Data Bank (PDB).
The PROCHECK analyses provide an idea of the stereo-chemical quality of all protein chains in a given PDB structure.
They highlight regions of the proteins which appear to have unusual geometry and provide an overall assessment of the structure as a whole.
PDBsum uses version 3.6.2 of PROCHECK.
PROCHECK(PDBsum)
The PDB sum is a pictorial database that provides an at-a-glance overview of the contents of each 3D structure deposited inthe Protein Data Bank (PDB).
The PROCHECK analyses provide an idea of the stereochemical quality of all protein chains in a given PDB structure.
They highlight regions of the proteins which appear to have unusual geometry and provide an overall assessment of the structure as a whole.
PDBsumuses version 3.6.2 ofPROCHECK.
PROCHECK ANALYSIS
• G factor- The G-factor is a log-odds score based on the observed distributions of these stereo-chemical parameters.
• A low G-factor indicates that the property corresponds to a low-probability conformation.
• These are the stereo-chemical property:
1. planarity
2. chirality
3. phi/psi preferences
4. chi angles.
Result and discussion Fold recognition was done through PHYRE2 server for fold
assessment. On the other hand ab initio prediction was analyzed by Robetta
sever which gives information about domain.
After build the model, model was validated through some server ANOLEA & PROSA.
Ramachandran plot of model analysed using PDBsum PROCHECK with the description of the allowed region.
Result and discussion
The comparative and combined study of phyre2 and Robetta shows:-Sl. no.Sl. no. Str. Prediction methodStr. Prediction method Protein idProtein id discription
1 Fold recognition by PHYRE2 4F4C Crystal structureof themultidrugtransporterP-glycoprotein from C. elegans
2 Ab initio by Robetta 4p79 Membrane protein
3 Ab initio by Robetta 1ni0 Hydrolase
4 Ab initio by Robetta 4m1m Multidrug resistant protein,ATP binding cassate transpoter,Pgp
Conclusion
The above results of PHYRE2 (fold recognition method) and Robetta (ab initio prediction) generate the model of given AA sequence which conclude that the given protein is
P-glycoprotein: multidrug-resistance and a superfamily of membrane-associated transport proteins.
ABC (ATP binding cassette) transporter
Transmembrane protein (alpha helical structure)
References http://www.sbg.bio.ic.ac.uk/phyre2/html/page.cgi?id=index
http://www.sbg.bio.ic.ac.uk/phyre2/phyre2_output/7330b2b464c1ea64/summary.html
http://robetta.bakerlab.org/
http://melolab.org/anolea/
https://prosa.services.came.sbg.ac.at/prosa.php
http://www.ebi.ac.uk/thornton-srv/databases/pdbsum/Generate.html
http://ekhidna.biocenter.helsinki.fi/dali_server/results/20150324-0049-69ef51112579617192cac4dcad7075f2/index.html