Date post: | 26-Dec-2015 |
Category: |
Documents |
Upload: | morris-cole |
View: | 216 times |
Download: | 0 times |
Use of bioinformatics in drug development and diagnostics
Bringing a New Drug to Market
Review and approval by Food & Drug Administration
1 compound approved
Phase III: Confirms effectiveness and monitors adverse reactions from long-term use in 1,000 to5,000 patient volunteers.
Phase II: Assesses effectiveness and looks for side effects in 100 to 500 patient volunteers.
Phase I: Evaluates safety and dosage in 20 to 100 healthy human volunteers.
5 compounds enter clinical trials
Discovery and preclininal testing: Compounds are identified and evaluated in laboratory and animal studies for safety, biological activity, and formulation.
5,000 compounds evaluated
0 2 4 6 8 10 12 14 Years
16
Source: Tufts Center for the Study of Drug Development
Biological Research in 21st Century
“ The new paradigm, now emerging is that all the 'genes' will be known (in the sense of being resident in databases available electronically), and that the starting "point of a biological investigation will be theoretical.”
- Walter Gilbert
Identify target
Clone gene encoding target
Rational Approach to Drug Discovery
Express target in recombinant form
Synthesize modifications of lead compounds
Identify lead compounds
Screen recombinant target with available inhibitors
Crystal structures of target and target/inhibitor complexes
Preclinical trials
Identify lead compounds
Toxicity & pharmacokinetic
studies
Synthesize modifications of lead compounds
An Ideal Target• Is generally an enzyme/receptor in a pathway and
its inhibition leads to either killing a pathogenic organism (Malarial Parasite) or to modify some aspects of metabolism of body that is functioning dormally.
• An ideal target…– Is essential for the survival of the organism.– Located at a critical step in the metabolic pathway.– Makes the organism vulnerable.– Concentration of target gene product is low.– The enzyme amenable for simple HTS assays
How Bioinformatics can help in Target Identification?
• Homologous & Orthologous genes
• Gene Order
• Gene Clusters
• Molecular Pathways & Wire diagrams
• Gene Ontology
Identification of Unique Genes of Parasite as potential drug target.
Comparative Genomics Malarial Parasites: Source for
identification of new target molecules.
• Genome comparisons of malarial parasites of human.
• Genome comparisons of malarial parasites of human and rodent.
• Comparison of genomes of –– Human– Malarial parasite – Mosquito
What one should look for?Human
P.f
Mosquito
Proteins that are shared by –•All genomes•Exclusively by Human & P.f.•Exclusively by Human & Mosquito
•Exclusively by P.f. & Mosquito
Unique proteins in –
Human
P.f. Targets for
anti-malarial drugs
Drug Development Flowchart
• Check if structure is known• If unknown, model it using
KNOWLEDGE-BASED HOMOLOGY MODELING APPROACH.
• Search for small molecules/ inhibitors• Structure-based Drug Design• Drug-Protein Interactions• Docking
Why Modeling?
• Experimental determination of structure
is still a time consuming and expensive
process.
• Number of known sequences are more
than number of known structures.
• Structure information is essential in
understanding function.
Sequence identities & Molecular Modeling methods
Methods Sequence Identity with known
structures
• ab initio 0-20%
• Fold recognition 20-35%
• Homology Modeling >35%
STRUCTURE-BASED DRUG DESIGN
Compound databases,
Microbial broths,Plants extracts,Combinatorial
Libraries
3-D ligand Databases
DockingLinking orBinding
Receptor-LigandComplex
Randomscreening synthesis
Lead molecule
Target EnzymeOR Receptor
3-D structure by Crystallography,NMR, electron microscopy OR
Homology Modeling
Redesign to improve
affinity, specificity etc.
Testing
Binding Site Analysis
• In the absence of a structure of Target-
ligand complex, it is not a trivial exercise to
locate the binding site!!!
• This is followed by Lead optimization.
Compounds which are weak inhibitors may be modifiedby combinatorial chemistry in silico if the target structure(3-dimensional!) is known, minimizing the number of potential test compounds
N
H
CX
Y
Z
Target structure
Factors Affecting The Affinity Of A Small Molecule For A Target Protein
LIGAND.wat n +PROTEIN.wat n LIGAND.PROTEIN.watp+(n+m-p) wat
• HYDROGEN BONDING
• HYDROPHOBIC EFFECT
• ELECTROSTATIC INTERACTIONS
• VAN DER WAALS INTERACTIONS
DIFFERENCE BETWEEN AN INHIBITOR AND DRUG
Extra requirement of a drug compared to an inhibitor
LIPINSKI’S RULE OF FIVEPoor absorption or permeation are more likely when :-There are more than five H-bond donors-The mol.wt is over 500 Da-The MlogP is over 4.15(or CLOG P>5)-The sums of N’s and O’s is over 10
•Selectivity•Less Toxicity•Bioavailability•Slow Clearance•Reach The Target•Ease Of Synthesis•Low Price•Slow Or No Development Of Resistance •Stability Upon Storage As Tablet Or Solution•Pharmacokinetic Parameters•No Allergies
THERMODYNAMICS OF RECEPTOR-LIGAND BINDING
•Proteins that interact with drugs are typically enzymes or receptors.
•Drug may be classified as: substrates/inhibitors (for enzymes)
agonists/antagonists (for receptors)
•Ligands for receptors normally bind via a non-covalent reversible binding.
•Enzyme inhibitors have a wide range of modes:non-covalent reversible,covalent reversible/irreversible or suicide inhibition.
•Inhibitors are designed to bind with higher affinity: their affinities often exceed the corresponding substrate affinities by several orders of magnitude!
•Agonists are analogous to enzyme substrates: part of the binding energy may be used for signal transduction, inducing a conformation or aggregation shift.
•To understand ‘what forces’ are responsible for ligands binding to Receptors/Enzymes,
•The observed structure of Protein is generally a consequence of the hydrophobic effect!
•Proteins generally bury hydrophobic residues inside the core,while exposing hydrophilic residues to the exterior Salt-bridges inside
•Ligand building clefts in proteins often expose hydrophobic residues to solvent and may contain partially desolvated hydrophilic groups that are not paired:
Docking Methods
• Docking of ligands to proteins is a formidable problem since it entails optimization of the 6 positional degrees of freedom.
• Rigid vs Flexible
• Manual Interactive Docking
Automated Docking Methods
• Speed vs Reliability • Basic Idea is to fill the active site of the Target
protein with a set of spheres.• Match the centre of these spheres as good as
possible with the atoms in the database of small molecules with known 3-D structures.
• Examples:– DOCK, CAVEAT, AUTODOCK, LEGEND,
ADAM, LINKOR, LUDI.
GRID Based Docking Methods
• Grid Based methods– GRID (Goodford, 1985, J. Med. Chem. 28:849)– GREEN (Tomioka & Itai, 1994, J. Comp.
Aided. Mol. Des. 8:347)– MCSS (Mirankar & Karplus, 1991, Proteins,
11:29).• Functional groups are placed at regularly spaced
(0.3-0.5A) lattice points in the active site and their interaction energies are evaluated.
CLUSTAL W (1.81) multiple sequence alignment chabaudi -----------------------E--KAGCFSNKTFKGLGNEGGLPWKCNSVDMKHFSSV 35 vinckei -----------AICACCKVLNSNE--KASCFSNKTFKGLGNAGGLPWKCNSVDMKHFVSV 47 berghei MEDLSETFDIYAICACCKVLNDDE--KVRCFNNKTFKGIGNAGVLPWKCNLIDMKYFSSV 58 yoelii -----------AICACCKVINNNE--KSGSFNNKTFNGLGNAGMLPWKYNLVDMNYFSSV 47 vivax MEDLSDVFDIYAICACCKVAPTSEGTKNEPFSPRTFRGLGNKGTLPWKCNSVDMKYFSSV 60 falciparum -------------------------KKNEVFNNYTFRGLGNKGVLPWKCNSLDMKYFCAV 35 * *. **.*:** * **** * :**::* :* chabaudi TSYVNETNYMRLKWKRDRYMEK---------NNVKLNTDGIPSVDKLQNIVVMGKASWES 86 vinckei TSYVNENNYIRLKWKRDKYIKE---------NNVKVNTDGIPSIDKLQNIVVMGKTSWES 98 berghei TSYINENNYIRLKWKRDKYMEKHNLK-----NNVELNTNIISSTNNLQNIVVMGKKSWES 113 yoelii TSYVNENNYIRLQWKRDKYMGKNNLK-----NNAELNNGELN--NNLQNVVVMGKRNWDS 100 vivax TTYVDESKYEKLKWKRERYLRMEASQGGGDNTSGGDNTHGGDNADKLQNVVVMGRSSWES 120 falciparum TTYVNESKYEKLKYKRCKYLNKET----------VDNVNDMPNSKKLQNVVVMGRTNWES 85 *:*::*.:* :*::** :*: * .:***:****: .*:* chabaudi IPSKFKPLQNRINIILSRTLKKEDLAKEYN------NVIIINSVDDLFPILKCIKYYKCF 140 vinckei IPSKFKPLENRINIILSRTLKKENLAKEYS------NVIIIKSVDELFPILKCIKYYKCF 152 berghei IPKKFKPLQNRINIILSRTLKKEDIVNENN--NENNNVIIIKSVDDLFPILKCTKYYKCF 171 yoelii IPPKFKPLQNRINIILSRTLKKEDIANEDNKNNENGTVMIIKSVDDLFPILKAIKYYKCF 160 vivax IPKQYKPLPNRINVVLSKTLTKEDVK---------EKVFIIDSIDDLLLLLKKLKYYKCF 171 falciparum IPKKFKPLSNRINVILSRTLKKEDFD---------EDVYIINKVEDLIVLLGKLNYYKCF 136 ** ::*** ****::**:**.**:. * **..:::*: :* :***** chabaudi I----------------------------------------------------------- 141 vinckei IIGGASVYKEFLDRNLIKKIYFTRINNAYT------------------------------ 182 berghei IIGGSSVYKEFLDRNLIKKIYFTRINNSYNCDVLFPEINENLFKITSISDVYYSNNTTLD 231 yoelii IIGGSYVYKEFLDRNLIKKIYFTRINNSYN------------------------------ 190 vivax IIGGAQVYRECLSRNLIKQIYFTRINGAYPCDVFFPEFDESQFRVTSVSEVYNSKGTTLD 231 falciparum I----------------------------------------------------------- 137 * chabaudi --------- vinckei --------- berghei FIIYSKTKE 240 yoelii --------- vivax FLVYSKVGG 240 falciparum ---------
Multiple alignment of DHFR of Plasmodium species
Antifolate drugs in the active site of DHFR L. casei to show hydrogen bonding with
surrounding residues
MTX
TMP
PYR
SO3
How molecular modeling could be used in identifying new leads
• These two compounds
a triazinobenzimidazole &
a pyridoindole were found to be active with high Ki against recombinant wild type DHFR.
• Thus demonstrate use of molecular modeling in malarial drug design.
How Bioinformatics Aids in Vaccine Development / Peptide
Vaccine Development Using
Bionformatics Approaches
Emerging and re-emerging infectious diseases threats, 1980-2001Viral
- Bolivian hemorrhagic fever-1994,Latin America- Bovine spongiform encephalopathy-1986,United Kingdom- Creulzfeldt-Jackob disease(a new variant V-CID)/mad cow disease-1995-96, UK/France- Dengue fever-1994-97,Africa/Asia/Latin America/USA- Ebola virus-1994,Gabon;1995,Zaire;1996,United States(monkey)- Hantavirus-1993,United States; 1997, Argentina- HIV subtype O-1994,Africa- Influenza A/Beijing/32/92, A/Wuhan/359/95, HS:N1-1993,United States; 1995,China; 1997,
Hongkong- Japanese Encephalitis-1995, Australia- Lassa fever-1992,Nigeria- Measles-1997, Brazil- Monkey pox-1997,Congo- Morbillivirus – 1994, Australia- O’nyong-nyong fever-1996,Uganda- Polio-1996,Albania- Rift Valley fever-1993,Sudan- Venezuelan equine encephalitis-1995-96,Venezuela/Colombia- West Nile Virus-1996,Romania- Yellow fever-1993,Kenya;1995,Peru
Emerging and re-emerging infectious diseases threats contd.,
• Parasitic- African trypanosomiasis-1997,Sudan- Ancylcostoma caninum(eosinophilic enteritis)-
1990s,Australia- Cryptosporiadiasis-1993+,United States- Malaria-1995-97,Africa/Asia/Latin America/United
states- Metorchis-1996,Canada- Microsporidiosis-Worldwide
• Fungal- Coccidiodomycosis-1993,United States- Penicillium marneffi
Emerging and re-emerging infectious diseases threats contd.• Bacterial– Anthrax-1993,Caribbean
– Cat scratch disease/Bacillary angiomatosis(Bartonella henseiae)-1900s, USA– Chlamydia pneumoniae(Pneumonia/Coronary artery disease?)-1990s, USA(discovered
1983)– Cholera-1991,Latin America– Diphtheria-1993,Former Soviet Union– Ehrlichia chaffeensis,Human monocytic ahrlichiosis(HME)-United States– Ehrlichia phagocytophilia,Human Granulocytic ehrlichis(HGE)-United States– Escherichia coli O157-1982-1997,United States;1996,Japan– Gonorrhea(drug resistant)-1995,United States– Helicobacter pylori(ulcers/cancer_-worldwide(discovered 1983)– Leptospirosis-195,Nicaragun– Lyme disease(Borrelia burgdorferi)-1990s,United states– Meningococcal meningitis(serogroup A)-1995-1997,West Africa– Pertussis-1994,UK/Netherlands;1996,USA– Plague-1994,India– Salmonella typhimurium DT104(drug resistant)-1995,USA– Staphylococcus aureus(drug resistant)-1997,United States/Japan– Toxic strep-United States – Trench fever(Barnionella quintana)-1990s,United States– Tuberculosis(highly transmissible)-1995,United states– Vibrio cholerae 0139-1992,Southern Asia
Types of Vaccines
• Killed virus vaccines
• Live-attenuated vaccines
• Recombinant DNA vaccines
• Genetic vaccines
• Subunit vaccines
• Polytope/multi-epitope vaccines
• Synthetic peptide vaccines
Systems with potential use as T-cell vaccines
CD4 + T-cell vaccines CD8+ T-cell vaccinesKilled microbe Live attenuated microbe
Live attenuated microbe -
Synthetic peptide coupled Synthetic peptide to protein delivered in liposomes
or ISCOMsRecombinant microbial protein -bearing CD4+ T-cell epitope
Chimeric virus expressing Chimeric virus expressing CD4+ T-cell epitope CD8+ T-cell epitope
Chimeric Ig Self-molecule expressing CD8+ T-cell epitope
Chimeric-peptide-MHC Chimeric peptide-MHCclass II complex Class I complex
Receptor-linked peptide -
Naked DNA expressing Naked DNA expressing CD4+ T-cell epitope CD8+ T-cell epitope
Abbreviations: Ig, Immunoglobulin, ISCOM, immune-stimulating complex; MHC,Major histocompability complex.
Why Synthetic Peptide Vaccines?
Chemically well defined, selective and safe.
Stable at ambient temperature.
No cold chain requirement hence cost effective in tropical countries.
Simple and standardised production facility.
What Are Epitopes?
Antigenic determinants or Epitopes are the portions of the antigen molecules which are responsible for specificity of the antigens in antigen-antibody (Ag-Ab) reactions and that combine with the antigen binding site of Ab, to which they are complementary.
Epitopes could be -
contiguous (when Ab binds to a contiguous sequence of amino acids)
non-contiguous (when Ab binds to non-contiguous residues, brought together by folding).
Sequential epitopes are contiguous epitopes.
Conformational epitopes are non-contiguous antigenic determinants.
Properties of Amino Acids: predictors for Epitopes
Sequential epitope prediction methods Theoretical methods are based on properties of amino acids and their propensity scales. Hopp & Woods, 1981. Parker et al., 1986 Kolaskar & Tongaonkar, 1990.
The accuracy of prediction: 50-75%. Conformational epitope prediction method Kolaskar & Kulkarni-Kale, 1999.
Identified antigens must be checked for strain varyingpolymorphisms, these polymorphism must be representedin a anti-blood stage vaccine
Candidate protein X
Variants in strains A B C D
Protectiveepitope
Peptide vaccines to be launched in near future
• Foot & Mouth Disease Virus (FMDV)
• Human Immuno Deficiency Virus (HIV)
• Metastatic Breast Cancer
• Pancreatic Cancer
• Melanoma
• Malaria
• * T.solium cysticercosis *
Reverse Vaccinology
• Advantages– Fast access to virtually every antigen– Non-cultivable can be approached– Non abundant antigens can be identified– Antigens not expressed in vitro can be identified.– Non-structural proteins can be used
• Disadvantages– Non proteinous antigens like polysaccharides,
glycolipids cannot be used.
Genome SequenceGenome Sequence
Proteomics TechnologiesProteomics
TechnologiesIn silico analysisIn silico analysis
IVET, STM, DNAmicroarrays
High throughputCloning and expression
High throughputCloning and expression
In vitro and in vivo assays forVaccine candidate identification
In vitro and in vivo assays forVaccine candidate identification
Global genomic approach to identify new vaccine candidates
In Silico Analysis
Gene/Protein Sequence Database
Disease related protein DB
Candidate Epitope DB
VACCINOME
PeptideMultitope vaccines
Epitope prediction
Synthetic Peptide Vaccine
Design and Development of Synthetic Peptide vaccine
against Japanese encephalitis virus
Egp of JEV as an Antigen
Is a major structural antigen.
Responsible for viral haemagglutination.
Elicits neutralising antibodies.
~ 500 amino acids long.
Structure of extra-cellular domain (399) was predicted using knowledge-based homology modeling approach.
Model RefinementPARAMETERS USED
• force field: AMBER all atom • Dielectric const: Distance dependent • Optimisation: Steepest Descents &
Conjugate Gradients.
• rms derivative 0.1 kcal/mol/A for SD• rms derivative 0.001 kcal/mol/A for CG
• Biosym from InsightII, MSI and modules therein
Model For Solvated Protein
Egp of JEV molecule was soaked in the water layer of 10A.
4867 water molecules were added.
The system size was increased to 20,648 atoms from 6047.
An Algorithm to Identify Conformational Epitopes
Calculate the percent accessible surface
area (ASA) of the amino acid residues.
If ASA 30%, then residue was termed as accessible residues.
A contiguous stretch of more than three accessible residues was termed as the antigenic determinant.
…Cont. A determinant is extended to N- and C-
terminals, only if, accessible amino acid(s) are present after an inaccessible amino acid residue.
A list of sequential antigenic
determinants was prepared.
Peptide ModelingInitial random conformationForce field: AmberDistance dependent dielectric constant 4rij
Geometry optimization: Steepest descents & Conjugate gradientsMolecular dynamics at 400 K for 1nsPeptides are:
SENHGNYSAQVGASQ NHGNYSAQVGASQ YSAQVGASQ
YSAQVGASQAAKFT NHGNYSAQVGASQAAKFTSENHGNYSAQVGASQAAKFT149 168
Lowest energy Allowed conformations were obtained using multiple MD simulations:
– Initial conformation: random, allowed
– Amber force field with distance dependent dielectric constant of 4*rij
– Geometry optimization using Steepest descents & Conjugate gradient
– 10 cycles of molecular dynamics at 400 K; each of 1ns duration, with an equilibration for 500 ps
– Conformations captured at 10ps intervals, followed by energy minimization of each
– Analysis of resulting conformations to identify the lowest energy, geometrically and stereochemically allowed conformations
Prediction of conformations of the antigenic peptides
B Cell Epitopes:
SENHGNYSAQVGASQ NHGNYSAQVGASQ YSAQVGASQ YSAQVGASQAAKFT
NHGNYSAQVGASQAAKFT149
168SENHGNYSAQVGASQAAKFT
Chimeric B+Th Cell Epitope With Spacer:
SENHGNYSAQVGASQAAKFTSIGKAVHQVF
T-helper Cell Epitope:
436 445SIGKAVHQVF
MD simulations of following peptides were carried out
Structural comparison of Egps of Nakayama and Sri Lanka strains of JEV.
Single amino acid differences are highlighted.
1 3 5 7 9 11 13 15 17 19
A65
0
0.0
0.4
0.8
1.2
1.6
1 3 5 7 9 11 13 15 17 19
A65
0
0.0
0.4
0.8
1.2
1.6
1 3 5 7 9 11 13 15 17 19
A65
0
0.0
0.4
0.8
1.2
1.6
1 3 5 7 9 11 13 15 17 19A
650
0.0
0.4
0.8
1.2
1.6
1 3 5 7 9 11 13 15 17 19
A65
0
0.0
0.4
0.8
1.2
1.6
1 3 5 7 9 11 13 15 17 19
A65
0
0.0
0.4
0.8
1.2
1.6
Ts18 epitope mapping
13-mers window skipping 3 aminoacids
Ts18 MHC I and MHC II consensus profile
0
5
10
15
20
25
30
35
40
45
1 5 9 13 17 21 25 29 33 37 41 45 49 53 57 61 65 69 73