+ All Categories
Home > Health & Medicine > Project report-on-bio-informatics

Project report-on-bio-informatics

Date post: 10-May-2015
Category:
Upload: daniela-rotariu
View: 537 times
Download: 0 times
Share this document with a friend
Popular Tags:
35
Bioinformatics – A Brief Bioinformatics – A Brief overview overview
Transcript
Page 1: Project report-on-bio-informatics

Bioinformatics – A Brief overviewBioinformatics – A Brief overviewBioinformatics – A Brief overviewBioinformatics – A Brief overview

Page 2: Project report-on-bio-informatics

What is bioinformatics?What is bioinformatics?What is bioinformatics?What is bioinformatics?

Application of information technology to Application of information technology to the storage, management and analysis of the storage, management and analysis of biological informationbiological information

Facilitated by the use of computers Facilitated by the use of computers

Page 3: Project report-on-bio-informatics

Publically available genomes (April 1998)

Publically available genomes (April 1998)

COMPLETE/PUBLICCOMPLETE/PUBLIC

Aquifex aeolicus Aquifex aeolicus

Pyrococcus horikoshiiPyrococcus horikoshii

Bacillus subtilisBacillus subtilis

Treponema pallidumTreponema pallidum

Borrelia burgdorferiBorrelia burgdorferi

Helicobacter pyloriHelicobacter pylori

. Escherichia coli. Escherichia coli

Mycoplasma pneumoniaeMycoplasma pneumoniae

Saccharomyces cerevisiaeSaccharomyces cerevisiae

Mycoplasma genitaliumMycoplasma genitalium

Haemophilus influenzaeHaemophilus influenzae

COMPLETE/PENDING PUBLICATIONRickettsia prowazekii Pseudomonas aeruginosa

Pyrococcus abyssii

Bacillus sp. C-125

Ureaplasma urealyticum

Pyrobaculum aerophilum

ALMOST/PUBLIC

Pyrococcus furiosus

Mycobacterium tuberculosis H37Rv

Mycobacterium tuberculosis CSU93

Neisseria gonorrhea

Neisseria meningiditis

Streptococcus pyogenes

Page 4: Project report-on-bio-informatics

Promises of genomics and Promises of genomics and bioinformatics bioinformatics

Promises of genomics and Promises of genomics and bioinformatics bioinformatics

MedicineMedicine– Knowledge of protein structure facilitates drug designKnowledge of protein structure facilitates drug design

– Understanding of genomic variation allows the tailoring Understanding of genomic variation allows the tailoring of medical treatment to the individual’s genetic make-upof medical treatment to the individual’s genetic make-up

– Genome analysis allows the targeting of genetic Genome analysis allows the targeting of genetic diseasesdiseases

– The effect of a disease or of a therapeutic on RNA and The effect of a disease or of a therapeutic on RNA and protein levels can be elucidatedprotein levels can be elucidated

The same techniques can be applied to The same techniques can be applied to biotechnology, crop and livestock improvement, biotechnology, crop and livestock improvement, etc...etc...

Page 5: Project report-on-bio-informatics

The need for bioinformaticists.The need for bioinformaticists. The number of entries in data bases of gene sequences is The number of entries in data bases of gene sequences is increasing exponentially. Bioinformaticians are needed to increasing exponentially. Bioinformaticians are needed to

understand and use this informationunderstand and use this information..

0.E+00

5.E+08

1.E+09

2.E+09

2.E+09

3.E+09

0.E+00

5.E+05

1.E+06

2.E+06

2.E+06

3.E+06

3.E+06

4.E+06

Residues Records

82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99

GenBank growth

Page 6: Project report-on-bio-informatics

What Can be done using What Can be done using bioinformatics?bioinformatics?

What Can be done using What Can be done using bioinformatics?bioinformatics?

Sequence analysisSequence analysis– Geneticists/ molecular biologists analyse genome sequence Geneticists/ molecular biologists analyse genome sequence

information to understand disease processesinformation to understand disease processes

Molecular modelingMolecular modeling– Crystallographers/ biochemists design drugs using computer-aided Crystallographers/ biochemists design drugs using computer-aided

toolstools

Phylogeny/evolutionPhylogeny/evolution– Geneticists obtain information about the evolution of organisms by Geneticists obtain information about the evolution of organisms by

looking for similarities in gene sequenceslooking for similarities in gene sequences

Ecology and population studiesEcology and population studies– Bioinformatics is used to handle large amounts of data obtained in Bioinformatics is used to handle large amounts of data obtained in

population studiespopulation studies

Medical informaticsMedical informatics– Personalised medicinePersonalised medicine

Page 7: Project report-on-bio-informatics

NCBINCBI(National centre for Biotechnology information(National centre for Biotechnology information))

www.ncbi.nlm.nih.govwww.ncbi.nlm.nih.gov

Entrez ProteinEntrez Protein

DNADNA

EMBL, DDBJ, GENEBANKEMBL, DDBJ, GENEBANK

SRS GENOME SRS GENOME Pubmed AnnotationPubmed AnnotationMedlineMedline

PIRPIR

SwissprotSwissprot

PDBPDB

Page 8: Project report-on-bio-informatics

What can be discovered about a What can be discovered about a gene by a database search?gene by a database search?

A little or a lot, depending on the geneA little or a lot, depending on the gene Evolutionary informationEvolutionary information: homologous genes, taxonomic : homologous genes, taxonomic

distributions, allele frequencies, synteny, etc.distributions, allele frequencies, synteny, etc. Genomic informationGenomic information: chromosomal location, introns, : chromosomal location, introns,

UTRs, regulatory regions, shared domains, etc.UTRs, regulatory regions, shared domains, etc. Structural informationStructural information: associated protein structures, fold : associated protein structures, fold

types, structural domainstypes, structural domains Expression informationExpression information: expression specific to particular : expression specific to particular

tissues, developmental stages, phenotypes, diseases, etc.tissues, developmental stages, phenotypes, diseases, etc. Functional informationFunctional information: enzymatic/molecular function, : enzymatic/molecular function,

pathway/cellular role, localization, role in diseasespathway/cellular role, localization, role in diseases

Page 9: Project report-on-bio-informatics

DatabasesDatabases

Three types of databasesThree types of databasesPrimary –Primary –Sequence databaseSequence database

Secondary-Secondary-AnnotationAnnotation

Tertiary-Tertiary-structure database structure database

Two other typesTwo other typesDNA database -DNA database -Genebank,DDBJ,EMBLGenebank,DDBJ,EMBL

Protein databases – Protein databases – PIR,SwissProt,MIPSPIR,SwissProt,MIPS

Page 10: Project report-on-bio-informatics

BioinformaticsBioinformatics 1010

Biological databanks and Biological databanks and databasesdatabases

Very fast growth of biological dataVery fast growth of biological data

Diversity of biological data:Diversity of biological data:– primary sequencesprimary sequences– 3D structures3D structures– functional datafunctional data

Database entry usually required for publicationDatabase entry usually required for publication– SequencesSequences– StructuresStructures

Database entry may replace primary publicationDatabase entry may replace primary publication– genomic approachesgenomic approaches

Page 11: Project report-on-bio-informatics

PubMedPubMed

Page 12: Project report-on-bio-informatics
Page 13: Project report-on-bio-informatics

Sequence analysis: overviewSequence analysis: overviewSequence analysis: overviewSequence analysis: overview

Nucleotide sequence file

Search databases for similar sequences

Sequence comparison

Multiple sequence analysis

Design further experimentsRestriction mappingPCR planning

Translate into protein

Search for known motifs

RNA structure prediction

non-coding

coding

Protein sequence analysis

Search for protein coding regions

Manual sequence entry

Sequence database browsing

Sequencing project management

Protein sequence file

Search databases for similar sequences

Sequence comparison

Search for known motifs

Predict secondary structure

Predict tertiary

structureCreate a multiple sequence alignment

Edit the alignment

Format the alignment for publication

Molecular phylogeny

Protein family analysis

Nucleotide sequence analysis

Sequence entry

Page 14: Project report-on-bio-informatics

Sequence comparisonSequence comparison

Pairwise sequence alignment Pairwise sequence alignment Blast - BlastP,BlastN,nBlastPBlast - BlastP,BlastN,nBlastPMultiple sequence alignmentMultiple sequence alignmentClustalW,ClustalXClustalW,ClustalXUser interfaceUser interfaceBioeditBioeditBiology WorkbenchBiology WorkbenchCLC WorkbenchCLC Workbench

Page 15: Project report-on-bio-informatics

Click on:

Page 16: Project report-on-bio-informatics

Database SearchDatabase Search

Page 17: Project report-on-bio-informatics
Page 18: Project report-on-bio-informatics

Multiple Sequence Alignment: Multiple Sequence Alignment: ApproachesApproaches

Optimal Global AlignmentsOptimal Global Alignments -Dynamic programming -Dynamic programming– Generalization of Needleman-WunschGeneralization of Needleman-Wunsch– Find alignment that maximizes a score functionFind alignment that maximizes a score function– Computationally expensive: Time grows as product of sequence Computationally expensive: Time grows as product of sequence

lengthslengths

Global Progressive AlignmentsGlobal Progressive Alignments - Match closely-related - Match closely-related sequences first using a guide treesequences first using a guide treeGlobal Iterative AlignmentsGlobal Iterative Alignments - Multiple re-building - Multiple re-building attempts to find best alignmentattempts to find best alignmentLocal alignmentsLocal alignments– Profiles, Blocks, PatternsProfiles, Blocks, Patterns

Page 19: Project report-on-bio-informatics

CLUSTALW MSACLUSTALW MSA

Page 20: Project report-on-bio-informatics

Phylogeny inference: Phylogeny inference: Analysis of Analysis of sequences allows evolutionary relationships to sequences allows evolutionary relationships to

be determinedbe determined E.coli

C.botulinum

C.cadavers

C.butyricum

B.subtilis

B.cereusPhylogenetic tree constructed using the Phylip package

Page 21: Project report-on-bio-informatics

gene prediction softwaregene prediction software

Similarity-based or Comparative Similarity-based or Comparative – BLAST BLAST – SGP2 (extension of GeneID)SGP2 (extension of GeneID)

Ab initioAb initio = “from the beginning” = “from the beginning”– GeneID GeneID – GENSCANGENSCAN– GeneMarkGeneMark– Combined "evidence-based”Combined "evidence-based”– GeneSeqerGeneSeqer (Brendel et al., ISU) (Brendel et al., ISU)

BEST-BEST- GENSCAN, GeneMark.hmm, GeneSeqer GENSCAN, GeneMark.hmm, GeneSeqerbut depends on organism & specific but depends on organism & specific

tasktask

Page 22: Project report-on-bio-informatics

PCR Primer Design:PCR Primer Design:Oligonucleotides for use in the polymerisation chain Oligonucleotides for use in the polymerisation chain

reaction can be designed using computer based prgramsreaction can be designed using computer based prgrams

OPTIMAL primer length --> 20MINIMUM primer length --> 18MAXIMUM primer length --> 22 OPTIMAL primer melting temperature --> 60.000MINIMUM acceptable melting temp --> 57.000MAXIMUM acceptable melting temp --> 63.000MINIMUM acceptable primer GC% --> 20.000MAXIMUM acceptable primer GC% --> 80.000Salt concentration (mM) --> 50.000 DNA concentration (nM) --> 50.000MAX no. unknown bases (Ns) allowed --> 0 MAX acceptable self-complementarity --> 12 MAXIMUM 3' end self-complementarity --> 8 GC clamp how many 3' bases --> 0

Page 23: Project report-on-bio-informatics

Restriction mapping: Restriction mapping: Genes can Genes can be analysed to detect gene sequences be analysed to detect gene sequences

that can be cleaved with restriction that can be cleaved with restriction enzymesenzymes

AceIII 1 CAGCTCnnnnnnn’nnn...AluI 2 AG’CTAlwI 1 GGATCnnnn’n_ApoI 2 r’AATT_yBanII 1 G_rGCy’CBfaI 2 C’TA_GBfiI 1 ACTGGGBsaXI 1 ACnnnnnCTCCBsgI 1 GTGCAGnnnnnnnnnnn...

BsiHKAI 1 G_wGCw’CBsp1286I 1 G_dGCh’C

BsrI 2 ACTG_Gn’BsrFI 1 r’CCGG_yCjeI 2 CCAnnnnnnGTnnnnnn...CviJI 4 rG’CyCviRI 1 TG’CADdeI 2 C’TnA_GDpnI 2 GA’TCEcoRI 1 G’AATT_CHinfI 2 G’AnT_CMaeIII 1 ’GTnAC_MnlI 1 CCTCnnnnnn_n’MseI 2 T’TA_AMspI 1 C’CG_GNdeI 1 CA’TA_TG

Sau3AI 2 ’GATC_SstI 1 G_AGCT’CTfiI 2 G’AwT_C

Tsp45I 1 ’GTsAC_Tsp509I 3 ’AATT_

TspRI 1 CAGTGnn’

50 100 150 200 250

Page 24: Project report-on-bio-informatics

RNA structure prediction: RNA structure prediction: Structural features of RNA can be predictedStructural features of RNA can be predicted

G

GA

C

A

G

G

A

G

G

A

U

ACCG

CG

G

U

C

C

UGC

CG G U C C

U CA

CUU

GGACUUAGU

A

U

CA

U

C

A

G

U

C

UGCGC

AAU

A

G

G

UA A

C

G CGU

Page 25: Project report-on-bio-informatics

Protein Protein StructureStructure : :

the 3-D the 3-D structure of structure of proteins is proteins is

used to used to understand understand

protein protein function and function and design new design new

drugsdrugs

Page 26: Project report-on-bio-informatics

Gene Sequencing: Gene Sequencing: Automated chemcial Automated chemcial sequencing methods allow rapid generation sequencing methods allow rapid generation

of large data banks of gene sequencesof large data banks of gene sequences

Page 27: Project report-on-bio-informatics

Structural BioinformaticsStructural Bioinformatics

Page 28: Project report-on-bio-informatics

2828

Structural BioinformaticsStructural Bioinformatics

Prediction of structure from sequencePrediction of structure from sequence– secondary structuresecondary structure

– homology modelling, threadinghomology modelling, threading

– ab initio 3D predictionab initio 3D prediction

Analysis of 3D structureAnalysis of 3D structure– structure comparison/ alignmentstructure comparison/ alignment

– prediction of function from structureprediction of function from structure

– molecular mechanics/ molecular dynamicsmolecular mechanics/ molecular dynamics

– prediction of molecular interactions, dockingprediction of molecular interactions, docking

Structure databases (RCSB)Structure databases (RCSB)

Page 29: Project report-on-bio-informatics
Page 30: Project report-on-bio-informatics

Bioinformatics key areasBioinformatics key areas

organisation of knowledge (sequences, structures, functional data)

e.g. homology searches

Page 31: Project report-on-bio-informatics

Molecular modelingMolecular modeling

Homology modelHomology model

Comparative modeling Comparative modeling

ModellarModellar

SwissPDB ViwerSwissPDB Viwer

GenetraederGenetraeder

MOLMODMOLMOD

Page 32: Project report-on-bio-informatics

Molecular visualizationMolecular visualization

RasmolRasmol

CN3DCN3D

JmolJmol

PymolPymol

JmolJmol

Page 33: Project report-on-bio-informatics

SECONDARY STRUCTURE PREDICTIONSECONDARY STRUCTURE PREDICTION

Jpred,Gor,SopmaJpred,Gor,Sopma

Page 34: Project report-on-bio-informatics

Tertiary Structure predictionTertiary Structure predictionCPHmodelCPHmodel

Page 35: Project report-on-bio-informatics

Active Site PredictionActive Site Prediction


Recommended