+ All Categories
Home > Documents > Genomics_bioinformatics2

Genomics_bioinformatics2

Date post: 30-May-2018
Category:
Upload: brijesh-singh-yadav
View: 215 times
Download: 0 times
Share this document with a friend

of 26

Transcript
  • 8/14/2019 Genomics_bioinformatics2

    1/26

    06/10/0906/10/09 URC,AllahabadURC,Allahabad 11

    Genomics and Bioinformatics

    The "new" biology

    Brijesh Singh Yadav

    Bioinformatics Research CellUnited Research Center

    Allahabad, India.

  • 8/14/2019 Genomics_bioinformatics2

    2/26

    06/10/0906/10/09 URC,AllahabadURC,Allahabad 22

  • 8/14/2019 Genomics_bioinformatics2

    3/26

  • 8/14/2019 Genomics_bioinformatics2

    4/26

    06/10/0906/10/09 URC,AllahabadURC,Allahabad 44

    Genome sequencing chronology

    ~3,5001,830,137First free-livingorganism

    Haemophilusinfluenzae Rd

    1995

    ~6,00012,086,000First eukaryoteSaccharomycescerevisiae

    1996

    16,500

    5,386

    Genome size

    (bp)

    37First organelleHumanmitochondria1981

    11First genomeever!

    BacteriophagefX174

    1977

    Number of

    genes

    SignificanceOrganismYear

    http://www.ncbi.nlm.nih.gov/ICTVdb/Images/Ackerman/Phages/Microvir/238-27_1.jpg
  • 8/14/2019 Genomics_bioinformatics2

    5/26

    06/10/0906/10/09 URC,AllahabadURC,Allahabad 55

    Genome sequencing chronology

    ~25,000150,000,000First plantgenome

    Arabidopsisthaliana

    2000

    3,000,000,000

    49,000,000

    97,000,000

    Genome size (bp)

    ~30,000First humangenome

    Human2001

    673First humanchromosome

    Humanchromosome22

    1999

    ~19,000First multi-cellular organism

    Caenorhab-ditis elegans

    1998

    Number ofgenes

    SignificanceOrganismYear

    http://shiulab.plantbiology.msu.edu/wiki/images/8/8a/DSCN3107.jpg
  • 8/14/2019 Genomics_bioinformatics2

    6/26

    06/10/0906/10/09 URC,AllahabadURC,Allahabad 66

    Genome sequencing projects (as of 1/26,2007)

  • 8/14/2019 Genomics_bioinformatics2

    7/26

    06/10/0906/10/09 URC,AllahabadURC,Allahabad 77

  • 8/14/2019 Genomics_bioinformatics2

    8/26

    06/10/0906/10/09 URC,AllahabadURC,Allahabad 88

    Genome sequencing helps in:

    Identifying new genes (gene discovery)

    Looking at chromosome organization and structure

    Finding gene regulatory sequences

    Comparative genomics

    These in turn lead to advances in:Medicine

    Agriculture

    Biotechnology

    Understanding evolution and other basic science questions

  • 8/14/2019 Genomics_bioinformatics2

    9/26

    06/10/0906/10/09 URC,AllahabadURC,Allahabad 99

    Information contents in a genome

    Gene

    Protein coding genes

    RNA genes

    Regulatory elements

    Gene expression control Chromatin remodeling

    Matrix attachment sites

    Non-functional elements

    Selfish elements

    Junk DNA

    ??

  • 8/14/2019 Genomics_bioinformatics2

    10/26

    06/10/0906/10/09 URC,AllahabadURC,Allahabad 1010

    The central dogma of molecular biology

    Central dogma

    DNA

    RNA

    Protein

    Transcription

    Translation

    Replication

    http://upload.wikimedia.org/wikipedia/commons/7/74/Rubisco.png
  • 8/14/2019 Genomics_bioinformatics2

    11/26

    06/10/0906/10/09 URC,AllahabadURC,Allahabad 1111

    Expanded central dogma of molecular biology

    A more comprehensive view

    DNA

    RNA

    Protein

    Transcription

    Translation

    Replication

    Metabolite

    Pheno-type

    http://upload.wikimedia.org/wikipedia/commons/7/74/Rubisco.png
  • 8/14/2019 Genomics_bioinformatics2

    12/26

    06/10/0906/10/09 URC,AllahabadURC,Allahabad 1212

    New disciplines due to the advance in genomics

    Omics

    DNA

    RNA

    Protein

    Transcription

    Translation

    Replication

    Metabolite

    Pheno-type

    Structuralgenomics

    Transcriptomics

    Proteomics

    Metabolomics

    Genomic DNAsequences

    Transcript seqMicroarray dataCis-elements

    TF binding sitesEpigenetic regulation

    Shotgun protein seqSubcellular location

    Post-translational modProtein interactionProtein structure

    Metabolite concnMetabolic flux

    Genetic interactionsSystematic KO

    Disease information

  • 8/14/2019 Genomics_bioinformatics2

    13/26

  • 8/14/2019 Genomics_bioinformatics2

    14/26

    06/10/0906/10/09 URC,AllahabadURC,Allahabad 1414

    Nature omics gateway

  • 8/14/2019 Genomics_bioinformatics2

    15/26

    06/10/0906/10/09 URC,AllahabadURC,Allahabad 1515

    Three perspectives of our biological world

    The cellular level, the individual, the tree of life

    ~1014 cells per individual 2-100x106 species~3x104 genes

    http://www.olympusfluoview.com/gallery/cells/hela/helacells.htmlhttp://www.tolweb.org/tree/home.pages/aboutoverview.html
  • 8/14/2019 Genomics_bioinformatics2

    16/26

    06/10/0906/10/09 URC,AllahabadURC,Allahabad 1616

    Further complications

    Cell-cell interactions

    Cell types

    Environmental conditions

    Developmentalprogramming

    Interactions at theorganismal level

    Interactions at thepopulation, ecosystem level

  • 8/14/2019 Genomics_bioinformatics2

    17/26

    06/10/0906/10/09 URC,AllahabadURC,Allahabad 1717

    How to characterize new diseases?

    What new treatments can be discovered?

    How do we treat individual patients? Tailoring treatments?

    Impact of Genomics on Medicine

  • 8/14/2019 Genomics_bioinformatics2

    18/26

    06/10/0906/10/09 URC,AllahabadURC,Allahabad 1818

    Bioinformatics

    Conceptualizing biology in terms ofmolecules and then applying

    informatics techniques from math,computer science, and statistics to

    understand and organize theinformation associated with these

    molecules on a large scale

  • 8/14/2019 Genomics_bioinformatics2

    19/26

    06/10/0906/10/09 URC,AllahabadURC,Allahabad 1919

    How do we use Bioinformatics?

    Store/retrieve biological information (databases)

    Retrieve/compare gene sequences

    Predict function of unknown genes/proteins

    Search for previously known functions of a gene

    Compare data with other researchers

    Compile/distribute data for other researchers

  • 8/14/2019 Genomics_bioinformatics2

    20/26

    06/10/0906/10/09 URC,AllahabadURC,Allahabad 2020

    Example: Sequence alignment

    Align retinol-binding protein and b-lactoglobulin

    1 MKWVWALLLLAAWAAAERDCRVSSFRVKENFDKARFSGTWYAMAKKDPEG 50 RBP

    . ||| | . |. . . | : .||||.:| :

    1 ...MKCLLLALALTCGAQALIVT..QTMKGLDIQKVAGTWYSLAMAASD. 44 lactoglobulin

    51 LFLQDNIVAEFSVDETGQMSATAKGRVR.LLNNWD..VCADMVGTFTDTE 97 RBP

    : | | | | :: | .| . || |: || |.

    45 ISLLDAQSAPLRV.YVEELKPTPEGDLEILLQKWENGECAQKKIIAEKTK 93 lactoglobulin

    98 DPAKFKMKYWGVASFLQKGNDDHWIVDTDYDTYAV...........QYSC 136 RBP

    || ||. | :.|||| | . .|

    94 IPAVFKIDALNENKVL........VLDTDYKKYLLFCMENSAEPEQSLAC 135 lactoglobulin

    137 RLLNLDGTCADSYSFVFSRDPNGLPPEAQKIVRQRQ.EELCLARQYRLIV 185 RBP

    . | | | : || . | || |

    136 QCLVRTPEVDDEALEKFDKALKALPMHIRLSFNPTQLEEQCHI....... 178 lactoglobulin

    >RBP

    MKWVWALLLLAAWAAAERDCRVSSFRVKENFDKARFSGTWYAMAKKDPEGLFLQDNIVAEFSVDETGQMSATAKGRVRL

    LNNWDVCADMVGTFTDTEDPAKFKMKYWGVASFLQKGNDDHWIVDTDYDTYAVQYSCRLLNLDGTCADSYSFVFSRDPN

    GLPPEAQKIVRQRQEELCLARQYRLIV

    >lactoglobulin

    MKCLLLALALTCGAQALIVTQTMKGLDIQKVAGTWYSLAMAASDISLLDAQSAPLRVYVEELKPTPEGDLEILLQKWEN

    GECAQKKIIAEKTKIPAVFKIDALNENKVLVLDTDYKKYLLFCMENSAEPEQSLACQCLVRTPEVDDEALEKFDKALKA

    LPMHIRLSFNPTQLEEQCHI

  • 8/14/2019 Genomics_bioinformatics2

    21/26

    06/10/0906/10/09 URC,AllahabadURC,Allahabad 2121

    Microarray data analysis

    A simplified pipeline

  • 8/14/2019 Genomics_bioinformatics2

    22/26

    06/10/0906/10/09 URC,AllahabadURC,Allahabad 2222

    Example: Microarray

    A solid support (e.g. a membrane or glass slide) on which DNA of

    known sequence is deposited in a grid-like fashion

  • 8/14/2019 Genomics_bioinformatics2

    23/26

    06/10/0906/10/09 URC,AllahabadURC,Allahabad 2323

    Example: Identification of cis-elements

    The on-off switches and rheostats of a cell operating at the genelevel.

    They control whether and how vigorously that genes will betranscribed into RNAs.

  • 8/14/2019 Genomics_bioinformatics2

    24/26

    06/10/0906/10/09 URC,AllahabadURC,Allahabad 2424

    Motif model: Position Frequency Matrix (PFM)

    fb,i: freuqnecy of a base b occurred at the i-th position

    Dhaeseleer (2006) Nature Biotech. 24:4

  • 8/14/2019 Genomics_bioinformatics2

    25/26

    06/10/0906/10/09 URC,AllahabadURC,Allahabad 2525

    Final example: Relationships between sequences

    Sanger and colleagues (1950s): 1st sequence

    Insulin from various mammals

  • 8/14/2019 Genomics_bioinformatics2

    26/26

    06/10/0906/10/09 URC,AllahabadURC,Allahabad 2626

    The END

    ...