+ All Categories
Home > Documents > Michael Schroeder BioTechnological Center TU Dresden Biotec Genome Lesk, Introduction to...

Michael Schroeder BioTechnological Center TU Dresden Biotec Genome Lesk, Introduction to...

Date post: 13-Jan-2016
Category:
Upload: bryce-norton
View: 218 times
Download: 1 times
Share this document with a friend
Popular Tags:
23
Michael Schroeder BioTechnological Center TU Dresden Biotec Genome Lesk, Introduction to Bioinformatics, Chapter
Transcript
Page 1: Michael Schroeder BioTechnological Center TU Dresden Biotec Genome Lesk, Introduction to Bioinformatics, Chapter 2.

Michael Schroeder BioTechnological CenterTU Dresden Biotec

Genome

Lesk, Introduction to Bioinformatics,

Chapter 2

Page 2: Michael Schroeder BioTechnological Center TU Dresden Biotec Genome Lesk, Introduction to Bioinformatics, Chapter 2.

By Michael Schroeder, Biotec, 2004 2

Organisms and cells All organisms consist of small cells

Human body has approx 6x1013 cells of about 320 different types

Cell size can vary greatly Human red blood cell 5 microns (0.005 mm) Neuron from spinal cord 1m long

Two types of organisms Prokaryotes - Bacteria for example Eukaryotes - most other organisms Archaea – few organisms living in hostile

environments

Page 3: Michael Schroeder BioTechnological Center TU Dresden Biotec Genome Lesk, Introduction to Bioinformatics, Chapter 2.

By Michael Schroeder, Biotec, 2004 3

Genomes and Genes: Not all DNA codes for genes

Organism Number of bp Genes

ФX-174 5386 10 Virus infecting E.coli

Human mitochondrion 16569 37 Subcellular organelle

Mycoplasma pneumoniae 816394 680 Pneumonia

Hemophilus influenzae 1830138 1738 Middle ear infection

E. Coli 4639221 4406

Saccharomyces cerevisiae 12.1 x 106 5885 Yeast

C. Elegans 95.5 x 106 19099 Worm

Drosophilia melanogaster 1.8 x 108 13601 Fruit fly

Human 3.2 x 109 22.000?

Page 4: Michael Schroeder BioTechnological Center TU Dresden Biotec Genome Lesk, Introduction to Bioinformatics, Chapter 2.

By Michael Schroeder, Biotec, 2004 4

Genetic information

Genes as discovered by Mendel entirely abstract entities

Chromosomes are physical entities and their banding patterns their landmarks Chromosomes are numbered in size (1=largest) Human chromome: p (petite=short), q (queue) arm,

e.g. 15q11.1,

DNA sequences = hereditary information in physical form

Page 5: Michael Schroeder BioTechnological Center TU Dresden Biotec Genome Lesk, Introduction to Bioinformatics, Chapter 2.

By Michael Schroeder, Biotec, 2004 5

Locating genes

The disease cystic fibrosis is known since middle ages, the relevant protein was not

Folklore: „Children with excessive salt in sweat - noticable when kissing them on forehead - were short lived“

Implication: Chloride channel in epithelial tissues Search in family pedrigrees identified various genetic

markers (Variable Number Tandem Repeat), which limited the genomic region first from 1-2 Mio bp to 300kb

Finally the deletion 508Phe in the CFTR gene was identified as cause

Page 6: Michael Schroeder BioTechnological Center TU Dresden Biotec Genome Lesk, Introduction to Bioinformatics, Chapter 2.

By Michael Schroeder, Biotec, 2004 6

Chromosome

Page 7: Michael Schroeder BioTechnological Center TU Dresden Biotec Genome Lesk, Introduction to Bioinformatics, Chapter 2.

By Michael Schroeder, Biotec, 2004 7

Chromosome banding pattern map

Page 8: Michael Schroeder BioTechnological Center TU Dresden Biotec Genome Lesk, Introduction to Bioinformatics, Chapter 2.

By Michael Schroeder, Biotec, 2004 8

Chromosome banding pattern map

Page 9: Michael Schroeder BioTechnological Center TU Dresden Biotec Genome Lesk, Introduction to Bioinformatics, Chapter 2.

By Michael Schroeder, Biotec, 2004 9

2 Types of Maps: Physical Map

Genome sequencing projects supply the DNA sequence of each chromosome

The physical distance is the number of base pairs that separate two genes

…ACTGTATGACTGGCATGGCACTGGGGCAAATGTGCACTC…

110

180 Mbp

100

50

Gene A Gene B

C. Voigt, S. Ibrahim, S. Möller, P. Serrano Fernández. Non-linear map conversions. German Conference on Bioinformatics, 2003

Page 10: Michael Schroeder BioTechnological Center TU Dresden Biotec Genome Lesk, Introduction to Bioinformatics, Chapter 2.

By Michael Schroeder, Biotec, 2004 10

• Chromosomes are carriers of genetic information

• Genetic information is linked and linearly arranged inside the chromosome

• This linkage is sometimes broken: recombination (crossing-over)

Genetic Maps

C. Voigt, S. Ibrahim, S. Möller, P. Serrano Fernández. Non-linear map conversions. German Conference on Bioinformatics, 2003

2 Types of Maps: Genetic Map

Page 11: Michael Schroeder BioTechnological Center TU Dresden Biotec Genome Lesk, Introduction to Bioinformatics, Chapter 2.

By Michael Schroeder, Biotec, 2004 11

Genes located far from each other are more likely to be uncoupled during a crossing-over

A Morgan is the genetic distance in which 1 crossing-over is expected to occur

2

7078

110

0

cM

C. Voigt, S. Ibrahim, S. Möller, P. Serrano Fernández. Non-linear map conversions. German Conference on Bioinformatics, 2003

2 Types of Maps: Genetic Maps

Page 12: Michael Schroeder BioTechnological Center TU Dresden Biotec Genome Lesk, Introduction to Bioinformatics, Chapter 2.

By Michael Schroeder, Biotec, 2004 12

Historical background

Different systems provide us with complementary information (not completely redundant)

Genetic markers may be mapped in only one system (conversions needed)

Genetic markers may be ambiguous

C. Voigt, S. Ibrahim, S. Möller, P. Serrano Fernández. Non-linear map conversions. German Conference on Bioinformatics, 2003

Why 2 Types of Maps?

Page 13: Michael Schroeder BioTechnological Center TU Dresden Biotec Genome Lesk, Introduction to Bioinformatics, Chapter 2.

By Michael Schroeder, Biotec, 2004 13

bps / cM

bps / cM

Linear relationship

C. Voigt, S. Ibrahim, S. Möller, P. Serrano Fernández. Non-linear map conversions. German Conference on Bioinformatics, 2003

Expected Map Conversion

Page 14: Michael Schroeder BioTechnological Center TU Dresden Biotec Genome Lesk, Introduction to Bioinformatics, Chapter 2.

By Michael Schroeder, Biotec, 2004 14

bps / cM / cR

bps / cM / cR

Linear relationship

bps

cM

Human chromosome 12

C. Voigt, S. Ibrahim, S. Möller, P. Serrano Fernández. Non-linear map conversions. German Conference on Bioinformatics, 2003

Observed Map Conversion Non linear relationship (Yu A, et al. 2001.

Nature, 409:951-3 Outliers Marker abiguity Local marker density Inversions

Page 15: Michael Schroeder BioTechnological Center TU Dresden Biotec Genome Lesk, Introduction to Bioinformatics, Chapter 2.

By Michael Schroeder, Biotec, 2004 15

Gene density and recombination

Recombination is mostly higher in areas with a high gene density.

high recombination

high gene density

bps

cMHuman chromosome 12

Yao, et al. (2002) Proc Natl Acad Sci

99(9):6157

C. Voigt, S. Ibrahim, S. Möller, P. Serrano Fernández. Non-linear map conversions. German Conference on Bioinformatics, 2003

General Properties

Page 16: Michael Schroeder BioTechnological Center TU Dresden Biotec Genome Lesk, Introduction to Bioinformatics, Chapter 2.

By Michael Schroeder, Biotec, 2004 16

Tool

http://tp12.pzr.uni-rostock.de/qtl/cartographer.php

Page 17: Michael Schroeder BioTechnological Center TU Dresden Biotec Genome Lesk, Introduction to Bioinformatics, Chapter 2.

By Michael Schroeder, Biotec, 2004 17

How to Detect Genes? Detecting of regions similar to known coding regions from

other organisms Gene expressed (in another organism) mRNA cDNA = EST

(Expressed Sequence Tags) search for start of EST

Ab initio: derive gene from sequence itself Bacteria easy as genes are contiguous Eucaryotes problem: alternative splicing

Initial exon: Search for TATA box ~30bp upstream, no in-frame stop codon, ends before GT splice signal

Internal exon: AG splice signal, no in-frame stop codons, ends before GT splice signal

Final exon followed by polyadenylation

Page 18: Michael Schroeder BioTechnological Center TU Dresden Biotec Genome Lesk, Introduction to Bioinformatics, Chapter 2.

By Michael Schroeder, Biotec, 2004 18Brent, Nat Biotech, 2007

Page 19: Michael Schroeder BioTechnological Center TU Dresden Biotec Genome Lesk, Introduction to Bioinformatics, Chapter 2.

How to detect genes: De novo prediction

GenScan (late 90s) predicts 10% of ORFs in human genome Over prediction of 45000 genes (20-21000 current

estiamte) TwinScan (ealry 2000s):

Use alignment between target and a related genome: ca. 30% of ORFs in human genome

Nscan Includes pseudo gene detection Predicts 20138 genes

By Michael Schroeder, Biotec, 2004 19

Page 20: Michael Schroeder BioTechnological Center TU Dresden Biotec Genome Lesk, Introduction to Bioinformatics, Chapter 2.

By Michael Schroeder, Biotec, 2004 20

Applications

Genetic diversity and anthropology Cheetahs very closely related to each other pointing to

a population bottleneck 10.000 years ago Humans: mitochondrial DNA passed on through

maternal line, Y chromosome from father to son Variation in mitochondrial DNA in humans suggests

single maternal ancestor 140 000-200 000 years ago Population of Iceland (first inhabited 1100 years ago)

descended from Scandinavian males and femals from Scandinavia and the British Isles

Basques linguistically and genetically isolated

Page 21: Michael Schroeder BioTechnological Center TU Dresden Biotec Genome Lesk, Introduction to Bioinformatics, Chapter 2.

By Michael Schroeder, Biotec, 2004 21

Evolution of Genomes

Phylogenetic profiles What genes do different phyla share? What homologous proteins do different phyla share What functions to different phyla share?

Page 22: Michael Schroeder BioTechnological Center TU Dresden Biotec Genome Lesk, Introduction to Bioinformatics, Chapter 2.

By Michael Schroeder, Biotec, 2004 22

Shared functions of bacteria, archaea, and eucarya

Functions shared by Haemophilus influenza (bacteria), Methanococus jannaschii (archaea), Saccharomyces cerevisiae (eucarya) Energy:

Biosyntehsis of cofactors, amino acids Central and intermediary metabolism Energy metabolism Fatty acids and phospholipids Nucleotide biosynthesis Transport

Information: Replication Transcription Translation

Communication and regulation Regulatory functions Cell envelope/cell wall Cellular processes

Can we construct a minimal organism?

Page 23: Michael Schroeder BioTechnological Center TU Dresden Biotec Genome Lesk, Introduction to Bioinformatics, Chapter 2.

By Michael Schroeder, Biotec, 2004 23

Summary

Relation of DNA, genes and chromosomes Relationship of distance in Morgan and basepairs How to find genes in DNA

By similarity Ab initiov with Introns, exons, alternative splicing

Read Lesk, chapter 2


Recommended