+ All Categories
Home > Documents > BB30055: Genes and genomes Genomes - Dr. MV Hejmadi (bssmvh) 3 broad areas (A) Genomes...

BB30055: Genes and genomes Genomes - Dr. MV Hejmadi (bssmvh) 3 broad areas (A) Genomes...

Date post: 13-Dec-2015
Category:
Upload: joseph-quinn
View: 216 times
Download: 1 times
Share this document with a friend
Popular Tags:
68
BB30055: Genes and genomes Genomes - Dr. MV Hejmadi (bssmvh) 3 broad areas (A) Genomes (B) Applications genome projects (C) Genome evolution
Transcript
Page 1: BB30055: Genes and genomes Genomes - Dr. MV Hejmadi (bssmvh) 3 broad areas (A) Genomes (B)Applications genome projects (C) Genome evolution.

BB30055: Genes and genomes

Genomes - Dr. MV Hejmadi (bssmvh)

3 broad areas

(A) Genomes

(B)Applications genome

projects

(C) Genome evolution

Page 2: BB30055: Genes and genomes Genomes - Dr. MV Hejmadi (bssmvh) 3 broad areas (A) Genomes (B)Applications genome projects (C) Genome evolution.

Why sequence the genome?3 main reasons

• description of sequence of every gene valuable. Includes regulatory regions which help in understanding not only the molecular activities of the cell but also ways in which they are controlled.

• identify & characterise important inheritable disease genes or bacterial genes (for industrial use)

• Role of intergenic sequences e.g. satellites,

intronic regions etc

Page 3: BB30055: Genes and genomes Genomes - Dr. MV Hejmadi (bssmvh) 3 broad areas (A) Genomes (B)Applications genome projects (C) Genome evolution.

History of Human Genome Project (HGP)

1953 – DNA structure (Watson & Crick)1972 – Recombinant DNA (Paul Berg)1977 – DNA sequencing (Maxam, Gilbert and Sanger)1985 – PCR technology (Kary Mullis)1986 – automated sequencing (Leroy Hood & Lloyd

Smith1988 – IHGSC established (NIH, DOE) Watson leads1990 – IHGSC scaled up, BLAST published

(Lipman+Myers)1992 – Watson quits, Venter sets up TIGR1993 – F Collins heads IHGSC, Sanger Centre (Sulston)1995 – cDNA microarray1998 – Celera genomics (J Craig Venter)2001 – Working draft of human genome sequence

published2003 – Finished sequence announced

Page 4: BB30055: Genes and genomes Genomes - Dr. MV Hejmadi (bssmvh) 3 broad areas (A) Genomes (B)Applications genome projects (C) Genome evolution.

Human Genome Project (HGP)

Goal: Obtain the entire DNA sequence of human genome

Players:(A) International Human Genome Sequence

Consortium (IHGSC)- public funding, free access to all, started

earlier- used mapping overlapping clones method

(B) Celera Genomics – private funding, pay to view- started in 1998- used whole genome shotgun strategy

Page 5: BB30055: Genes and genomes Genomes - Dr. MV Hejmadi (bssmvh) 3 broad areas (A) Genomes (B)Applications genome projects (C) Genome evolution.

Whose genome is it anyway?

(A) International Human Genome Sequence Consortium (IHGSC)- composite from several different people generated from 10-20 primary samples taken from numerous anonymous donors across racial and ethnic groups

(B) Celera Genomics – 5 different donors (one of whom was J

Craig Venter himself !!!)

Page 6: BB30055: Genes and genomes Genomes - Dr. MV Hejmadi (bssmvh) 3 broad areas (A) Genomes (B)Applications genome projects (C) Genome evolution.

Strategies for sequencing the human genome

Page 7: BB30055: Genes and genomes Genomes - Dr. MV Hejmadi (bssmvh) 3 broad areas (A) Genomes (B)Applications genome projects (C) Genome evolution.

sequencing larger genomes

Mapping phase

Sequencing phase

Page 8: BB30055: Genes and genomes Genomes - Dr. MV Hejmadi (bssmvh) 3 broad areas (A) Genomes (B)Applications genome projects (C) Genome evolution.

Result….

~30 - 40,000 protein-coding genes estimated based on known genes and predictions

IHGSC Celeradefinite genes 24,500 26,383 possible genes 5000 12,000

Page 9: BB30055: Genes and genomes Genomes - Dr. MV Hejmadi (bssmvh) 3 broad areas (A) Genomes (B)Applications genome projects (C) Genome evolution.

Organisation of human genome

Nuclear genome (3.2 Gbp) 24 types of chromosomes Y- 51Mb and chr1 -279Mbp

Mitochondrial genome

Page 10: BB30055: Genes and genomes Genomes - Dr. MV Hejmadi (bssmvh) 3 broad areas (A) Genomes (B)Applications genome projects (C) Genome evolution.

General organisation of human genome

Page 11: BB30055: Genes and genomes Genomes - Dr. MV Hejmadi (bssmvh) 3 broad areas (A) Genomes (B)Applications genome projects (C) Genome evolution.

Polypeptide-coding regions

Page 12: BB30055: Genes and genomes Genomes - Dr. MV Hejmadi (bssmvh) 3 broad areas (A) Genomes (B)Applications genome projects (C) Genome evolution.

Gene organisation

Rare bicistronic transcription units E.g. UBA52 transcription generates ubiquitin

and a ribosomal protein S27a

Page 13: BB30055: Genes and genomes Genomes - Dr. MV Hejmadi (bssmvh) 3 broad areas (A) Genomes (B)Applications genome projects (C) Genome evolution.

General organisation of human genome

Page 14: BB30055: Genes and genomes Genomes - Dr. MV Hejmadi (bssmvh) 3 broad areas (A) Genomes (B)Applications genome projects (C) Genome evolution.

Non polypeptide–coding: RNA encoding

Page 15: BB30055: Genes and genomes Genomes - Dr. MV Hejmadi (bssmvh) 3 broad areas (A) Genomes (B)Applications genome projects (C) Genome evolution.

Pseudogenes ()

non functional copies of

exonic sequences of an

active gene.

Thought to arise by genomic

insertion of a cDNA as a

result of retroposition

Contributes to overall

repetitive elements (<1%)

Page 16: BB30055: Genes and genomes Genomes - Dr. MV Hejmadi (bssmvh) 3 broad areas (A) Genomes (B)Applications genome projects (C) Genome evolution.

processed pseudogenes -

Page 17: BB30055: Genes and genomes Genomes - Dr. MV Hejmadi (bssmvh) 3 broad areas (A) Genomes (B)Applications genome projects (C) Genome evolution.

Pseudogenes in globin gene cluster

Page 18: BB30055: Genes and genomes Genomes - Dr. MV Hejmadi (bssmvh) 3 broad areas (A) Genomes (B)Applications genome projects (C) Genome evolution.

Gene fragments or truncated genes

Gene fragments: small

segments of a gene

(e.g. single exon from

a multiexon gene)

Truncated genes: Short components of functional genes (e.g. 5’ or 3’ end)

Thought to arise due to unequal crossover or exchange

Page 19: BB30055: Genes and genomes Genomes - Dr. MV Hejmadi (bssmvh) 3 broad areas (A) Genomes (B)Applications genome projects (C) Genome evolution.

General organisation of human genome

Page 20: BB30055: Genes and genomes Genomes - Dr. MV Hejmadi (bssmvh) 3 broad areas (A) Genomes (B)Applications genome projects (C) Genome evolution.

Repetitive elements

Main classes based on origin

Tandem repeats

Interspersed repeats

Segmental duplications

Page 21: BB30055: Genes and genomes Genomes - Dr. MV Hejmadi (bssmvh) 3 broad areas (A) Genomes (B)Applications genome projects (C) Genome evolution.

1) Tandem repeats

Blocks of tandem repeats at subtelomeres pericentromeres Short arms of acrocentric

chromosomes Ribosomal gene clusters

Page 22: BB30055: Genes and genomes Genomes - Dr. MV Hejmadi (bssmvh) 3 broad areas (A) Genomes (B)Applications genome projects (C) Genome evolution.

Tandem / clustered

repeats

class Size of repeat

Repeat block

Major chromosomal

location

Satellite 5-171 bp > 100kb centromeric

heterochromatin

minisatellite 9-64 bp 0.1–20kb Telomeres

microsatellites 1-13 bp < 150 bp Dispersed

HMG3 by Strachan and Read pp 265-268

Broadly divided into 4 types based on size

Page 23: BB30055: Genes and genomes Genomes - Dr. MV Hejmadi (bssmvh) 3 broad areas (A) Genomes (B)Applications genome projects (C) Genome evolution.

SatellitesLarge arrays of

repeats

Some examplesSatellite 1,2 & 3Alphoid DNA) - found in all

chromosomes satellite

HMG3 by Strachan and Read pp 265-268

Page 24: BB30055: Genes and genomes Genomes - Dr. MV Hejmadi (bssmvh) 3 broad areas (A) Genomes (B)Applications genome projects (C) Genome evolution.

MinisatellitesModerate sized arrays of repeats

Some examplesHypervariable minisatellite DNA

- core of GGGCAGGAXG- found in telomeric regions- used in original DNA fingerprinting technique by Alec Jeffreys

HMG3 by Strachan and Read pp 265-268

Page 25: BB30055: Genes and genomes Genomes - Dr. MV Hejmadi (bssmvh) 3 broad areas (A) Genomes (B)Applications genome projects (C) Genome evolution.

MicrosatellitesVNTRs - Variable Number of Tandem Repeats, SSR - Simple Sequence Repeats 1-13 bp repeats e.g. (A)n ; (AC)n

HMG3 by Strachan and Read pp 265-268

2% of genome (dinucleotides - 0.5%)Used as genetic markers (especially for disease mapping)

Individual genotype

Page 26: BB30055: Genes and genomes Genomes - Dr. MV Hejmadi (bssmvh) 3 broad areas (A) Genomes (B)Applications genome projects (C) Genome evolution.

Microsatellite genotyping

. design PCR primers unique to one locus in the genomea single pair of PCR primers will produce different sized products for each of the different length microsatellites

Page 27: BB30055: Genes and genomes Genomes - Dr. MV Hejmadi (bssmvh) 3 broad areas (A) Genomes (B)Applications genome projects (C) Genome evolution.

2) Interspersed repeats

A.k.a. Transposon-derived repeats

45% of genome

Arise mainly as a result of transposition either through a DNA or a RNA intermediate

Page 28: BB30055: Genes and genomes Genomes - Dr. MV Hejmadi (bssmvh) 3 broad areas (A) Genomes (B)Applications genome projects (C) Genome evolution.

Interspersed repeats (transposon-derived)

class family size Copy numbe

r

% genome

*LINE L1 (Kpn family)

L2

~6.4kb 0.5x106

0.3 x 106

16.9

3.2

SINE Alu ~0.3kb 1.1x106 10.6

LTR e.g.HERV ~1.3kb 0.3x106 8.3

DNA

transposon

mariner ~0.25kb 1-2x104 2.8

major types

* Updated from HGP publications HMG3 by Strachan & Read pp268-272

Page 29: BB30055: Genes and genomes Genomes - Dr. MV Hejmadi (bssmvh) 3 broad areas (A) Genomes (B)Applications genome projects (C) Genome evolution.

Most ancient of eukaryotic genomes Autonomous transposition (reverse trancriptase) ~6-8kb long Internal polymerase II promoter and 2 ORFs 3 related LINE families in humans

– LINE-1, LINE-2, LINE-3. Believed to be responsible for retrotransposition

of SINEs and creation of processed pseudogenes

LINEs (long interspersed elements)

Page 30: BB30055: Genes and genomes Genomes - Dr. MV Hejmadi (bssmvh) 3 broad areas (A) Genomes (B)Applications genome projects (C) Genome evolution.

LINEs (long interspersed elements)

Nature (2001) pp879-880 HMG3 by Strachan & Read pp268-272

Page 31: BB30055: Genes and genomes Genomes - Dr. MV Hejmadi (bssmvh) 3 broad areas (A) Genomes (B)Applications genome projects (C) Genome evolution.

Non-autonomous (successful freeloaders! ‘borrow’ RT from other sources such as LINEs)

~100-300bp long Internal polymerase III promoter No proteins Share 3’ ends with LINEs 3 related SINE families in humans

– active Alu, inactive MIR and Ther2/MIR3.

SINEs (short interspersed elements)

Page 32: BB30055: Genes and genomes Genomes - Dr. MV Hejmadi (bssmvh) 3 broad areas (A) Genomes (B)Applications genome projects (C) Genome evolution.

LINES and SINEs have preferred insertion sites

• In this example, yellow represents the distribution of mys (a type of LINE) over a mouse genome where chromosomes are orange. There are more mys inserted in the sex (X) chromosomes.

Page 33: BB30055: Genes and genomes Genomes - Dr. MV Hejmadi (bssmvh) 3 broad areas (A) Genomes (B)Applications genome projects (C) Genome evolution.

Try the link below to do an online experiment which shows how an Alu insertion polymorphism has been used as a tool to reconstruct the human lineage

http://www.geneticorigins.org/geneticorigins/pv92/intro.html

Page 34: BB30055: Genes and genomes Genomes - Dr. MV Hejmadi (bssmvh) 3 broad areas (A) Genomes (B)Applications genome projects (C) Genome evolution.

Repeats on the same orientation on both sides of element e.g. ATATATNNNNNNNATATAT• contain sequences that serve as transcription promoters• as well as terminators. • These sequences allow the element to code for an mRNA

molecule that is processed and polyadenylated. • At least two genes coded within the element to supply

essential• activities for the retrotransposition mechanism. • The RNA contains a specific primer binding site (PBS) for

initiating reverse transcription. • A hallmark of almost all mobile elements is that they form

small direct repeats formed at the site of integration.

Long Terminal Repeats (LTR)

Page 35: BB30055: Genes and genomes Genomes - Dr. MV Hejmadi (bssmvh) 3 broad areas (A) Genomes (B)Applications genome projects (C) Genome evolution.

Autonomous or non-autonomous Autonomous retroposons encode gag,

pol genes which encode the protease, reverse transcriptase, RNAseH and integrase

Long Terminal Repeats (LTR)

Nature (2001) pp879-880 HMG3 by Strachan & Read pp268-272

Page 36: BB30055: Genes and genomes Genomes - Dr. MV Hejmadi (bssmvh) 3 broad areas (A) Genomes (B)Applications genome projects (C) Genome evolution.

DNA transposons Inverted repeats on both sides of elemente.g. ATGCNNNNNNNNNNNCGTA

DNA transposons (lateral transfer?)

Nature (2001) pp879-880 From GenesVII by Levin

Page 37: BB30055: Genes and genomes Genomes - Dr. MV Hejmadi (bssmvh) 3 broad areas (A) Genomes (B)Applications genome projects (C) Genome evolution.

3) Segmental duplications

Closely related sequence blocks at different genomic loci

Transfer of 1-200kb blocks of genomic sequence

Segmental duplications can occur on homologous chromosomes (intrachromosomal) or non homologous chromosomes (interchromosomal)

Not always tandemly arranged Relatively recent

Page 38: BB30055: Genes and genomes Genomes - Dr. MV Hejmadi (bssmvh) 3 broad areas (A) Genomes (B)Applications genome projects (C) Genome evolution.

Segmental duplicationsInterchromosomal segments duplicated

among non-homologous

chromosomes

Intrachromosomal duplications occur within a

chromosome / arm

Nature Reviews Genetics 2, 791-800 (2001);

Page 39: BB30055: Genes and genomes Genomes - Dr. MV Hejmadi (bssmvh) 3 broad areas (A) Genomes (B)Applications genome projects (C) Genome evolution.

Segmental duplicationsSegmental duplications in chromosome 22

Page 40: BB30055: Genes and genomes Genomes - Dr. MV Hejmadi (bssmvh) 3 broad areas (A) Genomes (B)Applications genome projects (C) Genome evolution.

Segmental duplications - chromosome 7.

Page 41: BB30055: Genes and genomes Genomes - Dr. MV Hejmadi (bssmvh) 3 broad areas (A) Genomes (B)Applications genome projects (C) Genome evolution.

Nature Reviews Genetics 2, 791-800 (2001)

Page 42: BB30055: Genes and genomes Genomes - Dr. MV Hejmadi (bssmvh) 3 broad areas (A) Genomes (B)Applications genome projects (C) Genome evolution.

Major insights from the HGP

Nature (2001) 15th Feb Vol 409 special issue; pgs 814 & 875-914.

1)Gene size, content and distribution

2)Proteome content

3)SNP identification

4)Distribution of GC content

5)CpG islands

6)Recombination rates

7)Repeat content

Page 43: BB30055: Genes and genomes Genomes - Dr. MV Hejmadi (bssmvh) 3 broad areas (A) Genomes (B)Applications genome projects (C) Genome evolution.

1) Gene size

Page 44: BB30055: Genes and genomes Genomes - Dr. MV Hejmadi (bssmvh) 3 broad areas (A) Genomes (B)Applications genome projects (C) Genome evolution.

More genes: Twice as many as drosophila / C.elegans

Uneven gene distribution: Gene-rich and gene-poor

regions

More paralogs: some gene families have extended

the number of paralogs e.g. olfactory gene family

has 1000 genes

More alternative transcripts: Increased RNA splice

variants produced thereby expanding the primary

proteins by 5 fold (e.g. neurexin genes)

Gene content….

Page 45: BB30055: Genes and genomes Genomes - Dr. MV Hejmadi (bssmvh) 3 broad areas (A) Genomes (B)Applications genome projects (C) Genome evolution.

Gene distribution

Genes- within genes E.g. NF1 gene

Overlapping genes (transcribed from 2 DNA strands) - Rare

Genes generally dispersed (~1 gene per 100kb)

Class III complex at HLA 6p21.3

HMG3 Fig 9.8

Page 46: BB30055: Genes and genomes Genomes - Dr. MV Hejmadi (bssmvh) 3 broad areas (A) Genomes (B)Applications genome projects (C) Genome evolution.

Gene-rich E.g. MHC on chromosome 6 has 60

genes with a GC content of 54%

Gene-poor regions 82 gene deserts identified? Large or unidentified genes

What is the functional significance of these variations?

Uneven gene distribution

Page 47: BB30055: Genes and genomes Genomes - Dr. MV Hejmadi (bssmvh) 3 broad areas (A) Genomes (B)Applications genome projects (C) Genome evolution.

2) Proteome content proteome more complex than invertebrates

Protein Domains (sections with identifiable shape/function)

Domain arrangements in humanslargest total number of domains is 130largest number of domain types per protein is 9Mostly identical arrangement of domains

A A B B CB C C CC Protein X

Page 48: BB30055: Genes and genomes Genomes - Dr. MV Hejmadi (bssmvh) 3 broad areas (A) Genomes (B)Applications genome projects (C) Genome evolution.

Proteome more complex than invertebrates……

no huge difference in domain number in humansBUT, frequency of domain sharing very high in human proteins (structural proteins and proteins involved in signal transduction and immune function)

However, only 3 cases where a combination of 3 domain types shared by human & yeast proteins.

e.g carbomyl-phosphate synthase (involved in the first 3 steps of de novo pyrimidine biosynthesis) has 7 domain types, which occurs once in human and yeast but twice in drosophila

Page 49: BB30055: Genes and genomes Genomes - Dr. MV Hejmadi (bssmvh) 3 broad areas (A) Genomes (B)Applications genome projects (C) Genome evolution.

3) SNPs (single nucleotide polymorphisms)

More than 1.4million SNPs identified One every 1.9kb length on averageDensities vary over regions and chromosomese.g. HLA region has a high SNP density, reflecting

maintenance of diverse haplotypes over many MYears

Nature (2001) 15th Feb Vol 409 special issue; pgs 821-823 & 928

Sites that result from point mutations in individual base pairs

biallelic ~60,000 SNPs lie within exons and

untranslated regions (85% of exons lie within 5kb of a SNP)

May or may not affect the ORF Most SNPs may be regulatory

Page 50: BB30055: Genes and genomes Genomes - Dr. MV Hejmadi (bssmvh) 3 broad areas (A) Genomes (B)Applications genome projects (C) Genome evolution.

How does one distinguish sequence errors from polymorphisms?sequence errorsEach piece of genome sequenced at least 10

times to reduce error rate (0.01%)

PolymorphismsSequence variation between individuals is 0.1%

To be defined as a polymorphism, the altered sequence must be present in a significant population

Rate of polymorphisms in diploid human genome is about 1 in 500 bp

Nature (2001) 15th Feb Vol 409 special issue; pgs 821-823 & 928

Page 51: BB30055: Genes and genomes Genomes - Dr. MV Hejmadi (bssmvh) 3 broad areas (A) Genomes (B)Applications genome projects (C) Genome evolution.

SNPs and disease

Page 52: BB30055: Genes and genomes Genomes - Dr. MV Hejmadi (bssmvh) 3 broad areas (A) Genomes (B)Applications genome projects (C) Genome evolution.

3) SNPs……and risk of disease

N(291)S

Page 53: BB30055: Genes and genomes Genomes - Dr. MV Hejmadi (bssmvh) 3 broad areas (A) Genomes (B)Applications genome projects (C) Genome evolution.

3) SNPs……and risk of disease

3 major alleles (APO E2, E3, and E4)

APO E2: Cys112 / Cys158 APO E3: Cys112 / Arg158 APO E4: Arg112 / Arg158

late-onset Alzheimer's disease (LOAD)Apolipoprotein 4 haplotype is a genetic risk factor

Page 54: BB30055: Genes and genomes Genomes - Dr. MV Hejmadi (bssmvh) 3 broad areas (A) Genomes (B)Applications genome projects (C) Genome evolution.

3) SNPs……and pharmacogenomics

Page 55: BB30055: Genes and genomes Genomes - Dr. MV Hejmadi (bssmvh) 3 broad areas (A) Genomes (B)Applications genome projects (C) Genome evolution.

4) Distribution of GC content

Genome wide average of 41%Huge regional variations exist

E.g.distal 48Mb of chromosome 1p-47% but chromosome 13 has only 36%

Confirms cytogenetic staining with G-bands (Giemsa)dark G-bands – low GC content (37%)light G-bands – high GC content (45%)

Nature (2001) 15th Feb Vol 409 special issue; pg 876-877

Page 56: BB30055: Genes and genomes Genomes - Dr. MV Hejmadi (bssmvh) 3 broad areas (A) Genomes (B)Applications genome projects (C) Genome evolution.

5) CpG islands

Significance of CpG islands1) Non-methylated CpG islands

associated with the 5’ ends of genes2) Aberrant methylation of CpG islands

is one mechanism of inactivating tumor suppressor genes (TSGs) in neoplasia

http://www.sanger.ac.uk/HGP/cgi.shtml

CpG Methyl CpG TpG

methylated at C Deamination

CpG islands show no methylation

Page 57: BB30055: Genes and genomes Genomes - Dr. MV Hejmadi (bssmvh) 3 broad areas (A) Genomes (B)Applications genome projects (C) Genome evolution.

CpG islands

Greatly under-represented in human genome

• ~28,890 in number• Variable density

e.g. Y – 2.9/Mb but 16,17 & 22 have 19-22/MbAverage is 10.5/Mb

Nature (2001) 15th Feb Vol 409 special issue; pg 877-888

Page 58: BB30055: Genes and genomes Genomes - Dr. MV Hejmadi (bssmvh) 3 broad areas (A) Genomes (B)Applications genome projects (C) Genome evolution.

6) Recombination rates

2 main observations• Recombination rate increases with

decreasing arm length• Recombination rate suppressed

near the centromeres and increases towards the distal 20-35Mb

Page 59: BB30055: Genes and genomes Genomes - Dr. MV Hejmadi (bssmvh) 3 broad areas (A) Genomes (B)Applications genome projects (C) Genome evolution.

7) Repeat content

a) Age distribution

b) Comparison with other genomes

c) Variation in distribution of repeats

d) Distribution by GC content

e) Y chromosome

Nature (2001) 409: pp 881-891

Page 60: BB30055: Genes and genomes Genomes - Dr. MV Hejmadi (bssmvh) 3 broad areas (A) Genomes (B)Applications genome projects (C) Genome evolution.

Repeat content…….

Most interspersed repeats predate eutherian

radiation (confirms the slow rate of clearance of

nonfunctional sequence from vertebrate genomes)

LINEs and SINEs have extremely long lives

2 major peaks of transposon activity

No DNA transposition in the past 50MYr

LTR retroposons teetering on the brink of extinction

a) Age distribution

Page 63: BB30055: Genes and genomes Genomes - Dr. MV Hejmadi (bssmvh) 3 broad areas (A) Genomes (B)Applications genome projects (C) Genome evolution.

c) Variation in distribution of repeatsSome regions show eitherHigh repeat density

e.g. chromosome Xp11 – a 525kb region shows 89% repeat density

Low repeat density e.g. HOX homeobox gene cluster (<2% repeats)

(indicative of regulatory elements which have low tolerance for insertions)

Page 64: BB30055: Genes and genomes Genomes - Dr. MV Hejmadi (bssmvh) 3 broad areas (A) Genomes (B)Applications genome projects (C) Genome evolution.

High GC – gene rich ; High AT – gene poor

LINEs abundant in AT-rich regionsSINEs lower in AT-rich regions

Alu repeats in particular retained in actively transcribed GC rich regions E.g. chromosme 19 has 5% Alus compared to Y chromosome

d) Distribution by GC content

Page 65: BB30055: Genes and genomes Genomes - Dr. MV Hejmadi (bssmvh) 3 broad areas (A) Genomes (B)Applications genome projects (C) Genome evolution.

Unusually young genome (high tolerance to gaining insertions)

Mutation rate is 2.1X higher in male germline

Possibly due to cell division rates or different repair mechanisms

e) The Y chromosome !

Page 66: BB30055: Genes and genomes Genomes - Dr. MV Hejmadi (bssmvh) 3 broad areas (A) Genomes (B)Applications genome projects (C) Genome evolution.

• Working draft published – Feb 2001• Finished sequence – April 2003

• Annotation of genes going on(refer: International Human Genome

Sequencing Consortium. Finishing the euchromatic sequence of the human genome. Nature 21 October 2004 (doi: 10.1038/nature03001)

Page 67: BB30055: Genes and genomes Genomes - Dr. MV Hejmadi (bssmvh) 3 broad areas (A) Genomes (B)Applications genome projects (C) Genome evolution.

Other genomes sequenced

2002Mus musculus36,000 genes

Sept 2003Canis 18,473human orthologs

19974,200 genes

199819,099 genes

200238,000 genes

Science (26 Sep 2003)Vol301(5641)pp1854-1855

31Aug 2005Pan troglodytes28% identical Human orthologs

Page 68: BB30055: Genes and genomes Genomes - Dr. MV Hejmadi (bssmvh) 3 broad areas (A) Genomes (B)Applications genome projects (C) Genome evolution.

References

1) Chapter 9 pp 265-268 HMG 3 by Strachan and

Read

2) Chapter 10: pp 339-348Genetics from genes to genomes by Hartwell et al (2/e)

3) Nature (2001) 409: pp 879-891


Recommended