La régulation transcriptionnelle :mécanismes et méthodes d’analyse
Jean Imbert
For a European Research Initiative: http://fer.apinc.org/
15 avril 2009
Plate-formeIBiSA - Inserm
I.1. Eukaryotic Promoter Classes
• Pol I < 1%
• Pol II with TATA-box > 70%• Pol II without TATA-box ~ 20%
• Pol III internal ~ 5%• Pol III upstream with TATA-box < 1%• Pol III upstream without TATA-box < 1%
mRNAs
3/4 rRNAs (28S, 18S, 5.8S)
Small RNAstRNAs5S RNAs
CORE PROMOTER
MODEL OF TYPICAL GENE PROMOTER AND REGULATORY REGIONS
Enhancer Regulatory Elements
+ 1
Inr EnhancerTATA Box
Core Promoter
Regulatory Promoter
-40 +50-4000 -500 +2000
The molecular mechanisms of transcriptional regulationinvolve an intricate hierarchy of factors acting sequentiallyat three levels:
1. Binding at specific regulatory elements linked to targeted genes ofthe sequence-specific DNA-binding transcription factors.
2. Recruitment of non-DNA-binding proteins capable of modifyingthe general repressive context of chromatin and actingas signal-regulated scaffolds to bridge interactions betweenthe sequence-specific DNA-binding proteins and the basal transcriptionmachinery.
3. Recruitment of RNA polymerase II and its basal transcription factors.
DNA compactioncompaction in a human nucleus
compact size DNA length compaction
nucleus (human) 2 x 23 = 46 chromosomes 92 DNA molecules 10 μm ball 12,000 Mbp 4 m DNA 400,000 x
mitotic chromosome 2 chromatids, 1 μm thick 2 DNA molecules 10 μm long X 2x 130 Mbp 2x 43 mm DNA 10,000 x
DNA domain anchored DNA loop 1 replicon ? 60 nm x 0.5 μm 60 kbp 20 μm DNA 35 x
chromatin fiber approx. 6 nucleosomes per ‘turn’ of 11 nm 30 nm diameter 1200 bp 400 nm DNA 35 x
nucleosome disk 1 ¾ turn of DNA (146 bp) + linker DNA 6 x 11 nm 200 bp 66 nm DNA 6 - 11 x
base pair 0.33 x 1.1 nm 1 bp 0.33 nm DNA 1 x
1bp (0.3nm)
10,000 nm
30nm
11 nm
Compaction of DNA by histones Compaction by chromosome scaffold / nuclear matrix
From Jakob H. Waterborg - School of Biological Sciences - University of Missouri-Kansas City
I.3. Major Nucleosomal Histone Modification Mapping
From G. Felsenfeld and M. Groudine. Controlling the double helix. Nature 421 (6921):448-453, 2003.
• Acetylation at lysine residues is highly associated with transcriptional activation (H2AK5, H2BK12, etc.)
• Methylation at lysine or arginine residues is associated with either transcriptional activation (H3K4, H4R3, etc.) or repression (H3K9, H3K27, etc.)
• Phosphorylation at serine or threonine residues is associated with either transcriptional activation (H3S28, etc.) or repression (H2AS1, etc.)
Functional Effects of Histone Modifications on TranscriptionA Few Examples in Mammals
Ubiquitylation and sumoylation have been associated with mitosis, meiosis,etc.
From C. L. Peterson and M. A. Laniel. Histones and histone modifications. Curr.Biol. 14 (14):R546-R551, 2004.
DD: Dimerization DomainDBD: DNA Binding DomainTD: Transactivation DomainNLS: Nuclear Localization SignalNES: Nuclear Export SignalP: Phosphorylation Site
Sequence-specific transcription factors are modular proteins
DBD TDDD
NLS
PPP
NES
From Tupler R, Perini G, Green MR. Nature 409: 832-833, 2001.
Genome-wide comparison of transcriptional activator families in eukaryotes
C2H2 zinc fingers are found in 2% of all human genes, and they are by far the most abundant class of DNA-binding domains found in human transcription factors.
STRATEGIES POUR L'ANALYSE DES SEQUENCES REGULATRICES ET DES FACTEURS DE TRANSCRIPTION
1. Identification et caractérisation des séquences régulatrices- Région promotrice, amplificatrice, répresseur- Essais à gènes indicateurs : transitoire ou stable (CAT, Luciférase, SEAP, β-gal, …)- Séquençage, recherche dans des banques de motifs consensus (sites web : TESS, Euk. Pr. Database,
TRANSFAC, JASPAR, MATINSPECTOR, TFSEARCH, etc.)
2. Identification et caractérisation des facteurs protéiques spécifiques des séquences régulatrices- gel-retard et dérivés : UV-crosslinking, oligonucléotides biotynilés- détection d'empreintes :* in vitro : Nucléases (DNAse I hypersensitivity, nucléase S1, Mmase, méthodes chimiques)* in vivo : détection d'empreinte génomique, ChIP (Chromatin ImmunoPrecipitation), enhancer knock in ,
minichromosomes, transgénèse
3. Caractérisation des interactions physiques et fonctionnelles- Transfection et biochimie- ChIP, ChIP-on-chip, ChIP-seq
4. Intégration dans le contexte chromatinien (Insulator, MAR/SAR, LCR, etc.)- Minichromosome artificiel- 3C, 4C, 5C
III. Exemple d’organisationfonctionnelle des régions régulatrices :
Control of IL2RA Gene Transcription
Major Roles of the IL2/IL2R System
Major T cell growth factor controlling antigen-mediated cell growth
Antigen-induced cell death (AICD)
Essential for generation in Thymus and peripheral maintenance ofCD4+CD25+ Treg
T CD4+
CD3/TcR
CD28
Ag MHC
B7 APCAutocrine
effects
IL-2
α/β/γc
B T CD8+NK
β/γcα/β/γc α/β/γc
Paracrineeffects
Primary Activation
IL-2Rα CHAIN EXPRESSION IS CONTROLED AT:
- TRANSCRIPTIONAL LEVEL
- POST-TRANSCRIPTIONAL LEVEL (mRNA half-life)
CD25/IL-2Rα GENE TRANSCRIPTION DURING T CELL ACTIVATIONLo
g [IL
2RA
mR
NA
leve
l]
G0 G1
Primary activation
CD3/TCRCD28
ProliferationDifferentiation
S
IL2/IL2R (α βc γc)
Cell cycle progression
IL-2
IL-2/IL2-RSignal 3
IL-2
III II
I
TGF-β
TGF-β/TGF-βROther signal
IL-2 RαmRNA
PRRII[-137,-64]
PRRI[-276,-244]
Elf-1
HMG-I(Y)
PRRIV[+3389,+3596]
IL-2rE
PRRIII [-3780,-3703]
IL-2rE
Stat5a,b
HMG-I(Y)
Stat5a,b
GATA-1-like
Ets-1/2
PRRVI[-8689,-8483]
CD28rE
NF-κB SRF
CREB/ATF
AP-1 AP-1
HMG-I(Y)NFAT Exon 1
AP-1
NFAT
SATB1?
SBS[+7822,+8199]
CREB/ATFAP-1
Smad3
PRRV[-7664,-7566]
CD4+ T cell
APC
B7
B7/CD28Signal 2
Ag/TCR/CD3 Signal 1
Ag
MHCI
PRRIITATA
+ 1-100-200
PRRI-300
PRIMARYACTIVATION
IL2RA GENE REGULATORY REGIONS
-4000-9000
NF-κB SRF Elf-1?
PipMaker
http://pipmaker.bx.psu.edu/pipmaker/
Schwartz, S., Zhang, Z., Frazer, K.A., Smit, A., Riemer, C., Bouck, J., Gibbs, R., Hardison, R., and Miller, W. (2000). PipMaker: A Web Server for Aligning Two Genomic DNA Sequences. Genome Res. 10, 577-586.
Homo Sapiens/Mus Musculus IL-2Rα locus dotplot comparison
1 2 3 4 567Exons:k
Homo Sapiens IL-2Rα
IIII1
intr
on A
IVSB
S
gap
23
45
67
Schwartz, S., Zhang, Z., Frazer, K.A., Smit, A., Riemer, C., Bouck, J., Gibbs, R., Hardison, R., and Miller, W. (2000). PipMaker: A Web Server for Aligning Two Genomic DNA Sequences. Genome Res. 10, 577-586. update JI 17/10/05
IIIVI I+II IVRegulatory Regions: V
Mus
Mus
culu
s IL
-2R
α
V
TTTCTTCTAGGAAGTACCAAACATTTCTGATAATAGAATTGAGCAATTTCCTGATIIIIIIII--IIIIIIIII-IIIIIIIIIIIII-III-IIIIIIII-IIIIIIIITTTCTTCTGAGAAGTACCAGACATTTCTGATAAGAGAGTTGAGCAACTTCCTGAT
Site I Site II Site III
GASp/GATAGASd/EBSd EBSp
PRRIII
IL-2rE
- 1369 - 1315
- 3772 - 3718
Homo Sapiens
Mus Musculus
250XbaI
TATA-3700-3800
PRRII
+ 1-100-200
PRRI-300
PRIMARYACTIVATION
XbaI XbaI NF-κBSRF
Elf-1
pTK CAT
PRRIII
IL-2
0 20 40 60 80CAT (pg/ml)
IL-2ns
pTK4.CAT
pTK4 / 250XbaI
INDUCTION FOLD
4.2
1.33
In Vivo Footprinting
• 1) Methylation of guanines (major groove) and to a lesser extent
of adenines (minor groove) by DMS on living cellsThe level of methylation is affected by protein binding to DNA
• 2) Genomic DNA extraction
• 3) Cleavage of methylated residues by piperidine
• 4) LMP-PCR amplification of the region to be analyzedLast amplification cycles performed with a 32P-labeled primer
• 5) Analysis of the PCR products on sequencing gel
1 2 3 4
nsIn V
itro
24h
48h
CD2+CD28
Primary T lymphocytes
- 3700
- 3819
GASp/GATA
EBSp
CGTATACGTAATGCGCATATGCTA
GASd
EBSd
- 3769
- 3757
The GASd/EBSd site is the only motif occupied in vivo in response to CD2+CD28in purified human primary T lymphocytes
TTTCTTCTAGGAAGTACCAAACATTTCTGATAATAGAATTGAGCAATTTCCTGATAAAGAAGATCCTTCATGGTTTGTAAAGACTATTATCTTAACTCGTTAAAGGACTA
GASp/GATAGASd/EBSd EBSp
- 3772 - 3718
The GASd/EBSd motif is the only putative regulatory element within PRRIIImodified in vivo in response to an IL-2-dependent induction
in human T lymphocytes
INDUCIBLE
CONSTITUTIVE
Lecine, P., Algarte, M., Rameil, P., Beadling, C., Bucher, P., Nabholz, M. and Imbert, J. Elf-1 and Stat5 bind to critical element in a new enhancer of the human interleukin-2 receptor alpha gene. Mol.Cell.Biol. 16: 6829-6840; 1996.
STAT5STAT5
STAT5
STAT5
STAT5
α β γc
MEMBRANE
NUCLEUSTARGET GENESCIS, OSM, IL-2RαGAS
JAK1JAK3
IL-2
STAT5
CYTOPLASM
IL-2
GASd/EBSd Site I FcγRI-GAS
C2C1
C3C2C3
1 2 3 4 5 6
- + - + - +
Inducibles complex C2 and C3 are GAS-specificConstitutive complex C1 is EBS-specific
GASdCD25/IL-2Rα GASd/EBSd EMSA probe: TTTCTTCTAGGAAGTACC
AAAGAAGATCCTTCATGGEBSd
Mouse site I TTTCTTCTGAGAAGTACCAAAGAAGACTCTTCATGG
COMP 100X : - - WT
mGAS
dm
EBSd
mGA
Sdm
EBSd
+IL-2
-IL-2
GASd/EBSd
C1C2C3
1 2 3 4 5 6
The constitutive C1 complex is EBS-specific and the inducible C2 and C3compexes are GAS-specific
anti-Stat5banti-Ets-1/2anti-Ets-1anti-Ets-2
1 2 3 4 5 6
- - + - - -- - - + - -- - - - + -- - - - - +
GASd/EBSd
- IL-
2 + IL-2
C2C1
C3
In Kit-225 cells the inducible C3 complex contains Stat5b, Ets-1 and Ets-2
Lécine et al., MCB, 1996
TTCTAGGAAEBS
AGGAAGAS
IL-2
1 2 3 4 5 6 7 8 9
- + - + + +- - +
IP :
WB : Stat5b
Ets1/2 Ets-1 Ets-2 c-Rel TL
83 Stat5b
Stat5b, Ets-1 and Ets-2 interact in vivo in response to IL-2
INDUCTION
27.3 +/- 7.7
11.8 +/- 4.5
8.3 +/- 3.3
9.1 +/- 4.1
8.8 +/- 3.4
4.6 +/- 2.1
1.0 +/- 0.02
0 20 40 60 80
CAT (pg/ml)
IL-2+P+INS
REPORTER
GASd EBSd
pTK4.CAT
EFFECTOR
++
+-
+-
--
+
+
-
+
--+
Stat
5bEt
s-1
Ets-
2
---
---
Functional cooperation between Stat5b, Ets-1 and Ets-2in response to IL-2 +PMA+Ionomycin
Formaldehydecross-linking
Sonication to shear chromatin
Specific Antibodies
Immunoprecipitation
Boundfraction
Unboundfraction
Reversing cross-linking
Semi-quantitative or quantitativePCR Assays
Chromatin immunoprecipitation (ChIP)
M 1 2 3 4 5 6 7 8 9
IL-2 16h 3h
input
input
no Ab
no Ab
Stat5b
Stat5b
p-Stat5b
Ets-1/2
Ets-1/2
Stat5b and Ets-1/2 bind to IL2RA gene within human IL-2rE in vivo
PRRIII ChIP primer design
1 TTCTGCCCTTAGCTTCTACCCCTCTCTACTTCTGGTTAACTATGGACCACACTCTGCTTCBZ3-1 BZ3-2
61 CTCAGGAACCACCTACCAAGGCCGTATCCATCCTTCAAGGACAATACGTGGGCCTTTCCT>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>
121 GATCACATCAGCTCAACAACTTTTCCCTCCTACATTTCAATTGCTCTTCTTACCATAATC>>>
181 ATTAGTATTCACCCCACTGTACGTCTAGAAAGAAAGTGGTCTTAAACCTAAGGGAAGGCA241 GTCTAGGTCAGAAATTTGTTGTCCGCTGTTCTGAGCAGTTTCTTCTAGGAAGTACCAAAC
BZ3-3301 ATTTCTGATAATAGAATTGAGCAATTTCCTGATGAAGTGAGACTCAGCTTGCACTGTTGA
<<<<<<<<<<<<<<<<<<361 CCGGCTGTCCTGGATGAACCTAGTTACTTTTAACCAAATGTTCCTTTCTTGAACTTGTTC
<< 421 CTTTCTTGAACTTAATCTATC
OLIGO start len tm gc% 3' seqBZ3-1 62 20 59.96 55.00 TCAGGAACCACCTACCAAGGBZ3-3 362 20 59.62 55.00 GGTCAACAGTGCAAGCTGAGBZ3-2 104 20 60.71 50.00 ATACGTGGGCCTTTCCTGAT
T LYMPHOCYTE
RESTING IL-2 STIMULATION
Stat5
Ets-1/2
GAS
+ 1
OFF GAS
+ 1
ON
Stat
5
Stat
5
Ets-1/2
X
Stat
5
Stat
5
Ets-1/2
X
Rameil et al., Oncogene, 2000
CD25/IL-2 Rα
+ 1
CD28 Signal 2
?
PRRII(-137~-64)
PRRI(-276~-244)
TCR / CD3Signal 1
Elf-1NF-kB
SRFHMG-I(Y)
IL-2R Signal 3
PRRIII(-3780~-3703)
Stat5a, b
GATA-1-like
IL-2rE(+3389~+3596)
Stat5a, b
HMG-I(Y)
Ets-1/2
DNase I Hypersensitive assay
DNase I Hypersensitive (DH) sites are created by the structure of chromatin
Chromatin structure:
B. DNAse I hypersensitive sitesns CD3+CD28CD3 CD28
DH sites
4
3
2
1
DNase I
kb
2.0
2.3
4.46.69.4
C. Restriction map DH sites
14 23IL-2Rα
PRRI+IIPRRIII
BgBg H
Probe
1 kb E Bc S B B H P
EBBB
BH
P
ESBcS
E
4818942
C. Gene reporter assays
1 6 11 16 21 26
CAT induction fold
pTK3
pTK3/ES
pTK3/BcS
pTK3/EB
pTK3/BB
pTK3/BH
481.IIR
481.IIR/BcS
8942.IIR
A. Homo Sapiens/Mus musculus CD25/IL-2Rα gene
0 5 10 15
pGL3
-
-
- -
-
-
- -
pGL3/PRRIV
+
+
CD28 wt
CD28 wt
161
172
181
192
+
+
Δ30
CD28 Δ 30
161
172
CD28 can specifically induce PRRIV transcriptional activity
Luciferase activity
CRE/TRE
ACTCCTCTAGAATTAT
ACTCCTGACGAATTAT
mCRE/TRE
Luciferase
1 3 5 7 9 11 13
CD3+CD28
CD28CD3
pGL3/PRRIV(mCRE/TRE)
pGL3/PRRIV
Induction fold (luciferase)
pSV40
PRRIV
The CRE/TRE within PRRIV is essential for the response of PRRIV to TCR-CD3 and CD28 signals
Control regions of the IL2RΑ gene
PRRII(-137~-64)
PRRI(-276~-244)
Elf-1
HMG-I(Y)
PRRV/IL-2rE(+3389~+3596)
PRRIII/IL-2rE (-3780~-3703)
IL-2/IL2-RSignal 3
Stat5a,b
HMG-I(Y)
Stat5a,b
GATA-1-like
Ets-1/2
B7/CD28Signal 2
PRRIV/CD28rE(-8689~-8483)
Ag/TCR/CD3 Signal 1
NF-κB SRFCREB/ATFAP-1
HMG-I(Y)NFAT Exon 1 NFAT
AP-1AP-1
SBS700
SATB1+1
IV. Les Nouveaux Outils
Analyse à grande échelle des modifications de la chromatine et des interactions entre génomes et facteurs de transcription
IV.1. Computational ApproachesPhylogenetic Footprinting
Regulatory Network Modeling
IV.2. Experimental ApproachesChIP-on-chipChIP-SAGEChIP-seq
Chromosome Conformation CaptureCircular Chromosome Conformation Capture
Chromosome Conformation Capture Carbon-Copy
Méthodologie employée au cours d’une analyse
Coordonnées chromosomiques d’intérêt chez l’Homme
Récupération des séquences nucléiques
Alignement des séquences
Visualisation de l’alignement
Reconstruction phylogénétique
Galaxy (MultiZ)
ClustalX
Seaview
Expertise de l’alignement
Phylip (NJ)PhyML
Visualisation de la conservation des TFBS
A
B
C
D
E
Coordonnées chromosomiques d’intérêt chez l’Homme
Récupération des séquences nucléiques
Alignement des séquences
Visualisation de l’alignement
Reconstruction phylogénétique
Galaxy (MultiZ)
ClustalX
Seaview
Expertise de l’alignement
Phylip (NJ)PhyML
Visualisation de la conservation des TFBS
A
B
C
D
E
Étape 1: ECR Browser
Position chromosomique chez l’Homme
Choix des espèces
Étape 2: Mulan
Étape 3: MultiTF
Toutes les familles de TF dans MultiTF
Vision graphique finale
Étape 1: ECR Browser
Position chromosomique chez l’Homme
Choix des espèces
Étape 2: Mulan
Étape 3: MultiTF
Toutes les familles de TF dans MultiTF
Vision graphique finale
Représentation des différentes étapes réalisées pour l’étude préliminaire
CD25/IL2RA GeneECR Genome Browser on Human Mar. 2006 Assembly
position/search: chr10:6,080,616-6,159,203
0.02
Eléphant
Vache
Cheval
Chat
Chien
63
55
Cochon-d'inde
Souris
Rat
93
Macaque
Chimpanzé
Homme
96
54
70
40
85
Stat5 Stat5 Elf-1
Stat5 Stat5 Elf-1
Stat5 Stat5 Elf-1
Stat5 Stat5 Elf-1
Stat5 Stat5 Elf-1
Stat5 Stat5 Elf-1
Stat5 Stat5 Elf-1
Stat5 Stat5 Elf-1
Stat5 Stat5 Elf-1
Stat5 Stat5 Elf-1
Stat5 Stat5 Elf-1
0.020.02
Eléphant
Vache
Cheval
Chat
Chien
63
55
Cochon-d'inde
Souris
Rat
93
Macaque
Chimpanzé
Homme
96
54
70
40
85
Stat5 Stat5 Elf-1
Stat5 Stat5 Elf-1
Stat5 Stat5 Elf-1
Stat5 Stat5 Elf-1
Stat5 Stat5 Elf-1
Stat5 Stat5 Elf-1
Stat5 Stat5 Elf-1
Stat5 Stat5 Elf-1
Stat5 Stat5 Elf-1
Stat5 Stat5 Elf-1
Stat5 Stat5 Elf-1 Eléphant
Vache
Cheval
Chat
Chien
63
55
Cochon-d'inde
Souris
Rat
93
Macaque
Chimpanzé
Homme
96
54
70
40
85
Eléphant
Vache
Cheval
Chat
Chien
63
55
Cochon-d'inde
Souris
Rat
93
Macaque
Chimpanzé
Homme
96
54
70
40
85
Stat5 Stat5 Elf-1
Stat5 Stat5 Elf-1
Stat5 Stat5 Elf-1
Stat5 Stat5 Elf-1
Stat5 Stat5 Elf-1
Stat5 Stat5 Elf-1
Stat5 Stat5 Elf-1
Stat5 Stat5 Elf-1
Stat5 Stat5 Elf-1
Stat5 Stat5 Elf-1
Stat5 Stat5 Elf-1
Stat5 Stat5 Elf-1 Stat5 Stat5 Elf-1
Stat5 Stat5 Elf-1 Stat5 Stat5 Elf-1
Stat5 Stat5 Elf-1 Stat5 Stat5 Elf-1
Stat5 Stat5 Elf-1 Stat5 Stat5 Elf-1
Stat5 Stat5 Elf-1
Stat5 Stat5 Elf-1
Stat5 Stat5 Elf-1 Stat5 Stat5 Elf-1
Stat5 Stat5 Elf-1 Stat5 Stat5 Elf-1
Stat5 Stat5 Elf-1 Stat5 Stat5 Elf-1
Stat5 Stat5 Elf-1 Stat5 Stat5 Elf-1
Stat5 Stat5 Elf-1 Stat5 Stat5 Elf-1
PRRIII/IL-2rE phylogeny
TSS
PRRVI PRRV A PRRIII PRRI PRRII PRRIV B2B1 C
[-276, -64][-3780, -3703] [+3389, +3596][-7664, –7566][-8689, -8483] [+9468, +9507][-7160, –6997]
[+10956, +11139][+23969, +24023]
TSS
PRRVI PRRV A PRRIII PRRI PRRII PRRIV B2B1 C
[-276, -64][-3780, -3703] [+3389, +3596][-7664, –7566][-8689, -8483] [+9468, +9507][-7160, –6997]
[+10956, +11139][+23969, +24023]
Tatou
Eléphant
Tenrec
Vache
Cheval
Chien
45
Lapin
Cochon-d'inde
Souris
Rat
79
63
Macaque
Chimpanzé
Homme
82
49
75
44
56
68
76
0.05AP1 CREB
AP1 CREB
AP1
AP1 CREB
CREB
CREB
AP1
Tatou
Eléphant
Tenrec
Vache
Cheval
Chien
45
Lapin
Cochon-d'inde
Souris
Rat
79
63
Macaque
Chimpanzé
Homme
82
49
75
44
56
68
76
0.050.05AP1 CREB
AP1 CREB
AP1
AP1 CREB
CREB
CREB
AP1
AP1 CREBAP1 CREB
AP1 CREBAP1 CREB
AP1AP1
AP1 CREBAP1 CREB
CREBCREB
CREBCREB
AP1AP1
PRRVI/CD28RE phylogeny
TSS
PRRVI PRRV A PRRIII PRRI PRRII PRRIV B2B1 C
[-276, -64][-3780, -3703] [+3389, +3596][-7664, –7566][-8689, -8483] [+9468, +9507][-7160, –6997]
[+10956, +11139][+23969, +24023]
TSS
PRRVI PRRV A PRRIII PRRI PRRII PRRIV B2B1 C
[-276, -64][-3780, -3703] [+3389, +3596][-7664, –7566][-8689, -8483] [+9468, +9507][-7160, –6997]
[+10956, +11139][+23969, +24023]
Tableau récapitulatif de la conservation des modules fonctionnels et potentiels du gène IL2RA dans 15 espèces de mammifères
TSS
PRRVI PRRV A PRRIII PRRI PRRII PRRIV B2B1 C
[-276, -64][-3780, -3703] [+3389, +3596][-7664, –7566][-8689, -8483] [+9468, +9507][-7160, –6997]
[+10956, +11139][+23969, +24023]
TSS
PRRVI PRRV A PRRIII PRRI PRRII PRRIV B2B1 C
[-276, -64][-3780, -3703] [+3389, +3596][-7664, –7566][-8689, -8483] [+9468, +9507][-7160, –6997]
[+10956, +11139][+23969, +24023]
IV.2. Experimental ApproachesChIP-on-chipChIP-SAGEChIP-seq
Chromosome Conformation CaptureCircular Chromosome Conformation Capture
Chromosome Conformation Capture Carbon-Copy
Genomic Targets Identification of Specific Transcription FactorsUsing Chromatin Immunoprecipitation
Gene-specific PCRChIP Cloning &
SequencingSAGE-like
GMAT/SACOChIP-seq
Microarray HybridizationChIP-on-chip
Genome-wide approaches (high throughput)Gene-specific approaches (low throughput)
From T. Y. Roh, S. Cuddapah, and K. Zhao. Active chromatin domains are defined by acetylation islands revealed by genome-wide mapping. Genes Dev. 19 (5):542-552, 2005.
Histone H3 K9/K14 acetylation islands colocalize with all identified PRRwithin IL2RA locus in resting human primary T cells
and suggest the existence of other cis-acting elements not yet identified
V VI
The genome-wide distributions of K9/K14 di-acetylated histone H3 (H3Ac2), K4 trimethylated histone H3 (H3K4me3), and K27 trimethylated histone H3 (H3K27me3) in resting human T lymphocytes were mapped by Genome-wide MApping Technique (GMAT), a combination of chromatin immunoprecipitation and SAGE technique.
The level of the histone modification at a genetic locus is positively correlated with the detection frequency of a 21-bp sequence tag identified by the GMAT analysis. The detection frequency (y-axis) is plotted against the chromosome coordinate (x-axis).
Activating marks of histone 3:- Di-acetylation on lysines 9 and 14 (H3K9acK14ac)- Tri-methylation on lysines 4 (H3K4me3), 36, and 79
Repressive marks on histone H3:- Tri-methylation on lysines 9 and 27 (H3K27me3)
3CChromosome Conformation Capture
J. Dekker, K. Rippe, M. Dekker, and N. Kleckner. Capturing chromosome conformation. Science 295 (5558):1306-1311, 2002.
4CCircular Chromosome Conformation Capture
Z. Zhao, G. Tavoosidana, M. Sjolinder, A. Gondor, P. Mariano, S. Wang, C. Kanduri, M. Lezcano, K. S. Sandhu, U. Singh, V. Pant, V. Tiwari, S. Kurukuti, and R. Ohlsson. Circular chromosome conformation capture (4C) uncoversextensive networks of epigenetically regulated intra- and interchromosomalinteractions. Nat Genet 38 (11):1341-1347, 2006.
5CChromosome Conformation Capture Carbon Copy
J. Dostie, T. A. Richmond, R. A. Arnaout, R. R. Selzer, W. L. Lee, T. A. Honan, E. D. Rubio, A. Krumm, J. Lamb, C. Nusbaum, R. D. Green, and J. Dekker. Chromosome Conformation Capture Carbon Copy (): a 5Cmassively parallel solution for mappinginteractions between genomic elements. Genome Res. 16 (10):1299-1309, 2006.
3C. Requires a prior knowledge of both interacting regions
4C. Requires a prior knowledge of only one interacting region
Summary of activation-dependent looping events and a model of transcriptionally active chromatin.
Cai S, Lee CC, Kohwi-Shigematsu T. SATB1 packages densely looped, transcriptionally active chromatin for coordinated expression of cytokine genes.
Nat Genet.38:1278-88, 2006.
Few published examples of physical and functional evidences for nonallelic interaction between chromosomes:
• Interchromosomal interactions and olfactory receptor choice (Lomvardas et al., Cell 1262:403-413, 2006).
• An LCR in IFN-γ locus associated with IL-4 locus on a different chromosome in committed naive T (Spilianakis et al., Nature, 435:637-645, 2005).
• The imprinting control region of the Igf2/H19 locus and the Wsb1/Nf1 gene (Ling et al., Science, 312:269-272, 2006).
A review: F. Savarese and R. Grosschedl. Blurring cis and trans in gene regulation. Cell 126 (2):248-250, 2006.
These nonallelic interchromosomal interactions appear relatively infrequent and transient, and their biological role is still somewhat unclear...
Overview
• The Basic Advantages/Disadvantages
• The Technologies at a glance
• The Application of massively parallel sequencing
From Elaine Mardis, Ph.D., Genome Sequencing Center, St Louis, MO, USA
Advantages of 2nd-Gen Platforms• No sub-cloning, no use of E. coli as host.
- cloning bias abolished- making libraries is more straightforward
• Each sequence is from a unique DNA molecule.- quantitation is possible through “counting”- enhanced dynamic range- detection of rare variants
• Provides exquisite resolution for many types of (input) experiments**.
• Revolutionary (disruptive) improvements in cost and speed of data generation.
• Requires (much) less automation at front end.From Elaine Mardis, Ph.D., Genome Sequencing Center, St Louis, MO, USA
Dis-advantages of Next-Gen Platforms
• Shorter read length sequences are produced.- relative to capillary sequencers- re-parameterization of base calling accuracy- challenges bioinformatics-based analyses
• File sizes traumatize IT infrastructures.- (up to) several Tb of raw data are produced per run- read processing pipelines require off-instrument CPU- decision of what to save vs re-run
• Instrument amortization paradigm shift.• Require (much) less automation at front end.
From Elaine Mardis, Ph.D., Genome Sequencing Center, St Louis, MO, USA
NGS: Next Generation Sequencingtarget 2013: 1X coverage of the human genome at $1,000
Gold Standard until 2006:Sanger sequencing with ABI 3730XL x1 Genetic Analyzer (1 Kb read length,
2.1 Mbp per day, 6x coverage 1 human genome, 18 Gb = ~18 years)
NGS since 2006:Solexa 2G: ~20-30Gb/Run, ~3-9 days run length, ~500 Mb/day, ~50-75 bp
read length, 99.99% accuracyABI SOLiD3: ~20-30Gb/Run, ~4-10 days run length, ~500 Mb/day , ~50-75 bp
read length, 99.94% accuracy454/Roche GS FLX: ~100MB/Run, ~7.5 hours run length, ~400 Mb/week,
~400 bp read length, 99.5% accuracyHeliscope: ~1Gb/hours, ~25-45 bp read length, 99% accuracy
In development:Oxford Nanopore Technologies: single molecule sequencing, ~nGB/Run?,
~2-3 Kb read length, 99.8% accuracy
ABI Next Gen Sequencing: SOLiD
Sequencing byOligonucleotide
Ligation andDetection
Le Système SOLiD™ est un analyseur génétique capable de faire du séquençage massif en parallèle.
The SOLiD™ 3 System
Applied Biosystems SOLiD• custom adapter library• emPCR on magnetic beads• sequencing by ligation usingfluorescent probes from acommon primer• sequential rounds of ligationfrom a series of primers• fixed/known nucleotides foreach probeset identify twobases per sequence read, for“two base encoding”
Current 454 Frags 454 Pairs Solexa Frags Solexa Pairs SOLiD Frags SOLiD PairsRead Length 250 bases 250 bases 36 bp 36 bp 25-35 bp 2 x 25 bpDays Per Run 0.3125 0.3125 3 6 7 10Number of Reads 400,000 400,000 60 Million 60 Million 80 Million 80 MillionGb Per Run (Filtered) 100 MB 100 MB >1 Gbp >2 Gbp 3 Gb 6 GbAverage Insert Size 2-3kb 200 bp 3 kb
Improvements 454 Frags 454 Pairs Solexa Frags Solexa Pairs SOLiD Frags SOLiD PairsRead Length 400 bases 400 bases 36 - 50 bp 36 - 50 bp 25 - 35 bp 2 x 25 bpDays Per Run 0.42 0.42 2.5 5 5 10Number of Reads 400,000 400,000 >90 Million >90 Million 80 Million 80 MillionGb Per Run (Filtered) 100 MB 100 MB >1.5 Gbp >3 Gbp 4 Gb 8 GbAverage Insert Size 2-3kb 200 bp 3 kb
Platform Statistics
Human gene mapping
Qualitative (SNP) and quantitative (amplification) genetic variations
de novo sequencing of model organisms and pathogens
Sequencing complex mixtures of microbial populations (gastrointestinal tract or water monitoring)
Epigenetic marks mapping and identification of regulatory sequences of gene expression (ChIP-seq)
Identification and analysis of non coding RNAs (miRNA, etc.)
Monitoring gene expression in covering all the alternative messengers to a given locus in a variety of contexts
Main NGS applications
Chromatin ImmunoPrecipitation (ChIP)-seq
• genome-wide identification of protein binding sites
• transcription factor bindingsites can indicate genes activated for transcription
• repressor binding sites canindicate genes repressed fromtranscription
• histone binding also can identifysequences available/not fortranscription
• co-investigation of transcribedgenes can provide correlative data
AAAAAA
AAA
AAAAAA
AAAAAA
AAA
AAAAAA
AAA
AAAAAA
AAAAAA
AAA
AAA
AAA
AAAAAA
AAAAAA
AAA
chr21:42,653,000-42,673,000TFF1 TMPRSS3TFF1 TMPRSS3
GIS-PETTranscriptomeGene discovery
ChIP-PETTF binding sitesEpigenetic sites
TFF1 TMPRSS3TFF1 TMPRSS3TFF1 TMPRSS3
chr21:42,653,000-42,673,000
SOLID-PETGenome SVs
Genome assembly
chr1:2,466,948-2,497,767
Reference genome (hg18)
MCF7 genome inversion
TNFRSF14
Reference genome (hg18)
MCF7 genome inversion
TNFRSF14TNFRSF14
Fusion transcripts
SNP*
Histone &Modification
Heterochromatin
mRNA
AAAAAA
AAAAAA
AAAAAA
Inversion
InsertionDeletion
Translocation
Chromatin loop
Euchromatin
Chromosome
alt. tss
ChromatinInteraction
Nucleosome
tss
ATGCGTACGTARNAPIIComplex
ChIA-PETLR chromatinInteractions
chr21:42,653,000-42,673,000TFF1 TMPRSS3TFF1 TMPRSS3TFF1 TMPRSS3
Yijun RUAN
N. D. Heintzman, R. K. Stuart, G. Hon, Y. Fu, C. W. Ching, R. D. Hawkins, L. O. Barrera, Calcar S. Van, C. Qu, K. A. Ching, W. Wang, Z. Weng, R. D. Green, G. E. Crawford, and B. Ren. Distinct and predictive chromatin signatures of transcriptionalpromoters and enhancers in the human genome. Nat Genet 39 (3):311-318, 2007.
Example of ChIP-seq(1)
Example of ChIP-seq (2)A. Barski, S. Cuddapah, K. Cui, T. Y. Roh, D. E. Schones, Z. Wang, G. Wei, I. Chepelev, and K. Zhao. High-resolution profiling of histone methylations in the human genome. Cell 129 (4):823-837, 2007.
Z. Wang, C. Zang, J. A. Rosenfeld, D. E. Schones, A. Barski, S. Cuddapah, K. Cui, T. Y. Roh, W. Peng, M. Q. Zhang, and K. Zhao. Combinatorial patterns of histone acetylations and methylations in the human genome. Nat Genet. 40 (7):897-903, 2008.
Active gene promoters are characterized by having active modification marks both surrounding and downstream of the TSSs
From Wang,Z., Schones,D.E., and Zhao,K. (2009). Characterization of human epigenomes. Curr. Opin. Genet. Dev.
Mapping PETs to reference genome
Dimerized PET sequencing
454GS FLX
or
Single PET sequencingSolexaGAII
or
Single PET sequencing
ABISOLiD II
or
Concatenated PET sequencing
ABI 3730
The Paired End diTag (PET) strategy for sequencing
Engineered α-hemolysin protein (shown in blue) is introduced into a planar lipid bilayer, which acts as an artificial biological membrane.
The lipid bilayer has a high electrical resistance and so when an electrical potential is applied across this membrane, a current flows only through the nanopore, carried by the ions in salt solutions that bathe both sides of the bilayer.
The lipid bilayer and nanopore are placed in a well that contains two electrodes on either side of the bilayer.
DNA sample is introduced into the top layer. As the exonuclease (shown here in green) directs individual DNA bases, in sequence, through the nanopore, each base transiently binds at the binding site (cyclodextrin, shown here in red).
During the binding event, the current through the nanopore is disturbed, creating a characteristic signal for each type of base. The signal for each base can be easily distinguished.
The electrical current trace provides a record of the sequence of bases passing through the nanopore.
TGS: Third-Generation Sequencing (1)
From Oxford Nanopore Technologies 2009 (http://www.nanoporetech.com/)
TGS: Third-Generation Sequencing (2)
To achieve high-throughput sequencing, this system will be run in parallel in an array chip.
Multiple microwells are arrayed on silicon, with individual nanopore sequencing units in each well.
Fragmented DNA is sequenced in parallel in multiple wells. Long read lengths are possible.
Data gathered from each well is combined for data reassembly.
From Oxford Nanopore Technologies 2009 (http://www.nanoporetech.com/)
TGS: Third-Generation Sequencing (3)
The instrumentation required to operate the array chip and record the resulting electrical signals does not require optics.
Direct electrical detection and potential long read lengths promise simpler bioinformatics.
From Oxford Nanopore Technologies 2009 (http://www.nanoporetech.com/)
From Oxford Nanopore Technologies 2009 (http://www.nanoporetech.com/)
TGS: Third-Generation Sequencing (4)
α-hemolysin nanopore (ribbon diagram) with covalently attached cyclodextrin (teal blue)transiently binds a base (red) traversing the pore.
From Clarke et al. Nat. Nanotechnol. 4:265-270, 2009
TGS: Third-Generation Sequencing (5)
Structures of haemolysin mutants
From Clarke et al. Nat. Nanotechnol. 4:265-270, 2009
TGS: Third-Generation Sequencing (6)
Nucleotide event distributions
Challenges of short read re-sequencing
• Many short reads cannot be uniquely aligned because they map to multiple regions in the genome. • RepeatMasker does not identify many of these 30-50 bp “micro-repeats”.• The size and complexity of the human genome requires extra caution to ensure variant-containing reads are accurately placed and that multiply-placed reads are not further considered.
• Laboratory Information Management SystemTrack samplesTrack laboratory processes in the databaseGenerate reports
• All Information Management SystemTrack analysisTrack disk spaceSchedule batch processes
• The sheer number and variety of programs, command line options, work flows, versions, platforms, runs, etc. make it infeasible, even undesirable, to settle on a single solution at present.
• As such, we are tracking everything in a detailed way.
From LIMS to AIMS
Conclusions• The Second-Generation Sequencing can address human
genomes either by whole genome sequencing or by targeted approaches. The cost gap of these two approaches is narrowing. It will be most probably filled in shortly by nano-sequencing, the Third-Generation Sequencing.
• Many challenges remain for accurate bioinformatics-based analysis pipelines that map read, discover mutations and indels, and correlate data across samples.
Planned NGS ApplicationsPlanned NGS Applicationsat Hotel Express Transcriptome platformat Hotel Express Transcriptome platformhttp://http://tagc.univtagc.univ--mrs.frmrs.fr
IBiSA - InsermPlate-forme
From Myers and Wold, Nat Methods, 5:19-21, 2008
Matrix attachment region (MAR)/scaffold attachment region (SAR): DNA sequence that binds the nuclear scaffold and can affect transcription. These elements form higher-order looped structures within chromosomes and influence gene expression by separating chromosomes into regulatory domains.
Silencer: Control element that suppresses gene expression independent of orientation or distance.
Insulator (also boundary element): Insulator elements affect gene expression by preventing the spread of heterochromatin and restricting transcriptional enhancers from activation of unrelated promoters. In vertebrates, insulator’s function requires association with the CCCTC-binding factor (CTCF), a protein that recognizes long and diverse nucleotide sequences.
Locus control region (LCR): Confers tissue-specific temporally regulated expression of linked genes. LCRs function independently of position, but they are copy number dependent and open the nucleosome structure so that other factors can bind. LCRs affect replication timing and origin usage.
Enhancer: Control element that elevates the levels of transcription from a promoter, independent of orientation or distance.
Promoter: Sequence of DNA near the 5' end of a gene that acts as a binding site for RNA polymerase and from which transcription is initiated.
cis-acting regulatory elements
Components of Eukaryotic Promotersand Regulatory Regions
• Site selector elements TATA-box, Initiator• Common upstream elements CCAAT-box, GC-box• Regulatory elements HSE, SRE, GRE, etc.
• Enhancers / Silencers• Locus control regions (LCRs)• Scaffold / Matrix attachment sites (SARs / MARs)• Insulator (CTCF)
• CpG islands
Promoter Regulatory Elements:Features and Facts
• Degenerate sequence motifs• Length: 6 to 20 bp• Low complexity (8-12 bits)• Binding sites of transcription factors• Excess of binding sites over binding proteins in the nucleus• Most in vitro binding sites not functional in vivo• Some in vivo binding sites also not functional• Regulatory potentials depends on cooperative effects between
multiple elements
Strategies for the Design of Microarrays for the Human Genome
Single gene or selected regions
Horak et al. (2002) PNAS 99:2924-2029Overlapping PCR products
Cawley et al. (2004) 166:459-509Tiling oligonucleotides
Gene collection (ex. Refseq)
Blais et al. (2005) Genes & Dev 19:1-17
Boyer et al. (2005) Cell 122:947-956
1 kb PCR products
60-mer covering -8 kb, +2 kb for 17,917 annotated human genes
• CpG dinucleotides are present at 20% of predicted frequency• CpG islands: >200 bp long, >50 %G+C, CpG >0.6 predicted• CpG islands account for 1% of the genome• 29,000 CpG islands are predicted in the human genome• ~60% of known genes have a CpG island near 5’ end• CpG island microarrays are promoter- and regulatory region-enriched arrays
29,000 CpG islands are predicted in the human genome
CpG islands Weinmann et al. (2002) Genes & Dev. 16:235-244Oberley et al. (2004) Methods Enzymol. 376:315-334
From Rémi Houlgatte – Inserm UMR915, Nantes, France
From Boyer, L. A., T. I. Lee, M. F. Cole, S. E. Johnstone, S. S. Levine, J. P. Zucker, M. G. Guenther, R. M. Kumar, H. L. Murray, R. G. Jenner, D. K. Gifford, D. A. Melton, R. Jaenisch, and R. A. Young. 2005. Core transcriptional regulatory circuitry in human embryonic stem cells. Cell 122:947-956.
From T. Y. Roh, W. C. Ngau, K. Cui, D. Landsman, and K. Zhao. High-resolution genome-wide mapping of histone modifications. Nat Biotechnol. 22 (8):1013-1016, 2004.
Genome-Wide Mapping Technique (GMAT)Or Serial Analysis of Chromatin Occupancy (SACO)
RNA Sequencing
RNA isolateSize selection for
nc RNA classes
Fragment,RT w/randoms
polyA priming,RT, ds DNA SAGE tags
Adapter-ligated fragmentsfor 2nd-gen sequencing
Alignment to reference database& discovery
Preuves du concept chez la levure :Ren, B., F. Robert, J. J. Wyrick, O. Aparicio, E. G. Jennings, I. Simon, J. Zeitlinger, J. Schreiber, N. Hannett, E. Kanin, T. L. Volkert, C. J. Wilson, S. P. Bell, and R. A. Young. 2000. Genome-wide location and function of DNA binding proteins. Science 290:2306-2309.Wells, J., K. E. Boyd, C. J. Fry, S. M. Bartley, and P. J. Farnham. 2000. Target gene specificity of E2F and pocket protein family members in living cells. Mol. Cell. Biol. 20:5797-5807.
Etudes chez les mammifères (y compris Homo Sapiens) :Horak, C. E., M. C. Mahajan, N. M. Luscombe, M. Gerstein, S. M. Weissman, and M. Snyder. 2002. GATA-1 binding sites mapped in the beta-globin locus by using mammalian chIp-chip analysis. Proc. Natl. Acad. Sci. U. S A. 99:2924-2929.Weinmann, A. S., P. S. Yan, M. J. Oberley, T. H. Huang, and P. J. Farnham. 2002. Isolating human transcription factor targets by couplingchromatin immunoprecipitation and CpG island microarray analysis. Genes Dev. 16:235-244.Kirmizis, A., S. M. Bartley, A. Kuzmichev, R. Margueron, D. Reinberg, R. Green, and P. J. Farnham. 2004. Silencing of human polycomb target genes is associated with methylation of histone H3 Lys 27. Genes Dev. 18:1592-1605.Heisler, L. E., D. Torti, P. C. Boutros, J. Watson, C. Chan, N. Winegarden, M. Takahashi, P. Yau, T. H. Huang, P. J. Farnham, I. Jurisica, J. R. Woodgett, R. Bremner, L. Z. Penn, and S. D. Der. 2005. CpG Island microarray probe sequences derived from a physical library are representative of CpG Islands annotated on the human genome. Nucleic Acids Res. 33:2952-2961.Kim, T. H., L. O. Barrera, M. Zheng, C. Qu, M. A. Singer, T. A. Richmond, Y. Wu, R. D. Green, and B. Ren. 2005. A high-resolution map of active promoters in the human genome. Nature 436:876-880.
Réseaux de régulations transcriptionnelles :Levure : Lee, T. I., N. J. Rinaldi, F. Robert, D. T. Odom, Z. Bar-Joseph, G. K. Gerber, N. M. Hannett, C. T. Harbison, C. M. Thompson, I. Simon, J. Zeitlinger, E. G. Jennings, H. L. Murray, D. B. Gordon, B. Ren, J. J. Wyrick, J. B. Tagne, T. L. Volkert, E. Fraenkel, D. K. Gifford, and R. A. Young. 2002. Transcriptional regulatory networks in Saccharomyces cerevisiae. Science 298:799-804.Homo sapiens : Boyer, L. A., T. I. Lee, M. F. Cole, S. E. Johnstone, S. S. Levine, J. P. Zucker, M. G. Guenther, R. M. Kumar, H. L. Murray, R. G. Jenner, D. K. Gifford, D. A. Melton, R. Jaenisch, and R. A. Young. 2005. Core transcriptional regulatory circuitry in human embryonic stem cells. Cell 122:947-956.
Revues et méthodes :Oberley, M. J., J. Tsao, P. Yau, and P. J. Farnham. 2004. High-throughput screening of chromatin immunoprecipitates using CpG-island microarrays. Methods Enzymol. 376:315-334.Ren, B. and B. D. Dynlacht. 2004. Use of chromatin immunoprecipitation assays in genome-wide location analysis of mammalian transcription factors. Methods Enzymol. 376:304-315.
Bioinformatique des promoteurs et régions régulatrices :Liu, Y., L. Wei, S. Batzoglou, D. L. Brutlag, J. S. Liu, and X. S. Liu. 2004. A suite of web-based programs to search for transcriptional regulatory motifs. Nucleic Acids Res. 32:W204-W207.
Quelques références...
Some referencesKiriakidou, M., P. T. Nelson, A. Kouranov, P. Fitziev, C. Bouyioukos, Z. Mourelatos, and A. Hatzigeorgiou. 2004. A combined computational-experimental approach predicts human microRNA targets. Genes Dev. 18:1165-1178.
Barski, A., S. Cuddapah, K. Cui, T. Y. Roh, D. E. Schones, Z. Wang, G. Wei, I. Chepelev, and K. Zhao. 2007. High-resolution profiling of histone methylations in the human genome. Cell 129:823-837.Dahl, F., J. Stenberg, S. Fredriksson, K. Welch, M. Zhang, M. Nilsson, D. Bicknell, W. F. Bodmer, R. W. Davis, and H. Ji. 2007. Multigene amplification and massively parallel sequencing for cancer mutation discovery. Proc. Natl. Acad. Sci. U. S. A 104:9387-9392.Euskirchen, G. M., J. S. Rozowsky, C. L. Wei, W. H. Lee, Z. D. Zhang, S. Hartman, O. Emanuelsson, V. Stolc, S. Weissman, M. B. Gerstein, Y. Ruan, and M. Snyder. 2007. Mapping of transcription factor binding regions in mammalian cells by ChIP: comparison of array- and sequencing-based technologies. Genome Res. 17:898-909.Johnson, D. S., A. Mortazavi, R. M. Myers, and B. Wold. 2007. Genome-wide mapping of in vivo protein-DNA interactions. Science 316:1497-1502.Lin, C. Y., V. B. Vega, J. S. Thomsen, T. Zhang, S. L. Kong, M. Xie, K. P. Chiu, L. Lipovich, D. H. Barnett, F. Stossi, A. Yeo, J. George, V. A. Kuznetsov, Y. K. Lee, T. H. Charn, N. Palanisamy, L. D. Miller, E. Cheung, B. S. Katzenellenbogen, Y. Ruan, G. Bourque, C. L. Wei, and E. T. Liu. 2007. Whole-genome cartography of estrogen receptor alpha binding sites. PLoS. Genet. 3:e87.Lu, C., B. C. Meyers, and P. J. Green. 2007. Construction of small RNA cDNA libraries for deep sequencing. Methods 43:110-117.Mardis, E. R. 2007. ChIP-seq: welcome to the new frontier. Nat. Methods 4:613-614.Porreca, G. J., K. Zhang, J. B. Li, B. Xie, D. Austin, S. L. Vassallo, E. M. LeProust, B. J. Peck, C. J. Emig, F. Dahl, Y. Gao, G. M. Church, and J. Shendure. 2007. Multiplex amplification of large sets of human exons. Nat. Methods 4:931-936.Schmid, C. D. and P. Bucher. 2007. ChIP-Seq data reveal nucleosome architecture of human promoters. Cell 131:831-832.Zhao, X. D., X. Han, J. L. Chew, J. Liu, K. P. Chiu, A. Choo, Y. L. Orlov, W. K. Sung, A. Shahab, V. A. Kuznetsov, G. Bourque, S. Oh, Y. Ruan, H. H. Ng, and C. L. Wei. 2007. Whole-genome mapping of histone h3 lys4 and 27 trimethylations reveals distinct genomic compartments in human embryonic stem cells. Cell Stem Cell 1:286-298.Chi, K. R. 2008. The year of sequencing. Nat. Methods 5:11-14.Mardis, E. R. 2008. The impact of next-generation sequencing technology on genetics. Trends Genet. 24:133-141.Pop, M. and S. L. Salzberg. 2008. Bioinformatics challenges of new sequencing technology. Trends Genet. 24:142-149.Schones, D. E., K. Cui, S. Cuddapah, T. Y. Roh, A. Barski, Z. Wang, G. Wei, and K. Zhao. 2008. Dynamic regulation of nucleosome positioning in the human genome. Cell 132:887-898.Schuster, S. C. 2008. Next-generation sequencing transforms today's biology. Nat. Methods 5:16-18.Shendure, J. A., G. J. Porreca, and G. M. Church. 2008. Overview of DNA sequencing strategies. Curr. Protoc. Mol. Biol. Chapter 7:Unit.von Bubnoff, A. 2008. Next-generation sequencing: the race is on. Cell 132:721-723.Wold, B. and R. M. Myers. 2008. Sequence census methods for functional genomics. Nat. Methods 5:19-21.
Regulatory region databases
PromoSer: The mammalian promoter service ORegAnno Open Regulatory Annotation:http://biowulf.bu.edu/zlab/PromoSer http://www.oreganno.org
PAZAR: A public database of transcription factor and regulatory sequence annotationhttp://www.pazar.info
Regulatory region analysis
PIPMaker: VISTA Tools: mVISTA and rVISTAhttp://pipmaker.bx.psu.edu/pipmaker http://genome.lbl.gov/vista/
DCODE.org Comparative Genomics Center: Comparing genomes to decipher the code of gene regulationhttp://www.dcode.org
Genomatix software GmbH:http://www.genomatix.de
General databases and tools
UCSC Genome Browser: Galaxy website:http://genome.ucsc.edu http://www.bx.psu.edu/cgi-bin/trac.cgi
IdConvert: http://idconverter.bioinfo.cnio.es
DNA fractionation
2010
5
21
0.5
Kb
1kb, 5kb, 10kb, 20kbB
B
B
B
SAB
BB
B
NNNN
The SOLID-PET ApproachCancer genomic DNA
circularization
EcoP15I cut
Purification & PCR
SOLiD sequencing
Mapping PETs to reference genome
PET Mapping
Reference genome sequence
Paired end tag (PET)
Yijun RUAN
Characteristics of the SOLiD™ SystemScalability, Throughput & Flexibility
1. # of Samples per Slide2. Sample Multiplexing
(20 Barcodes)Scalability
Increasing bead density
# of Beads per SlideThroughput # of Slides
2 Independent Flow CellsFlexibility
From 55’000 to 120’000 beads / panel
Experiment Specific Prep Fragmentation
SOLiD™ Sequencing WorkflowOpen, Flexible & Standardized
SampleCollection
LibraryConstruction
ePCR &Deposition
SequencingReaction
DataAnalysis
Your DNA or RNA Sample Collection and Purification Method
Emulsion PCR Deposition
Imaging
Image Analysis Color Calling Alignement Results
ChIP Seq
CGH Seq (CNV)Methylation Studies
Meta-Genomics
Whole TranscriptomeSmall RNA Profiling
Single Cell Transcriptome5’-SAGE / 3’-SAGE / CAGE
Whole Genome Re-sequencingTargeted Re-sequencing
Deep SequencingDe Novo Sequencing
SOLiDApplied Biosystems
Examples for Coverage, Throughput & MultiplexingEstimation of # of samples based on current Throughput
(10 Gb/slide) and # of Tags1 Compartment10 Gb / 200 M beads
Human Whole Human Whole Genome Genome
(5kb mate-pair library2x50bp,~3x Coverage!)
OR
WholeWholeTranscriptomeTranscriptome
(50 bp fragment library,1 – 4 Samples)
4 Compartments8 Gb / 160 M beads
Bacterial Whole GEX Profiling (Gene Exp)
Sample 1 - Normal
Whole MicrobialGenome Sequencing(5 MB, , 100X coverage)
Samples 14 - 18
Deep Re-Sequencing (e.g 100 kb target,
1’000X coverage/sample)Sample 3 - 13 (Pool)
Small RNA Profiling (Gene Exp)
Sample 2 - Tumor
1-2M Tags per Sample
e.g ~10 Mb
10M Tags
1–2M Tags per Sample
8 Compartments6.6 Gb / 132 M beads
ChIP1-8
3’-SAGEProfiling
1-8
ChIP9-16
CAGEProfiling
9 - 16
SAGEProfiling
11-20
BactRe-Seq
4Mb, 100x
SNPDiscovery10Mb, 28x
Global Methylation
Sample B
Global Methylation
Sample A
16.5 M mapped tags / compartment40 M mapped tags / compartment
Application Sample Prep Library Prep Sequencing Analysis
Developing Solutions that Take You fromSample to Results
• Third Party Tools: Softgenetics(NextGENe)
• SOLiD™ Analyzer• SOLiD™ Fragment Library
Sequencing Kit
• Small Sample Protocol• NA
ChIP
• AB tools: Corona, Map. GFF, SRF • SOLiD™ Analyzer• SOLiD™ Mate-Paired Library
Sequencing Kit
• SOLiD™ Mate-Paired LibraryOligos Kit
• Sample Multiplex Analysis
• BloodPrep® DNA Chemistry
• Agilent Array Enrichment• LR PCR
Resequencing
• Academic Tools: Shrimp, Velvet• Third Party Tools: Softgenetics
(NextGENe)
• SOLiD™ Analyzer• SOLiD™ Mate-Paired Library
Sequencing Kit
• SOLiD™ Mate-Paired LibraryOligos Kit
• BloodPrep® DNA Chemistry
de Novo
• AB Tools: WT Analysis tools- coming soon
• SOLiD™ Analyzer• SOLiD™ Mate-Paired Library
Sequencing Kit
• SOLiD™ Whole Transcriptome Analysis kit –coming soon
• MagMax™ Total RNA Isolatin Kits
Whole Transcriptome
• AB tools: RNA2 Map. GFF, SRF • Third Party tools: InterRNA
• SOLiD™ Analyzer• SOLiD™ Fragment Library
Sequencing Kit
• SOLiD™ Small RNA Expression kit
• Sample Multiplex Analysis
• mirVana™ miRNA Isolation Kit
Small RNA
SOLiD™ 3 System Specification Summary
• 800 Dedicated Service and Support• Large SOLiD™ Support Team incl. Bioinformatics Spec.
Service and Support
• Two independent flow cells process two slides per run• Open slide format with 1–8 samples / slide• 20 barcodes 160 samples / Slide, 320 samples / run
Scalable/ Flexible
• Fragment – up to 50 bases (R&D demonstrated: 75 bp)• Mate-paired – up to 2x50 bases (Insert Sizes: 0.6–10 kB)
Read Length
• Whole Genome and Targeted Resequencing, …• Gene Expression, Transcriptome Analysis,
ChIP Seq, Methylation Analysis, Structural Variation and • de novo sequencing
Applications(Protocols, kits)
• 35 bp Fragment 3.5 days, 50 bp Fragment 5-6 days• 2 x 50 bp Mate-Pair Run 12 – 14 daysRun Time
• Fragment: 10-15 GB / 200-300 M tags (mappable)• Mate-paired: 20–30GB / 400M-600 M tags (mappable)
• Overall sequence 99.94%• Consensus 99.999% @15X
Accuracy
Throughput / Run
Selected Genome-Wide Studies (NGS, etc.) 2006-2009
Hawkins,R.D. and Ren,B. (2006). Genome-wide location analysis: insights on transcriptional regulation. Hum. Mol. Genet 15 Spec No 1, R1-R7.
Kim,T.H. and Ren,B. (2006). Genome-Wide Analysis of Protein-DNA Interactions. Annu. Rev Genomics Hum. Genet 7, 81-102.
Kim,T.H. and Ren,B. (2006). An all-round view of eukaryotic transcription. Genome Biol. 7, 323.
Loh,Y.H., Wu,Q., Chew,J.L., Vega,V.B., Zhang,W., Chen,X., Bourque,G., George,J., Leong,B., Liu,J., Wong,K.Y., Sung,K.W., Lee,C.W., Zhao,X.D., Chiu,K.P., Lipovich,L., Kuznetsov,V.A., Robson,P., Stanton,L.W., Wei,C.L., Ruan,Y., Lim,B., and Ng,H.H. (2006). The Oct4 and Nanog transcription network regulates pluripotency in mouse embryonic stem cells. Nat Genet 38, 431-440.
Roh,T.Y., Cuddapah,S., Cui,K., and Zhao,K. (2006). The genomic landscape of histone modifications in human T cells. Proceedings of the National Academy of Sciences 103, 15782-15787.
Zeller,K.I., Zhao,X., Lee,C.W., Chiu,K.P., Yao,F., Yustein,J.T., Ooi,H.S., Orlov,Y.L., Shahab,A., Yong,H.C., Fu,Y., Weng,Z., Kuznetsov,V.A., Sung,W.K., Ruan,Y., Dang,C.V., and Wei,C.L. (2006). Global mapping of c-Myc binding sites and target gene networks in human B cells. Proc. Natl. Acad. Sci. U. S. A 103, 17834-17839.
Barski,A., Cuddapah,S., Cui,K., Roh,T.Y., Schones,D.E., Wang,Z., Wei,G., Chepelev,I., and Zhao,K. (2007). High-resolution profiling of histone methylations in the human genome. Cell 129, 823-837.
Birney,E., Stamatoyannopoulos,J.A., Dutta,A., Guigo,R., Gingeras,T.R., Margulies,E.H., Weng,Z., Snyder,M., Dermitzakis,E.T., Thurman,R.E., Kuehn,M.S., Taylor,C.M., Neph,S., Koch,C.M., Asthana,S., Malhotra,A., Adzhubei,I., Greenbaum,J.A., Andrews,R.M., Flicek,P., Boyle,P.J., Cao,H., Carter,N.P., Clelland,G.K., Davis,S., Day,N., Dhami,P., Dillon,S.C., Dorschner,M.O., Fiegler,H., Giresi,P.G., Goldy,J., Hawrylycz,M., Haydock,A., Humbert,R., James,K.D., Johnson,B.E., Johnson,E.M., Frum,T.T., Rosenzweig,E.R., Karnani,N., Lee,K., Lefebvre,G.C., Navas,P.A., Neri,F., Parker,S.C., Sabo,P.J., Sandstrom,R., Shafer,A., Vetrie,D., Weaver,M., Wilcox,S., Yu,M., Collins,F.S., Dekker,J., Lieb,J.D., Tullius,T.D., Crawford,G.E., Sunyaev,S., Noble,W.S., Dunham,I., Denoeud,F., Reymond,A., Kapranov,P., Rozowsky,J., Zheng,D., Castelo,R., Frankish,A., Harrow,J., Ghosh,S., Sandelin,A., Hofacker,I.L., Baertsch,R., Keefe,D., Dike,S., Cheng,J., Hirsch,H.A., Sekinger,E.A., Lagarde,J., Abril,J.F., Shahab,A., Flamm,C., Fried,C., Hackermuller,J., Hertel,J., Lindemeyer,M., Missal,K., Tanzer,A., Washietl,S., Korbel,J., Emanuelsson,O., Pedersen,J.S., Holroyd,N., Taylor,R., Swarbreck,D., Matthews,N., Dickson,M.C., Thomas,D.J., Weirauch,M.T., Gilbert,J., Drenkow,J., Bell,I., Zhao,X., Srinivasan,K.G., Sung,W.K., Ooi,H.S., Chiu,K.P., Foissac,S., Alioto,T., Brent,M., Pachter,L., Tress,M.L., Valencia,A., Choo,S.W., Choo,C.Y., Ucla,C., Manzano,C., Wyss,C., Cheung,E., Clark,T.G., Brown,J.B., Ganesh,M., Patel,S., Tammana,H., Chrast,J., Henrichsen,C.N., Kai,C., Kawai,J., Nagalakshmi,U., Wu,J., Lian,Z., Lian,J., Newburger,P., Zhang,X., Bickel,P., Mattick,J.S., Carninci,P., Hayashizaki,Y., Weissman,S., Hubbard,T., Myers,R.M., Rogers,J., Stadler,P.F., Lowe,T.M., Wei,C.L., Ruan,Y., Struhl,K., Gerstein,M., Antonarakis,S.E., Fu,Y., Green,E.D., Karaoz,U., Siepel,A., Taylor,J., Liefer,L.A., Wetterstrand,K.A., Good,P.J., Feingold,E.A., Guyer,M.S., Cooper,G.M., Asimenos,G., Dewey,C.N., Hou,M., Nikolaev,S., Montoya-Burgos,J.I., Loytynoja,A., Whelan,S., Pardi,F., Massingham,T., Huang,H., Zhang,N.R., Holmes,I., Mullikin,J.C., Ureta-Vidal,A., Paten,B., Seringhaus,M., Church,D., Rosenbloom,K., Kent,W.J., Stone,E.A., Batzoglou,S., Goldman,N., Hardison,R.C., Haussler,D., Miller,W.,
2
Sidow,A., Trinklein,N.D., Zhang,Z.D., Barrera,L., Stuart,R., King,D.C., Ameur,A., Enroth,S., Bieda,M.C., Kim,J., Bhinge,A.A., Jiang,N., Liu,J., Yao,F., Vega,V.B., Lee,C.W., Ng,P., Shahab,A., Yang,A., Moqtaderi,Z., Zhu,Z., Xu,X., Squazzo,S., Oberley,M.J., Inman,D., Singer,M.A., Richmond,T.A., Munn,K.J., Rada-Iglesias,A., Wallerman,O., Komorowski,J., Fowler,J.C., Couttet,P., Bruce,A.W., Dovey,O.M., Ellis,P.D., Langford,C.F., Nix,D.A., Euskirchen,G., Hartman,S., Urban,A.E., Kraus,P., Van,C.S., Heintzman,N., Kim,T.H., Wang,K., Qu,C., Hon,G., Luna,R., Glass,C.K., Rosenfeld,M.G., Aldred,S.F., Cooper,S.J., Halees,A., Lin,J.M., Shulha,H.P., Zhang,X., Xu,M., Haidar,J.N., Yu,Y., Ruan,Y., Iyer,V.R., Green,R.D., Wadelius,C., Farnham,P.J., Ren,B., Harte,R.A., Hinrichs,A.S., Trumbower,H., and Clawson,H. (2007). Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature 447, 799-816.
Chiu,K.P., Ariyaratne,P., Xu,H., Tan,A., Ng,P., Liu,E.T., Ruan,Y., Wei,C.L., and Sung,W.K. (2007). Pathway aberrations of murine melanoma cells observed in Paired-End diTag transcriptomes. BMC. Cancer 7, 109.
Collins,P.J., Kobayashi,Y., Nguyen,L., Trinklein,N.D., and Myers,R.M. (2007). The ets-Related Transcription Factor GABP Directs Bidirectional Transcription. PLoS. Genet. 3, e208.
Cooper,S.J., Trinklein,N.D., Nguyen,L., and Myers,R.M. (2007). Serum response factor binding sites differ in three human cell types. Genome Res. 17, 136-144.
Denoeud,F., Kapranov,P., Ucla,C., Frankish,A., Castelo,R., Drenkow,J., Lagarde,J., Alioto,T., Manzano,C., Chrast,J., Dike,S., Wyss,C., Henrichsen,C.N., Holroyd,N., Dickson,M.C., Taylor,R., Hance,Z., Foissac,S., Myers,R.M., Rogers,J., Hubbard,T., Harrow,J., Guigo,R., Gingeras,T.R., Antonarakis,S.E., and Reymond,A. (2007). Prominent use of distal 5' transcription start sites and discovery of a large number of additional exons in ENCODE regions. Genome Res. 17, 746-759.
Euskirchen,G.M., Rozowsky,J.S., Wei,C.L., Lee,W.H., Zhang,Z.D., Hartman,S., Emanuelsson,O., Stolc,V., Weissman,S., Gerstein,M.B., Ruan,Y., and Snyder,M. (2007). Mapping of transcription factor binding regions in mammalian cells by ChIP: comparison of array- and sequencing-based technologies. Genome Res. 17, 898-909.
Heintzman,N.D., Stuart,R.K., Hon,G., Fu,Y., Ching,C.W., Hawkins,R.D., Barrera,L.O., Van,C.S., Qu,C., Ching,K.A., Wang,W., Weng,Z., Green,R.D., Crawford,G.E., and Ren,B. (2007). Distinct and predictive chromatin signatures of transcriptional promoters and enhancers in the human genome. Nat Genet 39, 311-318.
Jeck,W.R., Reinhardt,J.A., Baltrus,D.A., Hickenbotham,M.T., Magrini,V., Mardis,E.R., Dangl,J.L., and Jones,C.D. (2007). Extending assembly of short DNA sequences to handle error. Bioinformatics. 23, 2942-2944.
Johnson,D.S., Mortazavi,A., Myers,R.M., and Wold,B. (2007). Genome-wide mapping of in vivo protein-DNA interactions. Science 316, 1497-1502.
Kim,T.H., Abdullaev,Z.K., Smith,A.D., Ching,K.A., Loukinov,D.I., Green,R.D., Zhang,M.Q., Lobanenkov,V.V., and Ren,B. (2007). Analysis of the vertebrate insulator protein CTCF-binding sites in the human genome. Cell 128, 1231-1245.
Kuznetsov,V.A., Orlov,Y.L., Wei,C.L., and Ruan,Y. (2007). Computational analysis and modeling of genome-scale avidity distribution of transcription factor binding sites in chip-pet experiments. Genome Inform. 19, 83-94.
Lim,C.A., Yao,F., Wong,J.J., George,J., Xu,H., Chiu,K.P., Sung,W.K., Lipovich,L., Vega,V.B., Chen,J., Shahab,A., Zhao,X.D., Hibberd,M., Wei,C.L., Lim,B., Ng,H.H., Ruan,Y., and Chin,K.C.
3
(2007). Genome-wide mapping of RELA(p65) binding identifies E2F1 as a transcriptional activator recruited by NF-kappaB upon TLR4 activation. Mol. Cell 27, 622-635.
Lin,C.Y., Vega,V.B., Thomsen,J.S., Zhang,T., Kong,S.L., Xie,M., Chiu,K.P., Lipovich,L., Barnett,D.H., Stossi,F., Yeo,A., George,J., Kuznetsov,V.A., Lee,Y.K., Charn,T.H., Palanisamy,N., Miller,L.D., Cheung,E., Katzenellenbogen,B.S., Ruan,Y., Bourque,G., Wei,C.L., and Liu,E.T. (2007). Whole-genome cartography of estrogen receptor alpha binding sites. PLoS. Genet. 3, e87.
Lin,J.M., Collins,P.J., Trinklein,N.D., Fu,Y., Xi,H., Myers,R.M., and Weng,Z. (2007). Transcription factor binding and modified histones in human bidirectional promoters. Genome Res. 17, 818-827.
Mardis,E.R. (2007). ChIP-seq: welcome to the new frontier. Nat. Methods 4, 613-614.
Ruan,Y., Ooi,H.S., Choo,S.W., Chiu,K.P., Zhao,X.D., Srinivasan,K.G., Yao,F., Choo,C.Y., Liu,J., Ariyaratne,P., Bin,W.G., Kuznetsov,V.A., Shahab,A., Sung,W.K., Bourque,G., Palanisamy,N., and Wei,C.L. (2007). Fusion transcripts and transcribed retrotransposed loci discovered through comprehensive transcriptome analysis using Paired-End diTags (PETs). Genome Res. 17, 828-838.
Xi,H., Shulha,H.P., Lin,J.M., Vales,T.R., Fu,Y., Bodine,D.M., McKay,R.D., Chenoweth,J.G., Tesar,P.J., Furey,T.S., Ren,B., Weng,Z., and Crawford,G.E. (2007). Identification and characterization of cell type-specific and ubiquitous chromatin regulatory structures in the human genome. PLoS Genet. 3, e136.
Zhao,X.D., Han,X., Chew,J.L., Liu,J., Chiu,K.P., Choo,A., Orlov,Y.L., Sung,W.K., Shahab,A., Kuznetsov,V.A., Bourque,G., Oh,S., Ruan,Y., Ng,H.H., and Wei,C.L. (2007). Whole-genome mapping of histone h3 lys4 and 27 trimethylations reveals distinct genomic compartments in human embryonic stem cells. Cell Stem Cell 1, 286-298.
Zheng,D., Frankish,A., Baertsch,R., Kapranov,P., Reymond,A., Choo,S.W., Lu,Y., Denoeud,F., Antonarakis,S.E., Snyder,M., Ruan,Y., Wei,C.L., Gingeras,T.R., Guigo,R., Harrow,J., and Gerstein,M.B. (2007). Pseudogenes in the ENCODE regions: consensus annotation, analysis of transcription, and evolution. Genome Res. 17, 839-851.
Bourque,G., Leong,B., Vega,V.B., Chen,X., Lee,Y.L., Srinivasan,K.G., Chew,J.L., Ruan,Y., Wei,C.L., Ng,H.H., and Liu,E.T. (2008). Evolution of the mammalian transcription factor binding repertoire via transposable elements. Genome Res. 18, 1752-1762.
Chen,X., Xu,H., Yuan,P., Fang,F., Huss,M., Vega,V.B., Wong,E., Orlov,Y.L., Zhang,W., Jiang,J., Loh,Y.H., Yeo,H.C., Yeo,Z.X., Narang,V., Govindarajan,K.R., Leong,B., Shahab,A., Ruan,Y., Bourque,G., Sung,W.K., Clarke,N.D., Wei,C.L., and Ng,H.H. (2008). Integration of external signaling pathways with the core transcriptional network in embryonic stem cells. Cell 133, 1106-1117.
Fullwood,M.J., Tan,J.J., Ng,P.W., Chiu,K.P., Liu,J., Wei,C.L., and Ruan,Y. (2008). The use of multiple displacement amplification to amplify complex DNA libraries. Nucleic Acids Res. 36, e32.
Hillier,L.W., Marth,G.T., Quinlan,A.R., Dooling,D., Fewell,G., Barnett,D., Fox,P., Glasscock,J.I., Hickenbotham,M., Huang,W., Magrini,V.J., Richt,R.J., Sander,S.N., Stewart,D.A., Stromberg,M., Tsung,E.F., Wylie,T., Schedl,T., Wilson,R.K., and Mardis,E.R. (2008). Whole-genome sequencing and variant discovery in C. elegans. Nat. Methods 5, 183-188.
4
Johnson,D.S., Li,W., Gordon,D.B., Bhattacharjee,A., Curry,B., Ghosh,J., Brizuela,L., Carroll,J.S., Brown,M., Flicek,P., Koch,C.M., Dunham,I., Bieda,M., Xu,X., Farnham,P.J., Kapranov,P., Nix,D.A., Gingeras,T.R., Zhang,X., Holster,H., Jiang,N., Green,R.D., Song,J.S., McCuine,S.A., Anton,E., Nguyen,L., Trinklein,N.D., Ye,Z., Ching,K., Hawkins,D., Ren,B., Scacheri,P.C., Rozowsky,J., Karpikov,A., Euskirchen,G., Weissman,S., Gerstein,M., Snyder,M., Yang,A., Moqtaderi,Z., Hirsch,H., Shulha,H.P., Fu,Y., Weng,Z., Struhl,K., Myers,R.M., Lieb,J.D., and Liu,X.S. (2008). Systematic evaluation of variability in ChIP-chip experiments using predefined DNA targets. Genome Res. 18, 393-403.
Liu,X., Wang,L., Zhao,K., Thompson,P.R., Hwang,Y., Marmorstein,R., and Cole,P.A. (2008). The structural basis of protein acetylation by the p300/CBP transcriptional coactivator. Nature 451, 846-850.
Mardis,E.R. (2008). The impact of next-generation sequencing technology on genetics. Trends Genet. 24, 133-141.
Oleksyk,T.K., Zhao,K., De,L., V, Gilbert,D.A., O'Brien,S.J., and Smith,M.W. (2008). Identifying selected regions from heterozygosity and divergence using a light-coverage genomic dataset from two human populations. PLoS. ONE. 3, e1712.
Roh,T.Y. and Zhao,K. (2008). High-resolution, genome-wide mapping of chromatin modifications by GMAT. Methods Mol. Biol. 387, 95-108.
Schones,D.E., Cui,K., Cuddapah,S., Roh,T.Y., Barski,A., Wang,Z., Wei,G., and Zhao,K. (2008). Dynamic regulation of nucleosome positioning in the human genome. Cell 132, 887-898.
Schones,D.E. and Zhao,K. (2008). Genome-wide approaches to studying chromatin modifications. Nat Rev. Genet. 9, 179-191.
Wang,Z., Zang,C., Rosenfeld,J.A., Schones,D.E., Barski,A., Cuddapah,S., Cui,K., Roh,T.Y., Peng,W., Zhang,M.Q., and Zhao,K. (2008). Combinatorial patterns of histone acetylations and methylations in the human genome. Nat Genet. 40, 897-903.
Wold,B. and Myers,R.M. (2008). Sequence census methods for functional genomics. Nat. Methods 5, 19-21.
Zhao,X., Ruan,Y., and Wei,C.L. (2008). Tackling the epigenome in the pluripotent stem cells. J. Genet. Genomics 35, 403-412.
Barski,A. and Zhao,K. (2009). Genomic location analysis by ChIP-Seq. J. Cell Biochem.
Cui,K., Zang,C., Roh,T.Y., Schones,D.E., Childs,R.W., Peng,W., and Zhao,K. (2009). Chromatin signatures in multipotent human hematopoietic stem cells indicate the fate of bivalent genes during differentiation. Cell Stem Cell 4, 80-93.
Ho,L., Jothi,R., Ronan,J.L., Cui,K., Zhao,K., and Crabtree,G.R. (2009). An embryonic stem cell chromatin remodeling complex, esBAF, is an essential component of the core pluripotency transcriptional network. Proc. Natl. Acad. Sci U. S. A 106, 5187-5191.
Milne,T.A., Zhao,K., and Hess,J.L. (2009). Chromatin Immunoprecipitation (ChIP) for Analysis of Histone Modifications and Chromatin-Associated Proteins. Methods Mol. Biol. 538, 1-15.
Rosenfeld,J.A., Wang,Z., Schones,D.E., Zhao,K., DeSalle,R., and Zhang,M.Q. (2009). Determination of enriched histone modifications in non-genic portions of the human genome. BMC. Genomics 10, 143.
5
Wang,Z., Schones,D.E., and Zhao,K. (2009). Characterization of human epigenomes. Curr. Opin. Genet. Dev.
Wei,G., Wei,L., Zhu,J., Zang,C., Hu-Li,J., Yao,Z., Cui,K., Kanno,Y., Roh,T.Y., Watford,W.T., Schones,D.E., Peng,W., Sun,H.W., Paul,W.E., O'Shea,J.J., and Zhao,K. (2009). Global mapping of H3K4me3 and H3K27me3 reveals specificity and plasticity in lineage fate determination of differentiating CD4+ T cells. Immunity. 30, 155-167.