+ All Categories
Home > Documents > Supplemental Information A Comprehensive Genomic … no threshold value was reported, ... 20,...

Supplemental Information A Comprehensive Genomic … no threshold value was reported, ... 20,...

Date post: 13-Mar-2018
Category:
Upload: phamkhue
View: 220 times
Download: 4 times
Share this document with a friend
17
Molecular Cell, Volume 41 Supplemental Information A Comprehensive Genomic Binding Map of Gene and Chromatin Regulatory Proteins in Saccharomyces Bryan J. Venters, Shinichiro Wachi, Travis N. Mavrich, Barbara E. Andersen, Peony Jena, Andrew J. Sinnamon, Priyanka Jain, Noah S. Rolleri, Cizhong Jiang, Christine Hemeryck-Walsh, and B. Franklin Pugh Inventory of Supplemental Information Supplemental Figures Figure S1. High-resolution (600dpi) version of Figure 4. Figure S2. Co-occupancy of Figure 4A cluster “a” and “b” factors verified by gene- based clustering. Figure S3. Cytoscape network of regulator co-occupancy, related to Figure 4. Figure S4. Clustering of over 2,300 relationships between of 25°C promoter occupancy data (this study) with public data, related to Figure 4. Figure S5. Affymetrix consensus binding locations compared to low-density oligo arrays, related to Figure 5. Figure S6. ChIP-seq data for 20 elongation factors presented as composite plots, related to Figure 5. Figure S7. Assembly and disassembly of the transcription machinery in response to heat shock. Supplemental Tables Table S1. List of factors covered in this study and confidence values. Table S2. Processed microarray data: Occupancy levels in YPD at 25°C . Table S3. Processed microarray data: Occupancy levels in YPD at 37°C . Table S4. Processed microarray data: Maximal promoter occupancy in YPD at 25°C & 37°C. Related to Figure 2. Table S5. Complexes occupying individual gene promoters at 25°C (5% FDR), related to Figure 2. Table S6. Gene properties relevant to Figs. 2, 3, 5, 6, & S3C. Table S7. Complete set of co-occupancy analysis data presented in Figure 4A-D, and relationships presented in Figure S4. Table S8. Interpolated factor distance to TSS, related to Figure 5. Table S9. Processed microarray data: Changes in occupancy (37°C/25°C), related to Figure 6. Table S10. Complexes increasing or decreasing in promoter occupancy in response to heat shock (37ºC /25ºC occupancy changes), related to Figure 6.
Transcript
Page 1: Supplemental Information A Comprehensive Genomic … no threshold value was reported, ... 20, 2250-2265. Figure S1. High-resolution ... SFP1 ASF1 SAGA Pol II ISWI TFIIB RAT1 NHP TFIID

Molecular Cell, Volume 41

Supplemental Information

A Comprehensive Genomic Binding Map of Gene and Chromatin Regulatory Proteins in Saccharomyces Bryan J. Venters, Shinichiro Wachi, Travis N. Mavrich, Barbara E. Andersen, Peony Jena, Andrew J. Sinnamon, Priyanka Jain, Noah S. Rolleri, Cizhong Jiang, Christine Hemeryck-Walsh, and B. Franklin Pugh Inventory of Supplemental Information Supplemental Figures Figure S1. High-resolution (600dpi) version of Figure 4. Figure S2. Co-occupancy of Figure 4A cluster “a” and “b” factors verified by gene-based clustering. Figure S3. Cytoscape network of regulator co-occupancy, related to Figure 4. Figure S4. Clustering of over 2,300 relationships between of 25°C promoter occupancy data (this study) with public data, related to Figure 4. Figure S5. Affymetrix consensus binding locations compared to low-density oligo arrays, related to Figure 5. Figure S6. ChIP-seq data for 20 elongation factors presented as composite plots, related to Figure 5. Figure S7. Assembly and disassembly of the transcription machinery in response to heat shock. Supplemental Tables Table S1. List of factors covered in this study and confidence values. Table S2. Processed microarray data: Occupancy levels in YPD at 25°C. Table S3. Processed microarray data: Occupancy levels in YPD at 37°C. Table S4. Processed microarray data: Maximal promoter occupancy in YPD at 25°C & 37°C. Related to Figure 2. Table S5. Complexes occupying individual gene promoters at 25°C (5% FDR), related to Figure 2. Table S6. Gene properties relevant to Figs. 2, 3, 5, 6, & S3C. Table S7. Complete set of co-occupancy analysis data presented in Figure 4A-D, and relationships presented in Figure S4. Table S8. Interpolated factor distance to TSS, related to Figure 5. Table S9. Processed microarray data: Changes in occupancy (37°C/25°C), related to Figure 6. Table S10. Complexes increasing or decreasing in promoter occupancy in response to heat shock (37ºC /25ºC occupancy changes), related to Figure 6.

Page 2: Supplemental Information A Comprehensive Genomic … no threshold value was reported, ... 20, 2250-2265. Figure S1. High-resolution ... SFP1 ASF1 SAGA Pol II ISWI TFIIB RAT1 NHP TFIID

SUPPLEMENTAL EXPERIMENTAL PROCEDURES

Microarray data analysis

Data were filtered, analyzed, and occupancy levels determined as previously described

(Zanton and Pugh, 2006). Normalized occupancy levels are available in Tables S2-S4. Pearson

correlation values between biological replicates are available in Table S1.

False discovery rates were determined using a method described previously (Li et al.,

2008) and the minimum log2 occupancy value meeting a 5% FDR are reported in row 12 of

Tables S2-S4. Briefly, based on four untagged control ChIP-chip occupancy data sets, we

generated a frequency distribution plot of the ChIP values. Plotting the data on a log2 scale

generally produced a "normal" (symmetrical) distribution for untagged control (background) data.

Based on the assumption that the untagged control produces no true positives, at a given

threshold the false discovery rate (FDR) was estimated by computing the ratio of data points in

the untagged control to the data points in the regulator ChIP. If a 5% FDR could not be reached,

then no threshold value was reported, meaning that no values met the 5% FDR cutoff. As a

cautionary disclaimer, this error model assumes that all samples have a similar actual

background distribution. While we have no evidence against this assumption, day-to-day

variability in sample preparation might alter the background distribution, and give rise to a

different false positive level than indicated.

As previously described (Venters and Pugh, 2009), fractional distribution analysis of

normalized 25°C occupancy values was used to determine the transcription regulator binding

location at each promoter. Briefly, assuming a normal signal value distribution for the binding

location of a transcription regulator, fractional distribution analysis interpolates the binding

location by using the fractional occupancy of adjacent UAS (-320 to -260 bp) and TSS (-90 to -30

bp) probes. Interpolated factor binding locations relative to the TSS are available in Table S8.

Such locations should be considered approximate, and assumes that the actual location resides

between the two probes, which in many cases might be incorrect.

Page 3: Supplemental Information A Comprehensive Genomic … no threshold value was reported, ... 20, 2250-2265. Figure S1. High-resolution ... SFP1 ASF1 SAGA Pol II ISWI TFIIB RAT1 NHP TFIID

SUPPLEMENTAL REFERENCES

Li, X.Y., MacArthur, S., Bourgon, R., Nix, D., Pollard, D.A., Iyer, V.N., Hechmer, A., Simirenko, L., Stapleton, M., Luengo Hendriks, C.L., et al. (2008). Transcription factors bind thousands of active and inactive regions in the Drosophila blastoderm. PLoS biology 6, e27.

Shannon, P., Markiel, A., Ozier, O., Baliga, N.S., Wang, J.T., Ramage, D., Amin, N., Schwikowski, B., and Ideker, T. (2003). Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome research 13, 2498-2504.

Venters, B.J., and Pugh, B.F. (2009). A canonical promoter organization of the transcription machinery and its regulators in the Saccharomyces genome. Genome research 19, 360-371.

Zanton, S.J., and Pugh, B.F. (2006). Full and partial genome-wide assembly and disassembly of the yeast transcription machinery in response to heat shock. Genes Dev 20, 2250-2265.

Page 4: Supplemental Information A Comprehensive Genomic … no threshold value was reported, ... 20, 2250-2265. Figure S1. High-resolution ... SFP1 ASF1 SAGA Pol II ISWI TFIIB RAT1 NHP TFIID

Figure S1. High-resolution (600dpi) version of Figure 4.

Page 5: Supplemental Information A Comprehensive Genomic … no threshold value was reported, ... 20, 2250-2265. Figure S1. High-resolution ... SFP1 ASF1 SAGA Pol II ISWI TFIIB RAT1 NHP TFIID
Page 6: Supplemental Information A Comprehensive Genomic … no threshold value was reported, ... 20, 2250-2265. Figure S1. High-resolution ... SFP1 ASF1 SAGA Pol II ISWI TFIIB RAT1 NHP TFIID
Page 7: Supplemental Information A Comprehensive Genomic … no threshold value was reported, ... 20, 2250-2265. Figure S1. High-resolution ... SFP1 ASF1 SAGA Pol II ISWI TFIIB RAT1 NHP TFIID
Page 8: Supplemental Information A Comprehensive Genomic … no threshold value was reported, ... 20, 2250-2265. Figure S1. High-resolution ... SFP1 ASF1 SAGA Pol II ISWI TFIIB RAT1 NHP TFIID

Figure S2. Co-occupancy of Figure 4A cluster “a” and “b” factors verified by gene-based clustering. (A) Figure 4A, cluster “a” validation. Rows representing promoter regions were clustered by K-means (K = 5). 25°C occupancy data were filtered to include promoters having >75% data present and at least 10 observations with >2-fold occupancy level (fold over background), which resulted in 921 genes. Columns represent ChIP-chip data for a given regulator, and were clustered hierarchically. (B) Figure 4A, cluster “b” validation. Rows representing promoter regions were clustered by K-means (K = 3). 25°C occupancy data were filtered to include promoters having >75% data present and at least 10 observations with >2-fold occupancy level (fold over background), which resulted in 1418 genes. Columns represent ChIP-chip data for a given regulator, and were clustered hierarchically.

Page 9: Supplemental Information A Comprehensive Genomic … no threshold value was reported, ... 20, 2250-2265. Figure S1. High-resolution ... SFP1 ASF1 SAGA Pol II ISWI TFIIB RAT1 NHP TFIID

Figure S3. Cytoscape network of regulator co-occupancy, related to Figure 4. (A) Nodes represent individual regulators that are colored by stage (see Figure 1A). Edges or lines are colored based on the clustered groups of regulators from Figure 4A (clusters a, b, and c have the following edge colors: cyan, green, and blue, respectively). Regulator pairs, whose statistical significance for co-occupancy was P <10-5 were connected in the network, and arranged using an edge-weighted, spring-embedded, network layout algorithm (Shannon et al., 2003). Thus, regulator pairs with the most significant co-occupancy (lowest P-value) are closest together. (B) A subnetwork of regulator co-occupancy whose statistical significance had P <10-60. Linker histone H1 and Asf1 are linked in the network to the Ume6 repressor of meiotic genes. (C) A hierarchically and K-means clustered plot of Ume6, H1, Asf1, and other regulators. Rows representing upstream promoter regions were clustered by K-means (K = 5). 25°C occupancy data was filtered to include promoters having >80% data present and at least 1 observation with >2-fold occupancy, which resulted in 3,848 genes. The intensity of red indicates ChIP-chip occupancy levels at 25ºC.

Page 10: Supplemental Information A Comprehensive Genomic … no threshold value was reported, ... 20, 2250-2265. Figure S1. High-resolution ... SFP1 ASF1 SAGA Pol II ISWI TFIIB RAT1 NHP TFIID

Figure S4. Clustering of over 2,300 relationships between 25°C promoter occupancy data (this study) and public data, related to Figure 4. Columns represent regulators and rows are public data sets. The overlap of those genes having significant occupancy of a particular regulator, and those genes that are in either the top or bottom tenth percentile (evaluated separately) of the indicated public datasets is represented by a color-coded P-value. Each pixel represents a -log10 P-value (Chi test) of the overlap. Brighter yellow means a stronger overlap. These P-values should not be interpreted strictly, as they can be skewed depending on the data types and underlying assumptions. The public datasets are a mixture of expression profiling data, ChIP-occupancy, motif enrichments, and others. Consequently, the P-values should be interpreted relative to others in the same data class (by rank ordering). See Table S7 for the corresponding values.

Page 11: Supplemental Information A Comprehensive Genomic … no threshold value was reported, ... 20, 2250-2265. Figure S1. High-resolution ... SFP1 ASF1 SAGA Pol II ISWI TFIIB RAT1 NHP TFIID

Figure S5. Affymetrix consensus binding locations compared to low-density oligo arrays, related to Figure 5. Composite distribution of regulators across promoter regions are displayed as a heatmap in the same manner as Figure 5. Affymetrix high density (~5 bp probe spacing) data meeting a 5% FDR (Venters and Pugh, 2009) was used to generate the frequency distributions for each factor shown. Base pair distances from the TSS, nucleosome locations, and the consensus TATA box location (green bar) are shown above the heatmap.

Page 12: Supplemental Information A Comprehensive Genomic … no threshold value was reported, ... 20, 2250-2265. Figure S1. High-resolution ... SFP1 ASF1 SAGA Pol II ISWI TFIIB RAT1 NHP TFIID

Figure S6. ChIP-seq data for 20 elongation factors presented as composite plots, related to Figure 5. (A-F) For elongation factors, ChIP-seq data was plotted as composite frequency distributions relative to the TSS. The ChIP-seq data for each factor was normalized to an untagged control and centered to the lowest read quartile for a 25 bp bin. Genes with >2-fold normalized occupancy were plotted. Elongation factor traces were grouped based on belonging to the same complex (A and D) or based on similar distribution patterns (B, C, E, and F).

Page 13: Supplemental Information A Comprehensive Genomic … no threshold value was reported, ... 20, 2250-2265. Figure S1. High-resolution ... SFP1 ASF1 SAGA Pol II ISWI TFIIB RAT1 NHP TFIID

Figure S7. Assembly and disassembly of the transcription machinery in response to heat shock. (A) Log2 ratios of 37ºC/25ºC occupancy changes for promoter regions that increased (red), decreased (green), did not change (black), or did not have data (black), the latter comprising <10% of the data. Rows represent promoter regions (A) or 3’ ends of genes (B), and are ordered identically in the two panels. See main Figure 6 for details.

Page 14: Supplemental Information A Comprehensive Genomic … no threshold value was reported, ... 20, 2250-2265. Figure S1. High-resolution ... SFP1 ASF1 SAGA Pol II ISWI TFIIB RAT1 NHP TFIID

Supplemental Tables

Table S1. List of factors covered in this study and confidence values. For each ChIP-chip experiment, the Pearson correlation (R-values) between biological replicates and high confidence data sets are indicated. Criteria for high confidence data include: 1) statistically significant overlap with published data (noted as comment in each cell), 2) statistically significant overlap between factors within a complex and/or 3) high ChIP efficiency. See actual file for data. Table S2. Processed microarray data: Occupancy levels in YPD at 25°C. See excel file for complete data. Table S3. Processed microarray data: Occupancy levels in YPD at 37°C. See excel file for complete data. Table S4. Processed microarray data: Maximal promoter occupancy in YPD at 25°C & 37°C. Related to Figure 2. See excel file for complete data.

Page 15: Supplemental Information A Comprehensive Genomic … no threshold value was reported, ... 20, 2250-2265. Figure S1. High-resolution ... SFP1 ASF1 SAGA Pol II ISWI TFIIB RAT1 NHP TFIID

Table S5. Complexes occupying individual gene promoters at 25°C (5% FDR), related to Figure 2.

Highly Expressed mRNA/hr Stage

TATA-less genes 1) Orchestration 2) Access 3) Initiation 4) Elongation

RPL14A 106 RAP1 ADA NC2 PCF11

SFP1 ASF1 SAGA Pol II

ISWI TFIIB RAT1

NHP TFIID SPT6

NuA4 TFIIF

RPD3 TFIIH

RSC

SSN6-TUP1

SWI-SNF

TATA

PGK1 111 DIG1 COMPASS Mediator BUR1,2

GAL4* Condensin SAGA CCR4-NOT

GCN4 HDA1 TFIIB CTK

GLN3 INO80 TFIID ESS1

INO2 ISWI TFIIE FACT

LEU3 NHP TFIIF PAF1

PHO2,4 NuA4 TFIIH SPT6

SFP1 RPD3 TFIIS

Spt23 RPH1 THO

STP1 SAS

SET2

SIN1

SIR

SSN6/TUP1

SWI/SNF

Lowly Expressed

TATA-less genes

SPO13 N.D. UME6 ASF1 TFIIS

Histone H1

TATA

GAL10** N.D. SFP1 ASF1 NC2 Rat1

STP1 Histone H1 TFIIS

UME6 ISWI

NHP

NuA3 *Based on other studies, Gal4 is likely a false positive. **In YPD media, GAL10 gene expression is essentially off.

Page 16: Supplemental Information A Comprehensive Genomic … no threshold value was reported, ... 20, 2250-2265. Figure S1. High-resolution ... SFP1 ASF1 SAGA Pol II ISWI TFIIB RAT1 NHP TFIID

Table S6. Gene properties relevant to Figs. 2, 3, 5, 6, & S3C. See excel file for complete data. Table S7. Complete set of co-occupancy analysis data presented in Figure 4A-D, and relationships presented in Figure S4. This workbook contains the values presented in Figure 4 and S4 (see individual tabs in the actual Excel file, accompanying this manuscript). Table S8. Interpolated factor distance to TSS, related to Figure 5. See excel file for complete data. Table S9. Processed microarray data: Changes in occupancy (37°C/25°C), related to Figure 6. See excel file for complete data.

Page 17: Supplemental Information A Comprehensive Genomic … no threshold value was reported, ... 20, 2250-2265. Figure S1. High-resolution ... SFP1 ASF1 SAGA Pol II ISWI TFIIB RAT1 NHP TFIID

Table S10. Complexes increasing or decreasing in promoter occupancy in response to heat shock (37ºC /25ºC occupancy changes), related to Figure 6.

Stage Orchestration Access Initiation Elongation Orchestration Access Initiation Elongation Heat shock induced Complexes coming Complexes leaving TFIID-dominated REB1 SWR-C TFIID RFX1 ISW1b TFIIH CTK

XBP1 RPD3

SAGA-dominated FHL1 CHZ Mediator FACT APIS

MSN2,4 INO80 NC2 Pol II

UME6 RPD3 SAGA SPT6

SKN7 CYC8/SSN6 TFIIE THO

STP1 SWI-SNF TFIIF

YAP1 TFIIH

Heat shock repressed

FHL1 HDA1 THO MSN2,4 ADA Mediator CCR4-NOT

FKH1 HOS1 DOT1 NC2 FACT

REB1 ISW1a/b SWI/SNF SAGA Pol II

RFX1 JHDM1 TFIIB SPT6

UME6 NuA4 TFIID

STP1 nucleosome TFIIE

XBP1 RPD3 TFIIF

YAP1 RSC TFIIH

YAP6 SET2

SWR-C


Recommended