Supplementary Data
Supplementary Table S1. Identifiers and references of datasets analyzed
Name Identifier, Citation or Accession Number Number of replicates
Ash1C GSE30820 1
Ash1N GSE30820 1
Fsh-L GSE30820 1
Fsh-SL GSE30820 1
INPUT (Enderle et al., 2011) 2
Pc (Enderle et al., 2011) 2
Ph (Enderle et al., 2011) 2
Psc (Enderle et al., 2011) 2
Trx-C (Enderle et al., 2011) 2
BEAF-70 modENCODE_922 2
BEAF-HB modENCODE_274 2
BRE1_Q2539 modENCODE_923 2
Chro(Chriz)BR modENCODE_278 2
Chro(Chriz)WR modENCODE_279 2
CP190-HB modENCODE_925 2
CP190-VC modENCODE_280 2
CTCF-VC modENCODE_283 2
dMi-2_Q2626 modENCODE_926 2
dRing Q3200 modENCODE_928 2
E(z) modENCODE_284 3
GAF modENCODE_285 2
HP1a_wa184 modENCODE_2668 2
HP1b (Henikoff ) modENCODE_941 2
HP1c (MO 462) modENCODE_943 2
HP2 (Ab2-90) modENCODE_944 2
JIL-1_Q3433 modENCODE_945 2
MBD-R2_Q2567 modENCODE_946 2
mod2.2-VC modENCODE_2674 2
NURF301_Q2602 modENCODE_947 2
RNA pol II (ALG) modENCODE_329 2
SU(HW)-HB modENCODE_330 2
Su(Hw)-VC modENCODE_331 2
Su(var)3-7-Q3448 modENCODE_2672 2
Su(var)3-9 modENCODE_2673 3
RNA-seq (Enderle et al., 2011) 1
RNA-seq modENCODE_983 3
RNA-seq modENCODE_2305 1
shortRNA-seq (Nechaev et al., 2010) 2
SUPPLEMENTARY FIG. S1. Comparison of all analyzed RNA-seq datasets; datasets no. 2, 3 and 4 are technical
replicates; axes are on logarithmic scales.
SUPPLEMENTARY FIG. S2. Cross validation curves of Lasso regularized linear model on single features (cyan),
pairs (blue) and triplets (red); the x-axis represents the amount of regularization on a log scale; the vertical lines mark
the minimum of the cross-validated mean square error.
A B
SUPPLEMENTARY FIG. S3. Histograms of pairwise correlations of features ranked high (top 35) by Random
Forest (A) and the Boostrap-Lasso approach (B).
SUPPLEMENTARY FIG. S5. Histogram of Boostrap-Lasso stability scores; no distinctive candidate cut-off point
such as a shoulder or second peak was observed therefore the scores are used for ranking the features and interaction
candidates.
SUPPLEMENTARY FIG. S4. Comparison of gene transcription levels (RNA-seq) and promoter-proximal pausing
levels (shortRNA-seq). Pearson correlation of all genes is 0.713 and Pearson correlation of genes where at least one
value is non-zero is 0.582.
SUPPLEMENTARY FIG. S6. Random Forest node and split enrichments; The diagonal elements represent the
enrichments of the according nodes down to a depth of four. The off-diagonal elements represent the relative fre-
quencies, i.e. enrichments of the specific splits down to a depth of four.