www.sciencetranslationalmedicine.org/cgi/content/full/4/156/156ra140/DC1
Supplementary Materials for
Methylation Subtypes and Large-Scale Epigenetic Alterations in Gastric Cancer
Hermioni Zouridis, Niantao Deng, Tatiana Ivanova, Yansong Zhu, Bernice Wong, Dan
Huang, Yong Hui Wu, Yingting Wu, Iain Beehuat Tan, Natalia Liem, Veena Gopalakrishnan, Qin Luo, Jeanie Wu, Minghui Lee, Wei Peng Yong, Liang Kee Goh,
Bin Tean Teh, Steve Rozen, Patrick Tan*
*To whom correspondence should be addressed. E-mail: [email protected]
Published 17 October 2012, Sci. Transl. Med. 4, 156ra140 (2012) DOI: 10.1126/scitranslmed.3004504
The PDF file includes:
Materials and Methods Fig. S1. Correlation between Infinium and GoldenGate methylation profiles. Fig. S2. Unsupervised clustering of gastric tumors and nonmalignant gastric normals using DNA methylation patterns. Fig. S3. Intersection of genes with DNA methylation and gene expression information. Fig. S4. Overlap of H3K27ac and H3K4me3 ChIP-Seq binding peaks with CpG sites exhibiting tumor hypermethylation and increased expression. Fig. S5. Tandem gene body and promoter methylation alterations in genes. Fig. S6. Clustering of tumor samples based on DNA methylation data. Fig. S7. Clustering of gastric cancers by gene expression or DNA methylation. Fig. S8. High-resolution methylation patterns in LRES5. Fig. S9. Hypo-LRRs: long-range regions (LRRs) of hypomethylation. Fig. S10. Box plots relating copy number alterations (CNAs) with individual hypomethylated regions (hypo-LRRs 1 to 24). Table S1. Clinical characteristics of gastric cancer patients analyzed in this study. Table S2. Correlation between Infinium and GoldenGate methylation. Table S3. GSEA of hypermethylated genes underexpressed in tumors. Table S4. GSEA of hypomethylated genes overexpressed in tumors. Table S5. Association of positively correlated CpGs to H3K27ac and H3K4me3 peaks. Table S6. Genes exhibiting correlations between gene body methylation and expression. Table S7. Genes exhibiting tandem promoter and gene body methylation. Table S8. Clinicopathologic associations of CIMP and non-CIMP tumors.
Table S9. Multivariate analysis of survival associations of CIMP tumors. Table S10. GSEA of genes hypermethylated in CIMP gastric tumors. Table S11. LRESs in CIMP gastric cancers. Table S12. Association of LRESs with CIMP gastric tumors. Table S13. CpG methylation correlation coefficients with respect to expression of genes within LRES5. Table S14. CpG methylation correlation coefficients with respect to expression of genes within LRES3. Table S15. Hypo-LRRs: long-range regions of hypomethylation. Table S16. Association of hypo-LRRs to non-CIMP tumors. Table S17. Hypo-LRR CpGs and repeat sequences.
Supplementary Text S1. Materials and Methods
Primary Gastric Tissue Samples and Cell Lines
Primary gastric tissues were obtained from the Singhealth Tissue Repository or National
University Hospital Singapore Tissue Repository, with approvals from institutional Research
Ethics Review Committees, and with patient informed consent. Gastric cancer samples were
histologically confirmed to have cancer cells present, with an average tumor cellularity of 40%.
In this project, we did not use tumor cell content as a strict inclusion criterion for genomic
profiling, because it is well known that diffuse-type gastric cancers possess a greater degree of
tumor stroma than intestinal-type gastric cancers. As such, choosing only samples of high tumor
cellularity would consequently bias our cohort towards intestinal-type gastric cancers, which
would run counter to our overall goal of identifying the most prevalent alterations in the overall
gastric cancer population. Histopathological data and patient characteristics are provided in
Supplementary Table S1. Patients were categorized into separate disease stages according to the
American Joint Committee on Cancer (6th Edition). Gastric cancer cell lines were cultured as
previously described [20].
DNA and RNA Extraction
Genomic DNA was extracted from flash-frozen primary tissues using a Qiagen genomic DNA
extraction kit. Total RNAs were extracted using Trizol (Invitrogen), digested with RNase free
DNase (RQ1 DNase, Promega), and subsequently purified using an RNeasy Mini kit (Qiagen).
CpG Methylation Profiling
Infinium assays were performed according to the manufacturer’s instructions (Illumina), using
arrays interrogating 27,578 CpGs located mainly in promoter regions (-1,500 to + 1,000 bp of
TSSs) of over 14,000 genes. 500 ng of high quality genomic DNA was used for bisulphite
conversion. After hybridizing bisulphite and non-bisulphite treated samples onto the arrays, the
stained arrays were scanned using an Illumina Bead Array Reader. Individual images files were
quality controlled using Illumina Genome Studio software, examining factors such as sample
dependent controls (staining, hybridization, target removal, and extension) and sample
independent controls (bisulphite conversion, specificity, negative, and non-polymorphism). Only
samples satisfying these QC measures were used in downstream analysis. Methylation analysis
using Infinium 450K arrays was also performed on selected CIMP tumors and matched normals,
using the manufacturer's instructions. YCC-11 gastric cell line DNA, two fully unmethylated
controls, and one fully methylated control were used as technical replicates across array batches
to ensure technical reproducibility across batches. We also included replicate samples within
each batch as an indicative measure of within-batch robustness. DNA methylation levels for each
CpG site were computed using Illumina’s Genome Studio software as the ratio of methylated
signal intensity to the sum of methylated and unmethylated signal intensities, with a value
between 0 (unmethylated) and 1 (fully methylated). The methylation array data have been
deposited into the Gene Expression Omnibus under accession number GSE30601.
Gene Expression Profiling and Copy Number Alterations
mRNAs were hybridized to Affymetrix Human Genome U133 plus Genechips (HG-U133 Plus
2.0, Affymetrix). Raw Affymetrix expression array data was normalized using the MAS5.0
algorithm (Affymetrix). Gene expression array data is available at
http://www.ncbi.nlm.nih.gov/projects/geo/ (Accession: GSE15460). Copy number data for this
data set were analyzed as described in [38]. The copy number data are available under Accession
number GSE31168. Briefly, tumor genomic DNAs (190) and paired normal tissue (98) were
hybridized to Affymetrix Genomewide SNP 6.0 arrays according to the manufacturer's protocol.
Raw fluorescence signals were normalized by Affymetrix genotyping console software.
Segmentation was performed using Circular Binary Segmentation (CBS) [52].
CpG Methylation and Gene Expression Data Analysis
CpG gene associations, distances from TSSs, and presence within CGIs were based on vendor-
provided annotation files using NCBI Genome Build 36 (hg18). Probes targeting the X and Y
chromosomes were removed from analysis. A CpG was considered to be located within a gene
body if the distance from the TSS of the associated gene was > +1,000 bp but less than the length
of the structural gene as defined at http://www.genecards.org/. Differential methylation analyses
were performed on CpGs between two groups of samples using two sided t-tests, and p-values
were corrected using the Bonferroni method. Corrected p-values of < 0.05 were considered
significant. A CpG was considered to be hypermethylated (or hypomethylated) in one group
relative to another if the mean DNA methylation level in that group was higher (or lower) than
the mean DNA methylation level of the other group. A similar method was used for differential
expression analyses, with p-values corrected using the Benjamini and Hochberg method [53].
Correlation analyses were performed by calculating Pearson correlation coefficients and
associated p-values using Matlab 7.10.0 (MathWorks). p-values were corrected using the
Benjamini and Hochberg method [53], and corrected p-values of < 0.05 were considered
significant. Expression data were filtered so that only genes exhibiting a) expression intensity >
100 in > 25% of samples, and b) a median expression intensity of > 300 across all samples were
considered for downstream analysis (10,587 genes).
Clustering and Pathway Analysis
Unsupervised hierarchical clustering was performed using a Euclidean distance metric and
complete linkage clustering. Prior to clustering, the tumor methylation data were subjected to a
log2 transformation relative to the mean of the corresponding nonmalignant sample probe. The
tumors were clustered based on a) 1,653 CpGs with significant methylation – expression
correlation, and b) 4,739 autosomal tumor probes with transformed DNA methylation levels
exhibiting standard deviation > 0.8. Cell lines were clustered based on 1,631 autosomal probes
with at least 5 samples having methylation levels < 0.1 and at least 5 samples having methylation
levels > 0.9. Gene Set Enrichment Analysis (GSEA, [23]) was performed using Molecular
Signatures Database version 2.5.1 (http://www.broadinstitute.org/gsea/msigdb/index.jsp). Gene
sets associated with hypermethylation/underexpression, hypomethylation/overexpression, and
genes hypermethylated in CIMP tumors were queried against all 5,452 MSigDB gene sets.
Genes exhibiting tumor hypermethylation/overexpression, tumor
hypomethylation/underexpression, and genes exhibiting DNA methylation levels with high
standard deviation were queried against the 2,392 chemical and genetic perturbations gene sets.
Nominal overlap p-values were corrected using the Bonferroni method. Corrected p-values of <
0.05 were considered significant. Permutation analyses were performed by permuting the array
data CpG probe labels and repeating analyses for the same genes over 100 randomization runs.
All statistical tests were done with R statistical computing language (R version 2.8.1)
(http://www.r-project.org), except where noted.
Tandem Promoter and Gene Body Methylation Analysis
To identify genes significantly regulated by gene body and promoter methylation, we compared
genes harboring positively correlated gene body methylation-expression relationships against all
genes with promoter CpGs exhibiting significant relationships to gene expression, in terms of
their promoter CpGs exhibiting positive, negative, and no methylation-expression correlations.
We used the multinomial distribution to compare the two distributions, considering a p-value of
< 0.05 to indicate that the distribution of the former differs significantly from the latter. In this
analysis, the BCL2 promoter CpG site, while not strictly located in the defined promoter region
of -1,500 to +1,000 bp from the TSS, was still included due to its close proximity to the TSS
(+1,233 bp). Methylation of the RUNX3 promoter CpG, which was also negatively correlated
with expression at borderline significance (p=0.058), was also considered for analysis, due to its
causative role in gastric cancer [54].
Clinico-Pathologic Associations and Statistical Analysis
Associations between gastric cancer molecular subtypes and gender, tumor grade, Lauren
classification, EBV status, MSI status, and disease stage were evaluated using Fisher’s exact test.
Association with patient age was determined by comparing ages of patients associated with
CIMP and non-CIMP tumors using the Wilcoxon rank sum test. p-values of < 0.05 were
considered significant. Kaplan-Meier analysis (SPSS) was performed to compare survival
outcomes for different patient cohorts. Cox regression models were used for computing hazard
ratios in univariate and multivariate analyses (variables considered: subtype, age, gender,
pathology, stage, margins, grade). Multivariate models were generated only using factors found
significant in univariate analysis. Associations of LRESs to CIMP tumors and hypo-LRRs to
non-CIMP tumors were calculated using chi-square tests (1 degree of freedom), with p<0.05
being considered significant. Associations between non-CIMP tumors and amplified regions
were performed using Fisher’s exact test. p-values were corrected using the Benjamini and
Hochberg method [53], and corrected p-values of <0.05 were considered significant.
LRES and Hypo-LRR Analysis
LRES and Hypo-LRRs were detected using a 1 Mb sliding window shifted along the genomic
coordinate in 50 kb increments. To identify LRESs, the number of hypermethylated CpGs and
the total number of array CpGs within each 1 Mb window were counted. If a) the observed
number of hypermethylated CpGs was greater than the expected number of hypermethylated
CpGs based on the chromosome average, b) the expected number of hypermethylated CpGs was
greater than 5, and c) the expected number of non-hypermethylated CpGs was greater than 5, a
chi-square test was applied and a p-value calculated. After scanning all windows, obtained p-
values were Bonferroni corrected, and windows associated with corrected p-values < 0.05 were
considered significantly enriched for hypermethylated CpGs. Overlapping significant windows
were grouped into single LRESs. We confirmed LRES hypermethylation in individual samples
using two methods. First we calculated, for each LRES, an "activity score" corresponding to the
fraction of observed hypermethylated CpGs exhibiting higher DNA methylation levels in each
tumor relative to the mean methylation levels of the same CpGs across the nonmalignant
samples. Second, we confirmed coordinated DNA methylation patterns within the LRES by
computing across-cohort CpG-CpG methylation correlation coefficients within the LRES, and
the number of significant correlations per CpG were computed for each LRR and compared
against a null distribution of randomly selected windows of comparable length across the
genome. Positive correlation coefficients > 0.4 were considered significant. To determine FDRs
associated with LRESs, we generated null distributions of randomly selected windows of
comparable length across the genome. A false discovery was counted if the number of
significantly overlapping gene sets for a randomly generated region was greater than or equal to
the number of significantly overlapping gene sets for the original region. An FDR cut-off of
<10% was used. A similar process was also used to identify and analyze hypo-LRRs, but
analyzing hypomethylated CpGs rather than hypermethylated CpGs. Regional gene expression
values for each LRES in each tumor was computed by first log2 transforming expression values
of each LRES gene in each tumor relative to the mean expression level of the same gene in
nonmalignant samples, and secondly averaging the transformed values within the LRES.
Drug Treatments and Murine Xenograft Assays
CIMP and non-CIMP cell lines were treated with 5 µmol/l 5-aza-dC for 72 hours. Proliferation
measurements were determined using a tetrazolium compound-based colorimetric method (MTS
kit, Promega) on an EnVision 2104 multilabel plate reader (Perkin Elmer). Representative CIMP
cell lines with specific LRES were identified by selecting cell lines having both a) >80% of the
LRES CpGs hypermethylated and b) the majority of LRES genes being underexpressed relative
to the least methylated line (SNU16). RNA was extracted from control and 5-aza-dC treated
lines, mRNAs were hybridized to Illumina Human WG-6 Beadchips (Illumina) to identify
reactivated genes. All drug treatment and expression array profiles were performed in triplicate
for each cell line. Animal studies were performed in compliance with Institutional Animal Care
and Use Committee (IACUC) policies at VARI. Six-week-old female BALB/c nu/nu nude mice
(Charles River) were injected subcutaneously with 2 × 106 Az521 cells in the right flank. Tumor
growth was measured three times per week using digital calipers (Mitutoyo) and tumor volume
was calculated as length × width × height × 0.5. Drug treatments were initiated once the tumors
grew to 200-250 cubic millimeters. Drug treatments were performed for 6 days and the treatment
groups were: Group1: vehicle control; Group 2: 5-aza-dC alone – 3 intraperitoneal injections at 5
mg/kg daily at 10:00, 13:00. and 16:00 (total dose 15 mg/kg per mouse); Group 3: cisplatin alone
– one intraperitoneal injection at 6 mg/kg; Group 4: 5-aza-dC and cisplatin combination: 5-aza-
dC - 3 intraperitoneal injections at 5 mg/kg at 10:00, 13:00 and 16:00 (total dose 15 mg/kg per
mouse) and cisplatin one intraperitoneal injection on day 0 at 6 mg/kg. Cisplatin was
administered intraperitoneally in 0.1 ml 0.9% saline solution, and 5-aza-dC (Sigma, Cat #
A3656) was delivered in 0.1 ml phosphate-buffered saline (PBS) intraperitoneally. IP injections
were delivered using a 25G 5/8 inch needle. Tumor volume was measured 3 times per week
during the therapeutic procedure.
0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
betas_CellLines_Infinium_colMatch_rowMatch[, 22]
bet
as_
Ce
llLin
es_G
old
eng
ate_
row
Mat
ch[,
22]
correlation coefficient = 0.95
Infinium
Go
lde
nga
te
GoldenGate
SUPPLEMENTARY FIGURES
Fig. S1. Correlation between Infinium and GoldenGate methylation profiles
The graph shows the level of correlation between CpG probes measured using either the Infinium methylation array platform (x-axis) or the GoldenGate methylation array platform (y-axis). Each point represents a CpG probe, of which 112 are present on both platforms. This graph shows the correlation of Infinium and GoldenGate data for the YCC-11 cell line.
Fig. S2. Unsupervised clustering of gastric tumors and nonmalignant gastric normals using DNA methylation patterns
The heat-map represents 203 gastric cancers and 94 matched nonmalignant gastric samples clustered using the top 1000 CpG probes demonstrating the highest β-value standard deviations across the entire cohort of 297 primary samples. The majority of tumors (orange, 89%) cluster together, as do the nonmalignant samples (yellow, 93%). The blue bar represents a set of tumors with “normal-like” methylation patterns. Even in this situation, however, the tumors cluster together, indicating that they are still distinct from nonmalignant gastric samples.
4,903 genes
(6,745 differentially
methylated CpGs,
T vs. N analysis) used in global
methylation – expression
correlation analysis
2,867
genes
2,817
genes
6,725
genes
Expression
array
Methylation
array
4,903 genes
(6,745 differentially
methylated CpGs,
T vs. N analysis) used in global
methylation – expression
correlation analysis
2,867
genes
2,817
genes
6,725
genes
Expression
array
Methylation
array
Fig. S3. Intersection of genes with DNA methylation and gene expression information
The Venn diagram shows the degree of overlap between genes probed on Infinium Methylation and Affymetrix U133 expression arrays. A total of 10,587 (2817+2867+4903) genes profiled on the expression array fulfilled the expression intensity filtering criteria and a total of 14,495 (6725+2867+4903) genes were profiled on the methylation array, of which 7,770 genes overlap between the two platforms. The central set of 4,903 genes possessing a) Methylation and gene expression information, and b) exhibiting differential methylation between tumors and nonmalignant tissues was used in the global methylation – expression correlation analysis. Probes targeting the X and Y chromosomes were removed prior to performing the differential methylation analysis, so all 4,903 genes applied to the global correlation analysis are located on autosomes.
Fig. S4. Overlap of H3K27ac and H3K4me3 ChIP-Seq binding peaks with CpG sites exhibiting tumor hypermethylation and increased expression
To test if the positively-correlated CpG sites might correspond to enhancer or cryptic promoter regions, we intersected the locations of the CpG sites with a separate list of regions corresponding to either high H3K27ac histone signals (active promoters and enhancers) or high H3K4me3 signals (Gene TSSs) in primary gastric tumors (generated from in-house CHiP-sequencing data). Interestingly, many of the positively-correlated CpGs were found to lie close to high H3K4me3 signals (eg C9orf78, RANBP1, ZFYVE16, HDAC3). This result suggests that some of these positively-correlated CpGs may occupy cryptic promoters, rather than distal enhancers.
H3K27ac (promoters and enhancers) and high H3K4me3 profiles (transcriptional start sites of genes) were from primary gastric cancers. Each dot represents a ChIP-seq binding peak, and CpG sites of interest are indicated with the black dashed line. Black solid lines and arrows indicate annotated transcription start sites of genes and the direction of transcription. X-axis : genomic coordinate. Y-axis : ChIP-seq binding height.
0 0.2 0.4 0.6 0.8 10
1.0
2.0
3.0
0 0.2 0.4 0.6 0.8 10
1.0
2.0
3.0
0 0.2 0.4 0.6 0.8 10
0.5
1
0 0.2 0.4 0.6 0.8 10
0.5
1.0
1.5
2.0
0 0.2 0.4 0.6 0.8 10
0.5
1.0
1.5
2.0
0 0.2 0.4 0.6 0.8 10
0.5
1
X103CHFR gene body DNA methylation level MEST gene body DNA methylation level
CHFR promoter DNA methylation level
CHFR promoter DNA methylation level
MEST promoter DNA methylation level
MEST promoter DNA methylation level
X103 X104
X104
CH
FR
exp
ress
ion
inte
nsi
ty
CH
FR
exp
ress
ion
inte
nsi
ty
ME
ST
exp
ress
ion
inte
nsi
ty
ME
ST
exp
ress
ion
inte
nsi
ty
CH
FR
ge
ne
bo
dy D
NA
me
thyla
tio
n le
vel
ME
ST
ge
ne
bo
dy D
NA
me
thyla
tio
n le
vel
correlation coefficient = 0.44
corrected p < 1X10-3
correlation coefficient = -0.62
corrected p < 1X10-3
correlation coefficient = -0.66
corrected p < 1X10-3
correlation coefficient = 0.46
corrected p < 1X10-3
correlation coefficient = -0.51
corrected p < 1X10-3
correlation coefficient = -0.61
corrected p < 1X10-3
CHFR gene
promotergene
body
MEST gene
promotergene
body
a
dHypermethylated (T vs. N)
Hypomethylated (T vs. N)
Overexpressed (T vs. N)
Underexpressed (T vs. N)
CHFR MEST
b
c
00
Tumors
Non-malignant
samples
Tumors
Non-malignant
samples
Tumors
Non-malignant
samples
CGI
Fig. S5. Tandem gene body and promoter methylation alterations in genes CHFR (left) and MEST (right) are shown as examples. For each gene, data are drawn from the individual gene body and promoter CpGs exhibiting the strongest absolute correlations with expression. (a) Relationship between expression intensity and gene body DNA methylation level. (b) Relationship between expression intensity and promoter DNA methylation level. (c) Relationship between gene body and promoter DNA methylation levels, confirming occurrence in the same samples. (d) Schematic representation of promoter and gene body methylation relationships to gene expression alterations in tumor samples relative to nonmalignant samples. CGI locations are shown as blue bars.
Fig. S6. Clustering of tumor samples based on DNA methylation data
CDH1
MLH1, CDKN2A
MGMT
MGMT
CDKN2A
CDKN2A
CDH1
MLH1, CDKN2A
MGMT
MGMT
CDKN2A
CDKN2A
CIMP
non-CIMP
Hypomethylation
Hypomethylation Hypermethylation
Heatmap representation of tumor clustering using CpG methylation data selected based on an across-tumor standard deviation probe set. Approximate rows corresponding to CpGs associated with MLH1, CDKN2A, MGMT, and CDH1 genes are indicated.
Fig. S7. Clustering of gastric cancers by gene expression or DNA methylation
CIMP
non-CIMP
CIMP
non-CIMP
161 tumors were analyzed, corresponding to samples for which gene expression and methylation information was available. Samples were clustered based on (a) gene expression levels of 1324 genes showing significant methylation-expression correlation relationships, or (b) methylation levels of the 1653 CpGs corresponding to these genes. Gastric CIMP tumors (blue) are more readily discernible from non-CIMP tumors (pink) by DNA methylation clustering. A differential expression analysis between CIMP and non-CIMP tumors also identified 9 genes (DEDD2, MEN1, ZNF581, GPR108, LUZP1, KIAA0319L, CALM3, NAPA, PROM1) as significantly differentially expressed between the CIMP and non-CIMP subgroups (p<0.05, corrected for multiple testing).
Fig. S8. High-resolution methylation patterns in LRES5
Gastric CIMP tumors exhibiting evidence of cancer-specific LRES5 hypermethylation were profiled on high resolution Infinium 450K methylation arrays. Compared to the 27K array that carries 34 CpG probes in this region, the 450K array carries 411 CpG probes. Y-axis: Differences in DNA methylation levels with respect to genomic coordinate for tumors relative to normal samples across all LRES5 CpGs profiled with the Illumina 450K array. The blue line indicates the locally estimated scatterplot smoothing (LOESS) line, with the semi-transparent ribbon indicating the standard error. The bottom tracks represent the locations of the CpG site, CGIs and LRES5 genes.
Fig. S9. Hypo-LRRs: long-range regions (LRRs) of hypomethylation
(a) Boxplots showing the numbers of significant CpG-CpG methylation correlations per CpG, for either random genomic regions or hypomethylated LRRs (hypo-LRRs). p-value was computed using the two-sided Mann-Whitney test. (b) Heatmap representation of gastric tumors clustered by hypo-LRR methylation status. The color bar represents the extent of hypo-LRR ("Activity" see Methods), from 0 to 1. The orange bar indicates samples exhibiting low methylation in the hypo-LRRs. The bottom bar confirms that tumors with hypo-LRRs were significantly associated with non-CIMP tumors.
Nu
mb
er
of
sig
nif
ica
nt
Cp
G-C
pG
me
thyla
tio
nco
rre
lati
on
s p
er
Cp
G
1 2
510
1520
Hypometh
ylate
d
LRRsRandom
genomic
regio
ns
p = 6.2X10-14
5
10
15
20
0
Nu
mb
er
of
sig
nif
ica
nt
Cp
G-C
pG
me
thyla
tio
nco
rre
lati
on
s p
er
Cp
G
1 2
510
1520
Hypometh
ylate
d
LRRsRandom
genomic
regio
ns
p = 6.2X10-14
5
10
15
20
0
Nu
mb
er
of
sig
nif
ica
nt
Cp
G-C
pG
me
thyla
tio
nco
rre
lati
on
s p
er
Cp
G
1 2
510
1520
1 2
510
1520
Hypometh
ylate
d
LRRsRandom
genomic
regio
ns
p = 6.2X10-14
5
10
15
20
0
5
10
15
20
0
a b
Hyp
om
eth
yla
ted
LRR
s
p = 0.02
LRR methylation
0 1LRR methylation
0 1
Samples exhibiting low
methylation levels in the hypo-LRRs
G-CIMP samples
G-nonCIMP samples
Hyp
om
eth
yla
ted
LRR
s
p = 0.02
Hyp
om
eth
yla
ted
LRR
s
p = 0.02
LRR methylation
0 1LRR methylation
0 1
Samples exhibiting low
methylation levels in the hypo-LRRs
G-CIMP samples
G-nonCIMP samples
CIMP Tumors non-CIMP Tumors
Fig. S10. Box plots relating copy number alterations (CNAs) with individual hypomethylated regions (Hypo-LRRs 1 to 24)
Hypo-LRRs were identified using an "Activity" score similar to that used to define LRESs, but analyzing hypomethylated CpGs rather than hypermethylated CpGs. Each dot represents one hypo-LRR in one tumor. X-axis: Level of hypo-activity from 0 to 1, marking increasing hypomethylation from left to right. Y-axis: CNA level, represented by the LRR (log relative ratio). p-values were calculated using a correlation test of standard deviation between CNA LRR and hypo-LRR activity.
SUPPLEMENTARY TABLES Table S1. Clinical characteristics of gastric cancer patients analyzed in this study This Table provides clinical information for 188 out of 203 patients profiled in this study. Clinical information was not available for 15 patients. Age information was not available for an additional 23 patients. Stage categories were based on the American Joint Committee on Cancer (AJCC) 6th edition classification. Median follow-up for patients who are still alive is 47.23 months. Clinical characteristics Gastric Cancers (N = 188) Age Range 23-92 Mean, S.D. 63, 12.6 Unknown (No of Patients) 23 Gender Male 121 Female 67 Lauren classification Intestinal 97 Diffuse 71 Mixed 20 Grade Undifferentiated 2 Poorly differentiated 114 Moderately differentiated 67 Well differentiated 5 Stage 1 28 2 29 3 70 4 61
Table S2. Correlation between Infinium and GoldenGate methylation Correlation coefficients between Infinium and GoldenGate DNA methylation profiles for 26 gastric cancer cell lines. Correlation values are based on 112 CpG sites probed on both arrays. The average correlation value is 0.91. Cell line name Correlation coefficient YCC-11 0.95 YCC-10 0.94 NCI-N87 0.94 YCC-16 0.94 AZ-521 0.94 MKN45 0.93 AGS 0.93 Takigawa 0.93 MKN7 0.93 SNU1 0.92 YCC-3 0.92 YCC-6 0.91 Hs 738.St/Int 0.91 FU97 0.91 YCC-1 0.91 Ist-1 0.91 SNU5 0.91 SCH 0.90 YCC-7 0.90 Hs 1.Int 0.90 Hs 746T 0.89 TMK1 0.89 IM95 0.88 MKN1-NCC 0.88 KATOIII 0.86 SNU16 0.80
Table S3. GSEA of hypermethylated genes underexpressed in tumors Gene sets mentioned in the Results are highlighted in the shaded boxes. Rank Signature Name Number of
genes in signature
Number of overlap genes
Corrected p-value
Signature source
1 STEIN_ESRRA_TARGETS
538 69 0X100 Stein et al., 2008
2 FLECHNER_BIOPSY_KIDNEY_TRANSPLANT_REJECTED_VS_OK_DN
557 74 0X100 Flechner et al., 2004
3 MOOTHA_MITOCHONDRIA
455 62 0X100 [24]
4 MOOTHA_HUMAN_MITODB_6_2002
440 65 0X100 [24]
5 MOOTHA_PGC 331 49 0X100 [24] 6 STEIN_ESRRA_TAR
GETS_UP 391 63 0X100 Stein et al., 2008
7 WONG_MITOCHONDRIA_GENE_MODULE
209 40 0X100 Wong et al., 2008
8 CREIGHTON_ENDOCRINE_THERAPY_RESISTANCE_3
723 71 1.66X10-13 Creighton et al., 2008
9 VECCHI_GASTRIC_CANCER_EARLY_DN
375 49 1.66X10-13 [25]
10 KAAB_HEART_ATRIUM_VS_VENTRICLE_DN
267 40 2.12X10-12 Kaab et al., 2004
Table S4. GSEA of hypomethylated genes overexpressed in tumors Gene sets mentioned in the Results are highlighted in the shaded boxes. Rank Signature Name Number of
genes in signature
Number of overlap genes
Corrected p-value
Signature source
1 HOSHIDA_LIVER_CANCER_SUBCLASS_S1 237 14 4.11X10-6
Hoshida et al., 2009
2 VECCHI_GASTRIC_CANCER_EARLY_UP 434 18 5.31X10-6
[25]
3 WU_CELL_MIGRATION 186 12 2.33X10-5
[26]
4 DELYS_THYROID_CANCER_UP 400 16 7.22X10-5
Delys et al., 2007
5 CREIGHTON_ENDOCRINE_THERAPY_RESISTANCE_3 723 20 5.57X10-4
Creighton et al., 2008
6 SMID_BREAST_CANCER_LUMINAL_B_DN 599 17 3.40X10-3
Smid et al., 2008
7 CHARAFE_BREAST_CANCER_LUMINAL_VS_BASAL_DN 456 14 1.24X10-2
Charafe-Jauffret et al., 2006
8 ENK_UV_RESPONSE_KERATINOCYTE_UP 537 15 1.76X10-2
Enk et al., 2006
9 SCHUETZ_BREAST_CANCER_DUCTAL_INVASIVE_UP 355 12 2.31X10-2
Schuetz et al., 2006
10 KONDO_EZH2_TARGETS 146 8 2.46X10-2
Kondo et al., 2008
Table S5. Association of positively correlated CpGs to H3K27ac and H3K4me3 peaks Chr CpG
coordinate Gene CGIa CpG
distance from TSS (bp)b
Methylation - expression correlation coefficient
Corrected p-value
Difference in DNA methylation (tumors vs. normal samples)c
Difference in gene expression (tumors vs. normal samples)d
Histone Mark
Peak location Peak heighte
5 140997957 HDAC3 TRUE -1361 0.303 0 0.034 0.003 H3K27ac 140993816 17 H3K27ac 140993835 18 H3K27ac 140994725 22 H3K27ac 140994728 22 H3K27ac 140995942 38 H3K27ac 140995942 33 H3K27ac 140995965 45 H3K27ac 140996425 20 H3K27ac 140996951 11 H3K27ac 140997015 12 H3K27ac 140997434 17 H3K27ac 140997507 22 H3K27ac 140997635 23 H3K27ac 140998945 8 H3K27ac 140999000 9 H3K27ac 141000575 9 H3K27ac 141001267 17 H3K27ac 141001275 18 H3K27ac 141002185 7 H3K27ac 141002925 10 H3K4me3 140994740 9 H3K4me3 140994765 9 H3K4me3 140996099 67 H3K4me3 140996152 57 H3K4me3 140996445 71 H3K4me3 140996935 70 H3K4me3 140997785 87
H3K4me3 140997786 64 H3K4me3 140999857 7 H3K4me3 140999875 7 H3K4me3 141001290 15 H3K4me3 141001335 16
16 83410410 CRISPLD2 TRUE -703 0.3394 0 0.072 0.067 H3K27ac 83409095 8 H3K27ac 83410165 8 H3K27ac 83411685 18 H3K27ac 83411774 19 H3K27ac 83412588 31 H3K27ac 83412625 41 H3K27ac 83413356 22 H3K27ac 83413375 25 H3K27ac 83413981 21 H3K27ac 83414085 23 H3K27ac 83414685 7 H3K4me3 83405655 5 H3K4me3 83409825 5 H3K4me3 83411265 38 H3K4me3 83411725 52 H3K4me3 83411779 58 H3K4me3 83412522 76 H3K4me3 83412537 59 H3K4me3 83412565 82 H3K4me3 83413209 25 H3K4me3 83413255 34 H3K4me3 83413273 20 H3K4me3 83414695 8 H3K4me3 83414715 8
1 3557410 WDR8 TRUE -913 0.3352 0 0.155 0.075 H3K27ac 3554975 6 H3K27ac 3555808 23 H3K27ac 3555842 21 H3K27ac 3555845 24 H3K27ac 3559735 12 H3K27ac 3559775 15 H3K27ac 3561295 27
H3K27ac 3561304 24 H3K4me3 3552965 4 H3K4me3 3555834 66 H3K4me3 3555852 55 H3K4me3 3555885 77 H3K4me3 3556705 4 H3K4me3 3558270 11 H3K4me3 3558271 11 H3K4me3 3558285 12 H3K4me3 3559730 19 H3K4me3 3559775 29 H3K4me3 3559786 23 H3K4me3 3561322 19 H3K4me3 3561385 21
5 79739149 ZFYVE16 TRUE -487 0.3472 0 0.062 0.305 H3K27ac 79738649 36 H3K27ac 79738665 38 H3K27ac 79739322 44 H3K27ac 79739324 54 H3K27ac 79739355 59 H3K27ac 79740242 19 H3K27ac 79740269 14 H3K27ac 79740305 22 H3K27ac 79741435 10 H3K4me3 79738575 6 H3K4me3 79739314 61 H3K4me3 79739318 68 H3K4me3 79739345 73 H3K4me3 79740155 92 H3K4me3 79740161 89 H3K4me3 79740185 100
1 89870192 LRRC8C FALSE -1158 0.3063 0 0.112 0.389 H3K27ac 89867535 8 H3K27ac 89869915 5 H3K27ac 89870597 26 H3K27ac 89870605 25 H3K27ac 89870625 27 H3K27ac 89872177 29
H3K27ac 89872193 37 H3K27ac 89872235 44 H3K27ac 89872625 26 H3K27ac 89873035 13 H3K27ac 89874765 24 H3K27ac 89874767 23 H3K4me3 89866435 4 H3K4me3 89870907 34 H3K4me3 89870908 36 H3K4me3 89870915 37 H3K4me3 89872191 86 H3K4me3 89872193 67 H3K4me3 89872225 101
6 44342489 NFKBIE FALSE -986 0.3501 0 0.105 0.629 H3K27ac 44337715 17 H3K27ac 44338630 101 H3K27ac 44338634 75 H3K27ac 44338655 111 H3K27ac 44339298 119 H3K27ac 44339300 136 H3K27ac 44339375 155 H3K27ac 44339795 127 H3K27ac 44340225 100 H3K27ac 44340665 80 H3K27ac 44341505 6 H3K27ac 44342265 32 H3K27ac 44342308 24 H3K27ac 44342312 31 H3K27ac 44343025 18 H3K27ac 44343029 16 H3K27ac 44343995 14 H3K27ac 44344045 13 H3K27ac 44344759 12 H3K27ac 44344875 10 H3K27ac 44345487 12 H3K27ac 44345505 12 H3K27ac 44347146 23
H3K27ac 44347195 28 H3K4me3 44338105 10 H3K4me3 44338618 47 H3K4me3 44338624 40 H3K4me3 44338655 56 H3K4me3 44339301 115 H3K4me3 44339315 127 H3K4me3 44340235 109 H3K4me3 44340650 143 H3K4me3 44340668 107 H3K4me3 44340685 162 H3K4me3 44342320 12 H3K4me3 44342335 12 H3K4me3 44343104 9 H3K4me3 44343125 9 H3K4me3 44344085 6 H3K4me3 44344762 12 H3K4me3 44344875 10 H3K4me3 44345485 7 H3K4me3 44347155 15 H3K4me3 44347205 21
9 131638595 C9orf78(1) FALSE -1214 0.3627 0 0.082 0.631 H3K27ac 131634115 5 H3K27ac 131636709 55 H3K27ac 131636747 49 H3K27ac 131636755 64 H3K27ac 131638112 33 H3K27ac 131638129 41 H3K27ac 131638135 48 H3K27ac 131638615 6 H3K27ac 131639535 8 H3K27ac 131641824 21 H3K27ac 131641845 22 H3K27ac 131642545 5 H3K4me3 131634045 6 H3K4me3 131635748 14 H3K4me3 131635765 14
H3K4me3 131637083 123 H3K4me3 131637094 102 H3K4me3 131637115 148 H3K4me3 131638108 84 H3K4me3 131638124 103 H3K4me3 131638155 118 H3K4me3 131638602 16 H3K4me3 131638635 15 H3K4me3 131639654 13 H3K4me3 131639665 14 H3K4me3 131640565 3 H3K4me3 131641833 11 H3K4me3 131641855 11 H3K4me3 131642501 7 H3K4me3 131642555 9 H3K4me3 131643105 6
9 131638707 C9orf78(2) FALSE -1326 0.4385 0 0.112 0.631 H3K27ac 131634115 5 H3K27ac 131636709 55 H3K27ac 131636747 49 H3K27ac 131636755 64 H3K27ac 131638112 33 H3K27ac 131638129 41 H3K27ac 131638135 48 H3K27ac 131638615 6 H3K27ac 131639535 8 H3K27ac 131641824 21 H3K27ac 131641845 22 H3K27ac 131642545 5 H3K4me3 131634045 6 H3K4me3 131635748 14 H3K4me3 131635765 14 H3K4me3 131637083 123 H3K4me3 131637094 102 H3K4me3 131637115 148 H3K4me3 131638108 84 H3K4me3 131638124 103
H3K4me3 131638155 118 H3K4me3 131638602 16 H3K4me3 131638635 15 H3K4me3 131639654 13 H3K4me3 131639665 14 H3K4me3 131640565 3 H3K4me3 131641833 11 H3K4me3 131641855 11 H3K4me3 131642501 7 H3K4me3 131642555 9 H3K4me3 131643105 6
22 18483541 RANBP1(1) FALSE -1483 0.3233 0 0.147 1.083 H3K27ac 18478828 18 H3K27ac 18478865 21 H3K27ac 18479511 12 H3K27ac 18479595 14 H3K27ac 18481385 9 H3K27ac 18481424 9 H3K27ac 18482615 25 H3K27ac 18482622 24 H3K27ac 18483293 21 H3K27ac 18483305 21 H3K27ac 18484202 45 H3K27ac 18484215 46 H3K27ac 18484242 42 H3K27ac 18485725 19 H3K27ac 18485750 17 H3K27ac 18485774 15 H3K27ac 18486427 17 H3K27ac 18486499 18 H3K27ac 18486515 18 H3K27ac 18487015 5 H3K4me3 18479202 7 H3K4me3 18481285 7 H3K4me3 18482554 17 H3K4me3 18482565 17 H3K4me3 18483319 18
H3K4me3 18483333 18 H3K4me3 18483335 18 H3K4me3 18484232 122 H3K4me3 18484241 106 H3K4me3 18484265 131 H3K4me3 18484995 3 H3K4me3 18485734 61 H3K4me3 18485738 66 H3K4me3 18485765 76 H3K4me3 18486495 11 H3K4me3 18487163 7 H3K4me3 18487165 7
22 18483649 RANBP1(2) TRUE -1375 0.3654 0 0.148 1.083 H3K27ac 18478828 18 H3K27ac 18478865 21 H3K27ac 18479511 12 H3K27ac 18479595 14 H3K27ac 18481385 9 H3K27ac 18481424 9 H3K27ac 18482615 25 H3K27ac 18482622 24 H3K27ac 18483293 21 H3K27ac 18483305 21 H3K27ac 18484202 45 H3K27ac 18484215 46 H3K27ac 18484242 42 H3K27ac 18485725 19 H3K27ac 18485750 17 H3K27ac 18485774 15 H3K27ac 18486427 17 H3K27ac 18486499 18 H3K27ac 18486515 18 H3K27ac 18487015 5 H3K4me3 18479202 7 H3K4me3 18481285 7 H3K4me3 18482554 17 H3K4me3 18482565 17
H3K4me3 18483319 18 H3K4me3 18483333 18 H3K4me3 18483335 18 H3K4me3 18484232 122 H3K4me3 18484241 106 H3K4me3 18484265 131 H3K4me3 18484995 3 H3K4me3 18485734 61 H3K4me3 18485738 66 H3K4me3 18485765 76 H3K4me3 18486495 11 H3K4me3 18487163 7 H3K4me3 18487165 7
aCpG location within CGI denoted by “TRUE” bDistances upstream and downstream of TSSs are represented by negative and positive values, respectively cDifference in DNA methylation (tumors vs. normal samples) is the mean normal methylation level subtracted from the mean tumor methylation level dDifference in expression (tumors vs. normal samples) is the log2 transformed ratio of the mean tumor expression intensity to the mean normal expression intensity ePeak height represents the extent of binding
Table S6. Genes exhibiting correlations between gene body methylation and expression Genes in this table all exhibit significant positive gene body expression – methylation correlations.
Gene name Distance from
TSS (kb) Correlation coefficient
DNA methylation (ave. T–ave. N)
Gene expression (log2(ave. T/ave. N))
MEST 5.21 0.46 0.06 1.70 PTPRO 25.85 0.23 -0.09 -1.36 CDKN2A 26.12 0.35 0.14 1.20 SERPINB5 13.40 0.24 0.07 1.18 SEMA3B 9.25 0.21 -0.06 -0.41 CHFR 28.32 0.44 -0.12 -0.33 RUNX3 33.14 0.27 0.14 0.28 RB1 17.57 0.31 0.08 0.21 ITPR2 163.26 0.24 -0.06 -0.19 BCL2 82.20 0.20 -0.06 0.05 ATM 38.05 0.22 -0.04 0.01 Genes are sorted by decreasing difference in gene expression magnitude (last column). The difference in expression is the log2 transformed ratio of the mean tumor expression intensity to the mean normal expression intensity. The difference in DNA methylation is the mean normal methylation level subtracted from the mean tumor methylation level. Data are shown from CpGs exhibiting highest correlation coefficient magnitude for each gene.
Table S7. Genes exhibiting tandem promoter and gene body methylation Genes harboring gene body CpGs with positive methylation – expression relationships and promoter CpGs with negative methylation – expression relationships.
Promoter Gene body Gene name
CpG distance
from TSS (bp)a
CGIb Methylation - expression correlation coefficient
CpG distance
from TSS (kb)a
CGIb Methylation - expression correlation coefficient
Promoter - intragenic methylation correlation
coefficientc
CHFR 490 TRUE -0.62 28.32 TRUE 0.44 -0.66 (p=2.58X10-28) MEST -210 TRUE -0.51 5.21 TRUE 0.46 -0.61 (p=5.76X10-24) PTPRO -168 TRUE -0.28 25.85 TRUE 0.23 -0.61 (p=1.48X10-23) CDKN2A 267 TRUE -0.17 26.12 TRUE 0.35 0.45 (p=3.33X10-12) BCL2 1,233d TRUE -0.24 82.20 TRUE 0.20 -0.29 (p<2.58X10-28)
RUNX3 -597 FALSE -0.16e 33.14 TRUE 0.27 -0.23 (p=6.00X10-4) aDistances upstream and downstream of TSSs are represented by negative and positive values, respectively. bCpG location within CGI denoted by “TRUE”. cCorrected correlation coefficient p-values are shown in parentheses. dThe BCL2 promoter CpG is located outside of promoter region boundaries defined in the main text (-1,500 to 1,000 bp from the TSS), but was included in downstream analysis due to its close proximity to the TSS (+1,233 bp). eThe associated corrected p-value from the global expression – methylation correlation analysis for this CpG is slightly above the significance threshold of 0.05 (p=0.058). Data correspond to individual promoter and gene body CpGs exhibiting strongest absolute correlation with expression.
Table S8. Clinicopathologic associations of CIMP and non-CIMP tumors
Clinical characteristic
CIMP samples (N = 68)
non-CIMP samples
(N = 135) p-value Test Conclusion Age Range 28-88 23-92 0.008 Wilcoxon rank
sum CIMP patients are younger than non-CIMP patients
Mean, S.D. 59, 13.4 65, 11.6 Unknown 21 17 Gender Male 40 81 Female 18 49 0.41 Fisher’s exact No gender association Unknown 10 5 Grade
Undifferentiated 0 2 CIMP: undifferentiated/poorly
Poorly differentiated 42 72 0.052 Fisher’s exact differentiated
Moderately differentiated 15 52
non-CIMP: moderately/well differentiated
Well differentiated 1 4 Unknown 10 5 EBV Status*
Positive 5 2 1 Fisher’s exact No association with
EBV Negative 80 16 MSI Status** Positive 1 5
Negative 31 61 0.82 Fisher’s exact No association with
MSI Unknown 36 69 Lauren classification Intestinal 25 72 CIMP, diffuse and Diffuse 26 45 0.15 Fisher’s exactb mixed
Mixed/Others 7 13 non-CIMP, intestinal (not significant)
Unknown 10 5 Stage a 1 7 21
2 9 20 0.73 Fisher’s exact No association with stage
3 20 50 4 22 39 Unknown 10 5
aAJCC 6th edition classification bDiffuse and mixed samples grouped for analysis.
*103 tumors were typed for EBV status **98 tumors were typed for MSI status Significant p-values are highlighted in bold.
Table S9. Multivariate analysis of survival associations of CIMP tumors Factors in the univariate and multivariate models were compared using Cox regression analysis. Significant p-values are highlighted in bold (p<0.05). Only factors found significant in univariate analysis were considered in the multivariate analysis.
Covariate
Univariate Multivariate HR (95.0% CI) P HR (95.0% CI) P
Subtype non-CIMP 1 CIMP 1.641 (1.115 to 2.415) 0.012 1.644 (1.077 to 2.509) 0.021
Age Continuous Variable 1.012 (0.997 to 1.027) 0.118 - - Gender Female - -
Male 1.085 (0.805 to 1.462) 0.593 - - Pathology
Intestinal 0.254 - - Diffuse 1.254 (0.917 to 1.714) 0.157 - - Unclassifiable 0.874 (0.515 to 1.483) 0.617 - -
AJCC6 staging
Stage I < 0.001 1 < 0.001 Stage II 1.666 (0.904 to 3.068) 0.101 2.300 (0.867 to 6.098) 0.094 Stage III 4.208 (2.447 to 7.236) < 0.001 5.609 (2.347 to 13.408) < 0.001
Stage IV 7.573 (4.325 to
13.258) < 0.001
9.498 (3.929 to 22.959) < 0.001
Margins Negative 1 Positive 1.918 (1.323 to 2.780) 0.001 1.724 (1.077 to 2.759) 0.023
Grade Well differentiated 0.128 1 0.913 Moderately differentiated
3.444 (0.844 to 14.056)
0.085
2.051 (0.269 to 15.645) 0.489
Poorly differentiated 4.221 (1.041 to
17.115) 0.044
1.938 (0.253 to 14.857) 0.524
Undifferentiated 2.278 (0.206 to
25.143) 0.502
1.825 (0.107 to 31.225) 0.678
Table S10. GSEA of genes hypermethylated in CIMP gastric tumors Table S10a. This table depicts GSEA results after querying significantly hypermethylated CpGs against all gene sets in MSigDB. Gene sets mentioned in the Results are highlighted in the shaded boxes. Rank Signature Name Number of
genes in signature
Number of overlap genes
Corrected p-value
Signature source / GO term
1 GGGCGGR_V$SP1_Q6
3053 393 3.22X10-61 Broad Institute
2 STEMCELL_NEURAL_UP
1838 284 9.86X10-59 [33]
3 CYTOPLASM 2137 282 4.50X10-44 GO:0005737 4 CAGGTG_V$E12_
Q6 2571 259 4.91X10-21 Broad Institute
5 BIOPOLYMER_METABOLIC_PROCESS
1687 251 3.99X10-48 GO:0043283
6 GGGAGGRR_V$MAZ_Q6
2332 250 5.07X10-24 Broad Institute
7 STEMCELL_HEMATOPOIETIC_UP
1452 230 1.86X10-48 [33]
8 NUCLEUS 1433 225 1.22X10-46 GO:0005634
9 TTGTTT_V$FOXO4_01
2149 218 1.55X10-17 Broad Institute
10 CTTTGT_V$LEF1_Q2
2036 214 5.28X10-19 Broad Institute
11 AACTTT_UNKNOWN
1963 210 1.63X10-19 Broad Institute
12 STEMCELL_EMBRYONIC_UP
1344 205 3.75X10-40 [33]
Table S10b. This table depicts GSEA results after querying high across-tumor standard deviation CpGs against the chemical and genetic perturbation gene sets in the MSigDB. Gene sets mentioned in the Results are highlighted in the shaded boxes. Rank Signature Name Number of
genes in signature
Number of overlap genes
Corrected p-value
Signature source
1 BENPORATH_ES_WITH_H3K27ME3 1117 508 0X100
[34]
2 BENPORATH_SUZ12_TARGETS 1037 472 0X100
[34]
3 BENPORATH_EE 1062 463 0X100 [34]
D_TARGETS 4 BLALOCK_ALZH
EIMERS_DISEASE_UP 1720 408 0X100
Blalock et al., 2004
5 BLALOCK_ALZHEIMERS_DISEASE_DN 1259 329 0X100
Blalock et al., 2004
6 DIAZ_CHRONIC_MEYLOGENOUS_LEUKEMIA_UP 1398 317 1.26X10-7
Diaz-Blanco et al., 2007
7 BENPORATH_PRC2_TARGETS 652 306 0X100
[34]
8 PEREZ_TP53_TARGETS 1191 273 9.64X10-7
Perez et al., 2007
9 NUYTTEN_EZH2_TARGETS_UP 974 267 0X100
[28]
10 DANG_BOUND_BY_MYC 1053 259 0X100
Zeller et al., 2003
Table S10c. This table depicts GSEA results after querying 316 CpGs exhibiting differential methylation between CIMP and non-CIMP tumors against the chemical and genetic perturbation gene sets in the MSigDB. Gene sets mentioned in the Results are highlighted in the shaded boxes. Rank Signature
Name Number of genes in signature
Number of overlap genes
Corrected p-value
Signature source
1 BENPORATH_SUZ12_TARGETS 1037 48
1.68X10-
9
[34]
2 BENPORATH_ES_WITH_H3K27ME3 1117 50
1.82X10-
9
[34]
3 BENPORATH_EED_TARGETS 1062 44
6.15X10-
7
[34]
Table S11. LRESs (long-range regions of epigenetic silencing) in CIMP gastric cancers
LRES Chr Genomic
coordinates Size (Mb)
Number of genes Genes
1 2 175754166 - 177654166
1.90 11
ATP5G3, HOXD13, HOXD12, HOXD11, HOXD10, HOXD9, HOXD8, HOXD4, HOXD3, HOXD1, MTX2
2 9 122768786 - 123818786
1.05 6 C5, CEP1, RAB14, GSN, STOM, DAB2IP
3 10 22535127 - 23635127
1.10 7 COMMD3, PCGF4, SPAG6, PIP5K2A, ARMC3, MSRB2, PTF1A
4 11 31835572 - 33385572
1.55 8 RCN1, WT1, WIT-1, hfl-B5, PRRG4, TCP11L1, CSTF3, HIPK3
5 11 43335572 - 44835572
1.50 6 HSD17B12, ALKBH3, PHACS, EXT2, ALX4, CD82
6 12 25992511 - 26992511
1.00 5 RASSF8, BHLHB3, SSPN, ITPR2, C12orf11
7 19 41693070 - 42993070
1.30 13
ZNF567, MGC62100, ZNF345, ZNF420, ZNF585A, ZNF585B, HKR1, ZNF569, ZNF570, ZNF540, ZNF571, FLJ37549, ZNF573
Table S12. Association of LRESs with CIMP gastric tumors
Association
LRES-positive samples (N = 95)
LRES-negative samples
(N = 108) p-value Test Conclusion CIMP samples 68 0 3.7X10-15
chi-square, 1 degree of freedom
LRES-positive samples enriched for
non-CIMP samples 27 108
CIMP samples
Table S13. CpG methylation correlation coefficients with respect to expression of genes within LRES5
CpG genomic
coordinate (Mb)
Gene name HSD17B12 expression correlation
Corrected p-value
ALKBH3 expression correlation
Corrected p-value
EXT2 expression correlation
Corrected p-value
ALX4 expression correlation
Corrected p-value
CD82 expression correlation
Corrected p-value
43.7 HSD17B12 -0.39 3.48X10-8 -0.10 0.144 0.12 0.152 -0.05 0.867 -0.06 0.642 43.7 HSD17B12 -0.39 3.48X10-8 -0.11 0.121 0.11 0.166 -0.06 0.824 -0.08 0.642 43.7 HSD17B12 -0.38 3.48X10-8 -0.15 0.047 0.04 0.664 -0.03 0.867 -0.09 0.642 43.7 HSD17B12 -0.33 4.20X10-6 -0.02 0.761 0.05 0.566 -0.03 0.867 -0.02 0.814 43.9 ALKBH3 -0.01 0.920 -0.40 1.02X10-8 0.20 0.022 0.00 0.959 0.04 0.724 43.9 ALKBH3 -0.09 0.303 -0.45 7.99X10-11 0.16 0.078 -0.01 0.936 0.00 0.946 44.0 PHACS -0.32 9.52X10-6 -0.19 0.021 0.14 0.092 -0.06 0.809 -0.05 0.642 44.0 PHACS 0.00 0.991 -0.08 0.262 -0.03 0.707 0.10 0.505 -0.11 0.642 44.1 EXT2 0.08 0.342 0.12 0.110 -0.09 0.258 0.10 0.505 0.08 0.642 44.1 EXT2 -0.20 0.012 -0.16 0.038 -0.01 0.917 -0.03 0.867 -0.05 0.642 44.3 ALX4 -0.09 0.277 -0.20 0.019 0.05 0.566 -0.03 0.867 0.06 0.642 44.3 ALX4 -0.11 0.219 -0.10 0.149 0.13 0.114 -0.10 0.505 0.06 0.642 44.3 ALX4 -0.11 0.210 -0.17 0.035 0.22 0.014 -0.09 0.537 0.07 0.642 44.3 ALX4 -0.12 0.152 -0.18 0.024 0.20 0.022 -0.11 0.505 0.05 0.653 44.3 ALX4 -0.13 0.152 -0.18 0.025 0.17 0.046 -0.16 0.261 0.05 0.642 44.3 ALX4 -0.08 0.326 -0.19 0.021 0.15 0.079 0.01 0.936 0.09 0.642 44.3 ALX4 -0.08 0.326 -0.16 0.039 0.22 0.014 -0.04 0.867 0.10 0.642 44.3 ALX4 -0.06 0.431 -0.17 0.033 0.15 0.079 -0.02 0.895 0.13 0.642 44.3 ALX4 -0.09 0.277 -0.19 0.021 0.08 0.338 -0.01 0.936 0.02 0.852 44.3 ALX4 0.00 0.991 -0.10 0.149 0.07 0.399 0.06 0.809 0.03 0.802 44.3 ALX4 -0.06 0.466 -0.18 0.023 0.02 0.806 0.03 0.867 0.06 0.642 44.3 ALX4 -0.12 0.194 -0.12 0.113 -0.07 0.468 0.16 0.261 -0.08 0.642 44.3 ALX4 -0.07 0.387 -0.08 0.230 -0.05 0.565 0.14 0.488 -0.03 0.809 44.3 ALX4 -0.17 0.037 -0.14 0.064 0.12 0.152 0.08 0.727 -0.05 0.642 44.3 ALX4 -0.17 0.037 -0.20 0.019 0.10 0.207 0.11 0.505 -0.06 0.642 44.3 ALX4 -0.06 0.466 -0.15 0.043 0.04 0.621 -0.03 0.867 -0.01 0.852 44.3 ALX4 -0.19 0.020 -0.24 0.003 0.12 0.156 -0.12 0.505 0.06 0.642 44.3 ALX4 -0.16 0.049 -0.12 0.114 0.13 0.109 -0.10 0.505 0.02 0.842 44.3 ALX4 -0.11 0.219 -0.14 0.052 0.18 0.035 -0.03 0.867 0.05 0.642 44.3 ALX4 -0.13 0.152 -0.16 0.040 0.17 0.057 -0.02 0.895 0.04 0.667 44.3 ALX4 -0.06 0.425 -0.11 0.136 0.21 0.021 0.02 0.895 0.07 0.642 44.3 ALX4 -0.07 0.387 -0.12 0.113 0.13 0.109 -0.03 0.867 0.11 0.642 44.3 ALX4 0.03 0.767 -0.07 0.329 0.05 0.566 -0.03 0.867 0.08 0.642 44.5 CD82 -0.25 0.001 -0.15 0.043 0.01 0.917 -0.07 0.804 -0.05 0.642
Significant p-values associated with negative correlation coefficients are highlighted in bold. ALKBH3 exhibited the strongest negative correlations and the greatest number of significant negative correlations (19 out of 34 CpGs), with a False Discovery Rate of 0.37%. In comparison, HSD17B12 exhibited fewer significant negative correlations (11 out of 34 CpGs), with a False Discovery Rate of 20%.
Table S14. CpG methylation correlation coefficients with respect to expression of genes within LRES3
CpG genomic
coordinate (Mb)
Gene name COMMD3 expression correlation
Corrected p-value
PCGF4 expression correlation
Corrected p-value
SPAG6 expression correlation
Corrected p-value
ARMC3 expression correlation
Corrected p-value
MSRB2 expression correlation
Corrected p-value
22.6 COMMD3 -0.17 0.066 0.01 0.859 0.04 0.809 0.04 0.934 0.08 0.348 22.6 COMMD3 -0.15 0.066 0.03 0.859 0.05 0.809 -0.03 0.934 0.09 0.348 22.6 PCGF4 -0.15 0.066 0.01 0.859 0.04 0.809 -0.01 0.934 0.05 0.505 22.7 PCGF4 -0.14 0.079 0.02 0.859 0.02 0.809 -0.02 0.934 0.10 0.299 22.7 SPAG6 0.06 0.433 0.04 0.859 -0.14 0.167 0.04 0.934 0.11 0.249 22.7 SPAG6 0.03 0.632 0.07 0.859 -0.13 0.167 0.01 0.934 0.16 0.094 23.0 PIP5K2A -0.16 0.066 0.03 0.859 0.04 0.809 -0.02 0.934 0.08 0.348 23.0 PIP5K2A -0.10 0.211 -0.06 0.859 -0.02 0.809 -0.07 0.934 0.13 0.195 23.3 ARMC3 -0.05 0.469 -0.05 0.859 -0.06 0.809 0.03 0.934 0.08 0.348 23.3 ARMC3 -0.09 0.274 -0.10 0.859 -0.02 0.809 0.05 0.934 0.01 0.886 23.4 MSRB2 -0.14 0.087 -0.11 0.859 -0.02 0.809 -0.09 0.934 -0.05 0.505 23.4 MSRB2 -0.19 0.066 -0.06 0.859 -0.02 0.809 0.09 0.934 -0.02 0.808 23.5 PTF1A 0.06 0.433 0.05 0.859 -0.16 0.099 -0.01 0.934 0.17 0.094 23.5 PTF1A 0.13 0.104 0.04 0.859 -0.20 0.043 0.07 0.934 0.16 0.094
Negative correlation coefficients are highlighted in bold. COMMD3 exhibited the strongest negative correlations and the greatest number of negative correlations (10 out of 14 CpGs).
Table S15. Hypo-LRRs: long-range regions of hypomethylation Hypomethylated
LRR Chr Genomic coordinates Size (Mb) Number of genes Genes
1 6 31286909 - 32636909 1.35 57
HLA-C, HLA-B, MICA, HCP5, MICB, BAT1, ATP6V1G2, NFKBIL1, LTA, TNF, LTB, LST1, AIF1, BAT2, BAT3, APOM, C6orf47, CSNK2B, BAT5, LY6G6C, C6orf25, DDAH2, CLIC1, MSH5, C6orf27, VARS, LSM2, HSPA1L, HSPA1A, HSPA1B, C6orf48, NEU1, SLC44A4, EHMT2, ZBTB12, C2, CFB, SKIV2L, RDBP, STK19, DOM3Z, C4B, CYP21A2, TNXB, CREBL1, FKBPL, PRRT1, PPT2, EGFL8, AGPAT1, AGER, PBX2, GPSM3, NOTCH4, BTNL2, HLA-DRA, HLA-DRB5
2 8 144721733 - 145721733 1.00 33
NAPRT1, EEF1D, TIGD5, PYCRL, TSTA3, ZNF623, MAPK15, FLJ46072, SCRIB, SIAHBP1, PLEC1, SPATC1, EXOSC4, GPAA1, CYC1, MAF1, BOP1, DGAT1, SCRT1, GPR172A, FBXL6, CPSF1, SLC39A4, VPS28, NFKBIL2, CYHR1, KIFC2, FOXH1, PPP1R16A, GPT, MFSD3, LRRC14, RECQL4
3 9 138268786 - 139418786 1.15 42
GPSM1, CARD9, SNAPC4, SDCCAG3, INPP5E, FLJ36779, NOTCH1, EGFL7, AGPAT2, FAM69B, LCN6, UNQ2541, MGC14141, KIAA1984, PHPT1, MAMDC4, EDF1, TRAF2, FBXW5, C8G, LCN12, PTGDS, C9orf142, CLIC3, FLJ36268, ABCA2, FUT7, NPDC1, ENTPD2, C9orf140, UAP1L1, MAN1B1, DPP7, GRIN1, SSNA1, ANAPC2, C9orf75, NDOR1, SLC34A3, TUBB2C, COBRA1, FLJ20245, NOXA1, ENTPD8, NELF, C9orf111, MRPL41, C9orf112, ZMYND19
4 11 1435572 - 3485572 2.05 28 HCCA2, DUSP8, CTSD, SYT8, TNNI2, LSP1, TNNT3, MRPL23, H19, IGF2, IGF2AS, INS, TH, ASCL2, TSPAN32, CD81, TSSC4, TRPM5, KCNQ1,
KCNQ1DN, CDKN1C, SLC22A18, SLC22A18AS, SLC22A18, PHLDA2, NAP1L4, CARS, OSBPL5
5 11 61285572 - 62685572 1.40 38
FEN1, C11orf10, FADS1, FADS2, FADS3, RAB3IL1, FTH1, SCGB1D1, SCGB2A1, SCGB1D2, SCGB2A2, ASRGL1, SCGB1A1, AHNAK, EEF1G, RBM21, MTA2, EML3, ROM1, B3GAT3, GANAB, KIAA1698, C11orf48, LOC51035, LOC221091, GNG3, BSCL2, TTC9C, ZBTB3, POLR2G, TAF6L, LOC374395, NXF1, STX5A, SLC3A2, CHRM1, SLC22A6, SLC22A8
6 11 63685572 - 66485572 2.80 99
LRP16, STIP1, URP2, DNAJC4, NUDT22, VEGFB, FKBP2, PLCB3, BAD, GPR137, KCNK4, ESRRA, HSPC152, FLJ37970, RPS6KA4, SLC22A11, SLC22A12, NRXN2, RASGRP2, PYGM, SF1, MAP4K2, MEN1, EHD1, PPP2R5B, GPHA2, BATF2, ARL2, SNX15, SAC3D1, ZFPL1, CDCA5, C11orf2, TM7SF2, ZNHIT2, MRPL49, FAU, SYVN1, CAPN1, POLA2, CDC42EP2, DPF2, TIGD3, FKSG44, SCYL1, LTBP3, FAM89B, SSSCA1, KCNK7, MAP3K11, SIPA1, RELA, HTATIP, AYP1, OVOL1, FLJ30934, CFL1, MUS81, EFEMP2, CTSW, FIBP, DIPA, FOSL1, DRAP1, Bles03, SART1, BANF1, MGC11102, CST6, CATSPER1, GAL3ST3, SF3B2, PACS1, KLC2, CNIH2, YIF1A, MGC33486, CD248, RIN1, BRMS1, B3GNT6, SLC29A2, NPAS4, MRPL11, DPP3, BBS1, ZDHHC24, ACTN3, CTSF, CCS, FLJ10786, RBM14, MGC15912, RBM4, RBM4B, SPTBN2, RCE1, LRFN4, PC
7 12 5942511 - 7592511 1.65 47
VWF, CD9, PLEKHG6, TNFRSF1A, SCNN1A, LTBR, TNFRSF7, TAPBPL, VAMP1, MRPL51, CNAP1, GAPDH, HOM-TES-103, NOL1, CHD4, GPR92, ACRBP, ING4, ZNF384, COPS7A, MLF2, PTMS, LAG3, CD4, GPR162, LEPREL2, GNB3, CDCA3, USP5, TPI1, SPSB2, LRRC23, ENO2, ATN1, C12orf57, PTPN6, EMG1, PHB2, OACT5, C1S, C1R, C1RL, RBP5, CLSTN3, PEX5, M160, CD163
8 12 50942511 - 52192511 1.25 42
KRTHB1, KRTHB6, KRTHB3, KRTHB5, KRTHB4, KRTHB2, K6HF, KRT6B, KRT6E, KRT6C, KRT6A, KRT5, KRT6IRS, K6IRS4, K6IRS2, K6IRS3, KRT2A, KRT1, KRT1B, KRT2B, KRT4, KRT6L, K5B, KRT8, KRT18, EIF4B, TENC1, SPRYD3, IGFBP6, SOAT2, CSAD, ITGB7, RARG, MFSD5, ESPL1, PFDN5, AAAS, SP1, AMHR2, PCBP2, TARBP2, NPFF
9 12 54042511 - 56742511 2.70 76
METTL7B, ITGA7, BLOC1S1, RDH5, CD63, GDF11, CIP29, ORMDL2, DNAJC14, MMP19, DGKA, CDK2, RAB5B, SUOX, RPS26, ERBB3, PA2G4, ZC3H10, FAM62A, MLC1SA, MYL6, SMARCC2, RNF41, OBFC2B, SLC39A5, CS, TMEM4, USP52, IL23A, STAT2, TIMELESS, MIP, LOC283377, GLS2, RBMS2, BAZ2A, ATP5B, PTGES3, NACA, PRIM1, HSD17B6, SDR-O, ADMR, TAC3, MYO1A, NAB2, STAT6, LRP1, NXPH4, SHMT2, LOC56901, STAC3, R3HDM2, INHBC, INHBE, GLI1, MARS, ARHGAP9, DDIT3, MBD6, DCTN2, KIF5A, PIP5K2C, GEFT, SLC26A10, GALGT, OS9, CENTG1, TSPAN31, CDK4, CYP27B1, DKFZP586D0919, METTL1, TSFM, AVIL, CTDSP2
10 16 91686 - 3741686 3.65 121
C16orf35, HBZ, HBA2, HBA1, HBQ1, LUC7L, C16orf9, PDIA2, RGS11, ARHGDIG, AXIN1, MRPL28, TMEM8, NME4, DECR2, SOLH, PIGQ, RAB40C, WFIKKN1, MGC15416, RHOT2, RHBDL1, STUB1, WDR24, FBXL16, METRN, C16orf24, C16orf25, NARFL, MSLN, RPUSD1, GNG13, SOX8, SSTR5, CACNA1H, TPSG1, TPSB2, TPSAB1, TPSD1, UBE2I, BAIAP3, GNPTG, MGC24381, C16orf28, UNKL, CLCN7, C16orf30, IFT140, CRAMP1L, C16orf34, MAPK8IP3, NME3, MRPS34, NUBP2, IGFALS, FAHD1, HAGH, MGC35212, SEPX1, RPL3L, NDUFB10, RPS2, TBL3, NOXO1, GFER, SYNGR3, SLC9A3R2, NTHL1, TSC2, PKD1, RAB26, TRAF7, GBL, MGC21830, E4F1, DNASE1L2, DCI, RNPS1, ABCA3, CCNF, FLJ13909, NTN2L, ATP6V0C, AMDHD2, PDPK1, LOC124216, KCTD5, PRSS27, SRRM2, TCEB2, TESSP1, PRSS21, LOC124220, MGC52282, PRSS22, LOC114984, PAQR4, KREMEN2, PKMYT1, CLDN9, CLDN6, TNFRSF12A, WDR58, HCFC1R1, MMP25, IL32, ZNF206, ZNF205, ZNF213, OR1F1, ZNF200, MEFV, ZNF263, TIGD7, ZNF75A, OR2C1, ZNF434, ZNF597, BTBD12, DNASE1, TRAP1
11 16 29491686 - 31491686 2.00 68
LAT1-3TM, SPN, QPRT, C16orf54, KIF22, MAZ, PRRT2, MVP, C16orf53, CDIPT, SEZ6L2, ASPHD1, KCTD13, TAOK2, FLJ90652, DOC2A, FAM57B, ALDOA, PPP4C, TBX6, GDPD3, MAPK3, CORO1A, SULT1A3, CD2BP2, TBC1D10B, MYLPF, 40422, ZNF553, XTP3TPA, SEPHS2, ITGAL, FLJ23436, MGC2474, MGC13138, ZNF688, FLJ32130, ZNF689, MGC3121, FBS1, SRCAP, PHKG2, RNF40,
BCL7C, CTF1, LOC283932, FBXL19, MGC13024, HSD3B7, STX1B2, STX4A, ZNF646, ZNF668, VKORC1, BCKDK, MYST1, PRSS8, FUS, PYCARD, PYDC1, ITGAM, ITGAX, ITGAD, COX6A2, TGFB1I1, SLC5A2, C16orf58, ERAF
12 16 65491686 - 67391686 1.90 48
CDH16, RRAD, CES2, FLJ21736, CBFB, LIN10, FBXL8, TRADD, HSF4, NOL3, LOC283849, E2F4, ELMO3, FHOD1, CGI-38, ZDHHC1, HSD11B2, ATP6V0D1, AGRP, FAM65A, CTCF, PARD6A, ACD, C16orf48, MGC11335, TSNAXIP1, THAP11, NUTF2, RCD-8, UNQ2446, PSKH1, CTRL, PSMB10, LCAT, SLC12A4, DPEP3, DPEP2, DUS2L, DDX28, NFATC3, RBM35B, LYPLA3, SLC7A6, PRMT7, SLC7A6OS, SMPD3, CDH3, CDH1
13 17 6351675 - 8351675 2.00 70
PITPNM3, TXNL5, KIAA0753, MED31, SLC13A5, BIRC4BP, FBXO39, TEKT1, ALOX12, SLC16A13, SLC16A11, CLEC10A, ASGR2, ASGR1, ACADVL, DLG4, DVL2, GABARAP, C17orf81, DULLARD, CLDN7, SLC2A4, YBX2, EIF5A, GPS2, CENTB1, TNK1, C17orf61, NLGN2, TMEM102, FGF11, CHRNB1, POLR2A, TNFSF12, TNFSF12-TNFSF13, TNFSF13, EIF4A1, CD68, MPDU1, SOX15, FXR2, SAT2, SHBG, ATP1B2, TP53, WDR79, EFNB3, TMEM88, CYB5D1, LSMD1, KCNAB3, CNTROB, TRAPPC1, GUCY2D, ALOX15B, ALOX12B, ALOXE3, PER1, VAMP2, TMEM107, C17orf59, AURKB, C17orf44, PFAS, RANGNRF, SLC25A35, ARHGEF15, LOC124751, RPL26, NDEL1
14 17 36601675 - 38751675 2.15 70
KRTAP9-3, KRTAP9-4, KRTAP17-1, KRTHA3A, KRTHA3B, KRTHA4, KRTHA1, KRTHA8, KRTHA2, KRTHA5, KRTHA6, KRT13, KRT15, KRT19, KRT9, KRT14, KRT16, KRT17, EIF1, GAST, HAP1, JUP, SC65, FKBP10, NT5C3L, KLHL11, ACLY, CNP, NKIRAS2, DNAJC7, LGP2, GCN5L2, HSPB9, RAB5C, KCNH4, HCRT, LGP1, STAT5B, STAT5A, STAT3, PTRF, ATP6V0A1, NAGLU, HSD17B1, COASY, MLX, TBPIP, LOC162427, PLEKHH3, CCR10, EZH1, RAMP2, VPS25, WNK4, CNTD, BECN1, PSME3, AOC2, AOC3, FLJ31222, G6PC, AARSD1, RPL27, IFI35, VAT1, RND2, BRCA1, NBR2, TMEM106A
15 19 293070 - 1843070 1.55 52
PRG2, THEG, C19orf19, MADCAM1, CDC34, GZMM, BSG, HCN2, POLRMT, FGF22, FLJ45684, RNF126, FSTL3, PRSSL1, PALM, C19orf21, PTBP1, PRG2, AZU1, PRTN3, ELA2, CFD, THRAP5, C19orf22, KISS1R, ARID3A, WDR18, C19orf6, CNN2, ABCA7, HMHA1, POLR2E, GPX4, KIAA0963, STK11, C19orf26, ATP5D, CIRBP, C19orf24, EFNA2, MUM1, NDUFS7, GAMT, DAZAP1, RPS15, APC2, PCSK4, REEP6, ADAMTSL5, MBD3, UQCR, TCF3, KLF16
16 19 9643070 - 11193070 1.55 38
ZNF562, FBXL12, UBL5, PIN1, OLFM2, COL5A3, RDH8, ANGPTL6, PPAN, P2RY11, EIF3S4, DNMT1, EDG5, MRPL4, ICAM1, ICAM4, ICAM5, MGC19604, ICAM3, TYK2, CDC37, PDE4A, KEAP1, EDG8, ATG4D, FLJ12949, CDKN2D, SLC44A2, ILF3, QTRT1, DNM2, TMED1, CARM1, LOC90580, YIPF2, SMARCA4, LDLR, ANKRD25
17 19 12093070 - 13293070 1.20 32 ZNF625, ZNF136, ZNF44, ZNF563, ZNF442, FLJ90396, MAN2B1, MORG1, PTD008, DHPS, FBXW9, MGC2803, ASNA1, JUNB, PRDX2,
RNASEH2A, RTBDN, MAST1, DNASE2, KLF1, GCDH, FARSLA, CALR, RAD23A, GADD45GIP1, DAND5, NFIX, LYL1, TRMT1, BTBD14B, IER2, STX10
18 19 39943070 - 41693070 1.75 52
FLJ38451, SCN1B, HPN, FXYD3, LGI4, FXYD1, FXYD7, FXYD5, FLJ25660, LSR, USF2, HAMP, MAG, CD22, FFAR1, FFAR3, GPR42, FFAR2, UNQ467, ZD52F10, SBSN, GAPDHS, NIFIE14, ATP4A, MGC10433, COX6B1, UPK1A, ZBTB32, MLL4, U2AF1L4, PSENEN, U2AF1L3, F25965, HSPB6, LOC148137, SNX26, PRODH2, KIRREL2, APLP1, TYROBP, LRFN3, FLJ36445, ALKBH6, CLIPR-59, CKAP1, POLR2I, CAPNS1, COX7A1, ZNF565, ZNF146, ZNF545, ZNF566
19 19 49743070 - 51343070 1.60 45
PVR, CEACAM19, BCL3, CBLC, BCAM, PVRL2, TOMM40, APOE, APOC1, APOC4, APOC2, CLPTM1, RELB, SFRS16, ZNF342, GEMIN7, TRAPPC6A, BLOC1S3, XTP7, MARK4, CKM, KLC3, ERCC2, PPP1R13L, CD3EAP, ERCC1, FOSB, RTN2, FLJ40125, VASP, OPA3, GPR4, EML2, GIPR, QPCTL, SNRPD2, SIX5, DMPK, DMWD, RSHL1, SYMPK, FOXA3, IRF2BP1, NOVA2, PGLYRP1
20 19 53093070 - 55643070 2.55 84
ELSPBP1, CABP5, PLA2G4C, LIG1, CARD8, ZNF114, FLJ32926, EMP3, FLJ10922, KDELR1, GRIN2D, GRWD1, KCNJ14, PSCD2, SULT2B1, SPACA4, SPHK2, RPL18, DBP, CA11, FUT2, FLJ36070, RASIP1, IZUMO1, FUT1, FGF21, BCAT2, DHRS10, PLEKHA4, PPP1R15A, NUCB1, TULP2, DHDH, BAX, GYS1, LHB, CGB, CGB2, CGB1, CGB5, CGB8, NTF5, KCNA7, SNRP70, LIN7B, PPFIA3, HRC, TRPM4, CD37, TEAD2, DKKL1, TIP39, SLC17A7, FLJ20643, ALDH16A1,
FLT3LG, RPL13A, RPS11, FCGRT, RCN3, NOSIP, PRRG2, RRAS, BCL2L12, IRF3, PRMT1, CPT1C, TSKS, FLJ22688, PTOV1, PNKP, TBC1D17, AKT1S1, ATF5, IL4I1, SIGLEC11, VRK3, SCRL, KCNC3, NAPSA, NR1H2, POLD1, SPIB, MYBPC2
21 19 59893070 - 61093070 1.20 35
KIR3DL3, KIR2DL1, KIR3DL1, KIR3DL2, FCAR, NCR1, NALP7, NALP2, EPS8L1, PPP1R12C, TNNT1, TNNI3, LOC352909, SYT5, PTPRH, TMEM86B, HSPBP1, BRSK1, SUV420H2, IL11, RPL28, UBE2S, ISOC2, KLP1, ZNF579, FLJ14768, ZNF524, LOC147808, ZNF580, ZNF581, HSU79303, U2AF2, NALP9, NALP11, NALP4
22 20 43316170 - 44816170 1.50 40
SLPI, MATN4, RBPSUHL, SDC4, C20orf35, C20orf10, PIGT, WFDC2, WFDC6, SPINLW1, WFDC8, WFDC10A, WFDC9, WFDC11, WFDC13, WFDC10B, DNTTIP1, WFDC3, UBE2C, TNNC2, C20orf161, ACOT8, ZSWIM1, C20orf165, PPGB, NEURL2, PLTP, C20orf67, ZNF335, MMP9, SLC12A5, NCOA5, CD40, CDH22, SLC35C2, ELMO2, ZNF663, SLC13A3, TP53RK, SLC2A10
23 20 56016170 - 57816170 1.80 14 C20orf85, C20orf86, RAB22A, VAPB, FLJ90166, STX16, GNAS, TH1L, CTSZ, TUBB1, ATP5E, C20orf45, EDN3, PHACTR3
24 20 60866170 - 62316170 1.45 39
C20orf20, OGFR, COL9A3, TCFL5, DIDO1, YTHDF1, BIRC7, C20orf58, ARFGAP1, CHRNA4, KCNQ2, EEF1A2, C20orf149, PTK6, SRMS, C20orf195, PRIC285, GMEB2, STMN3, RTEL1, ZGPAT, ARFRP1, LIME1, SLC2A4RG, BTBD4, C20orf135, TPD52L2, DNAJC5, UCKL1, GM632, SAMD10, C20orf14, LOC284739, SOX18, TCEA2, RGS19, OPRL1, NPBWR2, MYT1
Table S16. Association of hypo-LRRs to non-CIMP tumors
Association
hypo-LRR-positive samples (N = 91)
hypo-LRR-negative samples
(N = 112) p-value Test Conclusion non-CIMP samples 71 64 0.02
chi-square, 1 degree of freedom
hypo-LRR-positive samples enriched for
CIMP samples 20 48 non-CIMP samples
Table S17. Hypo-LRR CpGs and repeat sequences CpG sites in the hypo-LRR regions were mapped to the RepeatMasker track from the UCSC Genome Browser to identify the number of hypo-LRR CpGs overlapping with specific repeats. Out of the 2562 CpGs in the hypo-LRR regions, 261 (~10.2%) CpGs overlapped with a repeat sequence. RepClass nCounts Remark SINE 137 Short interspersed nuclear elements (SINE), which include
Alu elements LINE 44 Long interspersed nuclear elements (LINE) LTR 24 Long terminal repeat elements (LTR), which include
retroposons Low complexity 23 Low complexity repeats DNA 14 DNA repeat elements (DNA) Simple repeat 14 Simple repeats (micro-satellites) Satellite 2 Satellite repeats tRNA 2 RNA repeats (transfer RNAs) snRNA 1 RNA repeats (small nuclear RNAs)