+ All Categories
Home > Documents > Human Epigenome Data Reveal Increased CpG Methylation in Alternatively Spliced Sites and Putative...

Human Epigenome Data Reveal Increased CpG Methylation in Alternatively Spliced Sites and Putative...

Date post: 09-Dec-2016
Category:
Upload: sofia
View: 212 times
Download: 0 times
Share this document with a friend
11
ORIGINAL RESEARCH ARTICLES Human Epigenome Data Reveal Increased CpG Methylation in Alternatively Spliced Sites and Putative Exonic Splicing Enhancers Christina Anastasiadou, 1 Andigoni Malousi, 1 Nicos Maglaveras, 1 and Sofia Kouidou 2 The role of gene body methylation, which represents a major part of methylation in DNA, remains mostly unknown. Evidence based on the CpG distribution associates its presence with nucleosome positioning and alternative splicing. Recently, it was also shown that cytosine methylation influences splicing. However, to date, there is no methylation-based data on the association of methylation with alternative splicing and the distri- bution in exonic splicing enhancers (ESEs). We presently report that, based on the computational analysis of the Human Epigenome Project data, CpG hypermethylation (>80%) is frequent in alternatively spliced sites (par- ticularly in noncanonical) but not in alternate promoters. The methylation frequency increases in sequences containing multiple putative ESEs. However, significant differences in the extent of methylation are observed among different ESEs. Specifically, moderate levels of methylation, ranging from 20% to 80%, are frequent in SRp55-binding elements, which are associated with response to extracellular conditions, but not in SF2/ASF, primarily responsible for alternative splicing, or in CpG islands. Finally, methylation is more frequent in the presence of AT repeats and CpGs separated by 10 nucleotides and lower in adjacent CpGs, probably indicating its dependence on helical formations and on the presence of nucleosome positioning-related sequences. In conclusion, our results show the regulation of methylation in ESEs and support its involvement in alternative splicing. Introduction T he participation of DNA methylation in the regu- lation of gene expression was proposed several years ago, on account of the dramatic changes observed during X chromosome inactivation (Constancia et al., 1998), ontogen- esis, and differentiation (Razin and Riggs, 1980). Never- theless, although the role of methylation in CpG islands (CIs) and first introns is presently well understood, the impact of methylation on the binding of splicing factors and splicing regulatory elements and in CpG-dense gene body sequences (Malousi et al., 2008) remains unknown. Recently, it was verified that DNA methylation is one of the principle mechanisms that exerts a regulatory role on nucleosomal packaging and positioning in gene promoters and thus modulates DNA accessibility to transcription factors and gene expression (Choi and Kim, 2009). This process nega- tively affects transcription and is known to lead to gene si- lencing ( Jones and Laird, 1999). Modifications of DNA packaging and nucleosome positioning by DNA methylation could also affect other aspects of the regulation of gene ex- pression, such as splicing (Robertson, 2002). This is particu- larly probable in view of the fact that spliceosome assembly is a cotranscriptional process (Pandya-Jones and Black, 2009). During the preparation of this work, it was also shown that the factors that control CpG methylation at selected Hox genes affect splicing probably by Pol II stalling (Tao et al., 2010). Nucleosomal packaging and its dependence on confor- mational changes introduced by the presence of specific di- nucleotides such as TG have been extensively investigated (Adams et al., 1987; Liu et al., 2008; Caserta et al., 2009). Some dinucleotides reducing the conformational strain (AA/TT, TA, etc.) were found to facilitate nucleosome formation (Caserta et al., 2009), and the TATAAACGCC repeat se- quence was identified as the optimal oligonucleotide se- quence associated with nucleosome formation (Widlund et al., 1999). The CpG dinucleotide that is included in the optimal repeat sequence is considered responsible for intro- ducing particular conformation characteristics (conforma- tional distortion) in model oligonucleotide sequences (Arnott et al., 1983; Svozil et al., 2008) and has been investigated with respect to the nucleosome assembly (Adams et al., 1987; Schwartz et al., 2009). 1 Laboratory of Medical Informatics and 2 Department of Biological Chemistry, School of Medicine, Aristotle University of Thessaloniki, Thessaloniki, Greece. DNA AND CELL BIOLOGY Volume 30, Number 5, 2011 ª Mary Ann Liebert, Inc. Pp. 267–275 DOI: 10.1089/dna.2010.1094 267
Transcript
Page 1: Human Epigenome Data Reveal Increased CpG Methylation in Alternatively Spliced Sites and Putative Exonic Splicing Enhancers

ORIGINAL RESEARCH ARTICLES

Human Epigenome Data Reveal IncreasedCpG Methylation in Alternatively Spliced Sites

and Putative Exonic Splicing Enhancers

Christina Anastasiadou,1 Andigoni Malousi,1 Nicos Maglaveras,1 and Sofia Kouidou2

The role of gene body methylation, which represents a major part of methylation in DNA, remains mostlyunknown. Evidence based on the CpG distribution associates its presence with nucleosome positioning andalternative splicing. Recently, it was also shown that cytosine methylation influences splicing. However, to date,there is no methylation-based data on the association of methylation with alternative splicing and the distri-bution in exonic splicing enhancers (ESEs). We presently report that, based on the computational analysis of theHuman Epigenome Project data, CpG hypermethylation (>80%) is frequent in alternatively spliced sites (par-ticularly in noncanonical) but not in alternate promoters. The methylation frequency increases in sequencescontaining multiple putative ESEs. However, significant differences in the extent of methylation are observedamong different ESEs. Specifically, moderate levels of methylation, ranging from 20% to 80%, are frequent inSRp55-binding elements, which are associated with response to extracellular conditions, but not in SF2/ASF,primarily responsible for alternative splicing, or in CpG islands. Finally, methylation is more frequent in thepresence of AT repeats and CpGs separated by 10 nucleotides and lower in adjacent CpGs, probably indicatingits dependence on helical formations and on the presence of nucleosome positioning-related sequences. Inconclusion, our results show the regulation of methylation in ESEs and support its involvement in alternativesplicing.

Introduction

The participation of DNA methylation in the regu-lation of gene expression was proposed several years

ago, on account of the dramatic changes observed during Xchromosome inactivation (Constancia et al., 1998), ontogen-esis, and differentiation (Razin and Riggs, 1980). Never-theless, although the role of methylation in CpG islands (CIs)and first introns is presently well understood, the impact ofmethylation on the binding of splicing factors and splicingregulatory elements and in CpG-dense gene body sequences(Malousi et al., 2008) remains unknown. Recently, it wasverified that DNA methylation is one of the principlemechanisms that exerts a regulatory role on nucleosomalpackaging and positioning in gene promoters and thusmodulates DNA accessibility to transcription factors andgene expression (Choi and Kim, 2009). This process nega-tively affects transcription and is known to lead to gene si-lencing ( Jones and Laird, 1999). Modifications of DNApackaging and nucleosome positioning by DNA methylationcould also affect other aspects of the regulation of gene ex-pression, such as splicing (Robertson, 2002). This is particu-

larly probable in view of the fact that spliceosome assemblyis a cotranscriptional process (Pandya-Jones and Black, 2009).During the preparation of this work, it was also shown thatthe factors that control CpG methylation at selected Hoxgenes affect splicing probably by Pol II stalling (Tao et al.,2010).

Nucleosomal packaging and its dependence on confor-mational changes introduced by the presence of specific di-nucleotides such as TG have been extensively investigated(Adams et al., 1987; Liu et al., 2008; Caserta et al., 2009). Somedinucleotides reducing the conformational strain (AA/TT,TA, etc.) were found to facilitate nucleosome formation(Caserta et al., 2009), and the TATAAACGCC repeat se-quence was identified as the optimal oligonucleotide se-quence associated with nucleosome formation (Widlundet al., 1999). The CpG dinucleotide that is included in theoptimal repeat sequence is considered responsible for intro-ducing particular conformation characteristics (conforma-tional distortion) in model oligonucleotide sequences (Arnottet al., 1983; Svozil et al., 2008) and has been investigated withrespect to the nucleosome assembly (Adams et al., 1987;Schwartz et al., 2009).

1Laboratory of Medical Informatics and 2Department of Biological Chemistry, School of Medicine, Aristotle University of Thessaloniki,Thessaloniki, Greece.

DNA AND CELL BIOLOGYVolume 30, Number 5, 2011ª Mary Ann Liebert, Inc.Pp. 267–275DOI: 10.1089/dna.2010.1094

267

Page 2: Human Epigenome Data Reveal Increased CpG Methylation in Alternatively Spliced Sites and Putative Exonic Splicing Enhancers

CpG methylation is also known to have a strong impacton the conformational characteristics of the DNA sequence(Fang et al., 1995). Moreover, evidence for the strong impactof DNA methylation on splicing comes from the fact thatmethyl-binding protein mutants (MeCP2) affect gene splic-ing (Moretti and Zoghbi, 2006). DNA methylation mostlikely introduces an additional parameter influencing nu-cleosome positioning and binding of splicing regulatoryfactors (SR proteins) recognizing CpG-containing sites ofexonic splicing enhancers (ESEs). SR proteins are Serine-Arginine–rich proteins, which bind to specific RNA domainsand are critical factors for the splicing process. Studies usingartificial splicing models such as minigenes reveal that splicesite selection depends on the concerted action of these pro-teins (Goncalves et al., 2009). Among these, SF2/ASF hasbeen mostly reported to act as a splicing enhancer, whereasothers (SRp20) act as silencers. In addition, SF2/ASF plays aregulatory role in mRNA export (Li and Manley, 2005).

Several recent studies address the issue of nonpromoterDNA methylation and its possible involvement in tran-scriptional regulation and nucleosomal positioning. None ofthese systems refer to the human epigenome of differentiatedcells. The complete epigenome analysis of Arabidopsis thaliana(Chodavarapu et al., 2010) provides valuable informationconcerning the plant epigenome, and the study by Lister et al.(2009) involves the analysis of human stem cells. Althoughthese cells are of human origin, it is clearly shown by theseauthors that their epigenetic state is distinctly different fromthat of mature differentiated cells from different tissues, be-cause they exhibit epigenetic characteristics such as a highfrequency of non-CpG methylation, which has been previ-ously reported in lung carcinoma cells (Kouidou et al., 2005).

Nevertheless, to date, there is no information regarding thepresence of methylation in splicing regulation and nucleosomeformation inducing sequences in cells from human tissues.The methylation data in nonpromoter sequences availablethrough the Epigenome Project (Rakyan et al., 2004; Eckhardtet al., 2006) can provide useful information for evaluating therole of gene body and nonpromoter methylation.

In this study, we investigated the extent of DNA methyl-ation in CpGs for sequences that contain ESEs and thosewhich can give potential alternative splicing events or can beidentified as part of CIs. Moreover, we evaluated the se-quence specificity of methylation in short sequence motifs(di- and trinucleotides), some of which are considered criticalfor nucleosome positioning.

Materials and Methods

The source data used in this study are a compilation ofmethylation sites at specific loci of the human chromosomes6, 20, and 22, which are available from the Human Epigen-ome Project (HEP) (Eckhardt et al., 2006) (released June 26,2006), as well as the corresponding genomic sequences fromthe Human Genome Project (HGP) (Build 34, hg16, Jul.2003).We examined 33,352 methylated sites from multiple tissuetypes, and for each one of these sites, we extracted themethylation levels as well as the adjacent genomic sequencesfrom HEP and HGP, respectively. Methylation sites with noavailable methylation levels or with known data not ana-lyzed were ignored. The methylation level was used as acriterion for the classification of methylated sites. Sites with

methylation levels greater or equal to 80% are characterizedas hypermethylated, whereas those with methylation levelslower or equal to 20% are classified as hypomethylated.From the set of 33,352 methylated sites, 9012 CpG sites werefound to be hypermethylated and a total of 15,296 sites arehypomethylated. The remaining CpG sites are categorized toa third class containing intermediate methylation levels.

Putative ESEs responsive to the human SR proteins SF2/ASF, SC35, SRp40, and SRp55 were identified in the studiedsequences using ESEfinder (Cartegni et al., 2003). The defaultthresholds for the computational detection of the ESE con-sensus motifs were used in this analysis. For the comparativeanalyses of methylation and alternative splicing, we per-formed searches for different types of alternative splicingevents in the studied human genomic regions (NCBI36/hg18assembly) using the UCSC Table browser (Karolchik et al.,2004). The exon coordinates in chromosomes 6, 20, and 22were extracted using the Martview interface of Biomart forthe same human assembly (Haider et al., 2009).

In addition, we identified CIs using two types of analyses.First, we extracted all actual CIs in the studied chromosomesusing UCSC Table browser for the same human genomeassembly and visualized their distribution. Then, to comparethe actual CIs with those neighboring the analyzed CpGmethylation sites, we identified CIs in the 400-nt regioncentered at the hypermethylated and hypomethylated cyto-sines using the Takai and Jones algorithm (Takai and Jones,2003). The search criteria follow the definition of the CIsperformed by Gardiner-Garden and Frommer (1987) (i.e.,200 nt sequence length with >50% GþC content and ob-served CpC/expected CpG ratio [ObsCpG/ExpCpG] �0.6).

The expressed sequence tags (ESTs) were extracted usingthe UCSC Table browser of the human genome (hg19, Feb2009) for chromosomes 6, 20, and 22. ESTs were classifiedinto CpG rich (number of CpGs/EST length: >0.1) and CpGpoor (no CpGs in whole EST length) and the putative ESEswere identified for each dataset using ESEfinder.

Chi-square tests were used to estimate the statistical sig-nificance for all bivariate tabular analyses and Pearsonproduct–moment correlation coefficients provided a measureof the linear relationship between the distributions of specificfeatures in different chromosomes. Matlab and Perl scriptswere built to delineate the source dataset, to extract the ge-nomic sequences, and to analyze their features.

Results

CpG methylation in experimentally detectedalternative splice sites

The association of methylation with alternative splicingwas examined by defining the level of CpG methylationclose to sites where alternative splicing events have beenreported. The frequency of alternative splicing sites nearhyper- and hypomethylated CpGs (methylation frequencies�80% and frequencies �20%, respectively) was evaluatedin� 200-nt sequences. It is evident from Table 1 that thisfrequency is significantly higher for hypermethylated, com-pared with hypomethylated sequences. It is, however, in-teresting to note that such differences are not observed withregard to the extent of methylation in alternative promot-ers (Table 1). Actually, when alternative promoters areexcluded from the statistical correlation, the association of

268 ANASTASIADOU ET AL.

Page 3: Human Epigenome Data Reveal Increased CpG Methylation in Alternatively Spliced Sites and Putative Exonic Splicing Enhancers

hypermethylated CpGs with alternative splicing events isstrengthened. Particularly evident is also the association ofhypermethylation with atypical splice sites, that is, strangesplice sites and atypical intron splice ends (AT/AC).

ESE density in proximity with CpGs

To classify the methylation data for the 33,352 CpGs inchromosomes 6, 20, and 22 in different genetic regulatoryregions, we first evaluated the frequency of the identifiedESEs in� 50-nt sequences adjacent to each CpG, as a measureof their splicing regulatory complexity (Table 2A). It is evi-dent from Table 2A that the average regulatory complexityobtained by the computational analysis of the� 50-nt se-quences differs considerably among chromosomes. Specifi-cally, in chromosome 20, there is a higher average ESE

frequency for each CpG-adjacent sequence compared withchromosomes 6 and 22. Given that the ESEs’ length usuallyranges from 5 to 7 nt, we also investigated the number of ESEsin shorter sequences of� 30 nt length adjacent to CpGs. De-creasing the size of the analyzed sequences by 40% introducesconsiderable increase of the ESEs density, indicating that ESEsare clustered adjacent to the CpGs. This finding is probablyunrelated to the presence of multiple ESEs in a single CpG,because the frequency of ESEs containing two CpGs is verylow [e.g., 3/238 ESE hexamers identified according to Rescue-ESE (Fairbrother et al., 2004)]. It is, however, probable that theincrease of CpG density could be related to the presence ofmore dense or overlapping ESEs. The most considerable in-crease is observed in chromosome 20.

To further verify that the splicing regulatory complexity inCpG-proximal sequences is higher than in the remaining

Table 1. Alternative Splicing Events Reported in 400-nt Sequences Centered at the Analyzed

CpGs, Relative to Their Level of Methylation

Alternative splicing event Hypermethylated (9012) Hypomethylated (15,296) p-Value

Cassette exon 20 20 0.091Overlapping exon 15 16 0.192Retained intron 17 13 0.026Alternate 30 intron ends 9 5 0.035Strange intron ends (other than GT/AG,

GC/AG, or AT/AC)52 31 <0.0001

AT/AC intron ends 4 1Alternate transcription ends 1 0Alternate 50 intron ends 3 7Alternate promoter 15 31 0.531Alternative splicing events

for all sequences136 124 <0.0001

Number of total sequences in each class is shown within parentheses.

Table 2A. CpG-Containing Sequences and Corresponding Exonic Splicing Enhancers

in� 50-nt and� 30-nt Regions

CHR6 CHR20 CHR22

ESEs ESEs ESEs

CpGs �50 nt �30 nt CpGs �50 nt �30 nt CpGs �50 nt �30 nt

10,683 39,271 (3.68) 33,581 (5.24) 5486 29,942 (5.46) 23,973 (7.28) 17,183 67,764 (3.94) 58,665 (5.69)

The corresponding ESE frequencies in 100 nt centered at CpGs are shown within parentheses.ESEs, exonic splicing enhancers.

Table 2B. Exonic Splicing Enhancers in CpG-Dense and CpG-Poor Expressed Sequence

Tags for Chromosomes 6, 20, and 22

CHR6 CHR20 CHR22

Total ESTs 446,330 199,686 191,500

CpG-rich ESTs 8576 3565 4072Average length 452.4 384.5 437.5Average number (standard deviation) of ESEs per EST 149.2 (53.9) 124.5 (69) 143.5 (62.9)

CpG-poor ESTs 5912 3308 502Average length 452.7 383.1 436.5Average number (standard deviation) of ESEs per EST 52.8 (17.2) 50.4 (23) 67.8 (21.6)

ESTs, expressed sequence tags.

SPLICING AND CPG METHYLATION 269

Page 4: Human Epigenome Data Reveal Increased CpG Methylation in Alternatively Spliced Sites and Putative Exonic Splicing Enhancers

sequences, we also evaluated the frequency of ESEs in CpG-dense and CpG-poor ESTs for the same chromosomes (Table2B). It is evident from Table 2B that, although there is con-siderable variation among the studied chromosomes, thenumber of ESTs is higher in CpG-rich ESTs, compared withCpG-poor ESTs, whereas the corresponding average ESTlength remains the same in the three chromosomes. Theseresults reinforce the previous evidence that intragenic CpGsmight act as an additional regulatory factor in expressedsequences of high regulatory complexity.

Moreover, we investigated the spatial relationship ofCpGs with respect to splicing regulatory factors, by evalu-ating the ESE positioning relative to the CpG sequence. Asshown in Figure 1, it is evident that there is a stringentlydefined spatial relationship between the CpG and ESE dis-tribution and that the ESEs are clustered around CpGs.

Provided that the average ESE size is six nucleotides, it isevident that ESEs are evenly distributed around these sites oftentative epigenetic modification.

Comparison of the ESE content and exonic distributionin the analyzed CpG-containing sequences

To further verify that the analyzed CpGs represent auseful sample for estimating the impact of methylation withrespect to the regulation of splicing and ESEs, we evaluatedthe ESE frequency relative to the analyzed CpG density inchromosomal increments of 3.5�106 nt length (Fig. 2). Thescatter plots shown in Figure 2 reveal a linear correlationbetween the CpG and ESE densities that is mostly observedin areas of moderate CpG density in all chromosomes. Onthe contrary, in sequences with very high CpG densities

FIG. 1. Correlation of theESE and CpG densities. Thex-axis represents the averagedistance of the putative ESEsfrom a certain CpG and they-axis shows the number ofCpGs at the correspondingdistances for chromosomes6, 20, and 22. ESEs, exonicsplicing enhancers.

270 ANASTASIADOU ET AL.

Page 5: Human Epigenome Data Reveal Increased CpG Methylation in Alternatively Spliced Sites and Putative Exonic Splicing Enhancers

[>400 CpGs, probably corresponding to CIs and mini-CIs(Malousi et al., 2008)], the correlation coefficient decreases(r¼ 0.971 for less than 400 CpGs to r¼ 0.901 for higher than400 CpGs in chromosome 6). The same analysis in chromo-some 20 shows that the correlation coefficient is also reducedat high CpGs densities (r¼ 0.983 for less than 400 CpGs tor¼ 0.824 for higher than 400 CpGs densities). In chromosome20, the density of the analyzed CpGs is generally lowercompared with that in chromosomes 6 and 22.

The ESE frequency with respect to the CpG density was alsoinvestigated for chromosomal increments of 1�106 nt length.The results indicate a similar average correlation between theCpG and ESE densities for all chromosomes (r¼ 0.965, r¼ 0.969,and r¼ 0.984 for chromosomes 6, 20, and 22, respectively).

Methylation at different CpG-containing putative ESEs

Following the classification of CpG sequences as hyper-methylated and hypomethylated CpGs and intermediate

methylation CpGs, we then investigated the total ESE fre-quency and that of the different splicing regulatory elementscategories identified by ESEfinder. The data in Table 3 showthe frequency of CpG methylation with respect to the num-ber of putative ESE elements identified in� 50-nt sequences,adjacent to each CpG. This analysis reveals that, as expected,regardless of the chromosome tested, methylation is frequentin putative ESEs, which are exonic sequences. However, italso shows that the extent of methylation increases relative tothe putative ESE density surrounding the CpG dinucleotide,and as different putative ESE types are frequently clusteredor overlapping, methylation appears to be related to theputative ESE complexity.

In addition, the results in Tables 4A and 4B show that themethylation frequency in the different SR protein-bindingESEs varies according to the ESE type (SF2/ASF, SRp40,SRp55, and SC35) and that its frequency is invariable, re-gardless of the chromosome analyzed. Thus, the percentageof CpGs that show intermediate methylation levels

FIG. 2. Linear regression analysis of the ESE density relative to the number of analyzed CpGs (diamonds for >400 CpGsand triangles for �400 CpGs). Each chromosome is divided in 3.5�106 nt length increments. (A) Chromosome 6; (B) chro-mosome 20; (C) chromosome 22.

Table 3. Exonic Splicing Enhancer Frequency Relative to the Extent of CpG Methylation

in� 50-nt Nucleotide Sequences, Adjacent to Each CpG

Chromosome Hypermethylated CpGs (O) ESEs ESE/O Hypomethylated CpGs (U) ESEs ESE/U p-Value

6 1580 10,736 6.795 (3.68a) 6987 22,827 3.271 <0.000120 1613 12,397 7.686 (5.45a) 1350 7229 5.355 <0.000122 5819 32,818 5.640 (3.94a) 6959 24,156 3.471 <0.0001

The ESE frequency was estimated in hyper- (O) and hypomethylated (U) CpG sequences.aESE frequency in 100-nt sequences centered at analyzed CpGs, relative to the total CpGs in each chromosome (10,683, 5486, and 17,183 for

chromosomes 6, 20, and 22, respectively).

SPLICING AND CPG METHYLATION 271

Page 6: Human Epigenome Data Reveal Increased CpG Methylation in Alternatively Spliced Sites and Putative Exonic Splicing Enhancers

(20%–80%) is higher in SRp55-binding elements comparedwith all the remaining SR protein-binding elements, and themethylation frequency in this putative ESE type shows slightvariation among the three different chromosomes analyzed(Fig. 3). On the contrary, intermediate methylation is lessfrequent in SF2/ASF-binding motifs (Fig. 3). These resultsreveal that methylation is not only more frequent in the se-quences corresponding to putative ESEs but also mostprobably ESE-type dependent.

Statistical analysis of the above results concerning thefrequency of intermediate methylation reveals that it is sig-nificantly different in SF2/ASF- and SRp55-binding se-quences compared with methylation in the remaining ESEs(SC35 and SRp40) (Table 4B). On the contrary, there is a

stronger correlation between the levels of methylation inSC35- and SRp40-binding sequences.

Methylation in CIs

Using the same approach, we also evaluated the frequencyof hyper- and hypomethylated CpGs in computationallyidentified CIs (Table 5). As expected, most of the CpGs in CIswere either hyper- or hypomethylated and intermediatemethylation was infrequent in CIs compared with ESEs. CICpGs represent the majority of all CpGs, at least for chro-mosomes 6 and 22 (85.5% and 80.86%, respectively). It is alsonoteworthy that in CIs there is a higher variation of themethylation frequency than in ESEs and that intermediatelevels of methylation are more frequent in CIs in chromo-some 20 (ratio: 0.325) compared with chromosomes 6 and 22(ratio: 0.143 and 0.195, respectively).

CpG methylation adjacent to differentdinucleotide repeats

As the presence of specific repetitive CpG sequence ele-ments is supposed to influence significantly DNA windingand nucleosome positioning and could potentially influence

Table 4B. Statistical Correlation of the Data

for Intermediate Methylation Levels

SRp55 SC35 SRp40

SF2/ASF p< 0.0001 p< 0.0001 p< 0.0001SRp55 — p< 0.0001 p< 0.0001SC35 — — p¼ 0.019

Table 4A. Number of Different Splicing Regulatory Protein-Binding Elements

in� 50-nt Sequences Centered at the Analyzed CpGs in Each Chromosome

Hypermethylated(O) Hypomethylated(U)(O þ U)

ESEs

Intermediatemethylation

ESEsTotal Chr6 Chr20 Chr22 Chr6 Chr20 Chr22

SF2/ASF 41,734 3088 (0.288) 3425 (0.276) 10,619 (0.323) 7625 (0.334) 2255 (0.312) 8348 (0.346) 35,360 6374SC35 38,661 3008 (0.280) 3346 (0.270) 9095 (0.277) 6288 (0.275) 2046 (0.283) 6820 (0.282) 30,603 8058SRp40 35,333 3059 (0.285) 3646 (0.294) 8373 (0.255) 5543 (0.243) 1941 (0.268) 5653 (0.234) 28,215 7118SRp55 21,249 1581 (0.147) 1980 (0.160) 4731 (0.144) 3371 (0.148) 987 (0.136) 3335 (0.138) 15,985 5264

Total 136,977 10,736 12,397 32,818 22,827 7229 24,156 110,163 26,814

Within parentheses, the number of hypermethylated CpGs in each ESE category versus total number of hyper- and hypomethylated ESEs(O and U) in each chromosome, respectively, is given.

FIG. 3. Relative frequency ofmethylation in different ESEtypes for all chromosomes. Thefrequency of hypermethylatedand hypomethylated CpGs andCpGs with intermediate methyl-ation levels was estimated foreach type of ESE relative to thetotal number of each ESE type inall three chromosomes. Data wereobtained from Table 4.

272 ANASTASIADOU ET AL.

Page 7: Human Epigenome Data Reveal Increased CpG Methylation in Alternatively Spliced Sites and Putative Exonic Splicing Enhancers

splicing, we determined the levels of methylation in CpGswith respect to their distance from the next CpG dinucleo-tide. It is evident from Figure 4A that sequences with higherCpG density are frequently hypomethylated. However,methylation is slightly more frequent in CpGs that are di-vided by 10 nucleotides from the following CpG, comparedwith adjacent CpGs or those separated by <10 nucleotides.These data could indicate that although local helical distor-tions introduced by successive methylated CpGs might notfavor methylation, this limitation is less stringent in 12-ntCpG repeats, possibly because of the formation of specific,alternative helical structures.

In addition, we investigated the level of methylation inCpGs adjacent to other repetitive sequence elements, some ofwhich contribute to nucleosome positioning (Fig. 4B). TheTA-repeat sequence appears to contribute considerably toCpG methylation, whereas other dinucleotide repeats, forexample, TC, do not introduce similar preferences with re-spect to the methylation levels. Again, this analysis showsthat the nucleotide distance separating these dinucleotiderepeats is evidently a critical parameter in the process ofmethylation.

Discussion

Evidence regarding the involvement of methylation insplicing is given in very recent publications (Laurent et al.,2010; Tao et al., 2010). Moreover, clinical data regarding p53splicing-deregulating mutants in cancer-related syndromes(Kouidou et al., 2009) and cellular deregulation in the agingcell (Bork et al., 2010) clearly show the critical role of meth-ylation changes in human disease, the use of stem cells andits association with splicing. However, the mechanism bywhich epigenetic processes might be involved in splicingremained unresolved.

Our results, which are based on the analysis of a large setof methylation data selected from three chromosomes andprovided by the HEP, reveal that CpG hypermethylation isstrongly associated with alternative and noncanonical splic-ing events and with atypical splice sites, that is, strangesplice sites and atypical intron splice ends (AT/AC), whichare related to the minor spliceosome (Levine and Durbin,2001). The above data could provide an explanation for theinfluence of MeCP2 in the moderation of the splicing process(Moretti and Zoghbi, 2006).

Table 5. Methylation in CpGs of CpG Islands in Chromosomes 6, 20, and 22

Total CpGs (O þ U þ Int) O þ U [(O þ U)/total CpGs] CI (O þ U) [CI (O þ U)/(O þ U)] CIint [CIint/CI]

Chr6 10,683 8567 [0.80] 7411 [0.85] 1233 [0.143]Chr20 5486 2963 [0.54] 1445 [0.49] 696 [0.325]Chr22 17,183 12,778 [0.74] 10,333 [0.81] 2503 [0.195]

The 400-nt sequences centered at CpGs were analyzed for CpG islands.CI(O þ U), hyper- (O) and hypomethylated (U) CpGs in CIs; CIint, intermediate methylation (Int) in CI; CI, CpG islands.

FIG. 4. (A) The frequency ratio[m(C) >80]/[m(C) <20] of hy-permethylated cytosines (m(C) >80)and hypomethylated cytosines(m(C) <20) in repetitive CpGsseparated by different number ofnucleotides in the� 50-nt regionadjacent to the methylated CpG.(B) The corresponding frequencyratio in other repetitive dinucleotidemotifs.

SPLICING AND CPG METHYLATION 273

Page 8: Human Epigenome Data Reveal Increased CpG Methylation in Alternatively Spliced Sites and Putative Exonic Splicing Enhancers

In addition, our results show the strong association ofcomplex splicing regulatory sites (containing a large numberof ESEs) with CpGs, as well as the nonrandom distribution ofmethylation among different putative splicing regulatoryelements. SF2/ASF-binding sites, which are involved inconstitutive/alternative pre-mRNA splicing selection (Kra-lovicova and Vorechovsky, 2007; Sanford et al., 2008), areshown to be mostly hyper- or hypomethylated, similar toCpGs in CIs. On the contrary, methylation in SRp55-bindingmotifs, which are involved in the regulation of cellular re-sponse to intra- and extracellular conditions (Tran andRoesser, 2003; Filippov et al., 2008), mostly assumes inter-mediate values.

Finally, it is recently becoming evident that the presence ofCpG sequences in minigenes contributes to the efficiency oftheir translation (Bauer et al., 2010) and that these sequencesare associated with nucleosomal positioning (Widlund et al.,1999). In view of the high frequency of methylation in TArepeats, which also strongly favor nucleosome positioning(Adams et al., 1987; Caserta et al., 2009) present in theTATAAACGCC sequence (Widlund et al., 1999), it is prob-able that nucleosome positioning and assembly is associatedwith the presence of methylated cytosine and DNA-bindingproteins might play a positive role in this process.

Conclusion

In conclusion, our results show that there is a correlationbetween the distribution of DNA methylation and the pres-ence of sequence motifs affecting splicing and nucleosomepositioning. These data support an additional role for DNAmethylation, namely, that of a molecular switch contributingto the regulation of the splicing process, probably related tosplicing regulatory factor binding and nucleosome posi-tioning. Further studies on these processes will improve ourunderstanding of the mechanisms controlling splicing anddifferentiation.

Disclosure Statement

No competing financial interests exist.

References

Adams, R.L., Davis, T., Rinaldi, A., and Eason, R. (1987). CpGdeficiency, dinucleotide distributions and nucleosome posi-tioning. Eur J Biochem 165, 107–115.

Arnott, S., Chandrasekaran, R., Puigjaner, L.C., Walker, J.K.,Hall, I.H., Birdsall, D.L., et al. (1983). Wrinkled DNA. NucleicAcids Res 11, 1457–1474.

Bauer, A.P., Leikam, D., Krinner, S., Notka, F., Ludwig, C.,Langst, G., et al. (2010). The impact of intragenic CpG contenton gene expression. Nucleic Acids Res 38, 3891–3908.

Bork, S., Pfister, S., Witt, H., Horn, P., Korn, B., Ho, A.D., et al.(2010). DNA methylation pattern changes upon long-termculture and aging of human mesenchymal stromal cells. AgingCell 9, 54–63.

Cartegni, L., Wang, J., Zhu, Z., Zhang, M.Q., and Krainer, A.R.(2003). ESEfinder: a web resource to identify exonic splicingenhancers. Nucleic Acids Res 31, 3568–3571.

Caserta, M., Agricola, E., Churcher, M., Hiriart, E., Verdone, L.,Di Mauro, E., et al. (2009). A translational signature for nu-

cleosome positioning in vivo. Nucleic Acids Res 37, 5309–5321.

Chodavarapu, R.K., Feng, S., Bernatavichute, Y.V., Chen, P.Y.,Stroud, H., Yu, Y., et al. (2010). Relationship between nucleo-some positioning and DNA methylation. Nature. 466, 388–392.

Choi, J.K., and Kim, Y.J. (2009). Intrinsic variability of gene ex-pression encoded in nucleosome positioning sequences. NatGenet 41, 498–503.

Constancia, M., Pickard, B., Kelsey, G., and Reik, W. (1998).Imprinting mechanisms. Genome Res 8, 881–900.

Eckhardt, F., Lewin, J., Cortese, R., Rakyan, V.K., Attwood, J.,Burger, M., et al. (2006). DNA methylation profiling of humanchromosomes 6, 20 and 22. Nat Genet 38, 1378–1385.

Fairbrother, W.G., Yeo, G.W., Yeh, R., Goldstein, P., Mawson,M., Sharp, P.A., et al. (2004). RESCUE-ESE identifies candidateexonic splicing enhancers in vertebrate exons. Nucleic AcidsRes 32, W187–W190.

Fang, Y., Bai, C., Wei, Y., Lin, S.B., and Kan, L. (1995). Effect ofselective cytosine methylation and hydration on the confor-mations of DNA triple helices containing a TTTT loop struc-ture by FT-IR spectroscopy. J Biomol Struct Dyn 13, 471–482.

Filippov, V., Schmidt, E.L., Filippova, M., and Duerksen-Hughes, P.J. (2008). Splicing and splice factor SRp55 partici-pate in the response to DNA damage by changing isoformratios of target genes. Gene 420, 34–41.

Gardiner-Garden, M., and Frommer, M. (1987). CpG islands invertebrate genomes. J Mol Biol 196, 261–282.

Goncalves, V., Matos, P., and Jordan, P. (2009). Antagonistic SRproteins regulate alternative splicing of tumor-related Rac1bdownstream of the PI3-kinase and Wnt pathways. Hum MolGenet 18, 3696–3707.

Haider, S., Ballester, B., Smedley, D., Zhang, J., Rice, P., andKasprzyk, A. (2009). BioMart Central Portal—unified access tobiological data. Nucleic Acids Res 37, W23–W27.

Jones, P.A., and Laird, P.W. (1999). Cancer epigenetics comes ofage. Nat Genet 21, 163–167.

Karolchik, D., Hinrichs, A.S., Furey, T.S., Roskin, K.M., Sugnet,C.W., Haussler, D., et al. (2004). The UCSC Table Browser dataretrieval tool. Nucleic Acids Res 32, D493–D496.

Kouidou, S., Agidou, T., Kyrkou, A., Andreou, A., Katopodi, T.,Georgiou, E., et al. (2005). Non-CpG cytosine methylation ofp53 exon 5 in non-small cell lung carcinoma. Lung Cancer 50,

299–307.Kouidou, S., Malousi, A., and Maglaveras, N. (2009). Li-Fraumeni

and Li-Fraumeni-like syndrome mutations in p53 are associatedwith exonic methylation and splicing regulatory elements. MolCarcinog 48, 895–902.

Kralovicova, J., and Vorechovsky, I. (2007). Global control ofaberrant splice-site activation by auxiliary splicing sequences:evidence for a gradient in exon and intron definition. NucleicAcids Res 35, 6399–6413.

Laurent, L., Wong, E., Li, G., Huynh, T., Tsirigos, A., Ong, C.T.,et al. (2010). Dynamic changes in the human methylome dur-ing differentiation. Genome Res 20, 320–331.

Levine, A., and Durbin, R. (2001). A computational scan for U12-dependent introns in the human genome sequence. NucleicAcids Res 29, 4006–4013.

Li, X., and Manley, J.L. (2005). Inactivation of the SR proteinsplicing factor ASF/SF2 results in genomic instability. Cell122, 365–378.

Lister, R., Pelizzola, M., Dowen, R.H., Hawkins, R.D., Hon, G.,Tonti-Filippini, J., et al. (2009). Human DNA methylomes atbase resolution show widespread epigenomic differences.Nature 462, 315–322.

274 ANASTASIADOU ET AL.

Page 9: Human Epigenome Data Reveal Increased CpG Methylation in Alternatively Spliced Sites and Putative Exonic Splicing Enhancers

Liu, H., Wu, J., Xie, J., Yang, X., Lu, Z., and Sun, X. (2008).Characteristics of nucleosome core DNA and their applica-tions in predicting nucleosome positions. Biophys J 94, 4597–4604.

Malousi, A., Maglaveras, N., and Kouidou, S. (2008). IntronicCpG content and alternative splicing in human genes con-taining a single cassette exon. Epigenetics 3, 69–73.

Moretti, P., and Zoghbi, H.Y. (2006). MeCP2 dysfunction in Rettsyndrome and related disorders. Curr Opin Genet Dev 16,

276–281.Pandya-Jones, A., and Black, D.L. (2009). Co-transcriptional

splicing of constitutive and alternative exons. RNA 15, 1896–1908.

Rakyan, V.K., Hildmann, T., Novik, K.L., Lewin, J., Tost, J., Cox,A.V., et al. (2004). DNA methylation profiling of the humanmajor histocompatibility complex: a pilot study for the humanepigenome project. PLoS Biol 2, e405.

Razin, A., and Riggs, A.D. (1980). DNA methylation and genefunction. Science 210, 604–610.

Robertson, K.D. (2002). DNA methylation and chromatin—unraveling the tangled web. Oncogene 21, 5361–5379.

Sanford, J.R., Coutinho, P., Hackett, J.A., Wang, X., Ranahan, W.,and Caceres, J.F. (2008). Identification of nuclear and cyto-plasmic mRNA targets for the shuttling protein SF2/ASF.PLoS One 3, e3369.

Schwartz, S., Meshorer, E., and Ast, G. (2009). Chromatin or-ganization marks exon-intron structure. Nat Struct Mol Biol16, 990–995.

Svozil, D., Kalina, J., Omelka, M., and Schneider, B. (2008). DNAconformations and their sequence preferences. Nucleic AcidsRes 36, 3690–3706.

Takai, D., and Jones, P.A. (2003). The CpG island searcher: a newWWW resource. In Silico Biol 3, 235–240.

Tao, Y., Xi, S., Briones, V., and Muegge, K. (2010). Lsh mediatedRNA polymerase II stalling at HoxC6 and HoxC8 involvesDNA methylation. PLoS One 5, e9163.

Tran, Q., and Roesser, J.R. (2003). SRp55 is a regulator ofcalcitonin/CGRP alternative RNA splicing. Biochemistry 42,

951–957.Widlund, H.R., Kuduvalli, P.N., Bengtsson, M., Cao, H., Tullius,

T.D., and Kubista, M. (1999). Nucleosome structural featuresand intrinsic properties of the TATAAACGCC repeat se-quence. J Biol Chem 274, 31847–31852.

Address correspondence to:Prof. Sofia Kouidou

Department of Biological ChemistrySchool of Medicine

Aristotle UniversityThessaloniki 54124

Greece

E-mail: [email protected]

Received for publication June 02, 2010; received in revisedform November 01, 2010; accepted November 14, 2010.

SPLICING AND CPG METHYLATION 275

Page 10: Human Epigenome Data Reveal Increased CpG Methylation in Alternatively Spliced Sites and Putative Exonic Splicing Enhancers
Page 11: Human Epigenome Data Reveal Increased CpG Methylation in Alternatively Spliced Sites and Putative Exonic Splicing Enhancers

This article has been cited by:

1. Alika K Maunakea, Iouri Chepelev, Kairong Cui, Keji Zhao. 2013. Intragenic DNA methylation modulates alternative splicing byrecruiting MeCP2 to promote exon recognition. Cell Research . [CrossRef]

2. Marta Kulis, Ana C. Queirós, Renée Beekman, José I. Martín-Subero. 2013. Intragenic DNA methylation in transcriptionalregulation, normal differentiation and cancer. Biochimica et Biophysica Acta (BBA) - Gene Regulatory Mechanisms . [CrossRef]

3. J. Wan, V. F. Oliver, H. Zhu, D. J. Zack, J. Qian, S. L. Merbs. 2013. Integrative analysis of tissue-specific methylation andalternative splicing identifies conserved transcription factor binding motifs. Nucleic Acids Research . [CrossRef]

4. M. T. Eadon, H. E. Wheeler, A. L. Stark, X. Zhang, E. L. Moen, S. M. Delaney, H. K. Im, P. N. Cunningham, W. Zhang, M.E. Dolan. 2013. Genetic and epigenetic variants contributing to clofarabine cytotoxicity. Human Molecular Genetics . [CrossRef]

5. Marisol Resendiz, Yuanyuan Chen, Nail C Öztürk, Feng C Zhou. 2013. Epigenetic medicine and fetal alcohol spectrum disorders.Epigenomics 5:1, 73-86. [CrossRef]

6. Sylvain Guibert, Michael WeberFunctions of DNA Methylation and Hydroxymethylation in Mammalian Development 104, 47-83.[CrossRef]

7. T.-J. Chuang, F.-C. Chen, Y.-Z. Chen. 2012. Position-dependent correlations between DNA methylation and the evolutionaryrates of mammalian coding exons. Proceedings of the National Academy of Sciences 109:39, 15841-15846. [CrossRef]

8. Andigoni Malousi, Sofia Kouidou. 2012. DNA hypermethylation of alternatively spliced and repeat sequences in humans. MolecularGenetics and Genomics 287:8, 631-642. [CrossRef]

9. Michelle E. Schober, Xingrao Ke, Bohan Xing, Benjamin P. Block, Daniela F. Requena, Robert McKnight, Robert H. Lane. 2012.Traumatic Brain Injury Increased IGF-1B mRNA and Altered IGF-1 Exon 5 and Promoter Region Epigenetic Characteristicsin the Rat Pup Hippocampus. Journal of Neurotrauma 29:11, 2075-2085. [Abstract] [Full Text HTML] [Full Text PDF] [FullText PDF with Links]

10. Susan A. Weiner, Amy L. Toth. 2012. Epigenetics in Social Insects: A New Direction for Understanding the Evolution of Castes.Genetics Research International 2012, 1-11. [CrossRef]


Recommended