Genetic-epigenetic interactions: Sequence-dependent and
independent DNA methylation
by
Carolyn Ptak
A thesis submitted in conformity with the requirements
for the degree of Doctor of Philosophy
Graduate Department of Pharmacology and Toxicology
University of Toronto
© Copyright by Carolyn Ptak 2013
ii
Genetic-epigenetic interactions: Sequence-dependent and
independent DNA methylation
Carolyn Ptak
Doctor of Philosophy
Graduate Department of Pharmacology and Toxicology
University of Toronto
2013
Abstract
The field of human epigenetics has become widely accepted, yet many basic principles remain
unclear. It is important to determine the extent to which DNA methylation profiles are influenced
by DNA sequence; we addressed this question using tissues from human monozygotic (MZ) and
dizygotic (DZ) twins, post-mortem brain and germline samples. First, we analyzed white blood
cells (WBC), buccal epithelium, and rectal biopsies from MZ twins, and annotated the epigenetic
metastability of ~6,000 unique genomic regions. Our study was the first to utilize epigenome-
wide profiling to document DNA methylation differences in MZ twins. We also found that DZ
twins exhibited more epigenetic differences compared to MZ twins. Two competing hypotheses
were tested: 1) DNA sequence differences caused the additional DZ epigenetic variation, or 2)
additional epigenetic differences are present in the zygotes of DZ co-twins. Our animal and in
silico studies supported hypothesis 2, providing the first evidence for twin-based epigenetic
iii
heritability. Still, genetic impact on epigenetic variation cannot be excluded. To explore DNA–
epigenetic interactions and their role in disease, we mapped allele-specific methylation (ASM) in
brain and sperm DNA from individuals affected with major psychosis and controls. We found
that ~2.5% of brain SNPs show ASM, although genomic distribution of these “epiSNPs” varies
between cohorts. EpiSNPs were generally enriched in untranslated regions (UTRs) and regions
surrounding genes, and depleted in exons. The schizophrenia cohort contained twice as many
epiSNPs as controls and bipolar disorder, although they largely overlapped. Most epiSNP Gene
Ontology categories were related to brain development and function; differences between cohorts
were observed in glutamate and insulin pathways. Tissue-specific epiSNPs were also detected in
sperm DNA. Deep sequencing analysis revealed that any SNP could potentially demonstrate
low-level ASM. This work describes various aspects of genetic–epigenetic interactions, while
supporting epigenetic heritability and the role of genetic-epigenetic interactions in major
psychiatric disease.
iv
Acknowledgments
First, I would like to thank my supervisor, Art Petronis, who has offered an incredible amount of
support, guidance and inspiration over the years. I would also like to thank the members of my
committee, Albert Wong and Young Kim, for their advice, creativity, and even for grilling me at
our annual meetings. There have been many excellent people in the lab, all of whom contributed
to the completion of my work, but I would especially like to thank Zach Kaminsky for training
me on all basic lab techniques and for involving me in every step of the twin project, Gabriel Oh
for sharing the entire PhD process with me, Jon Mill and Ian Weaver for being entertaining
British fellows, and Miki Susic for keeping everyone in line and the lab running smoothly. I
would also like to thank everyone who directly contributed to the work included in this thesis, in
particular, Paul Boutros and Denise Mak, the bioinformaticians who developed a number of
analytical tools for many of these analyses. Additionally, I am thankful for the departmental
fellowships, Ontario Graduate Scholarships, and CIHR Master’s and Doctoral awards that I
received to fund this research. Finally, I want to thank my husband, Greg, and our lizard for
listening to my insane rants about “the lab” every night for the last 5.5 years.
v
Table of Contents
Acknowledgments .......................................................................................................................... iv
Table of Contents ............................................................................................................................ v
List of Tables ............................................................................................................................... viii
List of Figures ................................................................................................................................ ix
List of Appendices .......................................................................................................................... x
List of Abbreviations ..................................................................................................................... xi
Chapter 1 Introduction .................................................................................................................... 1
Statement of problem ................................................................................................................. 1
Study objectives and rationale ................................................................................................... 3
Review of the literature .............................................................................................................. 5
Epigenetics .......................................................................................................................... 5
Twin studies and the separation of genetic and epigenetic factors in disease .................... 7
The putative role of genetic-epigenetic interactions in complex disease ........................... 8
Genetic and epigenetic studies of major psychosis ........................................................... 11
ASM: relevance to studies of complex disease ................................................................. 17
Emergence of epigenetic treatments ................................................................................. 22
Chapter 2 Materials and Methods ................................................................................................. 23
Contributions: DNA dependent and independent DNA methylation in twins ................. 23
Twin sample ...................................................................................................................... 23
DNA methylation profiling ............................................................................................... 25
Animal studies .................................................................................................................. 25
Data analysis ..................................................................................................................... 26
Test for association of epigenetic difference with cellular heterogeneity ........................ 26
Biological and technical variation .................................................................................... 26
vi
Spot-wise epigenetic variation .......................................................................................... 27
Cross tissue comparison .................................................................................................... 28
Investigation of genomic element class ............................................................................ 28
Gene ontology analysis ..................................................................................................... 29
Validation of the microarray findings ............................................................................... 29
Contributions: ASM and its putative role in complex disease .......................................... 31
Sample preparation ........................................................................................................... 31
EpiSNP identification ....................................................................................................... 33
Verification of microarray results ..................................................................................... 35
Examination of linkage disequilibrium effects ................................................................. 36
Deep sequencing analysis of non-epiSNPs ....................................................................... 37
Chapter 3 Results and Discussion ................................................................................................. 40
Comparison of MZ versus DZ epigenetic profiles ........................................................... 48
Genomic frequency and distribution of ASM ................................................................... 53
Bisulfite verification of selected SNPs ............................................................................. 61
Linkage disequilibrium does not cause false-positives ..................................................... 64
ASM in the major psychosis cohort .................................................................................. 66
Tissue specificity of ASM ................................................................................................ 82
Sensitivity Analysis .......................................................................................................... 89
Low level ASM across the genome .................................................................................. 91
Chapter 4 General Discussion and Conclusions ........................................................................... 96
Changing concepts of epigenetic regulation ..................................................................... 97
Genetic-epigenetic interplay in complex disease ............................................................ 105
Future directions ............................................................................................................. 110
Appendices .................................................................................................................................. 116
Appendix I. Twin Study Supplementary Notes ..................................................................... 116
vii
Appendix 2. Allele-Specific Methylation Study Supplementary Notes ................................ 121
Copyright Acknowledgements .................................................................................................... 132
References ................................................................................................................................... 134
viii
List of Tables
Table 2.1 Sodium bisulfite treated loci and primers
Table 2.2 Sodium bisulfite treated SNP loci and primers
Table 2.3 Forward 454 sequencing reads per amplicon
Table 3.1 GO analysis of loci with high MZ co-twin epigenetic similarity
Table 3.2 GO analysis of loci with low MZ co-twin epigenetic similarity
Table 3.3 Differences in epiSNP chromosomal distribution in brain and sperm
Table 3.4 Differences in epiSNP functional class distribution in brain and sperm
Table 3.5 Average methylation across loci
Table 3.6 Direction of methylation and gene information for common epiSNPs
Table 3.7 Top five GO categories per brain cohort
Table 3.8 Top 5 enriched GO categories per brain cohort
Table 3.9 Glutamate- and insulin-related GO categories per brain cohort
Table 3.10 Summary of mitochondria-related epiSNPs per brain cohort
Table 3.11 Minor epiSNPs showing significant ASM effects in brain
Table A2.1 Stanley sample demographics
Table A2.2 Harvard sample demographics
Table A2.3 CAMH sample demographics
Table A2.4 Methylation levels at all CpG sites
Table A2.5 EpiSNPs and associated gene information
Table A2.6 454 analysis sample genotypes
ix
List of Figures
Figure 1.1 Hypothetical mechanisms of epiSNP action
Figure 2.1 Twin study workflow
Figure 2.2 EpiSNP study workflow
Figure 3.1 Biological vs. technical variation
Figure 3.2 Correlations between microarray and sodium bisulfite sequencing data
Figure 3.3 Pyrosequencing correlations as a function of distance
Figure 3.4 Karyogram of MZ co-twin epigenetic similarity in WBCs
Figure 3.5 Raw binding intensities of MC and DC MZ twin hybridizations
Figure 3.6 MZ and DZ ICC distributions in buccal cells
Figure 3.7 Karyogram of MZICC-DZICC values in buccal cells of DC MZ twins
Figure 3.8 Technical variation volcano plots of HpaII and MspI based enrichments
Figure 3.9 Distributions of inbred and outbred epigenetic variation
Figure 3.10 Enrichment of unmethylated DNA fraction
Figure 3.11 Chromosomal distribution of brain epiSNPs
Figure 3.12 Functional class distribution of brain epiSNPs
Figure 3.13 Methylation levels observed for an epiSNP
Figure 3.14 Methylation levels observed for a non-epiSNP
Figure 3.15 Distances between SNPs and MSRE SNPs
Figure 3.16 Linkage disequilibrium scores in brain and sperm
Figure 3.17 Total epiSNPs per cohort in brain
Figure 3.18 GO categories per cohort in brain
Figure 3.19 EpiSNPs detected in sperm DNA
Figure 3.20 Overlapping epiSNPs between brain and sperm
Figure 3.21 Chromosomal distribution of sperm epiSNPs
Figure 3.22 Functional class distribution of sperm epiSNPs
Figure 3.23 Sensitivity analysis
Figure 3.24 Sensitivity analysis stratified by strength of associations
Figure 3.25 Deep sequencing workflow
Figure 3.26 CpG count per SNP
Figure 3.27 Methylation levels per group for a sample minor epiSNP
Figure A1.1 Karyogram of MZ co-twin epigenetic similarity in buccal cells
Figure A1.2 Karyogram of MZ co-twin epigenetic similarity in gut
Figure A1.3 Karyogram of MZICC-DZICC values in WBCs
Figure A1.4 Karyogram of MZICC-DZICC values in buccal cells of MC MZ twins
x
List of Appendices
Appendix 1. Twin Study Supplementary Notes
Appendix 2. Allele-Specific Methylation Study Supplementary Notes
xi
List of Abbreviations
5-hydroxymethylcytosine 5-hmC
5-methylcytosine 5-mC
3’ untranslated region 3’ UTR
5, 10-methylenetetrahydrofolate reductase MTHFR
Acute lymphoblastic leukemia ALL
Allele-specific expression ASE
Allele-specific methylation ASM
Alpha-ketoglutarate-dependent dioxygenase FTO
Base-pair bp
Bipolar disorder BD
Breast cancer type 1 BRCA1
Caenorhabditis elegans C. elegans
Copy number variants CNV
CpG islands CGI
Cyclin-dependent kinase inhibitor 2A p16INK4a
Differentially methylated regions DMR
Dichorionic DC
Dizygotic DZ
DNA methyltransferase DNMT
Endoplasmic reticulum ER
Epigenome-wide association studies EWAS
Expression quantitative trait loci eQTL
False discovery rate FDR
Fragile X mental retardation 1 FMR1
GABA plasma membrane transporter-1 GAT-1
Gene ontology GO
Genome-wide association studies GWAS
Glutamic acid decarboxylase 67 GAD67
Glutathione transferase GST
G protein-coupled inwardly rectifying potassium channel KCNJ6
Head and neck squamous cell carcinoma HNSCC
Heterozygosity quotient HQ
Histone acetyltransferase HAT
Histone deacetylase HDAC
HLA complex group 9 HCG9
Human embryonic stem cell HESC
Human leukocyte antigen HLA
Intraclass correlation coefficient ICC
Imprinted differentially methylated regions iDMR
International Human Epigenome Consortium IHEC
Jak and microtubule interacting protein MARLIN-1
Kilobase Kb
Linkage disequilibrium LD
Long interspersed nucleotide element LINE
Major depressive disorder MDD
xii
Melanin-concentrating hormone receptor 1 MCHR1
Methylation-sensitive representational difference analysis MS-RDA
Methylation-sensitive restriction enzyme MSRE
Methyl-CpG binding domains MBD
Methyl-CpG-binding protein 2 MeCP2
Micro RNA miRNA
Monochorionic MC
Monozygotic MZ
MutL homolog 1 hMLH1
Nei endonuclease VIII-like 1 NEIL1
Non-coding RNA ncRNA
O(6)-methylguanine DNA methyltransferase MGMT
Peripheral blood leukocyte PBL
Polymerase chain reaction PCR
Potassium chloride co-transporter 3 SLC12A6
Quantitative trait loci QTL
Reelin RELN
RNA-induced silencing complex RISC
S-adenosyl-methionine SAM
Schizophrenia SZ
Secretogranin II SCG2
Serotonin receptor 1A 5HT1A
Single nucleotide polymorphism SNP
Small interfering RNA siRNA
Transcription factor TF
Ten-eleven translocation TET
Toronto Centre for Applied Genomics TCAG
Type 2 diabetes T2D
Vitamin D receptor VDR
White blood cells WBC
1
Chapter 1 Introduction
Statement of problem
DNA sequence has long been regarded as the means for encoding information in the mammalian
cell, and when the extent of human genetic diversity began to emerge, a few years after the
release of the complete human genome, Science declared the discovery to be the “breakthrough
of the year [1],” as it was hoped that these sequence differences would explain inter-individual
phenotypic variation. Unfortunately, there was still a considerable disconnect between DNA
sequence and phenotypic outcomes, but epigenetic factors were quickly nominated as the
connection between environment and genetics that should be added to this “first draft” of the
genome [2]. Although associations between genetic and epigenetic factors are starting to
materialize, their complex relationship remains somewhat nebulous. It is critical to understand
the influence of genetics on epigenetics and vice versa, and the twin study design provides an
elegant strategy for teasing apart their effects.
Twin research has been of fundamental importance in human studies for two main reasons.
First, phenotypic discordance in monozygotic (MZ) co-twins has traditionally indicated a role of
environment, and twin studies offer a means to measure the relative contributions of genes and
environmental factors. Countless twin studies have been performed over the last century on
almost any trait imaginable, but primarily on human disease [3], although an acceptable
mechanistic explanation for MZ discordance has yet to be presented. In the last decade,
evidence has been accumulating that epigenetic modifications of DNA and histones can have a
primary role in phenotypic outcomes, including human disease [4]. DNA methylation shows
only partial stability, which could be caused by a wide variety of factors, including
developmental programs, environment, hormones, and stochastic events [5-8]. Such epigenetic
metastability may result in substantial epigenetic differences across genetically identical
organisms [9]. Several studies have identified epigenetic differences, either at selected genes of
MZ twins [10-13] or in the overall epigenome [14]. It has become evident that the MZ study
design is quite useful for the investigation of methylation differences that are not sequence-
dependent. Despite this promising start, no epigenome-wide studies have yet been conducted to
2
catalogue the extent of this phenomenon, and of the targeted studies, few have been done in
tissues other than peripheral blood cells.
The second major benefit of the twin design is that comparison of phenotypic concordance rates
in MZ twins versus dizygotic (DZ) twins is a powerful strategy to estimate heritability. Nearly
universally, MZ twins show various degrees of discordance, generally lower in comparison to
discordance in DZ twins. These observations provided the basis for the current paradigm of
human normal and morbid biology, which focuses on DNA sequence variation and
environmental differences. The extent to which DZ twins are different remains unknown, as
does the degree to which DNA sequence variants can influence local methylation levels, and
how this factors into the differences observed between MZ and DZ twins. Our twin study
demonstrated that DZ twins exhibited a larger degree of epigenetic variation in comparison to
MZ twins, which is most likely an outcome of epigenetic differences in the zygotes. At the same
time, we could not fully exclude the putative role of DNA sequence variants on epigenetic
variation, and subsequently dedicated significant effort to mapping of DNA-epigenetic
interactions at common single nucleotide polymprphisms (SNPs).
SNPs have been investigated in many diseases and conditions [15-17], but their actual
contributions remain largely unknown, and knowledge of individual SNP risk factors fails to
fully explain the heritability estimates for complex traits [18]. In genome-wide association
studies (GWAS), SNPs frequently associate with certain phenotypes, but many SNPs detected
with this approach do not reach significance. We aimed to demonstrate that there is
heterogeneity within the A and B alleles at many SNPs, and that stratification based on
epigenetic properties would greatly enhance the ability of GWAS studies to detect strong
markers of disease and reach more meaningful conclusions.
It has been found that SNPs may exhibit allele-specific methylation (ASM); however, this
evidence is derived from a limited number of studies that did not include a sufficient number of
samples [19-21]. ASM may play an important role in phenotypic diversity and disease etiology,
yet its occurrence and link to genetic polymorphism remains unknown. Despite an intense
interest in single-locus ASM associated with various cancer subtypes [22-24], this phenomenon
has not been investigated in the context of other complex non-Mendelian diseases. Major
psychosis, a term that describes both schizophrenia (SZ) and bipolar disorder (BD), is a perfect
3
example of a disease that exhibits substantial heritability, yet the patterns of inheritance and the
exact genes or SNPs involved remain largely undiscovered. In the case of SZ, GWAS have
identified a number of associated SNPs, yet a staggering minority of those SNPs has actually
been found to be functionally related to the phenotypes of affected individuals, and none of
them are classified as biomarkers [25]. While the natural response seems to be intensification of
the same research strategy - larger sample sizes, higher resolution mapping, etc – we should also
continue to search for new disease mechanisms. DNA sequence variants with the ability to
interact with epigenetic modifications are a promising new field of study, as these “epiSNPs”
could potentially alter the expression of genes in cis and, potentially, in trans.
It is essential to understand the interaction between genetics and epigenetics in both normal
individuals and those in a disease state, but overall, our knowledge of sequence-dependent and
independent DNA methylation is severely lacking. Using microarrays, a twin study design and a
genome-wide interrogation of ASM effects in cases and controls, we will attempt to solve some
of these fundamental molecular mysteries. Ultimately, we hope that our findings can be utilized
in the identification of molecular targets for new pharmaceuticals, which will be applied in the
treatment of SZ, BD and other complex diseases.
Study objectives and rationale
Many questions remain concerning the relationship between DNA sequence and epigenetic
factors; we will never see a complete picture of genomic activity until these interactions are
elucidated. One goal is to determine the relative amount of sequence-independent epigenetic
variation, and the MZ twin model provides an effective way to study it. In genetically identical
organisms, such as human MZ twins, epigenetic patterns may drift due to the partial stability of
DNA methylation and other epigenetic modifications. Previous studies have identified
epigenetic metastability in MZ twins, but they have failed to examine these differences in a
large-scale, genome-wide manner, thus, the first objective was to map the DNA methylation
differences between MZ co-twins using WBC, gut and buccal epithelial cell tissue.
The comparison of concordance rates between MZ and DZ twins is one method to estimate the
contributions of genetic (heritability) and non-genetic factors (environment) for any given trait;
4
we wished to apply this same model to the investigation of epigenetic variation. Very few
studies have touched upon the concept of epigenetic or “soft” inheritance, which refers to the
transmission of epigenetic marks from parent to offspring through the germ cells. The epigenetic
marks established in the majority of cells over the lifetime of an organism are mostly irrelevant
to the next generation, with the exception of those occurring in the mature gametes [26]. While
it is known that a large-scale erasure of methylation marks occurs during early mammalian
development, presumedly to restore all cell lineages to a common ground state, several
examples of meiotically-transmitted epi-alleles have been discovered in a variety of organisms,
including humans [27]. In order to investigate the extent of this phenomenon, our second
objective was to compare the DNA methylation variation between MZ and DZ twins, using
white blood cells (WBC) and buccal epithelial cells.
Unlike MZ twins, DZ twins only share 50% of segregating DNA polymorphisms [28], thus, any
additional epigenetic variation between DZ twins can potentially be explained by: 1) DNA
sequence effects on epigenetic variation or 2) epigenomic individuality of zygotes. We explored
both hypotheses and determined that it was necessary to systematically document genetic–
epigenetic interactions. We addressed this issue in our third objective by performing an in silico
SNP analysis, as well as conducting an animal study, in which DNA methylation variation was
compared between inbred (genetically identical) and outbred mice (non-identical).
Although we went on to determine that the epigenetic variation between MZ and DZ twins was
independent of DNA sequence, it was clear that our experimental design limited the amount of
observable sequence-dependent methylation effects. The twin experiment did not allow us to
make conclusions about the portion of methylation that may be controlled by DNA sequence
variants, although this interaction is potentially an important step in the etiopathogenesis of
complex diseases. A large-scale, unbiased mapping of these events has never been
accomplished, thus, our fourth objective was to estimate the percentage of SNPs that
demonstrate ASM, investigate the distribution of these epiSNPs throughout the genome, and
then determine the potential of any given SNP to display ASM effects. Many molecular findings
show some sort of tissue-specificity, and this is especially true for epigenetic effects [29], as
different tissues are subjected to different hormone levels, environmental stressors, and other
xenobiotics, all of which may impact the epigenome to some degree [30, 31]. We examined a
5
second DNA source - sperm cells from control and BD subjects - to determine if ASM effects
would be present and, if so, how they would compare to effects seen in the brain.
Our lab has previously determined that epigenetic factors are involved in the etiology of
psychosis [32], however, this study only examined the contribution of epigenetic factors without
considering an interaction with DNA sequence. In this experiment, our fifth objective was to
identify epiSNPs in a subset of individuals affected with psychosis, and then compare them to
those detected in the control set. The over-arching hypothesis of this portion of the study is that
a specific epigenetic state is required for a SNP to be classified as a true risk factor for
psychosis, and that genetic-epigenetic interactions should be considered when searching for
predictive or causative elements associated with complex diseases.
Review of the literature
Epigenetics
Epigenetics refers to regulation of various genomic functions, including gene expression, that
are brought about by mitotically heritable, but potentially reversible changes in DNA
methylation and various modifications of histones (acetylation, methylation, phosphorylation,
etc) [33]. The two epigenetic mechanisms work in concert, with alterations in DNA
modification affecting histone modifications and vice versa. In humans and animals,
methylation of DNA occurs at the C5 position of cytosines (5-mC), primarily within
cytosine/guanine dinucleotides (CpG), which is established and maintained by the DNA-
methyltransferase (DNMT) family of enzymes. DNA is wrapped around octamers of basic
histone proteins (H2A, H2B, H3 and H4), forming higher order nucleosome structures.
Modification of these proteins, such as acetylation, methylation, phosphorylation,
ubiquitination, etc, control chromatin states, which can be open (transcriptionally active) or
closed (inactive). Among numerous other histone modification enzymes, histone
acetyltransferases (HATs) acetylate lysine residues on the N terminal tail of histone proteins.
This neutralizes the positive charge of the protein, decreasing its affinity for DNA and leading to
a looser interaction [34] that creates an open chromatin structure and increases accessibility for
the transcription machinery. In contrast, human histone deacetylases (HDACs) remove acetyl
groups, which results in condensed chromatin and gene inactivation [35]. Proteins with N
6
terminal methyl-CpG binding domains (MBD), such as methyl-CpG-binding protein 2
(MeCP2), can bind to methylated sites on DNA and complex with HDACs and the corepressor
Sin3a. This leads to histone deacetylation and the silencing of genes downstream from the
methylated CpG site. The effects of histone methylation depend on the specific lysine or
arginine that is modified, and can also result in either gene activation or repression [36].
In addition to the well-known modified pyrimidine base, 5-mC, a second modified cytosine has
recently been established as an important epigenetic factor. In mammals, 5-
hydroxymethylcytosine (5-hmC) is generated through oxidation of 5-methylcytosine by the ten-
eleven translocation (TET) family of enzymes [37]; the TET1 protein also binds a large number
of the Polycomb group target genes and colocalizes with the SIN3A co-repressor complex,
indicating that it plays a role in regulating transcription and preventing excessive 5-mC
accumulation at CpG-rich sequences [38]. 5-hmC was originally discovered in bacteriophage in
the early 1950s [39], but it wasn’t until 2009 that it was discovered in Purkinje neurons and
human stem cells [37, 40]. In human and mouse brains, 5-hmC is surprisingly abundant [40],
although it can occur in any cell type and tends to be enriched in the bodies of highly expressed
genes [41]. Unlike 5-mC, 5-hmC is also enriched in CpG-rich transcription start sites [38],
suggesting that conversion of 5-mC to 5-hmC is a way to reverse the transcriptional repression
that results from methylation [42], although its exact function in the genome remains unknown.
Epigenetic studies of various species – from E.coli and yeast to animals and humans – have
demonstrated that epigenetic regulation is critically important in the normal functioning of
genomes [43-45]. Cells can only operate normally if both the DNA sequence and epigenetic
components of the genome function properly; epigenetically dysregulated genes, despite
impeccable DNA sequences, can be harmful and cause disease [46, 47]. To date, the role of
epigenetic factors has been thoroughly investigated in rare paediatric syndromes [48] and
malignant transformation of cells in cancer [49-51]. More importantly, epigenetics can be highly
relevant to various complex non-Mendelian diseases, as epigenetic mechanisms allow for the
integration of a variety of apparently unrelated clinical, epidemiological, and molecular data into
a new theoretical framework [18].
7
Twin studies and the separation of genetic and epigenetic factors in disease
Discordance of identical (MZ) twins is one of the hallmarks of complex non-Mendelian disease;
concordance of monozygotic twins reaches only ~15% in breast cancer, 20% in ulcerative colitis,
25-30% in multiple sclerosis, 25-45% in diabetes, 50% in schizophrenia, 40-70% for Alzheimer’s
disease [52]. The discordance of MZ twins has traditionally been attributed to the differential
effect of environmental factors, which supposedly produce disease in one of the two genetically
predisposed co-twins [53]. Identification of such factors has been very difficult and, so far, only
a limited number of environmental disease risk factors have been identified (e.g. smoking in
lung cancer, diet in cardiovascular diseases) [54, 55].
The epigenetic explanation for MZ twin discordance is that, due to the partial stability of
epigenetic factors, a substantial degree of disease-relevant epigenetic dissimilarity can be
accumulated in genetically identical twins [10, 14]. Epigenetic differences in MZ twins may
reflect differential exposure to a wide variety of environmental factors. For example, intake of
folic acid affects both the global methylation level in the genome and regulation of imprinted
genes [56, 57]. It is also generally accepted that a sufficient level of methyl donor molecules is
necessary for normal mammalian neural tube development [58], and it has recently been
determined that the increased risk of neural tube defects is associated with hypomethylation of
long interspersed nucleotide element-1 (LINE-1) [59]. During pregnancy, maternal dietary
methyl supplements increase DNA methylation and change methylation-dependent epigenetic
phenotypes in mammalian offspring [60, 61]. One important cellular methyl donor, S-adenosyl-
methionine (SAM), has been found to mediate the activity of glutathione transferase (GST),
which is an enzyme involved in toxicant metabolism as well as neuronal stability. A study that
utilized a mouse model of Alzheimer’s disease illustrated the link between SAM and GST levels
– the mice originally had reduced levels of both SAM and GST, but SAM supplementation
restored GST activity, which is a promising development in the field of Alzheimer’s research
[62]. Overall, there could be numerous environmental stressors, including alcohol consumption
[63], asbestos and arsenic exposure [64, 65], and even maternal behaviour [66], that cause some
epigenetic “trace.”
MZ twin study designs are especially suited for the investigation of environmental epigenetics,
because there is no confounding effect from DNA sequence differences [67]. At the time of
8
commencement of our twin study, a number of studies had detected epigenetic differences between
MZ twins at individual loci, for example, in MZ twins discordant for Beckwith-Wiedemann
syndrome, a methylation difference at KCNQ1OT1 was detected between affected and unaffected
co-twins, representing an imprinting defect [68]. Methylation differences were also found to
occur between MZ twins in the regulatory regions of the catechol-o-methyltransferase [69] and
dopamine D2 receptor [10] genes. Isogenic organisms, such as inbred animals, are also useful
for molecular epigenetic studies, as they have identical genomes [70]. Famously, the inheritance
of an epigenetic modification upstream of the agouti locus, was documented in isogenic mice:
variation in the agouti phenotype - which can be visually detected as a fur colour continuum
from yellow to full agouti - was found to be the result of incomplete erasure of an epigenetic
modification that was then inherited through the female germline [71].
As a rule, epigenetic profiles are much more dynamic compared to DNA sequence, and the
epigenetic differences that occur between MZ twins may stem from many causes. DNA
methylation levels, for example, are not rigidly fixed in place and may become altered as a
result of environmental stressors, developmental programs, or even stochastically [67]. Some
mechanisms of stochasticity in epigenetic regulation are well understood. For example, the
enzyme DNA methyltransferase I (DNMT1), which acts as a maintenance enzyme and replaces
the methyl group at hemi-methylated sites [72], does not work with 100% accuracy. In mice,
the fidelity of DNMT1 was found to be approximately 95% [73], while other studies have
estimated the value to be 99.85-99.92% [6], although this second estimate was believed to take
into account the contribution of the de novo methyltransferases, DNMT3a and DNMT3b.
Another feature of DNMT1 is its ability to randomly methylate unmethylated cytosines, and this
activity is the main cause of most methylation errors, even in CGIs [6]. It is evident that
epigenetic marks have the potential to be gained or lost at every mitotic replication, and that this
occurrence can have important cumulative downstream effects, making it possible for MZ twins
to differ epigenetically without any specific causal factor.
The putative role of genetic-epigenetic interactions in complex disease
There are three fundamental points that enable us to consider epigenetic factors as etiological
candidates in complex disease. First, the epigenetic status of genes is more dynamic in
comparison to DNA sequence, and can be altered by developmental programs and the
9
environment of the organism [66]; furthermore, epigenetic changes may occur even in the
absence of obvious environmental differences, i.e. due to stochastic reasons [5]. Second, some
epigenetic signals can be transmitted along with DNA sequence across the germline generations,
i.e. such signals exhibit partial meiotic stability [27]. Third, epigenetic regulation is critical for
normal genomic function, such as segregation of chromosomes in mitosis, inactivation of
parasitic DNA elements, and regulation of gene activity [74, 75].
Partial epigenetic stability and the primary role of epigenetics in controlling the activities of
DNA sequences can shed a new light on various non-Mendelian irregularities of complex
diseases, such as MZ twin discordance (described above), sexual dimorphism, parent-of-origin
effects, familiality and sporadicity. One of the important peculiarities of complex disease is
sexual dimorphism - differential susceptibility to a disease in males and females. In psychiatric
conditions such as Alzheimer's disease, schizophrenia, alcoholism, and mood and anxiety
disorders, psychopathology exhibits a number of differences between the sexes in rates of illness as
well as the course of illness [76]. It is important to note that sex effects in complex diseases cannot
be explained by sex chromosome-linked genes, and that these effects are also observed on
autosomes [77]. While hormones cannot change DNA sequence, they can be potent modifiers of
epigenetic status, which controls genomic activities, thus, sex effects may be mediated by
hormone-induced epigenetic alterations [77].
In some complex diseases, risk to offspring depends on the sex of the affected parent. For
example, asthma, bipolar disorder, and epilepsy are more often transmitted from the mother, while
type 1 diabetes seems to be more often transmitted from the affected father [52]. Parent-of origin-
dependent clinical differences have also been detected in schizophrenia [78]. Molecular genetic
studies, although rarely performed in a sex-specific fashion, have discovered parental origin
effects in a wide variety of phenotypes, such as obesity [79], Alzheimer’s disease [80], atopy
and asthma [81], autism [82], autoimmunity [83], and major psychosis [84]. One of the most
common mechanisms of parent-of-origin effects is genomic imprinting [85], where differential
epigenetic modification of genes occurs based on their parental origin, resulting in expression of
genes from only one of the two parental copies [86]. Disruption of the normal imprinting pattern
often causes diseases that affect cell growth, development, and behaviour [87], with severe
disruptions potentially causing recurrent molar pregnancy, miscarriage or infertility [88].
10
Imprinting is the only form of ASM that is moderately understood, to date, although the search
for new imprinted domains continues constantly.
The epigenetic model of complex disease could be imagined as a chain of aberrant epigenetic
events that begins with a pre-epimutation, a primary epigenetic problem that takes place during
the maturation of the germline; pre-epimutation increases the risk for the disease but is not
necessarily sufficient to cause the disease. The dysregulation can be tolerated to some extent,
and age of disease onset may depend on the effects of tissue differentiation, stochastic factors,
hormones, and probably some external environmental factors (nutrition, infections, medications,
addictions, etc) [7, 89, 90]. It may take decades to reach a critical threshold, beyond which the
genome, cell, or tissue is no longer able to function normally – this may be the case for many
adult-onset diseases [91] – and only some predisposed individuals will reach the “threshold” of
epigenetic dysregulation and acquire phenotypic changes that meet the diagnostic criteria for a
clinical disorder. Severity of epigenetic dysregulation may fluctuate over time, and in clinical
terms this is known as remission and relapse. In some cases, “aging” epimutations may slowly
regress back to the norm. For example, in psychosis, this is seen as fading psychopathology or
even partial recovery, which is consistent with age-dependent epigenetic changes in the genome
[92]. The same principle applies to other diseases, such as asthma [93] and attention deficit and
hyperactivity syndrome [94]. It should be noted that an epimutation could represent a
sequence-independent change, such as a stochastic gain or loss of DNA methylation, or a
sequence-dependent event, for example, genetic disruption of an imprinted domain.
Although a wider variety of studies are beginning to appear, to date, epigenetic factors in
complex disease have not been intensely investigated, with the exception of cancer. Genes
involved in various cellular pathways may become misregulated, but epigenetic silencing of
tumor suppressor genes, such as the gene encoding the cell cycle inhibitor, cyclin-dependent
kinase inhibitor 2A (p16INK4a
), the DNA repair genes, breast cancer type 1 (BRCA1) and MutL
homolog 1 (hMLH1), has been studied the most extensively. Current estimates suggest that the
average tumor will contain approximately 100-400 hypermethylated promoter regions [95].
Global hypomethylation is also observed in cancer cells [96, 97], and it is believed to cause a
decrease in genomic stability and the formation of abnormal chromosome structures. Not
surprisingly, in addition to aberrant DNA methylation changes, histone modification changes
11
have also been detected in malignant cells [98, 99]. For example, the actions of histone H2A.Z
in cancer cells depend on both its level of acetylation and its location within the promoter region
[100]. Despite the regular occurrence of epigenetic changes in cancer, it is not clear which
epimutations are primary causes of early stage malignant transformation, versus the ones that
simply represent downstream effects of these primary causes [101, 102]. Until we are able to
differentiate between these subtypes, effective etiological treatment of cancer is not possible via
epigenetic approaches.
In addition to cancer, some epigenetic studies of psychiatric diseases have been completed or
are underway. The maintenance DNA methyltransferase, DNMT1, was shown to be
upregulated in GABAergic medium spiny neurons in layers I and II of the cerebral prefrontal
cortex in schizophrenia and bipolar disorder patients. An increase in DMNT1 levels, along with
a decrease in reelin (RELN) and glutamic acid decarboxylase 67 (GAD67), also occurs in
GABAergic medium spiny neurons of the caudate nucleus and putamen in schizophrenia
patients [103]. In autism studies, a substantial proportion of post-mortem brain samples from
autistic individuals revealed monoallelic or highly skewed allelic expression of GABA receptor
subunit genes, while such genes were biallelically expressed in control brain samples [104]. Rett
syndrome, an X-linked neurodevelopmental disorder, has been shown to result from a mutation
in MeCP2, of which the protein product represses gene transcription by binding to 5-
methylcytosine residues [105]. Fragile X syndrome has been linked to epigenetic silencing and
loss of expression of the fragile X mental retardation 1 (FMR1) gene, due to expansion of a
CGG repeat in its 5’-untranslated region [106]. Several studies have focused on the epigenetics
of psychosis, and this topic will be discussed in the next section of the review. Although the
underlying epimutations remain unknown in most complex diseases, many epigenetic
therapeutic agents have already been developed. Several of these compounds are progressing
through the clinical trial stage, or have even become approved treatments for particular
conditions.
Genetic and epigenetic studies of major psychosis
Psychiatric diseases place a tremendous burden on affected individuals, their caregivers and the
healthcare system. Although evidence exists for a strong inherited component to many of these
conditions, dedicated efforts to identify DNA sequence-based causes have not been
12
exceptionally productive, and very few pharmacologic treatment options are clinically available.
Major psychosis is a classification that encompasses both schizophrenia (SZ) and bipolar
disorder (BD) - two conditions that seem to be related etiologically [107]. SZ is a multifactorial
disease characterized by disordered thinking and concentration that results in psychotic thoughts
(delusions and hallucinations), inappropriate emotional responses, erratic behavior, as well as
social and occupational deterioration [108], while BD represents a category of mood disorders,
in which affected individuals experience episodes of mania or hypomania interspersed with
periods of depression, and may also suffer from delusions and hallucinations.
A variety of theories on the origin of psychosis have been proposed, many of which focus on
disturbances in brain circuitry and neurotransmitter levels. A prevalent opinion is that a genetic
predisposition paired with psychosocial and environmental elements is ultimately responsible,
but identification of any of these factors has been daunting. Popular theories of psychosis have
involved dopamine [109], serotonin [110] and glutamate [111] pathways, and first-line
pharmacological therapies mainly focus on these systems [112]; receptors for all of these
neurotransmitters appear to be dysregulated in the frontal cortex of psychosis subjects [113].
Many social and environmental contributing factors have been suggested, such as obstetric
complications [114], maternal malnutrition [115], hypoxia during neurodevelopment [116], viral
infection [117], identification as an ethnic minority and perception of disadvantage [118],
autoimmune reactions [119], and substance abuse [120]. The wide range of findings that support
different hypotheses combined with the spectrum of phenotypes observed in both diseases
suggest that the underlying causes of SZ and BD vary between individuals and likely involve
multiple neural pathways.
To date, traditional gene- and environment-based approaches have not been very productive in
deciphering the clinical, molecular and epidemiological aspects of psychosis, such as MZ twin
discordance (41-65% for SZ [121], ~60% for BD [122]), sexual dimorphism, parent-of-origin
effects, fluctuating disease course with periods of remission and relapse, and peaks of
susceptibility to the disease that correspond to periods of major hormonal changes in the
organism [90]. Classically, psychosis research was aimed at defining genetic and environmental
risk factors, but despite significant evidence of a heritable component derived from twin and
adoption studies [123, 124], many molecular genetics findings have not been replicated, and
significant heterogeneity and small effect sizes are thought to plague genetic association studies
13
[125]. A large study examined 789 SNPs within 14 top candidate genes in 1,870 SZ cases and
2,002 controls and found that all SNPs previously reported as associated with SZ were
consistent with chance expectation, and 4 other previously-identified SNPs were not
significantly associated with the disease [126].
More recent GWAS have also provided some disappointing results. A 2011 study of BD
included 1000 cases and 1034 controls, and utilized the Affymetrix SNP 6.0 platform to search
for genetic risk factors in each subset of the disorder. Only two SNPs reached significance – one
in the vicinity of the gene phosphodiesterase 10A (PDE10A) and another located between
contactin-4 precursor (BIG-2) and contactin 6 (CNTN6) [127]. Another study from Spain
examined the genomes of 476 SZ patients and 447 controls with the aim of studying only non-
synonymous SNPs to increase the probability of finding functional risk factors. One SNP
located at the metal ions transporter gene, SLC39A8, was found to be significant, although it is
rare in non-European populations [128]. These are just a few examples of a purely genetic
approach that have failed to explain a substantial portion of the heritable element of psychosis,
and it has been noted that all GWAS of SZ performed to date have found that the most
significant genetic risk factors do not have odds ratios (OR) greater than 1.15–1.20. A German
study managed to find a region on chromosome 11 (containing the candidate genes AMBRA1,
DGKZ, CHRM4 and MDK) that had an OR of 1.25 and was significantly associated with SZ in a
sample of 1169 cases and 3714 controls however, when the sample was expanded to include an
additional 2569 cases and 4088 controls, the OR dropped to 1.11 [129]. On the topic of GWAS
of psychosis, one group of reviewers has recently stated that, “The validation of any genetic
signal is likely confounded by genetic and phenotypic heterogeneities which are influenced by
epistatic, epigenetic and gene-environment interactions.” They go on to highlight the
importance of integrating multiple platforms in order to better understand the biological basis of
these diseases [130].
Recently, the first epigenomic study of major psychosis utilizing CpG-island microarrays was
released by Mill et al (2008), providing a large-scale overview of DNA methylation differences
in the brain associated with SZ and BD. DNA extracted from the frontal cortex (n=35 each for
SZ, BD and control) was subjected to enrichment of the unmethylated fraction using
methylation-sensitive restriction enzymes, and adaptor ligation coupled with PCR amplification.
The amplicons (multiple copies of the unmethylated genomic DNA) were interrogated on
14
12,192 feature CpG-island microarrays. The data was normalized, assigned raw p values based
on a t statistic, and then converted to false discovery rates (FDR). Indeed, in cortex they
discovered differences at loci involved in glutamatergic and GABAergic neurotransmission,
brain development, mitochondrial function, stress response, and other disease-related functions,
many of which correspond to psychosis-related changes in steady-state mRNA. Network and
gene ontology (GO) analyses were performed in order to determine relationships between the
functionally linked pathways from the microarray dataset. The network analysis revealed a
lower degree of modularity of DNA methylation “nodes” in the major psychosis samples,
indicating that there is some degree of systemic epigenetic dysregulation involved in the
disorder. From the GO analysis, several categories were highlighted, including those involved in
epigenetic processes, transcription, and development, as well as brain development in female
BD and SZ samples, and in those related to stress response in male BD samples [32]. The data
presented here supports the idea that epigenetic mechanisms underlie the broader hypotheses of
major psychosis, and the study uncovers some new avenues for future exploration.
A second epigenomic study of psychosis has since been performed by Dempster et al, using the
Illumina Infinium HumanMethylation27 BeadChip platform to compare methylation levels
between cotwins in a sample set comprised of 22 MZ twin pairs discordant for either SZ or BD.
The DNA source for the original experiment was whole blood, and the results were validated
using the Sequenom EpiTYPER platform, and then tested separately on 45 post-mortem brain
samples from cases and controls. Methylation levels differed between co-twins at numerous
loci, and there was significant heterogeneity between twin pairs, but this is understandable given
the clinical differences observed between cases of psychosis. The top differentially methylated
site across all MZ pairs was within the promoter of the gene encoding alpha-N-acetylgalactos-
aminide alpha-2,6-sialyltransferase 1 (ST6GALNAC1), which was unmethylated in affected
subjects; this gene is involved in protein glycosylation and cell–cell interactions, and it is
differentially regulated during neurodevelopment. A pathway analysis revealed an enrichment
of epigenetic changes in biological networks that were relevant to psychiatric disorders and
neurodevelopment, such as “nervous system development and function” in the SZ group, and
“developmental, genetic and neurological disorder” in the BD group. It was interesting to note
that CpG sites located within CpG islands for 100 top-ranked psychosis-associated,
differentially methylated sites were under-represented [131]. In the past, it has been common for
15
methylation studies to focus on promoter regions, so it is not surprising that many of the top loci
discovered here have not been previously identified, and this underscores the need to investigate
the genome without bias when searching for epigenetic effects.
Both SZ and BD have also been examined using the candidate gene approach in an epigenetic
context, as epigenetic down-regulation of genes is emerging as a possible underlying
mechanism of the GABAergic neuronal dysfunction in SZ. One of the more intensively
investigated SZ-related genes is RELN, which is involved in neuronal development and cell
signalling, and has been found to be hypermethylated in cases of SZ [132]. However, no
differences were observed at this locus in a replication attempt [32, 133], and the focus seems to
be shifting to other candidate genes, namely the 67 kDa glutamate decarboxylase (GAD67, a.k.a.
GAD1) and DNMT1. GAD67 catalyzes the conversion of glutamic acid to GABA. In cases of
SZ, the levels of this enzyme and several others involved in GABAergic neurotransmission,
such as GAD65 and GABA plasma membrane transporter-1 (GAT-1), display decreased mRNA
levels, as determined by real-time-quantitative polymerase chain reaction (qPCR) and in situ
hybridization [134-137]. In addition to aberrant methylation at this locus, an analysis of the
microarray collection of the National Brain Databank (USA) has shown that decreased GAD67
mRNA levels strongly correlated with upregulated HDAC1 in the prefrontal cortices of SZ
subjects [138]. Oddly enough, at the GAD67 promoter, SZ patients have been shown to display
an approximately 8-fold deficit in repressive chromatin-associated DNA methylation [137].
Currently, the general opinion on SZ seems to be that disturbances in the cortico-striato-pallido-
thalamic circuitry and in early brain maturation can result in a loss of cells and normal
connectivity in a wide variety of brain regions. This theory is consistent with the epigenetic
model of complex disease, and it is likely that genetic and epigenetic factors are disrupted at
many different loci, with each affected individual displaying a unique profile [139, 140].
Less information is available on BD, possibly because of the large degree of overlap between
BD-related genes and those associated with other mental disorders; genomic imprinting has
been suggested by statistical genetics, but molecular approaches have not yielded any imprinted
disease genes [141]. A recent study applied methylation-sensitive representational difference
analysis (MS-RDA) to lymphoblastoid cells derived from twins discordant for BD [11]. One
detected gene, peptidylprolyl isomerase E-like (PPIEL), was unmethylated in BD affected
twins, while a region of the spermine synthase (SMS) gene was hypermethylated versus
16
unaffected twins; it has yet to be determined if either of these regions are biologically and
funtionally significant. An analysis by Kaminsky et al (2011) mapped DNA methylation
differences at the human leukocyte antigen (HLA) complex group 9 gene (HCG9) using post-
mortem brains, peripheral blood cells and germline from BD subjects and controls, and found
consistent epigenetic differences at this locus in all tissues studied. Two brain tissue cohorts
exhibited lower DNA methylation in BD patients versus controls at an extended HCG9 region,
and sperm DNA had a significant association with BD at one of the regions that displayed
epigenetic changes in brain and blood, thus, the HCG9 locus appears to have a causal
association with BD [142].
Copy number variants (CNV) – the occurrence of abnormal numbers of copies of a given gene
or region of DNA, including duplications and deletions that can range in size from 1 Kb to
several megabases [143] – have been implicated in many complex diseases, including major
psychosis [144]. It has been demonstrated that CNVs play a critical role in human evolution and
genetic diversity, and it is estimated that CNVs make up ~12% of the human genome [143],
with around 0.4% of genomic variation between unrelated individuals differing due to copy
number [145]. Diseases such as SZ show a large degree of phenotypic heterogeneity, and it is
becoming apparent that a small percentage of SZ patients carry a number of specific CNVs
[146]. It has been reported that the overall genome-wide CNV burden does not differ between
SZ and unaffected subjects [146], although a significant increase in singleton deletions has been
observed in SZ and BD subjects versus controls, and very large CNVs (> 500 Kb) have also
shown enrichment in SZ subjects [147]. The well-known 15q11.2-q13.1 duplication associated
with autism has also been associated with SZ [148], while several large CNVs have been found
to increase the risk for SZ and a number of other disorders, such as autism, attention-deficit
hyperactivity disorder, learning difficulties and epilepsy – these CNVs are not enriched in
subjects with non-psychiatric diseases [149]. As more is learned about the nature of psychiatric
diseases, it seems that rare variants contribute significantly to their etiology; while the presence
of certain alleles can obviously influence phenotype, expressivity can be quite variable, resulting
in a spectrum of outcomes. Future studies will have to take into consideration the highly
individual molecular basis of complex disease [150].
In combined studies of epigenetics and DNA sequence, some interesting developments have
been observed. It has been shown that rare G variants of a G/A polymorphism in the potassium
17
chloride co-transporter 3 gene (SLC12A6) may represent risk factors for BD [151]. Eventually,
it was discovered that variants containing the G allele were methylated at the adjacent cytosine,
and this accompanied a decrease in gene expression in human lymphocytes [152]. This hints at
a functional link between epigenetics and genetic variation, and the association with BD is
believable, as SLC12A6 mutations underlie another psychiatric disorder, Andermann syndrome,
which is an autosomal recessive motor-sensory neuropathy associated with developmental and
neurodegenerative defects [153]. Unfortunately, studies that consider both genetics and
epigenetics (even smaller, targeted ones) are incredibly rare, and no one has explored the
genetic-epigenetic interactions associated with psychosis, to date.
ASM: relevance to studies of complex disease
Allele-specific methylation refers to DNA methylation that is present on only one of the two
alleles that exist in a cell. ASM can arise from several causes, such as genomic imprinting, X
chromosome inactivation, stochastic methylation of a single allele, or as a direct result of DNA
sequence variants. Genomic imprinting, in which the inactive imprinted allele is significantly
more methylated than the actively expressed allele, has been intensely studied and a large
candidate list of imprinted genes is available [154], although most still require validation and are
believed to only represent a fraction of the total number. X inactivation, a large-scale case of
ASM, occurs as one copy of the two X chromosomes in a mammalian female is methylated,
thereby silenced, and packaged into heterochromatin [155]. ASM may also arise stochastically,
appearing in all or many cells of an organism if the methylation event occurs at a very early
stage of development [71, 156], or only in select tissues if the event is postnatal or the result of
some environmental factor, such as smoking or diet [157].
Recently, evidence for DNA sequence-influenced ASM has been building, and it has proven to
be an area of extreme interest. The first suggestion of this phenomenon appeared in 2002, when
Yan et al discovered allele-specific expression (ASE) occurring at a small subset of SNPs,
although the mechanism of action was unknown at the time [20]. Several years later, Kerkel et
al examined a collection of tissues, including WBC, brain, buccal cells, lung, kidney and
placenta, and made the first estimate of sequence-dependent ASM. Using methylation-sensitive
SNP analysis, they surveyed the genomes of 12 and 5 individuals at 50K and 250K resolution,
respectively, and determined that at least 0.16% of the informative SNP-tagged loci queried
18
showed ASM [19]. Several other studies followed, and a range of ASM estimates were
presented. A recent study by Schalkwyk et al, in which blood DNA from five pairs of MZ twins
was interrogated on Affymetrix SNP 6.0 microarrays, stated that 1.5% of their 183 605 SNPs
displayed ASM, and approximately 90% of the ASM was cis in nature. These results were
validated with bisulfite-mapping and gene-expression analyses, and then subsequently tested in
a second tissue from the same individuals and replicated in DNA obtained from 30 parent-child
trios [21]. In contrast, some very high estimates have also been proposed: 10% by Zhang et al
[158], 10% by Hellman and Chess [159], and a staggering 23-37% was suggested by Shoemaker
et al [160]. It should be noted that Hellman and Chess' finding was entirely based on in silico
simulations, and the percent provided by Shoemaker et al was an estimate based on the findings
in a few thousand isolated regions, only in pluripotent cell lines. Another study that used three
human embryonic stem cell (HESC) lines estimated that 14% of all CG sites will show ASM,
and they also identified 1,020 genes that show ASE, but again, these were cell lines and HESC
in particular have significantly higher non-CG methylation than differentiated cells [161].
Although it is not entirely understood how ASM might exert an effect, one somewhat obvious
explanation involves ASE, where the presence or absence of a particular allele is required for
expression of a given gene. The paper by Kerkel et al confirmed ASM at 16 SNP-tagged loci,
and then identified two cases of ASE at the vanin and CYP2A6-CYP2A7 gene clusters [19]. Ten
cases of SNP-methylation-expression three-way associations were detected by Zhang et al
[162]. Schalkwyk et al reported that 16.3% of the possible SNP-expression associations (a SNP
located within 5Kb of a gene expressed at detectable levels in blood) provided evidence for a
significant linear association between the allelic variant present at an ASM-SNP and mRNA
level, confirming that ASM effects often correlate with allelic expression differences, and that
these are likely to be cis in nature [21]. Other hypothetical mechanisms of ASM action, such as
interference with gene splicing, protein binding, micro RNA (miRNA) binding and RNA
structural alterations, are summarized in Figure 1.1. As some SNPs have been found to operate
in these ways [163-171], local methylation differences surrounding the SNP could potentially
act to increase any of these actions.
19
Figure 1.1. Hypothetical mechanisms of epiSNP action
The basic structure of a gene is presented, with letters representing potential locations of epiSNPs. A) intergenic
epiSNPs may interact with RNA genes or ncRNA themselves[164], or they could disrupt transcription factor
binding sites. B) promoter epiSNPs may interfere with the binding of numerous proteins, such as transcription
factors, RNA polymerases, activators, repressors and any other protein whose upstream binding can influence
transcription. C) epiSNPs in the 5’UTR may act as riboSNitches, which affect the shape of the mRNA transcript
[170]. D) introns have been shown to affect splicing [167], occasionally encode proteins or ncRNA[169, 172], and
they may also act as transposons [168], so an intronic epiSNP may alter any of these activities. E) exonic epiSNPs
may directly affect the proper transcription of exonic sequences [163]. F) epiSNPs in the 3’UTR may interfere with
the binding of many miRNAs [171], which may have consequences for polyA signal and stability of the transcript
[166].
Experimental design is a critical element to consider when investigating ASM, as DNA
methylation is tissue- [173], developmentally- [174], and temporally-specific [175], thus, in
addition to the inclusion of a large number of samples, studies should ideally utilize complex
genome-scanning tools, such as microarrays, and interrogate as many SNPs as possible without
bias to any particular region of the genome. The aforementioned studies did not fully satisfy all
of these requirements, as many relied upon very small sample sets, older arrays that interrogated
20
a small number of SNPs, and in many cases they tended to focus only on areas that had been
previously identified in other studies, despite the fact that recent evidence suggests that ASM
studies should not exclusively consider core promoters, CpG islands (CGIs) and imprinted
differentially methylated regions (iDMRs); a 2009 study found that most methylation alterations
in colon cancer are not localized in promoter regions or CGIs, but in 'CpG island shores,' which
are sequences up to 2Kb away [176]. While ASM is believed to predominantly act in cis, trans
effects (where a SNP is correlated with methylation at a site several megabases away, or even
on a different chromosome altogether) have been reported in a few instances [21, 162].
As previously mentioned, the SNPs detected by GWAS studies do not account for all of the
heritability associated with complex diseases, especially psychiatric ones. Integration of the
epigenetic aspect, to form Epigenome-Wide Association Studies (EWAS), is a promising
method to pinpoint the truly causative variants that differ only in methylation status from those
that are non-causative. These SNPs displaying ASM, or "epi-alleles," may also act as risk
factors or predictors of disease type, treatment outcome or disease course, and would be of great
value to pre-screening applications and general diagnostics. With the International Human
Epigenome Consortium (IHEC) underway [177], which attempts to map 1000 reference
epigenomes for various human tissues and cell types, our ability to conduct EWAS will be
significantly improved in the near future, and as array technology continues to advance, large-
scale EWAS for complex disease may become an attractive option.
Currently, ASM has only been studied in a small number of complex diseases, mainly various
cancers, but the findings are quite intriguing. Milani et al discovered that 16% of the genes they
analyzed displayed ASE in multiple acute lymphoblastic leukemia (ALL) cell samples, with the
level of ASE varying largely between the samples. Of these genes exhibiting ASE, 55%
displayed what the authors called “bidirectional” ASE, in which either of the two SNP alleles
could become the one that was overexpressed. ASE and ASM are not the same process,
although they can be related, but in this particular experiment, the bidirectional ASE correlated
with methylation level at the site [22]. A more direct finding was made by Hawkins et al, who
found that the T allele at SNP rs16906252 is a key determinant in the onset of O(6)-
methylguanine DNA methyltransferase (MGMT) methylation in colorectal cancer; MGMT is a
DNA repair protein that restores mutated guanine, and its methylation is often detected in
sporadic colorectal cancer [23]. It has been suggested that screening for methylation of the T
21
allele at this SNP in the peripheral blood of unaffected individuals could identify those
predisposed for colorectal cancer, lung cancer, lymphoma, and glioblastoma [24]. In the lung
adenocarcinomas and sputum samples from smokers, another study has found that the A allele
of an MGMT promoter-enhancer SNP is a key determinant for MGMT methylation in lung
carcinogenesis, as this allele was selectively methylated in primary lung tumors and cell lines
heterozygous at that SNP [178].
Outside of cancer, very few complex diseases have been studied in the context of ASM. The
vitamin D receptor (VDR) gene encodes a transcription factor that modulates several processes,
such as calcium homeostasis and immune function. Large differences in allele frequency
between populations have been observed at the VDR, and it has previously been associated with
susceptibility to tuberculosis and autoimmunity. In tuberculosis cases and controls, as well as
lymphoblastoid cell lines from two ethnically distinct populations (Yoruba and Caucasian), it
was found that there were methylation-variable positions in the 3' end of VDR that significantly
distinguish ethnicity and tuberculosis status. It was also shown that methylation status
demonstrated a complex association with a VDR SNP known as TaqI (rs731236), with several
local CpG sites showing disease- and ethnicity-specific methylation, thus, it is recommended
that epigenetic and genetic factors should be investigated together in the case of VDR-associated
disease [179]. In a study of obesity, the melanin-concentrating hormone receptor 1 (MCHR1),
which regulates energy balance, food intake, physical activity and body weight in humans and
rodents, was found to have ASM at two SNPs in its first exon that was age-dependent, BMI-
associated and that also affects transcription [180]. Recently, DNA methylation was examined
in 60 females stratified by type 2 diabetes (T2D) susceptibility haplotype, using previously
identified association loci. After noticing increased DNA methylation on the alpha-
ketoglutarate-dependent dioxygenase (FTO) obesity susceptibility haplotype, it was then
determined that the methylation difference was due to the co-ordinated phase of CpG-creating
SNPs across the risk haplotype. Essentially, they had found a 7.7Kb example of haplotype-
specific methylation that can act as a long-range enhancer, supported by the histone H3K4me1
enhancer signature [181]. One message to take away from this last study is that genetic and
epigenetic mechanisms can be intertwined, and that diseases may be caused by their combined
actions, in ways that we have not yet envisioned. The vast majority of complex diseases have
22
not been examined from an ASM-perspective, especially one that utilizes EWAS, mainly
because the technology for such an endeavor simply was not available in the past.
Emergence of epigenetic treatments
Epigenetic drug strategies are currently employed to treat a collection of cancer subtypes, and
these medications are now being considered in the treatment of psychiatric disease, as well. The
DNMT inhibitor, Doxorubicin, has been used to increase reelin and GAD67 expression in
neuronal precursor cells, and it was shown that reelin gene expression correlated with the
dissociation of DNMT1 and MeCP2 from its promoter, as well as an increased level of histone
H3 acetylation[182]. Other studies have shown that HDAC inhibition enhances learning and
memory following neurodegeneration induced by traumatic brain injury[183], and also shows
some therapeutic efficacy in rodent models of neurodegenerative conditions, such as
Huntington’s disease[184], multiple sclerosis[185], and Parkinson’s disease[186]. One of the
downstream effects of HDAC inhibition is upregulation of p21[187], a cyclin-dependent kinase
inhibitor that appears to play an important protective role against oxidative stress and DNA
damage[188]. Valproate, a compound utilized for its anticonvulsant and mood stabilizing
properties, also exhibits HDAC-inhibitory activity and has been successfully implemented as a
treatment for epilepsy[189], BD[190] and, less commonly, SZ[191]. Like valproate, it has been
discovered that several drugs have previously unknown epigenetic modifying properties, and the
list continues to grow. While such medications are promising, their pleiotropy, transient effects,
and non-specific alterations to the entire epigenome limit them for the time being.
The studies presented below include a detailed analysis of sequence-dependent and sequence–
independent DNA methylation in MZ and DZ twins, plus an unbiased, large-scale evaluation of
ASM in psychosis cases and controls that avoids the short-comings of previous studies and
utilizes the Affymetrix SNP 6.0 platform in a novel manner. Our findings highlight the variable
influence of genetics on epigenetics, as well as the importance of genetic-epigenetic interactions
in both normal and pathological phenotypes. We stress that an epigenetic element must be added
to genetic studies in order to fully understand the molecular functions of the genome, and that
epigenetic drug therapies are promising options for the treatment of complex diseases.
23
Chapter 2
Materials and Methods
Contributions: DNA dependent and independent DNA methylation in twins
I was responsible for the animal experiment, including DNA extractions, restriction enzyme
digestions, adaptor ligations, PCR amplifications, microarray preparation, hybridization and
scanning, plus some basic analysis. I also performed the bulk of the pyrosequencing and
cloning validation experiments for the human microarray data, a portion of the bisulfite
modification, and I was involved in the writing and editing of the manuscript.
All human microarray laboratory experiments, experimental design, the majority of the analysis
and writing of the paper were performed by Zach Kaminsky. Carl Virtanen created the
karyograms. The Gene Ontology analysis script in Bioconductor was provided by Thomas Tang.
Consultation with bioinformaticians Thomas Tang, Sun-Chong Wang, and Allan McRae helped
to direct the analyses performed. Animal sacrifice was performed by Laura Feldcamp under the
direction of Albert Wong. A portion of the bisulfite modification, pyrosequencing and cloning
was performed by Zach Kaminsky, Gabriel Oh, and Sigrid Ziegler.
Twin sample
We investigated three cohorts of twins representing various tissues. WBC of 19 dichorionic
(DC) MZ and 20 DZ twin pairs matched for age, sex and WBC count plus buccal epithelial cells
from the 10 monochorionic (MC) MZ, 10 DC MZ, and 20 DZ age- and sex-matched twin pairs
were obtained from the Brisbane Adolescent Twin Study [192]. WBCs and buccal cells were
obtained from the same individual for 10 DC MZ and 10 DZ pairs. WBC samples were from
twins 13.2 ± 1 y old (mean ± s.d.) and consisted of 20 females and 18 males. MC and DC buccal
epithelial cells both consisted of 10 males (aged 14 ± 0.77 y) and 10 females (all 14 y old); all
were of European ancestry (mainly northern European ancestry). MZ and DZ twins in the WBC
group were selected from several thousand sets of twins of the Australian Twin Registry using
hematology report data. The percentage difference between cell subfraction counts for the whole
WBC count, neutrophil and lymphocyte counts did not exceed 10%. The mean percentage
difference in selected DZ twins was smaller than that of MZ twins to bias against the alternative
24
hypothesis of more epigenetic variation in the DZ twin group. We determined zygosity by
comparisons of nine microsatellite markers, which gave a probability of incorrect assignment of
a DZ as an MZ of less than 0.0001. Gut biopsies from 18 pairs of MZ twins were obtained from
a Swedish twin population with inflammatory bowel disease described previously [193].
Although all twin pairs had at least one twin affected with inflammatory bowel disease, we
investigated biopsies from rectal mucosa, which were macroscopically not inflamed in any of
the twins investigated. Written informed consent was obtained from all participants, and studies
were approved by the local institutional review boards at participating institutions. The
workflow for the twin study is presented in Figure 2.1.
Figure 2.1. Twin study workflow
Human WBC, buccal epithelium cells and gut biopsies, plus whole mouse brain samples were processed
identically. Several downstream analyses were conducted, which differed between cohorts.
25
DNA methylation profiling
The unmethylated fraction of genomic DNA was enriched using the methylation-sensitive
restriction enzyme (MSRE) HpaII [194] and interrogated on Human 12K CpG island
microarrays [195]. Enrichment of the unmethylated genome of MZ and DZ twin pairs and
hybridization to the microarrays was carried out in a randomized fashion. We did two technical
replicates for each enrichment and hybridization, after which we averaged the log ratios per
each replicate to produce one value per individual per locus. All samples were hybridized
against a common reference (reference 1) with the exception of 9 MZ and 10 DZ pairs in WBC,
which were originally hybridized against a different common reference (reference 2) and later
transformed to match reference pattern 1. Transformation was achieved by first obtaining a spot-
wise log ratio of reference 2 relative to reference 1 through a comparison of two dye-swapped
reference 1 versus reference 2 hybridizations. Log ratios from the 9 MZ and 10 DZ pairs
originally hybridized with reference 2 were multiplied by the log ratio values of reference 1
versus reference 2 to obtain log ratio values relative to reference 2. This transformation was
followed by between-array normalization using the Limma package in Bioconductor. We
created the reference pools by addition of equal quantities of the enriched unmethylated WBC
DNA fraction from 10 MZ and 10 DZ pairs.
Animal studies
We extracted genomic DNA using standard phenol and chloroform methods from whole-brain
tissue of four strains of mice: c57BL6 and FVB inbred strains and CF-1–1 and CD-1 outbred
strains, all obtained from Charles River Laboratories International. Three litters consisting of
three male mice per litter were kept in uniform environments and killed at postnatal day 43. We
enriched the unmethylated fraction of genomic DNA and created the common reference pool in
an identical manner to the human reference design studies. The microarrays used were mouse
4.6K CpG island microarrays, all produced during a single printing at the microarray facility of
the University Health Network, Toronto. Hybridizations were carried out in batches of 18
microarrays consisting of one amplification set from one inbred and one outbred strain per day
for a total of four hybridization days. We determined selection and order of hybridization at
random through sorting on a random number generator.
26
Data analysis
All microarrays were scanned on the Axon 4000A scanner and cross-referenced to annotated
GAL files using Genepix 6.0 Software. Microarray GAL annotation was made available from
the manufacturer and downloaded at www.microarrays.ca. We carried out normalization
procedures in Bioconductor using the Limma package. All arrays underwent log ratio- based
normalization, background correction, print tip loess normalization and scale normalization
between blocks. We removed low quality flagged loci identified by Genepix. Microarray data
were trimmed on the basis of the annotation information such that spot IDs containing
mitochondrial DNA, translocation hot spots and repetitive elements, and those located on the X
and Y chromosomes were removed. After trimming and removal of flagged loci, 6,405 (WBC),
5,918 (buccal cells), and 5,941 (gut biopsies) unique DNA sequences in humans and 2,176 DNA
sequences in mice were used for subsequent statistical analyses.
All statistical tests were done in R (http://www.r-project.org/). Using an Anderson-Darling test
from the nortest package, we found that all distributions derived from microarray data rejected
the null hypothesis of normality, and we subsequently evaluated them with non-parametric tests.
All statistical tests done were two tailed and a P<0.05 is considered significant. Unless
otherwise specified + denotes the standard error of the mean.
Test for association of epigenetic difference with cellular heterogeneity
WBC counts were available for all twin blood samples, allowing us to investigate any
association between twin pair wise variability and the fold difference of DNA methylation
variability at each locus. A spot-wise correlation between the difference in log fold change value
per twin pair and the log2 of the ratio of the cell count per twin pair was calculated with the
Spearman method and subjected to correction for multiple testing using the qvalue package
[196]. The three separate comparisons were performed on the cell fractions with the highest
proportion of cells consisting of the whole white blood cell count, total neutrophil count, and
total lymphocyte count.
Biological and technical variation
Levels of biological variation and technical variation for individual twin sets produced by twin
versus co-twin methylation profile comparisons and self versus self methylation profile
27
comparisons, respectively, were measured according to the variance (2) over all ~6,000 loci.
Non-parametric comparisons between matched biological and technical variation for all sets
were carried out by the Ansari-Bradley test. Differences between the degrees of biological and
technical variation in 4 MZ twin sets were evaluated with the Kruskal-Wallis test. Technical
variation produced by MspI- based DNA enrichment was tested by 4 self versus self
hybridizations and compared to HpaII technical variation levels by the Ansari-Bradley test. For
the common reference design data, we addressed the null hypothesis that the difference between
co-twins was not significantly larger than that between replicate hybridizations. For each tissue,
the median absolute value of the fold change difference between the two technical replicate
enrichments/hybridizations performed per individual was determined and compared to that
between co-twin hybridizations with a paired Wilcoxon Signed Rank test for MZ twins.
For animal data, assessment of technical variation was performed in the following way. For all
mice, a spot-wise correlation between replicate hybridizations was produced at 2,176 unique
genomic regions. To ensure that biological variation was detectably higher than technical
variation, a Monte Carlo procedure was performed to test the null hypothesis that the spot-wise
correlation between technical replicates would be higher than that produced from the random
pairing of biological replicates from different mice. A simulated distribution was created by
randomly shuffling the replicates and re-calculating a spot-wise correlation distribution for
10,000 permutations. For each permutation, the original distribution of technical replicate
correlations was compared to each randomly created distribution with a paired Wilcoxon Signed
Rank test. The proportion of times the correlation distribution of original technical replicates
was higher than the randomly sorted distribution was tabulated and divided by the total number
of permutations to obtain the quantile and relative P value.
Spot-wise epigenetic variation
We calculated a spot-wise intraclass correlation coefficient (ICC) according to the one-way
consistency model using the irr package, designating co-twin pairs as a class. The ICC formula
is ICC ¼ (MSb – MSw)/(MSb+ MSw). Here MSb stands for the between pair mean square and
MSw represents the within-pair mean square of the specified class. As the ICC approaches 1,
the co-twins are more similar to each other than unrelated twin pairs are to each other, whereas
as it approaches –1, the within–co-twin difference across the group is consistently larger in
28
comparison to unrelated twin pairs. Each unique DNA region investigated by the microarray
was treated as an independent measurement. To address the null hypothesis that there are no
differences in the amount of DNA methylation variability between MZ and DZ twins, we
evaluated the distributions of unique locus ICC between MZ and DZ twins in WBC cells with a
paired Wilcoxon signed rank test. For buccal epithelial cells, the same hypothesis for MC and
DC twins was evaluated in a similar manner. For inbred and outbred mice, separately, a spot-
wise distribution of within sibship epigenetic variation was created by taking the average of the
variance produced by the three mice per sibship. To address the null hypothesis that there are no
differences in the degrees of epigenetic variation between inbred and outbred mice, we
compared these spot-wise distributions with a paired Wilcoxon signed rank test.
Cross tissue comparison
Ten WBC samples were obtained from the same individuals as the 10 DC MZ twins used in the
buccal cell analysis. Separate spot-wise ICC distributions were calculated for these 10 DC MZ
twins in the WBC sample and from the remaining 9 unrelated DC MZ twin WBC samples. Each
distribution was compared to the buccal cell derived ICC distribution at 5919 loci overlapping
between datasets by linear regression.
Investigation of genomic element class
The list of microarray probes residing within CpG islands was obtained from the annotation
data (www.microarrays.ca). A list of probes residing within 1 Kb of gene promoters was created
by cross referencing the chromosomal coordinates of each microarray probe with the genome
locations of transcription start sites located within the Transcription Start Site database
(http://dbtss.hgc.jp/) using an in house Perl algorithm. For each tissue cohort, the spot- wise ICC
distribution of probes residing within CpG islands was compared to non-CpG island probes with
a Wilcoxon Rank Sum test: NCGI = 2,542, Nnon-CGI = 3,863 in WBC; NCGI = 2,343, Nnon-CGI =
3,575 in the buccal cells; NCGI = 2,352, Nnon-CGI = 3,590 in the gut. The same analysis was
performed for promoter- associated loci: NPromoter = 1,341, Nnon-Promoter = 5,064 in WBC; NPromoter
= 1,248, Nnon-Promoter = 4,670 in the buccal cells; and NPromoter = 1,253, Nnon-Promoter = 4,688 in the
gut. P values were corrected for multiple testing using the Bonferroni method.
29
Gene ontology analysis
Over representation of gene ontology category within the top and bottom 5th
percentile of
unique promoter loci was tested using the GOhyperG [197] function of the GOstats package in
Bioconductor for WBC, buccal, and gut. The top and bottom 5th
percentile of unique CGI
associated loci was interrogated in an identical manner for each tissue. GOhyperG does not
correct for multiple testing. Mappings were based on data provided by: Gene Ontology
(ftp://ftp.geneontology.org/pub/go/godatabase/archive/latest) on 2007/08.
Validation of the microarray findings
We validated the microarray findings using sodium bisulfite modification as done previously in
our laboratory [32]. Sodium bisulfite modification was followed by interrogation of specific
CpG sites by pyrosequencing [198] or direct cloning and sequencing. PCR amplicon,
pyrosequencing, and sequencing primers are provided in Table 2.1. PCR conditions included
0.5μM primers, 10 μl of Qiagen HotStar Taq Master Mix, and double-distilled H2O to a final
reaction volume of 20 μl. Cycling conditions were as follows: 95oC -15 min, 40 cycles of 95
oC -
30 sec, 50oC -45 sec, 72
oC -30 sec, 72
oC –5 min, cool to 4
oC. PCR amplicons were
pyrosequenced at EpigenDX Inc (http://www.epigendx.com). A representative CpG dense probe
residing within the 3’ end of the Complement C1q tumor necrosis factor-related protein 8
precursor (C1QTNF8) gene containing 18 CpG positions in a 367 bp fragment was selected for
in depth analysis by cloning and sequencing in WBC DNA from 18 twin pairs. On average, 1 μl
of PCR amplicon was ligated into 50 ng of pGEMt easy plasmid vector (Promega) with 5μl of
2X Rapid Ligation buffer and 3 Weiss units of T4 DNA ligase in a 10 μl reaction volume, and
incubated overnight at 4oC. 2 μl of ligation product was transformed into 50 μl JM109 high
efficiency competent cells and plated on LB agar plates containing 0.1 mg/ml ampicilin, 50 µM
isopropyl β-D-1-thiogalactopyranoside, and 80 µg/ml X-gal for white colony selection. For each
individual, 36 clones were grown overnight in 1 ml lysogeny broth medium, pelleted and
sequenced at Functional Biosciences (http://www.functionalbio.com), after which the ratio of C
to T was calculated at each CpG position per individual. The methylation difference at a CpG at
position 9, located within a HpaII restriction site, as well as the mean methylation difference
between co-twins was compared to the microarray log ratio differences by linear regression.
30
Pyrosequenced Loci # Pairs Direction PCR Primers Pyrosequencing Primer
UHNhscpg0004390 15 F-B
5'-
ACACACTATTTGTTGTAATTTTTTTTAGTTT
TTT-3' 5'-AAACCCAACAACACA-3'
R 5'-CTACTCATCAATAAAAAAACC-3'
UHNhscpg0008483 10 F-B 5'-GATTATGTTTTATTATTGGGGGTA-3' 5'-CAACTAAAACAAAAAAAACATCCC-3'
R 5'-CAACTAAAACAAAAAAAACATCCC-3'
UHNhscpg0004556 19 F 5'-GGTTGGTAGTTTAAGTTTGAGTTAG-3' 5'-GGTTGGTAGTTTAAGTTTGAGTTAG-3'
R-B 5'-CAACTATACCATCTTTCACTATTTTAAC-3'
UHNhscpg0000193 18 F-B 5'-GGGAGGTGTTYGAGAGGATT-3' 5'-TCTACCCCCTTTTCCATCTAAA-3'
R 5'-TCTACCCCCTTTTCCATCTAAA-3'
UHNhscpg0004262 11 F-B 5'-TAGGAATTAAAAGGATGTTGAAGAT-3' 5'-AAAACTATACCCTATCCCCTAAA-3'
R 5'-AAAACTATACCCTATCCCCTAAAAC-3'
Sequenced Loci # Pairs PCR Amplicon Primers Sequencing Primer
C1QTNF8 18 F 5'-GTTTGGAATGTTATAGGGATGTTTT-3' M13 Reverse
R 5'-AACCTCAAACAACAAAACCTACATCC-3'
Table 2.1. Sodium bisulfite treated loci and primers
Column 1: microarray probe IDs for loci subjected to sodium bisulfite modification. Column 2: the number of twin
pairs interrogated per locus. Column 3: primer orientation. “F” and “R” denote the forward and reverse primer
sequence. “B” denotes the addition of a biotin modification for downstream pyrosequencing applications. Column
4: Primer sequences for amplifying the respective regions from post sodium bisulfite modified DNA for
pyrosequencing and cloning and sequencing strategies. Column 5: pyrosequencing and sequencing primers are
provided in the far right.
In silico SNP analysis
SNP and allele frequencies were initially obtained from the October 2005 release of dbSNP
database (http://www.ncbi.nlm.nih.gov/projects/SNP/) and updated with information from the
March 2007 release #22 of the HapMap (http://www.hapmap.org/) database. For each locus, a
31
heterozygosity quotient (HQ) was calculated for two scenarios. The first was for only those
SNPs residing within HpaII positions and the second was for all SNPs residing within the probe
sequence and 1Kb upstream and downstream. An HQ was calculated by summing the quantity
of 1 minus the sum of the squared allele frequencies for all SNPs located within the interrogated
region. The relationship between HQ value and ICCMZ-ICCDZ difference was evaluated through
linear regression.
Contributions: ASM and its putative role in complex disease
I was responsible for experimental design and most wetlab activities involved in this project,
including development, optimization and initial testing of our epiSNP detection technique, all
DNA extractions, quantifications, MSRE digestions, adaptor ligations, PCR amplifications,
fragmentation and labeling for the enriched brain and sperm samples, and all of these tasks plus
hybridization, fluidics and array scanning for all sperm samples. The brain genotyping arrays
were run as a service at the Toronto Centre for Applied Genomics (TCAG). For the deep
sequencing experiment using the 454 platform, I selected the target loci, designed primers, and
performed all bisulfite modification, amplifications, gel extractions, purifications and pooling of
the 800 samples. The 454 sequencing was also run as a service at TCAG. For the bisulfite
verification of epiSNPs and non-epiSNPs, I selected the loci, designed all PCR and
pyrosequencing assays, bisulfite-modified the DNA, and then prepared the amplicons, ran the
pyrosequencing reaction and performed a portion of the data analysis. Analysis of results was
performed by Denise Mak and Paul Boutros, with additional analyses by Natalie Freeman,
Michal Grzadkowski and Ying Wu. I provided biological consultation to direct the analyses.
Sample preparation
Frozen prefrontal cortex (Brodmann area 10) tissues from post-mortem control subjects (n=76),
BD (n=67) and SZ (n=65) patients were obtained from the Stanley Medical Research Institute
and the Harvard Brain Tissue Resource Center. All demographic data provided by the brain
banks is summarized in Appendix 2, Tables A2.1 and A2.2. For the Stanley samples,
pathologists compiled reports on potential donors that include family interviews and medical
records, and these reports are reviewed independently by two senior psychiatrists who make the
diagnoses [199]. The Harvard samples were collected via community donations initiated by the
families of the donors, and classifications were also made through the use of family interviews
32
and medical records. Every case received a complete neuropathological examination that
included detailed gross and microscopic analysis [200].
Sperm samples from BD (n=24) and control samples (n=24) were collected at the Centre for
Addiction and Mental Health (CAMH, Toronto), and demographic data is presented in
Appendix 2, Table A2.3. The CAMH Research Ethics Board approved the use of all brain and
sperm samples in this study. Germ cells were isolated from the semen using two-layer
discontinuous gradient separation. At 37C, the two-layer gradient was formed using 2mL
ISolate (lower layer) and 2mL Modified HTF Medium (upper layer) in a 15mL tube. Semen
(2mL) was gently added onto the upper layer, and the tube was centrifuged for 20 min at 300 x
g. The top layer was removed by aspiration until only 0.5 mL of lower layer remained. Sperm
Washing Medium (3mL) was added, the tube was centrifuged for 10 min at 300 x g, and then all
but the lower 0.5 mL was removed with a pipette. This washing step was repeated once, then the
supernatant was removed and 0.5mL Sperm Washing Medium was added to the pellet, which
was then stored at -80C in a cryo tube. The cells were re-pelleted and the storage solution was
removed prior to DNA extraction.
Genomic DNA was extracted using phenol and chloroform. The unmethylated fraction of the
genome was enriched for each sample in the following manner: 500ng of genomic DNA was
separately digested with three MSREs, HpaII, HinP1I, and HpyCH4IV (New England Biolabs),
and the three digests per sample were then pooled in equivalent amounts, adaptors were ligated
onto the ends, and the ligation products were digested with McrBC (New England Biolabs).
Samples were then PCR-amplified using primers complementary to the adaptor sequences,
fragmented with DNAseI (EpiCentre), labelled (GeneChip DNA labelling reagent, Affymetrix)
and hybridized to Affymetrix SNP 6.0 microarrays, which interrogate 906 600 SNPs at 3000 bp
resolution. For each sample, purified genomic DNA was prepared following the manufacturer’s
instructions and hybridized onto a second SNP 6.0 array for standard genotyping. As cases and
controls were run separately on two batches of arrays, a subset of 10 cases and 10 controls was
re-run in the second batch to ensure comparability. These technical replicates were enriched
separately, at a later date versus the original cases and controls, which were enriched together.
The workflow for the epiSNP study is presented in Figure 2.2.
33
Figure 2.2. EpiSNP study workflow
Post mortem prefrontal cortex and sperm samples were processed identically. Identified epiSNPs were examined in
several downstream analyses, which differed between cohorts.
EpiSNP identification
We used R v1.12.1 and the R package, oligo v1.14.0, to background correct, normalize and
summarize (RMA) the SNP probes, and crlmm to make genotype calls. Datasets were
normalized separately, as were genotyping and methylation arrays. For each SNP, we obtained a
pair of values: a genotype call and a methylation level. To determine the most appropriate
analysis strategy, we performed an empirical study of five different methods: Pearson’s
correlation, Spearman’s rank-order correlation, mutual information, piecewise linear regression
(PWL) and analysis of variance (ANOVA). We found that PWL and ANOVA were the top two
34
performers in terms of sensitivity and specificity. PWL is a two-step linear regression model and
it has the advantage over ANOVA, in that it provides us with a pattern of directionality between
the genotypes AA, AB (slope 1) and AB, BB (slope 2). PWL was then used to examine the
pattern of dependence between AAAB and ABBB genotype calls and their respective
methylation levels. Statistically significant allelic DNA methylation differences are identified as
epiSNPs, i.e. an epiSNP will have at least one significant non-zero slope. An FDR correction
was applied to correct for multiple testing. Identified epiSNPs had q-values < 0.01.
We designed a random sampling procedure to test the sensitivity of our epiSNP identification
method, where Ci represents one of five sets of cohorts (Control, Case, BD, SZ, Control+Case)
in our study. For each cohort Ci and each value of N (ranges from 2 to Ci-1), we will obtain X
groups of identified epiSNPs, dependent on the randomization of chosen samples. The sampled
data gives us a range (minimum and maximum) of identified epiSNPs at each sample size, from
which we extrapolated the number of identified epiSNPs expected at larger sample sizes. For
the extrapolation, we tested three models - linear, quadratic and logistic - and used AIC (Akaike
Information Criterion) values to evaluate each one, with the smallest AIC value representing the
best model. The relative measure, weighted AIC, calculates the probability of each model being
the best, given the data and set of possible models. In order to gain a clear view of the overall
trend, we filtered the data with an increasing slope threshold. Both slope values for each SNP
were combined together and made positive.
Identified epiSNPs from both the brain and sperm samples were closely examined for any
chromosomal and functional class bias. Chromosome annotation was taken from the R
annotation package pd.genomewidesnp.6 (v1.1.0) and functional class information was taken
from dbSNP (build 135). We separated functional classes into five main categories: exon,
intron, UTR, locus and intergenic. Note that intergenic regions are not currently categorized in
dbSNP, and it was the absence of information that was used to label SNPs as “intergenic.” All
other possible functional classes did not apply to our group of epiSNPs. The term “locus” is
used by dbSNP to identify intergenic SNPs that have close associations with a gene, existing
either within 2Kb upstream or 500bp downstream, but do not appear in the transcript. The term
“UTR” includes both 5’ and 3’ UTRs, where the 3′ UTR is the portion of an mRNA from the
position of the last codon that is used in translation to the 3′ end, and the 5′ UTR is the portion
35
of an mRNA from the 5′ end to the position of the first codon used in translation. We examined
each cohort using the hypergeometric distribution (phyper function in R) to compare the
proportion between epiSNPs in each chromosome/functional class against all genetically diverse
SNPs on the SNP 6.0 array for the same chromosome/functional class. We determined if the
epiSNP proportion was an under-representation or an over-representation. Multiple testing
correction using the FDR and a q-value of 0.01 was applied to the chromosomal results, as there
were 24 tests per cohort. It was not needed for the functional class results because only five tests
were run per cohort. No correction for multiple testing was applied to the functional class test,
because using a stringent p-value cut-off of 0.01 meant that we would expect, per cohort, 0.01 x
5 = 0.05 false positive functional class bias by chance alone.
We used the web interface of the GoMiner program to identify enriched GO categories
associated with our lists of identified epiSNPs. GoMiner requires two lists of genes as input: the
total set of genes and a subset of interesting genes. dbSNP (build 135) was used for mapping
SNPs to gene symbols. Three GoMiner runs were completed for each cohort. For each cohort,
GoMiner returned a list of GO categories that are statistically enriched for those cohort genes
that belong to the GO category after correction for multiple testing (FDR q-value < 0.02). An
enrichment score is given for each GO category, representing the proportion of cohort genes
relative to the total number of genes on the SNP 6.0 array, which can be concisely described as:
Verification of microarray results
EpiSNPs that occurred in both cases and controls were chosen from a list of top hits, where one
allele was associated with methylation much more strongly versus the alternate allele, and non-
epiSNPs were randomly selected (n=3 for each). For each locus, we chose AA and BB
homozygous samples from the original experiment, and then performed bisulfite modification
on the genomic DNA, as done previously in our laboratory [32]. The amplicons were designed
to cover as many CpG sites surrounding the SNP as possible; PCR amplicon, pyrosequencing,
and sequencing primers are provided in Table 2.2. PCR conditions included 5µl of 0.12μM
primer mix, 2.5 μl of Qiagen HotStar Taq buffer, 0.5µl of 10mM dNTPs, 1.3µl of HotStarTaq,
36
and double-distilled H2O to a final reaction volume of 25 μl. Cycling conditions were as
follows: 95oC – 15 min, 40 cycles of 95
oC – 1min, locus-specific annealing temperature – 45
sec, 72oC – 1 min, 72
oC –10 min, cool to 10
oC and hold. Prepared amplicons were
pyrosequenced in house on the PyroMark Q24 machine (Qiagen), using 0.3µM sequencing
primers. Results were analyzed using the methylation analysis function in the PyroMark Q24
software. Independent CpG percentages were compared between genotypes (AA vs. BB) using
the Wilcoxon signed rank test, using a p-value threshold of <0.06.
Table 2.2. Sodium bisulfite treated SNP loci and primers
Column 1: SNP 6.0 probe IDs for loci subjected to sodium bisulfite modification. Column 2: the number of brain
DNA samples interrogated per locus. Column 3: primer orientation. “F” and “R” denote the forward and reverse
primer sequence. “B” denotes the addition of a biotin modification for downstream pyrosequencing applications.
Column 4: Primer sequences for amplifying the respective regions from bisulfite modified DNA for
pyrosequencing. Column 5: pyrosequencing primers are provided in the far right.
Examination of linkage disequilibrium effects
We investigated possible linkage disequilibrium (LD) effects between identified epiSNPs and
SNPs occurring within nearby MSRE recognition sites, as it is possible that any of the 4 bases in
the recognition site could be a SNP that creates or disrupts the site, leading to false positive
Locus #Samples Direction PCR primers Pyrosequencing primer
rs649951 10 F AGTTTTTGTTAGTTTGAAGATATTTTGA AGATTTATATGTAGTTGTA
R-B (BIO)AATATAATCCCAAATCATAAAATCACAA
rs9936944 10 F TGTTGTATTTTTAGTAGAGAGAGGGT TGTTGGTTAAGTTGGT
R-B (BIO)TCCTAATCCTAAAATCAACCATTCCT
rs1485474 10 F TGTGGTAGTATATGGTTGTGGT AGGATGGAGGTTTGT
R-B (BIO)AACCAACTAATCTTCAACAAAACAAA
rs5950206 8 F TTGGAAGATGTATTGTTTATAGTGTT TTATTAGTGTTAGAGTTT
R-B (BIO)ACCATATACACAAATCAACTCACAA
rs10875310 10 F ATAGGAGGATGTGTGTAGATTATAT TGTGTAGATTATATGGT
R-B (BIO)ACCCACATAACCCAATCACCT
rs10962372 6 F TTAAGGTGATTGGATGATTTGAGTA TGAGGATTAAAGTATGA
R-B (BIO)ACTAATTCAACTTACCTCCACCT
37
epiSNP associations. Unless specified, "SNPs" includes both epiSNPs and non-epiSNPs and
"MSRE SNPs" refers to any SNP that creates or disrupts an MSRE recognition site. The human
genome (build hg19) and SNPs from dbSNP (build 135) were used to determine the location of
MSRE SNPs. Two different analyses were utilized (p<0.05 is considered significant):
1. Distance analysis: Examining the distance between SNPs and the nearest MSRE SNP. A
Wilcoxon rank sum test was applied to test for differences between epiSNP and non-epiSNP
distances in each cohort.
2. LD analysis: Examining LD values between SNPs and all MSRE SNPs within 2Kb. The
European ancestry in Utah (CEU) and British from England and Scotland, UK (GBR)
populations were chosen as most representative of the samples used in the epiSNP analysis.
Genotype information was taken from the 1000 genomes project [201]. PLINK analysis for LD
calculations used a maximum distance of 10Kb to reduce computation time.
Deep sequencing analysis of non-epiSNPs
We detected a large number of epiSNPs using the microarray approach and PWL analysis, but
microarrays are not as sensitive as deep sequencing technologies and epiSNPs demonstrating
lower level associations are likely overlooked by this method. Also, the PWL analysis is a two-
step regression model that requires at least two of the three possible genotypes (AA, AB and
BB) to be present in order to detect the methylation intensity slopes; as a result, SNPs that have
rare genotypes may be excluded from our analysis. In order to examine the methylation
association of SNPs in greater detail, we conducted an experiment using the 454 deep
sequencing platform, which allows us to use single-base resolution to search for associations
that may have been missed by the microarrays. Eleven SNPs that did not demonstrate ASM
were chosen. For each SNP, we aimed to choose 10 samples of each alternative homozygote
from cases and controls, ie. 10 AA (case), 10 BB (case), 10 AA (ctrl) and 10 BB (ctrl), to be
bisulfite-modified [32] and sequenced using the 454 platform (40 samples x 11 SNPs = 440
amplicons produced). In some cases, 10 samples of each type were unavailable, but overall
there were approximately equal numbers of each type submitted. Each sample underwent
bisulfite modification, which converts unmethylated cytosines to uracils, and then finally to
thymines after PCR amplification, while methylated cytosines remain intact For each SNP, an
38
amplicon was created that surrounded the SNP and contained as many CCGG sites as possible –
to ensure purity, these amplicons were cut from an agarose gel and purified before
quantification. We split the 454 sequencing plate into quadrants (AA (case), BB (case), AA
(ctrl) and BB (ctrl)) and sequenced from from both A and B tails using Titanium plates and
reagents. We generated approximately 422 114 forward sequence reads (summarized in Table
2.3) and compared the number of unconverted cytosines at each CpG site between groups per
SNP. Our analysis focused on 4 main questions:
1. Is the association between methylation and genotype in case the same as the association
between methylation and genotype in control?
2. Is the association between methylation and genotype AA in case the same as the association
between methylation and genotype BB in case?
3. Is the association between methylation and genotype AA in control the same as the
association between methylation and genotype BB in control?
4. Is the association between methylation and case the same as the association between
methylation and control?
For question 1, we tested the null hypothesis (H0) per SNP, assuming that the difference
between the odds ratio follows a normal distribution under H0. The method fitted the data with a
logistic regression model (Binomial Family with logit link) for each SNP. Since we used the
information for all the CpGs for each SNP, we would not get the results for each individual
CpG.
For questions 2-4, we performed a multiple-CpG association analysis per SNP. A 2x2
contingency table was constructed for each question tested, and Fisher’s exact test was used to
examine the association for each table. The CpG counts were pooled for each SNP. Each test
was performed 20 times (per SNP), and significant q-values were recorded.
39
Table 2.3. Forward 454 sequencing reads per amplicon
The number of forward 454 sequencing reads generated for 11 non-epiSNPs.
SNP dbSNP ID # reads
SNP 1 10975882 9792
SNP 2 5943127 17255
SNP 7 11658063 110312
SNP 11 3762352 114447
SNP 12 219815 34831
SNP 15 17551103 30815
SNP 18 2859011 5629
SNP 19 2059697 53360
SNP 21 720080 26628
SNP 23 2581651 5372
SNP 25 1902675 13673
40
Chapter 3
Results and Discussion
Sequence-independent DNA methylation differences in MZ twins
In this study, we mapped MZ twin DNA methylation differences in white blood cells (WBC)
(N=19 pairs), buccal epithelial cells (N=20 pairs), and gut (rectum) biopsies (N=18 pairs), by
interrogation of the unmethylated genome on the 12K CpG island microarray [195]. We first
ensured that the microarray technology identifies actual DNA methylation differences between
MZ co-twins rather than artifactual differences due to technical variation. For this, 4 parallel
enrichments of the unmethylated fraction of genomic WBC DNA were performed from the
DNA stock of the same individual. DNA samples from 8 MZ twins (4 pairs) were compared
against themselves (to measure technical variation) or the respective co-twin (to measure
biological variation). The biological variation significantly exceeded the technical variation in
all cases (P=1.4x10-238
, P=1.1x10-202
, P=2.1x10-7
, P=2.6 x10-39
) indicating that the detected MZ
co-twin differences are genuine (Fig. 3.1). The technical variance (2) was consistent between
all self-self hybridizations, while the degree of biological variation varied significantly between
twin pairs (Fig. 3.1). For all MZ twins per tissue cohort, the mean absolute log fold change
between MZ co-twins was significantly larger than that between technical replicates (WBC
mean difference=0.013 + 4.5.6x10-4
, P=3.6x10-173
, buccal mean difference=0.017 + 6.8x10-4
,
P=4.9x10-132
, gut mean difference=0.0053 + 5.6x10-4
, P=1.02x10-14
), signifying biological
variation was detectably higher than technical variation in all tissues. Furthermore, microarray
validation performed by sodium bisulfite sequencing and pyrosequencing (Fig. 3.2 and Fig. 3.3)
indicated that the microarray signals detected reflect the actual DNA methylation status in the
tested samples. For WBC-based analyses, we also performed a spot wise correlation between
cell sub-fraction counts and confirmed that differences observed in WBC samples were not
resultant from cell sub-fraction differences.
41
Figure 3.1. Biological vs. technical variation
Volcano plots of 4 MZ twin vs. co-twin WBC DNA methylome comparisons (black) overlayed with 4 matched
twin DNA vs. self comparisons (green) for each set of MZ twins. The x-axis represents the mean fold change across
the 4 replicas. The y-axis represents the –log10 of the p value from a paired t test. Higher significance denotes a
higher consistency between replicates. Significant variation in the spread of detected biological difference exists
between twin pairs (Kruskal-Wallis χ2= 16.3, df = 3, P=0.001) with a symmetrical large (A and B), symmetrical
small (C), and asymmetrical (D) variation of the DNA methylome between co-twins. For each twin pair, a non-
parametric Ansari-Bradley test demonstrated that levels of variance (2) in the MZ twin - co-twin comparison were
significantly larger than 2 in the self-self comparisons (twin set A: variance ratio= 2.91, P=1.4x10
-238; set B: 2.14,
P=1.1x10-202
; set C: 1.12, P=2.1x10-7
; set D: 2.63, P=2.6 x10-39
). Levels of technical variation were not significantly
different between groups (Kruskal-Wallis χ2= 1.81, df = 3, P= 0.62).
Figure 3.2. Correlations between microarray and sodium bisulfite sequencing data
Using DNA from WBC, microarray data were validated by sodium bisulfite modification based mapping of
methylated cytosines in 18 CpGs at a locus displaying a range of co-twin variability, UHNhscpg0003195, which
maps to the 3’ end of C1QTNF8. Over 1,300 clones representing 18 MZ twin pairs (36 clones on average per
individual) were sequenced. Twin differences in the density of methylated cytosines revealed by bisulfite
sequencing (x-axis) correlated significantly with the log2 DNA methylation differences produced by the microarray
data (y-axis) (mean density across all 18 CpGs R=0.65, P=0.0036 (A). Similarly, the density of methylated
cytosines in the HpaII restriction site at the 9th
CpG position correlated significantly with the microarray data
42
(R=0.58, P=0.01) (B). In both A and B, the x-axis values represent the sodium bisulfite based co-twin DNA
methylation difference and y-axis values represent the log2 fold difference between co-twins generated by
microarray data.
Figure 3.3. Pyrosequencing correlations as a function of distance
Bisulfite pyrosequencing of the total amplicon without cloning was performed at 5 loci showing a range of co-twin
variation: UHNhscpg0008483 (15 pairs of twins, CA2 gene), UHNhscpg0004390 (10 pairs, RAX gene),
UHNhscpg0004556 (19 pairs, IL1A gene), UHNhscpg0000193 (18 pairs, RNF110 gene), and UHNhscpg0004262
(11 pairs, DLX1 gene), in WBC DNA samples, which also positively correlated with the microarray data. A bar
graph displaying the strength of correlation between log2 DNA methylation difference between co-twins in the
bisulfite pyrosequenced loci compared to that of the microarray data. Correlations between the microarray data and
Hpa II position only are depicted in Red, while blue represents the correlation derived from of the average
methylation density over 5,6,4,7 and 3 CpGs, respectively. Interrogated CpG sites located within the probe
sequence (represented by a rectangle) showed the strongest correlation with microarray data. X-axis values (-141,-
30, 221, 267, and 637) depict the position in bp of the interrogated HpaII site relative to the 3’ end of each clone
marked as zero. The y-axis depicts Pearson’s correlation (R) between microarray and pyrosequencing data.
Similarly to Eckhardt et al. [202], we noticed that the strength of the correlation between microarray signal and
bisulfite based-mapping of methylated cytosines located outside the probe sequence was a function of the distance
of the interrogated CpG site from the probe.
In the microarray-based studies, we detected a large degree of MZ co-twin DNA methylation
variation in all tissues investigated, despite their identical DNA sequences. We used an
43
intraclass correlation coefficient (ICC) to measure MZ co-twin variation for each unique
genomic region, where an ICC range from +1 to –1 denotes high to low epigenetic similarity
between co-twins relative to the variation between unrelated pairs. For each tissue, we
generated an ICC- based annotation of MZ co-twin DNA methylation variation across ~6,000
unique DNA loci (Fig. 3.4 – WBC; Appendix 1 Fig A.1 and Fig A.2 contain annotations for the
other tissues). Notably, DNA methylation profiles in the buccal epithelial cells from MC MZ
twins were significantly more variable within pairs than those from DC MZ twins (median
difference= 0.37 + 0.0057, P<9.9x10-324
) , which cannot be explained by technical differences
between the hybridization batches of each group (Fig 3.5). Chorionicity information was only
available for the buccal and WBC samples; all WBC of MZ twins were DC to avoid in utero
twin blood transfusion effects. DC MZ twins are believed to result from a splitting of the
cleavage-stage embryo within the first four days following fertilization, whereas MC MZ twins
arise after this point [203]. The varying degrees of epigenetic dissimilarity detected between
these groups may reflect differences in epigenetic divergence among embryonic cells at the time
the twin embryos separated.
Figure 3.4. Karyogram of MZ co-twin epigenetic similarity in WBCs
A chromosomal karyogram depicting degree of MZ co-twin DNA methylation similarity per interrogated locus in
the WBC sample. Dark-to-light bars on the chromosomes represent chromosomal banding patterns as revealed by
Giemsa staining, and red bars indicate regions of high microarray probe density. Bars to the right of each
44
chromosome represent locus-specific ICCs depicting degrees of MZ co-twin epigenetic similarity. P values
associated with the ICC statistic per locus were subjected to false discovery rate (FDR) correction for multiple
testing. FDR-corrected P values below the level of P < 0.05 are depicted in green, and those with greater P values
are shown in gray.
Figure 3.5. Raw binding intensities of MC and DC MZ twin hybridizations
Box plots of raw green (A) and red (B) signal intensities for 40 DC (1) and 40 MC (2) buccal MZ twin microarrays.
Green and red center lines separate the two batches of samples. As MC and dichorionic MZ twin buccal samples
45
were performed in different batches, we wanted to evaluate if batch effects in sample binding could be influencing
this result. No batch effects are observed that could account for the significant differences in MZ co-twin
epigenetic variation between dichorionic and MC twins.
The spot-wise ICC values across the 5,919 loci that overlapped between data sets were
compared between the buccal cells and WBC from the same set of DC MZ twins, and WBC
samples from different individuals, by linear regression. A small but significant correlation was
observed between WBC and buccal-derived ICCs from the same individuals (R=0.046,
P=4.08x10-4
) but not between buccal cells and WBCs from unrelated individuals (R=-0.0025,
P=0.84), which suggests that tissues in genetically identical individuals are more epigenetically
similar versus those in unrelated individuals.
Using locus-specific DNA methylation information, we investigated whether the degree of co-
twin epigenetic similarity is associated with functional genomic elements. In each tissue, we
compared the distribution of ICCs of the CpG islands (CGIs) to that of all non-CGI loci.
Promoters were investigated in an identical manner. We carried out six tests and corrected p
values for multiple testing using the Bonferroni method. Both CGIs and promoters were less
epigenetically variable versus non-promoter regions in WBC-derived DNA (Wilcoxon Rank
Sum test, meanCGI=0. 43 + 0.0065, meanNon-CGI=0.39 + 0.0053, P=1.5x10-4
and
meanPromoter=0.43 + 0.0085, meanNon-Promoter=0.4 + 0.0048, P=0.0077; Bonferroni corrected P=
8.7 x10-4
and P=0.047, respectively). Promoters also showed a trend towards being less
epigenetically variable in gut tissue (Wilcoxon Rank Sum test, meanPromoter=0.11 + 0.0065,
meanNon-Promoter=0.09 + 0.0037, P=0.057; Bonferroni corrected P=0.34). No statistically
significant differences in the degree of DNA methylation variation were detected in the buccal
epithelial cells. The promoter and CGI probes were also subjected to the Gene Ontology (GO)-
based analysis [197]. Most of the identified GO categories associated with epigenetically similar
loci between co-twins (top 5th
percentile of ICCs) had direct functional relevance to the tissue
investigated (Table 3.1). The most apparent connections were observed in WBC, where
categories such as T cell proliferation (GO:004209) and activation of immune response
(GO:000225) were identified. In buccal cells, the proteinaceous extracellular matrix
(GO:0005578) and the metalloendopeptidase activity (GO:0004222) categories were identified;
genes in these categories interact and are expressed in oral fibroblast cells [204, 205]. A portion
of GO categories in gut appeared to be associated with regulation of cell proliferation
46
(GO:0042127) and epithelial to mesenchymal transition (GO:0001837), which is an intrinsic
step of formation of the smooth muscle cells of the gut blood vessels [206]. Our observations
are consistent with an earlier study [6] where the fidelity of CpG methylation patterns was twice
as high in promoter as opposed to non-promoter regions. Taken together, greater epigenetic
similarity between MZ co-twins at functionally important regions (versus loci without clearly
defined regulatory function) suggests that the epigenome is functionally stratified based on the
locations of critical genes.
Cohort GO ID Pvalue OddsRatio ExpCount Count Size Term
WBC GO:0002274 0.0051 Inf 0.14341085 2 2 myeloid leukocyte activation
Promoters GO:0042098 0.0277 13.277778 0.28682171 2 4 T cell proliferation
GO:0006909 0.044 8.8425926 0.35852713 2 5 phagocytosis
GO:0009615 0.044 8.8425926 0.35852713 2 5 response to virus
GO:0002253 0.044 8.8425926 0.35852713 2 5 activation of immune response
WBC GO:0030183 0.0314 12.325301 0.30630631 2 4 B cell differentiation
CGIs GO:0002253 0.0498 8.2088353 0.38288288 2 5 activation of immune response
GO:0002443 0.0498 8.2088353 0.38288288 2 5 leukocyte mediated immunity
Buccal GO:0030518 0.0047 Inf 0.13835198 2 2 steroid hormone receptor signaling pathway
Promoters GO:0019222 0.0066 2.0263338 15.8413021 25 229 regulation of metabolic process
GO:0006350 0.0183 1.8618926 14.4577823 22 209 transcription
Buccal GO:0005578 0.0287 3.3724236 1.79654511 5 26 proteinaceous extracellular matrix
CGIs GO:0004222 0.0324 5.3868243 0.73780488 3 11 metalloendopeptidase activity
Gut GO:0042127 0.0365 3.8139535 1.31905465 4 19 regulation of cell proliferation
Promoters
Gut GO:0001837 0.0138 27.388889 0.20923657 2 3 epithelial to mesenchymal transition
CGIs GO:0007179 0.0263 13.680556 0.27898209 2 4 transforming growth factor beta receptor signaling pathway
47
Table 3.1. GO analysis of loci with high MZ co-twin epigenetic similarity
Significantly over represented gene ontology categories in the positive 5th
percentile of the ICC distribution of
promoter and CGI associated loci in each tissue cohort.
Epigenetically variable loci (bottom 5th
percentile of ICCs) were associated with cell division
processes, which may reflect an early developmental epigenetic discordance as one of the
hypothetical reasons for twin formation [6] (Table 3.2).
Cohort GO ID Pvalue OddsRatio ExpCount Count Size Term
WBC GO:0000074 0.0032 3.5637066 3.20930233 9 46 regulation of progression through cell cycle
Promoters GO:0022402 0.019 2.4193548 4.88372093 10 70 cell cycle process
WBC GO:0019882 0.0435 8.9004329 0.35585586 2 5 antigen processing and presentation
CGIs GO:0045786 0.048 3.3833333 1.42342342 4 20
negative regulation of progression through
cell cycle
Buccal GO:0000279 0.0075 3.6943284 2.40895219 7 32 M phase
Promoters GO:0007067 0.0398 3.0641822 1.95727365 5 26 mitosis
GO:0000776 0.0276 13.308824 0.28659161 2 4 kinetochore
GO:0005876 0.0276 13.308824 0.28659161 2 4 spindle microtubule
GO:0005768 0.0383 5.0317164 0.78812692 3 11 endosome
GO:0000922 0.0439 8.8627451 0.35823951 2 5 spindle pole
Buccal GO:0051656 0.0143 26.767123 0.21367521 2 3 establishment of organelle localization
CGIs
Gut GO:0006732 0.0151 5.4577778 1.03692762 4 13 coenzyme metabolic process
Promoters GO:0019867 0.0313 12.489796 0.30676692 2 4 outer membrane
Gut GO:0051186 0.0065 4.4351852 1.80395853 6 22 cofactor metabolic process
CGIs GO:0022610 0.0189 2.5604396 4.18190386 9 51 biological adhesion
GO:0016337 0.0285 3.4325681 1.80395853 5 22 cell-cell adhesion
48
Table 3.2. GO analysis of loci with low MZ co-twin epigenetic similarity
Significantly over represented gene ontology categories in the negative 5th
percentile of the ICC distribution of
promoter and CGI associated loci in each tissue cohort.
Cases of DNA sequence variation in MZ twins have been documented [207], but these are
uncommon and unlikely to account for even a fraction of the MZ co-twin differences identified
in our experiments. Further studies may include a more detailed annotation of epigenetic
differences in MZ co-twins, a search for disease-specific epigenetic changes in discordant MZ
twins, and a dissection of environment-induced versus stochastic epigenetic differences. As MZ
twins reared apart are generally quite similar to MZ twins reared together according to an array
of traits (electroencephalogram, IQ, personality, social attitudes) [208], we speculate that
stochastic events are much more important than environmental effects at loci where phenotype
is highly determined by epigenetics in MZ co-twins.
Comparison of MZ versus DZ epigenetic profiles
While the first part of this study investigated epigenetic differences in MZ twins, the next
section focuses on comparisons of epigenetic similarities in MZ versus DZ twins - the same
design that has been used in heritability studies. Here, we aim to describe the contributions of
genetic and non-genetic factors to epigenetic variation. DNA methylation differences in buccal
epithelial cells from 20 sets of MZ co-twins (described above) were significantly lower in
comparison to 20 sets of DZ co-twins matched for age and sex (ICCMZ-ICCDZ =0.15 + 0.0039,
P=1.2x10-294
) (Fig. 3.6A). All the effect observed was attributed to the ten sets of dichorionic
MZ twins (mean ICCMZ-ICCDZ = 0.35 + 0.0057, P<9.9x10-324
) (Fig. 3.6B), whereas the mean
ICC of MC MZ twins was close to 0 (Fig. 3.6C). In WBC from 19 sets of MZ twins (described
above) and 20 sets of DZ twins matched for age, sex, and blood cell count (total WBC count,
neutrophil and lymphocyte fractions), MZ-DZ differences were much more subtle, but still
significant (mean ICCMZ-ICCDZ =0.0073 + 0.0034, P=0.044). The observed effect may have
been diminished by our conservative efforts to bias against larger epigenetic MZ – DZ
differences by selecting matched DZ twins with smaller co-twin cell sub-fraction differences as
compared to the MZ twins. For buccal tissue, a locus-specific annotation of ICCMZ-ICCDZ
values representing dichorionic MZ co-twin similarity relative to DZ co-twin similarity is
provided (Fig. 3.7, and Fig. A1.3 and A1.4 in Appendix 1 for WBC and MC buccal samples).
49
Figure 3.6. MZ and DZ ICC distributions in buccal cells
ICC distributions in buccal epithelial cells of MZ and DZ twins. A) all MZ twins (N= 20 sets, red) and DZ twins
(N=20 sets, blue); B) dichorionic MZ twins (N=10 sets, red) and matched DZ twins ( N=10 sets, blue); C) MC MZ
buccal samples (N=10 sets, red) with matched DZ twins ( N=10 sets, blue).
Figure 3.7. Karyogram of MZICC-DZICC values in buccal cells of DC MZ twins
A chromosomal karyogram depicting levels of dichorionic MZ co-twin similarity relative to DZ co-twin similarity
per interrogated locus in the buccal sample. Blue bars to the right of each chromosome represent locus specific
ICCMZ-ICCDZ values.
50
All techniques for enrichment of differentially-methylated DNA sequences for microarray-based
DNA methylation profiling can potentially be confounded by DNA sequence variation. In our
experiments, SNPs within HpaII restriction sites may have caused enrichment differences,
which would then result in larger variation in DZ twins. In addition, DNA sequence variants
may influence the epigenetic status, as in the literature, there are several examples of DNA
allele or haplotype association with specific epigenetic profiles [12, 209, 210].
Alternatively, DZ twins may show more epigenetic differences than MZ twins because the
former originate from different zygotes carrying two different epigenetic profiles, while the
latter develop from the same zygote, and therefore should possess similar epigenomes at the
time of blastocyst splitting. Although the experiments described below do not unequivocally
prove this second hypothesis, we favour the idea of these zygotic epigenetic effects for several
reasons.
First, epigenetic profiles are not fully determined by DNA sequence; if that were the case, MZ
twins would show no epigenetic differences. Therefore, the observed major, epigenome-wide
differences in the buccal epithelial cells from MZ twins versus DZ twins are highly unlikely to
be caused exclusively by DNA sequence differences in DZ twins. Furthermore, ICCMZ-ICCDZ
differences were tissue-specific, as the buccal epithelial cells from dichorionic MZ twins
showed much larger MZ-DZ epigenetic differences in comparison to that of a subset of WBC
obtained from the same individuals at the same time. As the DNA sequences should be identical
(or nearly identical) between the tissues of the same organism, the tissue specific ICCMZ-ICCDZ
differences argue against DNA sequence as a major controlling factor of epigenetic profiles.
Second, to address the putative effects of differential digestion of polymorphic HpaII restriction
sites in DZ twins, we tried to perform a comparative analysis between HpaII and its
isoschizomer, MspI, as has been suggested in the HELP assay [211]; however, degrees of
technical variation produced in MspI- based experiments were markedly larger than those of
HpaII experiments (ratio of HpaII/MspI variance= 0.37, P<9.9x10-324
) (Fig. 3.8). It is not
surprising, given that digestion of genomic DNA with MspI generates at least an order of
magnitude more short restriction fragments, which will affect the dynamics of subsequent steps
(adaptor ligation, PCR, hybridization) in comparison to the HpaII- based enrichment of the
unmethylated DNA fraction. As a result, the two experiments were not directly comparable.
51
Alternately, we carried out an in silico analysis whereby the SNP and allele frequency
information available in the dbSNP (http://www.ncbi.nlm.nih.gov/projects/SNP/) and HapMap
(http://www.hapmap.org/) databases were obtained to calculate heterozygosity quotients that
represent the probability that a given probe would have a restriction site disrupted by a SNP.
From the 6,405 and 5,917 unique sequences within the WBC and buccal data sets, 109 and 98
loci containing HpaII SNPs were identified, respectively. For both data sets, there was no
correlation of locus heterozygosity value with ICCMZ-ICCDZ value (R=-0.0032 and P=0.97 for
WBC; R=0.024 and P=0.81 for buccal cells). A similar analysis was done to address the
epigenetic effects of SNPs in cis by extending the interrogated region to include all SNPs within
1 Kb proximal to and including the probe sequence. Again, correlation analysis of
heterozygosity values at 1,369 (WBC) and 1,284 (buccal) SNP containing loci showed no
correlation with ICCMZ-ICCDZ value (R=-0.019, P=0.47 (WBC), and R=0.033, P=0.23(buccal)).
These results are in agreement with our subsequent study, which demonstrates that strong
genetic effects on epigenetic status occur relatively infrequently throughout the genome.
Figure 3.8. Technical variation volcano plots of HpaII and MspI based enrichments
Volcano plots measuring technical variation produced by HpaII (red) and MspI (blue) enrichments. Each plot is
produced from 4 parallel self vs. self enrichments and hybridizations at 5,997 overlapping loci between the two data
sets. MspI enriched samples produce significantly more technical variance (2) than that of HpaII as measured by a
non-parametric Ansari-Bradley test (Ratio of HpaII/MspI variance= 0.37, P<9.9x10-324
).
52
Third, we investigated whether DNA sequence variation may influence DNA methylation in cis
and in trans by methylation analysis of two strains of inbred (that is, nearly genetically
identical) mice as compared to two strains of outbred (genetically non-identical) mice. Mouse
brains were subjected to 4.6K CpG island microarray-based DNA methylation profiling. First,
we determined that the detected biological variance is significantly larger than technical
variance in the mouse experiments (P<9.9x10-324
). We then compared the spot-wise distribution
of within sibship DNA methylation variation (2) between inbred and outbred mice at 2,176
unique genomic regions and did not detect any significant difference (mean difference=2.1x10-5
+ 3x10-4
, P=0.68) (Fig. 3.9). Although it is not completely clear to what extent mouse brain
results can be extrapolated to human buccal cells, despite their shared ectodermal origin, and
although DNA variation in the outbred mice is less than that of unrelated humans (based on the
Welcome Trust study (http://www.well.ox.ac.uk/mouse/INBREDS; our estimate is that in
general, outbred mouse DNA heterozygosity is 2-4 times lower in comparison to unrelated
humans), the impact of DNA polymorphisms on DNA methylation does not seem to be
common.
Figure 3.9. Distributions of inbred and outbred epigenetic variation
The spot-wise distributions of the within sibship variance for both inbred (red) and outbred (blue) mice. A non-
parametric comparison of the distributions with a paired Wilcoxon signed rank test did not identify any significant
epigenetic difference between groups, despite the genetic variation within the outbred group (mean
difference=2.1x10-5
+ 3x10-4
, P=0.68).
53
In the classical twin studies, greater phenotypic similarity among MZ twin pairs compared to
DZ twins has been traditionally attributed to the degree of DNA sequence similarity. Our twin
studies suggest that in addition to identical DNA, epigenetic similarity at the time of blastocyst
splitting may also contribute to phenotypic similarities in MZ co-twins. By the same argument,
DZ co-twins are more different from each other than MZ co-twins not only because they possess
some DNA sequence differences (on average ~0.05%), but also because they originated from
epigenomically different zygotes. In addition, epigenomic inheritance may explain the
“intangible variance”, the concept that originated from the observation that regular (polyzygotic)
inbred mice were much more different from each other than the MZ inbred mice of the same
strain [212]. In conjunction with such findings, our data suggest that the phenotypic effects of
the individual epigenomes of each zygote could be substantial.
Although our in silico and mouse experiments indicated that ASM did not significantly affect
our findings in this study, it does not mean that genetics do not exert any influence on epigenetic
modifications. Here we used a single MSRE, HpaII, in the enrichment of the unmethylated
fraction, and the microarrays only contained 109 HpaII recognition sites (1.7% of total sites)
that could potentially be interrupted by the presence of a SNP – this is a control feature for our
epigenetic study, as it does not leave much opportunity for an ASM effect to become visible. If
we were to utilize multiple MSREs and a genome-wide approach, the incidence of ASM would
become more apparent, and this complicated molecular interaction may be relevant in the study
of complex human diseases, such as psychosis.
Genomic frequency and distribution of ASM
In response to the questions raised in the twin study, regarding the contribution of DNA
sequence to epigenetic factors, we conducted a large-scale, genome-wide analysis of ASM in
psychosis cases (SZ and BD) and controls using human tissues and the most complex
microarrays currently available. Our large sample size, volume of interrogated SNPs, and use of
human brain and sperm, as opposed to cell lines, sets this study apart from all other studies to
date. The incorporation of a disease element in an ASM study of this magnitude is completely
novel and, unlike some other groups, we were not confined to any particular region of the
genome - promoters, exons, CGIs or other - allowing us to examine a tremendous number of
SNPs without bias. Each sample was genotyped using the Affymetrix SNP 6.0 array, and then a
54
second array was hybridized with the enriched unmethylated fraction from that same subject, in
order to determine the methylation differences between alternative DNA alleles at every SNP.
Our enrichment strategy is presented in Figure 3.10.
Figure 3.10. Enrichment of unmethylated DNA fraction
To enrich the unmethylated fraction, genomic DNA was first digested separately with HpaII, HinP1I and
HpyCH4IV, which cut their unmethylated recognition sites. The digested DNA was pooled per sample and non-
human adaptor sequences were ligated to the sticky ends. McrBC was then used to cut all internally methylated
CpG sites, leaving only unmethylated fragments to be amplified, fragmented, labeled and hybridized to SNP 6.0
arrays.
We examined our total numbers of epiSNPs occurring per cohort to determine their distribution
across the chromosomes in brain tissue (Fig 3.11). Chromosomes range in size, thus, the
number of total SNPs occurring on each one will differ – taking this into account, we observed
several significant differences in the chromosomal distribution of epiSNPs between the cohorts
(illustrated in Table 3.3). In general, epiSNPs are distributed evenly across the genome in the
55
control cohort, with the exception of chr7 and chr16, which appear to have under- and over-
representation of epiSNPs across all cohorts, respectively. As power increases with sample size,
we were able to detect more epiSNPs when the cohorts were pooled; 9082 epiSNPs were
identified in the combined case cohort (BD plus SZ) and 13480 were detected when all case and
control samples were analyzed together. The distribution differences observed in the case group
encompassed all those included in the separate BD and SZ analyses, plus three additional
differences that likely became significant with the increased power. Oddly, when the largest
cohort (combined cases and controls) was analyzed, fewer differences in distribution were
observed versus the combined case group, suggesting that the inclusion of the somewhat
uniformly-distributed controls cancelled out some of the variation introduced by the cases.
Figure 3.11. Chromosomal distribution of brain epiSNPs
The distribution of brain epiSNPs per cohort across all chromosomes, where the total number of detected epiSNPs
is displayed as a proportion of the total number of SNPs occurring on each chromosome (on the Affymetrix SNP
56
6.0 array), with a p-value cut-off of <0.01. No epiSNPs in any cohort were located on the mitochondrial
chromosome.
Table 3.3. Differences in epiSNP chromosomal distribution in brain and sperm
The differences in epiSNP distribution across the chromosomes are shown per cohort, with q-values displayed
wherever a difference occurs. Column 1: tissue source. Column 2: cohorts, with “case” representing the combined
analysis of SZ and BD cohorts. Column 3: chromosome number (where a difference was observed). Column 4:
results of distribution test, showing over- or under-representation of epiSNPs on that chromosome. Column 5: q-
values, where q<0.05 is considered significant.
57
For each cohort, we consulted dbSNP to determine if the epiSNPs occurred on sequences that
were associated with a specific “functional class.” It should be noted that this database does not
classify intergenic SNPs, so we used the absence of classification to group them. Other studies
have suggested that a large percentage of epiSNPs appear in intergenic regions [213] and,
indeed, we did find that 1308 (61.17%) of control, 1070 (59.94%) of BD and 2356 (58.75%) of
SZ epiSNPs existed outside of the dbSNP functional classes in brain. Of all known SNPs, the
majority map to noncoding and intergenic regions of the genome; it is interesting to note that
intergenic SNPs are frequently detected by GWAS as showing strong associations with human
disease [214, 215]. For example, SNPs in an intergenic region on chromosome 4, as well as
those upstream from SLC2A9 are strongly associated with Alzheimer’s disease (plus psychosis)
[216]; an intergenic SNP was found to be significantly associated with systemic lupus
erythematosus, and a positive correlation between this SNP and expression of ATG5 has
nominated this nearby gene as a candidate locus [217]; finally, an intergenic SNP has been
suggested to predispose to papillary thyroid carcinoma via interactions with a long intergenic
noncoding RNA gene (papillary thyroid carcinoma susceptibility candidate 3 , PTCSC3), which
is believed to be a tumor suppressor [164].
The binding of transcription factors to their specific sites is correlated with expression levels of
related genes, and some transcription factors, such as CTCF, are known to bind predominantly
within intergenic regions [218]. EpiSNPs within these sites would provide alternate local
methylation patterns, which may affect the binding and subsequent downstream actions of
CTCF and related proteins. Additionally, many non-coding RNAs (ncRNA) can be transcribed
from intergenic regions of the genome [219]. It is estimated that there are thousands of ncRNAs
[219, 220], some of which are highly conserved and required for crucial biological processes
[221], although the functions of many of these molecules still require validation [222].
Advances in sequencing technology and analysis techniques have encouraged the notion that
much of the genome is actually transcribed, and that non-coding sequences are as interesting
and important as those that code proteins [223, 224].
Our functional class analysis revealed that control, SZ and BD cohorts each have a unique
functional distribution, whereas the combined case cohort profile is predictably influenced by
the SZ and BD profiles (Fig 3.12). The exact meaning of each individual functional class
58
distribution is unclear, but three major trends were observed across most cohorts: epiSNPs are
enriched in locus and UTR regions, and depleted in exons (p values are displayed in Table 3.4).
When viewing Figure 3.12 B, it should be noted that enrichment was calculated as the spread of
functional classes per cohort against the spread of classes on the SNP 6.0 array, so comparisons
should be only be made within a given cohort – enrichment represents the occurrence of
epiSNPs in a functional class versus the number we would predict to see in that class, given the
representation on the array. Some of the most detrimental SNPs are those that are structurally
functional and exhibit pleiotropy – they are associated with certain diseases and tend to occur in
exonic regions [165]. It is also true that SNPs generally tend to occur more frequently in non-
coding versus coding regions, as there is a negative selective force acting at sites of amino acid
altering mutations [163], thus, our observation that epiSNPs are generally depleted at exons is
quite logical.
The locus region is defined by dbSNP as being “within 2 Kb 5′ or 500 bp 3′ of a gene feature
(on either strand), but the variation is not in the transcript for the gene.” The UTRs are
considered to be located on the mRNA, but make up the regions before the first codon and after
the last codon used in translation, so they are located adjacent to the locus regions. Given that
the locus and UTR classes show similar, significant enrichment patterns, we can generalize that
the areas immediately flanking the 5’ and 3’ ends of genes are favourable epiSNP locations.
Many significant hits detected in GWAS studies of complex diseases do not actually affect
protein structure directly – it is believed that most variants act via regulatory changes in mRNA
expression [171]. Even single-base alterations to RNA can have profound effects on its structure
[225], and accessibility of particular regions affects the binding affinity at target sites for the
RNA-induced silencing complex (RISC) [226], as well as miRNAs, which bind to their many
target sites in the 3’UTRs of nearly every human transcript and exert a powerful regulatory
effect [171]. Aside from miRNA target sites, there are other functional elements within the
3’UTR that are known to affect miRNA activity. For example, loss of poly-adenylation (polyA)
can lead to various disease states via non-specific degradation of mRNA [227], and miRNA-
mediated repression has been correlated with polyA signal efficiency [166]. In humans, mRNA
is often targeted by multiple miRNAs, so the loss of a single binding site may not be deleterious
on its own, however, alterations may be cumulative [228]. In the developing brain, many
miRNAs are expressed, and they act to regulate neurogenesis, dendritogenesis, and synapse
59
formation. A functional analysis has shown that the 3' UTR of brain-derived neurotrophic factor
(BDNF) mRNA can be targeted by several miRNAs that are aberrantly up-regulated in the
absence of MeCP2, and this dysregulation may contribute to the development of Rett syndrome
[229].
Some of the 5’UTR SNPs detected by GWAS are also predicted to alter mRNA structure, and
evidence for this is accumulating: the 5’UTR in the human ferritin light chain gene has been
termed a “RiboSNitch,” meaning that its RNA changes structure if a particular disease-
associated SNP is present and, like the bacterial “Riboswitch,” the structural change is believed
to regulate translation [170]. Of the SNPs in high LD that formed RNA structure-stabilizing
haplotypes (SSH) in humans, SNP pairs in 8 of the 10 SSH-containing transcripts were shown
to stabilize RNA protein binding sites [170]. The methylation status of these SSH SNPs was not
investigated, but it is reasonable to hypothesize that regulation of local methylation could be
part of these SNPs’ mechanism of action. An algorithm has been devised for the detection of
RiboSNitches, and multiple SNPs have been detected in UTRs (particularly at the 5’ end) that
alter the mRNA structural ensemble of associated genes in six disease-states: hyperferritinemia
cataract syndrome, beta-thalassemia, cartilage-hair hypoplasia, retinoblastoma, chronic
obstructive pulmonary disease, and hypertension [230].
Table 3.4. Differences in epiSNP functional class distribution in brain and sperm
60
The differences in epiSNP distribution across dbSNP functional classes are shown per cohort, with p-values
displayed wherever a difference occurs. Column 1: tissue source. Column 2: cohorts, with “case” representing the
combined analysis of SZ and BD cohorts. Column 3: chromosome number (where a difference was observed).
Column 4: results of distribution test, showing over- or under-representation of epiSNPs on that chromosome.
Column 5: p-values, where p<0.05 is considered significant.
Many of our detected epiSNPs (approximately 40% of total epiSNPs) exist in introns of genes
that are related to brain function and development, as the GO analysis will reveal in a later
section, which is logical considering that we investigated DNA from brain tissue. One possible
function for intronic ASM has recently come to light: these epiSNPs may be involved in the
splicing of RNA transcripts, possibly playing a role in self-splicing. Deep sequencing studies
have revealed that over 90% of human genes undergo alternative splicing, and SNPs located at
splice sites can alter mRNA translation efficiency or influence exon configuration, ultimately
affecting disease susceptibility [167]; this is evident from the fact that exon splice sites show
high conservation and very low SNP rates [231]. In a study of vascular dementia, it has been
suggested that minor allele, A, of a PHLDB2 intronic SNP may induce a delayed splicing and
increase susceptibility to the disease [232]. Relevant to SZ and BD, there are three intronic
SNPs near exon 10 of the GABRB2 gene (an “alternative splicing hotspot” that codes the β-2
subunit of the GABA A receptor) that are responsible for two novel isoforms of the subunit, and
these SNPs are significantly correlated with SZ and BD, with altered expression of these
isoforms occurring in both diseases; β (2S1) expression was increased and β (2S2) expression
was decreased [233]. Originally, introns were considered to be “junk” DNA sequences that
were simply removed from pre-mRNA, but they have since been found to occasionally encode
proteins, undergo further processing to form ncRNAs [169, 172], and may also represent mobile
genetic elements, such as transposons, in humans [168]. Alternative splicing is an important
process that is necessary for the creation of diverse, complex products, and interference with
normal methylation patterns at critical sites may also result in improper splicing events, with
damaging downstream consequences. Increased or decreased methylation associated with one
allele of an intronic epiSNP may damage an organism if splicing is disrupted, and the particular
epiSNPs that are specific to each disease cohort may contribute to its etiology. Alternatively,
the methylation differences associated with an intronic epiSNP may be beneficial to an
organism – epiSNPs of this variety are likely conserved across populations.
61
Figure 3.12. Functional class distribution of brain epiSNPs
Brain epiSNPs were stratified by their dbSNP functional class tags. Coding SNPs occur in exons of genes, where
one variant introduces either a non-synonymous or synonymous change; Intron SNPs are located within intronic
regions of genes; Locus SNPs are located 2Kb upstream or 0.5Kb downstream from a gene; and UTR SNPs occur
in either 3’ or 5’ untranslated regions. A) The number of epiSNPs per cohort detected in each functional class. B)
The number of epiSNPs per cohort in each class, given as a proportion of the total number of SNP 6.0 SNPs in that
class. Red asterisks mark classes that are enriched per cohort, and blue asterisks mark classes that are depleted per
cohort (p<0.05).
Bisulfite verification of selected SNPs
Several epiSNPs that were a) common to case and control groups and b) showed large array
intensity differences between alternate homozygotes were chosen, in addition to some randomly
selected non-epiSNPs. A portion of the surrounding sequence was amplified using bisulfite
modified genomic DNA from the same brain tissue used in the array experiment. The amplicons
were then pyrosequenced and analyzed using the methylation assay feature of the Pyromark
software, and then each individual CpG was subjected to the Wilcoxon signed rank test. For
62
each locus, we investigated samples that were homozygous for each allele in order to maximize
the potential methylation differences; the average methylation across each locus is listed in
Table 3.5 and the methylation for each individual at each specific CpG site is listed in Appendix
2, Table A2.4.
SNP Type
% methylation A
allele
% methylation B
allele
# ASM CpGs
rs649951 epiSNP 80.2 68.4 1
rs9936944 epiSNP 97.4 97.2 1
rs1485474 epiSNP 8.9 59.8 3
rs5950206 non-epiSNP 63.6 64.3 0
rs10875310 non-epiSNP 99.8 100 0
rs10962372 non-epiSNP 100 99.6 0
Table 3.5. Average methylation across loci.
Column 1: SNP 6.0 probe ID for investigated loci. Column 2: designation of locus, where epiSNPs demonstrate
ASM and non-epiSNPs do not. Column 3: average methylation of all AA homozygotes across all CpG sites for a
locus. Column 4: average methylation of all BB homozygotes across all CpG sites for a locus. Column 5: number
of CpG sites on the amplicon that displayed significant methylation differences between AA and BB.
For three out of three epiSNPs, a methylation difference was observed in the direction predicted
by the microarrays, whereas all of the non-epiSNPs had no detectable significant differences
(p>0.42 for each). In two of the epiSNPs, not all of the surrounding CpG sites displayed
significant differences and, for rs9936944, the methylation difference at the ASM CpG was not
large enough to create an overall difference when all sites on the amplicon were analyzed
together. It is not necessary for ASM to occur at all nearby CpG sites for an epiSNP to be
detected, and it is also possible for a pyrosequencing experiment to miss contributing CpGs, as
our enrichment strategy allows us to detect epiSNPs that result from fragments that are several
Kb in length. It is possible that our small pyrosequencing region may not capture the exact CpG
sites that displayed the ASM associated with a particular SNP, and this is one limitation of the
technology. The methylation densities observed on the arrays for each possible genotype are
presented for one epiSNP in Figure 3.13, along with examples of the pyrograms generated for
the locus. Note that the unmethylated DNA fraction was measured on the microarray, as this is
63
what we specifically enriched and hybridized; a higher intensity corresponds to an overall lower
methylation level associated with a particular genotype.
Figure 3.13. Methylation levels observed for an epiSNP
A) Violin plot of unmethylation signal (microarray signal intensity generated by unmethylated fragments) for each
genotype of rs1485474, showing a decrease in unmethylation (increase in methylation) associated with the B allele.
Width of violin represents sample density at that position. B) Pyrogram for an AA genotype sample. C) Pyrogram
for a BB genotype sample, with higher levels of methylation.
All of the non-epiSNPs did not show differing methylation levels between alternate
homozygotes, as predicted by the microarray analysis. It should be noted that the pyrosequencer
is by no means a sensitive instrument, and that tiny fluctuations below 2% are considered to be
“noise” from the pyrosequencing reaction (as stated by Qiagen technical service), thus, any
differences below 2% are not reliably detected and are not considered to be significant. Figure
3.14 shows the hypomethylation intensities and pyrograms for a typical non-epiSNP. Although
64
this verification experiment was small-scale, it provides sufficient evidence that the ASM
detected by the microarrays is not simply an artefact.
Figure 3.14. Methylation levels observed for a non-epiSNP
A) Violin plot of unmethylation signal for each genotype of rs10875310, showing no change in
methylation level associated with the A or B allele. Width of violin represents sample density at that position. B)
Pyrogram for an AA genotype sample. C) Pyrogram for a BB genotype sample.
Linkage disequilibrium does not cause false-positives
Linkage disequilibrium (LD), the non-random association of alleles, could potentially confound
our results, as LD between two SNPs where one is located within a MSRE recognition site
could lead to the identification of false epiSNPs. We investigated possible LD effects between
identified epiSNPs and SNPs occurring within nearby MSRE recognition sites (MSRE SNPs) in
both brain and sperm samples across all cohorts. The human genome (build hg19) and SNPs
from dbSNP (build 135) were used to determine the location of MSRE SNPs for each MSRE
65
used in our unmethylated fraction enrichment (HpaII, HinP1I, and HpyCH4IV). We also
examined possible de novo MSRE SNPs that occur when an alternative allele results in the
creation of an MSRE recognition site, where one previously did not exist.
The distance between SNPs and the nearest MSRE SNP within 2Kb was closely examined, as
LD effects are more likely to occur between nearby SNPs; also, the unmethylated fragments
generated in our enrichment were between 400bp and 2Kb, on average, so LD between SNPs
that are separated by a great distance would not affect our findings. Using a Wilcoxon rank sum
test, we found no significant difference between epiSNPs and SNPs in relation to the physical
distance to the nearest MSRE SNP (including de novo MSRE SNPs) within 2Kb (Fig 3.15).
Figure 3.15. Distances between SNPs and MSRE SNPs
Distances between SNPs and SNPs within MSRE recognition sites within 2Kb for all cohorts in DNA from A)
brain and B) sperm. No significant differences were detected.
We also examined LD values between every SNP and all nearby (within 2Kb) MSRE SNPs. We
used an LD threshold value of 0.8 to distinguish an LD effect (> 0.8) from no LD effect (≤ 0.8).
Once more, we found no significant difference in LD values between epiSNPs and non-epiSNPs
across all cohorts in brain and sperm samples (Fig 3.16). After a rigorous analysis into possible
66
LD effects, examining the physical distance between SNPs and the nearest MSRE SNP and
comparing LD values between SNPs and all nearby MSRE SNPs, there is no evidence that
MSRE SNPs result in LD relationships with identified epiSNPs.
Figure 3.16. Linkage disequilibrium scores in brain and sperm
Linkage disequilibrium scores between SNPs and SNPs within MSRE recognition sites within 2Kb for all cohorts
in DNA from A) brain and B) sperm. The values range from 0 to 1, with those close to 1 indicating SNPs most
likely in LD and those closer to 0 indicating SNPs least likely in LD. The variations in distribution tip shape are
meaningless, as none of the groups demonstrated statistically significant differences.
ASM in the major psychosis cohort
The total number of epiSNPs per cohort and the overlap between cohorts are summarized in a
Venn diagram (Fig. 3.17). Each cohort had many unique epiSNPs (a list of the epiSNPs and
associated genes discussed here and in the tissue-specificity section is provided in Appendix 2,
Table A2.5) , but there was also a significant amount of overlap – 529 epiSNPs were common to
all three cohorts, and many overlapped two out of the three. When we further dissect our data,
the SZ group contained more than double the epiSNPs of the control or BD groups, and this may
reflect its complex disease etiology, where small alterations occur in a large number of
67
pathological pathways. Notably, it has been hypothesized that multiple rare variants are
responsible for SZ [234-236], and this hypothesis may be extended to include epiSNPs.
Although BD and SZ are similar psychiatric diseases from the epidemiological and, to some
extent, clinical point of view, the BD group did not demonstrate this increase in ASM, and
actually contained slightly fewer epiSNPs versus the controls. We do not believe that the
number of SZ epiSNPs was artificially inflated by one or two outliers, because the PWL method
requires a group effect in order to identify a significant epiSNP, and our q-value cut-off was
very stringent. The contribution of cohort-specific CNVs cannot be fully ruled out, however,
singleton deletions are believed to occur frequently in both SZ and BD, yet no epiSNP increase
was observed in BD [147]. Also, the total genome-wide CNV burden does not appear to differ
between controls and psychosis subjects [146], so it could be hypothesized that controls would
also have some unique epiSNP inflation, but the massive increase only seems to occur in the SZ
group. Other studies have not investigated the occurrence of epiSNPs in disease cohorts, so the
finding that epiSNP representation shows such a fluctuation between cases and controls is novel.
It also sheds some light on the importance of ASM in diseases, where a gain or loss of epiSNPs
may play an etiological role, although we have yet to definitively prove or disprove this scenario
in the case of psychosis.
We investigated the 529 epiSNPs that are common to all cohorts and found that the same allele
was associated with methylation in all but 35 of them. Additionally, of the epiSNPs that were
common to two out of three groups, only ~5% of them had methylation associated with the
opposite allele. This finding supports the concept that some epiSNPs may function in a way that
is critical to the organism, thus, their directionality is conserved between individuals.
Schalkwyk et al also noticed that the direction of methylation is not always uniform at an
epiSNP. They found that individuals varied in the direction of ASM at 10% of the loci that they
examined in blood DNA [21]. The 35 common epiSNPs that did not show uniform
directionality of ASM are listed in Table 3.6. For the vast majority of these loci, there is either
no gene located with 50Kb from the SNP, no MIM title associated with the gene, or both of the
above, so it is difficult to speculate on the meaning of many directionality differences. One
common epiSNP (rs684669) is located within an intron of the DSCAML1 gene, which encodes
the Down syndrome cell adhesion molecule-like 1 transmembrane receptor in many vertebrates
and invertebrates; DSCAML1 and the related DSCAM proteins are all involved in aspects of
68
neurodevelopment, such as axon guidance, bifurcation and segregation, as well as dendritic
patterning and synapse formation [237]. In this case, the SZ cohort showed ASM on the
opposite allele versus the control and BD cohorts, and there were also two SZ-unique epiSNPs
(rs7106294 and rs665406) located within introns of DSCAML1.
At another common epiSNP (rs2622769), the BD cohort differed in ASM direction versus SZ
and controls. This epiSNP was found within an intron of the secretogranin 3 gene (SCG3),
which codes a neuroendocrine secretory protein belonging to the chromogranin/secretogranin
family. Secretogranins are evolutionarily conserved in vertebrates and are responsible for
controlled delivery of peptides, hormones, neurotransmitters, and growth factors, and their
processed peptides are involved in metabolism, glucose homeostasis, emotional behavior, pain
pathways, and blood pressure modulation [238]. Due to their high level of conservation and
functionally crucial roles, a change in ASM in an intronic region of SCG3 may result in altered
splicing or other changes to the final protein product that could be damaging to the organism. In
the prefrontal cortex of patients with major depressive disorder (MDD), the psychotic subjects
exhibited a higher degree of steady state SCG3 mRNA versus controls [239]. SCG3 is not as
well characterized as other members of the family; it has mainly been reported to be over-
expressed in neuroendocrine tumours [240]. The mRNA of a similar granin, secretogranin 2
(SCG2), was upregulated in the brains of mice treated with lithium versus the levels observed in
controls [241], suggesting that the actions of lithium, the gold-standard mood stabilizing
medication prescribed to individuals with BD, may involve regulation of disturbances in the
secretogranin family. In line with this theory, it has previously been shown that lithium-treated
cells develop an altered secretory phenotype involving increased cell content and secretion of
the SCG2 protein [242].
The third interesting directionality difference involved an epiSNP (rs3792174) located within an
intron of the phospholipase A2 receptor 1 (PLA2R1) gene, which codes a type I transmembrane
glycoprotein that is believed to contribute to the clearance of phospholipase A2 (PLA2), thereby
inhibiting its action [243]; again, the BD cohort differed from SZ and controls. Unique epiSNPs
at PLA2R1 were also identified in the BD and SZ groups (BD = rs2715950, BD and SZ =
rs949753), showing that this locus is a common target for ASM. The PLA2 enzyme hydrolyses
glycerol to release arachidonic acid and lysophospholipids, which are then modified into
inflammatory mediators (eicosanoids), and several recent studies have linked it to psychosis.
69
One family of eicosanoids, the prostaglandins, has been directly implicated in the disease. The
'prostaglandin deficiency' hypothesis postulates that synaptic transmission is affected when
defective enzyme systems fail to convert essential fatty acids to prostaglandins, resulting in
diminished prostaglandin levels. Lysophosphatidylcholine, a byproduct of PLA2-catalyzed
phospholipid hydrolysis, is the main carrier of polyunsaturated fatty acids across the blood-brain
barrier, and its level decreased in SZ subjects in association with decreases in cognitive speed
[244]. Levels of PLA2 were measured in hippocampal tissues from anterior temporal
lobectomies of subjects with temporal lobe epilepsy who either showed psychotic symptoms or
did not, and increased PLA2 was only observed in the psychotic cohort [245]. In a subset of BD
patients with a history of psychosis, elevated calcium-independent PLA2 activity was detected
in the blood serum, but was absent in healthy controls [246]. Many years ago, increased plasma
PLA2 activity was discovered in SZ subjects, but could be reduced to control levels after 3
weeks of neuroleptic treatment [247]. Given that the activity of this enzyme is elevated in both
SZ and BD and that PLA2 is involved in regulation of norepinephrine receptor density, axon
regeneration, and presynaptic neurotransmitter release [248], increased rates of phospholipid
turnover may represent a shared biochemical feature of psychosis. Additionally, there was an
epiSNP (rs2020887) located within the coding region of the phospholipase A2 group 5 gene
(PLA2G5), and two in introns of phospholipase C epsilon 1 (PLCE1, rs2797990 and
rs11187815) that were unique to the SZ cohort, plus there was a BD-unique epiSNP in an intron
of phospholipase C beta 1 (PLCB1, rs6086496) and an epiSNP (rs11908460) common to both
SZ and BD, as well as two epiSNPs common to control and SZ all found within introns of
PLCB1 (rs6055601 and rs6055739). These are only a selection of phospholipase-related
epiSNPs, and this amount of variation indicates that ASM in these pathways is affected at
multiple levels in psychosis.
The fact that DSCAML1, SCG3 and PLA2R1 introns contained epiSNPs across all study groups
suggests that these loci are regularly destined for ASM – the differences in methylated-alleles
may indicate a malfunction in establishment of ASM that takes place in certain individuals as
part of a disease phenotype. These malfunctions could result in altered regulation of local genes,
perhaps at the transcriptional or splicing level. As psychosis is a complex disease and it is likely
that many different genotypes and epigenotypes can contribute to this diagnosis, these single
70
ASM changes are probably just one part of an intricate mosaic that leads to the overall psychosis
phenotype.
Figure 3.17. Total epiSNPs per cohort in brain
The total number of epiSNPs detected in control, BD and SZ cohorts are depicted in A) a Venn diagram and B) a
chart (p<2.2x10-16
).
71
Table 3.6. Direction of methylation and gene information for common epiSNPs
Column 1: SNP 6.0 probe ID for investigated loci. Column 2-4: allele associated with increased methylation for
control, BD and SZ cohorts. Column 5: closest gene within 50Kb. Column 6: OMIM title associated with closest
gene.
SNP Control BD SZ Closest gene MIM_title
rs16829083 A B B MYOM3
rs11207702 B B A NFIA Nuclear factor I/A
rs10920871 A B A NA
rs6664930 B A A SLC30A1 Solute carrier family 30, member 1 and 10
rs9428514 A B A WDR64
rs2362590 A B B SMYD3
rs7098116 A B B NA
rs12252906 A B B NA
rs17436486 B A A GALNTL4
rs684669 B B A DSCAML1 Down syndrome cell adhesion molecule-like 1
rs414161 B A A NA
rs10848091 B A B PIWIL1 Piwi, Drosophila, homolog of
rs176343 B A B NA
rs9573719 B A A NA
rs2269304 B A B SPTB
rs2622769 B A B SCG3 Sarcoglycan, gamma;Secretogranin III
rs4553646 B A A AHSP
rs4889048 B B A NA
rs1125244 A A B NA
rs12944274 B A A PPM1E
rs7342966 A B B RGS9 Regulator of G protein signaling 9 and RGPS9-binding protein
rs2377391 B B A RBFOX3
rs3792174 A B A PLA2R1 Phospholipase A2 receptor 1
rs17003416 B A B NA
rs11703071 A B B TFIP11 Tuftelin-interacting protein 11
rs12635765 A B B NA
rs901812 A A B NA
rs1853261 A B B NA
rs10034491 B B A NA
rs4478136 B A A NA
rs6876638 B B A NA
rs2546963 B A A PWWP2A
rs1379326 A B A CSMD1 Cub and Sushi multiple domains 1
rs4909472 B A B LOC286094
rs10975894 B A B KDM4C
72
Differences aside, many of the epiSNPs common to all cohorts were methylated on the same
allele. Upon examination, many of these epiSNPs are associated with genes that are critical for
various developmental and functional features of the brain. An epiSNP (rs12251692) appears in
an intron of neuregulin 3 (NRG3), which is thought to influence neuroblast proliferation,
migration and differentiation. NRG3 has been identified as a susceptibility locus for SZ [249]
and, although ASM at this particular SNP does not vary between cohorts, three other NRG3
intronic epiSNPs (rs2144468, rs7100526, and rs10787027) exist exclusively in the SZ cohort –
these may represent causative or compensatory ASM, or perhaps they are regions that are less
tightly-regulated, and there is room for fluctuation without serious detriment. There is also an
epiSNP (rs10519568) within an intron of the gamma-aminobutyric acid A receptor gamma-3
(GABRG3) for which the direction of ASM is consistent between cohorts. GABA is the major
inhibitory neurotransmitter in the mammalian brain and, although mice express GABRG3
biallelically, there is still some confusion surrounding the imprinting status of this gene in
humans [104], making the location of this epiSNP, as well as the two SZ-specific epiSNPs
(rs6422909 and rs8024723), particularly interesting. Another epiSNP lies within an intron of the
syntrophin (SYNTG1, rs7016161) gene, which belongs to a family of cytoplasmic peripheral
membrane proteins. Syntrophins have multiple protein interaction domains that link signaling
proteins, such as kinases and neuronal nitric oxide synthase, to dystrophins, and they are highly
expressed in the brain, where they are required for signaling and trafficking [250]. Also among
the conserved, common epiSNPs are those associated with a number of kinases, cation channels,
voltage-gated channels, proteins related to zinc fingers, phospholipases, and phosphatases; all of
these proteins, while they function in an assortment of diverse cascades, are essential to the
organism and their conservation at the ASM level is understandable.
One peculiar discovery involved a synonymous epiSNP (rs4343) located within the coding
region of angiotensin I converting enzyme (ACE) that was common to all cohorts and displayed
ASM on the same allele. ACE is a circulating enzyme that catalyzes the conversion of
angiotensin I to angiotensin II as part of the renin-angiotensin system that regulates extracellular
volume and arterial vasoconstriction, and it also hydrolyses a peptide (N-AcSer-Asp-Lys-Pro)
that acts as a negative regulator of hematopoietic stem cell proliferation [251]. What makes this
epiSNP interesting is the fact that there is an extensively-studied insertion/deletion
polymorphism at this locus that is associated with a number of conditions, including psychosis.
73
It has been hypothesized that the deletion allele may be responsible for clustering of psychotic
symptoms in BD, whereas the insertion allele may be protective against psychosis [252]. In
contrast, another study found that the deletion allele was actually protective and reduced the risk
of developing SZ by 50% [253]. It has also been found that elevation of ACE in cerebrospinal
fluid correlates with the duration of illness in SZ, although it is not clear whether the increases
were the result of treatment or deterioration due to the disease [254]. Regardless of the
conflicting theories, it is apparent that dysregulation at this locus is involved in psychosis, either
as a causal or compensatory factor, or simply as a biomarker, and epigenetic studies of the locus
have been recommended [255]. In our study, the genotypes at this epiSNP were distributed
somewhat differently per cohort, although a trend proved difficult to measure, statistically.
Future studies should further investigate the genotype and methylation levels of SZ subjects and
controls at this epiSNP and, if possible, attempt to correlate this data with ACE protein or
mRNA levels, as well as mRNA conformation.
When the epiSNPs unique to each cohort were examined, many of these loci did not have any
genes located within 50Kb from the SNP, as approximately 60% of ASM occurred in intergenic
regions, however, a number of cohort-specific epiSNPs immediately stood out. The control
cohort has a unique epiSNP (rs9497449) in an intron of the metabotropic glutamate receptor 1
(GRM1), a receptor at which deleterious mutations are detected in BD and SZ [256] – it should
be noted that the BD group has its own unique epiSNPs (rs7531813, rs12295113) in introns of
other glutamate receptors, the ionotropic, kainate 3 and 4 (GRIK3 and GRIK4). The BD group
has an epiSNP ~1.5Kb upstream from the gene encoding brain-specific angiogenesis inhibitor 2
(BAI2, rs7543090); BAI2 may play a role in depressed mood, as BAI2-deficient mice show
significant antidepressant-like behavior in a variety of tests versus wild-type mice [257]. In BD
and SZ, there are common epiSNPs (rs2820291, rs7102028) in the neuron navigator 1 and 2
(NAV1 and NAV2) genes that are not shared with the controls. Navigators act to reorganize the
cytoskeleton to guide cell shape changes and have a role in neurite outgrowth [258]; expression
of mRNAs associated with NAV1 have been shown to be reduced in SZ patients [259]. Another
gene that affects synapse development and function is neurexin 3 (NRXN3), which codes
neuronal adhesion proteins. Deletions at this locus are associated with autism spectrum disorder
[260], and there is an epiSNP (rs1022434) common to SZ and BD located within an intron of
this gene. Also, it is well known that epilepsy and psychosis share a complex relationship [261],
74
with epilepsy patients developing SZ at a higher rate than expected, and SZ patients being more
prone to seizures than the general population [262]. We detected two epiSNPs in the SZ cohort
(rs4738014 and rs7018199, the latter is shared with BD) and one epiSNP common to all cohorts
(rs7002461) within introns of the carboxypeptidase A6 gene (CPA6), mutations to which result
in several forms of epilepsy [263], indicating that ASM is a possible connection between
psychosis and seizures.
There are several intriguing epiSNPs that are unique to the SZ cohort, for example, rs1111050
lies in an intron of the gene that codes methylguanine-DNA methyltransferase (MGMT), an
enzyme that repairs inappropriately methylated guanine residues in DNA. ASM at SNPs
associated with MGMT has received quite a bit of attention in relation to colorectal cancer [264],
but those implicated epiSNPs are different from the one we have observed, most likely due to
tissue-specific ASM effects. An epiSNP (rs956451) exists in an intron of the gene coding
voltage-dependent, L type calcium channel alpha 1C subunit (CACNA1C) – a SZ risk gene that
has recently been shown to be a target of the miR-137 microRNA as part of a pathway that
contributes to disease progression [265]. There is an epiSNP (rs1705107) in an intron of the
reelin gene (RELN), which has famously been identified as a SZ risk factor. SNPs in the
promoter region of this gene have not been significantly associated with SZ [266], but perhaps
addition of the ASM element or consideration of intronic SNPs and epiSNPs would be more
fruitful in the search for factors predisposing to SZ. Two other epiSNPs (rs7205673 and
rs9937169) are located within the coding region of the polycystin 1-like 2 (PC1L2) gene
(PKD1L2), which interacts with G protein coupled receptors [267]; mutations of polycystins
result in polycystic kidney disease, although multiple mechanisms are thought to contribute to
its pathogenesis [268]. There has been evidence that polycystic kidney disease is significantly
elevated in SZ families [269] and, several decades ago, dialysis was suggested as a treatment for
SZ as it was believed that malfunctioning kidneys were responsible for circulation of waste
products, ultimately leading to hallucinations. This treatment was not validated in placebo-
controlled double-blind trials [270], indicating that any possible role of the kidneys in SZ was
not strong enough that the disease could be cured simply by improving kidney function. That
being said, this does not rule out the contribution of renal activity in a multifactorial model of
SZ, where smaller, cumulative alterations, such as SZ-specific ASM effects, may play a role in
the development of SZ in certain individuals. Another epiSNP (rs4894637) in an intron of the
75
neuroligin 1 (NLGN1) gene is also interesting, as the resulting protein is involved in a sort of
pruning of low-efficiency synapses - a process that enhances neural network function in control
individuals, but can lead to large synapse loss in cases of SZ [271]. Neuroligins also bind
neurexins to promote the development and differentiation of glutamate synapses [272]; we have
found that epiSNPs are highly enriched in glutamate pathways (discussed further in this
chapter), illustrating the potential for interconnection of epiSNP functions. Finally, the SZ
cohort has several unique epiSNPs located in introns of cub and sushi multiple domains 1 and 3
(CSMD1 and CSMD3) on chromosome 8: eight in CSMD1 (rs1182739, rs1393845, rs11997565,
rs41350751, rs17069006, rs1566860, rs11136748, rs2725068), plus one that is shared with BD
(rs17063261) and one that is shared with BD and control (rs1379326), and 4 in CSMD3
(rs11778262, rs17640016, rs1857719, rs1382469), one of which is shared with the BD cohort.
The CSMD proteins are tumor suppressors that may also function as receptors or co-receptors in
signal transduction processes [273], however, recent research has also identified SNPs within
CSMD1 and its homologue, CSMD2, to be significantly associated with SZ [274]. The large
number of SZ-unique epiSNPs occurring at these loci supports further investigation of their role
in affected individuals.
At the dopamine receptor D2 (DRD2) gene, control and BD groups both have a unique epiSNP
(rs7125415, rs7131056) in an intron, while SZ does not, although it has one at the 5’UTR of the
D1 receptor locus (DRD1, rs4532). As previously mentioned, the dopamine hypothesis of
psychosis has received the most scientific attention and, even after 50 years of drug discovery,
the majority of current treatments rely on dopamine D2 receptor blockade [275]. It is obvious
that this system has some merit as a factor in psychosis, and disease-specific ASM within the
receptors is one possible mechanism. All of these cohort-specific epiSNPs reveal interesting
connections between epigenetic factors and genetic elements, which act together to produce an
intricate disease profile that could vary between individuals.
Following the identification of total epiSNPs per cohort, we explored their associated functional
pathways. As many epiSNPs are located within introns, exons and UTRs, while others are in
the vicinity of genes, we assigned each epiSNP the GO category of the closest gene. The
distribution of GO categories per cohort is shown in Figure 3.18 and, predictably, the majority
of GO categories associated with brain epiSNPs were related to brain development, cellular
activity and brain function. Some GO categories encompass a very large number of SNPs, so it
76
is likely that epiSNPs will fall into these categories simply because they are so broad, for
example, there are 4316 SNPs included in the “multicellular organismal process” category.
Other categories contain considerably fewer SNPs, such as the “behavioural fear response”
category (12 SNPs), so it is much less likely that epiSNPs would receive those GO terms. Table
3.7 lists the top 5th
percentile of significant GO categories that appeared most frequently per
cohort, while table 3.8 lists the top 5th
percentile of significant GO categories that were enriched
per cohort, i.e. appearing more frequently than expected, given the total number of SNPs in the
category; noticeably, there is no overlap between tables. Most epiSNPs occurred in categories
related to signaling and development, with all cohorts sharing the top four GO IDs, but differing
in the fifth position. All of these categories are very large, encompassing hundreds to thousands
of SNPs, and are related to broad, basic tissue functions, so it is not surprising that they are the
most frequently observed.
Figure 3.18. GO categories per cohort in brain
The GO categories associated with epiSNPs from the control, BD and SZ cohorts are depicted in a Venn diagram
(p<0.01).
77
Table 3.7. Top five GO categories per brain cohort
The GO categories containing the largest numbers of epiSNPs per study cohort in the brain, where FDR <0.02.
Cohort GO Term epiSNPs
Control GO:0032501_multicellular_organismal_process 264
GO:0023052_signaling 222
GO:0032502_developmental_process 205
GO:0007275_multicellular_organismal_development 192
GO:0051179_localization 190
SZ GO:0032501_multicellular_organismal_process 410
GO:0023052_signaling 364
GO:0032502_developmental_process 322
GO:0007275_multicellular_organismal_development 301
GO:0048856_anatomical_structure_development 282
BD GO:0032501_multicellular_organismal_process 216
GO:0023052_signaling 186
GO:0032502_developmental_process 157
GO:0007275_multicellular_organismal_development 151
GO:0023060_signal_transmission 138
Cohort GO Term Enrichment epiSNPs
Control GO:0001662_behavioral_fear_response 8.993056 5
GO:0002209_behavioral_defense_response 8.301282 5
GO:0007172_signal_complex_assembly 8.301282 5
GO:0021955_central_nervous_system_neuron_axonogenesis 8.301282 5
GO:0001964_startle_response 7.708333 5
SZ GO:0045634_regulation_of_melanocyte_differentiation 8.431333 4
GO:0050932_regulation_of_pigment_cell_differentiation 8.431333 4
GO:0001662_behavioral_fear_response 6.323499 6
GO:0002209_behavioral_defense_response 5.837076 6
GO:0001964_startle_response 5.420142 6
BD GO:0051552_flavone_metabolic_process 25.527721 5
GO:0009812_flavonoid_metabolic_process 19.145791 6
GO:0009698_phenylpropanoid_metabolic_process 17.01848 6
GO:0032469_endoplasmic_reticulum_calcium_ion_homeostasis 17.01848 4
GO:0051967_negative_regulation_of_synaptic_transmission__glutamatergic 12.76386 3
78
Table 3.8. Top 5 enriched GO categories per brain cohort
The GO categories showing the highest levels of enrichment per study cohort in the brain, where FDR <0.02.
Enrichment levels refer to an over-representation of a GO category, and the number displayed in column four
represents the exact number of epiSNPs that were given that specific GO classification.
GOminer was used to determine the enrichment levels of each GO category per cohort. When
these enriched categories are examined, the profiles begin to diverge, especially in the case of
BD, where the top categories were incredibly enriched and none overlapped with the other
cohorts. The top three bipolar categories are related to flavonoids, as flavones are a class of
flavonoids, and phenylpropanoids are flavonoid precursors. At first glance, these categories
seem to be completely out of place but, in rats, the flavonoid, quercetin, and its metabolite, rutin,
have been found to exert antidepressive and neuroprotective effects, possibly due to inhibition
of monoamino oxidase [276]. Flavones are emerging as potential treatment options; in several
human cancer cell lines, flavones restored the function of a tumor suppressor gene that was
silenced via hypermethylation [277], although their exact activity remains unknown and this
particular study did not investigate any potential demethylation actions. An intriguing new
finding is that the flavonoid prodrug, baicalin, has the ability to cross the blood-brain barrier and
inhibit prolyl oligopeptidase, which is a cytosolic serine peptidase that is associated with BD,
SZ and other neuropsychiatric disorders. Baicalin is a natural compound derived from
Scutellaria baicalensis root extracts that has been safely administered to humans for many
years, and has potential as a new therapeutic option for psychosis and related conditions [278].
Altogether, it appears that flavonoid pathways are a new target that requires further exploration
in regards to psychosis, particularly BD, as recent literature is only beginning to touch on this
subject. The numerous flavonoid-related epiSNPs may represent a dysregulation of these
pathways, although this requires more in-depth investigation. The significance of the
endoplasmic reticulum- and glutamate-related BD enriched GO categories will be discussed
further in the thesis.
The top three enriched categories in the control cohort all deal with fear and behavioural
responses, and these were also present among the most enriched SZ categories. The genes
related to the epiSNPs were very similar between groups, with one major difference: the DRD1
gene appears in all categories in the SZ cohort, but does not appear in the control cohort,
however, the DRD2 gene appears in the control “startle response” pathway. While it is not clear
79
why fear-related categories would be so enriched in the control group, it has been shown that
prepulse-inhibition of startle is linked to glutamatergic pathways [279], and we have found that
these pathways are also largely affected by ASM; perhaps the regulation of these pathways is
somewhat dependent on the regulation of the glutamatergic ones, as glutamate receptor genes do
make up a significant portion of the fear-related GO categories. Still, lack of prepulse-inhibition
is a characteristic of SZ [280], so we should not be too quick to dismiss the enrichment of these
categories as circumstantial.
Outside of the fear pathway-related categories, the other control categories in the top five
seemed quite logical, dealing with signal complex assembly and CNS axonogenesis – processes
required for normal brain activity. In contrast, the top two categories in the SZ cohort seem
rather unintuitive, as they are related to differentiation of melanocytes. The same four genes
appeared in both categories: BCL-2, GNA11, ADAMTS9 and ADAMTS20. The BCL-2 protein is
an apoptotic regulator (anti-apoptotic protein) that has been implicated in both melanoma [281]
and SZ, showing decreased levels in the cortex of SZ subjects, leaving cortical cells vulnerable
to apoptosis [282]; current literature does not indicate a known connection between GNA11 or
the ADAMTS family of proteins and SZ. This is not the first association of melanocyte-related
pathways and SZ, as one group has found that two combined complex genotypes involving
melanotropin showed a stronger association with SZ than any single locus [283]. Some
evidence has been presented that SZ is protective against melanoma, but this has yet to be
independently verified [284]. Again, like the connection emerging between BD and flavonoid
pathways, the relationship between SZ and melanocyte pathways provides an interesting, yet,
ambiguous target for future research.
Interestingly, two general themes that appeared in our GO analysis involved pathways that have
been proposed to play a role in the etiopathogenesis of psychosis. A number of epiSNPs fell
into GO categories related to the neurotransmitter, glutamate, such as GO:0007215 (glutamate
signaling pathway) in all cohorts, GO:0051967 (negative regulation of synaptic transmission
glutamatergic) in BD, and GO:0035249 (synaptic transmission glutamatergic) in control and SZ;
over the last several years, the glutamatergic theory of psychosis has been gaining ground on the
popular dopamine theory. We also noted the presence of two insulin-related categories in the
SZ cohort alone: GO:0030073 (insulin secretion) and GO:0050796 (regulation of insulin
secretion). The majority of the genes associated with these epiSNPs are expressed in the human
80
brain – only one (RFX6) was expressed elsewhere. This discovery is particularly interesting, as a
link between SZ and insulin was proposed in the 1930s, but fell out of favour for decades until,
just recently, a collection of studies presented new evidence in support of this theory [244, 285,
286]. The GO categories are presented in Table 3.9, and the connection between psychosis,
glutamate and insulin will be addressed in detail in the “Genetic-epigenetic interplay in complex
disease” section of the thesis discussion.
Table 3.9. Glutamate- and insulin-related GO categories per brain cohort
GO categories related to glutamate and insulin pathways per study cohort in the brain, where FDR <0.05. The
number of epiSNPs within each category is displayed.
Various mitochondrial malfunctions have also been connected to the development of psychosis,
and our study uncovered several pieces of evidence that support this theory. Each brain cohort
had a unique profile of epiSNPs occurring within genes related to mitochondrial function
(summarized in Table 3.10). It should be noted that the SNP 6.0 array does cover SNPs in the
mitochondrial genome, however, we did not detect any epiSNPs of that variety, and the genes
discussed here are mentioned because they interact with the mitochondria. This does not
guarantee that ASM does not occur in mitochondrial DNA, as emerging evidence suggests that
epigenetic modification of mitochondrial DNA is possible [287, 288].
Minimal overlap of epiSNPs was observed and, in all cases, the common epiSNPs were
methylated on the same allele. Only one mitochondria-related epiSNP was common to all three
Cohort GO Term epiSNPs
Control GO:0035249_synaptic_transmission__glutamatergic 8
GO:0007215_glutamate_signaling_pathway 6
SZ GO:0007215_glutamate_signaling_pathway 7
GO:0035249_synaptic_transmission__glutamatergic 8
GO:0030073_insulin_secretion 19
GO:0050796_regulation_of_insulin_secretion 17
BD GO:0051967_negative_regulation_of_synaptic_transmission__glutamatergic 3
GO:0007215_glutamate_signaling_pathway 8
81
cohorts, and it was within an intron of a carrier that transports oxodicarboxylates across the
inner mitochondrial membrane; epiSNPs occurring in a peptidase and a variety of mitochondrial
ribosomal proteins were also shared between combinations of cohorts. Most of the unique
epiSNPs belonged to the SZ cohort and the associated genes mainly consisted of assembly
factors for mitochondrial complexes and solute carriers. The different profiles are evidence of
regulatory alterations taking place in the case cohorts versus the control and, as previously
mentioned, much of the ASM is present in intronic regions or UTRs – sequences that may affect
splicing or translation of the mRNA. In their epigenomic profiling study, Mill et al also detected
some epigenetic dysregulation occurring in mitochondrial pathways in cases of psychosis [32].
We also discovered a SNP in an intron of ERMP1 (rs4612399) that was common to all cohorts
in brain and in sperm, as well as a significant enrichment of an endoplasmic reticulum-related
GO category in the BD cohort, plus another enriched (enrichment score = 4.94) BD GO
category, GO:0034976: response to endoplasmic reticulum stress. It has been hypothesized that
BD and SZ can result from impaired brain energy metabolism marked by abnormal glucose
metabolism and mitochondrial dysfunction, and that detachment of enzymes from mitochondrial
membranes may lead to these diseases by increasing oxidative stress and limiting brain growth
and development [289]. Some of the affected genes in our ER-related GO categories (highly
enriched in BD) interact with mitochondrial membranes and proteins, for example, APP
produces the precursor to the amyloid beta (A-beta) protein, which is known to form deposits in
the brains of individuals with Alzheimer’s disease [290], and it also possesses a mitochondrial
targeting sequence allowing both APP and A-beta to accumulate inside the organelles and
disrupt their activities [291]. The general control nondepresible-2 (GCN2) kinase is also related
to mitochondria, as its activation can mark changes in translational control of mitochondrial
proteins, leading to organelle depression [292]. This is further evidence that ASM-mediated
disturbances in mitochondrial interactions could play a role in this particular theory of
psychosis.
One final interesting mitochondrial connection was our discovery of many different epiSNPs in
all cohorts and in both DNA sources that were located in introns of the inositol 1,4,5-
trisphosphate 3-kinase B, the enzyme that controls 1,4,5-inositol trisphosphate (IP3) levels via
phosphorylation to IP4 [293]. The reduction of intracellular IP3 levels stimulates autophagy,
whereas increases in IP3 enhance it. The IP3 receptor (IP3R) is located in the ER membranes
82
and in ER-mitochondrial contact sites, and blockade of this receptor leads to autophagy of both
ER and mitochondria [294]. Cohort-specific epiSNPs may bestow varying IP3 kinase B activity
upon each group, resulting in alterations to IP3 metabolism and, ultimately, dysfunctions of ER
and mitochondrial activities or autophagy – events that are thought to contribute to psychiatric
diseases.
Table 3.10. Summary of mitochondria-related epiSNPs per brain cohort
EpiSNPs occurring within genes are listed with the gene name, dbSNP functional class tag, full name of the gene,
and cohorts in which the epiSNP appears.
Tissue specificity of ASM
In addition to the post-mortem brain prefrontal cortex samples, we also attempted to identify
epiSNPs in sperm DNA from controls and BD subjects (n=24 each), using the same
experimental design and analysis employed on the brain samples. We detected a number of
epiSNPs and there was overlap between the control and BD cohorts (Fig. 3.19), however, there
were considerably less than the number found in the brain. This is most likely due to the lower
sample size and power to detect, but it may also reflect tissue-specific differences. In the brain
sample set, BD and control cohorts had approximately equal numbers of epiSNPs, but in the
EpiSNP Gene dbSNP class Name Cohort
rs2181411 ATPAF1 UTR-3 ATP synthase mitochondrial F1 complex assembly factor 1 Ctrl
rs654509 ATPAF1 intron ATP synthase mitochondrial F1 complex assembly factor 1 SZ
rs1025806 ATPAF1-AS1 cds-ref ATP synthase mitochondrial F1 complex assembly factor 1 SZ
rs2109862 IMMP2L intron IMP2 inner mitochondrial membrane peptidase-like (S. cerevisiae) SZ
rs13025568 MFF intron mitochondrial fission factor SZ
rs11551114 MIPEP missense mitochondrial intermediate peptidase Ctrl, BD
rs286633 MRPL22 intron mitochondrial ribosomal protein L22 BD, SZ
rs3785489 MRPS23 UTR-3 mitochondrial ribosomal protein S23 Ctrl, SZ
rs11760722 LOC100507421 intron mitochondrial ribosomal protein S33 BD, SZ
rs10058726 NDUFAF2 intron NADH dehydrogenase (ubiquinone) 1 alpha subcomplex, assembly factor 2 SZ
rs11967279 FARS2 intron phenylalanyl-tRNA synthetase 2, mitochondrial SZ
rs10134334 SLC25A21 intron solute carrier family 25 (mitochondrial oxodicarboxylate carrier), member 21 Ctrl, BD, SZ
rs17105237 SLC25A21 intron solute carrier family 25 (mitochondrial oxodicarboxylate carrier), member 21 BD
rs17105036 SLC25A21 intron solute carrier family 25 (mitochondrial oxodicarboxylate carrier), member 21 SZ
rs3772197 SLC25A26 UTR-5 solute carrier family 25 (mitochondrial carrier, phosphate carrier) SZ
83
sperm sample set, the control epiSNPs were more than double the number identified in the BD
cohort. As sample sizes were equal between sperm cohorts, this variation is not due to
differences in detection power, although its meaning is difficult to interpret.
Of the 27 epiSNPs that overlapped the control and BD groups in the sperm sample set, only one
differed in the directionality of ASM, and it was located in an intergenic region with no known
gene association within 50Kb. There were 13 genes associated with the sperm epiSNPs common
to control and BD, and they were usually intronic, except for one that was exonic and coded a
missense mutation in the proteasome subunit beta type-4 gene. One of these common epiSNPs
(rs3754387) was found in an intron of the inositol 1,4,5-trisphosphate 3-kinase B gene (ITPKB),
another (rs979278) was located in an intron of the endoplasmic reticulum metallopeptidase 1
gene (ERMP1), and a third (rs851823) was in an intron of the contactin-associated like 2 gene
(CNTNAP2) – the recurrence of these genes in brain and sperm will be addressed shortly. Other
associated genes code various proteins, including carbohydrate sulfotransferase 11, a
hypothetical protein (KIAA1609), protein bicaudal D homolog 1, a centrosomal protein, and
others with no particular link to BD.
Figure 3.19. EpiSNPs detected in sperm DNA
The total number of epiSNPs detected in control and BD cohorts are depicted in a Venn diagram (p<2.2x10-16
).
There is a moderate amount of epiSNP overlap between sperm and brain cohorts, as illustrated
in Figure 3.20. The variation is likely the result of tissue-specific ASM effects, as the
84
overlapping epiSNPs are not obviously related to brain or sperm function and could be common
to any number of tissues. It is true that epiSNPs, in general, seem to occur in intergenic regions
and do not appear to be associated with genes whose functions are obvious – further pathway
investigation is required to determine the functional relationships of the associated genes.
Sample size issues prohibited us from running a GO analysis for the sperm epiSNPs. When
brain cases and sperm cases were compared, only one out of 41 epiSNPs showed a difference in
ASM directionality, and it was located in an intergenic region with no known gene association
within 50Kb. Among the overlapping case epiSNPs, one (rs4971695) occurred in an intron of a
gene that was obviously related to brain function (neurexin 1, NRXN1), and another
(rs17067095) occurred in an exon of a gene that was obviously related to sperm function
(spermatid associated, SPERT); in general, the epiSNPs that were identified as “overlapping”
any brain and sperm cohorts tended not to be associated with brain- or sperm-specific genes, i.e.
their functions seemed to be more generally applicable.
There were 57 epiSNPs that appeared in both brain and sperm control cohorts, and all but one
displayed the same ASM directionality; as in the BD/control overlap, the one that differed
between brain and sperm was located in an intergenic region with no known gene association
within 50Kb. The shared epiSNPs were either intergenic or intronic, and they were not
significantly closer to MSRE sites than any other SNPs, meaning that they were not likely to
simply result from LD. They also had very few MIM titles associated with them, with the
exception of “inositol 1,4,5-trisphosphate 3-kinase B,” “endoplasmic reticulum metallopeptidase
1” and “partitioning-defective protein 3, C. elegans, homolog of.” Oddly, epiSNPs associated
with the “inositol 1,4,5-trisphosphate 3-kinase B” and “endoplasmic reticulum metallopeptidase
1” MIM titles were present in every cohort in brain and sperm, and the “contactin-associated
protein-like 2” title was present in all cohorts except for BD in brain, although different epiSNPs
appeared in this group; one SNP (rs3754387) in an intron of ITPKB was consistently present, as
was another (rs4612399) in an intron of endoplasmic reticulum (ER) metallopeptidase 1
(ERMP1). Mitochondrial ER dysfunction has been linked to bipolar disorder [295] and
valproate, a drug often prescribed for BD and epilepsy, has been shown to protect against ER
stress-induced apoptosis in rats [296]. Additionally, the inositol 1,4,5-trisphosphate receptor
calcium channel interacts with the Bcl-2 apoptosis-inhibiting protein to regulate calcium release
from the ER, and BD patients with the AA genotype at a Bcl-2 gene SNP showed reduced Bcl-2
85
mRNA and protein levels [297]. This particular SNP, rs956572, did not show ASM, and the
epiSNP related to the ER pathway did not differ in ASM directionality between cohorts or DNA
sources, but it is still interesting to note that, out of the nearly 1 million SNPs on the array, this
collection of SNPs were among the few thousand that were identified as epiSNPs, and a GO
category related to the ER was present in the top five enriched categories in the BD cohort.
Perhaps in a different population, for example, patients who have a long history of valproate
use, some differences in ASM may be observed.
In the case of the CNTNAP2 epiSNPs, one was conserved between BD and control groups in
sperm (rs851823), while a unique one was found in brain controls (rs17434745), and four
unique ones (rs4726833, rs10233374, rs41481753, rs12673933) were found in brain SZ cases,
all within introns of the gene. It is interesting to note that, while a CNTNAP2 epiSNP was
common to control and BD in sperm, there was an absence of similar epiSNPs in the BD cohort
in brain. The CNTNAP2 protein functions in neuronal cell adhesion, synaptic formation and cell
signaling as part of the cell adhesion molecule (CAM) pathway, which has been associated with
SZ and BD, as well as being previously implicated in specific language disorder and autism
[298]. A recent meta-analysis has identified variants within this gene as being significantly
associated with SZ and BD [299], although the detected SNPs were not epiSNPs. In sperm,
there is no difference in the CNTNAP2-associated epiSNP or directionality of ASM. In the
brain, the target tissue for this gene, cohorts differ considerably with an absence of ASM at this
locus in BD, an excess in SZ, and a single, different epiSNP in controls. The meaning of this
finding is not clear, but it supports the theory that brain-specific regulation of this gene can be
altered in cases of psychosis.
Again, the smaller sample size in the sperm study was a limitation that likely contributed to the
lower number of identified epiSNPs, overlaps between cohorts and overlaps between sample
types. A previous study by Schalkwyk et al listed their top 21 strongest examples of ASM
detected in blood DNA [21], none of which were common to our brain epiSNPs. Once again,
this may be the result of tissue-specific effects, or perhaps it is due to their different analysis
strategy and enrichment protocol; they filtered out most of the SNPs on the SNP 6.0 array
before analyzing and used a cocktail of three enzymes (HpaII, HhaI and AciI) to produce a
variety of smaller fragments that differed from ours, likely interrogating some different SNPs. It
should be noted that Schalkwyk et al detected tissue-specific ASM effects between their own
86
blood and buccal samples, and recent research suggests that tissue-specificity of gene expression
is common, with one study estimating that 69 - 80% of regulatory variants are cell type-specific
[300]. Using whole-genome bisulfite sequencing, Li et al mapped the methylome from the
blood of one individual, and found tissue-specific DMRs at 240 856 regions when they
compared blood and lung tissue [301]. Taking the above information into account and, given
that methylation profiles are highly variable between tissues [29], we suspect that our evidence
for tissue-specific ASM effects is valid, despite the sample size limitations.
Figure 3.20. Overlapping epiSNPs between brain and sperm
The number of overlapping epiSNPs between control, total case, and BD cohorts in brain and sperm are depicted in
Venn diagrams (p<2.2x10-16
).
As in the brain samples, we investigated the chromosomal distribution and functional class
profiles of the epiSNPs detected in sperm. In both sperm cohorts, the epiSNP distribution is
entirely uniform across the chromosomes and did not show a single significant difference in
proportion (Table 3.3 and Fig 3.21). The absence of BD epiSNPs at chromosomes 4, 11, 20 and
control or BD epiSNPs at Y did not represent significant depletions, as a result of the overall
fewer epiSNPs identified in sperm. It is quite possible that the varying distribution profiles
87
reflect different distributions of tissue-specific genes that would be susceptible to ASM, while it
is equally possible that the flat epiSNP landscape in sperm DNA could indicate that ASM is
somewhat random or less influential in these cells, as it does not tend to localize to any
individual chromosome. The smaller sample size and lack of a SZ cohort in sperm are other
factors that should be considered, as we noticed that more enrichments and depletions became
significant when we pooled the BD and SZ cohorts in brain.
In the functional class analysis (Figure 3.22), the sperm profile was similar to that observed in
brain, in that 147 (59.76%) of control and 50 (59.52%) of BD epiSNPs were located in
intergenic regions, however, they did differ in some areas. There were no significant
enrichments or depletions observed in the control cohort (p>0.208 and 0.131, respectively),
whereas the brain control cohort epiSNPs were enriched in UTRs. The sperm BD cohort
differed from the brain profile, as the exon class was significantly enriched (p=0.046), in
comparison to the UTR enrichment observed in BD brain (p=0.011); enrichment of exons
highly contrasts the general depletion of this class that occurred in the brain. It should be noted
that, although there were no BD epiSNPs in the UTR class, it was not a significant depletion,
once again due to the small number of identified epiSNPs in sperm and the relatively few UTR
SNPs on the SNP 6.0 array (a factor that was considered when enrichment was calculated). As
we have already observed the presence of tissue-specific ASM effects between brain and sperm
samples (in relation to the MIM titles of associated genes), it is plausible that the overall
activities of epiSNPs may also differ between them. In both sample types, there was a large
number of intronic and intergenic epiSNPs, but this was due to the fact that most SNPs occur in
these classes, and no enrichments or depletions were documented.
88
Figure 3.21. Chromosomal distribution of sperm epiSNPs
The distribution of brain epiSNPs per cohort across all chromosomes, where the total number of detected epiSNPs
is displayed as a proportion of the total number of SNPs occurring on each chromosome. No chromosome was
significantly enriched (q<0.05).
89
Figure 3.22. Functional class distribution of sperm epiSNPs
Sperm epiSNPs were stratified by their dbSNP functional class tags. Coding SNPs occur in exons of genes, where
one variant introduces either a non-synonymous or synonymous change; Intron SNPs are located within intronic
regions of genes; Locus SNPs are located 2Kb upstream or 0.5Kb downstream from a gene; and UTR SNPs occur
in either 3’ or 5’ untranslated regions. The number of epiSNPs per cohort in each class is given as a proportion of
the total number of genomic SNPs in that class. Red asterisks mark classes that are enriched per cohort (p<0.05).
Sensitivity Analysis
As the number of detected epiSNPs decreased when the datasets were stratified into control, SZ
and BD groups, a sensitivity analysis that exhaustively interrogated all possible sample sizes
was conducted to determine the effect size for biologically meaningful results, and to ensure that
our method and sample size were sufficient to detect all significant epiSNPs. The results of this
analysis are summarized in Figure 3.23. We randomly sampled our 5 different cohorts (all
samples, control, all cases, BD and SZ) 5 times each at every sample size, and ran our PWL
analysis for every possible set that resulted from the sampling. We then used a non-linear least
squares regression to extrapolate the findings for each cohort to one million samples, and saw
our number of detectable epiSNPs rapidly plateau. This indicates that our sample sizes were
sufficient to detect the majority of significant epiSNPs.
90
Figure 3.23. Sensitivity analysis
The polygons represent the results of random sampling and recalculation of the total number of brain epiSNPs
identified in each cohort (control+case, control, all cases, BD and SZ). The dotted lines represent the extrapolation
of each data set to include one million samples. In all cohorts, the total number of detectable epiSNPs plateaus
almost immediately.
The epiSNPs considered in the sensitivity analysis were all significant, but ranged from very
weak to very strong associations between one allele and methylation. When we stratify this
same analysis by association strength, it becomes apparent that the only epiSNPs we may miss
are the ones with the weaker associations (Fig 3.24). Increasing the sample size permits the
detection of more epiSNPs, up to the point of plateau, but their associations between allelic
variants and methylation are weaker.
Figure 3.24. Sensitivity analysis stratified by strength of associations
The sensitivity results separated by their PWL slope values. The slope can be interpreted as the strength of the
association between genotype and methylation levels, where a high slope value (>=3) is a strong association. Slope
values are absolute, ie. -3 and +3 are binned in the same group. As slope value increases the curves taper off.
Extrapolating from the asymptote for our collective brain DNA sample set, we determined that
~2.5% of the 906,600 SNPs on the Affymetrix SNP 6.0 array display ASM. This prediction is
the most comparable to those presented by other groups, as our combined sample size was quite
91
large (n=208), however, the estimate is slightly conservative, as we have just demonstrated that
a small fraction of the weakest epiSNPs can be missed by our detection and analysis methods.
Compared to the estimates presented by other groups, ours is somewhat intermediate, leaning
towards the lower values. On the lower end, Kerkel et al estimated that at least 0.16% of SNPs
show ASM [19], and Schalkwyk et al suggested a value of 1.5%. Higher estimates included:
8.1% by Gertz et al [213], 10% by Zhang et al [158], 10% by Hellman and Chess[159], and 23-
37% by Shoemaker et al [160]. The variable estimates are likely the result of many factors,
such as sample size, detection method, DNA source and the use of simulation studies; the
impact of these factors will be scrutinized in the General Discussion and Conclusions section.
Low level ASM across the genome
We detected a large number of epiSNPs using the microarray approach and PWL analysis, but
there are two main limitations to this strategy. First of all, microarrays are not as sensitive as
deep sequencing technologies and, as we have already addressed in the sensitivity analysis,
epiSNPs demonstrating lower level associations are likely overlooked by this method.
Secondly, the PWL analysis is a two-step regression model that requires at least two of the three
possible genotypes (AA, AB and BB) to be present in order to detect the methylation intensity
slopes from AA to AB and from AB to BB; as a result, SNPs that have rare genotypes may be
immediately excluded from our analysis if the study population does not contain the required
combinations. In order to examine the methylation association of SNPs in greater detail, we
conducted an experiment using the 454 deep sequencing platform, which allows us to use
single-base resolution to search for associations that may have been missed by the microarrays.
Our experimental design is depicted in Figure 3.25.
92
Figure 3.25. Deep sequencing workflow
Preparation of fragments for 454 deep sequencing. Genomic DNA is treated with sodium bisulfite, which converts
all unmethylated cytosines to thymines. Next, 400bp sequences surrounding the target SNPs are amplified using
PCR. The amplicons are gel-extracted and purified, and then 4 pools are created using equal amounts of each
amplicon. These pools were applied to a 4-gasket 454 plate (as shown above) and sequenced.
We selected 11 SNPs that were not found to show ASM in the microarray study and bisulfite
modified the genomic DNA for 40 subjects per SNP, thereby converting all unmethylated
cytosines to thymines. The 40 subjects were divided into four groups of alternate homozygotes:
case AA, case BB, control AA and control BB. In some cases, we did not have 10 samples of a
particular genotype available per group – in those instances, additional samples from the
alternate homozygote group for the same disease status were added in an effort to include 20
samples per group (sample information is listed in Appendix 2, Table A2.6). Homozygous
samples were selected, as they would provide the greatest potential difference in methylation
levels, should an ASM effect be present. We then amplified an area surrounding the SNP,
including as many CpG sites as possible, purified our amplicons and sequenced them using the
454 next generation sequencing instrument. We generated 422,114 forward sequence reads and
compared the number of unconverted cytosines at each CpG site between groups per SNP. Our
analysis focused on 4 main questions:
93
1. Is the association between methylation and genotype in case group the same as the association
between methylation and genotype in control group?
2. Is the association between methylation and genotype AA in case group the same as the
association between methylation and genotype BB in case group?
3. Is the association between methylation and genotype AA in control group the same as the
association between methylation and genotype BB in control group?
4. Is the association between methylation and case group the same as the association between
methylation and control group?
Our null hypothesis for each question was that there should be no significant difference in
methylation level between any of the 4 alternate homozygous groups at any of the chosen SNPs;
surprisingly, this was not what we observed. Each SNP was associated with between 1 and 27
CpG sites, with most having around five on their amplicons, as shown in Figure 3.26. For
question 1, we found that there was a significant difference in methylation-genotype association
between case and control groups in 11/11 SNPs (Table 3.11). This indicates that a disease-
effect on the association of methylation and a genotype becomes evident between case and
control groups, even for SNPs that were not detected by the microarray analysis, when analyzed
at high resolution. This was the most general of the four questions, but the findings were
recapitulated in the following three analyses, which provided a more detailed examination of the
possible methylation-genotype relationships.
94
Figure 3.26. CpG count per SNP
The number of CpG sites present on the sequenced amplicons for each SNP.
When we investigate questions 2-4, nearly all investigated SNPs demonstrateed significant
differences between groups when all CpG sites surrounding a SNP were analyzed (Table 3.11).
It is apparent that, when all local CpG sites are considered, differences can be observed between
genotypes, between cohorts, or as a combination of both possibilities. Figure 3.27 illustrates the
variation occurring between and within cohorts at the four CpG sites surrounding a single SNP;
our null hypotheses would predict that the methylation proportions would be the same for all
genotypes and all cohorts. This figure serves to visualize the fluctuations at all CpG sites
between cohorts at a SNP, although we pooled all CpG sites per SNP in our analyses to obtain a
single measurement for each SNP.
It should be stressed that the ASM we observed in this experiment is nowhere near as robust as
what was detected in the microarray or even the pyrosequencing experiments, as the microarrays
did not suggest that these were epiSNPs and the pyrosequencing experiment failed to find any
significant ASM at other SNPs that were not identified as epiSNPs. Still, this finding is
intriguing, because it introduces a novel, fundamental quality of ASM: its common occurrence
at smaller effect sizes, which we deem “minor epiSNPs.” It is not clear why this phenomenon
would exist, but perhaps it represents the remnants of epiSNPs that have faded away over one
generation or the span of many. At this point, we can only hypothesize on the biological
relevance of this finding. We can conclude that cases and controls have an overall, subtle
difference in their total ASM profiles. It also appears that DNA sequence exerts a genome-wide,
low-level effect on methylation status that can only be detected using extremely sensitive
methods, aside from certain points that become more pronounced (epiSNPs).
95
Table 3.11. Minor epiSNPs showing significant ASM effects in brain
The analysis results for each of questions 1-4. Significant p values (p<0.05) appear in bold.
Figure 3.27. Methylation levels per group for a sample minor epiSNP
Deep sequencing results for minor epiSNP1, an amplicon containing 4 methylatable CpG sites. The Y-axis
represents the level of methylation, based on the number of sequences that retained a cytosine at that site. For each
site, the X-axis is divided into the 4 cohorts.
96
Chapter 4
General Discussion and Conclusions
In this thesis, we defined some important features of genetic-epigenetic interactions, beginning
with an investigation of methylation differences within MZ twins, continuing on to examine
methylation differences between MZ and DZ twins, followed by a study of the occurrence and
properties of ASM. We then applied some of these principles in a study of major psychosis, a
non-Mendelian complex disease, in order to determine how the relationship between genetics
and epigenetics can impact human diseases.
Our twin study was the first to utilize epigenome-wide profiling to document sequence-
independent DNA methylation differences in several tissues from MZ twins, and the results
support the theory that epigenetic metastability and divergence is responsible for a portion of co-
twin phenotypic discordance. We analyzed the WBC, buccal epithelium, and rectal biopsies of
MZ twins, and annotated the epigenetic metastability of ~6,000 unique genomic regions. We
also examined WBC and buccal epithelium of DZ twins and found that DZ co-twins exhibited
significantly higher epigenetic difference over MZ co-twins in buccal cells. Our in silico SNP
analysis and our comparison of methylomes in inbred vs. outbred mice favour the hypothesis
that epigenomic differences in the zygotes are responsible for the greater discordance observed
in DZ twins. This is evidence that epigenetic inheritance exists, and that it is a separate entity
from DNA sequence-dependent methylation.
Delving into the topic of ASM, we discovered that approximately 2.5% of SNPs show ASM in
brain tissue, although genomic distribution varies between cases and controls, with controls
having a nearly-uniform occurrence of epiSNPs across the chromosomes. The majority of
epiSNPs occur in intronic and intergenic regions, although UTRs and regions in the vicinity of
genes (2Kb upstream or 500bp downstream) were generally enriched, while exons were
generally depleted in the brain. The SZ cohort contained twice as many epiSNPs as controls
and BD, although there was a large degree of overlap. The many epiSNP GO categories were
related to brain development and function; intriguing differences between cohorts were
observed in the glutamate and insulin pathways. EpiSNPs were also detected in sperm DNA
and were largely tissue-specific, showing little conservation between sample types, as well as
differing in chromosome and functional distribution patterns. Deep sequencing analysis
97
revealed that any SNP could potentially demonstrate low-level ASM, and that this subtle
phenomenon is easily overlooked in microarray-based studies. This work offers some insight on
the actions of sequence-dependent DNA methylation in normal and affected individuals, and
provides evidence that genetic studies should be stratified to include an epigenetic component.
Changing concepts of epigenetic regulation
The classical view of epigenetics, which dominated for several decades, is somewhat simplistic
– DNA methylation at a promoter silences transcription of that gene in cis, and epigenetic marks
are completely erased and re-established during maturation of germ cells and post-zygotically,
with the exception of imprinted genes [302] – however, recent technological advances have
allowed us to study molecular epigenetic mechanisms in much greater detail and expand upon
this basic scenario. Concepts such as epigenetic inheritance and ASM are still controversial,
mainly because previous studies have not had the power or scope to adequately support or refute
any particular hypotheses [303]. In our twin study, the observation that DZ co-twins were more
epigenetically variable than MZ co-twins provided strong evidence for epigenetic inheritance,
which argues for some degree of persistence of epigenetic marks through the reprogramming
event that occurs after fertilization, ultimately resulting in the presence of these marks in the
somatic cells of the offspring. It differs from transgenerational epigenetic inheritance, in that the
marks are erased in the primordial germ cells, whereas transgenerational marks survive the more
intense erasure/re-establishment process and may be transmitted to subsequent generations
[304]. The mechanisms underlying transgenerational epigenetic inheritance are not fully
understood, but the number of documented cases has been increasing; evidence for
transgenerational inheritance has been detected in several species, such as yeast [305],
drosophila [306] and mice [156, 307]. Whether the signals are passed to a single offspring or to
multiple generations, the transmission of epigenetic marks will significantly impact the study of
complex non-Mendelian disease, as well as the traditional twin study design.
Our work highlights the existence and importance of epigenetic drift and inheritance, as well as
the role of epigenetics in development and disease, where inherited or stochastic epimutations
can predispose an individual to a phenotypic outcome. Epimutations may arise spontaneously or
as a result of external factors during germ cell reprogramming, and differential germline
epigenetic modification has been documented. An epigenetic study of sperm cells discovered
98
significant intra- and inter-individual differences in DNA methylation, where unique DNA
methylation profiles were present in a large percentage of the sperm cells collected from a single
subject. Of the identified positions, promoter CpG islands and peri-centromeric satellite repeats
showed the highest degree of variation, and the variation occurred in a number of genes, not
simply those related to sperm function and development [210]. Some of this variation likely
persists after the reprogramming at the time of fertilization, but the degree of inheritance is not
known, nor is the impact on associated histone modifications, which have also been proposed to
show some level of inheritance [308, 309].
At all stages of life, from developing embryo to adult organism, numerous environmental
factors can impact epigenetic status, although the potential for damage is much greater during
the former time period. It has long been understood that epigenetic factors were dynamic in
nature, but they were believed to be relatively mitotically stable; recent research seems to be
indicating that they fluctuate at a much higher rate than previously thought. For example, it has
been demonstrated that short periods of exercise are sufficient to cause transient changes in
DNA methylation. DNA methylation levels in skeletal muscle biopsies from healthy men and
women before and after acute exercise were quantified using MeDIP, followed by qPCR and
bisulfite verification. Not only were methylation levels decreased in the promoters of genes
involved in metabolic functions, but the corresponding mRNA levels were also increased,
indicating that DNA methylation can be dynamically altered to affect acute gene expression
[310]. While non-CpG cytosines were the most frequently altered in this experiment, it has also
been shown that methylation-demethylation cycling can occur at CpGs, as demonstrated in five
selected human promoters, including the oestrogen (E2)-responsive pS2 gene, which cycles with
a periodicity of 100 minutes [311]. Taken together, it is apparent that epigenetic marks are not
simply stable markers of gene transcriptional events, but rather, they are actively involved in the
fine-tuning of genomic activity.
One issue in the study of MZ variation and epigenetic inheritance is that it is extremely difficult
to pinpoint the source of the variation. Variation may be introduced purely by the environment
in a stochastic manner. Differences may also be stochastic, taking place at the molecular
epigenetic level, namely due to the actions of DNMT1, which does not carry methylation status
through mitotic divisions with complete fidelity. The third option is that environmental stimuli
may affect an individual via interactions with epigenetic factors. The combined actions of these
99
three methods can be described in the example of food allergies: Peanut allergies, for example,
are associated with increased immunoglobulin E (IgE) [312], and vitamin D deficiency (a purely
environmental cause) has been linked to increased IgE sensitization [313]. It has also been
found that allergy-related gene transcription is activated in response to CpG demethylation of
particular sites [314]. These alterations could possibly be the result of epimutations that took
place early in development, precipitating the allergic phenotype when they reached critical
mass. Alternately, there may have been some environmental factor that caused the epigenetic
response, and it is known that a variety of exposures, such as tobacco smoke, particulates, diesel
exhaust, polyaromatic hydrocarbons, ozone, and endotoxin, etc, can cause allergic phenotypes
by enhancing imflammatory cytokine expression via epigenetic mechanisms [314-316]. In
human epigenetic studies, it is incredibly difficult to control for any of these effects, particularly
when post mortem tissue is used. Future studies are required to determine exactly how
environmental and epigenetic factors may interact in order to eventually make their separation
possible.
ASM represents the interaction between DNA sequence variants and DNA methylation, and it
allows us to consider a new level of regulation in the genome. Although our in silico and mouse
experiments indicated that ASM did not significantly affect our findings in the twin study, it
should be noted that the twin experiment had a different objective and was not designed to
detect ASM effects. The aim of our epiSNP study was to develop an unbiased approach to
examine ASM, as many previous studies were seriously limited by an assortment of biases. First
of all, the main issue with many studies was that they simply included too few samples or
investigated too few SNPs/CpG sites. The first study that focused on allele-specific effects of
SNPs, by Yan et al, only examined 13 loci in 37 samples [20], and the following studies did not
improve upon this issue: Kerkel et al examined 15 057 SNPs in 12 samples (not all representing
the same tissue) [19], Schalkwyk et al examined 183 605 SNPs in 10 samples (5 twin pairs)[21],
and Hellman and Chess examined 110 883 SNPs in 28 samples [159], just to name a few. The
majority of these studies filtered out a tremendous number of SNPs that they felt were either a)
within an MSRE, thus, could potentially cause LD effects or b) uninformative, for example,
Gertz et al used reduced representation bisulfite sequencing to detect ASM and eliminated all C
T SNPs because they were impossible to differentiate from unmethylated CpG after bisulfite
modification [213]. We felt that this filtering was unnecessary, and that it was more productive
100
to include all SNPs in the analysis, and then filter afterwards if LD effects were detected, versus
estimating at the beginning and then drastically decreasing our number of SNPs later on.
Fortunately, we did not detect an LD effect, using an analysis that considered the same factors
used by previous studies to design their filters, and this allowed us to examine around ten times
more sites.
Several other studies used cell lines as a source of DNA, which may be problematic as the
methylation profiles of immortalized cells may not reflect those of normal cells, for example, it
has been shown that the overall CpG methylation level is lower in WBC vs hESC [301]. Again,
Hellman and Chess used Epstein-Barr virus transformed B lymphocyte lines, Shoemaker et al
used 16 pluripotent and adult cell lines [160], and Chen et al used 3 HESC lines [161].
Additionally, some studies focused on very specific regions of the genome and isolated target
genes, for example, Shoemaker et al only studied CpGs on chromosomes 2 and 20, regions
surrounding the TSS of 26 developmental genes, and 237 ENCODE promoter regions. Finally,
Hellman and Chess relied heavily on in silico extrapolations and simulations of their actual
wetlab data when they estimated that 10% of SNPs would demonstrate ASM. This value is
considerably higher than our prediction and the predictions of many other groups.
Another limitation facing all studies, including ours, is that microarrays do not tend to capture
rare variants. The minor allele frequency of a SNP must be greater than 1% in a population, and
the SNPs included on the SNP 6.0 microarray have an average MAF of 19.6%, 18.2% and
20.6% in the HapMap Caucasians, Asians and Africans, respectively [317], so it is quite
possible that we are not detecting a number of epiSNPs simply because they do not appear on
the arrays. Furthermore, our enrichment strategy utilized three MSREs and, although it provides
a great deal of coverage, it is not enough to interrogate every differentially methylated region of
the genome, as some fragments would be created that are too large to be efficiently amplified
and subsequently hybridized to the arrays. Also, the high stringency of our data analysis
probably caused us to miss some additional epiSNPs with significances that were just around the
cut-off – our sensitivity analysis showed that we had a sufficient sample size to detect all of the
SNPs with a strong, medium and weak association with ASM, but that we would miss those
with very weak associations and that our genome-wide ASM prediction is a slight
underestimate. Additionally, we and other studies were limited by the biases present in the
101
human genome reference sequences, as they are built from a small number of genomes and
likely omit some CpGs within common polymorphic DNA regions.
Tissue heterogeneity is another potential confounding issue experienced in any epigenetic study,
as epigenetic marks vary between cell types [318], and tissues contain a variety of cells. Even
techniques such as laser-capture microdissection or fluorescence-activated cell sorting cannot
fully remove this confounder, although they would certainly decrease its effect. Our post
mortem brain samples did come from the same general area (BA10), but we cannot rule out the
possibility that tissue heterogeneity could have affected our findings if the cohorts differed
significantly in cellular composition. The occurrence of CNVs may also influence the findings
of an ASM study, as additional variants may inflate the array intensity measurements at a SNP.
Ideally, all subjects should be screened for CNVs prior to the study. Although the SNP 6.0
array is equipped for CNV analysis, we did not perform one due to a shortage of bioinformatics
resources, however, this analysis is now underway and will be provided in the final publication.
We were also unable to stratify our samples based on smoking, alcohol or drug use, as this
information was not provided in the study demographics, except in the case of the sperm
samples, although this sample set was too small to achieve adequate power with stratification.
Finally, due to technical limitations, no study to date has differentiated between 5mC and 5hmC,
as they are indistinguishable to MSREs and perform identically in bisulfite modification
reactions.
All current studies agree that ASM is a much more common event than genomic imprinting, and
that DNA sequence plays an important role in the establishment of methylation marks in
specific regions, however, one particular study by Gertz et al takes a rather extreme stance on
the effect size of ASM. They state that:
“The strong association between genotype and DNA methylation indicates that genetics plays a
prominent role in the establishment of DNA methylation patterns. Our data supports a non-
Lamarckian model of evolution, where genetic variants, as opposed to environment, shape
epigenetics. These genetic variants may not lead directly to phenotypic differences, but may
cause phenotypic variability through changes in epigenetic states," and claim that, “the majority
of variation in DNA methylation can be explained by genotype [213].”
102
We strongly disagree with these statements – they place too much emphasis on the ability of
DNA sequence to affect methylation status, and this directly conflicts with our findings in the
twin study, where a great deal of epigenetic variation exists between individuals who are
genetically identical. It is difficult to understand how Gertz et al can reach this conclusion,
considering that they only estimated that 8% of SNPs show ASM in a small, 3 generation family
(n=6) and eliminated all CT SNPs from their analysis. Our estimate was lower (2.5%), and it
was generated by examining a tremendous number of SNPs without bias in a very large sample
set (n= 208 brain, 48 sperm samples), thus, we feel it is a more accurate representation of the
true number. Either way, 2.5% or 8% certainly is not a high enough value to conclude that DNA
sequence determines “the majority” of DNA methylation.
They also found a strong overlap in epiSNPs between tissues, whereas we saw a weaker overlap
between our brain and sperm epiSNPs, as well as a moderate overlap between different cohorts
in the same sample type. They rationalize their strong ASM concordance as being due to,
“shared gene regulatory events that occur early in development or an inherent property of DNA
sequence that directly affects the propensity of DNA methylation [213].” Again, our study was
much larger with higher power to detect, and their tissues consisted of blood and cell lines, so it
is difficult to determine the biological relevance of the conclusions they reached about
overlapping epiSNPs. Still, we do agree that the common epiSNPs likely share some kind of
developmental origin or similarities in the surrounding sequence that increases their ability to
direct local methylation. We also agree with their statement that the “data are consistent with
both the re-establishment of allelic methylation during development and the direct transmission
of DNA methylation in the germline [213],” and that our experimental designs do not allow for
separation of these methods of epiSNP establishment or propagation.
A probabilistic model that identifies ASM based on bisulfite sequencing data has been presented
by Fang et al [319]; it operates by ignoring the actual genotype and simply focuses on detecting
any region where methylation levels differ between alleles. Although it was developed for the
identification of new imprinted loci, the model provided some insight on the characteristics of
genomic, non-imprinted ASM as well. They analyzed 22 publicly available methylomes from
several tissue types, five of which were uncultured primary blood cell types, while the rest
represented ESC or induced pluripotent stem cells. In agreement with our data, they found that
103
the total number of ASM loci varies between tissues, also noting that DNA methylation is
altered in immortalized cells – a shortcoming of many other ASM studies. Their predicted
regions of ASM that were common to multiple tissues often marked the promoter regions for
various ncRNA [319], which complements our finding that epiSNPs are enriched in locus
regions and UTRs; as previously mentioned, these regions are often targeted by ncRNA [171].
On a related note, we determined that a large percentage of epiSNPs occurred in introns and
intergenic sequences, both of which can code products that are processed into ncRNAs [169,
172], although those epiSNPs were so numerous simply because SNPs are numerous in those
regions. Fang et al did not analyze any brain methylome data, nor did they produce a list of
epiSNPs, but their results support the tissue-specificity of epiSNPs and strengthen the
connection between ASM and ncRNA.
Humans are not the only species in which ASM effects have been documented. Xie et al [320]
have recently published a genome-wide, base-resolution map of ASM occurring in mouse
frontal cortex that was generated with MethylC-Seq. They examined ~20 million SNPs in two
mice that were the F1 progeny of reciprocal crosses between two distantly-related inbred strains
and detected ASM in 131 765 CpG sites (approximately 0.7%), finding that CG and non-CG
methylation could occur in an allele-specific manner. As in our study, they found that ASM
sites were isolated and scattered across the genome, typically appearing in intronic and
intergenic regions, although they noted a relative depletion in proximal promoters, whereas we
found an enrichment regions within 2Kb upstream from TSS; additionally, they found that
imprinted ASM sites preferentially occurred in proximal promoters. Taken together, the results
of our studies indicate that parent-of-origin-dependent and sequence-dependent ASM do not
appear to share the same molecular basis, but each specific type of ASM may function similarly
between humans and mice. They also characterized the mouse ASM and came to several
conclusions. First, ASM is more likely to be depleted in sequences coding homeobox proteins,
transcription factors, developmental regulators, histones and ribosome proteins, suggesting that
the regulation of methylation levels in key developmental regions and housekeeping proteins is
very stringent. This is in line with our findings, as we also noticed that very few epiSNPs
occurred in the coding regions of genes related to the aforementioned functions. Second, they
found that only a small fraction of ASM sites are clustered (at 94 genes) and few of these sites
(21.3%) are associated with ASE. Estimates of ASE differ quite a bit between studies, although
they are not directly comparable due to variations in method and tissue source. For example, Li
104
et al utilized whole-genome bisulfite sequencing to examine the methylome of one human
subject, focusing on DNA from WBC; they found that, when ASM occurred within 2Kb of a
TSS, greater than 80% of genes will demonstrate ASE [301]. Unfortunately, we were unable to
examine epiSNP-associated expression differences in our particular brain samples, as the RNA
quality was no longer high enough for use with expression arrays, despite the impeccable DNA
quality. There are several published expression datasets available that utilized many of our
brainbank samples, however, upon further inspection it was apparent that the number of samples
and the number of epiSNP-associated genes included on the arrays was too low to accurately
estimate ASE with sufficient power. Third, when Xie et al examined the neighbouring SNPs of
the ASM CG-sites, they found an over-representation of several motifs on the hyper- and
unmethylated alleles, respectively, but these motifs also existed in hyper- and unmethylated
sequences on a genome-wide scale. They cross-referenced the motifs to recently-published
human methylomes and found that some had very strong correlations with methylation indexes
between mice and humans [320].
One key question that most studies have not addressed is the potential for epiSNPs to actually
represent “epi-haplotypes.” Any epiSNP that we detect may be part of a haplotype block with
another SNP (or several SNPs) where only one of them is truly associated with the methylation
level, and this complicates our selection of the SNPs that should have their surrounding
sequence examined. Other studies also seem to have this issue, as they only consider SNPs
associated with differential methylation at a given CpG site, and do not examine the
contribution of epi-haplotypes. When a single, known haplotype is in question, it is much easier
to investigate ASM related to that block. Bell et al studied a 46Kb LD block of the FTO obesity
susceptibility haplotype and discovered a 7.7 Kb epi-haplotype region encapsulating a highly
conserved non-coding element that represents a validated long-range enhancer, supported by the
histone H3K4me1 enhancer signature [181]. This study was quite feasible, as the target was
already defined and only a small number of CpG sites were interrogated in 60 individuals, but it
highlights the importance of combined genetic-epigenetic studies, as well as the ability of
epiSNPs to form epi-haplotypes. We investigated the possibility that any one SNP was in LD
with another SNP that caused or disrupted an MSRE site and found no evidence for this
association, however, we did not take into account the LD between SNPs outside of MSRE
sites, as this sort of association still requires the presence of an actual epiSNP and could not be
105
created by false positives alone. In order to distinguish between these two scenarios, an
exhaustive analysis would need to be performed that grouped all nearby epiSNPs (the maximum
group size is unknown, so many permutations would have to be run), utilized LD values to
predict haplotypes, and then tested methylation levels for each group size. Exact haplotypes
could have been determined without using LD-based predictions if we had attempted a deep
sequencing approach instead of microarrays, but that would be prohibitively expensive or would
severely limit our number of samples, which would lead to a loss of SNP variation and decrease
in power. Due to these irresolvable issues and time/computational restraints, an epi-haplotype
analysis would require an enormous effort and was not within the scope of the current study.
Genetic-epigenetic interplay in complex disease
The concepts of epigenetic inheritance and ASM have major ramifications for the study of
complex diseases, as we are no longer considering DNA sequence to be the single mode of
inheritance and are learning that genetics and epigenetics interact in many complex ways. Here,
we have looked at major psychosis from a novel perspective and discovered evidence for a
disturbance in the pattern of SNPs that control their local methylation. Since this finding
involves thousands of SNPs, likely working in concert, it is impossible to explain the connection
between all of these loci and development of the disease with complete certainty, however, we
can make some hypotheses based on key findings and current literature.
The majority of previously-documented epiSNPs appear to operate in cis, as opposed to trans
actions, although the actual definitions of these mechanisms is not incredibly clear. Two
scenarios exist in the literature: one discusses the distance between an epiSNP and an affected
gene, whereas the other involves an epiSNP and a nearby methylated CpG site. There does not
seem to be a standard distance between an epiSNP and affected gene or site within which
interactions are labeled “cis;” one study used a distance of >1Mb between a SNP and CpG site
to delineate trans interactions [162].” By this definition, our study was only designed to detect
cis effects, as our interrogated unmethylated sequences were less than ~2Kb, thus, all epiSNPs
must be within the immediate vicinity of an affected CpG site. A number of trans effects have
been reported under this definition [21], meaning that our study may have missed ASM
involving a SNP that is far away from its target CpG site. When we consider the distance
between an epiSNP and a nearby gene, arbitrary values are usually assigned, for example, ASM
106
occurring in introns or up to 43Kb upstream of a gene have been classified as cis [21], but a
2Mb window centred on a gene has also been used as the boundary for cis interactions [300].
The terms “cis” and “trans” are somewhat outdated, and perhaps it is more useful to think of
them as “effects on the same chromosome” and “effects on other chromosomes,” respectively.
Multiple studies in drosophila [321], mice [322] and humans [21, 323] have concluded that only
about 10% or less of regulatory variants are trans in nature. Again, our study was geared
towards detecting cis interations, but our finding that about half of our epiSNPs occur in introns
or within 2Kb of genes supports the idea that cis-actions are common. We also discovered a
large amount of intergenic ASM, and these epiSNPs could potentially support either mode of
action, depending on the definition. A 2009 Science paper [300] focused on functional variants
that affect gene expression in cis and the tissue-specificity of their effects, concluding that a)
single SNPs can affect the transcription of multiple genes, b) tissue-specific SNPs are usually
located farther away from the gene, and c) 69 - 80% of regulatory variants are cell type-specific
[300]. Combining their findings with our own, a hypothesis of epiSNP action begins to
materialize, where epiSNPs are frequently in cis, often affecting gene transcription via
modulation of alternative splicing events from within the intronic regions of genes. The
epiSNPs that act in trans potentially affect either the production or binding of a transcription
factor, enhancer or silencer that exerts its effect further downstream, or perhaps the 3D
organization of DNA in the nucleus places the epiSNP in proximity to an effector gene. In the
latter case, histone changes that result from differential methylation caused by the epiSNP may
affect local chromatin arrangement, resulting in indirect trans regulation.
In our GO analysis, we found several categories involved in two pathways that have been
proposed to play a role in the etiopathogenesis of psychosis: one related to glutamate and one
related to insulin. For many years, dopamine and its dysfunction within the mesolimbic pathway
has been thought to be the major cause of psychosis, but there is evidence that other
neurotransmitters may be equally important; one likely candidate is glutamate, as the majority of
neurons in the brain use it for neurotransmission [324]. Two drugs that intensify psychotic
symptoms, ketamine and phencyclidine (PCP), both act via blockade of glutamate receptors, but
show very little dopaminergic effects [325]. In the glutamate hypothesis of SZ, N-methyl-D-
aspartate (NMDA) receptors, the major subtype of glutamate receptors are believed to be
dysfunctional. Animal models have demonstrated that various environmental stressors increase
107
glutamate release/transmission in limbic/cortical areas and can cause structural changes, such as
dendritic remodeling, reduction of synapses and possibly volumetric reductions in areas where
glutamate is primarily involved [324]. In psychotic or pre-psychotic individuals, levels of
glutamate are altered: low levels of glutamate are observed in the thalami of individuals who are
at risk for psychosis [326], higher levels occur in the associative striatum (precommissural
dorsal-caudate) high-risk and first-episode psychotic subjects [327], and plasma glutamate
levels were decreased in first-episode SZ and BD subjects, but were restored after treatment
[328]. Selective agonists of group II metabotropic glutamate (mGlu) receptors, such as the
positive allosteric modulator biphenyl-indanone A (BINA), have recently demonstrated efficacy
in treating the positive and negative symptoms of SZ, as well as modulating the activity of
psychotomimetic drugs and reducing the increased glutamatergic transmission associated with
psychotomimetic hallucinogens. Increased excitation of the medial prefrontal cortex is believed
to contribute to the development of SZ, so the ability of BINA and other related compounds to
treat SZ symptoms reinforces the glutamate theory and highlights the importance of the mGlu
receptor, in addition to the NMDA receptor [329]. Several glutamatergic gene and pathway
targets have been identified in GWAS of SZ and, while there are fewer findings for BD, there is
evidence to suggest that the candidates will be complementary to SZ versus completely
overlapping [330]. This idea mirrors our comparison of SZ versus BD ASM where there is
some overlap of GO categories and epiSNPs, yet the diseases seem to maintain similar, but
unique epiSNP distributions.
We detected a number of epiSNPs in three GO categories related to glutamatergic signaling
(summarized in Table 3.7), some of which were shared between groups. One category,
GO:0051967, was unique to BD, whereas GO:0035249 was common to SZ and controls, and
GO:0007215 was common to all three cohorts. In the case of this common category, broadly
defined as “glutamate signaling pathway,” the majority of epiSNPs were associated with the
same genes across all cohorts, and these genes coded either glutamate receptors or receptor
subunits. Three of the genes unique to BD also encoded glutamate receptors, but one gene
(APP) codes amyloid beta (A4) precursor protein, which is known to be involved in the
pathogenesis of Alzheimer’s disease [331]. The control- and SZ-unique genes were also related
to glutamate receptors, including genes coding proteins that interact with the receptors, such as
GNAQ, which couples cell surface, 7-transmembrane domain receptors to intracellular signaling
108
pathways [332], and HOMER2, a protein belonging to a family that regulates glutamate receptor
function [333], but is also implicated in addiction and drug induced neuroplasticity [334]. The
considerable amount of ASM associated with several different receptors in glutamate pathways
supports the glutamate theory of psychosis, as BD and SZ subjects deviate from the epigenetic
regulation patterns observed in control individuals, which could potentially result in alteration
and dysfunction of various glutamate-related pathways.
This same situation exists with our SZ-specific insulin-related GO categories, which were both
involved with insulin secretion. In the 1930s, it was believed that shocking a subject with doses
of insulin high enough to cause a violent reaction and an eventual comatose state, followed by
rescue with glucose, would cure or greatly improve the symptoms of SZ [335]. This therapy
came about after it was noticed that, when insulin was administered to SZ patients to encourage
weight gain, their mental state was positively affected [336]. With time, significant doubts
arose concerning the validity/ethics of the approach, and it was eventually replaced with newer
treatments, but recent research has been reviving the relationship between insulin and SZ. In a
study of serum samples from 19 twin pairs discordant for SZ plus 34 age- and gender-matched
healthy control twins, it was found that the SZ subjects had elevated triglycerides and were more
insulin resistant than their healthy co-twins [244]. Increased serum concentration of insulin in
SZ subjects has been documented at the onset of the disease [286], as well as in antipsychotic-
naïve individuals [285]. This second finding is especially important, as antipsychotic treatment
has been associated with insulin resistance or impaired glucose tolerance, but this and other
studies have shown that these changes are occurring prior to initiation of drug treatment.
Another group investigated anti-psychotic naïve first onset SZ subjects and also found increased
serum levels of insulin. Additionally, they used liquid-chromatography mass spectrometry
proteomic profiling to analyse the proteome of stimulated and unstimulated peripheral blood
mononuclear cells from SZ subjects and controls, and identified 18 proteins that were
differentially expressed between first onset cases and controls, 8 of which belonged to the
glycolytic pathway. Differences included increased levels of lactate and the glucose transporter-
1, and decreased levels of the insulin receptor - none of these proteins were altered in
antipsychotic treated patients [337]. Finally, while energy metabolism genes are decreased in
the brains of SZ patients, stimulation of insulin and insulin-like growth factor (IGF-1) receptors
leads to a reciprocal alteration of genes associated with metabolism and synaptic function.
109
Pharmacologic interventions that activate these receptors are a promising therapy for SZ, as they
would counter an etiological genomic disturbance [338]. Our GO analysis detected 36 epiSNPs
involved in insulin secretion that were unique to the SZ cohort; these epiSNPs were associated
with a variety of genes, ranging from potassium and calcium channels (KCNB1, CACNA1C) to
regulatory transcription factors (RFX6, RFX3), protein kinase C (PRKCA), a cholinergic
receptor (CHRM3), and a phosphodiesterase (PDE8B). While none of these genes are
exclusively related to insulin secretion, they are all critical pathway components, and these GO
categories were significantly enriched in the SZ group alone. Once more, it is not likely that a
complex disease, such as SZ, will be caused by a small number of “obvious” genes, and perhaps
the actual contributing factors have been so elusive because they represent different profiles of
seemingly unrelated genes that are discretely-dysregulated (ie. epigenetic dysregulation versus
genetic mutation), all of which are only responsible for a small portion of the pathology.
Although they seem to be completely separate pathways, there is a link between insulin
resistance and glutamate in psychosis. GAD is the enzyme that catalyzes the decarboxylation of
glutamate to produce GABA [339], and expression of one isoform, GAD65, is decreased within
the axon boutons of interneurons in SZ patients; the decrease correlates with decreases in
GAD65 protein levels and dendritic spines [340]. There is also evidence that autoantibodies to
GAD isoforms can contribute to the development of chronic psychotic disorders [341, 342],
type 1 diabetes and latent autoimmune diabetes in adults [343, 344]. When young non-obese
diabetic (NOD) mice were injected with anti-idiotypic antibodies directed to the GAD65Ab, the
incidence rate of Type 1 diabetes and time of onset significantly decreased [345]. The
connection between GAD autoantibodies and these conditions is relatively recent, and their
actual role in pathogenesis has yet to be determined. It is evident, however, that fluctuations in
one of these systems could result in disturbances in the other, meaning that epiSNP-induced
dysregulation of one pathway may also exert an effect via the other, and it is possible that these
effects could be additive.
The role of epigenetic mechanisms in psychiatric diseases is only beginning to solidify, but it is
already evident in major psychosis, Alzheimer’s disease, autism spectrum disorder, fragile X
syndrome [346], and several other conditions not previously mentioned in this thesis, such as
Rubinstein-Taybi syndrome [347], addiction [348, 349], and Huntington’s disease [350].
110
Maintenance of DNA methylation and histone modifications is crucial for normal
neurodevelopment and functioning of the brain – dysregulation of these components is
deleterious to the subject and can predispose to any of the aforementioned disease phenotypes.
Previous studies of psychiatric conditions have concentrated on the contributions of genetic and
environmental factors, but the impact of epigenetic mechanisms on neural function and gene
regulation cannot be ignored. While DNA sequence and external influences likely play an
important role in disease etiology, it is the interplay between epigenetics, DNA sequences and
environment that should become the focus of future work, with epigenetics bridging the gap
between genes and environment. New discoveries related to epigenetic inheritance and ASM are
evidence that we are not even aware of all the fundamental epigenetic mechanisms, and that a
considerable effort must be devoted to this field of research.
Future directions
As we gain a better understanding of genetic-epigenetic interactions, such as those described
here, the next step is to utilize these findings to improve upon our current strategies for studying
genomic activity and understanding normal molecular interactions. Here, we have explored the
epigenetic differences between MZ co-twins using CpG island microarrays, which was a large-
scale approach at the time of its undertaking. Since then, technology has advanced, and much
larger-scale tools are now available, such as tiling arrays and next generation sequencing
platforms. Future studies should utilize these technologies to further study the epigenetics of
MZ twin discordance and to investigate the contributions of DNA methylation and histone
modification in the context of phenotypic discordance.
The first goal would be to identify epigenetic differences that correlate with phenotypic
differences in disease states, as well as in normal, non-pathological traits. Integrative
approaches should be used for the simultaneous study of epigenetic and genetic factors, i.e. we
should begin to use DNA methylation level and histone modification information to stratify
GWAS into EWAS. The use of this additional level of biological regulation will facilitate the
identification of risk epi-alleles, which may be more informative than the purely genetic risk
alleles that are currently being discovered, as certain DNA risk factors may only become
detectable when their methylation state is taken into consideration. In the past several decades,
thousands of quantitative trait loci (QTL) – stretches of DNA associated with the gene for a
111
given trait - have been mapped using linkage and association analyses [351], however, small
effect sizes and the small proportion of the variance attached to individual QTLs have greatly
slowed the connecting of QTLs to genes [352]. The introduction of an epigenetic element may
facilitate the mapping of QTL that influence complex traits, and we have already observed
instances where epigenetic factors are responsible for the phenotype. If the DNA sequence of
the agouti locus (previously mentioned in the literature review) had been studied alone, with no
consideration of methylation status, the effect of this QTL would not be fully understood, as the
methylation within the 5’ transposon is the true predictor of the phenotype [353].
Presently, the EWAS approach is being utilized in many small-scale or targeted studies, mainly
those focused on cancer, but with our increasing knowledge of epigenetic mechanisms, EWAS
methods should become the standard for all risk factor screening endeavours. It has recently
been suggested that simply knowing the DNA sequence of an individual is not sufficient to
predict whether or not they will develop a disease in their lifetime, as personal choices, lifestyle
and random events can cause or prevent nearly every disease [354]. This is discouraging news
for genetics, but highlights the importance of epigenetics and combined genetic-epigenetic
studies in personal medicine, as epigenetic mechanisms can be affected by the environment and
may mediate a portion of its effects. A comprehensive re-analysis of GWAS-derived candidate
genes, haplotypes and individual SNPs should be attempted wherever possible and, hopefully,
technology will advance to the point where these screens can be completed simultaneously in a
cost-effective manner, perhaps through the design of dual-purpose microarrays.
The next step would be to determine the origin of these epigenetic differences, and the
mechanism by which they exert their effects – are they influenced by specific environmental
stimuli or do they arise stochastically? Once the causes have been identified, the development of
treatments and preventative strategies can begin for the pathological traits; in the case of non-
pathological phenotypic traits, we simply wish to understand the underlying mechanisms
responsible for their variability. The relationships between environmental exposures, genetic
states and epigenetic factors will become more comprehensive when examined using integrative
approaches and large sample sizes, and this should facilitate the discovery of epigenetic
mechanisms in discordant MZ twins.
112
Another goal of future work will be to expand upon our knowledge of epigenetic inheritance, as
much remains unknown regarding this critical biological phenomenon. As mentioned in the first
goal, subsequent studies should incorporate new technologies and drastically increase sample
sizes to study epigenetic inheritance in humans, as well as model systems. Experiments should
focus on the detection of epigenetically-heritable signals in the context of heritable traits and
diseases, taking into account both DNA methylation and histone modification. It has been
shown that histone modifications are heritable and can potentially affect the regulation of
transcription in germ and somatic cells of subsequent generations. For example, when a
Caenorhabditis elegans (C. elegans) ortholog of the H3K4me2 demethylase LSD1/KDM1 is
mutated, the worms show increasing sterility with each generation, and this sterility correlates
with misregulation of spermatogenesis-expressed genes and transgenerational accumulation of
dimethylation of histone H3 on lysine 4. It was hypothesized that erasure of H3K4me2 by
LSD/KDM1 in the germline prevents the “epigenetic memory” from being inappropriately
transmitted between generations [355]. Furuhashi et al have suggested that the use of “elegant
model systems” is crucial for the understanding of transgenerational histone modification
effects, namely the manipulation of C. elegans, as this species does not exhibit any DNA
methylation and encodes all of its epigenetic information using histones [309]. Examination of
histone patterns was not within the scope of our study, but future studies should definitely
explore this topic.
Regarding ASM, a third goal should be to investigate epiSNP 3D interactions and search their
surrounding sequences for motifs and other clues about their origins, but we must first
determine if we are dealing with epiSNPs, epi-haplotypes or both because, as previously
mentioned, a pure sequence analysis would be somewhat futile in an epi-haplotype scenario.
We must also study their stability and the factors that influence it, considering the possibility
that epiSNPs are not static and are subject to change – can their methylation levels fluctuate as
rapidly those observed in the Barres study [310]? If so, are fluctuations long- or short-acting,
and can they be stimulated by medications, chemicals in food, water or air, stress and emotional
state, physical activity, medications or any number of other factors? While the few known
imprinted genes display strong, parent-of-origin-specific monoallelic gene expression, it has
been documented that about 20% of autosomal genes may also show some differences in allelic
expression [356, 357]. The idea of a dynamic adaptation of genomic regulation using epiSNPs
113
is intriguing, although, it does not explain the conservation that we are seeing at many loci.
Perhaps there is a selective pressure at play in the establishment and maintenance of epiSNPs.
In order to study ASM mechanisms, it is necessary to select some target regions and focus on
them, versus looking at the entire genome; the purpose of our study was to identify instances of
ASM and observe trends, but the design is not suited to the exploration of exact mechanisms.
Ideally, deep-sequencing techniques would be employed, providing single-base resolution and
allowing for the convenient, reliable detection of epi-haplotypes. Analysis of epi-haplotypes
offers the extra benefits of reducing the number of tests required, as we are no longer
considering individual SNPs, and it would also permit the undertaking of evolutionary studies.
Also related to the study of epiSNP function are the questions of downstream interactions with
genes or regulatory elements and the relationship between ASM loci and imprinted loci. It has
been hypothesized that imprinting provides a means for offspring to adapt to the environment
before birth, and perhaps epiSNPs also serve this purpose. It is already known that ASM exists
in mice [320], so an analysis of epiSNP establishment in inbred mice could be very informative,
if different developmental time-points and maternal exposures are considered; this sort of
experiment should focus on previously-identified ASM hotspots, and could be extended to
include a study of gametic ASM. Our current studies have only considered 5-mC, but
technology is advancing and it will soon be possible to examine 5-mC and 5-hmC separately.
This is recommended for future experiments, especially in brain, liver, kidney and colorectal
tissues, where 5-hmC is relatively abundant [358], as consideration of both forms of methylation
may provide more information on the function of these modifications. On the subject of
downstream interactions, the influence of epiSNPs on transcription factors and their binding
sites, enhancers, silencers, ncRNA and miRNA should be studied, as should the impact of
intronic and UTR epiSNPs on the splicing and conformation of mRNA.
A fourth and final goal is the development of effective epigenetic pharmacotherapies that can be
used to correct dysregulated methylation levels and histone modifications. A number of
treatments based on epigenetic principles already exist, but they are severely limited by their
non-specificity, lack of efficacy and impermanent nature. Many of these drugs affect enzymes
that modify DNA and histones, but targeting etiological disease epimutations may be even more
promising, and compounds with higher specificity would become attractive choices for the
treatment of diseases other than cancer. One way to target specific sequences involves the use of
114
aptamers, which are small RNA/DNA molecules that form secondary and tertiary structures that
specifically bind proteins or other targets, much like a synthetic antibody. Aptamers are
chemically synthesized and easily conjugated with siRNA and nanoparticles, and preclinical
studies have shown great potential of these molecules in mouse models of cancer and HIV
[359]. It may also be possible to exploit the properties of zinc-finger proteins (ZFPs) and RNA
interference (RNAi) for the development of future epigenetic therapeutics. Zinc-finger proteins
specifically recognize and bind short stretches of DNA sequence (typically 9-18 base pairs)
[360], and they can be used to carry out a variety of cellular activities when they are combined
with different domains. In theory, an epimutation could be resolved if an epigenetically
dysregulated gene is treated with a corresponding histone or DNA modification enzyme
attached to a gene specific ZFP; the ZFP will specifically bind to the epimutation locus, while
the modification enzyme permanently repairs the damage. Another promising technology is
based on RNAi, which involves double-stranded RNA-induced destruction of homologous
mRNA, thus disabling protein production. Small interfering RNAs (siRNA) are endogenously
produced and incorporated into an RNA-induced silencing complex (RISC), which then targets
and cleaves mRNA transcripts [361]. It is believed that RNAi may have an impact on local
chromatin structure, heterochromatin assembly, and gene silencing, although mechanistic details
as to how the RNA and chromatin connect remain unclear [362]. Several siRNAs have recently
been created, including ones to knock down Β-secretase (BACE1) in Huntington's and
Alzheimer's disease, SCA1 in spinocerebellar ataxia, superoxide dismutase (SOD1) in
amyotrophic lateral sclerosis [363], and Toll-like receptor 4 (TLR4) in a rat model of acute lung
injury [364]. A clinical trial has also been submitted to the FDA, proposing the use of siRNA
against vascular endothelial growth factor in cases of age-related macular degeneration [365].
Great potential exists for the therapeutic use of siRNA to knock down mutated proteins in
various disease states, although issues of nonspecific silencing of partially homologous genes,
safe delivery and inhibition of microRNA (miRNA) must first be resolved.
The number of known epigenetic target genes and sequences has been steadily increasing,
especially in a variety of cancers, and their clinical utility is beginning to be noticed. MGMT
promoter methylation can be used to stratify elderly glioblastoma patients for treatment, as only
those with this methylation are sensitive to alkylating agent chemotherapy [366]. In patients
with early stage ER-negative breast cancer, classification of the methylation levels of tumor-
115
specific and tumor-related genes is an independent prognostic factor [367], although these
classifications are not yet linked to preferable treatment strategies. Some biomarkers have been
officially approved for use in the monitoring of cancer, such as CA125 and HE4, which were
approved by the FDA as biomarkers of ovarian cancer [368]. Ovarian cancer has shown quite a
bit of resistance to drug therapy (namely cisplatin and carboplatin), however, the ability of
epigenetic treatments to re-sensitize the tissue is under investigation [369, 370]. Natural
epigenetic modulator compounds, such as epigallocatechin-3-gallate (EGCG), a catechin
(flavonoid) from green tea, sulforaphane (SFN), an organosulfur from cruciferous vegetables,
and genistein, an isoflavonoid from soybean, have also demonstrated an ability to inhibit
ovarian cancer cell proliferation, while offering a safer adverse effect profile [51]. Other
promising plant-derived epigenetic therapeutics are also emerging, including alkaloids,
terpenoids and many polyphenol compounds [371]. Perhaps, in the next several years, further
identification of epigenetic biomarkers and operationalization of new, effective diagnostics and
treatments will become feasible for psychiatric and various other complex diseases.
116
Appendices
Appendix I. Twin Study Supplementary Notes
Correlation of MZ co-twin epigenetic variation with WB cell counts
The spot- wise correlation between twin pair loess M log ratio values and WB cell counts did
not yield any significant loci after correction for multiple testing. The number of genes
associated with loci showing an uncorrected significance value of P<0.001 in the whole WBC,
neutrophil, and lymphocyte fractions were 6, 10, and 8, respectively. Of the genes associated
with identified microarray probes beyond this threshold, 3 genes including the EOMES,
PDCD2, and PTPN9 genes that are related to immune system function[372-374]. While there is
a possibility that that these loci have surfaced by chance, the correlation between DNA
methylation status and various immune system related genes suggests that some of the
differences detected in this tissue could be a result of cellular sub-fraction differences between
these twins. However, the proportion of seemingly relevant correlations is less that 0.04% of the
total number of unique loci, which may be a testament to the effectiveness of matching WBC
cellular sub-fractions prior to epigenomic profiling.
117
Figure A.1. Karyogram of MZ co-twin epigenetic similarity in buccal cells
A chromosomal karyogram depicting levels of MZ co-twin similarity per interrogated locus in the buccal sample.
Black and grey bars on the chromosomes represent chromosomal banding patterns while red bars are indicative of
regions of high microarray probe density. Bars to the right of each chromosome represent locus specific ICCs
depicting levels of MZ co-twin epigenetic similarity. FDR corrected P values below the level of P<0.05 are depicted
in green while those with greater P values are depicted in grey.
118
Figure A.2. Karyogram of MZ co-twin epigenetic similarity in gut
A chromosomal karyogram depicting levels of MZ co-twin similarity per interrogated locus in the gut sample. Black
bars on the chromosomes represent chromosomal banding patterns while red bars are indicative of regions of high
microarray probe density. Bars to the right of each chromosome represent locus specific ICCs depicting levels of
MZ co-twin epigenetic similarity. Raw P values below the level of P<0.05 are depicted in green while those with
greater P values are depicted in grey.
119
Figure A.3. Karyogram of MZICC-DZICC values in WBCs
A chromosomal karyogram depicting levels of MZ co-twin similarity relative to DZ co-twin similarity per
interrogated locus in the WBC sample. Blue bars to the right of each chromosome represent locus specific ICCMZ-
ICCDZ values.
120
Figure A.4. Karyogram of MZICC-DZICC values in buccal cells of MC MZ twins
A chromosomal karyogram depicting levels of MZ co-twin similarity relative to DZ co-twin similarity per
interrogated locus in the MC buccal sample. Blue bars to the right of each chromosome represent locus specific
ICCMZ-ICCDZ values.
121
Appendix 2. Allele-Specific Methylation Study Supplementary Notes
ID Status Age Sex Race COD PMI Brain pH Brain Wt Age of Onset
2 BD 29 M white SUIC:CO 60 6.7 1430 17
3 SZ 43 M white PNEUMONIA 26 6.42 1480 22
4 BD 45 M white CARDIAC 28 6.35 1480 35
5 BD 41 M nat amer SUIC:OD 70 6.71 1625 22
6 BD 29 F white OD 62 6.74 1330 18
8 BD 44 M white SUIC:HANGING 19 6.74 1660 33
9 SZ 45 F white SUIC:JUMPED 52 6.51 1510 34
10 SZ 40 M white PNEUMONIA 34 6.18 1480 21
11 SZ 51 M white CARDIAC 43 6.63 1390 23
12 SZ 19 M white OD 28 6.73 1465 18
13 BD 49 F white SUIC:MVA 19 5.87 1380 22
14 BD 48 F white CARDIAC 18 6.5 1205 33
15 C 44 F white CARDIAC 28 6.59 1330 NA
16 BD 42 M white DROWNING 32 6.65 1470 18
17 SZ 53 F white CARDIAC 13 6.49 1345 29
18 BD 35 M white CARDIAC 35 6.3 1490 19
19 C 49 M white CARDIAC 46 6.5 1605 NA
20 BD 59 F white SUIC:OD 53 6.2 1410 48
21 BD 54 M white SUIC:OD 44 6.5 1510 45
22 SZ 37 M white CARDIAC 30 6.8 1550 13
23 BD 35 F white SUIC:CO 17 6.1 1250 21
24 C 53 M white CARDIAC 9 6.4 1500 NA
26 SZ 24 M white SUIC:OD 15 6.2 1505 20
27 C 37 M white CARDIAC 13 6.5 1600 NA
28 BD 45 M black KETOACIDOSIS 35 6.03 1300 16
31 SZ 34 M white EXHAUSTIVE MANIA/NMS 9 5.9 1415 19
32 BD 42 F white OD 49 6.65 1335 20
33 C 38 F white CARDIAC 33 6 1120 NA
36 BD 41 M white OD 39 6.6 1375 21
37 SZ 39 M white MVA 80 6.6 1355 17
38 C 60 M white CARDIAC 47 6.8 1460 NA
39 SZ 33 M white CARDIAC 29 6.5 1470 19
40 SZ 50 M white CARDIAC 9 6.2 1400 31
41 SZ 43 M white CIRRHOSIS 18 6.3 1520 18
42 BD 64 M white PNEUMONIA 16 6.1 1340 19
43 C 35 M white MYOCARDITIS 52 6.7 1700 NA
44 SZ 32 F white SUIC:JUMPED 36 6.8 1340 29
45 SZ 35 M white CARDIAC 47 6.4 1370 14
46 BD 59 M white SLEEP APNEA 84 6.65 1300 25
47 SZ 44 M white CARDIAC 32 6.67 1560 9
48 BD 55 F white CARDIAC 41 5.76 1270 40
49 C 34 M white CARDIAC 22 6.48 1480 NA
50 BD 51 F white SUIC:BLEEDING 77 6.42 1120 35
51 C 47 M white CARDIAC 21 6.81 1550 NA
52 C 45 M white CARDIAC 29 6.94 1405 NA
122
53 C 34 F white CARDIAC 24 6.87 1255 NA
54 C 42 M white CARDIAC 37 6.91 1340 NA
55 SZ 47 M white ACUTE PANCREAT 13 6.3 1310 20
56 C 44 F white CARDIAC 10 6.2 1305 NA
58 BD 63 F white CARDIAC 32 6.97 1290 43
59 C 57 M white CANCER 26 6.4 1470 NA
61 BD 44 F white MYOCARDITIS 37 6.37 1200 26
62 BD 56 F white DROWNING 26 6.58 1170 14
63 BD 43 F white SUIC:OD 39 6.74 1505 25
64 BD 35 M white DROWNING 22 6.58 1390 14
65 C 49 M white CARDIAC 23 6.93 1390 NA
66 SZ 45 M white CARDIAC 35 6.66 1390 15
67 C 35 M white CARDIAC 24 7.03 1415 NA
69 BD 50 F white SUIC:OD 62 6.51 1400 25
70 C 55 M white CARDIAC 31 6.7 1515 NA
71 C 49 F white CARDIAC 45 6.72 1435 NA
72 BD 49 F white OD 38 6.39 1190 20
73 SZ 53 M white CARDIAC 38 6.17 1400 23
74 BD 33 F white SUIC:HANGING 24 6.51 1450 15
75 SZ 54 F white PNEUMONIA 42 6.65 1170 17
76 BD 41 F white CARDIAC 28 6.44 1360 14
77 C 33 F white ASTHMA 29 6.52 1360 NA
78 SZ 44 F white POSS PULM THROMB 26 6.58 1490 16
79 C 48 M white CARDIAC 31 6.86 1580 NA
80 C 50 M white CARDIAC 49 6.75 1645 NA
81 SZ 47 F white OD 30 6.47 1430 23
82 SZ 39 M white SUIC:HANGING 26 6.8 1470 34
83 C 32 M white CARDIAC 13 6.57 1410 NA
85 SZ 38 M hispanic OD 35 6.68 1210 17
86 C 46 M white CARDIAC 31 6.67 1360 NA
87 SZ 41 M white CARDIAC 54 6.18 1629 20
88 SZ 43 M white SUIC:HANGING 65 6.67 1490 25
89 BD 43 F white OD 57 5.92 1340 29
90 C 40 M white CARDIAC 38 6.67 1498 NA
93 SZ 47 F white CARDIAC 35 6.5 1575 20
94 SZ 42 M white CARDIAC 19 6.48 1310 18
95 C 31 M white PULM EMBOL 11 6.13 1335 NA
97 SZ 46 M white PNEUMONIA 30 6.72 1630 22
98 BD 56 M white SUIC:OD 23 6.07 1670 28
99 C 39 F white CARDIAC 58 6.46 1260 NA
100 SZ 59 F white CARDIAC 38 6.93 1515 14
101 SZ 52 M white PNEUMONIA 16 6.52 1340 19
102 BD 48 M white SUIC:HANGING 23 6.9 1466 31
103 C 47 M white CARDIAC 36 6.57 1535 NA
104 BD 19 M white OD 12 5.97 1484 17
105 C 41 F white CARDIAC 50 6.17 1290 NA
Table A2.1. Stanley sample demographics
Demographic information for brain samples obtained from the Stanley Medical Research Institute brain collection.
123
ID Sex Status Age (decade) PMI Race Age of Onset COD Code COD
1003 F C 51-60 24h unknown na 2 1 heart attack/disease
1005 F C 71-80 12.5 unknown na l&r 10 Trauma 1008 F C 61-70 22.5 white na 2 11 pneumonia/resp infection
1011 M C 61-70 22.33 unknown na 1 12 Sepsis 1013 M C 31-40 18.75 white na a 13 Other 1014 M C 31-40 20 unknown na MI 14 dehydration, starvation
1020 M C 71-80 20.53 unknown na u 15 Hanging 1021 M C 31-40 25.67 unknown na u 16 Seizures 1022 M C 81-90 7.42 white na 2 17 chronic obs. Pulmonary
1024 M C 71-80 20.92 white na 2 19 Choking 1025 F C 71-80 23.91 unknown na 2 2 Cancer 1026 M C 31-40 28.83 unknown na MI 20 Asphyxia 1028 F C 61-70 24.25 unknown na u 21 GI bleed 1029 F C 61-70 7.42 white na 2 22 renal disease
1030 M C 41-50 18.33 unknown na MI 23 smoke inhalation
1032 M C 41-50 24.13 unknown na u 3 stroke, cerebrovasc dis
1034 M C 31-40 16.6 unknown na MI 4 general ateriosclerosis
1046 F C 71-80 14.1 white na MI 5 Complications
1047 M C 61-70 15.3 white na ref 6 CO poisoning
1049 F C 61-70 15 unknown na 1 7 Drowning 1055 M C 61-70 18.7 unknown na 1 8 shooting/stabbing
1070 M C 61-70 16.05 unknown na 1 9 drug OD
1071 F C 71-80 22.75 unknown na 22 a Accidental
1072 F C 71-80 18.5 unknown na u aaa abdominal aortic aneurysm
1074 M C 41-50 27.23 unknown na u asp Aspiration
1075 M C 41-50 20.61 unknown na MI chf congestive heart failure
1078 F C 61-70 22.55 unknown na MI E Emphysema
1079 F C 71-80 22.67 unknown na MI fro frozen to death
1080 M C 41-50 24.32 unknown na MI gan gangrene, infection
1086 M C 51-60 21.83 unknown na u gs gunshot wound
1087 F C 51-60 23.08 unknown na u inf Infection
1088 M C 41-50 30.4 unknown na 1 l&r liver and renal failure
1093 M C 61-70 29.36 unknown na u MI myocardial infarction
1099 M C 71-80 21.25 unknown na u mva motor vehicle accident
1107 F C 71-80 20.3 unknown na u PE pulmonary embolis
1110 M C 41-50 27.13 unknown na 1 pgl pontine glioma
1111 F C 51-60 23.78 unknown na u ref respiratory failure
1117 M C 61-70 20.97 unknown na 1 s Suicide
1118 M C 41-50 14.68 unknown na u ska ski accident
1123 F C 71-80 26.67 unknown na u sys systemic failure
1124 M C 51-60 24.42 unknown na 1 u Unknown
1125 M C 41-50 19.88 unknown na MI vd vascular dementia
1127 F C 51-60 24.25 unknown na 1
1128 M C 71-80 25.23
unknown na u SZ = schizophrenia
1129 M C 11-20 19.83 unknown
na 1 SA = schizoaffective
1132 F C 31-40 18.08 unknown
na u BD = bipolar disorder
124
1135 F C 81-90 17.42 white
na 2 INR = insufficient records
1137 M C 51-60 18.15 unknown na MI
1139 M C 51-60 21.88 unknown na 1
1141 F C 41-50 20.25 unknown na 1 1004 M SZ 61-70 19.9 unknown na u 1009 F SZ 71-80 24 white na 2 1010
F BD
71-80 20.83 white na 12 1012
M BD
71-80 14.25 white 18 11 1015
F BD
71-80 17 white 35 u 1016
M SZ
61-70 22.35 white na 1 1017 F SA 51-60 18 white na 2 1018 M BD 31-40 30.75 white na s 1019 M BD 31-40 22 white na 6 1027 M BD 71-80 27.66 white 20 u 1031 M BD 81-90 5.02 white na u 1036 M SZ 41-50 19 white na 11 1037 M SZ 31-40 28 white na 1 1038 M SZ 41-50 18.1 white na 6 1039 F SA 71-80 13.4 white na u 1041 M BD 71-80 30.2 white na u 1043 M SZ 41-50 27.1 white na 2 1045 F BD 41-50 15.8 white na u 1048 F BD 71-80 11.6 white na 3 1051 F BD 61-70 11 white na E 1052 F SZ 81-90 23.25 white na u 1053 M BD 31-40 41.5 white na gs 1054 F SA 81-90 25.75 white na chf 1056 M BD 51-60 31 white na u 1059 F BD 71-80 22.8 white na 1 1060 F SZ 71-80 21.75 white na 2 1061 F SA 41-50 33.78 white na 1 1064 M BD 71-80 24.8 white 52 11 1065 M SZ 41-50 19.08 white 19 s
1069 M SZ 41-50 24.5 white na MI
1073 F SZ 81-90 15.67 white na u
1076 M SZ 61-70 16.47 white 20 11
1077 M SZ 41-50 29.06 white na 2
1081 M SZ 61-70 15.95 white na E
1083 F BD 71-80 19.86 white 20 11
1084 F BD 81-90 14.08 white 35 u
1085 F BD 21-30 24.17 white 14 9
1089 F SZ 61-70 27.8 white 10 MI
1090 F BD 71-80 33.33 unknown na u
1092 F BD 41-50 16.25 white 18 u
1094 M SZ 41-50 17.67 white 2 MI
1095 M SZ 61-70 25.33 white na 12
1096 M SZ 51-60 32.38 white 20 chf
1097 M SZ 41-50 17.75 white 20 2
125
1098 F SZ 51-60 16.12 white na 2
1100 M SZ 51-60 24.53 white 18 19
1101 F BD 71-80 22.62 white 22 2
1102 F SZ 71-80 28.8 white na 2
1103 M BD 61-70 27.17 unknown na ref
1104 M SZ 61-70 21.43 white 15 MI
1105 M INR 11-20 17.5 unknown na mva
1106 M SZ 51-60 20.08 white 21 MI
1109 F BD 61-70 25.3 white 23 11
1112 M BD 61-70 29.48 white 56 11
1113 F SZ 61-70 11 white 21 MI
1114 F BD 71-80 21.63 white 20 2
1115 M SZ 41-50 33.25 white 19 u
1116 F BD 61-70 13.37 unknown 50 u
1119 M BD 61-70 17.25 white 27 22
1120 M SZ 51-60 38.25 unknown 19 u
1121 F BD 31-40 21.92 white 22 u
1122 F BD 51-60 17.22 white 16 l&r
1126 F SZ 51-60 18.72 white na 2
1130 M BD 21-30 19.83 white 19 s
1131 F BD 71-80 22.92 white 50 u
1134 F BD 71-80 24.75 white 20 14
1136 F BD 51-60 30.1 white na 1
1140 M SZ 41-50 32.5 white 18 s
1142 F BD 71-80 21.46 unknown na u
Table A2.2. Harvard sample demographics Demographic information for brain samples obtained from the Harvard Brain Tissue Resource Center.
ID Status Age Additional Conditions Smoke Medication Ethnicity
C15 control 29 none no none East Indian
C16 control 34 none no none Caucasian
C28 control 40 cleft palate surgery x3 no none Caucasian
C44 control 42 none no none Asian
C45 control 29 none no none East Indian
C46 control 30 childhood heart murmur yes none Caucasian
C55 control 27 none no none Caucasian
C57 control 63 osteoarthritis, GERD no Losec, Pregabalin Caucasian
C58 control 26 none no none Asian
126
C63 control 64 none no none Caucasian
C64 control 35 none no none Caucasian
C71 control 27 none no none Caucasian
C77 control 34 none no none Caucasian
C78 control 47 none no none Caucasian
C79 control 35 none no none Caucasian
C93 control 50 none no none Caucasian
C94 control 44 chronic back pain, injury no NSAIDs, morphine, muscle relaxants Caucasian
C95 control 38 tremor in hands yes propanolol Caucasian
C102 control 50 hepatitis C, alcoholic yes none Native
C103 control 26 none no none Caucasian
C104 control 41 none yes Cephalexin Mixed
C110 control 33 none no none Mixed
C122 control 26 none no none East Indian
C125 control 53 none no none Caucasian
S1 bipolar 35 none no Seroquel, Tegretol, lithium Caucasian
S4 bipolar 44 penicillin allergy no Lithium, Epival, Zyprexa Caucasian
S22 bipolar 55 alcoholic no none Caucasian
S23 bipolar 24 learning disability, migraine no Epival, Seroquel Caucasian
S27 bipolar 42 Seizures - post-concussion no Epival Caucasian
S37 bipolar 22 none no Epival, Zoloft, clonazepam Caucasian
S41 bipolar 59 none no none Caucasian
S45 bipolar 32 none no Lamictal Caucasian
S46 bipolar 27 depression, suicidal yes previously took Zyprexa, Cipralex Caucasian
S47 bipolar 38 spinal disc herniation no
Albilify, Lyrica, Lithium, Symbalta, Seroquel,
Topamax Caucasian
S53 bipolar 46 none no none Caucasian
S55 bipolar 36
panic disorder, learning disability,
IBS, hemorrhoid no Zyprexa, Clonazepam Caucasian
S60 bipolar 63 gambling, sleep apnea, hearing loss no Wellbutrin, Seroquel , Lithium Caucasian
S61 bipolar 46 none no Risperdal, Lithium, Cogentin Mixed
127
S62 bipolar 35 cholesterol, sleep apnea, COPD yes Seroquel Caucasian
S68 bipolar 21 ADD, OCD, GAD, mild asthma no Invega, Adderall, previously used Seroquel and Paxil Caucasian
S72 bipolar 54 diabetes II, anxiety, hypertension no Lamictal, Ativan, Cozaar, Lipitor, Januvia, Diamicron, Imovane, Levemir Caucasian
S73 bipolar 42 none yes Epival, Risperdal injections, Zyprexa, Seroquel Caucasian
S85 bipolar 24 depression yes none, previously used Seroquel Caucasian
S86 bipolar 23 none yes Epival Caucasian
S89 bipolar 44 none yes methadone Caucasian
S90 bipolar 45 migraine, varicose veins no Ibuprofen Hispanic
S92 bipolar 46 none yes Loxapine, Celexa, lorazepam Caucasian
S95 bipolar 22 none yes Olanzapine, acetaminophen Mixed
Table A2.3. CAMH sample demographics Demographic information for sperm cell samples obtained from the Centre for Addiction and Mental Health.
Locus Sample genotype CpG 1 CpG 2 CpG 3 CpG 4 Average
SNP_A-4222947 88 BB 87 49 68
100 BB 87 50 68.5
1009 BB 87 49 68
1049 BB 87 50 68.5
1111 BB 87 51 69
33 AA 85 73 79
1012 AA 87 74 80.5
1065 AA 86 74 80
1093 AA 87 74 80.5
1099 AA 88 74 81
SNP_A-8623123 80 AA 97 99 95 93 96
1009 AA 97 100 100 95 98
1061 AA 97 99 98 94 97
1088 AA 98 99 99 95 97.75
1093 AA 97 100 100 96 98.25
3 BB 96 99 99 95 97.25
1027 BB 97 100 97 93 96.75
1094 BB 96 100 99 95 97.5
1099 BB 97 100 99 92 97
1132 BB 95 100 99 95 97.25
SNP_A-8697241 90 AA 4 14 8 8.67
128
1009 AA 4 15 8 9.00
1101 AA 4 15 8 9.00
1117 AA 3 15 8 8.67
1127 AA 4 15 8 9.00
103 BB 24 95 59 59.33
1037 BB 25 97 59 60.33
1064 BB 25 98 60 61.00
1088 BB 24 97 57 59.33
1132 BB 24 96 57 59.00
SNP_A-1878011 1049 AA 63 63
1099 AA 62 62
1037 AA 64 64
1060 AA 65 65
1120 AA 64 64
28 AB 64 64
87 AB 66 66
100 AB 63 63
SNP_A-8529885 94 AA 100 100
1017 AA 100 100
1056 AA 100 100
1117 AA 99 99
1126 AA 100 100
1093 BB 100 100
1099 BB 100 100
1111 BB 100 100
1119 BB 100 100
1120 BB 100 100
SNP_A-4259932 70 AB 100 100
67 BB 98 98
73 BB 100 100
1061 BB 100 100
1077 BB 100 100
1117 BB 100 100
Table A2.4. Methylation levels at all CpG sites. The results from the bisulfite modification and pyrosequencing validation of epiSNPs and non-epiSNPs. Three
highly significant epiSNPs displaying large differences between AA and BB variants were chosen; three non-
epiSNPs were chosen randomly. For each locus, the individual sample codes are listed with their respective
genotypes. A and B designations were assigned arbitrarily during analysis. Methylation percentages returned from
the pyrosequencing are listed for each CpG site; samples had up to 4 sites available for interrogation on the
pyrosequencing amplicon. The far right column lists the average methylation percentage for each sample across the
locus.
129
130
Table A2.5. EpiSNPs and associated gene information Column 1: epiSNPs discussed in the Results section. Column 2: brain cohorts displaying the epiSNP. Column 3:
sperm cohorts displaying the epiSNP. Column 4: functional class tag of the epiSNP. Column 5: associated gene
name. Column 6: full gene name and function.
131
Table A2.6. 454 analysis sample genotypes Number of samples per genotype and cohort are listed per investigated SNP. Random non-epiSNPs were labeled
from 1 to 25, 11 of which were chosen for sequencing based on primer set performance.
SNP dbSNP ID
Case Control
AA BB AA BB
SNP 1 10975882 10 10 9 11
SNP 2 5943127 14 6 14 4
SNP 7 11658063 10 10 10 10
SNP 11 3762352 9 9 10 11
SNP 12 219815 10 10 10 10
SNP 15 17551103 15 5 15 5
SNP 18 2859011 10 10 10 10
SNP 19 2059697 9 10 10 10
SNP 21 720080 13 7 14 7
SNP 23 2581651 16 4 15 4
SNP 25 1902675 10 10 10 10
132
Copyright Acknowledgements
The twin study was published by Nature [375], and their statement regarding use of published
material by an author is listed below. Permission to use the paper has also been granted by Zach
Kaminsky (first author), and by Art Petronis (last author).
From: http://www.nature.com/reprints/permission-requests.html
Portions of the literature review were taken from review articles that I have previously published
with Annual Reviews of Pharmacology and Toxicology [376] and Dialogues in Clinical
Neuroscience [377] Copyright © Les Laboratoires Servier 2010. Permission to use the material
has been provided by Art Petronis (co-author of both articles). Copyright statements from the
journals are provided below.
The authors of articles published by Nature Publishing Group, or the authors' designated agents, do not usually need to seek permission
for re-use of their material as long as the journal is credited with initial publication. For further information about the terms of re-use for
authors please see below.
Author Requests
If you are the author of this content (or his/her designated agent) please read the following. Since 2003, ownership of copyright in in
original research articles remains with the Authors*, and provided that, when reproducing the Contribution or extracts from it, the
Authors acknowledge first and reference publication in the Journal, the Authors retain the following non-exclusive rights:
1. To reproduce the Contribution in whole or in part in any printed volume (book or thesis) of which they are the author(s).
2. They and any academic institution where they work at the time may reproduce the Contribution for the purpose of course
teaching.
3. To reuse figures or tables created by them and contained in the Contribution in other works created by them.
4. To post a copy of the Contribution as accepted for publication after peer review (in Word or Tex format) on the Author's own
web site, or the Author's institutional repository, or the Author's funding body's archive, six months after publication of the
printed or online edition of the Journal, provided that they also link to the Journal article on NPG's web site (eg through the
DOI).
NPG encourages the self-archiving of the accepted version of your manuscript in your funding agency's or institution's repository, six
months after publication. This policy complements the recently announced policies of the US National Institutes of Health, Wellcome
Trust and other research funding bodies around the world. NPG recognizes the efforts of funding bodies to increase access to the
research they fund, and we strongly encourage authors to participate in such efforts.
133
From: http://www.annualreviews.org/page/about/copyright-and-permissions
A note of permission (email) was received from Dialogues in Clinical Neuroscience:
Annual Reviews Authors: There is no need to obtain permission from Annual Reviews
for the use of your own work(s). Our copyright transfer agreement provides you with all
the necessary permissions.
Our copyright transfer agreement provides: “..The nonexclusive
right to use, reproduce, distribute, perform, update, create derivatives, and make copies of
the work (electronically or in print) in connection with the author’s teaching, conference
presentations, lectures, and publications, provided proper attribution is given...”
From: [email protected] [[email protected]]
Sent: Thursday, February 23, 2012 5:03 AM
To: Carolyn Ptak Cc: [email protected]
Subject: RE: Permission to use copyrighted material in a doctoral thesis (by article author)
Hi Carolyn,
Yes, that will be fine. You may have permission, provided that you include the complete citation of the work, with Copyright © Les Laboratoires Servier 2010, and the annotation that parts of the text have previously appeared in the publication mentioned.
Once you have completed the work, could you please send us the link to the online version? Thanks.
Best wishes,
Catriona
134
References
1. Kennedy, D., Breakthrough of the year. Science, 2007. 318(5858): p. 1833.
2. Esteller, M., The necessity of a human epigenome project. Carcinogenesis, 2006. 27(6):
p. 1121-5.
3. Martin, N., D. Boomsma, and G. Machin, A twin-pronged attack on complex traits. Nat
Genet, 1997. 17(4): p. 387-92.
4. Robertson, K.D. and A.P. Wolffe, DNA methylation in health and disease. Nat Rev
Genet, 2000. 1(1): p. 11-9.
5. Riggs, A.D., et al., Methylation dynamics, epigenetic fidelity and X chromosome
structure. Novartis Found Symp, 1998. 214: p. 214-25; discussion 225-32.
6. Ushijima, T., et al., Fidelity of the methylation pattern and its variation in the genome.
Genome Res, 2003. 13(5): p. 868-74.
7. Jaenisch, R. and A. Bird, Epigenetic regulation of gene expression: how the genome
integrates intrinsic and environmental signals. Nat Genet, 2003. 33 Suppl: p. 245-54.
8. Jirtle, R.L. and M.K. Skinner, Environmental epigenomics and disease susceptibility. Nat
Rev Genet, 2007. 8(4): p. 253-62.
9. Wong, A.H., Gottesman, II, and A. Petronis, Phenotypic differences in genetically
identical organisms: the epigenetic perspective. Hum Mol Genet, 2005. 14 Spec No 1: p.
R11-8.
10. Petronis, A., et al., Monozygotic twins exhibit numerous epigenetic differences: clues to
twin discordance? Schizophr Bull, 2003. 29(1): p. 169-78.
11. Kuratomi, G., et al., Aberrant DNA methylation associated with bipolar disorder
identified from discordant monozygotic twins. Mol Psychiatry, 2008. 13(4): p. 429-41.
12. Heijmans, B.T., et al., Heritable rather than age-related environmental and stochastic
factors dominate variation in DNA methylation of the human IGF2/H19 locus. Hum Mol
Genet, 2007. 16(5): p. 547-54.
13. Oates, N.A., et al., Increased DNA methylation at the AXIN1 gene in a monozygotic twin
from a pair discordant for a caudal duplication anomaly. Am J Hum Genet, 2006. 79(1):
p. 155-62.
14. Fraga, M.F., et al., Epigenetic differences arise during the lifetime of monozygotic twins.
Proc Natl Acad Sci U S A, 2005. 102(30): p. 10604-9.
15. Schellenberg, G.D. and T.J. Montine, The genetics and neuropathology of Alzheimer's
disease. Acta Neuropathol, 2012.
16. Hebert-Schuster, M., E.E. Fabre, and V. Nivet-Antoine, Catalase polymorphisms and
metabolic diseases. Curr Opin Clin Nutr Metab Care, 2012.
17. Lu, Y., et al., TGFB1 genetic polymorphisms and coronary heart disease risk: a meta-
analysis. BMC Med Genet, 2012. 13(1): p. 39.
18. Petronis, A., Epigenetics as a unifying principle in the aetiology of complex traits and
diseases. Nature, 2010. 465(7299): p. 721-7.
19. Kerkel, K., et al., Genomic surveys by methylation-sensitive SNP analysis identify
sequence-dependent allele-specific DNA methylation. Nat Genet, 2008. 40(7): p. 904-8.
20. Yan, H., et al., Allelic variation in human gene expression. Science, 2002. 297(5584): p.
1143.
135
21. Schalkwyk, L.C., et al., Allelic skewing of DNA methylation is widespread across the
genome. Am J Hum Genet, 2010. 86(2): p. 196-212.
22. Milani, L., et al., Allele-specific gene expression patterns in primary leukemic cells
reveal regulation of gene expression by CpG site methylation. Genome Res, 2009. 19(1):
p. 1-11.
23. Hawkins, N.J., et al., MGMT methylation is associated primarily with the germline C>T
SNP (rs16906252) in colorectal cancer and normal colonic mucosa. Mod Pathol, 2009.
22(12): p. 1588-99.
24. Candiloro, I.L. and A. Dobrovic, Detection of MGMT promoter methylation in normal
individuals is strongly associated with the T allele of the rs16906252 MGMT promoter
single nucleotide polymorphism. Cancer Prev Res (Phila), 2009. 2(10): p. 862-7.
25. Vawter, M.P., F. Mamdani, and F. Macciardi, An integrative functional genomics
approach for discovering biomarkers in schizophrenia. Brief Funct Genomics, 2011.
10(6): p. 387-99.
26. Daxinger, L. and E. Whitelaw, Understanding transgenerational epigenetic inheritance
via the gametes in mammals. Nat Rev Genet, 2012. 13(3): p. 153-62.
27. Richards, E.J., Inherited epigenetic variation--revisiting soft inheritance. Nat Rev Genet,
2006. 7(5): p. 395-401.
28. Boomsma, D., A. Busjahn, and L. Peltonen, Classical twin studies and beyond. Nat Rev
Genet, 2002. 3(11): p. 872-82.
29. Cedar, H. and Y. Bergman, Programming of DNA Methylation Patterns. Annu Rev
Biochem, 2012.
30. Perera, F. and J. Herbstman, Prenatal environmental exposures, epigenetics, and disease.
Reprod Toxicol, 2011. 31(3): p. 363-73.
31. Guo, S.W., The endometrial epigenome and its response to steroid hormones. Mol Cell
Endocrinol, 2012. 358(2): p. 185-96.
32. Mill, J., et al., Epigenomic profiling reveals DNA-methylation changes associated with
major psychosis. Am J Hum Genet, 2008. 82(3): p. 696-711.
33. Henikoff, S. and M.A. Matzke, Exploring and explaining epigenetic effects. Trends
Genet, 1997. 13(8): p. 293-5.
34. Margueron, R., P. Trojer, and D. Reinberg, The key to development: interpreting the
histone code? Curr Opin Genet Dev, 2005. 15(2): p. 163-76.
35. Thiagalingam, S., et al., Histone deacetylases: unique players in shaping the epigenetic
histone code. Ann N Y Acad Sci, 2003. 983: p. 84-100.
36. Klose, R.J. and Y. Zhang, Regulation of histone methylation by demethylimination and
demethylation. Nat Rev Mol Cell Biol, 2007. 8(4): p. 307-18.
37. Tahiliani, M., et al., Conversion of 5-methylcytosine to 5-hydroxymethylcytosine in
mammalian DNA by MLL partner TET1. Science, 2009. 324(5929): p. 930-5.
38. Williams, K., et al., TET1 and hydroxymethylcytosine in transcription and DNA
methylation fidelity. Nature, 2011. 473(7347): p. 343-8.
39. Hershey, A.D., J. Dixon, and M. Chase, Nucleic acid economy in bacteria infected with
bacteriophage T2. I. Purine and pyrimidine composition. J Gen Physiol, 1953. 36(6): p.
777-89.
40. Kriaucionis, S. and N. Heintz, The nuclear DNA base 5-hydroxymethylcytosine is present
in Purkinje neurons and the brain. Science, 2009. 324(5929): p. 929-30.
41. Song, C.X., et al., Selective chemical labeling reveals the genome-wide distribution of 5-
hydroxymethylcytosine. Nat Biotechnol, 2011. 29(1): p. 68-72.
136
42. Wu, S.C. and Y. Zhang, Active DNA demethylation: many roads lead to Rome. Nat Rev
Mol Cell Biol, 2010. 11(9): p. 607-20.
43. Andersen, I.S., et al., Epigenetic complexity during the zebrafish mid-blastula transition.
Biochem Biophys Res Commun, 2012. 417(4): p. 1139-44.
44. Ishikawa, K., E. Fukuda, and I. Kobayashi, Conflicts targeting epigenetic systems and
their resolution by cell death: novel concepts for methyl-specific and other restriction
systems. DNA Res, 2010. 17(6): p. 325-42.
45. Nakanishi, M.O., et al., Trophoblast-specific DNA methylation occurs after the
segregation of the trophectoderm and inner cell mass in the mouse periimplantation
embryo. Epigenetics, 2012. 7(2): p. 173-82.
46. Alabert, C. and A. Groth, Chromatin replication and epigenome maintenance. Nat Rev
Mol Cell Biol, 2012. 13(3): p. 153-67.
47. Schroeder, J.W., et al., Neonatal DNA methylation patterns associate with gestational
age. Epigenetics, 2011. 6(12): p. 1498-504.
48. Weksberg, R., et al., Beckwith-Wiedemann syndrome demonstrates a role for epigenetic
control of normal development. Hum Mol Genet, 2003. 12 Spec No 1: p. R61-8.
49. Baylin, S.B. and J.G. Herman, DNA hypermethylation in tumorigenesis: epigenetics joins
genetics. Trends Genet, 2000. 16(4): p. 168-74.
50. Jones, P.A. and P.W. Laird, Cancer epigenetics comes of age. Nat Genet, 1999. 21(2): p.
163-7.
51. Chen, H., T.M. Hardy, and T.O. Tollefsbol, Epigenomics of ovarian cancer and its
chemoprevention. Front Genet, 2011. 2: p. 67.
52. Petronis, A., Human morbid genetics revisited: relevance of epigenetics. Trends Genet,
2001. 17(3): p. 142-6.
53. Reiss, D., R. Plomin, and E.M. Hetherington, Genetics and psychiatry: an unheralded
window on the environment. Am J Psychiatry, 1991. 148(3): p. 283-91.
54. Kaprio, J. and M. Koskenvuo, Cigarette smoking as a cause of lung cancer and coronary
heart disease. A study of smoking-discordant twin pairs. Acta Genet Med Gemellol
(Roma), 1990. 39(1): p. 25-34.
55. Chen, C.J., et al., Environmental effects on cardiovascular risk factors in Chinese
adolescent monozygotic twins. Acta Genet Med Gemellol (Roma), 1984. 33(3): p. 375-
81.
56. Ingrosso, D., et al., Folate treatment and unbalanced methylation and changes of allelic
expression induced by hyperhomocysteinaemia in patients with uraemia. Lancet, 2003.
361(9370): p. 1693-9.
57. Wolff, G.L., et al., Maternal epigenetics and methyl supplements affect agouti gene
expression in Avy/a mice. Faseb J, 1998. 12(11): p. 949-57.
58. Dunlevy, L.P., et al., Integrity of the methylation cycle is essential for mammalian neural
tube closure. Birth Defects Res A Clin Mol Teratol, 2006. 76(7): p. 544-52.
59. Wang, L., et al., Relation between hypomethylation of long interspersed nucleotide
elements and risk of neural tube defects. Am J Clin Nutr, 2010. 91(5): p. 1359-67.
60. Cooney, C.A., A.A. Dave, and G.L. Wolff, Maternal methyl supplements in mice affect
epigenetic variation and DNA methylation of offspring. J Nutr, 2002. 132(8 Suppl): p.
2393S-2400S.
61. Waterland, R.A. and R.L. Jirtle, Transposable elements: targets for early nutritional
effects on epigenetic gene regulation. Mol Cell Biol, 2003. 23(15): p. 5293-300.
137
62. Tchantchou, F., et al., S-adenosylmethionine mediates glutathione efficacy by increasing
glutathione S-transferase activity: implications for S-adenosyl methionine as a
neuroprotective dietary supplement. J Alzheimers Dis, 2008. 14(3): p. 323-8.
63. Kharbanda, K.K., Alcoholic liver disease and methionine metabolism. Semin Liver Dis,
2009. 29(2): p. 155-65.
64. Christensen, B.C., et al., Epigenetic profiles distinguish pleural mesothelioma from
normal pleura and predict lung asbestos burden and clinical outcome. Cancer Res, 2009.
69(1): p. 227-34.
65. Majumdar, S., et al., Arsenic exposure induces genomic hypermethylation. Environ
Toxicol, 2010. 25(3): p. 315-8.
66. Weaver, I.C., et al., Epigenetic programming by maternal behavior. Nat Neurosci, 2004.
7(8): p. 847-54.
67. Petronis, A., Epigenetics and twins: three variations on the theme. Trends Genet, 2006.
22(7): p. 347-50.
68. Weksberg, R., et al., Discordant KCNQ1OT1 imprinting in sets of monozygotic twins
discordant for Beckwith-Wiedemann syndrome. Hum Mol Genet, 2002. 11(11): p. 1317-
25.
69. Mill, J., et al., Evidence for monozygotic twin (MZ) discordance in methylation level at
two CpG sites in the promoter region of the catechol-O-methyltransferase (COMT) gene.
Am J Med Genet B Neuropsychiatr Genet, 2006. 141B(4): p. 421-5.
70. Matzke, M.A. and A.J. Matzke, Cloning problems don't surprise plant biologists.
Science, 2000. 288(5475): p. 2318b.
71. Morgan, H.D., et al., Epigenetic inheritance at the agouti locus in the mouse. Nat Genet,
1999. 23(3): p. 314-8.
72. Iida, T., et al., PCNA clamp facilitates action of DNA cytosine methyltransferase 1 on
hemimethylated DNA. Genes Cells, 2002. 7(10): p. 997-1007.
73. Vilkaitis, G., et al., Processive methylation of hemimethylated CpG sites by mouse Dnmt1
DNA methyltransferase. J Biol Chem, 2005. 280(1): p. 64-72.
74. Hassler, M.R. and G. Egger, Epigenomics of cancer - emerging new concepts. Biochimie,
2012.
75. Faria, C.M., et al., Epigenetic mechanisms regulating neural development and pediatric
brain tumor formation. J Neurosurg Pediatr, 2011. 8(2): p. 119-32.
76. Seeman, M.V., Psychopathology in women and men: focus on female hormones. Am J
Psychiatry, 1997. 154(12): p. 1641-7.
77. Kaminsky, Z., S.C. Wang, and A. Petronis, Complex disease, gender and epigenetics.
Ann Med, 2006. 38(8): p. 530-44.
78. Ohara, K., et al., Anticipation and imprinting in schizophrenia. Biol Psychiatry, 1997.
42(9): p. 760-6.
79. Guo, Y.F., et al., Assessment of genetic linkage and parent-of-origin effects on obesity. J
Clin Endocrinol Metab, 2006. 91(10): p. 4001-5.
80. Bassett, S.S., D. Avramopoulos, and D. Fallin, Evidence for parent of origin effect in
late-onset Alzheimer disease. Am J Med Genet, 2002. 114(6): p. 679-86.
81. Demenais, F., V. Chaudru, and M. Martinez, Detection of parent-of-origin effects for
atopy by model-free and model-based linkage analyses. Genet Epidemiol, 2001. 21
Suppl 1: p. S186-91.
82. Lamb, J.A., et al., Analysis of IMGSAC autism susceptibility loci: evidence for sex limited
and parent of origin specific effects. J Med Genet, 2005. 42(2): p. 132-7.
138
83. Camprubi, C. and D. Monk, Does genomic imprinting play a role in autoimmunity? Adv
Exp Med Biol, 2011. 711: p. 103-16.
84. Schulze, T.G., et al., Additional, physically ordered markers increase linkage signal for
bipolar disorder on chromosome 18q22. Biol Psychiatry, 2003. 53(3): p. 239-43.
85. Hall, J.G., Genomic imprinting: review and relevance to human diseases. Am J Hum
Genet, 1990. 46(5): p. 857-73.
86. Barlow, D.P., Gametic imprinting in mammals. Science, 1995. 270(5242): p. 1610-3.
87. Delaval, K., A. Wagschal, and R. Feil, Epigenetic deregulation of imprinting in
congenital diseases of aberrant growth. Bioessays, 2006. 28(5): p. 453-9.
88. Tomizawa, S.I. and H. Sasaki, Genomic imprinting and its relevance to congenital
disease, infertility, molar pregnancy and induced pluripotent stem cell. J Hum Genet,
2012.
89. Sutherland, J.E. and M. Costa, Epigenetics and the environment. Ann N Y Acad Sci,
2003. 983: p. 151-60.
90. Petronis, A., The origin of schizophrenia: genetic thesis, epigenetic antithesis, and
resolving synthesis. Biol Psychiatry, 2004. 55(10): p. 965-70.
91. O'Sullivan, L., et al., Epigenetics and developmental programming of adult onset
diseases. Pediatr Nephrol, 2012.
92. Fuke, C., et al., Age related changes in 5-methylcytosine content in human peripheral
leukocytes and placentas: an HPLC-based study. Ann Hum Genet, 2004. 68(Pt 3): p.
196-204.
93. van den Toorn, L.M., et al., Asthma remission: does it exist? Curr Opin Pulm Med, 2003.
9(1): p. 15-20.
94. Faraone, S.V., J. Biederman, and E. Mick, The age-dependent decline of attention deficit
hyperactivity disorder: a meta-analysis of follow-up studies. Psychol Med, 2006. 36(2):
p. 159-65.
95. Esteller, M., Cancer epigenomics: DNA methylomes and histone-modification maps. Nat
Rev Genet, 2007. 8(4): p. 286-98.
96. Ting Hsiung, D., et al., Global DNA methylation level in whole blood as a biomarker in
head and neck squamous cell carcinoma. Cancer Epidemiol Biomarkers Prev, 2007.
16(1): p. 108-14.
97. Ehrlich, M., DNA methylation in cancer: too much, but also too little. Oncogene, 2002.
21(35): p. 5400-13.
98. Fraga, M.F., et al., Loss of acetylation at Lys16 and trimethylation at Lys20 of histone H4
is a common hallmark of human cancer. Nat Genet, 2005. 37(4): p. 391-400.
99. Pogribny, I.P., et al., Histone H3 lysine 9 and H4 lysine 20 trimethylation and the
expression of Suv4-20h2 and Suv-39h1 histone methyltransferases in
hepatocarcinogenesis induced by methyl deficiency in rats. Carcinogenesis, 2006. 27(6):
p. 1180-6.
100. Valdes-Mora, F., et al., Acetylation of H2A.Z is a key epigenetic modification associated
with gene deregulation and epigenetic remodeling in cancer. Genome Res, 2012. 22(2):
p. 307-21.
101. Laird, P.W., Cancer epigenetics. Hum Mol Genet, 2005. 14 Spec No 1: p. R65-76.
102. Baylin, S. and T.H. Bestor, Altered methylation patterns in cancer cell genomes: cause or
consequence? Cancer Cell, 2002. 1(4): p. 299-305.
139
103. Veldic, M., et al., Epigenetic mechanisms expressed in basal ganglia GABAergic neurons
differentiate schizophrenia from bipolar disorder. Schizophr Res, 2007. 91(1-3): p. 51-
61.
104. Hogart, A., et al., 15q11-13 GABAA receptor genes are normally biallelically expressed
in brain yet are subject to epigenetic dysregulation in autism-spectrum disorders. Hum
Mol Genet, 2007. 16(6): p. 691-703.
105. Deng, V., et al., FXYD1 is a MeCP2 target gene overexpressed in the brains of Rett
syndrome patients and Mecp2-null mice. Hum Mol Genet, 2007.
106. Hagerman, R.J., M.Y. Ono, and P.J. Hagerman, Recent advances in fragile X: a model
for autism and neurodegeneration. Curr Opin Psychiatry, 2005. 18(5): p. 490-6.
107. Ivleva, E., G. Thaker, and C.A. Tamminga, Comparing genes and phenomenology in the
major psychoses: schizophrenia and bipolar 1 disorder. Schizophr Bull, 2008. 34(4): p.
734-42.
108. Andreasen, N.C., Symptoms, signs, and diagnosis of schizophrenia. Lancet, 1995.
346(8973): p. 477-81.
109. Bauer, M., S. Kasper, and M. Willeit, Is dopamine neurotransmission altered in
prodromal schizophrenia? A review of the evidence. Curr Pharm Des, 2012.
110. Akhondzadeh, S., The 5-HT hypothesis of schizophrenia. IDrugs, 2001. 4(3): p. 295-300.
111. Kantrowitz, J. and D.C. Javitt, Glutamatergic transmission in schizophrenia: from basic
research to clinical practice. Curr Opin Psychiatry, 2012. 25(2): p. 96-102.
112. Kuroki, T., N. Nagao, and T. Nakahara, Neuropharmacology of second-generation
antipsychotic drugs: a validity of the serotonin-dopamine hypothesis. Prog Brain Res,
2008. 172: p. 199-212.
113. Rao, J.S., et al., Dysregulated glutamate and dopamine transporters in postmortem
frontal cortex from bipolar and schizophrenic patients. J Affect Disord, 2012. 136(1-2):
p. 63-71.
114. Haukvik, U.K., et al., Cortical folding in Broca's area relates to obstetric complications
in schizophrenia patients and healthy controls. Psychol Med, 2011: p. 1-9.
115. Roseboom, T.J., et al., Hungry in the womb: what are the consequences? Lessons from
the Dutch famine. Maturitas, 2011. 70(2): p. 141-5.
116. Schmidt-Kastner, R., et al., An environmental analysis of genes associated with
schizophrenia: hypoxia and vascular factors as interacting elements in the
neurodevelopmental model. Mol Psychiatry, 2012.
117. Kneeland, R.E. and S.H. Fatemi, Viral infection, inflammation and schizophrenia. Prog
Neuropsychopharmacol Biol Psychiatry, 2012.
118. Reininghaus, U., et al., Ethnic identity, perceptions of disadvantage, and psychosis:
findings from the AESOP study. Schizophr Res, 2010. 124(1-3): p. 43-8.
119. Benros, M.E., et al., Autoimmune diseases and severe infections as risk factors for
schizophrenia: a 30-year population-based register study. Am J Psychiatry, 2011.
168(12): p. 1303-10.
120. Fiorentini, A., et al., Substance-induced psychoses: a critical review of the literature.
Curr Drug Abuse Rev, 2011. 4(4): p. 228-40.
121. Craddock, N., M.C. O'Donovan, and M.J. Owen, The genetics of schizophrenia and
bipolar disorder: dissecting psychosis. J Med Genet, 2005. 42(3): p. 193-204.
122. Craddock, N. and I. Jones, Genetics of bipolar disorder. J Med Genet, 1999. 36(8): p.
585-94.
140
123. Bertelsen, A. and Gottesman, II, Schizoaffective psychoses: genetical clues to
classification. Am J Med Genet, 1995. 60(1): p. 7-11.
124. Cardno, A.G. and Gottesman, II, Twin studies of schizophrenia: from bow-and-arrow
concordances to star wars Mx and functional genomics. Am J Med Genet, 2000. 97(1): p.
12-7.
125. O'Donovan, M.C., N.J. Craddock, and M.J. Owen, Genetics of psychosis; insights from
views across the genome. Hum Genet, 2009. 126(1): p. 3-12.
126. Sanders, A.R., et al., No significant association of 14 candidate genes with schizophrenia
in a large European ancestry sample: implications for psychiatric genetics. Am J
Psychiatry, 2008. 165(4): p. 497-506.
127. Kerner, B., C.G. Lambert, and B.O. Muthen, Genome-wide association study in bipolar
patients stratified by co-morbidity. PLoS One, 2011. 6(12): p. e28477.
128. Carrera, N., et al., Association study of nonsynonymous single nucleotide polymorphisms
in schizophrenia. Biol Psychiatry, 2012. 71(2): p. 169-77.
129. Rietschel, M., et al., Association between genetic variation in a region on chromosome
11 and schizophrenia in large samples from Europe. Mol Psychiatry, 2011.
130. Lee, K.W., et al., Genome wide association studies (GWAS) and copy number variation
(CNV) studies of the major psychoses: what have we learnt? Neurosci Biobehav Rev,
2012. 36(1): p. 556-71.
131. Dempster, E.L., et al., Disease-associated epigenetic changes in monozygotic twins
discordant for schizophrenia and bipolar disorder. Hum Mol Genet, 2011. 20(24): p.
4786-96.
132. Grayson, D.R., et al., Reelin promoter hypermethylation in schizophrenia. Proc Natl
Acad Sci U S A, 2005. 102(26): p. 9341-6.
133. Tochigi, M., et al., Methylation status of the reelin promoter region in the brain of
schizophrenic patients. Biol Psychiatry, 2008. 63(5): p. 530-3.
134. Volk, D.W. and D.A. Lewis, Impaired prefrontal inhibition in schizophrenia: relevance
for cognitive dysfunction. Physiol Behav, 2002. 77(4-5): p. 501-5.
135. Hashimoto, T., et al., Gene expression deficits in a subclass of GABA neurons in the
prefrontal cortex of subjects with schizophrenia. J Neurosci, 2003. 23(15): p. 6315-26.
136. Bullock, W.M., et al., Altered expression of genes involved in GABAergic transmission
and neuromodulation of granule cell activity in the cerebellum of schizophrenia patients.
Am J Psychiatry, 2008. 165(12): p. 1594-603.
137. Huang, H.S. and S. Akbarian, GAD1 mRNA expression and DNA methylation in
prefrontal cortex of subjects with schizophrenia. PLoS One, 2007. 2(8): p. e809.
138. Sharma, R.P., D.R. Grayson, and D.P. Gavin, Histone deactylase 1 expression is
increased in the prefrontal cortex of schizophrenia subjects: analysis of the National
Brain Databank microarray collection. Schizophr Res, 2008. 98(1-3): p. 111-7.
139. Swerdlow, N.R., Are we studying and treating schizophrenia correctly? Schizophr Res,
2011. 130(1-3): p. 1-10.
140. Van Winkel, R., et al., REVIEW: Genome-wide findings in schizophrenia and the role of
gene-environment interplay. CNS Neurosci Ther, 2010. 16(5): p. e185-92.
141. McGowan, P.O. and T. Kato, Epigenetics in mood disorders. Environ Health Prev Med,
2008. 13(1): p. 16-24.
142. Kaminsky, Z., et al., A multi-tissue analysis identifies HLA complex group 9 gene
methylation differences in bipolar disorder. Mol Psychiatry, 2011.
141
143. Stankiewicz, P. and J.R. Lupski, Structural variation in the human genome and its role in
disease. Annu Rev Med, 2010. 61: p. 437-55.
144. Van de Kerkhof, N.W., et al., Copy number variants in a sample of patients with
psychotic disorders: is standard screening relevant for actual clinical practice?
Neuropsychiatr Dis Treat, 2012. 8: p. 295-300.
145. Kidd, J.M., et al., Mapping and sequencing of structural variation from eight human
genomes. Nature, 2008. 453(7191): p. 56-64.
146. Ye, T., et al., Analysis of Copy Number Variations in Brain DNA from Patients with
Schizophrenia and Other Psychiatric Disorders. Biol Psychiatry, 2012.
147. Bergen, S.E., et al., Genome-wide association study in a Swedish population yields
support for greater CNV and MHC involvement in schizophrenia compared with bipolar
disorder. Mol Psychiatry, 2012.
148. Liao, H.M., et al., Identification and characterization of three inherited genomic copy
number variations associated with familial schizophrenia. Schizophr Res, 2012. 139(1-
3): p. 229-36.
149. Grozeva, D., et al., Independent estimation of the frequency of rare CNVs in the UK
population confirms their role in schizophrenia. Schizophr Res, 2012. 135(1-3): p. 1-7.
150. Malhotra, D. and J. Sebat, CNVs: harbingers of a rare variant revolution in psychiatric
genetics. Cell, 2012. 148(6): p. 1223-41.
151. Meyer, J., et al., Rare variants of the gene encoding the potassium chloride co-
transporter 3 are associated with bipolar disorder. Int J Neuropsychopharmacol, 2005.
8(4): p. 495-504.
152. Moser, D., et al., Functional analysis of a potassium-chloride co-transporter 3
(SLC12A6) promoter polymorphism leading to an additional DNA methylation site.
Neuropsychopharmacology, 2009. 34(2): p. 458-67.
153. Uyanik, G., et al., Novel truncating and missense mutations of the KCC3 gene associated
with Andermann syndrome. Neurology, 2006. 66(7): p. 1044-8.
154. Luedi, P.P., et al., Computational and experimental identification of novel human
imprinted genes. Genome Res, 2007. 17(12): p. 1723-30.
155. Monk, M., Changes in DNA methylation during mouse embryonic development in
relation to X-chromosome activity and imprinting. Philos Trans R Soc Lond B Biol Sci,
1990. 326(1235): p. 299-312.
156. Rakyan, V. and E. Whitelaw, Transgenerational epigenetic inheritance. Curr Biol, 2003.
13(1): p. R6.
157. Christensen, B.C., et al., Aging and environmental exposures alter tissue-specific DNA
methylation dependent upon CpG island context. PLoS Genet, 2009. 5(8): p. e1000602.
158. Zhang, Y., et al., Non-imprinted allele-specific DNA methylation on human autosomes.
Genome Biol, 2009. 10(12): p. R138.
159. Hellman, A. and A. Chess, Extensive sequence-influenced DNA methylation
polymorphism in the human genome. Epigenetics Chromatin, 2010. 3(1): p. 11.
160. Shoemaker, R., et al., Allele-specific methylation is prevalent and is contributed by CpG-
SNPs in the human genome. Genome Res, 2010. 20(7): p. 883-9.
161. Chen, P.Y., et al., A comparative analysis of DNA methylation across human embryonic
stem cell lines. Genome Biol, 2011. 12(7): p. R62.
162. Zhang, D., et al., Genetic control of individual differences in gene-specific methylation in
human brain. Am J Hum Genet, 2010. 86(3): p. 411-9.
142
163. Barreiro, L.B., et al., Natural selection has driven population differentiation in modern
humans. Nat Genet, 2008. 40(3): p. 340-5.
164. Jendrzejewski, J., et al., The polymorphism rs944289 predisposes to papillary thyroid
carcinoma through a large intergenic noncoding RNA gene of tumor suppressor type.
Proc Natl Acad Sci U S A, 2012. 109(22): p. 8646-51.
165. Sivakumaran, S., et al., Abundant pleiotropy in human complex diseases and traits. Am J
Hum Genet, 2011. 89(5): p. 607-18.
166. Walters, R.W., S.S. Bradrick, and M. Gromeier, Poly(A)-binding protein modulates
mRNA susceptibility to cap-dependent miRNA-mediated repression. Rna, 2010. 16(1): p.
239-50.
167. Yang, J.O., W.Y. Kim, and J. Bhak, ssSNPTarget: genome-wide splice-site Single
Nucleotide Polymorphism database. Hum Mutat, 2009. 30(12): p. E1010-20.
168. Wimmer, K., et al., The NF1 gene contains hotspots for L1 endonuclease-dependent de
novo insertion. PLoS Genet, 2011. 7(11): p. e1002371.
169. Medvedeva, Y.A., et al., Intergenic, gene terminal, and intragenic CpG islands in the
human genome. BMC Genomics, 2010. 11: p. 48.
170. Martin, J.S., et al., Structural effects of linkage disequilibrium on the transcriptome. Rna,
2012. 18(1): p. 77-87.
171. Arnold, M., et al., Cis-Acting Polymorphisms Affect Complex Traits through
Modifications of MicroRNA Regulation Pathways. PLoS One, 2012. 7(5): p. e36694.
172. Rearick, D., et al., Critical association of ncRNA with introns. Nucleic Acids Res, 2011.
39(6): p. 2357-66.
173. An, J.H., et al., DNA methylation-specific multiplex assays for body fluid identification.
Int J Legal Med, 2012.
174. Wang, D., et al., Individual variation and longitudinal pattern of genome-wide DNA
methylation from birth to the first two years of life. Epigenetics, 2012. 7(6).
175. Hou, Y., et al., DNA Demethylation and USF Regulate the Meiosis-Specific Expression of
the Mouse Miwi. PLoS Genet, 2012. 8(5): p. e1002716.
176. Irizarry, R.A., et al., The human colon cancer methylome shows similar hypo- and
hypermethylation at conserved tissue-specific CpG island shores. Nat Genet, 2009. 41(2):
p. 178-86.
177. Hudson, T.J., et al., International network of cancer genome projects. Nature, 2010.
464(7291): p. 993-8.
178. Leng, S., et al., The A/G allele of rs16906252 predicts for MGMT methylation and is
selectively silenced in premalignant lesions from smokers and in lung adenocarcinomas.
Clin Cancer Res, 2011. 17(7): p. 2014-23.
179. Andraos, C., et al., Vitamin D receptor gene methylation is associated with ethnicity,
tuberculosis, and TaqI polymorphism. Hum Immunol, 2010. 72(3): p. 262-8.
180. Stepanow, S., et al., Allele-specific, age-dependent and BMI-associated DNA methylation
of human MCHR1. PLoS One, 2011. 6(5): p. e17711.
181. Bell, C.G., et al., Integrated genetic and epigenetic analysis identifies haplotype-specific
methylation in the FTO type 2 diabetes and obesity susceptibility locus. PLoS One, 2010.
5(11): p. e14040.
182. Kundakovic, M., et al., DNA methyltransferase inhibitors coordinately induce expression
of the human reelin and glutamic acid decarboxylase 67 genes. Mol Pharmacol, 2007.
71(3): p. 644-53.
143
183. Dash, P.K., S.A. Orsi, and A.N. Moore, Histone deactylase inhibition combined with
behavioral therapy enhances learning and memory following traumatic brain injury.
Neuroscience, 2009.
184. Hockly, E., et al., Suberoylanilide hydroxamic acid, a histone deacetylase inhibitor,
ameliorates motor deficits in a mouse model of Huntington's disease. Proc Natl Acad Sci
U S A, 2003. 100(4): p. 2041-6.
185. Camelo, S., et al., Transcriptional therapy with the histone deacetylase inhibitor
trichostatin A ameliorates experimental autoimmune encephalomyelitis. J
Neuroimmunol, 2005. 164(1-2): p. 10-21.
186. Chen, P.S., et al., Valproate protects dopaminergic neurons in midbrain neuron/glia
cultures by stimulating the release of neurotrophic factors from astrocytes. Mol
Psychiatry, 2006. 11(12): p. 1116-25.
187. Dagtas, A.S., E.R. Edens, and K.M. Gilbert, Histone deacetylase inhibitor uses
p21(Cip1) to maintain anergy in CD4(+) T cells. Int Immunopharmacol, 2009.
188. Langley, B., et al., Pulse inhibition of histone deacetylases induces complete resistance to
oxidative death in cortical neurons without toxicity and reveals a role for cytoplasmic
p21(waf1/cip1) in cell cycle-independent neuroprotection. J Neurosci, 2008. 28(1): p.
163-76.
189. Gilad, R., et al., Treatment of status epilepticus and acute repetitive seizures with i.v.
valproic acid vs phenytoin. Acta Neurol Scand, 2008. 118(5): p. 296-300.
190. Bowden, C.L., Spectrum of effectiveness of valproate in neuropsychiatry. Expert Rev
Neurother, 2007. 7(1): p. 9-16.
191. Sajatovic, M., et al., Adjunct extended-release valproate semisodium in late life
schizophrenia. Int J Geriatr Psychiatry, 2008. 23(2): p. 142-7.
192. Wright, M. and N. Martin, Brisbane Adolescent Twin Study: outline of study methods and
research projects. Australian Journal of Psychology, 2004. 56: p. 65-78.
193. Halfvarson, J., et al., Inflammatory bowel disease in a Swedish twin cohort: a long-term
follow-up of concordance and clinical characteristics. Gastroenterology, 2003. 124(7): p.
1767-73.
194. Schumacher, A., et al., Microarray-based DNA methylation profiling: technology and
applications. Nucleic Acids Res, 2006. 34(2): p. 528-42.
195. Heisler, L.E., et al., CpG Island microarray probe sequences derived from a physical
library are representative of CpG Islands annotated on the human genome. Nucleic
Acids Res, 2005. 33(9): p. 2952-61.
196. Storey, J.D. and R. Tibshirani, Statistical significance for genomewide studies. Proc Natl
Acad Sci U S A, 2003. 100(16): p. 9440-5.
197. Falcon, S. and R. Gentleman, Using GOstats to test gene lists for GO term association.
Bioinformatics, 2007. 23(2): p. 257-8.
198. Tost, J., H. El abdalaoui, and I.G. Gut, Serial pyrosequencing for quantitative DNA
methylation analysis. Biotechniques, 2006. 40(6): p. 721-2, 724, 726.
199. Torrey, E.F., et al., The stanley foundation brain collection and neuropathology
consortium. Schizophr Res, 2000. 44(2): p. 151-5.
200. Deep-Soboslay, A., et al., Psychiatric brain banking: three perspectives on current trends
and future directions. Biol Psychiatry, 2011. 69(2): p. 104-12.
201. Consortium, G.P., A map of human genome variation from population-scale sequencing.
Nature, 2010. 467(7319): p. 1061-73.
144
202. Eckhardt, F., et al., DNA methylation profiling of human chromosomes 6, 20 and 22. Nat
Genet, 2006. 38(12): p. 1378-85.
203. Hall, J.G., Twinning. Lancet, 2003. 362(9385): p. 735-43.
204. Andrian, E., et al., Regulation of matrix metalloproteinases and tissue inhibitors of
matrix metalloproteinases by Porphyromonas gingivalis in an engineered human oral
mucosa model. J Cell Physiol, 2007. 211(1): p. 56-62.
205. Choi, B.K., et al., Activation of matrix metalloproteinase-2 by a novel oral spirochetal
species Treponema lecithinolyticum. J Periodontol, 2001. 72(11): p. 1594-600.
206. Wilm, B., et al., The serosal mesothelium is a major source of smooth muscle cells of the
gut vasculature. Development, 2005. 132(23): p. 5317-28.
207. Bruder, C.E., et al., Phenotypically concordant and discordant monozygotic twins display
different DNA copy-number-variation profiles. Am J Hum Genet, 2008. 82(3): p. 763-71.
208. Bouchard, T.J., Jr., et al., Sources of human psychological differences: the Minnesota
Study of Twins Reared Apart. Science, 1990. 250(4978): p. 223-8.
209. Murrell, A., et al., An association between variants in the IGF2 gene and Beckwith-
Wiedemann syndrome: interaction between genotype and epigenotype. Hum Mol Genet,
2004. 13(2): p. 247-55.
210. Flanagan, J.M., et al., Intra- and interindividual epigenetic variation in human germ
cells. Am J Hum Genet, 2006. 79(1): p. 67-84.
211. Khulan, B., et al., Comparative isoschizomer profiling of cytosine methylation: the HELP
assay. Genome Res, 2006. 16(8): p. 1046-55.
212. Gartner, K. and E. Baunack, Is the similarity of monozygotic twins due to genetic factors
alone? Nature, 1981. 292(5824): p. 646-7.
213. Gertz, J., et al., Analysis of DNA methylation in a three-generation family reveals
widespread genetic influence on epigenetic regulation. PLoS Genet, 2011. 7(8): p.
e1002228.
214. Hsu, J. and J.D. Smith, Genome-wide studies of gene expression relevant to coronary
artery disease. Curr Opin Cardiol, 2012. 27(3): p. 210-3.
215. Pardini, B., et al., Gene expression variations: potentialities of master regulator
polymorphisms in colorectal cancer risk. Mutagenesis, 2012. 27(2): p. 161-7.
216. Hollingworth, P., et al., Genome-wide association study of Alzheimer's disease with
psychotic symptoms. Mol Psychiatry, 2011.
217. Zhou, X.J., et al., Genetic association of PRDM1-ATG5 intergenic region and autophagy
with systemic lupus erythematosus in a Chinese population. Ann Rheum Dis, 2011.
70(7): p. 1330-7.
218. Lee, B.K., et al., Cell-type specific and combinatorial usage of diverse transcription
factors revealed by genome-wide binding studies in multiple human cells. Genome Res,
2012. 22(1): p. 9-24.
219. Washietl, S., et al., Mapping of conserved RNA secondary structures predicts thousands
of functional noncoding RNAs in the human genome. Nat Biotechnol, 2005. 23(11): p.
1383-90.
220. Pheasant, M. and J.S. Mattick, Raising the estimate of functional human sequences.
Genome Res, 2007. 17(9): p. 1245-53.
221. Witzany, G., Noncoding RNAs: persistent viral agents as modular tools for cellular
needs. Ann N Y Acad Sci, 2009. 1178: p. 244-67.
222. Huttenhofer, A., P. Schattner, and N. Polacek, Non-coding RNAs: hope or hype? Trends
Genet, 2005. 21(5): p. 289-97.
145
223. Ponting, C.P. and R.C. Hardison, What fraction of the human genome is functional?
Genome Res, 2011. 21(11): p. 1769-76.
224. Kapranov, P., A.T. Willingham, and T.R. Gingeras, Genome-wide transcription and the
implications for genomic organization. Nat Rev Genet, 2007. 8(6): p. 413-23.
225. Waldispuhl, J. and P. Clote, Computing the partition function and sampling for saturated
secondary structures of RNA, with respect to the Turner energy model. J Comput Biol,
2007. 14(2): p. 190-215.
226. Li, X., et al., Predicting in vivo binding sites of RNA-binding proteins using mRNA
secondary structure. Rna, 2010. 16(6): p. 1096-107.
227. Bennett, C.L., et al., A rare polyadenylation signal mutation of the FOXP3 gene
(AAUAAA-->AAUGAA) leads to the IPEX syndrome. Immunogenetics, 2001. 53(6): p.
435-9.
228. Guo, H., et al., Mammalian microRNAs predominantly act to decrease target mRNA
levels. Nature, 2010. 466(7308): p. 835-40.
229. Wu, H., et al., Genome-wide analysis reveals methyl-CpG-binding protein 2-dependent
regulation of microRNAs in a mouse model of Rett syndrome. Proc Natl Acad Sci U S A,
2010. 107(42): p. 18161-6.
230. Halvorsen, M., et al., Disease-associated mutations that alter the RNA structural
ensemble. PLoS Genet, 2010. 6(8): p. e1001074.
231. Castle, J.C., SNPs occur in regions with less genomic sequence conservation. PLoS One,
2011. 6(6): p. e20660.
232. Eom, S. and C. Lee, Functions of intronic nucleotide variants in the gene encoding
pleckstrin homology like domain beta 2 (PHLDB2) on susceptibility to vascular
dementia. World J Biol Psychiatry, 2011.
233. Zhao, C., et al., Alternative-splicing in the exon-10 region of GABA(A) receptor beta(2)
subunit gene: relationships between novel isoforms and psychotic disorders. PLoS One,
2009. 4(9): p. e6977.
234. Itokawa, M., et al., [Studies on pathophysiology of schizophrenia with a rare variant as a
clue]. Brain Nerve, 2011. 63(3): p. 223-31.
235. Shen, Y.C., et al., Genetic and functional analysis of the gene encoding neurogranin in
schizophrenia. Schizophr Res, 2012. 137(1-3): p. 7-13.
236. Kushima, I., et al., Resequencing and association analysis of the KALRN and EPHB1
genes and their contribution to schizophrenia susceptibility. Schizophr Bull, 2012. 38(3):
p. 552-60.
237. Huang, J., et al., Human down syndrome cell adhesion molecules (DSCAMs) are
functionally conserved with Drosophila Dscam[TM1] isoforms in controlling
neurodevelopment. Insect Biochem Mol Biol, 2011. 41(10): p. 778-87.
238. Bartolomucci, A., et al., The extended granin family: structure, function, and biomedical
implications. Endocr Rev, 2011. 32(6): p. 755-97.
239. Teyssier, J.R., et al., Activation of a DeltaFOSB dependent gene expression pattern in the
dorsolateral prefrontal cortex of patients with major depressive disorder. J Affect
Disord, 2011. 133(1-2): p. 174-8.
240. Portela-Gomes, G.M., L. Grimelius, and M. Stridsberg, Secretogranin III in human
neuroendocrine tumours: a comparative immunohistochemical study with chromogranins
A and B and secretogranin II. Regul Pept, 2010. 165(1): p. 30-5.
241. McQuillin, A., M. Rizig, and H.M. Gurling, A microarray gene expression study of the
molecular pharmacology of lithium carbonate on mouse brain mRNA to understand the
146
neurobiology of mood stabilization and treatment of bipolar affective disorder.
Pharmacogenet Genomics, 2007. 17(8): p. 605-17.
242. Umbach, J.A., Y. Zhao, and C.B. Gundersen, Lithium enhances secretion from large
dense-core vesicles in nerve growth factor-differentiated PC12 cells. J Neurochem, 2005.
94(5): p. 1306-14.
243. Hanasaki, K., Mammalian phospholipase A2: phospholipase A2 receptor. Biol Pharm
Bull, 2004. 27(8): p. 1165-7.
244. Oresic, M., et al., Phospholipids and insulin resistance in psychosis: a lipidomics study of
twin pairs discordant for schizophrenia. Genome Med, 2012. 4(1): p. 1.
245. Gattaz, W.F., et al., Increased PLA2 activity in the hippocampus of patients with
temporal lobe epilepsy and psychosis. J Psychiatr Res, 2011. 45(12): p. 1617-20.
246. Ross, B.M., et al., Serum calcium-independent phospholipase A2 activity in bipolar
affective disorder. Bipolar Disord, 2006. 8(3): p. 265-70.
247. Gattaz, W.F., et al., Increased plasma phospholipase-A2 activity in schizophrenic
patients: reduction after neuroleptic therapy. Biol Psychiatry, 1987. 22(4): p. 421-6.
248. Folley, B.S., M.L. Doop, and S. Park, Psychoses and creativity: is the missing link a
biological mechanism related to phospholipids turnover? Prostaglandins Leukot Essent
Fatty Acids, 2003. 69(6): p. 467-76.
249. Kao, W.T., et al., Common genetic variation in Neuregulin 3 (NRG3) influences risk for
schizophrenia and impacts NRG3 expression in human brain. Proc Natl Acad Sci U S A,
2010. 107(35): p. 15619-24.
250. Alessi, A., et al., gamma-Syntrophin scaffolding is spatially and functionally distinct from
that of the alpha/beta syntrophins. Exp Cell Res, 2006. 312(16): p. 3084-95.
251. Kugaevskaia, E.V., [Angiotensin converting enzyme domain structure and properties].
Biomed Khim, 2005. 51(6): p. 567-80.
252. Kucukali, C.I., et al., Angiotensin-converting enzyme polymorphism in schizophrenia,
bipolar disorders, and their first-degree relatives. Psychiatr Genet, 2010. 20(1): p. 14-9.
253. Crescenti, A., et al., Insertion/deletion polymorphism of the angiotensin-converting
enzyme gene is associated with schizophrenia in a Spanish population. Psychiatry Res,
2009. 165(1-2): p. 175-80.
254. Wahlbeck, K., et al., Cerebrospinal fluid angiotensin-converting enzyme (ACE)
correlates with length of illness in schizophrenia. Schizophr Res, 2000. 41(2): p. 335-40.
255. Danser, A.H., et al., Commentaries on Viewpoint: Epigenetic regulation of the ACE gene
might be more relevant to endurance physiology than the I/D polymorphism. J Appl
Physiol, 2012. 112(6): p. 1084-5.
256. Ayoub, M.A., et al., Deleterious GRM1 Mutations in Schizophrenia. PLoS One, 2012.
7(3): p. e32849.
257. Okajima, D., G. Kudo, and H. Yokota, Antidepressant-like behavior in brain-specific
angiogenesis inhibitor 2-deficient mice. J Physiol Sci, 2012. 61(1): p. 47-54.
258. van Haren, J., et al., Mammalian Navigators are microtubule plus-end tracking proteins
that can reorganize the cytoskeleton to induce neurite-like extensions. Cell Motil
Cytoskeleton, 2009. 66(10): p. 824-38.
259. Fung, S.J., S. Sivagnanasundaram, and C.S. Weickert, Lack of change in markers of
presynaptic terminal abundance alongside subtle reductions in markers of presynaptic
terminal plasticity in prefrontal cortex of schizophrenia patients. Biol Psychiatry, 2011.
69(1): p. 71-9.
147
260. Vaags, A.K., et al., Rare deletions at the neurexin 3 locus in autism spectrum disorder.
Am J Hum Genet, 2012. 90(1): p. 133-41.
261. Sachdev, P., Schizophrenia-like psychosis and epilepsy: the status of the association. Am
J Psychiatry, 1998. 155(3): p. 325-36.
262. Hyde, T.M. and D.R. Weinberger, Seizures and schizophrenia. Schizophr Bull, 1997.
23(4): p. 611-22.
263. Salzmann, A., et al., Carboxypeptidase A6 gene (CPA6) mutations in a recessive familial
form of febrile seizures and temporal lobe epilepsy and in sporadic temporal lobe
epilepsy. Hum Mutat, 2012. 33(1): p. 124-35.
264. Ogino, S., et al., MGMT germline polymorphism is associated with somatic MGMT
promoter methylation and gene silencing in colorectal cancer. Carcinogenesis, 2007.
28(9): p. 1985-90.
265. Kwon, E., W. Wang, and L.H. Tsai, Validation of schizophrenia-associated genes
CSMD1, C10orf26, CACNA1C and TCF4 as miR-137 targets. Mol Psychiatry, 2011.
266. Chang, L.H., et al., Association of RELN promoter SNPs with schizophrenia in the
Chinese population. Dongwuxue Yanjiu, 2011. 32(5): p. 504-8.
267. Yuasa, T., et al., Polycystin-1L2 is a novel G-protein-binding protein. Genomics, 2004.
84(1): p. 126-38.
268. Park, E.Y., Y.M. Woo, and J.H. Park, Polycystic kidney disease and therapeutic
approaches. BMB Rep, 2011. 44(6): p. 359-68.
269. Goodman, A.B., A family history study of schizophrenia spectrum disorders suggests new
candidate genes in schizophrenia and autism. Psychiatr Q, 1994. 65(4): p. 287-97.
270. Wagemaker, H., J.L. Rogers, and R. Cade, Schizophrenia, hemodialysis, and the placebo
effect. Results and issues. Arch Gen Psychiatry, 1984. 41(8): p. 805-10.
271. Bennett, A.O.M., Dual constraints on synapse formation and regression in
schizophrenia: neuregulin, neuroligin, dysbindin, DISC1, MuSK and agrin. Aust N Z J
Psychiatry, 2008. 42(8): p. 662-77.
272. Siddiqui, T.J., et al., LRRTMs and neuroligins bind neurexins with a differential code to
cooperate in glutamate synapse development. J Neurosci, 2010. 30(22): p. 7495-506.
273. Kamal, M., et al., Loss of CSMD1 expression is associated with high tumour grade and
poor survival in invasive ductal breast carcinoma. Breast Cancer Res Treat, 2010.
121(3): p. 555-63.
274. Havik, B., et al., The complement control-related genes CSMD1 and CSMD2 associate to
schizophrenia. Biol Psychiatry, 2011. 70(1): p. 35-42.
275. Howes, O.D., et al., The Nature of Dopamine Dysfunction in Schizophrenia and What
This Means for Treatment: Meta-analysis of Imaging Studies. Arch Gen Psychiatry,
2012.
276. Dimpfel, W., Rat electropharmacograms of the flavonoids rutin and quercetin in
comparison to those of moclobemide and clinically used reference drugs suggest
antidepressive and/or neuroprotective action. Phytomedicine, 2009. 16(4): p. 287-94.
277. Ullmannova, V. and N.C. Popescu, Inhibition of cell proliferation, induction of apoptosis,
reactivation of DLC1, and modulation of other gene expression by dietary flavone in
breast cancer cell lines. Cancer Detect Prev, 2007. 31(2): p. 110-8.
278. Tarrago, T., et al., Baicalin, a prodrug able to reach the CNS, is a prolyl oligopeptidase
inhibitor. Bioorg Med Chem, 2008. 16(15): p. 7516-24.
279. Du, Y., X. Wu, and L. Li, Differentially organized top-down modulation of prepulse
inhibition of startle. J Neurosci, 2011. 31(38): p. 13644-53.
148
280. Meincke, U., E. Gouzoulis-Mayfrank, and H. Sass, [The startle reflex in schizophrenia
research]. Nervenarzt, 2001. 72(11): p. 844-52.
281. Furusato, E., et al., WT1 and Bcl2 expression in melanocytic lesions of the conjunctiva:
an immunohistochemical study of 123 cases. Arch Ophthalmol, 2009. 127(8): p. 964-9.
282. Jarskog, L.F., et al., Apoptotic proteins in the temporal cortex in schizophrenia: high
Bax/Bcl-2 ratio without caspase-3 activation. Am J Psychiatry, 2004. 161(1): p. 109-15.
283. Miller, C.L., et al., Two complex genotypes relevant to the kynurenine pathway and
melanotropin function show association with schizophrenia and bipolar disorder.
Schizophr Res, 2009. 113(2-3): p. 259-67.
284. Goldacre, M.J., et al., Schizophrenia and cancer: an epidemiological study. Br J
Psychiatry, 2005. 187: p. 334-8.
285. Dasgupta, A., et al., Insulin resistance and metabolic profile in antipsychotic naive
schizophrenia patients. Prog Neuropsychopharmacol Biol Psychiatry, 2010. 34(7): p.
1202-7.
286. Guest, P.C., et al., Altered levels of circulating insulin and other neuroendocrine
hormones associated with the onset of schizophrenia. Psychoneuroendocrinology, 2011.
36(7): p. 1092-6.
287. Chinnery, P.F., et al., Epigenetics, epidemiology and mitochondrial DNA diseases. Int J
Epidemiol, 2012. 41(1): p. 177-87.
288. Shock, L.S., et al., DNA methyltransferase 1, cytosine methylation, and cytosine
hydroxymethylation in mammalian mitochondria. Proc Natl Acad Sci U S A, 2011.
108(9): p. 3630-5.
289. Regenold, W.T., et al., Mitochondrial detachment of hexokinase 1 in mood and psychotic
disorders: implications for brain energy metabolism and neurotrophic signaling. J
Psychiatr Res, 2012. 46(1): p. 95-104.
290. Crouch, P.J., et al., Mechanisms of A beta mediated neurodegeneration in Alzheimer's
disease. Int J Biochem Cell Biol, 2008. 40(2): p. 181-98.
291. Anandatheerthavarada, H.K., et al., Mitochondrial targeting and a novel transmembrane
arrest of Alzheimer's amyloid precursor protein impairs mitochondrial function in
neuronal cells. J Cell Biol, 2003. 161(1): p. 41-54.
292. Martinez-Reyes, I., M. Sanchez-Arago, and J.M. Cuezva, AMPK and GCN2-ATF4 signal
the repression of mitochondria in colon cancer cells. Biochem J, 2012.
293. Nalaskowski, M.M., et al., Human inositol 1,4,5-trisphosphate 3-kinase isoform B
(IP3KB) is a nucleocytoplasmic shuttling protein specifically enriched at cortical actin
filaments and at invaginations of the nuclear envelope. J Biol Chem, 2011. 286(6): p.
4500-10.
294. Criollo, A., et al., Regulation of autophagy by the inositol trisphosphate receptor. Cell
Death Differ, 2007. 14(5): p. 1029-39.
295. Holland, J. and M. Agius, Neurobiology of bipolar disorder - lessons from migraine
disorders. Psychiatr Danub, 2011. 23 Suppl 1: p. S162-5.
296. Zhang, Z., et al., Valproate protects the retina from endoplasmic reticulum stress-induced
apoptosis after ischemia-reperfusion injury. Neurosci Lett, 2011. 504(2): p. 88-92.
297. Machado-Vieira, R., et al., The Bcl-2 gene polymorphism rs956572AA increases inositol
1,4,5-trisphosphate receptor-mediated endoplasmic reticulum calcium release in subjects
with bipolar disorder. Biol Psychiatry, 2011. 69(4): p. 344-52.
149
298. O'Dushlaine, C., et al., Molecular pathways involved in neuronal cell adhesion and
membrane scaffolding contribute to schizophrenia and bipolar disorder susceptibility.
Mol Psychiatry, 2011. 16(3): p. 286-92.
299. Wang, K.S., X.F. Liu, and N. Aragam, A genome-wide meta-analysis identifies novel loci
associated with schizophrenia and bipolar disorder. Schizophr Res, 2010. 124(1-3): p.
192-9.
300. Dimas, A.S., et al., Common regulatory variation impacts gene expression in a cell type-
dependent manner. Science, 2009. 325(5945): p. 1246-50.
301. Li, Y., et al., The DNA methylome of human peripheral blood mononuclear cells. PLoS
Biol, 2010. 8(11): p. e1000533.
302. Constancia, M., et al., Imprinting mechanisms. Genome Res, 1998. 8(9): p. 881-900.
303. Whitelaw, N.C. and E. Whitelaw, Transgenerational epigenetic inheritance in health and
disease. Curr Opin Genet Dev, 2008. 18(3): p. 273-9.
304. Manikkam, M., et al., Transgenerational actions of environmental compounds on
reproductive disease and identification of epigenetic biomarkers of ancestral exposures.
PLoS One, 2012. 7(2): p. e31901.
305. Klar, A.J., Propagating epigenetic states through meiosis: where Mendel's gene is more
than a DNA moiety. Trends Genet, 1998. 14(8): p. 299-301.
306. Ruden, D.M. and X. Lu, Hsp90 affecting chromatin remodeling might explain
transgenerational epigenetic inheritance in Drosophila. Curr Genomics, 2008. 9(7): p.
500-8.
307. Pentinat, T., et al., Transgenerational inheritance of glucose intolerance in a mouse
model of neonatal overnutrition. Endocrinology, 2010. 151(12): p. 5617-23.
308. Arai, J.A. and L.A. Feig, Long-lasting and transgenerational effects of an environmental
enrichment on memory formation. Brain Res Bull, 2011. 85(1-2): p. 30-5.
309. Furuhashi, H. and W.G. Kelly, The epigenetics of germ-line immortality: lessons from an
elegant model system. Dev Growth Differ, 2010. 52(6): p. 527-32.
310. Barres, R., et al., Acute exercise remodels promoter methylation in human skeletal
muscle. Cell Metab, 2012. 15(3): p. 405-11.
311. Kangaspeska, S., et al., Transient cyclical methylation of promoter DNA. Nature, 2008.
452(7183): p. 112-5.
312. Ebisawa, M., et al., Measurement of Ara h 1-, 2-, and 3-specific IgE antibodies is useful
in diagnosis of peanut allergy in Japanese children. Pediatr Allergy Immunol, 2012.
313. Sharief, S., et al., Vitamin D levels and food and environmental allergies in the United
States: results from the National Health and Nutrition Examination Survey 2005-2006. J
Allergy Clin Immunol, 2011. 127(5): p. 1195-202.
314. Pacheco, K.A., Epigenetics mediate environment : gene effects on occupational
sensitization. Curr Opin Allergy Clin Immunol, 2012. 12(2): p. 111-8.
315. Breton, C.V., et al., Prenatal tobacco smoke exposure affects global and gene-specific
DNA methylation. Am J Respir Crit Care Med, 2009. 180(5): p. 462-7.
316. Perera, F., et al., Relation of DNA methylation of 5'-CpG island of ACSL3 to
transplacental exposure to airborne polycyclic aromatic hydrocarbons and childhood
asthma. PLoS One, 2009. 4(2): p. e4488.
317. Nishida, N., et al., Evaluating the performance of Affymetrix SNP Array 6.0 platform with
400 Japanese individuals. BMC Genomics, 2008. 9: p. 431.
318. Bettscheider, M., et al., Optimized Analysis of DNA Methylation and Gene Expression
from Small, Anatomically-defined Areas of the Brain. J Vis Exp, 2012(65).
150
319. Fang, F., et al., Genomic landscape of human allele-specific DNA methylation. Proc Natl
Acad Sci U S A, 2012.
320. Xie, W., et al., Base-resolution analyses of sequence and parent-of-origin dependent
DNA methylation in the mouse genome. Cell, 2012. 148(4): p. 816-31.
321. Genissel, A., et al., Cis and trans regulatory effects contribute to natural variation in
transcriptome of Drosophila melanogaster. Mol Biol Evol, 2008. 25(1): p. 101-10.
322. Schilling, E., C. El Chartouni, and M. Rehli, Allele-specific DNA methylation in mouse
strains is mainly determined by cis-acting sequences. Genome Res, 2009.
323. Bell, J.T., et al., DNA methylation patterns associate with genetic and gene expression
variation in HapMap cell lines. Genome Biol, 2011. 12(1): p. R10.
324. Sanacora, G., G. Treccani, and M. Popoli, Towards a glutamate hypothesis of depression:
an emerging frontier of neuropsychopharmacology for mood disorders.
Neuropharmacology, 2012. 62(1): p. 63-77.
325. Kantrowitz, J.T. and D.C. Javitt, Thinking glutamatergically: changing concepts of
schizophrenia based upon changing neurochemical models. Clin Schizophr Relat
Psychoses, 2010. 4(3): p. 189-200.
326. Fusar-Poli, P., et al., Thalamic glutamate levels as a predictor of cortical response during
executive functioning in subjects at high risk for psychosis. Arch Gen Psychiatry, 2011.
68(9): p. 881-90.
327. de la Fuente-Sandoval, C., et al., Higher levels of glutamate in the associative-striatum of
subjects with prodromal symptoms of schizophrenia and patients with first-episode
psychosis. Neuropsychopharmacology, 2011. 36(9): p. 1781-91.
328. Palomino, A., et al., Decreased levels of plasma glutamate in patients with first-episode
schizophrenia and bipolar disorder. Schizophr Res, 2007. 95(1-3): p. 174-8.
329. Benneyworth, M.A., et al., A selective positive allosteric modulator of metabotropic
glutamate receptor subtype 2 blocks a hallucinogenic drug model of psychosis. Mol
Pharmacol, 2007. 72(2): p. 477-84.
330. Ginsberg, S.D., S.E. Hemby, and J.F. Smiley, Expression profiling in neuropsychiatric
disorders: emphasis on glutamate receptors in bipolar disorder. Pharmacol Biochem
Behav, 2012. 100(4): p. 705-11.
331. Muller, U.C. and H. Zheng, Physiological Functions of APP Family Proteins. Cold
Spring Harb Perspect Med, 2012. 2(2): p. a006288.
332. Dong, Q., et al., Molecular cloning of human G alpha q cDNA and chromosomal
localization of the G alpha q gene (GNAQ) and a processed pseudogene. Genomics,
1995. 30(3): p. 470-75.
333. Ohashi, T., et al., Long-term follow-up of electrocochleogram in Meniere's disease. ORL
J Otorhinolaryngol Relat Spec, 1991. 53(3): p. 131-6.
334. Szumlinski, K.K., A.W. Ary, and K.D. Lominac, Homers regulate drug-induced
neuroplasticity: implications for addiction. Biochem Pharmacol, 2008. 75(1): p. 112-33.
335. Larkin, E.H., Insulin Shock Treatment of Schizophrenia. Br Med J, 1937. 1(3979): p.
745-7.
336. Ziskind, E., et al., The Mechanism of Insulin Therapy in Schizophrenia. Cal West Med,
1938. 48(5): p. 310-1.
337. Herberth, M., et al., Impaired glycolytic response in peripheral blood mononuclear cells
of first-onset antipsychotic-naive schizophrenia patients. Mol Psychiatry, 2011. 16(8): p.
848-59.
151
338. Altar, C.A., et al., Insulin, IGF-1, and muscarinic agonists modulate schizophrenia-
associated genes in human neuroblastoma cells. Biol Psychiatry, 2008. 64(12): p. 1077-
87.
339. Erlander, M.G., et al., Two genes encode distinct glutamate decarboxylases. Neuron,
1991. 7(1): p. 91-100.
340. Moyer, C.E., et al., Reduced Glutamate Decarboxylase 65 Protein Within Primary
Auditory Cortex Inhibitory Boutons in Schizophrenia. Biol Psychiatry, 2012.
341. Najjar, S., et al., Glutamic Acid decarboxylase autoantibody syndrome presenting as
schizophrenia. Neurologist, 2012. 18(2): p. 88-91.
342. Yarlagadda, A., et al., Blood Brain Barrier: The Role of GAD Antibodies in Psychiatry.
Psychiatry (Edgmont), 2007. 4(6): p. 57-9.
343. Hampe, C.S., et al., Species-specific autoantibodies in type 1 diabetes. J Clin Endocrinol
Metab, 1999. 84(2): p. 643-8.
344. Andrade Lima Gabbay, M., et al., Serum titres of anti-glutamic acid decarboxylase-65
and anti-IA-2 autoantibodies are associated with different immunoregulatory milieu in
newly diagnosed type 1 diabetes patients. Clin Exp Immunol, 2012. 168(1): p. 60-7.
345. Wang, X., et al., Anti-idiotypic antibody specific to GAD65 autoantibody prevents type 1
diabetes in the NOD mouse. PLoS One, 2012. 7(2): p. e32515.
346. Warren, S.T., The Epigenetics of Fragile X Syndrome. Cell Stem Cell, 2007. 1(5): p. 488-
489.
347. Roelfsema, J.H., et al., Genetic heterogeneity in Rubinstein-Taybi syndrome: mutations in
both the CBP and EP300 genes cause disease. Am J Hum Genet, 2005. 76(4): p. 572-80.
348. Philibert, R.A., et al., MAOA methylation is associated with nicotine and alcohol
dependence in women. Am J Med Genet B Neuropsychiatr Genet, 2008. 147B(5): p. 565-
70.
349. Bonsch, D., et al., Lowered DNA methyltransferase (DNMT-3b) mRNA expression is
associated with genomic DNA hypermethylation in patients with chronic alcoholism. J
Neural Transm, 2006. 113(9): p. 1299-304.
350. Stack, E.C., et al., Modulation of nucleosome dynamics in Huntington's disease. Hum
Mol Genet, 2007. 16(10): p. 1164-75.
351. Devoto, M. and M. Falchi, Genetic mapping of quantitative trait Loci for disease-related
phenotypes. Methods Mol Biol, 2012. 871: p. 281-311.
352. Rohrwasser, A., et al., From genetics to mechanism of disease liability. Adv Genet, 2008.
60: p. 701-26.
353. Argeson, A.C., K.K. Nelson, and L.D. Siracusa, Molecular basis of the pleiotropic
phenotype of mice carrying the hypervariable yellow (Ahvy) mutation at the agouti locus.
Genetics, 1996. 142(2): p. 557-67.
354. Roberts, N.J., et al., The Predictive Capacity of Personal Genome Sequencing. Sci Transl
Med, 2012.
355. Katz, D.J., et al., A C. elegans LSD1 demethylase contributes to germline immortality by
reprogramming epigenetic memory. Cell, 2009. 137(2): p. 308-20.
356. Knight, J.C., Resolving the variable genome and epigenome in human disease. J Intern
Med, 2012. 271(4): p. 379-91.
357. Zhang, K., et al., Digital RNA allelotyping reveals tissue-specific and allele-specific gene
expression in human. Nat Methods, 2009. 6(8): p. 613-8.
358. Li, W. and M. Liu, Distribution of 5-hydroxymethylcytosine in different human tissues. J
Nucleic Acids, 2011. 2011: p. 870726.
152
359. Ni, X., et al., Nucleic acid aptamers: clinical applications and promising new horizons.
Curr Med Chem, 2011. 18(27): p. 4206-14.
360. Jamieson, A.C., J.C. Miller, and C.O. Pabo, Drug discovery with engineered zinc-finger
proteins. Nat Rev Drug Discov, 2003. 2(5): p. 361-8.
361. Waggoner, D., Mechanisms of disease: epigenesis. Semin Pediatr Neurol, 2007. 14(1): p.
7-14.
362. Bernstein, E. and C.D. Allis, RNA meets chromatin. Genes Dev, 2005. 19(14): p. 1635-
55.
363. Pushparaj, P.N. and A.J. Melendez, Short interfering RNA (siRNA) as a novel
therapeutic. Clin Exp Pharmacol Physiol, 2006. 33(5-6): p. 504-10.
364. Wu, F., et al., Small Interference RNA Targeting TLR4 Gene Effectively Attenuates
Pulmonary Inflammation in a Rat Model. J Biomed Biotechnol, 2012. 2012: p. 406435.
365. Reich, S.J., et al., Small interfering RNA (siRNA) targeting VEGF effectively inhibits
ocular neovascularization in a mouse model. Mol Vis, 2003. 9: p. 210-6.
366. Reifenberger, G., et al., Predictive impact of MGMT promoter methylation in
glioblastoma of the elderly. Int J Cancer, 2011.
367. van Hoesel, A.Q., et al., Primary tumor classification according to methylation pattern is
prognostic in patients with early stage ER-negative breast cancer. Breast Cancer Res
Treat, 2012. 131(3): p. 859-69.
368. Moore, R.G., S. MacLaughlan, and R.C. Bast, Jr., Current state of biomarker
development for clinical application in epithelial ovarian cancer. Gynecol Oncol, 2010.
116(2): p. 240-5.
369. Steele, N., et al., Combined inhibition of DNA methylation and histone acetylation
enhances gene re-expression and drug sensitivity in vivo. Br J Cancer, 2009. 100(5): p.
758-63.
370. Liang, D., et al., Genetic variants in MicroRNA biosynthesis pathways and binding sites
modify ovarian cancer risk, survival, and treatment response. Cancer Res, 2010. 70(23):
p. 9765-76.
371. Schneider-Stock, R., et al., Epigenetic mechanisms of plant-derived anticancer drugs.
Front Biosci, 2012. 17: p. 129-73.
372. Terrazas, L.I., et al., Role of the programmed Death-1 pathway in the suppressive activity
of alternatively activated macrophages in experimental cysticercosis. Int J Parasitol,
2005. 35(13): p. 1349-58.
373. Wang, X., et al., Enlargement of secretory vesicles by protein tyrosine phosphatase PTP-
MEG2 in rat basophilic leukemia mast cells and Jurkat T cells. J Immunol, 2002. 168(9):
p. 4612-9.
374. Pearce, E.L., et al., Control of effector CD8+ T cell function by the transcription factor
Eomesodermin. Science, 2003. 302(5647): p. 1041-3.
375. Kaminsky, Z.A., et al., DNA methylation profiles in monozygotic and dizygotic twins. Nat
Genet, 2009. 41(2): p. 240-5.
376. Ptak, C. and A. Petronis, Epigenetics and complex disease: from etiology to new
therapeutics. Annu Rev Pharmacol Toxicol, 2008. 48: p. 257-76.
377. Ptak, C. and A. Petronis, Epigenetic approaches to psychiatric disorders. Dialogues Clin
Neurosci, 2010. 12(1): p. 25-35.