Download - Genetic-epigenetic interactions: Sequence-dependent · PDF fileGenetic-epigenetic interactions: Sequence-dependent and ... Genetic-epigenetic interactions: Sequence-dependent and ...

Genetic-epigenetic interactions: Sequence-dependent and

independent DNA methylation

by

Carolyn Ptak

A thesis submitted in conformity with the requirements

for the degree of Doctor of Philosophy

Graduate Department of Pharmacology and Toxicology

University of Toronto

© Copyright by Carolyn Ptak 2013

ii

Genetic-epigenetic interactions: Sequence-dependent and

independent DNA methylation

Carolyn Ptak

Doctor of Philosophy

Graduate Department of Pharmacology and Toxicology

University of Toronto

2013

Abstract

The field of human epigenetics has become widely accepted, yet many basic principles remain

unclear. It is important to determine the extent to which DNA methylation profiles are influenced

by DNA sequence; we addressed this question using tissues from human monozygotic (MZ) and

dizygotic (DZ) twins, post-mortem brain and germline samples. First, we analyzed white blood

cells (WBC), buccal epithelium, and rectal biopsies from MZ twins, and annotated the epigenetic

metastability of ~6,000 unique genomic regions. Our study was the first to utilize epigenome-

wide profiling to document DNA methylation differences in MZ twins. We also found that DZ

twins exhibited more epigenetic differences compared to MZ twins. Two competing hypotheses

were tested: 1) DNA sequence differences caused the additional DZ epigenetic variation, or 2)

additional epigenetic differences are present in the zygotes of DZ co-twins. Our animal and in

silico studies supported hypothesis 2, providing the first evidence for twin-based epigenetic

iii

heritability. Still, genetic impact on epigenetic variation cannot be excluded. To explore DNA–

epigenetic interactions and their role in disease, we mapped allele-specific methylation (ASM) in

brain and sperm DNA from individuals affected with major psychosis and controls. We found

that ~2.5% of brain SNPs show ASM, although genomic distribution of these “epiSNPs” varies

between cohorts. EpiSNPs were generally enriched in untranslated regions (UTRs) and regions

surrounding genes, and depleted in exons. The schizophrenia cohort contained twice as many

epiSNPs as controls and bipolar disorder, although they largely overlapped. Most epiSNP Gene

Ontology categories were related to brain development and function; differences between cohorts

were observed in glutamate and insulin pathways. Tissue-specific epiSNPs were also detected in

sperm DNA. Deep sequencing analysis revealed that any SNP could potentially demonstrate

low-level ASM. This work describes various aspects of genetic–epigenetic interactions, while

supporting epigenetic heritability and the role of genetic-epigenetic interactions in major

psychiatric disease.

iv

Acknowledgments

First, I would like to thank my supervisor, Art Petronis, who has offered an incredible amount of

support, guidance and inspiration over the years. I would also like to thank the members of my

committee, Albert Wong and Young Kim, for their advice, creativity, and even for grilling me at

our annual meetings. There have been many excellent people in the lab, all of whom contributed

to the completion of my work, but I would especially like to thank Zach Kaminsky for training

me on all basic lab techniques and for involving me in every step of the twin project, Gabriel Oh

for sharing the entire PhD process with me, Jon Mill and Ian Weaver for being entertaining

British fellows, and Miki Susic for keeping everyone in line and the lab running smoothly. I

would also like to thank everyone who directly contributed to the work included in this thesis, in

particular, Paul Boutros and Denise Mak, the bioinformaticians who developed a number of

analytical tools for many of these analyses. Additionally, I am thankful for the departmental

fellowships, Ontario Graduate Scholarships, and CIHR Master’s and Doctoral awards that I

received to fund this research. Finally, I want to thank my husband, Greg, and our lizard for

listening to my insane rants about “the lab” every night for the last 5.5 years.

v

Table of Contents

Acknowledgments .......................................................................................................................... iv

Table of Contents ............................................................................................................................ v

List of Tables ............................................................................................................................... viii

List of Figures ................................................................................................................................ ix

List of Appendices .......................................................................................................................... x

List of Abbreviations ..................................................................................................................... xi

Chapter 1 Introduction .................................................................................................................... 1

Statement of problem ................................................................................................................. 1

Study objectives and rationale ................................................................................................... 3

Review of the literature .............................................................................................................. 5

Epigenetics .......................................................................................................................... 5

Twin studies and the separation of genetic and epigenetic factors in disease .................... 7

The putative role of genetic-epigenetic interactions in complex disease ........................... 8

Genetic and epigenetic studies of major psychosis ........................................................... 11

ASM: relevance to studies of complex disease ................................................................. 17

Emergence of epigenetic treatments ................................................................................. 22

Chapter 2 Materials and Methods ................................................................................................. 23

Contributions: DNA dependent and independent DNA methylation in twins ................. 23

Twin sample ...................................................................................................................... 23

DNA methylation profiling ............................................................................................... 25

Animal studies .................................................................................................................. 25

Data analysis ..................................................................................................................... 26

Test for association of epigenetic difference with cellular heterogeneity ........................ 26

Biological and technical variation .................................................................................... 26

vi

Spot-wise epigenetic variation .......................................................................................... 27

Cross tissue comparison .................................................................................................... 28

Investigation of genomic element class ............................................................................ 28

Gene ontology analysis ..................................................................................................... 29

Validation of the microarray findings ............................................................................... 29

Contributions: ASM and its putative role in complex disease .......................................... 31

Sample preparation ........................................................................................................... 31

EpiSNP identification ....................................................................................................... 33

Verification of microarray results ..................................................................................... 35

Examination of linkage disequilibrium effects ................................................................. 36

Deep sequencing analysis of non-epiSNPs ....................................................................... 37

Chapter 3 Results and Discussion ................................................................................................. 40

Comparison of MZ versus DZ epigenetic profiles ........................................................... 48

Genomic frequency and distribution of ASM ................................................................... 53

Bisulfite verification of selected SNPs ............................................................................. 61

Linkage disequilibrium does not cause false-positives ..................................................... 64

ASM in the major psychosis cohort .................................................................................. 66

Tissue specificity of ASM ................................................................................................ 82

Sensitivity Analysis .......................................................................................................... 89

Low level ASM across the genome .................................................................................. 91

Chapter 4 General Discussion and Conclusions ........................................................................... 96

Changing concepts of epigenetic regulation ..................................................................... 97

Genetic-epigenetic interplay in complex disease ............................................................ 105

Future directions ............................................................................................................. 110

Appendices .................................................................................................................................. 116

Appendix I. Twin Study Supplementary Notes ..................................................................... 116

vii

Appendix 2. Allele-Specific Methylation Study Supplementary Notes ................................ 121

Copyright Acknowledgements .................................................................................................... 132

References ................................................................................................................................... 134

viii

List of Tables

Table 2.1 Sodium bisulfite treated loci and primers

Table 2.2 Sodium bisulfite treated SNP loci and primers

Table 2.3 Forward 454 sequencing reads per amplicon

Table 3.1 GO analysis of loci with high MZ co-twin epigenetic similarity

Table 3.2 GO analysis of loci with low MZ co-twin epigenetic similarity

Table 3.3 Differences in epiSNP chromosomal distribution in brain and sperm

Table 3.4 Differences in epiSNP functional class distribution in brain and sperm

Table 3.5 Average methylation across loci

Table 3.6 Direction of methylation and gene information for common epiSNPs

Table 3.7 Top five GO categories per brain cohort

Table 3.8 Top 5 enriched GO categories per brain cohort

Table 3.9 Glutamate- and insulin-related GO categories per brain cohort

Table 3.10 Summary of mitochondria-related epiSNPs per brain cohort

Table 3.11 Minor epiSNPs showing significant ASM effects in brain

Table A2.1 Stanley sample demographics

Table A2.2 Harvard sample demographics

Table A2.3 CAMH sample demographics

Table A2.4 Methylation levels at all CpG sites

Table A2.5 EpiSNPs and associated gene information

Table A2.6 454 analysis sample genotypes

ix

List of Figures

Figure 1.1 Hypothetical mechanisms of epiSNP action

Figure 2.1 Twin study workflow

Figure 2.2 EpiSNP study workflow

Figure 3.1 Biological vs. technical variation

Figure 3.2 Correlations between microarray and sodium bisulfite sequencing data

Figure 3.3 Pyrosequencing correlations as a function of distance

Figure 3.4 Karyogram of MZ co-twin epigenetic similarity in WBCs

Figure 3.5 Raw binding intensities of MC and DC MZ twin hybridizations

Figure 3.6 MZ and DZ ICC distributions in buccal cells

Figure 3.7 Karyogram of MZICC-DZICC values in buccal cells of DC MZ twins

Figure 3.8 Technical variation volcano plots of HpaII and MspI based enrichments

Figure 3.9 Distributions of inbred and outbred epigenetic variation

Figure 3.10 Enrichment of unmethylated DNA fraction

Figure 3.11 Chromosomal distribution of brain epiSNPs

Figure 3.12 Functional class distribution of brain epiSNPs

Figure 3.13 Methylation levels observed for an epiSNP

Figure 3.14 Methylation levels observed for a non-epiSNP

Figure 3.15 Distances between SNPs and MSRE SNPs

Figure 3.16 Linkage disequilibrium scores in brain and sperm

Figure 3.17 Total epiSNPs per cohort in brain

Figure 3.18 GO categories per cohort in brain

Figure 3.19 EpiSNPs detected in sperm DNA

Figure 3.20 Overlapping epiSNPs between brain and sperm

Figure 3.21 Chromosomal distribution of sperm epiSNPs

Figure 3.22 Functional class distribution of sperm epiSNPs

Figure 3.23 Sensitivity analysis

Figure 3.24 Sensitivity analysis stratified by strength of associations

Figure 3.25 Deep sequencing workflow

Figure 3.26 CpG count per SNP

Figure 3.27 Methylation levels per group for a sample minor epiSNP

Figure A1.1 Karyogram of MZ co-twin epigenetic similarity in buccal cells

Figure A1.2 Karyogram of MZ co-twin epigenetic similarity in gut

Figure A1.3 Karyogram of MZICC-DZICC values in WBCs

Figure A1.4 Karyogram of MZICC-DZICC values in buccal cells of MC MZ twins

x

List of Appendices

Appendix 1. Twin Study Supplementary Notes

Appendix 2. Allele-Specific Methylation Study Supplementary Notes

xi

List of Abbreviations

5-hydroxymethylcytosine 5-hmC

5-methylcytosine 5-mC

3’ untranslated region 3’ UTR

5, 10-methylenetetrahydrofolate reductase MTHFR

Acute lymphoblastic leukemia ALL

Allele-specific expression ASE

Allele-specific methylation ASM

Alpha-ketoglutarate-dependent dioxygenase FTO

Base-pair bp

Bipolar disorder BD

Breast cancer type 1 BRCA1

Caenorhabditis elegans C. elegans

Copy number variants CNV

CpG islands CGI

Cyclin-dependent kinase inhibitor 2A p16INK4a

Differentially methylated regions DMR

Dichorionic DC

Dizygotic DZ

DNA methyltransferase DNMT

Endoplasmic reticulum ER

Epigenome-wide association studies EWAS

Expression quantitative trait loci eQTL

False discovery rate FDR

Fragile X mental retardation 1 FMR1

GABA plasma membrane transporter-1 GAT-1

Gene ontology GO

Genome-wide association studies GWAS

Glutamic acid decarboxylase 67 GAD67

Glutathione transferase GST

G protein-coupled inwardly rectifying potassium channel KCNJ6

Head and neck squamous cell carcinoma HNSCC

Heterozygosity quotient HQ

Histone acetyltransferase HAT

Histone deacetylase HDAC

HLA complex group 9 HCG9

Human embryonic stem cell HESC

Human leukocyte antigen HLA

Intraclass correlation coefficient ICC

Imprinted differentially methylated regions iDMR

International Human Epigenome Consortium IHEC

Jak and microtubule interacting protein MARLIN-1

Kilobase Kb

Linkage disequilibrium LD

Long interspersed nucleotide element LINE

Major depressive disorder MDD

xii

Melanin-concentrating hormone receptor 1 MCHR1

Methylation-sensitive representational difference analysis MS-RDA

Methylation-sensitive restriction enzyme MSRE

Methyl-CpG binding domains MBD

Methyl-CpG-binding protein 2 MeCP2

Micro RNA miRNA

Monochorionic MC

Monozygotic MZ

MutL homolog 1 hMLH1

Nei endonuclease VIII-like 1 NEIL1

Non-coding RNA ncRNA

O(6)-methylguanine DNA methyltransferase MGMT

Peripheral blood leukocyte PBL

Polymerase chain reaction PCR

Potassium chloride co-transporter 3 SLC12A6

Quantitative trait loci QTL

Reelin RELN

RNA-induced silencing complex RISC

S-adenosyl-methionine SAM

Schizophrenia SZ

Secretogranin II SCG2

Serotonin receptor 1A 5HT1A

Single nucleotide polymorphism SNP

Small interfering RNA siRNA

Transcription factor TF

Ten-eleven translocation TET

Toronto Centre for Applied Genomics TCAG

Type 2 diabetes T2D

Vitamin D receptor VDR

White blood cells WBC

1

Chapter 1 Introduction

Statement of problem

DNA sequence has long been regarded as the means for encoding information in the mammalian

cell, and when the extent of human genetic diversity began to emerge, a few years after the

release of the complete human genome, Science declared the discovery to be the “breakthrough

of the year [1],” as it was hoped that these sequence differences would explain inter-individual

phenotypic variation. Unfortunately, there was still a considerable disconnect between DNA

sequence and phenotypic outcomes, but epigenetic factors were quickly nominated as the

connection between environment and genetics that should be added to this “first draft” of the

genome [2]. Although associations between genetic and epigenetic factors are starting to

materialize, their complex relationship remains somewhat nebulous. It is critical to understand

the influence of genetics on epigenetics and vice versa, and the twin study design provides an

elegant strategy for teasing apart their effects.

Twin research has been of fundamental importance in human studies for two main reasons.

First, phenotypic discordance in monozygotic (MZ) co-twins has traditionally indicated a role of

environment, and twin studies offer a means to measure the relative contributions of genes and

environmental factors. Countless twin studies have been performed over the last century on

almost any trait imaginable, but primarily on human disease [3], although an acceptable

mechanistic explanation for MZ discordance has yet to be presented. In the last decade,

evidence has been accumulating that epigenetic modifications of DNA and histones can have a

primary role in phenotypic outcomes, including human disease [4]. DNA methylation shows

only partial stability, which could be caused by a wide variety of factors, including

developmental programs, environment, hormones, and stochastic events [5-8]. Such epigenetic

metastability may result in substantial epigenetic differences across genetically identical

organisms [9]. Several studies have identified epigenetic differences, either at selected genes of

MZ twins [10-13] or in the overall epigenome [14]. It has become evident that the MZ study

design is quite useful for the investigation of methylation differences that are not sequence-

dependent. Despite this promising start, no epigenome-wide studies have yet been conducted to

2

catalogue the extent of this phenomenon, and of the targeted studies, few have been done in

tissues other than peripheral blood cells.

The second major benefit of the twin design is that comparison of phenotypic concordance rates

in MZ twins versus dizygotic (DZ) twins is a powerful strategy to estimate heritability. Nearly

universally, MZ twins show various degrees of discordance, generally lower in comparison to

discordance in DZ twins. These observations provided the basis for the current paradigm of

human normal and morbid biology, which focuses on DNA sequence variation and

environmental differences. The extent to which DZ twins are different remains unknown, as

does the degree to which DNA sequence variants can influence local methylation levels, and

how this factors into the differences observed between MZ and DZ twins. Our twin study

demonstrated that DZ twins exhibited a larger degree of epigenetic variation in comparison to

MZ twins, which is most likely an outcome of epigenetic differences in the zygotes. At the same

time, we could not fully exclude the putative role of DNA sequence variants on epigenetic

variation, and subsequently dedicated significant effort to mapping of DNA-epigenetic

interactions at common single nucleotide polymprphisms (SNPs).

SNPs have been investigated in many diseases and conditions [15-17], but their actual

contributions remain largely unknown, and knowledge of individual SNP risk factors fails to

fully explain the heritability estimates for complex traits [18]. In genome-wide association

studies (GWAS), SNPs frequently associate with certain phenotypes, but many SNPs detected

with this approach do not reach significance. We aimed to demonstrate that there is

heterogeneity within the A and B alleles at many SNPs, and that stratification based on

epigenetic properties would greatly enhance the ability of GWAS studies to detect strong

markers of disease and reach more meaningful conclusions.

It has been found that SNPs may exhibit allele-specific methylation (ASM); however, this

evidence is derived from a limited number of studies that did not include a sufficient number of

samples [19-21]. ASM may play an important role in phenotypic diversity and disease etiology,

yet its occurrence and link to genetic polymorphism remains unknown. Despite an intense

interest in single-locus ASM associated with various cancer subtypes [22-24], this phenomenon

has not been investigated in the context of other complex non-Mendelian diseases. Major

psychosis, a term that describes both schizophrenia (SZ) and bipolar disorder (BD), is a perfect

3

example of a disease that exhibits substantial heritability, yet the patterns of inheritance and the

exact genes or SNPs involved remain largely undiscovered. In the case of SZ, GWAS have

identified a number of associated SNPs, yet a staggering minority of those SNPs has actually

been found to be functionally related to the phenotypes of affected individuals, and none of

them are classified as biomarkers [25]. While the natural response seems to be intensification of

the same research strategy - larger sample sizes, higher resolution mapping, etc – we should also

continue to search for new disease mechanisms. DNA sequence variants with the ability to

interact with epigenetic modifications are a promising new field of study, as these “epiSNPs”

could potentially alter the expression of genes in cis and, potentially, in trans.

It is essential to understand the interaction between genetics and epigenetics in both normal

individuals and those in a disease state, but overall, our knowledge of sequence-dependent and

independent DNA methylation is severely lacking. Using microarrays, a twin study design and a

genome-wide interrogation of ASM effects in cases and controls, we will attempt to solve some

of these fundamental molecular mysteries. Ultimately, we hope that our findings can be utilized

in the identification of molecular targets for new pharmaceuticals, which will be applied in the

treatment of SZ, BD and other complex diseases.

Study objectives and rationale

Many questions remain concerning the relationship between DNA sequence and epigenetic

factors; we will never see a complete picture of genomic activity until these interactions are

elucidated. One goal is to determine the relative amount of sequence-independent epigenetic

variation, and the MZ twin model provides an effective way to study it. In genetically identical

organisms, such as human MZ twins, epigenetic patterns may drift due to the partial stability of

DNA methylation and other epigenetic modifications. Previous studies have identified

epigenetic metastability in MZ twins, but they have failed to examine these differences in a

large-scale, genome-wide manner, thus, the first objective was to map the DNA methylation

differences between MZ co-twins using WBC, gut and buccal epithelial cell tissue.

The comparison of concordance rates between MZ and DZ twins is one method to estimate the

contributions of genetic (heritability) and non-genetic factors (environment) for any given trait;

4

we wished to apply this same model to the investigation of epigenetic variation. Very few

studies have touched upon the concept of epigenetic or “soft” inheritance, which refers to the

transmission of epigenetic marks from parent to offspring through the germ cells. The epigenetic

marks established in the majority of cells over the lifetime of an organism are mostly irrelevant

to the next generation, with the exception of those occurring in the mature gametes [26]. While

it is known that a large-scale erasure of methylation marks occurs during early mammalian

development, presumedly to restore all cell lineages to a common ground state, several

examples of meiotically-transmitted epi-alleles have been discovered in a variety of organisms,

including humans [27]. In order to investigate the extent of this phenomenon, our second

objective was to compare the DNA methylation variation between MZ and DZ twins, using

white blood cells (WBC) and buccal epithelial cells.

Unlike MZ twins, DZ twins only share 50% of segregating DNA polymorphisms [28], thus, any

additional epigenetic variation between DZ twins can potentially be explained by: 1) DNA

sequence effects on epigenetic variation or 2) epigenomic individuality of zygotes. We explored

both hypotheses and determined that it was necessary to systematically document genetic–

epigenetic interactions. We addressed this issue in our third objective by performing an in silico

SNP analysis, as well as conducting an animal study, in which DNA methylation variation was

compared between inbred (genetically identical) and outbred mice (non-identical).

Although we went on to determine that the epigenetic variation between MZ and DZ twins was

independent of DNA sequence, it was clear that our experimental design limited the amount of

observable sequence-dependent methylation effects. The twin experiment did not allow us to

make conclusions about the portion of methylation that may be controlled by DNA sequence

variants, although this interaction is potentially an important step in the etiopathogenesis of

complex diseases. A large-scale, unbiased mapping of these events has never been

accomplished, thus, our fourth objective was to estimate the percentage of SNPs that

demonstrate ASM, investigate the distribution of these epiSNPs throughout the genome, and

then determine the potential of any given SNP to display ASM effects. Many molecular findings

show some sort of tissue-specificity, and this is especially true for epigenetic effects [29], as

different tissues are subjected to different hormone levels, environmental stressors, and other

xenobiotics, all of which may impact the epigenome to some degree [30, 31]. We examined a

5

second DNA source - sperm cells from control and BD subjects - to determine if ASM effects

would be present and, if so, how they would compare to effects seen in the brain.

Our lab has previously determined that epigenetic factors are involved in the etiology of

psychosis [32], however, this study only examined the contribution of epigenetic factors without

considering an interaction with DNA sequence. In this experiment, our fifth objective was to

identify epiSNPs in a subset of individuals affected with psychosis, and then compare them to

those detected in the control set. The over-arching hypothesis of this portion of the study is that

a specific epigenetic state is required for a SNP to be classified as a true risk factor for

psychosis, and that genetic-epigenetic interactions should be considered when searching for

predictive or causative elements associated with complex diseases.

Review of the literature

Epigenetics

Epigenetics refers to regulation of various genomic functions, including gene expression, that

are brought about by mitotically heritable, but potentially reversible changes in DNA

methylation and various modifications of histones (acetylation, methylation, phosphorylation,

etc) [33]. The two epigenetic mechanisms work in concert, with alterations in DNA

modification affecting histone modifications and vice versa. In humans and animals,

methylation of DNA occurs at the C5 position of cytosines (5-mC), primarily within

cytosine/guanine dinucleotides (CpG), which is established and maintained by the DNA-

methyltransferase (DNMT) family of enzymes. DNA is wrapped around octamers of basic

histone proteins (H2A, H2B, H3 and H4), forming higher order nucleosome structures.

Modification of these proteins, such as acetylation, methylation, phosphorylation,

ubiquitination, etc, control chromatin states, which can be open (transcriptionally active) or

closed (inactive). Among numerous other histone modification enzymes, histone

acetyltransferases (HATs) acetylate lysine residues on the N terminal tail of histone proteins.

This neutralizes the positive charge of the protein, decreasing its affinity for DNA and leading to

a looser interaction [34] that creates an open chromatin structure and increases accessibility for

the transcription machinery. In contrast, human histone deacetylases (HDACs) remove acetyl

groups, which results in condensed chromatin and gene inactivation [35]. Proteins with N

6

terminal methyl-CpG binding domains (MBD), such as methyl-CpG-binding protein 2

(MeCP2), can bind to methylated sites on DNA and complex with HDACs and the corepressor

Sin3a. This leads to histone deacetylation and the silencing of genes downstream from the

methylated CpG site. The effects of histone methylation depend on the specific lysine or

arginine that is modified, and can also result in either gene activation or repression [36].

In addition to the well-known modified pyrimidine base, 5-mC, a second modified cytosine has

recently been established as an important epigenetic factor. In mammals, 5-

hydroxymethylcytosine (5-hmC) is generated through oxidation of 5-methylcytosine by the ten-

eleven translocation (TET) family of enzymes [37]; the TET1 protein also binds a large number

of the Polycomb group target genes and colocalizes with the SIN3A co-repressor complex,

indicating that it plays a role in regulating transcription and preventing excessive 5-mC

accumulation at CpG-rich sequences [38]. 5-hmC was originally discovered in bacteriophage in

the early 1950s [39], but it wasn’t until 2009 that it was discovered in Purkinje neurons and

human stem cells [37, 40]. In human and mouse brains, 5-hmC is surprisingly abundant [40],

although it can occur in any cell type and tends to be enriched in the bodies of highly expressed

genes [41]. Unlike 5-mC, 5-hmC is also enriched in CpG-rich transcription start sites [38],

suggesting that conversion of 5-mC to 5-hmC is a way to reverse the transcriptional repression

that results from methylation [42], although its exact function in the genome remains unknown.

Epigenetic studies of various species – from E.coli and yeast to animals and humans – have

demonstrated that epigenetic regulation is critically important in the normal functioning of

genomes [43-45]. Cells can only operate normally if both the DNA sequence and epigenetic

components of the genome function properly; epigenetically dysregulated genes, despite

impeccable DNA sequences, can be harmful and cause disease [46, 47]. To date, the role of

epigenetic factors has been thoroughly investigated in rare paediatric syndromes [48] and

malignant transformation of cells in cancer [49-51]. More importantly, epigenetics can be highly

relevant to various complex non-Mendelian diseases, as epigenetic mechanisms allow for the

integration of a variety of apparently unrelated clinical, epidemiological, and molecular data into

a new theoretical framework [18].

7

Twin studies and the separation of genetic and epigenetic factors in disease

Discordance of identical (MZ) twins is one of the hallmarks of complex non-Mendelian disease;

concordance of monozygotic twins reaches only ~15% in breast cancer, 20% in ulcerative colitis,

25-30% in multiple sclerosis, 25-45% in diabetes, 50% in schizophrenia, 40-70% for Alzheimer’s

disease [52]. The discordance of MZ twins has traditionally been attributed to the differential

effect of environmental factors, which supposedly produce disease in one of the two genetically

predisposed co-twins [53]. Identification of such factors has been very difficult and, so far, only

a limited number of environmental disease risk factors have been identified (e.g. smoking in

lung cancer, diet in cardiovascular diseases) [54, 55].

The epigenetic explanation for MZ twin discordance is that, due to the partial stability of

epigenetic factors, a substantial degree of disease-relevant epigenetic dissimilarity can be

accumulated in genetically identical twins [10, 14]. Epigenetic differences in MZ twins may

reflect differential exposure to a wide variety of environmental factors. For example, intake of

folic acid affects both the global methylation level in the genome and regulation of imprinted

genes [56, 57]. It is also generally accepted that a sufficient level of methyl donor molecules is

necessary for normal mammalian neural tube development [58], and it has recently been

determined that the increased risk of neural tube defects is associated with hypomethylation of

long interspersed nucleotide element-1 (LINE-1) [59]. During pregnancy, maternal dietary

methyl supplements increase DNA methylation and change methylation-dependent epigenetic

phenotypes in mammalian offspring [60, 61]. One important cellular methyl donor, S-adenosyl-

methionine (SAM), has been found to mediate the activity of glutathione transferase (GST),

which is an enzyme involved in toxicant metabolism as well as neuronal stability. A study that

utilized a mouse model of Alzheimer’s disease illustrated the link between SAM and GST levels

– the mice originally had reduced levels of both SAM and GST, but SAM supplementation

restored GST activity, which is a promising development in the field of Alzheimer’s research

[62]. Overall, there could be numerous environmental stressors, including alcohol consumption

[63], asbestos and arsenic exposure [64, 65], and even maternal behaviour [66], that cause some

epigenetic “trace.”

MZ twin study designs are especially suited for the investigation of environmental epigenetics,

because there is no confounding effect from DNA sequence differences [67]. At the time of

8

commencement of our twin study, a number of studies had detected epigenetic differences between

MZ twins at individual loci, for example, in MZ twins discordant for Beckwith-Wiedemann

syndrome, a methylation difference at KCNQ1OT1 was detected between affected and unaffected

co-twins, representing an imprinting defect [68]. Methylation differences were also found to

occur between MZ twins in the regulatory regions of the catechol-o-methyltransferase [69] and

dopamine D2 receptor [10] genes. Isogenic organisms, such as inbred animals, are also useful

for molecular epigenetic studies, as they have identical genomes [70]. Famously, the inheritance

of an epigenetic modification upstream of the agouti locus, was documented in isogenic mice:

variation in the agouti phenotype - which can be visually detected as a fur colour continuum

from yellow to full agouti - was found to be the result of incomplete erasure of an epigenetic

modification that was then inherited through the female germline [71].

As a rule, epigenetic profiles are much more dynamic compared to DNA sequence, and the

epigenetic differences that occur between MZ twins may stem from many causes. DNA

methylation levels, for example, are not rigidly fixed in place and may become altered as a

result of environmental stressors, developmental programs, or even stochastically [67]. Some

mechanisms of stochasticity in epigenetic regulation are well understood. For example, the

enzyme DNA methyltransferase I (DNMT1), which acts as a maintenance enzyme and replaces

the methyl group at hemi-methylated sites [72], does not work with 100% accuracy. In mice,

the fidelity of DNMT1 was found to be approximately 95% [73], while other studies have

estimated the value to be 99.85-99.92% [6], although this second estimate was believed to take

into account the contribution of the de novo methyltransferases, DNMT3a and DNMT3b.

Another feature of DNMT1 is its ability to randomly methylate unmethylated cytosines, and this

activity is the main cause of most methylation errors, even in CGIs [6]. It is evident that

epigenetic marks have the potential to be gained or lost at every mitotic replication, and that this

occurrence can have important cumulative downstream effects, making it possible for MZ twins

to differ epigenetically without any specific causal factor.

The putative role of genetic-epigenetic interactions in complex disease

There are three fundamental points that enable us to consider epigenetic factors as etiological

candidates in complex disease. First, the epigenetic status of genes is more dynamic in

comparison to DNA sequence, and can be altered by developmental programs and the

9

environment of the organism [66]; furthermore, epigenetic changes may occur even in the

absence of obvious environmental differences, i.e. due to stochastic reasons [5]. Second, some

epigenetic signals can be transmitted along with DNA sequence across the germline generations,

i.e. such signals exhibit partial meiotic stability [27]. Third, epigenetic regulation is critical for

normal genomic function, such as segregation of chromosomes in mitosis, inactivation of

parasitic DNA elements, and regulation of gene activity [74, 75].

Partial epigenetic stability and the primary role of epigenetics in controlling the activities of

DNA sequences can shed a new light on various non-Mendelian irregularities of complex

diseases, such as MZ twin discordance (described above), sexual dimorphism, parent-of-origin

effects, familiality and sporadicity. One of the important peculiarities of complex disease is

sexual dimorphism - differential susceptibility to a disease in males and females. In psychiatric

conditions such as Alzheimer's disease, schizophrenia, alcoholism, and mood and anxiety

disorders, psychopathology exhibits a number of differences between the sexes in rates of illness as

well as the course of illness [76]. It is important to note that sex effects in complex diseases cannot

be explained by sex chromosome-linked genes, and that these effects are also observed on

autosomes [77]. While hormones cannot change DNA sequence, they can be potent modifiers of

epigenetic status, which controls genomic activities, thus, sex effects may be mediated by

hormone-induced epigenetic alterations [77].

In some complex diseases, risk to offspring depends on the sex of the affected parent. For

example, asthma, bipolar disorder, and epilepsy are more often transmitted from the mother, while

type 1 diabetes seems to be more often transmitted from the affected father [52]. Parent-of origin-

dependent clinical differences have also been detected in schizophrenia [78]. Molecular genetic

studies, although rarely performed in a sex-specific fashion, have discovered parental origin

effects in a wide variety of phenotypes, such as obesity [79], Alzheimer’s disease [80], atopy

and asthma [81], autism [82], autoimmunity [83], and major psychosis [84]. One of the most

common mechanisms of parent-of-origin effects is genomic imprinting [85], where differential

epigenetic modification of genes occurs based on their parental origin, resulting in expression of

genes from only one of the two parental copies [86]. Disruption of the normal imprinting pattern

often causes diseases that affect cell growth, development, and behaviour [87], with severe

disruptions potentially causing recurrent molar pregnancy, miscarriage or infertility [88].

10

Imprinting is the only form of ASM that is moderately understood, to date, although the search

for new imprinted domains continues constantly.

The epigenetic model of complex disease could be imagined as a chain of aberrant epigenetic

events that begins with a pre-epimutation, a primary epigenetic problem that takes place during

the maturation of the germline; pre-epimutation increases the risk for the disease but is not

necessarily sufficient to cause the disease. The dysregulation can be tolerated to some extent,

and age of disease onset may depend on the effects of tissue differentiation, stochastic factors,

hormones, and probably some external environmental factors (nutrition, infections, medications,

addictions, etc) [7, 89, 90]. It may take decades to reach a critical threshold, beyond which the

genome, cell, or tissue is no longer able to function normally – this may be the case for many

adult-onset diseases [91] – and only some predisposed individuals will reach the “threshold” of

epigenetic dysregulation and acquire phenotypic changes that meet the diagnostic criteria for a

clinical disorder. Severity of epigenetic dysregulation may fluctuate over time, and in clinical

terms this is known as remission and relapse. In some cases, “aging” epimutations may slowly

regress back to the norm. For example, in psychosis, this is seen as fading psychopathology or

even partial recovery, which is consistent with age-dependent epigenetic changes in the genome

[92]. The same principle applies to other diseases, such as asthma [93] and attention deficit and

hyperactivity syndrome [94]. It should be noted that an epimutation could represent a

sequence-independent change, such as a stochastic gain or loss of DNA methylation, or a

sequence-dependent event, for example, genetic disruption of an imprinted domain.

Although a wider variety of studies are beginning to appear, to date, epigenetic factors in

complex disease have not been intensely investigated, with the exception of cancer. Genes

involved in various cellular pathways may become misregulated, but epigenetic silencing of

tumor suppressor genes, such as the gene encoding the cell cycle inhibitor, cyclin-dependent

kinase inhibitor 2A (p16INK4a

), the DNA repair genes, breast cancer type 1 (BRCA1) and MutL

homolog 1 (hMLH1), has been studied the most extensively. Current estimates suggest that the

average tumor will contain approximately 100-400 hypermethylated promoter regions [95].

Global hypomethylation is also observed in cancer cells [96, 97], and it is believed to cause a

decrease in genomic stability and the formation of abnormal chromosome structures. Not

surprisingly, in addition to aberrant DNA methylation changes, histone modification changes

11

have also been detected in malignant cells [98, 99]. For example, the actions of histone H2A.Z

in cancer cells depend on both its level of acetylation and its location within the promoter region

[100]. Despite the regular occurrence of epigenetic changes in cancer, it is not clear which

epimutations are primary causes of early stage malignant transformation, versus the ones that

simply represent downstream effects of these primary causes [101, 102]. Until we are able to

differentiate between these subtypes, effective etiological treatment of cancer is not possible via

epigenetic approaches.

In addition to cancer, some epigenetic studies of psychiatric diseases have been completed or

are underway. The maintenance DNA methyltransferase, DNMT1, was shown to be

upregulated in GABAergic medium spiny neurons in layers I and II of the cerebral prefrontal

cortex in schizophrenia and bipolar disorder patients. An increase in DMNT1 levels, along with

a decrease in reelin (RELN) and glutamic acid decarboxylase 67 (GAD67), also occurs in

GABAergic medium spiny neurons of the caudate nucleus and putamen in schizophrenia

patients [103]. In autism studies, a substantial proportion of post-mortem brain samples from

autistic individuals revealed monoallelic or highly skewed allelic expression of GABA receptor

subunit genes, while such genes were biallelically expressed in control brain samples [104]. Rett

syndrome, an X-linked neurodevelopmental disorder, has been shown to result from a mutation

in MeCP2, of which the protein product represses gene transcription by binding to 5-

methylcytosine residues [105]. Fragile X syndrome has been linked to epigenetic silencing and

loss of expression of the fragile X mental retardation 1 (FMR1) gene, due to expansion of a

CGG repeat in its 5’-untranslated region [106]. Several studies have focused on the epigenetics

of psychosis, and this topic will be discussed in the next section of the review. Although the

underlying epimutations remain unknown in most complex diseases, many epigenetic

therapeutic agents have already been developed. Several of these compounds are progressing

through the clinical trial stage, or have even become approved treatments for particular

conditions.

Genetic and epigenetic studies of major psychosis

Psychiatric diseases place a tremendous burden on affected individuals, their caregivers and the

healthcare system. Although evidence exists for a strong inherited component to many of these

conditions, dedicated efforts to identify DNA sequence-based causes have not been

12

exceptionally productive, and very few pharmacologic treatment options are clinically available.

Major psychosis is a classification that encompasses both schizophrenia (SZ) and bipolar

disorder (BD) - two conditions that seem to be related etiologically [107]. SZ is a multifactorial

disease characterized by disordered thinking and concentration that results in psychotic thoughts

(delusions and hallucinations), inappropriate emotional responses, erratic behavior, as well as

social and occupational deterioration [108], while BD represents a category of mood disorders,

in which affected individuals experience episodes of mania or hypomania interspersed with

periods of depression, and may also suffer from delusions and hallucinations.

A variety of theories on the origin of psychosis have been proposed, many of which focus on

disturbances in brain circuitry and neurotransmitter levels. A prevalent opinion is that a genetic

predisposition paired with psychosocial and environmental elements is ultimately responsible,

but identification of any of these factors has been daunting. Popular theories of psychosis have

involved dopamine [109], serotonin [110] and glutamate [111] pathways, and first-line

pharmacological therapies mainly focus on these systems [112]; receptors for all of these

neurotransmitters appear to be dysregulated in the frontal cortex of psychosis subjects [113].

Many social and environmental contributing factors have been suggested, such as obstetric

complications [114], maternal malnutrition [115], hypoxia during neurodevelopment [116], viral

infection [117], identification as an ethnic minority and perception of disadvantage [118],

autoimmune reactions [119], and substance abuse [120]. The wide range of findings that support

different hypotheses combined with the spectrum of phenotypes observed in both diseases

suggest that the underlying causes of SZ and BD vary between individuals and likely involve

multiple neural pathways.

To date, traditional gene- and environment-based approaches have not been very productive in

deciphering the clinical, molecular and epidemiological aspects of psychosis, such as MZ twin

discordance (41-65% for SZ [121], ~60% for BD [122]), sexual dimorphism, parent-of-origin

effects, fluctuating disease course with periods of remission and relapse, and peaks of

susceptibility to the disease that correspond to periods of major hormonal changes in the

organism [90]. Classically, psychosis research was aimed at defining genetic and environmental

risk factors, but despite significant evidence of a heritable component derived from twin and

adoption studies [123, 124], many molecular genetics findings have not been replicated, and

significant heterogeneity and small effect sizes are thought to plague genetic association studies

13

[125]. A large study examined 789 SNPs within 14 top candidate genes in 1,870 SZ cases and

2,002 controls and found that all SNPs previously reported as associated with SZ were

consistent with chance expectation, and 4 other previously-identified SNPs were not

significantly associated with the disease [126].

More recent GWAS have also provided some disappointing results. A 2011 study of BD

included 1000 cases and 1034 controls, and utilized the Affymetrix SNP 6.0 platform to search

for genetic risk factors in each subset of the disorder. Only two SNPs reached significance – one

in the vicinity of the gene phosphodiesterase 10A (PDE10A) and another located between

contactin-4 precursor (BIG-2) and contactin 6 (CNTN6) [127]. Another study from Spain

examined the genomes of 476 SZ patients and 447 controls with the aim of studying only non-

synonymous SNPs to increase the probability of finding functional risk factors. One SNP

located at the metal ions transporter gene, SLC39A8, was found to be significant, although it is

rare in non-European populations [128]. These are just a few examples of a purely genetic

approach that have failed to explain a substantial portion of the heritable element of psychosis,

and it has been noted that all GWAS of SZ performed to date have found that the most

significant genetic risk factors do not have odds ratios (OR) greater than 1.15–1.20. A German

study managed to find a region on chromosome 11 (containing the candidate genes AMBRA1,

DGKZ, CHRM4 and MDK) that had an OR of 1.25 and was significantly associated with SZ in a

sample of 1169 cases and 3714 controls however, when the sample was expanded to include an

additional 2569 cases and 4088 controls, the OR dropped to 1.11 [129]. On the topic of GWAS

of psychosis, one group of reviewers has recently stated that, “The validation of any genetic

signal is likely confounded by genetic and phenotypic heterogeneities which are influenced by

epistatic, epigenetic and gene-environment interactions.” They go on to highlight the

importance of integrating multiple platforms in order to better understand the biological basis of

these diseases [130].

Recently, the first epigenomic study of major psychosis utilizing CpG-island microarrays was

released by Mill et al (2008), providing a large-scale overview of DNA methylation differences

in the brain associated with SZ and BD. DNA extracted from the frontal cortex (n=35 each for

SZ, BD and control) was subjected to enrichment of the unmethylated fraction using

methylation-sensitive restriction enzymes, and adaptor ligation coupled with PCR amplification.

The amplicons (multiple copies of the unmethylated genomic DNA) were interrogated on

14

12,192 feature CpG-island microarrays. The data was normalized, assigned raw p values based

on a t statistic, and then converted to false discovery rates (FDR). Indeed, in cortex they

discovered differences at loci involved in glutamatergic and GABAergic neurotransmission,

brain development, mitochondrial function, stress response, and other disease-related functions,

many of which correspond to psychosis-related changes in steady-state mRNA. Network and

gene ontology (GO) analyses were performed in order to determine relationships between the

functionally linked pathways from the microarray dataset. The network analysis revealed a

lower degree of modularity of DNA methylation “nodes” in the major psychosis samples,

indicating that there is some degree of systemic epigenetic dysregulation involved in the

disorder. From the GO analysis, several categories were highlighted, including those involved in

epigenetic processes, transcription, and development, as well as brain development in female

BD and SZ samples, and in those related to stress response in male BD samples [32]. The data

presented here supports the idea that epigenetic mechanisms underlie the broader hypotheses of

major psychosis, and the study uncovers some new avenues for future exploration.

A second epigenomic study of psychosis has since been performed by Dempster et al, using the

Illumina Infinium HumanMethylation27 BeadChip platform to compare methylation levels

between cotwins in a sample set comprised of 22 MZ twin pairs discordant for either SZ or BD.

The DNA source for the original experiment was whole blood, and the results were validated

using the Sequenom EpiTYPER platform, and then tested separately on 45 post-mortem brain

samples from cases and controls. Methylation levels differed between co-twins at numerous

loci, and there was significant heterogeneity between twin pairs, but this is understandable given

the clinical differences observed between cases of psychosis. The top differentially methylated

site across all MZ pairs was within the promoter of the gene encoding alpha-N-acetylgalactos-

aminide alpha-2,6-sialyltransferase 1 (ST6GALNAC1), which was unmethylated in affected

subjects; this gene is involved in protein glycosylation and cell–cell interactions, and it is

differentially regulated during neurodevelopment. A pathway analysis revealed an enrichment

of epigenetic changes in biological networks that were relevant to psychiatric disorders and

neurodevelopment, such as “nervous system development and function” in the SZ group, and

“developmental, genetic and neurological disorder” in the BD group. It was interesting to note

that CpG sites located within CpG islands for 100 top-ranked psychosis-associated,

differentially methylated sites were under-represented [131]. In the past, it has been common for

15

methylation studies to focus on promoter regions, so it is not surprising that many of the top loci

discovered here have not been previously identified, and this underscores the need to investigate

the genome without bias when searching for epigenetic effects.

Both SZ and BD have also been examined using the candidate gene approach in an epigenetic

context, as epigenetic down-regulation of genes is emerging as a possible underlying

mechanism of the GABAergic neuronal dysfunction in SZ. One of the more intensively

investigated SZ-related genes is RELN, which is involved in neuronal development and cell

signalling, and has been found to be hypermethylated in cases of SZ [132]. However, no

differences were observed at this locus in a replication attempt [32, 133], and the focus seems to

be shifting to other candidate genes, namely the 67 kDa glutamate decarboxylase (GAD67, a.k.a.

GAD1) and DNMT1. GAD67 catalyzes the conversion of glutamic acid to GABA. In cases of

SZ, the levels of this enzyme and several others involved in GABAergic neurotransmission,

such as GAD65 and GABA plasma membrane transporter-1 (GAT-1), display decreased mRNA

levels, as determined by real-time-quantitative polymerase chain reaction (qPCR) and in situ

hybridization [134-137]. In addition to aberrant methylation at this locus, an analysis of the

microarray collection of the National Brain Databank (USA) has shown that decreased GAD67

mRNA levels strongly correlated with upregulated HDAC1 in the prefrontal cortices of SZ

subjects [138]. Oddly enough, at the GAD67 promoter, SZ patients have been shown to display

an approximately 8-fold deficit in repressive chromatin-associated DNA methylation [137].

Currently, the general opinion on SZ seems to be that disturbances in the cortico-striato-pallido-

thalamic circuitry and in early brain maturation can result in a loss of cells and normal

connectivity in a wide variety of brain regions. This theory is consistent with the epigenetic

model of complex disease, and it is likely that genetic and epigenetic factors are disrupted at

many different loci, with each affected individual displaying a unique profile [139, 140].

Less information is available on BD, possibly because of the large degree of overlap between

BD-related genes and those associated with other mental disorders; genomic imprinting has

been suggested by statistical genetics, but molecular approaches have not yielded any imprinted

disease genes [141]. A recent study applied methylation-sensitive representational difference

analysis (MS-RDA) to lymphoblastoid cells derived from twins discordant for BD [11]. One

detected gene, peptidylprolyl isomerase E-like (PPIEL), was unmethylated in BD affected

twins, while a region of the spermine synthase (SMS) gene was hypermethylated versus

16

unaffected twins; it has yet to be determined if either of these regions are biologically and

funtionally significant. An analysis by Kaminsky et al (2011) mapped DNA methylation

differences at the human leukocyte antigen (HLA) complex group 9 gene (HCG9) using post-

mortem brains, peripheral blood cells and germline from BD subjects and controls, and found

consistent epigenetic differences at this locus in all tissues studied. Two brain tissue cohorts

exhibited lower DNA methylation in BD patients versus controls at an extended HCG9 region,

and sperm DNA had a significant association with BD at one of the regions that displayed

epigenetic changes in brain and blood, thus, the HCG9 locus appears to have a causal

association with BD [142].

Copy number variants (CNV) – the occurrence of abnormal numbers of copies of a given gene

or region of DNA, including duplications and deletions that can range in size from 1 Kb to

several megabases [143] – have been implicated in many complex diseases, including major

psychosis [144]. It has been demonstrated that CNVs play a critical role in human evolution and

genetic diversity, and it is estimated that CNVs make up ~12% of the human genome [143],

with around 0.4% of genomic variation between unrelated individuals differing due to copy

number [145]. Diseases such as SZ show a large degree of phenotypic heterogeneity, and it is

becoming apparent that a small percentage of SZ patients carry a number of specific CNVs

[146]. It has been reported that the overall genome-wide CNV burden does not differ between

SZ and unaffected subjects [146], although a significant increase in singleton deletions has been

observed in SZ and BD subjects versus controls, and very large CNVs (> 500 Kb) have also

shown enrichment in SZ subjects [147]. The well-known 15q11.2-q13.1 duplication associated

with autism has also been associated with SZ [148], while several large CNVs have been found

to increase the risk for SZ and a number of other disorders, such as autism, attention-deficit

hyperactivity disorder, learning difficulties and epilepsy – these CNVs are not enriched in

subjects with non-psychiatric diseases [149]. As more is learned about the nature of psychiatric

diseases, it seems that rare variants contribute significantly to their etiology; while the presence

of certain alleles can obviously influence phenotype, expressivity can be quite variable, resulting

in a spectrum of outcomes. Future studies will have to take into consideration the highly

individual molecular basis of complex disease [150].

In combined studies of epigenetics and DNA sequence, some interesting developments have

been observed. It has been shown that rare G variants of a G/A polymorphism in the potassium

17

chloride co-transporter 3 gene (SLC12A6) may represent risk factors for BD [151]. Eventually,

it was discovered that variants containing the G allele were methylated at the adjacent cytosine,

and this accompanied a decrease in gene expression in human lymphocytes [152]. This hints at

a functional link between epigenetics and genetic variation, and the association with BD is

believable, as SLC12A6 mutations underlie another psychiatric disorder, Andermann syndrome,

which is an autosomal recessive motor-sensory neuropathy associated with developmental and

neurodegenerative defects [153]. Unfortunately, studies that consider both genetics and

epigenetics (even smaller, targeted ones) are incredibly rare, and no one has explored the

genetic-epigenetic interactions associated with psychosis, to date.

ASM: relevance to studies of complex disease

Allele-specific methylation refers to DNA methylation that is present on only one of the two

alleles that exist in a cell. ASM can arise from several causes, such as genomic imprinting, X

chromosome inactivation, stochastic methylation of a single allele, or as a direct result of DNA

sequence variants. Genomic imprinting, in which the inactive imprinted allele is significantly

more methylated than the actively expressed allele, has been intensely studied and a large

candidate list of imprinted genes is available [154], although most still require validation and are

believed to only represent a fraction of the total number. X inactivation, a large-scale case of

ASM, occurs as one copy of the two X chromosomes in a mammalian female is methylated,

thereby silenced, and packaged into heterochromatin [155]. ASM may also arise stochastically,

appearing in all or many cells of an organism if the methylation event occurs at a very early

stage of development [71, 156], or only in select tissues if the event is postnatal or the result of

some environmental factor, such as smoking or diet [157].

Recently, evidence for DNA sequence-influenced ASM has been building, and it has proven to

be an area of extreme interest. The first suggestion of this phenomenon appeared in 2002, when

Yan et al discovered allele-specific expression (ASE) occurring at a small subset of SNPs,

although the mechanism of action was unknown at the time [20]. Several years later, Kerkel et

al examined a collection of tissues, including WBC, brain, buccal cells, lung, kidney and

placenta, and made the first estimate of sequence-dependent ASM. Using methylation-sensitive

SNP analysis, they surveyed the genomes of 12 and 5 individuals at 50K and 250K resolution,

respectively, and determined that at least 0.16% of the informative SNP-tagged loci queried

18

showed ASM [19]. Several other studies followed, and a range of ASM estimates were

presented. A recent study by Schalkwyk et al, in which blood DNA from five pairs of MZ twins

was interrogated on Affymetrix SNP 6.0 microarrays, stated that 1.5% of their 183 605 SNPs

displayed ASM, and approximately 90% of the ASM was cis in nature. These results were

validated with bisulfite-mapping and gene-expression analyses, and then subsequently tested in

a second tissue from the same individuals and replicated in DNA obtained from 30 parent-child

trios [21]. In contrast, some very high estimates have also been proposed: 10% by Zhang et al

[158], 10% by Hellman and Chess [159], and a staggering 23-37% was suggested by Shoemaker

et al [160]. It should be noted that Hellman and Chess' finding was entirely based on in silico

simulations, and the percent provided by Shoemaker et al was an estimate based on the findings

in a few thousand isolated regions, only in pluripotent cell lines. Another study that used three

human embryonic stem cell (HESC) lines estimated that 14% of all CG sites will show ASM,

and they also identified 1,020 genes that show ASE, but again, these were cell lines and HESC

in particular have significantly higher non-CG methylation than differentiated cells [161].

Although it is not entirely understood how ASM might exert an effect, one somewhat obvious

explanation involves ASE, where the presence or absence of a particular allele is required for

expression of a given gene. The paper by Kerkel et al confirmed ASM at 16 SNP-tagged loci,

and then identified two cases of ASE at the vanin and CYP2A6-CYP2A7 gene clusters [19]. Ten

cases of SNP-methylation-expression three-way associations were detected by Zhang et al

[162]. Schalkwyk et al reported that 16.3% of the possible SNP-expression associations (a SNP

located within 5Kb of a gene expressed at detectable levels in blood) provided evidence for a

significant linear association between the allelic variant present at an ASM-SNP and mRNA

level, confirming that ASM effects often correlate with allelic expression differences, and that

these are likely to be cis in nature [21]. Other hypothetical mechanisms of ASM action, such as

interference with gene splicing, protein binding, micro RNA (miRNA) binding and RNA

structural alterations, are summarized in Figure 1.1. As some SNPs have been found to operate

in these ways [163-171], local methylation differences surrounding the SNP could potentially

act to increase any of these actions.

19

Figure 1.1. Hypothetical mechanisms of epiSNP action

The basic structure of a gene is presented, with letters representing potential locations of epiSNPs. A) intergenic

epiSNPs may interact with RNA genes or ncRNA themselves[164], or they could disrupt transcription factor

binding sites. B) promoter epiSNPs may interfere with the binding of numerous proteins, such as transcription

factors, RNA polymerases, activators, repressors and any other protein whose upstream binding can influence

transcription. C) epiSNPs in the 5’UTR may act as riboSNitches, which affect the shape of the mRNA transcript

[170]. D) introns have been shown to affect splicing [167], occasionally encode proteins or ncRNA[169, 172], and

they may also act as transposons [168], so an intronic epiSNP may alter any of these activities. E) exonic epiSNPs

may directly affect the proper transcription of exonic sequences [163]. F) epiSNPs in the 3’UTR may interfere with

the binding of many miRNAs [171], which may have consequences for polyA signal and stability of the transcript

[166].

Experimental design is a critical element to consider when investigating ASM, as DNA

methylation is tissue- [173], developmentally- [174], and temporally-specific [175], thus, in

addition to the inclusion of a large number of samples, studies should ideally utilize complex

genome-scanning tools, such as microarrays, and interrogate as many SNPs as possible without

bias to any particular region of the genome. The aforementioned studies did not fully satisfy all

of these requirements, as many relied upon very small sample sets, older arrays that interrogated

20

a small number of SNPs, and in many cases they tended to focus only on areas that had been

previously identified in other studies, despite the fact that recent evidence suggests that ASM

studies should not exclusively consider core promoters, CpG islands (CGIs) and imprinted

differentially methylated regions (iDMRs); a 2009 study found that most methylation alterations

in colon cancer are not localized in promoter regions or CGIs, but in 'CpG island shores,' which

are sequences up to 2Kb away [176]. While ASM is believed to predominantly act in cis, trans

effects (where a SNP is correlated with methylation at a site several megabases away, or even

on a different chromosome altogether) have been reported in a few instances [21, 162].

As previously mentioned, the SNPs detected by GWAS studies do not account for all of the

heritability associated with complex diseases, especially psychiatric ones. Integration of the

epigenetic aspect, to form Epigenome-Wide Association Studies (EWAS), is a promising

method to pinpoint the truly causative variants that differ only in methylation status from those

that are non-causative. These SNPs displaying ASM, or "epi-alleles," may also act as risk

factors or predictors of disease type, treatment outcome or disease course, and would be of great

value to pre-screening applications and general diagnostics. With the International Human

Epigenome Consortium (IHEC) underway [177], which attempts to map 1000 reference

epigenomes for various human tissues and cell types, our ability to conduct EWAS will be

significantly improved in the near future, and as array technology continues to advance, large-

scale EWAS for complex disease may become an attractive option.

Currently, ASM has only been studied in a small number of complex diseases, mainly various

cancers, but the findings are quite intriguing. Milani et al discovered that 16% of the genes they

analyzed displayed ASE in multiple acute lymphoblastic leukemia (ALL) cell samples, with the

level of ASE varying largely between the samples. Of these genes exhibiting ASE, 55%

displayed what the authors called “bidirectional” ASE, in which either of the two SNP alleles

could become the one that was overexpressed. ASE and ASM are not the same process,

although they can be related, but in this particular experiment, the bidirectional ASE correlated

with methylation level at the site [22]. A more direct finding was made by Hawkins et al, who

found that the T allele at SNP rs16906252 is a key determinant in the onset of O(6)-

methylguanine DNA methyltransferase (MGMT) methylation in colorectal cancer; MGMT is a

DNA repair protein that restores mutated guanine, and its methylation is often detected in

sporadic colorectal cancer [23]. It has been suggested that screening for methylation of the T

21

allele at this SNP in the peripheral blood of unaffected individuals could identify those

predisposed for colorectal cancer, lung cancer, lymphoma, and glioblastoma [24]. In the lung

adenocarcinomas and sputum samples from smokers, another study has found that the A allele

of an MGMT promoter-enhancer SNP is a key determinant for MGMT methylation in lung

carcinogenesis, as this allele was selectively methylated in primary lung tumors and cell lines

heterozygous at that SNP [178].

Outside of cancer, very few complex diseases have been studied in the context of ASM. The

vitamin D receptor (VDR) gene encodes a transcription factor that modulates several processes,

such as calcium homeostasis and immune function. Large differences in allele frequency

between populations have been observed at the VDR, and it has previously been associated with

susceptibility to tuberculosis and autoimmunity. In tuberculosis cases and controls, as well as

lymphoblastoid cell lines from two ethnically distinct populations (Yoruba and Caucasian), it

was found that there were methylation-variable positions in the 3' end of VDR that significantly

distinguish ethnicity and tuberculosis status. It was also shown that methylation status

demonstrated a complex association with a VDR SNP known as TaqI (rs731236), with several

local CpG sites showing disease- and ethnicity-specific methylation, thus, it is recommended

that epigenetic and genetic factors should be investigated together in the case of VDR-associated

disease [179]. In a study of obesity, the melanin-concentrating hormone receptor 1 (MCHR1),

which regulates energy balance, food intake, physical activity and body weight in humans and

rodents, was found to have ASM at two SNPs in its first exon that was age-dependent, BMI-

associated and that also affects transcription [180]. Recently, DNA methylation was examined

in 60 females stratified by type 2 diabetes (T2D) susceptibility haplotype, using previously

identified association loci. After noticing increased DNA methylation on the alpha-

ketoglutarate-dependent dioxygenase (FTO) obesity susceptibility haplotype, it was then

determined that the methylation difference was due to the co-ordinated phase of CpG-creating

SNPs across the risk haplotype. Essentially, they had found a 7.7Kb example of haplotype-

specific methylation that can act as a long-range enhancer, supported by the histone H3K4me1

enhancer signature [181]. One message to take away from this last study is that genetic and

epigenetic mechanisms can be intertwined, and that diseases may be caused by their combined

actions, in ways that we have not yet envisioned. The vast majority of complex diseases have

22

not been examined from an ASM-perspective, especially one that utilizes EWAS, mainly

because the technology for such an endeavor simply was not available in the past.

Emergence of epigenetic treatments

Epigenetic drug strategies are currently employed to treat a collection of cancer subtypes, and

these medications are now being considered in the treatment of psychiatric disease, as well. The

DNMT inhibitor, Doxorubicin, has been used to increase reelin and GAD67 expression in

neuronal precursor cells, and it was shown that reelin gene expression correlated with the

dissociation of DNMT1 and MeCP2 from its promoter, as well as an increased level of histone

H3 acetylation[182]. Other studies have shown that HDAC inhibition enhances learning and

memory following neurodegeneration induced by traumatic brain injury[183], and also shows

some therapeutic efficacy in rodent models of neurodegenerative conditions, such as

Huntington’s disease[184], multiple sclerosis[185], and Parkinson’s disease[186]. One of the

downstream effects of HDAC inhibition is upregulation of p21[187], a cyclin-dependent kinase

inhibitor that appears to play an important protective role against oxidative stress and DNA

damage[188]. Valproate, a compound utilized for its anticonvulsant and mood stabilizing

properties, also exhibits HDAC-inhibitory activity and has been successfully implemented as a

treatment for epilepsy[189], BD[190] and, less commonly, SZ[191]. Like valproate, it has been

discovered that several drugs have previously unknown epigenetic modifying properties, and the

list continues to grow. While such medications are promising, their pleiotropy, transient effects,

and non-specific alterations to the entire epigenome limit them for the time being.

The studies presented below include a detailed analysis of sequence-dependent and sequence–

independent DNA methylation in MZ and DZ twins, plus an unbiased, large-scale evaluation of

ASM in psychosis cases and controls that avoids the short-comings of previous studies and

utilizes the Affymetrix SNP 6.0 platform in a novel manner. Our findings highlight the variable

influence of genetics on epigenetics, as well as the importance of genetic-epigenetic interactions

in both normal and pathological phenotypes. We stress that an epigenetic element must be added

to genetic studies in order to fully understand the molecular functions of the genome, and that

epigenetic drug therapies are promising options for the treatment of complex diseases.

23

Chapter 2

Materials and Methods

Contributions: DNA dependent and independent DNA methylation in twins

I was responsible for the animal experiment, including DNA extractions, restriction enzyme

digestions, adaptor ligations, PCR amplifications, microarray preparation, hybridization and

scanning, plus some basic analysis. I also performed the bulk of the pyrosequencing and

cloning validation experiments for the human microarray data, a portion of the bisulfite

modification, and I was involved in the writing and editing of the manuscript.

All human microarray laboratory experiments, experimental design, the majority of the analysis

and writing of the paper were performed by Zach Kaminsky. Carl Virtanen created the

karyograms. The Gene Ontology analysis script in Bioconductor was provided by Thomas Tang.

Consultation with bioinformaticians Thomas Tang, Sun-Chong Wang, and Allan McRae helped

to direct the analyses performed. Animal sacrifice was performed by Laura Feldcamp under the

direction of Albert Wong. A portion of the bisulfite modification, pyrosequencing and cloning

was performed by Zach Kaminsky, Gabriel Oh, and Sigrid Ziegler.

Twin sample

We investigated three cohorts of twins representing various tissues. WBC of 19 dichorionic

(DC) MZ and 20 DZ twin pairs matched for age, sex and WBC count plus buccal epithelial cells

from the 10 monochorionic (MC) MZ, 10 DC MZ, and 20 DZ age- and sex-matched twin pairs

were obtained from the Brisbane Adolescent Twin Study [192]. WBCs and buccal cells were

obtained from the same individual for 10 DC MZ and 10 DZ pairs. WBC samples were from

twins 13.2 ± 1 y old (mean ± s.d.) and consisted of 20 females and 18 males. MC and DC buccal

epithelial cells both consisted of 10 males (aged 14 ± 0.77 y) and 10 females (all 14 y old); all

were of European ancestry (mainly northern European ancestry). MZ and DZ twins in the WBC

group were selected from several thousand sets of twins of the Australian Twin Registry using

hematology report data. The percentage difference between cell subfraction counts for the whole

WBC count, neutrophil and lymphocyte counts did not exceed 10%. The mean percentage

difference in selected DZ twins was smaller than that of MZ twins to bias against the alternative

24

hypothesis of more epigenetic variation in the DZ twin group. We determined zygosity by

comparisons of nine microsatellite markers, which gave a probability of incorrect assignment of

a DZ as an MZ of less than 0.0001. Gut biopsies from 18 pairs of MZ twins were obtained from

a Swedish twin population with inflammatory bowel disease described previously [193].

Although all twin pairs had at least one twin affected with inflammatory bowel disease, we

investigated biopsies from rectal mucosa, which were macroscopically not inflamed in any of

the twins investigated. Written informed consent was obtained from all participants, and studies

were approved by the local institutional review boards at participating institutions. The

workflow for the twin study is presented in Figure 2.1.

Figure 2.1. Twin study workflow

Human WBC, buccal epithelium cells and gut biopsies, plus whole mouse brain samples were processed

identically. Several downstream analyses were conducted, which differed between cohorts.

25

DNA methylation profiling

The unmethylated fraction of genomic DNA was enriched using the methylation-sensitive

restriction enzyme (MSRE) HpaII [194] and interrogated on Human 12K CpG island

microarrays [195]. Enrichment of the unmethylated genome of MZ and DZ twin pairs and

hybridization to the microarrays was carried out in a randomized fashion. We did two technical

replicates for each enrichment and hybridization, after which we averaged the log ratios per

each replicate to produce one value per individual per locus. All samples were hybridized

against a common reference (reference 1) with the exception of 9 MZ and 10 DZ pairs in WBC,

which were originally hybridized against a different common reference (reference 2) and later

transformed to match reference pattern 1. Transformation was achieved by first obtaining a spot-

wise log ratio of reference 2 relative to reference 1 through a comparison of two dye-swapped

reference 1 versus reference 2 hybridizations. Log ratios from the 9 MZ and 10 DZ pairs

originally hybridized with reference 2 were multiplied by the log ratio values of reference 1

versus reference 2 to obtain log ratio values relative to reference 2. This transformation was

followed by between-array normalization using the Limma package in Bioconductor. We

created the reference pools by addition of equal quantities of the enriched unmethylated WBC

DNA fraction from 10 MZ and 10 DZ pairs.

Animal studies

We extracted genomic DNA using standard phenol and chloroform methods from whole-brain

tissue of four strains of mice: c57BL6 and FVB inbred strains and CF-1–1 and CD-1 outbred

strains, all obtained from Charles River Laboratories International. Three litters consisting of

three male mice per litter were kept in uniform environments and killed at postnatal day 43. We

enriched the unmethylated fraction of genomic DNA and created the common reference pool in

an identical manner to the human reference design studies. The microarrays used were mouse

4.6K CpG island microarrays, all produced during a single printing at the microarray facility of

the University Health Network, Toronto. Hybridizations were carried out in batches of 18

microarrays consisting of one amplification set from one inbred and one outbred strain per day

for a total of four hybridization days. We determined selection and order of hybridization at

random through sorting on a random number generator.

26

Data analysis

All microarrays were scanned on the Axon 4000A scanner and cross-referenced to annotated

GAL files using Genepix 6.0 Software. Microarray GAL annotation was made available from

the manufacturer and downloaded at www.microarrays.ca. We carried out normalization

procedures in Bioconductor using the Limma package. All arrays underwent log ratio- based

normalization, background correction, print tip loess normalization and scale normalization

between blocks. We removed low quality flagged loci identified by Genepix. Microarray data

were trimmed on the basis of the annotation information such that spot IDs containing

mitochondrial DNA, translocation hot spots and repetitive elements, and those located on the X

and Y chromosomes were removed. After trimming and removal of flagged loci, 6,405 (WBC),

5,918 (buccal cells), and 5,941 (gut biopsies) unique DNA sequences in humans and 2,176 DNA

sequences in mice were used for subsequent statistical analyses.

All statistical tests were done in R (http://www.r-project.org/). Using an Anderson-Darling test

from the nortest package, we found that all distributions derived from microarray data rejected

the null hypothesis of normality, and we subsequently evaluated them with non-parametric tests.

All statistical tests done were two tailed and a P<0.05 is considered significant. Unless

otherwise specified + denotes the standard error of the mean.

Test for association of epigenetic difference with cellular heterogeneity

WBC counts were available for all twin blood samples, allowing us to investigate any

association between twin pair wise variability and the fold difference of DNA methylation

variability at each locus. A spot-wise correlation between the difference in log fold change value

per twin pair and the log2 of the ratio of the cell count per twin pair was calculated with the

Spearman method and subjected to correction for multiple testing using the qvalue package

[196]. The three separate comparisons were performed on the cell fractions with the highest

proportion of cells consisting of the whole white blood cell count, total neutrophil count, and

total lymphocyte count.

Biological and technical variation

Levels of biological variation and technical variation for individual twin sets produced by twin

versus co-twin methylation profile comparisons and self versus self methylation profile

http://www.microarrays.ca/

http://www.r-project.org/

27

comparisons, respectively, were measured according to the variance (2) over all ~6,000 loci.

Non-parametric comparisons between matched biological and technical variation for all sets

were carried out by the Ansari-Bradley test. Differences between the degrees of biological and

technical variation in 4 MZ twin sets were evaluated with the Kruskal-Wallis test. Technical

variation produced by MspI- based DNA enrichment was tested by 4 self versus self

hybridizations and compared to HpaII technical variation levels by the Ansari-Bradley test. For

the common reference design data, we addressed the null hypothesis that the difference between

co-twins was not significantly larger than that between replicate hybridizations. For each tissue,

the median absolute value of the fold change difference between the two technical replicate

enrichments/hybridizations performed per individual was determined and compared to that

between co-twin hybridizations with a paired Wilcoxon Signed Rank test for MZ twins.

For animal data, assessment of technical variation was performed in the following way. For all

mice, a spot-wise correlation between replicate hybridizations was produced at 2,176 unique

genomic regions. To ensure that biological variation was detectably higher than technical

variation, a Monte Carlo procedure was performed to test the null hypothesis that the spot-wise

correlation between technical replicates would be higher than that produced from the random

pairing of biological replicates from different mice. A simulated distribution was created by

randomly shuffling the replicates and re-calculating a spot-wise correlation distribution for

10,000 permutations. For each permutation, the original distribution of technical replicate

correlations was compared to each randomly created distribution with a paired Wilcoxon Signed

Rank test. The proportion of times the correlation distribution of original technical replicates

was higher than the randomly sorted distribution was tabulated and divided by the total number

of permutations to obtain the quantile and relative P value.

Spot-wise epigenetic variation

We calculated a spot-wise intraclass correlation coefficient (ICC) according to the one-way

consistency model using the irr package, designating co-twin pairs as a class. The ICC formula

is ICC ¼ (MSb – MSw)/(MSb+ MSw). Here MSb stands for the between pair mean square and

MSw represents the within-pair mean square of the specified class. As the ICC approaches 1,

the co-twins are more similar to each other than unrelated twin pairs are to each other, whereas

as it approaches –1, the within–co-twin difference across the group is consistently larger in

28

comparison to unrelated twin pairs. Each unique DNA region investigated by the microarray

was treated as an independent measurement. To address the null hypothesis that there are no

differences in the amount of DNA methylation variability between MZ and DZ twins, we

evaluated the distributions of unique locus ICC between MZ and DZ twins in WBC cells with a

paired Wilcoxon signed rank test. For buccal epithelial cells, the same hypothesis for MC and

DC twins was evaluated in a similar manner. For inbred and outbred mice, separately, a spot-

wise distribution of within sibship epigenetic variation was created by taking the average of the

variance produced by the three mice per sibship. To address the null hypothesis that there are no

differences in the degrees of epigenetic variation between inbred and outbred mice, we

compared these spot-wise distributions with a paired Wilcoxon signed rank test.

Cross tissue comparison

Ten WBC samples were obtained from the same individuals as the 10 DC MZ twins used in the

buccal cell analysis. Separate spot-wise ICC distributions were calculated for these 10 DC MZ

twins in the WBC sample and from the remaining 9 unrelated DC MZ twin WBC samples. Each

distribution was compared to the buccal cell derived ICC distribution at 5919 loci overlapping

between datasets by linear regression.

Investigation of genomic element class

The list of microarray probes residing within CpG islands was obtained from the annotation

data (www.microarrays.ca). A list of probes residing within 1 Kb of gene promoters was created

by cross referencing the chromosomal coordinates of each microarray probe with the genome

locations of transcription start sites located within the Transcription Start Site database

(http://dbtss.hgc.jp/) using an in house Perl algorithm. For each tissue cohort, the spot- wise ICC

distribution of probes residing within CpG islands was compared to non-CpG island probes with

a Wilcoxon Rank Sum test: NCGI = 2,542, Nnon-CGI = 3,863 in WBC; NCGI = 2,343, Nnon-CGI =

3,575 in the buccal cells; NCGI = 2,352, Nnon-CGI = 3,590 in the gut. The same analysis was

performed for promoter- associated loci: NPromoter = 1,341, Nnon-Promoter = 5,064 in WBC; NPromoter

= 1,248, Nnon-Promoter = 4,670 in the buccal cells; and NPromoter = 1,253, Nnon-Promoter = 4,688 in the

gut. P values were corrected for multiple testing using the Bonferroni method.

29

Gene ontology analysis

Over representation of gene ontology category within the top and bottom 5th

percentile of

unique promoter loci was tested using the GOhyperG [197] function of the GOstats package in

Bioconductor for WBC, buccal, and gut. The top and bottom 5th

percentile of unique CGI

associated loci was interrogated in an identical manner for each tissue. GOhyperG does not

correct for multiple testing. Mappings were based on data provided by: Gene Ontology

(ftp://ftp.geneontology.org/pub/go/godatabase/archive/latest) on 2007/08.

Validation of the microarray findings

We validated the microarray findings using sodium bisulfite modification as done previously in

our laboratory [32]. Sodium bisulfite modification was followed by interrogation of specific

CpG sites by pyrosequencing [198] or direct cloning and sequencing. PCR amplicon,

pyrosequencing, and sequencing primers are provided in Table 2.1. PCR conditions included

0.5μM primers, 10 μl of Qiagen HotStar Taq Master Mix, and double-distilled H2O to a final

reaction volume of 20 μl. Cycling conditions were as follows: 95oC -15 min, 40 cycles of 95

oC -

30 sec, 50oC -45 sec, 72

oC -30 sec, 72

oC –5 min, cool to 4

oC. PCR amplicons were

pyrosequenced at EpigenDX Inc (http://www.epigendx.com). A representative CpG dense probe

residing within the 3’ end of the Complement C1q tumor necrosis factor-related protein 8

precursor (C1QTNF8) gene containing 18 CpG positions in a 367 bp fragment was selected for

in depth analysis by cloning and sequencing in WBC DNA from 18 twin pairs. On average, 1 μl

of PCR amplicon was ligated into 50 ng of pGEMt easy plasmid vector (Promega) with 5μl of

2X Rapid Ligation buffer and 3 Weiss units of T4 DNA ligase in a 10 μl reaction volume, and

incubated overnight at 4oC. 2 μl of ligation product was transformed into 50 μl JM109 high

efficiency competent cells and plated on LB agar plates containing 0.1 mg/ml ampicilin, 50 µM

isopropyl β-D-1-thiogalactopyranoside, and 80 µg/ml X-gal for white colony selection. For each

individual, 36 clones were grown overnight in 1 ml lysogeny broth medium, pelleted and

sequenced at Functional Biosciences (http://www.functionalbio.com), after which the ratio of C

to T was calculated at each CpG position per individual. The methylation difference at a CpG at

position 9, located within a HpaII restriction site, as well as the mean methylation difference

between co-twins was compared to the microarray log ratio differences by linear regression.

http://www.epigendx.com/

30

Pyrosequenced Loci # Pairs Direction PCR Primers Pyrosequencing Primer

UHNhscpg0004390 15 F-B

5'-

ACACACTATTTGTTGTAATTTTTTTTAGTTT

TTT-3' 5'-AAACCCAACAACACA-3'

R 5'-CTACTCATCAATAAAAAAACC-3'

UHNhscpg0008483 10 F-B 5'-GATTATGTTTTATTATTGGGGGTA-3' 5'-CAACTAAAACAAAAAAAACATCCC-3'

R 5'-CAACTAAAACAAAAAAAACATCCC-3'

UHNhscpg0004556 19 F 5'-GGTTGGTAGTTTAAGTTTGAGTTAG-3' 5'-GGTTGGTAGTTTAAGTTTGAGTTAG-3'

R-B 5'-CAACTATACCATCTTTCACTATTTTAAC-3'

UHNhscpg0000193 18 F-B 5'-GGGAGGTGTTYGAGAGGATT-3' 5'-TCTACCCCCTTTTCCATCTAAA-3'

R 5'-TCTACCCCCTTTTCCATCTAAA-3'

UHNhscpg0004262 11 F-B 5'-TAGGAATTAAAAGGATGTTGAAGAT-3' 5'-AAAACTATACCCTATCCCCTAAA-3'

R 5'-AAAACTATACCCTATCCCCTAAAAC-3'

Sequenced Loci # Pairs PCR Amplicon Primers Sequencing Primer

C1QTNF8 18 F 5'-GTTTGGAATGTTATAGGGATGTTTT-3' M13 Reverse

R 5'-AACCTCAAACAACAAAACCTACATCC-3'

Table 2.1. Sodium bisulfite treated loci and primers

Column 1: microarray probe IDs for loci subjected to sodium bisulfite modification. Column 2: the number of twin

pairs interrogated per locus. Column 3: primer orientation. “F” and “R” denote the forward and reverse primer

sequence. “B” denotes the addition of a biotin modification for downstream pyrosequencing applications. Column

4: Primer sequences for amplifying the respective regions from post sodium bisulfite modified DNA for

pyrosequencing and cloning and sequencing strategies. Column 5: pyrosequencing and sequencing primers are

provided in the far right.

In silico SNP analysis

SNP and allele frequencies were initially obtained from the October 2005 release of dbSNP

database (http://www.ncbi.nlm.nih.gov/projects/SNP/) and updated with information from the

March 2007 release #22 of the HapMap (http://www.hapmap.org/) database. For each locus, a

http://www.ncbi.nlm.nih.gov/projects/SNP/

http://www.hapmap.org/

31

heterozygosity quotient (HQ) was calculated for two scenarios. The first was for only those

SNPs residing within HpaII positions and the second was for all SNPs residing within the probe

sequence and 1Kb upstream and downstream. An HQ was calculated by summing the quantity

of 1 minus the sum of the squared allele frequencies for all SNPs located within the interrogated

region. The relationship between HQ value and ICCMZ-ICCDZ difference was evaluated through

linear regression.

Contributions: ASM and its putative role in complex disease

I was responsible for experimental design and most wetlab activities involved in this project,

including development, optimization and initial testing of our epiSNP detection technique, all

DNA extractions, quantifications, MSRE digestions, adaptor ligations, PCR amplifications,

fragmentation and labeling for the enriched brain and sperm samples, and all of these tasks plus

hybridization, fluidics and array scanning for all sperm samples. The brain genotyping arrays

were run as a service at the Toronto Centre for Applied Genomics (TCAG). For the deep

sequencing experiment using the 454 platform, I selected the target loci, designed primers, and

performed all bisulfite modification, amplifications, gel extractions, purifications and pooling of

the 800 samples. The 454 sequencing was also run as a service at TCAG. For the bisulfite

verification of epiSNPs and non-epiSNPs, I selected the loci, designed all PCR and

pyrosequencing assays, bisulfite-modified the DNA, and then prepared the amplicons, ran the

pyrosequencing reaction and performed a portion of the data analysis. Analysis of results was

performed by Denise Mak and Paul Boutros, with additional analyses by Natalie Freeman,

Michal Grzadkowski and Ying Wu. I provided biological consultation to direct the analyses.

Sample preparation

Frozen prefrontal cortex (Brodmann area 10) tissues from post-mortem control subjects (n=76),

BD (n=67) and SZ (n=65) patients were obtained from the Stanley Medical Research Institute

and the Harvard Brain Tissue Resource Center. All demographic data provided by the brain

banks is summarized in Appendix 2, Tables A2.1 and A2.2. For the Stanley samples,

pathologists compiled reports on potential donors that include family interviews and medical

records, and these reports are reviewed independently by two senior psychiatrists who make the

diagnoses [199]. The Harvard samples were collected via community donations initiated by the

families of the donors, and classifications were also made through the use of family interviews

32

and medical records. Every case received a complete neuropathological examination that

included detailed gross and microscopic analysis [200].

Sperm samples from BD (n=24) and control samples (n=24) were collected at the Centre for

Addiction and Mental Health (CAMH, Toronto), and demographic data is presented in

Appendix 2, Table A2.3. The CAMH Research Ethics Board approved the use of all brain and

sperm samples in this study. Germ cells were isolated from the semen using two-layer

discontinuous gradient separation. At 37C, the two-layer gradient was formed using 2mL

ISolate (lower layer) and 2mL Modified HTF Medium (upper layer) in a 15mL tube. Semen

(2mL) was gently added onto the upper layer, and the tube was centrifuged for 20 min at 300 x

g. The top layer was removed by aspiration until only 0.5 mL of lower layer remained. Sperm

Washing Medium (3mL) was added, the tube was centrifuged for 10 min at 300 x g, and then all

but the lower 0.5 mL was removed with a pipette. This washing step was repeated once, then the

supernatant was removed and 0.5mL Sperm Washing Medium was added to the pellet, which

was then stored at -80C in a cryo tube. The cells were re-pelleted and the storage solution was

removed prior to DNA extraction.

Genomic DNA was extracted using phenol and chloroform. The unmethylated fraction of the

genome was enriched for each sample in the following manner: 500ng of genomic DNA was

separately digested with three MSREs, HpaII, HinP1I, and HpyCH4IV (New England Biolabs),

and the three digests per sample were then pooled in equivalent amounts, adaptors were ligated

onto the ends, and the ligation products were digested with McrBC (New England Biolabs).

Samples were then PCR-amplified using primers complementary to the adaptor sequences,

fragmented with DNAseI (EpiCentre), labelled (GeneChip DNA labelling reagent, Affymetrix)

and hybridized to Affymetrix SNP 6.0 microarrays, which interrogate 906 600 SNPs at 3000 bp

resolution. For each sample, purified genomic DNA was prepared following the manufacturer’s

instructions and hybridized onto a second SNP 6.0 array for standard genotyping. As cases and

controls were run separately on two batches of arrays, a subset of 10 cases and 10 controls was

re-run in the second batch to ensure comparability. These technical replicates were enriched

separately, at a later date versus the original cases and controls, which were enriched together.

The workflow for the epiSNP study is presented in Figure 2.2.

33

Figure 2.2. EpiSNP study workflow

Post mortem prefrontal cortex and sperm samples were processed identically. Identified epiSNPs were examined in

several downstream analyses, which differed between cohorts.

EpiSNP identification

We used R v1.12.1 and the R package, oligo v1.14.0, to background correct, normalize and

summarize (RMA) the SNP probes, and crlmm to make genotype calls. Datasets were

normalized separately, as were genotyping and methylation arrays. For each SNP, we obtained a

pair of values: a genotype call and a methylation level. To determine the most appropriate

analysis strategy, we performed an empirical study of five different methods: Pearson’s

correlation, Spearman’s rank-order correlation, mutual information, piecewise linear regression

(PWL) and analysis of variance (ANOVA). We found that PWL and ANOVA were the top two

34

performers in terms of sensitivity and specificity. PWL is a two-step linear regression model and

it has the advantage over ANOVA, in that it provides us with a pattern of directionality between

the genotypes AA, AB (slope 1) and AB, BB (slope 2). PWL was then used to examine the

pattern of dependence between AAAB and ABBB genotype calls and their respective

methylation levels. Statistically significant allelic DNA methylation differences are identified as

epiSNPs, i.e. an epiSNP will have at least one significant non-zero slope. An FDR correction

was applied to correct for multiple testing. Identified epiSNPs had q-values < 0.01.

We designed a random sampling procedure to test the sensitivity of our epiSNP identification

method, where Ci represents one of five sets of cohorts (Control, Case, BD, SZ, Control+Case)

in our study. For each cohort Ci and each value of N (ranges from 2 to Ci-1), we will obtain X

groups of identified epiSNPs, dependent on the randomization of chosen samples. The sampled

data gives us a range (minimum and maximum) of identified epiSNPs at each sample size, from

which we extrapolated the number of identified epiSNPs expected at larger sample sizes. For

the extrapolation, we tested three models - linear, quadratic and logistic - and used AIC (Akaike

Information Criterion) values to evaluate each one, with the smallest AIC value representing the

best model. The relative measure, weighted AIC, calculates the probability of each model being

the best, given the data and set of possible models. In order to gain a clear view of the overall

trend, we filtered the data with an increasing slope threshold. Both slope values for each SNP

were combined together and made positive.

Identified epiSNPs from both the brain and sperm samples were closely examined for any

chromosomal and functional class bias. Chromosome annotation was taken from the R

annotation package pd.genomewidesnp.6 (v1.1.0) and functional class information was taken

from dbSNP (build 135). We separated functional classes into five main categories: exon,

intron, UTR, locus and intergenic. Note that intergenic regions are not currently categorized in

dbSNP, and it was the absence of information that was used to label SNPs as “intergenic.” All

other possible functional classes did not apply to our group of epiSNPs. The term “locus” is

used by dbSNP to identify intergenic SNPs that have close associations with a gene, existing

either within 2Kb upstream or 500bp downstream, but do not appear in the transcript. The term

“UTR” includes both 5’ and 3’ UTRs, where the 3′ UTR is the portion of an mRNA from the

position of the last codon that is used in translation to the 3′ end, and the 5′ UTR is the portion

35

of an mRNA from the 5′ end to the position of the first codon used in translation. We examined

each cohort using the hypergeometric distribution (phyper function in R) to compare the

proportion between epiSNPs in each chromosome/functional class against all genetically diverse

SNPs on the SNP 6.0 array for the same chromosome/functional class. We determined if the

epiSNP proportion was an under-representation or an over-representation. Multiple testing

correction using the FDR and a q-value of 0.01 was applied to the chromosomal results, as there

were 24 tests per cohort. It was not needed for the functional class results because only five tests

were run per cohort. No correction for multiple testing was applied to the functional class test,

because using a stringent p-value cut-off of 0.01 meant that we would expect, per cohort, 0.01 x

5 = 0.05 false positive functional class bias by chance alone.

We used the web interface of the GoMiner program to identify enriched GO categories

associated with our lists of identified epiSNPs. GoMiner requires two lists of genes as input: the

total set of genes and a subset of interesting genes. dbSNP (build 135) was used for mapping

SNPs to gene symbols. Three GoMiner runs were completed for each cohort. For each cohort,

GoMiner returned a list of GO categories that are statistically enriched for those cohort genes

that belong to the GO category after correction for multiple testing (FDR q-value < 0.02). An

enrichment score is given for each GO category, representing the proportion of cohort genes

relative to the total number of genes on the SNP 6.0 array, which can be concisely described as:

Verification of microarray results

EpiSNPs that occurred in both cases and controls were chosen from a list of top hits, where one

allele was associated with methylation much more strongly versus the alternate allele, and non-

epiSNPs were randomly selected (n=3 for each). For each locus, we chose AA and BB

homozygous samples from the original experiment, and then performed bisulfite modification

on the genomic DNA, as done previously in our laboratory [32]. The amplicons were designed

to cover as many CpG sites surrounding the SNP as possible; PCR amplicon, pyrosequencing,

and sequencing primers are provided in Table 2.2. PCR conditions included 5µl of 0.12μM

primer mix, 2.5 μl of Qiagen HotStar Taq buffer, 0.5µl of 10mM dNTPs, 1.3µl of HotStarTaq,

36

and double-distilled H2O to a final reaction volume of 25 μl. Cycling conditions were as

follows: 95oC – 15 min, 40 cycles of 95

oC – 1min, locus-specific annealing temperature – 45

sec, 72oC – 1 min, 72

oC –10 min, cool to 10

oC and hold. Prepared amplicons were

pyrosequenced in house on the PyroMark Q24 machine (Qiagen), using 0.3µM sequencing

primers. Results were analyzed using the methylation analysis function in the PyroMark Q24

software. Independent CpG percentages were compared between genotypes (AA vs. BB) using

the Wilcoxon signed rank test, using a p-value threshold of <0.06.

Table 2.2. Sodium bisulfite treated SNP loci and primers

Column 1: SNP 6.0 probe IDs for loci subjected to sodium bisulfite modification. Column 2: the number of brain

DNA samples interrogated per locus. Column 3: primer orientation. “F” and “R” denote the forward and reverse

primer sequence. “B” denotes the addition of a biotin modification for downstream pyrosequencing applications.

Column 4: Primer sequences for amplifying the respective regions from bisulfite modified DNA for

pyrosequencing. Column 5: pyrosequencing primers are provided in the far right.

Examination of linkage disequilibrium effects

We investigated possible linkage disequilibrium (LD) effects between identified epiSNPs and

SNPs occurring within nearby MSRE recognition sites, as it is possible that any of the 4 bases in

the recognition site could be a SNP that creates or disrupts the site, leading to false positive

Locus #Samples Direction PCR primers Pyrosequencing primer

rs649951 10 F AGTTTTTGTTAGTTTGAAGATATTTTGA AGATTTATATGTAGTTGTA

R-B (BIO)AATATAATCCCAAATCATAAAATCACAA

rs9936944 10 F TGTTGTATTTTTAGTAGAGAGAGGGT TGTTGGTTAAGTTGGT

R-B (BIO)TCCTAATCCTAAAATCAACCATTCCT

rs1485474 10 F TGTGGTAGTATATGGTTGTGGT AGGATGGAGGTTTGT

R-B (BIO)AACCAACTAATCTTCAACAAAACAAA

rs5950206 8 F TTGGAAGATGTATTGTTTATAGTGTT TTATTAGTGTTAGAGTTT

R-B (BIO)ACCATATACACAAATCAACTCACAA

rs10875310 10 F ATAGGAGGATGTGTGTAGATTATAT TGTGTAGATTATATGGT

R-B (BIO)ACCCACATAACCCAATCACCT

rs10962372 6 F TTAAGGTGATTGGATGATTTGAGTA TGAGGATTAAAGTATGA

R-B (BIO)ACTAATTCAACTTACCTCCACCT

37

epiSNP associations. Unless specified, "SNPs" includes both epiSNPs and non-epiSNPs and

"MSRE SNPs" refers to any SNP that creates or disrupts an MSRE recognition site. The human

genome (build hg19) and SNPs from dbSNP (build 135) were used to determine the location of

MSRE SNPs. Two different analyses were utilized (p<0.05 is considered significant):

1. Distance analysis: Examining the distance between SNPs and the nearest MSRE SNP. A

Wilcoxon rank sum test was applied to test for differences between epiSNP and non-epiSNP

distances in each cohort.

2. LD analysis: Examining LD values between SNPs and all MSRE SNPs within 2Kb. The

European ancestry in Utah (CEU) and British from England and Scotland, UK (GBR)

populations were chosen as most representative of the samples used in the epiSNP analysis.

Genotype information was taken from the 1000 genomes project [201]. PLINK analysis for LD

calculations used a maximum distance of 10Kb to reduce computation time.

Deep sequencing analysis of non-epiSNPs

We detected a large number of epiSNPs using the microarray approach and PWL analysis, but

microarrays are not as sensitive as deep sequencing technologies and epiSNPs demonstrating

lower level associations are likely overlooked by this method. Also, the PWL analysis is a two-

step regression model that requires at least two of the three possible genotypes (AA, AB and

BB) to be present in order to detect the methylation intensity slopes; as a result, SNPs that have

rare genotypes may be excluded from our analysis. In order to examine the methylation

association of SNPs in greater detail, we conducted an experiment using the 454 deep

sequencing platform, which allows us to use single-base resolution to search for associations

that may have been missed by the microarrays. Eleven SNPs that did not demonstrate ASM

were chosen. For each SNP, we aimed to choose 10 samples of each alternative homozygote

from cases and controls, ie. 10 AA (case), 10 BB (case), 10 AA (ctrl) and 10 BB (ctrl), to be

bisulfite-modified [32] and sequenced using the 454 platform (40 samples x 11 SNPs = 440

amplicons produced). In some cases, 10 samples of each type were unavailable, but overall

there were approximately equal numbers of each type submitted. Each sample underwent

bisulfite modification, which converts unmethylated cytosines to uracils, and then finally to

thymines after PCR amplification, while methylated cytosines remain intact For each SNP, an

38

amplicon was created that surrounded the SNP and contained as many CCGG sites as possible –

to ensure purity, these amplicons were cut from an agarose gel and purified before

quantification. We split the 454 sequencing plate into quadrants (AA (case), BB (case), AA

(ctrl) and BB (ctrl)) and sequenced from from both A and B tails using Titanium plates and

reagents. We generated approximately 422 114 forward sequence reads (summarized in Table

2.3) and compared the number of unconverted cytosines at each CpG site between groups per

SNP. Our analysis focused on 4 main questions:

1. Is the association between methylation and genotype in case the same as the association

between methylation and genotype in control?

2. Is the association between methylation and genotype AA in case the same as the association

between methylation and genotype BB in case?

3. Is the association between methylation and genotype AA in control the same as the

association between methylation and genotype BB in control?

4. Is the association between methylation and case the same as the association between

methylation and control?

For question 1, we tested the null hypothesis (H0) per SNP, assuming that the difference

between the odds ratio follows a normal distribution under H0. The method fitted the data with a

logistic regression model (Binomial Family with logit link) for each SNP. Since we used the

information for all the CpGs for each SNP, we would not get the results for each individual

CpG.

For questions 2-4, we performed a multiple-CpG association analysis per SNP. A 2x2

contingency table was constructed for each question tested, and Fisher’s exact test was used to

examine the association for each table. The CpG counts were pooled for each SNP. Each test

was performed 20 times (per SNP), and significant q-values were recorded.

39

Table 2.3. Forward 454 sequencing reads per amplicon

The number of forward 454 sequencing reads generated for 11 non-epiSNPs.

SNP dbSNP ID # reads

SNP 1 10975882 9792

SNP 2 5943127 17255

SNP 7 11658063 110312

SNP 11 3762352 114447

SNP 12 219815 34831

SNP 15 17551103 30815

SNP 18 2859011 5629

SNP 19 2059697 53360

SNP 21 720080 26628

SNP 23 2581651 5372

SNP 25 1902675 13673

40

Chapter 3

Results and Discussion

Sequence-independent DNA methylation differences in MZ twins

In this study, we mapped MZ twin DNA methylation differences in white blood cells (WBC)

(N=19 pairs), buccal epithelial cells (N=20 pairs), and gut (rectum) biopsies (N=18 pairs), by

interrogation of the unmethylated genome on the 12K CpG island microarray [195]. We first

ensured that the microarray technology identifies actual DNA methylation differences between

MZ co-twins rather than artifactual differences due to technical variation. For this, 4 parallel

enrichments of the unmethylated fraction of genomic WBC DNA were performed from the

DNA stock of the same individual. DNA samples from 8 MZ twins (4 pairs) were compared

against themselves (to measure technical variation) or the respective co-twin (to measure

biological variation). The biological variation significantly exceeded the technical variation in

all cases (P=1.4x10-238

, P=1.1x10-202

, P=2.1x10-7

, P=2.6 x10-39

) indicating that the detected MZ

co-twin differences are genuine (Fig. 3.1). The technical variance (2) was consistent between

all self-self hybridizations, while the degree of biological variation varied significantly between

twin pairs (Fig. 3.1). For all MZ twins per tissue cohort, the mean absolute log fold change

between MZ co-twins was significantly larger than that between technical replicates (WBC

mean difference=0.013 + 4.5.6x10-4

, P=3.6x10-173

, buccal mean difference=0.017 + 6.8x10-4

,

P=4.9x10-132

, gut mean difference=0.0053 + 5.6x10-4

, P=1.02x10-14

), signifying biological

variation was detectably higher than technical variation in all tissues. Furthermore, microarray

validation performed by sodium bisulfite sequencing and pyrosequencing (Fig. 3.2 and Fig. 3.3)

indicated that the microarray signals detected reflect the actual DNA methylation status in the

tested samples. For WBC-based analyses, we also performed a spot wise correlation between

cell sub-fraction counts and confirmed that differences observed in WBC samples were not

resultant from cell sub-fraction differences.

41

Figure 3.1. Biological vs. technical variation

Volcano plots of 4 MZ twin vs. co-twin WBC DNA methylome comparisons (black) overlayed with 4 matched

twin DNA vs. self comparisons (green) for each set of MZ twins. The x-axis represents the mean fold change across

the 4 replicas. The y-axis represents the –log10 of the p value from a paired t test. Higher significance denotes a

higher consistency between replicates. Significant variation in the spread of detected biological difference exists

between twin pairs (Kruskal-Wallis χ2= 16.3, df = 3, P=0.001) with a symmetrical large (A and B), symmetrical

small (C), and asymmetrical (D) variation of the DNA methylome between co-twins. For each twin pair, a non-

parametric Ansari-Bradley test demonstrated that levels of variance (2) in the MZ twin - co-twin comparison were

significantly larger than 2 in the self-self comparisons (twin set A: variance ratio= 2.91, P=1.4x10

-238; set B: 2.14,

P=1.1x10-202

; set C: 1.12, P=2.1x10-7

; set D: 2.63, P=2.6 x10-39

). Levels of technical variation were not significantly

different between groups (Kruskal-Wallis χ2= 1.81, df = 3, P= 0.62).

Figure 3.2. Correlations between microarray and sodium bisulfite sequencing data

Using DNA from WBC, microarray data were validated by sodium bisulfite modification based mapping of

methylated cytosines in 18 CpGs at a locus displaying a range of co-twin variability, UHNhscpg0003195, which

maps to the 3’ end of C1QTNF8. Over 1,300 clones representing 18 MZ twin pairs (36 clones on average per

individual) were sequenced. Twin differences in the density of methylated cytosines revealed by bisulfite

sequencing (x-axis) correlated significantly with the log2 DNA methylation differences produced by the microarray

data (y-axis) (mean density across all 18 CpGs R=0.65, P=0.0036 (A). Similarly, the density of methylated

cytosines in the HpaII restriction site at the 9th

CpG position correlated significantly with the microarray data

42

(R=0.58, P=0.01) (B). In both A and B, the x-axis values represent the sodium bisulfite based co-twin DNA

methylation difference and y-axis values represent the log2 fold difference between co-twins generated by

microarray data.

Figure 3.3. Pyrosequencing correlations as a function of distance

Bisulfite pyrosequencing of the total amplicon without cloning was performed at 5 loci showing a range of co-twin

variation: UHNhscpg0008483 (15 pairs of twins, CA2 gene), UHNhscpg0004390 (10 pairs, RAX gene),

UHNhscpg0004556 (19 pairs, IL1A gene), UHNhscpg0000193 (18 pairs, RNF110 gene), and UHNhscpg0004262

(11 pairs, DLX1 gene), in WBC DNA samples, which also positively correlated with the microarray data. A bar

graph displaying the strength of correlation between log2 DNA methylation difference between co-twins in the

bisulfite pyrosequenced loci compared to that of the microarray data. Correlations between the microarray data and

Hpa II position only are depicted in Red, while blue represents the correlation derived from of the average

methylation density over 5,6,4,7 and 3 CpGs, respectively. Interrogated CpG sites located within the probe

sequence (represented by a rectangle) showed the strongest correlation with microarray data. X-axis values (-141,-

30, 221, 267, and 637) depict the position in bp of the interrogated HpaII site relative to the 3’ end of each clone

marked as zero. The y-axis depicts Pearson’s correlation (R) between microarray and pyrosequencing data.

Similarly to Eckhardt et al. [202], we noticed that the strength of the correlation between microarray signal and

bisulfite based-mapping of methylated cytosines located outside the probe sequence was a function of the distance

of the interrogated CpG site from the probe.

In the microarray-based studies, we detected a large degree of MZ co-twin DNA methylation

variation in all tissues investigated, despite their identical DNA sequences. We used an

43

intraclass correlation coefficient (ICC) to measure MZ co-twin variation for each unique

genomic region, where an ICC range from +1 to –1 denotes high to low epigenetic similarity

between co-twins relative to the variation between unrelated pairs. For each tissue, we

generated an ICC- based annotation of MZ co-twin DNA methylation variation across ~6,000

unique DNA loci (Fig. 3.4 – WBC; Appendix 1 Fig A.1 and Fig A.2 contain annotations for the

other tissues). Notably, DNA methylation profiles in the buccal epithelial cells from MC MZ

twins were significantly more variable within pairs than those from DC MZ twins (median

difference= 0.37 + 0.0057, P<9.9x10-324

) , which cannot be explained by technical differences

between the hybridization batches of each group (Fig 3.5). Chorionicity information was only

available for the buccal and WBC samples; all WBC of MZ twins were DC to avoid in utero

twin blood transfusion effects. DC MZ twins are believed to result from a splitting of the

cleavage-stage embryo within the first four days following fertilization, whereas MC MZ twins

arise after this point [203]. The varying degrees of epigenetic dissimilarity detected between

these groups may reflect differences in epigenetic divergence among embryonic cells at the time

the twin embryos separated.

Figure 3.4. Karyogram of MZ co-twin epigenetic similarity in WBCs

A chromosomal karyogram depicting degree of MZ co-twin DNA methylation similarity per interrogated locus in

the WBC sample. Dark-to-light bars on the chromosomes represent chromosomal banding patterns as revealed by

Giemsa staining, and red bars indicate regions of high microarray probe density. Bars to the right of each

44

chromosome represent locus-specific ICCs depicting degrees of MZ co-twin epigenetic similarity. P values

associated with the ICC statistic per locus were subjected to false discovery rate (FDR) correction for multiple

testing. FDR-corrected P values below the level of P < 0.05 are depicted in green, and those with greater P values

are shown in gray.

Figure 3.5. Raw binding intensities of MC and DC MZ twin hybridizations

Box plots of raw green (A) and red (B) signal intensities for 40 DC (1) and 40 MC (2) buccal MZ twin microarrays.

Green and red center lines separate the two batches of samples. As MC and dichorionic MZ twin buccal samples

45

were performed in different batches, we wanted to evaluate if batch effects in sample binding could be influencing

this result. No batch effects are observed that could account for the significant differences in MZ co-twin

epigenetic variation between dichorionic and MC twins.

The spot-wise ICC values across the 5,919 loci that overlapped between data sets were

compared between the buccal cells and WBC from the same set of DC MZ twins, and WBC

samples from different individuals, by linear regression. A small but significant correlation was

observed between WBC and buccal-derived ICCs from the same individuals (R=0.046,

P=4.08x10-4

) but not between buccal cells and WBCs from unrelated individuals (R=-0.0025,

P=0.84), which suggests that tissues in genetically identical individuals are more epigenetically

similar versus those in unrelated individuals.

Using locus-specific DNA methylation information, we investigated whether the degree of co-

twin epigenetic similarity is associated with functional genomic elements. In each tissue, we

compared the distribution of ICCs of the CpG islands (CGIs) to that of all non-CGI loci.

Promoters were investigated in an identical manner. We carried out six tests and corrected p

values for multiple testing using the Bonferroni method. Both CGIs and promoters were less

epigenetically variable versus non-promoter regions in WBC-derived DNA (Wilcoxon Rank

Sum test, meanCGI=0. 43 + 0.0065, meanNon-CGI=0.39 + 0.0053, P=1.5x10-4

and

meanPromoter=0.43 + 0.0085, meanNon-Promoter=0.4 + 0.0048, P=0.0077; Bonferroni corrected P=

8.7 x10-4

and P=0.047, respectively). Promoters also showed a trend towards being less

epigenetically variable in gut tissue (Wilcoxon Rank Sum test, meanPromoter=0.11 + 0.0065,

meanNon-Promoter=0.09 + 0.0037, P=0.057; Bonferroni corrected P=0.34). No statistically

significant differences in the degree of DNA methylation variation were detected in the buccal

epithelial cells. The promoter and CGI probes were also subjected to the Gene Ontology (GO)-

based analysis [197]. Most of the identified GO categories associated with epigenetically similar

loci between co-twins (top 5th

percentile of ICCs) had direct functional relevance to the tissue

investigated (Table 3.1). The most apparent connections were observed in WBC, where

categories such as T cell proliferation (GO:004209) and activation of immune response

(GO:000225) were identified. In buccal cells, the proteinaceous extracellular matrix

(GO:0005578) and the metalloendopeptidase activity (GO:0004222) categories were identified;

genes in these categories interact and are expressed in oral fibroblast cells [204, 205]. A portion

of GO categories in gut appeared to be associated with regulation of cell proliferation

46

(GO:0042127) and epithelial to mesenchymal transition (GO:0001837), which is an intrinsic

step of formation of the smooth muscle cells of the gut blood vessels [206]. Our observations

are consistent with an earlier study [6] where the fidelity of CpG methylation patterns was twice

as high in promoter as opposed to non-promoter regions. Taken together, greater epigenetic

similarity between MZ co-twins at functionally important regions (versus loci without clearly

defined regulatory function) suggests that the epigenome is functionally stratified based on the

locations of critical genes.

Cohort GO ID Pvalue OddsRatio ExpCount Count Size Term

WBC GO:0002274 0.0051 Inf 0.14341085 2 2 myeloid leukocyte activation

Promoters GO:0042098 0.0277 13.277778 0.28682171 2 4 T cell proliferation

GO:0006909 0.044 8.8425926 0.35852713 2 5 phagocytosis

GO:0009615 0.044 8.8425926 0.35852713 2 5 response to virus

GO:0002253 0.044 8.8425926 0.35852713 2 5 activation of immune response

WBC GO:0030183 0.0314 12.325301 0.30630631 2 4 B cell differentiation

CGIs GO:0002253 0.0498 8.2088353 0.38288288 2 5 activation of immune response

GO:0002443 0.0498 8.2088353 0.38288288 2 5 leukocyte mediated immunity

Buccal GO:0030518 0.0047 Inf 0.13835198 2 2 steroid hormone receptor signaling pathway

Promoters GO:0019222 0.0066 2.0263338 15.8413021 25 229 regulation of metabolic process

GO:0006350 0.0183 1.8618926 14.4577823 22 209 transcription

Buccal GO:0005578 0.0287 3.3724236 1.79654511 5 26 proteinaceous extracellular matrix

CGIs GO:0004222 0.0324 5.3868243 0.73780488 3 11 metalloendopeptidase activity

Gut GO:0042127 0.0365 3.8139535 1.31905465 4 19 regulation of cell proliferation

Promoters

Gut GO:0001837 0.0138 27.388889 0.20923657 2 3 epithelial to mesenchymal transition

CGIs GO:0007179 0.0263 13.680556 0.27898209 2 4 transforming growth factor beta receptor signaling pathway

47

Table 3.1. GO analysis of loci with high MZ co-twin epigenetic similarity

Significantly over represented gene ontology categories in the positive 5th

percentile of the ICC distribution of

promoter and CGI associated loci in each tissue cohort.

Epigenetically variable loci (bottom 5th

percentile of ICCs) were associated with cell division

processes, which may reflect an early developmental epigenetic discordance as one of the

hypothetical reasons for twin formation [6] (Table 3.2).

Cohort GO ID Pvalue OddsRatio ExpCount Count Size Term

WBC GO:0000074 0.0032 3.5637066 3.20930233 9 46 regulation of progression through cell cycle

Promoters GO:0022402 0.019 2.4193548 4.88372093 10 70 cell cycle process

WBC GO:0019882 0.0435 8.9004329 0.35585586 2 5 antigen processing and presentation

CGIs GO:0045786 0.048 3.3833333 1.42342342 4 20

negative regulation of progression through

cell cycle

Buccal GO:0000279 0.0075 3.6943284 2.40895219 7 32 M phase

Promoters GO:0007067 0.0398 3.0641822 1.95727365 5 26 mitosis

GO:0000776 0.0276 13.308824 0.28659161 2 4 kinetochore

GO:0005876 0.0276 13.308824 0.28659161 2 4 spindle microtubule

GO:0005768 0.0383 5.0317164 0.78812692 3 11 endosome

GO:0000922 0.0439 8.8627451 0.35823951 2 5 spindle pole

Buccal GO:0051656 0.0143 26.767123 0.21367521 2 3 establishment of organelle localization

CGIs

Gut GO:0006732 0.0151 5.4577778 1.03692762 4 13 coenzyme metabolic process

Promoters GO:0019867 0.0313 12.489796 0.30676692 2 4 outer membrane

Gut GO:0051186 0.0065 4.4351852 1.80395853 6 22 cofactor metabolic process

CGIs GO:0022610 0.0189 2.5604396 4.18190386 9 51 biological adhesion

GO:0016337 0.0285 3.4325681 1.80395853 5 22 cell-cell adhesion

48

Table 3.2. GO analysis of loci with low MZ co-twin epigenetic similarity

Significantly over represented gene ontology categories in the negative 5th

percentile of the ICC distribution of

promoter and CGI associated loci in each tissue cohort.

Cases of DNA sequence variation in MZ twins have been documented [207], but these are

uncommon and unlikely to account for even a fraction of the MZ co-twin differences identified

in our experiments. Further studies may include a more detailed annotation of epigenetic

differences in MZ co-twins, a search for disease-specific epigenetic changes in discordant MZ

twins, and a dissection of environment-induced versus stochastic epigenetic differences. As MZ

twins reared apart are generally quite similar to MZ twins reared together according to an array

of traits (electroencephalogram, IQ, personality, social attitudes) [208], we speculate that

stochastic events are much more important than environmental effects at loci where phenotype

is highly determined by epigenetics in MZ co-twins.

Comparison of MZ versus DZ epigenetic profiles

While the first part of this study investigated epigenetic differences in MZ twins, the next

section focuses on comparisons of epigenetic similarities in MZ versus DZ twins - the same

design that has been used in heritability studies. Here, we aim to describe the contributions of

genetic and non-genetic factors to epigenetic variation. DNA methylation differences in buccal

epithelial cells from 20 sets of MZ co-twins (described above) were significantly lower in

comparison to 20 sets of DZ co-twins matched for age and sex (ICCMZ-ICCDZ =0.15 + 0.0039,

P=1.2x10-294

) (Fig. 3.6A). All the effect observed was attributed to the ten sets of dichorionic

MZ twins (mean ICCMZ-ICCDZ = 0.35 + 0.0057, P<9.9x10-324

) (Fig. 3.6B), whereas the mean

ICC of MC MZ twins was close to 0 (Fig. 3.6C). In WBC from 19 sets of MZ twins (described

above) and 20 sets of DZ twins matched for age, sex, and blood cell count (total WBC count,

neutrophil and lymphocyte fractions), MZ-DZ differences were much more subtle, but still

significant (mean ICCMZ-ICCDZ =0.0073 + 0.0034, P=0.044). The observed effect may have

been diminished by our conservative efforts to bias against larger epigenetic MZ – DZ

differences by selecting matched DZ twins with smaller co-twin cell sub-fraction differences as

compared to the MZ twins. For buccal tissue, a locus-specific annotation of ICCMZ-ICCDZ

values representing dichorionic MZ co-twin similarity relative to DZ co-twin similarity is

provided (Fig. 3.7, and Fig. A1.3 and A1.4 in Appendix 1 for WBC and MC buccal samples).

49

Figure 3.6. MZ and DZ ICC distributions in buccal cells

ICC distributions in buccal epithelial cells of MZ and DZ twins. A) all MZ twins (N= 20 sets, red) and DZ twins

(N=20 sets, blue); B) dichorionic MZ twins (N=10 sets, red) and matched DZ twins ( N=10 sets, blue); C) MC MZ

buccal samples (N=10 sets, red) with matched DZ twins ( N=10 sets, blue).

Figure 3.7. Karyogram of MZICC-DZICC values in buccal cells of DC MZ twins

A chromosomal karyogram depicting levels of dichorionic MZ co-twin similarity relative to DZ co-twin similarity

per interrogated locus in the buccal sample. Blue bars to the right of each chromosome represent locus specific

ICCMZ-ICCDZ values.

50

All techniques for enrichment of differentially-methylated DNA sequences for microarray-based

DNA methylation profiling can potentially be confounded by DNA sequence variation. In our

experiments, SNPs within HpaII restriction sites may have caused enrichment differences,

which would then result in larger variation in DZ twins. In addition, DNA sequence variants

may influence the epigenetic status, as in the literature, there are several examples of DNA

allele or haplotype association with specific epigenetic profiles [12, 209, 210].

Alternatively, DZ twins may show more epigenetic differences than MZ twins because the

former originate from different zygotes carrying two different epigenetic profiles, while the

latter develop from the same zygote, and therefore should possess similar epigenomes at the

time of blastocyst splitting. Although the experiments described below do not unequivocally

prove this second hypothesis, we favour the idea of these zygotic epigenetic effects for several

reasons.

First, epigenetic profiles are not fully determined by DNA sequence; if that were the case, MZ

twins would show no epigenetic differences. Therefore, the observed major, epigenome-wide

differences in the buccal epithelial cells from MZ twins versus DZ twins are highly unlikely to

be caused exclusively by DNA sequence differences in DZ twins. Furthermore, ICCMZ-ICCDZ

differences were tissue-specific, as the buccal epithelial cells from dichorionic MZ twins

showed much larger MZ-DZ epigenetic differences in comparison to that of a subset of WBC

obtained from the same individuals at the same time. As the DNA sequences should be identical

(or nearly identical) between the tissues of the same organism, the tissue specific ICCMZ-ICCDZ

differences argue against DNA sequence as a major controlling factor of epigenetic profiles.

Second, to address the putative effects of differential digestion of polymorphic HpaII restriction

sites in DZ twins, we tried to perform a comparative analysis between HpaII and its

isoschizomer, MspI, as has been suggested in the HELP assay [211]; however, degrees of

technical variation produced in MspI- based experiments were markedly larger than those of

HpaII experiments (ratio of HpaII/MspI variance= 0.37, P<9.9x10-324

) (Fig. 3.8). It is not

surprising, given that digestion of genomic DNA with MspI generates at least an order of

magnitude more short restriction fragments, which will affect the dynamics of subsequent steps

(adaptor ligation, PCR, hybridization) in comparison to the HpaII- based enrichment of the

unmethylated DNA fraction. As a result, the two experiments were not directly comparable.

51

Alternately, we carried out an in silico analysis whereby the SNP and allele frequency

information available in the dbSNP (http://www.ncbi.nlm.nih.gov/projects/SNP/) and HapMap

(http://www.hapmap.org/) databases were obtained to calculate heterozygosity quotients that

represent the probability that a given probe would have a restriction site disrupted by a SNP.

From the 6,405 and 5,917 unique sequences within the WBC and buccal data sets, 109 and 98

loci containing HpaII SNPs were identified, respectively. For both data sets, there was no

correlation of locus heterozygosity value with ICCMZ-ICCDZ value (R=-0.0032 and P=0.97 for

WBC; R=0.024 and P=0.81 for buccal cells). A similar analysis was done to address the

epigenetic effects of SNPs in cis by extending the interrogated region to include all SNPs within

1 Kb proximal to and including the probe sequence. Again, correlation analysis of

heterozygosity values at 1,369 (WBC) and 1,284 (buccal) SNP containing loci showed no

correlation with ICCMZ-ICCDZ value (R=-0.019, P=0.47 (WBC), and R=0.033, P=0.23(buccal)).

These results are in agreement with our subsequent study, which demonstrates that strong

genetic effects on epigenetic status occur relatively infrequently throughout the genome.

Figure 3.8. Technical variation volcano plots of HpaII and MspI based enrichments

Volcano plots measuring technical variation produced by HpaII (red) and MspI (blue) enrichments. Each plot is

produced from 4 parallel self vs. self enrichments and hybridizations at 5,997 overlapping loci between the two data

sets. MspI enriched samples produce significantly more technical variance (2) than that of HpaII as measured by a

non-parametric Ansari-Bradley test (Ratio of HpaII/MspI variance= 0.37, P<9.9x10-324

).

http://www.ncbi.nlm.nih.gov/projects/SNP/

http://www.hapmap.org/

52

Third, we investigated whether DNA sequence variation may influence DNA methylation in cis

and in trans by methylation analysis of two strains of inbred (that is, nearly genetically

identical) mice as compared to two strains of outbred (genetically non-identical) mice. Mouse

brains were subjected to 4.6K CpG island microarray-based DNA methylation profiling. First,

we determined that the detected biological variance is significantly larger than technical

variance in the mouse experiments (P<9.9x10-324

). We then compared the spot-wise distribution

of within sibship DNA methylation variation (2) between inbred and outbred mice at 2,176

unique genomic regions and did not detect any significant difference (mean difference=2.1x10-5

+ 3x10-4

, P=0.68) (Fig. 3.9). Although it is not completely clear to what extent mouse brain

results can be extrapolated to human buccal cells, despite their shared ectodermal origin, and

although DNA variation in the outbred mice is less than that of unrelated humans (based on the

Welcome Trust study (http://www.well.ox.ac.uk/mouse/INBREDS; our estimate is that in

general, outbred mouse DNA heterozygosity is 2-4 times lower in comparison to unrelated

humans), the impact of DNA polymorphisms on DNA methylation does not seem to be

common.

Figure 3.9. Distributions of inbred and outbred epigenetic variation

The spot-wise distributions of the within sibship variance for both inbred (red) and outbred (blue) mice. A non-

parametric comparison of the distributions with a paired Wilcoxon signed rank test did not identify any significant

epigenetic difference between groups, despite the genetic variation within the outbred group (mean

difference=2.1x10-5

+ 3x10-4

, P=0.68).

http://www.well.ox.ac.uk/mouse/INBREDS

53

In the classical twin studies, greater phenotypic similarity among MZ twin pairs compared to

DZ twins has been traditionally attributed to the degree of DNA sequence similarity. Our twin

studies suggest that in addition to identical DNA, epigenetic similarity at the time of blastocyst

splitting may also contribute to phenotypic similarities in MZ co-twins. By the same argument,

DZ co-twins are more different from each other than MZ co-twins not only because they possess

some DNA sequence differences (on average ~0.05%), but also because they originated from

epigenomically different zygotes. In addition, epigenomic inheritance may explain the

“intangible variance”, the concept that originated from the observation that regular (polyzygotic)

inbred mice were much more different from each other than the MZ inbred mice of the same

strain [212]. In conjunction with such findings, our data suggest that the phenotypic effects of

the individual epigenomes of each zygote could be substantial.

Although our in silico and mouse experiments indicated that ASM did not significantly affect

our findings in this study, it does not mean that genetics do not exert any influence on epigenetic

modifications. Here we used a single MSRE, HpaII, in the enrichment of the unmethylated

fraction, and the microarrays only contained 109 HpaII recognition sites (1.7% of total sites)

that could potentially be interrupted by the presence of a SNP – this is a control feature for our

epigenetic study, as it does not leave much opportunity for an ASM effect to become visible. If

we were to utilize multiple MSREs and a genome-wide approach, the incidence of ASM would

become more apparent, and this complicated molecular interaction may be relevant in the study

of complex human diseases, such as psychosis.

Genomic frequency and distribution of ASM

In response to the questions raised in the twin study, regarding the contribution of DNA

sequence to epigenetic factors, we conducted a large-scale, genome-wide analysis of ASM in

psychosis cases (SZ and BD) and controls using human tissues and the most complex

microarrays currently available. Our large sample size, volume of interrogated SNPs, and use of

human brain and sperm, as opposed to cell lines, sets this study apart from all other studies to

date. The incorporation of a disease element in an ASM study of this magnitude is completely

novel and, unlike some other groups, we were not confined to any particular region of the

genome - promoters, exons, CGIs or other - allowing us to examine a tremendous number of

SNPs without bias. Each sample was genotyped using the Affymetrix SNP 6.0 array, and then a

54

second array was hybridized with the enriched unmethylated fraction from that same subject, in

order to determine the methylation differences between alternative DNA alleles at every SNP.

Our enrichment strategy is presented in Figure 3.10.

Figure 3.10. Enrichment of unmethylated DNA fraction

To enrich the unmethylated fraction, genomic DNA was first digested separately with HpaII, HinP1I and

HpyCH4IV, which cut their unmethylated recognition sites. The digested DNA was pooled per sample and non-

human adaptor sequences were ligated to the sticky ends. McrBC was then used to cut all internally methylated

CpG sites, leaving only unmethylated fragments to be amplified, fragmented, labeled and hybridized to SNP 6.0

arrays.

We examined our total numbers of epiSNPs occurring per cohort to determine their distribution

across the chromosomes in brain tissue (Fig 3.11). Chromosomes range in size, thus, the

number of total SNPs occurring on each one will differ – taking this into account, we observed

several significant differences in the chromosomal distribution of epiSNPs between the cohorts

(illustrated in Table 3.3). In general, epiSNPs are distributed evenly across the genome in the

55

control cohort, with the exception of chr7 and chr16, which appear to have under- and over-

representation of epiSNPs across all cohorts, respectively. As power increases with sample size,

we were able to detect more epiSNPs when the cohorts were pooled; 9082 epiSNPs were

identified in the combined case cohort (BD plus SZ) and 13480 were detected when all case and

control samples were analyzed together. The distribution differences observed in the case group

encompassed all those included in the separate BD and SZ analyses, plus three additional

differences that likely became significant with the increased power. Oddly, when the largest

cohort (combined cases and controls) was analyzed, fewer differences in distribution were

observed versus the combined case group, suggesting that the inclusion of the somewhat

uniformly-distributed controls cancelled out some of the variation introduced by the cases.

Figure 3.11. Chromosomal distribution of brain epiSNPs

The distribution of brain epiSNPs per cohort across all chromosomes, where the total number of detected epiSNPs

is displayed as a proportion of the total number of SNPs occurring on each chromosome (on the Affymetrix SNP

56

6.0 array), with a p-value cut-off of <0.01. No epiSNPs in any cohort were located on the mitochondrial

chromosome.

Table 3.3. Differences in epiSNP chromosomal distribution in brain and sperm

The differences in epiSNP distribution across the chromosomes are shown per cohort, with q-values displayed

wherever a difference occurs. Column 1: tissue source. Column 2: cohorts, with “case” representing the combined

analysis of SZ and BD cohorts. Column 3: chromosome number (where a difference was observed). Column 4:

results of distribution test, showing over- or under-representation of epiSNPs on that chromosome. Column 5: q-

values, where q<0.05 is considered significant.

57

For each cohort, we consulted dbSNP to determine if the epiSNPs occurred on sequences that

were associated with a specific “functional class.” It should be noted that this database does not

classify intergenic SNPs, so we used the absence of classification to group them. Other studies

have suggested that a large percentage of epiSNPs appear in intergenic regions [213] and,

indeed, we did find that 1308 (61.17%) of control, 1070 (59.94%) of BD and 2356 (58.75%) of

SZ epiSNPs existed outside of the dbSNP functional classes in brain. Of all known SNPs, the

majority map to noncoding and intergenic regions of the genome; it is interesting to note that

intergenic SNPs are frequently detected by GWAS as showing strong associations with human

disease [214, 215]. For example, SNPs in an intergenic region on chromosome 4, as well as

those upstream from SLC2A9 are strongly associated with Alzheimer’s disease (plus psychosis)

[216]; an intergenic SNP was found to be significantly associated with systemic lupus

erythematosus, and a positive correlation between this SNP and expression of ATG5 has

nominated this nearby gene as a candidate locus [217]; finally, an intergenic SNP has been

suggested to predispose to papillary thyroid carcinoma via interactions with a long intergenic

noncoding RNA gene (papillary thyroid carcinoma susceptibility candidate 3 , PTCSC3), which

is believed to be a tumor suppressor [164].

The binding of transcription factors to their specific sites is correlated with expression levels of

related genes, and some transcription factors, such as CTCF, are known to bind predominantly

within intergenic regions [218]. EpiSNPs within these sites would provide alternate local

methylation patterns, which may affect the binding and subsequent downstream actions of

CTCF and related proteins. Additionally, many non-coding RNAs (ncRNA) can be transcribed

from intergenic regions of the genome [219]. It is estimated that there are thousands of ncRNAs

[219, 220], some of which are highly conserved and required for crucial biological processes

[221], although the functions of many of these molecules still require validation [222].

Advances in sequencing technology and analysis techniques have encouraged the notion that

much of the genome is actually transcribed, and that non-coding sequences are as interesting

and important as those that code proteins [223, 224].

Our functional class analysis revealed that control, SZ and BD cohorts each have a unique

functional distribution, whereas the combined case cohort profile is predictably influenced by

the SZ and BD profiles (Fig 3.12). The exact meaning of each individual functional class

58

distribution is unclear, but three major trends were observed across most cohorts: epiSNPs are

enriched in locus and UTR regions, and depleted in exons (p values are displayed in Table 3.4).

When viewing Figure 3.12 B, it should be noted that enrichment was calculated as the spread of

functional classes per cohort against the spread of classes on the SNP 6.0 array, so comparisons

should be only be made within a given cohort – enrichment represents the occurrence of

epiSNPs in a functional class versus the number we would predict to see in that class, given the

representation on the array. Some of the most detrimental SNPs are those that are structurally

functional and exhibit pleiotropy – they are associated with certain diseases and tend to occur in

exonic regions [165]. It is also true that SNPs generally tend to occur more frequently in non-

coding versus coding regions, as there is a negative selective force acting at sites of amino acid

altering mutations [163], thus, our observation that epiSNPs are generally depleted at exons is

quite logical.

The locus region is defined by dbSNP as being “within 2 Kb 5′ or 500 bp 3′ of a gene feature

(on either strand), but the variation is not in the transcript for the gene.” The UTRs are

considered to be located on the mRNA, but make up the regions before the first codon and after

the last codon used in translation, so they are located adjacent to the locus regions. Given that

the locus and UTR classes show similar, significant enrichment patterns, we can generalize that

the areas immediately flanking the 5’ and 3’ ends of genes are favourable epiSNP locations.

Many significant hits detected in GWAS studies of complex diseases do not actually affect

protein structure directly – it is believed that most variants act via regulatory changes in mRNA

expression [171]. Even single-base alterations to RNA can have profound effects on its structure

[225], and accessibility of particular regions affects the binding affinity at target sites for the

RNA-induced silencing complex (RISC) [226], as well as miRNAs, which bind to their many

target sites in the 3’UTRs of nearly every human transcript and exert a powerful regulatory

effect [171]. Aside from miRNA target sites, there are other functional elements within the

3’UTR that are known to affect miRNA activity. For example, loss of poly-adenylation (polyA)

can lead to various disease states via non-specific degradation of mRNA [227], and miRNA-

mediated repression has been correlated with polyA signal efficiency [166]. In humans, mRNA

is often targeted by multiple miRNAs, so the loss of a single binding site may not be deleterious

on its own, however, alterations may be cumulative [228]. In the developing brain, many

miRNAs are expressed, and they act to regulate neurogenesis, dendritogenesis, and synapse

59

formation. A functional analysis has shown that the 3' UTR of brain-derived neurotrophic factor

(BDNF) mRNA can be targeted by several miRNAs that are aberrantly up-regulated in the

absence of MeCP2, and this dysregulation may contribute to the development of Rett syndrome

[229].

Some of the 5’UTR SNPs detected by GWAS are also predicted to alter mRNA structure, and

evidence for this is accumulating: the 5’UTR in the human ferritin light chain gene has been

termed a “RiboSNitch,” meaning that its RNA changes structure if a particular disease-

associated SNP is present and, like the bacterial “Riboswitch,” the structural change is believed

to regulate translation [170]. Of the SNPs in high LD that formed RNA structure-stabilizing

haplotypes (SSH) in humans, SNP pairs in 8 of the 10 SSH-containing transcripts were shown

to stabilize RNA protein binding sites [170]. The methylation status of these SSH SNPs was not

investigated, but it is reasonable to hypothesize that regulation of local methylation could be

part of these SNPs’ mechanism of action. An algorithm has been devised for the detection of

RiboSNitches, and multiple SNPs have been detected in UTRs (particularly at the 5’ end) that

alter the mRNA structural ensemble of associated genes in six disease-states: hyperferritinemia

cataract syndrome, beta-thalassemia, cartilage-hair hypoplasia, retinoblastoma, chronic

obstructive pulmonary disease, and hypertension [230].

Table 3.4. Differences in epiSNP functional class distribution in brain and sperm

60

The differences in epiSNP distribution across dbSNP functional classes are shown per cohort, with p-values

displayed wherever a difference occurs. Column 1: tissue source. Column 2: cohorts, with “case” representing the

combined analysis of SZ and BD cohorts. Column 3: chromosome number (where a difference was observed).

Column 4: results of distribution test, showing over- or under-representation of epiSNPs on that chromosome.

Column 5: p-values, where p<0.05 is considered significant.

Many of our detected epiSNPs (approximately 40% of total epiSNPs) exist in introns of genes

that are related to brain function and development, as the GO analysis will reveal in a later

section, which is logical considering that we investigated DNA from brain tissue. One possible

function for intronic ASM has recently come to light: these epiSNPs may be involved in the

splicing of RNA transcripts, possibly playing a role in self-splicing. Deep sequencing studies

have revealed that over 90% of human genes undergo alternative splicing, and SNPs located at

splice sites can alter mRNA translation efficiency or influence exon configuration, ultimately

affecting disease susceptibility [167]; this is evident from the fact that exon splice sites show

high conservation and very low SNP rates [231]. In a study of vascular dementia, it has been

suggested that minor allele, A, of a PHLDB2 intronic SNP may induce a delayed splicing and

increase susceptibility to the disease [232]. Relevant to SZ and BD, there are three intronic

SNPs near exon 10 of the GABRB2 gene (an “alternative splicing hotspot” that codes the β-2

subunit of the GABA A receptor) that are responsible for two novel isoforms of the subunit, and

these SNPs are significantly correlated with SZ and BD, with altered expression of these

isoforms occurring in both diseases; β (2S1) expression was increased and β (2S2) expression

was decreased [233]. Originally, introns were considered to be “junk” DNA sequences that

were simply removed from pre-mRNA, but they have since been found to occasionally encode

proteins, undergo further processing to form ncRNAs [169, 172], and may also represent mobile

genetic elements, such as transposons, in humans [168]. Alternative splicing is an important

process that is necessary for the creation of diverse, complex products, and interference with

normal methylation patterns at critical sites may also result in improper splicing events, with

damaging downstream consequences. Increased or decreased methylation associated with one

allele of an intronic epiSNP may damage an organism if splicing is disrupted, and the particular

epiSNPs that are specific to each disease cohort may contribute to its etiology. Alternatively,

the methylation differences associated with an intronic epiSNP may be beneficial to an

organism – epiSNPs of this variety are likely conserved across populations.

61

Figure 3.12. Functional class distribution of brain epiSNPs

Brain epiSNPs were stratified by their dbSNP functional class tags. Coding SNPs occur in exons of genes, where

one variant introduces either a non-synonymous or synonymous change; Intron SNPs are located within intronic

regions of genes; Locus SNPs are located 2Kb upstream or 0.5Kb downstream from a gene; and UTR SNPs occur

in either 3’ or 5’ untranslated regions. A) The number of epiSNPs per cohort detected in each functional class. B)

The number of epiSNPs per cohort in each class, given as a proportion of the total number of SNP 6.0 SNPs in that

class. Red asterisks mark classes that are enriched per cohort, and blue asterisks mark classes that are depleted per

cohort (p<0.05).

Bisulfite verification of selected SNPs

Several epiSNPs that were a) common to case and control groups and b) showed large array

intensity differences between alternate homozygotes were chosen, in addition to some randomly

selected non-epiSNPs. A portion of the surrounding sequence was amplified using bisulfite

modified genomic DNA from the same brain tissue used in the array experiment. The amplicons

were then pyrosequenced and analyzed using the methylation assay feature of the Pyromark

software, and then each individual CpG was subjected to the Wilcoxon signed rank test. For

62

each locus, we investigated samples that were homozygous for each allele in order to maximize

the potential methylation differences; the average methylation across each locus is listed in

Table 3.5 and the methylation for each individual at each specific CpG site is listed in Appendix

2, Table A2.4.

SNP Type

% methylation A

allele

% methylation B

allele

# ASM CpGs

rs649951 epiSNP 80.2 68.4 1

rs9936944 epiSNP 97.4 97.2 1

rs1485474 epiSNP 8.9 59.8 3

rs5950206 non-epiSNP 63.6 64.3 0

rs10875310 non-epiSNP 99.8 100 0

rs10962372 non-epiSNP 100 99.6 0

Table 3.5. Average methylation across loci.

Column 1: SNP 6.0 probe ID for investigated loci. Column 2: designation of locus, where epiSNPs demonstrate

ASM and non-epiSNPs do not. Column 3: average methylation of all AA homozygotes across all CpG sites for a

locus. Column 4: average methylation of all BB homozygotes across all CpG sites for a locus. Column 5: number

of CpG sites on the amplicon that displayed significant methylation differences between AA and BB.

For three out of three epiSNPs, a methylation difference was observed in the direction predicted

by the microarrays, whereas all of the non-epiSNPs had no detectable significant differences

(p>0.42 for each). In two of the epiSNPs, not all of the surrounding CpG sites displayed

significant differences and, for rs9936944, the methylation difference at the ASM CpG was not

large enough to create an overall difference when all sites on the amplicon were analyzed

together. It is not necessary for ASM to occur at all nearby CpG sites for an epiSNP to be

detected, and it is also possible for a pyrosequencing experiment to miss contributing CpGs, as

our enrichment strategy allows us to detect epiSNPs that result from fragments that are several

Kb in length. It is possible that our small pyrosequencing region may not capture the exact CpG

sites that displayed the ASM associated with a particular SNP, and this is one limitation of the

technology. The methylation densities observed on the arrays for each possible genotype are

presented for one epiSNP in Figure 3.13, along with examples of the pyrograms generated for

the locus. Note that the unmethylated DNA fraction was measured on the microarray, as this is

63

what we specifically enriched and hybridized; a higher intensity corresponds to an overall lower

methylation level associated with a particular genotype.

Figure 3.13. Methylation levels observed for an epiSNP

A) Violin plot of unmethylation signal (microarray signal intensity generated by unmethylated fragments) for each

genotype of rs1485474, showing a decrease in unmethylation (increase in methylation) associated with the B allele.

Width of violin represents sample density at that position. B) Pyrogram for an AA genotype sample. C) Pyrogram

for a BB genotype sample, with higher levels of methylation.

All of the non-epiSNPs did not show differing methylation levels between alternate

homozygotes, as predicted by the microarray analysis. It should be noted that the pyrosequencer

is by no means a sensitive instrument, and that tiny fluctuations below 2% are considered to be

“noise” from the pyrosequencing reaction (as stated by Qiagen technical service), thus, any

differences below 2% are not reliably detected and are not considered to be significant. Figure

3.14 shows the hypomethylation intensities and pyrograms for a typical non-epiSNP. Although

64

this verification experiment was small-scale, it provides sufficient evidence that the ASM

detected by the microarrays is not simply an artefact.

Figure 3.14. Methylation levels observed for a non-epiSNP

A) Violin plot of unmethylation signal for each genotype of rs10875310, showing no change in

methylation level associated with the A or B allele. Width of violin represents sample density at that position. B)

Pyrogram for an AA genotype sample. C) Pyrogram for a BB genotype sample.

Linkage disequilibrium does not cause false-positives

Linkage disequilibrium (LD), the non-random association of alleles, could potentially confound

our results, as LD between two SNPs where one is located within a MSRE recognition site

could lead to the identification of false epiSNPs. We investigated possible LD effects between

identified epiSNPs and SNPs occurring within nearby MSRE recognition sites (MSRE SNPs) in

both brain and sperm samples across all cohorts. The human genome (build hg19) and SNPs

from dbSNP (build 135) were used to determine the location of MSRE SNPs for each MSRE

65

used in our unmethylated fraction enrichment (HpaII, HinP1I, and HpyCH4IV). We also

examined possible de novo MSRE SNPs that occur when an alternative allele results in the

creation of an MSRE recognition site, where one previously did not exist.

The distance between SNPs and the nearest MSRE SNP within 2Kb was closely examined, as

LD effects are more likely to occur between nearby SNPs; also, the unmethylated fragments

generated in our enrichment were between 400bp and 2Kb, on average, so LD between SNPs

that are separated by a great distance would not affect our findings. Using a Wilcoxon rank sum

test, we found no significant difference between epiSNPs and SNPs in relation to the physical

distance to the nearest MSRE SNP (including de novo MSRE SNPs) within 2Kb (Fig 3.15).

Figure 3.15. Distances between SNPs and MSRE SNPs

Distances between SNPs and SNPs within MSRE recognition sites within 2Kb for all cohorts in DNA from A)

brain and B) sperm. No significant differences were detected.

We also examined LD values between every SNP and all nearby (within 2Kb) MSRE SNPs. We

used an LD threshold value of 0.8 to distinguish an LD effect (> 0.8) from no LD effect (≤ 0.8).

Once more, we found no significant difference in LD values between epiSNPs and non-epiSNPs

across all cohorts in brain and sperm samples (Fig 3.16). After a rigorous analysis into possible

66

LD effects, examining the physical distance between SNPs and the nearest MSRE SNP and

comparing LD values between SNPs and all nearby MSRE SNPs, there is no evidence that

MSRE SNPs result in LD relationships with identified epiSNPs.

Figure 3.16. Linkage disequilibrium scores in brain and sperm

Linkage disequilibrium scores between SNPs and SNPs within MSRE recognition sites within 2Kb for all cohorts

in DNA from A) brain and B) sperm. The values range from 0 to 1, with those close to 1 indicating SNPs most

likely in LD and those closer to 0 indicating SNPs least likely in LD. The variations in distribution tip shape are

meaningless, as none of the groups demonstrated statistically significant differences.

ASM in the major psychosis cohort

The total number of epiSNPs per cohort and the overlap between cohorts are summarized in a

Venn diagram (Fig. 3.17). Each cohort had many unique epiSNPs (a list of the epiSNPs and

associated genes discussed here and in the tissue-specificity section is provided in Appendix 2,

Table A2.5) , but there was also a significant amount of overlap – 529 epiSNPs were common to

all three cohorts, and many overlapped two out of the three. When we further dissect our data,

the SZ group contained more than double the epiSNPs of the control or BD groups, and this may

reflect its complex disease etiology, where small alterations occur in a large number of

67

pathological pathways. Notably, it has been hypothesized that multiple rare variants are

responsible for SZ [234-236], and this hypothesis may be extended to include epiSNPs.

Although BD and SZ are similar psychiatric diseases from the epidemiological and, to some

extent, clinical point of view, the BD group did not demonstrate this increase in ASM, and

actually contained slightly fewer epiSNPs versus the controls. We do not believe that the

number of SZ epiSNPs was artificially inflated by one or two outliers, because the PWL method

requires a group effect in order to identify a significant epiSNP, and our q-value cut-off was

very stringent. The contribution of cohort-specific CNVs cannot be fully ruled out, however,

singleton deletions are believed to occur frequently in both SZ and BD, yet no epiSNP increase

was observed in BD [147]. Also, the total genome-wide CNV burden does not appear to differ

between controls and psychosis subjects [146], so it could be hypothesized that controls would

also have some unique epiSNP inflation, but the massive increase only seems to occur in the SZ

group. Other studies have not investigated the occurrence of epiSNPs in disease cohorts, so the

finding that epiSNP representation shows such a fluctuation between cases and controls is novel.

It also sheds some light on the importance of ASM in diseases, where a gain or loss of epiSNPs

may play an etiological role, although we have yet to definitively prove or disprove this scenario

in the case of psychosis.

We investigated the 529 epiSNPs that are common to all cohorts and found that the same allele

was associated with methylation in all but 35 of them. Additionally, of the epiSNPs that were

common to two out of three groups, only ~5% of them had methylation associated with the

opposite allele. This finding supports the concept that some epiSNPs may function in a way that

is critical to the organism, thus, their directionality is conserved between individuals.

Schalkwyk et al also noticed that the direction of methylation is not always uniform at an

epiSNP. They found that individuals varied in the direction of ASM at 10% of the loci that they

examined in blood DNA [21]. The 35 common epiSNPs that did not show uniform

directionality of ASM are listed in Table 3.6. For the vast majority of these loci, there is either

no gene located with 50Kb from the SNP, no MIM title associated with the gene, or both of the

above, so it is difficult to speculate on the meaning of many directionality differences. One

common epiSNP (rs684669) is located within an intron of the DSCAML1 gene, which encodes

the Down syndrome cell adhesion molecule-like 1 transmembrane receptor in many vertebrates

and invertebrates; DSCAML1 and the related DSCAM proteins are all involved in aspects of

68

neurodevelopment, such as axon guidance, bifurcation and segregation, as well as dendritic

patterning and synapse formation [237]. In this case, the SZ cohort showed ASM on the

opposite allele versus the control and BD cohorts, and there were also two SZ-unique epiSNPs

(rs7106294 and rs665406) located within introns of DSCAML1.

At another common epiSNP (rs2622769), the BD cohort differed in ASM direction versus SZ

and controls. This epiSNP was found within an intron of the secretogranin 3 gene (SCG3),

which codes a neuroendocrine secretory protein belonging to the chromogranin/secretogranin

family. Secretogranins are evolutionarily conserved in vertebrates and are responsible for

controlled delivery of peptides, hormones, neurotransmitters, and growth factors, and their

processed peptides are involved in metabolism, glucose homeostasis, emotional behavior, pain

pathways, and blood pressure modulation [238]. Due to their high level of conservation and

functionally crucial roles, a change in ASM in an intronic region of SCG3 may result in altered

splicing or other changes to the final protein product that could be damaging to the organism. In

the prefrontal cortex of patients with major depressive disorder (MDD), the psychotic subjects

exhibited a higher degree of steady state SCG3 mRNA versus controls [239]. SCG3 is not as

well characterized as other members of the family; it has mainly been reported to be over-

expressed in neuroendocrine tumours [240]. The mRNA of a similar granin, secretogranin 2

(SCG2), was upregulated in the brains of mice treated with lithium versus the levels observed in

controls [241], suggesting that the actions of lithium, the gold-standard mood stabilizing

medication prescribed to individuals with BD, may involve regulation of disturbances in the

secretogranin family. In line with this theory, it has previously been shown that lithium-treated

cells develop an altered secretory phenotype involving increased cell content and secretion of

the SCG2 protein [242].

The third interesting directionality difference involved an epiSNP (rs3792174) located within an

intron of the phospholipase A2 receptor 1 (PLA2R1) gene, which codes a type I transmembrane

glycoprotein that is believed to contribute to the clearance of phospholipase A2 (PLA2), thereby

inhibiting its action [243]; again, the BD cohort differed from SZ and controls. Unique epiSNPs

at PLA2R1 were also identified in the BD and SZ groups (BD = rs2715950, BD and SZ =

rs949753), showing that this locus is a common target for ASM. The PLA2 enzyme hydrolyses

glycerol to release arachidonic acid and lysophospholipids, which are then modified into

inflammatory mediators (eicosanoids), and several recent studies have linked it to psychosis.

69

One family of eicosanoids, the prostaglandins, has been directly implicated in the disease. The

'prostaglandin deficiency' hypothesis postulates that synaptic transmission is affected when

defective enzyme systems fail to convert essential fatty acids to prostaglandins, resulting in

diminished prostaglandin levels. Lysophosphatidylcholine, a byproduct of PLA2-catalyzed

phospholipid hydrolysis, is the main carrier of polyunsaturated fatty acids across the blood-brain

barrier, and its level decreased in SZ subjects in association with decreases in cognitive speed

[244]. Levels of PLA2 were measured in hippocampal tissues from anterior temporal

lobectomies of subjects with temporal lobe epilepsy who either showed psychotic symptoms or

did not, and increased PLA2 was only observed in the psychotic cohort [245]. In a subset of BD

patients with a history of psychosis, elevated calcium-independent PLA2 activity was detected

in the blood serum, but was absent in healthy controls [246]. Many years ago, increased plasma

PLA2 activity was discovered in SZ subjects, but could be reduced to control levels after 3

weeks of neuroleptic treatment [247]. Given that the activity of this enzyme is elevated in both

SZ and BD and that PLA2 is involved in regulation of norepinephrine receptor density, axon

regeneration, and presynaptic neurotransmitter release [248], increased rates of phospholipid

turnover may represent a shared biochemical feature of psychosis. Additionally, there was an

epiSNP (rs2020887) located within the coding region of the phospholipase A2 group 5 gene

(PLA2G5), and two in introns of phospholipase C epsilon 1 (PLCE1, rs2797990 and

rs11187815) that were unique to the SZ cohort, plus there was a BD-unique epiSNP in an intron

of phospholipase C beta 1 (PLCB1, rs6086496) and an epiSNP (rs11908460) common to both

SZ and BD, as well as two epiSNPs common to control and SZ all found within introns of

PLCB1 (rs6055601 and rs6055739). These are only a selection of phospholipase-related

epiSNPs, and this amount of variation indicates that ASM in these pathways is affected at

multiple levels in psychosis.

The fact that DSCAML1, SCG3 and PLA2R1 introns contained epiSNPs across all study groups

suggests that these loci are regularly destined for ASM – the differences in methylated-alleles

may indicate a malfunction in establishment of ASM that takes place in certain individuals as

part of a disease phenotype. These malfunctions could result in altered regulation of local genes,

perhaps at the transcriptional or splicing level. As psychosis is a complex disease and it is likely

that many different genotypes and epigenotypes can contribute to this diagnosis, these single

70

ASM changes are probably just one part of an intricate mosaic that leads to the overall psychosis

phenotype.

Figure 3.17. Total epiSNPs per cohort in brain

The total number of epiSNPs detected in control, BD and SZ cohorts are depicted in A) a Venn diagram and B) a

chart (p<2.2x10-16

).

71

Table 3.6. Direction of methylation and gene information for common epiSNPs

Column 1: SNP 6.0 probe ID for investigated loci. Column 2-4: allele associated with increased methylation for

control, BD and SZ cohorts. Column 5: closest gene within 50Kb. Column 6: OMIM title associated with closest

gene.

SNP Control BD SZ Closest gene MIM_title

rs16829083 A B B MYOM3

rs11207702 B B A NFIA Nuclear factor I/A

rs10920871 A B A NA

rs6664930 B A A SLC30A1 Solute carrier family 30, member 1 and 10

rs9428514 A B A WDR64

rs2362590 A B B SMYD3

rs7098116 A B B NA

rs12252906 A B B NA

rs17436486 B A A GALNTL4

rs684669 B B A DSCAML1 Down syndrome cell adhesion molecule-like 1

rs414161 B A A NA

rs10848091 B A B PIWIL1 Piwi, Drosophila, homolog of

rs176343 B A B NA

rs9573719 B A A NA

rs2269304 B A B SPTB

rs2622769 B A B SCG3 Sarcoglycan, gamma;Secretogranin III

rs4553646 B A A AHSP

rs4889048 B B A NA

rs1125244 A A B NA

rs12944274 B A A PPM1E

rs7342966 A B B RGS9 Regulator of G protein signaling 9 and RGPS9-binding protein

rs2377391 B B A RBFOX3

rs3792174 A B A PLA2R1 Phospholipase A2 receptor 1

rs17003416 B A B NA

rs11703071 A B B TFIP11 Tuftelin-interacting protein 11

rs12635765 A B B NA

rs901812 A A B NA

rs1853261 A B B NA

rs10034491 B B A NA

rs4478136 B A A NA

rs6876638 B B A NA

rs2546963 B A A PWWP2A

rs1379326 A B A CSMD1 Cub and Sushi multiple domains 1

rs4909472 B A B LOC286094

rs10975894 B A B KDM4C

72

Differences aside, many of the epiSNPs common to all cohorts were methylated on the same

allele. Upon examination, many of these epiSNPs are associated with genes that are critical for

various developmental and functional features of the brain. An epiSNP (rs12251692) appears in

an intron of neuregulin 3 (NRG3), which is thought to influence neuroblast proliferation,

migration and differentiation. NRG3 has been identified as a susceptibility locus for SZ [249]

and, although ASM at this particular SNP does not vary between cohorts, three other NRG3

intronic epiSNPs (rs2144468, rs7100526, and rs10787027) exist exclusively in the SZ cohort –

these may represent causative or compensatory ASM, or perhaps they are regions that are less

tightly-regulated, and there is room for fluctuation without serious detriment. There is also an

epiSNP (rs10519568) within an intron of the gamma-aminobutyric acid A receptor gamma-3

(GABRG3) for which the direction of ASM is consistent between cohorts. GABA is the major

inhibitory neurotransmitter in the mammalian brain and, although mice express GABRG3

biallelically, there is still some confusion surrounding the imprinting status of this gene in

humans [104], making the location of this epiSNP, as well as the two SZ-specific epiSNPs

(rs6422909 and rs8024723), particularly interesting. Another epiSNP lies within an intron of the

syntrophin (SYNTG1, rs7016161) gene, which belongs to a family of cytoplasmic peripheral

membrane proteins. Syntrophins have multiple protein interaction domains that link signaling

proteins, such as kinases and neuronal nitric oxide synthase, to dystrophins, and they are highly

expressed in the brain, where they are required for signaling and trafficking [250]. Also among

the conserved, common epiSNPs are those associated with a number of kinases, cation channels,

voltage-gated channels, proteins related to zinc fingers, phospholipases, and phosphatases; all of

these proteins, while they function in an assortment of diverse cascades, are essential to the

organism and their conservation at the ASM level is understandable.

One peculiar discovery involved a synonymous epiSNP (rs4343) located within the coding

region of angiotensin I converting enzyme (ACE) that was common to all cohorts and displayed

ASM on the same allele. ACE is a circulating enzyme that catalyzes the conversion of

angiotensin I to angiotensin II as part of the renin-angiotensin system that regulates extracellular

volume and arterial vasoconstriction, and it also hydrolyses a peptide (N-AcSer-Asp-Lys-Pro)

that acts as a negative regulator of hematopoietic stem cell proliferation [251]. What makes this

epiSNP interesting is the fact that there is an extensively-studied insertion/deletion

polymorphism at this locus that is associated with a number of conditions, including psychosis.

73

It has been hypothesized that the deletion allele may be responsible for clustering of psychotic

symptoms in BD, whereas the insertion allele may be protective against psychosis [252]. In

contrast, another study found that the deletion allele was actually protective and reduced the risk

of developing SZ by 50% [253]. It has also been found that elevation of ACE in cerebrospinal

fluid correlates with the duration of illness in SZ, although it is not clear whether the increases

were the result of treatment or deterioration due to the disease [254]. Regardless of the

conflicting theories, it is apparent that dysregulation at this locus is involved in psychosis, either

as a causal or compensatory factor, or simply as a biomarker, and epigenetic studies of the locus

have been recommended [255]. In our study, the genotypes at this epiSNP were distributed

somewhat differently per cohort, although a trend proved difficult to measure, statistically.

Future studies should further investigate the genotype and methylation levels of SZ subjects and

controls at this epiSNP and, if possible, attempt to correlate this data with ACE protein or

mRNA levels, as well as mRNA conformation.

When the epiSNPs unique to each cohort were examined, many of these loci did not have any

genes located within 50Kb from the SNP, as approximately 60% of ASM occurred in intergenic

regions, however, a number of cohort-specific epiSNPs immediately stood out. The control

cohort has a unique epiSNP (rs9497449) in an intron of the metabotropic glutamate receptor 1

(GRM1), a receptor at which deleterious mutations are detected in BD and SZ [256] – it should

be noted that the BD group has its own unique epiSNPs (rs7531813, rs12295113) in introns of

other glutamate receptors, the ionotropic, kainate 3 and 4 (GRIK3 and GRIK4). The BD group

has an epiSNP ~1.5Kb upstream from the gene encoding brain-specific angiogenesis inhibitor 2

(BAI2, rs7543090); BAI2 may play a role in depressed mood, as BAI2-deficient mice show

significant antidepressant-like behavior in a variety of tests versus wild-type mice [257]. In BD

and SZ, there are common epiSNPs (rs2820291, rs7102028) in the neuron navigator 1 and 2

(NAV1 and NAV2) genes that are not shared with the controls. Navigators act to reorganize the

cytoskeleton to guide cell shape changes and have a role in neurite outgrowth [258]; expression

of mRNAs associated with NAV1 have been shown to be reduced in SZ patients [259]. Another

gene that affects synapse development and function is neurexin 3 (NRXN3), which codes

neuronal adhesion proteins. Deletions at this locus are associated with autism spectrum disorder

[260], and there is an epiSNP (rs1022434) common to SZ and BD located within an intron of

this gene. Also, it is well known that epilepsy and psychosis share a complex relationship [261],

74

with epilepsy patients developing SZ at a higher rate than expected, and SZ patients being more

prone to seizures than the general population [262]. We detected two epiSNPs in the SZ cohort

(rs4738014 and rs7018199, the latter is shared with BD) and one epiSNP common to all cohorts

(rs7002461) within introns of the carboxypeptidase A6 gene (CPA6), mutations to which result

in several forms of epilepsy [263], indicating that ASM is a possible connection between

psychosis and seizures.

There are several intriguing epiSNPs that are unique to the SZ cohort, for example, rs1111050

lies in an intron of the gene that codes methylguanine-DNA methyltransferase (MGMT), an

enzyme that repairs inappropriately methylated guanine residues in DNA. ASM at SNPs

associated with MGMT has received quite a bit of attention in relation to colorectal cancer [264],

but those implicated epiSNPs are different from the one we have observed, most likely due to

tissue-specific ASM effects. An epiSNP (rs956451) exists in an intron of the gene coding

voltage-dependent, L type calcium channel alpha 1C subunit (CACNA1C) – a SZ risk gene that

has recently been shown to be a target of the miR-137 microRNA as part of a pathway that

contributes to disease progression [265]. There is an epiSNP (rs1705107) in an intron of the

reelin gene (RELN), which has famously been identified as a SZ risk factor. SNPs in the

promoter region of this gene have not been significantly associated with SZ [266], but perhaps

addition of the ASM element or consideration of intronic SNPs and epiSNPs would be more

fruitful in the search for factors predisposing to SZ. Two other epiSNPs (rs7205673 and

rs9937169) are located within the coding region of the polycystin 1-like 2 (PC1L2) gene

(PKD1L2), which interacts with G protein coupled receptors [267]; mutations of polycystins

result in polycystic kidney disease, although multiple mechanisms are thought to contribute to

its pathogenesis [268]. There has been evidence that polycystic kidney disease is significantly

elevated in SZ families [269] and, several decades ago, dialysis was suggested as a treatment for

SZ as it was believed that malfunctioning kidneys were responsible for circulation of waste

products, ultimately leading to hallucinations. This treatment was not validated in placebo-

controlled double-blind trials [270], indicating that any possible role of the kidneys in SZ was

not strong enough that the disease could be cured simply by improving kidney function. That

being said, this does not rule out the contribution of renal activity in a multifactorial model of

SZ, where smaller, cumulative alterations, such as SZ-specific ASM effects, may play a role in

the development of SZ in certain individuals. Another epiSNP (rs4894637) in an intron of the

75

neuroligin 1 (NLGN1) gene is also interesting, as the resulting protein is involved in a sort of

pruning of low-efficiency synapses - a process that enhances neural network function in control

individuals, but can lead to large synapse loss in cases of SZ [271]. Neuroligins also bind

neurexins to promote the development and differentiation of glutamate synapses [272]; we have

found that epiSNPs are highly enriched in glutamate pathways (discussed further in this

chapter), illustrating the potential for interconnection of epiSNP functions. Finally, the SZ

cohort has several unique epiSNPs located in introns of cub and sushi multiple domains 1 and 3

(CSMD1 and CSMD3) on chromosome 8: eight in CSMD1 (rs1182739, rs1393845, rs11997565,

rs41350751, rs17069006, rs1566860, rs11136748, rs2725068), plus one that is shared with BD

(rs17063261) and one that is shared with BD and control (rs1379326), and 4 in CSMD3

(rs11778262, rs17640016, rs1857719, rs1382469), one of which is shared with the BD cohort.

The CSMD proteins are tumor suppressors that may also function as receptors or co-receptors in

signal transduction processes [273], however, recent research has also identified SNPs within

CSMD1 and its homologue, CSMD2, to be significantly associated with SZ [274]. The large

number of SZ-unique epiSNPs occurring at these loci supports further investigation of their role

in affected individuals.

At the dopamine receptor D2 (DRD2) gene, control and BD groups both have a unique epiSNP

(rs7125415, rs7131056) in an intron, while SZ does not, although it has one at the 5’UTR of the

D1 receptor locus (DRD1, rs4532). As previously mentioned, the dopamine hypothesis of

psychosis has received the most scientific attention and, even after 50 years of drug discovery,

the majority of current treatments rely on dopamine D2 receptor blockade [275]. It is obvious

that this system has some merit as a factor in psychosis, and disease-specific ASM within the

receptors is one possible mechanism. All of these cohort-specific epiSNPs reveal interesting

connections between epigenetic factors and genetic elements, which act together to produce an

intricate disease profile that could vary between individuals.

Following the identification of total epiSNPs per cohort, we explored their associated functional

pathways. As many epiSNPs are located within introns, exons and UTRs, while others are in

the vicinity of genes, we assigned each epiSNP the GO category of the closest gene. The

distribution of GO categories per cohort is shown in Figure 3.18 and, predictably, the majority

of GO categories associated with brain epiSNPs were related to brain development, cellular

activity and brain function. Some GO categories encompass a very large number of SNPs, so it

76

is likely that epiSNPs will fall into these categories simply because they are so broad, for

example, there are 4316 SNPs included in the “multicellular organismal process” category.

Other categories contain considerably fewer SNPs, such as the “behavioural fear response”

category (12 SNPs), so it is much less likely that epiSNPs would receive those GO terms. Table

3.7 lists the top 5th

percentile of significant GO categories that appeared most frequently per

cohort, while table 3.8 lists the top 5th

percentile of significant GO categories that were enriched

per cohort, i.e. appearing more frequently than expected, given the total number of SNPs in the

category; noticeably, there is no overlap between tables. Most epiSNPs occurred in categories

related to signaling and development, with all cohorts sharing the top four GO IDs, but differing

in the fifth position. All of these categories are very large, encompassing hundreds to thousands

of SNPs, and are related to broad, basic tissue functions, so it is not surprising that they are the

most frequently observed.

Figure 3.18. GO categories per cohort in brain

The GO categories associated with epiSNPs from the control, BD and SZ cohorts are depicted in a Venn diagram

(p<0.01).

77

Table 3.7. Top five GO categories per brain cohort

The GO categories containing the largest numbers of epiSNPs per study cohort in the brain, where FDR <0.02.

Cohort GO Term epiSNPs

Control GO:0032501_multicellular_organismal_process 264

GO:0023052_signaling 222

GO:0032502_developmental_process 205

GO:0007275_multicellular_organismal_development 192

GO:0051179_localization 190

SZ GO:0032501_multicellular_organismal_process 410




GO:0048856_anatomical_structure_development 282

BD GO:0032501_multicellular_organismal_process 216




GO:0023060_signal_transmission 138

Cohort GO Term Enrichment epiSNPs

Control GO:0001662_behavioral_fear_response 8.993056 5

GO:0002209_behavioral_defense_response 8.301282 5

GO:0007172_signal_complex_assembly 8.301282 5

GO:0021955_central_nervous_system_neuron_axonogenesis 8.301282 5

GO:0001964_startle_response 7.708333 5

SZ GO:0045634_regulation_of_melanocyte_differentiation 8.431333 4

GO:0050932_regulation_of_pigment_cell_differentiation 8.431333 4

GO:0001662_behavioral_fear_response 6.323499 6

GO:0002209_behavioral_defense_response 5.837076 6

GO:0001964_startle_response 5.420142 6

BD GO:0051552_flavone_metabolic_process 25.527721 5

GO:0009812_flavonoid_metabolic_process 19.145791 6

GO:0009698_phenylpropanoid_metabolic_process 17.01848 6

GO:0032469_endoplasmic_reticulum_calcium_ion_homeostasis 17.01848 4

GO:0051967_negative_regulation_of_synaptic_transmission__glutamatergic 12.76386 3

78

Table 3.8. Top 5 enriched GO categories per brain cohort

The GO categories showing the highest levels of enrichment per study cohort in the brain, where FDR <0.02.

Enrichment levels refer to an over-representation of a GO category, and the number displayed in column four

represents the exact number of epiSNPs that were given that specific GO classification.

GOminer was used to determine the enrichment levels of each GO category per cohort. When

these enriched categories are examined, the profiles begin to diverge, especially in the case of

BD, where the top categories were incredibly enriched and none overlapped with the other

cohorts. The top three bipolar categories are related to flavonoids, as flavones are a class of

flavonoids, and phenylpropanoids are flavonoid precursors. At first glance, these categories

seem to be completely out of place but, in rats, the flavonoid, quercetin, and its metabolite, rutin,

have been found to exert antidepressive and neuroprotective effects, possibly due to inhibition

of monoamino oxidase [276]. Flavones are emerging as potential treatment options; in several

human cancer cell lines, flavones restored the function of a tumor suppressor gene that was

silenced via hypermethylation [277], although their exact activity remains unknown and this

particular study did not investigate any potential demethylation actions. An intriguing new

finding is that the flavonoid prodrug, baicalin, has the ability to cross the blood-brain barrier and

inhibit prolyl oligopeptidase, which is a cytosolic serine peptidase that is associated with BD,

SZ and other neuropsychiatric disorders. Baicalin is a natural compound derived from

Scutellaria baicalensis root extracts that has been safely administered to humans for many

years, and has potential as a new therapeutic option for psychosis and related conditions [278].

Altogether, it appears that flavonoid pathways are a new target that requires further exploration

in regards to psychosis, particularly BD, as recent literature is only beginning to touch on this

subject. The numerous flavonoid-related epiSNPs may represent a dysregulation of these

pathways, although this requires more in-depth investigation. The significance of the

endoplasmic reticulum- and glutamate-related BD enriched GO categories will be discussed

further in the thesis.

The top three enriched categories in the control cohort all deal with fear and behavioural

responses, and these were also present among the most enriched SZ categories. The genes

related to the epiSNPs were very similar between groups, with one major difference: the DRD1

gene appears in all categories in the SZ cohort, but does not appear in the control cohort,

however, the DRD2 gene appears in the control “startle response” pathway. While it is not clear

79

why fear-related categories would be so enriched in the control group, it has been shown that

prepulse-inhibition of startle is linked to glutamatergic pathways [279], and we have found that

these pathways are also largely affected by ASM; perhaps the regulation of these pathways is

somewhat dependent on the regulation of the glutamatergic ones, as glutamate receptor genes do

make up a significant portion of the fear-related GO categories. Still, lack of prepulse-inhibition

is a characteristic of SZ [280], so we should not be too quick to dismiss the enrichment of these

categories as circumstantial.

Outside of the fear pathway-related categories, the other control categories in the top five

seemed quite logical, dealing with signal complex assembly and CNS axonogenesis – processes

required for normal brain activity. In contrast, the top two categories in the SZ cohort seem

rather unintuitive, as they are related to differentiation of melanocytes. The same four genes

appeared in both categories: BCL-2, GNA11, ADAMTS9 and ADAMTS20. The BCL-2 protein is

an apoptotic regulator (anti-apoptotic protein) that has been implicated in both melanoma [281]

and SZ, showing decreased levels in the cortex of SZ subjects, leaving cortical cells vulnerable

to apoptosis [282]; current literature does not indicate a known connection between GNA11 or

the ADAMTS family of proteins and SZ. This is not the first association of melanocyte-related

pathways and SZ, as one group has found that two combined complex genotypes involving

melanotropin showed a stronger association with SZ than any single locus [283]. Some

evidence has been presented that SZ is protective against melanoma, but this has yet to be

independently verified [284]. Again, like the connection emerging between BD and flavonoid

pathways, the relationship between SZ and melanocyte pathways provides an interesting, yet,

ambiguous target for future research.

Interestingly, two general themes that appeared in our GO analysis involved pathways that have

been proposed to play a role in the etiopathogenesis of psychosis. A number of epiSNPs fell

into GO categories related to the neurotransmitter, glutamate, such as GO:0007215 (glutamate

signaling pathway) in all cohorts, GO:0051967 (negative regulation of synaptic transmission

glutamatergic) in BD, and GO:0035249 (synaptic transmission glutamatergic) in control and SZ;

over the last several years, the glutamatergic theory of psychosis has been gaining ground on the

popular dopamine theory. We also noted the presence of two insulin-related categories in the

SZ cohort alone: GO:0030073 (insulin secretion) and GO:0050796 (regulation of insulin

secretion). The majority of the genes associated with these epiSNPs are expressed in the human

80

brain – only one (RFX6) was expressed elsewhere. This discovery is particularly interesting, as a

link between SZ and insulin was proposed in the 1930s, but fell out of favour for decades until,

just recently, a collection of studies presented new evidence in support of this theory [244, 285,

286]. The GO categories are presented in Table 3.9, and the connection between psychosis,

glutamate and insulin will be addressed in detail in the “Genetic-epigenetic interplay in complex

disease” section of the thesis discussion.

Table 3.9. Glutamate- and insulin-related GO categories per brain cohort

GO categories related to glutamate and insulin pathways per study cohort in the brain, where FDR <0.05. The

number of epiSNPs within each category is displayed.

Various mitochondrial malfunctions have also been connected to the development of psychosis,

and our study uncovered several pieces of evidence that support this theory. Each brain cohort

had a unique profile of epiSNPs occurring within genes related to mitochondrial function

(summarized in Table 3.10). It should be noted that the SNP 6.0 array does cover SNPs in the

mitochondrial genome, however, we did not detect any epiSNPs of that variety, and the genes

discussed here are mentioned because they interact with the mitochondria. This does not

guarantee that ASM does not occur in mitochondrial DNA, as emerging evidence suggests that

epigenetic modification of mitochondrial DNA is possible [287, 288].

Minimal overlap of epiSNPs was observed and, in all cases, the common epiSNPs were

methylated on the same allele. Only one mitochondria-related epiSNP was common to all three

Cohort GO Term epiSNPs

Control GO:0035249_synaptic_transmission__glutamatergic 8

GO:0007215_glutamate_signaling_pathway 6

SZ GO:0007215_glutamate_signaling_pathway 7

GO:0035249_synaptic_transmission__glutamatergic 8

GO:0030073_insulin_secretion 19

GO:0050796_regulation_of_insulin_secretion 17

BD GO:0051967_negative_regulation_of_synaptic_transmission__glutamatergic 3

GO:0007215_glutamate_signaling_pathway 8

81

cohorts, and it was within an intron of a carrier that transports oxodicarboxylates across the

inner mitochondrial membrane; epiSNPs occurring in a peptidase and a variety of mitochondrial

ribosomal proteins were also shared between combinations of cohorts. Most of the unique

epiSNPs belonged to the SZ cohort and the associated genes mainly consisted of assembly

factors for mitochondrial complexes and solute carriers. The different profiles are evidence of

regulatory alterations taking place in the case cohorts versus the control and, as previously

mentioned, much of the ASM is present in intronic regions or UTRs – sequences that may affect

splicing or translation of the mRNA. In their epigenomic profiling study, Mill et al also detected

some epigenetic dysregulation occurring in mitochondrial pathways in cases of psychosis [32].

We also discovered a SNP in an intron of ERMP1 (rs4612399) that was common to all cohorts

in brain and in sperm, as well as a significant enrichment of an endoplasmic reticulum-related

GO category in the BD cohort, plus another enriched (enrichment score = 4.94) BD GO

category, GO:0034976: response to endoplasmic reticulum stress. It has been hypothesized that

BD and SZ can result from impaired brain energy metabolism marked by abnormal glucose

metabolism and mitochondrial dysfunction, and that detachment of enzymes from mitochondrial

membranes may lead to these diseases by increasing oxidative stress and limiting brain growth

and development [289]. Some of the affected genes in our ER-related GO categories (highly

enriched in BD) interact with mitochondrial membranes and proteins, for example, APP

produces the precursor to the amyloid beta (A-beta) protein, which is known to form deposits in

the brains of individuals with Alzheimer’s disease [290], and it also possesses a mitochondrial

targeting sequence allowing both APP and A-beta to accumulate inside the organelles and

disrupt their activities [291]. The general control nondepresible-2 (GCN2) kinase is also related

to mitochondria, as its activation can mark changes in translational control of mitochondrial

proteins, leading to organelle depression [292]. This is further evidence that ASM-mediated

disturbances in mitochondrial interactions could play a role in this particular theory of

psychosis.

One final interesting mitochondrial connection was our discovery of many different epiSNPs in

all cohorts and in both DNA sources that were located in introns of the inositol 1,4,5-

trisphosphate 3-kinase B, the enzyme that controls 1,4,5-inositol trisphosphate (IP3) levels via

phosphorylation to IP4 [293]. The reduction of intracellular IP3 levels stimulates autophagy,

whereas increases in IP3 enhance it. The IP3 receptor (IP3R) is located in the ER membranes

82

and in ER-mitochondrial contact sites, and blockade of this receptor leads to autophagy of both

ER and mitochondria [294]. Cohort-specific epiSNPs may bestow varying IP3 kinase B activity

upon each group, resulting in alterations to IP3 metabolism and, ultimately, dysfunctions of ER

and mitochondrial activities or autophagy – events that are thought to contribute to psychiatric

diseases.

Table 3.10. Summary of mitochondria-related epiSNPs per brain cohort

EpiSNPs occurring within genes are listed with the gene name, dbSNP functional class tag, full name of the gene,

and cohorts in which the epiSNP appears.

Tissue specificity of ASM

In addition to the post-mortem brain prefrontal cortex samples, we also attempted to identify

epiSNPs in sperm DNA from controls and BD subjects (n=24 each), using the same

experimental design and analysis employed on the brain samples. We detected a number of

epiSNPs and there was overlap between the control and BD cohorts (Fig. 3.19), however, there

were considerably less than the number found in the brain. This is most likely due to the lower

sample size and power to detect, but it may also reflect tissue-specific differences. In the brain

sample set, BD and control cohorts had approximately equal numbers of epiSNPs, but in the

EpiSNP Gene dbSNP class Name Cohort

rs2181411 ATPAF1 UTR-3 ATP synthase mitochondrial F1 complex assembly factor 1 Ctrl

rs654509 ATPAF1 intron ATP synthase mitochondrial F1 complex assembly factor 1 SZ

rs1025806 ATPAF1-AS1 cds-ref ATP synthase mitochondrial F1 complex assembly factor 1 SZ

rs2109862 IMMP2L intron IMP2 inner mitochondrial membrane peptidase-like (S. cerevisiae) SZ

rs13025568 MFF intron mitochondrial fission factor SZ

rs11551114 MIPEP missense mitochondrial intermediate peptidase Ctrl, BD

rs286633 MRPL22 intron mitochondrial ribosomal protein L22 BD, SZ

rs3785489 MRPS23 UTR-3 mitochondrial ribosomal protein S23 Ctrl, SZ

rs11760722 LOC100507421 intron mitochondrial ribosomal protein S33 BD, SZ

rs10058726 NDUFAF2 intron NADH dehydrogenase (ubiquinone) 1 alpha subcomplex, assembly factor 2 SZ

rs11967279 FARS2 intron phenylalanyl-tRNA synthetase 2, mitochondrial SZ

rs10134334 SLC25A21 intron solute carrier family 25 (mitochondrial oxodicarboxylate carrier), member 21 Ctrl, BD, SZ

rs17105237 SLC25A21 intron solute carrier family 25 (mitochondrial oxodicarboxylate carrier), member 21 BD

rs17105036 SLC25A21 intron solute carrier family 25 (mitochondrial oxodicarboxylate carrier), member 21 SZ

rs3772197 SLC25A26 UTR-5 solute carrier family 25 (mitochondrial carrier, phosphate carrier) SZ

83

sperm sample set, the control epiSNPs were more than double the number identified in the BD

cohort. As sample sizes were equal between sperm cohorts, this variation is not due to

differences in detection power, although its meaning is difficult to interpret.

Of the 27 epiSNPs that overlapped the control and BD groups in the sperm sample set, only one

differed in the directionality of ASM, and it was located in an intergenic region with no known

gene association within 50Kb. There were 13 genes associated with the sperm epiSNPs common

to control and BD, and they were usually intronic, except for one that was exonic and coded a

missense mutation in the proteasome subunit beta type-4 gene. One of these common epiSNPs

(rs3754387) was found in an intron of the inositol 1,4,5-trisphosphate 3-kinase B gene (ITPKB),

another (rs979278) was located in an intron of the endoplasmic reticulum metallopeptidase 1

gene (ERMP1), and a third (rs851823) was in an intron of the contactin-associated like 2 gene

(CNTNAP2) – the recurrence of these genes in brain and sperm will be addressed shortly. Other

associated genes code various proteins, including carbohydrate sulfotransferase 11, a

hypothetical protein (KIAA1609), protein bicaudal D homolog 1, a centrosomal protein, and

others with no particular link to BD.

Figure 3.19. EpiSNPs detected in sperm DNA

The total number of epiSNPs detected in control and BD cohorts are depicted in a Venn diagram (p<2.2x10-16

).

There is a moderate amount of epiSNP overlap between sperm and brain cohorts, as illustrated

in Figure 3.20. The variation is likely the result of tissue-specific ASM effects, as the

84

overlapping epiSNPs are not obviously related to brain or sperm function and could be common

to any number of tissues. It is true that epiSNPs, in general, seem to occur in intergenic regions

and do not appear to be associated with genes whose functions are obvious – further pathway

investigation is required to determine the functional relationships of the associated genes.

Sample size issues prohibited us from running a GO analysis for the sperm epiSNPs. When

brain cases and sperm cases were compared, only one out of 41 epiSNPs showed a difference in

ASM directionality, and it was located in an intergenic region with no known gene association

within 50Kb. Among the overlapping case epiSNPs, one (rs4971695) occurred in an intron of a

gene that was obviously related to brain function (neurexin 1, NRXN1), and another

(rs17067095) occurred in an exon of a gene that was obviously related to sperm function

(spermatid associated, SPERT); in general, the epiSNPs that were identified as “overlapping”

any brain and sperm cohorts tended not to be associated with brain- or sperm-specific genes, i.e.

their functions seemed to be more generally applicable.

There were 57 epiSNPs that appeared in both brain and sperm control cohorts, and all but one

displayed the same ASM directionality; as in the BD/control overlap, the one that differed

between brain and sperm was located in an intergenic region with no known gene association

within 50Kb. The shared epiSNPs were either intergenic or intronic, and they were not

significantly closer to MSRE sites than any other SNPs, meaning that they were not likely to

simply result from LD. They also had very few MIM titles associated with them, with the

exception of “inositol 1,4,5-trisphosphate 3-kinase B,” “endoplasmic reticulum metallopeptidase

1” and “partitioning-defective protein 3, C. elegans, homolog of.” Oddly, epiSNPs associated

with the “inositol 1,4,5-trisphosphate 3-kinase B” and “endoplasmic reticulum metallopeptidase

1” MIM titles were present in every cohort in brain and sperm, and the “contactin-associated

protein-like 2” title was present in all cohorts except for BD in brain, although different epiSNPs

appeared in this group; one SNP (rs3754387) in an intron of ITPKB was consistently present, as

was another (rs4612399) in an intron of endoplasmic reticulum (ER) metallopeptidase 1

(ERMP1). Mitochondrial ER dysfunction has been linked to bipolar disorder [295] and

valproate, a drug often prescribed for BD and epilepsy, has been shown to protect against ER

stress-induced apoptosis in rats [296]. Additionally, the inositol 1,4,5-trisphosphate receptor

calcium channel interacts with the Bcl-2 apoptosis-inhibiting protein to regulate calcium release

from the ER, and BD patients with the AA genotype at a Bcl-2 gene SNP showed reduced Bcl-2

85

mRNA and protein levels [297]. This particular SNP, rs956572, did not show ASM, and the

epiSNP related to the ER pathway did not differ in ASM directionality between cohorts or DNA

sources, but it is still interesting to note that, out of the nearly 1 million SNPs on the array, this

collection of SNPs were among the few thousand that were identified as epiSNPs, and a GO

category related to the ER was present in the top five enriched categories in the BD cohort.

Perhaps in a different population, for example, patients who have a long history of valproate

use, some differences in ASM may be observed.

In the case of the CNTNAP2 epiSNPs, one was conserved between BD and control groups in

sperm (rs851823), while a unique one was found in brain controls (rs17434745), and four

unique ones (rs4726833, rs10233374, rs41481753, rs12673933) were found in brain SZ cases,

all within introns of the gene. It is interesting to note that, while a CNTNAP2 epiSNP was

common to control and BD in sperm, there was an absence of similar epiSNPs in the BD cohort

in brain. The CNTNAP2 protein functions in neuronal cell adhesion, synaptic formation and cell

signaling as part of the cell adhesion molecule (CAM) pathway, which has been associated with

SZ and BD, as well as being previously implicated in specific language disorder and autism

[298]. A recent meta-analysis has identified variants within this gene as being significantly

associated with SZ and BD [299], although the detected SNPs were not epiSNPs. In sperm,

there is no difference in the CNTNAP2-associated epiSNP or directionality of ASM. In the

brain, the target tissue for this gene, cohorts differ considerably with an absence of ASM at this

locus in BD, an excess in SZ, and a single, different epiSNP in controls. The meaning of this

finding is not clear, but it supports the theory that brain-specific regulation of this gene can be

altered in cases of psychosis.

Again, the smaller sample size in the sperm study was a limitation that likely contributed to the

lower number of identified epiSNPs, overlaps between cohorts and overlaps between sample

types. A previous study by Schalkwyk et al listed their top 21 strongest examples of ASM

detected in blood DNA [21], none of which were common to our brain epiSNPs. Once again,

this may be the result of tissue-specific effects, or perhaps it is due to their different analysis

strategy and enrichment protocol; they filtered out most of the SNPs on the SNP 6.0 array

before analyzing and used a cocktail of three enzymes (HpaII, HhaI and AciI) to produce a

variety of smaller fragments that differed from ours, likely interrogating some different SNPs. It

should be noted that Schalkwyk et al detected tissue-specific ASM effects between their own

86

blood and buccal samples, and recent research suggests that tissue-specificity of gene expression

is common, with one study estimating that 69 - 80% of regulatory variants are cell type-specific

[300]. Using whole-genome bisulfite sequencing, Li et al mapped the methylome from the

blood of one individual, and found tissue-specific DMRs at 240 856 regions when they

compared blood and lung tissue [301]. Taking the above information into account and, given

that methylation profiles are highly variable between tissues [29], we suspect that our evidence

for tissue-specific ASM effects is valid, despite the sample size limitations.

Figure 3.20. Overlapping epiSNPs between brain and sperm

The number of overlapping epiSNPs between control, total case, and BD cohorts in brain and sperm are depicted in

Venn diagrams (p<2.2x10-16

).

As in the brain samples, we investigated the chromosomal distribution and functional class

profiles of the epiSNPs detected in sperm. In both sperm cohorts, the epiSNP distribution is

entirely uniform across the chromosomes and did not show a single significant difference in

proportion (Table 3.3 and Fig 3.21). The absence of BD epiSNPs at chromosomes 4, 11, 20 and

control or BD epiSNPs at Y did not represent significant depletions, as a result of the overall

fewer epiSNPs identified in sperm. It is quite possible that the varying distribution profiles

87

reflect different distributions of tissue-specific genes that would be susceptible to ASM, while it

is equally possible that the flat epiSNP landscape in sperm DNA could indicate that ASM is

somewhat random or less influential in these cells, as it does not tend to localize to any

individual chromosome. The smaller sample size and lack of a SZ cohort in sperm are other

factors that should be considered, as we noticed that more enrichments and depletions became

significant when we pooled the BD and SZ cohorts in brain.

In the functional class analysis (Figure 3.22), the sperm profile was similar to that observed in

brain, in that 147 (59.76%) of control and 50 (59.52%) of BD epiSNPs were located in

intergenic regions, however, they did differ in some areas. There were no significant

enrichments or depletions observed in the control cohort (p>0.208 and 0.131, respectively),

whereas the brain control cohort epiSNPs were enriched in UTRs. The sperm BD cohort

differed from the brain profile, as the exon class was significantly enriched (p=0.046), in

comparison to the UTR enrichment observed in BD brain (p=0.011); enrichment of exons

highly contrasts the general depletion of this class that occurred in the brain. It should be noted

that, although there were no BD epiSNPs in the UTR class, it was not a significant depletion,

once again due to the small number of identified epiSNPs in sperm and the relatively few UTR

SNPs on the SNP 6.0 array (a factor that was considered when enrichment was calculated). As

we have already observed the presence of tissue-specific ASM effects between brain and sperm

samples (in relation to the MIM titles of associated genes), it is plausible that the overall

activities of epiSNPs may also differ between them. In both sample types, there was a large

number of intronic and intergenic epiSNPs, but this was due to the fact that most SNPs occur in

these classes, and no enrichments or depletions were documented.

88

Figure 3.21. Chromosomal distribution of sperm epiSNPs

The distribution of brain epiSNPs per cohort across all chromosomes, where the total number of detected epiSNPs

is displayed as a proportion of the total number of SNPs occurring on each chromosome. No chromosome was

significantly enriched (q<0.05).

89

Figure 3.22. Functional class distribution of sperm epiSNPs

Sperm epiSNPs were stratified by their dbSNP functional class tags. Coding SNPs occur in exons of genes, where

one variant introduces either a non-synonymous or synonymous change; Intron SNPs are located within intronic

regions of genes; Locus SNPs are located 2Kb upstream or 0.5Kb downstream from a gene; and UTR SNPs occur

in either 3’ or 5’ untranslated regions. The number of epiSNPs per cohort in each class is given as a proportion of

the total number of genomic SNPs in that class. Red asterisks mark classes that are enriched per cohort (p<0.05).

Sensitivity Analysis

As the number of detected epiSNPs decreased when the datasets were stratified into control, SZ

and BD groups, a sensitivity analysis that exhaustively interrogated all possible sample sizes

was conducted to determine the effect size for biologically meaningful results, and to ensure that

our method and sample size were sufficient to detect all significant epiSNPs. The results of this

analysis are summarized in Figure 3.23. We randomly sampled our 5 different cohorts (all

samples, control, all cases, BD and SZ) 5 times each at every sample size, and ran our PWL

analysis for every possible set that resulted from the sampling. We then used a non-linear least

squares regression to extrapolate the findings for each cohort to one million samples, and saw

our number of detectable epiSNPs rapidly plateau. This indicates that our sample sizes were

sufficient to detect the majority of significant epiSNPs.

90

Figure 3.23. Sensitivity analysis

The polygons represent the results of random sampling and recalculation of the total number of brain epiSNPs

identified in each cohort (control+case, control, all cases, BD and SZ). The dotted lines represent the extrapolation

of each data set to include one million samples. In all cohorts, the total number of detectable epiSNPs plateaus

almost immediately.

The epiSNPs considered in the sensitivity analysis were all significant, but ranged from very

weak to very strong associations between one allele and methylation. When we stratify this

same analysis by association strength, it becomes apparent that the only epiSNPs we may miss

are the ones with the weaker associations (Fig 3.24). Increasing the sample size permits the

detection of more epiSNPs, up to the point of plateau, but their associations between allelic

variants and methylation are weaker.

Figure 3.24. Sensitivity analysis stratified by strength of associations

The sensitivity results separated by their PWL slope values. The slope can be interpreted as the strength of the

association between genotype and methylation levels, where a high slope value (>=3) is a strong association. Slope

values are absolute, ie. -3 and +3 are binned in the same group. As slope value increases the curves taper off.

Extrapolating from the asymptote for our collective brain DNA sample set, we determined that

~2.5% of the 906,600 SNPs on the Affymetrix SNP 6.0 array display ASM. This prediction is

the most comparable to those presented by other groups, as our combined sample size was quite

91

large (n=208), however, the estimate is slightly conservative, as we have just demonstrated that

a small fraction of the weakest epiSNPs can be missed by our detection and analysis methods.

Compared to the estimates presented by other groups, ours is somewhat intermediate, leaning

towards the lower values. On the lower end, Kerkel et al estimated that at least 0.16% of SNPs

show ASM [19], and Schalkwyk et al suggested a value of 1.5%. Higher estimates included:

8.1% by Gertz et al [213], 10% by Zhang et al [158], 10% by Hellman and Chess[159], and 23-

37% by Shoemaker et al [160]. The variable estimates are likely the result of many factors,

such as sample size, detection method, DNA source and the use of simulation studies; the

impact of these factors will be scrutinized in the General Discussion and Conclusions section.

Low level ASM across the genome

We detected a large number of epiSNPs using the microarray approach and PWL analysis, but

there are two main limitations to this strategy. First of all, microarrays are not as sensitive as

deep sequencing technologies and, as we have already addressed in the sensitivity analysis,

epiSNPs demonstrating lower level associations are likely overlooked by this method.

Secondly, the PWL analysis is a two-step regression model that requires at least two of the three

possible genotypes (AA, AB and BB) to be present in order to detect the methylation intensity

slopes from AA to AB and from AB to BB; as a result, SNPs that have rare genotypes may be

immediately excluded from our analysis if the study population does not contain the required

combinations. In order to examine the methylation association of SNPs in greater detail, we

conducted an experiment using the 454 deep sequencing platform, which allows us to use

single-base resolution to search for associations that may have been missed by the microarrays.

Our experimental design is depicted in Figure 3.25.

92

Figure 3.25. Deep sequencing workflow

Preparation of fragments for 454 deep sequencing. Genomic DNA is treated with sodium bisulfite, which converts

all unmethylated cytosines to thymines. Next, 400bp sequences surrounding the target SNPs are amplified using

PCR. The amplicons are gel-extracted and purified, and then 4 pools are created using equal amounts of each

amplicon. These pools were applied to a 4-gasket 454 plate (as shown above) and sequenced.

We selected 11 SNPs that were not found to show ASM in the microarray study and bisulfite

modified the genomic DNA for 40 subjects per SNP, thereby converting all unmethylated

cytosines to thymines. The 40 subjects were divided into four groups of alternate homozygotes:

case AA, case BB, control AA and control BB. In some cases, we did not have 10 samples of a

particular genotype available per group – in those instances, additional samples from the

alternate homozygote group for the same disease status were added in an effort to include 20

samples per group (sample information is listed in Appendix 2, Table A2.6). Homozygous

samples were selected, as they would provide the greatest potential difference in methylation

levels, should an ASM effect be present. We then amplified an area surrounding the SNP,

including as many CpG sites as possible, purified our amplicons and sequenced them using the

454 next generation sequencing instrument. We generated 422,114 forward sequence reads and

compared the number of unconverted cytosines at each CpG site between groups per SNP. Our

analysis focused on 4 main questions:

93

1. Is the association between methylation and genotype in case group the same as the association

between methylation and genotype in control group?

2. Is the association between methylation and genotype AA in case group the same as the

association between methylation and genotype BB in case group?

3. Is the association between methylation and genotype AA in control group the same as the

association between methylation and genotype BB in control group?

4. Is the association between methylation and case group the same as the association between

methylation and control group?

Our null hypothesis for each question was that there should be no significant difference in

methylation level between any of the 4 alternate homozygous groups at any of the chosen SNPs;

surprisingly, this was not what we observed. Each SNP was associated with between 1 and 27

CpG sites, with most having around five on their amplicons, as shown in Figure 3.26. For

question 1, we found that there was a significant difference in methylation-genotype association

between case and control groups in 11/11 SNPs (Table 3.11). This indicates that a disease-

effect on the association of methylation and a genotype becomes evident between case and

control groups, even for SNPs that were not detected by the microarray analysis, when analyzed

at high resolution. This was the most general of the four questions, but the findings were

recapitulated in the following three analyses, which provided a more detailed examination of the

possible methylation-genotype relationships.

94

Figure 3.26. CpG count per SNP

The number of CpG sites present on the sequenced amplicons for each SNP.

When we investigate questions 2-4, nearly all investigated SNPs demonstrateed significant

differences between groups when all CpG sites surrounding a SNP were analyzed (Table 3.11).

It is apparent that, when all local CpG sites are considered, differences can be observed between

genotypes, between cohorts, or as a combination of both possibilities. Figure 3.27 illustrates the

variation occurring between and within cohorts at the four CpG sites surrounding a single SNP;

our null hypotheses would predict that the methylation proportions would be the same for all

genotypes and all cohorts. This figure serves to visualize the fluctuations at all CpG sites

between cohorts at a SNP, although we pooled all CpG sites per SNP in our analyses to obtain a

single measurement for each SNP.

It should be stressed that the ASM we observed in this experiment is nowhere near as robust as

what was detected in the microarray or even the pyrosequencing experiments, as the microarrays

did not suggest that these were epiSNPs and the pyrosequencing experiment failed to find any

significant ASM at other SNPs that were not identified as epiSNPs. Still, this finding is

intriguing, because it introduces a novel, fundamental quality of ASM: its common occurrence

at smaller effect sizes, which we deem “minor epiSNPs.” It is not clear why this phenomenon

would exist, but perhaps it represents the remnants of epiSNPs that have faded away over one

generation or the span of many. At this point, we can only hypothesize on the biological

relevance of this finding. We can conclude that cases and controls have an overall, subtle

difference in their total ASM profiles. It also appears that DNA sequence exerts a genome-wide,

low-level effect on methylation status that can only be detected using extremely sensitive

methods, aside from certain points that become more pronounced (epiSNPs).

95

Table 3.11. Minor epiSNPs showing significant ASM effects in brain

The analysis results for each of questions 1-4. Significant p values (p<0.05) appear in bold.

Figure 3.27. Methylation levels per group for a sample minor epiSNP

Deep sequencing results for minor epiSNP1, an amplicon containing 4 methylatable CpG sites. The Y-axis

represents the level of methylation, based on the number of sequences that retained a cytosine at that site. For each

site, the X-axis is divided into the 4 cohorts.

96

Chapter 4

General Discussion and Conclusions

In this thesis, we defined some important features of genetic-epigenetic interactions, beginning

with an investigation of methylation differences within MZ twins, continuing on to examine

methylation differences between MZ and DZ twins, followed by a study of the occurrence and

properties of ASM. We then applied some of these principles in a study of major psychosis, a

non-Mendelian complex disease, in order to determine how the relationship between genetics

and epigenetics can impact human diseases.

Our twin study was the first to utilize epigenome-wide profiling to document sequence-

independent DNA methylation differences in several tissues from MZ twins, and the results

support the theory that epigenetic metastability and divergence is responsible for a portion of co-

twin phenotypic discordance. We analyzed the WBC, buccal epithelium, and rectal biopsies of

MZ twins, and annotated the epigenetic metastability of ~6,000 unique genomic regions. We

also examined WBC and buccal epithelium of DZ twins and found that DZ co-twins exhibited

significantly higher epigenetic difference over MZ co-twins in buccal cells. Our in silico SNP

analysis and our comparison of methylomes in inbred vs. outbred mice favour the hypothesis

that epigenomic differences in the zygotes are responsible for the greater discordance observed

in DZ twins. This is evidence that epigenetic inheritance exists, and that it is a separate entity

from DNA sequence-dependent methylation.

Delving into the topic of ASM, we discovered that approximately 2.5% of SNPs show ASM in

brain tissue, although genomic distribution varies between cases and controls, with controls

having a nearly-uniform occurrence of epiSNPs across the chromosomes. The majority of

epiSNPs occur in intronic and intergenic regions, although UTRs and regions in the vicinity of

genes (2Kb upstream or 500bp downstream) were generally enriched, while exons were

generally depleted in the brain. The SZ cohort contained twice as many epiSNPs as controls

and BD, although there was a large degree of overlap. The many epiSNP GO categories were

related to brain development and function; intriguing differences between cohorts were

observed in the glutamate and insulin pathways. EpiSNPs were also detected in sperm DNA

and were largely tissue-specific, showing little conservation between sample types, as well as

differing in chromosome and functional distribution patterns. Deep sequencing analysis

97

revealed that any SNP could potentially demonstrate low-level ASM, and that this subtle

phenomenon is easily overlooked in microarray-based studies. This work offers some insight on

the actions of sequence-dependent DNA methylation in normal and affected individuals, and

provides evidence that genetic studies should be stratified to include an epigenetic component.

Changing concepts of epigenetic regulation

The classical view of epigenetics, which dominated for several decades, is somewhat simplistic

– DNA methylation at a promoter silences transcription of that gene in cis, and epigenetic marks

are completely erased and re-established during maturation of germ cells and post-zygotically,

with the exception of imprinted genes [302] – however, recent technological advances have

allowed us to study molecular epigenetic mechanisms in much greater detail and expand upon

this basic scenario. Concepts such as epigenetic inheritance and ASM are still controversial,

mainly because previous studies have not had the power or scope to adequately support or refute

any particular hypotheses [303]. In our twin study, the observation that DZ co-twins were more

epigenetically variable than MZ co-twins provided strong evidence for epigenetic inheritance,

which argues for some degree of persistence of epigenetic marks through the reprogramming

event that occurs after fertilization, ultimately resulting in the presence of these marks in the

somatic cells of the offspring. It differs from transgenerational epigenetic inheritance, in that the

marks are erased in the primordial germ cells, whereas transgenerational marks survive the more

intense erasure/re-establishment process and may be transmitted to subsequent generations

[304]. The mechanisms underlying transgenerational epigenetic inheritance are not fully

understood, but the number of documented cases has been increasing; evidence for

transgenerational inheritance has been detected in several species, such as yeast [305],

drosophila [306] and mice [156, 307]. Whether the signals are passed to a single offspring or to

multiple generations, the transmission of epigenetic marks will significantly impact the study of

complex non-Mendelian disease, as well as the traditional twin study design.

Our work highlights the existence and importance of epigenetic drift and inheritance, as well as

the role of epigenetics in development and disease, where inherited or stochastic epimutations

can predispose an individual to a phenotypic outcome. Epimutations may arise spontaneously or

as a result of external factors during germ cell reprogramming, and differential germline

epigenetic modification has been documented. An epigenetic study of sperm cells discovered

98

significant intra- and inter-individual differences in DNA methylation, where unique DNA

methylation profiles were present in a large percentage of the sperm cells collected from a single

subject. Of the identified positions, promoter CpG islands and peri-centromeric satellite repeats

showed the highest degree of variation, and the variation occurred in a number of genes, not

simply those related to sperm function and development [210]. Some of this variation likely

persists after the reprogramming at the time of fertilization, but the degree of inheritance is not

known, nor is the impact on associated histone modifications, which have also been proposed to

show some level of inheritance [308, 309].

At all stages of life, from developing embryo to adult organism, numerous environmental

factors can impact epigenetic status, although the potential for damage is much greater during

the former time period. It has long been understood that epigenetic factors were dynamic in

nature, but they were believed to be relatively mitotically stable; recent research seems to be

indicating that they fluctuate at a much higher rate than previously thought. For example, it has

been demonstrated that short periods of exercise are sufficient to cause transient changes in

DNA methylation. DNA methylation levels in skeletal muscle biopsies from healthy men and

women before and after acute exercise were quantified using MeDIP, followed by qPCR and

bisulfite verification. Not only were methylation levels decreased in the promoters of genes

involved in metabolic functions, but the corresponding mRNA levels were also increased,

indicating that DNA methylation can be dynamically altered to affect acute gene expression

[310]. While non-CpG cytosines were the most frequently altered in this experiment, it has also

been shown that methylation-demethylation cycling can occur at CpGs, as demonstrated in five

selected human promoters, including the oestrogen (E2)-responsive pS2 gene, which cycles with

a periodicity of 100 minutes [311]. Taken together, it is apparent that epigenetic marks are not

simply stable markers of gene transcriptional events, but rather, they are actively involved in the

fine-tuning of genomic activity.

One issue in the study of MZ variation and epigenetic inheritance is that it is extremely difficult

to pinpoint the source of the variation. Variation may be introduced purely by the environment

in a stochastic manner. Differences may also be stochastic, taking place at the molecular

epigenetic level, namely due to the actions of DNMT1, which does not carry methylation status

through mitotic divisions with complete fidelity. The third option is that environmental stimuli

may affect an individual via interactions with epigenetic factors. The combined actions of these

99

three methods can be described in the example of food allergies: Peanut allergies, for example,

are associated with increased immunoglobulin E (IgE) [312], and vitamin D deficiency (a purely

environmental cause) has been linked to increased IgE sensitization [313]. It has also been

found that allergy-related gene transcription is activated in response to CpG demethylation of

particular sites [314]. These alterations could possibly be the result of epimutations that took

place early in development, precipitating the allergic phenotype when they reached critical

mass. Alternately, there may have been some environmental factor that caused the epigenetic

response, and it is known that a variety of exposures, such as tobacco smoke, particulates, diesel

exhaust, polyaromatic hydrocarbons, ozone, and endotoxin, etc, can cause allergic phenotypes

by enhancing imflammatory cytokine expression via epigenetic mechanisms [314-316]. In

human epigenetic studies, it is incredibly difficult to control for any of these effects, particularly

when post mortem tissue is used. Future studies are required to determine exactly how

environmental and epigenetic factors may interact in order to eventually make their separation

possible.

ASM represents the interaction between DNA sequence variants and DNA methylation, and it

allows us to consider a new level of regulation in the genome. Although our in silico and mouse

experiments indicated that ASM did not significantly affect our findings in the twin study, it

should be noted that the twin experiment had a different objective and was not designed to

detect ASM effects. The aim of our epiSNP study was to develop an unbiased approach to

examine ASM, as many previous studies were seriously limited by an assortment of biases. First

of all, the main issue with many studies was that they simply included too few samples or

investigated too few SNPs/CpG sites. The first study that focused on allele-specific effects of

SNPs, by Yan et al, only examined 13 loci in 37 samples [20], and the following studies did not

improve upon this issue: Kerkel et al examined 15 057 SNPs in 12 samples (not all representing

the same tissue) [19], Schalkwyk et al examined 183 605 SNPs in 10 samples (5 twin pairs)[21],

and Hellman and Chess examined 110 883 SNPs in 28 samples [159], just to name a few. The

majority of these studies filtered out a tremendous number of SNPs that they felt were either a)

within an MSRE, thus, could potentially cause LD effects or b) uninformative, for example,

Gertz et al used reduced representation bisulfite sequencing to detect ASM and eliminated all C

T SNPs because they were impossible to differentiate from unmethylated CpG after bisulfite

modification [213]. We felt that this filtering was unnecessary, and that it was more productive

100

to include all SNPs in the analysis, and then filter afterwards if LD effects were detected, versus

estimating at the beginning and then drastically decreasing our number of SNPs later on.

Fortunately, we did not detect an LD effect, using an analysis that considered the same factors

used by previous studies to design their filters, and this allowed us to examine around ten times

more sites.

Several other studies used cell lines as a source of DNA, which may be problematic as the

methylation profiles of immortalized cells may not reflect those of normal cells, for example, it

has been shown that the overall CpG methylation level is lower in WBC vs hESC [301]. Again,

Hellman and Chess used Epstein-Barr virus transformed B lymphocyte lines, Shoemaker et al

used 16 pluripotent and adult cell lines [160], and Chen et al used 3 HESC lines [161].

Additionally, some studies focused on very specific regions of the genome and isolated target

genes, for example, Shoemaker et al only studied CpGs on chromosomes 2 and 20, regions

surrounding the TSS of 26 developmental genes, and 237 ENCODE promoter regions. Finally,

Hellman and Chess relied heavily on in silico extrapolations and simulations of their actual

wetlab data when they estimated that 10% of SNPs would demonstrate ASM. This value is

considerably higher than our prediction and the predictions of many other groups.

Another limitation facing all studies, including ours, is that microarrays do not tend to capture

rare variants. The minor allele frequency of a SNP must be greater than 1% in a population, and

the SNPs included on the SNP 6.0 microarray have an average MAF of 19.6%, 18.2% and

20.6% in the HapMap Caucasians, Asians and Africans, respectively [317], so it is quite

possible that we are not detecting a number of epiSNPs simply because they do not appear on

the arrays. Furthermore, our enrichment strategy utilized three MSREs and, although it provides

a great deal of coverage, it is not enough to interrogate every differentially methylated region of

the genome, as some fragments would be created that are too large to be efficiently amplified

and subsequently hybridized to the arrays. Also, the high stringency of our data analysis

probably caused us to miss some additional epiSNPs with significances that were just around the

cut-off – our sensitivity analysis showed that we had a sufficient sample size to detect all of the

SNPs with a strong, medium and weak association with ASM, but that we would miss those

with very weak associations and that our genome-wide ASM prediction is a slight

underestimate. Additionally, we and other studies were limited by the biases present in the

101

human genome reference sequences, as they are built from a small number of genomes and

likely omit some CpGs within common polymorphic DNA regions.

Tissue heterogeneity is another potential confounding issue experienced in any epigenetic study,

as epigenetic marks vary between cell types [318], and tissues contain a variety of cells. Even

techniques such as laser-capture microdissection or fluorescence-activated cell sorting cannot

fully remove this confounder, although they would certainly decrease its effect. Our post

mortem brain samples did come from the same general area (BA10), but we cannot rule out the

possibility that tissue heterogeneity could have affected our findings if the cohorts differed

significantly in cellular composition. The occurrence of CNVs may also influence the findings

of an ASM study, as additional variants may inflate the array intensity measurements at a SNP.

Ideally, all subjects should be screened for CNVs prior to the study. Although the SNP 6.0

array is equipped for CNV analysis, we did not perform one due to a shortage of bioinformatics

resources, however, this analysis is now underway and will be provided in the final publication.

We were also unable to stratify our samples based on smoking, alcohol or drug use, as this

information was not provided in the study demographics, except in the case of the sperm

samples, although this sample set was too small to achieve adequate power with stratification.

Finally, due to technical limitations, no study to date has differentiated between 5mC and 5hmC,

as they are indistinguishable to MSREs and perform identically in bisulfite modification

reactions.

All current studies agree that ASM is a much more common event than genomic imprinting, and

that DNA sequence plays an important role in the establishment of methylation marks in

specific regions, however, one particular study by Gertz et al takes a rather extreme stance on

the effect size of ASM. They state that:

“The strong association between genotype and DNA methylation indicates that genetics plays a

prominent role in the establishment of DNA methylation patterns. Our data supports a non-

Lamarckian model of evolution, where genetic variants, as opposed to environment, shape

epigenetics. These genetic variants may not lead directly to phenotypic differences, but may

cause phenotypic variability through changes in epigenetic states," and claim that, “the majority

of variation in DNA methylation can be explained by genotype [213].”

102

We strongly disagree with these statements – they place too much emphasis on the ability of

DNA sequence to affect methylation status, and this directly conflicts with our findings in the

twin study, where a great deal of epigenetic variation exists between individuals who are

genetically identical. It is difficult to understand how Gertz et al can reach this conclusion,

considering that they only estimated that 8% of SNPs show ASM in a small, 3 generation family

(n=6) and eliminated all CT SNPs from their analysis. Our estimate was lower (2.5%), and it

was generated by examining a tremendous number of SNPs without bias in a very large sample

set (n= 208 brain, 48 sperm samples), thus, we feel it is a more accurate representation of the

true number. Either way, 2.5% or 8% certainly is not a high enough value to conclude that DNA

sequence determines “the majority” of DNA methylation.

They also found a strong overlap in epiSNPs between tissues, whereas we saw a weaker overlap

between our brain and sperm epiSNPs, as well as a moderate overlap between different cohorts

in the same sample type. They rationalize their strong ASM concordance as being due to,

“shared gene regulatory events that occur early in development or an inherent property of DNA

sequence that directly affects the propensity of DNA methylation [213].” Again, our study was

much larger with higher power to detect, and their tissues consisted of blood and cell lines, so it

is difficult to determine the biological relevance of the conclusions they reached about

overlapping epiSNPs. Still, we do agree that the common epiSNPs likely share some kind of

developmental origin or similarities in the surrounding sequence that increases their ability to

direct local methylation. We also agree with their statement that the “data are consistent with

both the re-establishment of allelic methylation during development and the direct transmission

of DNA methylation in the germline [213],” and that our experimental designs do not allow for

separation of these methods of epiSNP establishment or propagation.

A probabilistic model that identifies ASM based on bisulfite sequencing data has been presented

by Fang et al [319]; it operates by ignoring the actual genotype and simply focuses on detecting

any region where methylation levels differ between alleles. Although it was developed for the

identification of new imprinted loci, the model provided some insight on the characteristics of

genomic, non-imprinted ASM as well. They analyzed 22 publicly available methylomes from

several tissue types, five of which were uncultured primary blood cell types, while the rest

represented ESC or induced pluripotent stem cells. In agreement with our data, they found that

103

the total number of ASM loci varies between tissues, also noting that DNA methylation is

altered in immortalized cells – a shortcoming of many other ASM studies. Their predicted

regions of ASM that were common to multiple tissues often marked the promoter regions for

various ncRNA [319], which complements our finding that epiSNPs are enriched in locus

regions and UTRs; as previously mentioned, these regions are often targeted by ncRNA [171].

On a related note, we determined that a large percentage of epiSNPs occurred in introns and

intergenic sequences, both of which can code products that are processed into ncRNAs [169,

172], although those epiSNPs were so numerous simply because SNPs are numerous in those

regions. Fang et al did not analyze any brain methylome data, nor did they produce a list of

epiSNPs, but their results support the tissue-specificity of epiSNPs and strengthen the

connection between ASM and ncRNA.

Humans are not the only species in which ASM effects have been documented. Xie et al [320]

have recently published a genome-wide, base-resolution map of ASM occurring in mouse

frontal cortex that was generated with MethylC-Seq. They examined ~20 million SNPs in two

mice that were the F1 progeny of reciprocal crosses between two distantly-related inbred strains

and detected ASM in 131 765 CpG sites (approximately 0.7%), finding that CG and non-CG

methylation could occur in an allele-specific manner. As in our study, they found that ASM

sites were isolated and scattered across the genome, typically appearing in intronic and

intergenic regions, although they noted a relative depletion in proximal promoters, whereas we

found an enrichment regions within 2Kb upstream from TSS; additionally, they found that

imprinted ASM sites preferentially occurred in proximal promoters. Taken together, the results

of our studies indicate that parent-of-origin-dependent and sequence-dependent ASM do not

appear to share the same molecular basis, but each specific type of ASM may function similarly

between humans and mice. They also characterized the mouse ASM and came to several

conclusions. First, ASM is more likely to be depleted in sequences coding homeobox proteins,

transcription factors, developmental regulators, histones and ribosome proteins, suggesting that

the regulation of methylation levels in key developmental regions and housekeeping proteins is

very stringent. This is in line with our findings, as we also noticed that very few epiSNPs

occurred in the coding regions of genes related to the aforementioned functions. Second, they

found that only a small fraction of ASM sites are clustered (at 94 genes) and few of these sites

(21.3%) are associated with ASE. Estimates of ASE differ quite a bit between studies, although

they are not directly comparable due to variations in method and tissue source. For example, Li

104

et al utilized whole-genome bisulfite sequencing to examine the methylome of one human

subject, focusing on DNA from WBC; they found that, when ASM occurred within 2Kb of a

TSS, greater than 80% of genes will demonstrate ASE [301]. Unfortunately, we were unable to

examine epiSNP-associated expression differences in our particular brain samples, as the RNA

quality was no longer high enough for use with expression arrays, despite the impeccable DNA

quality. There are several published expression datasets available that utilized many of our

brainbank samples, however, upon further inspection it was apparent that the number of samples

and the number of epiSNP-associated genes included on the arrays was too low to accurately

estimate ASE with sufficient power. Third, when Xie et al examined the neighbouring SNPs of

the ASM CG-sites, they found an over-representation of several motifs on the hyper- and

unmethylated alleles, respectively, but these motifs also existed in hyper- and unmethylated

sequences on a genome-wide scale. They cross-referenced the motifs to recently-published

human methylomes and found that some had very strong correlations with methylation indexes

between mice and humans [320].

One key question that most studies have not addressed is the potential for epiSNPs to actually

represent “epi-haplotypes.” Any epiSNP that we detect may be part of a haplotype block with

another SNP (or several SNPs) where only one of them is truly associated with the methylation

level, and this complicates our selection of the SNPs that should have their surrounding

sequence examined. Other studies also seem to have this issue, as they only consider SNPs

associated with differential methylation at a given CpG site, and do not examine the

contribution of epi-haplotypes. When a single, known haplotype is in question, it is much easier

to investigate ASM related to that block. Bell et al studied a 46Kb LD block of the FTO obesity

susceptibility haplotype and discovered a 7.7 Kb epi-haplotype region encapsulating a highly

conserved non-coding element that represents a validated long-range enhancer, supported by the

histone H3K4me1 enhancer signature [181]. This study was quite feasible, as the target was

already defined and only a small number of CpG sites were interrogated in 60 individuals, but it

highlights the importance of combined genetic-epigenetic studies, as well as the ability of

epiSNPs to form epi-haplotypes. We investigated the possibility that any one SNP was in LD

with another SNP that caused or disrupted an MSRE site and found no evidence for this

association, however, we did not take into account the LD between SNPs outside of MSRE

sites, as this sort of association still requires the presence of an actual epiSNP and could not be

105

created by false positives alone. In order to distinguish between these two scenarios, an

exhaustive analysis would need to be performed that grouped all nearby epiSNPs (the maximum

group size is unknown, so many permutations would have to be run), utilized LD values to

predict haplotypes, and then tested methylation levels for each group size. Exact haplotypes

could have been determined without using LD-based predictions if we had attempted a deep

sequencing approach instead of microarrays, but that would be prohibitively expensive or would

severely limit our number of samples, which would lead to a loss of SNP variation and decrease

in power. Due to these irresolvable issues and time/computational restraints, an epi-haplotype

analysis would require an enormous effort and was not within the scope of the current study.

Genetic-epigenetic interplay in complex disease

The concepts of epigenetic inheritance and ASM have major ramifications for the study of

complex diseases, as we are no longer considering DNA sequence to be the single mode of

inheritance and are learning that genetics and epigenetics interact in many complex ways. Here,

we have looked at major psychosis from a novel perspective and discovered evidence for a

disturbance in the pattern of SNPs that control their local methylation. Since this finding

involves thousands of SNPs, likely working in concert, it is impossible to explain the connection

between all of these loci and development of the disease with complete certainty, however, we

can make some hypotheses based on key findings and current literature.

The majority of previously-documented epiSNPs appear to operate in cis, as opposed to trans

actions, although the actual definitions of these mechanisms is not incredibly clear. Two

scenarios exist in the literature: one discusses the distance between an epiSNP and an affected

gene, whereas the other involves an epiSNP and a nearby methylated CpG site. There does not

seem to be a standard distance between an epiSNP and affected gene or site within which

interactions are labeled “cis;” one study used a distance of >1Mb between a SNP and CpG site

to delineate trans interactions [162].” By this definition, our study was only designed to detect

cis effects, as our interrogated unmethylated sequences were less than ~2Kb, thus, all epiSNPs

must be within the immediate vicinity of an affected CpG site. A number of trans effects have

been reported under this definition [21], meaning that our study may have missed ASM

involving a SNP that is far away from its target CpG site. When we consider the distance

between an epiSNP and a nearby gene, arbitrary values are usually assigned, for example, ASM

106

occurring in introns or up to 43Kb upstream of a gene have been classified as cis [21], but a

2Mb window centred on a gene has also been used as the boundary for cis interactions [300].

The terms “cis” and “trans” are somewhat outdated, and perhaps it is more useful to think of

them as “effects on the same chromosome” and “effects on other chromosomes,” respectively.

Multiple studies in drosophila [321], mice [322] and humans [21, 323] have concluded that only

about 10% or less of regulatory variants are trans in nature. Again, our study was geared

towards detecting cis interations, but our finding that about half of our epiSNPs occur in introns

or within 2Kb of genes supports the idea that cis-actions are common. We also discovered a

large amount of intergenic ASM, and these epiSNPs could potentially support either mode of

action, depending on the definition. A 2009 Science paper [300] focused on functional variants

that affect gene expression in cis and the tissue-specificity of their effects, concluding that a)

single SNPs can affect the transcription of multiple genes, b) tissue-specific SNPs are usually

located farther away from the gene, and c) 69 - 80% of regulatory variants are cell type-specific

[300]. Combining their findings with our own, a hypothesis of epiSNP action begins to

materialize, where epiSNPs are frequently in cis, often affecting gene transcription via

modulation of alternative splicing events from within the intronic regions of genes. The

epiSNPs that act in trans potentially affect either the production or binding of a transcription

factor, enhancer or silencer that exerts its effect further downstream, or perhaps the 3D

organization of DNA in the nucleus places the epiSNP in proximity to an effector gene. In the

latter case, histone changes that result from differential methylation caused by the epiSNP may

affect local chromatin arrangement, resulting in indirect trans regulation.

In our GO analysis, we found several categories involved in two pathways that have been

proposed to play a role in the etiopathogenesis of psychosis: one related to glutamate and one

related to insulin. For many years, dopamine and its dysfunction within the mesolimbic pathway

has been thought to be the major cause of psychosis, but there is evidence that other

neurotransmitters may be equally important; one likely candidate is glutamate, as the majority of

neurons in the brain use it for neurotransmission [324]. Two drugs that intensify psychotic

symptoms, ketamine and phencyclidine (PCP), both act via blockade of glutamate receptors, but

show very little dopaminergic effects [325]. In the glutamate hypothesis of SZ, N-methyl-D-

aspartate (NMDA) receptors, the major subtype of glutamate receptors are believed to be

dysfunctional. Animal models have demonstrated that various environmental stressors increase

107

glutamate release/transmission in limbic/cortical areas and can cause structural changes, such as

dendritic remodeling, reduction of synapses and possibly volumetric reductions in areas where

glutamate is primarily involved [324]. In psychotic or pre-psychotic individuals, levels of

glutamate are altered: low levels of glutamate are observed in the thalami of individuals who are

at risk for psychosis [326], higher levels occur in the associative striatum (precommissural

dorsal-caudate) high-risk and first-episode psychotic subjects [327], and plasma glutamate

levels were decreased in first-episode SZ and BD subjects, but were restored after treatment

[328]. Selective agonists of group II metabotropic glutamate (mGlu) receptors, such as the

positive allosteric modulator biphenyl-indanone A (BINA), have recently demonstrated efficacy

in treating the positive and negative symptoms of SZ, as well as modulating the activity of

psychotomimetic drugs and reducing the increased glutamatergic transmission associated with

psychotomimetic hallucinogens. Increased excitation of the medial prefrontal cortex is believed

to contribute to the development of SZ, so the ability of BINA and other related compounds to

treat SZ symptoms reinforces the glutamate theory and highlights the importance of the mGlu

receptor, in addition to the NMDA receptor [329]. Several glutamatergic gene and pathway

targets have been identified in GWAS of SZ and, while there are fewer findings for BD, there is

evidence to suggest that the candidates will be complementary to SZ versus completely

overlapping [330]. This idea mirrors our comparison of SZ versus BD ASM where there is

some overlap of GO categories and epiSNPs, yet the diseases seem to maintain similar, but

unique epiSNP distributions.

We detected a number of epiSNPs in three GO categories related to glutamatergic signaling

(summarized in Table 3.7), some of which were shared between groups. One category,

GO:0051967, was unique to BD, whereas GO:0035249 was common to SZ and controls, and

GO:0007215 was common to all three cohorts. In the case of this common category, broadly

defined as “glutamate signaling pathway,” the majority of epiSNPs were associated with the

same genes across all cohorts, and these genes coded either glutamate receptors or receptor

subunits. Three of the genes unique to BD also encoded glutamate receptors, but one gene

(APP) codes amyloid beta (A4) precursor protein, which is known to be involved in the

pathogenesis of Alzheimer’s disease [331]. The control- and SZ-unique genes were also related

to glutamate receptors, including genes coding proteins that interact with the receptors, such as

GNAQ, which couples cell surface, 7-transmembrane domain receptors to intracellular signaling

108

pathways [332], and HOMER2, a protein belonging to a family that regulates glutamate receptor

function [333], but is also implicated in addiction and drug induced neuroplasticity [334]. The

considerable amount of ASM associated with several different receptors in glutamate pathways

supports the glutamate theory of psychosis, as BD and SZ subjects deviate from the epigenetic

regulation patterns observed in control individuals, which could potentially result in alteration

and dysfunction of various glutamate-related pathways.

This same situation exists with our SZ-specific insulin-related GO categories, which were both

involved with insulin secretion. In the 1930s, it was believed that shocking a subject with doses

of insulin high enough to cause a violent reaction and an eventual comatose state, followed by

rescue with glucose, would cure or greatly improve the symptoms of SZ [335]. This therapy

came about after it was noticed that, when insulin was administered to SZ patients to encourage

weight gain, their mental state was positively affected [336]. With time, significant doubts

arose concerning the validity/ethics of the approach, and it was eventually replaced with newer

treatments, but recent research has been reviving the relationship between insulin and SZ. In a

study of serum samples from 19 twin pairs discordant for SZ plus 34 age- and gender-matched

healthy control twins, it was found that the SZ subjects had elevated triglycerides and were more

insulin resistant than their healthy co-twins [244]. Increased serum concentration of insulin in

SZ subjects has been documented at the onset of the disease [286], as well as in antipsychotic-

naïve individuals [285]. This second finding is especially important, as antipsychotic treatment

has been associated with insulin resistance or impaired glucose tolerance, but this and other

studies have shown that these changes are occurring prior to initiation of drug treatment.

Another group investigated anti-psychotic naïve first onset SZ subjects and also found increased

serum levels of insulin. Additionally, they used liquid-chromatography mass spectrometry

proteomic profiling to analyse the proteome of stimulated and unstimulated peripheral blood

mononuclear cells from SZ subjects and controls, and identified 18 proteins that were

differentially expressed between first onset cases and controls, 8 of which belonged to the

glycolytic pathway. Differences included increased levels of lactate and the glucose transporter-

1, and decreased levels of the insulin receptor - none of these proteins were altered in

antipsychotic treated patients [337]. Finally, while energy metabolism genes are decreased in

the brains of SZ patients, stimulation of insulin and insulin-like growth factor (IGF-1) receptors

leads to a reciprocal alteration of genes associated with metabolism and synaptic function.

109

Pharmacologic interventions that activate these receptors are a promising therapy for SZ, as they

would counter an etiological genomic disturbance [338]. Our GO analysis detected 36 epiSNPs

involved in insulin secretion that were unique to the SZ cohort; these epiSNPs were associated

with a variety of genes, ranging from potassium and calcium channels (KCNB1, CACNA1C) to

regulatory transcription factors (RFX6, RFX3), protein kinase C (PRKCA), a cholinergic

receptor (CHRM3), and a phosphodiesterase (PDE8B). While none of these genes are

exclusively related to insulin secretion, they are all critical pathway components, and these GO

categories were significantly enriched in the SZ group alone. Once more, it is not likely that a

complex disease, such as SZ, will be caused by a small number of “obvious” genes, and perhaps

the actual contributing factors have been so elusive because they represent different profiles of

seemingly unrelated genes that are discretely-dysregulated (ie. epigenetic dysregulation versus

genetic mutation), all of which are only responsible for a small portion of the pathology.

Although they seem to be completely separate pathways, there is a link between insulin

resistance and glutamate in psychosis. GAD is the enzyme that catalyzes the decarboxylation of

glutamate to produce GABA [339], and expression of one isoform, GAD65, is decreased within

the axon boutons of interneurons in SZ patients; the decrease correlates with decreases in

GAD65 protein levels and dendritic spines [340]. There is also evidence that autoantibodies to

GAD isoforms can contribute to the development of chronic psychotic disorders [341, 342],

type 1 diabetes and latent autoimmune diabetes in adults [343, 344]. When young non-obese

diabetic (NOD) mice were injected with anti-idiotypic antibodies directed to the GAD65Ab, the

incidence rate of Type 1 diabetes and time of onset significantly decreased [345]. The

connection between GAD autoantibodies and these conditions is relatively recent, and their

actual role in pathogenesis has yet to be determined. It is evident, however, that fluctuations in

one of these systems could result in disturbances in the other, meaning that epiSNP-induced

dysregulation of one pathway may also exert an effect via the other, and it is possible that these

effects could be additive.

The role of epigenetic mechanisms in psychiatric diseases is only beginning to solidify, but it is

already evident in major psychosis, Alzheimer’s disease, autism spectrum disorder, fragile X

syndrome [346], and several other conditions not previously mentioned in this thesis, such as

Rubinstein-Taybi syndrome [347], addiction [348, 349], and Huntington’s disease [350].

110

Maintenance of DNA methylation and histone modifications is crucial for normal

neurodevelopment and functioning of the brain – dysregulation of these components is

deleterious to the subject and can predispose to any of the aforementioned disease phenotypes.

Previous studies of psychiatric conditions have concentrated on the contributions of genetic and

environmental factors, but the impact of epigenetic mechanisms on neural function and gene

regulation cannot be ignored. While DNA sequence and external influences likely play an

important role in disease etiology, it is the interplay between epigenetics, DNA sequences and

environment that should become the focus of future work, with epigenetics bridging the gap

between genes and environment. New discoveries related to epigenetic inheritance and ASM are

evidence that we are not even aware of all the fundamental epigenetic mechanisms, and that a

considerable effort must be devoted to this field of research.

Future directions

As we gain a better understanding of genetic-epigenetic interactions, such as those described

here, the next step is to utilize these findings to improve upon our current strategies for studying

genomic activity and understanding normal molecular interactions. Here, we have explored the

epigenetic differences between MZ co-twins using CpG island microarrays, which was a large-

scale approach at the time of its undertaking. Since then, technology has advanced, and much

larger-scale tools are now available, such as tiling arrays and next generation sequencing

platforms. Future studies should utilize these technologies to further study the epigenetics of

MZ twin discordance and to investigate the contributions of DNA methylation and histone

modification in the context of phenotypic discordance.

The first goal would be to identify epigenetic differences that correlate with phenotypic

differences in disease states, as well as in normal, non-pathological traits. Integrative

approaches should be used for the simultaneous study of epigenetic and genetic factors, i.e. we

should begin to use DNA methylation level and histone modification information to stratify

GWAS into EWAS. The use of this additional level of biological regulation will facilitate the

identification of risk epi-alleles, which may be more informative than the purely genetic risk

alleles that are currently being discovered, as certain DNA risk factors may only become

detectable when their methylation state is taken into consideration. In the past several decades,

thousands of quantitative trait loci (QTL) – stretches of DNA associated with the gene for a

111

given trait - have been mapped using linkage and association analyses [351], however, small

effect sizes and the small proportion of the variance attached to individual QTLs have greatly

slowed the connecting of QTLs to genes [352]. The introduction of an epigenetic element may

facilitate the mapping of QTL that influence complex traits, and we have already observed

instances where epigenetic factors are responsible for the phenotype. If the DNA sequence of

the agouti locus (previously mentioned in the literature review) had been studied alone, with no

consideration of methylation status, the effect of this QTL would not be fully understood, as the

methylation within the 5’ transposon is the true predictor of the phenotype [353].

Presently, the EWAS approach is being utilized in many small-scale or targeted studies, mainly

those focused on cancer, but with our increasing knowledge of epigenetic mechanisms, EWAS

methods should become the standard for all risk factor screening endeavours. It has recently

been suggested that simply knowing the DNA sequence of an individual is not sufficient to

predict whether or not they will develop a disease in their lifetime, as personal choices, lifestyle

and random events can cause or prevent nearly every disease [354]. This is discouraging news

for genetics, but highlights the importance of epigenetics and combined genetic-epigenetic

studies in personal medicine, as epigenetic mechanisms can be affected by the environment and

may mediate a portion of its effects. A comprehensive re-analysis of GWAS-derived candidate

genes, haplotypes and individual SNPs should be attempted wherever possible and, hopefully,

technology will advance to the point where these screens can be completed simultaneously in a

cost-effective manner, perhaps through the design of dual-purpose microarrays.

The next step would be to determine the origin of these epigenetic differences, and the

mechanism by which they exert their effects – are they influenced by specific environmental

stimuli or do they arise stochastically? Once the causes have been identified, the development of

treatments and preventative strategies can begin for the pathological traits; in the case of non-

pathological phenotypic traits, we simply wish to understand the underlying mechanisms

responsible for their variability. The relationships between environmental exposures, genetic

states and epigenetic factors will become more comprehensive when examined using integrative

approaches and large sample sizes, and this should facilitate the discovery of epigenetic

mechanisms in discordant MZ twins.

112

Another goal of future work will be to expand upon our knowledge of epigenetic inheritance, as

much remains unknown regarding this critical biological phenomenon. As mentioned in the first

goal, subsequent studies should incorporate new technologies and drastically increase sample

sizes to study epigenetic inheritance in humans, as well as model systems. Experiments should

focus on the detection of epigenetically-heritable signals in the context of heritable traits and

diseases, taking into account both DNA methylation and histone modification. It has been

shown that histone modifications are heritable and can potentially affect the regulation of

transcription in germ and somatic cells of subsequent generations. For example, when a

Caenorhabditis elegans (C. elegans) ortholog of the H3K4me2 demethylase LSD1/KDM1 is

mutated, the worms show increasing sterility with each generation, and this sterility correlates

with misregulation of spermatogenesis-expressed genes and transgenerational accumulation of

dimethylation of histone H3 on lysine 4. It was hypothesized that erasure of H3K4me2 by

LSD/KDM1 in the germline prevents the “epigenetic memory” from being inappropriately

transmitted between generations [355]. Furuhashi et al have suggested that the use of “elegant

model systems” is crucial for the understanding of transgenerational histone modification

effects, namely the manipulation of C. elegans, as this species does not exhibit any DNA

methylation and encodes all of its epigenetic information using histones [309]. Examination of

histone patterns was not within the scope of our study, but future studies should definitely

explore this topic.

Regarding ASM, a third goal should be to investigate epiSNP 3D interactions and search their

surrounding sequences for motifs and other clues about their origins, but we must first

determine if we are dealing with epiSNPs, epi-haplotypes or both because, as previously

mentioned, a pure sequence analysis would be somewhat futile in an epi-haplotype scenario.

We must also study their stability and the factors that influence it, considering the possibility

that epiSNPs are not static and are subject to change – can their methylation levels fluctuate as

rapidly those observed in the Barres study [310]? If so, are fluctuations long- or short-acting,

and can they be stimulated by medications, chemicals in food, water or air, stress and emotional

state, physical activity, medications or any number of other factors? While the few known

imprinted genes display strong, parent-of-origin-specific monoallelic gene expression, it has

been documented that about 20% of autosomal genes may also show some differences in allelic

expression [356, 357]. The idea of a dynamic adaptation of genomic regulation using epiSNPs

113

is intriguing, although, it does not explain the conservation that we are seeing at many loci.

Perhaps there is a selective pressure at play in the establishment and maintenance of epiSNPs.

In order to study ASM mechanisms, it is necessary to select some target regions and focus on

them, versus looking at the entire genome; the purpose of our study was to identify instances of

ASM and observe trends, but the design is not suited to the exploration of exact mechanisms.

Ideally, deep-sequencing techniques would be employed, providing single-base resolution and

allowing for the convenient, reliable detection of epi-haplotypes. Analysis of epi-haplotypes

offers the extra benefits of reducing the number of tests required, as we are no longer

considering individual SNPs, and it would also permit the undertaking of evolutionary studies.

Also related to the study of epiSNP function are the questions of downstream interactions with

genes or regulatory elements and the relationship between ASM loci and imprinted loci. It has

been hypothesized that imprinting provides a means for offspring to adapt to the environment

before birth, and perhaps epiSNPs also serve this purpose. It is already known that ASM exists

in mice [320], so an analysis of epiSNP establishment in inbred mice could be very informative,

if different developmental time-points and maternal exposures are considered; this sort of

experiment should focus on previously-identified ASM hotspots, and could be extended to

include a study of gametic ASM. Our current studies have only considered 5-mC, but

technology is advancing and it will soon be possible to examine 5-mC and 5-hmC separately.

This is recommended for future experiments, especially in brain, liver, kidney and colorectal

tissues, where 5-hmC is relatively abundant [358], as consideration of both forms of methylation

may provide more information on the function of these modifications. On the subject of

downstream interactions, the influence of epiSNPs on transcription factors and their binding

sites, enhancers, silencers, ncRNA and miRNA should be studied, as should the impact of

intronic and UTR epiSNPs on the splicing and conformation of mRNA.

A fourth and final goal is the development of effective epigenetic pharmacotherapies that can be

used to correct dysregulated methylation levels and histone modifications. A number of

treatments based on epigenetic principles already exist, but they are severely limited by their

non-specificity, lack of efficacy and impermanent nature. Many of these drugs affect enzymes

that modify DNA and histones, but targeting etiological disease epimutations may be even more

promising, and compounds with higher specificity would become attractive choices for the

treatment of diseases other than cancer. One way to target specific sequences involves the use of

114

aptamers, which are small RNA/DNA molecules that form secondary and tertiary structures that

specifically bind proteins or other targets, much like a synthetic antibody. Aptamers are

chemically synthesized and easily conjugated with siRNA and nanoparticles, and preclinical

studies have shown great potential of these molecules in mouse models of cancer and HIV

[359]. It may also be possible to exploit the properties of zinc-finger proteins (ZFPs) and RNA

interference (RNAi) for the development of future epigenetic therapeutics. Zinc-finger proteins

specifically recognize and bind short stretches of DNA sequence (typically 9-18 base pairs)

[360], and they can be used to carry out a variety of cellular activities when they are combined

with different domains. In theory, an epimutation could be resolved if an epigenetically

dysregulated gene is treated with a corresponding histone or DNA modification enzyme

attached to a gene specific ZFP; the ZFP will specifically bind to the epimutation locus, while

the modification enzyme permanently repairs the damage. Another promising technology is

based on RNAi, which involves double-stranded RNA-induced destruction of homologous

mRNA, thus disabling protein production. Small interfering RNAs (siRNA) are endogenously

produced and incorporated into an RNA-induced silencing complex (RISC), which then targets

and cleaves mRNA transcripts [361]. It is believed that RNAi may have an impact on local

chromatin structure, heterochromatin assembly, and gene silencing, although mechanistic details

as to how the RNA and chromatin connect remain unclear [362]. Several siRNAs have recently

been created, including ones to knock down Β-secretase (BACE1) in Huntington's and

Alzheimer's disease, SCA1 in spinocerebellar ataxia, superoxide dismutase (SOD1) in

amyotrophic lateral sclerosis [363], and Toll-like receptor 4 (TLR4) in a rat model of acute lung

injury [364]. A clinical trial has also been submitted to the FDA, proposing the use of siRNA

against vascular endothelial growth factor in cases of age-related macular degeneration [365].

Great potential exists for the therapeutic use of siRNA to knock down mutated proteins in

various disease states, although issues of nonspecific silencing of partially homologous genes,

safe delivery and inhibition of microRNA (miRNA) must first be resolved.

The number of known epigenetic target genes and sequences has been steadily increasing,

especially in a variety of cancers, and their clinical utility is beginning to be noticed. MGMT

promoter methylation can be used to stratify elderly glioblastoma patients for treatment, as only

those with this methylation are sensitive to alkylating agent chemotherapy [366]. In patients

with early stage ER-negative breast cancer, classification of the methylation levels of tumor-

115

specific and tumor-related genes is an independent prognostic factor [367], although these

classifications are not yet linked to preferable treatment strategies. Some biomarkers have been

officially approved for use in the monitoring of cancer, such as CA125 and HE4, which were

approved by the FDA as biomarkers of ovarian cancer [368]. Ovarian cancer has shown quite a

bit of resistance to drug therapy (namely cisplatin and carboplatin), however, the ability of

epigenetic treatments to re-sensitize the tissue is under investigation [369, 370]. Natural

epigenetic modulator compounds, such as epigallocatechin-3-gallate (EGCG), a catechin

(flavonoid) from green tea, sulforaphane (SFN), an organosulfur from cruciferous vegetables,

and genistein, an isoflavonoid from soybean, have also demonstrated an ability to inhibit

ovarian cancer cell proliferation, while offering a safer adverse effect profile [51]. Other

promising plant-derived epigenetic therapeutics are also emerging, including alkaloids,

terpenoids and many polyphenol compounds [371]. Perhaps, in the next several years, further

identification of epigenetic biomarkers and operationalization of new, effective diagnostics and

treatments will become feasible for psychiatric and various other complex diseases.

116

Appendices

Appendix I. Twin Study Supplementary Notes

Correlation of MZ co-twin epigenetic variation with WB cell counts

The spot- wise correlation between twin pair loess M log ratio values and WB cell counts did

not yield any significant loci after correction for multiple testing. The number of genes

associated with loci showing an uncorrected significance value of P<0.001 in the whole WBC,

neutrophil, and lymphocyte fractions were 6, 10, and 8, respectively. Of the genes associated

with identified microarray probes beyond this threshold, 3 genes including the EOMES,

PDCD2, and PTPN9 genes that are related to immune system function[372-374]. While there is

a possibility that that these loci have surfaced by chance, the correlation between DNA

methylation status and various immune system related genes suggests that some of the

differences detected in this tissue could be a result of cellular sub-fraction differences between

these twins. However, the proportion of seemingly relevant correlations is less that 0.04% of the

total number of unique loci, which may be a testament to the effectiveness of matching WBC

cellular sub-fractions prior to epigenomic profiling.

117

Figure A.1. Karyogram of MZ co-twin epigenetic similarity in buccal cells

A chromosomal karyogram depicting levels of MZ co-twin similarity per interrogated locus in the buccal sample.

Black and grey bars on the chromosomes represent chromosomal banding patterns while red bars are indicative of

regions of high microarray probe density. Bars to the right of each chromosome represent locus specific ICCs

depicting levels of MZ co-twin epigenetic similarity. FDR corrected P values below the level of P<0.05 are depicted

in green while those with greater P values are depicted in grey.

118

Figure A.2. Karyogram of MZ co-twin epigenetic similarity in gut

A chromosomal karyogram depicting levels of MZ co-twin similarity per interrogated locus in the gut sample. Black

bars on the chromosomes represent chromosomal banding patterns while red bars are indicative of regions of high

microarray probe density. Bars to the right of each chromosome represent locus specific ICCs depicting levels of

MZ co-twin epigenetic similarity. Raw P values below the level of P<0.05 are depicted in green while those with

greater P values are depicted in grey.

119

Figure A.3. Karyogram of MZICC-DZICC values in WBCs

A chromosomal karyogram depicting levels of MZ co-twin similarity relative to DZ co-twin similarity per

interrogated locus in the WBC sample. Blue bars to the right of each chromosome represent locus specific ICCMZ-

ICCDZ values.

120

Figure A.4. Karyogram of MZICC-DZICC values in buccal cells of MC MZ twins

A chromosomal karyogram depicting levels of MZ co-twin similarity relative to DZ co-twin similarity per

interrogated locus in the MC buccal sample. Blue bars to the right of each chromosome represent locus specific

ICCMZ-ICCDZ values.

121

Appendix 2. Allele-Specific Methylation Study Supplementary Notes

ID Status Age Sex Race COD PMI Brain pH Brain Wt Age of Onset

2 BD 29 M white SUIC:CO 60 6.7 1430 17

3 SZ 43 M white PNEUMONIA 26 6.42 1480 22

4 BD 45 M white CARDIAC 28 6.35 1480 35

5 BD 41 M nat amer SUIC:OD 70 6.71 1625 22

6 BD 29 F white OD 62 6.74 1330 18

8 BD 44 M white SUIC:HANGING 19 6.74 1660 33

9 SZ 45 F white SUIC:JUMPED 52 6.51 1510 34


11 SZ 51 M white CARDIAC 43 6.63 1390 23

12 SZ 19 M white OD 28 6.73 1465 18

13 BD 49 F white SUIC:MVA 19 5.87 1380 22

14 BD 48 F white CARDIAC 18 6.5 1205 33

15 C 44 F white CARDIAC 28 6.59 1330 NA

16 BD 42 M white DROWNING 32 6.65 1470 18

17 SZ 53 F white CARDIAC 13 6.49 1345 29

18 BD 35 M white CARDIAC 35 6.3 1490 19

19 C 49 M white CARDIAC 46 6.5 1605 NA

20 BD 59 F white SUIC:OD 53 6.2 1410 48

21 BD 54 M white SUIC:OD 44 6.5 1510 45


23 BD 35 F white SUIC:CO 17 6.1 1250 21


26 SZ 24 M white SUIC:OD 15 6.2 1505 20


28 BD 45 M black KETOACIDOSIS 35 6.03 1300 16

31 SZ 34 M white EXHAUSTIVE MANIA/NMS 9 5.9 1415 19

32 BD 42 F white OD 49 6.65 1335 20

33 C 38 F white CARDIAC 33 6 1120 NA

36 BD 41 M white OD 39 6.6 1375 21

37 SZ 39 M white MVA 80 6.6 1355 17




41 SZ 43 M white CIRRHOSIS 18 6.3 1520 18

42 BD 64 M white PNEUMONIA 16 6.1 1340 19

43 C 35 M white MYOCARDITIS 52 6.7 1700 NA

44 SZ 32 F white SUIC:JUMPED 36 6.8 1340 29


46 BD 59 M white SLEEP APNEA 84 6.65 1300 25




50 BD 51 F white SUIC:BLEEDING 77 6.42 1120 35



122



55 SZ 47 M white ACUTE PANCREAT 13 6.3 1310 20



59 C 57 M white CANCER 26 6.4 1470 NA

61 BD 44 F white MYOCARDITIS 37 6.37 1200 26

62 BD 56 F white DROWNING 26 6.58 1170 14

63 BD 43 F white SUIC:OD 39 6.74 1505 25

64 BD 35 M white DROWNING 22 6.58 1390 14




69 BD 50 F white SUIC:OD 62 6.51 1400 25



72 BD 49 F white OD 38 6.39 1190 20


74 BD 33 F white SUIC:HANGING 24 6.51 1450 15

75 SZ 54 F white PNEUMONIA 42 6.65 1170 17


77 C 33 F white ASTHMA 29 6.52 1360 NA

78 SZ 44 F white POSS PULM THROMB 26 6.58 1490 16



81 SZ 47 F white OD 30 6.47 1430 23

82 SZ 39 M white SUIC:HANGING 26 6.8 1470 34


85 SZ 38 M hispanic OD 35 6.68 1210 17



88 SZ 43 M white SUIC:HANGING 65 6.67 1490 25

89 BD 43 F white OD 57 5.92 1340 29




95 C 31 M white PULM EMBOL 11 6.13 1335 NA


98 BD 56 M white SUIC:OD 23 6.07 1670 28




102 BD 48 M white SUIC:HANGING 23 6.9 1466 31


104 BD 19 M white OD 12 5.97 1484 17


Table A2.1. Stanley sample demographics

Demographic information for brain samples obtained from the Stanley Medical Research Institute brain collection.

123

ID Sex Status Age (decade) PMI Race Age of Onset COD Code COD

1003 F C 51-60 24h unknown na 2 1 heart attack/disease

1005 F C 71-80 12.5 unknown na l&r 10 Trauma 1008 F C 61-70 22.5 white na 2 11 pneumonia/resp infection

1011 M C 61-70 22.33 unknown na 1 12 Sepsis 1013 M C 31-40 18.75 white na a 13 Other 1014 M C 31-40 20 unknown na MI 14 dehydration, starvation

1020 M C 71-80 20.53 unknown na u 15 Hanging 1021 M C 31-40 25.67 unknown na u 16 Seizures 1022 M C 81-90 7.42 white na 2 17 chronic obs. Pulmonary

1024 M C 71-80 20.92 white na 2 19 Choking 1025 F C 71-80 23.91 unknown na 2 2 Cancer 1026 M C 31-40 28.83 unknown na MI 20 Asphyxia 1028 F C 61-70 24.25 unknown na u 21 GI bleed 1029 F C 61-70 7.42 white na 2 22 renal disease

1030 M C 41-50 18.33 unknown na MI 23 smoke inhalation

1032 M C 41-50 24.13 unknown na u 3 stroke, cerebrovasc dis

1034 M C 31-40 16.6 unknown na MI 4 general ateriosclerosis

1046 F C 71-80 14.1 white na MI 5 Complications

1047 M C 61-70 15.3 white na ref 6 CO poisoning

1049 F C 61-70 15 unknown na 1 7 Drowning 1055 M C 61-70 18.7 unknown na 1 8 shooting/stabbing

1070 M C 61-70 16.05 unknown na 1 9 drug OD

1071 F C 71-80 22.75 unknown na 22 a Accidental

1072 F C 71-80 18.5 unknown na u aaa abdominal aortic aneurysm

1074 M C 41-50 27.23 unknown na u asp Aspiration

1075 M C 41-50 20.61 unknown na MI chf congestive heart failure

1078 F C 61-70 22.55 unknown na MI E Emphysema

1079 F C 71-80 22.67 unknown na MI fro frozen to death

1080 M C 41-50 24.32 unknown na MI gan gangrene, infection

1086 M C 51-60 21.83 unknown na u gs gunshot wound

1087 F C 51-60 23.08 unknown na u inf Infection

1088 M C 41-50 30.4 unknown na 1 l&r liver and renal failure

1093 M C 61-70 29.36 unknown na u MI myocardial infarction

1099 M C 71-80 21.25 unknown na u mva motor vehicle accident

1107 F C 71-80 20.3 unknown na u PE pulmonary embolis

1110 M C 41-50 27.13 unknown na 1 pgl pontine glioma

1111 F C 51-60 23.78 unknown na u ref respiratory failure

1117 M C 61-70 20.97 unknown na 1 s Suicide

1118 M C 41-50 14.68 unknown na u ska ski accident

1123 F C 71-80 26.67 unknown na u sys systemic failure

1124 M C 51-60 24.42 unknown na 1 u Unknown

1125 M C 41-50 19.88 unknown na MI vd vascular dementia

1127 F C 51-60 24.25 unknown na 1

1128 M C 71-80 25.23

unknown na u SZ = schizophrenia

1129 M C 11-20 19.83 unknown

na 1 SA = schizoaffective

1132 F C 31-40 18.08 unknown

na u BD = bipolar disorder

124

1135 F C 81-90 17.42 white

na 2 INR = insufficient records

1137 M C 51-60 18.15 unknown na MI

1139 M C 51-60 21.88 unknown na 1

1141 F C 41-50 20.25 unknown na 1 1004 M SZ 61-70 19.9 unknown na u 1009 F SZ 71-80 24 white na 2 1010

F BD

71-80 20.83 white na 12 1012

M BD

71-80 14.25 white 18 11 1015

F BD

71-80 17 white 35 u 1016

M SZ

61-70 22.35 white na 1 1017 F SA 51-60 18 white na 2 1018 M BD 31-40 30.75 white na s 1019 M BD 31-40 22 white na 6 1027 M BD 71-80 27.66 white 20 u 1031 M BD 81-90 5.02 white na u 1036 M SZ 41-50 19 white na 11 1037 M SZ 31-40 28 white na 1 1038 M SZ 41-50 18.1 white na 6 1039 F SA 71-80 13.4 white na u 1041 M BD 71-80 30.2 white na u 1043 M SZ 41-50 27.1 white na 2 1045 F BD 41-50 15.8 white na u 1048 F BD 71-80 11.6 white na 3 1051 F BD 61-70 11 white na E 1052 F SZ 81-90 23.25 white na u 1053 M BD 31-40 41.5 white na gs 1054 F SA 81-90 25.75 white na chf 1056 M BD 51-60 31 white na u 1059 F BD 71-80 22.8 white na 1 1060 F SZ 71-80 21.75 white na 2 1061 F SA 41-50 33.78 white na 1 1064 M BD 71-80 24.8 white 52 11 1065 M SZ 41-50 19.08 white 19 s

1069 M SZ 41-50 24.5 white na MI

1073 F SZ 81-90 15.67 white na u

1076 M SZ 61-70 16.47 white 20 11

1077 M SZ 41-50 29.06 white na 2

1081 M SZ 61-70 15.95 white na E

1083 F BD 71-80 19.86 white 20 11

1084 F BD 81-90 14.08 white 35 u

1085 F BD 21-30 24.17 white 14 9

1089 F SZ 61-70 27.8 white 10 MI

1090 F BD 71-80 33.33 unknown na u

1092 F BD 41-50 16.25 white 18 u

1094 M SZ 41-50 17.67 white 2 MI

1095 M SZ 61-70 25.33 white na 12

1096 M SZ 51-60 32.38 white 20 chf

1097 M SZ 41-50 17.75 white 20 2

125

1098 F SZ 51-60 16.12 white na 2

1100 M SZ 51-60 24.53 white 18 19

1101 F BD 71-80 22.62 white 22 2

1102 F SZ 71-80 28.8 white na 2

1103 M BD 61-70 27.17 unknown na ref

1104 M SZ 61-70 21.43 white 15 MI

1105 M INR 11-20 17.5 unknown na mva

1106 M SZ 51-60 20.08 white 21 MI

1109 F BD 61-70 25.3 white 23 11

1112 M BD 61-70 29.48 white 56 11

1113 F SZ 61-70 11 white 21 MI

1114 F BD 71-80 21.63 white 20 2

1115 M SZ 41-50 33.25 white 19 u

1116 F BD 61-70 13.37 unknown 50 u

1119 M BD 61-70 17.25 white 27 22

1120 M SZ 51-60 38.25 unknown 19 u

1121 F BD 31-40 21.92 white 22 u

1122 F BD 51-60 17.22 white 16 l&r

1126 F SZ 51-60 18.72 white na 2

1130 M BD 21-30 19.83 white 19 s

1131 F BD 71-80 22.92 white 50 u

1134 F BD 71-80 24.75 white 20 14

1136 F BD 51-60 30.1 white na 1

1140 M SZ 41-50 32.5 white 18 s

1142 F BD 71-80 21.46 unknown na u

Table A2.2. Harvard sample demographics Demographic information for brain samples obtained from the Harvard Brain Tissue Resource Center.

ID Status Age Additional Conditions Smoke Medication Ethnicity

C15 control 29 none no none East Indian

C16 control 34 none no none Caucasian

C28 control 40 cleft palate surgery x3 no none Caucasian

C44 control 42 none no none Asian


C46 control 30 childhood heart murmur yes none Caucasian


C57 control 63 osteoarthritis, GERD no Losec, Pregabalin Caucasian

C58 control 26 none no none Asian

126








C94 control 44 chronic back pain, injury no NSAIDs, morphine, muscle relaxants Caucasian

C95 control 38 tremor in hands yes propanolol Caucasian

C102 control 50 hepatitis C, alcoholic yes none Native


C104 control 41 none yes Cephalexin Mixed

C110 control 33 none no none Mixed



S1 bipolar 35 none no Seroquel, Tegretol, lithium Caucasian

S4 bipolar 44 penicillin allergy no Lithium, Epival, Zyprexa Caucasian

S22 bipolar 55 alcoholic no none Caucasian

S23 bipolar 24 learning disability, migraine no Epival, Seroquel Caucasian

S27 bipolar 42 Seizures - post-concussion no Epival Caucasian

S37 bipolar 22 none no Epival, Zoloft, clonazepam Caucasian

S41 bipolar 59 none no none Caucasian

S45 bipolar 32 none no Lamictal Caucasian

S46 bipolar 27 depression, suicidal yes previously took Zyprexa, Cipralex Caucasian

S47 bipolar 38 spinal disc herniation no

Albilify, Lyrica, Lithium, Symbalta, Seroquel,

Topamax Caucasian

S53 bipolar 46 none no none Caucasian

S55 bipolar 36

panic disorder, learning disability,

IBS, hemorrhoid no Zyprexa, Clonazepam Caucasian

S60 bipolar 63 gambling, sleep apnea, hearing loss no Wellbutrin, Seroquel , Lithium Caucasian

S61 bipolar 46 none no Risperdal, Lithium, Cogentin Mixed

127

S62 bipolar 35 cholesterol, sleep apnea, COPD yes Seroquel Caucasian

S68 bipolar 21 ADD, OCD, GAD, mild asthma no Invega, Adderall, previously used Seroquel and Paxil Caucasian

S72 bipolar 54 diabetes II, anxiety, hypertension no Lamictal, Ativan, Cozaar, Lipitor, Januvia, Diamicron, Imovane, Levemir Caucasian

S73 bipolar 42 none yes Epival, Risperdal injections, Zyprexa, Seroquel Caucasian

S85 bipolar 24 depression yes none, previously used Seroquel Caucasian

S86 bipolar 23 none yes Epival Caucasian

S89 bipolar 44 none yes methadone Caucasian

S90 bipolar 45 migraine, varicose veins no Ibuprofen Hispanic

S92 bipolar 46 none yes Loxapine, Celexa, lorazepam Caucasian

S95 bipolar 22 none yes Olanzapine, acetaminophen Mixed

Table A2.3. CAMH sample demographics Demographic information for sperm cell samples obtained from the Centre for Addiction and Mental Health.

Locus Sample genotype CpG 1 CpG 2 CpG 3 CpG 4 Average

SNP_A-4222947 88 BB 87 49 68

100 BB 87 50 68.5

1009 BB 87 49 68

1049 BB 87 50 68.5

1111 BB 87 51 69

33 AA 85 73 79

1012 AA 87 74 80.5

1065 AA 86 74 80

1093 AA 87 74 80.5

1099 AA 88 74 81

SNP_A-8623123 80 AA 97 99 95 93 96

1009 AA 97 100 100 95 98

1061 AA 97 99 98 94 97

1088 AA 98 99 99 95 97.75

1093 AA 97 100 100 96 98.25

3 BB 96 99 99 95 97.25

1027 BB 97 100 97 93 96.75

1094 BB 96 100 99 95 97.5

1099 BB 97 100 99 92 97

1132 BB 95 100 99 95 97.25

SNP_A-8697241 90 AA 4 14 8 8.67

128

1009 AA 4 15 8 9.00

1101 AA 4 15 8 9.00

1117 AA 3 15 8 8.67

1127 AA 4 15 8 9.00

103 BB 24 95 59 59.33

1037 BB 25 97 59 60.33

1064 BB 25 98 60 61.00

1088 BB 24 97 57 59.33

1132 BB 24 96 57 59.00

SNP_A-1878011 1049 AA 63 63

1099 AA 62 62

1037 AA 64 64

1060 AA 65 65

1120 AA 64 64

28 AB 64 64

87 AB 66 66

100 AB 63 63

SNP_A-8529885 94 AA 100 100

1017 AA 100 100

1056 AA 100 100

1117 AA 99 99

1126 AA 100 100

1093 BB 100 100

1099 BB 100 100

1111 BB 100 100

1119 BB 100 100

1120 BB 100 100

SNP_A-4259932 70 AB 100 100

67 BB 98 98

73 BB 100 100

1061 BB 100 100

1077 BB 100 100

1117 BB 100 100

Table A2.4. Methylation levels at all CpG sites. The results from the bisulfite modification and pyrosequencing validation of epiSNPs and non-epiSNPs. Three

highly significant epiSNPs displaying large differences between AA and BB variants were chosen; three non-

epiSNPs were chosen randomly. For each locus, the individual sample codes are listed with their respective

genotypes. A and B designations were assigned arbitrarily during analysis. Methylation percentages returned from

the pyrosequencing are listed for each CpG site; samples had up to 4 sites available for interrogation on the

pyrosequencing amplicon. The far right column lists the average methylation percentage for each sample across the

locus.

129

130

Table A2.5. EpiSNPs and associated gene information Column 1: epiSNPs discussed in the Results section. Column 2: brain cohorts displaying the epiSNP. Column 3:

sperm cohorts displaying the epiSNP. Column 4: functional class tag of the epiSNP. Column 5: associated gene

name. Column 6: full gene name and function.

131

Table A2.6. 454 analysis sample genotypes Number of samples per genotype and cohort are listed per investigated SNP. Random non-epiSNPs were labeled

from 1 to 25, 11 of which were chosen for sequencing based on primer set performance.

SNP dbSNP ID

Case Control

AA BB AA BB

SNP 1 10975882 10 10 9 11

SNP 2 5943127 14 6 14 4

SNP 7 11658063 10 10 10 10

SNP 11 3762352 9 9 10 11

SNP 12 219815 10 10 10 10

SNP 15 17551103 15 5 15 5

SNP 18 2859011 10 10 10 10

SNP 19 2059697 9 10 10 10

SNP 21 720080 13 7 14 7

SNP 23 2581651 16 4 15 4

SNP 25 1902675 10 10 10 10

132

Copyright Acknowledgements

The twin study was published by Nature [375], and their statement regarding use of published

material by an author is listed below. Permission to use the paper has also been granted by Zach

Kaminsky (first author), and by Art Petronis (last author).

From: http://www.nature.com/reprints/permission-requests.html

Portions of the literature review were taken from review articles that I have previously published

with Annual Reviews of Pharmacology and Toxicology [376] and Dialogues in Clinical

Neuroscience [377] Copyright © Les Laboratoires Servier 2010. Permission to use the material

has been provided by Art Petronis (co-author of both articles). Copyright statements from the

journals are provided below.

The authors of articles published by Nature Publishing Group, or the authors' designated agents, do not usually need to seek permission

for re-use of their material as long as the journal is credited with initial publication. For further information about the terms of re-use for

authors please see below.

Author Requests

If you are the author of this content (or his/her designated agent) please read the following. Since 2003, ownership of copyright in in

original research articles remains with the Authors*, and provided that, when reproducing the Contribution or extracts from it, the

Authors acknowledge first and reference publication in the Journal, the Authors retain the following non-exclusive rights:

1. To reproduce the Contribution in whole or in part in any printed volume (book or thesis) of which they are the author(s).

2. They and any academic institution where they work at the time may reproduce the Contribution for the purpose of course

teaching.

3. To reuse figures or tables created by them and contained in the Contribution in other works created by them.

4. To post a copy of the Contribution as accepted for publication after peer review (in Word or Tex format) on the Author's own

web site, or the Author's institutional repository, or the Author's funding body's archive, six months after publication of the

printed or online edition of the Journal, provided that they also link to the Journal article on NPG's web site (eg through the

DOI).

NPG encourages the self-archiving of the accepted version of your manuscript in your funding agency's or institution's repository, six

months after publication. This policy complements the recently announced policies of the US National Institutes of Health, Wellcome

Trust and other research funding bodies around the world. NPG recognizes the efforts of funding bodies to increase access to the

research they fund, and we strongly encourage authors to participate in such efforts.

133

From: http://www.annualreviews.org/page/about/copyright-and-permissions

A note of permission (email) was received from Dialogues in Clinical Neuroscience:

Annual Reviews Authors: There is no need to obtain permission from Annual Reviews

for the use of your own work(s). Our copyright transfer agreement provides you with all

the necessary permissions.

Our copyright transfer agreement provides: “..The nonexclusive

right to use, reproduce, distribute, perform, update, create derivatives, and make copies of

the work (electronically or in print) in connection with the author’s teaching, conference

presentations, lectures, and publications, provided proper attribution is given...”

From: [email protected] [[email protected]]

Sent: Thursday, February 23, 2012 5:03 AM

To: Carolyn Ptak Cc: [email protected]

Subject: RE: Permission to use copyrighted material in a doctoral thesis (by article author)

Hi Carolyn,

Yes, that will be fine. You may have permission, provided that you include the complete citation of the work, with Copyright © Les Laboratoires Servier 2010, and the annotation that parts of the text have previously appeared in the publication mentioned.

Once you have completed the work, could you please send us the link to the online version? Thanks.

Best wishes,

Catriona

http://www.annualreviews.org/page/about/copyright-and-permissions

134

References

1. Kennedy, D., Breakthrough of the year. Science, 2007. 318(5858): p. 1833.

2. Esteller, M., The necessity of a human epigenome project. Carcinogenesis, 2006. 27(6):

p. 1121-5.

3. Martin, N., D. Boomsma, and G. Machin, A twin-pronged attack on complex traits. Nat

Genet, 1997. 17(4): p. 387-92.

4. Robertson, K.D. and A.P. Wolffe, DNA methylation in health and disease. Nat Rev

Genet, 2000. 1(1): p. 11-9.

5. Riggs, A.D., et al., Methylation dynamics, epigenetic fidelity and X chromosome

structure. Novartis Found Symp, 1998. 214: p. 214-25; discussion 225-32.

6. Ushijima, T., et al., Fidelity of the methylation pattern and its variation in the genome.

Genome Res, 2003. 13(5): p. 868-74.

7. Jaenisch, R. and A. Bird, Epigenetic regulation of gene expression: how the genome

integrates intrinsic and environmental signals. Nat Genet, 2003. 33 Suppl: p. 245-54.

8. Jirtle, R.L. and M.K. Skinner, Environmental epigenomics and disease susceptibility. Nat

Rev Genet, 2007. 8(4): p. 253-62.

9. Wong, A.H., Gottesman, II, and A. Petronis, Phenotypic differences in genetically

identical organisms: the epigenetic perspective. Hum Mol Genet, 2005. 14 Spec No 1: p.

R11-8.

10. Petronis, A., et al., Monozygotic twins exhibit numerous epigenetic differences: clues to

twin discordance? Schizophr Bull, 2003. 29(1): p. 169-78.

11. Kuratomi, G., et al., Aberrant DNA methylation associated with bipolar disorder

identified from discordant monozygotic twins. Mol Psychiatry, 2008. 13(4): p. 429-41.

12. Heijmans, B.T., et al., Heritable rather than age-related environmental and stochastic

factors dominate variation in DNA methylation of the human IGF2/H19 locus. Hum Mol

Genet, 2007. 16(5): p. 547-54.

13. Oates, N.A., et al., Increased DNA methylation at the AXIN1 gene in a monozygotic twin

from a pair discordant for a caudal duplication anomaly. Am J Hum Genet, 2006. 79(1):

p. 155-62.

14. Fraga, M.F., et al., Epigenetic differences arise during the lifetime of monozygotic twins.

Proc Natl Acad Sci U S A, 2005. 102(30): p. 10604-9.

15. Schellenberg, G.D. and T.J. Montine, The genetics and neuropathology of Alzheimer's

disease. Acta Neuropathol, 2012.

16. Hebert-Schuster, M., E.E. Fabre, and V. Nivet-Antoine, Catalase polymorphisms and

metabolic diseases. Curr Opin Clin Nutr Metab Care, 2012.

17. Lu, Y., et al., TGFB1 genetic polymorphisms and coronary heart disease risk: a meta-

analysis. BMC Med Genet, 2012. 13(1): p. 39.

18. Petronis, A., Epigenetics as a unifying principle in the aetiology of complex traits and

diseases. Nature, 2010. 465(7299): p. 721-7.

19. Kerkel, K., et al., Genomic surveys by methylation-sensitive SNP analysis identify

sequence-dependent allele-specific DNA methylation. Nat Genet, 2008. 40(7): p. 904-8.

20. Yan, H., et al., Allelic variation in human gene expression. Science, 2002. 297(5584): p.

1143.

135

21. Schalkwyk, L.C., et al., Allelic skewing of DNA methylation is widespread across the

genome. Am J Hum Genet, 2010. 86(2): p. 196-212.

22. Milani, L., et al., Allele-specific gene expression patterns in primary leukemic cells

reveal regulation of gene expression by CpG site methylation. Genome Res, 2009. 19(1):

p. 1-11.

23. Hawkins, N.J., et al., MGMT methylation is associated primarily with the germline C>T

SNP (rs16906252) in colorectal cancer and normal colonic mucosa. Mod Pathol, 2009.

22(12): p. 1588-99.

24. Candiloro, I.L. and A. Dobrovic, Detection of MGMT promoter methylation in normal

individuals is strongly associated with the T allele of the rs16906252 MGMT promoter

single nucleotide polymorphism. Cancer Prev Res (Phila), 2009. 2(10): p. 862-7.

25. Vawter, M.P., F. Mamdani, and F. Macciardi, An integrative functional genomics

approach for discovering biomarkers in schizophrenia. Brief Funct Genomics, 2011.

10(6): p. 387-99.

26. Daxinger, L. and E. Whitelaw, Understanding transgenerational epigenetic inheritance

via the gametes in mammals. Nat Rev Genet, 2012. 13(3): p. 153-62.

27. Richards, E.J., Inherited epigenetic variation--revisiting soft inheritance. Nat Rev Genet,

2006. 7(5): p. 395-401.

28. Boomsma, D., A. Busjahn, and L. Peltonen, Classical twin studies and beyond. Nat Rev

Genet, 2002. 3(11): p. 872-82.

29. Cedar, H. and Y. Bergman, Programming of DNA Methylation Patterns. Annu Rev

Biochem, 2012.

30. Perera, F. and J. Herbstman, Prenatal environmental exposures, epigenetics, and disease.

Reprod Toxicol, 2011. 31(3): p. 363-73.

31. Guo, S.W., The endometrial epigenome and its response to steroid hormones. Mol Cell

Endocrinol, 2012. 358(2): p. 185-96.

32. Mill, J., et al., Epigenomic profiling reveals DNA-methylation changes associated with

major psychosis. Am J Hum Genet, 2008. 82(3): p. 696-711.

33. Henikoff, S. and M.A. Matzke, Exploring and explaining epigenetic effects. Trends

Genet, 1997. 13(8): p. 293-5.

34. Margueron, R., P. Trojer, and D. Reinberg, The key to development: interpreting the

histone code? Curr Opin Genet Dev, 2005. 15(2): p. 163-76.

35. Thiagalingam, S., et al., Histone deacetylases: unique players in shaping the epigenetic

histone code. Ann N Y Acad Sci, 2003. 983: p. 84-100.

36. Klose, R.J. and Y. Zhang, Regulation of histone methylation by demethylimination and

demethylation. Nat Rev Mol Cell Biol, 2007. 8(4): p. 307-18.

37. Tahiliani, M., et al., Conversion of 5-methylcytosine to 5-hydroxymethylcytosine in

mammalian DNA by MLL partner TET1. Science, 2009. 324(5929): p. 930-5.

38. Williams, K., et al., TET1 and hydroxymethylcytosine in transcription and DNA

methylation fidelity. Nature, 2011. 473(7347): p. 343-8.

39. Hershey, A.D., J. Dixon, and M. Chase, Nucleic acid economy in bacteria infected with

bacteriophage T2. I. Purine and pyrimidine composition. J Gen Physiol, 1953. 36(6): p.

777-89.

40. Kriaucionis, S. and N. Heintz, The nuclear DNA base 5-hydroxymethylcytosine is present

in Purkinje neurons and the brain. Science, 2009. 324(5929): p. 929-30.

41. Song, C.X., et al., Selective chemical labeling reveals the genome-wide distribution of 5-

hydroxymethylcytosine. Nat Biotechnol, 2011. 29(1): p. 68-72.

136

42. Wu, S.C. and Y. Zhang, Active DNA demethylation: many roads lead to Rome. Nat Rev

Mol Cell Biol, 2010. 11(9): p. 607-20.

43. Andersen, I.S., et al., Epigenetic complexity during the zebrafish mid-blastula transition.

Biochem Biophys Res Commun, 2012. 417(4): p. 1139-44.

44. Ishikawa, K., E. Fukuda, and I. Kobayashi, Conflicts targeting epigenetic systems and

their resolution by cell death: novel concepts for methyl-specific and other restriction

systems. DNA Res, 2010. 17(6): p. 325-42.

45. Nakanishi, M.O., et al., Trophoblast-specific DNA methylation occurs after the

segregation of the trophectoderm and inner cell mass in the mouse periimplantation

embryo. Epigenetics, 2012. 7(2): p. 173-82.

46. Alabert, C. and A. Groth, Chromatin replication and epigenome maintenance. Nat Rev

Mol Cell Biol, 2012. 13(3): p. 153-67.

47. Schroeder, J.W., et al., Neonatal DNA methylation patterns associate with gestational

age. Epigenetics, 2011. 6(12): p. 1498-504.

48. Weksberg, R., et al., Beckwith-Wiedemann syndrome demonstrates a role for epigenetic

control of normal development. Hum Mol Genet, 2003. 12 Spec No 1: p. R61-8.

49. Baylin, S.B. and J.G. Herman, DNA hypermethylation in tumorigenesis: epigenetics joins

genetics. Trends Genet, 2000. 16(4): p. 168-74.

50. Jones, P.A. and P.W. Laird, Cancer epigenetics comes of age. Nat Genet, 1999. 21(2): p.

163-7.

51. Chen, H., T.M. Hardy, and T.O. Tollefsbol, Epigenomics of ovarian cancer and its

chemoprevention. Front Genet, 2011. 2: p. 67.

52. Petronis, A., Human morbid genetics revisited: relevance of epigenetics. Trends Genet,

2001. 17(3): p. 142-6.

53. Reiss, D., R. Plomin, and E.M. Hetherington, Genetics and psychiatry: an unheralded

window on the environment. Am J Psychiatry, 1991. 148(3): p. 283-91.

54. Kaprio, J. and M. Koskenvuo, Cigarette smoking as a cause of lung cancer and coronary

heart disease. A study of smoking-discordant twin pairs. Acta Genet Med Gemellol

(Roma), 1990. 39(1): p. 25-34.

55. Chen, C.J., et al., Environmental effects on cardiovascular risk factors in Chinese

adolescent monozygotic twins. Acta Genet Med Gemellol (Roma), 1984. 33(3): p. 375-

81.

56. Ingrosso, D., et al., Folate treatment and unbalanced methylation and changes of allelic

expression induced by hyperhomocysteinaemia in patients with uraemia. Lancet, 2003.

361(9370): p. 1693-9.

57. Wolff, G.L., et al., Maternal epigenetics and methyl supplements affect agouti gene

expression in Avy/a mice. Faseb J, 1998. 12(11): p. 949-57.

58. Dunlevy, L.P., et al., Integrity of the methylation cycle is essential for mammalian neural

tube closure. Birth Defects Res A Clin Mol Teratol, 2006. 76(7): p. 544-52.

59. Wang, L., et al., Relation between hypomethylation of long interspersed nucleotide

elements and risk of neural tube defects. Am J Clin Nutr, 2010. 91(5): p. 1359-67.

60. Cooney, C.A., A.A. Dave, and G.L. Wolff, Maternal methyl supplements in mice affect

epigenetic variation and DNA methylation of offspring. J Nutr, 2002. 132(8 Suppl): p.

2393S-2400S.

61. Waterland, R.A. and R.L. Jirtle, Transposable elements: targets for early nutritional

effects on epigenetic gene regulation. Mol Cell Biol, 2003. 23(15): p. 5293-300.

137

62. Tchantchou, F., et al., S-adenosylmethionine mediates glutathione efficacy by increasing

glutathione S-transferase activity: implications for S-adenosyl methionine as a

neuroprotective dietary supplement. J Alzheimers Dis, 2008. 14(3): p. 323-8.

63. Kharbanda, K.K., Alcoholic liver disease and methionine metabolism. Semin Liver Dis,

2009. 29(2): p. 155-65.

64. Christensen, B.C., et al., Epigenetic profiles distinguish pleural mesothelioma from

normal pleura and predict lung asbestos burden and clinical outcome. Cancer Res, 2009.

69(1): p. 227-34.

65. Majumdar, S., et al., Arsenic exposure induces genomic hypermethylation. Environ

Toxicol, 2010. 25(3): p. 315-8.

66. Weaver, I.C., et al., Epigenetic programming by maternal behavior. Nat Neurosci, 2004.

7(8): p. 847-54.

67. Petronis, A., Epigenetics and twins: three variations on the theme. Trends Genet, 2006.

22(7): p. 347-50.

68. Weksberg, R., et al., Discordant KCNQ1OT1 imprinting in sets of monozygotic twins

discordant for Beckwith-Wiedemann syndrome. Hum Mol Genet, 2002. 11(11): p. 1317-

25.

69. Mill, J., et al., Evidence for monozygotic twin (MZ) discordance in methylation level at

two CpG sites in the promoter region of the catechol-O-methyltransferase (COMT) gene.

Am J Med Genet B Neuropsychiatr Genet, 2006. 141B(4): p. 421-5.

70. Matzke, M.A. and A.J. Matzke, Cloning problems don't surprise plant biologists.

Science, 2000. 288(5475): p. 2318b.

71. Morgan, H.D., et al., Epigenetic inheritance at the agouti locus in the mouse. Nat Genet,

1999. 23(3): p. 314-8.

72. Iida, T., et al., PCNA clamp facilitates action of DNA cytosine methyltransferase 1 on

hemimethylated DNA. Genes Cells, 2002. 7(10): p. 997-1007.

73. Vilkaitis, G., et al., Processive methylation of hemimethylated CpG sites by mouse Dnmt1

DNA methyltransferase. J Biol Chem, 2005. 280(1): p. 64-72.

74. Hassler, M.R. and G. Egger, Epigenomics of cancer - emerging new concepts. Biochimie,

2012.

75. Faria, C.M., et al., Epigenetic mechanisms regulating neural development and pediatric

brain tumor formation. J Neurosurg Pediatr, 2011. 8(2): p. 119-32.

76. Seeman, M.V., Psychopathology in women and men: focus on female hormones. Am J

Psychiatry, 1997. 154(12): p. 1641-7.

77. Kaminsky, Z., S.C. Wang, and A. Petronis, Complex disease, gender and epigenetics.

Ann Med, 2006. 38(8): p. 530-44.

78. Ohara, K., et al., Anticipation and imprinting in schizophrenia. Biol Psychiatry, 1997.

42(9): p. 760-6.

79. Guo, Y.F., et al., Assessment of genetic linkage and parent-of-origin effects on obesity. J

Clin Endocrinol Metab, 2006. 91(10): p. 4001-5.

80. Bassett, S.S., D. Avramopoulos, and D. Fallin, Evidence for parent of origin effect in

late-onset Alzheimer disease. Am J Med Genet, 2002. 114(6): p. 679-86.

81. Demenais, F., V. Chaudru, and M. Martinez, Detection of parent-of-origin effects for

atopy by model-free and model-based linkage analyses. Genet Epidemiol, 2001. 21

Suppl 1: p. S186-91.

82. Lamb, J.A., et al., Analysis of IMGSAC autism susceptibility loci: evidence for sex limited

and parent of origin specific effects. J Med Genet, 2005. 42(2): p. 132-7.

138

83. Camprubi, C. and D. Monk, Does genomic imprinting play a role in autoimmunity? Adv

Exp Med Biol, 2011. 711: p. 103-16.

84. Schulze, T.G., et al., Additional, physically ordered markers increase linkage signal for

bipolar disorder on chromosome 18q22. Biol Psychiatry, 2003. 53(3): p. 239-43.

85. Hall, J.G., Genomic imprinting: review and relevance to human diseases. Am J Hum

Genet, 1990. 46(5): p. 857-73.

86. Barlow, D.P., Gametic imprinting in mammals. Science, 1995. 270(5242): p. 1610-3.

87. Delaval, K., A. Wagschal, and R. Feil, Epigenetic deregulation of imprinting in

congenital diseases of aberrant growth. Bioessays, 2006. 28(5): p. 453-9.

88. Tomizawa, S.I. and H. Sasaki, Genomic imprinting and its relevance to congenital

disease, infertility, molar pregnancy and induced pluripotent stem cell. J Hum Genet,

2012.

89. Sutherland, J.E. and M. Costa, Epigenetics and the environment. Ann N Y Acad Sci,

2003. 983: p. 151-60.

90. Petronis, A., The origin of schizophrenia: genetic thesis, epigenetic antithesis, and

resolving synthesis. Biol Psychiatry, 2004. 55(10): p. 965-70.

91. O'Sullivan, L., et al., Epigenetics and developmental programming of adult onset

diseases. Pediatr Nephrol, 2012.

92. Fuke, C., et al., Age related changes in 5-methylcytosine content in human peripheral

leukocytes and placentas: an HPLC-based study. Ann Hum Genet, 2004. 68(Pt 3): p.

196-204.

93. van den Toorn, L.M., et al., Asthma remission: does it exist? Curr Opin Pulm Med, 2003.

9(1): p. 15-20.

94. Faraone, S.V., J. Biederman, and E. Mick, The age-dependent decline of attention deficit

hyperactivity disorder: a meta-analysis of follow-up studies. Psychol Med, 2006. 36(2):

p. 159-65.

95. Esteller, M., Cancer epigenomics: DNA methylomes and histone-modification maps. Nat

Rev Genet, 2007. 8(4): p. 286-98.

96. Ting Hsiung, D., et al., Global DNA methylation level in whole blood as a biomarker in

head and neck squamous cell carcinoma. Cancer Epidemiol Biomarkers Prev, 2007.

16(1): p. 108-14.

97. Ehrlich, M., DNA methylation in cancer: too much, but also too little. Oncogene, 2002.

21(35): p. 5400-13.

98. Fraga, M.F., et al., Loss of acetylation at Lys16 and trimethylation at Lys20 of histone H4

is a common hallmark of human cancer. Nat Genet, 2005. 37(4): p. 391-400.

99. Pogribny, I.P., et al., Histone H3 lysine 9 and H4 lysine 20 trimethylation and the

expression of Suv4-20h2 and Suv-39h1 histone methyltransferases in

hepatocarcinogenesis induced by methyl deficiency in rats. Carcinogenesis, 2006. 27(6):

p. 1180-6.

100. Valdes-Mora, F., et al., Acetylation of H2A.Z is a key epigenetic modification associated

with gene deregulation and epigenetic remodeling in cancer. Genome Res, 2012. 22(2):

p. 307-21.

101. Laird, P.W., Cancer epigenetics. Hum Mol Genet, 2005. 14 Spec No 1: p. R65-76.

102. Baylin, S. and T.H. Bestor, Altered methylation patterns in cancer cell genomes: cause or

consequence? Cancer Cell, 2002. 1(4): p. 299-305.

139

103. Veldic, M., et al., Epigenetic mechanisms expressed in basal ganglia GABAergic neurons

differentiate schizophrenia from bipolar disorder. Schizophr Res, 2007. 91(1-3): p. 51-

61.

104. Hogart, A., et al., 15q11-13 GABAA receptor genes are normally biallelically expressed

in brain yet are subject to epigenetic dysregulation in autism-spectrum disorders. Hum

Mol Genet, 2007. 16(6): p. 691-703.

105. Deng, V., et al., FXYD1 is a MeCP2 target gene overexpressed in the brains of Rett

syndrome patients and Mecp2-null mice. Hum Mol Genet, 2007.

106. Hagerman, R.J., M.Y. Ono, and P.J. Hagerman, Recent advances in fragile X: a model

for autism and neurodegeneration. Curr Opin Psychiatry, 2005. 18(5): p. 490-6.

107. Ivleva, E., G. Thaker, and C.A. Tamminga, Comparing genes and phenomenology in the

major psychoses: schizophrenia and bipolar 1 disorder. Schizophr Bull, 2008. 34(4): p.

734-42.

108. Andreasen, N.C., Symptoms, signs, and diagnosis of schizophrenia. Lancet, 1995.

346(8973): p. 477-81.

109. Bauer, M., S. Kasper, and M. Willeit, Is dopamine neurotransmission altered in

prodromal schizophrenia? A review of the evidence. Curr Pharm Des, 2012.

110. Akhondzadeh, S., The 5-HT hypothesis of schizophrenia. IDrugs, 2001. 4(3): p. 295-300.

111. Kantrowitz, J. and D.C. Javitt, Glutamatergic transmission in schizophrenia: from basic

research to clinical practice. Curr Opin Psychiatry, 2012. 25(2): p. 96-102.

112. Kuroki, T., N. Nagao, and T. Nakahara, Neuropharmacology of second-generation

antipsychotic drugs: a validity of the serotonin-dopamine hypothesis. Prog Brain Res,

2008. 172: p. 199-212.

113. Rao, J.S., et al., Dysregulated glutamate and dopamine transporters in postmortem

frontal cortex from bipolar and schizophrenic patients. J Affect Disord, 2012. 136(1-2):

p. 63-71.

114. Haukvik, U.K., et al., Cortical folding in Broca's area relates to obstetric complications

in schizophrenia patients and healthy controls. Psychol Med, 2011: p. 1-9.

115. Roseboom, T.J., et al., Hungry in the womb: what are the consequences? Lessons from

the Dutch famine. Maturitas, 2011. 70(2): p. 141-5.

116. Schmidt-Kastner, R., et al., An environmental analysis of genes associated with

schizophrenia: hypoxia and vascular factors as interacting elements in the

neurodevelopmental model. Mol Psychiatry, 2012.

117. Kneeland, R.E. and S.H. Fatemi, Viral infection, inflammation and schizophrenia. Prog

Neuropsychopharmacol Biol Psychiatry, 2012.

118. Reininghaus, U., et al., Ethnic identity, perceptions of disadvantage, and psychosis:

findings from the AESOP study. Schizophr Res, 2010. 124(1-3): p. 43-8.

119. Benros, M.E., et al., Autoimmune diseases and severe infections as risk factors for

schizophrenia: a 30-year population-based register study. Am J Psychiatry, 2011.

168(12): p. 1303-10.

120. Fiorentini, A., et al., Substance-induced psychoses: a critical review of the literature.

Curr Drug Abuse Rev, 2011. 4(4): p. 228-40.

121. Craddock, N., M.C. O'Donovan, and M.J. Owen, The genetics of schizophrenia and

bipolar disorder: dissecting psychosis. J Med Genet, 2005. 42(3): p. 193-204.

122. Craddock, N. and I. Jones, Genetics of bipolar disorder. J Med Genet, 1999. 36(8): p.

585-94.

140

123. Bertelsen, A. and Gottesman, II, Schizoaffective psychoses: genetical clues to

classification. Am J Med Genet, 1995. 60(1): p. 7-11.

124. Cardno, A.G. and Gottesman, II, Twin studies of schizophrenia: from bow-and-arrow

concordances to star wars Mx and functional genomics. Am J Med Genet, 2000. 97(1): p.

12-7.

125. O'Donovan, M.C., N.J. Craddock, and M.J. Owen, Genetics of psychosis; insights from

views across the genome. Hum Genet, 2009. 126(1): p. 3-12.

126. Sanders, A.R., et al., No significant association of 14 candidate genes with schizophrenia

in a large European ancestry sample: implications for psychiatric genetics. Am J

Psychiatry, 2008. 165(4): p. 497-506.

127. Kerner, B., C.G. Lambert, and B.O. Muthen, Genome-wide association study in bipolar

patients stratified by co-morbidity. PLoS One, 2011. 6(12): p. e28477.

128. Carrera, N., et al., Association study of nonsynonymous single nucleotide polymorphisms

in schizophrenia. Biol Psychiatry, 2012. 71(2): p. 169-77.

129. Rietschel, M., et al., Association between genetic variation in a region on chromosome

11 and schizophrenia in large samples from Europe. Mol Psychiatry, 2011.

130. Lee, K.W., et al., Genome wide association studies (GWAS) and copy number variation

(CNV) studies of the major psychoses: what have we learnt? Neurosci Biobehav Rev,

2012. 36(1): p. 556-71.

131. Dempster, E.L., et al., Disease-associated epigenetic changes in monozygotic twins

discordant for schizophrenia and bipolar disorder. Hum Mol Genet, 2011. 20(24): p.

4786-96.

132. Grayson, D.R., et al., Reelin promoter hypermethylation in schizophrenia. Proc Natl

Acad Sci U S A, 2005. 102(26): p. 9341-6.

133. Tochigi, M., et al., Methylation status of the reelin promoter region in the brain of

schizophrenic patients. Biol Psychiatry, 2008. 63(5): p. 530-3.

134. Volk, D.W. and D.A. Lewis, Impaired prefrontal inhibition in schizophrenia: relevance

for cognitive dysfunction. Physiol Behav, 2002. 77(4-5): p. 501-5.

135. Hashimoto, T., et al., Gene expression deficits in a subclass of GABA neurons in the

prefrontal cortex of subjects with schizophrenia. J Neurosci, 2003. 23(15): p. 6315-26.

136. Bullock, W.M., et al., Altered expression of genes involved in GABAergic transmission

and neuromodulation of granule cell activity in the cerebellum of schizophrenia patients.

Am J Psychiatry, 2008. 165(12): p. 1594-603.

137. Huang, H.S. and S. Akbarian, GAD1 mRNA expression and DNA methylation in

prefrontal cortex of subjects with schizophrenia. PLoS One, 2007. 2(8): p. e809.

138. Sharma, R.P., D.R. Grayson, and D.P. Gavin, Histone deactylase 1 expression is

increased in the prefrontal cortex of schizophrenia subjects: analysis of the National

Brain Databank microarray collection. Schizophr Res, 2008. 98(1-3): p. 111-7.

139. Swerdlow, N.R., Are we studying and treating schizophrenia correctly? Schizophr Res,

2011. 130(1-3): p. 1-10.

140. Van Winkel, R., et al., REVIEW: Genome-wide findings in schizophrenia and the role of

gene-environment interplay. CNS Neurosci Ther, 2010. 16(5): p. e185-92.

141. McGowan, P.O. and T. Kato, Epigenetics in mood disorders. Environ Health Prev Med,

2008. 13(1): p. 16-24.

142. Kaminsky, Z., et al., A multi-tissue analysis identifies HLA complex group 9 gene

methylation differences in bipolar disorder. Mol Psychiatry, 2011.

141

143. Stankiewicz, P. and J.R. Lupski, Structural variation in the human genome and its role in

disease. Annu Rev Med, 2010. 61: p. 437-55.

144. Van de Kerkhof, N.W., et al., Copy number variants in a sample of patients with

psychotic disorders: is standard screening relevant for actual clinical practice?

Neuropsychiatr Dis Treat, 2012. 8: p. 295-300.

145. Kidd, J.M., et al., Mapping and sequencing of structural variation from eight human

genomes. Nature, 2008. 453(7191): p. 56-64.

146. Ye, T., et al., Analysis of Copy Number Variations in Brain DNA from Patients with

Schizophrenia and Other Psychiatric Disorders. Biol Psychiatry, 2012.

147. Bergen, S.E., et al., Genome-wide association study in a Swedish population yields

support for greater CNV and MHC involvement in schizophrenia compared with bipolar

disorder. Mol Psychiatry, 2012.

148. Liao, H.M., et al., Identification and characterization of three inherited genomic copy

number variations associated with familial schizophrenia. Schizophr Res, 2012. 139(1-

3): p. 229-36.

149. Grozeva, D., et al., Independent estimation of the frequency of rare CNVs in the UK

population confirms their role in schizophrenia. Schizophr Res, 2012. 135(1-3): p. 1-7.

150. Malhotra, D. and J. Sebat, CNVs: harbingers of a rare variant revolution in psychiatric

genetics. Cell, 2012. 148(6): p. 1223-41.

151. Meyer, J., et al., Rare variants of the gene encoding the potassium chloride co-

transporter 3 are associated with bipolar disorder. Int J Neuropsychopharmacol, 2005.

8(4): p. 495-504.

152. Moser, D., et al., Functional analysis of a potassium-chloride co-transporter 3

(SLC12A6) promoter polymorphism leading to an additional DNA methylation site.

Neuropsychopharmacology, 2009. 34(2): p. 458-67.

153. Uyanik, G., et al., Novel truncating and missense mutations of the KCC3 gene associated

with Andermann syndrome. Neurology, 2006. 66(7): p. 1044-8.

154. Luedi, P.P., et al., Computational and experimental identification of novel human

imprinted genes. Genome Res, 2007. 17(12): p. 1723-30.

155. Monk, M., Changes in DNA methylation during mouse embryonic development in

relation to X-chromosome activity and imprinting. Philos Trans R Soc Lond B Biol Sci,

1990. 326(1235): p. 299-312.

156. Rakyan, V. and E. Whitelaw, Transgenerational epigenetic inheritance. Curr Biol, 2003.

13(1): p. R6.

157. Christensen, B.C., et al., Aging and environmental exposures alter tissue-specific DNA

methylation dependent upon CpG island context. PLoS Genet, 2009. 5(8): p. e1000602.

158. Zhang, Y., et al., Non-imprinted allele-specific DNA methylation on human autosomes.

Genome Biol, 2009. 10(12): p. R138.

159. Hellman, A. and A. Chess, Extensive sequence-influenced DNA methylation

polymorphism in the human genome. Epigenetics Chromatin, 2010. 3(1): p. 11.

160. Shoemaker, R., et al., Allele-specific methylation is prevalent and is contributed by CpG-

SNPs in the human genome. Genome Res, 2010. 20(7): p. 883-9.

161. Chen, P.Y., et al., A comparative analysis of DNA methylation across human embryonic

stem cell lines. Genome Biol, 2011. 12(7): p. R62.

162. Zhang, D., et al., Genetic control of individual differences in gene-specific methylation in

human brain. Am J Hum Genet, 2010. 86(3): p. 411-9.

142

163. Barreiro, L.B., et al., Natural selection has driven population differentiation in modern

humans. Nat Genet, 2008. 40(3): p. 340-5.

164. Jendrzejewski, J., et al., The polymorphism rs944289 predisposes to papillary thyroid

carcinoma through a large intergenic noncoding RNA gene of tumor suppressor type.

Proc Natl Acad Sci U S A, 2012. 109(22): p. 8646-51.

165. Sivakumaran, S., et al., Abundant pleiotropy in human complex diseases and traits. Am J

Hum Genet, 2011. 89(5): p. 607-18.

166. Walters, R.W., S.S. Bradrick, and M. Gromeier, Poly(A)-binding protein modulates

mRNA susceptibility to cap-dependent miRNA-mediated repression. Rna, 2010. 16(1): p.

239-50.

167. Yang, J.O., W.Y. Kim, and J. Bhak, ssSNPTarget: genome-wide splice-site Single

Nucleotide Polymorphism database. Hum Mutat, 2009. 30(12): p. E1010-20.

168. Wimmer, K., et al., The NF1 gene contains hotspots for L1 endonuclease-dependent de

novo insertion. PLoS Genet, 2011. 7(11): p. e1002371.

169. Medvedeva, Y.A., et al., Intergenic, gene terminal, and intragenic CpG islands in the

human genome. BMC Genomics, 2010. 11: p. 48.

170. Martin, J.S., et al., Structural effects of linkage disequilibrium on the transcriptome. Rna,

2012. 18(1): p. 77-87.

171. Arnold, M., et al., Cis-Acting Polymorphisms Affect Complex Traits through

Modifications of MicroRNA Regulation Pathways. PLoS One, 2012. 7(5): p. e36694.

172. Rearick, D., et al., Critical association of ncRNA with introns. Nucleic Acids Res, 2011.

39(6): p. 2357-66.

173. An, J.H., et al., DNA methylation-specific multiplex assays for body fluid identification.

Int J Legal Med, 2012.

174. Wang, D., et al., Individual variation and longitudinal pattern of genome-wide DNA

methylation from birth to the first two years of life. Epigenetics, 2012. 7(6).

175. Hou, Y., et al., DNA Demethylation and USF Regulate the Meiosis-Specific Expression of

the Mouse Miwi. PLoS Genet, 2012. 8(5): p. e1002716.

176. Irizarry, R.A., et al., The human colon cancer methylome shows similar hypo- and

hypermethylation at conserved tissue-specific CpG island shores. Nat Genet, 2009. 41(2):

p. 178-86.

177. Hudson, T.J., et al., International network of cancer genome projects. Nature, 2010.

464(7291): p. 993-8.

178. Leng, S., et al., The A/G allele of rs16906252 predicts for MGMT methylation and is

selectively silenced in premalignant lesions from smokers and in lung adenocarcinomas.

Clin Cancer Res, 2011. 17(7): p. 2014-23.

179. Andraos, C., et al., Vitamin D receptor gene methylation is associated with ethnicity,

tuberculosis, and TaqI polymorphism. Hum Immunol, 2010. 72(3): p. 262-8.

180. Stepanow, S., et al., Allele-specific, age-dependent and BMI-associated DNA methylation

of human MCHR1. PLoS One, 2011. 6(5): p. e17711.

181. Bell, C.G., et al., Integrated genetic and epigenetic analysis identifies haplotype-specific

methylation in the FTO type 2 diabetes and obesity susceptibility locus. PLoS One, 2010.

5(11): p. e14040.

182. Kundakovic, M., et al., DNA methyltransferase inhibitors coordinately induce expression

of the human reelin and glutamic acid decarboxylase 67 genes. Mol Pharmacol, 2007.

71(3): p. 644-53.

143

183. Dash, P.K., S.A. Orsi, and A.N. Moore, Histone deactylase inhibition combined with

behavioral therapy enhances learning and memory following traumatic brain injury.

Neuroscience, 2009.

184. Hockly, E., et al., Suberoylanilide hydroxamic acid, a histone deacetylase inhibitor,

ameliorates motor deficits in a mouse model of Huntington's disease. Proc Natl Acad Sci

U S A, 2003. 100(4): p. 2041-6.

185. Camelo, S., et al., Transcriptional therapy with the histone deacetylase inhibitor

trichostatin A ameliorates experimental autoimmune encephalomyelitis. J

Neuroimmunol, 2005. 164(1-2): p. 10-21.

186. Chen, P.S., et al., Valproate protects dopaminergic neurons in midbrain neuron/glia

cultures by stimulating the release of neurotrophic factors from astrocytes. Mol

Psychiatry, 2006. 11(12): p. 1116-25.

187. Dagtas, A.S., E.R. Edens, and K.M. Gilbert, Histone deacetylase inhibitor uses

p21(Cip1) to maintain anergy in CD4(+) T cells. Int Immunopharmacol, 2009.

188. Langley, B., et al., Pulse inhibition of histone deacetylases induces complete resistance to

oxidative death in cortical neurons without toxicity and reveals a role for cytoplasmic

p21(waf1/cip1) in cell cycle-independent neuroprotection. J Neurosci, 2008. 28(1): p.

163-76.

189. Gilad, R., et al., Treatment of status epilepticus and acute repetitive seizures with i.v.

valproic acid vs phenytoin. Acta Neurol Scand, 2008. 118(5): p. 296-300.

190. Bowden, C.L., Spectrum of effectiveness of valproate in neuropsychiatry. Expert Rev

Neurother, 2007. 7(1): p. 9-16.

191. Sajatovic, M., et al., Adjunct extended-release valproate semisodium in late life

schizophrenia. Int J Geriatr Psychiatry, 2008. 23(2): p. 142-7.

192. Wright, M. and N. Martin, Brisbane Adolescent Twin Study: outline of study methods and

research projects. Australian Journal of Psychology, 2004. 56: p. 65-78.

193. Halfvarson, J., et al., Inflammatory bowel disease in a Swedish twin cohort: a long-term

follow-up of concordance and clinical characteristics. Gastroenterology, 2003. 124(7): p.

1767-73.

194. Schumacher, A., et al., Microarray-based DNA methylation profiling: technology and

applications. Nucleic Acids Res, 2006. 34(2): p. 528-42.

195. Heisler, L.E., et al., CpG Island microarray probe sequences derived from a physical

library are representative of CpG Islands annotated on the human genome. Nucleic

Acids Res, 2005. 33(9): p. 2952-61.

196. Storey, J.D. and R. Tibshirani, Statistical significance for genomewide studies. Proc Natl

Acad Sci U S A, 2003. 100(16): p. 9440-5.

197. Falcon, S. and R. Gentleman, Using GOstats to test gene lists for GO term association.

Bioinformatics, 2007. 23(2): p. 257-8.

198. Tost, J., H. El abdalaoui, and I.G. Gut, Serial pyrosequencing for quantitative DNA

methylation analysis. Biotechniques, 2006. 40(6): p. 721-2, 724, 726.

199. Torrey, E.F., et al., The stanley foundation brain collection and neuropathology

consortium. Schizophr Res, 2000. 44(2): p. 151-5.

200. Deep-Soboslay, A., et al., Psychiatric brain banking: three perspectives on current trends

and future directions. Biol Psychiatry, 2011. 69(2): p. 104-12.

201. Consortium, G.P., A map of human genome variation from population-scale sequencing.

Nature, 2010. 467(7319): p. 1061-73.

144

202. Eckhardt, F., et al., DNA methylation profiling of human chromosomes 6, 20 and 22. Nat

Genet, 2006. 38(12): p. 1378-85.

203. Hall, J.G., Twinning. Lancet, 2003. 362(9385): p. 735-43.

204. Andrian, E., et al., Regulation of matrix metalloproteinases and tissue inhibitors of

matrix metalloproteinases by Porphyromonas gingivalis in an engineered human oral

mucosa model. J Cell Physiol, 2007. 211(1): p. 56-62.

205. Choi, B.K., et al., Activation of matrix metalloproteinase-2 by a novel oral spirochetal

species Treponema lecithinolyticum. J Periodontol, 2001. 72(11): p. 1594-600.

206. Wilm, B., et al., The serosal mesothelium is a major source of smooth muscle cells of the

gut vasculature. Development, 2005. 132(23): p. 5317-28.

207. Bruder, C.E., et al., Phenotypically concordant and discordant monozygotic twins display

different DNA copy-number-variation profiles. Am J Hum Genet, 2008. 82(3): p. 763-71.

208. Bouchard, T.J., Jr., et al., Sources of human psychological differences: the Minnesota

Study of Twins Reared Apart. Science, 1990. 250(4978): p. 223-8.

209. Murrell, A., et al., An association between variants in the IGF2 gene and Beckwith-

Wiedemann syndrome: interaction between genotype and epigenotype. Hum Mol Genet,

2004. 13(2): p. 247-55.

210. Flanagan, J.M., et al., Intra- and interindividual epigenetic variation in human germ

cells. Am J Hum Genet, 2006. 79(1): p. 67-84.

211. Khulan, B., et al., Comparative isoschizomer profiling of cytosine methylation: the HELP

assay. Genome Res, 2006. 16(8): p. 1046-55.

212. Gartner, K. and E. Baunack, Is the similarity of monozygotic twins due to genetic factors

alone? Nature, 1981. 292(5824): p. 646-7.

213. Gertz, J., et al., Analysis of DNA methylation in a three-generation family reveals

widespread genetic influence on epigenetic regulation. PLoS Genet, 2011. 7(8): p.

e1002228.

214. Hsu, J. and J.D. Smith, Genome-wide studies of gene expression relevant to coronary

artery disease. Curr Opin Cardiol, 2012. 27(3): p. 210-3.

215. Pardini, B., et al., Gene expression variations: potentialities of master regulator

polymorphisms in colorectal cancer risk. Mutagenesis, 2012. 27(2): p. 161-7.

216. Hollingworth, P., et al., Genome-wide association study of Alzheimer's disease with

psychotic symptoms. Mol Psychiatry, 2011.

217. Zhou, X.J., et al., Genetic association of PRDM1-ATG5 intergenic region and autophagy

with systemic lupus erythematosus in a Chinese population. Ann Rheum Dis, 2011.

70(7): p. 1330-7.

218. Lee, B.K., et al., Cell-type specific and combinatorial usage of diverse transcription

factors revealed by genome-wide binding studies in multiple human cells. Genome Res,

2012. 22(1): p. 9-24.

219. Washietl, S., et al., Mapping of conserved RNA secondary structures predicts thousands

of functional noncoding RNAs in the human genome. Nat Biotechnol, 2005. 23(11): p.

1383-90.

220. Pheasant, M. and J.S. Mattick, Raising the estimate of functional human sequences.

Genome Res, 2007. 17(9): p. 1245-53.

221. Witzany, G., Noncoding RNAs: persistent viral agents as modular tools for cellular

needs. Ann N Y Acad Sci, 2009. 1178: p. 244-67.

222. Huttenhofer, A., P. Schattner, and N. Polacek, Non-coding RNAs: hope or hype? Trends

Genet, 2005. 21(5): p. 289-97.

145

223. Ponting, C.P. and R.C. Hardison, What fraction of the human genome is functional?

Genome Res, 2011. 21(11): p. 1769-76.

224. Kapranov, P., A.T. Willingham, and T.R. Gingeras, Genome-wide transcription and the

implications for genomic organization. Nat Rev Genet, 2007. 8(6): p. 413-23.

225. Waldispuhl, J. and P. Clote, Computing the partition function and sampling for saturated

secondary structures of RNA, with respect to the Turner energy model. J Comput Biol,

2007. 14(2): p. 190-215.

226. Li, X., et al., Predicting in vivo binding sites of RNA-binding proteins using mRNA

secondary structure. Rna, 2010. 16(6): p. 1096-107.

227. Bennett, C.L., et al., A rare polyadenylation signal mutation of the FOXP3 gene

(AAUAAA-->AAUGAA) leads to the IPEX syndrome. Immunogenetics, 2001. 53(6): p.

435-9.

228. Guo, H., et al., Mammalian microRNAs predominantly act to decrease target mRNA

levels. Nature, 2010. 466(7308): p. 835-40.

229. Wu, H., et al., Genome-wide analysis reveals methyl-CpG-binding protein 2-dependent

regulation of microRNAs in a mouse model of Rett syndrome. Proc Natl Acad Sci U S A,

2010. 107(42): p. 18161-6.

230. Halvorsen, M., et al., Disease-associated mutations that alter the RNA structural

ensemble. PLoS Genet, 2010. 6(8): p. e1001074.

231. Castle, J.C., SNPs occur in regions with less genomic sequence conservation. PLoS One,

2011. 6(6): p. e20660.

232. Eom, S. and C. Lee, Functions of intronic nucleotide variants in the gene encoding

pleckstrin homology like domain beta 2 (PHLDB2) on susceptibility to vascular

dementia. World J Biol Psychiatry, 2011.

233. Zhao, C., et al., Alternative-splicing in the exon-10 region of GABA(A) receptor beta(2)

subunit gene: relationships between novel isoforms and psychotic disorders. PLoS One,

2009. 4(9): p. e6977.

234. Itokawa, M., et al., [Studies on pathophysiology of schizophrenia with a rare variant as a

clue]. Brain Nerve, 2011. 63(3): p. 223-31.

235. Shen, Y.C., et al., Genetic and functional analysis of the gene encoding neurogranin in

schizophrenia. Schizophr Res, 2012. 137(1-3): p. 7-13.

236. Kushima, I., et al., Resequencing and association analysis of the KALRN and EPHB1

genes and their contribution to schizophrenia susceptibility. Schizophr Bull, 2012. 38(3):

p. 552-60.

237. Huang, J., et al., Human down syndrome cell adhesion molecules (DSCAMs) are

functionally conserved with Drosophila Dscam[TM1] isoforms in controlling

neurodevelopment. Insect Biochem Mol Biol, 2011. 41(10): p. 778-87.

238. Bartolomucci, A., et al., The extended granin family: structure, function, and biomedical

implications. Endocr Rev, 2011. 32(6): p. 755-97.

239. Teyssier, J.R., et al., Activation of a DeltaFOSB dependent gene expression pattern in the

dorsolateral prefrontal cortex of patients with major depressive disorder. J Affect

Disord, 2011. 133(1-2): p. 174-8.

240. Portela-Gomes, G.M., L. Grimelius, and M. Stridsberg, Secretogranin III in human

neuroendocrine tumours: a comparative immunohistochemical study with chromogranins

A and B and secretogranin II. Regul Pept, 2010. 165(1): p. 30-5.

241. McQuillin, A., M. Rizig, and H.M. Gurling, A microarray gene expression study of the

molecular pharmacology of lithium carbonate on mouse brain mRNA to understand the

146

neurobiology of mood stabilization and treatment of bipolar affective disorder.

Pharmacogenet Genomics, 2007. 17(8): p. 605-17.

242. Umbach, J.A., Y. Zhao, and C.B. Gundersen, Lithium enhances secretion from large

dense-core vesicles in nerve growth factor-differentiated PC12 cells. J Neurochem, 2005.

94(5): p. 1306-14.

243. Hanasaki, K., Mammalian phospholipase A2: phospholipase A2 receptor. Biol Pharm

Bull, 2004. 27(8): p. 1165-7.

244. Oresic, M., et al., Phospholipids and insulin resistance in psychosis: a lipidomics study of

twin pairs discordant for schizophrenia. Genome Med, 2012. 4(1): p. 1.

245. Gattaz, W.F., et al., Increased PLA2 activity in the hippocampus of patients with

temporal lobe epilepsy and psychosis. J Psychiatr Res, 2011. 45(12): p. 1617-20.

246. Ross, B.M., et al., Serum calcium-independent phospholipase A2 activity in bipolar

affective disorder. Bipolar Disord, 2006. 8(3): p. 265-70.

247. Gattaz, W.F., et al., Increased plasma phospholipase-A2 activity in schizophrenic

patients: reduction after neuroleptic therapy. Biol Psychiatry, 1987. 22(4): p. 421-6.

248. Folley, B.S., M.L. Doop, and S. Park, Psychoses and creativity: is the missing link a

biological mechanism related to phospholipids turnover? Prostaglandins Leukot Essent

Fatty Acids, 2003. 69(6): p. 467-76.

249. Kao, W.T., et al., Common genetic variation in Neuregulin 3 (NRG3) influences risk for

schizophrenia and impacts NRG3 expression in human brain. Proc Natl Acad Sci U S A,

2010. 107(35): p. 15619-24.

250. Alessi, A., et al., gamma-Syntrophin scaffolding is spatially and functionally distinct from

that of the alpha/beta syntrophins. Exp Cell Res, 2006. 312(16): p. 3084-95.

251. Kugaevskaia, E.V., [Angiotensin converting enzyme domain structure and properties].

Biomed Khim, 2005. 51(6): p. 567-80.

252. Kucukali, C.I., et al., Angiotensin-converting enzyme polymorphism in schizophrenia,

bipolar disorders, and their first-degree relatives. Psychiatr Genet, 2010. 20(1): p. 14-9.

253. Crescenti, A., et al., Insertion/deletion polymorphism of the angiotensin-converting

enzyme gene is associated with schizophrenia in a Spanish population. Psychiatry Res,

2009. 165(1-2): p. 175-80.

254. Wahlbeck, K., et al., Cerebrospinal fluid angiotensin-converting enzyme (ACE)

correlates with length of illness in schizophrenia. Schizophr Res, 2000. 41(2): p. 335-40.

255. Danser, A.H., et al., Commentaries on Viewpoint: Epigenetic regulation of the ACE gene

might be more relevant to endurance physiology than the I/D polymorphism. J Appl

Physiol, 2012. 112(6): p. 1084-5.

256. Ayoub, M.A., et al., Deleterious GRM1 Mutations in Schizophrenia. PLoS One, 2012.

7(3): p. e32849.

257. Okajima, D., G. Kudo, and H. Yokota, Antidepressant-like behavior in brain-specific

angiogenesis inhibitor 2-deficient mice. J Physiol Sci, 2012. 61(1): p. 47-54.

258. van Haren, J., et al., Mammalian Navigators are microtubule plus-end tracking proteins

that can reorganize the cytoskeleton to induce neurite-like extensions. Cell Motil

Cytoskeleton, 2009. 66(10): p. 824-38.

259. Fung, S.J., S. Sivagnanasundaram, and C.S. Weickert, Lack of change in markers of

presynaptic terminal abundance alongside subtle reductions in markers of presynaptic

terminal plasticity in prefrontal cortex of schizophrenia patients. Biol Psychiatry, 2011.

69(1): p. 71-9.

147

260. Vaags, A.K., et al., Rare deletions at the neurexin 3 locus in autism spectrum disorder.

Am J Hum Genet, 2012. 90(1): p. 133-41.

261. Sachdev, P., Schizophrenia-like psychosis and epilepsy: the status of the association. Am

J Psychiatry, 1998. 155(3): p. 325-36.

262. Hyde, T.M. and D.R. Weinberger, Seizures and schizophrenia. Schizophr Bull, 1997.

23(4): p. 611-22.

263. Salzmann, A., et al., Carboxypeptidase A6 gene (CPA6) mutations in a recessive familial

form of febrile seizures and temporal lobe epilepsy and in sporadic temporal lobe

epilepsy. Hum Mutat, 2012. 33(1): p. 124-35.

264. Ogino, S., et al., MGMT germline polymorphism is associated with somatic MGMT

promoter methylation and gene silencing in colorectal cancer. Carcinogenesis, 2007.

28(9): p. 1985-90.

265. Kwon, E., W. Wang, and L.H. Tsai, Validation of schizophrenia-associated genes

CSMD1, C10orf26, CACNA1C and TCF4 as miR-137 targets. Mol Psychiatry, 2011.

266. Chang, L.H., et al., Association of RELN promoter SNPs with schizophrenia in the

Chinese population. Dongwuxue Yanjiu, 2011. 32(5): p. 504-8.

267. Yuasa, T., et al., Polycystin-1L2 is a novel G-protein-binding protein. Genomics, 2004.

84(1): p. 126-38.

268. Park, E.Y., Y.M. Woo, and J.H. Park, Polycystic kidney disease and therapeutic

approaches. BMB Rep, 2011. 44(6): p. 359-68.

269. Goodman, A.B., A family history study of schizophrenia spectrum disorders suggests new

candidate genes in schizophrenia and autism. Psychiatr Q, 1994. 65(4): p. 287-97.

270. Wagemaker, H., J.L. Rogers, and R. Cade, Schizophrenia, hemodialysis, and the placebo

effect. Results and issues. Arch Gen Psychiatry, 1984. 41(8): p. 805-10.

271. Bennett, A.O.M., Dual constraints on synapse formation and regression in

schizophrenia: neuregulin, neuroligin, dysbindin, DISC1, MuSK and agrin. Aust N Z J

Psychiatry, 2008. 42(8): p. 662-77.

272. Siddiqui, T.J., et al., LRRTMs and neuroligins bind neurexins with a differential code to

cooperate in glutamate synapse development. J Neurosci, 2010. 30(22): p. 7495-506.

273. Kamal, M., et al., Loss of CSMD1 expression is associated with high tumour grade and

poor survival in invasive ductal breast carcinoma. Breast Cancer Res Treat, 2010.

121(3): p. 555-63.

274. Havik, B., et al., The complement control-related genes CSMD1 and CSMD2 associate to

schizophrenia. Biol Psychiatry, 2011. 70(1): p. 35-42.

275. Howes, O.D., et al., The Nature of Dopamine Dysfunction in Schizophrenia and What

This Means for Treatment: Meta-analysis of Imaging Studies. Arch Gen Psychiatry,

2012.

276. Dimpfel, W., Rat electropharmacograms of the flavonoids rutin and quercetin in

comparison to those of moclobemide and clinically used reference drugs suggest

antidepressive and/or neuroprotective action. Phytomedicine, 2009. 16(4): p. 287-94.

277. Ullmannova, V. and N.C. Popescu, Inhibition of cell proliferation, induction of apoptosis,

reactivation of DLC1, and modulation of other gene expression by dietary flavone in

breast cancer cell lines. Cancer Detect Prev, 2007. 31(2): p. 110-8.

278. Tarrago, T., et al., Baicalin, a prodrug able to reach the CNS, is a prolyl oligopeptidase

inhibitor. Bioorg Med Chem, 2008. 16(15): p. 7516-24.

279. Du, Y., X. Wu, and L. Li, Differentially organized top-down modulation of prepulse

inhibition of startle. J Neurosci, 2011. 31(38): p. 13644-53.

148

280. Meincke, U., E. Gouzoulis-Mayfrank, and H. Sass, [The startle reflex in schizophrenia

research]. Nervenarzt, 2001. 72(11): p. 844-52.

281. Furusato, E., et al., WT1 and Bcl2 expression in melanocytic lesions of the conjunctiva:

an immunohistochemical study of 123 cases. Arch Ophthalmol, 2009. 127(8): p. 964-9.

282. Jarskog, L.F., et al., Apoptotic proteins in the temporal cortex in schizophrenia: high

Bax/Bcl-2 ratio without caspase-3 activation. Am J Psychiatry, 2004. 161(1): p. 109-15.

283. Miller, C.L., et al., Two complex genotypes relevant to the kynurenine pathway and

melanotropin function show association with schizophrenia and bipolar disorder.

Schizophr Res, 2009. 113(2-3): p. 259-67.

284. Goldacre, M.J., et al., Schizophrenia and cancer: an epidemiological study. Br J

Psychiatry, 2005. 187: p. 334-8.

285. Dasgupta, A., et al., Insulin resistance and metabolic profile in antipsychotic naive

schizophrenia patients. Prog Neuropsychopharmacol Biol Psychiatry, 2010. 34(7): p.

1202-7.

286. Guest, P.C., et al., Altered levels of circulating insulin and other neuroendocrine

hormones associated with the onset of schizophrenia. Psychoneuroendocrinology, 2011.

36(7): p. 1092-6.

287. Chinnery, P.F., et al., Epigenetics, epidemiology and mitochondrial DNA diseases. Int J

Epidemiol, 2012. 41(1): p. 177-87.

288. Shock, L.S., et al., DNA methyltransferase 1, cytosine methylation, and cytosine

hydroxymethylation in mammalian mitochondria. Proc Natl Acad Sci U S A, 2011.

108(9): p. 3630-5.

289. Regenold, W.T., et al., Mitochondrial detachment of hexokinase 1 in mood and psychotic

disorders: implications for brain energy metabolism and neurotrophic signaling. J

Psychiatr Res, 2012. 46(1): p. 95-104.

290. Crouch, P.J., et al., Mechanisms of A beta mediated neurodegeneration in Alzheimer's

disease. Int J Biochem Cell Biol, 2008. 40(2): p. 181-98.

291. Anandatheerthavarada, H.K., et al., Mitochondrial targeting and a novel transmembrane

arrest of Alzheimer's amyloid precursor protein impairs mitochondrial function in

neuronal cells. J Cell Biol, 2003. 161(1): p. 41-54.

292. Martinez-Reyes, I., M. Sanchez-Arago, and J.M. Cuezva, AMPK and GCN2-ATF4 signal

the repression of mitochondria in colon cancer cells. Biochem J, 2012.

293. Nalaskowski, M.M., et al., Human inositol 1,4,5-trisphosphate 3-kinase isoform B

(IP3KB) is a nucleocytoplasmic shuttling protein specifically enriched at cortical actin

filaments and at invaginations of the nuclear envelope. J Biol Chem, 2011. 286(6): p.

4500-10.

294. Criollo, A., et al., Regulation of autophagy by the inositol trisphosphate receptor. Cell

Death Differ, 2007. 14(5): p. 1029-39.

295. Holland, J. and M. Agius, Neurobiology of bipolar disorder - lessons from migraine

disorders. Psychiatr Danub, 2011. 23 Suppl 1: p. S162-5.

296. Zhang, Z., et al., Valproate protects the retina from endoplasmic reticulum stress-induced

apoptosis after ischemia-reperfusion injury. Neurosci Lett, 2011. 504(2): p. 88-92.

297. Machado-Vieira, R., et al., The Bcl-2 gene polymorphism rs956572AA increases inositol

1,4,5-trisphosphate receptor-mediated endoplasmic reticulum calcium release in subjects

with bipolar disorder. Biol Psychiatry, 2011. 69(4): p. 344-52.

149

298. O'Dushlaine, C., et al., Molecular pathways involved in neuronal cell adhesion and

membrane scaffolding contribute to schizophrenia and bipolar disorder susceptibility.

Mol Psychiatry, 2011. 16(3): p. 286-92.

299. Wang, K.S., X.F. Liu, and N. Aragam, A genome-wide meta-analysis identifies novel loci

associated with schizophrenia and bipolar disorder. Schizophr Res, 2010. 124(1-3): p.

192-9.

300. Dimas, A.S., et al., Common regulatory variation impacts gene expression in a cell type-

dependent manner. Science, 2009. 325(5945): p. 1246-50.

301. Li, Y., et al., The DNA methylome of human peripheral blood mononuclear cells. PLoS

Biol, 2010. 8(11): p. e1000533.

302. Constancia, M., et al., Imprinting mechanisms. Genome Res, 1998. 8(9): p. 881-900.

303. Whitelaw, N.C. and E. Whitelaw, Transgenerational epigenetic inheritance in health and

disease. Curr Opin Genet Dev, 2008. 18(3): p. 273-9.

304. Manikkam, M., et al., Transgenerational actions of environmental compounds on

reproductive disease and identification of epigenetic biomarkers of ancestral exposures.

PLoS One, 2012. 7(2): p. e31901.

305. Klar, A.J., Propagating epigenetic states through meiosis: where Mendel's gene is more

than a DNA moiety. Trends Genet, 1998. 14(8): p. 299-301.

306. Ruden, D.M. and X. Lu, Hsp90 affecting chromatin remodeling might explain

transgenerational epigenetic inheritance in Drosophila. Curr Genomics, 2008. 9(7): p.

500-8.

307. Pentinat, T., et al., Transgenerational inheritance of glucose intolerance in a mouse

model of neonatal overnutrition. Endocrinology, 2010. 151(12): p. 5617-23.

308. Arai, J.A. and L.A. Feig, Long-lasting and transgenerational effects of an environmental

enrichment on memory formation. Brain Res Bull, 2011. 85(1-2): p. 30-5.

309. Furuhashi, H. and W.G. Kelly, The epigenetics of germ-line immortality: lessons from an

elegant model system. Dev Growth Differ, 2010. 52(6): p. 527-32.

310. Barres, R., et al., Acute exercise remodels promoter methylation in human skeletal

muscle. Cell Metab, 2012. 15(3): p. 405-11.

311. Kangaspeska, S., et al., Transient cyclical methylation of promoter DNA. Nature, 2008.

452(7183): p. 112-5.

312. Ebisawa, M., et al., Measurement of Ara h 1-, 2-, and 3-specific IgE antibodies is useful

in diagnosis of peanut allergy in Japanese children. Pediatr Allergy Immunol, 2012.

313. Sharief, S., et al., Vitamin D levels and food and environmental allergies in the United

States: results from the National Health and Nutrition Examination Survey 2005-2006. J

Allergy Clin Immunol, 2011. 127(5): p. 1195-202.

314. Pacheco, K.A., Epigenetics mediate environment : gene effects on occupational

sensitization. Curr Opin Allergy Clin Immunol, 2012. 12(2): p. 111-8.

315. Breton, C.V., et al., Prenatal tobacco smoke exposure affects global and gene-specific

DNA methylation. Am J Respir Crit Care Med, 2009. 180(5): p. 462-7.

316. Perera, F., et al., Relation of DNA methylation of 5'-CpG island of ACSL3 to

transplacental exposure to airborne polycyclic aromatic hydrocarbons and childhood

asthma. PLoS One, 2009. 4(2): p. e4488.

317. Nishida, N., et al., Evaluating the performance of Affymetrix SNP Array 6.0 platform with

400 Japanese individuals. BMC Genomics, 2008. 9: p. 431.

318. Bettscheider, M., et al., Optimized Analysis of DNA Methylation and Gene Expression

from Small, Anatomically-defined Areas of the Brain. J Vis Exp, 2012(65).

150

319. Fang, F., et al., Genomic landscape of human allele-specific DNA methylation. Proc Natl

Acad Sci U S A, 2012.

320. Xie, W., et al., Base-resolution analyses of sequence and parent-of-origin dependent

DNA methylation in the mouse genome. Cell, 2012. 148(4): p. 816-31.

321. Genissel, A., et al., Cis and trans regulatory effects contribute to natural variation in

transcriptome of Drosophila melanogaster. Mol Biol Evol, 2008. 25(1): p. 101-10.

322. Schilling, E., C. El Chartouni, and M. Rehli, Allele-specific DNA methylation in mouse

strains is mainly determined by cis-acting sequences. Genome Res, 2009.

323. Bell, J.T., et al., DNA methylation patterns associate with genetic and gene expression

variation in HapMap cell lines. Genome Biol, 2011. 12(1): p. R10.

324. Sanacora, G., G. Treccani, and M. Popoli, Towards a glutamate hypothesis of depression:

an emerging frontier of neuropsychopharmacology for mood disorders.

Neuropharmacology, 2012. 62(1): p. 63-77.

325. Kantrowitz, J.T. and D.C. Javitt, Thinking glutamatergically: changing concepts of

schizophrenia based upon changing neurochemical models. Clin Schizophr Relat

Psychoses, 2010. 4(3): p. 189-200.

326. Fusar-Poli, P., et al., Thalamic glutamate levels as a predictor of cortical response during

executive functioning in subjects at high risk for psychosis. Arch Gen Psychiatry, 2011.

68(9): p. 881-90.

327. de la Fuente-Sandoval, C., et al., Higher levels of glutamate in the associative-striatum of

subjects with prodromal symptoms of schizophrenia and patients with first-episode

psychosis. Neuropsychopharmacology, 2011. 36(9): p. 1781-91.

328. Palomino, A., et al., Decreased levels of plasma glutamate in patients with first-episode

schizophrenia and bipolar disorder. Schizophr Res, 2007. 95(1-3): p. 174-8.

329. Benneyworth, M.A., et al., A selective positive allosteric modulator of metabotropic

glutamate receptor subtype 2 blocks a hallucinogenic drug model of psychosis. Mol

Pharmacol, 2007. 72(2): p. 477-84.

330. Ginsberg, S.D., S.E. Hemby, and J.F. Smiley, Expression profiling in neuropsychiatric

disorders: emphasis on glutamate receptors in bipolar disorder. Pharmacol Biochem

Behav, 2012. 100(4): p. 705-11.

331. Muller, U.C. and H. Zheng, Physiological Functions of APP Family Proteins. Cold

Spring Harb Perspect Med, 2012. 2(2): p. a006288.

332. Dong, Q., et al., Molecular cloning of human G alpha q cDNA and chromosomal

localization of the G alpha q gene (GNAQ) and a processed pseudogene. Genomics,

1995. 30(3): p. 470-75.

333. Ohashi, T., et al., Long-term follow-up of electrocochleogram in Meniere's disease. ORL

J Otorhinolaryngol Relat Spec, 1991. 53(3): p. 131-6.

334. Szumlinski, K.K., A.W. Ary, and K.D. Lominac, Homers regulate drug-induced

neuroplasticity: implications for addiction. Biochem Pharmacol, 2008. 75(1): p. 112-33.

335. Larkin, E.H., Insulin Shock Treatment of Schizophrenia. Br Med J, 1937. 1(3979): p.

745-7.

336. Ziskind, E., et al., The Mechanism of Insulin Therapy in Schizophrenia. Cal West Med,

1938. 48(5): p. 310-1.

337. Herberth, M., et al., Impaired glycolytic response in peripheral blood mononuclear cells

of first-onset antipsychotic-naive schizophrenia patients. Mol Psychiatry, 2011. 16(8): p.

848-59.

151

338. Altar, C.A., et al., Insulin, IGF-1, and muscarinic agonists modulate schizophrenia-

associated genes in human neuroblastoma cells. Biol Psychiatry, 2008. 64(12): p. 1077-

87.

339. Erlander, M.G., et al., Two genes encode distinct glutamate decarboxylases. Neuron,

1991. 7(1): p. 91-100.

340. Moyer, C.E., et al., Reduced Glutamate Decarboxylase 65 Protein Within Primary

Auditory Cortex Inhibitory Boutons in Schizophrenia. Biol Psychiatry, 2012.

341. Najjar, S., et al., Glutamic Acid decarboxylase autoantibody syndrome presenting as

schizophrenia. Neurologist, 2012. 18(2): p. 88-91.

342. Yarlagadda, A., et al., Blood Brain Barrier: The Role of GAD Antibodies in Psychiatry.

Psychiatry (Edgmont), 2007. 4(6): p. 57-9.

343. Hampe, C.S., et al., Species-specific autoantibodies in type 1 diabetes. J Clin Endocrinol

Metab, 1999. 84(2): p. 643-8.

344. Andrade Lima Gabbay, M., et al., Serum titres of anti-glutamic acid decarboxylase-65

and anti-IA-2 autoantibodies are associated with different immunoregulatory milieu in

newly diagnosed type 1 diabetes patients. Clin Exp Immunol, 2012. 168(1): p. 60-7.

345. Wang, X., et al., Anti-idiotypic antibody specific to GAD65 autoantibody prevents type 1

diabetes in the NOD mouse. PLoS One, 2012. 7(2): p. e32515.

346. Warren, S.T., The Epigenetics of Fragile X Syndrome. Cell Stem Cell, 2007. 1(5): p. 488-

489.

347. Roelfsema, J.H., et al., Genetic heterogeneity in Rubinstein-Taybi syndrome: mutations in

both the CBP and EP300 genes cause disease. Am J Hum Genet, 2005. 76(4): p. 572-80.

348. Philibert, R.A., et al., MAOA methylation is associated with nicotine and alcohol

dependence in women. Am J Med Genet B Neuropsychiatr Genet, 2008. 147B(5): p. 565-

70.

349. Bonsch, D., et al., Lowered DNA methyltransferase (DNMT-3b) mRNA expression is

associated with genomic DNA hypermethylation in patients with chronic alcoholism. J

Neural Transm, 2006. 113(9): p. 1299-304.

350. Stack, E.C., et al., Modulation of nucleosome dynamics in Huntington's disease. Hum

Mol Genet, 2007. 16(10): p. 1164-75.

351. Devoto, M. and M. Falchi, Genetic mapping of quantitative trait Loci for disease-related

phenotypes. Methods Mol Biol, 2012. 871: p. 281-311.

352. Rohrwasser, A., et al., From genetics to mechanism of disease liability. Adv Genet, 2008.

60: p. 701-26.

353. Argeson, A.C., K.K. Nelson, and L.D. Siracusa, Molecular basis of the pleiotropic

phenotype of mice carrying the hypervariable yellow (Ahvy) mutation at the agouti locus.

Genetics, 1996. 142(2): p. 557-67.

354. Roberts, N.J., et al., The Predictive Capacity of Personal Genome Sequencing. Sci Transl

Med, 2012.

355. Katz, D.J., et al., A C. elegans LSD1 demethylase contributes to germline immortality by

reprogramming epigenetic memory. Cell, 2009. 137(2): p. 308-20.

356. Knight, J.C., Resolving the variable genome and epigenome in human disease. J Intern

Med, 2012. 271(4): p. 379-91.

357. Zhang, K., et al., Digital RNA allelotyping reveals tissue-specific and allele-specific gene

expression in human. Nat Methods, 2009. 6(8): p. 613-8.

358. Li, W. and M. Liu, Distribution of 5-hydroxymethylcytosine in different human tissues. J

Nucleic Acids, 2011. 2011: p. 870726.

152

359. Ni, X., et al., Nucleic acid aptamers: clinical applications and promising new horizons.

Curr Med Chem, 2011. 18(27): p. 4206-14.

360. Jamieson, A.C., J.C. Miller, and C.O. Pabo, Drug discovery with engineered zinc-finger

proteins. Nat Rev Drug Discov, 2003. 2(5): p. 361-8.

361. Waggoner, D., Mechanisms of disease: epigenesis. Semin Pediatr Neurol, 2007. 14(1): p.

7-14.

362. Bernstein, E. and C.D. Allis, RNA meets chromatin. Genes Dev, 2005. 19(14): p. 1635-

55.

363. Pushparaj, P.N. and A.J. Melendez, Short interfering RNA (siRNA) as a novel

therapeutic. Clin Exp Pharmacol Physiol, 2006. 33(5-6): p. 504-10.

364. Wu, F., et al., Small Interference RNA Targeting TLR4 Gene Effectively Attenuates

Pulmonary Inflammation in a Rat Model. J Biomed Biotechnol, 2012. 2012: p. 406435.

365. Reich, S.J., et al., Small interfering RNA (siRNA) targeting VEGF effectively inhibits

ocular neovascularization in a mouse model. Mol Vis, 2003. 9: p. 210-6.

366. Reifenberger, G., et al., Predictive impact of MGMT promoter methylation in

glioblastoma of the elderly. Int J Cancer, 2011.

367. van Hoesel, A.Q., et al., Primary tumor classification according to methylation pattern is

prognostic in patients with early stage ER-negative breast cancer. Breast Cancer Res

Treat, 2012. 131(3): p. 859-69.

368. Moore, R.G., S. MacLaughlan, and R.C. Bast, Jr., Current state of biomarker

development for clinical application in epithelial ovarian cancer. Gynecol Oncol, 2010.

116(2): p. 240-5.

369. Steele, N., et al., Combined inhibition of DNA methylation and histone acetylation

enhances gene re-expression and drug sensitivity in vivo. Br J Cancer, 2009. 100(5): p.

758-63.

370. Liang, D., et al., Genetic variants in MicroRNA biosynthesis pathways and binding sites

modify ovarian cancer risk, survival, and treatment response. Cancer Res, 2010. 70(23):

p. 9765-76.

371. Schneider-Stock, R., et al., Epigenetic mechanisms of plant-derived anticancer drugs.

Front Biosci, 2012. 17: p. 129-73.

372. Terrazas, L.I., et al., Role of the programmed Death-1 pathway in the suppressive activity

of alternatively activated macrophages in experimental cysticercosis. Int J Parasitol,

2005. 35(13): p. 1349-58.

373. Wang, X., et al., Enlargement of secretory vesicles by protein tyrosine phosphatase PTP-

MEG2 in rat basophilic leukemia mast cells and Jurkat T cells. J Immunol, 2002. 168(9):

p. 4612-9.

374. Pearce, E.L., et al., Control of effector CD8+ T cell function by the transcription factor

Eomesodermin. Science, 2003. 302(5647): p. 1041-3.

375. Kaminsky, Z.A., et al., DNA methylation profiles in monozygotic and dizygotic twins. Nat

Genet, 2009. 41(2): p. 240-5.

376. Ptak, C. and A. Petronis, Epigenetics and complex disease: from etiology to new

therapeutics. Annu Rev Pharmacol Toxicol, 2008. 48: p. 257-76.

377. Ptak, C. and A. Petronis, Epigenetic approaches to psychiatric disorders. Dialogues Clin

Neurosci, 2010. 12(1): p. 25-35.