LUND UNIVERSITY
PO Box 117221 00 Lund+46 46-222 00 00
A Possible Mechanism behind Autoimmune Disorders Discovered By Genome-WideLinkage and Association Analysis in Celiac Disease
Ostensson, Malin; Montén, Caroline; Bacelis, Jonas; Gudjonsdottir, Audur H.; Adamovic,Svetlana; Ek, Johan; Ascher, Henry; Pollak, Elisabet; Arnell, Henrik; Browaldh, Lars; Agardh,Daniel; Wahlstrom, Jan; Nilsson, Staffan; Torinsson-Naluai, AsaPublished in:PLoS ONE
DOI:10.1371/journal.pone.0070174
2013
Link to publication
Citation for published version (APA):Ostensson, M., Montén, C., Bacelis, J., Gudjonsdottir, A. H., Adamovic, S., Ek, J., Ascher, H., Pollak, E., Arnell,H., Browaldh, L., Agardh, D., Wahlstrom, J., Nilsson, S., & Torinsson-Naluai, A. (2013). A Possible Mechanismbehind Autoimmune Disorders Discovered By Genome-Wide Linkage and Association Analysis in CeliacDisease. PLoS ONE, 8(8), [e70174]. https://doi.org/10.1371/journal.pone.0070174
Total number of authors:14
General rightsUnless other specific re-use rights are stated the following general rights apply:Copyright and moral rights for the publications made accessible in the public portal are retained by the authorsand/or other copyright owners and it is a condition of accessing publications that users recognise and abide by thelegal requirements associated with these rights. • Users may download and print one copy of any publication from the public portal for the purpose of private studyor research. • You may not further distribute the material or use it for any profit-making activity or commercial gain • You may freely distribute the URL identifying the publication in the public portal
Read more about Creative commons licenses: https://creativecommons.org/licenses/Take down policyIf you believe that this document breaches copyright please contact us providing details, and we will removeaccess to the work immediately and investigate your claim.
Download date: 30. Jun. 2021
https://doi.org/10.1371/journal.pone.0070174https://portal.research.lu.se/portal/en/publications/a-possible-mechanism-behind-autoimmune-disorders-discovered-by-genomewide-linkage-and-association-analysis-in-celiac-disease(8dc2b002-ad5c-49ea-9c6d-2e735ee7b58b).htmlhttps://doi.org/10.1371/journal.pone.0070174
A Possible Mechanism behind Autoimmune DisordersDiscovered By Genome-Wide Linkage and AssociationAnalysis in Celiac DiseaseMalin Östensson1, Caroline Montén2, Jonas Bacelis3, Audur H. Gudjonsdottir4, Svetlana Adamovic3,
Johan Ek5, Henry Ascher6, Elisabet Pollak3, Henrik Arnell7, Lars Browaldh8, Daniel Agardh2,
Jan Wahlström3, Staffan Nilsson1, Åsa Torinsson-Naluai3,9*
1 Department of Mathematical Sciences, Chalmers University of Technology, Gothenburg, Sweden, 2 Diabetes and Celiac Disease Unit, Department of Clinical Sciences,
Lund University, Malmö, Sweden, 3 Institute of Biomedicine, Department of Medical and Clinical Genetics, Sahlgrenska Academy at the University of Gothenburg,
Gothenburg, Sweden, 4 Queen Silvia Children’s Hospital, Sahlgrenska Academy at the University of Gothenburg, Department of Pediatrics, Gothenburg, Sweden,
5 Buskerud Central Hospital, Department of Pediatrics, Drammen, Norway, 6 Sahlgrenska Academy at the University of Gothenburg, Department of Public Health and
Community Medicine, Unit of Social Medicine, Gothenburg, Sweden, 7 Department of Pediatric Gastroenterology, Hepatology and Nutrition, Karolinska University
Hospital and Division of Pediatrics, CLINTEC, Karolinska Institutet, Stockholm, Sweden, 8 Department of Clinical Science and Education, Karolinska Institutet
Sodersjukhuset, Stockholm, Sweden, 9 Systems Biology Research Centre, Tumor Biology, School of Life Sciences University of Skövde, Skövde, Sweden
Abstract
Celiac disease is a common autoimmune disorder characterized by an intestinal inflammation triggered by gluten, a storageprotein found in wheat, rye and barley. Similar to other autoimmune diseases such as type 1 diabetes, psoriasis andrheumatoid arthritis, celiac disease is the result of an immune response to self-antigens leading to tissue destruction andproduction of autoantibodies. Common diseases like celiac disease have a complex pattern of inheritance with inputs fromboth environmental as well as additive and non-additive genetic factors. In the past few years, Genome Wide AssociationStudies (GWAS) have been successful in finding genetic risk variants behind many common diseases and traits. Tocomplement and add to the previous findings, we performed a GWAS including 206 trios from 97 nuclear Swedish andNorwegian families affected with celiac disease. By stratifying for HLA-DQ, we identified a new genome-wide significant risklocus covering the DUSP10 gene. To further investigate the associations from the GWAS we performed pathway analysesand two-locus interaction analyses. These analyses showed an over-representation of genes involved in type 2 diabetes andidentified a set of candidate mechanisms and genes of which some were selected for mRNA expression analysis using smallintestinal biopsies from 98 patients. Several genes were expressed differently in the small intestinal mucosa from patientswith celiac autoimmunity compared to intestinal mucosa from control patients. From top-scoring regions we identifiedsusceptibility genes in several categories: 1) polarity and epithelial cell functionality; 2) intestinal smooth muscle; 3) growthand energy homeostasis, including proline and glutamine metabolism; and finally 4) innate and adaptive immune system.These genes and pathways, including specific functions of DUSP10, together reveal a new potential biological mechanismthat could influence the genesis of celiac disease, and possibly also other chronic disorders with an inflammatorycomponent.
Citation: Östensson M, Montén C, Bacelis J, Gudjonsdottir AH, Adamovic S, et al. (2013) A Possible Mechanism behind Autoimmune Disorders Discovered ByGenome-Wide Linkage and Association Analysis in Celiac Disease. PLoS ONE 8(8): e70174. doi:10.1371/journal.pone.0070174
Editor: Anna Carla Goldberg, Albert Einstein Institute for Research and Education, Brazil
Received March 18, 2013; Accepted June 14, 2013; Published August 2, 2013
Copyright: � 2013 Östensson et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permitsunrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Funding: The principal funding for this study was provided through the regional agreement on medical training and clinical research (ALF) between GothenburgCounty Council and Gothenburg University. We also thank Bengt Ihre’s, Claes Groschinsky’s, Magnus Bergvall’s, Nilsson-Ehle’s, Tore Nilson’s and Professor NannaSvartz Foundations, Kungliga fysiografiska sällskapet in Lund, Frimurare barnhusdirektionen, Ruth and Richard Julin Foundation and the Swedish Society ofMedicine. The Swedish Medical Research Council and the Celiac Disease Foundation supported family sample collections. The funders had no role in study design,data collection and analysis, decision to publish, or preparation of the manuscript.
Competing Interests: The authors have declared that no competing interests exist.
* E-mail: [email protected]
Introduction
Celiac disease (CD) is a common chronic disease and even
though most often diagnosed in early childhood, it can present
itself at any age. Most of the individuals with CD remain
undiagnosed and an estimated 2% of the Swedish population is
affected without having been diagnosed [1]. Ongoing disease will
increase the overall risk for developing other chronic inflammatory
diseases, neurological manifestations and malnutrition disorders.
CD is the only autoimmune disorder where the actual genes
responsible for the association in HLA are known (HLA-DQA1 and
HLA-DQB1) [2]. In the past few years Genome Wide Association
Studies (GWAS) have had tremendous success in identifying new
genes, or gene regions, that influence common diseases. These
studies use several hundreds of thousands of genetic markers
(single nucleotide polymorphisms, SNPs) across all human
chromosomes in order to pin down the chromosomal locations
of genes, which could influence the disease.
A large joint effort has been done, not the least in CD, and 40
new CD-associated genetic regions marked by SNPs have been
PLOS ONE | www.plosone.org 1 August 2013 | Volume 8 | Issue 8 | e70174
Celiac Disease Genome-Wide Linkage and Association
PLOS ONE | www.plosone.org 2 August 2013 | Volume 8 | Issue 8 | e70174
discovered [3–7]. However, these genes cannot account for all CD
heritability, and part of the genetic variance that influences disease
development is still unknown [8].
Most GWAS so far have been performed on case control
samples. A case control study design has some advantages
compared to using a family study design. For example, in a case
control design it is possible to select a perfectly matched set of
controls to increase the chance of discovering susceptibility genes,
and furthermore, cases and controls are usually easier to collect
than individuals from the same family. However, using a family
material can be a very good complement to a case control design.
First of all, families with several affected members are likely to
have a stronger genetic component compared to sporadic cases.
Familial cases tend to be enriched for disease-predisposing alleles
and there is an increased power especially for detecting rare
genetic variants [9]. Another important fact is that statistical
analyses based on family data are robust against population
stratification. Already in their paper from 1996, Risch and
Merikangas suggested that all sib-pair families collected for non-
parametric linkage analysis in complex diseases, should be re-run
‘‘Genome-Wide’’ using SNP markers and the potentially more
powerful Transmission Disequilibrium Test (TDT) [10]. The
TDT test in sib-pairs is a test of linkage in the presence of
association. Hereafter we refer to whole genome sibling TDT as
‘‘Linkage GWAS’’.
In this study, we aimed to uncover additional genetic factors in
CD by performing a Linkage GWAS using 206 affected children
(sib-pairs) within 97 nuclear families using the TDT test. In
addition to the Linkage GWAS we explored gene-gene interac-
tions and pathway analyses. We also performed a non-parametric
linkage (NPL) analysis and compared the results with the published
linkage analysis, with microsatellite markers, performed in the
same set of families previously [11]. Furthermore, quantitative
PCR was used to investigate levels of gene expression in small
intestinal biopsies from additional patients with CD autoimmunity
(CAI) and control patients. Finally, we stratified the TDT analysis
on HLA genotype. It has been shown that carrying DQB1*02 on
both chromosomes (i.e. being homozygous), confers higher risk of
developing CD as compared to heterozygote individuals [12]. It is
therefore conceivable that heterozygote individuals may require
more additional risk factors outside HLA, in order to accumulate
sufficient risk to develop CD, compared with homozygous
individuals. Based on this assumption we stratified the patient in
an HLA low-risk group and an HLA high-risk group. By
stratifying the Linkage GWAS, we expected to uncover even
more of the so-called ‘‘missing heritability’’ in CD. This strategy
could identify different risk factors all together or perhaps a more
likely scenario is that the same risk factors outside HLA would just
be more common in the HLA low-risk group.
Results
Genotyping and ImputationWe included single nucleotide polymorphism (SNP) markers
that had a call rate above 97%, which led to the exclusion of 1.3%
of the Omni Express and 0.6% of the 660W-Quad SNP markers.
Out of the 127,535,126 imputed genotypes, 88.3% had a posterior
probability of over 0.95. Approximately 90% of the 944,512 SNP
markers had a minor allele frequency of at least 0.01 after
imputation.
Transmission Disequilibrium Test (TDT)All markers from the TDT analysis are shown in Figure 1a. As
expected, the region around the CD associated HLA genes onchromosome 6 showed the strongest association with the most
significant p-value reaching 4.9610221 at marker rs424232. InTable 1, we present the 35 most significant associations found
outside of HLA (HLA defined as SNP markers located within 27–
34 Mb on chromosome 6). The most significant finding outside of
the HLA region was the marker rs12734338 on chromosome 1,
including the PPP1R12B gene.
HLA Stratified Transmission Disequilibrium Test (TDT)In Figure 1b and Table 2, we present results from the TDT
analysis stratified on the HLA-DQ risk factor. For this analysis 115affected offspring trios were included in the ‘‘low-risk’’ group and
88 trios were put in the ‘‘high-risk’’ group. A region including the
DUSP10 gene (also known as MKP5) reached genome-widesignificance (p-value = 3.861028) in the low-risk group. Figure 1cpresents this region including the most associated SNPs plotted on
the x-axis using SNAP.
Interaction AnalysesSince some markers just below genome-wide significance are
still expected to be true findings, we wanted to try and separate
these from the, in fact, true negative findings (those that show
linkage and association close to genome-wide significance just by
chance). In total, 603 SNP markers from 383 independent regions
and their surrounding genes were identified by three inclusion
criteria (Fig. 2 and Table S1). These genes were subsequently used
for pathway and two-locus interaction analyses.
Two-locus interaction analysis. Two-locus interaction
analysis, identified 582 SNP pairs with a p-value of less than
1.061024 for the test comparing the model M0 of no associationand the general two-locus model MG. Out of these, 101 pairs from
87 regions deviated significantly (p,0.05) from a purely multipli-cative model (MM), which is the best fitting model when at least
one of the SNP markers is false. Under the null hypothesis we
expect to find 29 such pairs. The 101 pairs showed either epistasis
(individuals carry both risk alleles) or evidence of heterogeneity
(individuals carry either the one or the other risk allele from the
two loci).
The results with a p-value ,1.061024 for epistasis and thosewith high p-value (.0.05), which represent pairs that did not showconvincing deviation from the heterogeneity model are listed in
Table 3 and 4. Several loci were in an epistatic relationship with
HLA; rs4899272 (ACTN1), rs1073933 (COX7C), rs10482751(TGFB2), rs571879 (APPL1) and rs7590305 (FABP1). Also,previously identified susceptibility loci for CD were involved in
Figure 1. Manhattanplot of the TDT p-values. a) The location of all genotyped SNPs on chromosomes 1–22 and X plotted on the x-axis. –log10(p-value) result for each SNP and all transmissions on the y-axis. b) The location of all genotyped SNPs on chromosomes 1–22 and X plotted onthe x-axis. –log10(p-value) result for each SNP and all transmissions, to children in the low risk group, on the y-axis. c) Regional plot of associationresults and recombination rates, within the region surrounding DUSP10, generated by SNAP (http://www.broadinstitute.org/mpg/snap/ldplot.php).The x-axis show 500 kb around the most associated SNP. Genomic locations of genes within the region of interest (NCBI Build 36 human assembly)were annotated from the UCSC Genome Browser (arrows). The left y-axis show –log10(p-value) and estimated recombination rates (cM/Mb) fromHapMap Project (NCBI Build 36) are shown in light blue lines.doi:10.1371/journal.pone.0070174.g001
Celiac Disease Genome-Wide Linkage and Association
PLOS ONE | www.plosone.org 3 August 2013 | Volume 8 | Issue 8 | e70174
Ta
ble
1.
Tra
nsm
issi
on
Dis
eq
uili
bri
um
Te
st(T
DT
).
TD
T(P
LIN
K)
ex
pT
DT
Ch
rS
NP
Ge
ne
sB
PA
1A
2T
Up
-va
lue
TU
T/U
p-v
alu
eN
PL
GW
AS
cata
log
1rs
12
73
43
38
PP
P1
R1
2B
SYT
2U
BE2
T2
00
73
63
46
CT
39
90
7.1
1E-
06
61
.44
13
2.4
50
.46
3.4
1E-
07
0,00
95
10
rs1
08
86
15
9EM
X2
OS
RA
B1
1FI
P2
EMX
21
19
60
36
00
CT
24
50
2.5
1E-
03
40
.16
98
.48
0.4
17
.30
E-0
70
,19
*
21
rs1
04
39
88
4B
AG
E2T
PT
EB
AG
E9
99
38
22
AG
10
23
2.3
6E-
02
15
.52
55
.78
0.2
81
.86
E-0
60
,22
*
14
rs1
95
85
89
EAP
PSN
X6
C1
4o
rf1
47
33
91
41
27
CT
15
36
3.2
8E-
03
27
.25
73
.64
0.3
73
.87
E-0
60
,32
*
17
rs1
77
60
26
8A
NK
FN1
NO
G5
19
66
29
0C
T2
71
19
.44
E-0
35
7.2
41
7.4
53
.28
4.1
3E-
06
0,3
3*
Can
nab
isd
ep
en
d./
he
igh
t
4rs
10
32
35
5R
G9
MT
D2
C4
orf
17
MT
TP
10
07
58
91
9C
T3
38
16
.94
E-0
63
5.0
78
5.2
80
.41
4.7
4E-
06
0,00
69*
22
rs4
91
16
42
CC
T8
L2p
siT
PT
E22
14
88
43
99
CT
20
38
1.8
1E-
02
34
.38
84
.15
0.4
14
.86
E-0
60
,21
*H
IV-1
vira
lse
tpo
int
20
rs1
57
64
0D
OK
55
28
47
94
6G
T7
11
35
8.2
3E-
06
72
13
80
.52
5.2
5E-
06
0,3
8Fu
nct
ion
alM
RI
1rs
20
68
82
4N
AV
11
99
86
12
88
CT
21
69
.67
E-0
46
.83
36
.77
0.1
95
.75
E-0
60
,01
4*
3rs
26
05
39
3ST
AC
36
38
46
05
GT
40
83
1.0
6E-
04
58
.52
11
8.8
80
.49
5.8
6E-
06
0,1
3*
19
rs2
66
41
56
KLK
2K
LK3
KLK
P1
KLK
4K
LK1
55
60
68
97
5C
T4
81
66
.33
E-0
59
1.9
64
0.0
12
.30
6.1
3E-
06
0,3
4*
Pro
stat
eca
nce
r
16
rs1
95
65
6H
YD
IN6
96
04
98
5A
G1
83
42
.65
E-0
22
9.7
97
6.0
90
.39
6.8
3E-
06
0,5
2*
11
rs4
93
01
44
IGF2
AS
TH
MR
PL2
3T
NN
T3
SYT
8A
SCL2
TN
NI2
LSP
1IG
F2IN
S-IG
F2IN
SH
19
20
05
06
4A
G6
23
21
.97
E-0
31
16
.06
57
.04
2.0
37
.28
E-0
60
,38
*P
rost
ate
can
cer/
Typ
e1
dia
be
tes
4rs
17
02
91
73
RG
9M
TD
2C
4o
rf1
7M
TT
P1
00
72
83
44
GT
27
70
1.2
7E-
05
27
.06
71
.49
0.3
87
.65
E-0
60,
0069
*
4rs
13
12
84
41
STK
32
B5
21
32
90
CT
32
77
1.6
3E-
05
32
79
0.4
18
.16
E-0
60
,74
*C
oro
nar
yh
ear
td
ise
ase
3rs
18
71
35
0ST
AC
36
34
87
69
CT
27
69
1.8
1E-
05
27
.06
71
.19
0.3
88
.49
E-0
60
,13
*
3rs
20
46
00
0ST
AC
36
32
73
68
AC
28
72
1.0
8E-
05
28
.05
72
.46
0.3
99
.42
E-0
60
,13
*
17
rs7
20
97
52
CC
DC
14
4C
LOC
28
41
94
SPEC
C1
AK
AP
10
19
90
99
89
AG
92
64
.06
E-0
31
2.6
14
6.4
60
.27
1.0
6E-
05
0,7
5*
1rs
37
95
27
7K
IAA
17
51
PR
KC
ZG
AB
RD
19
70
97
8A
C2
08
2.3
3E-
02
46
.62
12
.76
3.6
51
.11
E-0
50
,52
*R
eas
on
ing
/he
igh
t
2rs
10
20
37
48
TG
FBR
AP
1C
2o
rf4
9N
CK
2FH
L2G
PR
45
10
54
42
54
2C
T2
04
33
.76
E-0
32
3.9
36
5.3
50
.37
1.1
7E-
05
0,0
13
*A
IDS
11
rs3
18
96
6N
TM
13
08
71
34
8A
G1
54
23
.49
E-0
43
3.8
68
0.7
90
.42
1.1
7E-
05
0,0
15
*A
spe
rge
rd
iso
rde
r
6rs
94
02
23
4T
MEM
20
0A
SAM
D3
13
08
69
17
5C
T1
77
4.1
2E-
02
49
.86
14
.87
3.3
51
.37
E-0
50
,38
*h
eig
ht
9rs
15
36
68
9C
9o
rf9
3B
CN
2a
16
11
96
30
AG
42
18
1.9
5E-
03
97
.95
46
.04
2.1
31
.52
E-0
50
,09
7*
Hb
A1
c/g
luco
sele
cve
ls
4rs
68
38
03
6D
C2
AG
XT
2L1
RP
L34
10
96
30
52
8A
C1
06
57
1.2
4E-
04
12
7.7
26
7.4
71
.89
1.6
2E-
05
0,00
48*
3rs
17
28
38
13
LPP
19
01
22
38
9A
G1
12
61
.37
E-0
21
7.5
15
3.9
60
.32
1.6
2E-
05
0,0
82
*Ig
E/vi
tili
go
/Ce
liac
3rs
18
71
35
2ST
AC
36
32
95
41
AC
27
69
1.8
1E-
05
27
.99
70
.79
0.4
01
.66
E-0
50
,13
*
1rs
12
74
79
34
FOX
D3
63
54
01
85
AG
20
57
2.4
8E-
05
24
.88
65
.59
0.3
81
.87
E-0
50
,68
*
1rs
43
23
66
2LO
C1
00
28
80
79
IVN
S1A
BP
18
36
97
11
7G
T4
12
21
.67
E-0
29
8.4
14
6.8
52
.10
1.8
9E-
05
0,0
32
*
3rs
18
42
14
9ST
AC
36
36
67
14
GT
51
17
3.7
4E-
05
61
.29
22
.22
2.7
61
.91
E-0
50
,13
*
19
rs3
81
48
92
PA
LMH
CN
2C
19
orf
21
PO
LRM
TFS
TL3
PR
SSL1
RN
F12
6FG
F22
58
98
53
AG
11
31
2.0
3E-
03
23
.45
63
.03
0.3
72
.08
E-0
50
,88
*
3rs
12
63
17
57
TH
RB
24
61
85
77
CT
38
15
1.5
8E-
03
80
.09
34
.53
2.3
22
.09
E-0
50
,22
*H
em
ato
l.an
db
ioch
em
.tr
aits
10
rs7
09
73
80
SOR
CS1
10
86
71
65
9A
G1
18
60
1.3
8E-
05
11
96
21
.92
2.2
7E-
05
0,2
0*
Hb
A1
c/g
luco
sele
cve
ls
1rs
12
73
40
01
PP
P1
R1
2B
SYT
2U
BE2
T2
00
65
75
37
CT
75
40
1.1
0E-
03
12
9.0
96
9.4
61
.86
2.3
2E-
05
0,0
12
*
Celiac Disease Genome-Wide Linkage and Association
PLOS ONE | www.plosone.org 4 August 2013 | Volume 8 | Issue 8 | e70174
several interactions: rs4899272 (ACTN1), rs6741418 (STAT1, GLS),
rs13096142 (CCR1,2,3,5), rs10197319 (ICOS, CTLA4) and
rs870875 (CD247).
Pathway analysis. Biological functions clustered by Ingenu-
ity Pathway Analysis (IPA) and Genetrail [13] are shown in
Table 5, 6 and 7. Several clusters were significant after correction
for multiple comparisons. The most significant network implicated
by IPA included DUSP10 (Fig. 3 and Table 8). The second top
network included the MHC complex (HLA) and the third top
network included LPP, which is located within the most
significantly non-HLA associated region identified in CD so far
[3].
Gene ExpressionOut of the 34 selected target genes, three were from the top
associated SNPs (DUSP10, SVIL and PPP1R12B) and the
remaining were genes identified from the two-locus and pathway
analysis. Eight genes showed significant up- or down-regulation
after correction for multiple testing using Bonferroni correction
(Fig. 4). For the top associated genes, several transcript variants
were tested (Table 9). For the PPP1R12B gene, Isoform c and d
(transcript variants NM032103.2 and NM032104.2) also known as
the small subunit (sm-M20) of myosin light chain phosphatase,
show significant up-regulation in patients with CD autoimmunity
compared to control patients. An additional ten genes showed
nominally significant differences in expression (Table 9).
Non-parametric Linkage (NPL)The strongest linkage outside of HLA was detected in
chromosome regions 5q23.2-q33.1, and 1q32.1. In total, thirteen
regions with an NPL point wise p-value below 0.01 were detected
(Fig. 5 and Table 10). In our previous linkage-scan, using almost
the same set of families, we detected only one region (11q23-25)
with a point wise p-value below 0.01 [14]. The reason for the
improved results is mainly the almost perfect information content
achieved by a dense set of highly successful SNP markers
compared to a relatively sparse set of less successful microsatellite
markers. Also in the NPL analysis, the PPP1R12B gene was
located in one of the top regions (1q32.1).
Discussion
This study confirmed some previous GWAS findings and in
addition, it established a new genome-wide significant region
containing the DUSP10 gene. The top markers, rs12144971 and
rs4240931 showed a substantial effect size in the HLA low-risk
group with a transmitted versus non-transmitted allele ratio of 3.11
(Table 2).
DUSP10, TNF-a and Tissue Transglutaminase (TGM2)The protein product of DUSP10 preferentially binds to the
stress-activated p38 MAPK (mitogen-activated protein kinase) and
plays an important role in regulating chemokine induction after
infection by various pathogens [15], and in coordinating MAPK
activity in response to oxidative stress [16]. In previous studies,
both p38 MAPK and DUSP10 have been shown to activate TNF-
a [17,18], of which one also demonstrates that TNF-a up-regulates TGM2 (the gene encoding the main autoantigen in CD
[19]) in intestinal mucosa from untreated CD patients [17].
Whether this up-regulation of TGM2 is of importance for the
immune response leading to formation of IgA-tTG and IgG-tTG
autoantibodies, the serological markers for CD is still unresolved.
Ta
ble
1.
Co
nt.
TD
T(P
LIN
K)
ex
pT
DT
Ch
rS
NP
Ge
ne
sB
PA
1A
2T
Up
-va
lue
TU
T/U
p-v
alu
eN
PL
GW
AS
cata
log
10
rs1
70
94
08
3G
FRA
11
17
85
08
41
CT
27
68
2.5
9E-
05
30
.84
74
.17
0.4
22
.35
E-0
50
,27
*
3rs
12
63
27
71
CX
3C
R1
39
22
38
56
AG
36
82
.43
E-0
53
68
4.5
02
.43
E-0
50
,21
*
Th
eto
p3
5as
soci
ate
dSN
Ps
are
liste
dto
ge
the
rw
ith
the
surr
ou
nd
ing
ge
ne
sd
efi
ne
db
ye
ith
er
Gra
il(w
ww
.bro
adin
stit
ute
.org
/mp
g/g
rail/
)o
rth
eG
en
om
eB
row
ser
(htt
p:/
/ge
no
me
.ucs
c.e
du
/).
Th
ed
ise
ase
asso
ciat
ion
sar
eac
qu
ire
dfr
om
the
‘‘Cat
alo
go
fP
ub
lish
ed
Ge
no
me
-Wid
eA
sso
ciat
ion
Stu
die
s’’
(htt
p:/
/ww
w.g
en
om
e.g
ov
/gw
astu
die
s/).
For
PLI
NK
:g
en
oty
pe
sw
ere
imp
ute
dif
any
of
the
po
ste
rio
rp
rob
abili
tie
sw
ere
.0
.95
.Fo
re
xpT
DT
:T
and
Uar
eth
ee
xpe
cte
dtr
ansm
issi
on
cou
nts
(bas
ed
on
all
the
po
ste
rio
rim
pu
tati
on
pro
bab
iliti
es)
.N
PL
–th
em
ost
sig
nif
ican
tN
on
Par
ame
tric
Lin
kag
e(N
PL)
p-v
alu
efo
rth
esa
me
locu
sas
the
SNP
.P
-val
ue
sb
elo
w0
.05
are
mar
ked
init
alic
s.T
and
U–
the
nu
mb
er
of
he
tero
zyg
ou
sp
are
nts
wh
otr
ansm
itth
eal
lele
sA
1an
dA
2,
resp
ect
ive
ly.
T/U
–tr
ansm
issi
on
od
ds
bas
ed
on
the
exp
ect
ed
tran
smis
sio
nco
un
ts.
*th
em
arke
rin
the
set
of
SNP
sfr
om
the
linka
ge
anal
ysis
clo
sest
toth
em
arke
rin
the
SNP
colu
mn
(wh
en
this
mar
ker
was
no
tru
nin
the
linka
ge
anal
ysis
).acl
ose
stkn
ow
ng
en
e.
loca
ted
.5
00
kbfr
om
asso
ciat
ed
SNP
.d
oi:1
0.1
37
1/j
ou
rnal
.po
ne
.00
70
17
4.t
00
1
Celiac Disease Genome-Wide Linkage and Association
PLOS ONE | www.plosone.org 5 August 2013 | Volume 8 | Issue 8 | e70174
Ta
ble
2.
HLA
stra
tifi
ed
Tra
nsm
issi
on
Dis
eq
uili
bri
um
Te
st(T
DT
). Hig
hR
isk
Lo
wri
skA
ll
Ch
rS
NP
ge
ne
(s)
BP
A1
A2
TU
T/U
ch
isq
p-v
alu
eT
UT
/Uc
his
qp
-va
lue
we
igh
ted
ch
isq
TU
T/U
ch
isq
P-v
alu
e
1rs
1214
4971
DU
SP10
2200
9910
8C
T26
350.
741.
332.
49E-
0187
283.
1130
.27
3.76
E-08
20.2
411
664
1.81
15.0
21.
06E-
04
1rs
4240
931
DU
SP10
2201
0567
8T
C26
350.
741.
332.
49E-
0187
283.
1130
.27
3.76
E-08
20.2
411
664
1.81
15.0
21.
06E-
04
10
rs1
24
76
97
SVIL
29
90
13
47
CA
41
35
1.1
70
.47
4.9
1E-
01
22
69
0.3
22
4.2
78
.35
E-0
71
3.4
46
31
04
0.6
11
0.0
71
.51
E-0
3
1rs
48
46
73
4D
USP
10
22
01
39
62
1G
A2
02
50
.80
0.5
54
.56
E-0
15
31
53
.53
21
.24
4.0
6E-
06
13
.00
76
41
1.8
51
0.4
71
.21
E-0
3
10
rs7
09
73
80
SOR
CS1
10
86
71
65
9A
G4
63
41
.35
1.8
1.8
0E-
01
72
26
2.7
72
1.5
93
.37
E-0
61
2.7
01
18
60
1.9
71
8.9
01
.38
E-0
5
2rs
67
55
30
8P
RK
CE
46
08
37
71
AG
11
33
0.3
31
19
.11
E-0
43
28
4.0
01
4.4
01
.48
E-0
41
2.6
24
34
11
.05
0.0
58
.27
E-0
1
1rs
11
81
16
13
DU
SP1
02
20
12
20
26
GA
19
25
0.7
60
.82
3.6
6E-
01
54
17
3.1
81
9.2
81
.13
E-0
51
2.2
27
64
31
.77
9.1
52
.49
E-0
3
2rs
13
01
70
44
PR
KC
E4
60
86
85
3A
G1
23
90
.31
14
.29
1.5
6E-
04
43
18
2.3
91
0.2
51
.37
E-0
31
2.0
95
55
70
.96
0.0
48
.50
E-0
1
10
rs1
11
93
12
0SO
RC
S11
08
67
87
68
GA
50
35
1.4
32
.64
1.0
4E-
01
70
26
2.6
92
0.1
77
.10
E-0
61
1.9
41
20
61
1.9
71
9.2
31
.16
E-0
5
1rs
11
10
21
46
KC
NA
31
11
00
75
59
CT
13
17
0.7
60
.53
4.6
5E-
01
48
15
3.2
01
7.2
93
.22
E-0
51
1.8
86
13
41
.79
7.6
75
.60
E-0
3
10
rs4
74
84
17
STA
MT
MEM
23
61
78
19
81
2T
C2
0N
A2
1.5
7E-
01
18
29
.00
12
.80
3.4
7E-
04
11
.82
20
21
0.0
14
.73
1.2
4E-
04
2rs
49
72
81
0D
LX1
DLX
2P
DK
1M
AP
1D
ITG
A6
17
29
26
13
5A
G1
51
11
.36
0.6
14
.33
E-0
14
11
13
.73
17
.31
3.1
8E-
05
11
.74
56
22
2.5
51
4.8
21
.18
E-0
4
3rs
18
71
35
2ST
AC
36
32
95
41
AC
15
25
0.6
02
.51
.14
E-0
11
24
40
.27
18
.29
1.9
0E-
05
11
.71
27
69
0.3
91
8.3
81
.81
E-0
5
3rs
18
71
35
0ST
AC
36
34
87
69
CT
15
25
0.6
02
.51
.14
E-0
11
24
40
.27
18
.29
1.9
0E-
05
11
.71
27
69
0.3
91
8.3
81
.81
E-0
5
10
rs1
08
84
38
7SO
RC
S11
08
68
21
42
TC
43
33
1.3
01
.32
2.5
1E-
01
71
27
2.6
31
9.7
68
.80
E-0
61
1.7
01
14
60
1.9
01
6.7
64
.25
E-0
5
3rs
20
46
00
0ST
AC
36
32
73
68
AC
15
27
0.5
63
.43
6.4
1E-
02
13
45
0.2
91
7.6
62
.65
E-0
51
1.6
82
87
20
.39
19
.36
1.0
8E-
05
1rs
12
73
43
38
PP
P1
R1
2B
SYT
2U
BE2
T2
00
73
63
46
CT
20
36
0.5
64
.57
3.2
5E-
02
18
53
0.3
41
7.2
53
.27
E-0
51
1.6
63
99
00
.43
20
.16
7.1
1E-
06
22
rs1
29
68
26
BID
BC
L2L1
3SL
C2
5A
18
AT
P6
V1
E11
64
59
51
8C
T3
90
.33
38
.33
E-0
23
23
0.1
31
5.3
88
.77
E-0
51
1.4
76
32
0.1
91
7.7
92
.47
E-0
5
4rs
76
87
17
6IN
TS1
2G
STC
D1
06
74
65
55
TC
10
71
.43
0.5
34
.67
E-0
12
24
0.0
81
8.6
21
.60
E-0
51
1.4
71
23
10
.39
8.4
03
.76
E-0
3
2rs
49
72
80
9D
LX1
DLX
2P
DK
1M
AP
1D
ITG
A6
17
29
25
33
7A
G1
61
21
.33
0.5
74
.50
E-0
14
11
13
.73
17
.31
3.1
8E-
05
11
.45
57
23
2.4
81
4.4
51
.44
E-0
4
6rs
13
20
75
43
ELO
VL4
SH3
BG
RL2
TT
K8
05
92
60
1A
C1
32
60
.50
4.3
33
.74
E-0
25
21
92
.74
15
.34
8.9
9E-
05
11
.44
65
45
1.4
43
.64
5.6
5E-
02
4rs
10
32
35
5R
G9
MT
D2
C4
orf
17
MT
TP
10
07
58
91
9C
T9
42
0.2
12
1.3
53
.82
E-0
62
43
70
.65
2.7
79
.60
E-0
21
1.2
33
38
10
.41
20
.21
6.9
4E-
06
5rs
11
95
26
77
SCA
MP
1LH
FPL2
77
82
36
90
GA
27
34
0.7
90
.80
3.7
0E-
01
55
17
3.2
42
0.0
67
.52
E-0
61
1.2
38
45
11
.65
8.0
74
.51
E-0
3
1rs
12
74
35
21
DU
SP1
02
20
16
75
71
AG
21
24
0.8
80
.26
.55
E-0
15
01
53
.33
18
.85
1.4
2E-
05
11
.22
74
40
1.8
51
0.1
41
.45
E-0
3
12
rs1
10
68
31
5FB
XW
8H
RK
TES
CR
NFT
21
15
99
49
35
TC
86
1.3
30
.28
5.9
3E-
01
42
60
.15
16
.13
5.9
0E-
05
11
.09
12
32
0.3
89
.09
2.5
7E-
03
6rs
77
45
05
2FB
XL4
C6
orf
16
8U
SP4
5C
OQ
3P
OU
3F2
SFR
S18
99
74
73
31
AG
54
17
3.1
81
9.2
81
.13
E-0
53
22
51
.28
0.8
63
.54
E-0
11
1.0
88
64
22
.05
15
.12
1.0
1E-
04
12
rs1
72
45
50
1P
PFI
A2
80
26
04
70
CT
17
18
0.9
40
.02
88
.66
E-0
14
51
33
.46
17
.66
2.6
5E-
05
11
.02
62
31
2.0
01
0.3
31
.31
E-0
3
1rs
75
44
50
1D
USP
10
22
01
68
98
5T
C2
12
40
.88
0.2
6.5
5E-
01
51
16
3.1
91
8.2
81
.90
E-0
51
1.0
27
54
11
.83
9.9
71
.60
E-0
3
1rs
78
56
27
LPH
N2
82
52
11
65
TC
30
17
1.7
73
.60
5.7
9E-
02
25
61
0.4
11
5.0
71
.04
E-0
41
1.0
25
67
90
.71
3.9
24
.78
E-0
2
8rs
72
01
31
RA
D2
11
17
98
51
96
AG
35
40
0.8
80
.33
5.6
4E-
01
79
34
2.3
21
7.9
22
.30
E-0
51
0.9
01
15
75
1.5
38
.42
3.7
1E-
03
12
rs1
11
04
36
5M
GA
T4
C8
61
64
65
0C
T4
15
70
.72
2.6
11
.06
E-0
13
47
90
.43
17
.92
2.3
0E-
05
10
.81
77
13
60
.57
16
.34
5.2
9E-
05
Celiac Disease Genome-Wide Linkage and Association
PLOS ONE | www.plosone.org 6 August 2013 | Volume 8 | Issue 8 | e70174
Pathway AnalysesIn order to discover possible functional connections between
DUSP10 and other genes, we analyzed genes surrounding the top
603 markers. A total of 845 genes were used in the analysis.
Ingenuity pathway analysis (IPA) included DUSP10 within the
most significant network. Also part of this network were GLS and
RGS1, two genes previously identified within significant GWAS
loci [3], as well as the insulin (INS) gene, and the immune
regulatory nuclear factor kappa B (NF-Kb) complex (Fig. 3 andTable 8). The second top network included the MHC complex
(HLA) and also several genes within already identified GWAS loci:
ACTN1, CD247, CCR5, ICOS and STAT1 [3]. In addition, both
IPA and GeneTrail [13] identified T2D genes as the most
significantly overrepresented gene cluster after correction for
multiple testing (Table 5 and 6). Among this set of genes
surrounding the 603 markers, many genes belonged to growth
and nutrient signaling pathways, for example, INS, INSR, EGF,
POMC, TIPRL and PRR5L. There were also related genes directly
involved in energy metabolism; PDK1, COX7C, COQ3 and GLS.
Overlapping Results with Other GWAS FindingsSurprisingly, four out of six top loci identified by a GWAS for
anorexia nervosa [20] and two out of three loci involved in plasma
glucose levels in type 1 diabetic patients [21] were among our 603
and 35 best SNP markers respectively. One of the genes in
anorexia, namely AKAP6, is also associated to fasting insulin-related
traits as well as the autoimmune disease Ankylosing spondylitis
[22]. Of the 40 identified regions in CD, seven regions overlap
with our 603 SNP list (LPP, STAT4/GLS, RGS1, CCR1/CCR3,
PUS10, ICOS/CTLA4 and CD247). Out of the 69 regions reported
in the GWAS catalog for type 1 diabetes, eight overlap with the
regions reported in this study and out of those eight, CTLA4/ICOS
also overlap with the previously reported CD associations.
We compared minor allele frequencies between the previous
CD GWAS by Dubois et al. and our GWAS. In their top 42
associations, there was no SNP below a minor allele frequency of
0.08. In our top 42 associations, we identified five SNPs with a
minor allele frequency below 0.06. This observation could just be
a chance finding or perhaps an indication that rare variants are
easier to discover using families. We also identified a relatively rare
variant in the LPP gene region (rs17283813), with a minor allele
frequency of 0.075. This SNP was not at all significant in the
GWAS by Dubois et al. (Table S1).
Neither was there an association with the DUSP10 region in the
GWAS by Dubois and co-workers. The associated markers in the
DUSP10 region in our GWAS have a minor allele frequency
around 0.5 and are hence very common in the population. It is
difficult to say if this is a population specific effect or if DUSP10
could be detected in an HLA stratified population from another
ethnicity. Interestingly, the DUSP10 region has also been identified
as a risk factor for colon cancer by a meta-analysis of three GWAS
from the UK. This is an indication that colon cancer and CD
could share genetic risk factors.
Key Metabolic Regulators as well as the Top Associatedgene PPP1R12B were Differently Expressed in CD CasesCompared to Controls
Another important finding was the difference between cases and
controls and their gene expression patterns in the small intestine.
Eight of the 34 candidate genes selected for quantitative
measurements of gene expression, including PPP1R12B, PDK1,
GLS, PRR5L and the INSR, showed significant up or down
regulation of mRNA levels in cases compared to controls (Fig. 4).
Ta
ble
2.
Co
nt.
Hig
hR
isk
Lo
wri
skA
ll
Ch
rS
NP
ge
ne
(s)
BP
A1
A2
TU
T/U
ch
isq
p-v
alu
eT
UT
/Uc
his
qp
-va
lue
we
igh
ted
ch
isq
TU
T/U
ch
isq
P-v
alu
e
14
rs7
14
40
18
NA
T1
2EX
OC
5C
14
orf
10
85
66
76
82
0C
T2
93
9.6
72
1.1
24
.30
E-0
61
42
00
.70
1.0
63
.04
E-0
11
0.7
94
52
31
.96
7.1
27
.63
E-0
3
Th
eto
p3
2as
soci
ate
dSN
Ps
fro
mth
ere
sult
so
fth
eH
LAst
rati
fie
dan
alys
is.S
urr
ou
nd
ing
ge
ne
sar
ed
efi
ne
db
ye
ith
er
Gra
il(w
ww
.bro
adin
stit
ute
.org
/mp
g/g
rail/
)o
rth
eG
en
om
eB
row
ser
(htt
p:/
/ge
no
me
.ucs
c.e
du
/).T
he
low
risk
gro
up
con
sist
so
f1
15
trio
san
dth
eh
igh
risk
gro
up
of
88
trio
s.G
en
oty
pe
sw
ere
imp
ute
dif
any
of
the
po
ste
rio
rp
rob
abili
tie
sw
ere
.0
.95
.C
his
q–
the
valu
eo
fth
eT
DT
test
stat
isti
cfo
rth
etr
ansm
issi
on
cou
nts
.W
eig
hte
dC
his
q–
bas
ed
on
the
tran
smis
sio
nco
un
tso
fth
elo
wan
dh
igh
-ris
kg
rou
ps.
Th
isva
lue
isu
sed
for
ran
kin
gth
ere
sult
s.T
and
U–
the
nu
mb
er
of
he
tero
zyg
ou
sp
are
nts
wh
otr
ansm
itth
eal
lele
sA
1an
dA
2,
resp
ect
ive
ly.
T/U
–tr
ansm
issi
on
od
ds
bas
ed
on
the
exp
ect
ed
tran
smis
sio
nco
un
ts.
do
i:10
.13
71
/jo
urn
al.p
on
e.0
07
01
74
.t0
02
Celiac Disease Genome-Wide Linkage and Association
PLOS ONE | www.plosone.org 7 August 2013 | Volume 8 | Issue 8 | e70174
Figure 2. Illustration of the three inclusion criteria used for pathway and interaction analyses. The first criteria of p-values less than3.061024 in the linkage TDT analysis resulted in a total of 477 markers. The second criteria included a comparison of the results from this study withthe results from the study by Dubois et al. [3]. We included 118 SNPs that had a simple score based on a combined p-value less than 5.061025 and inthe same allelic direction in both datasets. The third criteria involved selecting markers with a large effect size. We included 65 markers which had aratio of transmitted versus not transmitted (T/NT) alleles of over 5 or below 0.2, combined with a p-value of less than 2.061023.doi:10.1371/journal.pone.0070174.g002
Table 3. The top epistasis interaction results from the 101 two-locus interaction analysis.
Snp 1 Genes chr Snp 2 Genes chr N P02 P12 PM2
rs2187668 HLADQ 6 rs4899272 ACTN1 14 95 4.0E-17 1.42E-13 4.E-02
rs204034 SHISA9 16 94 1.3E-14 1.09E-12 5.E-02
rs571879 APPL1 HESX1 IL17RD. DNHD2. ASB14 3 94 2.3E-15 5.21E-11 3.E-02
rs204999 HLA 6 rs1073933 COX7C 5 94 9.9E-14 9.27E-12 3.E-02
rs11836636 ATXN7L3B KCNC2 12 91 1.1E-12 8.15E-11 4.E-02
rs7745052 FBXL4. C6orf168. USP45. COQ3. POU3F2. SFRS18 6 92 2.3E-05 1.79E-05 4.E-02
rs10749738 FOXD3 1 rs1373649 BMPR1B 4 93 2.7E-05 1.78E-05 4.E-02
rs3860295 RASSF5 IKBKE 1 rs13096142 CCR5 CCR3 LTF CCR2 CCR1 3 95 1.1E-05 6.48E-06 1.E-02
rs9396802 KIF13A NUP153 FAM8A1 6 rs2194633 NETO1 18 95 3.8E-06 6.82E-06 2.E-02
rs9296204 MTCH1 PI16 6 rs4385459 LY96 JPH1 GDAP1 TMEM70 TCEB1 8 95 2.8E-05 9.91E-06 3.E-02
rs9397928 ARID1B* 6 rs2415836 FSCB* 14 93 2.8E-05 1.75E-05 3.E-03
rs1145212 APOA5 ZNF259 BUD13 11 rs10083673 MYO5A 15 95 6.6E-05 1.77E-05 2.E-03
rs7756191 DNAH8 6 rs1108001 NAV2 HTATIP2 DBX1 PRMT3 11 95 3.5E-05 2.60E-05 3.E-03
rs10197319 ICOS CTLA4 2 rs882820 SRL TFAP4 16 94 1.4E-05 3.03E-05 3.E-05
rs4899272 ACTN1 14 rs17703807 C15orf41 15 83 2.9E-05 8.68E-05 1.E-02
All SNP pairs which reached an interaction p-value of P12,1.061024, in addition to PM2,0.05.
*closest known gene. located .500 kb from associated SNP.P02– p-value for the test statistic comparing the models M0 (no association) and the general model MG.P12– p-value for the test test comparing the models MR (heterogeneity) and the general model MG.PM2– p-value for the test comparing the models MM (multiplicative) and the general model MG.doi:10.1371/journal.pone.0070174.t003
Celiac Disease Genome-Wide Linkage and Association
PLOS ONE | www.plosone.org 8 August 2013 | Volume 8 | Issue 8 | e70174
This could very well be a consequence of an ongoing inflammation
or possibly also indicate an underlying metabolic difference.
Glutamine is converted to glutamate by the enzyme glutaminase
(GLS). In turn, glutamate can be converted to proline and
subsequently catabolized by the enzyme proline dehydrogenase
(PRODH) resulting in the production of reactive oxygen species
and apoptosis [23]. In the present study, we show that the
expression of GLS is down-regulated and PDK1 is up-regulated in
cases. Interestingly, a previous study has shown that cell lines with
a known familiar mutation for amyotrophic lateral sclerosis (ALS)
have the same expression pattern, with up-regulated PDK1 and
down-regulated GLS, as compared to the wild-type cell line [24].
PRR5L (also called Protor-2) belong to the TOR signaling
pathway. Our results show an up-regulated PRR5L expression in
cases (Fig. 4). Like DUSP10, the protein product from PRR5L has
been shown to stimulate an increased TNF-a expression [25].Another gene, connected to the MAPK pathway and which was
identified both by our two-locus interaction analysis and in
significant biological functions implied by IPA, was the APPL1
gene. APPL1 is a binding partner of the protein kinase Akt2 and a
key regulator of insulin signaling [26]. It takes part in adiponectin
signaling to stimulate activity of p38 MAPK in muscle cells [27]
and is a critical regulator of the crosstalk between adiponectin
signaling and insulin signaling pathways [28]. We could detect
expression of both APPL1 and APPL2 in small intestinal biopsies
and a significantly lower expression of APPL2 was detected in the
CD autoimmunity cases as compared to controls (Fig. 4). Lower
expression of APPL2 levels lead to enhanced adiponectin
stimulated glucose uptake and fatty acid oxidation [29]. A SNP
(rs10861406) included in the top 603 list was located upstream of
the APPL2 gene, however the promotor of this gene was on the
opposite side of a recombination hotspot and therefore not
included in the gene list for pathway analyses.
The most significant finding from our non-stratified linkage
GWAS analysis was the association with the PPP1R12B gene
region. PPP1R12B is involved in smooth muscle contractibility and
mediates binding to myosin [30]. Myosin light chain phosphatase
from smooth muscle consists of a catalytic subunit (PP1c) and two
non-catalytic subunits, M130 and M20. The two non-catalytic
subunits are both encoded by the PPP1R12B gene. The M130
transcript was not differentially expressed between CD autoim-
munity and control patients while the small subunit ‘‘M20’’
showed a significantly higher expression in patients with CD
autoimmunity. (PPP1R12B_22 in Fig. 4) Several other genes
located close to top markers such as the PPP3CA, ACTN1, MYO1B,
MYO5A, MAPK1, PRKCH, PRKCQ, PRKACB, PRR5L and NTS
genes, are connected to smooth muscle when examining their
function by using KEGG [31] and Gene Ontology [32].
Table 4. The top heterogeneity results from the 101 two-locus interaction analysis.
SNP1 Genes chr SNP2 Genes chr N P02 P12 PM2
rs4899272 ACTN1 14 rs4820682 SRRD HPS4 TFIP11 ASPHD2 MIR548JTPST2 CRYBB1 CRYBA4
22 95 7.1E-06 6.97E-02 2.E-02
rs4426448 DOK6 18 94 9.5E-06 6.80E-01 1.E-02
rs870875 CD247 1 94 9.4E-05 7.19E-02 3.E-02
rs4842007 PAEP 9 95 8.6E-06 5.66E-01 4.E-02
rs571879 APPL1 HESX1 IL17RDDNHD2 ASB14
3 rs4385459 LY96 JPH1 GDAP1 TMEM70 TCEB1 8 94 4.1E-05 5.81E-01 5.E-02
rs7590305 FABP1 THNSL2 2 rs390495 MICAL3 22 93 7.0E-05 9.09E-01 3.E-03
rs7745052 FBXL4 C6orf168 USP45COQ3 POU3F2 SFRS18
6 rs4930144 IGF2AS TH MRPL23 TNNT3 SYT8 ASCL2TNNI2 LSP1 IGF2 INS-IGF2 INS H19
11 50 1.9E-05 5.30E-01 3.E-02
rs10749738 FOXD3 1 rs10498982 EPHA7* 6 93 2.0E-05 1.95E-01 4.E-02
rs2605393 STAC 3 63 7.3E-05 4.37E-01 4.E-02
rs2187668 HLADQ 6 rs11013804 KIAA1217 10 94 3.5E-14 8.40E-02 2.E-02
rs1676235 ESRRB ANGEL1 VASH1 14 43 2.0E-07 8.55E-02 3.E-02
rs958802 KANK4 L1TD1 INADL 1 rs2194633 NETO1 18 95 1.9E-05 5.55E-01 3.E-02
rs2345981 KHDRBS2 6 rs6495130 RYR3 15 94 6.1E-05 1.58E-01 3.E-02
rs11940562 PCDH7* 4 rs4905043 ITPK1 CHGA 14 44 4.6E-05 2.77E-01 2.E-02
rs4656538 POU2F1 1 rs2187668 HLADQ 6 94 3.0E-13 1.19E-01 5.E-02
rs3860295 RASSF5 IKBKE 1 rs7046385 SMC2 9 94 5.3E-05 1.07E-01 2.E-02
rs6741418 STAT1 GLS STAT4 2 rs10798004 C1orf25 C1orf26 IVNS1ABP RNF2 1 87 7.2E-05 7.68E-02 4.E-02
rs1571812 VLDLR 9 86 3.0E-05 9.19E-02 4.E-02
rs882820 SRL TFAP4 16 87 4.2E-05 3.52E-01 6.E-03
rs1470379 VIM 10 82 1.0E-05 3.70E-01 8.E-03
rs10946659 DCDC2 NRSN1 6 87 1.9E-06 6.64E-01 9.E-03
rs10482751 TGFB2 1 rs1571812 VLDLR 9 92 5.2E-05 1.86E-01 1.E-02
All SNP pairs which reached an interaction p-value of P12.0.05, in addition to PM2,0.05.*closest known gene located .500 kb from associated SNP.P02– p-value for the test statistic comparing the models M0 (no association) and the general model MG.P12– p-value for the test test comparing the models MR (heterogeneity) and the general model MG.PM2– p-value for the test comparing the models MM (multiplicative) and the general model MG.doi:10.1371/journal.pone.0070174.t004
Celiac Disease Genome-Wide Linkage and Association
PLOS ONE | www.plosone.org 9 August 2013 | Volume 8 | Issue 8 | e70174
The second most significant region in the HLA-stratified
analysis after DUSP10 contains the SVIL gene. The product of
this gene has been suggested to bind LPP [33]. In our two-locus
interaction analysis, the LPP locus and a locus containing KIF13A
was one of the 101 interaction pairs. KIF13A is a motor protein,
which shuttles vesicles containing AP-1 and the mannnose-6-
phosphate receptor [34]. KIF13A was significantly down-regulated
in intestinal biopsies from CD patients in our gene expression
analysis (Fig. 4). SVIL is associated with cell-focal adhesions
(substrate contacts), which are important for rapidly moving cells
such as for example immune cells but also for motility and polarity
of intestinal epithelial cells. SVIL mRNA was down-regulated in
our gene expression analysis, however, not significant after
correction for multiple testing.
Proline and Glutamine Metabolism - Part of a ‘‘DangerSignal’’
Amoebiasis was one of the nominally significant pathways in the
GeneTrail analysis of genes surrounding the two-locus interaction
SNPs (Table 7). Several of these genes were also present together
with DUSP10 and the MHC class II genes in the two most
significant IPA generated networks (marked in bold text in
Figure 3. Ingenuity network 1. The top network identified by the Ingenuity IPA software using genes surrounding all 603 most associated SNPsfrom the TDT analysis. Molecules in gray were present among the genes from our TDT analysis and molecules in white were added by the IPAsoftware. The DUSP10 gene is marked in yellow.doi:10.1371/journal.pone.0070174.g003
Celiac Disease Genome-Wide Linkage and Association
PLOS ONE | www.plosone.org 10 August 2013 | Volume 8 | Issue 8 | e70174
Table 5. Biological functions of genes surrounding the 603 top associated SNPs. Results from IPA.
FunctionAnnotation
p-value(Raw)
B-H p-value* Molecules Molecules
non-insulin-dependentdiabetesmellitus
0.0000057 0.025 ABCC8. ADRA1B. ADRA1D. AGT. APOA5. ATP10A. BCL2L11. CCR5. CD38. CNTNAP2. FOXP1. FTO. HFE.HFE2. INS. INSR. KCNJ11. KIRREL3. KLF10. mir-154. mir-448. MTTP. PBX3. PIEZO2. PPARA. PPP3CA. PRDM10.RGS5. VEGFA. ZMYM2
30
quantity ofmetal
0.0000082 0.025 ABCC8. ADRA1B. AGT. APLP2. ATP2B3. BCL2. BMP2. BTK. CAMLG. CCR5. CD247. CD38. CHGA. CX3CR1.CXCL13. DARC. DCN. DVL1. EGF (includes EG:13645). FBXL5. FCER1A. GNA14. GNB1. HFE. HFE2. IGF2. INS.INSR. KCNJ11. LTF. NTS. NUCB2. POMC. PRL. PRNP. PTGDR2. RGS1. RYR3. SELL. SOD1. TRPM8. TXNIP.VAV3. VEGFA
44
incorporationof thymidine
0.000010 0.025 AGT. AKAP13. BMP2. CD40. EGF (includes EG:13645). IGF2. INS. INSR. PRL. THBS2. TNFSF13B. VEGFA. WT1 13
quantity ofCa2+
0.000018 0.033 ABCC8. ADRA1B. AGT. ATP2B3. BCL2. BTK. CAMLG. CCR5. CD247. CD38. CHGA. CX3CR1. CXCL13. DARC.DCN. DVL1. EGF (includes EG:13645). FCER1A. GNA14. GNB1. IGF2. INS. INSR. KCNJ11. NTS. NUCB2. POMC.PRL. PRNP. PTGDR2. RGS1. RYR3. SELL. SOD1. TRPM8. VAV3. VEGFA
37
eyedevelopment
0.000022 0.033 BID. BMPR1B. CD247. CHD7. CRYBB2. CX3CR1. DLX1. DNMT3A. EBF3. EGF (includes EG:13645). FJX1. FTO.GJA3. H19. HESX1. IFT88. IGF2. IRX3. ITGA6. LUM. MITF. OGN. PAX5. PROM1. PRRX2. PYGO1. SEMA5A. SOD1.STAT1. TGFB2. TH. THBS2. THRB. TUB. USH2A. VEGFA. WT1
37
diabetesmellitus
0.000027 0.034 ABCC8. ABCG1. ABT1. ADRA1B. ADRA1D. AGT. APOA5. ATP10A. BCL2. BCL2L11. BTC. BTN2A1. BTN3A2. CBLB.CCR5. CD200. CD38. CD40. CNTNAP2. CYBA. E2F3. ENAH. FOXP1. FTO. GABRD. HFE. HFE2. HIST1H3A(includes others). HTR2C. ICOS. IGF2-AS1. INS. INSR. KCNJ11. KIRREL3. KLF10. mir-154. mir-448. MTTP. PBX3.PDE8A. PGM1. PIEZO2. PPARA. PPP3CA. PRDM10. PRSS16. PRUNE2. PXDNL. RGS1. RGS5. SELL. SOD1. TH.THRB. TSPO. VEGFA. ZMYM2
58
angiogenesisof bone
0.000032 0.034 BMP2. NOG. TGFB2. VEGFA 4
quantity ofmetal ion
0.000071 0.043 ABCC8. ADRA1B. AGT. ATP2B3. BCL2. BTK. CAMLG. CCR5. CD247. CD38. CHGA. CX3CR1. CXCL13. DARC.DCN. DVL1. EGF (includes EG:13645). FCER1A. GNA14. GNB1. IGF2. INS. INSR. KCNJ11. NTS. NUCB2. POMC.PRL. PRNP. PTGDR2. RGS1. RYR3. SELL. SOD1. TRPM8. TXNIP. VAV3. VEGFA
38
developmentof head
0.000069 0.043 BCL2. BCL2L11. BID. BMP2. BMPR1B. CD247. CHD7. CRYBB2. CX3CR1. DLX1. DNMT3A. EBF3. EGF (includesEG:13645). FJX1. FTO. GJA3. H19. HESX1. IFT88. IGF2. IRX3. ITGA6. LUM. MITF. MYO5A. NOG. OGN. PAX5.PROM1. PRRX2. PYGO1. SEMA5A. SOD1. STAT1. TGFB2. TH. THBS2. THRB. TUB. USH2A. VEGFA. WT1
42
migration ofcells
0.000057 0.043 ADI1 (includes EG:104923). AGT. APLP2. APPL1. ARHGAP5. B3GAT1. BCL2. BGN. BID. BMP2. BTC. BTK. CBLB.CCR5. CD200. CD247. CD36. CD38. CD40. CD99. CHGA. CMA1. CNTNAP2. CSF2RA. CTBP2. CTNNA2. CTSG.CX3CR1. CXCL13. DARC. DCDC2. DCN. DISC1. DLX1. DYX1C1. E2F3. EBF3. EGF (includes EG:13645). ELMO2.FCER1A. FH. FHL2. GFRA1. GNA12. GRIA2. GZMB. HTATIP2. ICOS. IGF2. INS. INSR. ITGA6. KIAA0319. LAMA2. LPP.LSP1 (includes EG:16985). LTF. LUM. LY96 (includes EG:17087). MAPK1. MNX1. MTAP. MYO10. MYO1F. NAV1.NOG. NPTX2. NTS. NUCB2. PAEP. PDPN (includes EG:10630). PEX11B. PEX13. POMC. POU3F2. PPARA. PPM1F.PRKCQ. PRKCZ. PRL. PRNP. PROK2. PTGDR2. PTGES. PTK2 (includes EG:14083). PVR. RASSF5. RGS1. SATB2. SELL.SEMA5A. SOD1. STAT1. TDP2. TGFB2. THBS2. TIAM1. TNFAIP8. TNFRSF18. TNFRSF4. TNFSF13B. TSPO. UNC5C.VASH1. VAV3. VEGFA. VIM. WWOX
108
cell movement0.000073 0.043 ADCY10. ADI1 (includes EG:104923). AGT. APLP2. APPL1. ARHGAP5. B3GAT1. BCL2. BGN. BID. BMP2. BTC.BTK. CATSPER3. CBLB. CCR5. CD200. CD247. CD36. CD38. CD40. CD99. CHGA. CMA1. CNTNAP2. CSF2RA.CTBP2. CTNNA2. CTSG. CX3CR1. CXCL13. DARC. DCDC2. DCN. DISC1. DLX1. DYX1C1. E2F3. EBF3. EGF (includesEG:13645). ELMO2. ENAH. FCER1A. FH. FHL2. GFRA1. GNA12. GNB1. GRIA2. GZMB. HTATIP2. ICOS. IFT88. IGF2.INS. INSR. ITGA6. KIAA0319. LAMA2. LPP. LSP1 (includes EG:16985). LTF. LUM. LY96 (includes EG:17087). MAPK1.MNX1. MTAP. MYO10. MYO1F. NAV1. NCK2. NOG. NPTX2. NTS. NUCB2. PAEP. PDPN (includes EG:10630). PEX11B.PEX13. POMC. POU3F2. PPARA. PPM1F. PRKCQ. PRKCZ. PRL. PRNP. PROK2. PTGDR2. PTGES. PTK2 (includesEG:14083). PVR. RASSF5. RGS1. RGS10. SATB2. SELL. SEMA5A. SOD1. SPAG16. STAT1. TAS1R3. TDP2. TGFB2. THBS2.THRB. TIAM1. TNFAIP8. TNFRSF18. TNFRSF4. TNFSF13B. TSPO. UNC5C. VASH1. VAV3. VEGFA. VIM. WWOX
118
apoptosis 0.000069 0.043 ABCG1. ADCY10. ADI1 (includes EG:104923). ADRA1B. ADRA1D. AGPAT2. AGT. APPL1. ATXN1. BCL2. BCL2L11.BCL2L13. BGN. BID. BIK. BMP2. BMPR1B. BTC. BTK. CACNA1A. CBLB. CCDC86. CCNI. CCNL2. CCR5. CD200.CD247. CD36. CD38. CD40. CD5L. CD99. CDK11A/CDK11B. CSF2RA. CTBP2. CTSG. CX3CR1. CYBA. DACH1. DCN.DLX1. DNMT3A. DUSP10. DVL1. E2F3. EGF (includes EG:13645). EPHA7. EPHX1. EPM2A. FABP1. FANCC. FBXL5.FCER1A. FHL2. FOXP1. FSTL3. GFRA1. GNA12. GRIA2. GZMB. HFE. HSF2. HTATIP2. ICOS. IFNE. IGF2. IKBKE.IL17RD. INS. INSR. IPPK. ITGA6. ITGB3BP. ITPK1. IVNS1ABP. KIFAP3. KLF10. LAMA2. LIG4. LSP1 (includes EG:16985).LTF. LUM. MAGED1. MAGEH1. MAPK1. mir-154. mir-506. MITF. MLLT3. MNAT1. MTCH1. NELL1. NOG. NPTX2. NTS.PAEP. PAWR. PAX5. PDCD6IP. PEX11B. PKN2. POLH. POMC. PPARA. PPM1F. PPP2R4. PPP3CA. PRAME. PRKCH.PRKCQ. PRKCZ. PRL. PRNP. PRPF19. PRUNE2. PTGES. PTK2 (includes EG:14083). PUS10. RASSF5. RGS5. RNASEH1.SELL. SGCG. SLC25A6. SMARCA2. SMOX. SOD1. SPAG16. ST14. STAT1. TFAP4. TGFB2. THBS2. TIAM1. TMEM109.TMEM132A. TNFAIP8. TNFRSF18. TNFRSF4. TNFSF13B. TREX2. TRPS1. TSPO. TUB. TXNIP. UNC5C. VEGFA. VIM.VPS13A. WT1. WWOX. XPO1. ZMYM2
153
quantity ofleukocytes
0.000076 0.043 AGT. BCL2. BCL2L11. BID. BIK. BST1 (includes EG:12182). BTK. CARD11. CBLB. CCR5. CD200. CD247. CD36.CD38. CD40. CD5L. CRLF2. CX3CR1. CXCL13. DCN. DMD. DUSP10. FABP1. FANCC. FCER1A. FOXP1. GNA12.HESX1. ICOS. IGF2. INS. ITGA6. KDM5A. KIFAP3. KLF10. LAMA2. LIG4. LSP1 (includes EG:16985). LUM. NOG.PAWR. PAX5. PRKCQ. PRL. PRNP. PROK2. PTGDR2. PTGES. PTK2 (includes EG:14083). RASSF5. RGS1. RGS10.SELL. SOD1. ST14. STAM. STAT1. TNFRSF4. TNFSF13B. TOX. TXNIP. VAV3. VEGFA. VPREB1. WWOX
65
A total of 823 genes surrounding the 603 top associated SNPs were put into the IPA software.Surrounding genes were defined by either Grail (www.broadinstitute.org/mpg/grail/) or the Genome Browser (http://genome.ucsc.edu/). Gene families located in thesame region were manually curated so that only one gene in each family remained in each region, based on a similar official gene symbol.*Hochberg Y, Benjamini Y. Statistics in medicine 1990; 9:811–8.doi:10.1371/journal.pone.0070174.t005
Celiac Disease Genome-Wide Linkage and Association
PLOS ONE | www.plosone.org 11 August 2013 | Volume 8 | Issue 8 | e70174
Ta
ble
6.
Bio
log
ical
fun
ctio
ns
of
ge
ne
ssu
rro
un
din
gth
e6
03
top
asso
ciat
ed
SNP
s.R
esu
lts
fro
mG
en
eT
rail.
Ca
teg
ory
ran
kS
ub
cate
go
rye
xp
ect
ed
ob
serv
ed
p-v
alu
e(r
aw
)G
en
es
KEG
G1
Typ
eII
dia
be
tes
me
llitu
s1
.91
70
.00
26
AB
CC
8C
AC
NA
1A
INS
INSR
KC
NJ1
1M
AP
K1
PR
KC
Z
KEG
G2
Saliv
ary
secr
eti
on
3.6
29
0.0
03
AD
RA
1B
AD
RA
1D
AM
Y1
BA
TP
2B
3B
ST1
CA
LML6
CD
38
CST
2R
YR
3
KEG
G3
Pat
hw
ays
inca
nce
r1
3.3
52
30
.00
7A
PP
L1B
CL2
BID
BM
P2
CB
LBC
SF2
RA
CT
BP
2C
TN
NA
2D
VL1
E2F3
EGF
FGF2
2FH
ITG
A6
LAM
A2
MA
PK
1M
ITF
PT
K2
RA
SSF5
STA
T1
TC
EB1
TG
FB2
VEG
FA
KEG
G4
Tce
llre
cep
tor
sig
nal
ing
pat
hw
ay4
.40
10
0.0
12
CA
RD
11
CB
LBC
D2
47
ICO
SM
AP
K1
NC
K2
PD
K1
PP
P3
CA
PR
KC
QV
AV
3
KEG
G5
TG
F-b
eta
sig
nal
ing
pat
hw
ay3
.46
80
.02
2B
MP
2B
MP
R1
BD
CN
ID4
MA
PK
1N
OG
TG
FB2
TH
BS2
KEG
G6
Cyt
oki
ne
-cyt
oki
ne
rece
pto
rin
tera
ctio
n1
0.7
91
80
.02
2B
MP
2B
MP
R1
BC
CR
5C
D4
0C
RLF
2C
SF2
RA
CX
3C
R1
CX
CL1
3EG
FIF
NA
6IF
NE
IL3
RA
PR
LT
GFB
2T
NFR
SF1
8T
NFR
SF4
TN
FSF1
3B
VEG
FA
KEG
G7
Arr
hyt
hm
og
en
icri
gh
tve
ntr
icu
lar
card
iom
yop
ath
y3
.09
70
.03
4A
CT
N1
CT
NN
A2
DM
DIT
GA
10
ITG
A6
LAM
A2
SGC
G
Ge
ne
On
tolo
gy
1n
eg
ativ
ere
gu
lati
on
of
ph
osp
hat
ase
acti
vity
0.2
13
0.0
00
6P
PP
2R
4T
GFB
2T
IPR
L
Ge
ne
On
tolo
gy
2p
osi
tive
reg
ula
tio
no
fap
op
tosi
s1
3.9
32
70
.00
08
AG
TA
KA
P1
3A
RH
GEF
18
BC
L2B
CL2
L11
BC
L2L1
3B
IDB
IKB
MP
2B
TK
CD
38
HT
AT
IP2
IKB
KE
ITG
B3
BP
MA
GED
1M
AP
K1
MT
CH
1P
AW
RP
PP
2R
4P
RU
NE2
PV
RSO
D1
TFA
P4
TG
FB2
TIA
M1
VA
V3
WT
1
Ge
ne
On
tolo
gy
3re
gu
lati
on
of
ph
osp
hat
ase
acti
vity
0.5
34
0.0
01
5B
MP
2P
PP
2R
4T
GFB
2T
IPR
L
Ge
ne
On
tolo
gy
5g
lom
eru
lar
ep
ith
eliu
md
eve
lop
me
nt
0.0
82
0.0
01
7B
ASP
1W
T1
Ge
ne
On
tolo
gy
6ve
sicl
e1
2.5
02
40
.00
17
AP
PL1
BG
NC
D3
6C
TSG
CU
ZD
1C
XX
C4
CY
BA
DV
L1EG
FG
RIA
2H
FEH
PS4
LTF
NR
SN1
PA
LMR
ASS
F9SE
C2
4A
SOD
1SY
T1
SYT
2T
GFB
2T
HT
HB
S2V
EGFA
Ge
ne
On
tolo
gy
7ce
llula
rd
efe
nse
resp
on
se2
.38
80
.00
24
CC
R5
CD
30
0C
CD
5L
CX
3C
R1
DC
DC
2LS
P1
LY9
6N
CR
2
Ge
ne
On
tolo
gy
8cy
top
lasm
icve
sicl
e1
2.1
72
30
.00
26
BG
NC
D3
6C
TSG
CU
ZD
1C
XX
C4
CY
BA
DV
L1EG
FG
RIA
2H
FEH
PS4
LTF
NR
SN1
PA
LMR
ASS
F9SE
C2
4A
SOD
1SY
T1
SYT
2T
GFB
2T
HT
HB
S2V
EGFA
Ge
ne
On
tolo
gy
9p
ho
sph
oin
osi
tid
e3
-kin
ase
casc
ade
0.3
33
0.0
03
3A
GT
INS
TG
FB2
Ge
ne
On
tolo
gy
10
hin
db
rain
de
velo
pm
en
t0
.37
30
.00
48
CT
NN
A2
MY
O1
6SD
F4
Ge
ne
On
tolo
gy
11
reg
ula
tio
no
fn
eu
ron
alsy
nap
tic
pla
stic
ity
0.3
73
0.0
04
8N
ETO
1SH
ISA
9SY
NG
R1
Ge
ne
On
tolo
gy
12
ne
uro
np
roje
ctio
nm
em
bra
ne
0.1
22
0.0
04
9C
NT
NA
P2
SHIS
A9
Ge
ne
On
tolo
gy
13
do
pam
ine
bio
syn
the
tic
pro
cess
0.1
22
0.0
04
9T
GFB
2T
H
Ge
ne
On
tolo
gy
14
hyd
rog
en
pe
roxi
de
bio
syn
the
tic
pro
cess
0.1
22
0.0
04
9C
YB
ASO
D1
Ge
ne
On
tolo
gy
15
po
siti
vere
gu
lati
on
of
resp
irat
ory
bu
rst
0.1
22
0.0
04
9IN
SIN
SR
Ge
ne
On
tolo
gy
16
card
iac
ep
ith
elia
lto
me
sen
chym
altr
ansi
tio
n0
.12
20
.00
49
BM
P2
TG
FB2
Ge
ne
On
tolo
gy
17
en
zym
eac
tiva
tor
acti
vity
7.1
11
50
.00
51
AG
TA
PO
A5
AR
HG
AP
5B
CL2
L13
BM
P2
EGF
MM
P1
7O
PH
N1
PIT
RM
1P
PP
1R
12
BP
PP
2R
4R
GS1
RG
S5T
BC
1D
15
VA
V3
Celiac Disease Genome-Wide Linkage and Association
PLOS ONE | www.plosone.org 12 August 2013 | Volume 8 | Issue 8 | e70174
Ta
ble
6.
Co
nt.
Ca
teg
ory
ran
kS
ub
cate
go
rye
xp
ect
ed
ob
serv
ed
p-v
alu
e(r
aw
)G
en
es
Ge
ne
On
tolo
gy
18
ep
ide
rmal
gro
wth
fact
or
rece
pto
rsi
gn
alin
g0
.74
40
.00
54
AG
TEG
FN
CK
2SN
X6
Ge
ne
On
tolo
gy
19
ext
race
llula
rm
atri
x7
.23
15
0.0
06
0A
SPN
BG
NC
MA
1C
PX
M2
CT
SGD
CN
ECM
2LA
MA
2LU
MM
MP
23
BO
GN
SOD
1T
GFB
2U
SH2
AV
EGFA
NIA
hu
man
dis
eas
e1
Dia
be
tes
Me
llitu
s.T
ype
21
0.4
92
20
.00
03
*A
BC
C8
AG
TA
KA
P1
0A
PO
A5
BT
CC
CR
5C
D3
6C
MA
1C
YB
AFA
BP
1FT
OIN
SIN
SRK
CN
J11
MT
TP
PP
AR
AP
RK
CZ
SELL
TH
TH
BS2
TX
NIP
VEG
FA
NIA
hu
man
dis
eas
e2
Hyp
erl
ipo
pro
tein
em
ias
0.2
63
0.0
01
2A
PO
A5
FAB
P1
PP
AR
A
NIA
hu
man
dis
eas
e3
Dia
be
tic
An
gio
pat
hie
s1
.94
70
.00
24
CD
40
CY
BA
INS
KC
NJ1
1P
PA
RA
TX
NIP
VEG
FA
NIA
hu
man
dis
eas
e4
Po
stm
ort
em
Ch
ang
es
0.1
02
0.0
02
6D
AO
AT
PH
2
NIA
hu
man
dis
eas
e5
Dis
eas
eP
rog
ress
ion
7.7
71
60
.00
30
AG
TB
CL2
CC
R5
CD
40
CM
A1
CX
3C
R1
DC
NEG
FH
FEK
CN
J11
PP
AR
AP
RN
PSE
LLSO
D1
VEG
FAW
T1
NIA
hu
man
dis
eas
e6
Bir
thW
eig
ht
1.6
96
0.0
05
4EG
FEP
HX
1FT
OH
19
INS
TH
NIA
hu
man
dis
eas
e7
Pat
ho
log
ical
Co
nd
itio
ns.
Sig
ns
and
Sym
pto
ms
23
.66
34
0.0
07
3A
GT
AP
OA
5B
CL2
CC
R5
CD
40
CM
A1
CX
3C
R1
CY
BA
DA
OA
DC
ND
ISC
1D
MD
EGF
EPH
X1
FCER
1A
FTO
H1
9H
FEH
TR
2C
INS
INSR
KC
NJ1
1LT
FM
TT
PP
LXN
A2
PO
MC
PP
AR
AP
RN
PSE
LLSO
D1
TH
TP
H2
VEG
FAW
T1
NIA
hu
man
dis
eas
e8
Bro
nch
iolit
is.
Vir
al0
.15
20
.00
75
CC
R5
CX
3C
R1
NIA
hu
man
dis
eas
e9
Kid
ne
yFa
ilure
.A
cute
0.1
52
0.0
07
5C
YB
AW
T1
NIA
hu
man
dis
eas
e1
0D
ise
ase
sin
Tw
ins
0.4
63
0.0
08
6D
ISC
1H
FEP
LXN
A2
NIA
hu
man
dis
eas
e1
1C
oro
nar
yA
rte
ryD
ise
ase
4.5
51
00
.01
27
AG
TA
PO
A5
CD
36
CD
40
CM
A1
CX
3C
R1
CY
BA
PP
AR
AT
HB
S2V
EGFA
NIA
hu
man
dis
eas
e1
2D
ysle
xia
0.2
62
0.0
23
3D
YX
1C
1K
IAA
03
19
NIA
hu
man
dis
eas
e1
3M
yoca
rdia
lIn
farc
tio
n6
.64
12
0.0
28
2A
GT
AK
AP
10
AP
OA
5C
CR
5C
TSG
CX
3C
R1
HFE
INSR
MT
TP
TH
BS2
TN
FRSF
4V
EGFA
NIA
hu
man
dis
eas
e1
4N
utr
itio
nal
and
Me
tab
olic
Dis
eas
es
18
.45
26
0.0
29
5A
BC
C8
AG
TA
KA
P1
0A
PO
A5
BT
CC
BLB
CC
R5
CD
36
CM
A1
CY
BA
DC
NFA
BP
1FT
OH
TR
2C
INS
INSR
KC
NJ1
1M
TT
PP
OM
CP
PA
RA
PR
KC
ZSE
LLT
HT
HB
S2T
XN
IPV
EGFA
NIA
hu
man
dis
eas
e1
5O
verw
eig
ht
0.3
12
0.0
33
8A
PO
A5
FTO
Ato
tal
of
82
3g
en
es
surr
ou
nd
ing
the
60
3to
pas
soci
ate
dSN
Ps
we
rep
ut
into
the
Ge
ne
Tra
ilso
ftw
are
.Su
rro
un
din
gg
en
es
we
red
efi
ne
db
ye
ith
er
Gra
il(h
ttp
://w
ww
.bro
adin
stit
ute
.org
/mp
g/g
rail/
)o
rth
eG
en
om
eB
row
ser
(htt
p:/
/g
en
om
e.u
csc.
ed
u/)
.G
en
efa
mili
es
loca
ted
inth
esa
me
reg
ion
we
rem
anu
ally
cura
ted
soth
ato
nly
on
eg
en
ein
eac
hfa
mily
rem
ain
ed
ine
ach
reg
ion
,b
ase
do
na
sim
ilar
off
icia
lg
en
esy
mb
ol.
*Sig
nif
ican
taf
ter
mu
ltip
lete
stin
gco
rre
ctio
nu
sin
gFD
Rad
just
me
nt.
(pc
orr
-val
ue
=0
.03
2).
Size
of
test
set:
82
3(7
68
kno
wn
).N
um
be
ro
fkn
ow
nre
f.ID
s:4
48
29
Ke
gg
:N
um
be
ro
fan
no
tate
dg
en
es
inte
stse
tw
as2
20
.N
um
be
ro
fan
no
tate
dg
en
es
inre
fse
tw
as5
40
5.
Ge
ne
On
tolo
gy:
Nu
mb
er
of
ann
ota
ted
ge
ne
sin
test
set
was
47
6.
Nu
mb
er
of
ann
ota
ted
ge
ne
sin
ref
set
was
11
58
0.
NIA
hu
man
ge
ne
sse
ts:
Nu
mb
er
of
ann
ota
ted
ge
ne
sin
test
set
was
76
.N
um
be
ro
fan
no
tate
dg
en
es
inre
fse
tw