STABILITY INDICATORS IN (BIOLOGICAL) NETWORK INFERENCEGiuseppe Jurman1, Michele Filosi1,2, Roberto Visintainer1, Samantha Riccadonna3, Cesare Furlanello1
1 Fondazione Bruno Kessler, Trento and 2 University of Trento, Italy and 3 Fondazione Edmund Mach San Michele all’Adige, Italy
We propose how to quantify inference variability with respect to data perturbation, and, in particular, data subsampling. We introduce a set offour indicators allowing the researcher to quantitatively evaluate the reliability of the inferred/non-inferred links. For a given ratio of removeddata and for a give number of resampling, we quantitatively assess the mutual distances among all inferred networks and their distances to thenetwork generated by the whole dataset. The rationale is that, the smaller the average distance, the stabler the network. We also provide aranked list of the stablest links and nodes, where the rank is induced by the variability of the link weight and the node degree across the gen-erated networks, the less variable being the top ranked.
As a network distance we employ the HIM distance, which represents a good compromise between local (link-based) and global (structure-based) measure of network comparison. As a first testbed in a controlled situation the four indicators are computed on a synthetic dataset fordifferent instances of a correlation network with different measures, highlighting the impact of a False Discovery Ratio filter on the network re-construction method. Finally, we show the use of the stability measures in comparing the relevance networks inferred on a miRNA microarraydataset with paired tissues extracted from a cohort of 241 hepatocellular carcinoma patients.
STABILITY INDICATORS
p
D: sALG
NODES={x1D,...,xp
D}
LINK WEIGHTS=
w11D,...,w1p
D
wp1D,...,wpp
D
whkD
...
...
...
...
p
D: s
p
Di: n
p
D1: n
p
Dr: n
...
...ALG
ALG
ALG
n < s
r ≤ sn))
Stability of the entire network
I1(n, r ) = {HIM(ND, NDi) : i = 1, ... , r}I1(n, r ) = {HIM(ND, NDi) : i = 1, ... , r}I1(n, r ) = {HIM(ND, NDi) : i = 1, ... , r}
Distances between the network constructed on thewhole dataset and the networks inferred from thedifferent subsampling replicates.
I2(n, r ) = {HIM(NDi , NDj) : i , j = 1, ... r , i 6= j}I2(n, r ) = {HIM(NDi , NDj) : i , j = 1, ... r , i 6= j}I2(n, r ) = {HIM(NDi , NDj) : i , j = 1, ... r , i 6= j}
Mutual distances among the networks inferred fromthe different subsampling replicates.
Stability (reliability) of single nodes and links
I3(n, r ) = {aDihk}I3(n, r ) = {aDihk}I3(n, r ) = {aDihk} for i = 1, ... , r and k , h = 1, ... , p
I4(n, r ) = {∂(xDih )}I4(n, r ) = {∂(xDih )}I4(n, r ) = {∂(xDih )} for i = 1, ... , r and h = 1, ... , p
and ∂ the degree function
Variability of node degree and link weight of thenetworks inferred from the different subsamplingreplicates
RESAMPLING SCHEMA:I LOO (leave-one-out stability): n = s − 1,r = 1I 20 × k -fold20 × k -fold20 × k -fold cross validation for k = 2, 4, 10 (k2, k4 and k10) −→ n = bs(k−1)
k c and r = 20k .
HIM NETWORK DISTANCE
HIM MetricProduct Metric of
{ Hamming - edit distance - focus only on local presence/absence of matching links.Ipsen-Mikhailov - spectral distance - evaluate global structure of topologies.
(A)
(B) H
IM
0.2
0.4
0.6
0.8
0.2 0.4 0.6 0.8
P(A,B)0.6
P(A,F)
0.59
P(A,E)
0.5
2
I
II III
IV
HIM(G,H) = 1√2
√H(G,H)2 + IM(G,H)2 ,
H(G,H) = 1N(N−1)
∑1≤i 6=j≤N
|A(1)ij − A(2)
ij | ,
IM(G,H) = εγ(G,H) =√∫∞
0 [ρG(ω, γ)− ρH(ω, γ)]2 dω .
Representation of the HIM distance in the Ipsen-Mikhailov and Hamming distance space between networks A versus B, F and E,where F is the fully connected network and E is the empty one.P(G,H) represents the distance between two networks G and H whose coordinates are x = H(G,H)x = H(G,H)x = H(G,H) and y = IM(G,H)y = IM(G,H)y = IM(G,H) and the normof P is
√2 times the HIM distance HIM(G,H).
I N-nodes network as N-atoms systemconnected by identical elastic strings,
xi +N∑
j=1Aij(xi − xj) = 0 for i = 0, · · · , N − 1 ,
I The vibrational frequencies ωi satisfyλi = ω2
i ,
I ρ(ω) = KN−1∑i=1
γ
(ω − ωk)2 + γ2, spectral density as sum of Lorentz
distributions,I γ is the common width, half-width at half-maximum (HWHM),
equal to half the interquartile range, εγ(E , F ) = 1 ,I K is the normalization constant solution of
∫∞0 ρ(ω)dω = 1.
0
1
2
3
4
5
6
7
G
0 2 4 6 8
0.0
00
.05
0.1
00
.15
0.2
00
.25
0.3
0
ρG(γ,ω)
20 4 6 8
spec(LG)
REFERENCES
Baralla et al. Inferring Gene Networks: Dream or Nightmare? Annals of the New York Academy of Science, 2009.
Budhu et al. Identification of Metastasis-Related MicroRNAs in Hepatocellular Carcinoma Hepatology, 2008.
Faith et al. Large-Scale Mapping and Validation of Escherichia coli Transcriptional Regulation from a Compendium of Expression Profiles PLoS Biology, 2007.
Gillis and Pavlidis The role of indirect connections in gene networks in predicting function Bioinformatics, 2011.
Ipsen and Mikhailov Evolutionary reconstruction of networks Physical Review E, 2002.
Jurman et al. Stability Indicators in Network Reconstruction arXiv, 2012.
Jurman et al. A glocal distance for network comparison arXiv, 2012.
Meyer et al. Verification of systems biology research in the age of collaborative competition Nature Biotechnology, 2011.
Miller et al. Identifying Biological Network Structure, Predicting Network Behavior, and Classifying Network State With High Dimensional Model Representation (HDMR) PLoS ONE, 2012.
Prill et al. Towards a Rigorous Assessment of Systems Biology Models: The DREAM3 Challenges PLoS ONE, 2010.
Reshef et al. Detecting novel associations in large datasets Science, 2011
Volinia et al. Reprogramming of miRNA networks in cancer and leukemia Genome Research, 2010.
FDR EFFECT ON CORRELATION NETWORKS
To assess the different level of stability in a correlation network inferred by a set of synthetic high-throughput signals whenthe inference is computed with or without False Discovery Rate control.
SYNTHETIC BENCHMARK.
Corr(fi, fj) ≈
0.90.70.4
f20
f19
f18
f17
f16
f15
f14
f13
f12
f11
f10
f9
f8
f7
f6
f5
f4
f3
f2
f1
−0
.20
.00
.20
.40
.60
.81
.0
I WGCNA [Langfelder et al., 2008]I MIC [Reshef et al., 2011]I WGCNA
with FDR correction
Adj = {ahk} where ahk =
|Corr(xh, xk)| if|Fz(h, k)| ≥ 10 otherwise
I1
I2
0.00
0.02
0.04
0.06
0.00 0.05 0.10
k10
k2
k4
LOO
●
●
●
●
k10
k2
k4
LOO
k10
k2
k4
LOOk10k2k4LOO
k10
k2
k4
LOO
MINEWGCNAWGCNAFDR1e−2WGCNAFDR5e−3WGCNAFDR1e−4
●
MIRNA NETWORK ON A HEPATOCELLULAR CARCINOMA DATASET
I 482 tissue samples from 241 patients [Budhu et al., 2008, Volinia et al., 2010].I For each patients, a sample from cancerous hepatic tissue and a sample from
surrounding non-cancerous hepatic tissue.I Ohio State University CCC MicroRNA Microarray 2.0: 11520 probes, 250
non-redundant human and 200 mouse miRNA.I After preprocessing, the dataset HCC of 240+240 paired samples described
by 210 human miRNA is analyzed (210 ♂+ 30 ♀).IHCC is partitioned into four subsets combining the sex and disease status
phenotypes(MT, FT, MnT, FnT).
INFERENCE ALGORITHMS COMPARISON
I I1 captures the robustness of the algorithms to subsampling.I I2 tends to express the homogeneity of the dataset.I The bigger the sample-size for reconstruction the stabler the result.
I1
I2
0.00
0.02
0.04
0.06
0.08
0.0 0.2 0.4 0.6
k10k2
k4LOO
●
●
●
●
k10
k2
k4
LOO
k10
k2
k4
LOO
k10
k2
k4
LOO
ARACNE CLR TOM WGCNA●
MT
FT
MnT
FnT
let.7a.1.prec
let.7a.2.precNo1
let.7a.2.precNo2
let.7a.3.prec
let.7b.prec
let.7c.prec
let.7d.prec
let.7d.v1.prec
let.7d.v2.precNo2
let.7e.prec
let.7f.1.precNo2
let.7f.2.prec2
let.7g.precNo1
let.7iNo1
let.7iNo2
007.2.precNo2
007.3.precNo1
009.3No1
010a.precNo1
010b.precNo1
016a.chr13
016b.chr3
017.precNo2
020.prec
021.prec.17No1
023a.prec
023b.prec
024.1.precNo1
024.1.precNo2
024.2.prec
025.prec
026a.precNo1
026b.prec
027a.prec
027b.prec
029a.2No1
029a.2No2
029c.prec
030a.precNo1
030b.precNo1
030c.prec
030d.precNo2
031.prec
032.precNo1
032.precNo2
034precNo1
092.prec.13.092.1No1
092.prec.13.092.1No2
092.prec.X.092.2
093.prec.7.1.093.1
095.prec.4
096.prec.7No2
099b.prec.19No2
099.prec.21
100.1.2.prec
100No1
101.1.2.precNo2
102.prec.1
103.2.prec
103.prec.5.103.1
105.prec.X.1.105.1
106aNo1
106bNo1 106.prec.X
107No1
107.prec.10
123.precNo1
123.precNo2
124a.1.prec1124a.2.prec
125a.precNo1125b.1126No1
126No1
126No2
126No2
128a.precNo2
129.2No1
129.precNo1
1.2No1
1.2No2
130a.precNo2
130bNo1
130bNo2
132.precNo2133bNo2
135.2.prec
135a.1No1
135a.2No1
138.2.prec
142.prec
145.prec
148aNo1
148bNo1
148bNo2
148.prec
149.prec
150.prec152.precNo2
155.prec
15aNo1
16.1No1
16.2No1
181a.precNo2 181b.1No1181b.2No1
181b.2No2
181b.precNo1
181c.precNo1
184.precNo2
185.precNo2
192.2.3No1
192No1
193.precNo1
194.1No1
194.2No1
194.precNo1
195.prec
196a.1No1
196a.2No1
196bNo2
197.prec
199a.1.prec
199b.precNo2
206.precNo1
206.precNo2213.precNo1
214.prec
215.precNo1
215.precNo2
216.precNo1
219.1No1
219.1No2
219.2No2
21No1
221.prec
222.precNo1
222.precNo2
223.prec224.prec
26a.1No1
26a.1No2
26a.2No1
296No1
299No1
29b.1No1
29b.2.102prec7.1.7.2
302bNo2
30c.1No1
30c.2No1
30eNo1
320No2
321No1
321No2
323No2324.5pNo1
324No2
325No1
326No1
326No2
328No1
335No2
338No1
338No2
339No2
340No2
342No1
345No2
346No1
34aNo1
34bNo2
368No1
371No1
373No1
373No1
373No2
373No23p21.v1.v2.AntiS5P
3p21.v1.v2.sense5P
3p21.v3.v4.sense13P
3p21.v3.v4.sense35P
3p21.v3.v4.sense45P
let.7a.1.prec
let.7a.2.precNo1
let.7a.2.precNo2
let.7a.3.prec
let.7b.prec
let.7c.prec
let.7d.prec
let.7d.v1.prec
let.7d.v2.precNo2
let.7e.prec
let.7f.1.precNo2
let.7f.2.prec2
let.7g.precNo1
let.7iNo1
let.7iNo2
007.2.precNo2
007.3.precNo1
009.3No1
010a.precNo1
010b.precNo1
016a.chr13
016b.chr3
017.precNo2
020.prec
021.prec.17No1
023a.prec
023b.prec024.1.precNo1
024.1.precNo2
024.2.prec
025.prec
026a.precNo1
026b.prec027a.prec
027b.prec
029a.2No1
029a.2No2
029c.prec
030a.precNo1
030a.precNo2
030b.precNo1
030c.prec
030d.precNo2
031.prec
032.precNo1
032.precNo2
034precNo1
092.prec.13.092.1No1
092.prec.13.092.1No2
092.prec.X.092.2
093.prec.7.1.093.1
095.prec.4
096.prec.7No2
099b.prec.19No2
099.prec.21
100.1.2.prec
100No1
101.1.2.precNo1
101.1.2.precNo2
102.prec.1
103.2.prec
103.prec.5.103.1
105.prec.X.1.105.1
106aNo1
106bNo1
106.prec.X
107No1
107.prec.10
122a.prec
123.precNo1
123.precNo2
124a.1.prec1
124a.2.prec
124a.3.prec
125a.precNo1
125b.1
125b.2.precNo2
126No1
126No1
126No2
126No2
127.prec
128a.precNo1
128a.precNo2
128b.precNo1129.2No1
129.precNo1129.precNo2
1.2No1
1.2No2
130a.precNo2130bNo1
130bNo2
132.precNo2
133a.1
133bNo2
135.2.prec
135a.1No1
135a.2No1
136.precNo2
138.2.prec
140No2
142.prec
145.prec
146.prec
148aNo1
148bNo1
148bNo2
148.prec
149.prec
150.prec
152.precNo1
152.precNo2
155.prec
15aNo116.1No1
16.2No1
181a.precNo2
181b.1No1
181b.2No1
181b.2No2
181b.precNo1
181c.precNo1
184.precNo2
185.precNo2
191.prec
192.2.3No1192No1
193.precNo1
193.precNo2
194.1No1
194.2No1
194.precNo1
195.prec
196a.1No1
196a.2No1
196bNo2
197.prec
198.prec
199a.1.prec
199b.precNo2
205.prec
206.precNo1
206.precNo2
210.prec
212.precNo1
212.precNo2
213.precNo1
214.prec
215.precNo1
215.precNo2
216.precNo1
218.2.precNo2
219.1No1
219.1No2
219.2No2
21No1
221.prec
222.precNo1
222.precNo2
223.prec
224.prec26a.1No1
26a.1No2
26a.2No1
296No1
299No1
29b.1No1
29b.2.102prec7.1.7.2
301No2
302bNo2
30c.1No1
30c.2No1
30eNo1
320No1
320No2
321No1
321No2
323No2
324.5pNo1
324.5pNo2
324No2
325No1
326No1
326No2
328No1
331No2
335No2
338No1
338No2
339No2340No2
342No1
342No2
345No2
346No1
34aNo1
34bNo2
34cNo2
368No1
371No1
373No1
373No1
373No2
373No2
3p21.v1.v2.AntiS5P
3p21.v1.v2.sense5P3p21.v3.v4.sense13P
3p21.v3.v4.sense35P
3p21.v3.v4.sense45P
let.7a.2.precNo1
let.7a.3.prec
let.7c.prec
let.7d.prec
let.7d.v1.prec
let.7d.v2.precNo2
let.7e.prec
let.7f.2.prec2let.7iNo1
let.7iNo2
007.2.precNo2
009.3No1
016a.chr13
016b.chr3
017.precNo2
020.prec
021.prec.17No1
023a.prec
023b.prec
024.1.precNo2
024.2.prec
025.prec
026a.precNo1
026b.prec
027a.prec
027b.prec
029a.2No1
029a.2No2
030a.precNo1
030b.precNo1
030c.prec
030d.precNo2
031.prec
032.precNo1
032.precNo2
092.prec.13.092.1No2
092.prec.X.092.2
093.prec.7.1.093.1
095.prec.4
101.1.2.precNo1101.1.2.precNo2
102.prec.1
105.prec.X.1.105.1
107No1
123.precNo2
124a.1.prec1124a.2.prec
126No2
129.2No1
129.precNo1
1.2No1
1.2No2
130a.precNo2
130bNo1
130bNo2
132.precNo2
133a.1
133bNo2
135a.1No1
135a.2No1
138.2.prec
148aNo1
148bNo1
148bNo2
149.prec
152.precNo1
152.precNo2
16.2No1
181a.precNo2 181b.1No1181b.2No1
181b.2No2
181b.precNo1
181c.precNo1
185.precNo2
191.prec
192.2.3No1
192No1
193.precNo1
194.2No1
196a.1No1
196a.2No1
196bNo2
198.prec
199a.1.prec
199b.precNo2
206.precNo1
206.precNo2210.prec 213.precNo1
214.prec
215.precNo1
215.precNo2
216.precNo1
219.1No1
219.2No2
21No1
222.precNo1
26a.1No1
26a.1No2
26a.2No1
296No1
299No1
29b.1No1
302bNo2
30c.1No1
30c.2No1
320No2
321No1
321No2
323No2324.5pNo1
325No1
326No1
326No2
328No1
331No2
338No1
339No2
340No2
342No2
345No2
346No1
34bNo2
368No1
371No1
373No1
373No1
373No2
373No23p21.v1.v2.AntiS5P
3p21.v1.v2.sense5P
3p21.v3.v4.sense13P
3p21.v3.v4.sense35P
let.7a.1.prec
let.7a.2.precNo1
let.7a.2.precNo2
let.7a.3.prec
let.7b.prec
let.7c.prec
let.7d.prec
let.7d.v1.prec
let.7d.v2.precNo2
let.7e.prec
let.7f.1.precNo2
let.7f.2.prec2
let.7g.precNo1
let.7iNo1
let.7iNo2
007.2.precNo2
007.3.precNo1
009.3No1
010a.precNo1
016a.chr13
016b.chr3
017.precNo2
020.prec
021.prec.17No1
023a.prec
023b.prec024.1.precNo1
024.1.precNo2
024.2.prec
025.prec
026a.precNo1
026b.prec
027b.prec
029a.2No1
029a.2No2
029c.prec
030a.precNo1
030a.precNo2
030b.precNo1
030c.prec
030d.precNo2
031.prec
032.precNo1
032.precNo2
034precNo1
092.prec.13.092.1No1
092.prec.13.092.1No2
092.prec.X.092.2
093.prec.7.1.093.1
095.prec.4
096.prec.7No2
099b.prec.19No2
099.prec.21
100.1.2.prec
100No1
101.1.2.precNo1
101.1.2.precNo2
102.prec.1
103.prec.5.103.1
105.prec.X.1.105.1
106aNo1
106bNo1
106.prec.X
107No1
107.prec.10
123.precNo1
123.precNo2
124a.1.prec1
124a.2.prec
125a.precNo1
125b.1
125b.2.precNo2
126No1
126No1
126No2
128a.precNo1
128a.precNo2
128b.precNo1129.2No1
129.precNo2
1.2No1
1.2No2
130a.precNo2
130bNo2
132.precNo2
133a.1
133bNo2
135.2.prec
135a.2No1
136.precNo2
138.2.prec
140No2
145.prec
146.prec
148bNo2
148.prec
149.prec
150.prec
152.precNo1
152.precNo2
155.prec
15aNo116.1No1
16.2No1
181a.precNo2
181b.1No1
181b.2No1
181b.2No2
181b.precNo1
181c.precNo1
184.precNo2
185.precNo2
191.prec
193.precNo1
194.1No1
194.2No1
194.precNo1
196a.1No1
196a.2No1
196bNo2
197.prec
198.prec
199b.precNo2
205.prec
206.precNo1
206.precNo2
212.precNo1
213.precNo1
214.prec
215.precNo1
215.precNo2
216.precNo1
218.2.precNo2
219.1No1
219.1No2
219.2No2
21No1
221.prec
222.precNo1
222.precNo2
223.prec
224.prec26a.1No1
26a.1No2
26a.2No1
296No1
299No1
29b.1No1
29b.2.102prec7.1.7.2
301No2
302bNo2
30c.1No1
30c.2No1
320No2
321No1
321No2
323No2
324.5pNo1
324.5pNo2
324No2
325No1
326No1
326No2
328No1
331No2
335No2
338No1
338No2
339No2340No2
342No1
342No2
345No2
346No1
34aNo1
34bNo2
34cNo2
368No1
371No1
373No1
373No1
373No2
373No2
3p21.v1.v2.AntiS5P
3p21.v1.v2.sense5P
3p21.v3.v4.sense35P
3p21.v3.v4.sense45P
BIOLOGICAL RESULTS
MnT FT FnT HIM0.0412 0.0858 0.0235 MT
0.1265 0.0618 MnT0.0684 FT
−0.05 0.00 0.05
−0.0
05
0.0
05
Coordinate 1
Coord
inate
2
MTMnT FT
FnT
Statistics on I4id hsa-mir idx1 hsa-mir idx2 MT MnT FT FnT(a) 321No1 321No2 1 1 9 2(b) 016b.chr3 16.2No1 3 12 15 309 -(c) 021.prec.17No1 21No1 27 5 2 921(d) 219.1No1 321No2 2 6 1903 314(e) 326No1 342No2 132 1017 3 -(f) 192.2.3No1 215.precNo1 4 300 4 3340
(a) is top ranking in all four cases as expected (hsa-mir 321No1 andhsa-mir 321No2 denote essentially the same miRNA)
(b) and (c) as (a), but with less stability in the FnT network due to noise.(d) has different stability between the male and the female (link probably
associated to sex rather than HCC).(e) is very stable for FT, while is not even picked up as a link by CLR in the
FnT network.(f) is a very well known cancer associated link, as confirmed by high stability
in MT and FT.
I1
I2
0.00
0.02
0.04
0.06
0.08
0.10
0.12
0.00 0.02 0.04 0.06 0.08 0.10 0.12
k10
k2
k4
LOO
k10
k2
k4
LOO
k10
k2
k4
LOO
k10
k2
k4
LOO
FT
FnT
MT
MnT
I1 vs. I2 plot for CLR inferred networks in the 4 subgroups.
I M much stabler than FI MT stability similar to MnTI F LOO worse than M 4/10 FoldI FnT much worse than FT
Authors acknowledge funding by the European Union FP7 Project HiperDART
NetSci 2013 - International School and Conference on Network Science http://mpba.fbk.eu {jurman, filosi, visintainer, furlan}@fbk.eu, [email protected]