+ All Categories
Home > Documents > Selection of the Regularization Parameter in Graphical ... · PDF fileSelection of the...

Selection of the Regularization Parameter in Graphical ... · PDF fileSelection of the...

Date post: 02-Feb-2018
Category:
Upload: doandan
View: 227 times
Download: 1 times
Share this document with a friend
41
Selection of the Regularization Parameter in Graphical Models using Network Characteristics Natalia Bochkina University of Edinburgh, Maxwell Institute and the Alan Turing Institute joint work with Adria Caballe Mestres (University of Edinburgh and BioSS) and Claus Mayer (BioMathematics and Statistics Scotland) 27 July 2016 Natalia Bochkina (University of Edinburgh) 27 July 2016 1 / 35
Transcript
Page 1: Selection of the Regularization Parameter in Graphical ... · PDF fileSelection of the Regularization Parameter in Graphical Models using Network Characteristics ... Maxwell Institute

Selection of the Regularization Parameterin Graphical Models usingNetwork Characteristics

Natalia Bochkina

University of Edinburgh, Maxwell Institute and the Alan Turing Institute

joint work withAdria Caballe Mestres (University of Edinburgh and BioSS)and Claus Mayer (BioMathematics and Statistics Scotland)

27 July 2016

Natalia Bochkina (University of Edinburgh) 27 July 2016 1 / 35

Page 2: Selection of the Regularization Parameter in Graphical ... · PDF fileSelection of the Regularization Parameter in Graphical Models using Network Characteristics ... Maxwell Institute

Outline

1 Sparse High Dimensional Gaussian Graphical Models

2 Network-based estimation of hyperparameter

3 Simulated data

4 Tumour gene expression data

5 Summary

Natalia Bochkina (University of Edinburgh) 27 July 2016 2 / 35

Page 3: Selection of the Regularization Parameter in Graphical ... · PDF fileSelection of the Regularization Parameter in Graphical Models using Network Characteristics ... Maxwell Institute

Sparse High Dimensional Gaussian Graphical Models Model

Gaussian graphical models

Suppose we observe n replicates of p variables:

Yi = (Y1,i , . . . ,Yp,i ) ∼ N (µ,Ω−1) independently for i = 1, . . . ,n

where Ω is p × p precision matrix and µ is the vector of the means (assumeµ = 0).

Matrix Ω represents the conditional dependence structure among thevariables, with zero values representing conditional independence.

Aim: estimate the underlying graph of the conditional dependence structuredetermined by Ω.

Applications: networks in genetics and genomics, financial models . . .

Natalia Bochkina (University of Edinburgh) 27 July 2016 3 / 35

Page 4: Selection of the Regularization Parameter in Graphical ... · PDF fileSelection of the Regularization Parameter in Graphical Models using Network Characteristics ... Maxwell Institute

Sparse High Dimensional Gaussian Graphical Models Estimation of precision matrix

Gaussian graphical models, p is large

For p small compared to n, the maximum likelihood estimate (MLE)

ΩML = arg maxΩ0

[log det Ω− tr(SΩ)], (1)

where S is the sample covariance matrix: S = n−1∑ni=1 YiY T

i .

Problem: if p is large s.t. S is not of full rank, then ΩML is not unique.

Additional assumption on Ω is sparsity.

Penalised maximum likelihood estimator (with a convex penalty) is

ΩpML = arg maxΩ0

[log det Ω− tr(SΩ)− λ||Ω||1], (2)

where ||Ω||1 =∑p

i,j=1 |Ωij | is the elementwise `1 norm of the matrix Ω.

For λ large enough, estimator ΩpML is sparse.

Natalia Bochkina (University of Edinburgh) 27 July 2016 4 / 35

Page 5: Selection of the Regularization Parameter in Graphical ... · PDF fileSelection of the Regularization Parameter in Graphical Models using Network Characteristics ... Maxwell Institute

Sparse High Dimensional Gaussian Graphical Models Estimation of precision matrix

Gaussian graphical models, p is large

For p small compared to n, the maximum likelihood estimate (MLE)

ΩML = arg maxΩ0

[log det Ω− tr(SΩ)], (1)

where S is the sample covariance matrix: S = n−1∑ni=1 YiY T

i .

Problem: if p is large s.t. S is not of full rank, then ΩML is not unique.

Additional assumption on Ω is sparsity.

Penalised maximum likelihood estimator (with a convex penalty) is

ΩpML = arg maxΩ0

[log det Ω− tr(SΩ)− λ||Ω||1], (2)

where ||Ω||1 =∑p

i,j=1 |Ωij | is the elementwise `1 norm of the matrix Ω.

For λ large enough, estimator ΩpML is sparse.

Natalia Bochkina (University of Edinburgh) 27 July 2016 4 / 35

Page 6: Selection of the Regularization Parameter in Graphical ... · PDF fileSelection of the Regularization Parameter in Graphical Models using Network Characteristics ... Maxwell Institute

Sparse High Dimensional Gaussian Graphical Models Algorithms and consistency

Methods for estimating hyperparameter λ

Additional penalty term for λ:

ΩppML = arg maxΩ0, λ>0

[L(Ω)− λ||Ω||1 − pen(λ)]

Methods such as AIC and BIC are suboptimal when p is large.

Bayesian model with a slab-and-spike prior for elements of Ω: computationallyintensive for large p (≥ 103).

Two steps procedure:

Step 1: Use pMLE estimator ΩpML = ΩpML(λ) for given λ,Step 2: Choose λ to minimise R(λ, ΩpML(λ)),e.g. Cross Validation for Ω. Overfits when p is large (Liu et al., 2011).

Stability selection by Meinshausen and Bühlman (2010): controls FDR

StARS – Stability Approach to Regularization Selection by Liu et al. (2011).Additional tuning parameter; can lead to overfitting in certain graph topologies.

. . .

Natalia Bochkina (University of Edinburgh) 27 July 2016 5 / 35

Page 7: Selection of the Regularization Parameter in Graphical ... · PDF fileSelection of the Regularization Parameter in Graphical Models using Network Characteristics ... Maxwell Institute

Sparse High Dimensional Gaussian Graphical Models Algorithms and consistency

Instability of estimated graph structureSmall variation in the penalty (||Ω||1) can lead to a significant change in theestimated graph structure.

λ = 0.83 λ = 0.85

0.70 0.75 0.80 0.85 0.90 0.95

020

040

060

080

010

0012

00

λ

l_1

norm

Natalia Bochkina (University of Edinburgh) 27 July 2016 6 / 35

Page 8: Selection of the Regularization Parameter in Graphical ... · PDF fileSelection of the Regularization Parameter in Graphical Models using Network Characteristics ... Maxwell Institute

Network-based estimation of hyperparameter

Estimation of the hyperparameter

in sparse graphical models

using network characteristics

Natalia Bochkina (University of Edinburgh) 27 July 2016 7 / 35

Page 9: Selection of the Regularization Parameter in Graphical ... · PDF fileSelection of the Regularization Parameter in Graphical Models using Network Characteristics ... Maxwell Institute

Network-based estimation of hyperparameter Network approach

Network-based approach to estimating λ

We propose to estimate λ using network characteristics of underlying graph.

Notation: Graph G(V ,E) with nodes V , edges E , adjacency matrix A.In graphical models: Aij = I(Ωij 6= 0) for i 6= j , and Aii = 0.

Network-based estimation of λ

Given λ, estimate Ω = Ωλ by penalised MLE:

Ωλ = arg maxΩ0

[log det Ω− tr(SΩ)− λ||Ω||1]

Choose λ = arg minλ R(λ, Aλ),where Aλ is adjacency matrix of cond. dependence graph of Ωλ.

The loss f-n for estimating λ depends only on the adjacency matrix of the underlyingconditional dependence graph.

Main a priori assumption: presence of weakly connected clusters.

Natalia Bochkina (University of Edinburgh) 27 July 2016 8 / 35

Page 10: Selection of the Regularization Parameter in Graphical ... · PDF fileSelection of the Regularization Parameter in Graphical Models using Network Characteristics ... Maxwell Institute

Network-based estimation of hyperparameter Network approach

Network-based approach to estimating λ

We propose to estimate λ using network characteristics of underlying graph.

Notation: Graph G(V ,E) with nodes V , edges E , adjacency matrix A.In graphical models: Aij = I(Ωij 6= 0) for i 6= j , and Aii = 0.

Network-based estimation of λ

Given λ, estimate Ω = Ωλ by penalised MLE:

Ωλ = arg maxΩ0

[log det Ω− tr(SΩ)− λ||Ω||1]

Choose λ = arg minλ R(λ, Aλ),where Aλ is adjacency matrix of cond. dependence graph of Ωλ.

The loss f-n for estimating λ depends only on the adjacency matrix of the underlyingconditional dependence graph.

Main a priori assumption: presence of weakly connected clusters.

Natalia Bochkina (University of Edinburgh) 27 July 2016 8 / 35

Page 11: Selection of the Regularization Parameter in Graphical ... · PDF fileSelection of the Regularization Parameter in Graphical Models using Network Characteristics ... Maxwell Institute

Network-based estimation of hyperparameter Network approach

Network characteristics

Correlation coefficient between nodes Vi , Vj ∈ G(V ,E):

σij =|nei(Vi ) ∩ nei(Vj )|√|nei(Vi )| |nei(Vj )|

,

where nei(Vi ) is the set of neighbours of node Vi (Estrada, 2011).Corresponding dissimilarity measure δij = 1− σij .

Mean Geodesic Distance: measure of connectivity between nodes

H(λ) =1

p(p − 1)

∑i<j

dij I(dij <∞)

where dij is the length of the shortest path between nodes i and j (Costaand Rodrigues, 2007).. . .

Natalia Bochkina (University of Edinburgh) 27 July 2016 9 / 35

Page 12: Selection of the Regularization Parameter in Graphical ... · PDF fileSelection of the Regularization Parameter in Graphical Models using Network Characteristics ... Maxwell Institute

Network-based estimation of hyperparameter Novel approaches

General algorithm

Fix a sequence (grid) of values of λ, (λ1, . . . , λN)

For each λ`, estimate Ω by penalised MLE, and hence the adjacencymatrix A` of the corresponding graphChoose λ`? = arg min` R(λ`,A`)

Can be interpreted as a point estimator of a modularised Bayesian model.

We propose two risk functions R(λ,A):

Path Connectivity: uses λ corresponding to the biggest structural changein the complexity of the graph.Complexity of the graph is measured by the Mean Geodesic Distance.

Augmented MSE: mimics a cross-validation approach with the lossdepending on the adjacency matrix of the graph.

Natalia Bochkina (University of Edinburgh) 27 July 2016 10 / 35

Page 13: Selection of the Regularization Parameter in Graphical ... · PDF fileSelection of the Regularization Parameter in Graphical Models using Network Characteristics ... Maxwell Institute

Network-based estimation of hyperparameter Novel approaches

General algorithm

Fix a sequence (grid) of values of λ, (λ1, . . . , λN)

For each λ`, estimate Ω by penalised MLE, and hence the adjacencymatrix A` of the corresponding graphChoose λ`? = arg min` R(λ`,A`)

Can be interpreted as a point estimator of a modularised Bayesian model.

We propose two risk functions R(λ,A):

Path Connectivity: uses λ corresponding to the biggest structural changein the complexity of the graph.Complexity of the graph is measured by the Mean Geodesic Distance.

Augmented MSE: mimics a cross-validation approach with the lossdepending on the adjacency matrix of the graph.

Natalia Bochkina (University of Edinburgh) 27 July 2016 10 / 35

Page 14: Selection of the Regularization Parameter in Graphical ... · PDF fileSelection of the Regularization Parameter in Graphical Models using Network Characteristics ... Maxwell Institute

Network-based estimation of hyperparameter Path Connectivity

Path ConnectivityConsider the Mean Geodesic Distance

H(λ) =1

p(p − 1)

∑i<j

dij I(dij <∞)

where dij is the length of the shortest path between nodes Vi and Vj .Choose λ: the largest change in graph structure measured by H(λ).

0.3 0.4 0.5 0.6

010

000

2000

030

000

λ

conn

Use finite differences with bandwidth h: H(λ+ h)− H(λ).Natalia Bochkina (University of Edinburgh) 27 July 2016 11 / 35

Page 15: Selection of the Regularization Parameter in Graphical ... · PDF fileSelection of the Regularization Parameter in Graphical Models using Network Characteristics ... Maxwell Institute

Network-based estimation of hyperparameter Path Connectivity

Path Connectivity: motivationUse normalised difference between mean geodesic distances:

R(λk ,A) =H(λk + h)− H(λk )

k−1∑k

j=1[H(λj + h)− H(λj )], λk = λ0 + (k − 1)h

FP = 46TP = 116

(a) Optimal λk?

FP = 54TP = 120

(b) λk?−1

Natalia Bochkina (University of Edinburgh) 27 July 2016 12 / 35

Page 16: Selection of the Regularization Parameter in Graphical ... · PDF fileSelection of the Regularization Parameter in Graphical Models using Network Characteristics ... Maxwell Institute

Network-based estimation of hyperparameter Path Connectivity

Path Connectivity: estimators of λ

0.20 0.25 0.30 0.35 0.40

05

1015

2025

λ

Den

sity

clusterednon−clustered

Natalia Bochkina (University of Edinburgh) 27 July 2016 13 / 35

Page 17: Selection of the Regularization Parameter in Graphical ... · PDF fileSelection of the Regularization Parameter in Graphical Models using Network Characteristics ... Maxwell Institute

Network-based estimation of hyperparameter Augmented MSE

Augmented MSE

Ideally, would like to use a cross-validation approach over some characteristic ofthe conditional dependence graph, which is an unbiased estimator of thecorresponding oracle risk.E.g. the MSE error of estimating a characteristic (dij ):

R(λ) = E∑i<j

(dij − dij (λ))2

where dij (λ) are based on GLasso estimator Ω(λ), with the corresponding oracleλoracle

λoracle = arg minλ

R(λ).

However, we do not observe the conditional dependency graph, i.e. we do nothave unbiased estimators of dij .

A priori information

the network contains clusters (possibly overlapping)⇒an algorithm that estimates well global characteristics (number of clusters, degrees,..)

to produce an original “estimate”

Natalia Bochkina (University of Edinburgh) 27 July 2016 14 / 35

Page 18: Selection of the Regularization Parameter in Graphical ... · PDF fileSelection of the Regularization Parameter in Graphical Models using Network Characteristics ... Maxwell Institute

Network-based estimation of hyperparameter Augmented MSE

Augmented MSE of graph correlations

network characteristic: graph correlations:

ρij =|nei(Vi ) ∩ nei(Vj )|√|nei(Vi )| |nei(Vj )|

original "estimate": output of a clustering algorithm.AGNES (Kaufman and Rousseeuw, 2009): estimates well global characteristicssuch as average degree of the graph, eigenvalues of A, etc

A-MSE estimator of λGiven Ωλ from GLasso and its adjacency matrix Aλ, choose

λAMSE = arg minλ

R(λ, Aλ) = arg minλ

E(∑i>j

|ρij − ρλij |q), q ≥ 1,

where E is the average over subsamples and (ρij ) correspond to the graph correlationsin the “original graph estimate”.

Natalia Bochkina (University of Edinburgh) 27 July 2016 15 / 35

Page 19: Selection of the Regularization Parameter in Graphical ... · PDF fileSelection of the Regularization Parameter in Graphical Models using Network Characteristics ... Maxwell Institute

Network-based estimation of hyperparameter Augmented MSE

Augmented MSE of graph correlations

network characteristic: graph correlations:

ρij =|nei(Vi ) ∩ nei(Vj )|√|nei(Vi )| |nei(Vj )|

original "estimate": output of a clustering algorithm.AGNES (Kaufman and Rousseeuw, 2009): estimates well global characteristicssuch as average degree of the graph, eigenvalues of A, etc

A-MSE estimator of λGiven Ωλ from GLasso and its adjacency matrix Aλ, choose

λAMSE = arg minλ

R(λ, Aλ) = arg minλ

E(∑i>j

|ρij − ρλij |q), q ≥ 1,

where E is the average over subsamples and (ρij ) correspond to the graph correlationsin the “original graph estimate”.

Natalia Bochkina (University of Edinburgh) 27 July 2016 15 / 35

Page 20: Selection of the Regularization Parameter in Graphical ... · PDF fileSelection of the Regularization Parameter in Graphical Models using Network Characteristics ... Maxwell Institute

Network-based estimation of hyperparameter Augmented MSE

Augmented MSE of graph order 2 connectivity

network characteristic: graph order 2 connectivity:

δij = I(|nei(Vi ) ∩ nei(Vj )| > 0) = I(ρij 6= 0)

i.e. the indicator function whether nodes i and j are connected or share aconnection.original "estimate": clustering algorithm (AGNES)

Risk:

R(λ, Aλ) = E∑i>j

(δij − δλij )2 = C + E(TP(λ)− FP(λ))

also known as Youden index, where

FP(λ) =∑i<j

I[δij = 0, δij (λ) = 1], TP(λ) =∑i<j

I[δij = 1, δij (λ) = 1].

Similarly, can estimate λ using this risk with δij replaced by δij in original graphestimate.

Natalia Bochkina (University of Edinburgh) 27 July 2016 16 / 35

Page 21: Selection of the Regularization Parameter in Graphical ... · PDF fileSelection of the Regularization Parameter in Graphical Models using Network Characteristics ... Maxwell Institute

Network-based estimation of hyperparameter Augmented MSE

Augmented MSE of graph order 2 connectivity

network characteristic: graph order 2 connectivity:

δij = I(|nei(Vi ) ∩ nei(Vj )| > 0) = I(ρij 6= 0)

i.e. the indicator function whether nodes i and j are connected or share aconnection.original "estimate": clustering algorithm (AGNES)

Risk:

R(λ, Aλ) = E∑i>j

(δij − δλij )2 = C + E(TP(λ)− FP(λ))

also known as Youden index, where

FP(λ) =∑i<j

I[δij = 0, δij (λ) = 1], TP(λ) =∑i<j

I[δij = 1, δij (λ) = 1].

Similarly, can estimate λ using this risk with δij replaced by δij in original graphestimate.

Natalia Bochkina (University of Edinburgh) 27 July 2016 16 / 35

Page 22: Selection of the Regularization Parameter in Graphical ... · PDF fileSelection of the Regularization Parameter in Graphical Models using Network Characteristics ... Maxwell Institute

Network-based estimation of hyperparameter Augmented MSE

A-MSE and oracle tuning parameter

−0.1

0.0

0.1

n=50 n=100 n=200 n=500

n

λ−λ

(c) p=50

−0.1

0.0

0.1

n=50 n=100 n=200 n=500

n

λ−λ

(d) p=170

−0.1

0.0

0.1

n=50 n=100 n=200 n=500

n

λ−λ

(e) p=290

−0.1

0.0

0.1

n=50 n=100 n=200 n=500

n

λ−λ

(f) p=500

The oracle value of λ is within the 95% confidence interval for the median ofλAMSE .

Natalia Bochkina (University of Edinburgh) 27 July 2016 17 / 35

Page 23: Selection of the Regularization Parameter in Graphical ... · PDF fileSelection of the Regularization Parameter in Graphical Models using Network Characteristics ... Maxwell Institute

Simulated data

Comparison on simulated data

Compare 6 approaches:

StARS, AGNES, A-MSE (graph correlations), PC, AIC and BIC

method penalized uses network subsampling fully fast very sparselikelihood characteristics. automatic graph estimates

PC X X X XA-MSE X X X XAGNES X X XStARS X XBIC X X X XAIC X X X

Compare on 3 graph structure scenarios: hubs, power law and randomnetworks.

Natalia Bochkina (University of Edinburgh) 27 July 2016 18 / 35

Page 24: Selection of the Regularization Parameter in Graphical ... · PDF fileSelection of the Regularization Parameter in Graphical Models using Network Characteristics ... Maxwell Institute

Simulated data Graph topologies

Graph topologies in biological data

Networks with hubs.Typical in biological networksPower-law networks. Distribution of the number of connections ξ of eachnode is

Pξ = k =k−α

ς(α), k ≥ 1,

for some constant α and the normalizing function ς(α).Peng et al. (2009): α = 2.3 provides a distribution that is close to what isexpected in biological networks.Random networks:

Pξ = k =

(pk

)θk (1− θ)p−k ,

where the parameter θ determines the proportion of edges (or sparsity) inthe graph.

Natalia Bochkina (University of Edinburgh) 27 July 2016 19 / 35

Page 25: Selection of the Regularization Parameter in Graphical ... · PDF fileSelection of the Regularization Parameter in Graphical Models using Network Characteristics ... Maxwell Institute

Simulated data Graph topologies

Examples of simulated graphs

1

(g) p=50, hubs-based

1

(h) p=170, hubs-based

1

(i) p=290, hubs-based

1

(j) p=50, power-law

1

(k) p=170, power-law

1

(l) p=290, power-lawNatalia Bochkina (University of Edinburgh) 27 July 2016 20 / 35

Page 26: Selection of the Regularization Parameter in Graphical ... · PDF fileSelection of the Regularization Parameter in Graphical Models using Network Characteristics ... Maxwell Institute

Simulated data Performance

Average ranks for the MSE of the precision matrix

Hubs-based Power lawn 50 100 200 500 50 100 200 500

dimension p=50AGNES 3.05 3.55 4.06 4.40 3.12 3.73 4.40 4.71A-MSE 4.33 4.90 5.22 5.38 4.92 5.47 5.67 5.78PC 5.23 5.80 5.58 5.15 4.58 5.13 4.85 4.49StARS 1.27 1.49 1.18 1.28 1.17 1.43 1.04 1.07BIC 5.38 3.73 3.14 3.06 5.33 3.66 3.08 3.02AIC 1.73 1.52 1.82 1.73 1.90 1.58 1.96 1.92

dimension p=500AGNES 2.13 3.00 3.92 4.30 2.11 3.00 3.62 4.11A-MSE 4.28 4.78 5.13 5.35 4.81 5.25 5.27 5.47PC 4.94 5.97 5.60 4.85 4.63 5.67 5.73 5.39StARS 1.00 1.01 1.00 1.00 1.00 1.00 1.00 1.00BIC 5.78 4.25 3.31 3.32 5.55 4.08 3.38 3.03AIC 2.88 2.00 2.05 2.18 2.90 2.00 2.00 2.00

Natalia Bochkina (University of Edinburgh) 27 July 2016 21 / 35

Page 27: Selection of the Regularization Parameter in Graphical ... · PDF fileSelection of the Regularization Parameter in Graphical Models using Network Characteristics ... Maxwell Institute

Simulated data Performance

Average ranks for the MSE of the dissimilarity matrix

Hubs-based Power lawn 50 100 200 500 50 100 200 500

dimension p=50AGNES 2.88 2.70 2.23 2.09 3.60 3.02 2.38 2.09A-MSE 2.83 2.47 1.65 1.44 2.12 1.65 1.20 1.41PC 3.52 3.67 2.75 2.68 2.42 2.22 2.53 2.52StARS 4.16 4.58 5.81 5.72 5.70 5.58 5.97 5.96BIC 3.83 3.05 3.41 3.79 2.13 3.12 3.88 3.98AIC 3.77 4.54 5.16 5.28 5.02 5.42 5.03 5.04

dimension p=170AGNES 3.52 2.98 2.12 1.73 4.31 3.68 3.06 2.32A-MSE 2.62 2.04 1.65 1.45 2.14 1.58 1.45 1.40PC 2.46 2.49 3.11 3.83 2.12 1.62 1.73 2.32StARS 6.00 5.78 6.00 6.00 6.00 6.00 6.00 6.00BIC 2.14 2.52 3.14 3.34 1.80 3.12 3.77 3.97AIC 4.26 5.18 4.98 4.65 4.62 5.00 5.00 5.00

dimension p=500AGNES 4.83 3.25 2.06 1.60 4.89 4.00 3.38 2.56A-MSE 2.51 1.80 1.94 2.00 2.09 1.72 1.33 1.42PC 2.04 3.12 3.69 3.83 2.32 1.48 1.68 2.06StARS 6.00 6.00 6.00 6.00 6.00 6.00 6.00 6.00BIC 1.51 1.95 2.40 2.78 1.60 2.80 3.61 3.97AIC 4.11 4.88 4.92 4.79 4.10 5.00 5.00 5.00

Natalia Bochkina (University of Edinburgh) 27 July 2016 22 / 35

Page 28: Selection of the Regularization Parameter in Graphical ... · PDF fileSelection of the Regularization Parameter in Graphical Models using Network Characteristics ... Maxwell Institute

Simulated data Performance

True discovery rate TDR = TP/(TP + FP)

0.0

0.2

0.4

0.6

0.8

1.0

p=50T

DR

p=170

p=290

p=500

0.0

0.2

0.4

0.6

0.8

1.0

n

TD

R

50 100 200 500

n50 100 200 500

n50 100 200 500

n50 100 200 500

AGNES AMSE PC StARS BIC AIC

TDR increases with n for AGNES, A-MSE and PC, and decreases for AIC and BIC.Natalia Bochkina (University of Edinburgh) 27 July 2016 23 / 35

Page 29: Selection of the Regularization Parameter in Graphical ... · PDF fileSelection of the Regularization Parameter in Graphical Models using Network Characteristics ... Maxwell Institute

Simulated data Performance

ROC curves

0.0

0.2

0.4

0.6

0.00 0.01 0.02 0.03 0.04FPR

TP

R

METHODPCAGSTAAG

0.0

0.2

0.4

0.6

0.00 0.02 0.04 0.06FPR

TP

R

METHODPCAGSTAAG

0.0

0.2

0.4

0.6

0.000 0.025 0.050 0.075FPR

TP

R

METHODPCAGSTAAG

0.0

0.2

0.4

0.6

0.00 0.02 0.04 0.06 0.08FPR

TP

R

METHODPCAGSTAAG

0.0

0.2

0.4

0.6

0.00 0.02 0.04 0.06 0.08FPR

TP

R

METHODPCAGSTAAG

0.0

0.2

0.4

0.6

0.000 0.025 0.050 0.075 0.100FPR

TP

R

METHODPCAGSTAAG

Dots: optimal graph selected by the corresponding method.Natalia Bochkina (University of Edinburgh) 27 July 2016 24 / 35

Page 30: Selection of the Regularization Parameter in Graphical ... · PDF fileSelection of the Regularization Parameter in Graphical Models using Network Characteristics ... Maxwell Institute

Simulated data Summary

Summary

AGNES is the best approach to recover global network characteristics(e.g. the proportion of edges, Mean Geodesic Distance) but generallyleads to complex graphs that are difficult to interpret.

Augmented MSE: sparser graphs than AGNES and achieves betterresults in estimating adjacency matrix A; more interpretable graphs

Path Connectivity is computationally the fastest method and only doesslightly worse than A-MSE in estimating MSE(A). It generally obtainssimple graph structures which are easier to interpret.

The choice of method depends on the relative cost of False Positivescompared to that of True Positives.

Natalia Bochkina (University of Edinburgh) 27 July 2016 25 / 35

Page 31: Selection of the Regularization Parameter in Graphical ... · PDF fileSelection of the Regularization Parameter in Graphical Models using Network Characteristics ... Maxwell Institute

Tumour gene expression data

Tumour gene expression data

Gene expression data set, colorectal tumour study (Hinoue et al., 2012).25 patientspaired samples: the gene expression profiling is obtained in each patientfor a colorectal tumor sample and its healthy adjacent colonic tissueTotal number of genes: 25, 000.7,579 genes were analysed (selected as differentially expressed betweenthe conditions).

Natalia Bochkina (University of Edinburgh) 27 July 2016 26 / 35

Page 32: Selection of the Regularization Parameter in Graphical ... · PDF fileSelection of the Regularization Parameter in Graphical Models using Network Characteristics ... Maxwell Institute

Tumour gene expression data

Dependence structure for tumour gene expressiondata: healthy

Path Connectivity A-MSE

clust 1clust 2clust 3clust 4clust 5clust 6clust 7clust 8clust 9clust 10

clust 1clust 2clust 3clust 4clust 5clust 6clust 7clust 8clust 9clust 10clust 11clust 12clust 13clust 14clust 15clust 16clust 17

Natalia Bochkina (University of Edinburgh) 27 July 2016 27 / 35

Page 33: Selection of the Regularization Parameter in Graphical ... · PDF fileSelection of the Regularization Parameter in Graphical Models using Network Characteristics ... Maxwell Institute

Tumour gene expression data

Dependence structure for tumour gene expressiondata: tumour

Path Connectivity A-MSE

clust 1clust 2clust 3clust 4clust 5clust 6clust 7clust 8clust 9clust 10clust 11clust 12clust 13

clust 1clust 2clust 3clust 4clust 5clust 6clust 7clust 8clust 9clust 10clust 11clust 12clust 13clust 14clust 15

Natalia Bochkina (University of Edinburgh) 27 July 2016 28 / 35

Page 34: Selection of the Regularization Parameter in Graphical ... · PDF fileSelection of the Regularization Parameter in Graphical Models using Network Characteristics ... Maxwell Institute

Tumour gene expression data

PC graph for gene expression data

10 clusters in the healthy samples13 clusters in the tumour samples

Overlap between cluster 4 in the healthy samples (84 genes) with cluster 2 inthe tumor sample (88 genes), which share 38 genes.

Overlap expected by chance: ∼ 4.45 genes.

Genes in Cluster 4 (normal) and Cluster 2 (tumour):P53-signaling pathway (P53 being the classical cancer gene)DNA replicationadaptive immune system

Natalia Bochkina (University of Edinburgh) 27 July 2016 29 / 35

Page 35: Selection of the Regularization Parameter in Graphical ... · PDF fileSelection of the Regularization Parameter in Graphical Models using Network Characteristics ... Maxwell Institute

Tumour gene expression data

PC graph for gene expression data

10 clusters in the healthy samples13 clusters in the tumour samples

Overlap between cluster 4 in the healthy samples (84 genes) with cluster 2 inthe tumor sample (88 genes), which share 38 genes.

Overlap expected by chance: ∼ 4.45 genes.

Genes in Cluster 4 (normal) and Cluster 2 (tumour):P53-signaling pathway (P53 being the classical cancer gene)DNA replicationadaptive immune system

Natalia Bochkina (University of Edinburgh) 27 July 2016 29 / 35

Page 36: Selection of the Regularization Parameter in Graphical ... · PDF fileSelection of the Regularization Parameter in Graphical Models using Network Characteristics ... Maxwell Institute

Summary

Summary and future workSummary

Propose a network-based method to choose the hyperparameter in Gaussiangraphical modelEstimation of conditional dependence graph is more stable than the approacheswhich depend on Ω only via ||Ω||1Estimated graphs are more interpretableChoice of method should be determined by the relative cost of FP vs TPBayesian interpretation? A point estimator under a modularised DAG.

R package: "GMRPS", paper is on arXiv:1509.05326.Current and future work

Asymptotic/non-asymptotic propertiesIn particular, given n and p, how large is the conditional dependence graph thatcan be estimated reliably.Other risk functions, notably based on second (and other) eigenvalues of ATest for the difference between the conditional dependence graphs in differentgroups of samples“Differential” network: difference between networks in two conditions

Natalia Bochkina (University of Edinburgh) 27 July 2016 30 / 35

Page 37: Selection of the Regularization Parameter in Graphical ... · PDF fileSelection of the Regularization Parameter in Graphical Models using Network Characteristics ... Maxwell Institute

Summary

Summary and future workSummary

Propose a network-based method to choose the hyperparameter in Gaussiangraphical modelEstimation of conditional dependence graph is more stable than the approacheswhich depend on Ω only via ||Ω||1Estimated graphs are more interpretableChoice of method should be determined by the relative cost of FP vs TPBayesian interpretation? A point estimator under a modularised DAG.

R package: "GMRPS", paper is on arXiv:1509.05326.Current and future work

Asymptotic/non-asymptotic propertiesIn particular, given n and p, how large is the conditional dependence graph thatcan be estimated reliably.Other risk functions, notably based on second (and other) eigenvalues of ATest for the difference between the conditional dependence graphs in differentgroups of samples“Differential” network: difference between networks in two conditions

Natalia Bochkina (University of Edinburgh) 27 July 2016 30 / 35

Page 38: Selection of the Regularization Parameter in Graphical ... · PDF fileSelection of the Regularization Parameter in Graphical Models using Network Characteristics ... Maxwell Institute

Summary

References

Cai, T., W. Liu, and X. Luo (2011). A Constrained l1 Minimization Approach to Sparse Precision MatrixEstimation. Journal of the American Statistical Association 106(494), 594–607.

Costa, L. and F. Rodrigues (2007). Characterization of complex networks: A survey of measurements.Advances in Physics 56(1), 167–242.

Estrada, E. (2011). The structure of complex networks. New York: OXFORD University press.

Hinoue, T., D. J. Weisenberger, C. P. E. Lange, H. Shen, H.-M. Byun, D. Van Den Berg, S. Malik, F. Pan,H. Noushmehr, C. M. van Dijk, R. a. E. M. Tollenaar, and P. W. Laird (2012, February). Genome-scaleanalysis of aberrant DNA methylation in colorectal cancer. Genome research 22(2), 271–82.

Kaufman, L. and P. Rousseeuw (2009). Finding groups in data: an introduction to cluster analysis. New Jersey:John Wiley & sons.

Liu, H., K. Roeder, and L. Wasserman (2011). Stability approach to regularization selection (stars) for highdimensional graphical models. Journal of Computational and Graphical Statistics, 1.

Meinshausen, N. and P. Bühlman (2010). Stability Selection. Journal of the Royal Statistical Society, SeriesB 72, 417–473.

Peng, J., P. Wang, N. Zhou, and J. Zhu (2009, June). Partial Correlation Estimation by Joint Sparse RegressionModels. Journal of the American Statistical Association 104(486), 735–746.

Natalia Bochkina (University of Edinburgh) 27 July 2016 31 / 35

Page 39: Selection of the Regularization Parameter in Graphical ... · PDF fileSelection of the Regularization Parameter in Graphical Models using Network Characteristics ... Maxwell Institute

Summary

Thank you!

Natalia Bochkina (University of Edinburgh) 27 July 2016 32 / 35

Page 40: Selection of the Regularization Parameter in Graphical ... · PDF fileSelection of the Regularization Parameter in Graphical Models using Network Characteristics ... Maxwell Institute

Summary

Simulated data

Yi ∼ Np(0,Ω−1), i = 1, . . . ,n

3 graph structure scenarios: hubs, power law and random networks.

Then,Ω = Ω(0) + δI

where off-diagonal elements of Ω are (Cai et al., 2011)

Ω(0)ij =

Unif (0.5,0.9) if Aij = 1 and Bern(0.5)=1 ;Unif (−0.5,−0.9) if Aij = 1 and Bern(0.5)=0;0 if Aij = 0.

with δ such that Ω is a positive definite matrix.

Each simulation is repeated 50 times.

Natalia Bochkina (University of Edinburgh) 27 July 2016 33 / 35

Page 41: Selection of the Regularization Parameter in Graphical ... · PDF fileSelection of the Regularization Parameter in Graphical Models using Network Characteristics ... Maxwell Institute

Summary

Path connectivity and 2nd eigenvalue of A

0.30 0.35 0.40 0.45 0.50 0.55

050

100

150

λ

H(λ)100

3.0

3.5

4.0

evalue2

H(λ) 100evalue2

Natalia Bochkina (University of Edinburgh) 27 July 2016 34 / 35


Recommended