Liver Toxicity in ratCausal models
Distributions living on a DAGEstimation
Some resultsHow to search Graphical models?
ConclusionsFor Further Reading
References
Acknowledgements
Directed Acyclic Graphs and the use ofLinear Mixed Models
Prediction and Marker Selection
Siem Heisterkamp1,2
1Grünenthal, department CDB-Biometrics Aachen, Germany2Groningen Bioinformatics Centre (GBIC), Groningen, The Netherlands
Bayes 2012, Aachen
Siem Heisterkamp DAG
Liver Toxicity in ratCausal models
Distributions living on a DAGEstimation
Some resultsHow to search Graphical models?
ConclusionsFor Further Reading
References
Acknowledgements
Acknowledgements
I Joanna in ’t Hout, (University of Nijmegen, dept. MedicalStatistics and Epidemiology)
I Herman Vreuls, (MSD, dept. BARDS)I Jan Polman (formerly MSD, dept. MDI)I Susanne Bauerschmidt, (formerly MSD, dept. MDI)I Geny Groothuis, (University of Groningen, dept. Pharmacy)I Marieke Elferink, (University of Groningen, dept. Pharmacy)I Peter Olinga, (University of Groningen, dept. Pharmacy)I Elisa van Leeuwen (University of Groningen, dept. Pharmacy)
Siem Heisterkamp DAG
Liver Toxicity in ratCausal models
Distributions living on a DAGEstimation
Some resultsHow to search Graphical models?
ConclusionsFor Further Reading
References
Liver Toxicity in rat
I Part of a large study on comparison of expression in liver ofhumans, rats and HEP-cell lines (collaboration between formerOrganon and University of Groningen) Elferink et.al. (2011)In this presentation
I Rats exposed to a range of liver-toxic compounds (a.o.Paracetamol) using liver slices
I Gene-expression arrays applied on liver tissueI Q: How to find biologically meaningful associations between
treatment and gene-expression?
Siem Heisterkamp DAG
Liver Toxicity in ratCausal models
Distributions living on a DAGEstimation
Some resultsHow to search Graphical models?
ConclusionsFor Further Reading
References
Usual approach
I Unstructured hunting of SNP’s or bio-markers either univariate ormulti-variable
I Selection of variables is completely data-drivenI Subject of this presentation:
I Causal models formulated from hypotheses using findings fromdatabases (Ingenuity, GRAIL, etc)
I Test these by relatively simple means using linear causal modelsI Find smaller sub-graphs
Siem Heisterkamp DAG
Liver Toxicity in ratCausal models
Distributions living on a DAGEstimation
Some resultsHow to search Graphical models?
ConclusionsFor Further Reading
References
Part of Ingenuity Pathway Analysis
Siem Heisterkamp DAG
Liver Toxicity in ratCausal models
Distributions living on a DAGEstimation
Some resultsHow to search Graphical models?
ConclusionsFor Further Reading
References
Some DAG-Theory
Directed Acyclic Graph
●●
●
● ●●
■
Lbp
Tnfr.1
Tnfr.2
Jun.1
Jun.2
Apoc4
y
Full model with common edges
●●
●
● ●●
■
Lbp
Tnfr.1
Tnfr.2
Jun.1
Jun.2
Apoc4
y
DAG
●●
●
● ●●
■
Lbp
Tnfr.1
Tnfr.2
Jun.1
Jun.2
Apoc4
y
moralized DAG
Siem Heisterkamp DAG
Liver Toxicity in ratCausal models
Distributions living on a DAGEstimation
Some resultsHow to search Graphical models?
ConclusionsFor Further Reading
References
Some DAG-Theory
●●
●
●●
●
■
Lbp
Tnfr.1
Tnfr.2
Jun.1
Jun.2
Apoc4
y
Full DAG model
Adjacency matrix
Lbp Tnfr .1 Tnfr .2 Jun.1 Jun.2 Apoc4 YLbp 0 1 1 0 0 0 0
Tnfr .1 0 0 1 1 0 0 0Tnfr .2 0 0 0 1 1 0 0Jun.1 0 0 0 0 1 1 0Jun.2 0 0 0 0 0 1 0Apoc4 0 0 0 0 0 0 1
Din 0 1 2 2 2 2 1
Siem Heisterkamp DAG
Liver Toxicity in ratCausal models
Distributions living on a DAGEstimation
Some resultsHow to search Graphical models?
ConclusionsFor Further Reading
References
Some DAG-Theory
Markov-Equivalence of different DAG’sExample: DAG’s with 3 nodes
●
●
■
a
b
c
model 1
●
●
■
c
b
a
model 2
●
●
■
c
b
a
model 3
●
●
■
a
c
b
model 4
Siem Heisterkamp DAG
Liver Toxicity in ratCausal models
Distributions living on a DAGEstimation
Some resultsHow to search Graphical models?
ConclusionsFor Further Reading
References
Some DAG-Theory
Markov-Equivalence of different DAG’sAssociated Moralized graphs
●
●
■
a
b
c
moralized model 1
●
●
■
c
b
a
moralized model 2
●
●
■
c
b
a
moralized model 2
●
●
■
a
c
b
moralized model 4
Siem Heisterkamp DAG
Liver Toxicity in ratCausal models
Distributions living on a DAGEstimation
Some resultsHow to search Graphical models?
ConclusionsFor Further Reading
References
Some DAG-Theory
Seemingly different DAG’s may be equivalent
I Two DAG’s are Markov Equivalent iff1. Skeleton graph’s are equal2. The ’amoralities’ are the same
I In other words: the moralized graphs must be the sameI Causality can only be established in case of ’colliding’ arrows
Siem Heisterkamp DAG
Liver Toxicity in ratCausal models
Distributions living on a DAGEstimation
Some resultsHow to search Graphical models?
ConclusionsFor Further Reading
References
Directed Acyclic GraphFinding the distributions
I log lik (Dag) =∑ν log P (Child |ν)
I Linear model; algorithm by Cox & Wermuth (1996)
1. Moralize the graph
2. Trace-back from Y E [y |νy ] =∑
jaj yβy|νj
with
(∑j
aj y − dy
)6= 0
3. Conditional logLik: ∝ λ (y − E [y |νy ])2
and precision λ
(dy −
∑j
aj y
)4. Repeat 2 until last married grand-parents...
Siem Heisterkamp DAG
Liver Toxicity in ratCausal models
Distributions living on a DAGEstimation
Some resultsHow to search Graphical models?
ConclusionsFor Further Reading
References
Directed Acyclic GraphFinding the distributions
I Algorithm by Cox & Wermuth (1996)1. Let A adjacency matrix of a DAG2. Construct moralized Am, strip Y and other barren variables3. Construct Laplacian:
LA = I − D− 1
2m AmD
− 12
m
I With Dm the diagonal matrix of degrees for each nodeI LA inverse of the prior co-variance matrix (precision or
concentration)I May be weighted and of deficit rank
Siem Heisterkamp DAG
Liver Toxicity in ratCausal models
Distributions living on a DAGEstimation
Some resultsHow to search Graphical models?
ConclusionsFor Further Reading
References
Directed Acyclic GraphCausality and Distributions
According to DAG-theoryI Causality can only be established in case of ’colliding’ arrows
Corollary from the Normal Graph-theory Cox & Wermuth (1996)I A node cannot be causal with respect to other nodes if the
correlation between the latter conditional on the former is ZEROI Equivalent to a zero entry in the inverse of covariance matrix
Siem Heisterkamp DAG
Liver Toxicity in ratCausal models
Distributions living on a DAGEstimation
Some resultsHow to search Graphical models?
ConclusionsFor Further Reading
References
DAG and the linear mixed model
I X represent observations on nodes (for simplicity: no othervariables)
Y |β ∝ Nor(X (I + AD)α + Iβ, σ2Σy
)β ∝ Nor
(0, λ−1L−1
A
)I σ2 Σy the covariance matrix of the observationsI λ LA the precisionI LA of rank r ≤ p (β random effects)I AD Adjacency matrix of the Dual Graph of edges (α fixed effects)
Siem Heisterkamp DAG
Liver Toxicity in ratCausal models
Distributions living on a DAGEstimation
Some resultsHow to search Graphical models?
ConclusionsFor Further Reading
References
Linear Causal model
log lik
(β
∣∣∣∣∣Y ,A,X, λ, α, usual︷︸︸︷. . .
)= log lik
(Y
∣∣∣∣∣β,A,X, α, usual︷︸︸︷. . .
)︸ ︷︷ ︸
usual log lik
+
−0.5 n(λβtLA β − rLA log (λ)− log (det (LA))
)︸ ︷︷ ︸prior log lik
I r rank of LA
Siem Heisterkamp DAG
Liver Toxicity in ratCausal models
Distributions living on a DAGEstimation
Some resultsHow to search Graphical models?
ConclusionsFor Further Reading
References
Similarities with Linear mixed model
SimilaritiesI Identifiable (like in mixed models) for fixed λ · σ2
usually a suitable σ̂2 is plugged in or σ = 1I Standard software (e.g. lme in S-Plus or R) with user-defined
variance structureI Different DAGs for different random levels e.g. gender etc.I Common causes may be modeled as well
Siem Heisterkamp DAG
Liver Toxicity in ratCausal models
Distributions living on a DAGEstimation
Some resultsHow to search Graphical models?
ConclusionsFor Further Reading
References
Differences with Linear mixed model
DifferencesI Changing adjacency A may change Laplacian LA as wellI Consequences for estimation and testing
1. Use of Wald test for individual interactions not possible as LA
depends on H0
2. ML-estimation must be used with general criteria (AIC, AICc, BIC,gBIC)
Siem Heisterkamp DAG
Liver Toxicity in ratCausal models
Distributions living on a DAGEstimation
Some resultsHow to search Graphical models?
ConclusionsFor Further Reading
References
4 best DAG’s using gBIC
●●
●
● ●●
■
Lbp
Tnfr.1
Tnfr.2
Jun.1
Jun.2
Apoc4
y
DAG(1) gBIC -28071.45
●●
●
● ●●
■
Lbp
Tnfr.1
Tnfr.2
Jun.1
Jun.2
Apoc4
y
DAG(2) gBIC -14310.42
●●
●
● ●●
■
Lbp
Tnfr.1
Tnfr.2
Jun.1
Jun.2
Apoc4
y
DAG(3) gBIC -10965.17
●●
●●
●
■
Lbp
Tnfr.1
Tnfr.2
Jun.1
Apoc4
y
DAG(4) gBIC -10788.49
Siem Heisterkamp DAG
Liver Toxicity in ratCausal models
Distributions living on a DAGEstimation
Some resultsHow to search Graphical models?
ConclusionsFor Further Reading
References
4 best moralized DAG’s
●●
●
● ●●
■
Lbp
Tnfr.1
Tnfr.2
Jun.1
Jun.2
Apoc4
y
moralized (1)
●●
●
● ●●
■
Lbp
Tnfr.1
Tnfr.2
Jun.1
Jun.2
Apoc4
y
moralized (2)
●●
●
● ●●
■
Lbp
Tnfr.1
Tnfr.2
Jun.1
Jun.2
Apoc4
y
moralized (3)
●●
●●
●
■
Lbp
Tnfr.1
Tnfr.2
Jun.1
Apoc4
y
moralized (4)
Siem Heisterkamp DAG
Liver Toxicity in ratCausal models
Distributions living on a DAGEstimation
Some resultsHow to search Graphical models?
ConclusionsFor Further Reading
References
in-vivo liver toxicity in ratvalidation by leave one out
Table: Prediction for 4 models
DAG(1) DAG(2) DAG(3) DAG(4)gBIC -28071 -14310 -10965 -10788
T C T C T C T Cpred. T 6 0 6 0 6 0 6 0pred. C 4 10 4 10 4 10 4 10Total 10 10 10 10 10 10 10 10
Note: None of the models are Markov-equivalent
Siem Heisterkamp DAG
Liver Toxicity in ratCausal models
Distributions living on a DAGEstimation
Some resultsHow to search Graphical models?
ConclusionsFor Further Reading
References
How to search for plausible models?
I The number of models to look for is too largeI With lasso or graphical lasso edges can be deleted automatically
(e.g. gLassso by Friedman)I But: Results in disconnected networks or biologically hard to
explain (?)I Proposal: extra penalty function with L1 and L0-norm on
connected edges acting on the dual graph
Siem Heisterkamp DAG
Liver Toxicity in ratCausal models
Distributions living on a DAGEstimation
Some resultsHow to search Graphical models?
ConclusionsFor Further Reading
References
How to search plausible models (2)?I Double exponential prior on α ∈ Edges (Lasso L1)I Poisson prior on the number of edges (L0)I Constrain for at least one connected path between
Grandparent(s) and Final node(s)I ⇒ 1-prob(no pathway of any length exists)I extra priors added (penalty functions)
−0.5 n λ1
(∑s∈E
|αs|
)︸ ︷︷ ︸
prior log lik L1 norm
+n log
1− e
−∑k
∑f∈Fg∈G
∑sπ(λ1,λ2,αs)·wk
s,f ,g
︸ ︷︷ ︸prior log lik L0 norm
Siem Heisterkamp DAG
Liver Toxicity in ratCausal models
Distributions living on a DAGEstimation
Some resultsHow to search Graphical models?
ConclusionsFor Further Reading
References
What have we learned?
I Unstructured univariate or multi-variable hunting fordifferentiating genes, SNPs, markers
I Selection of variables: no guarantee for biologically meaningfulmodels
I Extract hypotheses from knowledge-bases (Ingenuity etc)I Fit with causal modelsI Search smaller models
I Some problems to be solved practically:I How to compare alternative models?
Siem Heisterkamp DAG
Liver Toxicity in ratCausal models
Distributions living on a DAGEstimation
Some resultsHow to search Graphical models?
ConclusionsFor Further Reading
References
For Further Reading I
Cox DR, Wermuth N,Multivariate Dependencies: Models analysis andInterpretation(1996). Chapman & Hall, London
Pearl, J. Causality: Models Reasoning and Inference(2000).Cambridge, Cambridge University Press
Raftery, A. E., (1996) Hypothesis testing and model selection in Gilks,W. R., Richardson, S., Spiegelhalter, D.J. (eds.) Markov ChainMonte Carlo in Practice. London: Chapman and Hall.
Elferink, M.G.L., Olinga, P., van Leeuwen , E.M., Bauerschmidt, S.,Polman, J., Schoonen, W.G., Heisterkamp, S.H. and Groothuis,G.M.M., (2011) Gene expression analysis of precision-cut humanliver slices indicate stable expression of ADME-Tox related genes.Toxicol Appl Pharmacol (2011) PMID 21420995.
Siem Heisterkamp DAG
Liver Toxicity in ratCausal models
Distributions living on a DAGEstimation
Some resultsHow to search Graphical models?
ConclusionsFor Further Reading
References
For Further Reading II
Dawid, A.P., (2004) Probability, Causality and the Empirical World: ABayes -de Finetti -Popper -Borel Synthesism, Statistical Science,19, 44-57.
Dawid, A.P., (2007) Fundamentals of Statistical Causality ResearchReport no 279, Department of Statistical Sciences, UniversityCollege London, September 2007
Siem Heisterkamp DAG