+ All Categories
Home > Documents > Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal.

Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal.

Date post: 28-Mar-2015
Category:
Upload: gabrielle-lindsey
View: 217 times
Download: 2 times
Share this document with a friend
Popular Tags:
42
Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal
Transcript
Page 1: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal.

Systems analysis of innate immune mechanisms in infection – a role for HPC

Peter Ghazal

Page 2: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal.

What is Pathway Biology?

Pathway biology is….

A systems biology approach for understanding a biological process

- empirically by functional association of

multiple gene products & metabolites

- computationally by defining networks of cause-effect relationships.

Pathway Models link molecular; cellular; whole organism levels.

FORMAL MODELS --- ALLOW PREDICTING the outcome of Costly or Intractable Experiments

Page 3: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal.

Focus and outline of talk

• High through-put approaches to mapping and understanding host-response to infection.

• Targeting the host NOT the “bug” as anti-infective strategy

• Making HPC more accessible: SPRINT a new framework for high dimensional biostatistic computation

Story starts at the bed side

Page 4: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal.

Differentially expressed genes Differentially expressed genes in neonatesin neonatescontrol vs Infected (FDR p>1x10control vs Infected (FDR p>1x10-5-5, FC±4), FC±4)

Page 5: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal.

Dealing with HTP data: Impact of data variability

Replicate

Patient where,

j

iby ijiij

),0(~ 2bi Nb ),0(~ 2

Nij

222

variationTechnical

variationBiological

variationTotal

b

• Model for introducing biological and technical variation:

Page 6: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal.

Modelling patient variability and biomarkers for classification

Machine Learning methods:

Random Forest (RF)

Support Vector Machine (SVM)

Linear Discriminant Analysis (LDA)

K-Nearest Neighbour (K-NN)

How different data characteristics affect the misclassification errors?

Factors investigated:Data variability (biological and technical variations)

Training set sizeNumber of replications

Correlation between RNA biomarkers

Mizanur Khondoker

Page 7: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal.

Error rate vs. (number of biomarkers, total variation) An example of a simulation model to quantify number of biomarkers

and level of patient variability

Page 8: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal.

Conclusions from simulations• There is increased predictive value using multiple markers – although there is

no magic number that can be recommended as optimal in all situations.

• Optimal number greatly depends on the data under study.

• The important determining factors of optimal number of biomarkers are:

• The degree of differential expression (fold-change, p-values etc.)

• Amount of biological and technical variation in the data.

• The size of the training set upon which the classifier is to be built.

• The number of replication for each biomarkers.

• The degree of correlation between biomarkers.

• Now possible to predict optimal number through simulation.

Page 9: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal.

Rule of five: Criteria for pathogenesis based biomarkers

• Readily accessible• Multiple markers• Appropriately powered statistical association

• Physiological relevance• Causally linked to phenotype

Key challenge is mapping biomarkers into: biological context and understanding

Requires an experimental model system

Page 10: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal.

PluripotentStem Cell

MyeloidStem Cell

“Activated” Cytolytic Macrophage

Primed Macrophage

Resident Macrophage(immature)

ActivatedT-Lymphocyte

Promonocyte(Primary Signal)

InflammationIFN-gamma

(Secondary Signal)Endotoxin, IFN-gamma

Lymphokines

?

Monocyte?

Page 11: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal.

Transcriptional profile of MΦ activated by Ifng

Page 12: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal.

How do we tackle this?

PATHWAY BIOLOGYLiteratureData-mining

ModellingNetwork analysis

Experimentationgenetic screens

microarraysY2H

mechanism basedstudies

A sub-system study of cause effect relationships with a defined start (input) and end (output).

Page 13: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal.
Page 14: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal.

Mapping new nodes

PATHWAY BIOLOGYLiteratureData-mining

Experimentation

Page 15: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal.

Transcriptional profile of MΦinfected with CMV

Page 16: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal.

Hypothesis generation

• Blue zone vs red zone

Page 17: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal.

Down regulation of sterol pathway

Page 18: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal.

BUT… recorded changes are small – Do they have any

effect?

Next step modelling

PATHWAY BIOLOGY

Pure and applied modellingNetwork inference analysis

Experimentaldata

Page 19: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal.

Workflow

ODE model

Literature derived model

Known parameters

Unknown parameters

Vary parametersby an order of

magnitude

Order of magnitudeestimation

Ensemble of ODE models Results

Ensembleaverage

Page 20: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal.

Modelling

Where available, parameters obtained from the Brenda enzyme database

http://www.brenda-enzymes.info/

Cholesterol Synthesis

ODE model, Michaelis-Menten interactions• 57 Parameters

• 25 Known Parameters• 32 Unknown Parameters

Algorithm• Using the first three time points, calculate an equilibrium state

• Release model from equilibrium and simulate using enzyme data• For each unknown, consider this model across 3 orders of magnitude,

holding the other unknowns parameters fixed.

Page 21: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal.

Cholesterol (output of sterol pathway) results from simulation and expts

Cholesterol rate/flux Cholesterol levels

Predictions: Experiments:

Page 22: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal.

Lipidomic – mass spec results

Page 23: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal.

• Infection down regulate cholesterol biosynthesis pathway and free intra-cellular cholesterol.

• Can now predict the behaviour of the pathway.

• But?• Just as a good as UK (Met Office) weather predictions……

because……

Page 24: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal.
Page 25: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal.

Scalability issues related to increased complexity

• Increasing complexity and size of biological data

• Solution: High Performance Computing (HPC)?

HPC for High Throughput Post-Genomic Data

Page 26: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal.

Problems with large biological data sets

– Volume of data• Many research groups can now routinely generate high volumes of data

– Memory (RAM) handling:• Input data size is too big• Algorithms cause linear, exponential or other growth in

data volume

– CPU performance:• Routine analyses take too long

Page 27: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal.

Limitation examples: Clustering

• Gene clustering using R on a high-spec workstation:– 16,000 genes, k=12 gene clusters runs for ~30min– 16,000 genes, k=40 gene clusters runs for ~10hrs

Partitioning-Around-Medoids, n genes, k=12 clusters requested

Memory fail limit

Page 28: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal.

Outcome: Adverse effect on research

• Arbitrary size reduction of input data

• Batch processing of data

• Analyses in smaller steps

• Avoidance of some algorithms

• Failure to analyse

Page 29: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal.

Solution: High Performance Computing

• HPC takes many forms:– clusters, networks, supercomputer, grid, GPUs, “cloud”, ...

• Provides more computational power

• HPC is technically accessible for most: – Department own, Eddie, HECToR,...

However!

Page 30: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal.

HPC Access Hurdles

• Cost of access

• Time to adapt

• Complex, require specialist skills • Consultancy (e.g. EPCC) only feasible on

ad-hoc basis, not routinely

Page 31: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal.

HPC Access Hurdles

HPC is (currently) optimal for:- Specific problems that can be tackled as a project- Individuals who are familiar with parallelisation and

system architectures

HPC is not optimal for:- Routine/casual analyses of high-throughput data- Ad-hoc and ever-changing analyses algorithms- Data analysts without time or knowledge to sidestep

into parallelisation software/hardware.

Page 32: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal.

Need a step change (up!) to broaden HPC

access to all biologists

Challenge two fold!!

• Provide a generic solution

• Easy to use

Page 33: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal.

SPRINT

Post Genomic

Data

R

Biological Results

Very Large Post Genomic

Data

R

Very Large Post Genomic

Data

R

Biological Results

HPC(Eddie)

A solution for analyses using R

SPRINT (DPM & EPCC))

Page 34: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal.

SPRINTSPRINT has 2 components:1.HPC harness manages access to HPC2.Library of parallel R functions

e.g. cor (correlation) pam (clustering)

maxt (permutation

Allows non-specialists to make use of HPC resources, with analysis functions parallelised by us or the R community.

Page 35: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal.

data(golub)smallgd <- golub[1:100,] classlabel <- golub.cl

resT <- mt.maxT(smallgd, classlabel, test="t", side="abs")

quit(save="no")

library("sprint")

data(golub)smallgd <- golub[1:100,] classlabel <- golub.cl

resT <- pmaxT(smallgd, classlabel, test="t", side="abs")

pterminate()

quit(save="no")

Code comparison

Page 36: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal.

Permutation Benchmark

Input Array Data Size

Permutation Count

Estimatedmaxt 1 CPU

Pmaxt on 256 CPUs

(s)

36,612 x 76 500,000 6 hrs 73.18

36,612 x 76 1,000,000 12 hrs 46.64

73,224 x 76 500,000 10 hrs 148.46

100,000 x 320 1,000,000 20 hrs 294.61

Page 37: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal.

Correlation Benchmark

Input Array Data Size Output Array Data Size pcor() on 256 CPUs

(s)

11,000 x 320(27 MB)

923 MB 4.76

22,000 x 320

(54 MB)

3.6 GB 13.87

35,000 x 320

(85 MB)

9.1 GB 36.64

45,000 x 320

(110 MB)

15 GB 42.18

Page 38: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal.

Clustering Benchmarks

Page 39: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal.

Future

• Cloud (confidentiality issues)

• GPU (limitations is data size)

Page 40: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal.

New therapeutic and diagnostic opportunities

Viral Interaction Networks

Host Interaction Networks

Virus Antiviral HostSystemic Therapeutic

Bed-bench-models-almost back to bed

Page 41: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal.

THANK YOU&

Acknowlegments to our sponsors

Page 42: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal.

Mathieu Blanc

Steven Watterson

Mizanur Khondoker

Paul Dickinson

Thorsten Forster

Muriel Mewissen

Terry Sloan

Jon Hill

Michal Piotrowski

Arthur Trew

Acknowledgement

EPCC


Recommended