Jan Lemeire December 19 th 2007

Post on 22-Mar-2016

31 views 0 download

description

Learning Causal Models of Multivariate Systems and the Value of it for the Performance Modeling of Computer Programs. Jan Lemeire December 19 th 2007. Supervisor: Prof. dr. ir. Erik Dirkx. Learning causal models for the performance analysis of programs executed on various computer systems. - PowerPoint PPT Presentation

transcript

Learning Causal Models of Multivariate Systemsand the Value of it for the Performance Modeling of Computer Programs

Jan LemeireDecember 19th 2007

Supervisor: Prof. dr. ir. Erik Dirkx

2Causal Inference & Performance Analysis

Pag.Jan Lemeire / 49

OverviewLearning causal models for the performance analysis of programs executed on various computer systems.

Intermezzo I: Causal inference.Practical deployment of the causal learning algorithms.Philosophical and theoretical study of causal inference.

Intermezzo II: Kolmogorov Minimal Sufficient Statistics.The importance of qualitative properties.

3Causal Inference & Performance Analysis

Pag.Jan Lemeire / 49

OverviewLearning causal models for the performance analysis of programs executed on various computer systems.

Intermezzo I: Causal Inference.Practical deployment of the causal learning algorithms. Philosophical and theoretical study of causal inference.

Intermezzo II: Kolmogorov Minimal Sufficient StatisticsThe importance of qualitative properties.

4Causal Inference & Performance Analysis

Pag.Jan Lemeire / 49

CPU

MCPU

M

CPU

MN

Time

What is Parallel Processing?

Ideally: Speedup = number of processors

Computational work:

Parallel system

Time

5Causal Inference & Performance Analysis

Pag.Jan Lemeire / 49

Parallel Overhead

Speedup = 2.55 Overhead = time the processors are not spending on useful work

= lost processor cycles

Time

6Causal Inference & Performance Analysis

Pag.Jan Lemeire / 49

Overhead Analysis

ratiosoverheadprocessorsofnumberSpeedup

runtimetimeoverheadratioOverhead

1

Impact of overhead on speedup

7Causal Inference & Performance Analysis

Pag.Jan Lemeire / 49

Experimental Parallel Performance Analysis: Data Acquisition

Parallel Program

EPPADatabase

EPPA

EPPA instrumentation

library

Executable

8Causal Inference & Performance Analysis

Pag.Jan Lemeire / 49

EPDA: Multivariate Analysis

User-defined variables

Database

EPDA

1.5 2 1842.5 4 8360.9 1 1043

Specify context

Modeling

Causal Model

Curve fitting

Analytical Model

CPT compression

Causal Inference

Derivatives of variables

Visualization

Outlier identification

Augmented Model

9Causal Inference & Performance Analysis

Pag.Jan Lemeire / 49

Intermezzo I: Causal Inference

System under study

Data A B C D Eexperiment 1 2 12 0.42 TRUE blueexperiment 2 1 73 1.93 FALSE greenexperiment 3 4 8 0.03 TRUE redexperiment 4 2 27 2.84 TRUE black

ED

CACausal model

Experiments

B

A

B

CE

D

10Causal Inference & Performance Analysis

Pag.Jan Lemeire / 49

Causal Inference for PerformanceAnalysis

Utility based on the following properties:1. Dependency analysis: how variables relate.2. Markov property.3. A causal model corresponds to a decomposition.

#op

fclock

#instrop

array size

cache misses

memory

element typeCinstr

element size Cmem

Tcomp

PROGRAM#op

fclock

array size

cache misses

memory

element type

Cinstr

Cmem

Tcomp

#instrop

??element size

PERFORMANCE

11Causal Inference & Performance Analysis

Pag.Jan Lemeire / 49

Execution of program gives cache misses

Datastructure

PROGRAMx?

x?

datatype (integer, float, double,…)

data size in Bytes

44

12Causal Inference & Performance Analysis

Pag.Jan Lemeire / 49

Markov Property

cache missesdatatype

cache missesdatatype data size

data sizedatatype cache misses

Provides explanationsDifferentiate direct from indirect relations

Correlated

With information about the data size:

13Causal Inference & Performance Analysis

Pag.Jan Lemeire / 49

Cannon angle

Can we Observe Causal Relations?

distance~OK, but: or ???

14Causal Inference & Performance Analysis

Pag.Jan Lemeire / 49

What is Causality?A causal relation denotes a mechanism, that a variable

is `produced’ by its causes. However… not directly observable.

Causality is a relic of a bygone age

Mmmh

Bertrand RussellJudea Pearl

But: we want to learn something about underlying system (goal of statistics)

15Causal Inference & Performance Analysis

Pag.Jan Lemeire / 49

gunpowder distance~

16Causal Inference & Performance Analysis

Pag.Jan Lemeire / 49

V-structure Propertycannon angle

distance

gunpowder

angle independent from gunpowder

but dependent when distance is known

17Causal Inference & Performance Analysis

Pag.Jan Lemeire / 49

mechC

mechE

mechD

A

B

CE

D

A

B

CE

D

Conditional Independencies Make Causal Inference Possible

From a causal structure follow conditional independencies, irrespective of the mechanisms.– Markov– V-structure

A

B

CE

D

18Causal Inference & Performance Analysis

Pag.Jan Lemeire / 49

Graph is a Description of Independencies

Graphical criterion: d-separation– Intuitive

Faithfulness property: independencies independencies in graph in reality

ED

CA

B

19Causal Inference & Performance Analysis

Pag.Jan Lemeire / 49

Causal Structure Learning

D

CBA

In two steps:1. Undirected

graph2. Orientation

C

A B

D

(a)

C

A B

D

A D

(b)

C

A B

D

A C B

C D B(c)

C

A B

D

(d)

C

A B

DA DA D B

(e)

C

A B

DA CA C B

(f)

20Causal Inference & Performance Analysis

Pag.Jan Lemeire / 49

Result

Partially directed acyclic graph

“We know what parts are unknown.”Faithfulness assumption: all independencies follow from the causal structure

21Causal Inference & Performance Analysis

Pag.Jan Lemeire / 49

Experimental Results(1) Automatic learning of accurate performance models(2) Model validation(3) Identification ofunexpected dependencies(4) Explanations for outliers

Contribution 1Figuur opnieuw in png, zonder losless compression

22Causal Inference & Performance Analysis

Pag.Jan Lemeire / 49

OverviewLearning causal models for the performance analysis of programs executed on various computer systems.

Intermezzo I: Causal Inference.Practical deployment of the causal learning algorithms. Philosophical and theoretical study of causal inference.

Intermezzo II: Kolmogorov Minimal Sufficient StatisticsThe importance of qualitative properties.

23Causal Inference & Performance Analysis

Pag.Jan Lemeire / 49

Practical Causal Inference

The following limitations had to be overcome:Non-linear relations: form-free independence testMixture of continuous, discrete and categorical data: general independence testDeterministic relations: augmented causal model and extended learning algorithms

24Causal Inference & Performance Analysis

Pag.Jan Lemeire / 49

Form-Free and General Dependency Test

Mutual information

Example

Kernel density estimation

Pearson:Rxy=0.083 => X and Y linearly independent

I(X;Y)=0.90 bits => dependent

X

Y

X

Y

P(X, Y)

25Causal Inference & Performance Analysis

Pag.Jan Lemeire / 49

Deterministic Relations

data sizedatatype cache misses

Data size and data type are information equivalent with respect to cache missesDuring learning connect least complex relation

26Causal Inference & Performance Analysis

Pag.Jan Lemeire / 49

Complexity Criterion

Correct models are learned under the Complexity Increase Assumption

Contribution 2a

mech1 mech2X Y Z

Complexity( X – Z ) ≥ Complexity( X – Y )Complexity( X – Z ) ≥ Complexity( Y – Z )

27Causal Inference & Performance Analysis

Pag.Jan Lemeire / 49

Reestablishment of Faithfulness

Consequences are consideredInformation equivalences

Independence and simplicity

D-separation extension

Faithful model: represents all independencies

Contribution 2b

X and Y eq. for A

Information is added to the model Basic information equivalences

Y A

X

ZZ

Y Z X Y Z XS

Y Z X Y Z Xeq

Dit moet erbij!!Details misschien niet?

28Causal Inference & Performance Analysis

Pag.Jan Lemeire / 49

Extension of PC Learning Algorithm

Detection of information equivalencesAmong information equivalent relations, the simplest one is chosenOrientation rules remain the same

Correct models are learned from data containing deterministic relations.

Contribution 2c

29Causal Inference & Performance Analysis

Pag.Jan Lemeire / 49

OverviewLearning causal models for the performance analysis of programs executed on various computer systems.

Intermezzo I: Causal Inference.

Practical deployment of the causal learning algorithms. Philosophical and theoretical study of causal inference.

Intermezzo II: Kolmogorov Minimal Sufficient StatisticsThe importance of qualitative properties.

30Causal Inference & Performance Analysis

Pag.Jan Lemeire / 49

31Causal Inference & Performance Analysis

Pag.Jan Lemeire / 49

Inductive Inference

Occam’s Razor“Among equivalent models

choose the simplest one.”William of Ockham

Jaartallen van scientists erbij zetten

BUT: Objective measure of complexity?

2.cmE 3. HFmE hyx vhm

vym

vxm

FE

...

22yx ddH

c

vvgF yx

22

.

32Causal Inference & Performance Analysis

Pag.Jan Lemeire / 49

Kolmogorov Complexity

Andrey Kolmogorov

REPEAT 11 TIMES PRINT "001"

PROGRAM

001001001001001001001001001001001

Universal Turing Machine

Kolmogorov Complexity of a binary string: the length of the shortest program that

computes the string and halts

33Causal Inference & Performance Analysis

Pag.Jan Lemeire / 49

Shortest Programs

REPEAT 11 TIMES PRINT "001"

PROGRAM

PRINT "01100011010110 1010111001001101000"

PROGRAM

regularity of repetition allows compression

011000110101101010111001001101000

random information = incompressible

34Causal Inference & Performance Analysis

Pag.Jan Lemeire / 49

Randomness versus Regularity

001001001001001001001001001001001

011000110101101010111001001101000Only random information (incompressible)

Kolmogorov Minimal Sufficient Statistics (KMSS): formal separation

Meaningful informationregularities

Accidental information randomness

repetition 11 times, 001

35Causal Inference & Performance Analysis

Pag.Jan Lemeire / 49

Learning = finding regularities = maximal compression

regularitiesrandom

Structure of a diamond Exact size

random

36Causal Inference & Performance Analysis

Pag.Jan Lemeire / 49

ED

CA

B

P(A)P(B)P(C|A, B)P(D|B)P(E|C, D)

CPDsGRAPH

Meaningful Information of Probability Distributions

meaningful information (Theorem 1)

Kolmogorov Minimal Sufficient Statistic if graph and CPDs are incompressible (Theorem 2)

Contribution 3a

a graph with random CPDs is faithful (Theorem 4)

Joint Probability Distribution

P(A, B, C, D, E)= E

D

CA

B

P(A)P(B)P(C|A, B)P(D|B)P(E|C, D)

CPDsGRAPH

37Causal Inference & Performance Analysis

Pag.Jan Lemeire / 49

A

B

CE

D

mechC

mechE

mechD

A

B

CE

D

Causal Aspect of Causal Models = Decomposition

Canonical decomposition: quasi-unique and minimal decomposition into atomic and independent components (the CPDs)Corresponds to reality (mechanisms)

ED

CA

B

38Causal Inference & Performance Analysis

Pag.Jan Lemeire / 49

Causal Component Relies on Reductionism

When DAG of Bayesian network is a complete graph

no meaningful information holism

The world can be studied in parts.Or, even more:

The world is made up of indivisible parts.

Figuurtje toevoegen van holisme en reductionisme

E

D

CA

B

39Causal Inference & Performance Analysis

Pag.Jan Lemeire / 49

Validity of Causal Inference

Do CPD components correspond to physical mechanisms?

Contribution 3b

Minimal model?Faithful?Other regularities?

How OK is the learned causal model?

Unfaithful

Other regularities

Conformreality

Wrongdecomposition

1 4

Counterexamples from literature

3, 6 2, 5, 7, 8Causal model ≠

minimal model{

40Causal Inference & Performance Analysis

Pag.Jan Lemeire / 49

Well-known Example of Unfaithfulness

CB

A

D

CB

A

D

D A

12

’Normally’:A and D correlate

A and D get independent if influences along paths 1 and 2 cancel each other out

Mechanisms are relatedRegularity among them

41Causal Inference & Performance Analysis

Pag.Jan Lemeire / 49

OverviewLearning causal models for the performance analysis of programs executed on various computer systems.

Intermezzo I: Causal Inference.

Practical deployment of the causal learning algorithms. Philosophical and theoretical study of causal inference.

Intermezzo II: Kolmogorov Minimal Sufficient StatisticsThe importance of qualitative properties.

42Causal Inference & Performance Analysis

Pag.Jan Lemeire / 49

Regularities are Qualitative Properties

Different from quantitative information.

Allow for qualitative reasoning.

Qualitative properties determine behavior.

43Causal Inference & Performance Analysis

Pag.Jan Lemeire / 49

Communication Schemes on Network Topologies

Communication Scheme

Network Topology

1 2

3

4

56

7

8

1 2

3

4

56

7

8

Communication time?

44Causal Inference & Performance Analysis

Pag.Jan Lemeire / 49

Generic Performance Model

Good predictions for combinations of random schemes and random topologies

Contribution 4a

Scheme1

Topo-logy1

model Tcomm

Scheme2 Scheme3

Topo-logy2

Topo-logy3

45Causal Inference & Performance Analysis

Pag.Jan Lemeire / 49

Combinations of Patterns

broadcast shift

Communication Schemes

star ring

Network Topologies

Performance depends onmatch!

Contribution 4b

46Causal Inference & Performance Analysis

Pag.Jan Lemeire / 49

Qualitative Properties

Faithfulness: ”graph should describe all independencies”

KMSS: ”model should describe all regularities”

Qualitative information Quantitative informationcontains no more regularities

explicitly describe regularities

47Causal Inference & Performance Analysis

Pag.Jan Lemeire / 49

Explicitly Mention Qualitative Properties!

Stone

(12,61)

(9,41)

(19,24)

(2,12)

(5,21)

??(12,61)

(12,61)

(12,61)

(9,41)

(9,41)

48Causal Inference & Performance Analysis

Pag.Jan Lemeire / 49

Conclusions

Contribution to performance analysis.Automatic causal analysis.Useful add-on in combination with other techniques.

The value of causal inference is underlined.The importance of regularities or qualitative properties.

49Causal Inference & Performance Analysis

Pag.Jan Lemeire / 49

Future Work

Application of the learned performance models for optimization.Is the failure of generic performance models only due to regularities?Augment models with qualitative properties.But: how define, recognize and reason with regularities?