+ All Categories
Home > Documents > Prof Anja Feldmann based on slides by David J. Lilja · Copyright 2004 David J. Lilja 12 Important...

Prof Anja Feldmann based on slides by David J. Lilja · Copyright 2004 David J. Lilja 12 Important...

Date post: 16-Jul-2020
Category:
Upload: others
View: 3 times
Download: 0 times
Share this document with a friend
42
Copyright 2004 David J. Lilja 1 Comparing alternatives Prof Anja Feldmann based on slides by David J. Lilja
Transcript
Page 1: Prof Anja Feldmann based on slides by David J. Lilja · Copyright 2004 David J. Lilja 12 Important points zUse confidence intervals to determine if there are statistically significant

Copyright 2004 David J. Lilja 1

Comparing alternatives

Prof Anja Feldmannbased on slides by David J. Lilja

Page 2: Prof Anja Feldmann based on slides by David J. Lilja · Copyright 2004 David J. Lilja 12 Important points zUse confidence intervals to determine if there are statistically significant

Copyright 2004 David J. Lilja 2

Comparing alternatives

ANOVAAnalysis of Variance

Partitions total variation in a set of measurements into

Variation due to real differences in alternativesVariation due to errors

Page 3: Prof Anja Feldmann based on slides by David J. Lilja · Copyright 2004 David J. Lilja 12 Important points zUse confidence intervals to determine if there are statistically significant

Copyright 2004 David J. Lilja 3

Comparing two alternatives1. Before-and-after

Did a change to the system have a statistically significant impact on performance?

2. Non-corresponding measurementsIs there a statistically significant difference between two different systems?

Page 4: Prof Anja Feldmann based on slides by David J. Lilja · Copyright 2004 David J. Lilja 12 Important points zUse confidence intervals to determine if there are statistically significant

Copyright 2004 David J. Lilja 4

Before-and-after comparisonAssumptions

Before-and-after measurements are not independentVariances in two sets of measurements may not be equal

→ Measurements are related

Use mean of differences

Page 5: Prof Anja Feldmann based on slides by David J. Lilja · Copyright 2004 David J. Lilja 12 Important points zUse confidence intervals to determine if there are statistically significant

Copyright 2004 David J. Lilja 5

Before-and-after comparison

483876-391885-595904490943-588832-186851

Difference(di = bi – ai)

After(ai)

Before(bi)

Measurement(i)

Page 6: Prof Anja Feldmann based on slides by David J. Lilja · Copyright 2004 David J. Lilja 12 Important points zUse confidence intervals to determine if there are statistically significant

Copyright 2004 David J. Lilja 6

Before-and-after comparison

From mean of differences, appears that change reduced performance.However, standard deviation is large.

15.4deviation Standard1sdifference ofMean

==−==

dsd

Page 7: Prof Anja Feldmann based on slides by David J. Lilja · Copyright 2004 David J. Lilja 12 Important points zUse confidence intervals to determine if there are statistically significant

Copyright 2004 David J. Lilja 7

95% Confidence interval for mean of differences

c1,2 = [-5.36, 3.36]Interval includes 0

→ With 95% confidence, there is no statistically significant difference between the two systems.

Page 8: Prof Anja Feldmann based on slides by David J. Lilja · Copyright 2004 David J. Lilja 12 Important points zUse confidence intervals to determine if there are statistically significant

Copyright 2004 David J. Lilja 8

Noncorrespondingmeasurements

No direct correspondence between pairs of measurementsUnpaired observationsn1 measurements of system 1n2 measurements of system 2

Page 9: Prof Anja Feldmann based on slides by David J. Lilja · Copyright 2004 David J. Lilja 12 Important points zUse confidence intervals to determine if there are statistically significant

Copyright 2004 David J. Lilja 9

Confidence interval for difference of means

1. Compute means2. Compute difference of means3. Compute standard deviation of difference of

means4. Find confidence interval for this difference5. No statistically significant difference

between systems if interval includes 0

Page 10: Prof Anja Feldmann based on slides by David J. Lilja · Copyright 2004 David J. Lilja 12 Important points zUse confidence intervals to determine if there are statistically significant

Copyright 2004 David J. Lilja 10

OS exampleInitial operating system

n1 = 1,300,203 interrupts (3.5 hours)m1 = 142,892 interrupts occurred in OS codep1 = 0.1099, or 11% of time executing in OS

Upgrade OSn2 = 999,382m2 = 84,876p2 = 0.0849, or 8.5% of time executing in OS

Statistically significant improvement?

Page 11: Prof Anja Feldmann based on slides by David J. Lilja · Copyright 2004 David J. Lilja 12 Important points zUse confidence intervals to determine if there are statistically significant

Copyright 2004 David J. Lilja 11

OS example (2.)

p = p1 – p2 = 0.0250sp = 0.000391190% confidence interval

(0.0242, 0.0257)Statistically significant difference?

Page 12: Prof Anja Feldmann based on slides by David J. Lilja · Copyright 2004 David J. Lilja 12 Important points zUse confidence intervals to determine if there are statistically significant

Copyright 2004 David J. Lilja 12

Important points

Use confidence intervals to determine if there are statistically significant differences

Before-and-after comparisonsFind interval for mean of differences

Noncorresponding measurementsFind interval for difference of means

If interval includes zero→ No statistically significant difference

Page 13: Prof Anja Feldmann based on slides by David J. Lilja · Copyright 2004 David J. Lilja 12 Important points zUse confidence intervals to determine if there are statistically significant

Copyright 2004 David J. Lilja 13

Comparing > two alternatives

Naïve approachCompare confidence intervals

Page 14: Prof Anja Feldmann based on slides by David J. Lilja · Copyright 2004 David J. Lilja 12 Important points zUse confidence intervals to determine if there are statistically significant

Copyright 2004 David J. Lilja 14

One-factor Analysis of Variance (ANOVA)

Very general techniqueLook at total variation in a set of measurementsDivide into meaningful components

Also calledOne-way classificationOne-factor experimental design

Page 15: Prof Anja Feldmann based on slides by David J. Lilja · Copyright 2004 David J. Lilja 12 Important points zUse confidence intervals to determine if there are statistically significant

Copyright 2004 David J. Lilja 15

One-factor Analysis of Variance (ANOVA)

Separates total variation observed in a set of measurements into:

1. Variation within one systemDue to random measurement errors

2. Variation between systemsDue to real differences + random error

Is variation(2) statistically > variation(1)?

Page 16: Prof Anja Feldmann based on slides by David J. Lilja · Copyright 2004 David J. Lilja 12 Important points zUse confidence intervals to determine if there are statistically significant

Copyright 2004 David J. Lilja 16

ANOVA

Make n measurements of k alternativesyij = ith measurment on jth alternativeAssumes errors are:

IndependentGaussian (normal)

Page 17: Prof Anja Feldmann based on slides by David J. Lilja · Copyright 2004 David J. Lilja 12 Important points zUse confidence intervals to determine if there are statistically significant

Copyright 2004 David J. Lilja 17

Measurements for all alternatives

αk…αj…α2α1Effecty.k…y.j…y.2y.1Col meanynk…ynj…yn2yn1n

…………………

yik…yij…yi2yi1i

…………………

y2k…y2j…y22y212

yk1…y1j…y12y111

k…j…21Measurements

Alternatives

Page 18: Prof Anja Feldmann based on slides by David J. Lilja · Copyright 2004 David J. Lilja 12 Important points zUse confidence intervals to determine if there are statistically significant

Copyright 2004 David J. Lilja 18

Column means: Average performance of one alternative

αk…αj…α2α1Effecty.k…y.j…y.2y.1Col meanynk…ynj…yn2yn1n

…………………

yik…yij…yi2yi1i

…………………

y2k…y2j…y22y212

yk1…y1j…y12y111

k…j…21Measurements

Alternatives

Page 19: Prof Anja Feldmann based on slides by David J. Lilja · Copyright 2004 David J. Lilja 12 Important points zUse confidence intervals to determine if there are statistically significant

Copyright 2004 David J. Lilja 19

Error: Deviation from column mean

αk…αj…α2α1Effecty.k…y.j…y.2y.1Col meanynk…ynj…yn2yn1n

…………………

yik…yij…yi2yi1i

…………………

y2k…y2j…y22y212

yk1…y1j…y12y111

k…j…21Measurements

Alternatives

Page 20: Prof Anja Feldmann based on slides by David J. Lilja · Copyright 2004 David J. Lilja 12 Important points zUse confidence intervals to determine if there are statistically significant

Copyright 2004 David J. Lilja 20

Overall mean:Average performance of all alternatives

αk…αj…α2α1Effecty.k…y.j…y.2y.1Col meanynk…ynj…yn2yn1n

…………………

yik…yij…yi2yi1i

…………………

y2k…y2j…y22y212

yk1…y1j…y12y111

k…j…21Measurements

Alternatives

Page 21: Prof Anja Feldmann based on slides by David J. Lilja · Copyright 2004 David J. Lilja 12 Important points zUse confidence intervals to determine if there are statistically significant

Copyright 2004 David J. Lilja 21

Effect: Deviation from overall mean

αk…αj…α2α1Effecty.k…y.j…y.2y.1Col meanynk…ynj…yn2yn1n

…………………

yik…yij…yi2yi1i

…………………

y2k…y2j…y22y212

yk1…y1j…y12y111

k…j…21Measurements

Alternatives

Page 22: Prof Anja Feldmann based on slides by David J. Lilja · Copyright 2004 David J. Lilja 12 Important points zUse confidence intervals to determine if there are statistically significant

Copyright 2004 David J. Lilja 22

Effects and errors

Effect is distance from overall meanHorizontally across alternatives

Error is distance from column meanVertically within one alternativeError across alternatives, too

Individual measurements are then:

ijjij eyy ++= α..

Page 23: Prof Anja Feldmann based on slides by David J. Lilja · Copyright 2004 David J. Lilja 12 Important points zUse confidence intervals to determine if there are statistically significant

Copyright 2004 David J. Lilja 23

Sum of squares of differencesSST = differences between each measurement and overall meanSSA = variation due to effects of alternativesSSE = variation due to errors in measurments

SSESSASST +=

Page 24: Prof Anja Feldmann based on slides by David J. Lilja · Copyright 2004 David J. Lilja 12 Important points zUse confidence intervals to determine if there are statistically significant

Copyright 2004 David J. Lilja 24

Sum of squares of differences

( )

( )

( )2

1 1..

2

1 1.

2

1...

∑∑

∑∑

= =

= =

=

−=

−=

−=

k

j

n

iij

k

j

n

ijij

k

jj

yySST

yySSE

yynSSA

Page 25: Prof Anja Feldmann based on slides by David J. Lilja · Copyright 2004 David J. Lilja 12 Important points zUse confidence intervals to determine if there are statistically significant

Copyright 2004 David J. Lilja 25

ANOVA – Fundamental idea

Separates variation in measured values into:1. Variation due to effects of alternatives

• SSA – variation across columns2. Variation due to errors

• SSE – variation within a single columnIf differences among alternatives are due to

real differences,• SSA should be statistically > SSE

Page 26: Prof Anja Feldmann based on slides by David J. Lilja · Copyright 2004 David J. Lilja 12 Important points zUse confidence intervals to determine if there are statistically significant

Copyright 2004 David J. Lilja 26

Comparing SSE and SSA

Simple approachSSA / SST = fraction of total variation explained by differences among alternativesSSE / SST = fraction of total variation due to experimental error

But is it statistically significant?

Page 27: Prof Anja Feldmann based on slides by David J. Lilja · Copyright 2004 David J. Lilja 12 Important points zUse confidence intervals to determine if there are statistically significant

Copyright 2004 David J. Lilja 27

Comparing variancesUse F-test (statistics) to compare ratio of variancesIf Fcomputed > Ftable

→ We have (1 – α) * 100% confidence that variation due to actual differences in alternatives, SSA, is statistically greater than variation due to errors, SSE.

Page 28: Prof Anja Feldmann based on slides by David J. Lilja · Copyright 2004 David J. Lilja 12 Important points zUse confidence intervals to determine if there are statistically significant

Copyright 2004 David J. Lilja 28

ANOVA example

0.3175-0.1441-0.1735Effects

0.29030.60780.14620.1168Column mean

0.52980.13830.09745

0.66750.17300.19544

0.51520.13820.09693

0.53000.14320.09712

0.79660.13820.09721

Overall mean

321Measurements

Alternatives

Page 29: Prof Anja Feldmann based on slides by David J. Lilja · Copyright 2004 David J. Lilja 12 Important points zUse confidence intervals to determine if there are statistically significant

Copyright 2004 David J. Lilja 29

Conclusions from exampleSSA/SST = 0.7585/0.8270 = 0.917→ 91.7% of total variation in measurements is due to

differences among alternativesSSE/SST = 0.0685/0.8270 = 0.083→ 8.3% of total variation in measurements is due to noise in

measurementsComputed F statistic > tabulated F statistic→ 95% confidence that differences among alternatives are

statistically significant.

Page 30: Prof Anja Feldmann based on slides by David J. Lilja · Copyright 2004 David J. Lilja 12 Important points zUse confidence intervals to determine if there are statistically significant

Copyright 2004 David J. Lilja 30

ContrastsANOVA tells us that there is a statistically significant difference among alternativesBut it does not tell us where difference isUse method of contrasts to compare subsets of alternatives

A vs B{A, B} vs {C}Etc.

Contrast = linear combination of effects of alternatives

Page 31: Prof Anja Feldmann based on slides by David J. Lilja · Copyright 2004 David J. Lilja 12 Important points zUse confidence intervals to determine if there are statistically significant

Copyright 2004 David J. Lilja 31

Important Points

Use one-factor ANOVA to separate total variation into:– Variation within one system

Due to random errors– Variation between systems

Due to real differences (+ random error)

Is the variation due to real differences statistically greater than the variation due to errors?

Page 32: Prof Anja Feldmann based on slides by David J. Lilja · Copyright 2004 David J. Lilja 12 Important points zUse confidence intervals to determine if there are statistically significant

Copyright 2004 David J. Lilja 32

Generalized design of experiments

GoalsIsolate effects of each input variable.Determine effects of interactions.Determine magnitude of experimental errorObtain maximum information for given effort

Basic ideaExpand 1-factor ANOVA to m factors

Page 33: Prof Anja Feldmann based on slides by David J. Lilja · Copyright 2004 David J. Lilja 12 Important points zUse confidence intervals to determine if there are statistically significant

Copyright 2004 David J. Lilja 33

TerminologyResponse variable

Measured output value: e.g., total execution timeFactors

Input variables that can be changedE.g.: cache size, clock rate, bytes transmitted

LevelsSpecific values of factors:

Continuous (~bytes) or discrete (type of system)Replication

Completely re-run experiment with same input levelsInteraction

Effect of input factor A depends on level of input factor B

Page 34: Prof Anja Feldmann based on slides by David J. Lilja · Copyright 2004 David J. Lilja 12 Important points zUse confidence intervals to determine if there are statistically significant

Copyright 2004 David J. Lilja 34

Two-factor experiments

Two factors (inputs)A, B

Separate total variation in output values into:Effect due to AEffect due to BEffect due to interaction of A and B (AB)Experimental error

Page 35: Prof Anja Feldmann based on slides by David J. Lilja · Copyright 2004 David J. Lilja 12 Important points zUse confidence intervals to determine if there are statistically significant

Copyright 2004 David J. Lilja 35

Example – User response timeA = degree of multiprogrammingB = memory sizeAB = interaction of memory size and degree of multiprogramming

0.701.451.504

0.500.660.813

0.360.450.522

0.150.210.251

1286432A

B (Mbytes)

Page 36: Prof Anja Feldmann based on slides by David J. Lilja · Copyright 2004 David J. Lilja 12 Important points zUse confidence intervals to determine if there are statistically significant

Copyright 2004 David J. Lilja 36

Two-factor ANOVA

Factor A – a input levelsFactor B – b input levelsn measurements for each input combinationabn total measurements

Page 37: Prof Anja Feldmann based on slides by David J. Lilja · Copyright 2004 David J. Lilja 12 Important points zUse confidence intervals to determine if there are statistically significant

Copyright 2004 David J. Lilja 37

Two-factor ANOVA

Each individual measurement is composition of

Overall meanEffectsInteractionsMeasurement errors

Page 38: Prof Anja Feldmann based on slides by David J. Lilja · Copyright 2004 David J. Lilja 12 Important points zUse confidence intervals to determine if there are statistically significant

Copyright 2004 David J. Lilja 38

ExampleOutput = user response time (seconds)Want to separate effects due to

A = degree of multiprogrammingB = memory sizeAB = interactionError

Need replications to separate error 0.701.451.504

0.500.660.813

0.360.450.522

0.150.210.251

1286432A

B (Mbytes)

Page 39: Prof Anja Feldmann based on slides by David J. Lilja · Copyright 2004 David J. Lilja 12 Important points zUse confidence intervals to determine if there are statistically significant

Copyright 2004 David J. Lilja 39

Conclusions from example

77.6% (SSA/SST) of all variation in response time due to degree of multiprogramming11.8% (SSB/SST) due to memory size9.9% (SSAB/SST) due to interaction0.7% due to measurement error95% confident that all effects and interactions are statistically significant

Page 40: Prof Anja Feldmann based on slides by David J. Lilja · Copyright 2004 David J. Lilja 12 Important points zUse confidence intervals to determine if there are statistically significant

Copyright 2004 David J. Lilja 40

A problem

Full factorial design with replicationMeasure system response with all possible input combinationsReplicate each measurement n times to determine effect of measurement error

m factors, v levels, n replications→ n vm experimentsm = 5 input factors, v = 4 levels, n = 3→ 3(45) = 3,072 experiments!

Page 41: Prof Anja Feldmann based on slides by David J. Lilja · Copyright 2004 David J. Lilja 12 Important points zUse confidence intervals to determine if there are statistically significant

Copyright 2004 David J. Lilja 41

Fractional factorial designs: n2m experiments

Special case of generalized m-factor experimentsRestrict each factor to two possible values

High, lowOn, off

Find factors that have largest impactFull factorial design with only those factors

Page 42: Prof Anja Feldmann based on slides by David J. Lilja · Copyright 2004 David J. Lilja 12 Important points zUse confidence intervals to determine if there are statistically significant

Copyright 2004 David J. Lilja 42

Still too many experiments with n2m!

Plackett and Burman designs (1946)Multifactorial designs

Effects of main factors onlyLogically minimal number of experiments to estimate effects of m input parameters (factors)Ignores interactions

Requires O(m) experimentsInstead of O(2m) or O(vm)


Recommended