Performance Analysis of Computer Systems - TU Dresden fileHolger Brunst...

transcript

Holger Brunst (holger.brunst@tu-dresden.de)

Matthias S. Mueller (matthias.mueller@tu-dresden.de)

Performance Analysis of Computer Systems

Experimental Design

LARS: Experimental Design

Simple Linear Regression Model

n observation pairs:

Predictor variable x and predicted response y:

Error:

Measured y Estimated

ˆ y i = b0 + b1xi

{(x1,y1),...,(xn ,yn )}

iiiyye ˆ=

Error of Linear Regression

Error:

Sum of Squared Errors (SSE):

Mean Error:

Measured y Estimated

ei = yi b0 + b1xi( )

iii xbbye1 1

Calculation of Linear Regression Parameters

Best linear model minimizes SSE and has a mean error of zero

xbyb10

xiyi nx y i=1

xi2 n(x )2

Experimental Design

Terminology

Simple Designs

Full Factorial Designs

Fractional Factorial Designs

(Pitfalls and common mistakes)

Experimental Design

Obtain the maximum information with the

minimum number experiments

Experimental Design

– Isolate effects of each input variable.

– Determine effects of interactions.

– Determine magnitude of experimental error

– Obtain maximum information for given effort

Basic idea

– Expand 1-factor Analysis of Variation (ANOVA) to m factors

Experimental Design: Example Problem

Task: Design a personal workstation

Factors & Levels:

CPU selection: Processor A, B, C

Memory sizes: 2 GiB, 4 GiB, 16 GiB

Workload: Administrative, Creative/Artistic, Scientific

Operating Systems: Windows, Linux, Mac OS

Terminology: Response Variable, Factors, Levels

Response Variable (Zielgröße)

– The measured outcome of an experiment

– In example: response time for tasks

Factors (Einflussfaktoren)

– Changeable input variables that affect response variable

– In example: CPU type, memory size, workload…

– Also called: Predictor/Independent variables or predictors

Levels (Stufen)

– Values a factor can assume (inputs)

– Each factor level is an alternative

– Qualitative (e.g. type of processor) or quantitative (e.g. memory sizes)

– Sometimes also referred to as treatment

Terminology: Replication and Design

Replication (Wiederholung)

– Re-run experiment with same input levels

– Determine impact of measurement error

Experiment Design (Versuchsplan)

– Specifies the number of experiments, the factor level combinations, and the number of replications of each experiment

– In workstation example: 3 x 3 x 3 x 3 = 81 experiments

– With 5 replications we have to perform ~400 observations

Terminology: Interaction

Interaction (Wechselwirkung)

Effect of one input factor depends on level of another input factor

Example:

Cholesterol reduction clinic

Two diets A and B and one exercise regime

Exercise or diet alone are effective (reducing cholesterol levels)

For patients who did not exercise, the two diets worked equally well

Patients who followed diet A and exercised got the benefits of both

However, patients who followed diet B and exercised got the benefits of both plus a bonus, an interaction effect.

Analysis of Variation (ANOVA)

Separates total variation observed in a set of measurements

1. Variation within one system

• Due to random measurement errors

2. Variation between systems

• Due to real differences + random error

Is variation(2) statistically > variation(1)?

One-factor experimental design

One-factor ANOVA

Each individual measurement is composition of

– Overall mean

– Effect of alternatives

– Measurement errors

yij = y + ai + eij

y = overall mean

ai = effect due to A

eij = measurement error

Two-factor ANOVA

Each individual measurement is composition of

– Overall mean

– Effects

– Interactions

– Measurement errors

yijk = y + i + j + ij + eijk

y = overall mean

ai = effect due to A

b j = effect due to B

abij = effect due to interaction of A and B

eijk = measurement error

Experimental Designs

Three most frequently used experimental designs:

Simple designs (einfacher Versuchsplan)

Full factorial designs (vollständiger Versuchsplan)

Fractional factorial designs (Teilfaktorplan)

Simple Designs

Start with a typical configuration and vary one factor at a time

In workstation example: A typical workstation configuration might consist of Processor A, 2 GiB main memory, running administrative

tasks under Windows

1. The performance of this configuration is measured

2. Vary the first factor to find the best CPU

3. Change the amount of main memory to 2, 4 and 16 GiB to find its optimal size

4. Proceed identically with other factors in a pre-defined order

Simple Designs

Given k factors with the i-th factor having ni levels, requires only n

experiments:

This design does not make the best use of the effort spent

Interacting factors are ignored in this design. This my lead to wrong conclusions

In workstation example:

– The CPU performance depends on the size of memory

– Their optimal combination cannot be determined unless all possibilities are tried

Not recommended

n =1+ (ni 1)i=1

Uses every possible combination at all levels of all factors which requires n experiments, where

In workstation example: 3 CPUs x 3 Memory sizes x 3 workloads x 3 operating systems

= 81 experiments

Advantage: Every possible factor combination is examined. This includes their interactions.

Disadvantage:

– Cost of the study regarding time and money

– Too many experiments to be conducted.

– Also consider replication!

n = ni

Ways to reduce the number of experiments:

Reduce the number of levels for each factor

– Full factorial designs with k factors and just two levels each, require 2k experiments. Very popular and called 2k design

– In workstation example: Start with 24 = 16 experiments.

– After factor reduction one can try more levels if effects can be

observed for the initial two levels

Reduce the number of factors

– Secondary factors often not known in the beginning

Use fractional factorial designs

Consider the following (simplified) full factorial 34 experiment design:

n = 3 CPUs x 3 memory levels x 3 workloads x 3 educational levels

= 81 experiments

The corresponding 34-2 fractional factorial consists of only nine experiments:

Experiment

Number CPU Level

Memory Level Workload

Educational

1 Cheapest 2 GB Administrative High school

2 Cheapest 4 GB Scientific Postgraduate

3 Cheapest 16 GB Creative College

4 Best price/perf. 2 GB Scientific College

5 Best price/perf. 4 GB Creative High school

6 Best price/perf. 16 GB Administrative Postgraduate

7 Most expensive 2 GB Creative Postgraduate

8 Most expensive 4 GB Administrative College

9 Most expensive 16 GB Scientific High school

Advantages vs. Disadvantages

– Fractional factorial designs save time and expense

– Information obtained is less than from a full factorial design

– Not all interactions between factors are covered

Sometimes, certain interactions are known to be negligible

2k Factorial Designs

Determines the effect of k factors with 2 levels each

Easy to analyze

Helps to sort performance factors in the order of impact

At beginning of performance study:

– Large number of factors and levels

– Full factorial design most likely not possible

– Reduce the number of factors by selecting the significant ones

Impact of unidirectional factors can be estimated for their minimum and maximum levels

Decide if performance difference is worth further examination (with more levels)

Explanation of the concept: Start with k=2, then generalize

2k Factorial Designs with k=2

Special case of 2k with just two factors each at two levels

Can be easily analyzed with the following regression model

Example: Performance in MFLOPS, Factors: Cache size and memory size

Define xA and xB as follows:

• xA = -1 if 4 GB memory and 1 if 16 GB memory

• xB = -1 if 0.5 MB cache and 1 if 4 MB cache

Regression of MFLOPS with nonlinear model of the form:

• y = q0 + qAxA + qBxB + qABxAxB

Cache Size in MB 4 GB Memory 16 GB Memory

0.5 300 900

4 500 1500

2k Factorial Designs with k=2

Inserting the four observations we get the following equations:

300 = q0 - qA - qB + qAB

900 = q0 + qA - qB - qAB

500 = q0 - qA + qB - qAB

1500 = q0 + qA + qB + qAB

Solving the four unknowns results in the regression equation:

y = 800 + 400xA + 200xB + 100xAxB

Interpretation:

– The mean performance is 800 MFLOPS

– Effect of memory is ±400 MFLOPS

– Effect of cache is ±200 MFLOPS

– Interaction (extra bonus) for cache and memory combination accounts for ±100 MFLOPS

2k Factorial Designs: Sign Method Table

Exp. A B y

1 -1 -1 y1

2 1 -1 y2

3 -1 1 y3

4 1 1 y4

Substituting the four observations with yi we get:

y1 = q0 - qA - qB + qAB

y2 = q0 + qA - qB - qAB

y3 = q0 - qA + qB - qAB

y4 = q0 + qA + qB + qAB

Solving the equations, we get:

q0 = 0.25 ( y1 + y2 + y3 + y4 )

qA = 0.25 ( - y1 + y2 - y3 + y4 )

qB = 0.25 ( - y1 - y2 + y3 + y4 )

qAB = 0.25 ( y1 - y2 - y3 + y4 )

Sum of coefficients of linear combinations qA, qB, and qAB is zero!

Such expressions are called contrasts

Coefficients of yi in qA are identical to levels of A

2k Factorial Designs with k=2: Sign Method

Coefficients of yi of qA, qB, and qAB correspond to columns A, B, and

Multiply columns (vector) I, A, B, and AB with column (vector) y.

Divide the four results by four in order to obtain the coefficients of

the regression model

Experiment I A B AB y

1 1 -1 -1 1 300

2 1 1 -1 -1 900

3 1 -1 1 -1 500

4 1 1 1 1 1500

3200 1600 800 400 Total

q 800 400 200 100 Total/22

2k Factorial Designs with k=2: Variation

Importance of a performance factor: Variation of Factor / Total

Variation

Sample variance of y

Total variation or Sum of Squares Total of y = SST =

Variance != Total variation

SST can be transformed to: SST = 22qA2 + 22qB

2 + 22qAB2

These three parts represent the portion of the total variation explained by the effects A, B, and interaction AB.

SST = SSA + SSB + SSAB

Fraction of variation explained by A = SSA / SST

=(yi y)2

(yi y)2

2k Factorial Designs with k=2: Variation

SST = 22qA2 + 22qB

2 + 22qAB2

= 4 (4002 +2002 + 1002) = 840,000

SSA = 4 * 4002 = 640,000

SSAB = 4 * 1002 = 40,000

SSA / SST = 640,000 / 840,000 = 0.76

SSAB / SST = 40,000 / 840,000 = 0.05

Main effect A is responsible for 76% variation

Interaction between effect A and B accounts for 5% variation

2k Factorial Designs: General

Now: Extend 22 design to 2k i.e. k factors with two levels each

Total of 2k experiments with 2k effects: k main effects, two-factor interactions, three-factor

interactions, etc.

Sign table method also valid

Experiment replication can be easily included into the formulas (2kr).

See textbook for further details

2k Factorial Designs: General

Exp I A B C AB AC BC ABC y

1 1 -1 -1 -1 1 1 1 -1 14

2 1 1 -1 -1 -1 -1 1 1 22

3 1 -1 1 -1 -1 1 -1 1 10

4 1 1 1 -1 1 -1 -1 -1 34

5 1 -1 -1 1 1 -1 -1 1 46

6 1 1 -1 1 -1 1 -1 -1 58

7 1 -1 1 1 -1 -1 1 -1 50

8 1 1 1 1 1 1 1 1 86

320 80 40 160 40 16 24 9 Total

q 40 10 5 20 5 2 3 1 Total/23

2k-p Fractional Factorial Designs

Large number of factors require very many experiments with a full

factorial experiment design

Alternative: Fractional factorial design

A 2k-p design allows to analyze k two-level factors with just 2k-p experiments. With p chosen suitably, i.e. p = 1 for half as many

experiments, p = 2 for a quarter of the original number of experiments, etc.

2k-1 design is called half-replicate design

How do we create such designs i.e. the sign tables?

Example: A 27-4 experimental design

2k-p Fractional Factorial Designs: 27-4 Example

Factor levels (signs in columns) need to be carefully chosen

Sign vectors remain orthogonally of sign vectors as known from 2k designs

– Sum of each column is zero

– Sum of the products of any two columns is zero

– The sum of the squares of each column is 27-4

Exp. A B C D E F G

1 -1 -1 -1 1 1 1 -1

2 1 -1 -1 -1 -1 1 1

3 -1 1 -1 -1 1 -1 1

4 1 1 -1 1 -1 -1 -1

5 -1 -1 1 1 -1 -1 1

6 1 -1 1 -1 1 -1 -1

7 -1 1 1 -1 -1 1 -1

8 1 1 1 1 1 1 1

2k-p Fractional Factorial Designs: Preparation

Let’s do a 24-1 sign table

– Start with full factorial sign table for a 23 design

– Arbitrarily pick the rightmost column and mark it D instead of ABC

– Computes the main effects qA, qB, qC, and qD plus the interactions qAB, qAC, and qBC

Confounding (Konfudierung): Some of the effects cannot be determined independently

Exp. A B C AB AC BC D

1 -1 -1 -1 1 1 1 -1

2 1 -1 -1 -1 -1 1 1

3 -1 1 -1 -1 1 -1 1

4 1 1 -1 1 -1 -1 -1

5 -1 -1 1 1 -1 -1 1

6 1 -1 1 -1 1 -1 -1

7 -1 1 1 -1 -1 1 -1

8 1 1 1 1 1 1 1

2k-p Fractional Factorial Designs: Confounding

In fractional factorial experiments some of the effects cannot be

determined individually

Only the combined influence of two or more effects are available

This problem is known as confounding (Konfudierungseffekt)

Example: Consider the effects of A and D which can be computed as:

The interaction ABC is obtained by multiplying the elements of columns A, B, C, and y which gives:

Expression for qD is identical to that for qABC !!!

qA = yixAi

=y1 + y2 y3 + y4 y5 + y6 y7 + y8

qD = yixDi

=y1 + y2 + y3 y4 + y5 y6 y7 + y8

qABC = yixAi

xBixCi =y1 + y2 + y3 y4 + y5 y6 y7 + y8

2k-p Fractional Factorial Designs: Confounding

Actually, the previous expression is neither qABC nor qD. It’s the sum of both!

This is not a problem if the combined interaction of A, B, and C is small compared to the effect of D. Often this is known prior to the experiment.

Thus, the confounding in this example can be denoted as D = ABC i.e. their computation uses the same linear combination of responses.

D and ABC are not the only confounded effects. A 24-1 has only eight experiments with eight results.

Complete list of confoundings: A = BCD, B = ACD, C = ABD, AB = CD, AC = BD, BC = AD, ABC = D, I = ABCD.

A fractional factorial design is not unique: 2p possibilities

Assumption: Higher order interactions are smaller than lower order interactions

Design quality can vary. Typically it’s better if main effects are confounded with third or higher order interactions, e.g. A = BCD is better than A = BD.

Common Mistakes

Variation due to experimental error ignored

– Measured values are random values. They can vary even if all controllable factors are kept constant

– Variation due to a factor must be compared to variation due to errors before making a decision about the effect

Important parameters are not controlled

– Parameters that affect performance are not selected as factors e.g. the user of a workstation

Effects of different factors are not isolated

– Simultaneous variation of factors -> performance change cannot be allocated to any particular factor

Simple one-factor-at-a-time designs are used

– Waste of resources. Requires to many experiments

Interactions are ignored

– Many performance effects depend on multiple factors at the same time e.g. data cache size, number of CPUs, and the problem size

Too many experiments are conducted

Holger Brunst (holger.brunst@tu-dresden.de)

Matthias S. Mueller (matthias.mueller@tu-dresden.de)

Thank You!

Performance Analysis of Computer Systems - TU Dresden fileHolger Brunst...

Documents