Download - Iwsm2014 putnam revisited (han suelmann) for publication

Putnam’s Effort-Duration Trade-Off Law: Is theSoftware Estimation Problem Really Solved?

Han SuelmannOctober 7th, 2014

2

Putnam’s study – reference

L.H. Putnam,

“A generic empirical solution to the macro software sizing and estimating problem,”

IEEE Transactions on Software Engineering,

vol. 4, pages 345 ─ 361,

July 1978.

3

5

Agenda

• Putnam’s study: results and influence• Putnam’s approach• Intermezzo – A statistical pitfall• Critical evaluation:

o dataset is very limitedo model and assumptions are unclearo analysis is incorrect

• Other studies provide no corroboration• Simulation study demonstrate incorrectness

6

7

Putnam’s study: results and influence

Claims:• Generic empirical equations that describe size – effort –

duration relationships.• Method will produce accurate estimates.• Only a few quick reference tables and a pocket calculator

needed.• Trade-off law: K ~ 1 / T4.

Proble

m s

olved!

8

Putnam’s study is very influential

Influence:• incorporated in estimation software• many references• sometimes cited as authoritative

9

Putnam’s approach

1) Gather data on effort (K), duration (T) and size (S).

2) Define difficulty: D = K / T 2

3) Define productivity: P = S / K

4) Find relationship between D and P.Result: P ~ D -0.67

5) Perform basic algebraic manipulations to find relationships between S, T, and K.Result: S = C ∙ K1/3 ∙ T4/3,

and therefore: .4

3

T

SK

10

The crucial relationship…

2T

KD

K

SP

difficulty

prod

uctiv

ity

11

Putnam’s approach

1) Gather data on effort (K), duration (T) and size (S).

2) Define difficulty: D = K / T 2

3) Define productivity: P = S / K

4) Find relationship between D and P.Result: P ~ D -0.67

5) Perform basic algebraic manipulations to find relationships between S, T, and K.Result: S = C ∙ K1/3 ∙ T4/3,

and therefore: .4

3

T

SK

12

Intermezzo – A statistical pitfall

•Two researchers examine relationship between S and K.•Both assume linear relationship.

•Researcher 1 writes K = aS + b

•Researcher 2 writes S = a’K + b’

S

K

S

K

Researcher 2’s does his linear fit

14

Intermezzo – The results are quite different

•Researcher 1 writes K = aS + b and finds (1) K = 1.01 S −0.02.

•Researcher 2 writes S = a’K + b’ and finds (2) S = 0.50 K + 3.0.

•Researcher 2 then derives

(3) K = 2.02 S – 6.2.

Researcher 1: does her fit

16

Intermezzo – A statistical pitfall

•Researcher 1 writes K = aS + b and finds (1) K = 1.01 S −0.02.

•Researcher 2 writes S = a’K + b’ and finds (2) S = 0.50 K + 3.0.

•Researcher 2 then derives

(3) K = 2.02 S – 6.2.

^

^

^

Linear fit: minimising least squares

S

K

18

Critical evaluation (1) – dataset is very limited

• only 13 projects• all US Military• 4 are left out => 9 projects remaining

19

Critical evaluation (2) – model is unclear

size

duration

effort

Putnam does not make clear and consistent choices regarding model structure.

Only one parameter to capture effort-duration interaction

20

Critical evaluation (3) – analysis

CS = K1/3

T /33 3 4

21


C S=K T4 33

22


C= K1/3

T /33 3 4S

^

23


C =K T43 3

S^

24


C=K

T 43

3S^

25

Critical evaluation (4): Difficulty – Productivity relationship

Putnam’s reasoning:

3/2 DP

3/43/1 TKS

3/2ˆ DP

More precisely notated:

??ˆˆ 3/43/2 TKKS

3/2

2

T

K

K

S

26

Other studies − Corroboration by Putnam et al.

Putnam & Putnam, “A data verification of the software fourth power trade-off law,” (Proc. of the Int. Soc. of Parametric Analysts – 6th Annu. Conf., vol. III(I), pp. 443–471, 1984.)

Putnam & Myers, “Measures for excellence – Reliable software on time, within budget”, (Englewood Cliffs: Yourdon, 1992.)

Confirmed that K ~ 1 / T4, but…

Found (Dunsmore et al., 1986) and admitted (Putnam & Myers, n.d.) to be based on circular reasoning.

27

Other studies – No corroboration from Jeffery

Result:•P ~ K−0.47T −0.05

•essentially no productivity – duration relationship•comparison with Putnam’s P ~ K−0.67T 1.33

•no confirmation•strictly speaking: no refutation either

Jeffery (1987):•47 MIS in 4 large organisations•Find P as a function of K and T.

28

Other studies – No corroboration fromBarry, Mukhopadhyay, and Slaughter

Barry, Mukhopadhyay, and Slaughter (2002):

Ansatz: ln K = … + β1 T

Result: β1 = 0.000677 ± 0.000103, p = .031.

So – larger duration predicts larger effort.

29

Other studies – Team size affects effort, so…?Putnam & Myers (n.d.): larger team size predicts larger effort:

Teams of 5 or less have better productivity than teams of 20 or more.

Supported by other studies. Example (Rodríguez et al.):

PDR ~ (average team size)^0.57

But…•translation to effort-duration trade-off unclear•interpretation in terms of causation dubious

30

Several interpretations are possible…

larger team size

more effort

?

?

31

Simulation (1)

Goal: check whether the analysis issues really lead to incorrect results.

Method:•generate simulated data with known structure•analyze simulated data, following Putnam’s approach•check whether results are consistent with assumptions

32

Simulation (2)

Model assumptions:

•Size, effort, and duration are unrelated random numbers.•Log-normal distributions.•1000 projects.

33

Simulation (3) – analysis

34

2T

KD

K

SP

35

Simulation (4) – result

Fit yields:

After transformation:

After some manipulations (same as Putnam’s):

02.067.0 DP

constantln67.0ln DP

4.01.4

1

TK Yet, no

relationship actually exists!

36

Simulation (5) – coincidence?

For convenience, write s = ln S, k = ln K, and t = ln T.

Difficulty and productivity:

•ln D = k – 2t

•ln P = s – k

Derive the slope of P against D:

Follow Putnam closely, finding K ~ T u , with

.4)var(ln

)ln,cov(ln)ln|(ln 22

2

tk

k

D

PDDPB

2

2

21

t

ku

which yields u = − 4 if 8t

k

37

Simulation (6) – resultSK

38

Conclusions

Claims:• Generic equations that

describe size – effort – duration relationships.

• Method will produce accurate estimates.

• Trade-off law: K ~ 1 / T4.

Limited dataset,no corroboration

Not addressed

Faulty analysis,no corroboration

39

Conclusion

No credibility forPutnam’s result

No corroboration

Putnam’s original study

was wrong

40

The bad news

• Handling statistical relationships as if exact.

• Interpreting statistical relationship as causal relationships without sufficient support.

Both issues are rather common in the estimation /

metrics literature.

Question time