Putnam’s Effort-Duration Trade-Off Law: Is theSoftware Estimation Problem Really Solved?
Han SuelmannOctober 7th, 2014
2
Putnam’s study – reference
L.H. Putnam,
“A generic empirical solution to the macro software sizing and estimating problem,”
IEEE Transactions on Software Engineering,
vol. 4, pages 345 ─ 361,
July 1978.
3
5
Agenda
• Putnam’s study: results and influence• Putnam’s approach• Intermezzo – A statistical pitfall• Critical evaluation:
o dataset is very limitedo model and assumptions are unclearo analysis is incorrect
• Other studies provide no corroboration• Simulation study demonstrate incorrectness
6
7
Putnam’s study: results and influence
Claims:• Generic empirical equations that describe size – effort –
duration relationships.• Method will produce accurate estimates.• Only a few quick reference tables and a pocket calculator
needed.• Trade-off law: K ~ 1 / T4.
Proble
m s
olved!
8
Putnam’s study is very influential
Influence:• incorporated in estimation software• many references• sometimes cited as authoritative
9
Putnam’s approach
1) Gather data on effort (K), duration (T) and size (S).
2) Define difficulty: D = K / T 2
3) Define productivity: P = S / K
4) Find relationship between D and P.Result: P ~ D -0.67
5) Perform basic algebraic manipulations to find relationships between S, T, and K.Result: S = C ∙ K1/3 ∙ T4/3,
and therefore: .4
3
T
SK
10
The crucial relationship…
2T
KD
K
SP
difficulty
prod
uctiv
ity
11
Putnam’s approach
1) Gather data on effort (K), duration (T) and size (S).
2) Define difficulty: D = K / T 2
3) Define productivity: P = S / K
4) Find relationship between D and P.Result: P ~ D -0.67
5) Perform basic algebraic manipulations to find relationships between S, T, and K.Result: S = C ∙ K1/3 ∙ T4/3,
and therefore: .4
3
T
SK
12
Intermezzo – A statistical pitfall
•Two researchers examine relationship between S and K.•Both assume linear relationship.
•Researcher 1 writes K = aS + b
•Researcher 2 writes S = a’K + b’
S
K
S
K
Researcher 2’s does his linear fit
14
Intermezzo – The results are quite different
•Researcher 1 writes K = aS + b and finds (1) K = 1.01 S −0.02.
•Researcher 2 writes S = a’K + b’ and finds (2) S = 0.50 K + 3.0.
•Researcher 2 then derives
(3) K = 2.02 S – 6.2.
Researcher 1: does her fit
16
Intermezzo – A statistical pitfall
•Researcher 1 writes K = aS + b and finds (1) K = 1.01 S −0.02.
•Researcher 2 writes S = a’K + b’ and finds (2) S = 0.50 K + 3.0.
•Researcher 2 then derives
(3) K = 2.02 S – 6.2.
^
^
^
Linear fit: minimising least squares
S
K
18
Critical evaluation (1) – dataset is very limited
• only 13 projects• all US Military• 4 are left out => 9 projects remaining
19
Critical evaluation (2) – model is unclear
size
duration
effort
Putnam does not make clear and consistent choices regarding model structure.
Only one parameter to capture effort-duration interaction
20
Critical evaluation (3) – analysis
CS = K1/3
T /33 3 4
21
Critical evaluation (3) – analysis
C S=K T4 33
22
Critical evaluation (3) – analysis
C= K1/3
T /33 3 4S
^
23
Critical evaluation (3) – analysis
C =K T43 3
S^
24
Critical evaluation (3) – analysis
C=K
T 43
3S^
25
Critical evaluation (4): Difficulty – Productivity relationship
Putnam’s reasoning:
3/2 DP
3/43/1 TKS
3/2ˆ DP
More precisely notated:
??ˆˆ 3/43/2 TKKS
3/2
2
T
K
K
S
26
Other studies − Corroboration by Putnam et al.
Putnam & Putnam, “A data verification of the software fourth power trade-off law,” (Proc. of the Int. Soc. of Parametric Analysts – 6th Annu. Conf., vol. III(I), pp. 443–471, 1984.)
Putnam & Myers, “Measures for excellence – Reliable software on time, within budget”, (Englewood Cliffs: Yourdon, 1992.)
Confirmed that K ~ 1 / T4, but…
Found (Dunsmore et al., 1986) and admitted (Putnam & Myers, n.d.) to be based on circular reasoning.
27
Other studies – No corroboration from Jeffery
Result:•P ~ K−0.47T −0.05
•essentially no productivity – duration relationship•comparison with Putnam’s P ~ K−0.67T 1.33
•no confirmation•strictly speaking: no refutation either
Jeffery (1987):•47 MIS in 4 large organisations•Find P as a function of K and T.
28
Other studies – No corroboration fromBarry, Mukhopadhyay, and Slaughter
Barry, Mukhopadhyay, and Slaughter (2002):
Ansatz: ln K = … + β1 T
Result: β1 = 0.000677 ± 0.000103, p = .031.
So – larger duration predicts larger effort.
29
Other studies – Team size affects effort, so…?Putnam & Myers (n.d.): larger team size predicts larger effort:
Teams of 5 or less have better productivity than teams of 20 or more.
Supported by other studies. Example (Rodríguez et al.):
PDR ~ (average team size)^0.57
But…•translation to effort-duration trade-off unclear•interpretation in terms of causation dubious
30
Several interpretations are possible…
larger team size
more effort
?
?
31
Simulation (1)
Goal: check whether the analysis issues really lead to incorrect results.
Method:•generate simulated data with known structure•analyze simulated data, following Putnam’s approach•check whether results are consistent with assumptions
32
Simulation (2)
Model assumptions:
•Size, effort, and duration are unrelated random numbers.•Log-normal distributions.•1000 projects.
33
Simulation (3) – analysis
34
2T
KD
K
SP
35
Simulation (4) – result
Fit yields:
After transformation:
After some manipulations (same as Putnam’s):
02.067.0 DP
constantln67.0ln DP
4.01.4
1
TK Yet, no
relationship actually exists!
36
Simulation (5) – coincidence?
For convenience, write s = ln S, k = ln K, and t = ln T.
Difficulty and productivity:
•ln D = k – 2t
•ln P = s – k
Derive the slope of P against D:
Follow Putnam closely, finding K ~ T u , with
.4)var(ln
)ln,cov(ln)ln|(ln 22
2
tk
k
D
PDDPB
2
2
21
t
ku
which yields u = − 4 if 8t
k
37
Simulation (6) – resultSK
38
Conclusions
Claims:• Generic equations that
describe size – effort – duration relationships.
• Method will produce accurate estimates.
• Trade-off law: K ~ 1 / T4.
Limited dataset,no corroboration
Not addressed
Faulty analysis,no corroboration
39
Conclusion
No credibility forPutnam’s result
No corroboration
Putnam’s original study
was wrong
40
The bad news
• Handling statistical relationships as if exact.
• Interpreting statistical relationship as causal relationships without sufficient support.
Both issues are rather common in the estimation /
metrics literature.
Question time