Date post: | 07-Aug-2018 |
Category: |
Documents |
Upload: | aparna-k-nayak |
View: | 219 times |
Download: | 0 times |
of 89
8/21/2019 Predicting and Estimating Nov 06
1/89
1Linda M. Laird 2004All Rights Reserved
What will the reliability be?
8/21/2019 Predicting and Estimating Nov 06
2/89
2 2003 Linda LairdAll Rights Reserved
Predicting and Estimating Software
Reliability
Linda M. Laird
SRE 689
8/21/2019 Predicting and Estimating Nov 06
3/89
3Linda M. Laird 2004All Rights Reserved
Hardware
Requirements
Analysis
System
Requirements
Analysis and
Design
Hardware
Preliminary
Design
Hardware
Detailed
Design
Fabrication HWCI Test System
Integration
and Test
Software
Requirements
Analysis
Software
Preliminary
Design
Software
Detailed
Design
Coding
and Unit
Test
CSC
Integration
Test CSCI Test
System
Reliability
Requirements
System HW/
SW
Reliability
Model
System HW/
SW Reliability
Allocations
ReDesign Activity
Design Activity
HW/SW
Reliability
Predictions
Progress Evaluation
Assessment
Report
Program Review Board Activity
HW/SW Growth Testing
Evaluate Growth
HW/SW
Demo Test
Evaluate
Results
Assessment
Report
Design Correction
Reallocation Needed
Reassign Resources
Not OK
To Program Manager
and Engineering Manager
To Program Manager
and Engneering manager
System
Reliability
Tasks
Source: Lakey
and Neufelder
8/21/2019 Predicting and Estimating Nov 06
4/89
4Linda M. Laird 2004All Rights Reserved
Projecting Reliability Agenda
Motivation
Prediction vs. Estimation
Model Overview
Predicting Defect Densities
Predicting Failures From Defect DensitiesEstimating Failures from TestingExecution Time vs. Calendar Time
Estimating Failure Models
Reliability Growth Reliability Estimation
This
Week
Next
Week
8/21/2019 Predicting and Estimating Nov 06
5/89
5Linda M. Laird 2004All Rights Reserved
Need to have a view of expected
reliability throughout the projectso if
you can tell if you are going to hit the
requirementsor not.
8/21/2019 Predicting and Estimating Nov 06
6/89
6Linda M. Laird 2004All Rights Reserved
And if you are going to miss.
What can you do about it?
Plenty
8/21/2019 Predicting and Estimating Nov 06
7/89
7Linda M. Laird 2004All Rights Reserved
(So QuickWhat are some ways you canimprove the reliability?)
Fault Tolerance, Reduce Defects,
Increase Reviews, Reduce Complexity,
Redundancy (maybe), etc
8/21/2019 Predicting and Estimating Nov 06
8/89
8Linda M. Laird 2004All Rights Reserved
Why else Predict the Reliability?
When to Ship
Objective statement of quality of
productResource planning for maintenance
8/21/2019 Predicting and Estimating Nov 06
9/89
9Linda M. Laird 2004All Rights Reserved
Prediction and Estimation
Whats the difference?
http://www.bmc.riken.go.jp/sensor/Huang/Localization/estimation.gif8/21/2019 Predicting and Estimating Nov 06
10/89
10Linda M. Laird 2004All Rights Reserved
Prediction & Estimation
Predictionsbased on historical reliability
data and knowledge from other projects
Estimationsbased on reliability data for this
project & reliability models
Prediction: Used in earlier stages
Estimation: Used in later stages (when youhave more data)
8/21/2019 Predicting and Estimating Nov 06
11/89
11Linda M. Laird 2004All Rights Reserved
Prediction Semantics
The terminology of prediction and
estimation are frequently misused.
(including by these lectures)
What is important is the conceptare you
using historical data from other projects
(predicting) or are you trending from this
project (estimating) .
8/21/2019 Predicting and Estimating Nov 06
12/89
12Linda M. Laird 2004All Rights Reserved
Reliability Prediction
Used to predict a products reliability
or to predict the number of latent
defects when available to users
Like to Predict
Fault Density (per phase)Fault Profile
Initial Failure Rate
Final Failure Rate
?
0
8/21/2019 Predicting and Estimating Nov 06
13/89
13Linda M. Laird 2004All Rights Reserved
So how do we predict what the
reliability be?
8/21/2019 Predicting and Estimating Nov 06
14/89
14Linda M. Laird 2004All Rights Reserved
Prediction Model steps
Can either make prediction for each stepor
use actual data if that step has alreadyoccurred.
Multiple methodologies for each step
Issue remains of predicting failures fromfaults.need to be carefulit is a weak link
Fault
Profile
& Defect
Density
Initial FailureRate
Delivered
andOn-going
Failure Rate
8/21/2019 Predicting and Estimating Nov 06
15/89
15Linda M. Laird 2004All Rights Reserved
Predicting Fault Density and Distribution
Typical Distribution of Faults
Defect Prediction Models Dynamic
Rayleigh, Exponential, S-Curve Models
(Fault Injection)
Static Coqualmo Model
Based on Process
RL-TR-92-95
Industry Data Such as SEI Delivered Fault Data
Local ModelsHistorical Data
Note: Much of this
Material is from CS533 --
Included as a review
8/21/2019 Predicting and Estimating Nov 06
16/89
16Linda M. Laird 2004All Rights Reserved
Typical Fault Distributions
8/21/2019 Predicting and Estimating Nov 06
17/89
17Linda M. Laird 2004All Rights Reserved
Defect Dynamics and Behaviors
Defects have certain dynamics,
behaviors, and patterns which areimportant to understand in order to
understand the dynamics of software
development
8/21/2019 Predicting and Estimating Nov 06
18/89
18Linda M. Laird 2004All Rights Reserved
Projected Software Defects
In general, defect arrivals follow a Rayleigh Distribution Curvecan predict,
based upon project size and past defect densities, the curve, along with theUpper and Lower Control Bounds
Time
Defects
Upper Limit
Lower
Limit
F(t) = 1e^((-t/c)^2)
f(t) = 2*((t/c)^2) *e ^((-t/c)^2)
Recall that F(t) is the cumulative distribution density, f(t) is the
probability distribution, t is time, and c is a constant.
8/21/2019 Predicting and Estimating Nov 06
19/89
19Linda M. Laird 2004All Rights Reserved
Defects Detected tends to be similar to Staffing
Curves
People
Defects
TimeSource: Industrial Strength Software,Putnam & Myers, IEEE, 1997
8/21/2019 Predicting and Estimating Nov 06
20/89
20Linda M. Laird 2004All Rights Reserved
Which is related to Code Production Rate
People
Defects
Time
Code Production
Rate
And all tend tofollow Rayleigh
Curves
TEST
Note: Period during test is similar to
exponential curve Source: Putnam &Myers
8/21/2019 Predicting and Estimating Nov 06
21/89
21Linda M. Laird 2004All Rights Reserved
Defect Prediction/Estimation Models
Total number of defects
Distribution of defects over time
8/21/2019 Predicting and Estimating Nov 06
22/89
22Linda M. Laird 2004All Rights Reserved
Defect Model Types: Static and Dynamic
Dynamic is usually based on statistical distributions of faultsfound (akaestimated)
Two types
One that model the entire development Rayleigh distributions
One that models the testing/deployment process Exponential andS-Curve models
Work better in the large on projects when you need to estimatewhen/if the project will fail.
Static uses attributes of the program to estimate number ofdefects (aka predicted)
Typically of form y = f(a,b,c,d,e) where y is the defect rate or # ofdefects, and a->z are attributes of the product, process, and/orproject
COQUALMO &RL-TR-92-95 Model, Industry Data, Local Historicalare all static models
Usually work better at the module level to provide indication toengineers on where to focus
8/21/2019 Predicting and Estimating Nov 06
23/89
23Linda M. Laird 2004All Rights Reserved
Total Defects and Defect Distribution
If you have fault data, you can estimate the
total number of faults and the distribution.
Via Calculations or Tools, using predictive models
Method for the three primary distributions:
Rayleigh
Exponential
S-Curves
If you dont have fault data, you use historical
data from other projects and static models
8/21/2019 Predicting and Estimating Nov 06
24/89
24Linda M. Laird 2004All Rights Reserved
Development Phase Model Applicability
Start tracking
defects
Start
Independent
Testing
Rayleigh
Model
Exponential (Reliability Growth)
Model & S-Curves
Static
Models
8/21/2019 Predicting and Estimating Nov 06
25/89
25Linda M. Laird 2004All Rights Reserved
Exponential and S-Shaped Distributions
S-Shaped Curve
Exponential
Time
Cumulative
Failures
Found(e,g.F(t))
8/21/2019 Predicting and Estimating Nov 06
26/89
26Linda M. Laird 2004All Rights Reserved
Exponential and S-Shaped Distributions
S-Shaped Arrival Curve
Exponential
Time
Defects
Found(Arrival
Distribution
f(t))
8/21/2019 Predicting and Estimating Nov 06
27/89
27Linda M. Laird 2004All Rights Reserved
S curves: Overview
Resemble an S---with a slow start, then a much
quicker discovery rate, and than a slow tail-off at theend
Based upon view that software defect removalprocess is a defect detection, defection isolation anddefect correctionand all of them take time.
Multiple S curve models, all Based upon the non-homogeneous Poisson process for the arrivaldistribution
One equation:
M(t) =
Where M(t) is the expected number of failures by time t,and K is the total number of failures
t
etk
)1(1
8/21/2019 Predicting and Estimating Nov 06
28/89
28Linda M. Laird 2004All Rights Reserved
Rayleigh & Exponential Curves
In the family of Weibull curves;
Which have the form of:
F(t) = 1e(-t/c)m ;
f(t) = (m/t)*(t/c)me (-t/c)m
For m = 1Exponential Distribution
For m = 2Rayleigh Distribution
8/21/2019 Predicting and Estimating Nov 06
29/89
29Linda M. Laird 2004All Rights Reserved
Rayleigh Model
Defect Arrival Rate (PDF)the number of defects to
arrive at time t =
Cumulative Defects (CDF) -- the total number of defectsto arrive by time t =
Where:
K=total number of injected defects
c is a function of the time tmaxthat the curve reachesits peak
c = tmax* sqrt (2)
Note: at tmax, ~ 40% of the defects should havebeen found
)1(*)(2)/( cteKtF
2)/(2 *)/2(*)( ctectKtf
8/21/2019 Predicting and Estimating Nov 06
30/89
30Linda M. Laird 2004All Rights Reserved
Using Rayleigh Model
Simple extensions of the model provide
other useful information.
For example, defect priority classes can
be specified as percentages of the total
curve.
This allows the model to predict defects
by severity categories over time
8/21/2019 Predicting and Estimating Nov 06
31/89
31Linda M. Laird 2004All Rights Reserved
Plotting the graphs/looking at the fxns
If K = 1, F(t) =
probability of 1
defect arriving by
time t
f(t) = probability
of defect arriving
at time 1
So. what do
these charts
mean?
Raleigh distribution - c=2
-0.2
0
0.2
0.4
0.6
0.8
1
1.2
0 5 10 15
time
probabilty
F(t) for c = 2
f(t) for c = 2
Raleigh Distribution c = 10
0
0.2
0.4
0.6
0.8
1
0 5 10 15 20
time
probability
F(t) for c = 10
f(t) for c = 10
8/21/2019 Predicting and Estimating Nov 06
32/89
32Linda M. Laird 2004All Rights Reserved
Plotting the graphs/working with the fxns
These are all for K =
1.
For case 1, tmax~ =
1.4, => c = ~ 1.96
(close to 2)
For example, the
probability that thedefect will arrive at
time 2 is ~.39, and
the probability that it
has arrived by time
2 is ~.62
For case 2, tmax= ~7
=> c = 7*1.4 = 9.8
(almost 10)
Raleigh distribution - c=2
-0.2
0
0.2
0.4
0.6
0.8
1
1.2
0 5 10 15
time
probabilty
F(t) for c = 2
f(t) for c = 2
Raleigh Distribution c = 10
0
0.2
0.4
0.6
0.8
1
0 5 10 15 20
time
probability
F(t) for c = 10
f(t) for c = 10
8/21/2019 Predicting and Estimating Nov 06
33/89
33Linda M. Laird 2004All Rights Reserved
Predicting defects analytically
You can
assuming a Distribution.and with defect
data collected from early in the project
Mathematically determine the curve and
the equation, as long as youve hit themaximum.
With Rayleigh, you need a maximum
With Exp, you need enough data to see slope starting
to change
8/21/2019 Predicting and Estimating Nov 06
34/89
34Linda M. Laird 2004All Rights Reserved
Method for using the Rayleigh Distribution
Given n data points, plot them
Determine tm(the time t at which f(t) ismax)
Then since you have the formulae
F(t) = K[1-e-(t2/2*tm2)
]f(t) = K[ (1/tm)
2*t*e-(t2/2*tmax^2)]
Where F(t) is the cumulative arrival rate,f(t)is the arrival rate for defects, and K is the
total number of defectsAnd you can then use these to predict the
later arrival of defects.
8/21/2019 Predicting and Estimating Nov 06
35/89
35Linda M. Laird 2004All Rights Reserved
Example: 594 Faults found by day 9
Faults vs. Days
0
20
40
60
80
100
0 2 4 6 8 10
Days
FaultsF
ound
What is the arrival
function f(t)?
Need tmax and K
to determine f(t).
Tmax -> 7 )2)27*2/1(2)7/1(*)( tteKtf
8/21/2019 Predicting and Estimating Nov 06
36/89
36Linda M. Laird 2004All Rights Reserved
Then what would you do?
Solve for K ( you can pick any points --- I use
t = 1defects = 20 for simplicity) =>K=20*49/e(-1/98)= ~990
You now have an equation:
Then plot out the equation and use it to
predict arrival rates (and also see how well itmatches to the data)
2)98/1(
)49/990()(
t
tetf
201.
2.20 t
te
Note: this is an extremely simplistic way to solve for the equation. Using more than 1 point
Or tools would be a good idea
8/21/2019 Predicting and Estimating Nov 06
37/89
37Linda M. Laird 2004All Rights Reserved
Using this data
Remember that K is the expected total
number of faults to be foundYou can determine # of defects found so far by
taking sum of points on chartwhich happens toequal 594.
This chart and analysis says that You expect ~ 990 faults
Therefore, have found ~60% of faults so far
If you wanted to predict when at least 95% hadbeen found, you could either
Solve for (KF(t))/K >= .05 Use the equations with an excel model
8/21/2019 Predicting and Estimating Nov 06
38/89
38Linda M. Laird 2004All Rights Reserved
What did we just do?
We figured out how to predict the total
number of defects and the distribution basedupon the defects found to date, and assuming
a Rayleigh distribution
What would you do if you missed some of the
initial data (for example, no one tracked
defects found in requirement)would this
method be useless?
NOyoud use the maximum,
and then project the faults found
initially as well.
8/21/2019 Predicting and Estimating Nov 06
39/89
39Linda M. Laird 2004All Rights Reserved
Rayleigh Model Implementation
SPSS (Regression Module)
SAS
SLIM (by Quantitative Software
Management)
STEER (by IBM)
8/21/2019 Predicting and Estimating Nov 06
40/89
40Linda M. Laird 2004All Rights Reserved
Now lets look at an exponential distribution
F(t) = 1e-t
f(t) = *e- t
Or, given K total defects,F(t) = K*( 1e- t)
f(t) = K * *e- t
8/21/2019 Predicting and Estimating Nov 06
41/89
41Linda M. Laird 2004All Rights Reserved
Exponential Distributions
Exponential
Time
Cumulative
Failures
Found
Ktotal
number
of
defects
8/21/2019 Predicting and Estimating Nov 06
42/89
42Linda M. Laird 2004All Rights Reserved
Exponential Distributionswhat is K?
Exponential
Time
Cumulative
Failures
Found
Ktotal
number
of
defects
8/21/2019 Predicting and Estimating Nov 06
43/89
43Linda M. Laird 2004All Rights Reserved
Solve equations
Either
solve for a few points (ok)
Draw in your own K (not so good)
let excel figure it out for you with trendlines
(better)
8/21/2019 Predicting and Estimating Nov 06
44/89
44Linda M. Laird 2004All Rights Reserved
OKnow you try one by hand
Given the following data, what are
K
t 1 2 3 4 5 6 7 8 9
defects found 17 16 15 14 14 13 12 12 12
8/21/2019 Predicting and Estimating Nov 06
45/89
45Linda M. Laird 2004All Rights Reserved
Answer
Since f(t) = K* *e- t, then
f(a)/f(b) = (K* *e-
a) /(K* *e-
t)= e
(b-a)
And K = f(t)/ (*e- t)
Therefore, selecting a = 1 and b = 5, we have f(1)/f(5) = e (5-1)
17/14 = e *4
Ln(17/14)=4= .048
And thenpick a few points to determine Ktry 1and 5 againK1= 17/(.048*e
-.048) = 372
K5= 15/(.048* e-.048*5) = 371
8/21/2019 Predicting and Estimating Nov 06
46/89
46Linda M. Laird 2004All Rights Reserved
Using the Rayleigh Model instead of the
exponential distribution
8/21/2019 Predicting and Estimating Nov 06
47/89
47Linda M. Laird 2004All Rights Reserved
Predicting arrival rates
If you have a projection of the total
number of defects (using static modelsor historical data) you can also predictthe arrival rates of defects using theRayleigh model
Then use it as a plan to manage against
If there are significant deviations, thiswould cause the manager to investigateand potentially take remedial action
8/21/2019 Predicting and Estimating Nov 06
48/89
48Linda M. Laird 2004All Rights Reserved
Using The Rayleigh Distribution
This model, given total number of defects
expected, spreads them out over the life-cycle of the project in a Rayleigh Curve.Use: to compare projected with actual faults found
to determine project performance
Input is:TdTotal duration of project (to operational
delivery)
ErTotal expected # of faults for lifetime ofproject
Errors for each time period isEm = (6 * Er/Td^2)*t*exp(-3t^2/td^2)
NOTE: this assumes ~95% of faults foundbefore delivery
Source: Putnam and Myers
8/21/2019 Predicting and Estimating Nov 06
49/89
49Linda M. Laird 2004All Rights Reserved
Rayleigh Model - example
Given a 26 week project, and expected
faults of 1000..then.using formula
Defects Per Week
0
10
20
30
40
50
60
0 10 20 30
Week
Defects
Found
8/21/2019 Predicting and Estimating Nov 06
50/89
50Linda M. Laird 2004All Rights Reserved
Try another problem
If you expect to have 100 defects, and you
think that the time it takes to shipment is 10weeks. Youve found 60 defects by week 6.
Are you in good shape or not?
Since the errors you should be finding for
each time period isEm = (6 * 100/10^2)*t*exp(-3t^2/10^2)
= 6t *exp(-3t2/100)
You should have found Total for weeks 1 to 6 = Sum(Em) for m= 1 to 6
= 71.45 (I used a spreadsheet to calculate).So.either your software is better than you
expected, or you are behind in finding defects.
8/21/2019 Predicting and Estimating Nov 06
51/89
51Linda M. Laird 2004All Rights Reserved
Other Tools
If you dont like the calculations, there
are tools (such as SLIM and STEER)
which, given the arrival rate data, will
help you predict the remaining defects
and arrival patterns.
E i ti l D t d R d ti
8/21/2019 Predicting and Estimating Nov 06
52/89
52Linda M. Laird 2004
All Rights Reserved
Putnam and Myers (1992) found total defectsprojected using Rayleigh curves were within 5% to10%Others not as close, but may have had dirty data.
With small projects, have smaller number of datapoints, and therefore, less confidence.
Using their STEER software tool IBM FederalSystems in Gaithersburg, MD estimated latentdefects for 8 projects and compared the estimate withactual data collected for the first year in thefield..very close.
Some data suggests that m=1.8 for Weibull curvesmay be best
Kans recommendation: Use as many models aspossible to predict, compare with each other, trackresults, and see what works the best.
Experiential Data and Recommendations:
8/21/2019 Predicting and Estimating Nov 06
53/89
53Linda M. Laird 2004
All Rights Reserved
Dynamic Model Distribution Summary
Formal Parametric Models for projecting
latent software defects whendevelopment is complete and the
project is ready to ship
Encompasses both defect preventionand early defect removal
8/21/2019 Predicting and Estimating Nov 06
54/89
54Linda M. Laird 2004
All Rights Reserved
Predicting Fault Density and Distribution
Typical Distribution of FaultsDefect Prediction ModelsDynamic
Rayleigh, Exponential, S-Curve Models
Static Coqualmo Model
Based on Process
RL-TR-92-95
Industry Data
Such as SEI Delivered Fault Data
Local ModelsHistorical Data
8/21/2019 Predicting and Estimating Nov 06
55/89
55Linda M. Laird 2004
All Rights Reserved
COQUALMOby Chulani and Boehm
Defect Analysis Tool from USC
Extension to the COCOMO estimation model(Software Sizing Model developed by Boehm and
others at USC)
Based on the Defect Insertion/Removal model
Tool/Paper available on our course website
Coqualmo is a model which predictsDelivered
8/21/2019 Predicting and Estimating Nov 06
56/89
56Linda M. Laird 2004
All Rights Reserved
Defect Density (per KLOC or per FP)
Defects In
Defectsout
Based upon a variety of
factorsAnd which you cantune based on your own
experience.
Delivered Defect Density
C l M d l
8/21/2019 Predicting and Estimating Nov 06
57/89
57Linda M. Laird 2004
All Rights Reserved
Coqualmo Models
2 Separate models
Source: COCOMO II
Size
Software
Platform,
Product,
Personnel, andProject
Attributes
Defect
Introduction
Number of non-trivial reqmts,
design, and coding
defects introduced
Defect
RemovalNumber of
Defects per
KLOC
Defect Removal Activities
(Automated Analysis,Reviews, Testing and Tools
I P
8/21/2019 Predicting and Estimating Nov 06
58/89
58Linda M. Laird 2004
All Rights Reserved
Input Parameters
For defect introduction, it uses the COCOMO II project
descriptors (size, personnel capability and experience,platform characteristics, project practices, and product
characteristics such as complexity and required
reliability) to estimate the number of requirements,
design, and code defects introduced into the project.
For defect removal, it uses ratings of a projects level
of use of analysis tools, peer reviews, and execution
testing, to determine what fraction of the introduced
defects are removed. Its estimates to date are consistent
with general project experience and a small number of
detailed project data points.
COQUALMO M d t il
8/21/2019 Predicting and Estimating Nov 06
59/89
59Linda M. Laird 2004
All Rights Reserved
COQUALMOMore detail
Quantitative model for defect introduction and
removalAcronym for Constructive Quality Model
Chulani and Boehm at USC1999
Consistent with COCOMO model by Boehm
Current data is from the COCOMO clients and Expert
OpinionNeed addl data from more projects to tune the model
Defects Introduced (DI) =Where A is the multiplicative constant (for rqmts, design,
coding)B is initially set to 1 and accounts for economies of scale
QAF is the quality factor that is taking into account 21 defectintroduction factors (Platform, Product, Personnel, andProject)
j
B
j
j QAFSizeA j
**
3
1
DI ti i E li h
8/21/2019 Predicting and Estimating Nov 06
60/89
60Linda M. Laird 2004
All Rights Reserved
DI equationin English
What does that equation say?
That the number of defects introduced is the sumof the number of defects introduced in each
requirements, design, and coding
The number of defects introduced in a given
phase = A * (size) ^ B * QAF where A is based upon which phase
B is based upon size
QAF is based upon the quality of the process, platform,
etc.
E l (U i d d t )
8/21/2019 Predicting and Estimating Nov 06
61/89
61Linda M. Laird 2004
All Rights Reserved
Example (Using dummy data):
Assume that you calculated the QAF for each phase ---
and that you have the following values, and that the modelhas given you the values for A as shown
This says that the Defects Introduced by phase would be:
Phase QAF A
Rqmts 1.2 10
Design 1 20Coding 0.5 30
Phase QAF A DI
Rqmts 1.2 10 12Design 1 20 20
Coding 0.5 30 15
Note that the QAFs imply a
requirements activity worse than
average and a coding activity
better than average
QAF Q lit A t F t
8/21/2019 Predicting and Estimating Nov 06
62/89
62Linda M. Laird 2004
All Rights Reserved
QAFQuality Assessment Factor
The QAF is a factor which is the product
of 21 defect introduction driverssuchas analyst capability, programmer
capability, required reliability of the
system, etc.
Defects Introduced
8/21/2019 Predicting and Estimating Nov 06
63/89
63Linda M. Laird 2004
All Rights Reserved
Defects Introduced
Nominal values, per KSLOC are:
DI(requirements) = 10;DI (design) = 20
DI (coding) = 30
DI(total) = 60
E.G., for for every 1K lines of code, the model
predicts that, assuming a nominal situation therewould typically be 60 defects injected into the code,10 of which were requirements defects, 20 werecoding, etc. etc.
Process Maturity had highest impact on defectintroductionwith everything else held constant, itvaries result by a factor of 2.5which says that if youhave a very good process, you significantly reducethe number of defects introduced
COQUALMO Defect Removal
8/21/2019 Predicting and Estimating Nov 06
64/89
64Linda M. Laird 2004
All Rights Reserved
COQUALMODefect Removal
Initial values determined by experts using the 2-
Delphi technique Looked at three different removal techniques:
Automated Analysis, People Reviews, ExecutionTesting and Tools
Rated %DRE for removing defects for 6 levels of
effectiveness of technique for each phase (rqmts,design, coding)
Computed residual defects as If all techniques Very Low Effectiveness= 60 defects
per KSLOC
If all techniques Extra High Effectiveness= 1.57 defectsper KSLOC
If all techniques Nominal= 14.3 defects per KSLOC
Summary on COQUALMO model
8/21/2019 Predicting and Estimating Nov 06
65/89
65Linda M. Laird 2004
All Rights Reserved
Summary on COQUALMO model
Mathematical model which takes as input
Your view of your defect injection driversYour view of your defect removal drivers
Gives you a projection of # of defectsremaining in your system at any phase
Can be used to estimate impact of driverchanges on defect densitywhat if analysis
improvement investment analysis
Other Similar Models available
RL TR 92 52
8/21/2019 Predicting and Estimating Nov 06
66/89
66Linda M. Laird 2004
All Rights Reserved
RL-TR-92-52
Seems to be a primary reference and model
for both default density and fault densityCould not obtain a copy of report, I believe
very similar to CoQualmo
Key Fault Parameters for predicting defect
density are:Application Type & Difficulty: 2 to 14
Development Organization: .5 to 2
Software Complexity: .8 to 1.5
Compliance with Design Rules: .75 to 1.5 Note: 1 adds them
Predicting Fault Density and Distribution
8/21/2019 Predicting and Estimating Nov 06
67/89
67Linda M. Laird 2004
All Rights Reserved
Predicting Fault Density and Distribution
Typical Distribution of FaultsDefect Prediction ModelsDynamic
Rayleigh, Exponential, S-Curve Models
Static Coqualmo Model
Based on Process
RL-TR-92-95
Industry Data
Such as SEI Delivered Fault Data
Local ModelsHistorical Data
SEI Defect Removal
8/21/2019 Predicting and Estimating Nov 06
68/89
68Linda M. Laird 2004
All Rights Reserved
SEI Defect Removal
Cumulative % of defects removed thru
acceptance test:
SEI Level 2: 25.5%
SEI Level 3: 41.5%
SEI Level 4: 62.3%
SEI Level 5: 87.3%
Diaz & King,
2002 (in Kan)
Industry data
8/21/2019 Predicting and Estimating Nov 06
69/89
69Linda M. Laird 2004
All Rights Reserved
Industry data
CMM Approach
Measure Average defects/
function points
Typical defect potential and delivered defects
for SEI CMM Level 1
5.0 potential
.75 delivered
Typical defect potential and delivered defects
for SEI CMM Level 2
4.0 potential
.44 delivered
Typical defect potential and delivered defects
for SEI CMM Level 3
3.0 potential
.27 delivered
Typical defect potential and delivered defectsfor SEI CMM Level 4
2.0 potential.14 delivered
Typical defect potential and delivered defects
for SEI CMM Level 5
1.0 potential
.05 delivered
Source:Capers Jones, 1995
Industry Data
8/21/2019 Predicting and Estimating Nov 06
70/89
70Linda M. Laird 2004
All Rights Reserved
Industry Data
Industry Approach
Measure Average defects/ function
points
Delivered defects per industry System Software - .4
Commercial Software - .5
Information Software - 1.2
Military Software - .3
Overall average - .65
Source:
Capers Jones, 1995
Defect Data By Application Domain - Reifer
8/21/2019 Predicting and Estimating Nov 06
71/89
71Linda M. Laird 2004
All Rights Reserved
Defect Data By Application Domain - Reifer
Application Domain Number
Proje
cts
ErrorRange
(Errors/
KESLOC)
Normative Error Rate Notes
(Errors/ KESLOC)
Automation 55 2 to 8 5 Factory automation
Banking 30 3 to 10 6 Loan processing, ATM
Command & Control 45 0.5 to 5 1 Command centers
Data Processing 35 2 to 14 8 DB-intensive systems
Environment/ Tools 75 5 to 12 8 CASE, compilers, etc.
Military -All 125 0.2 to 3 < 1.0 See subcategories
Airborne 40 0.2 to 1.3 0.5 Embedded sensors
Ground 52 0.5 to 4 0.8 Combat center
Missile 15 0.3 to 1.5 0.5 GNC system
Space 18 0.2 to 0.8 0.4 Attitude control system
Scientific 35 0.9 to 5 2 Seismic processing
Telecom 50 3 to 12 6 Digital switches
Test 35 3 to 15 7 Test equipment, devices
Trainers/ Simulations 25 2 to 11 6 Virtual reality simulator
Web Business 65 4 to 18 11 Client/server sites
Other 25 2 to 15 7 All others
Domain Data Comments
8/21/2019 Predicting and Estimating Nov 06
72/89
72Linda M. Laird 2004
All Rights Reserved
Domain Data Comments
Defect rates in military systems are much
smaller due to the safety requirementsDefect rates after delivery tend to be cyclical
with each version released. They initially are
high, and then stabilize around 1 to 2 defects
per KLOC in systems with longer lifecycles (>
5 years). Web Business systems tend to
have shorter lifecycles (
8/21/2019 Predicting and Estimating Nov 06
73/89
73Linda M. Laird 2004
All Rights Reserved
Local History
SimplestDefect Densities and Defect
Removal Efficiencies from other project
Remember from 533 what DRE is -- the %
of defects removed in each developmentphase
Prediction Model 2nd Step
8/21/2019 Predicting and Estimating Nov 06
74/89
74Linda M. Laird 2004
All Rights Reserved
Prediction Model 2nd Step
Now at 2nd step -- predicting failure rate from defects
Fault
Profile
& DefectDensity
Initial Failure
Rate
Delivered
and
On-going
Failure Rate
8/21/2019 Predicting and Estimating Nov 06
75/89
75Linda M. Laird 2004
All Rights Reserved
Predicting Failure Rate from Fault DensityIssues
Musa ModelUsing Past Projects Data
8/21/2019 Predicting and Estimating Nov 06
76/89
76Linda M. Laird 2004
All Rights Reserved
Issue is thatfailures are afunction of
FaultsEnvironment
System Usage& Mix
8/21/2019 Predicting and Estimating Nov 06
77/89
77Linda M. Laird 2004
All Rights Reserved
Issue is that failures are a function of
Faults
Environment
System Usage & MixIf you can make
these the same as your
Target environment
Then the
projection should
work out better
8/21/2019 Predicting and Estimating Nov 06
78/89
78Linda M. Laird 2004
All Rights Reserved
Typically cant have those similar to
operational environment until
operational testingso before that time,
use empirical data..
8/21/2019 Predicting and Estimating Nov 06
79/89
79
Linda M. Laird 2004
All Rights Reserved
The Musa Prediction Method
For predicting failure rate given a fault
density -- developed for predicting
expected failure rate in system test
Caveat: This method seems like magic
to me. But is does have an empirical
basis..
Musa Model Underlying Concepts
8/21/2019 Predicting and Estimating Nov 06
80/89
80
Linda M. Laird 2004
All Rights Reserved
Musa Model Underlying Concepts
Each fault is embodied in machine instructions
There is a probability that the faulty machineinstructions will cause a failure
Therefore, if you know the number of faultsremaining, the number of machineinstructions for the program, the speed of themachine, and the probability, you can predictthe arrival rate of failures.
Musa Prediction Model I/O
8/21/2019 Predicting and Estimating Nov 06
81/89
81
Linda M. Laird 2004
All Rights Reserved
Musa Prediction Model I/O
Input: Fault Density, Size of Program,
Processor Speed andA probability that a given faulty line of code
will cause a failure when it is
executede.g, a ratio of failures to faultscan either be from past history, or can
use default (4.2*10^-7)
Output: Expected Failure Rate
Musa Model for Failure Rate
8/21/2019 Predicting and Estimating Nov 06
82/89
82Linda M. Laird 2004All Rights Reserved
Musa Model for Failure Rate
Let w = number of faults
Let I = number of object code instructions
Let r = process speed in instructions per sec
Let L = expected failure rate (e.g., lambda)
K=magicconstant = 4.2*10^-7 --- theprobability that a given faulty line of code will
cause a failure when it is executed
Then L = r*K*w/I
The Key is obviously K
8/21/2019 Predicting and Estimating Nov 06
83/89
83Linda M. Laird 2004All Rights Reserved
The Key is obviously K
And where does it come from?
If you have other similar programs/project,
generate it from those (K = L*I/(r*w))
Interestingly, Musas data across
multiple projects only has K slightlyvaryingwith a range of 1*10^-7 to
7.5*10^-7
Example: Musa Model
8/21/2019 Predicting and Estimating Nov 06
84/89
84Linda M. Laird 2004All Rights Reserved
Example: Musa Model
Let w = number of faults
Let I = number of object code instructions
Let r = process speed in instructions per sec
Let L = expected failure rate (e.g., lambda)
K=magicconstant = 4.2*10^-7 failures per fault
Then L = r*K*w/I
Assume a 100 MIP machine; 5 defectsper KLOC, 100K Source Lines, C++
Then what is the expected failures per
execution second?
Class Example: Musa Model
8/21/2019 Predicting and Estimating Nov 06
85/89
85Linda M. Laird 2004All Rights Reserved
Class Example: Musa Model
Let w = number of faults
Let I = number of object code instructions
Let r = process speed in instructions per sec K=magicconstant = 4.2*10^-7 failures per fault
Then L = r*K*w/I
Assume a 100 MIP machine; 5 defects per KLOC,100K Source Lines, C++
Then w=5*100 = 500 total faults I = 100K*6 (from table in Rome Notebook) = 600K
lines of object code
L=(100*10^6)*(4.2*10^-7)*500/6*10^5
=10^8*10^-7*10^2*4.2*5/10^5*6 = 10^-2*3.5= .035=.035 failures per execution sec
= 2.1 failures per minute
So this says that the initial failure rate is estimated to be 2.1failures per EXECUTION minute.
Musa Model Summary
8/21/2019 Predicting and Estimating Nov 06
86/89
86Linda M. Laird 2004All Rights Reserved
Musa Model Summary
The theory behind Musas model is that
the faults are embedded in the code,and that the probability of the faults
becoming failures is dependent upon
the fault density, and the frequency ofthe code being executed.
Kans Empirical data
8/21/2019 Predicting and Estimating Nov 06
87/89
87Linda M. Laird 2004All Rights Reserved
p
For system platforms to have > 99.9+%
availability, the defect level has to be
8/21/2019 Predicting and Estimating Nov 06
88/89
88Linda M. Laird 2004All Rights Reserved
j g y y
Prediction vs. Estimation
Model OverviewPredicting Defect Densities
Predicting Failures From Defect Densities
Estimating Reliability from TestingExecution Time vs. Calendar Time
Estimating Failure Models Reliability Growth
Reliability Estimation
Tools
Homework
8/21/2019 Predicting and Estimating Nov 06
89/89
For the Rayleigh curve example, solve for t such that95% of the defects have been found.
Play with Coqualmo so you can actually use it (onwebsite in tools)Understand parameters (may need to look up COCOMO
model to understand them)
Read articles on website Do project on website.