Date post: | 22-Dec-2015 |
Category: |
Documents |
Upload: | nancy-douglas |
View: | 242 times |
Download: | 0 times |
1
Software Dependability Assessment:
A Reality or A Dream?
Karama Kanoun
The Sixth IEEE International Conference on Software Security and Reliability, SERE, Washington D.C., USA, 20-22 June, 2012
2
Complexity
Economic pressure
[From J. Gray, ‘Dependability in the Internet era’] Availability Outage duration/yr
0,999999 32s
0,99999 5min 15s
0,9999 52min 34s
0,999 8h 46min
0,99 3d 16h
Availability
1950 1960 1970 1980 1990 20009%
99%
99.9%
99.99%
99.999%
99.9999%
Computer Systems
Telephone Systems
Cellphone
s
Internet
2010
3
Examples of Historical Failures
June 1980: False alerts at the North American Air Defence (NORAD)
June 1985 - Jan. 1987: Excessive radiotherapy doses (Therac-25)
Aug. 1986 - 1987: the « Willy Hacker » penetrates several tens of sensitive computing facilities
15 January 1990: 9 hours outage of the long distance phone in the USA
February 1991: Scud missed by a Patriot (Irak, Gulf war)
Nov. 1992: Crash of communication system of London Ambulance service
26 -27 June 1993: Authorization denials of credit card operations, France
4 June 1996: Failure of Ariane 5 first flight
Feb 2000: Distributed denials of service on large web sites: Yahoo, ebay…
Aug. 2003: electricity blackout in USA, Canada
Oct. 2006: 83000 email addresses, credit card info, banking transaction files stolen in UK
Aug. 2008: Air traffic control computing system failure (USA)
Sep. 2009: 3rd service interruption of Gmail, Google email service, during year 2009
Phy
sica
l
Ava
ialb
ility
/R
elia
bilit
y
Dev
elop
men
t
Inte
ract
ion
Saf
ety
Con
fiden
tialit
y
Cause
Loca
lized
Dis
trib
ute
d
Impact
Dependabilityproperties
4
Accidental faults
Number of failures
[consequences and outage durations depend upon application]
Dedicated computing systems
(e.g., transaction processing, electronic
switching, Internet backend servers)
Controlled systems
(e.g., civil airplanes, phone network, Internet frontend
servers)
Faults Rank Proportion Rank Proportion
Physical internal 3 ~ 10% 2 15-20%
Physical external 3 ~ 10% 2 15-20%
Human interactions 2 ~ 20% 1 40-50%
Development 1 ~ 60% 2 15-20%
5
Global Information Security Survey 2004 — Ernst & YoungLoss of availability: Top ten incidents
Percentage of respondents that indicated the following incidents resulted in an unexpected or unscheduled outage of their critical business
0% 20% 40% 60% 80%
Hardware failures
Major virus, Trojan horse, or Internet worms
Telecommunications failure
Software failure
Third party failure, e.g., service provider
System capacity issues
Operational erors, e.g., wrong software loaded
Infrastructure failure, e.g., fire, blackout
Former or current employee misconduct
Distributed Denial of Servive (DDoS) attacks
Non malicious76%
Malicious24%
6
User / customer
• Confidence in the product
• Acceptable failure rate
Why Software Dependability Assessment?
Developer / Supplier
• During production
Reduce # faults (zero defect)
Optimize development
Increase operational dependability
• During operation
Maintenance planning
• Long term
Improve software dependability
of next generations
7
Approaches to Software Dependability Assessment
Assessment based on software characteristics
• Language, complexity metrics, application domain, …
Assessment based on measurements
• Observation of the software behavior
Assessment based on controlled experiments
• Ad hoc vs standardized benchmarking
Assessment of the production process
• Maturity models
8
Outline of the Presentation
Assessment based on software characteristics
• Language, complexity metrics, application domain, …
Assessment based on measurements
• Observation of the software behavior
Assessment based on controlled experiments
• Ad hoc vs standardized benchmarking
Assessment of the production process
• Maturity models
9
Corrections non-repetitive process
No relationship between failures and corrections
Continuous evolution of usage profile
• According to the development phase
• Within a given phase
Overselling of early reliability “growth” models
Judgement on quality of the software developers
What is software dependability dependability measures?
Number of faults, fault density, complexity?
MTTF, failure intensity, failure rate?
Software Dependability Assessment — Difficulties
10
Dependability Measures?
Static measuresStatic measuresDynamic measures: Dynamic measures: characterizing occurrence characterizing occurrence of failures and correctionsof failures and corrections
Failure intensityFailure rateMTTFRestart timeRecovery timeAvailability…
Complexity metricsNumber of faultsFault density… Usage profile
& Environment
11
Number of Faults vs MTTF
MTTF (years)
5000 1580 500 158 50 15.8 5 1.58Product
1
2
3
4
5
6
7
8
9
28,8
28,0
28,5
28,5
28,5
28,2
28,5
27,1
27,6
17,8
18,2
18,0
18,7
18,4
20,1
18,5
18,4
20,4
10,3
9,7
8,7
11,9
9,4
11,5
9,9
11,1
12,8
5,0
4,5
6,5
4,4
4,4
5,0
4,5
6,5
5,6
2,1
3,2
2,8
2,0
2,9
2,1
2,7
2,7
1,9
1,2
1,5
1,4
0,3
1,4
0,8
1,4
1,4
0,5
0,7
0,7
0,4
0,1
0,7
0,3
0,6
1,1
0,0
34,2
34,3
33,7
34,2
34,2
32,0
34,0
31,9
31,2
Percentage of faults and corresponding MTTF (published by IBM) MTTF≤1.58 y
1.58 y ≤ MTTF≤ 5 y
12
Assessment Based on Measurements
• Descriptive statistics
• Trend analysis
• Modelling/prediction
Data Collection Data Processing
• Non-stationary processes
• Stochastic models
• Model validation
Times to failures /# failures
Failure impact
Failure originCorrections
• Trend evolution
• Failure modes
• MTTF / failure rate
• Correlations
Outputs
• MTTR
• Availability
14
Why Trend Analysis?
Corrections
. . .
Vi,1
Failure intensity
Vi,2
Vi,k
Corrections
Changes (usage profile, environment, specifications,...)
Vi+1,1
Vi+1,2
Vi+1,3
Vi+1,4
time
15
0
5
10
15
20
25
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 months
Failure intensity
OperationValidation
10
20
30
40
11 15 19 23 29 31
# systems
Example: Electronic Switching System
16
0
20
40
60
80
100
120
140
160
180
200
220
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 months
Observed
OperationValidation
Cumulative number of failures
Electronic Switching System (Cont.)
17
Cumulative number of failures
0
20
40
60
80
100
120
140
160
180
200
220
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 months
Observed
Hyperexponential model application maintenance planning
Predictive assessmentRetrodictive assessment
Observed # failures [20-32] = 33 Predicted # failures [21-32] = 37
Electronic Switching System (Cont.)
OperationValidation
18
Failure intensity and failure rate in operation(for an average system)
0
0.5
1
1.5
2
2.5
17 19 21 23 25 27 29 31
Estimated by Hyperexponential model
Observed
Residual failure rate: 5.7 10-5 /h
Electronic Switching System (Cont.)
19
Failure intensity of the software components
observed failure intensity
Hyperexponential model
0
0.2
0.4
0.6
0.8
1
17 19 21 23 25 27 29 31
observed failure intensity
Hyperexponential model
0
0.1
0.2
0.3
0.4
0.5
17 19 21 23 25 27 29 31
observed failure intensity
Hyperexponential model
0
0.1
0.2
0.3
0.4
17 19 21 23 25 27 29 31
observed failure intensity
Hyperexponential model
0
0.1
0.2
0.3
0.4
17 19 21 23 25 27 29 31
Telephony
Interface
Defense
Management
Electronic Switching System (Cont.)
20
Failure intensity and failure rate in operation(for an average system)
Residual failure rate6
5
5
6
-
-
-
-
-5
Telephony 1.2 10 /h
Defense 1.4 10 /hInterface 2.9 10 /h
Management 8.5 10 /h
Sum 5.3 10 /h
Component
0
0.5
1
1.5
2
2.5
17 19 21 23 25 27 29 31
Estimated by Hyperexponential model
Observed
Residual failure rate: 5.7 10-5 /h
Electronic Switching System (Cont.)
Sum of the failure intensities of thecomponents estimated by HE
21
Other Example: Operating System in Operation
u(i)
-2-1,5
-1-0,5
00,5
11,5
2
1 21 41 61 81 101 121 141 161 181# failures
Trend evolution = stable dependability
Data = Time to Failure during operation
100000
150000
200000
250000
300000
1 21 41 61 81 101 121 141 161 181# failures
MeanTime to Failure
22
Validity of Results
Early Validation
Trend analysis
development
follow-up
Assessment
End of Validation
Trend analysis + Assessment
• operational profile
• enough data?
Limits: 10-3/h -10-4/h
Operation
Trend analysis + Assessment
High relevance
Examples:
E10-B (Alcatel ESS):
1400 systems, 3 years
= 5 10-6/h
c = 10-7/h
Nuclear I&C systems:
8000 systems, 4 years
: 3 10-7/h 10-7/h
c = 4 10-8/h
23
Research Gaps
Applicability to new classes of systems
• Service oriented systems
• Adaptive and dynamic software systems on-line assessment
Industry implication
• Confidentiality real-life data
• Cost (perceptible overhead, invisible immediate benefits)
Case of Off-The-Shelf software components?
Applicability to safety critical systems
• During development
Accumulation of experience software process improvement
assessment of the software process
24
Off-The-Shelf software components — Dependability Benchmarking
No information available from software development
Evaluation based on controlled experimentation
Ad hoc Standard
Evaluation of dependability measures / features
in a non-ambiguous way comparison
Properties
Reproducibility, repeatability, portability, representativeness, acceptable cost
Dependability benchmarkingInternal purpose
Results: a
vailable
& reusable
25
Benchmarks of Operating Systems
Which OS for my computer
system?
Operating System
MacLinux
Windows
Computer System
Limited knowledge: functional description
Limited accessibility and observability
Black-box approach robustness benchmark
26
OS Outcomes
Operating system
Hardware
Devicedrivers
Application
APIFaults
Faults = corrupted system calls
Robustness Benchmarks
27
OS Response Time
s
0
100
200
300
400
500
600
700
NT 4 2000 XP NT4Server
2000Server
2003Server
0
100
200
300
400
500
600
700
2.2.26 2.4.5 2.4.26 2.6.6
s
In the presence of corrupted system calls
Without corruption
Windows Linux
28
Mean Restart Time
Windows Linux
0
40
80
120
NT 4 2000 XP NT4Server
2000Server
2003Server
0
40
80
120
2.2.26 2.4.5 2.4.26 2.6.6
seconds seconds
In the presence of corrupted system calls
Without corruption
29
# exp
Detailed Restart Time
# exp
check disk
50
100
150
200
250
0 100 200 300 400
secondsseconds
Windows XP Linux 2.2.26
30
50
100
150
200
250
0 50 100 150 200 250 300 350 400
Experiment
seconds
2000
XP
NT4
Impact of application state after failure
Restart time(seconds)
More on Windows family
# exp XP
NT4
2000
31
Benchmark Characteristics and Limitations
A benchmark should not replace software test and validation
Non-intrusiveness robustness benchmarks
(faults injected outside the benchmark target)
Make use of available inputs and outputs impact on measures
Balance between cost and degree of confidence
# dependability benchmark measures >>
# performance benchmark measures
32
Maturity of Dependability Benchmarks
“Competition” benchmarks
Performance benchmarks
• Mature domain
• Cooperative work
• Integrated to system development
• Accepted by all actors for
competitive system comparison
“Ad hoc” benchmarks
Dependability benchmarks
• Infancy
• Isolated work
• Not explicitly addressed
• Acceptability?
• •??
Maturity
33
Data Collection
Data ProcessingControlled experiments
Measurements
Measures
Software Process Improvement (SPI)
Objectivesof the analysis
Capitalize experience
Data related to similar projects
Feedback to software development process
34
Examples of Benefits from SPI Programs
AT&T(quality program):
Customer reported problems (maintenance program) divided by 10
System test interval divided by 2
New product introduction interval divided by 3
Fujitsu(concurrent development process):
Release cycle reduction = 75 %
Motorola (Arlington Heights), mix of methods:
Fault density reduction = 50% within 3.5 years
Raytheon (Electronic Systems), CMM:
Rework cost divided by 2 after two years of experience
Productivity increase = 190%
Product quality: multiplied by 4
35
Process improvement Dependability improvement & cost reduction!
Cost of dependability
Dependability
Cost
Basic development cost
Cost of scrap / rework
— — without process improvement approaches
Total cost——— with process improvement approaches