+ All Categories
Home > Documents > 1 Note content copyright © 2004 Ian Sommerville. NU-specific content copyright © 2004 M. E. Kabay....

1 Note content copyright © 2004 Ian Sommerville. NU-specific content copyright © 2004 M. E. Kabay....

Date post: 26-Mar-2015
Category:
Upload: paige-mccabe
View: 212 times
Download: 0 times
Share this document with a friend
Popular Tags:
36
Note content copyright © 2004 Ian Sommerville. NU-specific content copyright © 2004 M. E. Kabay. All rights reserved. Critical Systems Development IS301 – Software Engineering Lecture #27 – 2004-11-03 M. E. Kabay, PhD, CISSP Assoc. Prof. Information Assurance Division of Business & Management, Norwich University mailto:[email protected] V: 802.479.7937
Transcript
Page 1: 1 Note content copyright © 2004 Ian Sommerville. NU-specific content copyright © 2004 M. E. Kabay. All rights reserved. Critical Systems Development IS301.

1 Note content copyright © 2004 Ian Sommerville. NU-specific content copyright © 2004 M. E. Kabay. All rights reserved.

Critical Systems

DevelopmentIS301 – Software Engineering

Lecture #27 – 2004-11-03M. E. Kabay, PhD, CISSP

Assoc. Prof. Information AssuranceDivision of Business & Management, Norwich University

mailto:[email protected] V: 802.479.7937

Page 2: 1 Note content copyright © 2004 Ian Sommerville. NU-specific content copyright © 2004 M. E. Kabay. All rights reserved. Critical Systems Development IS301.

2 Note content copyright © 2004 Ian Sommerville. NU-specific content copyright © 2004 M. E. Kabay. All rights reserved.

First, take a deep breath.You are about to enter the

fire-hose zone.

Page 3: 1 Note content copyright © 2004 Ian Sommerville. NU-specific content copyright © 2004 M. E. Kabay. All rights reserved. Critical Systems Development IS301.

3 Note content copyright © 2004 Ian Sommerville. NU-specific content copyright © 2004 M. E. Kabay. All rights reserved.

Objectives

To explain how fault tolerance and fault avoidance contribute to the development of dependable systems

To describe characteristics of dependable software processes

To introduce programming techniques for fault avoidance

To describe fault tolerance mechanisms and their use of diversity and redundancy

Page 4: 1 Note content copyright © 2004 Ian Sommerville. NU-specific content copyright © 2004 M. E. Kabay. All rights reserved. Critical Systems Development IS301.

4 Note content copyright © 2004 Ian Sommerville. NU-specific content copyright © 2004 M. E. Kabay. All rights reserved.

Topics covered

Dependable processesDependable programmingFault toleranceFault tolerant architectures

Page 5: 1 Note content copyright © 2004 Ian Sommerville. NU-specific content copyright © 2004 M. E. Kabay. All rights reserved. Critical Systems Development IS301.

5 Note content copyright © 2004 Ian Sommerville. NU-specific content copyright © 2004 M. E. Kabay. All rights reserved.

Dependable Software Development

Programming techniques for building dependable software systems.

Page 6: 1 Note content copyright © 2004 Ian Sommerville. NU-specific content copyright © 2004 M. E. Kabay. All rights reserved. Critical Systems Development IS301.

6 Note content copyright © 2004 Ian Sommerville. NU-specific content copyright © 2004 M. E. Kabay. All rights reserved.

Software Dependability

In general, software customers expect all software to be dependable

For non-critical applications, may be willing to accept some system failures

Some applications have very high dependability requirements Special programming techniques req’d

Page 7: 1 Note content copyright © 2004 Ian Sommerville. NU-specific content copyright © 2004 M. E. Kabay. All rights reserved. Critical Systems Development IS301.

7 Note content copyright © 2004 Ian Sommerville. NU-specific content copyright © 2004 M. E. Kabay. All rights reserved.

Dependability Achievement

Fault avoidanceSoftware developed so

Human error avoided and System faults minimized

Development process organized so Faults in software detected and Repaired before delivery to customer

Fault toleranceSoftware designed so

Faults in delivered software do not result in system failure

Page 8: 1 Note content copyright © 2004 Ian Sommerville. NU-specific content copyright © 2004 M. E. Kabay. All rights reserved. Critical Systems Development IS301.

8 Note content copyright © 2004 Ian Sommerville. NU-specific content copyright © 2004 M. E. Kabay. All rights reserved.

Diversity and RedundancyRedundancy

Keep more than 1 version of a critical component available so that if one fails then a backup is available.

DiversityProvide the same functionality in different

ways so that they will not fail in the same way.However, adding diversity and redundancy adds

complexity and this can increase the chances of error.

Some engineers advocate simplicity and extensive verification & validation (V&V) as a more effective route to software dependability.

Page 9: 1 Note content copyright © 2004 Ian Sommerville. NU-specific content copyright © 2004 M. E. Kabay. All rights reserved. Critical Systems Development IS301.

9 Note content copyright © 2004 Ian Sommerville. NU-specific content copyright © 2004 M. E. Kabay. All rights reserved.

Diversity and Redundancy ExamplesRedundancy

Where availability is critical (e.g. in e-commerce systems),

companies normally keep backup servers and switch to these automatically if failure occurs.

Diversity. To provide resilience against external attacks, different servers may be implemented using

different operating systems (e.g. Windows and Linux)

Page 10: 1 Note content copyright © 2004 Ian Sommerville. NU-specific content copyright © 2004 M. E. Kabay. All rights reserved. Critical Systems Development IS301.

10 Note content copyright © 2004 Ian Sommerville. NU-specific content copyright © 2004 M. E. Kabay. All rights reserved.

Fault Minimization

Current methods of software engineering now

allow for production of fault-free softwareFault-free software means it conforms to its

specificationDoes NOT mean software

which will always perform correctly

Why not?

Because of specificatio

n errors.

Page 11: 1 Note content copyright © 2004 Ian Sommerville. NU-specific content copyright © 2004 M. E. Kabay. All rights reserved. Critical Systems Development IS301.

11 Note content copyright © 2004 Ian Sommerville. NU-specific content copyright © 2004 M. E. Kabay. All rights reserved.

Cost of Producing Fault-Free Software (1)

Very highCost-effective only in exceptional

situationsWhich?

May be cheaper to accept software faultsBut who will bear costs?

Users?Manufacturers?Both?

Will the risk-sharing be with full knowledge?

Page 12: 1 Note content copyright © 2004 Ian Sommerville. NU-specific content copyright © 2004 M. E. Kabay. All rights reserved. Critical Systems Development IS301.

12 Note content copyright © 2004 Ian Sommerville. NU-specific content copyright © 2004 M. E. Kabay. All rights reserved.

Cost of Producing Fault-Free Software (2)

The Pareto Principle

Costs

To

tal

% o

f E

rro

rs F

ixed

20%

80%

100%

If curve really is asymptotic to 100%, cost

may

approach

Page 13: 1 Note content copyright © 2004 Ian Sommerville. NU-specific content copyright © 2004 M. E. Kabay. All rights reserved. Critical Systems Development IS301.

13 Note content copyright © 2004 Ian Sommerville. NU-specific content copyright © 2004 M. E. Kabay. All rights reserved.

Cost of ProducingFault-Free Software (3)

Many Few Very fewNumber of residual errors

Co

st p

er e

rro

r d

etec

ted

Just a different way of

looking at it.

Page 14: 1 Note content copyright © 2004 Ian Sommerville. NU-specific content copyright © 2004 M. E. Kabay. All rights reserved. Critical Systems Development IS301.

16 Note content copyright © 2004 Ian Sommerville. NU-specific content copyright © 2004 M. E. Kabay. All rights reserved.

Validation activities

Requirements inspections.Requirements management.Model checking.Design and code inspection.Static analysis.Test planning and management.Configuration management, discussed in

Chapter 29, is also essential.

Page 15: 1 Note content copyright © 2004 Ian Sommerville. NU-specific content copyright © 2004 M. E. Kabay. All rights reserved. Critical Systems Development IS301.

19 Note content copyright © 2004 Ian Sommerville. NU-specific content copyright © 2004 M. E. Kabay. All rights reserved.

Safe Programming

Faults in programs are usually a consequence of programmers making mistakes.

These mistakes occur because people lose track of the relationships among program variables.

Some programming constructs are more error-prone than others so avoiding their use reduces programmer mistakes.

Page 16: 1 Note content copyright © 2004 Ian Sommerville. NU-specific content copyright © 2004 M. E. Kabay. All rights reserved. Critical Systems Development IS301.

20 Note content copyright © 2004 Ian Sommerville. NU-specific content copyright © 2004 M. E. Kabay. All rights reserved.

Fault-Free Software Development

Needs precise (preferably formal) specification

Requires organizational commitment to quality

Information hiding and encapsulation in software design essential

Use programming language with strict typing and run-time checking

Avoid error-prone constructsUse dependable and repeatable development

process

Page 17: 1 Note content copyright © 2004 Ian Sommerville. NU-specific content copyright © 2004 M. E. Kabay. All rights reserved. Critical Systems Development IS301.

21 Note content copyright © 2004 Ian Sommerville. NU-specific content copyright © 2004 M. E. Kabay. All rights reserved.

Structured Programming

First discussed in 1970'sProgramming without gotoWhile loops and if statements as only

control statementsTop-down design Important because it promoted thought and

discussion about programmingPrograms easier to read and understand than

old spaghetti code

Page 18: 1 Note content copyright © 2004 Ian Sommerville. NU-specific content copyright © 2004 M. E. Kabay. All rights reserved. Critical Systems Development IS301.

22 Note content copyright © 2004 Ian Sommerville. NU-specific content copyright © 2004 M. E. Kabay. All rights reserved.

Error-Prone Constructs (1)Floating-point numbers

Inherently imprecise – and machine-dependent

Imprecision may lead to invalid comparisons

PointersPointers referring to wrong memory as can

corrupt dataAliasing can make programs difficult to

understand and changeDynamic memory allocation

Run-time allocation can cause memory overflow

Page 19: 1 Note content copyright © 2004 Ian Sommerville. NU-specific content copyright © 2004 M. E. Kabay. All rights reserved. Critical Systems Development IS301.

23 Note content copyright © 2004 Ian Sommerville. NU-specific content copyright © 2004 M. E. Kabay. All rights reserved.

Error-Prone Constructs (2)Parallelism

Can result in subtle timing errors (race conditions) because of unforeseen interaction between parallel processes

RecursionErrors in recursion can cause memory

overflow Interrupts

Interrupts can cause critical operation to be terminated and make program difficult to understand

Similar to goto statements

Page 20: 1 Note content copyright © 2004 Ian Sommerville. NU-specific content copyright © 2004 M. E. Kabay. All rights reserved. Critical Systems Development IS301.

24 Note content copyright © 2004 Ian Sommerville. NU-specific content copyright © 2004 M. E. Kabay. All rights reserved.

Error-Prone Constructs (3)

InheritanceCode not localizedCan result in unexpected behavior when

changes madeCan be hard to understandDifficult to debug problems

All of these constructs don’t have to be absolutely eliminated But must be used with great care

Page 21: 1 Note content copyright © 2004 Ian Sommerville. NU-specific content copyright © 2004 M. E. Kabay. All rights reserved. Critical Systems Development IS301.

25 Note content copyright © 2004 Ian Sommerville. NU-specific content copyright © 2004 M. E. Kabay. All rights reserved.

Reliable Software Processes

Well-defined, repeatable software process:Reduces software faultsDoes not depend entirely on individual

skills – can be enacted by different peopleProcess activities should include significant

verification and validation

Page 22: 1 Note content copyright © 2004 Ian Sommerville. NU-specific content copyright © 2004 M. E. Kabay. All rights reserved. Critical Systems Development IS301.

26 Note content copyright © 2004 Ian Sommerville. NU-specific content copyright © 2004 M. E. Kabay. All rights reserved.

Process Validation Activities

Requirements inspectionsRequirements managementModel checkingDesign and code inspectionStatic analysisTest planning and managementConfiguration management also essential

Page 23: 1 Note content copyright © 2004 Ian Sommerville. NU-specific content copyright © 2004 M. E. Kabay. All rights reserved. Critical Systems Development IS301.

27 Note content copyright © 2004 Ian Sommerville. NU-specific content copyright © 2004 M. E. Kabay. All rights reserved.

Fault Tolerance

Critical software systems must be fault tolerantSystem can continue operating in spite of

software failureFault tolerance required in

High availability requirements orSystem failure costs very high

Even “fault-free” systems need fault tolerance May be specification errors orValidation may be incorrect

Page 24: 1 Note content copyright © 2004 Ian Sommerville. NU-specific content copyright © 2004 M. E. Kabay. All rights reserved. Critical Systems Development IS301.

28 Note content copyright © 2004 Ian Sommerville. NU-specific content copyright © 2004 M. E. Kabay. All rights reserved.

Fault Tolerance ActionsFault detection

Incorrect system state has occurredDamage assessment

Identify parts of system state affected by fault

Fault recoveryReturn to known safe state

Fault repairPrevent recurrence of faultIdentify underlying problemIf not transient*, then fix errors of design,

implementation, documentation or training that led to error

E.g., hardware failure

*

Page 25: 1 Note content copyright © 2004 Ian Sommerville. NU-specific content copyright © 2004 M. E. Kabay. All rights reserved. Critical Systems Development IS301.

29 Note content copyright © 2004 Ian Sommerville. NU-specific content copyright © 2004 M. E. Kabay. All rights reserved.

Approaches to Fault ToleranceDefensive programming

Programmers assume faults in codeCheck state after modifications to ensure

consistencyFault-tolerant architectures

HW & SW system architectures support redundancy and fault tolerance

Controller detects problems and supports fault recovery

Complementary rather than opposing techniques

Page 26: 1 Note content copyright © 2004 Ian Sommerville. NU-specific content copyright © 2004 M. E. Kabay. All rights reserved. Critical Systems Development IS301.

30 Note content copyright © 2004 Ian Sommerville. NU-specific content copyright © 2004 M. E. Kabay. All rights reserved.

Fault Detection (1)

Strictly-typed languages E.g., Java and Ada Many errors trapped at compile-time

Some classes of error can only be discovered at run-time

Fault detection: Detecting erroneous system state Throwing exception

To manage detected fault

Page 27: 1 Note content copyright © 2004 Ian Sommerville. NU-specific content copyright © 2004 M. E. Kabay. All rights reserved. Critical Systems Development IS301.

31 Note content copyright © 2004 Ian Sommerville. NU-specific content copyright © 2004 M. E. Kabay. All rights reserved.

Fault Detection (2)

Preventative fault detectionCheck conditions before making changesIf bad state detected, don’t make change

Retrospective fault detectionCheck validity after system state has been

changedUsed when

Incorrect sequence of correct actions leads to erroneous state or

When preventative fault detection involves too much overhead

Page 28: 1 Note content copyright © 2004 Ian Sommerville. NU-specific content copyright © 2004 M. E. Kabay. All rights reserved. Critical Systems Development IS301.

32 Note content copyright © 2004 Ian Sommerville. NU-specific content copyright © 2004 M. E. Kabay. All rights reserved.

Damage Assessment

Analyze system stateJudge extent of corruption caused by

system failureAssess what parts of state space have been

affected by failureGenerally based on ‘validity functions’

Can be applied to state elements Assess if their value within allowed range

Page 29: 1 Note content copyright © 2004 Ian Sommerville. NU-specific content copyright © 2004 M. E. Kabay. All rights reserved. Critical Systems Development IS301.

33 Note content copyright © 2004 Ian Sommerville. NU-specific content copyright © 2004 M. E. Kabay. All rights reserved.

Damage Assessment Techniques

Checksums Used for damage assessment in data

transmissionVerify integrity after transmission

Redundant pointers Check integrity of data structuresE.g., databases

Watch-dog timers Check for non-terminating processesIf no response after certain time, there’s a

problem

Page 30: 1 Note content copyright © 2004 Ian Sommerville. NU-specific content copyright © 2004 M. E. Kabay. All rights reserved. Critical Systems Development IS301.

34 Note content copyright © 2004 Ian Sommerville. NU-specific content copyright © 2004 M. E. Kabay. All rights reserved.

Fault Recovery

Forward recoveryApply repairs to corrupted system stateDomain knowledge required to compute

possible state correctionsForward recovery usually application

specificBackward recovery

Restore system state to known safe stateSimpler than forward recoveryDetails of safe state maintained and

replaces corrupted system state

Page 31: 1 Note content copyright © 2004 Ian Sommerville. NU-specific content copyright © 2004 M. E. Kabay. All rights reserved. Critical Systems Development IS301.

35 Note content copyright © 2004 Ian Sommerville. NU-specific content copyright © 2004 M. E. Kabay. All rights reserved.

Forward Recovery

Data communicationsAdd redundancy to coded dataUse to repair data corrupted during

transmissionRedundant pointers

E.g., doubly-linked lists Damaged list / file may be repaired if

enough links are still validOften used for database and file system

repair

Page 32: 1 Note content copyright © 2004 Ian Sommerville. NU-specific content copyright © 2004 M. E. Kabay. All rights reserved. Critical Systems Development IS301.

36 Note content copyright © 2004 Ian Sommerville. NU-specific content copyright © 2004 M. E. Kabay. All rights reserved.

Backward Recovery

Transaction processing often uses conservative methods to avoid problems

Complete computations, then apply changesKeep original data in buffersPeriodic checkpoints allow system to 'roll-

back' to correct state

Page 33: 1 Note content copyright © 2004 Ian Sommerville. NU-specific content copyright © 2004 M. E. Kabay. All rights reserved. Critical Systems Development IS301.

45 Note content copyright © 2004 Ian Sommerville. NU-specific content copyright © 2004 M. E. Kabay. All rights reserved.

Recovery Blocks (1)

Acceptancetest

Algorithm 2

Algorithm 1

Algorithm 3

Recoveryblocks

Test forsuccess

Retest

Retry

Retest

Try algorithm1

Continue execution ifacceptance test succeedsSignal exception if allalgorithms fail

Acceptance testfails – re-try

Page 34: 1 Note content copyright © 2004 Ian Sommerville. NU-specific content copyright © 2004 M. E. Kabay. All rights reserved. Critical Systems Development IS301.

46 Note content copyright © 2004 Ian Sommerville. NU-specific content copyright © 2004 M. E. Kabay. All rights reserved.

Recovery Blocks (2)

Force different algorithm to be used for each version so they reduce probability of common errors

However, design of acceptance test difficult as it must be independent of computation used

Problems with approach for real-time systems because of sequential operation of redundant versions

Page 35: 1 Note content copyright © 2004 Ian Sommerville. NU-specific content copyright © 2004 M. E. Kabay. All rights reserved. Critical Systems Development IS301.

52 Note content copyright © 2004 Ian Sommerville. NU-specific content copyright © 2004 M. E. Kabay. All rights reserved.

HomeworkStudy Chapter 18 in detail using SQ3RRequired

By next Wed 10 Nov 2004For 30 points

20.1, 20.2, 20.4 – 20.6, 20.9 (@5) and pay attention to demands for examples

OPTIONALBy Wed 17 Nov 2004For up to 14 extra points, any or all of

20.10 (@3), 20.11 (@3) – details please20.12 (@8) – detailed answers to all

parts of this question

Page 36: 1 Note content copyright © 2004 Ian Sommerville. NU-specific content copyright © 2004 M. E. Kabay. All rights reserved. Critical Systems Development IS301.

53 Note content copyright © 2004 Ian Sommerville. NU-specific content copyright © 2004 M. E. Kabay. All rights reserved.

DISCUSSION


Recommended