+ All Categories
Home > Documents > CS 5150 Software Engineering

CS 5150 Software Engineering

Date post: 21-Jan-2016
Category:
Upload: gwylan
View: 20 times
Download: 0 times
Share this document with a friend
Description:
CS 5150 Software Engineering. Lecture 21 Reliability 3. Administration. Final presentations Sign up for your presentations now. Failures and Faults. Failure: Software does not deliver the service expected by the user (e.g., mistake in requirements, confusing user interface) - PowerPoint PPT Presentation
22
CS 5150 1 CS 5150 Software Engineering Lecture 21 Reliability 3
Transcript
Page 1: CS 5150  Software Engineering

CS 5150 1

CS 5150 Software Engineering

Lecture 21

Reliability 3

Page 2: CS 5150  Software Engineering

CS 5150 2

Administration

Final presentations

Sign up for your presentations now.

Page 3: CS 5150  Software Engineering

CS 5150 3

Failures and Faults

Failure: Software does not deliver the service expected by the user (e.g., mistake in requirements, confusing user interface)

Fault (BUG): Programming or design error whereby the delivered system does not conform to specification (e.g., coding error, interface error)

Page 4: CS 5150  Software Engineering

CS 5150 4

Failure of Requirements

An actual example

• The head of an organization is not paid his salary because it is greater than the maximum allowed by the program. (Requirements problem.)

Page 5: CS 5150  Software Engineering

CS 5150 5

Terminology

Fault avoidance

Build systems with the objective of creating fault-free (bug-free) software

Fault tolerance

Build systems that continue to operate when faults (bugs) occur

Fault detection (testing and validation)

Detect faults (bugs) before the system is put into operation or when discovered after release.

Page 6: CS 5150  Software Engineering

CS 5150 6

Defensive Programming

Murphy's Law:

If anything can go wrong, it will.

Defensive Programming:

• Redundant code is incorporated to check system state after modifications.

• Implicit assumptions are tested explicitly.

• Risky programming constructs are avoided.

Page 7: CS 5150  Software Engineering

CS 5150 7

Fault Tolerance

Aim: A system that continues to operate when problems occur.

Examples:

• Invalid input data (e.g., in a data processing application)• Overload (e.g., in a networked system)• Hardware failure (e.g., in a control system)

General Approach:

• Failure detection• Damage assessment• Fault recovery• Fault repair

Page 8: CS 5150  Software Engineering

CS 5150 8

Fault Tolerance: Recovery

Backward Recovery

• Record system state at specific events (checkpoints). After failure, recreate state at last checkpoint.

• Combine checkpoints with system log (audit trail of transactions) that allows transactions from last checkpoint to be repeated automatically.

Recovery Software is Difficult to Test

Example

After an entire network is hit by lightning, the restart crashes because of overload. (Problem of incremental growth.)

Page 9: CS 5150  Software Engineering

CS 5150 9

Fixing Bugs

Isolate the bugIntermittent --> repeatableComplex example --> simple example

Understand the bug and its contextRoot causeDependenciesStructural interactions

Fix the bugDesign changesDocumentation changesCode changes

Page 10: CS 5150  Software Engineering

CS 5150 10

Moving the Bugs Around

Fixing bugs is an error-prone process!

• When you fix a bug, fix its environment

• Bug fixes need static and dynamic testing

• Repeat all tests that have the slightest relevance (regression testing)

Bugs have a habit of returning!

• When a bug is fixed, add the failure case to the test suite for future regression testing.

Persistence

Most people work around problems. The best people track them down and fix them!

Page 11: CS 5150  Software Engineering

CS 5150 11

Difficult Bugs

Some bugs may be extremely difficult to track down and isolate. This is particularly true of intermittent failures.

• A large central computer stops a few times every month with no dump or other diagnostic.

• A database load dies after running for several days with no diagnostics.

• An image processing system runs correctly, but uses huge amounts of memory.

Such problems may require months of effort to track down.

For a fictional example, see: Ellen Ullman, The Bug: a Novel, (Doubleday 2003).

Page 12: CS 5150  Software Engineering

CS 5150 12

Tracking Down a Difficult Bugs: The Heisenbug

Page 13: CS 5150  Software Engineering

CS 5150 13

Tracking Down a Difficult Bugs: Make3D

Memory usage by function

cv::fastmalloc

Page 14: CS 5150  Software Engineering

CS 5150 14

Bugs in System Software

Even system software from good manufacturers may contain bugs:

• Built-in function in Fortran run-time environment (e0 = 0)

• The string-to-number function that was very slow with integers

• The preload system with the memory leak

Page 15: CS 5150  Software Engineering

CS 5150 15

Bugs in Hardware

Three times in my career I have encountered hardware bugs:

• The microfilm plotter with the missing byte (1:1023)

• Microcode for virtual memory management

• The Sun page fault that IBM paid to fix

Each problem was actually a bug in embedded software/firmware

Page 16: CS 5150  Software Engineering

CS 5150 16

Deciding whether to Fix a Bug: Creating a Problem for Customers

Sometimes customers will build applications that rely upon a bug. Fixing the bug will break the applications.

• An application crashes with an emulator, even though the emulator is bug free. (Compensating bug problem.)

• The graphics package with rotation about the Z-axis in the wrong direction.

• The 3-pixel rendering problem with Internet Explorer.

With each of these bugs the code was easy to fix, but releasing it would cause problems for existing programs.

Page 17: CS 5150  Software Engineering

CS 5150 17

Deciding whether to Fix a Bug: Bugs and Features

Validation: Are we building the right product?

Verification: Are we building the product right?

It is sometimes difficult to distinguish between the two.

That's not a bug. That's a feature!

Often users will report that a program behaves in a manner that they consider wrong, even though it is behaving as intended.

The decision whether this is a bug should be made by the client not by by the developers.

Page 18: CS 5150  Software Engineering

CS 5150 18

Reliability: Adapting Small Teams to Large Projects

Small teams and small projects have advantages for reliability:

• Small group communication cuts need for intermediate documentation, yet reduces misunderstanding.

• Small projects are easier to test and make reliable.

• Small projects have shorter development cycles. Mistakes in requirements are less likely and less expensive to fix.

• When one project is completed it is easier to plan for the next.

Improved reliability is one of the reasons that Agile development has become popular over the past few years.

Page 19: CS 5150  Software Engineering

CS 5150 19

An Old Question: Safety Critical Software

A software system fails and several lives are lost. An inquiry discovers that the test plan did not consider the case that caused the failure. Who is responsible?

(a) The testers for not noticing the missing cases?

(b) The test planners for not writing the complete test plan?

(c) The managers for not having checked the test plan?

(d) The client for not having done a thorough acceptance test?

Page 20: CS 5150  Software Engineering

CS 5150 20

Software Developers and Testers: Responsibilities

• Carrying out assigned tasks thoroughly and in a professional manner

• Being committed to the entire project -- not just tasks that have been assigned

• Resisting pressures to cut corners on vital tasks

• Alerting colleagues and management to potential problems early

Page 21: CS 5150  Software Engineering

CS 5150 21

Computing Management Responsibility

• Organization culture that expects quality

• Appointment of suitably qualified people to vital tasks (e.g., testing safety-critical software)

• Establishing and overseeing the software development process

• Providing time and incentives that encourage quality work

• Working closely with the client

Accepting responsibility for work of team

Page 22: CS 5150  Software Engineering

CS 5150 22

Client Responsibility

• Organization culture that expects quality

• Appointment of suitably qualified people to vital tasks (e.g., technical team that will build a critical system)

• Reviewing requirements and design carefully

• Establishing and overseeing the acceptance process

• Providing time and incentives that encourage quality work

• Working closely with the software team

Accepting responsibility for the resulting product


Recommended