+ All Categories
Home > Documents > CS 5150 1 CS 5150 Software Engineering Lecture 19 Reliability 1.

CS 5150 1 CS 5150 Software Engineering Lecture 19 Reliability 1.

Date post: 22-Dec-2015
Category:
View: 226 times
Download: 1 times
Share this document with a friend
Popular Tags:
33
CS 5150 1 CS 5150 Software Engineering Lecture 19 Reliability 1
Transcript

CS 5150 1

CS 5150 Software Engineering

Lecture 19

Reliability 1

CS 5150 2

Administration

Weekly progress reports

• Weekly progress reports are not required after this week's presentations and report.

CS 5150 3

Security Techniques: Barriers

Place barriers that separate parts of a complex system:

• Isolate components, e.g., do not connect a computer to a network

• Firewalls

• Require authentication to access certain systems or parts of systems

Every barrier imposes restrictions on permitted uses of the system

Barriers are most effective when the system can be divided into subsystems with simple boundaries

CS 5150 4

Barriers: Firewall

Public network

Private network

Firewall

A firewall is a computer at the junction of two network segments that:

• Inspects every packet that attempts to cross the boundary

• Rejects any packet that does not satisfy certain criteria, e.g.,

an incoming request to open a TCP connectionan unknown packet type

Firewalls provide security at a loss of flexibility and a cost of system administration.

CS 5150 5

Security Techniques: Authentication & Authorization

Authentication establishes the identity of an agent:

• What does the agent know (e.g., password)?

• What does the agent possess (e.g., smart card)?

• Where does the agent have physical access to (e.g., crt-alt-del)?

• What are the physical properties of the agent (e.g., fingerprint)?

Authorization establishes what an authenticated agent may do:

• Access control lists

• Group membership

CS 5150 6

Example: An Access Model for Digital Content

Digital material

Attributes

User

Roles

Actions

OperationsAccess

Policies

CS 5150 7

Security Techniques: Encryption

Allows data to be stored and transmitted securely, even when the bits are viewed by unauthorized agents

• Private key and public key

• Digital signatures

Encryption

Decryption

X Y

Y X

CS 5150 8

Security and People

People are intrinsically insecure:

• Careless (e.g., leave computers logged on, leave passwords where others can read them)

• Dishonest (e.g., stealing from financial systems)

• Malicious (e.g., denial of service attack)

Many security problems come from inside the organization:

• In a large organization, there will be some disgruntled and dishonest employees

• Security relies on trusted individuals. What if they are dishonest?

CS 5150 9

Design for Security: People

• Make it easy for responsible people to use the system (e.g., make security procedures simple)

• Make it hard for dishonest or careless people (e.g., password management)

• Train people in responsible behavior

• Test the security of the system thoroughly and repeatedly, particularly after changes

• Do not hide violations

CS 5150 10

Programming Secure Software

Programs that interface with the outside world (e.g., Web sites) need to be written in a manner that resists intrusion.

For the top 25 programming errors, see: Common Weakness Evaluation: A Community-Developed Dictionary of Software Weakness Types. http://cwe.mitre.org/top25/

• Insecure Interaction Between Components

• Risky Resource Management

• Porous Defenses

Project management must ensure that programs avoid these errors.

CS 5150 11

Programming Secure Software

The following list is from the SANS Security Institute, Essential Skills for Secure Programmers Using Java/JavaEE, http://www.sans.org/

• Input Handling

• Authentication & Session Management

• Access Control (Authorization)

• Java Types & JVM Management

• Application Faults & Logging

• Encryption Services

• Concurrency and Threading

• Connection Patterns

CS 5150 12

Suggested Reading

Trust in Cyberspace, Committee on Information Systems Trustworthiness, National Research Council (1999)http://www.nap.edu/readingroom/books/trust/

Fred Schneider, Cornell Computer Science, was the chair of this study.

CS 5150 13

Dependable and Reliable Systems: The Royal Majesty

From the report of the National Transportation Safety Board:

"On June 10, 1995, the Panamanian passenger ship Royal Majesty grounded on Rose and Crown Shoal about 10 miles east of Nantucket Island, Massachusetts, and about 17 miles from where the watch officers thought the vessel was. The vessel, with 1,509 persons on board, was en route from St. George’s, Bermuda, to Boston, Massachusetts."

"The Raytheon GPS unit installed on the Royal Majesty had been designed as a standalone navigation device in the mid- to late1980s, ...The Royal Majesty’s GPS was configured by Majesty Cruise Line to automatically default to the Dead Reckoning mode when satellite data were not available."

CS 5150 14

The Royal Majesty: Analysis

• The ship was steered by an autopilot that relied on position information from the Global Positioning System (GPS).

• If the GPS could not obtain a position from satellites, it provided an estimated position based on Dead Reckoning (distance and direction traveled from a known point).

• The GPS failed one hour after leaving Bermuda.

• The crew failed to see the warning message on the display (or to check the instruments).

• 34 hours and 600 miles later, the Dead Reckoning error was 17 miles.

CS 5150 15

The Royal Majesty: Software Lessons

All the software worked as specified (no bugs), but ...

• Since the GPS software had been specified, the requirements had changed (stand alone system now part of integrated system).

• The manufacturers of the autopilot and GPS adopted different design philosophies about the communication of mode changes.

• The autopilot was not programmed to recognize valid/invalid status bits in message from the GPS (NMEA 0183).

• The warnings provided by the user interface were not sufficiently conspicuous to alert the crew.

• The officers had not been properly trained on this equipment.

CS 5150 16

Key Factors for Reliable Software

• Organization culture that expects quality

• Approach to software design and implementation that hides complexity (e.g., structured design, object-oriented programming)

• Precise, unambiguous specification

• Use of software tools that restrict or detect errors (e.g., strongly typed languages, source control systems, debuggers)

• Programming style that emphasizes simplicity, readability, and avoidance of dangerous constructs

• Incremental validation

CS 5150 17

Building Dependable Systems: Three Principles

For a software system to be dependable:

• Each stage of development must be done well.

• Changes should be incorporated into the structure as carefully as the original system development.

• Testing and correction do not ensure quality, but dependable systems are not possible without systematic testing.

CS 5150 18

Building Dependable Systems: Organizational Culture

Good organizations create good systems:

• Acceptance of the group's style of work (e.g., meetings, preparation, support for juniors)

• Visibility

• Completion of a task before moving to the next (e.g., documentation, comments in code)

CS 5150 19

Building Dependable Systems: Quality Management Processes

Assumption:

Good software is impossible without good processes

The importance of routine:

Standard terminology (requirements, specification, design, etc.)

Software standards (coding standards, naming conventions, etc.)

Regular builds of complete system

Internal and external documentation

Reporting procedures

CS 5150 20

Building Dependable Systems: Quality Management Processes

When time is short...

Pay extra attention to the early stages of the process: feasibility, requirements, design.

There will be little time to redo mistakes in the requirements.

Experience shows that taking extra time on the early stages will usually reduce the total time to release.

CS 5150 21

Building Dependable Systems: Specifications for the Client

Specifications are of no value if they do not meet the client's needs

• The client must understand and review the requirements specification in detail

• Appropriate members of the client's staff must review relevant areas of the design (including operations, training materials, system administration)

• The acceptance tests must belong to the client

CS 5150 22

Building Dependable Systems: Changes

Requirements

System design

Testing

Operation & maintenance

Program design

Implementation (coding)

Acceptance & release

Feasibility study

Changes

CS 5150 23

Building Dependable Systems: Change

Change management:

Source code management and version control

Tracking of change requests and bug reports

Procedures for changing requirements specifications, designs and other documentation

Regression testing

Release control

CS 5150 24

Building Dependable Systems: Complexity

The human mind can encompass only limited complexity:

• Comprehensibility

• Simplicity

• Partitioning of complexity

A simple component is easier to get right than a complex one.

CS 5150 25

Reliability Metrics

Reliability

Probability of a failure occurring in operational use.

Perceived reliability

Depends upon:

user behaviorset of inputspain of failure

CS 5150 26

Reliability Metrics

Traditional measures for online systems• Mean time between failures• Availability (up time)• Mean time to repair

Market measures• Complaints• Customer retention

User perception is influenced by• Distribution of failures

CS 5150 27

Metrics: User Perception of Reliability

1. A personal computer that crashes frequently v. a machine that is out of service for two days.

2. A database system that crashes frequently but comes back quickly with no loss of data v. a system that fails once in three years but data has to be restored from backup.

3. A system that does not fail but has unpredictable periods when it runs very slowly.

CS 5150 28

Reliability Metrics for Distributed Systems

Traditional metrics are hard to apply in multi-component systems:

• A system that has excellent average reliability might give terrible service to certain users.

• In a big network, at any given moment something will be giving trouble, but very few users will see it.

• When there are many components, system administrators rely on automatic reporting systems to identify problem areas.

CS 5150 29

Requirements Metrics for System Reliability

Example: ATM card reader

Failure class Example Metric (requirement)

Permanent System fails to operate 1 per 1,000 daysnon-corrupting with any card -- reboot

Transient System can not read 1 in 1,000 transactionsnon-corrupting an undamaged card

Corrupting A pattern of Never transactions corrupts database

CS 5150 30

Metrics: Cost of Improved Reliability

Time and $ Reliability

metric

99% 100%

• Will you spend your money on new functionality or improved reliability?

• When do you ship?

CS 5150 31

Example: Central Computing System

A central computer system (e.g., a server farm) is vital to an entire organization. Any failure is serious.

Step 1: Gather data on every failure

• Many years of data in a data base

• Every failure analyzed:

hardwaresoftware (default)environment (e.g., power, air conditioning)human (e.g., operator error)

CS 5150 32

Example: Central Computing System

Step 2: Analyze the data

• Weekly, monthly, and annual statistics

Number of failures and interruptionsMean time to repair

• Graphs of trends by component, e.g.,

Failure rates of disk drivesHardware failures after power failuresCrashes caused by software bugs in each

component

CS 5150 33

Example: Central Computing System

Step 3: Invest resources where benefit will be maximum, e.g.,

• Priority order for software improvements

• Changed procedures for operators

• Replacement hardware

• Orderly shut down after power failure

Example. Supercomputers may average 10 hours productive work per day.


Recommended