+ All Categories
Home > Documents > Presented by Nathanael Paul September 25, 2002 Y2K: Perrow’s Normal Accident Theory (NAT) Tested...

Presented by Nathanael Paul September 25, 2002 Y2K: Perrow’s Normal Accident Theory (NAT) Tested...

Date post: 27-Dec-2015
Category:
Upload: jordan-dickerson
View: 214 times
Download: 0 times
Share this document with a friend
Popular Tags:
26
Presented by Nathanael Paul September 25, 2002 Y2K: Perrow’s Normal Accident Theory (NAT) Tested and “Normal Accidents-Yesterday and Today”
Transcript

Presented by Nathanael Paul

September 25, 2002

Y2K: Perrow’s Normal Accident Theory (NAT) Testedand“Normal Accidents-Yesterday and Today”

2

Some questions…

What is the most code you have ever written? Largest project (number of lines of code) that you have ever worked on?

Y2K – the ultimate system failure Were you an optimist or a pessimist?

3

Society and Systems after 1984

First non-negotiable deadline 180 billion lines of code needs inspecting

Social security: 30 million lines to fix After 400 people and 5 years, only 6 million fixed

“organizations are always screwing up… uncertainty drives system accidents, and this is a hallmark of Y2K”

Failures and Social Incoherence

4

Perrow’s take on Y2K:“I expect moderate failures but little

social incoherence”

5

Ingredients

Unexpected interaction of multiple failures Tightly coupled system Incomprehensible

6

Interdependency

Slight inconvenience or isolated hardships Not as interconnected Handling of problem was better than at first thought

Tight coupling One way of doing things w/o too much slack

“web of connections” Closest analogy to software yet… Alternative paths Testing

7

Optimists and Pessimists

Pessimists Biblical Apocalypse Computer and Financial experts

Optimists Industrial trade groups, government, and companies Late in their response

8

Key points to both sides

P: Everything is linked, so everything is “mission critical” Hard to prioritize

O: Experience with failures before will see us through

O: Testing results not announced b/c of liability P: Accident becomes a catastrophic disaster

(multiple failures with coupled single systems)

9

The Chip

Chips can’t be reprogrammed 7 billion programmable microcontrollers in ‘96

Air Traffic Control’s problems with mainframes People locked in car plant, prisoners let loose “We know something about unexpected

interactions and are more prepared to look for them than ever before.”

“The Butterfly Effect”

10

Electric Power

Society’s lifeblood Complex interconnected “grids” 1998: Most of the 75% electric N. American

power companies were in awareness/assessment (same findings in Jan. 1999)

“Just in time production” Nuclear facilities not “expecting” problems

11

Lack of Interest

Jan. 1998 at premier tech conf. of industry No sessions, no meetings on Y2K One presentation was scheduled People were mad. Y2K was a hoax and presenter was a

profiteering consultant

March, 1998 at 3rd annual industry-wide meeting on Y2k

70 of 8,000 companies were there

One summer’s power meeting canceled b/c of lack of interest

12

More on Power

Interconnectivity No telecom, no power. No power, no telecom. Available fuel supply and delivery

No service obligations to provide base load power to bulk power entities

Gov’t intervention not wanted. Merge, but no fix

13

And last, but not least… Nuclear Power

Jan. 1999: Only 31 percent ready Harder to fix?

Not expecting problems Hard to test all components If not ready by 2000, shutdown Provided 25% of power

40% in Northeast

14

Y2K going wrong… We give up.

Y2K compliance vs. Y2K readiness “The Domino Effect”

Banking Shipping Farming and hybrid seeds

Just show them the software warranty You’re probably not liable anyway

15

Conclusions about NAT and accidents today

Which of the characteristics of NAT does software normally exhibit? Tight/loose coupling? Interdependency? Linear/Complex? “web of connections”

What has been done in the past to help in reducing “accidents” of software? (reduction of tight coupling/complexity/interdependencies)

Let’s see what Strauch has to say…

16

Other Accidents and Views

Challenger and Chernobyl Do these accidents support NAT?

Operator Error Someone to blame? Justified blame?

Chemical Refineries, Nuclear Power, commercial aviation have all seen drops in accidents or in types of accidents in Perrow

Can Perrow’s assertions be justified that more system accidents would happen?

17

1995 crash of American Airlines Boeing 757 in Cali, Columbia

Saving time and expense by landing to the south (Miami to Cali)

Many tasks performed to get ready for approach Approach named after final approach fix

(unusual, not named after initial approach) Initial approach beacon, Tulua, deleted from

approach data Flew to final approach (not initial)

18

Factors involved in Cali crash

“Hide the results of operator actions from operators themselves”

Navigational database design Abbreviations used and instrument approach

procedures Nepal 1992 accident, very similar to Cali crash

Lesson was not learned.

19

Accident Frequency since ‘84

Depends on country and particular system Perrow’s assertions affected by:

Industry variables Cultural variables Hindsight of his work in helping others And…

20

Technology

Airbus Industrie (A-320) Introduce new technology, time to familiarize High to lower rate of fatalities as time goes on

Training Better able to emulate real system Focus on what people need seen in training Training related accidents all but eliminated Operator error reduction in training? Was Perrow

still right?

21

Aviation Technology

CFIT Ground Proximity Warning Systems (GPWS) not

good at high speeds (Cali crash good example) Terrain Aural Warning System (TAWS) No TAWS aircraft in CFIT crash, yet…

In flight collision of 2 aircraft Terminal Collision Alerting System (TCAS) No 2 planes with TCAS in a collision, yet…

22

Organizational Accidents – Organizational Features in system safety

Valujet ’96 crash of DC-9 Canisters of chemical oxygen generators Non-traditional contracting out of work Maintenance personnel were rushed to work on 2

aircraft to meet deadlines Canisters not labeled correctly Warehousing personnel returned the canisters to

their rightful owner

23

What Happened?

Cost reductions over safety Regulation (FAA) failed where accident may

have been prevented Enron

24

Learning from our mistakes

Something done before 1984… Shortcomings of navigational databases

addressed (Cali accident) FAA operational oversight addressed (Valujet) Financial system deficiencies addressed (Enron) Rejected take offs decreased after better training What about Exxon Valdez oil spell (vessel’s

master and alcohol)

25

Doomed to repeat, if there is no change

Airplane flaps and slats ’87 Northwest Airlines crash in Detroit (better

training and aural warnings) Dallas-Ft. Worth crash b/c of flaps and slats (made

sure this didn’t happen again)

Concorde Engine could eat tire debris, tires in front of engine Nothing done until 2000 Paris accident (problem

cited much earlier by engineers)

26

Conclusions

Was NAT successful? Why? “Features” can create deficiencies Are systems any more comprehensible? Operator error vs. Design Error Why the reduction in system accidents?

Have we truly stopped accusing the operator and started looking at the systems?

Technology Did it help or hurt more in system accidents?


Recommended