Slide 1 Recovery Oriented Computing (ROC) David Patterson 2002 Grad Visit Day.

Slide 1

Recovery Oriented Computing (ROC)

David Patterson

2002 Grad Visit Day

Slide 2

Berkeley’s Research Goals• Have Impact, not just count Journal

Papers– Some universities have bad benchmarks– Recently realized that when goal is not

impact, you rarely have impact (but lots of papers)

• Produce Great Students, not # Journal Papers– Try to create projects that if I were a student,

I would almost kill myself to try to join – Not all projects equally successful in research

impact, but all can produce great students

– As you get further in career, you realize thatStudents are the coin of the academic realm

Slide 3

(One) Berkeley Approach to Systems

• Find an important problem crossing HW/SW Interface, with HW/SW prototype at end,(usually in a graduate course)

• Assemble a band of 3-6 faculty, 12-20 grad students, 1-3 staff to tackle it over 4 years

• Meet twice a year for 3-day retreats with invited outsiders– Builds team spirit, advice on direction, change

course– Offers milestones for project stages– Grad students give 6 to 8 talks Great Speakers

• Write papers, go to conferences, get PhDs, jobs

• End of project party, reshuffle faculty, go to 1

Slide 4

Patterson’s Projects, Faculty, Commercial Impact

• Reduced Instruction Set Computer (RISC)– What: simplified instructions to exploit VLSI: ‘80-’84– With: Sequin@UC, Hennessy@Stanford, Cocke@IBM– Direct Impact: Sun, RISC >90% embedded MPUs

• Symbolic Processing Using RISCs (SPUR)– What: desktop multiprocessor for AI: ‘84 - ‘89 – With: Fateman, Hilfinger, Hodges, Katz, Ousterhout– Direct Impact: PLL => fast serial lines => Silicon

Image

• Redundant Arrays of Inexpensive Disks (RAID)– What: many PC disks for speed, reliability: ‘88 - ‘93– With: Katz, Ousterhout, Stonebraker– Direct Impact:$25B/yr(EMC) 80% nonPC disks RAID

Slide 5

Symbolic Processing Using RISCs: ‘85-’89

• Before Commercial RISC chips• Built Workstation Multiprocessor and

Operating System from scratch(!)• Sprite Operating System• 3 chips: Processor, Cache Controller, FPU

– Coined term “snopping cache protocol”– 3C’s cache miss: compulsory, capacity, conflict

Slide 6

SPUR 10 Year Reunion, January ‘99

• Everyone from North America came!• 19 PhDs: 9 to Academia

– 8/9 got tenure, 4 full professors (already)– 2 Romme fellows (3rd, 4th at Wisconsin)– 3 NSF Presidential Young Investigator Winners– 2 ACM Dissertation Awards– They in turn had produced 30 PhDs (1/99)

• 10 to Industry– Founders of 6 startups, (1 failed, 1 acquired, 1

public)– 2 Department heads (AT&T Bell Labs, Microsoft)

• Very successful group; SPUR Project “gave them a taste of success, lifelong friends”

• “Berkeley is on lunatic fringe on multi-faculty projects”

Slide 7

Group Photo (in souvenir jackets)

• See www.cs.berkeley.edu/Projects/ARC to learn more about Berkeley Systems

Garth GibsonCMU, Founder Panasas

Dave Lee Founder Si. Image

MendelRosen-blum,

Stanford,FounderVMWare

Ben Zorn Colorado,

M/S

David Wood,Wisconsin

Jim Larus, Wisconsin, M/S

MarkHill

Wisc.

SusanEggersWash-ington

Brent Welch Founder, Scriptics

George Taylor, Founder, ?

Shing Kong Si. Image

John Ouster-hout

Founder, Scriptics,E. Cloud

Slide 8

• Networks of Workstations (NOW)– What: big server via switched network of WS ’94-’98 – With: Anderson, Brewer, Culler– Direct Impact: Inktomi + many Internet companies

• Tertiary Disk (TD: a NOW subset project)– What: big, cheap, disk-NOW (for SF Museum) ’96-’99 – Direct Impact: Scale8 (big, reliable Internet storage)

• Intelligent RAM (IRAM)– What: media processor inside DRAM chip: ‘97 - ‘02 – With: Yelick (and Wawrzynek)

• ISTORE/Recovery-Oriented Computing (ROC)– What: Available, Maintainable Servers: HW,SW,LW – With: Yelick/Fox (and Kubiatowicz)

Patterson’s Projects, People, Impact

Slide 9

Network of Workstations (NOW) ‘94 -’98

Construction of 2 HW/SW prototypes: NOW-1 with 32 SuperSPARCs and NOW-2 with 100 UltraSPARC 1s

NOW-2 cluster held the world record for the fastest Disk-to-Disk Sort for 2 years, 1997-1999

NOW-2 cluster 1st to crack the 40-bit key as part of a key-cracking challenge offered by RSA, 1997

NOW-2 made list of Top 200 supercomputers 1997 NOW technology is a foundation of Virtual Interface

(VI) Architecture, a proposed standard that allows fully protected, direct user-level access to the network interface, promoted by Compaq, Intel, & M/S

NOW technology led directly to one Internet startup company (Inktomi), and many other Internet companies rely on clusters

Slide 10

Network of Workstations (NOW) ‘94 -’98

12 PhDs. Note that 3/4 of them went into academia, and that 1/3 are female:

Andrea Arpaci-Desseau, Asst. Professor, Wisconsin, Madison

Remzi Arpaci-Desseau, Asst. Professor, Wisconsin, Madison Mike Dahlin, Assoc. Professor, University of Texas, Austin Jeanna Neefe Matthews, Asst. Professor, Clarkson Univ. Douglas Ghormley, Researcher, Los Alamos National Labs Kim Keeton, Researcher, Hewlett Packard Labs Steve Lumetta, Assistant Professor, U. Illinois, Urbana-Ch. Alan Mainwaring, Researcher, Intel Berkeley Labs Rich Martin, Assistant Professor, Rutgers University Nisha Talagala, Researcher, Network Storage, Sun Micro. Amin Vahdat, Assistant Professor, Duke University Randy Wang, Assistant Professor, Princeton University

Slide 11

Research in Berkeley Courses• RISC, SPUR, RAID, NOW, IRAM, ROC all

started in advanced graduate courses• Make transition from undergraduate student

to researcher in first year graduate courses– First year architecture, operating systems

courses: select topic, do research, write paper, give talk

– Prof meets each team 1-on-1 ~3 times, + TA help – Some papers get submitted and published– Same time to Ph.D. as places with no required

courses

• Requires class size 20 - 40 (e.g., Berkeley)– If 100-200 students / course (school offers

combined BS/MS or professional MS over TV broadcast) => cannot do research in grad courses

Slide 12

Retreat Research Style• Project Reviews with Outsiders

– Twice a year: 3-day retreat@Tahoe– Faculty, students, staff + guests– Key piece is feedback at end– Can change minds of faculty– Breaks enable valuable discussion– Builds team spirit (all play&work)– Helps create deadlines– Helps with technology transfer– Always amazed of value at end

• By far, most important idea to run 10-25 person project– Cost ~ 1 grad student– Visitors donate $ = 4 to 6 grads

Slide 13

Background: Tertiary Disk (part of NOW)

• Tertiary Disk (1997) – cluster of 20 PCs

hosting 364 3.5” IBM disks (8.4 GB) in 7 19”x 33” x 84” racks, or 3 TB. The 200MHz, 96 MB P6 PCs run FreeBSD and a switched 100Mb/s Ethernet connects the hosts. Also 4 UPS units. – Hosts world’s largest art

database:72,000 images in cooperation with San Francisco Fine Arts Museum:Try www.thinker.org

Slide 14

Tertiary Disk HW Failure Experience

Reliability of hardware components (20 months)7 IBM SCSI disk failures (out of 364, or 2%)6 IDE (internal) disk failures (out of 20, or 30%)1 SCSI controller failure (out of 44, or 2%)1 SCSI Cable (out of 39, or 3%)1 Ethernet card failure (out of 20, or 5%)1 Ethernet switch (out of 2, or 50%)3 enclosure power supplies (out of 92, or 3%)1 short power outage (covered by UPS)

Did not match expectations:SCSI disks more reliable than SCSI cables!

Difference between simulation and prototypes

Slide 15

Lessons from Tertiary Disk Project

• Maintenance is hard on current systems– Hard to know what is going on, who is to

blame

• Everything can break– Its not what you expect in advance– Follow rule of no single point of failure

• Nothing fails fast– Eventually behaves bad enough that

operator “fires” poor performer, but it doesn’t “quit”

• Most failures may be predicted

Slide 16

The past: research goals andassumptions of last 15 years

• Goal #1: Improve performance• Goal #2: Improve performance• Goal #3: Improve cost-performance• Assumptions

– Humans are perfect (they don’t make mistakes during installation, wiring, upgrade, maintenance or repair)

– Software will eventually be bug free (Hire better programmers!)

– Hardware MTBF is already very large (~100 years between failures), and will continue to increase

– Maintenance costs irrelevant vs. Purchase price (maintenance a function of price, so cheaper helps)

Slide 17

Learning from others: Bridges•1800s: 1/4 iron truss railroad

bridges failed!•Safety is now part of

Civil Engineering DNA•Techniques invented since

1800s: –Learn from failures vs. successes –Redundancy to survive some

failures–Margin of safety 3X-6X vs.

calculated load– (CS&E version of safety margin?)

•What will people of future think of our computers?

Slide 18

Recovery-Oriented Computing Philosophy

“If a problem has no solution, it may not be a problem, but a fact, not to be solved, but to be coped with over time”

— Shimon Peres (“Peres’s Law”)• People/HW/SW failures are facts, not

problems• Improving recovery/repair improves availability– UnAvailability = MTTR

MTTF– 1/10th MTTR just as valuable as 10X MTBF

(assuming MTTR much less than MTTF)

• Recovery/repair is how we cope with above facts

• Since major Sys Admin job is recovery after failure, ROC also helps with maintenance/TCO• Since Cost of Ownership is 5-10X HW/SW, if necessary, use disk/DRAM space and processor performance for ACME

Slide 19

Approach to ROC• Failure data collection: why do Internet

services fail? What do failures look like?– Collected data from 3 Internet sites:

operator error > 50% of time

• Recovery Benchmarks– Do Recovery Experiments where trigger

faults and see how long to recover– SW RAID: Solaris v. Linux v. Windows 1:5:30

• Margin of Safety: to recover from surprises

• Construct an clustered Email service as an example

Slide 20

Approach To Email Service• Recovery experiments while develop code

and after deployed– E.g., glibc with script to trigger errors (FIG)

• Automated diagnosis– E.g., trace all modules used per request, log if

fail or succeed, put into database, use data mining to find faulty module (Pinpoint: 60% to 90% accurate)

• Fast restart (with Fox)– Partition software so only have to restart subset

of system (5X reduction in Mercury example)

• Reversible Email service: Undo for operators– Rewind, Repair, and Redo: remove virus via time

travel

• See http://roc.cs.berkeley.edu/294fall01/

Slide 21

Interested in ROCing? • Many research opportunities, low hanging

fruit– Failure data collection, analysis, and

publication– Create/Run recovery, Maintainability

benchmarks: compare (by vendor) databases, files systems, routers, …

– Invent, evaluate techniques to reduce MTTR and TCO in computation, storage, and network systems “If it’s important, how can you say it’s impossible if you don’t

try?”Jean Monnet, a founder of European Union

http://ROC.cs.berkeley.edu

Date post:	18-Jan-2016
Category:	Documents
Upload:	amos-fletcher
View:	215 times
Download:	0 times

Slide 1 Recovery Oriented Computing (ROC) David Patterson 2002 Grad Visit Day.

Documents