+ All Categories
Home > Technology > Bio-IT for Core Facility Managers

Bio-IT for Core Facility Managers

Date post: 12-May-2015
Category:
Upload: chris-dagdigian
View: 1,302 times
Download: 2 times
Share this document with a friend
Description:
This is a massive slide deck I used as the starting point for a 1.5 hour talk at the 2012 www.nerlscd.org conference. Mixture of old and (some) new slides from my usual stuff.
Popular Tags:
156
1 Bio-IT For Core Facility Leaders Tips, Tricks & Trends 2012 NERLCSD Meeting - www.nerlscd.org Wednesday, October 31, 12
Transcript
Page 1: Bio-IT for Core Facility Managers

1

Bio-IT For Core Facility LeadersTips, Tricks & Trends

2012 NERLCSD Meeting - www.nerlscd.org

Wednesday, October 31, 12

Page 2: Bio-IT for Core Facility Managers

2

Meta-Issues (The Big Picture)

Infrastructure Tour

Compute & HPC

Storage

Intro

Cloud & Big Data

1

2

3

4

5

6Wednesday, October 31, 12

Page 3: Bio-IT for Core Facility Managers

3

I’m Chris.

I’m an infrastructure geek.

I work for the BioTeam.

@chris_dagWednesday, October 31, 12

Page 4: Bio-IT for Core Facility Managers

Who, what & whyBioTeam

‣ Independent consulting shop‣ Staffed by scientists forced to

learn IT, SW & HPC to get our own research done

‣ 12+ years bridging the “gap” between science, IT & high performance computing

‣ www.bioteam.net

4Wednesday, October 31, 12

Page 5: Bio-IT for Core Facility Managers

Seriously.Listen to me at your own risk

‣ Clever people find multiple solutions to common issues

‣ I’m fairly blunt, burnt-out and cynical in my advanced age

‣ Significant portion of my work has been done in demanding production Biotech & Pharma environments

‣ Filter my words accordingly5

Wednesday, October 31, 12

Page 6: Bio-IT for Core Facility Managers

6

Meta-Issues (The Big Picture)

Infrastructure Tour

Compute & HPC

Storage

Intro

Cloud & Big Data

1

2

3

4

5

6Wednesday, October 31, 12

Page 7: Bio-IT for Core Facility Managers

7

Meta-IssuesWhy you need to track this stuff ...

Wednesday, October 31, 12

Page 8: Bio-IT for Core Facility Managers

Why this stuff matters ...

8

Big Picture

‣ HUGE revolution in the rate at which lab instruments are being redesigned, improved & refreshed

• Example: CCD sensor upgrade on that confocal microscopy rig just doubled your storage requirements

• Example: That 2D ultrasound imager is now a 3D imager

• Example: Illumina HiSeq upgrade just doubled the rate at which you can acquire genomes. Massive downstream increase in storage, compute & data movement needs

Wednesday, October 31, 12

Page 9: Bio-IT for Core Facility Managers

9

The Central Problem Is ...

‣ Instrumentation & protocols are changing FAR FASTER than we can refresh our Research-IT & Scientific Computing infrastructure

• The science is changing month-to-month ...

• ... while our IT infrastructure only gets refreshed every 2-7 years

‣ We have to design systems TODAY that can support unknown research requirements & workflows over many years (gulp ...)

Wednesday, October 31, 12

Page 10: Bio-IT for Core Facility Managers

10

The Central Problem Is ...

‣ The easy period is over‣ 5 years ago you could toss inexpensive storage and

servers at the problem; even in a nearby closet or under a lab bench if necessary

‣ That does not work any more; IT needs are too extreme‣ 1000 CPU Linux clusters and petascale storage is the

new normal; try fitting THAT in a closet!

Wednesday, October 31, 12

Page 11: Bio-IT for Core Facility Managers

What core facility leadership needs to understandThe Take Home Lesson

11

‣ The incredible rate of cost decreases & capability gains seen in the lab instrumentation space is not mirrored everywhere

‣ As gear gets cheaper/faster, scientists will simply do more work and ask more questions. Nobody simply banks the financial savings when an instrument gets 50% cheaper -- they just buy two of them!

‣ IT technology is not improving at the same rate; we also can’t change our IT infrastructures all that rapidly

Wednesday, October 31, 12

Page 12: Bio-IT for Core Facility Managers

If you get it wrong ...

‣ Lost opportunity‣ Frustrated & very vocal researchers‣ Problems in recruiting ‣ Publication problems

12Wednesday, October 31, 12

Page 13: Bio-IT for Core Facility Managers

13

Meta-Issues (The Big Picture)

Infrastructure Tour

Compute & HPC

Storage

Intro

Cloud & Big Data

1

2

3

4

5

6Wednesday, October 31, 12

Page 14: Bio-IT for Core Facility Managers

14

Infrastructure TourWhat does this stuff look like?

Wednesday, October 31, 12

Page 15: Bio-IT for Core Facility Managers

15

Self-contained single-instrument infrastructure

Wednesday, October 31, 12

Page 16: Bio-IT for Core Facility Managers

16

Ilumina GA

Wednesday, October 31, 12

Page 17: Bio-IT for Core Facility Managers

17

Instrument Control Workstation

Wednesday, October 31, 12

Page 18: Bio-IT for Core Facility Managers

18

SOLiD Sequencer ...

Wednesday, October 31, 12

Page 19: Bio-IT for Core Facility Managers

19

sits on top of a 24U server rack...

Wednesday, October 31, 12

Page 20: Bio-IT for Core Facility Managers

20

Another lab-local HPC cluster + storage

Wednesday, October 31, 12

Page 21: Bio-IT for Core Facility Managers

21

More lab-local servers & storage

Wednesday, October 31, 12

Page 22: Bio-IT for Core Facility Managers

22

Small core w/ multiple instrument support

Wednesday, October 31, 12

Page 23: Bio-IT for Core Facility Managers

23

Small cluster; large storage

Wednesday, October 31, 12

Page 24: Bio-IT for Core Facility Managers

24

Mid-sized core facility

Wednesday, October 31, 12

Page 25: Bio-IT for Core Facility Managers

25

Large Core Facility

Wednesday, October 31, 12

Page 26: Bio-IT for Core Facility Managers

26

Large Core Facility

Wednesday, October 31, 12

Page 27: Bio-IT for Core Facility Managers

27

Large Core Facility

Wednesday, October 31, 12

Page 28: Bio-IT for Core Facility Managers

28

Colocation Cages

Wednesday, October 31, 12

Page 29: Bio-IT for Core Facility Managers

29

Inside a colo cage

Wednesday, October 31, 12

Page 30: Bio-IT for Core Facility Managers

30

Linux Cluster + In-row chillers (front)

Wednesday, October 31, 12

Page 31: Bio-IT for Core Facility Managers

31

Linux Cluster + In-row chillers (rear)

Wednesday, October 31, 12

Page 32: Bio-IT for Core Facility Managers

32

1U “Pizza Box” Style Server Chassis

Wednesday, October 31, 12

Page 33: Bio-IT for Core Facility Managers

33

Pile of “pizza boxes”

Wednesday, October 31, 12

Page 34: Bio-IT for Core Facility Managers

34

4U Rackmount Servers

Wednesday, October 31, 12

Page 35: Bio-IT for Core Facility Managers

35

“Blade” Servers & Enclosure

Wednesday, October 31, 12

Page 36: Bio-IT for Core Facility Managers

36

Hybrid Modular Server

Wednesday, October 31, 12

Page 37: Bio-IT for Core Facility Managers

37

Integrated: Blades + Hypervisor + Storage

Wednesday, October 31, 12

Page 38: Bio-IT for Core Facility Managers

38

Petabyte-scale StorageWednesday, October 31, 12

Page 39: Bio-IT for Core Facility Managers

39

Yep. This counts.

16 monster compute nodes + 22 GPU nodesCost? 30 bucks an hour via AWS Spot Market

Real world screenshot from earlier this month

Wednesday, October 31, 12

Page 40: Bio-IT for Core Facility Managers

40

Physical data movement station

Wednesday, October 31, 12

Page 41: Bio-IT for Core Facility Managers

41

Physical data movement station

Wednesday, October 31, 12

Page 42: Bio-IT for Core Facility Managers

42

“Naked” Data Movement

Wednesday, October 31, 12

Page 43: Bio-IT for Core Facility Managers

43

“Naked” Data Archive

Wednesday, October 31, 12

Page 44: Bio-IT for Core Facility Managers

44

The cliche image

Wednesday, October 31, 12

Page 45: Bio-IT for Core Facility Managers

45

Backblaze Pod: 100 terabytes for $12,000

Wednesday, October 31, 12

Page 46: Bio-IT for Core Facility Managers

46

Meta-Issues (The Big Picture)

Infrastructure Tour

Compute & HPC

Storage

Intro

Cloud & Big Data

1

2

3

4

5

6Wednesday, October 31, 12

Page 47: Bio-IT for Core Facility Managers

47

ComputeActually the easy bit ...

Wednesday, October 31, 12

Page 48: Bio-IT for Core Facility Managers

Not a big deal in 2012 ...

48

Compute Power

‣ Compute power is largely a solved problem‣ It’s just a commodity‣ Cheap, simple & very easy to acquire‣ Lets talk about what you need to know ...

Wednesday, October 31, 12

Page 49: Bio-IT for Core Facility Managers

Thinks you should be tracking ...Compute Trends

‣ Facility Issues‣ “Fat Nodes” replacing Linux Clusters‣ Increasing presence of serious “lab-local” IT

49Wednesday, October 31, 12

Page 50: Bio-IT for Core Facility Managers

Facility Stuff

‣ Compute & storage requirements are getting larger and larger

‣ We are packing more “stuff” into smaller spaces

‣ This increases (radically) electrical and cooling requirements

50Wednesday, October 31, 12

Page 51: Bio-IT for Core Facility Managers

Facility Stuff - Core issue

‣ Facility & power issues can take many months or years to address

‣ Sometimes it may be impossible to address (new building required ...)

‣ If research IT footprint is growing fast; you must be well versed in your facility planning/upgrade process

51Wednesday, October 31, 12

Page 52: Bio-IT for Core Facility Managers

Facility Stuff - One more thing

‣ Sometimes central IT will begin facility upgrade efforts without consulting with research users

• This was the reason behind one of our more ‘interesting’ projects in 2012

‣ ... a client was weeks away from signing off on a $MM datacenter which would not have had enough electricity to support current research & faculty recruiting commitments

52Wednesday, October 31, 12

Page 53: Bio-IT for Core Facility Managers

“Fat” Nodes Replacing Clusters

53Wednesday, October 31, 12

Page 54: Bio-IT for Core Facility Managers

Fat Nodes - 1 box replacing a cluster

54

‣ This server has 64 CPU Cores‣ .. and up to 1TB of RAM‣ Fantastic Genomics/Chemistry

system• A 256GB RAM version only

costs $13,000

‣ These single systems are replacing small clusters in some environments

Wednesday, October 31, 12

Page 55: Bio-IT for Core Facility Managers

Fat Nodes - Clever Scale-out Packaging

55

‣ This 2U chassis contains 4 individual servers

‣ Systems like this get near “blade” density without the price premium seen with proprietary blade packaging

‣ These “shrink” clusters in a major way or replace small ones

Wednesday, October 31, 12

Page 56: Bio-IT for Core Facility Managers

The other trend ...

56Wednesday, October 31, 12

Page 57: Bio-IT for Core Facility Managers

“Serious” IT now in your wet lab ...

57

‣ Instruments used to ship with a Windows PC “instrument control workstation”

‣ As instruments get more powerful the “companion” hardware is starting to scale-up

‣ End result: very significant stuff that used to live in your datacenter is now being rolled into lab enviroments

Wednesday, October 31, 12

Page 58: Bio-IT for Core Facility Managers

“Serious” IT now in your wet lab ...

58

‣ You may be surpised what you find in your labs in ’12

‣ ... can be problematic for a few reasons ...

1. IT support & backup2. Power & cooling3. Noise4. Security

Wednesday, October 31, 12

Page 59: Bio-IT for Core Facility Managers

59

NetworkingAlso not particularly worrisome ...

Wednesday, October 31, 12

Page 60: Bio-IT for Core Facility Managers

60

Networking

‣ Networking is also not super complicated‣ It’s also fairly cheap & commoditized in ’12‣ There are three core uses for networks:

1. Communication between servers & services

2. Message passing within a single application

3. Sharing files and data between many clients

Wednesday, October 31, 12

Page 61: Bio-IT for Core Facility Managers

61

Networking 1 - Servers & Services

‣ Ethernet. Period. Enough said.‣ Your only decision is between 10-Gig and 1-Gig ethernet‣ 1-Gig Ethernet is pervasive and dirt cheap‣ 10-Gig Ethernet is getting cheaper and on it’s way to

becoming pervasive

Wednesday, October 31, 12

Page 62: Bio-IT for Core Facility Managers

62

Networking 1 - Ethernet

‣ Everything speaks ethernet‣ 1-Gig is still the common interconnect for most things‣ 10-Gig is the standard now for the “core”‣ 10-Gig is the standard for top-of-rack and “aggregation”‣ 10-Gig connections to “special” servers is the norm

Wednesday, October 31, 12

Page 63: Bio-IT for Core Facility Managers

63

Networking 2 - Message Passing

‣ Parallel applications can span many servers at once

‣ Communicate/coordinate via “message passing”

‣ Ethernet is fine for this but has a somewhat high latency between message packets

‣ Many apps can tolerate Ethernet-level latency; some applications clearly benefit from a message passing network with lower latency

‣ There used to be many competing alternatives

‣ Clear 2012 winner is “Infiniband”Wednesday, October 31, 12

Page 64: Bio-IT for Core Facility Managers

64

Networking 2 - Message Passing

‣ The only things you need to know ...‣ Infiniband is an expensive networking alternative that

offers much lower latency than Ethernet‣ You would only pay for and deploy an IB fabric if you had

an application or use case that requires it.‣ No big deal. It’s just “another” network.

Wednesday, October 31, 12

Page 65: Bio-IT for Core Facility Managers

65

Networking 3 - File Sharing

‣ For ‘Omics this is the primary focus area‣ Overwhelming need for shared read/write access to files

and data between instruments, HPC environment and researcher desktops

‣ In HPC environments you will often have a separate network just for file sharing traffic

Wednesday, October 31, 12

Page 66: Bio-IT for Core Facility Managers

66

Networking 3 - File Sharing

‣ Generic file sharing uses familiar NFS or Windows fileshare protocols. No big deal

‣ Always implemented over Ethernet although often a mixture of 10-Gig and 1-Gig connections

• 10-Gig connections to the file servers, storage and edge switches; 1-gig connections to cluster nodes and user desktops

‣ Infiniband also has a presence here• Many “parallel” or “cluster” filesystems may talk to the clients

via NFS-over-ethernet but internally the distributed components may use a private Infiband network for metadata and coordination.

Wednesday, October 31, 12

Page 67: Bio-IT for Core Facility Managers

67

Storage.(the hard bit ...)

Wednesday, October 31, 12

Page 68: Bio-IT for Core Facility Managers

Setting the stage ...

68

Storage

‣ Life science is generating torrents of data‣ Size and volume often dwarf all other research areas -

particularly with Bioinformatics & Genomics work‣ Big/Fast storage is not cheap and is not commodity‣ There are many vendors and many ways to spectacularly

waste tons of money‣ And we still have an overwhelming need for storage that

can be shared concurrently between many different users, systems and clients

Wednesday, October 31, 12

Page 69: Bio-IT for Core Facility Managers

69

Life Science “Data Deluge”

‣ Scare stories and shocking graphs getting tiresome‣ We’ve been dealing with terabyte-scale lab instruments

& data movement issues since 2004• And somehow we’ve managed to survive ...

‣ Next few slides• Try to explain why storage does not stress me out all that

much in 2012 ...

Wednesday, October 31, 12

Page 70: Bio-IT for Core Facility Managers

70

The sky is not falling.

‣ Overwhelming majority of us do not operate at Broad/Sanger levels

• These folks add 200+ TB a week in primary storage

‣ We still face challenges but the scale/scope is well within the bounds of what traditional IT technologies can handle

‣ We’ve been doing this for years• Many vendors, best practices, “war stories”, proven methods

and just plain “people to talk to…”

1. You are not the Broad Institute or Sanger Center

Wednesday, October 31, 12

Page 71: Bio-IT for Core Facility Managers

71

The sky is not falling.

‣ Yesteryear: Terascale .TIFF Tsunami‣ Yesterday: RTA, in-instrument data reduction‣ Today: Basecalls, BAMs & Outsourcing‣ Tomorrow: Write directly to the cloud

2. Instrument Sanity Beckons

Wednesday, October 31, 12

Page 72: Bio-IT for Core Facility Managers

72

The sky is not falling.

‣ Peta-scale storage has not been a risky exotic technology gamble for years now

• A few years ago you’d be betting your career

‣ Today it’s just an engineering & budget exercise• Multiple vendors don’t find petascale requirements particularly

troublesome and can deliver proven systems within weeks

• $1M (or less in ’12) will get you 1PB from several top vendors

‣ However, still HARD to do BIG, FAST & SAFE• Hard but solvable; many resources & solutions out there

3. Peta-scale storage is not really exotic or unusual any more.

Wednesday, October 31, 12

Page 73: Bio-IT for Core Facility Managers

73

On the other hand ...

Wednesday, October 31, 12

Page 74: Bio-IT for Core Facility Managers

74

OMG! The Sky Is Falling!Maybe a little panic is appropriate ...

Wednesday, October 31, 12

Page 75: Bio-IT for Core Facility Managers

The sky IS falling!

‣ As instrument output declines …

‣ Downstream storage consumption by end-user researchers is increasing rapidly

‣ Each new genome generates new data mashups, experiments, data interchange conversions, etc.

‣ MUCH harder to do capacity planning against human beings vs. instruments

75

1. Those @!*#&^@ Scientists ...

Wednesday, October 31, 12

Page 76: Bio-IT for Core Facility Managers

The sky IS falling!

‣ Sequencing is already a commodity

‣ NOBODY simply banks the savings

‣ EVERYBODY buys or does more

76

2. @!*#&^@ Scientific Leadership ...

Wednesday, October 31, 12

Page 77: Bio-IT for Core Facility Managers

The sky IS falling!

772007 2008 2009 2010 2011 2012

:

BIG SCARY GRAPH

Gigabases vs. Moores LawOMG!!

Wednesday, October 31, 12

Page 78: Bio-IT for Core Facility Managers

The sky IS falling!

‣ Cost of acquiring data (genomes) falling faster than rate at which industry is increasing drive capacity

‣ Human researchers downstream of these datasets are also consuming more storage (and less predictably)

‣ High-scale labs must react or potentially have catastrophic issues in 2012-2013

78

3. Uncomfortable truths

Wednesday, October 31, 12

Page 79: Bio-IT for Core Facility Managers

The sky IS falling!

‣ This is not sustainable• Downstream consumption

exceeding instrument data reduction

• Commoditization yielding more platforms

• Chemistry moving faster than IT infrastructure

• What the heck are we doing with all this sequence?

79

5. Something will have to break ...

Wednesday, October 31, 12

Page 80: Bio-IT for Core Facility Managers

80

CRAM it.

Wednesday, October 31, 12

Page 81: Bio-IT for Core Facility Managers

The sky IS falling!

‣ Minor improvements are useless; order-of-magnitude needed

‣ Some people are talking about radical new methods – compressing against reference sequences and only storing the diffs• With a variable compression “quality budget” to spend on

lossless techniques in the areas you care about

‣ http://biote.am/5v - Ewan Birney on “Compressing DNA”

‣ http://biote.am/5w - The actual CRAM paper

‣ If CRAM takes off, storage landscape will change81

CRAM it in 2012 ...

Wednesday, October 31, 12

Page 82: Bio-IT for Core Facility Managers

82

What comes next?Next 18 months will be really fun...

Wednesday, October 31, 12

Page 83: Bio-IT for Core Facility Managers

83

What comes next.

‣ Accept that science changes faster than IT infrastructure‣ Be glad you are not Broad/Sanger‣ Flexibility, scalability and agility become the key

requirements of research informatics platforms• Tiered storage is in your future ...

‣ Shared/concurrent access is still the overwhelming storage use case

• We’ll still continue to use clustered, parallel and scale-out NAS solutions

The same rules apply for 2012 and beyond ...

Wednesday, October 31, 12

Page 84: Bio-IT for Core Facility Managers

84

What comes next.

‣ Many peta-scale capable systems deployed• Most will operate in the hundreds-of-TBs range

‣ Far more aggressive “data triage” • “.BAM only!”

‣ Genome compression via CRAM‣ Even more data will sit untouched & unloved‣ Growing need for tiers, HSM & even tape

In the following year ...

Wednesday, October 31, 12

Page 85: Bio-IT for Core Facility Managers

85

What comes next.

‣ Broad, Sanger and others will pave the way with respect to metadata-aware & policy driven storage frameworks

• And we’ll shamelessly copy a year or two later

‣ I’m still on my cloud storage kick• Economics are inescapable; Will be built into storage

platforms, gateways & VMs

• Amazon S3 is only a HTTP RESTful call away

• Cloud will become “just another tier”

In the following year ...

Wednesday, October 31, 12

Page 86: Bio-IT for Core Facility Managers

86

‣ What do DDN, Panasas, Isilon, BlueArc, etc. have in common?

• Under the hood they all run Unix or Unix-like OS’s on x86_64 architectures

‣ Some storage arrays can already run applications natively

• More will follow

• Likely a big trend for 2012

What comes next.Expect your storage to be smarter & more capable ...

Wednesday, October 31, 12

Page 87: Bio-IT for Core Facility Managers

87

But what about today?

Wednesday, October 31, 12

Page 88: Bio-IT for Core Facility Managers

88

Still trying to avoid this.(100TB scientific data, no RAID, unsecured on lab benchtops )

Wednesday, October 31, 12

Page 89: Bio-IT for Core Facility Managers

89

Flops, Failures & FreakoutsCommon storage mistakes ...

Wednesday, October 31, 12

Page 90: Bio-IT for Core Facility Managers

90

Flops, Failures & Freakouts

‣ Scientist: “My work is priceless, I must be able to access it at all times”

‣ Corporate/Enterprise Storage Guru: “Hmmm …you want high availability, huh?”

‣ System delivered:• 40TB Enterprise SAN

• Asynchronous replication to remote site

• Can’t scale, can’t do NFS easily

• ~$500K per year in operational & maintenance costs

#1 - Unchecked Enterprise Storage Architects

Wednesday, October 31, 12

Page 91: Bio-IT for Core Facility Managers

91

Flops, Failures & Freakouts

‣ Scientist: “I do bioinformatics, I am rate limited by the speed of file IO operations. Faster disk means faster science. “

‣ System delivered:• Budget blown on top tier fastest-possible ‘Cadillac’ system

‣ Outcome:• System fills to capacity in 9 months; zero budget left.

#2 - Unchecked User Requirements

Wednesday, October 31, 12

Page 92: Bio-IT for Core Facility Managers

92

Flops, Failures & Freakouts

‣ Common source of storage unhappiness

‣ Root cause:• Not enough pre-sales time spent on design and engineering

• Choosing Open Source over Common Sense

‣ System as built:• Not enough metadata controllers

• Issues with interconnect fabric

• Poor selection & configuration of key components

‣ End result:• Poor performance or availability

• High administrative/operational burden

#3 - D.I.Y Cluster & Parallel Filesystems

Wednesday, October 31, 12

Page 93: Bio-IT for Core Facility Managers

93

Hard Lessons LearnedWhat these tales tell us ...

Wednesday, October 31, 12

Page 94: Bio-IT for Core Facility Managers

94

Flops, Failures & Freakouts

‣ End-users are not precise with storage terms• “Extremely reliable” means no data loss;

Not millions spent on 99.99999% high availability

‣ When true costs are explained:• Many research users will trade a small amount of uptime or

availability for more capacity or capabilities

• … will also often trade some level of performance in exchange for a huge win in capacity or capability

Hard Lessons Learned

Wednesday, October 31, 12

Page 95: Bio-IT for Core Facility Managers

95

Flops, Failures & Freakouts

‣ End-users demand the world but are willing to compromise

• Necessary for IT staff to really talk to them and understand work, needs and priorities

• Also essential to explain true costs involved

‣ People demanding the “fastest” storage often don’t have actual metrics to back their assertions

Hard Lessons Learned

Wednesday, October 31, 12

Page 96: Bio-IT for Core Facility Managers

96

Flops, Failures & Freakouts

‣ Software-based parallel or clustered file systems are non-trivial to correctly implement

• Essential to involve experts in the initial design phase

• Even if using ‘open source’ version …

‣ Commercial support is essential• And I say this as an open source zealot …

Hard Lessons Learned

Wednesday, October 31, 12

Page 97: Bio-IT for Core Facility Managers

97

The road aheadMy $.02 for 2012...

Wednesday, October 31, 12

Page 98: Bio-IT for Core Facility Managers

The Road Ahead

‣ Peta-capable platforms required

‣ Scale-out NAS still the best fit

‣ Customers will no longer build one big scale-out NAS tier

‣ My ‘hack’ of using nearline spec storage as primary science tier is probably obsolete in ’12

‣ Not everything is worth backing up

‣ Expect disruptive stuff98

Storage Trends & Tips for 2012

Wednesday, October 31, 12

Page 99: Bio-IT for Core Facility Managers

The Road Ahead

‣ Monolithic tiers no longer cut it

• Changing science & instrument output patterns are to blame

• We can’t get away with biasing towards capacity over performance any more

‣ pNFS should go mainstream in ’12

• { fantastic news }

‣ Tiered storage IS in your future

• Multiple vendors & types99

Trends & Tips for 2012

Wednesday, October 31, 12

Page 100: Bio-IT for Core Facility Managers

The Road Ahead

‣ Your storage will be able to run apps

• Dedupe, cloud gateways & replication

• ‘CRAM’ or similar compression

• Storage Resource Brokers (iRODS) & metadata servers

• HDFS/Hadoop hooks?

• Lab, Data management & LIMS applications

100

Trends & Tips for 2012

Drobo Appliance running BioTeam MiniLIMS internally...

Wednesday, October 31, 12

Page 101: Bio-IT for Core Facility Managers

The Road Ahead

‣ Hadoop / MapReduce / BigData

• Just like GRID and CLOUD back in the day you’ll need a gas mask to survive the smog of hype and vendor press releases.

• You still need to think about it

• ... and have a roadmap for doing it

• Deep, deep ties to your storage

• Your users want/need it

• My $.02? Fantastic cloud use case101

Trends & Tips for 2012

Wednesday, October 31, 12

Page 102: Bio-IT for Core Facility Managers

102

Disruptive Technology Example

Wednesday, October 31, 12

Page 103: Bio-IT for Core Facility Managers

103

Backblaze Pod For Biotech

Wednesday, October 31, 12

Page 104: Bio-IT for Core Facility Managers

104

Backblaze: 100Tb for $12,000

Wednesday, October 31, 12

Page 105: Bio-IT for Core Facility Managers

105

Meta-Issues (The Big Picture)

Infrastructure Tour

Compute & HPC

Storage

Intro

Cloud & Big Data

1

2

3

4

5

6Wednesday, October 31, 12

Page 106: Bio-IT for Core Facility Managers

106

The ‘C’ wordDoes a Bio-IT talk exist if it does not mention “the cloud”?

Wednesday, October 31, 12

Page 107: Bio-IT for Core Facility Managers

107

Defining the “C-word”

‣ Just like “Grid Computing” the “cloud” word has been diluted to almost uselessness thanks to hype, vendor FUD and lunatic marketing minions

‣ Helpful to define terms before talking seriously‣ There are three types of cloud‣ “IAAS”, “SAAS” & “PAAS”

Wednesday, October 31, 12

Page 108: Bio-IT for Core Facility Managers

108

Cloud Stuff

‣ Before I get nasty ...‣ I am not an Amazon shill‣ I am a jaded, cynical, zero-loyalty consumer of IT

services and products that let me get #%$^ done‣ Because I only get paid when my #%$^ works, I am

picky about what tools I keep in my toolkit‣ Amazon AWS is an infinitely cool tool

Wednesday, October 31, 12

Page 109: Bio-IT for Core Facility Managers

109

Cloud Stuff - SAAS

‣ SAAS = “Software as a Service”‣ Think:‣ gmail.com

Wednesday, October 31, 12

Page 110: Bio-IT for Core Facility Managers

110

Cloud Stuff - SAAS

‣ PAAS = “Platform as a Service”‣ Think:‣ https://basespace.illumina.com/‣ salesforce.com‣ MS office365.com, Apple iCloud, etc.

Wednesday, October 31, 12

Page 111: Bio-IT for Core Facility Managers

111

Cloud Stuff - IAAS

‣ IAAS = “Infrastructure as a Service”‣ Think:‣ Amazon Web Services‣ Microsoft Azure

Wednesday, October 31, 12

Page 112: Bio-IT for Core Facility Managers

112

Cloud Stuff - IAAS

‣ When I talk “cloud” I mean IAAS‣ And right now in 2012 Amazon IS the IAAS cloud‣ ... everyone else is a pretender

Wednesday, October 31, 12

Page 113: Bio-IT for Core Facility Managers

113

Cloud Stuff - Why IAAS

‣ IAAS clouds are the focal point for life science informatics

• Although some vendors are now offering PAAS and SAAS options ...

‣ The “infrastructure” clouds give us the “building blocks” we can assemble into useful stuff

‣ Right now Amazon has the best & most powerful collection of “building blocks”

‣ The competition is years behind ...Wednesday, October 31, 12

Page 114: Bio-IT for Core Facility Managers

A message for thecloud pretenders…

Wednesday, October 31, 12

Page 115: Bio-IT for Core Facility Managers

No APIs?Not a cloud.

Wednesday, October 31, 12

Page 116: Bio-IT for Core Facility Managers

No self-service?Not a cloud.

Wednesday, October 31, 12

Page 117: Bio-IT for Core Facility Managers

Installing VMWare & excreting a press release?

Not a cloud.

Wednesday, October 31, 12

Page 118: Bio-IT for Core Facility Managers

I have to email a human?Not a cloud.

Wednesday, October 31, 12

Page 119: Bio-IT for Core Facility Managers

~50% failure rate when launching new servers?

Stupid cloud.

Wednesday, October 31, 12

Page 120: Bio-IT for Core Facility Managers

Block storage and virtual servers only?

(barely) a cloud;

Wednesday, October 31, 12

Page 121: Bio-IT for Core Facility Managers

121

Private CloudsMy $.02 cents

Wednesday, October 31, 12

Page 122: Bio-IT for Core Facility Managers

Private Clouds in 2012:

‣ I’m no longer dismissing them as “utter crap”

‣ Usable & useful in certain situations

‣ Hype vs. Reality ratio still wacky

‣ Sensible only for certain shops• Have you seen what you have to do

to your networks & gear?

‣ There are easier ways

Wednesday, October 31, 12

Page 123: Bio-IT for Core Facility Managers

‣ Remain cynical (test vendor claims)‣ Due Diligence still essential‣ I personally would not deploy/buy anything that does not

explicitly provide Amazon API compatibility

Private Clouds: My Advice for ‘12

Wednesday, October 31, 12

Page 124: Bio-IT for Core Facility Managers

Most people are better off:1. Adding VM platforms to existing HPC clusters &

environments

2. Extending enterprise VM platforms to allow user self-service & server catalogs

Private Clouds: My Advice for ‘12

Wednesday, October 31, 12

Page 125: Bio-IT for Core Facility Managers

125

Cloud AdviceMy $.02 cents

Wednesday, October 31, 12

Page 126: Bio-IT for Core Facility Managers

Don’t get left behind

126

Cloud Advice

‣ Research IT Organizations need a cloud strategy today‣ Those that don’t will be bypassed by frustrated users‣ IaaS cloud services are only a departmental credit card

away ... and some senior scientists are too big to be fired for violating IT policy :)

Wednesday, October 31, 12

Page 127: Bio-IT for Core Facility Managers

Design Patterns

127

Cloud Advice

‣ You actually need three tested cloud design patterns:

‣ (1) To handle ‘legacy’ scientific apps & workflows‣ (2) The special stuff that is worth re-architecting ‣ (3) Hadoop & big data analytics

Wednesday, October 31, 12

Page 128: Bio-IT for Core Facility Managers

Legacy HPC on the Cloud

128

Cloud Advice

‣ MIT StarCluster• http://web.mit.edu/star/cluster/

‣ This is your baseline‣ Extend as needed

Wednesday, October 31, 12

Page 129: Bio-IT for Core Facility Managers

“Cloudy” HPC

129

Cloud Advice

‣ Some of our research workflows are important enough to be rewritten for “the cloud” and the advantages that a truly elastic & API-driven infrastructure can deliver

‣ This is where you have the most freedom‣ Many published best practices you can borrow‣ Amazon Simple Workflow Service (SWS) look sweet‣ Good commercial options: Cycle Computing, etc.

Wednesday, October 31, 12

Page 130: Bio-IT for Core Facility Managers

130

Hadoop & “Big Data”

‣ Hadoop and “big data” need to be on your radar‣ Be careful though, you’ll need a gas mask to avoid the

smog of marketing and vapid hype‣ The utility is real and this does represent the “future

path” for analysis of large data sets

Wednesday, October 31, 12

Page 131: Bio-IT for Core Facility Managers

Big Data HPC

131

Cloud Advice - Hadoop & Big Data

‣ It’s gonna be a MapReduce world, get used to it‣ Little need to roll your own Hadoop in 2012‣ ISV & commercial ecosystem already healthy‣ Multiple providers today; both onsite & cloud-based‣ Often a slam-dunk cloud use case

Wednesday, October 31, 12

Page 132: Bio-IT for Core Facility Managers

What you need to know

132

Hadoop & “Big Data”

‣ “Hadoop” and “Big Data” are now general terms‣ You need to drill down to find out what people actually

mean‣ We are still in the period where senior mgmt. may

demand “hadoop” or “big data” capability without any actual business or scientific need

Wednesday, October 31, 12

Page 133: Bio-IT for Core Facility Managers

What you need to know

133

Hadoop & “Big Data”

‣ In broad terms you can break “Big Data” down into two very basic use cases:

1. Compute: Hadoop can be used as a very powerful platform for the analysis of very large data sets. The google search term here is “map reduce”

2. Data Stores: Hadoop is driving the development of very sophisticated “no-SQL” “non-Relational” databases and data query engines. The google search terms include “nosql”, “couchdb”, “hive”, “pig” & “mongodb”, etc.

‣ Your job is to figure out which type applies for the groups requesting “hadoop” or “big data” capability

Wednesday, October 31, 12

Page 134: Bio-IT for Core Facility Managers

Hadoop vs traditional Linux Clusters

134

High Throughput Science

‣ Hadoop is a very complex beast‣ It’s also the way of the future so you can’t ignore it‣ Very tight dependency on moving the ‘compute’ as close

as possible to the ‘data’‣ Hadoop clusters are just different enough that they do

not integrate cleanly with traditional Linux HPC system‣ Often treated as separate silo or punted to the cloud

Wednesday, October 31, 12

Page 135: Bio-IT for Core Facility Managers

What you need to know

135

Hadoop & “Big Data”

‣ Hadoop is being driven by a small group of academics writing and releasing open source life science hadoop applications;

‣ Your people will want to run these codes‣ In some academic environments you may find people

wanting to develop on this platform

Wednesday, October 31, 12

Page 136: Bio-IT for Core Facility Managers

136

Cloud Data MovementMy $.02 cents

Wednesday, October 31, 12

Page 137: Bio-IT for Core Facility Managers

137

Cloud Data Movement

‣ We’ve slung a ton of data in and out of the cloud‣ We used to be big fans of physical media movement‣ Remember these pictures?‣ ...

Wednesday, October 31, 12

Page 138: Bio-IT for Core Facility Managers

138

Physical data movement station 1

Wednesday, October 31, 12

Page 139: Bio-IT for Core Facility Managers

139

Physical data movement station 2

Wednesday, October 31, 12

Page 140: Bio-IT for Core Facility Managers

140

“Naked” Data Movement

Wednesday, October 31, 12

Page 141: Bio-IT for Core Facility Managers

141

“Naked” Data Archive

Wednesday, October 31, 12

Page 142: Bio-IT for Core Facility Managers

142

Cloud Data Movement

‣ We’ve got a new story for 2012‣ And the next image shows why ...

Wednesday, October 31, 12

Page 143: Bio-IT for Core Facility Managers

143

March 2012Wednesday, October 31, 12

Page 144: Bio-IT for Core Facility Managers

Wow!Cloud Data Movement

‣ With a 1GbE internet connection ...‣ and using Aspera software ....‣ We sustained 700 MB/sec for more than 7 hours

freighting genomes into Amazon Web Services‣ This is fast enough for many use cases, including

genome sequencing core facilities*‣ Chris Dwan’s webinar on this topic:

http://biote.am/7e

144Wednesday, October 31, 12

Page 145: Bio-IT for Core Facility Managers

Wow!Cloud Data Movement

‣ Results like this mean we now favor network-based data movement over physical media movement

‣ Large-scale physical data movement carries a high operational burden and consumes non-trivial staff time & resources

145Wednesday, October 31, 12

Page 146: Bio-IT for Core Facility Managers

There are three ways to do network data movement ...Cloud Data Movement

‣ Buy software from Aspera and be done with it‣ Attend the annual SuperComputing conference & see

which student group wins the bandwidth challenge contest; use their code

‣ Get GridFTP from the Globus folks• Trend: At every single “data movement” talk I’ve been to in

2011 it seemed that any speaker who was NOT using Aspera was a very happy user of GridFTP. #notCoincidence

146Wednesday, October 31, 12

Page 147: Bio-IT for Core Facility Managers

147

Putting it all together

Wednesday, October 31, 12

Page 148: Bio-IT for Core Facility Managers

148

Wrapping up

IT may just be a means to an end but you need to get your head wrapped around it

‣ (1) So you use/buy/request the correct ‘stuff’‣ (2) So you don’t get cheated by a vendor‣ (3) Because you need to understand your tools‣ (4) Because trends in automation and orchestration

are blurring the line between scientist & sysadmin

Wednesday, October 31, 12

Page 149: Bio-IT for Core Facility Managers

149

Wrapping up - Compute & Servers

‣ Servers and compute power are pretty straightforward‣ You just need to know roughly what your preferred

compute building blocks look like‣ ... and what special purpose resources you require (GPUs,

Large Memory, High Core Count, etc.)‣ Some of you may also have to deal with sizing, cost and

facility (power, cooling, space) issues as well

Wednesday, October 31, 12

Page 150: Bio-IT for Core Facility Managers

150

Wrapping up - Networking

‣ Networking is also not hugely painful thing‣ Ethernet rules the land; you might have to pick and choose

between 1-Gig and 10-Gig Ethernet

‣ Understand that special networking technologies like Infiniband offer advantages but they are expensive and need to be applied carefully (if at all)

‣ Knowing if your MPI apps are latency sensitive will help

‣ And remember that networking is used for multiple things (server communication, application message passing & file and data sharing)

Wednesday, October 31, 12

Page 151: Bio-IT for Core Facility Managers

151

Wrapping up - Storage

‣ If you are going to focus on one IT area, this is it

‣ It’s incredibly important for genomics and also incredibly complicated. Many ways to waste money or buy the ‘wrong’ stuff

‣ You may only have one chance to get it correct and may have to live with your decision for years

‣ Budget is finite. You have to balance “speed” vs “size” vs “expansion capacity” vs “high availibility” and more ...

‣ “Petabyte-capable Scale-out NAS” is usually the best starting point. You deviate away from NAS when scientific or technical requirements demand “something else”.

Wednesday, October 31, 12

Page 152: Bio-IT for Core Facility Managers

152

Wrapping up - Hadoop / Big Data

‣ Probably the way of the future for big-data analytics. It’s worth spending time to study; especially if you intend to develop software in the future

‣ Popular target for current and emerging high-scale genomics tools. If you want to use those tools you need to deploy Hadoop

‣ It’s complicated and still changing rapidly. It can be difficult to integrate into existing setups

‣ Be cynical about hype & test vendor claims

Wednesday, October 31, 12

Page 153: Bio-IT for Core Facility Managers

153

Wrapping up - Cloud

‣ Cloud is the future. The economics are inescapable and the advantages are compelling.

‣ The main obstacle holding back genomics is terabyte scale data movement. The cloud is horrible if you have to move 2TB of data before you can run 2Hrs of compute!

‣ Your future core facility may involve a comp bio lab without a datacenter at all. Some organizations are already 100% virtual and 100% cloud-based

Wednesday, October 31, 12

Page 154: Bio-IT for Core Facility Managers

154

The NGS cloud clincher.

700 mb/sec sustained for ~7 hoursWest Coast to East Coast USA

Wednesday, October 31, 12

Page 155: Bio-IT for Core Facility Managers

155

Wrapping up - Cloud, continued

‣ Understand that for the foreseeable future there are THREE distinct cloud architectures and design patterns.

‣ Vendors who push “100% hadoop” or “legacy free” solutions are idiots and should be shoved out the door. We will be running legacy codes and workflows for many years to come

‣ Your three design patterns on the cloud:• Legacy HPC systems

(replicate traditional clusters in the cloud)

• Hadoop

• Cloudy (when you rewrite something to fully leverage cloud capability)

Wednesday, October 31, 12

Page 156: Bio-IT for Core Facility Managers

156

Thanks! Slides online at: http://slideshare.net/chrisdag/

Wednesday, October 31, 12


Recommended