+ All Categories
Home > Documents > HPC for Biomed Applications Marcos Athanasoulis, Dr.PH Director, Information Technology Harvard...

HPC for Biomed Applications Marcos Athanasoulis, Dr.PH Director, Information Technology Harvard...

Date post: 12-Jan-2016
Category:
Upload: elinor-jenkins
View: 216 times
Download: 1 times
Share this document with a friend
Popular Tags:
34
HPC for Biomed HPC for Biomed Applications Applications Marcos Athanasoulis, Dr.PH Director, Information Technology Harvard Medical School
Transcript
Page 1: HPC for Biomed Applications Marcos Athanasoulis, Dr.PH Director, Information Technology Harvard Medical School.

HPC for Biomed HPC for Biomed ApplicationsApplications

Marcos Athanasoulis, Dr.PH Director, Information TechnologyHarvard Medical School

Page 2: HPC for Biomed Applications Marcos Athanasoulis, Dr.PH Director, Information Technology Harvard Medical School.

OutlineOutlineAbout HMSWhy Biomed HPC is differentContextResults from Biomed HPC 2007

SummitPredictionsRecommendations for Fabric

weavers

Page 3: HPC for Biomed Applications Marcos Athanasoulis, Dr.PH Director, Information Technology Harvard Medical School.

About the Longwood Medical About the Longwood Medical AreaArea213 Acres, 37,000 employees,

15,000 students21 institutions2.15 million in- and outpatient visits Forty-seven percent of all hospital-

based outpatient clinical visits, and fifty-one percent of all inpatient admissions in Boston

Forty-seven percent of all staffed beds in Boston

15,016 births in the LMA

Page 4: HPC for Biomed Applications Marcos Athanasoulis, Dr.PH Director, Information Technology Harvard Medical School.

HMS Affiliated Research – HMS Affiliated Research – LongwoodLongwood Four of the top five Independent Hospital recipients of

NIH funding nationwide Massachusetts was the number two state recipient of

National Institutes of Health (NIH) funding Boston is ranked as the number one city in the nation

for NIH support If the LMA were ranked as a city, it would be number

three for funding, after New York and before Philadelphia.  If the LMA were ranked as a state, it would be number eight, after North Carolina, and before Washington.  

National Institutes of Health (NIH) awards more than doubled for the LMA institutions from $302 million to $722 million over the decade between FY 1991 and FY 2001

Page 5: HPC for Biomed Applications Marcos Athanasoulis, Dr.PH Director, Information Technology Harvard Medical School.

What makes Biomed HPC What makes Biomed HPC Different?Different?Larger problem space

◦Whole genome processing◦Whole ‘Ome processing◦Image Processing◦Simulations◦Everything Else

Bursty Usage◦Processing power is not always the

bottleneck◦Most work is “embarrassingly parallel”

Page 6: HPC for Biomed Applications Marcos Athanasoulis, Dr.PH Director, Information Technology Harvard Medical School.

Biomed HPC Differences Biomed HPC Differences (cont.)(cont.)Researchers

◦Funding challenges◦Grant funding limitations and

requirements◦Everyone is a CIO

Systems Diversity◦Plethora of small clusters◦General lack of centralization◦White boxes to blue genes

Page 7: HPC for Biomed Applications Marcos Athanasoulis, Dr.PH Director, Information Technology Harvard Medical School.

About HPC @ HMSAbout HPC @ HMSToday:

◦Modest shared cluster◦1000 processor cores◦100TB attached NAS storage◦Interconnect: Gigabit Ethernet◦Subsidized user contribution model◦BUT, MOST computing happens

under the desk and behind the curtain!

Page 8: HPC for Biomed Applications Marcos Athanasoulis, Dr.PH Director, Information Technology Harvard Medical School.

About HPC @ HMS (cont.)About HPC @ HMS (cont.)Tomorrow:

◦Mid-scale cluster and Harvard Grid◦10-20K processor cores◦Petabyte of storage◦Parallel file system◦10g Ethernet or Infiniband◦More centralized

Page 9: HPC for Biomed Applications Marcos Athanasoulis, Dr.PH Director, Information Technology Harvard Medical School.

Challenge: Natural Language Challenge: Natural Language ProcessingProcessing

HOSPITAL COURSE: ... It was recommended that she receive …We also added Lactinax, oral form of Lactobacillus acidophilus to attempt a repopulation of her gut.

SH: widow,lives alone,2 children,no tob/alcohol.

BRIEF RESUME OF HOSPITAL COURSE: 63 yo woman with COPD, 50 pack-yr tobacco (quit 3 wks ago), spinal stenosis, ...

SOCIAL HISTORY: Negative for tobacco, alcohol, and IV drug abuse.

SOCIAL HISTORY: The patient is a nonsmoker. No alcohol.

SOCIAL HISTORY: The patient is married with four grown daughters,uses tobacco, has wine with dinner.

Smoker

Non-Smoker

SOCIAL HISTORY: The patient lives in rehab, married. Unclear smoking historyfrom the admission note…

Past Smoker

Hard to pick

Hard to pick

???

Page 10: HPC for Biomed Applications Marcos Athanasoulis, Dr.PH Director, Information Technology Harvard Medical School.

Challenge: Whole OmesChallenge: Whole OmesCurrent cost 100KWorking on <$1,000 whole

genomeHigh Throughput Instrumentation

◦ $250-$500 for 500,000 SNP’s◦ $50-100K for good quality phenotyping of

100K++ individuals◦ What about the samples (consented)

$650/patient Dozens a week Wait in clinic: $450+/patient

Page 11: HPC for Biomed Applications Marcos Athanasoulis, Dr.PH Director, Information Technology Harvard Medical School.

11

HPLC autosampler

(96 wells)syringe pump

Sequencing Equipment

microscope

with xyz

controls

flow-cell

temperature

control

Page 12: HPC for Biomed Applications Marcos Athanasoulis, Dr.PH Director, Information Technology Harvard Medical School.

12

2nd-generation 2nd-generation sequencingsequencing

Harvard-model-F07: $106K incl. computer. $14K support. Open-source software, hardware, wetware Reduce reagent volume & per vol cost 100X each.

E07 (Nikon) F07

Page 13: HPC for Biomed Applications Marcos Athanasoulis, Dr.PH Director, Information Technology Harvard Medical School.
Page 14: HPC for Biomed Applications Marcos Athanasoulis, Dr.PH Director, Information Technology Harvard Medical School.

Challenge: Everything to Challenge: Everything to EverythingEverything

Page 15: HPC for Biomed Applications Marcos Athanasoulis, Dr.PH Director, Information Technology Harvard Medical School.

Biomed HPC Leadership Biomed HPC Leadership SummitSummit150 leaders in biomedical HPCThe tech guy is between you and

a sale2008 Summit to convene October

6 and 7th in Boston MAhttp://biomedhpc.med.harvard.ed

u

Page 16: HPC for Biomed Applications Marcos Athanasoulis, Dr.PH Director, Information Technology Harvard Medical School.

Biomed HPC Audience Biomed HPC Audience SurveysSurveysAudience response devicesN=60-100 Leaders in HPC Questions asked over the two

day eventAnd, survey says!

Page 17: HPC for Biomed Applications Marcos Athanasoulis, Dr.PH Director, Information Technology Harvard Medical School.

Primary Network FabricPrimary Network Fabric

63

125

17

30

10

20

30

40

50

60

70

Per

cen

t

Primary Network Fabric

HMS Biomed HPC Leadership Summit 2007

Gig-Ethernet

InfiniBand

Myrinet

10g Ethernet

Other

Page 18: HPC for Biomed Applications Marcos Athanasoulis, Dr.PH Director, Information Technology Harvard Medical School.

Do you use virtualization?Do you use virtualization?

47

14

39

0

10

20

30

40

50

Per

cen

t

Do you use virtualization?

HMS Biomed HPC Leadership Summit 2007

Yes, we do now

No, we don't and don't haveplans to

No, but considering it forfuture

Page 19: HPC for Biomed Applications Marcos Athanasoulis, Dr.PH Director, Information Technology Harvard Medical School.

What are you using for What are you using for virtualization?virtualization?

66

23

29

0

10

20

30

40

50

60

70

Per

cen

t

What are you using for virtualization in your environment?

HMS Biomed HPC Leadership Summit 2007

VMWare

Xen

VMI

HPVM

Page 20: HPC for Biomed Applications Marcos Athanasoulis, Dr.PH Director, Information Technology Harvard Medical School.

Use of parallel/distributed Use of parallel/distributed FSFS

50

5

22 23

0

10

20

30

40

50

Per

cen

t

Use of parallel/distributed/networkfilesystem for production storage

HMS Biomed HPC Leadership Summit 2007

Yes, we do now

No, we don't and don't haveplans to

No, but have plans to

No, but considering for future

Page 21: HPC for Biomed Applications Marcos Athanasoulis, Dr.PH Director, Information Technology Harvard Medical School.

Which parallel filesystem?Which parallel filesystem?

1815

126 8

41

0

10

20

30

40

50

Pe

rce

nt

If using a distributed/network file system -- which one?

HMS Biomed HPC Leadership Summit 2007

Lustre

Microsoft Distributed FileSystem

Open AFS

PVFS

Brix

Other

Page 22: HPC for Biomed Applications Marcos Athanasoulis, Dr.PH Director, Information Technology Harvard Medical School.

Which publication do you Which publication do you rely on?rely on?

44

12

3 58

27

0

10

20

30

40

50

Per

cen

t

Most useful, relevant, and timely publication for Gridand HPC computing

HMS Biomed HPC Leadership Summit 2007

HPC Wire

Bio IT World

Grid World

Grid Today

Computerworld

Other

Page 23: HPC for Biomed Applications Marcos Athanasoulis, Dr.PH Director, Information Technology Harvard Medical School.

Primary Storage Primary Storage InfrastructureInfrastructure

45

30

1015

0

10

20

30

40

50

Per

cen

t

Primary Storage Infrastructure

HMS Biomed HPC Leadership Summit 2007

NAS

SAN

Locall attached for storageonly

Distributed file system forproduction

Page 24: HPC for Biomed Applications Marcos Athanasoulis, Dr.PH Director, Information Technology Harvard Medical School.

Data center challengesData center challenges

30

45

25

0

10

20

30

40

50

Per

cen

t

Data Center Status

HMS Biomed HPC Leadership Summit 2007

Plenty of power, cooling, andspace

Plenty of space, butpower/cooling constraints

Short of physical space, plentof power and cooling

Page 25: HPC for Biomed Applications Marcos Athanasoulis, Dr.PH Director, Information Technology Harvard Medical School.

Data center expansion Data center expansion plansplans

43

35

19

4

0

10

20

30

40

50

Per

cen

t

Data Center Expansion (in next year)

HMS Biomed HPC Leadership Summit 2007

Will build new data centerspace

Will lease commercial datacenter space

Will not expand data center

Don't run any data centers

Page 26: HPC for Biomed Applications Marcos Athanasoulis, Dr.PH Director, Information Technology Harvard Medical School.

Job schedulers usedJob schedulers used

42

19 17

913

0

10

20

30

40

50

Per

cen

t

Job Scheduler Used

HMS Biomed HPC Leadership Summit 2007

Platform LSF

Sun Grid Engine

Open PBS

Other

No Scheduler/Not Applicable

Page 27: HPC for Biomed Applications Marcos Athanasoulis, Dr.PH Director, Information Technology Harvard Medical School.

Primary drives being Primary drives being purchasedpurchased

55

16

27

20

10

20

30

40

50

60

Per

cen

t

Primary type of drive being bought for storageinfrastructure in new HPC systems

HMS Biomed HPC Leadership Summit 2007

SATA

SCSI/SAS

Fibre Channel

Other

Page 28: HPC for Biomed Applications Marcos Athanasoulis, Dr.PH Director, Information Technology Harvard Medical School.

Types of servers deployedTypes of servers deployed

33

56

92

0

10

20

30

40

50

60

Per

cen

t

Primarily Purchased New ComputationalHardware (Current)

HMS Biomed HPC Leadership Summit 2007

1U Nodes

Blade Servers

Larger Scale SMP Boxes (>16CPU)

Other

Page 29: HPC for Biomed Applications Marcos Athanasoulis, Dr.PH Director, Information Technology Harvard Medical School.

Installed 10GB E todayInstalled 10GB E today

53

2027

0

10

20

30

40

50

60

Per

cen

t

Installed 10GbE in Facility

HMS Biomed HPC Leadership Summit 2007

Yes

Plans for 2008

No

Page 30: HPC for Biomed Applications Marcos Athanasoulis, Dr.PH Director, Information Technology Harvard Medical School.

Installed 10GB to enpointsInstalled 10GB to enpoints

2417

58

0

10

20

30

40

50

60

Per

cen

t

Installed 10GbE to End Points (Servers)

HMS Biomed HPC Leadership Summit 2007

Yes

Plans for 2008

No plans

Page 31: HPC for Biomed Applications Marcos Athanasoulis, Dr.PH Director, Information Technology Harvard Medical School.

Best use of 10GB todayBest use of 10GB today

1724

58

0

10

20

30

40

50

60

Per

cen

t

Best Use for 10 Gigabit Ethernet Today

HMS Biomed HPC Leadership Summit 2007

Connecting Storage to CoreNetwork

Connecting SwitchesTogether

Both

Page 32: HPC for Biomed Applications Marcos Athanasoulis, Dr.PH Director, Information Technology Harvard Medical School.

PredictionPredictionBiomed HPC will continue double

digit growth for the foreseeable future

The importance of the network fabric will increase dramatically

Biomedical HPC will become more centralized

Page 33: HPC for Biomed Applications Marcos Athanasoulis, Dr.PH Director, Information Technology Harvard Medical School.

Recommendations for Open Recommendations for Open FabricFabricUser centered design

◦End to end analysis of your products usability

Don’t ignore the small guysBring costs downContinue your pursuit of

enlightened self interestBe involved in the community

Page 34: HPC for Biomed Applications Marcos Athanasoulis, Dr.PH Director, Information Technology Harvard Medical School.

Thank youThank youQuestions, comments:

[email protected]


Recommended