Data-Intensive Systems - Cornell University · read PCI (1GB/s) 100s GB 10s min. Servers —...

Post on 31-Jul-2020

1 views 0 download

transcript

Data-IntensiveSystems:Technologytrends,Emergingchallenges&opportuni=es

CS6453

RachitAgarwal

Slidesbasedon:manymanydiscussionswithIonStoica,hisclass,andmanyindustryfolks

Servers—Typicalnode

Memorybus

PCI

SATA

Ethe

rnet

Servers—Typicalnode

Memorybus

80GB/s

100sGB 10ssec

CapacityTimetoread

Servers—Typicalnode

Memorybus

80GB/s

100sGB 10ssec

CapacityTimetoread

PCI(1GB/s)100sGB 10smin

Servers—Typicalnode

Memorybus

80GB/s

100sGB 10ssec

CapacityTimetoread

PCI(1GB/s)

SATA

100sGB 10smin

10smin600MB/s

100MB/s1sTB hours

Trends—Moore’slawslowingdown?

• Stated50yearsagobyGordonMoore

• Numberoftransistorsonmicrochipdouble~2years•Whyinterestingforsystemspeople?

• BryanKrzanich—Today,

closerto2.5years

Trends—CPU(#cores)

Today,+20%everyyear

Servers—Trends

Memorybus

PCI

SATA

Ethe

rnet

+20%

Trends—CPU(performancepercore)

Today,+10%everyyear

• Numberofcores:+20%

• Performancepercore:+10%

• Overall:+30-32%

Trends—CPUscaling

Servers—Trends

Memorybus

PCI

SATA

Ethe

rnet

+30%

Trends—Memory

+29%everyyear

Servers—Trends

Memorybus

PCI

SATA

Ethe

rnet

+30% +30%

Trends—MemoryBus +15%everyyear

Servers—Trends

Memorybus

PCI

SATA

Ethe

rnet

+30% +30%

+15%

Trends—SSD

SSDscheaperthanHDD

• FollowingMoore’slaw(latestart)

• 3Dtechnologies

•MayevenoutpaceMoore’slaw

Trends—SSDcapacityscaling

Servers—Trends

Memorybus

PCI

SATA

Ethe

rnet

+30% +30%

+15%

+>30%

Trends—PCIbandwidth(and~SATA)+15-20%everyyear

Servers—Trends

Memorybus

PCI

SATA

Ethe

rnet

+30% +30%

+15%

+>30%+15-20%

Trends—Ethernetbandwidth+33-40%everyyear

Servers—Trends

Memorybus

PCI

SATA

Ethe

rnet

+30% +30%

+15%

+>30%+15-20%

+40%

• Intra-serverBandwidthanincreasingbottleneck

• Howcouldweovercomethis?

• Reducethesizeofthedata?•Whatdoesthatmeanforapplications?

• Preferremoteoverlocal?

• Challenges?• Non-intuitive;wealwayspreferlocality

Trends—Implications?

• Non-volatilememory

• 8-10xdensityofDRAM(closetoSSD)

• 2-4xhigherlatency

• Butwhocares?Bandwidthisthebottleneck…

Trends—Emergenceofnewtechnologies

Trends—Emergenceofnewtechnologies

Trends—Emergenceofnewtechnologies

heps://www.youtube.com/watch?v=IWsjbqbkqh8

Trends—&Implications

• HDDisnewtape

• SSD/NVRAMisthenewpersistentstorage

• But,increasinggapbetweencapacityandb/wconcerning…

• Deeperstoragehierarchy(L1,L2,L3,DRAM,NVRAM,SSD,HDD)

• DoCPUcachesevenmatter?

• Howdodesignsoftwarestacktoworkwithdeeperhierarchy?

• CPU-storage“disaggregation”isgoingtobeanorm

• Easiertoovercomebandwidthbottlenecks

• GoogleandMicrosofthavealreadyrealized

•Whathappenstolocality?

• Re-thinksoftwaredesign?

Paper1—Memory-centricdesign

• SSD/NVRAMisthenewpersistentstorage(+archival)

• Notjustthepersistentstorage,THEstorage• +(privatememory),deepstoragehierarchy

• CPU-storage“disaggregation”• NVRAMsharedacrossCPUs

• Challenges?• Howtomanage/shareresources?

• NVM:acceleratorsandcontrollers

• Addressing?Flatvirtualaddressspace?• NVMsharinginmulti-tenantscenarios?

• NVM+CPU+Network:software-controlled?

• Storagevscomputeheavyworkloads?

Paper1—Memory-centricdesign

• Newfailuremodes?[veryinterestingdirection!!]

• CPU-storagecanfailindependently• Verydifferentfromtoday’s“servers”

• Good?Bad?• Transparentfailuremitigation…?

• HowabouttheOS?• WhereshouldtheOSsit?

• WhatfunctionalitiesshouldbeimplementedwithintheOS?

• Application-levelsemantics

• ?

Paper2—Nanostores(Analternativeview)

• DRAMisdead

• SSD/NVRAMisthenewpersistentstorage(+archival)

• Notjustthepersistentstorage,THEstorage• Nostoragehierarchy

• CPU-storage“convergence”isgoingtobeanorm

• CPU-storagehyper-convergence• BerkeleyIRAMproject(late90s)

• Challenges?• Network?(topology,intra-nanostorelatency,throughput)• Howdoesthisbypassthetrendsdiscussedearlier?

Trends—Themissingpiece?

• DatavolumeincreasingsignificantlyfasterthanMoore’slaw

• 56xincreaseinGoogleindexeddatain7years• 173%increaseinenterprisedata• Uber,Airbnb,Orbitz,Hotels,…

• Datatypes• Images,audio,videos,logs,logs,logs,genetics,astronomy,….

• YouTube:~50TBofdataeveryday

Trends—Discussion

• Othermissingpieces?

• Softwareoverheads• Applicationworkloads• Specializationvs.generalization?