Data-IntensiveSystems:Technologytrends,Emergingchallenges&opportuni=es
CS6453
RachitAgarwal
Slidesbasedon:manymanydiscussionswithIonStoica,hisclass,andmanyindustryfolks
Servers—Typicalnode
Memorybus
PCI
SATA
Ethe
rnet
Servers—Typicalnode
Memorybus
80GB/s
100sGB 10ssec
CapacityTimetoread
Servers—Typicalnode
Memorybus
80GB/s
100sGB 10ssec
CapacityTimetoread
PCI(1GB/s)100sGB 10smin
Servers—Typicalnode
Memorybus
80GB/s
100sGB 10ssec
CapacityTimetoread
PCI(1GB/s)
SATA
100sGB 10smin
10smin600MB/s
100MB/s1sTB hours
Trends—Moore’slawslowingdown?
• Stated50yearsagobyGordonMoore
• Numberoftransistorsonmicrochipdouble~2years•Whyinterestingforsystemspeople?
• BryanKrzanich—Today,
closerto2.5years
Trends—CPU(#cores)
Today,+20%everyyear
Servers—Trends
Memorybus
PCI
SATA
Ethe
rnet
+20%
Trends—CPU(performancepercore)
Today,+10%everyyear
• Numberofcores:+20%
• Performancepercore:+10%
• Overall:+30-32%
Trends—CPUscaling
Servers—Trends
Memorybus
PCI
SATA
Ethe
rnet
+30%
Trends—Memory
+29%everyyear
Servers—Trends
Memorybus
PCI
SATA
Ethe
rnet
+30% +30%
Trends—MemoryBus +15%everyyear
Servers—Trends
Memorybus
PCI
SATA
Ethe
rnet
+30% +30%
+15%
Trends—SSD
SSDscheaperthanHDD
• FollowingMoore’slaw(latestart)
• 3Dtechnologies
•MayevenoutpaceMoore’slaw
Trends—SSDcapacityscaling
Servers—Trends
Memorybus
PCI
SATA
Ethe
rnet
+30% +30%
+15%
+>30%
Trends—PCIbandwidth(and~SATA)+15-20%everyyear
Servers—Trends
Memorybus
PCI
SATA
Ethe
rnet
+30% +30%
+15%
+>30%+15-20%
Trends—Ethernetbandwidth+33-40%everyyear
Servers—Trends
Memorybus
PCI
SATA
Ethe
rnet
+30% +30%
+15%
+>30%+15-20%
+40%
• Intra-serverBandwidthanincreasingbottleneck
• Howcouldweovercomethis?
• Reducethesizeofthedata?•Whatdoesthatmeanforapplications?
• Preferremoteoverlocal?
• Challenges?• Non-intuitive;wealwayspreferlocality
Trends—Implications?
• Non-volatilememory
• 8-10xdensityofDRAM(closetoSSD)
• 2-4xhigherlatency
• Butwhocares?Bandwidthisthebottleneck…
Trends—Emergenceofnewtechnologies
Trends—Emergenceofnewtechnologies
Trends—Emergenceofnewtechnologies
heps://www.youtube.com/watch?v=IWsjbqbkqh8
Trends—&Implications
• HDDisnewtape
• SSD/NVRAMisthenewpersistentstorage
• But,increasinggapbetweencapacityandb/wconcerning…
• Deeperstoragehierarchy(L1,L2,L3,DRAM,NVRAM,SSD,HDD)
• DoCPUcachesevenmatter?
• Howdodesignsoftwarestacktoworkwithdeeperhierarchy?
• CPU-storage“disaggregation”isgoingtobeanorm
• Easiertoovercomebandwidthbottlenecks
• GoogleandMicrosofthavealreadyrealized
•Whathappenstolocality?
• Re-thinksoftwaredesign?
Paper1—Memory-centricdesign
• SSD/NVRAMisthenewpersistentstorage(+archival)
• Notjustthepersistentstorage,THEstorage• +(privatememory),deepstoragehierarchy
• CPU-storage“disaggregation”• NVRAMsharedacrossCPUs
• Challenges?• Howtomanage/shareresources?
• NVM:acceleratorsandcontrollers
• Addressing?Flatvirtualaddressspace?• NVMsharinginmulti-tenantscenarios?
• NVM+CPU+Network:software-controlled?
• Storagevscomputeheavyworkloads?
Paper1—Memory-centricdesign
• Newfailuremodes?[veryinterestingdirection!!]
• CPU-storagecanfailindependently• Verydifferentfromtoday’s“servers”
• Good?Bad?• Transparentfailuremitigation…?
• HowabouttheOS?• WhereshouldtheOSsit?
• WhatfunctionalitiesshouldbeimplementedwithintheOS?
• Application-levelsemantics
• ?
Paper2—Nanostores(Analternativeview)
• DRAMisdead
• SSD/NVRAMisthenewpersistentstorage(+archival)
• Notjustthepersistentstorage,THEstorage• Nostoragehierarchy
• CPU-storage“convergence”isgoingtobeanorm
• CPU-storagehyper-convergence• BerkeleyIRAMproject(late90s)
• Challenges?• Network?(topology,intra-nanostorelatency,throughput)• Howdoesthisbypassthetrendsdiscussedearlier?
Trends—Themissingpiece?
• DatavolumeincreasingsignificantlyfasterthanMoore’slaw
• 56xincreaseinGoogleindexeddatain7years• 173%increaseinenterprisedata• Uber,Airbnb,Orbitz,Hotels,…
• Datatypes• Images,audio,videos,logs,logs,logs,genetics,astronomy,….
• YouTube:~50TBofdataeveryday
Trends—Discussion
• Othermissingpieces?
• Softwareoverheads• Applicationworkloads• Specializationvs.generalization?