+ All Categories
Home > Documents > Moving Data Over Networks - nersc.gov · PDF fileMoving Data Over Networks Network-Based Data...

Moving Data Over Networks - nersc.gov · PDF fileMoving Data Over Networks Network-Based Data...

Date post: 31-Jan-2018
Category:
Upload: buithu
View: 218 times
Download: 1 times
Share this document with a friend
24
Moving Data Over Networks Network-Based Data Transfer at NERSC Eli Dart, Network Engineer ESnet Science Engagement Lawrence Berkeley NaAonal Laboratory NERSC Users Group Training Berkeley, CA February 24, 2016
Transcript
Page 1: Moving Data Over Networks - nersc.gov · PDF fileMoving Data Over Networks Network-Based Data Transfer at NERSC Eli Dart, Network Engineer ... • perfSONAR nodes at the site and in

MovingDataOverNetworksNetwork-BasedDataTransferatNERSC

EliDart,NetworkEngineerESnetScienceEngagementLawrenceBerkeleyNaAonalLaboratory

NERSCUsersGroupTraining

Berkeley,CA

February24,2016

Page 2: Moving Data Over Networks - nersc.gov · PDF fileMoving Data Over Networks Network-Based Data Transfer at NERSC Eli Dart, Network Engineer ... • perfSONAR nodes at the site and in

Outline

•  Context•  ScienceDMZoverview

•  DataTransferNodes•  HandofftoShreyasCholia

2 – ESnet Science Engagement ([email protected]) - 2/24/17

Page 3: Moving Data Over Networks - nersc.gov · PDF fileMoving Data Over Networks Network-Based Data Transfer at NERSC Eli Dart, Network Engineer ... • perfSONAR nodes at the site and in

ScienceNetworksforScience

•  TheglobalResearch&EducaAon(R&E)networkecosystemiscomprisedofhundredsofinternaAonal,naAonal,regionalandlocal-scalenetworks–eachindependentlyownedandoperated.

•  ThesenetworksarepartofandconnectedtotheInternet,butareengineeredspecificallyforhigh-performancescienAficapplicaAons

February24,2017 33 – ESnet Science Engagement ([email protected]) - 2/24/17

Page 4: Moving Data Over Networks - nersc.gov · PDF fileMoving Data Over Networks Network-Based Data Transfer at NERSC Eli Dart, Network Engineer ... • perfSONAR nodes at the site and in

•  DatatransfersbetweenresourcesconnectedtoR&Enetworkscandomuchbe\erthandatatransferswhichusethecommodityInternet–  Terabytesarenoproblem–  Petabytesarefeasible

•  Justneedtomakesurewedoacoupleofthings–  LongdistanceporAonsworkwellingeneral–  Large-scalecompuAngcentersworkwellingeneral–  LocalconfiguraAonisreallyimportant

•  NERSChashigh-performancedataresources–  Fastnetworks–  Fastsystemsandfilesystems

•  ThistalkwilldescribewhatyoucandotointerfacewithNERSCeffecAvely

Effec4veHighPerformanceDataTransfer

4 – ESnet Science Engagement ([email protected]) - 2/24/17 ©2015,EnergySciencesNetwork

Page 5: Moving Data Over Networks - nersc.gov · PDF fileMoving Data Over Networks Network-Based Data Transfer at NERSC Eli Dart, Network Engineer ... • perfSONAR nodes at the site and in

•  NetworksareanessenAalpartofdata-intensivescience–  Connectdatasourcestodataanalysis–  Connectcollaboratorstoeachother–  Enablemachine-consumableinterfacestodataandanalysisresources(e.g.portals),automaAon,scale

•  PerformanceiscriAcal–  ExponenAaldatagrowth–  Constanthumanfactors–  Datamovementanddataanalysismustkeepup

•  EffecAveuseofwidearea(long-haul)networksbyscienAstshashistoricallybeendifficult

•  Someofthisisforyoursystemadministrator–  Pointyoursysadmintoh\p://fasterdata.es.net/formoreinfo–  Feelfreetofollowupwithmelater–[email protected]

Mo4va4on

5 – ESnet Science Engagement ([email protected]) - 2/24/17 ©2015,EnergySciencesNetwork

Page 6: Moving Data Over Networks - nersc.gov · PDF fileMoving Data Over Networks Network-Based Data Transfer at NERSC Eli Dart, Network Engineer ... • perfSONAR nodes at the site and in

TheCentralRoleoftheNetwork

•  Theverystructureofmodernscienceassumessciencenetworksexist:highperformance,featurerich,globalscope

•  Whatis“TheNetwork”anyway?–  “TheNetwork”isthesetofdevicesandapplicaAonsinvolvedintheuseofaremoteresource•  Thisisnotaboutsupercomputerinterconnects•  Thisisaboutdataflowfromexperimenttoanalysis,betweenfaciliAes,etc.

–  Userinterfacesfor“TheNetwork”–portal,datatransfertool,workflowengine–  Therefore,serversandapplicaAonsmustalsobeconsidered

•  Whatisimportant?Orderedlist:1.  Correctness2.  Consistency3.  Performance

6 – ESnet Science Engagement ([email protected]) - 2/24/17 ©2015,EnergySciencesNetwork

Page 7: Moving Data Over Networks - nersc.gov · PDF fileMoving Data Over Networks Network-Based Data Transfer at NERSC Eli Dart, Network Engineer ... • perfSONAR nodes at the site and in

TCP–UbiquitousandFragile

•  NetworksprovideconnecAvitybetweenapplicaAonsrunningonhosts–  FromanapplicaAon’sperspecAve,theinterfaceto“theotherend”isasocket

–  HostoperaAngsystemkernelprovidessocketinterface,kernelimplementsTCPwheretheapplicaAoncan’tsee

–  CommunicaAonisbetweenapplicaAons–mostlyoverTCP

•  TCP–thefragileworkhorse–  TCPis(forverygoodreasons)Amid–packetlossisinterpretedascongesAon

–  Likeitornot,TCPisusedforthevastmajorityofdatatransferapplicaAons(morethan95%ofESnettrafficisTCP)

–  PacketlossinconjuncAonwithlatencyisaperformancekiller

7 – ESnet Science Engagement ([email protected]) - 2/24/17 ©2015,EnergySciencesNetwork

Page 8: Moving Data Over Networks - nersc.gov · PDF fileMoving Data Over Networks Network-Based Data Transfer at NERSC Eli Dart, Network Engineer ... • perfSONAR nodes at the site and in

A small amount of packet loss makes a huge difference in TCP performance

MetroArea

Local(LAN)

Regional

ConAnental

InternaAonal

Measured (TCP Reno) Measured (HTCP) Theoretical (TCP Reno) Measured (no loss)

With loss, high performance beyond metro distances is essentially impossible

8 – ESnet Science Engagement ([email protected]) - 2/24/17 ©2015,EnergySciencesNetwork

Page 9: Moving Data Over Networks - nersc.gov · PDF fileMoving Data Over Networks Network-Based Data Transfer at NERSC Eli Dart, Network Engineer ... • perfSONAR nodes at the site and in

WorkingWithTCPInPrac4ce

•  FareasiertosupportTCPthantofixTCP–  PeoplehavebeentryingtofixTCPforyears–limitedsuccess–  Likeitornotwe’restuckwithTCPinthegeneralcase

•  PragmaAcallyspeaking,wemustaccommodateTCP–  SufficientbandwidthtoavoidcongesAon–  Zeropacketloss– Verifiableinfrastructure

•  Networksarecomplex•  Mustbeabletolocateproblemsquickly•  Smallfootprintisahugewin–smallnumberofdevicessothatproblemisolaAonistractable

9 – ESnet Science Engagement ([email protected]) - 2/24/17 ©2015,EnergySciencesNetwork

Page 10: Moving Data Over Networks - nersc.gov · PDF fileMoving Data Over Networks Network-Based Data Transfer at NERSC Eli Dart, Network Engineer ... • perfSONAR nodes at the site and in

PuIngASolu4onTogether

•  EffecAvesupportforTCP-baseddatatransfer–  Designforcorrect,consistent,high-performanceoperaAon–  DesignforeaseoftroubleshooAng

•  EasyadopAoniscriAcal–  LargelaboratoriesanduniversiAeshaveextensiveITdeployments–  DrasAcchangeisprohibiAvelydifficult

•  Cybersecurity–defensiblewithoutcompromisingperformance

•  BorrowideasfromtradiAonalnetworksecurity–  TradiAonalDMZ

•  Separateenclaveatnetworkperimeter(“DemilitarizedZone”)•  SpecificlocaAonforexternal-facingservices•  CleanseparaAonfrominternalnetwork

–  Dothesamethingforscience–ScienceDMZ

10 – ESnet Science Engagement ([email protected]) - 2/24/17 ©2015,EnergySciencesNetwork

Page 11: Moving Data Over Networks - nersc.gov · PDF fileMoving Data Over Networks Network-Based Data Transfer at NERSC Eli Dart, Network Engineer ... • perfSONAR nodes at the site and in

DedicatedSystemsforData

Transfer

NetworkArchitecture

PerformanceTesAng&

Measurement

DataTransferNode•  Highperformance•  Configuredspecifically

fordatatransfer•  Propertools

ScienceDMZ•  Dedicatednetwork

locaAonforhigh-speeddataresources

•  Appropriatesecurity•  Easytodeploy-noneed

toredesignthewholenetwork

perfSONAR•  EnablesfaultisolaAon•  VerifycorrectoperaAon•  WidelydeployedinESnet

andothernetworks,aswellassitesandfaciliAes

TheScienceDMZDesignPa\ern

11 – ESnet Science Engagement ([email protected]) - 2/24/17 ©2015,EnergySciencesNetwork

Page 12: Moving Data Over Networks - nersc.gov · PDF fileMoving Data Over Networks Network-Based Data Transfer at NERSC Eli Dart, Network Engineer ... • perfSONAR nodes at the site and in

AbstractorPrototypeDeployment

•  (ThissecAonisforyoursystemadministrator–sendthemtome,[email protected])

•  Add-ontoexisAngnetworkinfrastructure–  Allthatisrequiredisaportontheborderrouter–  Smallfootprint,pre-producAoncommitment

•  Easytoexperimentwithcomponentsandtechnologies–  DTNprototyping–  perfSONARtesAng

•  LimitedscopemakessecuritypolicyexcepAonseasy–  Onlyallowtrafficfrompartners–  Add-ontoproducAoninfrastructure–lowerriskthanrebuildingexisAnginfrastructure

12 – ESnet Science Engagement ([email protected]) - 2/24/17 ©2015,EnergySciencesNetwork

Page 13: Moving Data Over Networks - nersc.gov · PDF fileMoving Data Over Networks Network-Based Data Transfer at NERSC Eli Dart, Network Engineer ... • perfSONAR nodes at the site and in

ScienceDMZDesignPaNern(Abstract)

10GE

10GE

10GE

10GE

10G

Border Router

WAN

Science DMZSwitch/Router

Enterprise Border Router/Firewall

Site / CampusLAN

High performanceData Transfer Node

with high-speed storage

Per-service security policy control points

Clean, High-bandwidth

WAN path

Site / Campus access to Science

DMZ resources

perfSONAR

perfSONAR

perfSONAR

13 – ESnet Science Engagement ([email protected]) - 2/24/17 ©2015,EnergySciencesNetwork

Page 14: Moving Data Over Networks - nersc.gov · PDF fileMoving Data Over Networks Network-Based Data Transfer at NERSC Eli Dart, Network Engineer ... • perfSONAR nodes at the site and in

LocalAndWideAreaDataFlows

10GE

10GE

10GE

10GE

10G

Border Router

WAN

Science DMZSwitch/Router

Enterprise Border Router/Firewall

Site / CampusLAN

High performanceData Transfer Node

with high-speed storage

Per-service security policy control points

Clean, High-bandwidth

WAN path

Site / Campus access to Science

DMZ resources

perfSONAR

perfSONAR

High Latency WAN Path

Low Latency LAN Path

perfSONAR

14 – ESnet Science Engagement ([email protected]) - 2/24/17 ©2015,EnergySciencesNetwork

Page 15: Moving Data Over Networks - nersc.gov · PDF fileMoving Data Over Networks Network-Based Data Transfer at NERSC Eli Dart, Network Engineer ... • perfSONAR nodes at the site and in

ModularArchitecture–Mul4pleScienceDMZs

Dark Fiber

DarkFiber

10GE

DarkFiber

10GE

10G

Border Router

WAN

Science DMZSwitch/Routers

Enterprise Border Router/Firewall

Site / CampusLAN

Project A DTN(building A)

Per-project securitypolicy

perfSONAR

perfSONAR

Facility B DTN(building B)

Cluster DTN(building C)

perfSONARperfSONAR

Cluster(building C)

15 – ESnet Science Engagement ([email protected]) - 2/24/17 ©2015,EnergySciencesNetwork

Page 16: Moving Data Over Networks - nersc.gov · PDF fileMoving Data Over Networks Network-Based Data Transfer at NERSC Eli Dart, Network Engineer ... • perfSONAR nodes at the site and in

SupercomputerCenterDeployment

•  High-performancenetworkingisassumedinthisenvironment–  Dataflowsbetweensystems,betweensystemsandstorage,widearea,etc.–  GlobalfilesystemoqenAesresourcestogether

•  PorAonsofthismaynotrunoverEthernet(e.g.IB)•  ImplicaAonsforDataTransferNodes

•  “ScienceDMZ”maynotlooklikeadiscreteenAtyhere–  BytheAmeyougetthroughinterconnecAngalltheresources,youendupwithmostofthenetworkintheScienceDMZ

–  Thisisasitshouldbe–thepointisappropriatedeploymentoftools,configuraAon,policycontrol,etc.

•  Officenetworkscanlooklikeanaqerthought,buttheyaren’t–  Deployedwithappropriatesecuritycontrols–  Officeinfrastructureneednotbesizedforsciencetraffic

16 – ESnet Science Engagement ([email protected]) - 2/24/17 ©2015,EnergySciencesNetwork

Page 17: Moving Data Over Networks - nersc.gov · PDF fileMoving Data Over Networks Network-Based Data Transfer at NERSC Eli Dart, Network Engineer ... • perfSONAR nodes at the site and in

HPCCenter

©2014,EnergySciencesNetwork17 – ESnet Science Engagement ([email protected]) - 2/24/17

Routed

Border Router

WAN

Core Switch/Router

Firewall

Offices

perfSONAR

perfSONAR

perfSONAR

Supercomputer

Parallel Filesystem

Front endswitch

Data Transfer Nodes

Front endswitch

Page 18: Moving Data Over Networks - nersc.gov · PDF fileMoving Data Over Networks Network-Based Data Transfer at NERSC Eli Dart, Network Engineer ... • perfSONAR nodes at the site and in

HPCCenterDataPath

©2014,EnergySciencesNetwork18 – ESnet Science Engagement ([email protected]) - 2/24/17

Routed

Border Router

WAN

Core Switch/Router

Firewall

Offices

perfSONAR

perfSONAR

perfSONAR

Supercomputer

Parallel Filesystem

Front endswitch

Data Transfer Nodes

Front endswitch

High Latency WAN Path

Low Latency LAN Path

Page 19: Moving Data Over Networks - nersc.gov · PDF fileMoving Data Over Networks Network-Based Data Transfer at NERSC Eli Dart, Network Engineer ... • perfSONAR nodes at the site and in

CommonThreads

•  Twocommonthreadsexistinalltheseexamples

•  AccommodaAonofTCP–  WideareaporAonofdatatransferstraversespurpose-builtpath–  Highperformancedevicesthatdon’tdroppackets

•  Abilitytotestandverify–  Whenproblemsarise(andtheyalwayswill),theycanbesolvediftheinfrastructureisbuiltcorrectly

–  Smalldevicecountmakesiteasiertofindissues–  MulApletestandmeasurementhostsprovidemulApleviewsofthedatapath•  perfSONARnodesatthesiteandintheWAN•  perfSONARnodesattheremotesite

19 – ESnet Science Engagement ([email protected]) - 2/24/17 ©2015,EnergySciencesNetwork

Page 20: Moving Data Over Networks - nersc.gov · PDF fileMoving Data Over Networks Network-Based Data Transfer at NERSC Eli Dart, Network Engineer ... • perfSONAR nodes at the site and in

DedicatedSystems–DataTransferNode

•  TheDTNisdedicatedtodatatransfer•  Setupspecificallyforhigh-performancedatamovement

–  Systeminternals(BIOS,firmware,interrupts,etc.)–  Networkstack–  Storage(globalfilesystem,Fibrechannel,localRAID,etc.)–  Highperformancetools–  Noextraneoussoqware

•  Limita.onofscopeandfunc.onispowerful–  NoconflictswithconfiguraAonforothertasks–  SmallapplicaAonsetmakescybersecurityeasier–keypoint

20 – ESnet Science Engagement ([email protected]) - 2/24/17 ©2015,TheRegentsoftheUniversityofCalifornia,throughLawrenceBerkeleyNaAonalLaboratoryandislicensedunderCCBY-NC-ND4.0

Page 21: Moving Data Over Networks - nersc.gov · PDF fileMoving Data Over Networks Network-Based Data Transfer at NERSC Eli Dart, Network Engineer ... • perfSONAR nodes at the site and in

DataTransferToolsForDTNs

•  Parallelismisimportant–  ItisoqeneasiertoachieveagivenperformancelevelwithfourparallelconnecAonsthanoneconnecAon

–  Severaltoolsofferparalleltransfers,includingGlobus/GridFTP

•  LatencyinteracAoniscriAcal–  WideareadatatransfershavemuchhigherlatencythanLANtransfers–  ManytoolsandprotocolsassumeaLAN

•  WorkflowintegraAonisimportant

•  Keytools:GlobusOnline,HPN-SSH•  ESnettestDTNs:h\p://fasterdata.es.net/performance-tesAng/DTNs/

21 – ESnet Science Engagement ([email protected]) - 2/24/17 ©2015,TheRegentsoftheUniversityofCalifornia,throughLawrenceBerkeleyNaAonalLaboratoryandislicensedunderCCBY-NC-ND4.0

Page 22: Moving Data Over Networks - nersc.gov · PDF fileMoving Data Over Networks Network-Based Data Transfer at NERSC Eli Dart, Network Engineer ... • perfSONAR nodes at the site and in

DataTransferToolComparison•  InaddiAontothenetwork,usingtherightdatatransfertooliscriAcal•  DatatransfertestfromBerkeley,CAtoArgonne,IL(nearChicago).RTT=53ms,networkcapacity=10Gbps.

Tool Throughput SCP: 140Mbps HPNpatchedSCP: 1.2GbpsFTP 1.4Gbps GridFTP,4streams 5.4Gbps GridFTP,8streams 6.6Gbps

•  NERSCDTNshavebothHPN-SSHandGlobus•  Keypoint–yourlocalDTNandnetworkconnecAonsignificantlyaffectyourabilitytomovedatainandoutofNERSC

22 – ESnet Science Engagement ([email protected]) - 2/24/17 ©2015,TheRegentsoftheUniversityofCalifornia,throughLawrenceBerkeleyNaAonalLaboratoryandislicensedunderCCBY-NC-ND4.0

Page 23: Moving Data Over Networks - nersc.gov · PDF fileMoving Data Over Networks Network-Based Data Transfer at NERSC Eli Dart, Network Engineer ... • perfSONAR nodes at the site and in

8.9 Gbps

10.7 Gbps

7.5 Gbps

6.3 Gbps

7.6 Gbps 9.8 Gbps

11.4 Gbps

9.2 Gbps

21.9 Gbps

15.6 Gbps

26.1 Gbps

2.8 Gbps

DTN

DTN

DTN

DTN

alcf#dtn_miraALCF

nersc#dtnNERSC

olcf#dtn_atlasOLCF

ncsa#BlueWatersNCSA

Data set: L380Files: 19260Directories: 211Other files: 0Total bytes: 4442781786482 (4.4T bytes)Smallest file: 0 bytes (0 bytes)Largest file: 11313896248 bytes (11G bytes)Size distribution:

1 - 10 bytes: 7 files10 - 100 bytes: 1 files100 - 1K bytes: 59 files1K - 10K bytes: 3170 files10K - 100K bytes: 1560 files100K - 1M bytes: 2817 files1M - 10M bytes: 3901 files10M - 100M bytes: 3800 files100M - 1G bytes: 2295 files1G - 10G bytes: 1647 files10G - 100G bytes: 3 files

October 2016L380 Data Set

PerformanceBetweenCompu4ngFacili4es

23 – ESnet Science Engagement ([email protected]) - 2/24/17

Page 24: Moving Data Over Networks - nersc.gov · PDF fileMoving Data Over Networks Network-Based Data Transfer at NERSC Eli Dart, Network Engineer ... • perfSONAR nodes at the site and in

HandofftoShreyasCholia

•  Thanks!

24 – ESnet Science Engagement ([email protected]) - 2/24/17 ©2015,TheRegentsoftheUniversityofCalifornia,throughLawrenceBerkeleyNaAonalLaboratoryandislicensedunderCCBY-NC-ND4.0


Recommended