+ All Categories
Home > Documents > Fat-tree Data Center Topology -...

Fat-tree Data Center Topology -...

Date post: 13-Sep-2018
Category:
Upload: trinhque
View: 214 times
Download: 0 times
Share this document with a friend
8
A Scalable, Commodity Data Center Network Architecture Al9Fares, Loukissas, Vahdat, "A Scalable, Commodity Data Center Network Architecture," Proc. of ACM SIGCOMM '08, 38(4):63974, Oct. 2008. Presenter: William Beyer Paper Goals Point out faults with current data center designs Propose new architecture based on fat9tree Scalable interconnecUon bandwidth Economies of scale Backward compaUbility A Typical Data Center Data center topology is typically 293 level tree of switches and routers OversubscripUon RaUo of worst9case achievable aggregate bandwidth among end9hosts to the total bisecUon bandwidth of the network topology Ability of hosts to fully uUlize their uplink capacity 1:1 – All hosts can use full uplink capacity 5:1 – Only 20% of host bandwidth may be available Typical raUo is 2.5:1 (400 Mbps) to 8:1 (125 Mbps)
Transcript
Page 1: Fat-tree Data Center Topology - web.eecs.umich.eduweb.eecs.umich.edu/~sugih/courses/eecs589/f13/24-FatTree.pdf · Fat-tree Data Center Topology.pptx Author: sugih Created Date: 10/31/2013

A"Scalable,"Commodity"Data"Center"Network"Architecture"

Al9Fares,"Loukissas,"Vahdat,""A"Scalable,"Commodity"Data"Center"Network"Architecture,""Proc."of"ACM"

SIGCOMM"'08,"38(4):63974,"Oct."2008.""

Presenter:"William"Beyer"

Paper"Goals"

•  Point"out"faults"with"current"data"center"designs"

•  Propose"new"architecture"based"on"fat9tree"– Scalable"interconnecUon"bandwidth"– Economies"of"scale"– Backward"compaUbility"

A"Typical"Data"Center"

•  Data"center"topology"is"typically"293"level"tree"of"switches"and"routers"

OversubscripUon"

•  RaUo"of"worst9case"achievable"aggregate"bandwidth"among"end9hosts"to"the"total"bisecUon"bandwidth"of"the"network"topology"– Ability"of"hosts"to"fully"uUlize"their"uplink"capacity"

•  1:1"–"All"hosts"can"use"full"uplink"capacity"•  5:1"–"Only"20%"of"host"bandwidth"may"be"available"

•  Typical"raUo"is"2.5:1"(400"Mbps)"to"8:1"(125"Mbps)"

Page 2: Fat-tree Data Center Topology - web.eecs.umich.eduweb.eecs.umich.edu/~sugih/courses/eecs589/f13/24-FatTree.pdf · Fat-tree Data Center Topology.pptx Author: sugih Created Date: 10/31/2013

MulU9path"RouUng"

•  “MulU9rooted”"tree"required"to"communicate"at"full"bandwidth"for"large"clusters"– Otherwise"limited"to"max"bandwidth"of"a"single"expensive"switch"(1289port"10"GigE)"

•  Use"mulU9path"rouUng"technique"such"as"ECMP"– Performs"staUc"load"splidng,"cannot"account"for"flow"sizes"

– RouUng"tables"become"very"large"with"mulUple"paths"

Cost"Analysis"

Cost"Analysis" Fat9tree"Architecture"

•  k9ary"fat9tree:"three9layer"topology"(edge,"aggregaUon,"core)"–  k"pods,"each"consists"of"(k/2)2"hosts"and"two"layers"(edge/aggregate)"each"with"k/2"k9port"switches"

–  Each"edge"switch"connects"to"k/2"hosts"and"k/2"aggregate"switches"

–  Each"aggregate"switch"connects"to"k/2"edge"and"k/2"core"switches"

–  (k/2)2"core"switches:"each"connects"to"k"pods"–  Supports"k3/4"hosts!"

Page 3: Fat-tree Data Center Topology - web.eecs.umich.eduweb.eecs.umich.edu/~sugih/courses/eecs589/f13/24-FatTree.pdf · Fat-tree Data Center Topology.pptx Author: sugih Created Date: 10/31/2013

Fat9tree"Topology"with"k"="4" Issues"with"Fat9tree"Topologies"

•  Backwards"compaUble"with"IP/Ethernet"– Good"thing,"but"rouUng"algorithms"will"naively"choose"a"single"shortest"path"to"use"between"subnets"

– Leads"to"boilenecks"quickly"–  (k/2)2"shortest"paths"available,"should"use"them"all"equally"

•  Complex"wiring"due"to"lack"of"high"speed"ports"

Addressing"in"Fat9tree"

•  Use"10.0.0.0/8"private"addressing"block"•  Pod"switches"have"address"10.pod.switch.1"– Pod"and"switch"in"[0,"k91]"based"on"posiUon"

•  Core"switches"have"address"10.k.j.i"–  i"and"j"denote"core"posiUon"in"(k/2)2"core"switches"

•  Hosts"have"address"10.pod.switch.ID"–  ID"is"host"ID"in"switch"subnet"([2,"(k/2)"+"1])"– k"<"256,"this"scheme"does"not"scale"indefinitely"

Two9Level"Lookup"Table"

•  Prefixes"used"for"forwarding"intra9pod"traffic"•  Suffixes"used"for"forwarding"inter9pod"traffic"

Page 4: Fat-tree Data Center Topology - web.eecs.umich.eduweb.eecs.umich.edu/~sugih/courses/eecs589/f13/24-FatTree.pdf · Fat-tree Data Center Topology.pptx Author: sugih Created Date: 10/31/2013

Two9Level"Lookup"ImplementaUon"

•  Implemented"in"hardware"using"a"TCAM"– Can"perform"parallel"lookups"across"table"– Stores"don’t"care"bits,"suitable"for"storing"variable"length"prefixes"

•  Prefixes"preferred"over"suffixes"

RouUng"Algorithm"

•  Prefixes"in"two9level"table"prevent"intra9pod"traffic"from"leaving"pod"

•  Inter9pod"traffic"handled"by"suffix"table"– Suffixes"based"off"host"IDs,"ensures"spread"of"traffic"across"core"switches"

– Prevents"packet"reordering"by"having"staUc"path"•  Each"host9to9host"communicaUon"has"a"single"staUc"path"– Beier"than"having"a"single"path"between"subnets"

RouUng"Algorithm"(cont.)"

•  Core"switches"contain"(10.pod.0.0/16,"port)"entries"–  StaUcally"forwards"inter9pod"traffic"on"specified"port"

•  Aggregate"switches"contain"(10.pod.switch.0/24,"port)"entries"–  Switch"value"is"the"edge"switch"number"

•  Assumes"a"central"enUty"with"full"knowledge"of"topology"generates"these"rouUng"tables"– Also"responsible"for"detecUng"switch"failures"and"re9rouUng"traffic"

RouUng"Algorithm"Example"

Page 5: Fat-tree Data Center Topology - web.eecs.umich.eduweb.eecs.umich.edu/~sugih/courses/eecs589/f13/24-FatTree.pdf · Fat-tree Data Center Topology.pptx Author: sugih Created Date: 10/31/2013

Dynamic"RouUng"Techniques"•  AlternaUves"to"two9level"rouUng"table"– Aiempt"to"classify"and"schedule"flows"rather"than"use"staUc"rouUng"

•  Flow"ClassificaUon"–  Periodically"reassigns"flow"output"ports"–  Prevents"compeUUon"between"flows"for"a"single"port"

•  Flow"Scheduling"–  IdenUfy"large"flows"and"establish"reserved"paths"for"them"

–  Requires"communicaUon"between"edge"switches"and"a"central"flow"scheduler"

Fault"Tolerance"

•  Many"possible"paths"between"hosts"leads"to"“easy”"fault"tolerance"

•  Each"switch"maintains"BidirecUonal"Forwarding"DetecUon"session"with"neighbors"– Allows"switch"to"determine"when"neighbors"fail"

•  Two"primary"types"of"link"failure"– Between"lower"and"upper"switches"– Between"upper"and"core"switches"

Router"Power"and"Heat"DissipaUon" Topology"Power/Heat"DissipaUon"

Page 6: Fat-tree Data Center Topology - web.eecs.umich.eduweb.eecs.umich.edu/~sugih/courses/eecs589/f13/24-FatTree.pdf · Fat-tree Data Center Topology.pptx Author: sugih Created Date: 10/31/2013

Cafarella,"2013"

Hamilton,"2008"

Cafarella,"2013"

Emerson"Network"Power,"2007"

Cafarella,"2013"

EPA,"2007"

Sotware"ImplementaUon"

•  Validated"in"sotware"using"Click"– Click"is"a"modular"sotware"router"architecture"–  Implement"routers"on"PCs,"supports"experimental"router"designs"

•  Click"modules"called"“elements”"– Each"element"performs"a"specified"task"– RouUng"table"lookup,"decrement"packet"TTL,"etc…"

•  Implemented"elements"for"two9level"table,"flow"classifier,"and"flow"scheduler"

Page 7: Fat-tree Data Center Topology - web.eecs.umich.eduweb.eecs.umich.edu/~sugih/courses/eecs589/f13/24-FatTree.pdf · Fat-tree Data Center Topology.pptx Author: sugih Created Date: 10/31/2013

EvaluaUon"Setup"

•  Uses"a"49port"fat9tree"as"seen"previously"– Two9level"table"and"flow9based"schemes"analyzed"– Compared"against"hierarchical"tree"with"oversubscripUon"raUo"of"3.6:1"

•  Both"evaluated"using"Click"– Emulate"switches"and"hosts"on"PCs"

•  All"hosts"generate"96"Mbit/s"of"outgoing"traffic"– This"value"prevents"CPU"from"throiling"test"

EvaluaUon"Results"•  Percentages"indicate"aggregate"network"bandwidth"– Measured"as"amount"of"incoming"traffic"received"by"hosts""

Flow"Scheduler"Requirements"

•  Minimal"Ume"and"memory"requirements"for"flow"scheduler"

•  Feasible"to"use"at"least"unUl"k"grows"extremely"large"

Packaging"Problem"

•  Fat9tree"has"significant"cabling"overhead"– 1"GigE"switches"used"to"reduce"cost"– Lack"of"10"GigE"ports"leads"to"more"cabling"

•  Present"a"packaging"soluUon"for"k=48"– Generalizes"to"other"values"of"k"

Page 8: Fat-tree Data Center Topology - web.eecs.umich.eduweb.eecs.umich.edu/~sugih/courses/eecs589/f13/24-FatTree.pdf · Fat-tree Data Center Topology.pptx Author: sugih Created Date: 10/31/2013

Packaging"SoluUon" Strengths"

•  Fat9tree"architecture"seems"to"outperform"hierarchical"soluUon"

•  Excellent"power"and"heat"reducUons"over"hierarchical"approach"

•  EvaluaUon"methods"were"good"overall"with"tests"performed"

•  Data"centers"can"easily"switch"to"this"new"method"

Weaknesses"

•  Language"used"in"paper"was"confusing"at"Umes"– Referred"to"pod"switches"as"“aggregate"switch”,"“upper9layer"switch”,"and"“upper"pod"switch”"at"various"points"

•  EvaluaUon"performed"with"small"value"of"k=4"– Would"have"been"nice"to"see"higher"values"of"k"tested"

– Academic"project"and"resources"were"obviously"a"factor"for"evaluaUon"

References"•  Al9Fares,"Loukissas,"Vahdat,""

A"Scalable,"Commodity"Data"Center"Network"Architecture,""Proc.&of&ACM&SIGCOMM&'08,"38(4):63974,"Oct."2008."

•  Cafarella,"M."(2013,"April"20)."Datacenters."EECS&485."Lecture"conducted"from"University"of"Michigan,"Ann"Arbor."

•  "Energy"Efficient"Cooling"SoluUons"for"Data"Centers.""Emerson&Network&Power."2007"Web."28"Oct."2013."<hip://www.emersonnetworkpower.com/documents/en9us/latest9thinking/edc/documents/white%20paper/energy_efficient_cooling_soluUons_for_data_centers.pdf>."

•  Hamilton,"James.""PerspecUves"9"Cost"of"Power"in"Large9Scale"Data"Centers.""Perspec>ves&@&James&Hamilton's&Blog."N.p.,"28"Nov."2008."Web."28"Oct."2013."<hip://perspecUves.mvdirona.com/2008/11/28/CostOfPowerInLargeScaleDataCenters.aspx>."

•  "Report"to"Congress"on"Server"and"Data"Center"Energy"Efficiency.""Energy&Star."U.S."Environmental"ProtecUon"Agency,"2"Aug."2007."Web."28"Oct."2013."<hip://www.energystar.gov/ia/partners/prod_development/downloads/EPA_Datacenter_Report_Congress_Final1.pdf?db729bf5a>."


Recommended