+ All Categories
Home > Technology > 110G networking within JASMIN

110G networking within JASMIN

Date post: 09-Jan-2017
Category:
Upload: jisc
View: 198 times
Download: 0 times
Share this document with a friend
16
Jonathan Churchill, Campus network engineering workshop 19/10/2016 100G networking within JASMIN
Transcript
Page 1: 110G networking within JASMIN

Jonathan Churchill, Campus network engineering workshop19/10/2016 100G networking within

JASMIN

Page 2: 110G networking within JASMIN

100G networking within JASMIN

Campus network engineering for data-intensive science workshop

October 19th 2016

Jonathan ChurchillJASMIN Infrastructure Manager

( STFC Scientific Computing Dept.)

Page 3: 110G networking within JASMIN

JASMIN is a world leading, unique hybrid of:• 16PB high performance storage (~250GByte/s) • High-performance computing (~4,000 cores)• 35PB Archive and Elastic Tape• Non-blocking Networking (> 3Tbit/sec),

and Optical Private Network WAN’s• Coupled with cloud hosting capabilities

Cloud is here !

17PB

40PB5,000

Page 4: 110G networking within JASMIN

JASMIN 1

JASMIN 2JASMIN 3

DMZ

Cloud Lives here

JASMIN 4,5 (2016–20) …)

Storage and Servers distributed over the fabric network.

JASMIN

3.5

LOTUS

100G lives here

Page 5: 110G networking within JASMIN

JASMIN “Fabric” Networking

Page 6: 110G networking within JASMIN

The need for speed

347Gbps

347Gbps = 34,700 Broadband connections

Page 7: 110G networking within JASMIN

JC2-LSW1 JC2-LSW1 JC2-LSW1JC2-LSW1 JC2-LSW1 JC2-LSW1 JC2-LSW1 JC2-LSW1 JC2-LSW1JC2-LSW1 JC2-LSW1 JC2-LSW1

48 * 16 = 768 10GbE Non-blocking16 x 12 x 40GbE = 192 40GbE ports

S1036 = 32 x 40GbE

JC2-LSW1JC2-LSW1

JC2-SP1 JC2-SP1 JC2-SP1 JC2-SP1 JC2-SP1 JC2-SP1

16 x MSX1024B-1BFS48x10GBE + 12 40 GbE

16 x 12 40GbE = 192 Ports / 32 = 6Total 192 40 GbE Cable

1,900 @ 10GbE Ports

• Non-Blocking. Zero Contention (48x10Gb = 12x 40Gb uplinks)• Low Latency (250nS L3 / per switch/router). • Cheap(er) • But its all layer 3 routed (ECMP OSPF)

954 Routes

954 Routes

Page 8: 110G networking within JASMIN

Bandwidth ?? Data via the DTZThrough the IaaS firewall

DTZ Bandwidth 1:1 match to IaaS hypervisors.

Page 9: 110G networking within JASMIN

Data rates inside PaaS = IaaS ?• How can we provide data rate

access to Iaas Cloud tenants at similar rates to “inside” JASMIN (PaaS) ? – aka 100Gbits/sec

Page 10: 110G networking within JASMIN

Non Blocking data access inside JASMIN

SP1 SP2 SP3 SP4

LSW1

host001host002

host024

iSCSI

Underlay networks

LSW2

host025host026

host027

LSW3172.26.66.64/26

172.26.66.0/26

LSWn

~10x 10-12Gbps per “Bladeset”

24x 10Gbps

172.16.136.0/24

172.16.137.0/24

12x 40Gb ECMP uplinksper switch/router

A non-blocking to IaaS cloud needs to duplicate or fit into this fabric.

And still 1:1 using 10Gb servers

Page 11: 110G networking within JASMIN

LSW21

Non Blocking data access to JASMIN IaaS via 100G ?

SP1 SP2 SP3 SP4

LSW1

host001host002

host024

iSCSI

Underlay networks

LSW2

host025host026

host027

LSW3172.26.66.64/26

172.26.66.0/26

LSW20

1:10 server to client

LSWn

~10x 10-12Gbps per “Bladeset”

24x 10Gbps

172.16.136.0/24

172.16.137.0/24

host-100G-1host-100G-2

12x 40Gb ECMP uplinksper switch/router

vmhost1vmhost2

vmhost24

24x 10Gbps

“Blessed” private subnet

Page 12: 110G networking within JASMIN

Hardware• Mellanox Connect-X4 Dual port 100Gb QSFP+ DA

– Dell R730XD servers.– VXLAN/NV|GRE and Erasure Coding offload in h/w

• Mellanox Dual MSN2100 16 port x 100G switch/routers

Page 13: 110G networking within JASMIN

Potential Issues• Blocking “backdoor” access across the fabric

– Port ingress/egress ACL’s ?• …but trunked VLAN’s or VXLAN’s at hypervisor port(s)

• Performance impact of VXLAN terminations• 100G on the server at all

– cf. 1->10Gb kernel tuning transition– 2x 100Gb ports > PCI3 bandwidth limited to 120Gb

• Can the software keep up ?

Page 14: 110G networking within JASMIN

100G server software ?

host-100G-1

tomcat-1 tomcat-4tomcat-2 tomcat-3

apache/nginx Load Balancing

• OpenDAP– Parallel servers and threads ?– CPU and RAM implications– JVM memory issues ?

Page 15: 110G networking within JASMIN

Summary• Target :

Provide “non-blocking” data access to JASMIN IaaS.

• Use of 100Gb Networking :– Reduces server count– Scaleable for growing infrastructure

• Experimental. Many potential issues to resolve:– Fabric routing egress/ingress ACLs– 100G kernel tuning ?– Can the software keep up

Page 16: 110G networking within JASMIN

Questions


Recommended