Confidential and Proprietary
THE HIGH‐END VIRTUALIZATION COMPANYSERVER AGGREGATION – CREATING THE POWER OF ONE
vSMP Foundation Introduction
Confidential and Proprietary
Server Virtualization
Hypervisor or VMM
Virtual Machines
AppOS
AppOS
AppOS
Virtual Machine
AppOS
Hypervisor or VMM
Hypervisor or VMM
Hypervisor or VMM
Hypervisor or VMM
AGGREGATIONConcatenation of physical resources
PARTITIONINGSubset of the physical resources
‐ 2 ‐
Confidential and Proprietary
Over 4,000 systems in hundreds of deployment across 30 countries
Israel
QatarIndia
China
Brazil
South Africa
Canada
US
Russia
Spain
France
Belgium
Denmark
United Kingdom
S. Korea
Hong Kong
Vietnam
Brunei
Taiwan
Poland
Italy
GermanyNetherlands
Australia
Norway
Japan
Malaysia
Singapore
Creating the Power of One1 VM 1 OSN x Servers N x OS
ScaleMP Fact‐sheetFounded in 2003
Proven and mature technology
Scalable Virtual Machine (VM)SMP replacement driving application capability and system manageability
World’s largest SMP32,768 CPUs & 256 TB RAM
Processor & interconnect agnosticIntel/AMD x86, Intel Xeon Phi (1Q13)Mellanox IB, Intel TrueScale IB (1Q13)
Product shipping since 2006Over 4000 systems in hundreds of deployment across 30 countries
Channel‐only business model
11/29/2012‐ 3 ‐
Confidential and Proprietary
Selected Customers and Applications
11/29/2012‐ 4 ‐
Life SciencesComp. Chemistry• AMBER• CFOUR• DOCK• GAMESS• Gaussian• GOLD• NWChem• Octopus• OpenEye FRED• OpenEye OMEGA• Schrödinger Jaguar• Schrödinger Glide• SCM ADF• VASPMolecular Dynamics• GROMACS• MOLPRO• NAMD• OpenEye ROCS• Schrödinger Desmond• Turbomole
Finance• KX• Wombat
ManufacturingStructural Mechanics• ABAQUS/Explicit• ABAQUS/Standard• ALTAIR Radioss• ANSYS Mechanical• LSTC LS-DYNA• NASTRAN• TNO DianaFluid Dynamics• ANSYS CFX• ANSYS Fluent• ANSYS TGrid• AVL FIRE• EXA PowerFlow• EZNSS• GeoDict• MHD3D• NASA Cart3D• STAR-CCM+• STAR-CDOther• Comsol• inTrace OpenRT
Energy• IMEX• Norsar 3D• Paradigm GeoDepth• Schlumberger ECLIPSE
EDA• Cadence• HSPICE• Mentor• Quartus• Silvaco SmartSpice• Synopsys
Bio-informatics• 454/Newbler• Abyss• Bowtie• CLC Bio• FASTA• HMMER• Illumina• mpiBLAST• SOAPDenovo• Velvet
WeatherForecasting
• MITgcm• MM5 (MPI & OpenMP)• MOM4• WRF
Numerical Simulations
• Octave• R• MathWorks MATLAB• Wolfram Mathematica
…and many more homegrown applications
RDBMS & Analytics• Actian VectorWise• Oracle• SAP HANA• Sybase
Confidential and Proprietary
It’s all about Agility
11/29/2012‐ 5 ‐
Application
Small ClusterSmall Cluster Single SystemSingle System
Infrastructure
Fixed InfrastructureFixed Infrastructure On‐DemandOn‐Demand
Processing
Monolithic SystemMonolithic System Modular SystemModular System
Confidential and Proprietary
vSMP Foundation ‐ Solutions
SingleOperating System
SingleOperating System
Single(large) System
Single(large) System
Single Infrastructure
Single Infrastructure
Cluster management and server
consolidation
vSMP Foundation for Cluster
Compute and memory demanding
applications
vSMP Foundation for SMP
Cloud enabler– on‐demand infrastructure
vSMP Foundation for Cloud
Single OS Single InfrastructureSingle System11/29/2012‐ 6 ‐
Application Driven
Compute Intensive
CPU CPU
I/O
Memory
CPU CPU
I/O
Memory
CPU CPU
I/O
Memory
CPU CPU
I/O
Memory
Operating SystemApp.
Large Memory
CPU CPU
I/O
Memory
CPU CPU
I/O
Memory
CPU CPU
I/O
Memory
CPU CPU
I/O
Memory
Operating SystemApp.
I/O Intensive
CPU CPU
I/O
Memory
CPU CPU
I/O
Memory
CPU CPU
I/O
Memory
CPU CPU
I/O
Memory
Operating SystemApp.
IT Driven
Consolidation
CPU CPU
I/O
Memory
CPU CPU
I/O
Memory
CPU CPU
I/O
Memory
CPU CPU
I/O
Memory
Operating SystemApp. App. App. App. App. App. App. App.
Confidential and Proprietary
Server Consolidation Solution
11/29/2012‐ 7 ‐Single OS
• Your day‐to‐day cluster is difficult:– Multiple OS install, security updates, application patches– Fabric management– Storage management– True for both Technical‐computing and Enterprise‐computing
• Requires cluster management knowhow:– Translated into OPEX investment – require ongoing consultancy
Confidential and Proprietary
Server Consolidation Solution
11/29/2012‐ 8 ‐Single OS
• Server consolidation platform for both enterprise and technicalcomputing environments
• Simplified IT: One system to manage, instead of several
Confidential and Proprietary
Solution Comparison
SGI UV 2000 Intel Xeon E5 Dual Processor System+ vSMP Foundation Advanced Platform
Intel Xeon E5 Quad Processor System+ vSMP Foundation Advanced Platform
ProcessorSockets 256 sockets 256 sockets 512 sockets
Max. Core speed 2.9 GHz (6 cores / socket)2.7 GHz (8 cores / socket)
3.3 GHz (4 cores / socket)3.1 GHz (8 cores / socket)
2.9 GHz (6 cores / socket)2.7 GHz (8 cores / socket)
Max. GFLOP/socket 172.8 (8 cores / socket) 198.4 (8 cores / socket) 172.8 (8 cores / socket)Max. DIMMs/socket 8 12 12Asymmetric processor support No Yes Yes
BackplaneBackplane Technology NUMAlink6 (proprietary) InfiniBand (off the shelf) InfiniBand (off the shelf)Redundancy None Fully redundant (N+1) Fully redundant (N+1)Interconnect Speed (unidirectional) 40 Gbps to 160 Gbps 56 Gbps to 224Gbps 56 Gbps to 224Gbps
System
Memory Architecture CC‐NUMA(manual memory placement)
CC‐NUMA / COMA hybrid(automatic memory placement)
CC‐NUMA / COMA hybrid(automatic memory placement)
Max. Memory 64 TB 96 TB 192 TBPartitioning Support No Yes YesDensity (sockets/rack) 64 80 to 160 80Density (RAM/rack) 512 DIMMs 960 to 1280 DIMMs 960 DIMMsUpgrade Forklift Modular Architecture Modular ArchitectureChoice of Hardware No Yes Yes
SGI Altix UV vs. ScaleMP vSMP Foundation with Intel Xeon Processors
Source: ScaleMP and SGI (http://www.sgi.com/products/servers/altix/uv/specs.html)
11/29/2012‐ 9 ‐Single System
Confidential and Proprietary
Solution Comparison
SGI UV 2000 AMD Opteron Dual Processor System+ vSMP Foundation Advanced Platform
AMD Opteron Quad Processor System+ vSMP Foundation Advanced Platform
ProcessorSockets 256 sockets 256 sockets 512 sockets
Max. Core speed 2.9 GHz (6 core / socket)2.7 GHz (8 core / socket)
3.3 GHz (4 cores / socket)3.0 GHz (8 cores / socket)2.7 GHz (16 cores / socket)
3.3 GHz (4 cores / socket)3.0 GHz (8 cores / socket)2.7 GHz (16 cores / socket)
Max. GFLOP/socket 172.8 (8 core / socket) 172.8 (16 cores / socket) 172.8 (16 cores / socket)Max. DIMMs/socket 8 8 8Asymmetric processor support No Yes Yes
BackplaneBackplane Technology NUMAlink6 (proprietary) InfiniBand (off the shelf) InfiniBand (off the shelf)Redundancy None Fully redundant (N+1) Fully redundant (N+1)Interconnect Speed (unidirectional) 40 Gbps to 160 Gbps 40 Gbps to 160 Gbps 40 Gbps to 160 Gbps
System
Memory Architecture CC‐NUMA(manual memory placement)
CC‐NUMA / COMA hybrid(automatic memory placement)
CC‐NUMA / COMA hybrid(automatic memory placement)
Max. Memory 64 TB 64 TB 128 TBPartitioning Support No Yes YesDensity (sockets/rack) 64 160 160Density (RAM/rack) 512 DIMMs 1280 DIMMs 1280 DIMMsUpgrade Forklift Modular Architecture Modular ArchitectureChoice of Hardware No Yes Yes
SGI Altix UV vs. ScaleMP vSMP Foundation with AMD Opteron Processors
11/29/2012‐ 10 ‐Single System Source: ScaleMP and SGI
(http://www.sgi.com/products/servers/altix/uv/specs.html)
Confidential and Proprietary
Big‐data: Bio‐informatics & Analytics
11/29/2012Single System
• Target analytics with integrated solution. • Very large memory: ‐ up to 7.5 TB RAM without InfiniBand switch
‐ up 48 TB RAM with InfiniBand switch
‐ 11 ‐
Confidential and Proprietary
The Virtualized Datacenter
11/29/2012Single Infrastructure
• Single infrastructure, many workloads. Allows for big‐data problems in cloud environment
• Integrated with leading provisioning software.
‐ 12 ‐
Big Data VMs
Compute Resource
Big Data Resource
Compute VM
Confidential and Proprietary
PERFORMANCE
11/29/2012‐ 13 ‐
Confidential and Proprietary 11/29/2012‐ 14 ‐
29,118 43,753 12,844 36,482
‐33%
‐65%
‐100%
‐90%
‐80%
‐70%
‐60%
‐50%
‐40%
‐30%
‐20%
‐10%
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
‐50,000
‐40,000
‐30,000
‐20,000
‐10,000
0
10,000
20,000
30,000
40,000
50,000
vSMP FoundationIntel Xeon (64 cores)E5‐2670 @ 2.6GHz
64GB/socket8TB total
SGI UV1Intel Xeon (64 cores)X7560 @ 2.27GHz
64GB/socket16TB total
vSMP FoundationIntel Xeon (64 cores)E5‐2670 @ 2.6GHz
64GB/socket8TB total
SGI UV1Intel Xeon (64 cores)X7560 @ 2.27GHz
64GB/socket16TB total
Foxglove Diporphyrin
Gaussian Performance Comparison ‐ 64 cores
RuntimePerformance Difference
Confidential and Proprietary 11/29/2012‐ 15 ‐
474 566 602 526 841 873
6,841
8,583 8,843
11,75212,715
14,153
‐20% ‐23%
‐9%
‐18%
‐100%
‐90%
‐80%
‐70%
‐60%
‐50%
‐40%
‐30%
‐20%
‐10%
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
‐15,500
‐13,950
‐12,400
‐10,850
‐9,300
‐7,750
‐6,200
‐4,650
‐3,100
‐1,550
0
1,550
3,100
4,650
6,200
7,750
9,300
10,850
12,400
13,950
15,500
vSMP FoundationIntel Xeon
E5‐2670 @ 2.6GHz64GB/socket8TB total
Sun X4600M2AMD Opteron8380 @ 2.5GHz64GB/socket512GB total
SGI UV1Intel Xeon
X7560 @ 2.27GHz64GB/socket16TB total
vSMP FoundationIntel Xeon
E5‐2670 @ 2.6GHz64GB/socket8TB total
Sun X4600M2AMD Opteron8380 @ 2.5GHz64GB/socket512GB total
SGI UV1Intel Xeon
X7560 @ 2.27GHz64GB/socket16TB total
brain.chr19.k25 (Hash=21GB, Graph=163GB) blood.chr19.k25 (Hash=28GB, Graph=250GB)
Velvet Performance Comparison (1/2) ‐ 16 cores
GraphHashPerformance Difference
Confidential and Proprietary 11/29/2012‐ 16 ‐
2,410 3,096 3,675 3,670 4,328 7,965
18,000
21,778
28,22925,380
29,006
37,501
‐18%
‐36%
‐13%
‐36%
‐100%
‐90%
‐80%
‐70%
‐60%
‐50%
‐40%
‐30%
‐20%
‐10%
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
‐50,000
‐40,000
‐30,000
‐20,000
‐10,000
0
10,000
20,000
30,000
40,000
50,000
vSMP FoundationIntel Xeon
E5‐2670 @ 2.6GHz64GB/socket8TB total
Sun X4600M2AMD Opteron8380 @ 2.5GHz64GB/socket512GB total
SGI UV1Intel Xeon
X7560 @ 2.27GHz64GB/socket16TB total
vSMP FoundationIntel Xeon
E5‐2670 @ 2.6GHz64GB/socket8TB total
Sun X4600M2AMD Opteron8380 @ 2.5GHz64GB/socket512GB total
SGI UV1Intel Xeon
X7560 @ 2.27GHz64GB/socket16TB total
brain.chr1.k29 (Hash=118GB, Graph=344GB) blood.chr1.k29 (Hash=164GB, Graph=459GB)
Velvet Performance Comparison (2/2) ‐ 16 cores
GraphHashPerformance Difference
Confidential and Proprietary
510
288
158
87
54
39
100%
89%
81%
73%
59%
41%
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
‐
100
200
300
400
500
600
16 32 64 128 256 512
Efficiency
Time/Step
(sec.)
MHD3D
Runtime (Sec.) Efficiency
11/29/2012‐ 17 ‐
10,021
5,289
4,058
100%
95%
82%
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
‐
2,000
4,000
6,000
8,000
10,000
12,000
128 256 384
Efficiency
Time (sec.)
Lanczos
Time (sec.) Efficiency
Confidential and Proprietary
32 63
125
249
496
934
1,741
100% 100% 99% 99% 98%
93%86%
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
‐
200
400
600
800
1,000
1,200
1,400
1,600
1,800
2,000
8 16 32 64 128 256 512
Efficiency
TRIAD (G
B/s)
STREAM
TRIAD (GB/s) Efficiency
11/29/2012‐ 18 ‐
278
559
1,111
2,224
4,333
5,231
100% 100% 100%97%
59%
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
‐
1,000
2,000
3,000
4,000
5,000
6,000
16 32 64 128 256 512
Efficiency
GFLOPS
MKL (DGEMM)
GFLOPS Efficiency
NDIM = 32,000
Confidential and Proprietary
See performance info athttp://www.ScaleMP.com/performance
11/29/2012‐ 19 ‐
Confidential and Proprietary
THE HIGH‐END VIRTUALIZATION COMPANYSERVER AGGREGATION – CREATING THE POWER OF ONE
Do More – Pay Less !