89 Fifth Avenue, 7th Floor
New York, NY 10003
www.TheEdison.com
@EdisonGroupInc
212.367.7400
IBM® Spectrum Scale™ vs
EMC Isilon for IBM®
Spectrum Protect™
Workloads
A Competitive Test and Evaluation
Report
Printed in the United States of America
Copyright 2015 Edison Group, Inc. New York.
Edison Group offers no warranty either expressed or implied on the information contained
herein and shall be held harmless for errors resulting from its use.
The information contained in this document is based on IBM provided materials and
independent research and was aggregated and validated for Edison Group, Inc. by the Edison
Group Analyst team.
All products are trademarks of their respective owners.
First Publication: February, 2015
Produced by: Matthew Elkourie, Analyst; Manny Frishberg, Editor; Barry Cohen, Editor-in-Chief
Table of Contents
Executive Summary ..................................................................................................................... 1
Introduction to Spectrum Scale ................................................................................................. 2
Test Summary Overview and Results ...................................................................................... 4
Technology Review and Configuration .................................................................................. 8
EMC Isilon for IBM Spectrum Protect.................................................................................... 8
Spectrum Scale for IBM Spectrum Protect........................................................................... 10
Conclusions ................................................................................................................................. 12
Appendix A – IBM Spectrum Scale Architecture Overview ............................................. 13
Appendix B – IBM Test Reference Architecture .................................................................. 15
Edison: IBM® Spectrum Scale™ vs EMC Isilon for IBM® Spectrum Protect™ Workloads Page 1
Executive Summary
When considering enterprise storage software options, IT managers constantly strive to
find the most efficient, scalable, and high performance solutions that solve today’s
storage performance and scalability challenges, while future-proofing their investment
to handle new workloads and data types. Enterprise backup solutions can be
particularly vulnerable to issues stemming from poor network performance to the
storage array(s), and are often not designed with the scalability demanded by rapidly
changing enterprise environments.
Data backup is critical to enterprise users, and often faces challenges in network
throughput (often viewed as the key factor in how long it takes to perform a backup of
data) as well as scalability (viewed as how much the system can grow before new
solutions would be required). Using published data, Edison compared a solution
comprised of EMC® Isilon® against an IBM® Spectrum Scale™ solution. (IBM Spectrum
Scale was formerly IBM® General Parallel File System™ or IBM® GPFS™, also known
as code name Elastic Storage). For both solutions, IBM® Spectrum Protect™ (formerly
IBM Tivoli® Storage Manager or IBM® TSM®) is used as a common workload
performing the backups to target storage systems evaluated.
IBM Spectrum Protect is a data protection platform that provides a single-pane-of-glass-
style approach to data backup and recovery, helping to protect a wide range of systems,
including virtual machines, file servers, email, databases, enterprise resource planning
(ERP) systems, mainframes and desktops through a single administration interface.
Available in a range of configurations for single and multiple site installations, as well as
providing a cloud -based Disaster Recovery (DR) service, the Spectrum Protect platform
is flexible, powerful, and yet intuitively accessed and managed, making Spectrum
Protect the ideal platform to evaluate consistent performance data, such as the
performance data produced and evaluated in this paper.
As the reader will see in the benchmark results section of this paper, the Spectrum Scale
solution provided for up to 11x better throughput results, while demonstrating
superior linear scalability as additional resources are provided to the Spectrum Protect
backup infrastructure. Compared to the EMC Isilon solution, the Spectrum Scale
solution scales 11x better due to network design and capabilities using InfiniBand,
allowing enterprise users to invest significantly less funding and additional
infrastructure resources as enterprise scaling demands increase when using the
Spectrum Scale solution.
Edison: IBM® Spectrum Scale™ vs EMC Isilon for IBM® Spectrum Protect™ Workloads Page 2
Introduction to Spectrum Scale
To better understand the IBM Spectrum Scale technology, and why this solution
performs better than competing offerings, an introduction and brief overview of
Spectrum Scale and its features is provided here.
Spectrum Scale is a proven, highly scalable and high performance solution, formerly
known as GPFS or Elastic Storage. IBM Spectrum Scale has been in the market since
1998, starting as a solution for IBM AIX systems and evolving over time as a highly
available storage solution, available for both Linux (supported version available in 2001)
and Microsoft® Windows Server (supported version available in 2008).
Spectrum Scale, a member of the IBM Software Defined Storage family, is a global file
and object -based storage solution providing high performance and storage scalability
that readily meets the demands found in enterprise storage environments.
Spectrum Scale’s architecture allows organizations to manage ever increasing data pool
sizes, as well as scaling data throughput. By utilizing a single Spectrum Scale cluster
serving multiple IBM Spectrum Protect servers, operational simplicity keeps
management overhead low.
In addition to deployment as a high performance, scalable, and reliable backup solution,
Spectrum Scale is typically integrated as the storage infrastructure for a wide range of
products and applications. Some typical use cases with which readers of this white
paper will be familiar include integration of Spectrum Scale with Hadoop (and other
scale-out frameworks), integration with High Performance Computing (HPC)
applications, and integration with High Performance Throughput Computing (HPTC)
clustering solutions.
Spectrum Scale is also available as a service on the IBM SoftLayer Cloud, offering cloud
storage for analytics, mobile and social data. Cloud storage for these workloads must be
highly scalable and elastic in order to accommodate the dynamic requirements from
users and applications.
In 2012, IBM introduced the Active File Management feature in Spectrum Scale, which
enables asynchronous access and control of both local and remote files, enabling global
file access and use. Resiliency and performance is greatly enhanced by Spectrum Scale
Edison: IBM® Spectrum Scale™ vs EMC Isilon for IBM® Spectrum Protect™ Workloads Page 3
native RAID, a feature that significantly reduces RAID rebuild times (in some cases up
to twenty times faster1), while at the same time reducing overhead on infrastructure in
the datacenter by significantly decreased network and disk sub-system usage and
requirements. Spectrum Scale Native RAID implements sophisticated data placement
and error correction algorithms to deliver high levels of storage reliability, availability,
and performance.
The Elastic Storage Server is a bundled Spectrum Scale software and hardware-based
solution that incorporates Spectrum Scale Native RAID, providing a scalable storage
building block that can be used by many applications (such as IBM Spectrum Protect).
Elastic Storage Server addresses a variety of client needs and use cases, ranging from
managing traditional large data sets to Hadoop cluster storage infrastructure nodes to
providing the foundation for mission critical backups. The initial version of the Elastic
Storage Server was called the GPFS Storage Server, or GSS, based on x86 technology.
This initial version was used in these evaluations.
1 http://www.ibm.com/common/ssi/cgi-
bin/ssialias?subtype=WH&infotype=SA&appname=STGE_TS_DS_USEN&htmlfid=XSC03148USEN&attach
ment=XSC03148USEN.PDF
Edison: IBM® Spectrum Scale™ vs EMC Isilon for IBM® Spectrum Protect™ Workloads Page 4
Test Summary Overview and Results
The focus of testing and Edison reviews centers on key principles consisting of storage
throughput capabilities and storage scalability as additional workloads are increased.
Utilizing the IBM Spectrum Protect product as a baseline for consistent workload
generation, Edison reviewed how each storage architecture performs in a variety of
published benchmarks.
The first set of results illustrates the smallest IBM Spectrum Scale workload compared to
the EMC Isilon configuration utilizing multiple servers and multiple client threads.
Figure 1: Throughput Measurement (in MB/s) Platform Comparison
Clearly, with just a single server and single thread, the Spectrum Scale demonstrates 3x
the throughput capability in the smallest workload possible compared to its
competitor. While the Figure 1 results are quite impressive on their own, let’s examine
what happens when we scale the Spectrum Protect workload up and gauge the results
on the Spectrum Scale solution.
Edison: IBM® Spectrum Scale™ vs EMC Isilon for IBM® Spectrum Protect™ Workloads Page 5
Figure 2: Throughput Measurement (in MB/s) Platform Comparison
In Figure 2, Edison kept the same maximum performance figures from the Isilon
platform, while increasing the Spectrum Protect load on the Spectrum Scale solution by
running multiple clients on the same single server. Spectrum Scale was faster by more
than 6x when performing backups, and more than 8x faster when restoring data than
its competition.
At this point, while the results produced were impressive, Edison still has not examined
the results of having the same Spectrum Protect environment as the competition tested
against Spectrum Scale. Let’s examine that now in Figure 3.
Edison: IBM® Spectrum Scale™ vs EMC Isilon for IBM® Spectrum Protect™ Workloads Page 6
Figure 3: Throughput Measurement (in MB/s) Platform Comparison
Figure 3 clearly illustrates that when scaled out in similar fashion to its competition, the
IBM Spectrum Scale solution absolutely dominates in performance numbers, showing a
better than 11x faster throughput rate than its competition.
While digesting the results of the testing, it should be mentioned and illustrated that
aside from increased IBM Spectrum Protect loads presented to the systems under test,
the underlying infrastructure of the storage solutions being evaluated was kept the
same. With this understanding, Figure 4 shows the superior scalability of IBM Spectrum
Scale.
Edison: IBM® Spectrum Scale™ vs EMC Isilon for IBM® Spectrum Protect™ Workloads Page 7
Figure 4: Throughput Measurement (in MB/s) Platform Comparison
In looking at Figure 4, a clear trend becomes obvious. As IBM Spectrum Scale faces
increased loads, the performance factor scales linearly and predictably.
The performance and scalability presented thus far are impressive for both storage
platforms presented. While it is easy to make assumptions based on graphs, Edison
encourages the reader to take a deeper look into the technology presented that yielded
such impressive results, as well as to understand how these results were achieved.
Edison: IBM® Spectrum Scale™ vs EMC Isilon for IBM® Spectrum Protect™ Workloads Page 8
Technology Review and Configuration
EMC Isilon for IBM Spectrum Protect
First to be examined are the objectives of the EMC Isilon testing. In a published blog2 by
author Stefan Radtke (Field CTO, EMEA at EMC), Isilon testing is performed to
illustrate why Isilon is the right choice for Spectrum Protect as a backup target. The
stated goal of the testing was to illustrate the amount of throughput for backup and
restoration jobs an end-user could expect to achieve while running Spectrum Protect on
EMC Isilon. The Spectrum Protect server Database cannot be stored on EMC Isilon for
performance reasons, so it should be noted that the performance testing was focused on
the actual backup and restoration capabilities of the EMC Isilon platform.
Testing prior to the Isilon deployment Edison reviewed was conducted with 4 Spectrum
Protect server instances running on Microsoft Windows 2012 Server, configured to a pair
of NetApp arrays. Data throughput rates reached a maximum of approximately
150MB/s, with peak data transfer rates occurring at around 400MB/s, running IBM
Spectrum Protect. After the EMC Isilon array was implemented with the same Spectrum
Protect configuration but with a 3 node EMC Isilon NL400 cluster, data throughput rates
rose by nearly double the previous data rate sustained and peak values. The IBM
Spectrum Protect configuration was modified to use more threads, enabling the higher
test scores displayed below in the performance chart.
The published EMC Isilon test harness consisted of a three node EMC Isilon NL400
cluster. While testing was performed with a tape library as well as to hard disk, this
report will focus on the non-tape testing as Edison’s evaluation centers on throughput
on storage systems utilizing disks, not tape. Available details include the following
configuration data:
3 node Isilon NL400 storage nodes
432 TB raw capacity with 260TB usable across the 3 shelves of Isilon storage
10G network infrastructure
Microsoft Windows Server 2012, running 4 Spectrum Protect instances
2 http://stefanradtke.blogspot.de/2014/06/isilon-as-tsm-backup-target-analyses-of.html
Edison: IBM® Spectrum Scale™ vs EMC Isilon for IBM® Spectrum Protect™ Workloads Page 9
Connected to the EMC Isilon system were four Spectrum Protect servers running on
Windows 2012 and serving an undisclosed number of Spectrum Protect backup clients.
The Spectrum Protect server where connected via a 10 Gbit Ethernet infrastructure.
According to the author of this report, runtimes with Isilon were reduced by nearly a 5x
factor compared against NetApp, reducing from 12 hours necessary for runtime down
to 2 ½ hours. In conjunction with the platform shift to EMC Isilon, the author was able to
reduce complexity in the data backup infrastructure, in addition to the benefits of
increased data throughput rates and run time reductions.
Results of the author’s testing is displayed in Figure 5, and shows the performance
increases the author was able to achieve on the EMC Isilon infrastructure once it was
configured for maximum IBM Spectrum Protect operation.
Figure 5: EMC Isilon on IBM Spectrum Protect – Throughput IBM Spectrum Protect Instances with
Multiple Sessions
Edison: IBM® Spectrum Scale™ vs EMC Isilon for IBM® Spectrum Protect™ Workloads Page 10
Spectrum Scale for IBM Spectrum Protect
The IBM team’s testing regimen3 consists of several test runs being performed on a pair
of IBM x3650-M4 servers and a single IBM GSS26 storage system connected via
InfiniBand. The IBM team tested:
Peak backup and restore performance with multiple parallel client backup sessions
directed to a single IBM Spectrum Protect Server
Peak backup performance with multiple parallel client backup sessions directed to
two IBM Spectrum Protect Servers simultaneously
Peak backup performance using a single client backup session directed to a single
IBM Spectrum Protect Server
Similar to the testing performed with the EMC Isilon configuration, two IBM servers
running the Spectrum Protect software are directly connected to the IBM GSS Servers
(with Spectrum Scale software running as the storage infrastructure engine on them),
eliminating complexity and keeping the configuration easy to maintain and scale. The
connection was based on InfiniBand.
The published IBM test harness consisted of IBM GSS, running Spectrum Scale software.
A breakdown of the tested solution is as follows:
IBM GSS26 – Comprising two server nodes, configured with 348 drives in 6 de-
clustered arrays.
Each de-clustered array features 1 metadata and 1 vdisk
Single File System presented to Spectrum Protect for the Spectrum Protect
database and storage pool data
RAID arrays were established in an 8+2 array configuration for file system data,
and a 3-way replicated configuration for file system metadata
The underlying software version for the Operating System is Red Hat RHEL 6.5
installed with GSS release version 2.0 for the storage software
56 Gbps InfiniBand cross connections between the GSS26 installation and the two
Spectrum Protect servers utilized for testing
3https://www.ibm.com/developerworks/community/blogs/storageneers/entry/scale_out_backup_with_tsm_
and_gss_performance_test_results?lang=en
Edison: IBM® Spectrum Scale™ vs EMC Isilon for IBM® Spectrum Protect™ Workloads Page 11
Connected to the IBM GSS system were two IBM x3650-M4 servers, each hosting one
Spectrum Protect server and multiple Spectrum Protect client instances. The Spectrum
Protect servers with version 7.1 were running on Red Hat Linux 6.5 and configured as
Spectrum Scale nodes along with the GSS storage server.
Performance increases gained during testing, similar to the EMC Isilon testing regiment,
is shown as increased performance throughput as additional workloads were
introduced with first additional client threads, and then finally with a similar
environment to the EMC Isilon workload where multiple servers and clients were
utilized. This is show in Figure 6.
Figure 6: IBM Spectrum Scale on IBM Spectrum Protect – Varied Servers and Sessions
Edison: IBM® Spectrum Scale™ vs EMC Isilon for IBM® Spectrum Protect™ Workloads Page 12
Conclusions
Enterprise backup solutions, and the storage systems that provide the necessary
infrastructure to achieve tight deadlines for backup and recovery of data, can be a
challenging subject for users trying to decide what makes the most sense when
evaluating budgets against performance and scalability.
The evaluations in this white paper illustrate the comparison of two commercially
available storage solutions in a backup and retrieval scenario, and focus on the
throughput and scalability they both bring to the table. As retained data needs grow in
size, the ability to access that data and recover from loss becomes equally critical.
As shown throughout the whitepaper, the IBM Spectrum Scale platform outperforms
the competition, in some cases by up to 11x in throughput. It is especially notable that
the smallest, non-scaled implementation of IBM Spectrum Protect yields better than 3x
the performance of the competition when run on IBM Spectrum Scale. Scaling from 3x
the throughput to 11x the throughput of the competing system demonstrates the clear
workload scalability advantage IBM Spectrum Scale provides.
In conclusion, the benchmarks clearly show that IBM Spectrum Scale is the right
solution for companies requiring out-of-box system performance and future proofed
platform scalability.
Edison: IBM® Spectrum Scale™ vs EMC Isilon for IBM® Spectrum Protect™ Workloads Page 13
Appendix A – IBM Spectrum Scale Architecture
Overview
Figure 7: Basic Spectrum Scale Architecture
A Spectrum Scale cluster can be a single node, two nodes or thousands of nodes used for
applications, such as the modeling of weather patterns. The largest existing
configurations exceed 40,000 nodes, for example as found in the Argonne National Labs
Supercomputer Cluster4. Nodes in a cluster are connected via a cluster network
providing communication among the cluster nodes. The file systems configured in
Spectrum Scale represent a global name space where files are accessible on all cluster
nodes. Files in the Spectrum Scale file systems are stored on one or more storage systems
4 http://www.alcf.anl.gov/mira
Edison: IBM® Spectrum Scale™ vs EMC Isilon for IBM® Spectrum Protect™ Workloads Page 14
connected to the cluster nodes via a storage network. Spectrum Scale nodes have access
to the entire storage. The storage system can provide RAID technology whereas in a GSS
system, Spectrum Scale software performs the RAID operations (Spectrum Scale native
RAID) with simple JBODs (just a bunch of disk).
A single Spectrum Scale cluster can be scaled up in many ways. Each component of the
cluster – server (node), network, server, storage – can be individually scaled to match
individual and changing requirements.
Edison: IBM® Spectrum Scale™ vs EMC Isilon for IBM® Spectrum Protect™ Workloads Page 15
DCL12398USEN-00
Appendix B – IBM Test Reference Architecture
For visual purposes, a diagram is provided, showing a typical IBM Spectrum Protect
and Spectrum Scale deployment stack. This is the same stack referenced in the testing
referred to in this review.
Figure 8: Typical IBM Spectrum Protect and IBM Spectrum Scale Deployment Stack
The components found in the stack are as follows:
2 x Spectrum Protect Server
IBM x3650-M4 with Red Hat Enterprise Linux Server release 6.5
IBM Tivoli Storage Manager 7.1
1 x IBM System x GPFS Storage Server - GSS26
6 x 4U-60 with 58 x 2 TB NL-SAS disks drawer
In total 348 disks
1 x Mellanox 32 Port InfiniBand FDR switch
Each Spectrum Protect server is connected with a 56 GBit/s link to the GSS
system