+ All Categories
Home > Documents > IBM System x3850 & High IOPS Adapters...

IBM System x3850 & High IOPS Adapters...

Date post: 02-Feb-2021
Category:
Upload: others
View: 3 times
Download: 0 times
Share this document with a friend
11
Delivering on the I/O bandwidth promise: over 10GB/s large sequential bandwidth on IBM x3850/x3950 X5 October 2010 IBM System x3850/x3950 X5 Combined with High IOPS SSD PCIe Adapters Demonstrate New Levels of I/O Scalability By Vinay Kulkarni IBM Systems and Technology Group ®
Transcript
  • Delivering on the I/O bandwidth promise: over 10GB/s large sequential bandwidth on IBM x3850/x3950 X5 October 2010

    IBM System x3850/x3950 X5 Combined with High IOPS SSD PCIe Adapters Demonstrate New Levels of I/O Scalability

    By Vinay Kulkarni IBM Systems and Technology Group

    ®

  • IBM System x3850/x3950 X5 Combined with High IOPS SSD PCIe Adapters Demonstrate New Levels of I/O Scalability Page 2

    Executive Summary As the amount of data collected and stored grows exponentially, so does the need for faster retrieval of that data. Clients are faced with buying more storage, which, in turn, increases the data center costs of space, power, cooling and management. As the data grows, queries take longer to execute because there is so much more information to search. This makes SSD technology attractive because of the higher performance and much lower latencies offered by solid-state drives in general and even more so by the IBM High IOPS SSD PCIe Adapters

    1:

    • IBM 160GB High IOPS SS Class SSD PCIe Adapter (x4 Gen 1)

    • IBM 320GB High IOPS SS Class SSD PCIe Adapter (x4 Gen 1)

    • IBM 320GB High IOPS MS Class SSD PCIe Adapter (x4 Gen 1)

    • IBM 320GB High IOPS SD Class SSD PCIe Adapter (x8 Gen 2)

    • IBM 640GB High IOPS MLC Duo Adapter (x8 Gen 2) This paper demonstrates how the combination of the IBM System x3850 X5 servers and these high-IOPS adapters makes it possible for IBM to deliver storage bandwidth of more than 10 GB/s from a single x86 server. (See Figure 1.) The initial investment in these adapters can be significantly higher than other technologies (e.g., around $10,000 for a 320GB high-IOPS adapter, compared to around $200 for a 300GB Fibre Channel HDD. However, the performance of each adapter is equivalent to dozens of HDDs for workloads with large sequential bandwidth, and even hundreds of drives for small random IOPS

    2, and when the savings from space, energy,

    management, HBAs, PDUs, JBODs, and other no-longer-needed components are factored in, they reduce TCO significantly over the long run.

    Figure 1. Over 10GB/s bandwidth, measured with Iometer 1 These adapters are OEMed from Fusion-io and offer the compatibility assurance of IBM ServerProven

    ® testing.

    2 According to internal IBM testing, the typical maximum IOPS of a 600GB 15K 3.5-inch hot-swap SAS HDD is ~400 (4K random reads), while the typical IOPS of a 640GB High IOPS SSD PCIe adapter is ~200,000, or ~500x. The typical bandwidth of the same drive = ~195MBps (64K sequential reads), while the typical bandwidth of a 640GB High IOPS MLC adapter is ~1.5GBps, or ~7.7x.

  • IBM System x3850/x3950 X5 Combined with High IOPS SSD PCIe Adapters Demonstrate New Levels of I/O Scalability Page 3

    Introduction The IBM System x3850 X5 and x3950 X5 servers are capable of delivering much higher I/O performance than previous generations of the IBM System x server. This higher performance is the result of several different factors. For example, the x3850/x3950 X5 have eight PCIe slots. One slot comes with a standard IBM ServeRAID

    ® controller to support the internal drives, leaving

    seven PCIe 2.0 slots available3. One factor contributing to higher performance is the PCI Gen 2,

    or 2.0, standard compared to PCIe Gen 1, or 1.1, in the previous-generation x3850 M2 server. PCIe 2.0 doubles the bandwidth per lane compared to PCIe 1.0. It’s capable of driving 500 MB/s throughput per lane in each direction. In practical terms, a Gen 2 PCIe adapter can drive 1.5 GB/s throughput plugged into a PCIe 2.0 slot compared to 750 MB/s when plugged in a PCIe 1.1 slot. Two workloads were used to test I/O performance: Data Warehouse (DW) and OnLine Transaction Processing (OLTP). A Data Warehouse is a type of database designed to archive data for the purpose of reporting and analysis. Data Warehouses are typically used as the back-end data storage for Decision Support Systems (DSS). The terms DW and DSS are used interchangeably in this paper to describe the workload. A DW environment is different from an OLTP database in that it is often subjected to ad hoc queries rather than the predefined queries typical of the OLTP database. High-speed sequential data access is typical of DW workload and is important for bulk data operations typically found in multimedia, data mining, and scientific applications. Minimizing I/O overhead and maximizing bandwidth can free power to process the data. Queries that are run against DW databases often scan millions of rows, whereas typical OLTP queries generally scan a handful of records. The DSS I/O pattern is mostly large sequential reads, compared to small random reads with an OLTP workload. Small random data access with low response times is typical of the OLTP workload and is important for small database transactions (e.g., an order entry system or the stock market). The data accessed is small and random, and users expect subsecond response times. This technical report documents a series of tests that demonstrate the I/O capabilities of the IBM System x3850 X5 and x3950 X5, which are ideally suited to handle large-scale OLTP and DSS workloads. For this series of tests, an x3850 X5 server with 64-logical processors was populated with seven High IOPS SSD PCIe Adapters. Each of these cards appears as two different LUNs on the server.

    IBM System x3850 X5 and x3950 X5 The x3850 X5 (Figure 2) is built on eX5, the fifth generation of IBM X-Architecture

    ® chipset

    technology. This new system incorporates the latest in hardware acceleration and scalability to provide configurations that push the boundaries of x86/x64-based systems while providing industry-leading flexibility. With the release of new multicore processors from Intel

    ®, the 4-socket

    IBM System x3850 X5 now offers up to 32 cores, or 64 logical processors, with Hyper-Threading Technology and up to 64 DIMM slots/1TB of PC3-10600 double data rate 3 (DDR3) memory per chassis---double that of the previous generation x3850 M2. The memory is highly available, incorporating IBM OnForever™ features such as Memory ProteXion™, Chipkill™ error correction, memory rank sparing, memory scrubbing, and memory mirroring. Other high-availability features, such as hot-swap redundant fans and power supplies, and hot-swap HDDs and solid-state drives (SSDs) deliver high levels of reliability and availability to maximize uptime. These kinds of features provide the reliability and high availability needed for the most demanding virtualization requirements.

    3 Most models of the x3850 X5 come with a standard Emulex 10Gb FCoE/CNA card. The model used for running the tests in this paper did not have that card and therefore had seven open PCIe slots.

  • IBM System x3850/x3950 X5 Combined with High IOPS SSD PCIe Adapters Demonstrate New Levels of I/O Scalability Page 4

    Figure 2. Front view of IBM System x3850 X5 (Using the optional 1U MAX5 memory expansion unit, the x3850 X5 has access to an additional 32 DIMM slots, for a total of 1.5TB of memory in only 5U.) One of the slots is x16, one is x4 and the remaining five are x8 slots. This paper will demonstrate how the I/O capabilities and RAS features of the x3850 X5 can take the performance, efficiency and reliability of a database environment to the next level. If this doesn’t provide enough resources, you can upgrade an x3850 X5 to an x3950 X5 simply by installing a scalability kit. The x3950 X5 can then expand to a second identical chassis, doubling the processor sockets (8), memory (2GB/128 DIMMs), I/O slots (14), and HDD/SSD bays (16). For the purposes of this paper, the term “x3850 X5” refers to both the x3850 X5 and the x3950 X5 enterprise servers. For more information about the x3850/x3950 X5, visit: http://ibm.com/systems/x/hardware/enterprise/x3850x5/index.html

    IBM High-IOPS SSD PCIe Adapters Using the same award-winning ioMemory technology as the Fusion-io ioDrive, the High-IOPS SSD PCIe Adapters double the performance and capacity available on a single device, while greatly reducing the overall energy usage

    4. Multiple High-IOPS SSD PCIe Adapters (Figure 3)

    combined can easily achieve gigabytes of bandwidth and hundreds of thousands of IOPS with a single server. The High-IOPS SSD PCIe Adapters are packed with the following features:

    • Doubles the performance of the industry-leading ioDrive

    • Integrates with servers at the system bus and kernel levels, creating a new flash memory tier • Is not an SSD—but easily outperforms dozens of SSDs in a single server

    • At 200,000 IOPS and 1.5GB/s, the 640GB adapter delivers the performance of thousands of hard disk drives in a single server (up to 100,000 IOPS for the 160GB/320GB adapters)

    • Accelerates applications, improves response times, and boosts efficiency

    • Reduces storage latencies and eliminates I/O bottlenecks

    • Provides from 160GB to 640GB of enterprise-grade, solid-state flash storage

    • Extremely high reliability, due to advanced wear-leveling, 11-bit ECC, and N+1 chip-level redundancy

    4 On a performance-per-watt basis, these adapters outperform HDDs by up to 445x: 97,014 IOPS / 9W = 10,779 IOPS per watt (160GB/320GB High IOPS adapters). 196,000 IOPS / 12W = 16,333 IOPS per watt (640GB High IOPS adapter). 400 IOPS / 16.5W = 24.2 IOPS per watt (600GB 15K 3.5-inch hot-swap SAS HDD). 16,333 IOPS / 24.2 IOPS = 444.7.

  • IBM System x3850/x3950 X5 Combined with High IOPS SSD PCIe Adapters Demonstrate New Levels of I/O Scalability Page 5

    Figure 3. IBM 640GB High IOPS MLC Duo PCIe Adapter

    Test Environment The test environment consisted of a single x3850 X5 server populated with seven High IOPS SSD PCIe Adapters

    5 (Figure 4), making it a single-box environment with no external storage. The

    x3850 X5 incorporated four Xeon X7560 8-core processors. The tests were performed with five 320GB High IOPS SD Class PCIe Adapters and two 640GB High IOPS MLC Duo Adapters. The operating system used was Microsoft

    ® Windows

    ® Server 2008 R2 x64 Enterprise Edition.

    Data Warehouse workloads are host-based environments; OLTP workloads are client-server-based environments. These workloads are frequently very I/O-intensive because they scan huge tables in a database. OLTP workloads involve a lot of small transactions, and their I/O pattern is small random accesses with a strict response-time requirement.

    Figure 4. Rear view of a System x3850 X5 populated with seven High IOPS SSD PCIe Adapters

    Test Methodology The Iometer tool from http://iometer.org was used to conduct the I/O testing. A 20GB file was used for running the tests. Two types of I/O workloads were tested. Large sequential I/Os

    5 The assistance of Fusion-io in providing adapters for benchmarking is gratefully acknowledged.

  • IBM System x3850/x3950 X5 Combined with High IOPS SSD PCIe Adapters Demonstrate New Levels of I/O Scalability Page 6

    measuring I/O bandwidth are typical of Data Warehouse (DW)/Decision Support System (DSS) workloads. Small random I/Os measuring Input Output Operations per second (IOPS) are typical for transactional (OLTP) workloads. Runs were tried with one and multiple threads per adapter. After some experimentation, two worker threads per adapter were used for large sequential I/O (DW) tests. Four workers threads per adapter were used for small random I/O (IOPS) tests. I/Os were aligned on a 4K boundary. 64 outstanding I/Os were configured for these tests. All of the High IOPS SSD PCIe Adapters were prepared before tests were run: Writes of the same size and type were done before doing the reads.

    Test Results Measuring I/O Bandwidth with Large Sequential I/Os Table 1 and Figure 5 show the results of running large 512K 100% sequential reads, scaling from 1 adapter to 14 adapters. (The seven High IOPS SSD PCIe Adapters used, totaling 2.88TB, appear as 14 LUNs to the operating system

    6.)

    Number of High IOPS SSD PCIe Adapters Two 640GB Adapters, Five 320GB Adapters

    I/O Bandwidth MB/s with 512K Sequential Reads

    1 742

    2 1481

    3 2236

    4 2971

    5 3664

    6 4398

    7 5094

    8 5846

    9 6618

    10 7324

    11 8046

    12 8717

    13 9468

    14 10262

    Table 1. Scaling large sequential I/Os with High IOPS SSD PCIe Adapters

    6 Although we used only two 640GB adapters in these tests, up to seven of them are supported, totaling 4.48TB.

  • IBM System x3850/x3950 X5 Combined with High IOPS SSD PCIe Adapters Demonstrate New Levels of I/O Scalability Page 7

    Figure 5. Large sequential read test results Note: The CPU utilization on the server for the above tests was in the 30% range, even with seven adapters installed, sparing enough processor resources for an application to make use of the bandwidth. Measuring IOPS with Small Random I/Os Table 2 and Figure 6 show the results of running small 4K 100% random reads scaling from one to eight adapters. (Four High IOPS SSD PCIe Adapters appear as eight LUNs to the operating system.)

    Number of High IOPS SSD PCIe Adapters Two 640GB Adapters, Two 320GB Adapters

    I/O Bandwidth K IOPS with 4K Random Reads

    1 154

    2 271

    3 336

    4 397

    5 486

    6 580

    7 644

    8 677

    Table 2. Scaling small random IOPS with High IOPS SSD PCIe Adapters

  • IBM System x3850/x3950 X5 Combined with High IOPS SSD PCIe Adapters Demonstrate New Levels of I/O Scalability Page 8

    Figure 6. Small random read test results Note: The CPU utilization on the server with four High IOPS SSD PCIe Adapters was at 50%, which does not leave much room for an application to run and make use of the IOPS. Adding more adapters to the configuration only added to the CPU utilization and did not increase the IOPS. Measuring IOPS with Small Random I/Os and 70/30 Read/Write Ratio The data shown in Table 2 and Figure 6 were collected at a 100% read rate to show the maximum IOPS that the configuration is capable of delivering. Most typical OLTP environments do have a mix of read/write ratio. This ratio is typically at 70% reads and 30% writes. The data depicted in Table 3 and Figure 7 was collected at a read/write ratio of 70/30.

    Number of High IOPS SSD PCIe Adapters Two 640GB Adapters, Two 320GB Adapters

    K IOPS with 4K Random 70/30 Read/Write

    1 126

    2 192

    3 252

    4 315

    5 370

    6 434

    7 483

    8 537

    Table 3. Scaling small random read/write IOPS with High IOPS SSD PCIe Adapters

  • IBM System x3850/x3950 X5 Combined with High IOPS SSD PCIe Adapters Demonstrate New Levels of I/O Scalability Page 9

    Figure 7. Small random read/write test results

    Note: Again, doing a large number of small IOPS uses a lot of CPU cycles, because there are a lot of interrupts and Deferred Procedure Calls (DPCs) for the operating system to handle. The CPU utilization on the server with four High IOPS SSD PCIe Adapters was at 50% not leaving much room for an application to run and make use of the IOPS. Adding more adapters to the configuration only added to the CPU utilization and did not increase the IOPS.

    Measuring I/O Bandwidth with Large Sequential I/Os at 90/10 Read/Write Ratio DW workloads typically have a read/write ratio of 90% reads and 10% writes. Data was collected at a 90/10 read/write ratio but wasn’t significantly different from the data collected with 100% reads, and therefore the results aren’t shown. Measuring the I/O Bandwidth with a Real SQL Server Workload The results shown in Figures 5-7 above were obtained by running tests with Iometer, which is an ideal tool for measuring pure I/O performance. What it doesn’t do is measure I/O with processor, memory and network load on the system. To measure I/O performance, along with the other components of the system, the x3850 X5 system with the High IOPS SSD PCIe Adapters was configured to run a DW workload using Microsoft SQL Server 2008 R2. The example DW workload was set up to run on a 500GB database with various queries run against the database. Some of these queries were processor-intensive and others I/O-intensive, doing sequential scans through large tables. These were the queries that show improvement in the tested configuration. As you can see from results in Figure 8, read bandwidth during the query execution hit peaks of 10 GB/s, while average bandwidth was more than 7GB/s. The average is lower because the reads are not continuous, due to processing time between reads. This query executed in 93 seconds on this configuration vs. 264 seconds on an older x3850 M2 configuration capable of delivering 3.5 GB/s. That is nearly a 3X improvement in query response time.

  • IBM System x3850/x3950 X5 Combined with High IOPS SSD PCIe Adapters Demonstrate New Levels of I/O Scalability Page 10

    Figure 8. SQL query test results

    Conclusion The IBM System x3850 X5 populated with seven High IOPS SSD PCIe Adapters can deliver more than 10,000 MB/s or 10 GB/s I/O bandwidth for largely read-intensive workloads. This is nearly three times the performance delivered by the previous-generation x3850 M2. The x3850 X5 can deliver close to 700,000 IOPS with just four High IOPS SSD PCIe Adapters. Advancements in processor, I/O and flash memory technologies have made this possible. I/O is the slowest component in a database configuration, which means that it inhibits productivity in large enterprise organizations with a huge I/O demand. HDD access is many orders of magnitude slower than memory access. In environments with a huge database and heavy I/O requirements, a large amount of storage is needed and can be very costly. In these environments, expensive flash/SSD technology can actually deliver savings if you factor in the savings from space/infrastructure footprint, cooling requirement, management, and maintenance costs. The results presented in this paper demonstrate the benefits of running your large I/O workload on an IBM System x3850 X5 with High IOPS SSD PCIe Adapters.

    Additional Information For more information about the Fusion-io ioDrive Duo, visit: http://www.fusionio.com/products/iodriveduo/

    For more information about the IBM High-IOPS SSD PCIe Adapters, visit: http://ibm.com/common/ssi/ShowDoc.jsp?docURL=/common/ssi/rep_ca/8/897/ENUS110-128/index.html&breadCrum=DET001PT022&url=buttonpressed=DET002PT005&specific_index=DET001PEF502&DET015PGL002=DET001PEF011&submit.x=7&submit.y=8&lang=en_US#h2-tcx http://ibm.com/common/ssi/ShowDoc.jsp?docURL=/common/ssi/rep_ca/4/877/ENUSZG10-0154/index.html&breadCrum=DET001PT022&url=buttonpressed=DET001PT116&page=1000&paneltext1=DET001PEF011&user+type=EXT&lang=en_US&InfoType=AN&InfoSubT

  • IBM System x3850/x3950 X5 Combined with High IOPS SSD PCIe Adapters Demonstrate New Levels of I/O Scalability Page 11

    For More Information

    IBM System x Servers http://ibm.com/systems/x

    IBM BladeCenter Server and options http://ibm.com/systems/bladecenter

    IBM Systems Director Service and Support Manager http://ibm.com/support/electronic

    IBM System x and BladeCenter Power Configurator http://ibm.com/systems/bladecenter/resources/powerconfig.html

    IBM Standalone Solutions Configuration Tool http://ibm.com/systems/x/hardware/configtools.html

    IBM Configuration and Options Guide http://ibm.com/systems/x/hardware/configtools.html

    IBM ServerProven Program http://ibm.com/systems/info/x86servers/serverproven/compat/us

    Technical Support http://ibm.com/server/support

    Other Technical Support Resources http://ibm.com/systems/support

    Legal Information

    © IBM Corporation 2010 IBM Systems and Technology Group Dept. U2SA 3039 Cornwallis Road Research Triangle Park, NC 27709

    Produced in the USA October 2010 All rights reserved.

    For a copy of applicable product warranties, write to: Warranty Information, P.O. Box 12195, RTP, NC 27709, Attn: Dept. JDJA/B203. IBM makes no representation or warranty regarding third-party products or services including those designated as ServerProven

    ® or ClusterProven

    ®. Telephone

    support may be subject to additional charges. For onsite labor, IBM will attempt to diagnose and resolve the problem remotely before sending a technician.

    IBM, the IBM logo, ibm.com, Chipkill, ClusterProven, Memory ProteXion, ServerProven, System x, and X-Architecture are trademarks of IBM Corporation in the United States and/or other countries. If these and other IBM trademarked terms are marked on their first occurrence in this information with a trademark symbol (® or ™), these symbols indicate U.S. registered or common law trademarks owned by IBM at the time this information was published. Such trademarks may also be registered or common law trademarks in other countries. For a list of additional IBM trademarks, please see http://ibm.com/legal/copytrade.shtml.

    Intel, the Intel logo, and Xeon are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries.

    Microsoft, the Windows logo, and Windows Server are trademarks or registered trademarks of Microsoft Corporation.

    Other company, product and service names may be trademarks or service marks of others.

    IBM reserves the right to change specifications or other product information without notice. References in this publication to IBM products or services do not imply that IBM intends to make them available in all countries in which IBM operates. IBM PROVIDES THIS PUBLICATION “AS IS” WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. Some jurisdictions do not allow disclaimer of express or implied warranties in certain transactions; therefore, this statement may not apply to you.

    This publication may contain links to third party sites that are not under the control of or maintained by IBM. Access to any such third party site is at the user's own risk and IBM is not responsible for the accuracy or reliability of any information, data, opinions, advice or statements made on these sites. IBM provides these links merely as a convenience and the inclusion of such links does not imply an endorsement.

    Information in this presentation concerning non-IBM products was obtained from the suppliers of these products, published announcement material or other publicly available sources. IBM has not tested these products and cannot confirm the accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products.

    MB, GB and TB = 1,000,000, 1,000,000,000 and 1,000,000,000,000 bytes, respectively, when referring to storage capacity. Accessible capacity is less; up to 3GB is used in service partition. Actual storage capacity will vary based upon many factors and may be less than stated.

    Performance is in Internal Throughput Rate (ITR) ratio based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput that any user will experience will depend on considerations such as the amount of multiprogramming in the user’s job stream, the I/O configuration, the storage configuration and the workload processed. Therefore, no assurance can be given that an individual user will achieve throughput improvements equivalent to the performance ratios stated here.

    Maximum internal hard disk and memory capacities may require the replacement of any standard hard drives and/or memory and the population of all hard disk bays and memory slots with the largest currently supported drives available. When referring to variable speed CD-ROMs, CD-Rs, CD-RWs and DVDs, actual playback speed will vary and is often less than the maximum possible.

    XSW03083 -USEN-01


Recommended