Date post: | 22-Jan-2018 |
Category: |
Documents |
Upload: | david-pechon |
View: | 116 times |
Download: | 1 times |
SPINNING BROWN DONUTS
Why Storage Still Counts
Presented by:
David Pechon, Jr.
MCSA, VCP5-DCV
2
WHO DOES THIS GUY THINK HE IS?Started IT career with an enlistment in the US Army in 1997 as an Information Systems Operator/Analyst. Stationed at Fort Polk, LA; Youngsan Army Garrison in Seoul, South Korea; and Fort Bragg, NC. (never airborne, instead a dirty nasty leg)
Worked for a loan servicing company and three different banks in SE Louisiana, as well as a consulting business from a small MSP in New Orleans to a large systems integrator based in Denver.
Started working at Sparkhound in February 2014, specializing in virtualization, storage, messaging and identity management
Held certifications from Microsoft, VMware, NetApp, CommVault and SyncSort (now CatalogicSoftware).
Married to my wife Clare of 8 years with two children and currently resides in Ponchatoula, LA
Avid Chicago Cubs fan; enjoys fish, fine beers and grilling outdoors.
Fun fact: My face was on the Today show in 1991 for a full five seconds when Joe Garagiola visited my school at Fort Stewart, GA.
@davidpechon
http://linkedin.com/in/davidpechonjr
… AND WHY IS HE SO EXCITED ABOUT STORAGE?
As workloads become increasingly virtualized, storage becomes more and more of a potential storage bottleneck, many technologies have been produced to reduce impact.
The amount of data generated has grown exponentially with no signs of slowing down.
Information is an asset to any organization. There are needs to make sure its secure and available at all times.
BRIEF HISTORY OF DATA STORAGE
1948 Williams Tube
Early 1950s – Drum Memory
1951 -Uniservo
1956 – IBM 350
First HDD
1972 – Data Cassette
1976 – Floppy disks
1983 – ST-506First PC HDD
1990s Optical Media 2010s Cloud
2000s USB Flash
1725-1940s: Punch Cards
RAID IS NOT A BACKUPSeriously…
RAID IS NOT A BACKUPAnyone who thinks RAID is a backup should be swatted on the nose with a rolled up newspaper….
…and laughed at too.
RAID is used to span storage load across spindles and/or survive a disk failure. RAID will not protect against rouge admins, stupid admins, stupid users, users in general, users looking to get out of congressional hearings, viruses, Decepticons, the guys the Go-Bots fought against, vampires, fire, earthquakes, nuclear apocalypse …. well you get the idea.
STUPID IS AS STUPID DOESOne official wrote me … “there are criminal penalties for destroying federal records, which makes sense, including liability for negligence for not taking the necessary steps to protect files, including a federal requirement to backup data. This doesn’t happen. All email servers are backed up with something called ‘RAID’ (Redundant Array Of Independent Disks), and it’s nearly impossible for something to delete the files, and that even if that were to happen they would not be gone forever.”
Source:
D. Giordano (June 16, 2014) Attkisson On Missing IRS Documents: If The Emails Really Are Lost,
‘That’s Quite A Story In Itself’. Retrieved from http://philadelphia.cbslocal.com/2014/06/16/attkisson-on-
missing-irs-documents-if-the-emails-really-are-lost-thats-quite-a-story-in-itself/
BREAK IT DOWN … BARNEY STYLE
RAID does not protect against deletion, be it accidental or intentional.
RAID does not protect against data corruption.
Some RAID levels will not protect you against disk loss, all will not protect you against other catastrophic failures.
Remember Kids!! RAID is not a backup!
TRADITIONAL DATACENTER STORAGEWhy SAN is not just NAS Spelled Backwards
IN THE BEGINNING...STORAGE AREA
NETWORK
Storage Device
A SAN shares virtual disks from an array to a host. In this example, a fabric is being used. Storage is presented to a host as raw block storage.
DIRECT ATTACHED STORAGE
Direct attached storage is basically disks attached directly to a host via a storage controller card. While performance can be great, flexibility is low in creating islands of storage.
A NAS hosts files over network shares. Storage is mapped to hosts. Was created to share information between computers over standard data networks.
NETWORK ATTACHED STORAGE
Storage Device
File System (CIFS, NFS,
etc.)Network
Storage
NAS PROTOCOLSServer Message Block (SMB)
a.k.a.: Common Internet File System (CIFS)Network File System (NFS)
Primarily used by Windows to share files over a network. Supported by MacOS. Can be used by UNIX/Linux distributions
with third party tools like Samba
Developed by Sun Microsystems to share files with other Solaris systems. Primarily used by UNIX and UNIX like operating
systems. Windows Server 2012 can act as an NFS server natively.
Latest version SMB 3.0 supports hardware acceleration and multipathing.
Latest version NFS 4.1 supports multipathing and supports parallel writes for applications like high performance computing
(pNFS)
Hyper-V 3.0 can use SMB 3.0 shares to store VMs in a cluster. Only NAS protocol supported to store Microsoft Exchange
mailbox databases on virtual disks
NFS 3 is supported by all vSphere versions. NFS 4.1 is supported by ESXi 6/vSphere 6.
SAN PROTOCOLS
Fibre Channel iSCSI
Requires special FC switching and cards called Host Bus Adaptors or HBAs. Configured in a fabric configuration to
minimize failure points and increase data paths.
Uses existing Ethernet/IP infrastructure, can use either software initiators or HBAs.
Developed to go beyond the SCSI limits for disk devices and tape drives.
Developed as a lower cost alternative to Fibre Channel
Lossless protocol to minimize storage latency. Beholden to the loss packets that can occur on an IP network
Great scalability and can traverse greater distances by use of dark fiber. In some cases up to 100 kilometers (a tad over 62
miles)
While it can go over IP networks, not recommended to go over wide area networks.
BEST OF BOTH WORLDS - FCOE
FCoE switches can carry both Fibre Channel and IP networking on the same switches, reducing complexity, cabling, and devices.
Converged Network Adaptors replace specific HBAs and can also carry IP and Fibre Channel protocols.
Popular in converged architecture sets such as FlexPod, Vblock, ActiveSystem, CloudMatrix, etc.
Uses the same architecture and networking practices as Fibre Channel. Ethernet replaces the physical layer.
SO WHAT'S THE DIFFERENCE?
SO WHAT'S THE DIFFERENCE?• You would need a SAN if…
• You need lower latency disk access over a lossless protocol.• You are using higher transaction intensive systems like database management
applications, enterprise resource planning, or email systems like Exchange Server• You want to eliminate of single points of failure by use of a fabric network and multiple
data paths.• You need to traverse over a campus or even a metro area over fiber.
• You would need a NAS if..• You need lower administrative overhead without the need for special network
configuration outside of setting up a VLAN or two.• You want to share files directly to users from the array, eliminating traditional file
servers.• You want to cluster storage to scale out performance, not just capacity. (scale-out NAS)
WHAT IF I WANT BOTH?
WHAT IF I WANT BOTH?
Unified storage systems that can host both SAN and NAS protocols from the same array, simplifying management and allowing more flexibility.
NAS gateways are systems that use a LUN from a SAN to host file protocols. These can be systems that are built for that purpose or a general purpose operating system running on a server.
BEYOND ARRAYSSoftware Defined, Hybrid, All-Flash, and Convergence
HYBRID ARRAYSHybrid arrays combine the use of traditional magnetic disk and solid state.
The idea came from the method of storage tiering, where blocks of “hot” data are moved to faster disks.
While effective for a while, it was basically trying to squeeze blood from a stone. Performance was still limited by mechanical disk speed and scheduling of blocks to be written and when they were deemed hot or cold
15k RPM SAS RAID 10
10k RPM SAS RAID 6
7.2k RPM SATA RAID 6
HOT!
COLD
WARM
HOT!
WARM
WARM
WARMCOLD
HYBRID ARRAYSHybrid arrays combine the use of traditional magnetic disk and solid state.
In most hybrid arrays, hot data is cached in SSDs or PCI flash in the storage array. Some arrays will use DRAM as a cache level. Data isn’t moved but the array will use metadata to point reads to the cache, known as “cache hits”
Some hybrid arrays have the ability to use SSDs as a write cache, to ingest large amounts of data quickly, then move it to slower storage.
STORAGE ARRAY
SSD
SERVER
SSD
ALL FLASH ARRAYSIts an array with all flash drives….
…duh.On a serious note, what sets vendors apart are features.
Violin Memory is an example of one such array that doesn’t do any special space efficiency, but makes very dense solid state arrays.
Pure Storage sacrifices some raw performance for space efficiencies deduplication and compression.
Some traditional storage vendors like HP and NetApp, and added all-flash support to their existing storage arrays.
CONVERGED SYSTEMSVMware brought virtualization to commodity x86 computing, bringing the benefits of mainframes to lesser expensive hardware.
Fibre Channel over Ethernet allowed datacenters to reduce the amount of networking devices in the datacenter.
Cisco UCS platforms had decoupled various hardware settings from systems, allowing you to replace WWN, MAC addresses, BIOS settings, etc. to a new node either hot or cold
Unified storage systems such as NetApp FAS and EMC VNX allowed for all storage protocols under one system.
This led to the concept of converged systems, where compute, network, storage and hypervisor systems were combined under a validated model, giving the customer one number to call for support, or known as “one throat to choke.”
SOFTWARE DEFINED STORAGESoftware Defined Storage is the ability to get the features of a storage array in a virtual appliance rather than hardware or run a storage OS on their own hardware.
This has given birth to two disruptive technologies…
HYPERCONVERGEDHyperconverged systems cluster the local storage on virtual hosts by using a storage VM or by the hypervisor itself.
HYPERVISOR
Storage VM
SCSI Controller
SSD
SSD
SATA
SATA
SATA
SATA
VM
I/O
HYPERVISOR
Storage VM
SCSI Controller
SSD
SSD
SATA
SATA
SATA
SATA
VM
I/O
HYPERVISOR
Storage VM
SCSI Controller
SSD
SSD
SATA
SATA
SATA
SATA
VM
I/O
Virtual storage cluster
This technology has proven to be excellent for applications that linearly scale such as big data and virtual desktop infrastructure.
CLOUD STORAGECloud storage can be in the form of user accessible storage such as OneDrive or Dropbox
It can be a cold data tier, as used by Microsoft StorSimple
It can be used as a replication or backup target, similar to NetApp Cloud ONTAP.
Or replicate your entire infrastructure for disaster recovery with services like vCloud Air or Hyper-V/Azure replication.
FLASH! ….ahhh ahhhh….savior of the universe datacenter!
TYPES OF SOLID STATE DRIVES
Single level cell flash or SLC NAND* memory stores one bit per cell. It can endure more writes than any other flash memory available and is usually the most expensive.
Multi-level cell or MLC flash can do two to three bits per cell but has a shorter lifespan than SLC.
Enterprise MLC or eMLC will consist of chips of higher quality, much like how enterprise drives are more reliable than consumer grade. They cost more than consumer MLC SSD drives but less than SLC SSDs.
*NAND being a transistor logic gate, which is a negation of the AND operator. NOR logic gates are used in some SLC flash where the logic gate results in the negation of an OR operator.
ALL FLASH OR HYBRID?
It all depends!
Using metrics to determine cost, such as $/GB or $/IOPS.
Do you need sub millisecond latency?
Do you want the benefits of flash with a cost somewhat similar to disk?
Not all workloads need all flash arrays.
BUZZWORDS AND MARKETINGCutting through the BS
BIG DATABig Data is basically taking petabytes of unstructured and structured data and turning it into something useful.
Storage frameworks like Hadoop make this possible.
Hadoop requires an array of nodes that are usually only needed on demand.
Amazon EWR and Azure HDInsight are cloud services specifically for provisioning Hadoop clusters in seconds and you only pay for running a workload.
HERO NUMBERS
HERO NUMBERS
IOPS stands for Input/Output Operatoins Per Second. Most numbers quoted are based on 4 kilobyte block size and sequential reads, which most drives and arrays can perform quite well. Most applications like SQL Server will require a 64k block size. So this figure readjusted for 64k sequential read would be 12,500 IOPS.
Deduplication is the method of removing redundant blocks of storage to save space. Metadata is used to reconstruct data from the deduplicated blocks. This depends all about how much redundant data is being stored. If you have 10 desktops with nothing else installed, you pretty much have the same bits written 10 times, hence the 10:1 dedupe ratio. Databases for instance may only see 10-30% space savings with deduplication.
NL-SAS or Nearline SAS are SATA drives that can use SAS backplanes. They’re no faster per spindle than SATA drives.
I can imagine Paul Thurott saying something like that.
IN CONCLUSIONSizing storage properly can make or break your line of business applications. A lower cost hybrid array may be sufficient over all flash. You may want to consider cloud storage over an on premise array for cold storage.
Never let a vendor tell you how you should run your systems on their storage. A good storage consultant should be able to size an environment not only on your applications but through the entire lifecycle of the appliance.
Always ask for a “bake off”, meaning you can test your workloads on their gear before signing a purchase order.
Be wary of “hero numbers”, again using a bake off to get a much better picture on how their system will work for you.