+ All Categories
Home > Technology > Spectrum Scale Unified File and Object with WAN Caching

Spectrum Scale Unified File and Object with WAN Caching

Date post: 13-Jan-2017
Category:
Upload: sandeep-patil
View: 58 times
Download: 5 times
Share this document with a friend
38
#ibmedge © 2016 IBM Corporation Analytics with File and Object Access Plus Geographically Distributed Data Sandeep Patil, STSM, Spectrum Scale Trishali Nayar, AFM Development, Spectrum Scale Smita Raut, Object Development, Spectrum Scale 22 Sep 2016 Acknowledgement: Bill Owen, Dean Hilderbrand, Sanjay Gandhi, Brian Nelson, Tomonori Kubota, Gyoh Ohsawa
Transcript
Page 1: Spectrum Scale Unified File and Object with WAN Caching

#ibmedge© 2016 IBM Corporation

Software Defined Analytics with File and Object Access Plus Geographically Distributed DataSandeep Patil, STSM, Spectrum Scale

Trishali Nayar, AFM Development, Spectrum Scale

Smita Raut, Object Development, Spectrum Scale

22 Sep 2016

Acknowledgement: Bill Owen, Dean Hilderbrand, Sanjay Gandhi, Brian Nelson, Tomonori Kubota, Gyoh Ohsawa

Page 2: Spectrum Scale Unified File and Object with WAN Caching

#ibmedge2

Agenda• Introduction to Spectrum Scale Active File Manager (AFM)

• AFM Use Cases

• Spectrum Scale Protocol

• Unified File & Object Access (UFO) Feature Details

• AFM + Object : Unique Wan Caching for Object Store

• Deep Dive on Single Site & Multi-site Caching

• Configuration Commands with Demo

• Q & A

Page 3: Spectrum Scale Unified File and Object with WAN Caching

© 2016 IBM Corporation #ibmedge

Spectrum Scale Active File Management (AFM)

Page 4: Spectrum Scale Unified File and Object with WAN Caching

#ibmedge4

Spectrum Scale –The Complete Data Management Solution

Page 5: Spectrum Scale Unified File and Object with WAN Caching

#ibmedge5

AFM Overview• Active file management (AFM) uses a home-and-cache model in which a single

home provides the primary storage of data, and exported data is cached in a local GPFS™ file system

• AFM is primarily suited for remote caching

• Users access files from the cache system • For read requests, when the file is not yet cached, AFM retrieves the file from the home site• For write requests, writes are allowed on the cache system and can be pushed back to the

home system, depending on the cache types

Page 6: Spectrum Scale Unified File and Object with WAN Caching

#ibmedge6

AFM Caching Overview

Spectrum Scale

Storage Array

Storage node

Storage node

Home Cluster

Spectrum Scale

Storage Array

Storage node

Storage node

Cache Cluster

Nodes are made NFS servers

Few nodes are made gateway nodes

Cache filesets are associated to NFS export at home.

Page 7: Spectrum Scale Unified File and Object with WAN Caching

#ibmedge7

Global Sharing with Spectrum Scale AFM• Expands the GPFS global namespace across geographical distances

– Caches local ‘copies’ of data distributed to one or more GPFS clusters – Low latency ‘local’ read and write performance – Automated namespace management– As data is written or modified at one location, all other locations see that same data

• Efficient data transfers over wide area network (WAN) - Works with unreliable, high latency connections

• Speeds data access to collaborators and resources around the world

GPFS

GPFS

GPFS

Page 8: Spectrum Scale Unified File and Object with WAN Caching

#ibmedge8

AFM Caching Basics• Sites – two sides for a cache relationship

• A single home cluster– Presents a fileset that can be cached (export with NFS)– Can be non-GPFS cluster/nodes

• One or more cache clusters– Associates a local fileset with the home export

• AFM Fileset• Independent fileset with per-inode in xattrs• Data is fetched into the fileset on access (or prefetched on command)• Data written to the fileset is copied back to home

• Gateway Node (designation)• Maintains an in-memory queue of pending operations• Moves data between the cache and home clusters• Monitors connectivity to home, switches to disconnected mode on outage, triggers recovery on failure

Page 9: Spectrum Scale Unified File and Object with WAN Caching

#ibmedge9

Spectrum Scale AFM Use CasesGlobal Namespace

• Provides common name space across globally distributed cloud

• Persistent scalable cache for remote File System

Content distribution• Central site is

where data is created, maintained

• Branch/edge sites can periodically pre-fetch or pull on demand

Content Consolidation Disaster Recovery

• Replication of data across WAN with consistency points

• Failover and Failback support

• Branch offices work on local active data

• Master repository maintained centrally

• Adv functions – backup etc on central site

Page 10: Spectrum Scale Unified File and Object with WAN Caching

© 2016 IBM Corporation #ibmedge

Spectrum Scale Protocol

Page 11: Spectrum Scale Unified File and Object with WAN Caching

#ibmedge11

Enhanced Protocol Support from 4.1.1 releaseThe Challenge: How can I share my storage infrastructure across all of my legacy and new

generation applications?

The Solution:

• The new IBM Spectrum Scale Protocol Node allows access to data stored in a Spectrum Scale filesystem, using additional access methods and protocols.

• The Protocol Node functions are clustered and can support transparent failover for NFS and SWIFT protocols as well as SMB protocols.

• Multiprotocol data access from other systems using the following protocols• NFS v3 and v4• SMB 2 and SMB 3.0 mandatory features / CIFS for Windows support.• OpenStack Swift and S3 API support for object storage.

Page 12: Spectrum Scale Unified File and Object with WAN Caching

#ibmedge12

Adding Protocol Support

Administrator

Command Line Interface

Users

NFS

SMB/CIFS

POSIX

Open Stack Swift

PN1

ProtocolNode

Flash

Disk

Tape

Ext

erna

l TC

P/IP

or I

B N

etw

ork

PN2

PNn

NSD1

Network Shared Disks

NSD2

NSDn

Physical Storage

IBM

Spe

ctru

m S

cale

Clu

ster

TC

P/IP

or I

B N

etw

ork

Mgmt Nodes

AuthenticationServices

keystone

Open Stack Cinder

Spe

ctru

m S

cale

Clu

ster

Nod

es

Elastic Storage Server

Page 13: Spectrum Scale Unified File and Object with WAN Caching

#ibmedge13

IBM Spectrum Scale Benefits

Better performance Eliminate hotspots with massively parallel access to files Sequential I/O with ES greater than 400 GB/s Throughput advantage for parallel streaming workloads, e.g. Tech Computing and Analytics

More Storage. More Files. Hyper Scale. Simplified Management Easier management with one global namespace instead of managing islands of

NAS arrays, e.g. no need to copy data between compute clusters

Integrated policy driven automation Fewer storage administrators required

Lower Cost Optimizes storage tiers including flash, disk and tape Increased efficiency and more efficient provisioning due to parallelization and striping technology

Remove duplicate copies of data, e.g. run analytics on one copy of data without having to set up a separate silo

Page 14: Spectrum Scale Unified File and Object with WAN Caching

#ibmedge14

IBM Spectrum Scale – Protocol Integration• Software Offering - protocol support is added to GPFS

• Can be configured on existing GPFS clusters or new cluster• Support for Intel and Power Systems• RHEL 7/7.1

– Protocol node requirement– Remaining GPFS nodes can have any supported environment/platform

• Use of installation”) also limited to RHEL 7/7.1

• Add support for the following protocols• SMB• NFS• Object (HTTP Rest)

• Some cluster nodes are designated as “Protocol Nodes” (aka. CES nodes)• Integrated management of the protocol services• Active-Active clustering• High availability through IP fail-over

Page 15: Spectrum Scale Unified File and Object with WAN Caching

#ibmedge15

IBM Spectrum Scale – Protocol Support

Page 16: Spectrum Scale Unified File and Object with WAN Caching

#ibmedge16

Protocol Support Considerations• Adding Protocol Nodes to GPFS Cluster:

• All RHEL7/xServers or All RHEL7/pServers • Not NSD Servers• Protocol Export IPs distributed among the protocol nodes

– Different policies for balancing and failback

• Management: GUI and CLI

• Deployment: Easy Automated Deployment

• Flexibility: customer choice of nodes/disks/storage options

• Scale: limits for capacity/performance based on GPFS;

• CES nodes limits based on protocols enabled• 16 nodes, 3000 connections/node and 20K connections/cluster for SMB• 32 nodes for only NFS or only Object or NFS+Object

• Security: root access for cluster management but have sudo access support

• Roll your own or combine with Lab Services to meet expectations

Page 17: Spectrum Scale Unified File and Object with WAN Caching

© 2016 IBM Corporation #ibmedge

Spectrum Scale Object (Part of Spectrum Scale Protocol)

Page 18: Spectrum Scale Unified File and Object with WAN Caching

#ibmedge18

Spectrum Scale Object Storage• Basic support added in 4.1.1 release & enhanced in 4.2 and 4.2.1 release

• Based on Openstack Swift (Juno Release)

• REST-based data access• Growing number of clients due to extremely simple protocol• Applications can easily save & access data from anywhere using HTTP• Simple set of atomic operations:

– PUT (upload)– POST (update metadata)– GET (download)– DELETE

• Amazon S3 Protocol support

• High Availability with CES Integration

• Simple and Automated Installation Process

• Integrated authentication (Keystone) support

• Native GPFS Command Line Interface to manage Object service (mmobj command)

Page 19: Spectrum Scale Unified File and Object with WAN Caching

#ibmedge19

Spectrum Scale Object Storage – Additional Features• Unified file and object support with Hadoop connectors• Support for Encryption• Support for Compression• Only Object Store with Tape support for Backup• Object store with integrated transparent cloud tiering Support • Multi Region support• AD/LDAP support for authentication• ILM support for Object• Movement of Object across storage tiers based on access heat• Spectrum Scale Object with IBM DeepFlash becomes object store over all flash

array for newer faster workloads.• Spectrum Scale Object with WAN caching support (AFM)

Page 20: Spectrum Scale Unified File and Object with WAN Caching

© 2016 IBM Corporation #ibmedge

IBM Spectrum Scale: Unified File and Object Access Feature Overview

Page 21: Spectrum Scale Unified File and Object with WAN Caching

#ibmedge21

Unified File and Object (UFO Support)

Spectrum Scale: Redefining Unified Storage

• Challenge The world is not converged/file/object/HDFS today!

and never will be completely…

• Unified Scale-out Content Repository• File or object in. Object or file out.• Integrated big data analytics support• Native protocol support• High-performance that scales• Single Management Plane

Spectrum Scale

NFS SMBPOSIX

SSD FastDisk

SlowDisk

Tape

Swift/S3HDFS

Page 22: Spectrum Scale Unified File and Object with WAN Caching

#ibmedge22

What is Unified File and Object Access?• Accessing object using file interfaces (SMB/NFS/POSIX) and

accessing file using object interfaces (REST) helps legacy applications designed for file to seamlessly start integrating into the object world.

• It allows object data to be accessed using applications designed to process files. It allows file data to be published as objects.

• Multi protocol access for file and object in the same namespace (with common User ID management capability) allows supporting and hosting data oceans of different types of data with multiple access options.

• Optimizes various use cases and solution architectures resulting in better efficiency as well as cost savings.

<Clustered file system>

Swift (With Swift on File)

NFS/SMB/POSIXObject(http)

2 1

<Container>

Data ingested as Objects

3Data ingested as Files 4

Files accessed as Objects

Page 23: Spectrum Scale Unified File and Object with WAN Caching

© 2016 IBM Corporation #ibmedge

IBM Spectrum Scale: AFM + Object (Unique Proposition)

Page 24: Spectrum Scale Unified File and Object with WAN Caching

#ibmedge24

The Need: Thin-Thick storage capacity site deployments for Object Data

Applications

Applications

Applications

Limited storage

Limited storage

Limited storage

Unlimited storageCentral Site

Site 3

Site 2

Site 1Object Data

Object Data

Object Data

Centralized Analytics

Centralized Backup

• Geo Dispersed multiple sites with limited storage capacity • Independent Applications running on each sites accessing/generating object data.• Centralized Home for consolidated object data – ability to grow storage capacity.

• centralized backup for all sites via central location• ability to run analytics for all sites in central location

Page 25: Spectrum Scale Unified File and Object with WAN Caching

#ibmedge25

Usecase Requirements• There is an object store site that is closer to the end application but has a limited

storage capacity. • To cater to large storage capacity requirement there is another object store setup at a

geographically remote site which has unlimited or expandable storage capacity, that acts as a central archive.

• The relationship between these two object stores need to be setup in such a way that allows applications to access all object data from the site closer to them for faster access, even though it has limited storage capacity.

• The central site should have ability to do in place analytics of data.

• The central site should have ability to do backup of the data.

• If cache goes down the application should be able to failover to the central site.

Page 26: Spectrum Scale Unified File and Object with WAN Caching

#ibmedge26

The Solution: Unique WAN caching for Object Store - available only with Spectrum Scale

Unlimited storage

Central Site Centralized Analytics

Centralized Backup

Applications Limited storage

Site 1Object Data

Spectrum Scale Cluster with

Protocol Nodes (Object Enabled)

Spectrum Scale Cluster with

Protocol Nodes (Object Enabled)

Spectrum ScaleAFM (IW) Relationship with

cache eviction enabled on Site 1

Object Data can be accessed as Files using Unified file and Object Feature and used for

analytics

Data can be centrally backed to TapeSpectrum Scale Feature Requirements Addressed

AFM with Spectrum Scale Object - Allows objects store to have thin cache with eviction enabled and thick home.

AFM in IW Modes Allows for fail-back and fail-over from cache site to Home useful during disaster.

Unified File and Object Access with HDFS connector Allows centralized and in-place analytics of data at Home site

Tape Integration Centralized backup

Page 27: Spectrum Scale Unified File and Object with WAN Caching

#ibmedge27

Thin Object Store Cache – Thick Object Store Archive

Spectrum ScaleHome#1

Spectrum ScaleCache#1 Service

1

SerivesXXX

Site #1

FilesetObject access

Object Ingest

Fileset11TB/day

AFM Independent-Writer

Swift API Swift APIFailover/Failback

Existing Services Cache in Region 1 Archive in Region 2

ReplicateXXTB of dataeveryday

• Cache Site in Region 1 with limited storage and Home site in Region with max storage per data center• Object data to be archived from cache site in Region 1 to home site in Region 2 using AFM –IW• On cache failure, application will fail over home site for object access. Application will fail-back when

cache comes up.• Limited storage on cache site addressed by using Eviction along with AFM• Key Features used in Solution: Spectrum Scale Object , AFM (IW) with Eviction\• Available and documented in 4.2.1

Page 28: Spectrum Scale Unified File and Object with WAN Caching

#ibmedge

Spectrum Scale Cluster for Region 1

Home Cluster for Region 1

Services

Services

Region #1

Spectrum Scale Cluster for Region 1

Services

Services

Region #2

Sw

ift A

PI

Objects

Objects

Existing Services Cache Home in Region 3

Home Cluster for Region 2

Swift API Swift APIFailover/Failback

Swift API Swift APIFailover/Failback

One can include multiple sites where each site has its own home cluster at the central region and replicate the setup shown in previous slide for single site.

Multiple site Deployment

Page 29: Spectrum Scale Unified File and Object with WAN Caching

#ibmedge

Configuration Steps

• Details Configuration Step Available in 4.2.1 in Knowledge Center

Using AFM with Spectrum Scale Object• http://

www.ibm.com/support/knowledgecenter/STXKQY_4.2.1/com.ibm.spectrum.scale.v4r21.doc/bl1ins_usingafmwithobject.htm

29

Page 30: Spectrum Scale Unified File and Object with WAN Caching

#ibmedge

Conclusion• Spectrum Scale provides rich set of features like

• AFM• Protocols with POSIX, SMB,NFS and Object• Unified File and Object Access• In Place analytics using build-in Hadoop connectors

• Integrating AFM with spectrum scale object delivers unique solution required for many multi-site deployments wherein:

• One can have thin cache object store with auto eviction facility closer to the applications or users

• Centralized thick home object store which can act as failback object store for the thin cache sites.

• Ability to do in-place analytics of all the data on the home site• Ability to do a central backup at the home site.

30

Page 31: Spectrum Scale Unified File and Object with WAN Caching

#ibmedge

Spectrum Scale User Group• The Spectrum Scale User Group is free

to join and open to all using, interestedin using or integrating Spectrum Scale.

• Join the User Group activities to meetyour peers and get access to expertsfrom partners and IBM.

• Driven and owned by Customers

• Next meetings:- APAC: October 14, Melbourne- Global at SC16 : November 13 1pm to 5pm, Salt Lake City

• Web page: http://www.spectrumscale.org/ • Presentations: http://www.spectrumscale.org/presentations/ • Mailing list: http://www.spectrumscale.org/join/ • Contact: http://www.spectrumscale.org/committee/ • Meet Bob Oesterlin (US Co-Principal) at Edge2016: [email protected]

Page 32: Spectrum Scale Unified File and Object with WAN Caching

#ibmedge32

Session : How to apply Flash benefits to big data analytics and unstructured data

NDA & Customers ONLY

• Who: IBM Elastic Storage Server Offering Management

• Alex Chen

• When: Thursday, September 22, 2016

• 1:15pm to 2:15pm

• Where: Grand Garden Arena, Lower Level, MGM, Studio 10

• Contact(if any questions)

• • [email protected], [email protected]

Page 33: Spectrum Scale Unified File and Object with WAN Caching

#ibmedge33

Spectrum Scale Trial VM• Download the IBM Spectrum Scale Trial VM from :

• http://www-03.ibm.com/systems/storage/spectrum/scale/trial.html

Page 34: Spectrum Scale Unified File and Object with WAN Caching

#ibmedge

ReferencesSpectrum Scale 4.2.1 Knowledge Center: Using AFM with object http://www.ibm.com/support/knowledgecenter/STXKQY_4.2.1/com.ibm.spectrum.scale.v4r21.doc/bl1ins_usingafmwithobject.htm

Spectrum Scale Object Store – Unified File and Objecthttp://www.slideshare.net/SandeepPatil154/spectrum-scaleexternalunifiedfile-object

From Archive to Insight: Debunking Myths of Analytics on Object Stores – Dean Hildebrand, Bill Owen,Simon Lorenz, Luis Pabon, Rui Zhang. Vancouver Summit, Spring 2015.https://www.youtube.com/watch?v=brhEUptD3JQ

Deploying Swift on a File System – Bill Owen, Thiago Da Silva. BrownBag at OpenStack Paris, Fall 2014https://www.youtube.com/watch?v=vPn2uZF4yWo

Breaking the Mold with OpenStack Swift and GlusterFS – Jon Dickinson, Luis Pabo. Atlanta Summit, Spring 2014https://www.youtube.com/watch?v=pSWdzjA8WuA

SNIA SDC 2015

http://www.snia.org/sites/default/files/SDC15_presentations/security/DeanHildebrand_Sasi__OpenStack%20SwiftOnFile.pdf

Page 35: Spectrum Scale Unified File and Object with WAN Caching

#ibmedge

Notices and Disclaimers

35

Copyright © 2016 by International Business Machines Corporation (IBM). No part of this document may be reproduced or transmitted in any form without written permission from IBM.

U.S. Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM.

Information in these presentations (including information relating to products that have not yet been announced by IBM) has been reviewed for accuracy as of the date of initial publication and could include unintentional technical or typographical errors. IBM shall have no responsibility to update this information. THIS DOCUMENT IS DISTRIBUTED "AS IS" WITHOUT ANY WARRANTY, EITHER EXPRESS OR IMPLIED. IN NO EVENT SHALL IBM BE LIABLE FOR ANY DAMAGE ARISING FROM THE USE OF THIS INFORMATION, INCLUDING BUT NOT LIMITED TO, LOSS OF DATA, BUSINESS INTERRUPTION, LOSS OF PROFIT OR LOSS OF OPPORTUNITY. IBM products and services are warranted according to the terms and conditions of the agreements under which they are provided.

IBM products are manufactured from new parts or new and used parts. In some cases, a product may not be new and may have been previously installed. Regardless, our warranty terms apply.”

Any statements regarding IBM's future direction, intent or product plans are subject to change or withdrawal without notice.

Performance data contained herein was generally obtained in a controlled, isolated environments. Customer examples are presented as illustrations of how those customers have used IBM products and the results they may have achieved. Actual performance, cost, savings or other results in other operating environments may vary.

References in this document to IBM products, programs, or services does not imply that IBM intends to make such products, programs or services available in all countries in which IBM operates or does business.

Workshops, sessions and associated materials may have been prepared by independent session speakers, and do not necessarily reflect the views of IBM. All materials and discussions are provided for informational purposes only, and are neither intended to, nor shall constitute legal or other guidance or advice to any individual participant or their specific situation.

It is the customer’s responsibility to insure its own compliance with legal requirements and to obtain advice of competent legal counsel as to the identification and interpretation of any relevant laws and regulatory requirements that may affect the customer’s business and any actions the customer may need to take to comply with such laws. IBM does not provide legal advice or represent or warrant that its services or products will ensure that the customer is in compliance with any law

Page 36: Spectrum Scale Unified File and Object with WAN Caching

#ibmedge

Notices and Disclaimers Con’t.

36

Information concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other publicly available sources. IBM has not tested those products in connection with this publication and cannot confirm the accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products. IBM does not warrant the quality of any third-party products, or the ability of any such third-party products to interoperate with IBM’s products. IBM EXPRESSLY DISCLAIMS ALL WARRANTIES, EXPRESSED OR IMPLIED, INCLUDING BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.

The provision of the information contained h erein is not intended to, and does not, grant any right or license under any IBM patents, copyrights, trademarks or other intellectual property right.

IBM, the IBM logo, ibm.com, Aspera®, Bluemix, Blueworks Live, CICS, Clearcase, Cognos®, DOORS®, Emptoris®, Enterprise Document Management System™, FASP®, FileNet®, Global Business Services ®, Global Technology Services ®, IBM ExperienceOne™, IBM SmartCloud®, IBM Social Business®, Information on Demand, ILOG, Maximo®, MQIntegrator®, MQSeries®, Netcool®, OMEGAMON, OpenPower, PureAnalytics™, PureApplication®, pureCluster™, PureCoverage®, PureData®, PureExperience®, PureFlex®, pureQuery®, pureScale®, PureSystems®, QRadar®, Rational®, Rhapsody®, Smarter Commerce®, SoDA, SPSS, Sterling Commerce®, StoredIQ, Tealeaf®, Tivoli®, Trusteer®, Unica®, urban{code}®, Watson, WebSphere®, Worklight®, X-Force® and System z® Z/OS, are trademarks of International Business Machines Corporation, registered in many jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A current list of IBM trademarks is available on the Web at "Copyright and trademark information" at: www.ibm.com/legal/copytrade.shtml.

Page 37: Spectrum Scale Unified File and Object with WAN Caching

#ibmedge37

IBM Spectrum Scale Summary

• Avoid vendor lock-in with true Software Avoid vendor lock-in with true Software Defined Storage and Open Standards

• Seamless performance & capacity scaling• Automate data management at scale• Enable global collaboration

Data management at scale OpenStack and Spectrum Scale helps clients manage data at scale

Business: I need virtually unlimited storage

Operations: I need a flexible infrastructure that supports both object and file based storage

Operations: I need to minimize the time it takes to perform common storage management tasks

A single data plane that supports Cinder, Glance, Swift, Manila as well as NFS, SMB, et. al.

A fully automated policy based data placement and migration tool

An open & scalable cloud platform

Sharing with a variety of WAN caching modes

Results

• Converge File and Object based storage under one roof

• Employ enterprise features to protect data, e.g. Snapshots, Backup, and Disaster Recovery

• Support native file, block and object sharing to data

Spectrum Scale

NFS

SMBPOSIX

SSD FastDisk

SlowDisk

Tape

Swift

HDFS

CinderGlance Manila

37

Collaboration: I need to share data between people, departments and sites with low latency.

Data management at scale

Page 38: Spectrum Scale Unified File and Object with WAN Caching

© 2016 IBM Corporation #ibmedge

Thank You


Recommended