+ All Categories
Home > Documents > AUTOMATING IBM SPECTRUM SCALE CLUSTER...

AUTOMATING IBM SPECTRUM SCALE CLUSTER...

Date post: 19-Jun-2018
Category:
Upload: hadiep
View: 215 times
Download: 0 times
Share this document with a friend
15
AUTOMATING IBM SPECTRUM SCALE CLUSTER BUILDS IN AWS PROOF OF CONCEPT By Joshua Kwedar | Sr. Systems Engineer By Steve Horan | Cloud Architect ATS Innovation Center, Malvern, PA Dates: Oct – December 2017
Transcript

AUTOMATING IBM SPECTRUM SCALE CLUSTER BUILDS IN AWS PROOF OF CONCEPT

By Joshua Kwedar | Sr Systems EngineerBy Steve Horan | Cloud ArchitectATS Innovation Center Malvern PADates Oct ndash December 2017

01

INTRODUCTION

As an IBM premier business partner we are always looking for creative ways to meet our customerrsquos needs The demand for low cost quickly deployed solutions is ever increasing in todayrsquos ldquocloud firstrdquo climate

Many of our customers have invested a significant amount of money into HPC solutions with

node counts that can reach into the thousands and have found themselves feeling locked in to

a specific vendor with no quick way to elastically grow or shrink based on the demand of their

workloads In terms of hybrid growth wersquove researched IBM Spectrum Scalersquos Transparent Cloud

Tiering feature which allows for an S3 object store to serve as a storage tier within the same

Scale namespace While customers see the value in growing their storage footprint on demand

a subset of them view this feature as a stop gap to their ultimate goal of moving their storage as

well as their compute into the cloud We have built Scale clusters across multiple cloud providers

(Google Cloud Platform Amazon Web Services) but have focused in on AWS after IBMrsquos

Spectrum Scale on AWS Quick Start trial evaluation was released in September 2017

THE GOAL OF THE POC WAS THE FOLLOWING

middot Review the AWS Spectrum Scale architecture layout

middot Create a Spectrum Scale cluster using IBMrsquos Quick Start guide

middot Review cloud formation templates and scripts to better understand steps to automate cluster build in AWS

middot Execute failure scenarios and observe behavior

middot Determine any restrictions in the trial release

middot Improve cloud formation templates scripts restrictions based on observations

middot Create customized AMIs with updated Spectrum Scale versions

02

A Spectrum Scale cluster deployed in AWS consists of the following components

1 A VPC (Virtual Private Cloud) is defined within AWS All instances exist in the VPC

Deployment templates allow a cluster to be deployed within an existing VPC or for a

new VPC to be created The VPC defaults to span two availability zones

2 A single Bastion host is created within a single availability zone in a public subnet Think

of a Bastion host as a jump server or an admin server It serves no function within the

Spectrum Scale cluster other than a means to SSH into cluster nodes from outside of the

AWS VPC

a Note that one of the Bastion hosts is greyed out in the image above The Bastion stack

has autoscaling configured by default to ensure there is always at least one Bastion

host up and running Without an accessible Bastion host you would be unable to SSH

to any cluster nodes

3 NAT gateways are defined in the public subnet (Bastion stack) to allow outbound internet

access for nodes in the private subnet (Server and Compute instances)

4 AWS Identity and Access Management (IAM) role and Security groups are automatically

created to allow ports for SSH and the Spectrum Scale daemon

03

5 The IBM Quickstart allows users to specify EC2 types and quantity for the Server

(NSD Server) and Compute nodes to be configured as a part of the cluster

a NSD Servers

i Minimum = 2

ii Maxiumum = 64

b Compute nodes

i Minimum = 1

ii Maximum = 64

6 One 100GB disk is allocated per EC2 instance to be used as the root volume Users can

specify the size quantity and type of EBS volumes to be used as NSDs (Network Shared

Disks) The currently supported disk sizes the template provides are 10GB-16384GB EBS

Volume Types for NSD use can only be allocated as either gp2 (general purpose) io1 (high

performance SSD) or standard (HDD) If a user specifies 2 NSD servers and 1 compute

node a 5GB EBS volume is allocated to the compute node to account for quorum

7 A filesystem name block size (all supported Spectrum Scale block sizes can be specified)

and number of replicas (max 2) must be provided within the template The filesystem

default number of replicas and NSD failure group definitions are automatically configured

based on user inputs

04

Notes Regarding Architecturemiddot The template creates a synchronous highly available Spectrum Scale cluster across two

availability zones but does not account for third site quorum A single site will always have

a majority quorum definition when an odd number of quorum nodes are specified using

this architecture

middot Only a single EBS volume type can be specified for use Spectrum Scale allows users to

split metadata from data Metadata is often placed on faster volumes (io1) for high response

during metadata lookups The cloud template places metadata and data on the same

volumes and sets the maximumdefault replicas to 1 or 2 based on user input

middot The maximum number of replicas for Spectrum Scale filesystems is 3 The template only

allows for 2 as only two availability zones are able to be specified during cluster creation

middot Autoscaling groups are created for the Bastion Server and Compute stacks but need to be

configured to take any action beyond satisfying the minimum number of nodes within each

stack For example a CPU used threshold needs to be user defined

middot There is no input for ldquoGPFS cluster namerdquo

05

The Following Inputs Were Provided to The Template in Order to Create a Test Cluster

A filesystem block size of 16M is selected Allowable values are 256k 512k 1M 2M 4M 8M 16M

The minimum number of NSD Servers and two Compute nodes are selected for testing purposes

06

VPC Private and Public CIDR block entries in the Network section of the form are prepopulated but can be changed if desired A user must select at least 2 availability zones and defined an External CIDR block

For the purposes of testing I have specified 00000 to allow all public traffic In an actual implementation you would specify a corporate network CIDR Block range

A key pair name S3 bucket and operator email must be supplied Other values can be modified but are prepopulated While the Bastion instance type can be changed it is simply a jumpadmin server and does not need to be configured with any substantial amount of resources

07

The following default options were taken

After review of the user supplied inputs the overall stack can be created The progress of each individual stack can be monitored within the AWS console See progression below with timestamps

08

Once the status for each stack listed above is ldquoCREATE_COMPLETErdquo the EC2 instances containing a fully functioning Spectrum Scale filesystem are accessible Creation time of each stack depends on the user supplied inputs In our example the entire process took ~10 minutes Instancecluster creation time increases as nodesnumber of disks increase

EC2 instance information by accessing the AWS console -gt EC2 Instances Our cluster consists of the following hosts

From a cluster node we can view the output of lsquommlsclusterrsquo and lsquommlsnsdrsquo to review the Cluster Name Repository Type Node names Node designation NSD names NSD to Filesystem allocations and NSD servers per NSD

A column titled ldquoPublic IPrdquo shows that only the LinuxBastion host has been assigned a public

address The Bastion host can be accessed using the ec2-user account and passing your

AWS key to the Bastion IP From there you can SSH to any ServerCompute node

09

Based on this output we can determine that one NSD server and one Compute node are

placed in the 1001X (Availability Zone us-east-2a) and the other NSD server and compute

node are placed in the 1003X private network (Availability Zone us-east-2b) Note that two

NSD servers and a single compute node are designated as quorum nodes A single 5GB

descOnly disk is allocated to compute node ip-10-0-1-132us-east-2computeinternal to serve

as a quorum disk however if availability zone us-east-2a were to go offline the filesystem

would be inaccessible due to loss of quorum

The script defines NSDs with a naming convention that includes availability zone in order

to make it much easier for user to determine which zone an NSD resides In this example

nsd_2a_1_0 is served out by NSD server ip-10-0-1-208us-east-2computeinternal from

availability zone us-east-2a

10

Total filesystem size is 20G (2 x 10G disk ndash one per NSD Server) with a replication factor of 2

for data and metadata The filesystem is created with the maximum number of replicas (3)

for data and metadata allowed by Spectrum Scale

Autoscaling groups can be viewed by navigating to Auto Scaling -gt Auto Scaling Groups with

the EC2 instance view

In our example three autoscaling groups are created for the Bastion Server and Compute

stacks Each stack has a minimum desired and maximum definition derived from the user

inputs supplied in the Cloud Formation template Other auto scaling rules tied to CPUMemory

utilization for example can be manually defined

A basic test of terminating a compute node demonstrates the functionality of the autoscaling groups that were created by the template

A compute node is termined

11

A new compute instance spins up to satisfy the rules of the Compute stack autoscaling group

The new instance is ready for use

We can now SSH to the server verify that its just been built and that the AMI containing Spectrum Scale packages has been used to build the instance However the GPFS daemon is not running

12

Attempts to run lsquommlsclusterrsquo on the newly created compute node show that it does not belong to a Spectrum Scale cluster The same command run on an existing NSD Server node verifies that the new node has not been added to the cluster The terminated compute node (ip-10-0-3-222us-east-2computeinternal) is still a member of the cluster configuration

At this time there is no functionality built into the templatestacks to automatically add and

remove newly generated instances from the cluster configuration Nodes would need to be

manually added and removed using lsquommaddnodersquo and lsquommdelnodersquo commands In the event of

quorum node loss new quorum nodes would need to be designated In the event of NSD server

loss steps would need to be taken to reestablish optimal striping of data across newly created

NSDs (mmrestripefs) Users may want to automate certain functions listed above (automatic add

of new nodes to the cluster) while other administrative tasks (mmrestripefs) may be better off

being run manually during a maintenance window for example

With Regard to The Current Version of The AWS Spectrum Scale Trial Cloud Formation Template The Following Restrictions Exist

middot Protocol support including the use of Cluster Export Services (CES) nodes and protocol access such as Network File System (NFS) Object and Server Message Block (SMB)

middot Active File Management (AFM)

middot Transparent Cloud Tiering (TCT)

middot Compression

middot Encryption

middot Data Management API (DMAPI) support including Hierarchical Storage Management (HSM)

to tape

middot Hadoop Distributed File System (HDFS) connector support

middot Multi-cluster support (exporting an IBM Spectrum Scale file system from one Spectrum Scale cluster to another IBM Spectrum Scale cluster)

middot GUI

middot User name space management and quota management

middot Snapshots and clones

middot Replication is restricted to only 1X (IBM Spectrum Scale makes a single copy of all data and metadata) and 2X (IBM Spectrum Scale makes two copies of all data and metadata)

Additional Limitationsmiddot Using EBS volume encryption for IBM Spectrum Scale file systems is not supported

middot The archiving and restoring of IBM Spectrum Scale data through the use of AWS services is not supported

Many of the limitations above are a result of the edition packaged within the AMI (Standard) ATS has created custom cloud formation templates to reflect many of the design considerations called out in this document (third site quorum maximum datametadata replicas splitting of metadatadata volumes among others) in addition to creating our own AMI images using the AdvancedData Management edition of Spectrum Scale Features such as encryption compression and AFM are included in these editions As a next step we are looking to implement Protocol nodes within AWS using Active Directory replication as a means for authentication Implementing an S3 archive tier using Transparent Cloud Tiering would serve as an attractive option for those looking to leverage cheaper storage for archive purposes all within AWS We look forward to partnering with our customers to come up with the next generation of Spectrum Scale cluster implementations in the cloud

13

About the ATS Group

Since our founding in 2001 the ATS Group has consulted on thousands of system

implementations upgrades backups and recoveries We also support customers by

providing managed services performance analysis and capacity planning With over

60 industry-certified professionals we support SMBs Fortune 500 companies and

government agencies As experts in IBM VMware Oracle and other top vendors we are

experienced in virtualization storage area networks (SANs) high availability performance

tuning SDS enterprise backup and other evolving technologies that operate mission-critical

systems on premise in the cloud or in a hybrid environment

T H E AT S G R O U P C O M C O M PA N Y C O N TA C T

  1. Button 2
  2. Button 3

01

INTRODUCTION

As an IBM premier business partner we are always looking for creative ways to meet our customerrsquos needs The demand for low cost quickly deployed solutions is ever increasing in todayrsquos ldquocloud firstrdquo climate

Many of our customers have invested a significant amount of money into HPC solutions with

node counts that can reach into the thousands and have found themselves feeling locked in to

a specific vendor with no quick way to elastically grow or shrink based on the demand of their

workloads In terms of hybrid growth wersquove researched IBM Spectrum Scalersquos Transparent Cloud

Tiering feature which allows for an S3 object store to serve as a storage tier within the same

Scale namespace While customers see the value in growing their storage footprint on demand

a subset of them view this feature as a stop gap to their ultimate goal of moving their storage as

well as their compute into the cloud We have built Scale clusters across multiple cloud providers

(Google Cloud Platform Amazon Web Services) but have focused in on AWS after IBMrsquos

Spectrum Scale on AWS Quick Start trial evaluation was released in September 2017

THE GOAL OF THE POC WAS THE FOLLOWING

middot Review the AWS Spectrum Scale architecture layout

middot Create a Spectrum Scale cluster using IBMrsquos Quick Start guide

middot Review cloud formation templates and scripts to better understand steps to automate cluster build in AWS

middot Execute failure scenarios and observe behavior

middot Determine any restrictions in the trial release

middot Improve cloud formation templates scripts restrictions based on observations

middot Create customized AMIs with updated Spectrum Scale versions

02

A Spectrum Scale cluster deployed in AWS consists of the following components

1 A VPC (Virtual Private Cloud) is defined within AWS All instances exist in the VPC

Deployment templates allow a cluster to be deployed within an existing VPC or for a

new VPC to be created The VPC defaults to span two availability zones

2 A single Bastion host is created within a single availability zone in a public subnet Think

of a Bastion host as a jump server or an admin server It serves no function within the

Spectrum Scale cluster other than a means to SSH into cluster nodes from outside of the

AWS VPC

a Note that one of the Bastion hosts is greyed out in the image above The Bastion stack

has autoscaling configured by default to ensure there is always at least one Bastion

host up and running Without an accessible Bastion host you would be unable to SSH

to any cluster nodes

3 NAT gateways are defined in the public subnet (Bastion stack) to allow outbound internet

access for nodes in the private subnet (Server and Compute instances)

4 AWS Identity and Access Management (IAM) role and Security groups are automatically

created to allow ports for SSH and the Spectrum Scale daemon

03

5 The IBM Quickstart allows users to specify EC2 types and quantity for the Server

(NSD Server) and Compute nodes to be configured as a part of the cluster

a NSD Servers

i Minimum = 2

ii Maxiumum = 64

b Compute nodes

i Minimum = 1

ii Maximum = 64

6 One 100GB disk is allocated per EC2 instance to be used as the root volume Users can

specify the size quantity and type of EBS volumes to be used as NSDs (Network Shared

Disks) The currently supported disk sizes the template provides are 10GB-16384GB EBS

Volume Types for NSD use can only be allocated as either gp2 (general purpose) io1 (high

performance SSD) or standard (HDD) If a user specifies 2 NSD servers and 1 compute

node a 5GB EBS volume is allocated to the compute node to account for quorum

7 A filesystem name block size (all supported Spectrum Scale block sizes can be specified)

and number of replicas (max 2) must be provided within the template The filesystem

default number of replicas and NSD failure group definitions are automatically configured

based on user inputs

04

Notes Regarding Architecturemiddot The template creates a synchronous highly available Spectrum Scale cluster across two

availability zones but does not account for third site quorum A single site will always have

a majority quorum definition when an odd number of quorum nodes are specified using

this architecture

middot Only a single EBS volume type can be specified for use Spectrum Scale allows users to

split metadata from data Metadata is often placed on faster volumes (io1) for high response

during metadata lookups The cloud template places metadata and data on the same

volumes and sets the maximumdefault replicas to 1 or 2 based on user input

middot The maximum number of replicas for Spectrum Scale filesystems is 3 The template only

allows for 2 as only two availability zones are able to be specified during cluster creation

middot Autoscaling groups are created for the Bastion Server and Compute stacks but need to be

configured to take any action beyond satisfying the minimum number of nodes within each

stack For example a CPU used threshold needs to be user defined

middot There is no input for ldquoGPFS cluster namerdquo

05

The Following Inputs Were Provided to The Template in Order to Create a Test Cluster

A filesystem block size of 16M is selected Allowable values are 256k 512k 1M 2M 4M 8M 16M

The minimum number of NSD Servers and two Compute nodes are selected for testing purposes

06

VPC Private and Public CIDR block entries in the Network section of the form are prepopulated but can be changed if desired A user must select at least 2 availability zones and defined an External CIDR block

For the purposes of testing I have specified 00000 to allow all public traffic In an actual implementation you would specify a corporate network CIDR Block range

A key pair name S3 bucket and operator email must be supplied Other values can be modified but are prepopulated While the Bastion instance type can be changed it is simply a jumpadmin server and does not need to be configured with any substantial amount of resources

07

The following default options were taken

After review of the user supplied inputs the overall stack can be created The progress of each individual stack can be monitored within the AWS console See progression below with timestamps

08

Once the status for each stack listed above is ldquoCREATE_COMPLETErdquo the EC2 instances containing a fully functioning Spectrum Scale filesystem are accessible Creation time of each stack depends on the user supplied inputs In our example the entire process took ~10 minutes Instancecluster creation time increases as nodesnumber of disks increase

EC2 instance information by accessing the AWS console -gt EC2 Instances Our cluster consists of the following hosts

From a cluster node we can view the output of lsquommlsclusterrsquo and lsquommlsnsdrsquo to review the Cluster Name Repository Type Node names Node designation NSD names NSD to Filesystem allocations and NSD servers per NSD

A column titled ldquoPublic IPrdquo shows that only the LinuxBastion host has been assigned a public

address The Bastion host can be accessed using the ec2-user account and passing your

AWS key to the Bastion IP From there you can SSH to any ServerCompute node

09

Based on this output we can determine that one NSD server and one Compute node are

placed in the 1001X (Availability Zone us-east-2a) and the other NSD server and compute

node are placed in the 1003X private network (Availability Zone us-east-2b) Note that two

NSD servers and a single compute node are designated as quorum nodes A single 5GB

descOnly disk is allocated to compute node ip-10-0-1-132us-east-2computeinternal to serve

as a quorum disk however if availability zone us-east-2a were to go offline the filesystem

would be inaccessible due to loss of quorum

The script defines NSDs with a naming convention that includes availability zone in order

to make it much easier for user to determine which zone an NSD resides In this example

nsd_2a_1_0 is served out by NSD server ip-10-0-1-208us-east-2computeinternal from

availability zone us-east-2a

10

Total filesystem size is 20G (2 x 10G disk ndash one per NSD Server) with a replication factor of 2

for data and metadata The filesystem is created with the maximum number of replicas (3)

for data and metadata allowed by Spectrum Scale

Autoscaling groups can be viewed by navigating to Auto Scaling -gt Auto Scaling Groups with

the EC2 instance view

In our example three autoscaling groups are created for the Bastion Server and Compute

stacks Each stack has a minimum desired and maximum definition derived from the user

inputs supplied in the Cloud Formation template Other auto scaling rules tied to CPUMemory

utilization for example can be manually defined

A basic test of terminating a compute node demonstrates the functionality of the autoscaling groups that were created by the template

A compute node is termined

11

A new compute instance spins up to satisfy the rules of the Compute stack autoscaling group

The new instance is ready for use

We can now SSH to the server verify that its just been built and that the AMI containing Spectrum Scale packages has been used to build the instance However the GPFS daemon is not running

12

Attempts to run lsquommlsclusterrsquo on the newly created compute node show that it does not belong to a Spectrum Scale cluster The same command run on an existing NSD Server node verifies that the new node has not been added to the cluster The terminated compute node (ip-10-0-3-222us-east-2computeinternal) is still a member of the cluster configuration

At this time there is no functionality built into the templatestacks to automatically add and

remove newly generated instances from the cluster configuration Nodes would need to be

manually added and removed using lsquommaddnodersquo and lsquommdelnodersquo commands In the event of

quorum node loss new quorum nodes would need to be designated In the event of NSD server

loss steps would need to be taken to reestablish optimal striping of data across newly created

NSDs (mmrestripefs) Users may want to automate certain functions listed above (automatic add

of new nodes to the cluster) while other administrative tasks (mmrestripefs) may be better off

being run manually during a maintenance window for example

With Regard to The Current Version of The AWS Spectrum Scale Trial Cloud Formation Template The Following Restrictions Exist

middot Protocol support including the use of Cluster Export Services (CES) nodes and protocol access such as Network File System (NFS) Object and Server Message Block (SMB)

middot Active File Management (AFM)

middot Transparent Cloud Tiering (TCT)

middot Compression

middot Encryption

middot Data Management API (DMAPI) support including Hierarchical Storage Management (HSM)

to tape

middot Hadoop Distributed File System (HDFS) connector support

middot Multi-cluster support (exporting an IBM Spectrum Scale file system from one Spectrum Scale cluster to another IBM Spectrum Scale cluster)

middot GUI

middot User name space management and quota management

middot Snapshots and clones

middot Replication is restricted to only 1X (IBM Spectrum Scale makes a single copy of all data and metadata) and 2X (IBM Spectrum Scale makes two copies of all data and metadata)

Additional Limitationsmiddot Using EBS volume encryption for IBM Spectrum Scale file systems is not supported

middot The archiving and restoring of IBM Spectrum Scale data through the use of AWS services is not supported

Many of the limitations above are a result of the edition packaged within the AMI (Standard) ATS has created custom cloud formation templates to reflect many of the design considerations called out in this document (third site quorum maximum datametadata replicas splitting of metadatadata volumes among others) in addition to creating our own AMI images using the AdvancedData Management edition of Spectrum Scale Features such as encryption compression and AFM are included in these editions As a next step we are looking to implement Protocol nodes within AWS using Active Directory replication as a means for authentication Implementing an S3 archive tier using Transparent Cloud Tiering would serve as an attractive option for those looking to leverage cheaper storage for archive purposes all within AWS We look forward to partnering with our customers to come up with the next generation of Spectrum Scale cluster implementations in the cloud

13

About the ATS Group

Since our founding in 2001 the ATS Group has consulted on thousands of system

implementations upgrades backups and recoveries We also support customers by

providing managed services performance analysis and capacity planning With over

60 industry-certified professionals we support SMBs Fortune 500 companies and

government agencies As experts in IBM VMware Oracle and other top vendors we are

experienced in virtualization storage area networks (SANs) high availability performance

tuning SDS enterprise backup and other evolving technologies that operate mission-critical

systems on premise in the cloud or in a hybrid environment

T H E AT S G R O U P C O M C O M PA N Y C O N TA C T

  1. Button 2
  2. Button 3

02

A Spectrum Scale cluster deployed in AWS consists of the following components

1 A VPC (Virtual Private Cloud) is defined within AWS All instances exist in the VPC

Deployment templates allow a cluster to be deployed within an existing VPC or for a

new VPC to be created The VPC defaults to span two availability zones

2 A single Bastion host is created within a single availability zone in a public subnet Think

of a Bastion host as a jump server or an admin server It serves no function within the

Spectrum Scale cluster other than a means to SSH into cluster nodes from outside of the

AWS VPC

a Note that one of the Bastion hosts is greyed out in the image above The Bastion stack

has autoscaling configured by default to ensure there is always at least one Bastion

host up and running Without an accessible Bastion host you would be unable to SSH

to any cluster nodes

3 NAT gateways are defined in the public subnet (Bastion stack) to allow outbound internet

access for nodes in the private subnet (Server and Compute instances)

4 AWS Identity and Access Management (IAM) role and Security groups are automatically

created to allow ports for SSH and the Spectrum Scale daemon

03

5 The IBM Quickstart allows users to specify EC2 types and quantity for the Server

(NSD Server) and Compute nodes to be configured as a part of the cluster

a NSD Servers

i Minimum = 2

ii Maxiumum = 64

b Compute nodes

i Minimum = 1

ii Maximum = 64

6 One 100GB disk is allocated per EC2 instance to be used as the root volume Users can

specify the size quantity and type of EBS volumes to be used as NSDs (Network Shared

Disks) The currently supported disk sizes the template provides are 10GB-16384GB EBS

Volume Types for NSD use can only be allocated as either gp2 (general purpose) io1 (high

performance SSD) or standard (HDD) If a user specifies 2 NSD servers and 1 compute

node a 5GB EBS volume is allocated to the compute node to account for quorum

7 A filesystem name block size (all supported Spectrum Scale block sizes can be specified)

and number of replicas (max 2) must be provided within the template The filesystem

default number of replicas and NSD failure group definitions are automatically configured

based on user inputs

04

Notes Regarding Architecturemiddot The template creates a synchronous highly available Spectrum Scale cluster across two

availability zones but does not account for third site quorum A single site will always have

a majority quorum definition when an odd number of quorum nodes are specified using

this architecture

middot Only a single EBS volume type can be specified for use Spectrum Scale allows users to

split metadata from data Metadata is often placed on faster volumes (io1) for high response

during metadata lookups The cloud template places metadata and data on the same

volumes and sets the maximumdefault replicas to 1 or 2 based on user input

middot The maximum number of replicas for Spectrum Scale filesystems is 3 The template only

allows for 2 as only two availability zones are able to be specified during cluster creation

middot Autoscaling groups are created for the Bastion Server and Compute stacks but need to be

configured to take any action beyond satisfying the minimum number of nodes within each

stack For example a CPU used threshold needs to be user defined

middot There is no input for ldquoGPFS cluster namerdquo

05

The Following Inputs Were Provided to The Template in Order to Create a Test Cluster

A filesystem block size of 16M is selected Allowable values are 256k 512k 1M 2M 4M 8M 16M

The minimum number of NSD Servers and two Compute nodes are selected for testing purposes

06

VPC Private and Public CIDR block entries in the Network section of the form are prepopulated but can be changed if desired A user must select at least 2 availability zones and defined an External CIDR block

For the purposes of testing I have specified 00000 to allow all public traffic In an actual implementation you would specify a corporate network CIDR Block range

A key pair name S3 bucket and operator email must be supplied Other values can be modified but are prepopulated While the Bastion instance type can be changed it is simply a jumpadmin server and does not need to be configured with any substantial amount of resources

07

The following default options were taken

After review of the user supplied inputs the overall stack can be created The progress of each individual stack can be monitored within the AWS console See progression below with timestamps

08

Once the status for each stack listed above is ldquoCREATE_COMPLETErdquo the EC2 instances containing a fully functioning Spectrum Scale filesystem are accessible Creation time of each stack depends on the user supplied inputs In our example the entire process took ~10 minutes Instancecluster creation time increases as nodesnumber of disks increase

EC2 instance information by accessing the AWS console -gt EC2 Instances Our cluster consists of the following hosts

From a cluster node we can view the output of lsquommlsclusterrsquo and lsquommlsnsdrsquo to review the Cluster Name Repository Type Node names Node designation NSD names NSD to Filesystem allocations and NSD servers per NSD

A column titled ldquoPublic IPrdquo shows that only the LinuxBastion host has been assigned a public

address The Bastion host can be accessed using the ec2-user account and passing your

AWS key to the Bastion IP From there you can SSH to any ServerCompute node

09

Based on this output we can determine that one NSD server and one Compute node are

placed in the 1001X (Availability Zone us-east-2a) and the other NSD server and compute

node are placed in the 1003X private network (Availability Zone us-east-2b) Note that two

NSD servers and a single compute node are designated as quorum nodes A single 5GB

descOnly disk is allocated to compute node ip-10-0-1-132us-east-2computeinternal to serve

as a quorum disk however if availability zone us-east-2a were to go offline the filesystem

would be inaccessible due to loss of quorum

The script defines NSDs with a naming convention that includes availability zone in order

to make it much easier for user to determine which zone an NSD resides In this example

nsd_2a_1_0 is served out by NSD server ip-10-0-1-208us-east-2computeinternal from

availability zone us-east-2a

10

Total filesystem size is 20G (2 x 10G disk ndash one per NSD Server) with a replication factor of 2

for data and metadata The filesystem is created with the maximum number of replicas (3)

for data and metadata allowed by Spectrum Scale

Autoscaling groups can be viewed by navigating to Auto Scaling -gt Auto Scaling Groups with

the EC2 instance view

In our example three autoscaling groups are created for the Bastion Server and Compute

stacks Each stack has a minimum desired and maximum definition derived from the user

inputs supplied in the Cloud Formation template Other auto scaling rules tied to CPUMemory

utilization for example can be manually defined

A basic test of terminating a compute node demonstrates the functionality of the autoscaling groups that were created by the template

A compute node is termined

11

A new compute instance spins up to satisfy the rules of the Compute stack autoscaling group

The new instance is ready for use

We can now SSH to the server verify that its just been built and that the AMI containing Spectrum Scale packages has been used to build the instance However the GPFS daemon is not running

12

Attempts to run lsquommlsclusterrsquo on the newly created compute node show that it does not belong to a Spectrum Scale cluster The same command run on an existing NSD Server node verifies that the new node has not been added to the cluster The terminated compute node (ip-10-0-3-222us-east-2computeinternal) is still a member of the cluster configuration

At this time there is no functionality built into the templatestacks to automatically add and

remove newly generated instances from the cluster configuration Nodes would need to be

manually added and removed using lsquommaddnodersquo and lsquommdelnodersquo commands In the event of

quorum node loss new quorum nodes would need to be designated In the event of NSD server

loss steps would need to be taken to reestablish optimal striping of data across newly created

NSDs (mmrestripefs) Users may want to automate certain functions listed above (automatic add

of new nodes to the cluster) while other administrative tasks (mmrestripefs) may be better off

being run manually during a maintenance window for example

With Regard to The Current Version of The AWS Spectrum Scale Trial Cloud Formation Template The Following Restrictions Exist

middot Protocol support including the use of Cluster Export Services (CES) nodes and protocol access such as Network File System (NFS) Object and Server Message Block (SMB)

middot Active File Management (AFM)

middot Transparent Cloud Tiering (TCT)

middot Compression

middot Encryption

middot Data Management API (DMAPI) support including Hierarchical Storage Management (HSM)

to tape

middot Hadoop Distributed File System (HDFS) connector support

middot Multi-cluster support (exporting an IBM Spectrum Scale file system from one Spectrum Scale cluster to another IBM Spectrum Scale cluster)

middot GUI

middot User name space management and quota management

middot Snapshots and clones

middot Replication is restricted to only 1X (IBM Spectrum Scale makes a single copy of all data and metadata) and 2X (IBM Spectrum Scale makes two copies of all data and metadata)

Additional Limitationsmiddot Using EBS volume encryption for IBM Spectrum Scale file systems is not supported

middot The archiving and restoring of IBM Spectrum Scale data through the use of AWS services is not supported

Many of the limitations above are a result of the edition packaged within the AMI (Standard) ATS has created custom cloud formation templates to reflect many of the design considerations called out in this document (third site quorum maximum datametadata replicas splitting of metadatadata volumes among others) in addition to creating our own AMI images using the AdvancedData Management edition of Spectrum Scale Features such as encryption compression and AFM are included in these editions As a next step we are looking to implement Protocol nodes within AWS using Active Directory replication as a means for authentication Implementing an S3 archive tier using Transparent Cloud Tiering would serve as an attractive option for those looking to leverage cheaper storage for archive purposes all within AWS We look forward to partnering with our customers to come up with the next generation of Spectrum Scale cluster implementations in the cloud

13

About the ATS Group

Since our founding in 2001 the ATS Group has consulted on thousands of system

implementations upgrades backups and recoveries We also support customers by

providing managed services performance analysis and capacity planning With over

60 industry-certified professionals we support SMBs Fortune 500 companies and

government agencies As experts in IBM VMware Oracle and other top vendors we are

experienced in virtualization storage area networks (SANs) high availability performance

tuning SDS enterprise backup and other evolving technologies that operate mission-critical

systems on premise in the cloud or in a hybrid environment

T H E AT S G R O U P C O M C O M PA N Y C O N TA C T

  1. Button 2
  2. Button 3

03

5 The IBM Quickstart allows users to specify EC2 types and quantity for the Server

(NSD Server) and Compute nodes to be configured as a part of the cluster

a NSD Servers

i Minimum = 2

ii Maxiumum = 64

b Compute nodes

i Minimum = 1

ii Maximum = 64

6 One 100GB disk is allocated per EC2 instance to be used as the root volume Users can

specify the size quantity and type of EBS volumes to be used as NSDs (Network Shared

Disks) The currently supported disk sizes the template provides are 10GB-16384GB EBS

Volume Types for NSD use can only be allocated as either gp2 (general purpose) io1 (high

performance SSD) or standard (HDD) If a user specifies 2 NSD servers and 1 compute

node a 5GB EBS volume is allocated to the compute node to account for quorum

7 A filesystem name block size (all supported Spectrum Scale block sizes can be specified)

and number of replicas (max 2) must be provided within the template The filesystem

default number of replicas and NSD failure group definitions are automatically configured

based on user inputs

04

Notes Regarding Architecturemiddot The template creates a synchronous highly available Spectrum Scale cluster across two

availability zones but does not account for third site quorum A single site will always have

a majority quorum definition when an odd number of quorum nodes are specified using

this architecture

middot Only a single EBS volume type can be specified for use Spectrum Scale allows users to

split metadata from data Metadata is often placed on faster volumes (io1) for high response

during metadata lookups The cloud template places metadata and data on the same

volumes and sets the maximumdefault replicas to 1 or 2 based on user input

middot The maximum number of replicas for Spectrum Scale filesystems is 3 The template only

allows for 2 as only two availability zones are able to be specified during cluster creation

middot Autoscaling groups are created for the Bastion Server and Compute stacks but need to be

configured to take any action beyond satisfying the minimum number of nodes within each

stack For example a CPU used threshold needs to be user defined

middot There is no input for ldquoGPFS cluster namerdquo

05

The Following Inputs Were Provided to The Template in Order to Create a Test Cluster

A filesystem block size of 16M is selected Allowable values are 256k 512k 1M 2M 4M 8M 16M

The minimum number of NSD Servers and two Compute nodes are selected for testing purposes

06

VPC Private and Public CIDR block entries in the Network section of the form are prepopulated but can be changed if desired A user must select at least 2 availability zones and defined an External CIDR block

For the purposes of testing I have specified 00000 to allow all public traffic In an actual implementation you would specify a corporate network CIDR Block range

A key pair name S3 bucket and operator email must be supplied Other values can be modified but are prepopulated While the Bastion instance type can be changed it is simply a jumpadmin server and does not need to be configured with any substantial amount of resources

07

The following default options were taken

After review of the user supplied inputs the overall stack can be created The progress of each individual stack can be monitored within the AWS console See progression below with timestamps

08

Once the status for each stack listed above is ldquoCREATE_COMPLETErdquo the EC2 instances containing a fully functioning Spectrum Scale filesystem are accessible Creation time of each stack depends on the user supplied inputs In our example the entire process took ~10 minutes Instancecluster creation time increases as nodesnumber of disks increase

EC2 instance information by accessing the AWS console -gt EC2 Instances Our cluster consists of the following hosts

From a cluster node we can view the output of lsquommlsclusterrsquo and lsquommlsnsdrsquo to review the Cluster Name Repository Type Node names Node designation NSD names NSD to Filesystem allocations and NSD servers per NSD

A column titled ldquoPublic IPrdquo shows that only the LinuxBastion host has been assigned a public

address The Bastion host can be accessed using the ec2-user account and passing your

AWS key to the Bastion IP From there you can SSH to any ServerCompute node

09

Based on this output we can determine that one NSD server and one Compute node are

placed in the 1001X (Availability Zone us-east-2a) and the other NSD server and compute

node are placed in the 1003X private network (Availability Zone us-east-2b) Note that two

NSD servers and a single compute node are designated as quorum nodes A single 5GB

descOnly disk is allocated to compute node ip-10-0-1-132us-east-2computeinternal to serve

as a quorum disk however if availability zone us-east-2a were to go offline the filesystem

would be inaccessible due to loss of quorum

The script defines NSDs with a naming convention that includes availability zone in order

to make it much easier for user to determine which zone an NSD resides In this example

nsd_2a_1_0 is served out by NSD server ip-10-0-1-208us-east-2computeinternal from

availability zone us-east-2a

10

Total filesystem size is 20G (2 x 10G disk ndash one per NSD Server) with a replication factor of 2

for data and metadata The filesystem is created with the maximum number of replicas (3)

for data and metadata allowed by Spectrum Scale

Autoscaling groups can be viewed by navigating to Auto Scaling -gt Auto Scaling Groups with

the EC2 instance view

In our example three autoscaling groups are created for the Bastion Server and Compute

stacks Each stack has a minimum desired and maximum definition derived from the user

inputs supplied in the Cloud Formation template Other auto scaling rules tied to CPUMemory

utilization for example can be manually defined

A basic test of terminating a compute node demonstrates the functionality of the autoscaling groups that were created by the template

A compute node is termined

11

A new compute instance spins up to satisfy the rules of the Compute stack autoscaling group

The new instance is ready for use

We can now SSH to the server verify that its just been built and that the AMI containing Spectrum Scale packages has been used to build the instance However the GPFS daemon is not running

12

Attempts to run lsquommlsclusterrsquo on the newly created compute node show that it does not belong to a Spectrum Scale cluster The same command run on an existing NSD Server node verifies that the new node has not been added to the cluster The terminated compute node (ip-10-0-3-222us-east-2computeinternal) is still a member of the cluster configuration

At this time there is no functionality built into the templatestacks to automatically add and

remove newly generated instances from the cluster configuration Nodes would need to be

manually added and removed using lsquommaddnodersquo and lsquommdelnodersquo commands In the event of

quorum node loss new quorum nodes would need to be designated In the event of NSD server

loss steps would need to be taken to reestablish optimal striping of data across newly created

NSDs (mmrestripefs) Users may want to automate certain functions listed above (automatic add

of new nodes to the cluster) while other administrative tasks (mmrestripefs) may be better off

being run manually during a maintenance window for example

With Regard to The Current Version of The AWS Spectrum Scale Trial Cloud Formation Template The Following Restrictions Exist

middot Protocol support including the use of Cluster Export Services (CES) nodes and protocol access such as Network File System (NFS) Object and Server Message Block (SMB)

middot Active File Management (AFM)

middot Transparent Cloud Tiering (TCT)

middot Compression

middot Encryption

middot Data Management API (DMAPI) support including Hierarchical Storage Management (HSM)

to tape

middot Hadoop Distributed File System (HDFS) connector support

middot Multi-cluster support (exporting an IBM Spectrum Scale file system from one Spectrum Scale cluster to another IBM Spectrum Scale cluster)

middot GUI

middot User name space management and quota management

middot Snapshots and clones

middot Replication is restricted to only 1X (IBM Spectrum Scale makes a single copy of all data and metadata) and 2X (IBM Spectrum Scale makes two copies of all data and metadata)

Additional Limitationsmiddot Using EBS volume encryption for IBM Spectrum Scale file systems is not supported

middot The archiving and restoring of IBM Spectrum Scale data through the use of AWS services is not supported

Many of the limitations above are a result of the edition packaged within the AMI (Standard) ATS has created custom cloud formation templates to reflect many of the design considerations called out in this document (third site quorum maximum datametadata replicas splitting of metadatadata volumes among others) in addition to creating our own AMI images using the AdvancedData Management edition of Spectrum Scale Features such as encryption compression and AFM are included in these editions As a next step we are looking to implement Protocol nodes within AWS using Active Directory replication as a means for authentication Implementing an S3 archive tier using Transparent Cloud Tiering would serve as an attractive option for those looking to leverage cheaper storage for archive purposes all within AWS We look forward to partnering with our customers to come up with the next generation of Spectrum Scale cluster implementations in the cloud

13

About the ATS Group

Since our founding in 2001 the ATS Group has consulted on thousands of system

implementations upgrades backups and recoveries We also support customers by

providing managed services performance analysis and capacity planning With over

60 industry-certified professionals we support SMBs Fortune 500 companies and

government agencies As experts in IBM VMware Oracle and other top vendors we are

experienced in virtualization storage area networks (SANs) high availability performance

tuning SDS enterprise backup and other evolving technologies that operate mission-critical

systems on premise in the cloud or in a hybrid environment

T H E AT S G R O U P C O M C O M PA N Y C O N TA C T

  1. Button 2
  2. Button 3

04

Notes Regarding Architecturemiddot The template creates a synchronous highly available Spectrum Scale cluster across two

availability zones but does not account for third site quorum A single site will always have

a majority quorum definition when an odd number of quorum nodes are specified using

this architecture

middot Only a single EBS volume type can be specified for use Spectrum Scale allows users to

split metadata from data Metadata is often placed on faster volumes (io1) for high response

during metadata lookups The cloud template places metadata and data on the same

volumes and sets the maximumdefault replicas to 1 or 2 based on user input

middot The maximum number of replicas for Spectrum Scale filesystems is 3 The template only

allows for 2 as only two availability zones are able to be specified during cluster creation

middot Autoscaling groups are created for the Bastion Server and Compute stacks but need to be

configured to take any action beyond satisfying the minimum number of nodes within each

stack For example a CPU used threshold needs to be user defined

middot There is no input for ldquoGPFS cluster namerdquo

05

The Following Inputs Were Provided to The Template in Order to Create a Test Cluster

A filesystem block size of 16M is selected Allowable values are 256k 512k 1M 2M 4M 8M 16M

The minimum number of NSD Servers and two Compute nodes are selected for testing purposes

06

VPC Private and Public CIDR block entries in the Network section of the form are prepopulated but can be changed if desired A user must select at least 2 availability zones and defined an External CIDR block

For the purposes of testing I have specified 00000 to allow all public traffic In an actual implementation you would specify a corporate network CIDR Block range

A key pair name S3 bucket and operator email must be supplied Other values can be modified but are prepopulated While the Bastion instance type can be changed it is simply a jumpadmin server and does not need to be configured with any substantial amount of resources

07

The following default options were taken

After review of the user supplied inputs the overall stack can be created The progress of each individual stack can be monitored within the AWS console See progression below with timestamps

08

Once the status for each stack listed above is ldquoCREATE_COMPLETErdquo the EC2 instances containing a fully functioning Spectrum Scale filesystem are accessible Creation time of each stack depends on the user supplied inputs In our example the entire process took ~10 minutes Instancecluster creation time increases as nodesnumber of disks increase

EC2 instance information by accessing the AWS console -gt EC2 Instances Our cluster consists of the following hosts

From a cluster node we can view the output of lsquommlsclusterrsquo and lsquommlsnsdrsquo to review the Cluster Name Repository Type Node names Node designation NSD names NSD to Filesystem allocations and NSD servers per NSD

A column titled ldquoPublic IPrdquo shows that only the LinuxBastion host has been assigned a public

address The Bastion host can be accessed using the ec2-user account and passing your

AWS key to the Bastion IP From there you can SSH to any ServerCompute node

09

Based on this output we can determine that one NSD server and one Compute node are

placed in the 1001X (Availability Zone us-east-2a) and the other NSD server and compute

node are placed in the 1003X private network (Availability Zone us-east-2b) Note that two

NSD servers and a single compute node are designated as quorum nodes A single 5GB

descOnly disk is allocated to compute node ip-10-0-1-132us-east-2computeinternal to serve

as a quorum disk however if availability zone us-east-2a were to go offline the filesystem

would be inaccessible due to loss of quorum

The script defines NSDs with a naming convention that includes availability zone in order

to make it much easier for user to determine which zone an NSD resides In this example

nsd_2a_1_0 is served out by NSD server ip-10-0-1-208us-east-2computeinternal from

availability zone us-east-2a

10

Total filesystem size is 20G (2 x 10G disk ndash one per NSD Server) with a replication factor of 2

for data and metadata The filesystem is created with the maximum number of replicas (3)

for data and metadata allowed by Spectrum Scale

Autoscaling groups can be viewed by navigating to Auto Scaling -gt Auto Scaling Groups with

the EC2 instance view

In our example three autoscaling groups are created for the Bastion Server and Compute

stacks Each stack has a minimum desired and maximum definition derived from the user

inputs supplied in the Cloud Formation template Other auto scaling rules tied to CPUMemory

utilization for example can be manually defined

A basic test of terminating a compute node demonstrates the functionality of the autoscaling groups that were created by the template

A compute node is termined

11

A new compute instance spins up to satisfy the rules of the Compute stack autoscaling group

The new instance is ready for use

We can now SSH to the server verify that its just been built and that the AMI containing Spectrum Scale packages has been used to build the instance However the GPFS daemon is not running

12

Attempts to run lsquommlsclusterrsquo on the newly created compute node show that it does not belong to a Spectrum Scale cluster The same command run on an existing NSD Server node verifies that the new node has not been added to the cluster The terminated compute node (ip-10-0-3-222us-east-2computeinternal) is still a member of the cluster configuration

At this time there is no functionality built into the templatestacks to automatically add and

remove newly generated instances from the cluster configuration Nodes would need to be

manually added and removed using lsquommaddnodersquo and lsquommdelnodersquo commands In the event of

quorum node loss new quorum nodes would need to be designated In the event of NSD server

loss steps would need to be taken to reestablish optimal striping of data across newly created

NSDs (mmrestripefs) Users may want to automate certain functions listed above (automatic add

of new nodes to the cluster) while other administrative tasks (mmrestripefs) may be better off

being run manually during a maintenance window for example

With Regard to The Current Version of The AWS Spectrum Scale Trial Cloud Formation Template The Following Restrictions Exist

middot Protocol support including the use of Cluster Export Services (CES) nodes and protocol access such as Network File System (NFS) Object and Server Message Block (SMB)

middot Active File Management (AFM)

middot Transparent Cloud Tiering (TCT)

middot Compression

middot Encryption

middot Data Management API (DMAPI) support including Hierarchical Storage Management (HSM)

to tape

middot Hadoop Distributed File System (HDFS) connector support

middot Multi-cluster support (exporting an IBM Spectrum Scale file system from one Spectrum Scale cluster to another IBM Spectrum Scale cluster)

middot GUI

middot User name space management and quota management

middot Snapshots and clones

middot Replication is restricted to only 1X (IBM Spectrum Scale makes a single copy of all data and metadata) and 2X (IBM Spectrum Scale makes two copies of all data and metadata)

Additional Limitationsmiddot Using EBS volume encryption for IBM Spectrum Scale file systems is not supported

middot The archiving and restoring of IBM Spectrum Scale data through the use of AWS services is not supported

Many of the limitations above are a result of the edition packaged within the AMI (Standard) ATS has created custom cloud formation templates to reflect many of the design considerations called out in this document (third site quorum maximum datametadata replicas splitting of metadatadata volumes among others) in addition to creating our own AMI images using the AdvancedData Management edition of Spectrum Scale Features such as encryption compression and AFM are included in these editions As a next step we are looking to implement Protocol nodes within AWS using Active Directory replication as a means for authentication Implementing an S3 archive tier using Transparent Cloud Tiering would serve as an attractive option for those looking to leverage cheaper storage for archive purposes all within AWS We look forward to partnering with our customers to come up with the next generation of Spectrum Scale cluster implementations in the cloud

13

About the ATS Group

Since our founding in 2001 the ATS Group has consulted on thousands of system

implementations upgrades backups and recoveries We also support customers by

providing managed services performance analysis and capacity planning With over

60 industry-certified professionals we support SMBs Fortune 500 companies and

government agencies As experts in IBM VMware Oracle and other top vendors we are

experienced in virtualization storage area networks (SANs) high availability performance

tuning SDS enterprise backup and other evolving technologies that operate mission-critical

systems on premise in the cloud or in a hybrid environment

T H E AT S G R O U P C O M C O M PA N Y C O N TA C T

  1. Button 2
  2. Button 3

05

The Following Inputs Were Provided to The Template in Order to Create a Test Cluster

A filesystem block size of 16M is selected Allowable values are 256k 512k 1M 2M 4M 8M 16M

The minimum number of NSD Servers and two Compute nodes are selected for testing purposes

06

VPC Private and Public CIDR block entries in the Network section of the form are prepopulated but can be changed if desired A user must select at least 2 availability zones and defined an External CIDR block

For the purposes of testing I have specified 00000 to allow all public traffic In an actual implementation you would specify a corporate network CIDR Block range

A key pair name S3 bucket and operator email must be supplied Other values can be modified but are prepopulated While the Bastion instance type can be changed it is simply a jumpadmin server and does not need to be configured with any substantial amount of resources

07

The following default options were taken

After review of the user supplied inputs the overall stack can be created The progress of each individual stack can be monitored within the AWS console See progression below with timestamps

08

Once the status for each stack listed above is ldquoCREATE_COMPLETErdquo the EC2 instances containing a fully functioning Spectrum Scale filesystem are accessible Creation time of each stack depends on the user supplied inputs In our example the entire process took ~10 minutes Instancecluster creation time increases as nodesnumber of disks increase

EC2 instance information by accessing the AWS console -gt EC2 Instances Our cluster consists of the following hosts

From a cluster node we can view the output of lsquommlsclusterrsquo and lsquommlsnsdrsquo to review the Cluster Name Repository Type Node names Node designation NSD names NSD to Filesystem allocations and NSD servers per NSD

A column titled ldquoPublic IPrdquo shows that only the LinuxBastion host has been assigned a public

address The Bastion host can be accessed using the ec2-user account and passing your

AWS key to the Bastion IP From there you can SSH to any ServerCompute node

09

Based on this output we can determine that one NSD server and one Compute node are

placed in the 1001X (Availability Zone us-east-2a) and the other NSD server and compute

node are placed in the 1003X private network (Availability Zone us-east-2b) Note that two

NSD servers and a single compute node are designated as quorum nodes A single 5GB

descOnly disk is allocated to compute node ip-10-0-1-132us-east-2computeinternal to serve

as a quorum disk however if availability zone us-east-2a were to go offline the filesystem

would be inaccessible due to loss of quorum

The script defines NSDs with a naming convention that includes availability zone in order

to make it much easier for user to determine which zone an NSD resides In this example

nsd_2a_1_0 is served out by NSD server ip-10-0-1-208us-east-2computeinternal from

availability zone us-east-2a

10

Total filesystem size is 20G (2 x 10G disk ndash one per NSD Server) with a replication factor of 2

for data and metadata The filesystem is created with the maximum number of replicas (3)

for data and metadata allowed by Spectrum Scale

Autoscaling groups can be viewed by navigating to Auto Scaling -gt Auto Scaling Groups with

the EC2 instance view

In our example three autoscaling groups are created for the Bastion Server and Compute

stacks Each stack has a minimum desired and maximum definition derived from the user

inputs supplied in the Cloud Formation template Other auto scaling rules tied to CPUMemory

utilization for example can be manually defined

A basic test of terminating a compute node demonstrates the functionality of the autoscaling groups that were created by the template

A compute node is termined

11

A new compute instance spins up to satisfy the rules of the Compute stack autoscaling group

The new instance is ready for use

We can now SSH to the server verify that its just been built and that the AMI containing Spectrum Scale packages has been used to build the instance However the GPFS daemon is not running

12

Attempts to run lsquommlsclusterrsquo on the newly created compute node show that it does not belong to a Spectrum Scale cluster The same command run on an existing NSD Server node verifies that the new node has not been added to the cluster The terminated compute node (ip-10-0-3-222us-east-2computeinternal) is still a member of the cluster configuration

At this time there is no functionality built into the templatestacks to automatically add and

remove newly generated instances from the cluster configuration Nodes would need to be

manually added and removed using lsquommaddnodersquo and lsquommdelnodersquo commands In the event of

quorum node loss new quorum nodes would need to be designated In the event of NSD server

loss steps would need to be taken to reestablish optimal striping of data across newly created

NSDs (mmrestripefs) Users may want to automate certain functions listed above (automatic add

of new nodes to the cluster) while other administrative tasks (mmrestripefs) may be better off

being run manually during a maintenance window for example

With Regard to The Current Version of The AWS Spectrum Scale Trial Cloud Formation Template The Following Restrictions Exist

middot Protocol support including the use of Cluster Export Services (CES) nodes and protocol access such as Network File System (NFS) Object and Server Message Block (SMB)

middot Active File Management (AFM)

middot Transparent Cloud Tiering (TCT)

middot Compression

middot Encryption

middot Data Management API (DMAPI) support including Hierarchical Storage Management (HSM)

to tape

middot Hadoop Distributed File System (HDFS) connector support

middot Multi-cluster support (exporting an IBM Spectrum Scale file system from one Spectrum Scale cluster to another IBM Spectrum Scale cluster)

middot GUI

middot User name space management and quota management

middot Snapshots and clones

middot Replication is restricted to only 1X (IBM Spectrum Scale makes a single copy of all data and metadata) and 2X (IBM Spectrum Scale makes two copies of all data and metadata)

Additional Limitationsmiddot Using EBS volume encryption for IBM Spectrum Scale file systems is not supported

middot The archiving and restoring of IBM Spectrum Scale data through the use of AWS services is not supported

Many of the limitations above are a result of the edition packaged within the AMI (Standard) ATS has created custom cloud formation templates to reflect many of the design considerations called out in this document (third site quorum maximum datametadata replicas splitting of metadatadata volumes among others) in addition to creating our own AMI images using the AdvancedData Management edition of Spectrum Scale Features such as encryption compression and AFM are included in these editions As a next step we are looking to implement Protocol nodes within AWS using Active Directory replication as a means for authentication Implementing an S3 archive tier using Transparent Cloud Tiering would serve as an attractive option for those looking to leverage cheaper storage for archive purposes all within AWS We look forward to partnering with our customers to come up with the next generation of Spectrum Scale cluster implementations in the cloud

13

About the ATS Group

Since our founding in 2001 the ATS Group has consulted on thousands of system

implementations upgrades backups and recoveries We also support customers by

providing managed services performance analysis and capacity planning With over

60 industry-certified professionals we support SMBs Fortune 500 companies and

government agencies As experts in IBM VMware Oracle and other top vendors we are

experienced in virtualization storage area networks (SANs) high availability performance

tuning SDS enterprise backup and other evolving technologies that operate mission-critical

systems on premise in the cloud or in a hybrid environment

T H E AT S G R O U P C O M C O M PA N Y C O N TA C T

  1. Button 2
  2. Button 3

06

VPC Private and Public CIDR block entries in the Network section of the form are prepopulated but can be changed if desired A user must select at least 2 availability zones and defined an External CIDR block

For the purposes of testing I have specified 00000 to allow all public traffic In an actual implementation you would specify a corporate network CIDR Block range

A key pair name S3 bucket and operator email must be supplied Other values can be modified but are prepopulated While the Bastion instance type can be changed it is simply a jumpadmin server and does not need to be configured with any substantial amount of resources

07

The following default options were taken

After review of the user supplied inputs the overall stack can be created The progress of each individual stack can be monitored within the AWS console See progression below with timestamps

08

Once the status for each stack listed above is ldquoCREATE_COMPLETErdquo the EC2 instances containing a fully functioning Spectrum Scale filesystem are accessible Creation time of each stack depends on the user supplied inputs In our example the entire process took ~10 minutes Instancecluster creation time increases as nodesnumber of disks increase

EC2 instance information by accessing the AWS console -gt EC2 Instances Our cluster consists of the following hosts

From a cluster node we can view the output of lsquommlsclusterrsquo and lsquommlsnsdrsquo to review the Cluster Name Repository Type Node names Node designation NSD names NSD to Filesystem allocations and NSD servers per NSD

A column titled ldquoPublic IPrdquo shows that only the LinuxBastion host has been assigned a public

address The Bastion host can be accessed using the ec2-user account and passing your

AWS key to the Bastion IP From there you can SSH to any ServerCompute node

09

Based on this output we can determine that one NSD server and one Compute node are

placed in the 1001X (Availability Zone us-east-2a) and the other NSD server and compute

node are placed in the 1003X private network (Availability Zone us-east-2b) Note that two

NSD servers and a single compute node are designated as quorum nodes A single 5GB

descOnly disk is allocated to compute node ip-10-0-1-132us-east-2computeinternal to serve

as a quorum disk however if availability zone us-east-2a were to go offline the filesystem

would be inaccessible due to loss of quorum

The script defines NSDs with a naming convention that includes availability zone in order

to make it much easier for user to determine which zone an NSD resides In this example

nsd_2a_1_0 is served out by NSD server ip-10-0-1-208us-east-2computeinternal from

availability zone us-east-2a

10

Total filesystem size is 20G (2 x 10G disk ndash one per NSD Server) with a replication factor of 2

for data and metadata The filesystem is created with the maximum number of replicas (3)

for data and metadata allowed by Spectrum Scale

Autoscaling groups can be viewed by navigating to Auto Scaling -gt Auto Scaling Groups with

the EC2 instance view

In our example three autoscaling groups are created for the Bastion Server and Compute

stacks Each stack has a minimum desired and maximum definition derived from the user

inputs supplied in the Cloud Formation template Other auto scaling rules tied to CPUMemory

utilization for example can be manually defined

A basic test of terminating a compute node demonstrates the functionality of the autoscaling groups that were created by the template

A compute node is termined

11

A new compute instance spins up to satisfy the rules of the Compute stack autoscaling group

The new instance is ready for use

We can now SSH to the server verify that its just been built and that the AMI containing Spectrum Scale packages has been used to build the instance However the GPFS daemon is not running

12

Attempts to run lsquommlsclusterrsquo on the newly created compute node show that it does not belong to a Spectrum Scale cluster The same command run on an existing NSD Server node verifies that the new node has not been added to the cluster The terminated compute node (ip-10-0-3-222us-east-2computeinternal) is still a member of the cluster configuration

At this time there is no functionality built into the templatestacks to automatically add and

remove newly generated instances from the cluster configuration Nodes would need to be

manually added and removed using lsquommaddnodersquo and lsquommdelnodersquo commands In the event of

quorum node loss new quorum nodes would need to be designated In the event of NSD server

loss steps would need to be taken to reestablish optimal striping of data across newly created

NSDs (mmrestripefs) Users may want to automate certain functions listed above (automatic add

of new nodes to the cluster) while other administrative tasks (mmrestripefs) may be better off

being run manually during a maintenance window for example

With Regard to The Current Version of The AWS Spectrum Scale Trial Cloud Formation Template The Following Restrictions Exist

middot Protocol support including the use of Cluster Export Services (CES) nodes and protocol access such as Network File System (NFS) Object and Server Message Block (SMB)

middot Active File Management (AFM)

middot Transparent Cloud Tiering (TCT)

middot Compression

middot Encryption

middot Data Management API (DMAPI) support including Hierarchical Storage Management (HSM)

to tape

middot Hadoop Distributed File System (HDFS) connector support

middot Multi-cluster support (exporting an IBM Spectrum Scale file system from one Spectrum Scale cluster to another IBM Spectrum Scale cluster)

middot GUI

middot User name space management and quota management

middot Snapshots and clones

middot Replication is restricted to only 1X (IBM Spectrum Scale makes a single copy of all data and metadata) and 2X (IBM Spectrum Scale makes two copies of all data and metadata)

Additional Limitationsmiddot Using EBS volume encryption for IBM Spectrum Scale file systems is not supported

middot The archiving and restoring of IBM Spectrum Scale data through the use of AWS services is not supported

Many of the limitations above are a result of the edition packaged within the AMI (Standard) ATS has created custom cloud formation templates to reflect many of the design considerations called out in this document (third site quorum maximum datametadata replicas splitting of metadatadata volumes among others) in addition to creating our own AMI images using the AdvancedData Management edition of Spectrum Scale Features such as encryption compression and AFM are included in these editions As a next step we are looking to implement Protocol nodes within AWS using Active Directory replication as a means for authentication Implementing an S3 archive tier using Transparent Cloud Tiering would serve as an attractive option for those looking to leverage cheaper storage for archive purposes all within AWS We look forward to partnering with our customers to come up with the next generation of Spectrum Scale cluster implementations in the cloud

13

About the ATS Group

Since our founding in 2001 the ATS Group has consulted on thousands of system

implementations upgrades backups and recoveries We also support customers by

providing managed services performance analysis and capacity planning With over

60 industry-certified professionals we support SMBs Fortune 500 companies and

government agencies As experts in IBM VMware Oracle and other top vendors we are

experienced in virtualization storage area networks (SANs) high availability performance

tuning SDS enterprise backup and other evolving technologies that operate mission-critical

systems on premise in the cloud or in a hybrid environment

T H E AT S G R O U P C O M C O M PA N Y C O N TA C T

  1. Button 2
  2. Button 3

07

The following default options were taken

After review of the user supplied inputs the overall stack can be created The progress of each individual stack can be monitored within the AWS console See progression below with timestamps

08

Once the status for each stack listed above is ldquoCREATE_COMPLETErdquo the EC2 instances containing a fully functioning Spectrum Scale filesystem are accessible Creation time of each stack depends on the user supplied inputs In our example the entire process took ~10 minutes Instancecluster creation time increases as nodesnumber of disks increase

EC2 instance information by accessing the AWS console -gt EC2 Instances Our cluster consists of the following hosts

From a cluster node we can view the output of lsquommlsclusterrsquo and lsquommlsnsdrsquo to review the Cluster Name Repository Type Node names Node designation NSD names NSD to Filesystem allocations and NSD servers per NSD

A column titled ldquoPublic IPrdquo shows that only the LinuxBastion host has been assigned a public

address The Bastion host can be accessed using the ec2-user account and passing your

AWS key to the Bastion IP From there you can SSH to any ServerCompute node

09

Based on this output we can determine that one NSD server and one Compute node are

placed in the 1001X (Availability Zone us-east-2a) and the other NSD server and compute

node are placed in the 1003X private network (Availability Zone us-east-2b) Note that two

NSD servers and a single compute node are designated as quorum nodes A single 5GB

descOnly disk is allocated to compute node ip-10-0-1-132us-east-2computeinternal to serve

as a quorum disk however if availability zone us-east-2a were to go offline the filesystem

would be inaccessible due to loss of quorum

The script defines NSDs with a naming convention that includes availability zone in order

to make it much easier for user to determine which zone an NSD resides In this example

nsd_2a_1_0 is served out by NSD server ip-10-0-1-208us-east-2computeinternal from

availability zone us-east-2a

10

Total filesystem size is 20G (2 x 10G disk ndash one per NSD Server) with a replication factor of 2

for data and metadata The filesystem is created with the maximum number of replicas (3)

for data and metadata allowed by Spectrum Scale

Autoscaling groups can be viewed by navigating to Auto Scaling -gt Auto Scaling Groups with

the EC2 instance view

In our example three autoscaling groups are created for the Bastion Server and Compute

stacks Each stack has a minimum desired and maximum definition derived from the user

inputs supplied in the Cloud Formation template Other auto scaling rules tied to CPUMemory

utilization for example can be manually defined

A basic test of terminating a compute node demonstrates the functionality of the autoscaling groups that were created by the template

A compute node is termined

11

A new compute instance spins up to satisfy the rules of the Compute stack autoscaling group

The new instance is ready for use

We can now SSH to the server verify that its just been built and that the AMI containing Spectrum Scale packages has been used to build the instance However the GPFS daemon is not running

12

Attempts to run lsquommlsclusterrsquo on the newly created compute node show that it does not belong to a Spectrum Scale cluster The same command run on an existing NSD Server node verifies that the new node has not been added to the cluster The terminated compute node (ip-10-0-3-222us-east-2computeinternal) is still a member of the cluster configuration

At this time there is no functionality built into the templatestacks to automatically add and

remove newly generated instances from the cluster configuration Nodes would need to be

manually added and removed using lsquommaddnodersquo and lsquommdelnodersquo commands In the event of

quorum node loss new quorum nodes would need to be designated In the event of NSD server

loss steps would need to be taken to reestablish optimal striping of data across newly created

NSDs (mmrestripefs) Users may want to automate certain functions listed above (automatic add

of new nodes to the cluster) while other administrative tasks (mmrestripefs) may be better off

being run manually during a maintenance window for example

With Regard to The Current Version of The AWS Spectrum Scale Trial Cloud Formation Template The Following Restrictions Exist

middot Protocol support including the use of Cluster Export Services (CES) nodes and protocol access such as Network File System (NFS) Object and Server Message Block (SMB)

middot Active File Management (AFM)

middot Transparent Cloud Tiering (TCT)

middot Compression

middot Encryption

middot Data Management API (DMAPI) support including Hierarchical Storage Management (HSM)

to tape

middot Hadoop Distributed File System (HDFS) connector support

middot Multi-cluster support (exporting an IBM Spectrum Scale file system from one Spectrum Scale cluster to another IBM Spectrum Scale cluster)

middot GUI

middot User name space management and quota management

middot Snapshots and clones

middot Replication is restricted to only 1X (IBM Spectrum Scale makes a single copy of all data and metadata) and 2X (IBM Spectrum Scale makes two copies of all data and metadata)

Additional Limitationsmiddot Using EBS volume encryption for IBM Spectrum Scale file systems is not supported

middot The archiving and restoring of IBM Spectrum Scale data through the use of AWS services is not supported

Many of the limitations above are a result of the edition packaged within the AMI (Standard) ATS has created custom cloud formation templates to reflect many of the design considerations called out in this document (third site quorum maximum datametadata replicas splitting of metadatadata volumes among others) in addition to creating our own AMI images using the AdvancedData Management edition of Spectrum Scale Features such as encryption compression and AFM are included in these editions As a next step we are looking to implement Protocol nodes within AWS using Active Directory replication as a means for authentication Implementing an S3 archive tier using Transparent Cloud Tiering would serve as an attractive option for those looking to leverage cheaper storage for archive purposes all within AWS We look forward to partnering with our customers to come up with the next generation of Spectrum Scale cluster implementations in the cloud

13

About the ATS Group

Since our founding in 2001 the ATS Group has consulted on thousands of system

implementations upgrades backups and recoveries We also support customers by

providing managed services performance analysis and capacity planning With over

60 industry-certified professionals we support SMBs Fortune 500 companies and

government agencies As experts in IBM VMware Oracle and other top vendors we are

experienced in virtualization storage area networks (SANs) high availability performance

tuning SDS enterprise backup and other evolving technologies that operate mission-critical

systems on premise in the cloud or in a hybrid environment

T H E AT S G R O U P C O M C O M PA N Y C O N TA C T

  1. Button 2
  2. Button 3

08

Once the status for each stack listed above is ldquoCREATE_COMPLETErdquo the EC2 instances containing a fully functioning Spectrum Scale filesystem are accessible Creation time of each stack depends on the user supplied inputs In our example the entire process took ~10 minutes Instancecluster creation time increases as nodesnumber of disks increase

EC2 instance information by accessing the AWS console -gt EC2 Instances Our cluster consists of the following hosts

From a cluster node we can view the output of lsquommlsclusterrsquo and lsquommlsnsdrsquo to review the Cluster Name Repository Type Node names Node designation NSD names NSD to Filesystem allocations and NSD servers per NSD

A column titled ldquoPublic IPrdquo shows that only the LinuxBastion host has been assigned a public

address The Bastion host can be accessed using the ec2-user account and passing your

AWS key to the Bastion IP From there you can SSH to any ServerCompute node

09

Based on this output we can determine that one NSD server and one Compute node are

placed in the 1001X (Availability Zone us-east-2a) and the other NSD server and compute

node are placed in the 1003X private network (Availability Zone us-east-2b) Note that two

NSD servers and a single compute node are designated as quorum nodes A single 5GB

descOnly disk is allocated to compute node ip-10-0-1-132us-east-2computeinternal to serve

as a quorum disk however if availability zone us-east-2a were to go offline the filesystem

would be inaccessible due to loss of quorum

The script defines NSDs with a naming convention that includes availability zone in order

to make it much easier for user to determine which zone an NSD resides In this example

nsd_2a_1_0 is served out by NSD server ip-10-0-1-208us-east-2computeinternal from

availability zone us-east-2a

10

Total filesystem size is 20G (2 x 10G disk ndash one per NSD Server) with a replication factor of 2

for data and metadata The filesystem is created with the maximum number of replicas (3)

for data and metadata allowed by Spectrum Scale

Autoscaling groups can be viewed by navigating to Auto Scaling -gt Auto Scaling Groups with

the EC2 instance view

In our example three autoscaling groups are created for the Bastion Server and Compute

stacks Each stack has a minimum desired and maximum definition derived from the user

inputs supplied in the Cloud Formation template Other auto scaling rules tied to CPUMemory

utilization for example can be manually defined

A basic test of terminating a compute node demonstrates the functionality of the autoscaling groups that were created by the template

A compute node is termined

11

A new compute instance spins up to satisfy the rules of the Compute stack autoscaling group

The new instance is ready for use

We can now SSH to the server verify that its just been built and that the AMI containing Spectrum Scale packages has been used to build the instance However the GPFS daemon is not running

12

Attempts to run lsquommlsclusterrsquo on the newly created compute node show that it does not belong to a Spectrum Scale cluster The same command run on an existing NSD Server node verifies that the new node has not been added to the cluster The terminated compute node (ip-10-0-3-222us-east-2computeinternal) is still a member of the cluster configuration

At this time there is no functionality built into the templatestacks to automatically add and

remove newly generated instances from the cluster configuration Nodes would need to be

manually added and removed using lsquommaddnodersquo and lsquommdelnodersquo commands In the event of

quorum node loss new quorum nodes would need to be designated In the event of NSD server

loss steps would need to be taken to reestablish optimal striping of data across newly created

NSDs (mmrestripefs) Users may want to automate certain functions listed above (automatic add

of new nodes to the cluster) while other administrative tasks (mmrestripefs) may be better off

being run manually during a maintenance window for example

With Regard to The Current Version of The AWS Spectrum Scale Trial Cloud Formation Template The Following Restrictions Exist

middot Protocol support including the use of Cluster Export Services (CES) nodes and protocol access such as Network File System (NFS) Object and Server Message Block (SMB)

middot Active File Management (AFM)

middot Transparent Cloud Tiering (TCT)

middot Compression

middot Encryption

middot Data Management API (DMAPI) support including Hierarchical Storage Management (HSM)

to tape

middot Hadoop Distributed File System (HDFS) connector support

middot Multi-cluster support (exporting an IBM Spectrum Scale file system from one Spectrum Scale cluster to another IBM Spectrum Scale cluster)

middot GUI

middot User name space management and quota management

middot Snapshots and clones

middot Replication is restricted to only 1X (IBM Spectrum Scale makes a single copy of all data and metadata) and 2X (IBM Spectrum Scale makes two copies of all data and metadata)

Additional Limitationsmiddot Using EBS volume encryption for IBM Spectrum Scale file systems is not supported

middot The archiving and restoring of IBM Spectrum Scale data through the use of AWS services is not supported

Many of the limitations above are a result of the edition packaged within the AMI (Standard) ATS has created custom cloud formation templates to reflect many of the design considerations called out in this document (third site quorum maximum datametadata replicas splitting of metadatadata volumes among others) in addition to creating our own AMI images using the AdvancedData Management edition of Spectrum Scale Features such as encryption compression and AFM are included in these editions As a next step we are looking to implement Protocol nodes within AWS using Active Directory replication as a means for authentication Implementing an S3 archive tier using Transparent Cloud Tiering would serve as an attractive option for those looking to leverage cheaper storage for archive purposes all within AWS We look forward to partnering with our customers to come up with the next generation of Spectrum Scale cluster implementations in the cloud

13

About the ATS Group

Since our founding in 2001 the ATS Group has consulted on thousands of system

implementations upgrades backups and recoveries We also support customers by

providing managed services performance analysis and capacity planning With over

60 industry-certified professionals we support SMBs Fortune 500 companies and

government agencies As experts in IBM VMware Oracle and other top vendors we are

experienced in virtualization storage area networks (SANs) high availability performance

tuning SDS enterprise backup and other evolving technologies that operate mission-critical

systems on premise in the cloud or in a hybrid environment

T H E AT S G R O U P C O M C O M PA N Y C O N TA C T

  1. Button 2
  2. Button 3

09

Based on this output we can determine that one NSD server and one Compute node are

placed in the 1001X (Availability Zone us-east-2a) and the other NSD server and compute

node are placed in the 1003X private network (Availability Zone us-east-2b) Note that two

NSD servers and a single compute node are designated as quorum nodes A single 5GB

descOnly disk is allocated to compute node ip-10-0-1-132us-east-2computeinternal to serve

as a quorum disk however if availability zone us-east-2a were to go offline the filesystem

would be inaccessible due to loss of quorum

The script defines NSDs with a naming convention that includes availability zone in order

to make it much easier for user to determine which zone an NSD resides In this example

nsd_2a_1_0 is served out by NSD server ip-10-0-1-208us-east-2computeinternal from

availability zone us-east-2a

10

Total filesystem size is 20G (2 x 10G disk ndash one per NSD Server) with a replication factor of 2

for data and metadata The filesystem is created with the maximum number of replicas (3)

for data and metadata allowed by Spectrum Scale

Autoscaling groups can be viewed by navigating to Auto Scaling -gt Auto Scaling Groups with

the EC2 instance view

In our example three autoscaling groups are created for the Bastion Server and Compute

stacks Each stack has a minimum desired and maximum definition derived from the user

inputs supplied in the Cloud Formation template Other auto scaling rules tied to CPUMemory

utilization for example can be manually defined

A basic test of terminating a compute node demonstrates the functionality of the autoscaling groups that were created by the template

A compute node is termined

11

A new compute instance spins up to satisfy the rules of the Compute stack autoscaling group

The new instance is ready for use

We can now SSH to the server verify that its just been built and that the AMI containing Spectrum Scale packages has been used to build the instance However the GPFS daemon is not running

12

Attempts to run lsquommlsclusterrsquo on the newly created compute node show that it does not belong to a Spectrum Scale cluster The same command run on an existing NSD Server node verifies that the new node has not been added to the cluster The terminated compute node (ip-10-0-3-222us-east-2computeinternal) is still a member of the cluster configuration

At this time there is no functionality built into the templatestacks to automatically add and

remove newly generated instances from the cluster configuration Nodes would need to be

manually added and removed using lsquommaddnodersquo and lsquommdelnodersquo commands In the event of

quorum node loss new quorum nodes would need to be designated In the event of NSD server

loss steps would need to be taken to reestablish optimal striping of data across newly created

NSDs (mmrestripefs) Users may want to automate certain functions listed above (automatic add

of new nodes to the cluster) while other administrative tasks (mmrestripefs) may be better off

being run manually during a maintenance window for example

With Regard to The Current Version of The AWS Spectrum Scale Trial Cloud Formation Template The Following Restrictions Exist

middot Protocol support including the use of Cluster Export Services (CES) nodes and protocol access such as Network File System (NFS) Object and Server Message Block (SMB)

middot Active File Management (AFM)

middot Transparent Cloud Tiering (TCT)

middot Compression

middot Encryption

middot Data Management API (DMAPI) support including Hierarchical Storage Management (HSM)

to tape

middot Hadoop Distributed File System (HDFS) connector support

middot Multi-cluster support (exporting an IBM Spectrum Scale file system from one Spectrum Scale cluster to another IBM Spectrum Scale cluster)

middot GUI

middot User name space management and quota management

middot Snapshots and clones

middot Replication is restricted to only 1X (IBM Spectrum Scale makes a single copy of all data and metadata) and 2X (IBM Spectrum Scale makes two copies of all data and metadata)

Additional Limitationsmiddot Using EBS volume encryption for IBM Spectrum Scale file systems is not supported

middot The archiving and restoring of IBM Spectrum Scale data through the use of AWS services is not supported

Many of the limitations above are a result of the edition packaged within the AMI (Standard) ATS has created custom cloud formation templates to reflect many of the design considerations called out in this document (third site quorum maximum datametadata replicas splitting of metadatadata volumes among others) in addition to creating our own AMI images using the AdvancedData Management edition of Spectrum Scale Features such as encryption compression and AFM are included in these editions As a next step we are looking to implement Protocol nodes within AWS using Active Directory replication as a means for authentication Implementing an S3 archive tier using Transparent Cloud Tiering would serve as an attractive option for those looking to leverage cheaper storage for archive purposes all within AWS We look forward to partnering with our customers to come up with the next generation of Spectrum Scale cluster implementations in the cloud

13

About the ATS Group

Since our founding in 2001 the ATS Group has consulted on thousands of system

implementations upgrades backups and recoveries We also support customers by

providing managed services performance analysis and capacity planning With over

60 industry-certified professionals we support SMBs Fortune 500 companies and

government agencies As experts in IBM VMware Oracle and other top vendors we are

experienced in virtualization storage area networks (SANs) high availability performance

tuning SDS enterprise backup and other evolving technologies that operate mission-critical

systems on premise in the cloud or in a hybrid environment

T H E AT S G R O U P C O M C O M PA N Y C O N TA C T

  1. Button 2
  2. Button 3

10

Total filesystem size is 20G (2 x 10G disk ndash one per NSD Server) with a replication factor of 2

for data and metadata The filesystem is created with the maximum number of replicas (3)

for data and metadata allowed by Spectrum Scale

Autoscaling groups can be viewed by navigating to Auto Scaling -gt Auto Scaling Groups with

the EC2 instance view

In our example three autoscaling groups are created for the Bastion Server and Compute

stacks Each stack has a minimum desired and maximum definition derived from the user

inputs supplied in the Cloud Formation template Other auto scaling rules tied to CPUMemory

utilization for example can be manually defined

A basic test of terminating a compute node demonstrates the functionality of the autoscaling groups that were created by the template

A compute node is termined

11

A new compute instance spins up to satisfy the rules of the Compute stack autoscaling group

The new instance is ready for use

We can now SSH to the server verify that its just been built and that the AMI containing Spectrum Scale packages has been used to build the instance However the GPFS daemon is not running

12

Attempts to run lsquommlsclusterrsquo on the newly created compute node show that it does not belong to a Spectrum Scale cluster The same command run on an existing NSD Server node verifies that the new node has not been added to the cluster The terminated compute node (ip-10-0-3-222us-east-2computeinternal) is still a member of the cluster configuration

At this time there is no functionality built into the templatestacks to automatically add and

remove newly generated instances from the cluster configuration Nodes would need to be

manually added and removed using lsquommaddnodersquo and lsquommdelnodersquo commands In the event of

quorum node loss new quorum nodes would need to be designated In the event of NSD server

loss steps would need to be taken to reestablish optimal striping of data across newly created

NSDs (mmrestripefs) Users may want to automate certain functions listed above (automatic add

of new nodes to the cluster) while other administrative tasks (mmrestripefs) may be better off

being run manually during a maintenance window for example

With Regard to The Current Version of The AWS Spectrum Scale Trial Cloud Formation Template The Following Restrictions Exist

middot Protocol support including the use of Cluster Export Services (CES) nodes and protocol access such as Network File System (NFS) Object and Server Message Block (SMB)

middot Active File Management (AFM)

middot Transparent Cloud Tiering (TCT)

middot Compression

middot Encryption

middot Data Management API (DMAPI) support including Hierarchical Storage Management (HSM)

to tape

middot Hadoop Distributed File System (HDFS) connector support

middot Multi-cluster support (exporting an IBM Spectrum Scale file system from one Spectrum Scale cluster to another IBM Spectrum Scale cluster)

middot GUI

middot User name space management and quota management

middot Snapshots and clones

middot Replication is restricted to only 1X (IBM Spectrum Scale makes a single copy of all data and metadata) and 2X (IBM Spectrum Scale makes two copies of all data and metadata)

Additional Limitationsmiddot Using EBS volume encryption for IBM Spectrum Scale file systems is not supported

middot The archiving and restoring of IBM Spectrum Scale data through the use of AWS services is not supported

Many of the limitations above are a result of the edition packaged within the AMI (Standard) ATS has created custom cloud formation templates to reflect many of the design considerations called out in this document (third site quorum maximum datametadata replicas splitting of metadatadata volumes among others) in addition to creating our own AMI images using the AdvancedData Management edition of Spectrum Scale Features such as encryption compression and AFM are included in these editions As a next step we are looking to implement Protocol nodes within AWS using Active Directory replication as a means for authentication Implementing an S3 archive tier using Transparent Cloud Tiering would serve as an attractive option for those looking to leverage cheaper storage for archive purposes all within AWS We look forward to partnering with our customers to come up with the next generation of Spectrum Scale cluster implementations in the cloud

13

About the ATS Group

Since our founding in 2001 the ATS Group has consulted on thousands of system

implementations upgrades backups and recoveries We also support customers by

providing managed services performance analysis and capacity planning With over

60 industry-certified professionals we support SMBs Fortune 500 companies and

government agencies As experts in IBM VMware Oracle and other top vendors we are

experienced in virtualization storage area networks (SANs) high availability performance

tuning SDS enterprise backup and other evolving technologies that operate mission-critical

systems on premise in the cloud or in a hybrid environment

T H E AT S G R O U P C O M C O M PA N Y C O N TA C T

  1. Button 2
  2. Button 3

11

A new compute instance spins up to satisfy the rules of the Compute stack autoscaling group

The new instance is ready for use

We can now SSH to the server verify that its just been built and that the AMI containing Spectrum Scale packages has been used to build the instance However the GPFS daemon is not running

12

Attempts to run lsquommlsclusterrsquo on the newly created compute node show that it does not belong to a Spectrum Scale cluster The same command run on an existing NSD Server node verifies that the new node has not been added to the cluster The terminated compute node (ip-10-0-3-222us-east-2computeinternal) is still a member of the cluster configuration

At this time there is no functionality built into the templatestacks to automatically add and

remove newly generated instances from the cluster configuration Nodes would need to be

manually added and removed using lsquommaddnodersquo and lsquommdelnodersquo commands In the event of

quorum node loss new quorum nodes would need to be designated In the event of NSD server

loss steps would need to be taken to reestablish optimal striping of data across newly created

NSDs (mmrestripefs) Users may want to automate certain functions listed above (automatic add

of new nodes to the cluster) while other administrative tasks (mmrestripefs) may be better off

being run manually during a maintenance window for example

With Regard to The Current Version of The AWS Spectrum Scale Trial Cloud Formation Template The Following Restrictions Exist

middot Protocol support including the use of Cluster Export Services (CES) nodes and protocol access such as Network File System (NFS) Object and Server Message Block (SMB)

middot Active File Management (AFM)

middot Transparent Cloud Tiering (TCT)

middot Compression

middot Encryption

middot Data Management API (DMAPI) support including Hierarchical Storage Management (HSM)

to tape

middot Hadoop Distributed File System (HDFS) connector support

middot Multi-cluster support (exporting an IBM Spectrum Scale file system from one Spectrum Scale cluster to another IBM Spectrum Scale cluster)

middot GUI

middot User name space management and quota management

middot Snapshots and clones

middot Replication is restricted to only 1X (IBM Spectrum Scale makes a single copy of all data and metadata) and 2X (IBM Spectrum Scale makes two copies of all data and metadata)

Additional Limitationsmiddot Using EBS volume encryption for IBM Spectrum Scale file systems is not supported

middot The archiving and restoring of IBM Spectrum Scale data through the use of AWS services is not supported

Many of the limitations above are a result of the edition packaged within the AMI (Standard) ATS has created custom cloud formation templates to reflect many of the design considerations called out in this document (third site quorum maximum datametadata replicas splitting of metadatadata volumes among others) in addition to creating our own AMI images using the AdvancedData Management edition of Spectrum Scale Features such as encryption compression and AFM are included in these editions As a next step we are looking to implement Protocol nodes within AWS using Active Directory replication as a means for authentication Implementing an S3 archive tier using Transparent Cloud Tiering would serve as an attractive option for those looking to leverage cheaper storage for archive purposes all within AWS We look forward to partnering with our customers to come up with the next generation of Spectrum Scale cluster implementations in the cloud

13

About the ATS Group

Since our founding in 2001 the ATS Group has consulted on thousands of system

implementations upgrades backups and recoveries We also support customers by

providing managed services performance analysis and capacity planning With over

60 industry-certified professionals we support SMBs Fortune 500 companies and

government agencies As experts in IBM VMware Oracle and other top vendors we are

experienced in virtualization storage area networks (SANs) high availability performance

tuning SDS enterprise backup and other evolving technologies that operate mission-critical

systems on premise in the cloud or in a hybrid environment

T H E AT S G R O U P C O M C O M PA N Y C O N TA C T

  1. Button 2
  2. Button 3

12

Attempts to run lsquommlsclusterrsquo on the newly created compute node show that it does not belong to a Spectrum Scale cluster The same command run on an existing NSD Server node verifies that the new node has not been added to the cluster The terminated compute node (ip-10-0-3-222us-east-2computeinternal) is still a member of the cluster configuration

At this time there is no functionality built into the templatestacks to automatically add and

remove newly generated instances from the cluster configuration Nodes would need to be

manually added and removed using lsquommaddnodersquo and lsquommdelnodersquo commands In the event of

quorum node loss new quorum nodes would need to be designated In the event of NSD server

loss steps would need to be taken to reestablish optimal striping of data across newly created

NSDs (mmrestripefs) Users may want to automate certain functions listed above (automatic add

of new nodes to the cluster) while other administrative tasks (mmrestripefs) may be better off

being run manually during a maintenance window for example

With Regard to The Current Version of The AWS Spectrum Scale Trial Cloud Formation Template The Following Restrictions Exist

middot Protocol support including the use of Cluster Export Services (CES) nodes and protocol access such as Network File System (NFS) Object and Server Message Block (SMB)

middot Active File Management (AFM)

middot Transparent Cloud Tiering (TCT)

middot Compression

middot Encryption

middot Data Management API (DMAPI) support including Hierarchical Storage Management (HSM)

to tape

middot Hadoop Distributed File System (HDFS) connector support

middot Multi-cluster support (exporting an IBM Spectrum Scale file system from one Spectrum Scale cluster to another IBM Spectrum Scale cluster)

middot GUI

middot User name space management and quota management

middot Snapshots and clones

middot Replication is restricted to only 1X (IBM Spectrum Scale makes a single copy of all data and metadata) and 2X (IBM Spectrum Scale makes two copies of all data and metadata)

Additional Limitationsmiddot Using EBS volume encryption for IBM Spectrum Scale file systems is not supported

middot The archiving and restoring of IBM Spectrum Scale data through the use of AWS services is not supported

Many of the limitations above are a result of the edition packaged within the AMI (Standard) ATS has created custom cloud formation templates to reflect many of the design considerations called out in this document (third site quorum maximum datametadata replicas splitting of metadatadata volumes among others) in addition to creating our own AMI images using the AdvancedData Management edition of Spectrum Scale Features such as encryption compression and AFM are included in these editions As a next step we are looking to implement Protocol nodes within AWS using Active Directory replication as a means for authentication Implementing an S3 archive tier using Transparent Cloud Tiering would serve as an attractive option for those looking to leverage cheaper storage for archive purposes all within AWS We look forward to partnering with our customers to come up with the next generation of Spectrum Scale cluster implementations in the cloud

13

About the ATS Group

Since our founding in 2001 the ATS Group has consulted on thousands of system

implementations upgrades backups and recoveries We also support customers by

providing managed services performance analysis and capacity planning With over

60 industry-certified professionals we support SMBs Fortune 500 companies and

government agencies As experts in IBM VMware Oracle and other top vendors we are

experienced in virtualization storage area networks (SANs) high availability performance

tuning SDS enterprise backup and other evolving technologies that operate mission-critical

systems on premise in the cloud or in a hybrid environment

T H E AT S G R O U P C O M C O M PA N Y C O N TA C T

  1. Button 2
  2. Button 3

middot GUI

middot User name space management and quota management

middot Snapshots and clones

middot Replication is restricted to only 1X (IBM Spectrum Scale makes a single copy of all data and metadata) and 2X (IBM Spectrum Scale makes two copies of all data and metadata)

Additional Limitationsmiddot Using EBS volume encryption for IBM Spectrum Scale file systems is not supported

middot The archiving and restoring of IBM Spectrum Scale data through the use of AWS services is not supported

Many of the limitations above are a result of the edition packaged within the AMI (Standard) ATS has created custom cloud formation templates to reflect many of the design considerations called out in this document (third site quorum maximum datametadata replicas splitting of metadatadata volumes among others) in addition to creating our own AMI images using the AdvancedData Management edition of Spectrum Scale Features such as encryption compression and AFM are included in these editions As a next step we are looking to implement Protocol nodes within AWS using Active Directory replication as a means for authentication Implementing an S3 archive tier using Transparent Cloud Tiering would serve as an attractive option for those looking to leverage cheaper storage for archive purposes all within AWS We look forward to partnering with our customers to come up with the next generation of Spectrum Scale cluster implementations in the cloud

13

About the ATS Group

Since our founding in 2001 the ATS Group has consulted on thousands of system

implementations upgrades backups and recoveries We also support customers by

providing managed services performance analysis and capacity planning With over

60 industry-certified professionals we support SMBs Fortune 500 companies and

government agencies As experts in IBM VMware Oracle and other top vendors we are

experienced in virtualization storage area networks (SANs) high availability performance

tuning SDS enterprise backup and other evolving technologies that operate mission-critical

systems on premise in the cloud or in a hybrid environment

T H E AT S G R O U P C O M C O M PA N Y C O N TA C T

  1. Button 2
  2. Button 3

About the ATS Group

Since our founding in 2001 the ATS Group has consulted on thousands of system

implementations upgrades backups and recoveries We also support customers by

providing managed services performance analysis and capacity planning With over

60 industry-certified professionals we support SMBs Fortune 500 companies and

government agencies As experts in IBM VMware Oracle and other top vendors we are

experienced in virtualization storage area networks (SANs) high availability performance

tuning SDS enterprise backup and other evolving technologies that operate mission-critical

systems on premise in the cloud or in a hybrid environment

T H E AT S G R O U P C O M C O M PA N Y C O N TA C T

  1. Button 2
  2. Button 3

Recommended