AWS IMMERSION DAY...1 Amazon Web Services Introduction AWS IMMERSION DAY Facilitators Julia Knox,...

Post on 15-Aug-2020

1 views 0 download

transcript

1

Amazon Web Services Introduction

AWS IMMERSION DAY

FacilitatorsJulia Knox, Baccarelli Lab StaffRaj Chary, AWS Solutions ArchitectDana Koch, AWS Senior Account Executive - Education

January 31, 2019

2

Morning

● 10:00 am to 10:10am - Welcome/Introduction

● Current setup/challenges

○ Current cluster setup

○ Ability to run jobs when needed

○ Backup and centralized storage

AWS Immersion Day Schedule

● 10:10 am to 10:40am - Set the stage on AWS

● 10:40am – 10:55 am: Audience Poll

○ Challenges and areas of focus

● 11:00am – 12:00pm

○ Demo – show and tell

● R computing

● S3 for backup and storage

Afternoon

● 12:00pm to 12:30pm – [CATERED

LUNCH]

● 12:30pm – 2:30pm -- Lab session –

Demo/Show and Tell 120 mins

(Laptops Required)

■ R

■ S3

■ EC2

● 2:30pm to 3:30pm:

○ Challenge – TBD

● 3:30pm – 4:00pm

○ Final Wrap-up and Q&A

3

Welcome to AWS Immersion Day!

4

Current Set-up Challenges

5

HPC at Columbia

$ ssh user@login.c2b2.columbia.edu

$ qlogin –l mem=2G,time=:20:

$ cd /ifs/scratch/c2b2/my_lab/user

$ ./workhard

● Running a job on the cluster requires specifying memory and time

● Charged for time over 100K CPU hours and storage

● Lots of Command Line Dependence

6

Morning

● 10:00 am to 10:10am - Introduction

● Current setup/challenges

○ Current cluster setup

○ Ability to run jobs when needed

○ Backup and centralized storage

AWS Immersion Day Schedule

● 10:10 am to 10:40am - Set the stage on AWS

● 10:40am – 10:55 am: Audience Poll

○ Challenges and areas of focus

● 11:00am – 12:00pm

○ Demo – show and tell

● R computing

● S3 for backup and storage

Afternoon

● 12:00pm to 12:30pm – [CATERED

LUNCH]

● 12:30pm – 2:30pm -- Lab session –

120 mins (Laptops Required)

■ R

■ S3

■ EC2

● 2:30pm to 3:30pm:

○ Challenge – TBD

● 3:30pm – 4:00pm

○ Final Wrap-up and Q&A

7

AWS Workflow Examples

8

Introduction: AWS Computing Overview

AWS Computing Jobs:

● Run on virtual servers, known as instances, enabled by

Amazon Elastic Compute Cloud (EC2)

● Approximate 4 minute launch time of instances

○ Instantly launch Amazon Machine Images (AMIs)

■ Images preconfigured to suit research needs

● Call objects from S3 (storage) on instances using RStudio

9

Morning

● 10:00 am to 10:10am - Introduction

● Current setup/challenges

○ Current cluster setup

○ Ability to run jobs when needed

○ Backup and centralized storage

AWS Immersion Day Schedule

● 10:10 am to 10:40am - Set the stage on AWS

● 10:40am – 10:55 am: Audience Poll

○ Challenges and areas of focus

● 11:00am – 12:00pm

○ Demo – show and tell

● R computing

● S3 for backup and storage

Afternoon

● 12:00pm to 12:30pm – [CATERED

LUNCH]

● 12:30pm – 2:30pm -- Lab session –

120 mins (Laptops Required)

■ R

■ S3

■ EC2

● 2:30pm to 3:30pm:

○ Challenge – TBD

● 3:30pm – 4:00pm

○ Final Wrap-up and Q&A

10

AWS Advantage: Time

Columbia Cluster Computing Jobs: Compute the memory factored cpu hours (MFCPU) using the formula N * max(1.0, M / 3G) * H.

ExamplesIf you submit a job for 12G X 12 hours, and it runs for 10 hours, this would be:

(12G/3G) * 10 = 40 MFCPU

Monthly limit is 100K CPU Hours

AWS Computing Jobs:● Run on virtual servers, known as instances, enabled by

Amazon Elastic Compute Cloud● Instant launch of HPC clusters ● Scaling options● Multiple payment choices

○ Enables you to scale by priority■ E.g. With on-demand launching, you pay by the

hour of compute time. ■ If you have a job that can tolerate interruptions,

you can also bid on unused space called “Spot Instances”, which are normally 50% to 93% lower than the on-demand price.

11

AWS Advantage: Cost

Elasticity/Scalability Options

Pay for what you use

Lab member data

needs (based on

recent survey)

Estimated Costs to be

Approximately $1,000

Year or Less

12

AWS Advantage: Data Transfers

● Allow a researcher from a different

organization access to your AWS

environment while maintaining control of

what data and systems they can see and

do within the AWS account.

● This cross-account access is enable by

utilizing Amazon Identity and Access

Management (IAM).

● Hard Drive

● Does not enable API-enabled instance

access for file sharing

Amazon AWS Columbia HPC

Lab member data transfer

needs (based on recent

survey):

13

Morning

● 10:00 am to 10:10am - Introduction

● 10:10 am to 10:40am - Set the stage on AWS

● Current setup/challenges

○ Current cluster setup

○ Ability to run jobs when needed

○ Backup and centralized storage

AWS Immersion Day Schedule

● 10:40am – 10:55 am: Audience Poll

○ Challenges and areas of focus

● 11:00am – 12:00pm

○ Demo – show and tell

● R computing

● S3 for backup and storage

Afternoon

● 12:00pm to 12:30pm – [CATERED

LUNCH]

● 12:30pm – 2:30pm -- Lab session –

120 mins (Laptops Required)

■ R

■ S3

■ EC2

● 2:30pm to 3:30pm:

○ Challenge – TBD

● 3:30pm – 4:00pm

○ Final Wrap-up and Q&A

14

What is your biggest challenge with...

○ ….Data organization?○ ….Sharing data?○ ….Receiving data?○ ….Performing large analyses?

Audience Poll

15

A

W

S

S

y

s

t

e

m

16

AWS Reference In Development - GitHub Page

17

Morning

● 10:00 am to 10:10am - Introduction

● 10:10 am to 10:40am - Set the stage on AWS

● Current setup/challenges

○ Current cluster setup

○ Ability to run jobs when needed

○ Backup and centralized storage

AWS Immersion Day Schedule

● 10:40am – 10:55 am: Audience Poll

○ Challenges and areas of focus

● 11:00am – 12:00pm

○ Demo – show and tell

● R computing

● S3 for backup and storage

Afternoon

● 12:00pm to 12:30pm – [CATERED

LUNCH]

● 12:30pm – 2:30pm -- Lab session –

- Demo/Show and Tell - 120 mins

(Laptops Required)

■ R

■ S3

■ EC2

● 2:30pm to 3:30pm:

○ Challenge – TBD

● 3:30pm – 4:00pm

○ Final Wrap-up and Q&A

18

Example S3 Bucket from Baccarelli Lab AWS Set-up

S3 for Backup and Storage

This is Allison’s bucket. Only she, and people written into her “bucket

policy” have access to its objects.

19

Morning

● 10:00 am to 10:10am - Introduction

● 10:10 am to 10:40am - Set the stage on AWS

● Current setup/challenges

○ Current cluster setup

○ Ability to run jobs when needed

○ Backup and centralized storage

AWS Immersion Day Schedule

● 10:40 am – 11:30 am: Audience Poll

○ Challenges and areas of focus

● 11:45 am – 12:00 pm

○ R computing

○ S3 for backup and storage

Afternoon

● 12:00pm to 12:30pm –

[CATERED LUNCH]

● 12:30pm – 2:30pm -- Lab session –

120 mins (Demo/Show and Tell)

■ R

■ S3

■ EC2

● 2:30pm to 3:30pm:

○ Challenge – TBD

● 3:30pm – 4:00pm

○ Final Wrap-up and Q&A

20

Catered Lunch

21

Morning

● 10:00 am to 10:10am - Introduction

● 10:10 am to 10:40am - Set the stage on AWS

● Current setup/challenges

○ Current cluster setup

○ Ability to run jobs when needed

○ Backup and centralized storage

AWS Immersion Day Schedule

● 10:40am – 10:55 am: Audience Poll

○ Challenges and areas of focus

● 11:00am – 12:00pm

○ Demo – show and tell

● R computing

● S3 for backup and storage

Afternoon

● 12:00pm to 12:30pm – [CATERED

LUNCH]

● 12:30pm – 2:30pm -- Lab session

– Demo/Show and Tell

■ S3

■ EC2

● 2:30pm to 3:30pm:

○ Challenge

● 3:30pm – 4:00pm

○ Final Wrap-up and Q&A

22

Lab Session - Laptops Required

■ R

■ S3

■ EC2

23

Morning

● 10:00 am to 10:10am - Introduction

● 10:10 am to 10:40am - Set the stage on AWS

● Current setup/challenges

○ Current cluster setup

○ Ability to run jobs when needed

○ Backup and centralized storage

AWS Immersion Day Schedule

● 10:40am – 10:55 am: Audience Poll

○ Challenges and areas of focus

● 11:00am – 12:00pm

○ Demo – show and tell

● R computing

● S3 for backup and storage

Afternoon

● 12:00pm to 12:30pm – [CATERED

LUNCH]

● 12:30pm – 2:30pm -- Lab session –

120 mins (Laptops Required)

■ R

■ S3

■ EC2

● 2:30pm to 3:30pm:

○ Challenge

● 3:30pm – 4:00pm

○ Final Wrap-up and Q&A

24

Challenge

26

Morning

● 10:00 am to 10:10am - Introduction

● 10:10 am to 10:40am - Set the stage on AWS

● Current setup/challenges

○ Current cluster setup

○ Ability to run jobs when needed

○ Backup and centralized storage

AWS Immersion Day Schedule

● 10:40am – 10:55 am: Audience Poll

○ Challenges and areas of focus

● 11:00am – 12:00pm

○ Demo – show and tell

● R computing

● S3 for backup and storage

Afternoon

● 12:00pm to 12:30pm – [CATERED

LUNCH]

● 12:30pm – 2:30pm -- Lab session –

120 mins (Demo/Show and Tell)

■ R

■ S3

■ EC2

● 2:30pm to 3:30pm:

○ Challenge

● 3:30pm – 4:00pm

○ Final Wrap-up and Q&A

27

● Accessing code from S3

buckets in an EC2 Instance

● Running an AMI configured

EC2 Instance with RStudio

R and RStudio Computing

28

Q & A Session

AWS Solutions ArchitectAWS Senior Account Executive - Education

Raj Chary

Dana Koch

29

Fun Activity

● Groups of 3-4

● Simple Questions

● Using Echo

30

Thanks, Everyone!