Date post: | 27-Jul-2015 |
Category: |
Data & Analytics |
Upload: | alex-coqueiro |
View: | 88 times |
Download: | 0 times |
Alex Coqueiro Solutions Architect, Amazon Web Services
Consumer Business
Milhões de clientes ativos
Operações globais em diversos paises ao redor do mundo
Seller"Business
Vendas nos sites da Amazon
Tecnologia baseada na sua própria rede de varejo
Alavancagem de centros integrados de fulfillment
Cloud Business
Infraestrutura de nuvem para host de aplicações corporativass
Centenas de milhares de clientes em mais de 190 paises
Amplo conjunto de recursos computacionais que permitem as empresas moverem mais rapidamente
CLOUD
Why do researchers love using AWS?
Time to Science Access research
infrastructure in minutes
Globally Accessible Easily Collaborate with
researchers around the world Low Cost
Pay-as-you-go pricing
Secure A collection of tools to
protect data and privacy Elastic
Easily add or remove capacity
Scalable Access to effectively
limitless capacity
Popular HPC workloads on AWS
Genome processing
Modeling and Simulation
Government and Educational Research
Monte Carlo Simulations
Transcoding and Encoding
Computational Chemistry
A marketplace for software in the Cloud
Over 1,900 listings across 23 categories Customers run over 70M hours of software per month
AWS Marketplace – HPC category
aws.amazon.com/marketplace
AWS Public Data Sets
aws.amazon.com/marketplace Free for everyone
AWS Curriculum
http://aws.amazon.com/certification/
Over 1 million active customers across 190 countries 800+ government agencies 3,000+ educational institutions 11 regions 28 availability zones 52 edge locations
Everyday, AWS adds enough new server capacity to support Amazon.com when it was a $7 billion global enterprise.
Availability Zone A
Availability Zone B
Availability Zone C
Region
Customer Decides Where Applications and Data Reside Note: Conceptual drawing only. The number of Availability Zones may vary.
Enterprise Applications
Virtual Desktops Collaboration and Sharing
Platform Services
Databases
Caching
Relational
No SQL
Analytics
Hadoop
Real-time
Data Workflows
Data Warehouse
App Services
Queuing
Orchestration
App Streaming
Transcoding
Search
Deployment & Management
Containers
Dev/ops Tools
Resource Templates
Usage Tracking
Monitoring and Logs
Mobile Services
Identity
Sync
Mobile Analytics
Notifications
Foundation Services
Compute (VMs, Auto-scaling and Load Balancing)
Storage (Object, Block and Archive)
Security & Access Control Networking
Infrastructure Regions CDN and Points of Presence Availability Zones
Compute Analytics Databases Storage
Imaging data
Phenotypes & comparative analysis
Upstream analysis Data mining
Enterprise Applications
Virtual Desktops Collaboration and Sharing
Platform Services
Databases
Caching
Relational
No SQL
Analytics
Hadoop
Real-time
Data Workflows
Data Warehouse
App Services
Queuing
Orchestration
App Streaming
Transcoding
Search
Deployment & Management
Containers
Dev/ops Tools
Resource Templates
Usage Tracking
Monitoring and Logs
Mobile Services
Identity
Sync
Mobile Analytics
Notifications
Foundation Services
Compute (VMs, Auto-scaling and Load Balancing)
Storage (Object, Block and Archive)
Security & Access Control Networking
Infrastructure Regions CDN and Points of Presence Availability Zones
Amazon EC2
• Resizable compute capacity in >25 instance types • Reduces the time required to obtain and boot new server
instances to minutes or seconds • Scale capacity as your computing requirements change • Pay only for capacity that you actually use • Choose Linux or Windows • Deploy across Regions and Availability Zones for reliability • Support for virtual network interfaces that can be attached to
EC2 instances in your VPC
General Purpose
(Burstable or Fixed Performance)
Compute Optimized
Memory Optimized
GPU Instances
Storage Optimized
Compute Optimized
Name vCPU Memory (GiB) Network
c4.large 2 3.75 Moderate
c4.xlarge 4 7.5 Moderate
c4.2xlarge 8 15 High
c4.4xlarge 16 30 High
c4.8xlarge 36 60 10 Gbps
Storage Optimized
Name vCPU Memory (GiB) Network HDD
d2.xlarge 4 30.5 Moderate 3 x 2000
d2.2xlarge 8 61 High 6 x 2000
d2.4xlarge 16 122 High 12 x 2000
d2.8xlarge 36 244 10Gb 24 x 2000
Intel Xeon E5-2670 (Sandy Bridge)
15GB or 60GB RAM
1 NVIDIA Grid k520 GPU 1,536 Cores 4GB Mem
GPU Optimized
Name GPU vCPU Memory (GiB) Network SSD
g2.2xlarge 1 8 15 High 1 x 60
g2.8xlarge 4 32 60 10 Gb 2 x 120
Demo
Time:+00h
Scale using Elastic Capacity
<10 cores
Time: +24h
Scale using Elastic Capacity
>1500 cores
Time:+72h
Scale using Elastic Capacity
<10 cores
Time: +120h
Scale using Elastic Capacity
>600 cores
Demo
Reserved Make a low, one-time payment and receive a significant discount on the hourly charge For committed utilization
Free Tier Get Started on AWS with free usage & no commitment For POCs and getting started
On-Demand Pay for compute capacity by the hour with no long-term commitments For spiky workloads, or to define needs
Spot Bid for unused capacity, charged at a Spot Price which fluctuates based on supply and demand For time-insensitive or transient workloads
Dedicated Launch instances within Amazon VPC that run on hardware dedicated to a single customer For highly sensitive or compliance related workloads
On
On-demand
Reserved capacity
100%
Capacity Over Time
AWS Spot MarketAchieving economies of scale
Spot
0%
* Prices on April 17, 2015
* Prices on April 17, 2015
* Prices on April 17, 2015
* Prices on April 17, 2015
aws autoscale create-launch-configuration --launch-configuration-name spotlc-5cents --image-id ami-e565ba8c --instance-type d2.2xlarge --spot-price “0.25”
aws autoscale create-auto-scaling-group --auto-scaling-group-name spotasg --launch-configuration spotlc-5cents --availability-zones “us-east-1a,us-east-1b” --max-size 16 --min-size 1 --desiredcapacity 3
http://aws.amazon.com/cli/
Demo
Compute Analytics Databases Storage
Imaging data
Phenotypes & comparative analysis
Upstream analysis Data mining
Enterprise Applications
Virtual Desktops Collaboration and Sharing
Platform Services
Databases
Caching
Relational
No SQL
Analytics
Hadoop
Real-time
Data Workflows
Data Warehouse
App Services
Queuing
Orchestration
App Streaming
Transcoding
Search
Deployment & Management
Containers
Dev/ops Tools
Resource Templates
Usage Tracking
Monitoring and Logs
Mobile Services
Identity
Sync
Mobile Analytics
Notifications
Foundation Services
Compute (VMs, Auto-scaling and Load Balancing)
Storage (Object, Block and Archive)
Security & Access Control Networking
Infrastructure Regions CDN and Points of Presence Availability Zones
AWS region
AZ - B
VPC 10.0.0.0/16
SN 10.0.1.0/24
M E
E
E
VPC Endpoint
AZ - A
Internet GW Service
SN 10.0.2.0/24
E
E
E
M E
E
E
S S S
Demo
Enterprise Applications
Virtual Desktops Collaboration and Sharing
Platform Services
Databases
Caching
Relational
No SQL
Analytics
Hadoop
Real-time
Data Workflows
Data Warehouse
App Services
Queuing
Orchestration
App Streaming
Transcoding
Search
Deployment & Management
Containers
Dev/ops Tools
Resource Templates
Usage Tracking
Monitoring and Logs
Mobile Services
Identity
Sync
Mobile Analytics
Notifications
Foundation Services
Compute (VMs, Auto-scaling and Load Balancing)
Storage (Object, Block and Archive)
Security & Access Control Networking
Infrastructure Regions CDN and Points of Presence Availability Zones
Enterprise Applications
Virtual Desktops Collaboration and Sharing
Platform Services
Databases
Caching
Relational
No SQL
Analytics
Hadoop
Real-time
Data Workflows
Data Warehouse
App Services
Queuing
Orchestration
App Streaming
Transcoding
Search
Deployment & Management
Containers
Dev/ops Tools
Resource Templates
Usage Tracking
Monitoring and Logs
Mobile Services
Identity
Sync
Mobile Analytics
Notifications
Foundation Services
Compute (VMs, Auto-scaling and Load Balancing)
Storage (Object, Block and Archive)
Security & Access Control Networking
Infrastructure Regions CDN and Points of Presence Availability Zones
Demo
Demo
Storage Database Compute
Cloud automation allows for security agility “Programmable infrastructure” allows you to automate every aspect your environment. Security properties are “baked in,” constantly checked via logging and auditing, and deviations / alarms are actionable via code Change and speed of change become an asset, not a liability
aws ec2 create-vpc --cidr-block 10.0.0.0/16
aws ec2 replace-route --route-table-id $ROUTE_TABLE_ID
--destination-cidr-block 0.0.0.0/0
--instance-id $INSTANCE_ID
aws ec2 attach-network-interface --network-interface-id $ENI
--instance-id $INSTANCE_ID
--device-index 1
aws ec2 assign-private-ip-addresses --network-interface-id $ENI
--private-ip-addresses 10.0.0.100
• AWS CLI
#!/bin/sh export AWS_DEFAULT_REGION="us-east-1"
VPC_ID=`aws ec2 create-vpc --cidr-block 10.0.0.0/16 --output text | awk '{print $6;}'`
SUBNET_ID=`aws ec2 create-subnet --vpc-id $VPC_ID --cidr-block 10.0.1.0/24 --output text | awk '{print $6;}'`
echo "Created $VPC_ID & $SUBNET_ID"
#Clean up
aws ec2 delete-subnet --subnet-id $SUBNET_ID
aws ec2 delete-vpc --vpc-id $VPC_ID
#!/usr/bin/python import boto.vpc
Region=“us-east-1”
conn = boto.vpc.VPCConnection(Region)
vpc = conn.create_vpc(‘10.0.0.0/16’)
subnet = conn.create_subnet(vpc.id ‘10.0.1.0/24’)
Print "Created “+vpc.id+” & “+subnet.id
#Clean up
conn.delete_subnet(subnet.id)
conn.delete_vpc(vpc.id)
• Amazon SDK
"Resources" : { "VPC" : {
"Type" : "AWS::EC2::VPC",
"Properties" : {
"CidrBlock" : “10.0.0.0/16”,
"Tags" : [ { "Key" : “Name", "Value" : “VPCName“ } ]
}
}, "PublicSubnet" : {
"Type" : "AWS::EC2::Subnet",
"Properties" : {
"VpcId" : { "Ref" : "VPC" },
"CidrBlock" : “10.0.1.0/24”,
"Tags" : [ { "Key" : "Network", "Value" : "Public" } ] }
}
• AWS CloudFormation
Demo
Try out our HPC CloudFormation-based demo
CfnCluster (“CloudFormation cluster”)
Command Line Interface Tool Deploy and demo an HPC cluster
For more info:
https://aws.amazon.com/hpc/cfncluster
Facilities
Physical security
Compute infrastructure
Storage infrastructure
Network infrastructure
Virtualization layer (EC2)
Hardened service endpoints
Rich IAM capabilities
Network configuration
Security groups
OS firewalls
Operating systems
Applications
Proper service configuration
Auth & acct management
Authorization policies
+ =
• Re-focus your security professionals on a subset of the problem • Take advantage of high levels of uniformity and automation
Customer/Partner Audited
Web Tier
Application Tier
Database Tier Porta 80 e 443
Time de Engenharia com ssh
Todos os demais acessos bloqueados
Acesso analítico de dados Amazon EC2 Security Group Firewall
Rich control with AWS’s powerful Identity & Access Management capabilities
Authentication: • Multiple options including rich SAML
federation capabilities, MFA, web identities
• Clean separation of identity from proof of identity
• Roles are powerful and flexible pseudo-principals that can be assumed by other identities • Federation scenarios • Cross-account access
Network isolation with Virtual Private Cloud Define your own address space as extension of private network Connect to private network with VPN tunnel or Direct Connect Configure Security Groups (virtual firewalls) for all EC2 instances; update fleet firewall rules with a single API call Configure Network Access Control Lists for subnet level isolation and control
Enhanced isolation and control with encryption Automatic encryption with managed keys (Key Management Service) Dedicated hardware security modules (Cloud HSM) Bring and use your own keys
Encrypt your data prior to sending to AWS
Your applications in your data center
Your applications in Amazon EC2 Encrypted
Data
AWS Services
S3 Glacier Redshift EBS
Encryption Primer
Plaintext PHI
Hardware/ Software
Encrypted PHI
Symmetric Data Key
Encrypted Data Key
Master Key Symmetric Data Key
?
Encrypted Data in Storage
Key Hierarchy
?
S3 Client-Side Encryption Amazon S3 Encryption Client with AWS SDKs
Your key management infrastructure
Your applications in your data
center
Your key management
infrastructure in EC2
Your Encrypted Data in Amazon S3
Your application in Amazon EC2
AWS SDK with S3 Encryption Client
S3 SSE with Customer Provided Keys Works
Plaintext PHI
Encrypted Data
Customer Provided Key S3 Web Server
HTTPS Customer
PHI
S3 Storage Fleet
• Key is used at S3 server, then deleted • Customer must provide same key when
downloading to allow S3 to decrypt data
Customer Provided Key
S3 SSE with AWS fully managed keys
Plaintext PHI
Encrypted PHI
Symmetric Data Key S3 Web Server
HTTPS Customer
PHI
Encrypted Data Key
Master Key Symmetric Data Key
S3 Storage Fleet
A master key managed by the S3 service and protected by systems internal to AWS in a
distinct system
Amazon EBS
Amazon S3
• HTTPS • AES-256 server-side encryption • AWS or customer provided or customer managed keys • Each object gets its own key
• End-to-end secure network traffic • Whole volume encryption • AWS or customer managed keys • Encrypted incremental snapshots • Minimal performance overhead (utilizes Intel AES-NI)
Integrated with AWS IAM Console
Integrated with Amazon EBS
How AWS Services Integrate with KMS • 2-tiered key hierarchy using envelope
encryption
• Data keys encrypt customer data
• KMS customer master keys encrypt data keys
• Benefits: • Limits blast radius of compromised
resources and their keys • Better performance • Easier to manage a small number of master
keys than billions of resource keys
Master Key(s)
Data Key 1
S3 Object EBS Volume
RDS Instance
Redshift Cluster
Data Key 2 Data Key 3 Data Key 4 Data Key 5
Your Application
Keys encrypted
Data encrypted
KMS
bit.ly/aws-dbgap
aws.amazon.com/hpc
http://bit.ly/aws-dbgap
Architecting for Genomic Data Security and Compliance in AWS