Date post: | 11-Jan-2017 |
Category: |
Technology |
Upload: | amazon-web-services |
View: | 121 times |
Download: | 0 times |
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Puneet Agarwal – AWS Solutions Architect
Steve Abraham – AWS Solutions Architect
Mario Kostelac – Intercom Product Engineer
November 30, 2016
Amazon Aurora Best PracticesGetting the Best Out of Your Databases
DAT301
What to Expect from the Session
• Migration best practices
• Performance best practices
• Real-time reporting and analytics
• Concurrent event stores
• Integration with AWS services
• Welcome Intercom!
Amazon Aurora
• MySQL-compatible relational
database
• Performance and availability of
commercial databases
• Simplicity and cost effectiveness of
open source databases
• Delivered as managed serviceFastest growing service
in AWS history
Best practices: Migrations
Amazon Aurora migration options
Source database From where Recommended option
RDS
EC2, on-premises
EC2, on-premises, RDS
Console based automated
snapshot ingestion and catch
up via binlog replication.
Binary snapshot ingestion
through S3 and catch up via
binlog replication.
Schema conversion using
SCT and data migration via
DMS.
DB snapshot migration
One-click migration from
RDS MySQL 5.6 to Aurora
Automatic conversion from
MyISAM to InnoDB
Most migrations take <1 hr,
longer for large databases
One click replication from
RDS MySQL to Amazon
Aurora
DB
Snapshot
One-click
Migrate
RDS MySQL
Master/Slave New Aurora
Cluster
Migrating from self-managed MySQL to Aurora
• Import MySQL snapshots into Aurora through
S3
1) Execute Percona XtraBackup
2) Upload database snapshot to S3
3) Import snapshot from S3 to Aurora cluster
4) Setup logical replication to catch-up
5) Transition database workload to Aurora
cluster
• Faster migration for large databases (1+ TB)
• Import schema and data in one operation
Demo: Restore from S3
Best practices – binary snapshot ingestion
• File splitting and compression recommended for 1+ TB databases
• Supported file compression formats:1. Gzip (.gz)
2. Percona xbstream (.xbstream)
• Sample compression and splitting command:• innobackupex --user=myuser --password=<password> --stream=tar
\ /mydata/s3-restore/backup | gzip | split -d --bytes=512000 \- /mydata/s3-restore/backup3/backup.tar.gz
• Import separately:1. User accounts/passwords
2. Functions
3. Stored procedures
Data copy: Existing data is copied from source tables to tables on the target.
Chance data capture and apply: Changes to data on source are captured
while the tables are loaded. Once load is complete, buffered changes are
applied to the target. Additional changes captured on the source are applied to the target until the task
stopped or terminated AWS Database
Migration Service
AWS Schema
Conversion Tool
Oracle, SQL Server to Aurora Migration
Assessment report: SCT analyses the source database and provides a
report with a recommended target engine and information on automatic
and manual conversions
Code Browser and recommendation engine: Highlights places that require
manual edits and provides architectural and design guidelines.
Demo: Oracle to Aurora
migration
Migration best practices
• Use the right migration approach for your use case
• Test, migrate, test again!
• Consolidate shards on Aurora
• Schema conversion
• Schema optimization post conversion
• For tables with wide text columns, enable DYNAMIC row format
• Primary key implementation differences
Best practices: Performance
Aurora performance characteristics
Optimized for highly concurrent random access (OLTP)
Performance scales with number of connections
Consistent performance as number of
databases/schemas/tables increase
Consistent performance with increasing number of
Aurora read-replicas
Low replication lag
Performance testing
h t t p s : / / d 0 . a w s s t a t i c . c o m / p r o d u c t - m a rk e t i n g / Au r o r a / R D S_ Au r o r a _ Pe r f o r m a n c e _ As s e s s m e n t _ Be n c hm a r k i n g _ v 1 - 2 . p d f
AMAZON
AURORA
R3.8XLARGE
R3.8XLARGE
R3.8XLARGE
R3.8XLARGE
R3.8XLARGE
• Create an Amazon VPC (or use an existing one).
• Create four EC2 R3.8XL client instances to run the
SysBench client. All four should be in the same AZ.
• Enable enhanced networking on your clients.
• Tune your Linux settings (see whitepaper).
• Install Sysbench version 0.5.
• Launch a r3.8xlarge Amazon Aurora DB instance in
the same VPC and AZ as your clients.
• Start your benchmark!
1
2
3
4
5
6
7
Performance Best Practices
MySQL/RDBMS practices still apply
Choose the right tool for the right job (OLAP vs OLTP)
Create appropriate indexes
Tune your SQL code, use explain plans, performance schema
Leverage high concurrency
Aurora throughput increases with number of connections
Architect your applications to leverage high concurrency in Aurora
Read Scaling
Aurora offers read replicas with virtually no replication lag
Leverage multiple read replicas to distribute your reads
Performance Best Practices
Parameter tuning
No need to migrate your performance-related MySQL parameters to
Aurora
Aurora Parameter Groups are pre-tuned and already optimal in most
cases
Performance comparison
Don’t obsess over individual metrics (CPU, IOPS, IO throughput)
Focus on what matters, i.e., application performance
Other best practices
Keep query cache on
Leverage CloudWatch metrics
Best Practices: Real-Time
Reporting & Analytics
Scenario
• Travel & Booking Industry
• Live Contextual Product Recommendations
• Near Real-Time Reporting
• ~700+ Users
• ~8 TB Dataset
• Usage cycles over 24 hour period
• Cost Considerations
Database EngineStorage Backend
Application Users
Challenges – Original Design
Aurora Cluster
Application Users
DNS Endpoint
Load Balanced
Solutions – New Design
Scheduled Job Reads
Instance Load Metrics
and Calls Lambda
Cluster Instances
Added, Removed
or Resized
Lambda Function
Applies Logic
and Calls RDS API
Desired Scale
Achieved
Solutions – Fleet Scaling
Best practices: Massively
Concurrent Event Stores
Scenario
• Gaming Industry
• Millions of RPS
• Consistent Latency
• Cost Considerations
NoSQL (performance, steady state)
Partitioned
NoSQL TableUser Applications User Applications
Optimal Performance
Under Moderate Load
NoSQL (performance “hot” state)
Partitioned
NoSQL TableUser Applications User Applications
Degraded Performance
Under Heavy Load
Aurora implementation (performance)
Aurora ClusterUser Applications User Applications
Consistent Performance
Under Load
NoSQL implementation (cost)
Partitioned
NoSQL TableUser Applications
Each Read/Write
Billed Separately
Aurora implementation (cost)
Aurora ClusterUser Applications
Most Operations
Served From Memory
Small Portion of IO
Requests Billed
Cost-Efficient
Storage
Best practices: AWS Service
Integrations
AWS Services
Generating Data
Amazon S3
Data Lake
Amazon Aurora
Load From S3
S3 Event Driven
Lambda Call
Lambda Call
S3 Load Completed
Notification Delivered
Data Flow
Call / Notification
Flow
Event Driven Data Pipeline
User Modifies a
Monitored Table
Table Trigger
Invokes Lambda
Lambda Function
Applies Logic
Security Notification
Delivered
Event Driven Audit Notification
Demo: Lambda Integration
Amazon SQS
Aurora Cluster
AWS Lambda
Amazon SNSAmazon CloudWatch AWS Lambda
Demo: Lambda Integration
Mario KostelacProduct engineer
mariokostelac
?
How did we start using Aurora?
<>
Our stack
<whatever>.js
Our first MySQL instance
Then we got bigger...
And bigger…
Eventually, two MySQL instances…
Why so slow?
2 billion rows problem
- Your table can’t fit in RAM
- Your table can’t fit in RAM!
- You can’t modify your table schema
⏰
Replace RDS MySQL with?
- DynamoDB
- Partitioned RDS MySQL
- Aurora?
Is Aurora good enough for us?
Test your load!
“The only testing that should matter to
you is testing against YOUR production
load!”
me, right now!
How to test your load?
1. Test your tools (why?)
2. Create an image of your load! (how?)
3. Test your load!
MySQL
prod
Aurora MySQL
prod
How we migrated from RDS MySQL to
Aurora?
Write a migration runbook!
1. Downtimes are stressful
2. Induced downtimes have to be
carefully planned
🚀 Migrated within 8 minutes of
downtime, no records lost!
How does it work for us?
Use secondaries when you can
Check your drivers
Build your tooling
What don’t we have to do anymore?
Cluster monitoring got simpler Parameter tweaking
So you say it’s impossible to break
it?
Test your load! Write a runbook! Use secondaries
Check your drivers! Build your tooling
Thank you!
Related Sessions
• DAT203 - Getting Started with Amazon Aurora
• DAT303 - Deep Dive on Amazon Aurora
• DAT302 - Best Practices for Migrating from Commercial
Database Engines to Amazon Aurora or PostgreSQL
Remember to complete
your evaluations!