Cloud Computing & WordPress - Scalability and High Availability - wpcampbo13

Post on 27-Jan-2015

111 views 1 download

Tags:

description

Cloud Computing & WordPress - Scalability and High Availability @ WordPress WordCamp Bologna 2013 by Gabriele Mittica and Walter Dal Mut - www.corley.it - www.upcloo.com

transcript

WORDCAMP BOLOGNA - 9 FEB 2013 @WORDCAMPBOLOGNA # WPCAMPBO13

WordPress & Cloud Computing: Scalability and High Availability

Gabriele Mittica & Walter Dal Mut

CLOUD COMPUTING

Cloud computing refers to the delivery of computing and storage capacity as a service to a heterogeneous community of end-recipients.

Cloud computing entrusts services with a user's data, software and computation over a network.

It has considerable overlap with software as a service (SaaS).

WORDCAMP BOLOGNA - 9 FEB 2013 @WORDCAMPBOLOGNA # WPCAMPBO13

CLOUD COMPUTING

Is the cloud just a fad?No.It’s a rational evolution of IT architecture towards a more efficient way of managing resources and designing Web apps efficiently.

WORDCAMP BOLOGNA - 9 FEB 2013 @WORDCAMPBOLOGNA # WPCAMPBO13

Is the cloud cheap?No.The cloud allows you to pay the right for each service involved.

Is the cloud just a scalable vps?No.The cloud is a set of services designed to meet specific computing needs.

1.

2.

3.

CLOUD COMPUTING

WORDCAMP BOLOGNA - 9 FEB 2013 @WORDCAMPBOLOGNA # WPCAMPBO13

- access to unlimited resources

- scalable architecture

- no hardware dependency

- pay as you go

- geographical redundancy

- high availability

- increased competition for start-up

AMAZON WEB SERVICES

WORDCAMP BOLOGNA - 9 FEBBRAIO 2013 @WORDCAMPBOLOGNA # WPCAMPBO13

AMAZON WEB SERVICES - SIGNIN

WORDCAMP BOLOGNA - 9 FEB 2013 @WORDCAMPBOLOGNA # WPCAMPBO13

AMAZON WEB SERVICES - CONSOLE

WORDCAMP BOLOGNA - 9 FEBBRAIO 2013 @WORDCAMPBOLOGNA # WPCAMPBO13

AMAZON WEB SERVICES - LINKS

WORDCAMP BOLOGNA - 9 FEB 2013 @WORDCAMPBOLOGNA # WPCAMPBO13

- Home page: http://aws.amazon.com/

- About AWS: https://aws.amazon.com/what-is-aws/

- All products: http://aws.amazon.com/products/

- Dev area: http://aws.amazon.com/resources/

- Documentation: http://aws.amazon.com/documentation/

- SDK: http://aws.amazon.com/code/

- Community: https://forums.aws.amazon.com/index.jspa

- AWS Blog: http://aws.typepad.com/

- Events: https://aws.amazon.com/about-aws/events/

- Services Health Dashboard: http://status.aws.amazon.com/

- Pricing Calculator: http://calculator.s3.amazonaws.com/calc5.html

WORDPRESS ONAMAZON WEB SERVICES

WORDCAMP BOLOGNA - 9 FEB 2013 @WORDCAMPBOLOGNA # WPCAMPBO13

WORDPRESS ON AWS

WORDCAMP BOLOGNA - 9 FEB 2013 @WORDCAMPBOLOGNA # WPCAMPBO13

WORDPRESS ON AWS

WORDCAMP BOLOGNA - 9 FEB 2013 @WORDCAMPBOLOGNA # WPCAMPBO13

WORDPRESS ON AWS

WORDCAMP BOLOGNA - 9 FEB 2013 @WORDCAMPBOLOGNA # WPCAMPBO13

BOOSTING WORDPRESS WITH:

S3, CF, CLOUDSEARCH, SES

WORDCAMP BOLOGNA - 9 FEB 2013 @WORDCAMPBOLOGNA # WPCAMPBO13

PLUGINS FOR AWS – S3 & CLOUDFRONT

WORDCAMP BOLOGNA - 9 FEB 2013 @WORDCAMPBOLOGNA # WPCAMPBO13

PLUGINS FOR AWS – S3 & CLOUDFRONT

WORDCAMP BOLOGNA - 9 FEB 2013 @WORDCAMPBOLOGNA # WPCAMPBO13

Download credentials:

PLUGINS FOR AWS – S3 & CLOUDFRONT

WORDCAMP BOLOGNA - 9 FEB 2013 @WORDCAMPBOLOGNA # WPCAMPBO13

Grant full access to S3, CloudSearch, SES:

PLUGINS FOR AWS – S3 & CLOUDFRONT

WORDCAMP BOLOGNA - 9 FEB 2013 @WORDCAMPBOLOGNA # WPCAMPBO13

Tipical use of WordPress: html and media files are provided by apache (http request):

PLUGINS FOR AWS – S3 & CLOUDFRONT

WORDCAMP BOLOGNA - 9 FEB 2013 @WORDCAMPBOLOGNA # WPCAMPBO13

Uploads your wordpress attachements to S3 with an option CloudFront distribution.

This WordPress plugin allows you to use Amazon's Simple Storage Service to host your media for your WordPress powered blog with an optional CloudFront distribution.

Plugin homepage: http://wordpress.org/extend/plugins/tantan-s3-cloudfront/

Services involved:Simple Storage Service (S3):S3 provides a simple web services interface that can be used to store and retrieve any amount of data, at any time, from anywhere on the web. It provides 99.999999999% durability.

CloudFront:CF can be used to deliver your entire website, including dynamic, static and streaming content using a global network of edge locations.Over 30 edge locations.

PLUGINS FOR AWS – S3 & CLOUDFRONT

WORDCAMP BOLOGNA - 9 FEB 2013 @WORDCAMPBOLOGNA # WPCAMPBO13

Create a new bucket in S3 console:

PLUGINS FOR AWS

WORDCAMP BOLOGNA - 9 FEB 2013 @WORDCAMPBOLOGNA # WPCAMPBO13

Edit the distribution setting the bucket as origin!

PLUGINS FOR AWS – S3 & CLOUDFRONT

WORDCAMP BOLOGNA - 9 FEB 2013 @WORDCAMPBOLOGNA # WPCAMPBO13

Download, activate and customize the plugin:

PLUGINS FOR AWS – S3 & CLOUDFRONT

WORDCAMP BOLOGNA - 9 FEB 2013 @WORDCAMPBOLOGNA # WPCAMPBO13

Static files hosted on S3 and served by CloudFront:

PLUGINS FOR AWS – S3 & CLOUDFRONT

WORDCAMP BOLOGNA - 9 FEB 2013 @WORDCAMPBOLOGNA # WPCAMPBO13

Thanks to S3 and CloudFront, you can easily send all your media files through the content Delivery Network

PLUGINS FOR AWS - CLOUDSEARCH

WORDCAMP BOLOGNA - 9 FEB 2013 @WORDCAMPBOLOGNA # WPCAMPBO13

Create a scalable search engine for your content published on WordPress.

This WordPress plugin allows you to use Amazon's CloudSearch to provide a smart search engine to your final users.

Plugin homepage: http://wordpress.org/extend/plugins/lift-search/

Services involved:

CloudSearch:Amazon CloudSearch is a fully-managed search service in the cloud that allows customers to easily integrate fast and highly scalable search functionality into their applicationsIt supports over 8 millions docs.

PLUGINS FOR AWS - CLOUDSEARCH

WORDCAMP BOLOGNA - 9 FEB 2013 @WORDCAMPBOLOGNA # WPCAMPBO13

Starting a new engine server with CloudSearch:

PLUGINS FOR AWS - CLOUDSEARCH

WORDCAMP BOLOGNA - 9 FEB 2013 @WORDCAMPBOLOGNA # WPCAMPBO13

Documentation at http://getliftsearch.com/documentation/

PLUGINS FOR AWS – SIMPLE EMAIL SERVICE

WORDCAMP BOLOGNA - 9 FEB 2013 @WORDCAMPBOLOGNA # WPCAMPBO13

WP-SES is a plugin that redirects all outgoing WordPress emails through Amazon Simple Email Service (SES) for maximum email delivrability.

You can download this plugin on official website http://wp-ses.com/

Services involved:

Simple Email Service (SES):SES is a highly scalable and cost-effective bulk and transactional email-sending service for businesses and developersOnly $0.10 per thousand.

PLUGINS FOR AWS – SIMPLE EMAIL SERVICE

WORDCAMP BOLOGNA - 9 FEB 2013 @WORDCAMPBOLOGNA # WPCAMPBO13

PLUGINS FOR AWS – SIMPLE EMAIL SERVICE

WORDCAMP BOLOGNA - 9 FEB 2013 @WORDCAMPBOLOGNA # WPCAMPBO13

First, install like any other plugin:

- Upload and activate the plugin

- The setting are in settings / WP SES

Then, proceed to the settings:

- Fill the email address and name to use as the sender for all emails

- Fill in Amazon API credentials

- Save changes (Important !)

- Ask to add the email as a confirmed sender

- Click on the link you got by email from Amazon SES

- Refresh the plugin, send a test email

- If ok, ask Amazon to go out of sandbox into production mode

- Once in production mode, you can use the top button to activate the plugin.

PLUGINS FOR AWS – SIMPLE EMAIL SERVICE

WORDCAMP BOLOGNA - 9 FEB 2013 @WORDCAMPBOLOGNA # WPCAMPBO13

All emails sent by Simple Email Service:

PLUGINS FOR AWS

WORDCAMP BOLOGNA - 9 FEB 2013 @WORDCAMPBOLOGNA # WPCAMPBO13

BEFORE:

PLUGINS FOR AWS

WORDCAMP BOLOGNA - 9 FEB 2013 @WORDCAMPBOLOGNA # WPCAMPBO13

AFTER:

SCALABLE WORDPRESSHOW TO MAKE WORDPRESS SCALABLE ON

THE CLOUD

WORDCAMP BOLOGNA - 9 FEB 2013 @WORDCAMPBOLOGNA # WPCAMPBO13

A TYPICAL SCALABLE INFRASTRUCTURE

WORDCAMP BOLOGNA - 9 FEB 2013 @WORDCAMPBOLOGNA # WPCAMPBO13

RDBMS – START FROM THE END

WORDCAMP BOLOGNA - 9 FEB 2013 @WORDCAMPBOLOGNA # WPCAMPBO13

• Computation• LB listens for incoming connections and route requests to web

applications• Web applications use RDBMS to get stored information• Sessions and performance improvements are handled by Memcached

instances.

• Static resources distribution• CDN – Content Distribution Network

• A globally distributed endpoints• Serve static files (also dynamic if needed)• Native connections with S3 (Simple Storage Service)

• It handles all static resources, in this way our web servers have to handle only dynamic calls

RDBMS – START FROM THE END

WORDCAMP BOLOGNA - 9 FEB 2013 @WORDCAMPBOLOGNA # WPCAMPBO13

• MySQL • RDBMS

• Relational Database Management System

• How it scales?• Read Replica

• Pros (In terms of scalability)• Simple to do• Simple management

• Cons• You can scale only read operations

• The master instance has to handle all write operations (bottleneck on writes)

READ REPLICA ON AWS

WORDCAMP BOLOGNA - 9 FEB 2013 @WORDCAMPBOLOGNA # WPCAMPBO13

• From RDS service tab on the AWS console right click on a running instance and create a Read Replica DB Instance

• Configure the read-replica and create it through the graphical console.

IN ORDER TO PROMOTE A SLAVE TO MASTER?

WORDCAMP BOLOGNA - 9 FEB 2013 @WORDCAMPBOLOGNA # WPCAMPBO13

Similar to master creation• Select a read-replica• Right-click and promote Read Replica

Discover more on RDS:• http://aws.typepad.com/aws/amazon-rds/

NOW HAVE A LOOK ON WEB INSTANCES

WORDCAMP BOLOGNA - 9 FEB 2013 @WORDCAMPBOLOGNA # WPCAMPBO13

• All web instances scales out instead scales up• Scale out? What it means?

• Instead increase VM performances (more RAM, more CPU, more IO etc. etc.) open new VM and serve requests from these instances

• Load balancer route incoming connections to VMs using common algorithms• Round robin techniques• Based on VMs average load

PROBLEMS… WE NEVER TALK ABOUT…

WORDCAMP BOLOGNA - 9 FEB 2013 @WORDCAMPBOLOGNA # WPCAMPBO13

• Session management• If we open and close servers runtime we have to maintain PHP

sessions in order to handle user logins and other features related to sessions

• Database connections• All MySQL connectors handle just one connection… No “x” RDB

connections a the same time…• Software and Plugins maintenance

• How can we have the same version of WordPress and WP Plugins if VMs starts and stops continuously? How can we handle software updates?

• What about logs? How can we centralize the log management?

DELEGATE SESSION MANAGEMENT TO MEMCACHE

WORDCAMP BOLOGNA - 9 FEB 2013 @WORDCAMPBOLOGNA # WPCAMPBO13

• Memcache(d) servers are not only useful distributed in RAM caching servers but also they can manage PHP session for us.• Memcache infrastructure is simple to create and

maintain• Elasticache Service of AWS

• No software modification• We have just to configure the PHP interpreter (compile

with memcache/memcached support)

session.save_handler = memcache session.save_path = "tcp://1.cache.group.domain.tld:11211" 

DELEGATE CONNECTIONS TO MYSQL NATIVE DRIVER

WORDCAMP BOLOGNA - 9 FEB 2013 @WORDCAMPBOLOGNA # WPCAMPBO13

• MySQL native driver?• Available from PHP >=5.3• Compile PHP with mysqlnd support

• --with-mysqli=mysqlnd --with-pdo=mysqlnd --with-mysql=mysqlnd

• WARN mysql extension is deprecated as of PHP 5.5.0

• Delegate to “mysqlnd_ms” the master/slave management• http://www.php.net/manual/en/book.mysqlnd-ms.php

DELEGATE CONNECTIONS TO MYSQL NATIVE DRIVER

WORDCAMP BOLOGNA - 9 FEB 2013 @WORDCAMPBOLOGNA # WPCAMPBO13

{ "myapp": { "master": { "master_0": { "host": "localhost", "port": "3306" } }, "slave": { "slave_0": { "host": "192.168.2.27", "port": "3306" } } }}

The simple JSON configuration is divided in two main section

• Master• Slaves

“myapp” is the hostname that we use instead the real mysql host address.

Eg.• mysql_connect(“myapp”,

“user”, “passwd”);• new Mysqli(“myapp”, “user”,

“passwd”);• new

PDO(“mysql:dbname=testdb;host=myapp”);

START TALKING ABOUT ELASTIC COMPUTE CLOUD

WORDCAMP BOLOGNA - 9 FEB 2013 @WORDCAMPBOLOGNA # WPCAMPBO13

• ELB – Elastic Load Balancer• Distributed load balancer on AWS regions (eu-west-1, 2, 3 you

have to select in how many region you are available)• Watch EC2 status thanks to a ping strategy

• Page check every x minutes/seconds

• Turn on/off EC2 instances automatically thanks to alarms (CloudWatch raise alarms)• Receive Alarms from CloudWatch and engage scale operations• You can raise CPU alarms, Network Alarms, VM status alarms and many

others in order to increase or decrease the actual number of EC2

• Scale strategy is not simple and you have to understand how your application works• CPU is the simplest way but remember that the bandwidth is limited by

network interfaces and bottlenecks can obfuscate the CPU alarm and your application stucks in weird and strange situations.

AUTOSCALING WITH ELB + EC2 + CLOUDWATCH

WORDCAMP BOLOGNA - 9 FEB 2013 @WORDCAMPBOLOGNA # WPCAMPBO13

• If servers start and stops continuously, we have to find solutions to stay fresh and updated also on software• When a server starts, it has to create a valid

environment in order to provides web pages. Strategies?• Compile and bundle all softwares in one instance image

• It is very simple but all software becomes old very quickly and when you have to release an update you have to compile a new image and update all load balancers configurations. It is a long and complex operation

• Use EC2_USER_DATA feature provided by AWS• You can run a shell script when your instances bootstraps. It is more

flexible because you can create a skeleton (PHP + libraries) and download all software runtime during the boot operation

THE PROBLEM WITH SOFTWARE MANAGEMENT

WORDCAMP BOLOGNA - 9 FEB 2013 @WORDCAMPBOLOGNA # WPCAMPBO13

Use SVN (Subversion) to download the latest version of WordPress

Probably is not a good idea use the “trunk” but you can use tags in order to stay aligned in all VMssvn checkout http://core.svn.wordpress.org/tags/3.5.1/ mywebsite

http://codex.wordpress.org/Installing/Updating_WordPress_with_Subversion

Use SVN externals to download your pluginscd mywebsite/wp-content/plugins/svn propset svn:externals akismet http://plugins.svn.wordpress.org/akismet/tags/2.5.7/svn up

Create/Download your WordPress configuration file during VM bootstrap

HOW WE CAN DOWNLOAD WP AND PLUGINS?

WORDCAMP BOLOGNA - 9 FEB 2013 @WORDCAMPBOLOGNA # WPCAMPBO13

• If you ran 10 servers execute commands could be hard. You can use tools to run command on a server list• Capistrano (Ruby)

• https://github.com/capistrano/capistrano

• Fabric (Python)• https://github.com/fabric/fabric• Use CLOTH for AWS EC2 instances

• https://github.com/garethr/cloth

HOW TO UPDATE CONFIGURATIONS RUNTIME?

WORDCAMP BOLOGNA - 9 FEB 2013 @WORDCAMPBOLOGNA # WPCAMPBO13

#! /usr/bin/env python

from __future__ import with_statementfrom fabric.api import *

from fabric.contrib.console import confirm from cloth.tasks import * env.user = "root"env.directory = '/mnt/wordpress'env.key_filename = ['/home/walter/Amazon/wp-cms.pem'] @taskdef reload(): "Reload Apache configuration"

run('/etc/init.d/apache2 reload') @taskdef tail(): "Tail Apache logs"

run('tail /var/log/syslog')

EC2 instances are dynamic with don’t know address, for that reason we can use tagging system to execute commands on a group of instances

fab nodes:"^production.*" tail

Execute the “tail” command on all instances with a name that starts with “production.”

Eg.• production.web-1• production.log• production.mongodb

EXAMPLE OF FABRIC – USAGE WITH CLOTH

WORDCAMP BOLOGNA - 9 FEB 2013 @WORDCAMPBOLOGNA # WPCAMPBO13

• We create and destroy instances thanks to alarms but when we close an instance we lose immediately all apache logs (or equivalent)

• How we can manage logs?• The simplest way is to use Rsyslog clusters

• Rsyslog is an opensource software that forwarding log messages in an IP network

• Rsyslog implement the basic syslog protol• That means that we can configure apache logs to “syslog” instead

using normal text files.• In this way we can collect all logs in one group of VM and work

on these files later thanks to other technologies.

ALSO LOG MANAGEMENT IS NOT SIMPLE…

WORDCAMP BOLOGNA - 9 FEB 2013 @WORDCAMPBOLOGNA # WPCAMPBO13

• Collecting logs is not the latest operation because you have to analyse and reduce information• Move logs to S3 bucket – Time based• Analyze logs with Hadoop

• Map Reduce on the cloud with Elastic Map Reduce service (EMR)

• Use script languages on top of Hadoop in order to simply the log analysis• HIVE – Data Warehouse infrastructure (data summarization)• Pig – High level platform for creating MapReduce program

The end? Really you want the Red pill?

find out just how deep the rabbit hole goes

WORDCAMP BOLOGNA - 9 FEBBRAIO 2013 @WORDCAMPBOLOGNA # WPCAMPBO13

Gabriele Mittica

RELATORI

Web: www.gabrielemittica.com

Twitter: @gabrielemittica

Walter Dal Mut

Web: www.walterdalmut.com

Twitter: @walterdalmut