+ All Categories
Home > Documents > Dynamic AWS Server Usage Using Nagios Core or How to pay only for what you need Eric Loyd...

Dynamic AWS Server Usage Using Nagios Core or How to pay only for what you need Eric Loyd...

Date post: 14-Jan-2016
Category:
Upload: jimena-woolford
View: 215 times
Download: 0 times
Share this document with a friend
Popular Tags:
27
Dynamic AWS Server Usage Using Nagios Core or How to pay only for what you need Eric Loyd [email protected] 877.33.VOICE @Bitnetix @SmartVox
Transcript
Page 1: Dynamic AWS Server Usage Using Nagios Core or How to pay only for what you need Eric Loyd eric@bitnetix.com 877.33.VOICE @Bitnetix@SmartVox.

Dynamic AWS Server UsageUsing Nagios Core

orHow to pay only for what you need

Eric [email protected]

877.33.VOICE@Bitnetix @SmartVox

Page 2: Dynamic AWS Server Usage Using Nagios Core or How to pay only for what you need Eric Loyd eric@bitnetix.com 877.33.VOICE @Bitnetix@SmartVox.

About Bitnetix

Page 3: Dynamic AWS Server Usage Using Nagios Core or How to pay only for what you need Eric Loyd eric@bitnetix.com 877.33.VOICE @Bitnetix@SmartVox.

3

About Eric Loyd and Bitnetix

Founder and CEO of Bitnetix Incorporated

VoIP services and IT/network consulting

Over 25 Years in IT and management at places like

Eastman Kodak

Frontier Communications / Global Crossing

Rochester Institute of Technology

Bitnetix started its eighth year in July, 2013

Digital Rochester GREAT Award Finalist in:

2012 for Communications Technology

2013 for Rising Star

Using Nagios since 2004

© 2013 Bitnetix Incorporated

Page 4: Dynamic AWS Server Usage Using Nagios Core or How to pay only for what you need Eric Loyd eric@bitnetix.com 877.33.VOICE @Bitnetix@SmartVox.

History of SmartVox:Bitnetix’s VoIP Platform

Page 5: Dynamic AWS Server Usage Using Nagios Core or How to pay only for what you need Eric Loyd eric@bitnetix.com 877.33.VOICE @Bitnetix@SmartVox.

5

History of SmartVox, our VoIP Platform

Pre-2012 – not yet called SmartVox

Bitnetix primarily focused on IT consulting

VoIP service was ~10% of business with servers located primarily at client sites

Custom Asterisk-based servers running FreePBX

We ran customer’s network so we had control over VoIP

2012 – Focus switched to VoIP

Focused now on hosted VoIP solutions

Made use of Amazon Web Services EC2 VPSOne per customer with no proxies* or media servers

Network/bandwidth was only customer responsiblity

© 2013 Bitnetix Incorporated

Page 6: Dynamic AWS Server Usage Using Nagios Core or How to pay only for what you need Eric Loyd eric@bitnetix.com 877.33.VOICE @Bitnetix@SmartVox.

6

History of SmartVox, our VoIP Platform

2013 – SmartVox name born

Copyright, trademark, domain name, biz cards, etc.

Third generation born with multiple proxies, registrars, configuration servers, and media servers

June – Started Mission Matrix program & sales

AWS architecture leveraged for geography

Each customer gets own EC2 server

Proxies to closest zone, secondary “to the west”

Media servers located in zones base on number of simultaneous calls, conferences, etc.

VMs and CDRs stored in database

© 2013 Bitnetix Incorporated

Page 7: Dynamic AWS Server Usage Using Nagios Core or How to pay only for what you need Eric Loyd eric@bitnetix.com 877.33.VOICE @Bitnetix@SmartVox.

Brief Overview of AWS

Page 8: Dynamic AWS Server Usage Using Nagios Core or How to pay only for what you need Eric Loyd eric@bitnetix.com 877.33.VOICE @Bitnetix@SmartVox.

8

AWS EC2 Concepts

AWS – Amazon Web Services

Collection of cloud-based services:Storage (S3), DNS (Route 53), CDN, Server (EC2)

EC2 - Elastic Compute Cloud

Virtual servers in AWS datacenters (zones)US (3 = VA, CA, OR), EU (1), Asia (3), SA (1)

Persistent storage & flexible IP address assignment

Pay by the hour that it’s up, storage and bandwidth

Spot instances – “temporary” EC2 servers

Bring online as needed, terminated when shut down

© 2013 Bitnetix Incorporated

Page 9: Dynamic AWS Server Usage Using Nagios Core or How to pay only for what you need Eric Loyd eric@bitnetix.com 877.33.VOICE @Bitnetix@SmartVox.

9

AWS EC2 Costs

LOTS of variables, but reasonable potential costs:

Reserved servers cost about $2.00 per day

Reserved instance pricing is contractual and static, based on size

Spot servers cost between $0.50-$2.50 per day

Spot instance pricing is dynamic, we assume ~$0.10 per hour

We quantize concurrent calls into 50-call blocks

One media server = 50 calls = 1 spot instance

Two media servers = 100 calls = 2 spot instances

Bandwidth and storage will add ~10%

Reducing AWS usage reduces cost

We keep these savings for ourselves. Shhhh!!!

© 2013 Bitnetix Incorporated

Page 10: Dynamic AWS Server Usage Using Nagios Core or How to pay only for what you need Eric Loyd eric@bitnetix.com 877.33.VOICE @Bitnetix@SmartVox.

Why Nagios?

Page 11: Dynamic AWS Server Usage Using Nagios Core or How to pay only for what you need Eric Loyd eric@bitnetix.com 877.33.VOICE @Bitnetix@SmartVox.

11

Why Nagios?

Extensive experience using it for clients

Bitnetix is a Nagios reseller

Needed centralized monitoring software

Integrate with Twitter for notifications

Integrate with Eventum via email for trouble tickets

Zero cost

Framework

Leverage SSH, HTTP, check_mk and livestatus!!

Custom checks and notifications (very important)

Ability to “cookie cutter” installs for AWS

© 2013 Bitnetix Incorporated

Page 12: Dynamic AWS Server Usage Using Nagios Core or How to pay only for what you need Eric Loyd eric@bitnetix.com 877.33.VOICE @Bitnetix@SmartVox.

12

Initial Hurdles

Customer Premise Equipment

No real control over CPE choicesRouters block some traffic, “help” other traffic incorrectly

Need to be able to remotely [re-]configure phones

Figure out how to “cookie-cutter” EC2 servers

Customer boxes and SIP endpoints

Proxies and media servers

Wanted to monitor upstream providers as well

How to separate apparent from actual failure

Something’s broken, but overall service functional

© 2013 Bitnetix Incorporated

Page 13: Dynamic AWS Server Usage Using Nagios Core or How to pay only for what you need Eric Loyd eric@bitnetix.com 877.33.VOICE @Bitnetix@SmartVox.

SmartVox Provisioning Process and Automation

Page 14: Dynamic AWS Server Usage Using Nagios Core or How to pay only for what you need Eric Loyd eric@bitnetix.com 877.33.VOICE @Bitnetix@SmartVox.

14

SmartVox Network

DNS SRV records are key to redundant servers

© 2013 Bitnetix Incorporated

Sends the call on to the correct

phone/media server (VM, etc)

Figures out what customer should receive the calls

Sends incoming calls to

one/more border proxies

Provider

Border Proxy

Customer Proxy

Customer Proxy

Border Proxy

Customer Proxy

Page 15: Dynamic AWS Server Usage Using Nagios Core or How to pay only for what you need Eric Loyd eric@bitnetix.com 877.33.VOICE @Bitnetix@SmartVox.

15

Provisioning Process

SmartVox AWS EC2 Provisioning Database

Customer information

Account (location/division/etc) information

Number of phones*, VM boxes, etc.

Computes how many proxies customer needs

DNS SRV records created for batch updates

Media server/VM entries created automatically

Phone provisioning info created automatically

Automatically places order for phones* (+some)

Phones drop-shipped to customer in about 3 days

© 2013 Bitnetix Incorporated

Page 16: Dynamic AWS Server Usage Using Nagios Core or How to pay only for what you need Eric Loyd eric@bitnetix.com 877.33.VOICE @Bitnetix@SmartVox.

16

AWS EC2 Automation: Spot Instance API

Create spot instance -> gives request ID

Instance created with SmartVox created base image

Wait a bit -> query request ID -> get instance ID

Query instance -> get IP address

Update DNS with server information and IP

Update Nagios with server information and IP

When spot instances shut down, they terminate

No more expense for “burstable resources”

This sounds like a Nagios event handler…

© 2013 Bitnetix Incorporated

Page 17: Dynamic AWS Server Usage Using Nagios Core or How to pay only for what you need Eric Loyd eric@bitnetix.com 877.33.VOICE @Bitnetix@SmartVox.

17

AWS EC2 Automation: Our Custom Image

SmartVox media server image includes Asterisk

Asterisk told to exit after waiting for calls to terminate

Startup script shuts down system after Asterisk exits

Instant “spot instance”Bring it online when needed, and terminate as required

Same basic idea for starting/stopping proxies

These tend to be more static than media servers

Platform can be adjusted automatically

COGS adjusts appropriately

Hey, let’s hook this up to Nagios!!© 2013 Bitnetix Incorporated

Page 18: Dynamic AWS Server Usage Using Nagios Core or How to pay only for what you need Eric Loyd eric@bitnetix.com 877.33.VOICE @Bitnetix@SmartVox.

18

AWS EC2 Automation: More ideas

Quick aside about spot instances. Useful for:

Database dumps

Spot instance turned up to do MySQL copies

Run reports, dump, compress, purge, etc & term

Distributing web server load

Pop up another server and add to DNS

Instant on-demand capacity

Anything that you only want to do repeatedly but not for a long time, and only when you want to (or maybe if you have to)

© 2013 Bitnetix Incorporated

Page 19: Dynamic AWS Server Usage Using Nagios Core or How to pay only for what you need Eric Loyd eric@bitnetix.com 877.33.VOICE @Bitnetix@SmartVox.

Use Nagios for:ProvisioningMonitoring

Capacity Planning

Page 20: Dynamic AWS Server Usage Using Nagios Core or How to pay only for what you need Eric Loyd eric@bitnetix.com 877.33.VOICE @Bitnetix@SmartVox.

20

Provisioning

Rather than create EC2s, we just update Nagios

Automatically regenerate SIP proxy and media server dynamic_hosts.cfg file as part of provisioning process

Nagios looks for host up, doesn’t find it, fires off handler

Event handler queries EC2 to see if it’s being turned up (~10 min) or just not running. If it’s not running, it starts it.

DNS is batch updated every hour. 59 min TTLs

Phone provisioning handled via automatic extract from database to create HTTP served configuration files

Master/slave “config servers” (also in AWS) to send all this stuff to customers, with a URL to activate phones

Entire process from signature to functional < 1 week

© 2013 Bitnetix Incorporated

Page 21: Dynamic AWS Server Usage Using Nagios Core or How to pay only for what you need Eric Loyd eric@bitnetix.com 877.33.VOICE @Bitnetix@SmartVox.

21

Monitoring

Nagios looks for hosts (see previous slide)

Automatically creates them if needed

Note that SIP proxies are not spot instancesDedicated to lifespan of customer/account so they are only terminated as part of de-provisioning process

Nagios looks at health of services

Determine if we have faults, outages, etc.

Can potentially reroute automatically (DNS SRV!)

Store performance info for capacity calculations

Notifications via Twitter and email

Come back tomorrow at 10:30 for how this works

© 2013 Bitnetix Incorporated

Page 22: Dynamic AWS Server Usage Using Nagios Core or How to pay only for what you need Eric Loyd eric@bitnetix.com 877.33.VOICE @Bitnetix@SmartVox.

22

Capacity Planning

Quantize by 50 simultaneous calls per server

Perf data used to calculate historical usage

Can use cron to automatically add/remove servers

Nagios figures out “deltac” in current usage

If deltac = 0, we are just right (OK)

If deltac < 0, we have too much capacity (WARN)

If deltac > 0, we need more capacity (CRITICAL)

Event handler looks at state and either does nothing, tells least used box to stop Asterisk, or adds another box to the mix (see provisioning)

Capacity (and costs) dynamically adjust with usage

© 2013 Bitnetix Incorporated

Page 23: Dynamic AWS Server Usage Using Nagios Core or How to pay only for what you need Eric Loyd eric@bitnetix.com 877.33.VOICE @Bitnetix@SmartVox.

23

Capacity Planning: DeltaC

deltac – Custom Nagios module

Looks at the last three times it ran on particular host

Quantized by 50 calls = change in 50-call volumes

If deltac = 0 then we return an OK state

If deltac < 0 then we are dropping call volumes and can SSH to a box and tell Asterisk to stop

This will then stop the spot instance and reduce cost

If deltac > 0 then we are gaining call volumes and trigger provisioning process

This will start a spot instance and increase cost

© 2013 Bitnetix Incorporated

Page 24: Dynamic AWS Server Usage Using Nagios Core or How to pay only for what you need Eric Loyd eric@bitnetix.com 877.33.VOICE @Bitnetix@SmartVox.

Event Handler:DeltaC

Page 25: Dynamic AWS Server Usage Using Nagios Core or How to pay only for what you need Eric Loyd eric@bitnetix.com 877.33.VOICE @Bitnetix@SmartVox.

25

How DeltaC Works

Let’s assume we’re creating a new hostec2-request-spot-instances ami-58296831 -p 0.04 --key "BTC EC2" --group Asterisk --instance-type m1.medium -n 1 --type one-time

Get back a “spotInstanceRequestId” (sir-722f4e34)

ec2-describe-spot-instance-requests sir-722f4e34

Get back an “instanceId” (i-6488e31f)

ec2-describe-instances i-6488e31f

Get back public IP address (ipAddress) of this machine

Now we have IP address and (internal) namePopulate DNS batch update queue

Regenerate /usr/local/nagios/etc/objects/dynamic_hosts.cfg

© 2013 Bitnetix Incorporated

Page 26: Dynamic AWS Server Usage Using Nagios Core or How to pay only for what you need Eric Loyd eric@bitnetix.com 877.33.VOICE @Bitnetix@SmartVox.

26

DeltaC Saves Lives Money

Small percentage changes in usageresult in large changesin Cost Of Goods

For example:

© 2013 Bitnetix Incorporated

100 calls• 2 boxes• $0.20/hour• ~$75/year

500 calls• 10 boxes• $1.00/hour• ~$375/year

2000 calls• 20 boxes• $2.00/hour• ~$750/year

5000 calls• 50 boxes• $5.00/hour• ~$2000/year

Page 27: Dynamic AWS Server Usage Using Nagios Core or How to pay only for what you need Eric Loyd eric@bitnetix.com 877.33.VOICE @Bitnetix@SmartVox.

Questions?

Eric [email protected]

877.33.VOICE@Bitnetix @SmartVox


Recommended