+ All Categories
Home > Documents > Big Data Use CaseOur Big Data Problem 2 • Consolidate all monitoring data for reporting and...

Big Data Use CaseOur Big Data Problem 2 • Consolidate all monitoring data for reporting and...

Date post: 06-Jun-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
16
How Rackspace is using Private Cloud for Big Data Bryan Thompson Big Data Use Case May 8th, 2013
Transcript
Page 1: Big Data Use CaseOur Big Data Problem 2 • Consolidate all monitoring data for reporting and analytical purposes. • Every device (server, switch, SAN, UPS, etc.) and product produces

How Rackspace is using Private Cloud for Big Data Bryan Thompson

Big Data Use Case

May 8th, 2013

Page 2: Big Data Use CaseOur Big Data Problem 2 • Consolidate all monitoring data for reporting and analytical purposes. • Every device (server, switch, SAN, UPS, etc.) and product produces

RACKSPACE® HOSTING | WWW.RACKSPACE.COM

Our Big Data Problem

2

• Consolidate all monitoring data for reporting and analytical purposes.

• Every device (server, switch, SAN, UPS, etc.) and product produces multiple events per second

• Monitoring tens of thousands of devices (both physical and virtual)

• This adds up to terabytes of data per day, and growing…

Page 3: Big Data Use CaseOur Big Data Problem 2 • Consolidate all monitoring data for reporting and analytical purposes. • Every device (server, switch, SAN, UPS, etc.) and product produces

RACKSPACE® HOSTING | WWW.RACKSPACE.COM

Current Environment

3

• Dedicated Relational Database systems • Loaded nightly • Multiple BI Tools • 2450+ Users • To scale would be cost and time prohibitive:

• Cost of DB licenses • Cost of Hardware • Time to procure and configure servers • Concerns with performance • Heavy DBA work

Page 4: Big Data Use CaseOur Big Data Problem 2 • Consolidate all monitoring data for reporting and analytical purposes. • Every device (server, switch, SAN, UPS, etc.) and product produces

RACKSPACE® HOSTING | WWW.RACKSPACE.COM

What our sponsors and end-users want…

• Plug in and start analyzing data • Act at the speed of the business • Maintain optimal query performance • Costs to store and analyze Data

Volumes • Abstract technical nuances of multiple

big data technologies • Use your preferred BI tool • High Availability

Page 5: Big Data Use CaseOur Big Data Problem 2 • Consolidate all monitoring data for reporting and analytical purposes. • Every device (server, switch, SAN, UPS, etc.) and product produces

RACKSPACE® HOSTING | WWW.RACKSPACE.COM

To The Drawing Board!!!

5

• What we need is the ability to: • Host ever growing data volumes • Handle streaming data and hourly updates of

metrics with sub-second performance. • Rapid Scalability and High Availability • Leverage Open Source technologies • Ability to leverage multiple big data

technologies

Page 6: Big Data Use CaseOur Big Data Problem 2 • Consolidate all monitoring data for reporting and analytical purposes. • Every device (server, switch, SAN, UPS, etc.) and product produces

RACKSPACE® HOSTING | WWW.RACKSPACE.COM 6

The Analytic Compute Grid (ACG) Key components of the ACG

OpenStack can provide elasticity capabilities

Big Data Technologies (v1 Cassandra)

Advanced Hashing to run parallel clusters

Rule-based elasticity engine integrated w/ OpenStack

ANSI-SQL API w/ Extensions – ability to “plug in”

Page 7: Big Data Use CaseOur Big Data Problem 2 • Consolidate all monitoring data for reporting and analytical purposes. • Every device (server, switch, SAN, UPS, etc.) and product produces

RACKSPACE® HOSTING | WWW.RACKSPACE.COM

ACG Architecture

7

Infra

stru

ctur

e Pl

atfo

rm

Big

Dat

a Te

chno

logi

es

Dat

a Pr

esen

tatio

n

Rackspace Private Cloud

Host OS & Hypervisors

Commodity Hardware

Cas

sand

ra

Post

greS

QL

Had

oop

Oth

ers

in

Futu

re

ACG Engine Monitor capacity and

auto-provision/de-provision infrastructure resources

Route request to right analytics

technology

APIs

BI Tools Data Mining Data Integration SQL

Mon

goD

B

Page 8: Big Data Use CaseOur Big Data Problem 2 • Consolidate all monitoring data for reporting and analytical purposes. • Every device (server, switch, SAN, UPS, etc.) and product produces

RACKSPACE® HOSTING | WWW.RACKSPACE.COM

Rule-Based Elasticity

Page 9: Big Data Use CaseOur Big Data Problem 2 • Consolidate all monitoring data for reporting and analytical purposes. • Every device (server, switch, SAN, UPS, etc.) and product produces

RACKSPACE® HOSTING | WWW.RACKSPACE.COM

Rule-Based Elasticity

Page 10: Big Data Use CaseOur Big Data Problem 2 • Consolidate all monitoring data for reporting and analytical purposes. • Every device (server, switch, SAN, UPS, etc.) and product produces

RACKSPACE® HOSTING | WWW.RACKSPACE.COM

Our OpenStack Environment at Launch

10

• Deployed on Rackspace Private Cloud • Can run multiple node configurations • New node is provisioned in seconds!!! • Operating System – Ubuntu • Big Data Technology – Cassandra • 32 Node Cluster – with capacity to grow

Page 11: Big Data Use CaseOur Big Data Problem 2 • Consolidate all monitoring data for reporting and analytical purposes. • Every device (server, switch, SAN, UPS, etc.) and product produces

RACKSPACE® HOSTING | WWW.RACKSPACE.COM

Performance Comparison

11

• SQL Server Environment (Dedicated Environment)

24 CPU 256 GB RAM

Availability Calculation against 1.5 Billion row sample – 132 hours (5.5 days)

Page 12: Big Data Use CaseOur Big Data Problem 2 • Consolidate all monitoring data for reporting and analytical purposes. • Every device (server, switch, SAN, UPS, etc.) and product produces

RACKSPACE® HOSTING | WWW.RACKSPACE.COM

Performance Comparison

12

• RPC OpenStack Environment – (virtual machines)

8 CPU 32 GB RAM

Availability Calculation against 1.5 Billion row sample – 3.2 hours!!!

Page 13: Big Data Use CaseOur Big Data Problem 2 • Consolidate all monitoring data for reporting and analytical purposes. • Every device (server, switch, SAN, UPS, etc.) and product produces

RACKSPACE® HOSTING | WWW.RACKSPACE.COM 13

ACG Features

• ACG is a Big Data Management System • Parallel engine supports multiple clusters • Highly configurable Rules Engine

• Time based • System Based

• ANSI SQL Compliant API with extensions • High Compression - Cassandra • Reusable Bulk-Loader • Can integrate with current ETL tool

Page 14: Big Data Use CaseOur Big Data Problem 2 • Consolidate all monitoring data for reporting and analytical purposes. • Every device (server, switch, SAN, UPS, etc.) and product produces

RACKSPACE® HOSTING | WWW.RACKSPACE.COM

• PostgreSQL (launching this month) • Hadoop • Allow for seamless cross platform analysis • Migrate off legacy environment • Dev/QA Environments • Next big “big data” technology ?

The Road Ahead

14

Page 15: Big Data Use CaseOur Big Data Problem 2 • Consolidate all monitoring data for reporting and analytical purposes. • Every device (server, switch, SAN, UPS, etc.) and product produces

RACKSPACE® HOSTING | WWW.RACKSPACE.COM

Questions?

15

Page 16: Big Data Use CaseOur Big Data Problem 2 • Consolidate all monitoring data for reporting and analytical purposes. • Every device (server, switch, SAN, UPS, etc.) and product produces

16

RACKSPACE® HOSTING | 5000 WALZEM ROAD | SAN ANTONIO, TX 78218 US SALES: 1-800-961-2888 | US SUPPORT: 1-800-961-4454 | WWW.RACKSPACE.COM

RACKSPACE® HOSTING | © RACKSPACE US, INC. | RACKSPACE® AND FANATICAL SUPPORT® ARE SERVICE MARKS OF RACKSPACE US, INC. REGISTERED IN THE UNITED STATES AND OTHER COUNTRIES. | WWW.RACKSPACE.COM


Recommended