+ All Categories
Home > Technology > Engineering Big Data Infra with Openstack

Engineering Big Data Infra with Openstack

Date post: 27-Jan-2015
Category:
Upload: debojyoti-dutta
View: 103 times
Download: 1 times
Share this document with a friend
Description:
This deck talks about, at a high level, how one can optimize Big Data analytics applications on Openstack.
Popular Tags:
8
Cisco Confidential © 2010 Cisco and/or its affiliates. All rights reserved. 1 Engineering Big Data Clouds Debo Dutta – Principal Engineer, Cisco
Transcript
Page 1: Engineering Big Data Infra with Openstack

Cisco Confidential© 2010 Cisco and/or its affiliates. All rights reserved. 1

Engineering Big Data CloudsDebo Dutta – Principal Engineer, Cisco

Page 2: Engineering Big Data Infra with Openstack

© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 2© 2013 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 2

Forward-looking StatementsThis presentation contains projections and other

forward-looking statements regarding future events or the future financial performance of Cisco, including

future operating results. These projections and statements are only predictions. Actual events or

results may differ materially from those in the projections or other forward-looking statements. Please

see Cisco’s filings with the SEC, including its most recent filings on Form 10-K and 10-Q, for a discussion

of important risk factors that could cause actual events or results to differ materially from those in the

projections or other forward-looking statements.

Page 3: Engineering Big Data Infra with Openstack

© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 3

Data Deluge Everywhere: Enterprises need Insights in a cost-effective manner

Volume

Variety

Velocity

Veracity

Mobile Data – Location, Presence, Device, Access, Customer Video Growth - 65%

of Mobile and 90% of Fixed traffic will be video by 2015 (Cisco VNI)

M2M – 225 Million connections by 2014 (ABI Research) from vending machines & ATMs to connected automobileCommerce – Mobile

Payment platforms and local offers

Smart Converged Networks – B/W optimization, content placement, offload, SDN

Social Media – Consumer behavior, targeted advertisement, Social network platforms

Cloud (XaaS) & App stores – All data in the cloud

Adapted from PRIME deck

Page 4: Engineering Big Data Infra with Openstack

© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 4

Big Data to Big Insights in the Cloud

IDC A new generation of technologies and architectures designed to economically extract value from very large volumes of a wide variety of data by enabling high-velocity capture, discovery, and/or analysis

• Shift from technology for finding information to discovering insights

• Increases interest in real-time analytics of machine generated data “Software defined” and converged technologies

• Open source software/platforms will play a pivotal role in big data Infrastructure – gain greater commercial adoption

• 2013 will be the year of “Big Data in the Cloud”

4

Page 5: Engineering Big Data Infra with Openstack

© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 5

Rise of semi and unstructured data

Web Products,

Commerce, Services

Cloud Computing

Defense, Intelligence

and Security

Financial Services

• Click streams• Email• AVI files• User data / search

data

• Network log files• Event processing• Impact analysis

• Call data• Online activity• Travel data• GPS data• Satellite Feeds

• Fraud detection / risk analysis

• Transaction data warehousing

• PCI compliance• Surveillance

Forces Driving The Growth of Big Unstructured Data:

Other

• Meteorology• Disease /

epidemiology• Genomics• RFID data• Sensor data

Difficult to capture, store, search, share, and analyze with traditional tools

1) Source: Cisco Visual Networking Index, June 2011

Thanks to Corp Dev

Page 6: Engineering Big Data Infra with Openstack

© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 6

BDaaS on public clouds• Focus on Platforms

• Focus on Integration

• Mostly ETL

• Leverage Public Clouds

• Very little focus on Insights

• Insights are obtained by in-house Data Scientists

• For Viz, UX is not there yet

Exceptions: Tableau

6

@Netflix

Page 7: Engineering Big Data Infra with Openstack

© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 7

(Private) Cloud Provider, Big-data Services

Batch

HIVEAPI

Oozie

API

Hadoop

API

noSQL

Hive

API

Cassandra

API

Pig

API

Mongo

API

Real time

MapRD rill

API

Yahoo S4

API

Truviso

API

Storm

API

Lucene

API

User and System Admin

ComputeService

Servers

StorageServiceDisks

NetworkServiceNetworks

Hypervisor: KVM, Xen, ESX - Nexus 1000v + Open vSwitchNetwork Virtualization: VLAN, OpenFlow, LISP, VXLAN

Big Data on Openstack

OpenStack Cloud Platform• Bridges the virtual and physical layers

Resource Virtualization/hypervisor Layer• Creates and manages virtualized compute,

storage and networking resources

Physical Resource Layer• Networking, Storage and Compute resources

Devops

Intelligent Scheduler

Healthcare Big Data Application

Virtual VPN

Virtual Waas

Virtual

Firewall

App

OS

VM

DataBase

OS

VM

App

OS

VM

Single Instance Services

Healthcare Big Data Application

Virtual VPN

Virtual Waas

Virtual

Firewall

App

OS

VM

DataBase

OS

VM

App

OS

VM

Single Instance Services

Healthcare Big Data Application

Virtual VPN

Virtual Waas

Virtual

Firewall

App

OS

VM

DataBase

OS

VM

App

OS

VM

Single Instance Services

Healthcare Big Data Application

Virtual VPN

Virtual Waas

Virtual

Firewall

App

OS

VM

DataBase

OS

VM

App

OS

VM

Single Instance Services

Healthcare Big Data Application

Virtual VPN

DevopsServer

Virtual Load

Balancer

Dashboard

OS

VM

DataBase

OS

VM

Sensor

OS

VM

Single Instance Services

/tenant/industrial/tenant/healthcare

/tenant/industrial/tenant/finance

App Topology

Page 8: Engineering Big Data Infra with Openstack

© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 8

Scheduling Heterogeneous Resources is key

8

VMsmetal metalVMsVMs VMsmetal metal

ComputeService

(VMs, Memory, Local Disk)

StorageService

(Block, Massive Key-value store)

Developer API

Servers Disks

NetworkService

(Virtual Networks, Services)

Networks

Map heavy workloads on bare metal with more resources,Light workloads on virtualized resources

Network(Topology), Storage aware scheduling


Recommended