+ All Categories
Home > Documents > Big Data

Big Data

Date post: 27-Jan-2015
Category:
Upload: bc-technology-industry-association
View: 1,434 times
Download: 0 times
Share this document with a friend
Description:
 
Popular Tags:
34
Everything You Need to Know About ‘Big Data’, BI and Data Acceleration Adrian Westmoreland December, 2012
Transcript
Page 1: Big Data

Everything You Need to Know About

‘Big Data’, BI and Data Acceleration Adrian Westmoreland

December, 2012

Page 2: Big Data

© 2012 SAP AG. All rights reserved. 2

Safe harbor statement

The information in this presentation is confidential and proprietary to SAP and may not be disclosed without the permission of SAP. This presentation is not subject to your license agreement or any other service or subscription agreement with SAP. SAP has no obligation to pursue any course of business outlined in this document or any related presentation, or to develop or release any functionality mentioned therein. This document, or any related presentation and SAP's strategy and possible future developments, products and or platforms directions and functionality are all subject to change and may be changed by SAP at any time for any reason without notice. The information on this document is not a commitment, promise or legal obligation to deliver any material, code or functionality. This document is provided without a warranty of any kind, either express or implied, including but not limited to, the implied warranties of merchantability, fitness for a particular purpose, or non-infringement. This document is for informational purposes and may not be incorporated into a contract. SAP assumes no responsibility for errors or omissions in this document, except if such damages were caused by SAP intentionally or grossly negligent.

All forward-looking statements are subject to various risks and uncertainties that could cause actual results to differ materially from expectations. Readers are cautioned not to place undue reliance on these forward-looking statements, which speak only as of their dates, and they should not be relied upon in making purchasing decisions.

Page 3: Big Data

© 2012 SAP AG. All rights reserved. 3

Agenda

Over the next 30 minutes

You’ll gain an understanding of Big Data technologies, opportunities and challenges including how

technology innovations are changing the Big Data landscape.

You’ll see a brief overview of SAP’s Big Data architecture.

You’ll discover how other companies are utilizing Big Data for their benefit.

Page 4: Big Data

Introduction to ”Big Data”

Page 5: Big Data

© 2012 SAP AG. All rights reserved. 5

SOCIAL

Page 6: Big Data

© 2012 SAP AG. All rights reserved. 7

CRM Data

GP

S

Demand

Sp

ee

d

Velocity

Transactions

Opport

unitie

s

Serv

ice C

alls

Customer

Sales Orders

Inventory

Em

ails

Tw

eets

Planning

Things

Mobile

Insta

nt M

essages

Worldwide digital content will

double in 18 months, and

every 18 months thereafter.

VELOCITY

In 2005, humankind created

150 exabytes of information.

In 2011, 1,200 exabytes will

be created.

VOLUME VARIETY 80% of enterprise data will

be unstructured, spanning

traditional and non traditional

sources. Gartner

IDC

The Economist

VARIABILITY Configuring modern software can be

extremely difficult since a good

configuration depends (at least) on the

hardware environment, the workload, the load

intensity, and the target behavior MassConf Paper, ACM

VALUE Empowerment of the End User is the goal

of Enterprise Software. SAP

VALIDITY Ensure that the information was created in

accordance with complete understanding of the

use cases and includes all the other aspects of

data quality. Gartner

Page 7: Big Data

© 2012 SAP AG. All rights reserved. 8

IDC “Big Data” definition

IDC’s “Big Data” definition utilizes criteria and steps to determine whether a use case and

associated technology and services should be included in the “Big Data” market sizing. These

include the following scenarios:

Scenarios:

• Deployments where the data collected is over 100TB (data collected, not stored, accounts for

the use of in-memory technology where data may not be stored on a disk)

• Deployments of ultra-high-speed messaging technology for real-time, streaming data capture,

and monitoring

• Deployments where the data sets may not be very large today but are growing very rapidly at

a rate of 60% or more annually

Next, IDC evaluates whether, for each of the above scenarios, the technology is deployed on

scale-out infrastructure, and finally, IDC evaluates whether the deployments include two or more

data types or data sources, and/or include high-speed data sources such as click-stream

tracking or monitoring of machine-generated data .

Page 8: Big Data

© 2012 SAP AG. All rights reserved. 10

Collect Kafka Flume Scribe

Process Azkaban Oozie Pig Hive

Hadoop MapReduce S4 Storm

Store Voldemort Cassandra Hbase

Present Analytics? Applications? Mobile?

Open Source Big Data – Even More Confused?

Page 9: Big Data

© 2012 SAP AG. All rights reserved. 11

New storage and processing techniques required

Columnar

Distributed

In-memory

Row

Real-time queries

High value data

Targeted data read

Batch queries

Flexible data sets

All data read

Page 10: Big Data

© 2012 SAP AG. All rights reserved. 12

In-Memory computing Rethink

Yesterday Today

Disk

Partitioning

Insert Only on Delta Compression

Row and Column Store

No aggregates Memory

+ +

+ +

Memory

Logging and Backup –

Solid State / Flash / HDD

CPU

Multi-Core

Massively Parallel

SingleOptimized Platform

64-bit address space

supports 2TB RAM

100GB/s throughput

Software and data reside on HDD

• I/O constraint

• Support many platforms

• Optimized for none

• Leverage latest advances in hardware

• Minimize I/O time

• Optimized for x86 platform

Disk

CPU

+

Page 11: Big Data

© 2012 SAP AG. All rights reserved. 13

The future of database technology

In-m

em

ory

Co

mp

uti

ng

Ad

op

tio

n

Tra

dit

ion

al D

ata

base A

do

pti

on

Time

2012 – Cost per Terabyte Disk

Memory

$60

$4,900

1990 – Cost per Terabyte Disk

Memory

$9,000,000

$106,000,000

Falling prices move processing from

Disk/SSD to In-Memory

Page 12: Big Data

© 2012 SAP AG. All rights reserved. 14

Main memory reference 100 ns

Compress 1K bytes with Zippy 3,000 ns = 3 µs

Send 2K bytes over 1 Gbps network 20,000 ns = 20 µs

SSD random read 150,000 ns = 150 µs

Read 1 MB sequentially from memory 250,000 ns = 250 µs

Round trip within same datacenter 500,000 ns = 0.5 ms

Read 1 MB sequentially from SSD* 1,000,000 ns = 1 ms

Disk seek 10,000,000 ns = 10 ms

Read 1 MB sequentially from disk 20,000,000 ns = 20 ms

Send packet Canada->Europe->Canada 50,000,000 ns = 150 ms

*Assuming ~1GB/sec SSD

Data by [Jeff Dean](http://research.google.com/people/jeff/)

Originally by [Peter Norvig](http://norvig.com/21-days.html#answers)

The future of database technology

Page 13: Big Data

© 2012 SAP AG. All rights reserved. 15

Lets multiply all these durations by a billion:

Hour:

Main memory reference 100 s Brushing your teeth

Compress 1K bytes with Zippy 50 min One episode of a TV show (including ad breaks)

Day:

Send 2K bytes over 1 Gbps network 5.5 hr From lunch to end of work day

Week

SSD random read 1.7 days A normal weekend

Read 1 MB sequentially from memory 2.9 days A long weekend

Round trip within same datacenter 5.8 days A vacation

Read 1 MB sequentially from SSD 11.6 days A European vacation

Year

Disk seek 16.5 weeks A semester in university

Read 1 MB sequentially from disk 7.8 months

The above two together 1 year

Decade

Send packet Canada-Europe-Canada 4.8 years Average time it takes to complete a bachelor's degree

The future of database technology

Page 14: Big Data

© 2012 SAP AG. All rights reserved. 16

What is SAP HANA?

A flexible, data source agnostic in-memory

analytic appliance to quickly process and

analyze large volumes of transactional data in

real-time

A modern platform that serves as the foundation

to develop a new class of real-time applications

In-Memory Database that runs under SAP

NetWeaver BW for a supercharged data

warehouse

SAP HANA Studio

Real-Time Data Replication

SAP HANA™

SAP Applications Non SAP Data

sources

SAP HANA Database

Calculation Engine

Row & Column In-Memory

SAP BusinessObjects Data Integrator

SAP Information Composer

SAP BusinessObjects BI

Solutions

SAP Applications

SAP NetWeaver BW

SAP HANA – Overview

Page 15: Big Data

© 2012 SAP AG. All rights reserved. 17

Next generation SAP Real-time Data Platform

SAP Analytics SAP Business

Suite SAP Big Data

Applications 3rd Party

BI Clients

SAP

Mobile

On Premise / Cloud

Custom

Apps

Open Developer API’s and Protocols

Co

mm

on

L

an

dsca

pe

Ma

na

ge

me

nt

SAP Enterprise Information Management

SAP Sybase

Replication Server

SAP Data

Services

SAP HANA Platform

SAP MDG and MDM

SAP Real-time Data Platform

SAP Sybase IQ SAP Sybase ASE

SAP Sybase

SQLA

SAP Sybase ESP

Co

mm

on

M

od

elin

g

Syb

ase

Po

we

rDe

sig

ne

r

MP

P

Sca

le-O

ut

SAP NW BW

Page 16: Big Data

© 2012 SAP AG. All rights reserved. 18

Introducing SAP Big Data Processing Framework

Provide optimized data management across each phase of the information lifecycle

process and deliver real-time, actionable insights

Sybase Replication Server for

real-time high value data

replication

Sybase ESP for collecting

stream data

SAP BusinessObjects Data

Services with Hadoop

Connectors for collecting data

from disparate sources via

batch

SAP BusinessObjects BI

platform to display federated

query results across Hadoop and

HANA/IQ to provide deep

insights (dashboards,

visualization, data exploration,

predictive analysis, analytic

applications, and embedded BI in

business applications)

SAP HANA, ASE, or IQ for

real-time data store

Sybase IQ for near-time data

store and multimedia data

storage

Hadoop for long-term,

extended archive

SAP HANA and Sybase IQ for

real-time high value data

processing

Sybase Event Stream

Processor for real-time event

data processing

Sybase IQ for federated query

w/ MapReduce API

Hadoop/MapReduce for

batch, explorative data

processing

Collect Store Process Present

Page 17: Big Data

© 2012 SAP AG. All rights reserved. 19

Fresh Direct

“Our Food is Fresh.

Our Customers Are Spoiled”

Page 18: Big Data

© 2012 SAP AG. All rights reserved. 20

Parking Ticket Optimization

Page 19: Big Data

© 2012 SAP AG. All rights reserved. 21

“Transforming information into intelligence in real time is a cornerstone for McLaren’s winning formula – and increasingly critical for the future of every company,” Jim

Hagemann Snabe, co-CEO, SAP AG

"Using HANA we can hopefully automate decision making. People have always made decisions based on the data, but we want to get to the point where the system can

make the decision,“ Stuart Birrell , McLaren CIO

“ ”

McLaren Group Limited Automotive Industry (Formula One) – Predict and Transform the outcome of races

Product: Agile Datamart - POC

Business Challenges

Cut costs on expensive data scientists that currently help with the team's data analysts to measure

and predict car’s performance

Better anticipate, accelerate and differentiate its business from competitors

Technical Challenges

Turbo-charge both the speed and depth of McLaren’s telemetry technology

Process Big data and act on it rapidly to create the prescriptive intelligence in order to help transform

the outcome of races

Benefits

Real-time analysis of car sensor data – historical data and predictive models

Make immediate proactive corrections and avoid costly, dangerous incidents and win the race

Provide a technology engine that was integrated, scalable and delivered maximum performance

14,000x faster

data analysis – from 5

hours to 1 second

99% predict

the outcome of a

race

Page 20: Big Data

© 2012 SAP AG. All rights reserved. 22

SunGard Leading software and IT services company

SAP Sybase IQ is simple to manage and operate and it’s enabling us to easily build really big systems in a way that is cost-effective, manageable and

sustainable…It doesn’t matter what we throw at it, it seems to take it in stride and give us a great response…We feel like it’s a solution that will carry us

forward into uncharted territory. We see no limit to how far we can go with it.

Product Architect, SunGard

Business Challenges

Enabling the building of newer and larger systems – allowing expansion into new

markets and business areas.

Technical Challenges

Handle very large and continuously growing volumes of data without

performance degradation.

Existing system began to experience performance deterioration that was

unacceptable to end-users

Benefit

Slashes query response time regardless of data volumes

Enables analytics and reporting against virtually unlimited data

1 Trillion rows data stored

“ ”

80 TB of compressed

data

Page 21: Big Data

© 2012 SAP AG. All rights reserved. 23

SAP HANA + Hadoop + R

Benefits

Reduces time to detect variant DNA

In-memory accelerates predictive & correlation

analysis

Optimized treatment plans based on DNA mutations

Long-term study of DNA-based cancer treatment

Genomic DNA analysis in real-time will transform how we enable comprehensive patient care to fight against cancer. SAP HANA will be the mission critical

and reliable data platform to make real-time cancer analytics into a reality. Separately, our internal technical comparison demonstrated that SAP HANA

outperforms a traditional disk-based system by factor of 408,000 when performing other types of data analysis.

Yukihisa Kato, Director & Executive Officer, CTO, Research and Development Center, MITSUI KNOWLEDGE INDUSTRY CO.,LTD.

408,000x faster than

traditional disk-based

systems in PoC

216x faster DNA

analysis results - from

2-3 days to 20 minutes

“ ”

Page 22: Big Data

© 2012 SAP AG. All rights reserved. 24

SOCIAL ANALYTICS MOBILE BIG DATA CLOUD

HANA REAL-TIME PLATFORM

Page 23: Big Data

Thank You!

Adrian Westmoreland

SAP Canada

[email protected]

604 647 8343

Page 24: Big Data

SAP BusinessObjects BI 4.0

and SAP HANA

Page 25: Big Data

© 2012 SAP AG. All rights reserved. 27

SAP HANA A platform for a new class of real-time analytics and applications

Real-time analytics

SAP Business Suite Third-party systems

SAP HANA

Microsoft

Excel

SAP Business

Objects solutions Others…(Open)

Real-time replication services

Data services

Real-time apps

In-memory database

Planning and Calculation Engine

R & Hadoop integration

Predictive Analysis & Business Function

Libraries

SAP NetWeaver

Business Client

Information Composer & Modeling Studio

Text Search Application Services (e.g. HTML 5 Server)

Page 26: Big Data

© 2012 SAP AG. All rights reserved. 28

Today's World

Data

Warehouse / Marts

OLAP

Transactional

System

OLTP

Real-time

posting

into Transactional

System

Aggregation

Batch transfer to

Data Warehouse

Limited flexibility due to

pre-defined data structures

Long query run-times

Loss of detail

Long Wait times for reports

Reporting

Challenges

Large Volumes

High Impact

'Real Life'

Business

Transaction

Analysis and Insight

Action

Page 27: Big Data

© 2012 SAP AG. All rights reserved. 29

What if this would all happen real-time?

No Aggregation / No Data Staging / No Data Marts

Real-time

Loading into SAP

HANA

High Performance

Large Volume Data

Processing

Fast, flexible and detail

analytics over large volumes

SAP HANA

IN-MEMORY

'Real Life'

Business

Transaction

Analysis and Insight

Action

Page 28: Big Data

© 2012 SAP AG. All rights reserved. 30

Accelerated BI with SAP BusinessObjects and SAP HANA One Unified and Complete BI Suite Addressing the Full Spectrum of BI on SAP HANA

Discovery and Analysis

Discover areas to optimize your business

Adapt data to business needs

Tell your story with beautiful visualizations

Discover. Predict. Create.

Dashboards and Apps

Deliver engaging information to users where they

need it

Track key performance indicators and summary

data

Build custom experiences so users get what they

need quickly

Build Engaging Experiences

Reporting

Securely distribute information across your

organization

Give users the ability to ask and answer their

own questions

Build printable reports for operational efficiency

Share Information

Page 29: Big Data

© 2012 SAP AG. All rights reserved. 31

Agility for business analysts and business users

Discover trends, outliers and areas of interest in your business

Adapt to business scenarios by combining, manipulating, and enriching data

Tell your story with self-service visualizations and analytics

Forecast and predict future outcomes

Discovery and Analysis Discover. Predict. Create.

Portfolio

Visual Intelligence

Explorer

Analysis

Predictive Analysis

Page 30: Big Data

© 2012 SAP AG. All rights reserved. 32

Build engaging, visual dashboards

Powerful environment to build interactive and visually appealing analytics

Rich set of controls: buttons, list boxes, drop-down, crosstabs, charts…

Use custom code to extend and build workflows

Dashboards and Apps Build Engaging Experiences

Portfolio

Design Studio

Dashboards (aka Xcelsius®)

Page 31: Big Data

© 2012 SAP AG. All rights reserved. 33

High productivity design for report designers

Quickly build formatted reports on any data source

Securely distribute reports both internally and externally

Minimize IT support costs by empowering end users to easily create and modify their

own reports

Enhance custom applications with embedded reports

Reporting Share Information

Portfolio

Web Intelligence

Crystal Reports

Page 32: Big Data

© 2012 SAP AG. All rights reserved. 34

BI 4 Platform: Open, Agnostic, and Unified Access any data, consume information anywhere

Enterprise Portals

MS Office On Demand Services

Browsers Mobile Devices

ERP

Embedded Content

Personal

Universe Semantic Layer

Business Intelligence Platform

EDW

Discovery and Analysis Dashboards and Apps Reporting

Unstructured

Page 33: Big Data

Thank You!

Adrian Westmoreland

SAP Canada

[email protected]

604 647 8343

Page 34: Big Data

© 2012 SAP AG. All rights reserved. 36

No part of this publication may be reproduced or transmitted in any form or for any purpose without

the express permission of SAP AG. The information contained herein may be changed without prior

notice.

Some software products marketed by SAP AG and its distributors contain proprietary software

components of other software vendors.

Microsoft, Windows, Excel, Outlook, PowerPoint, Silverlight, and Visual Studio are registered

trademarks of Microsoft Corporation.

IBM, DB2, DB2 Universal Database, System i, System i5, System p, System p5, System x, System z,

System z10, z10, z/VM, z/OS, OS/390, zEnterprise, PowerVM, Power Architecture, Power Systems,

POWER7, POWER6+, POWER6, POWER, PowerHA, pureScale, PowerPC, BladeCenter, System

Storage, Storwize,

XIV, GPFS, HACMP, RETAIN, DB2 Connect, RACF, Redbooks, OS/2, AIX, Intelligent Miner,

WebSphere, Tivoli, Informix, and Smarter Planet are trademarks or registered trademarks of IBM

Corporation.

Linux is the registered trademark of Linus Torvalds in the United States and other countries.

Adobe, the Adobe logo, Acrobat, PostScript, and Reader are trademarks or registered trademarks of

Adobe Systems Incorporated in the United States and other countries.

Oracle and Java are registered trademarks of Oracle and its affiliates.

UNIX, X/Open, OSF/1, and Motif are registered trademarks of the Open Group.

Citrix, ICA, Program Neighborhood, MetaFrame, WinFrame, VideoFrame, and MultiWin are

trademarks or registered trademarks of Citrix Systems Inc.

HTML, XML, XHTML, and W3C are trademarks or registered trademarks of W3C®, World Wide Web

Consortium, Massachusetts Institute of Technology.

Apple, App Store, iBooks, iPad, iPhone, iPhoto, iPod, iTunes, Multi-Touch, Objective-C, Retina,

Safari, Siri,

and Xcode are trademarks or registered trademarks of Apple Inc.

IOS is a registered trademark of Cisco Systems Inc.

RIM, BlackBerry, BBM, BlackBerry Curve, BlackBerry Bold, BlackBerry Pearl, BlackBerry Torch,

BlackBerry Storm, BlackBerry Storm2, BlackBerry PlayBook, and BlackBerry App World are

trademarks or registered trademarks of Research in Motion Limited.

© 2012 SAP AG. All rights reserved.

Google App Engine, Google Apps, Google Checkout, Google Data API, Google Maps, Google Mobile

Ads, Google Mobile Updater, Google Mobile, Google Store, Google Sync, Google Updater, Google

Voice,

Google Mail, Gmail, YouTube, Dalvik and Android are trademarks or registered trademarks of

Google Inc.

INTERMEC is a registered trademark of Intermec Technologies Corporation.

Wi-Fi is a registered trademark of Wi-Fi Alliance.

Bluetooth is a registered trademark of Bluetooth SIG Inc.

Motorola is a registered trademark of Motorola Trademark Holdings LLC.

Computop is a registered trademark of Computop Wirtschaftsinformatik GmbH.

SAP, R/3, SAP NetWeaver, Duet, PartnerEdge, ByDesign, SAP BusinessObjects Explorer,

StreamWork,

SAP HANA, and other SAP products and services mentioned herein as well as their respective logos

are trademarks or registered trademarks of SAP AG in Germany and other countries.

Business Objects and the Business Objects logo, BusinessObjects, Crystal Reports, Crystal

Decisions, Web Intelligence, Xcelsius, and other Business Objects products and services mentioned

herein as well as their respective logos are trademarks or registered trademarks of Business Objects

Software Ltd. Business Objects is an SAP company.

Sybase and Adaptive Server, iAnywhere, Sybase 365, SQL Anywhere, and other Sybase products

and services mentioned herein as well as their respective logos are trademarks or registered

trademarks of Sybase Inc. Sybase is an SAP company.

Crossgate, m@gic EDDY, B2B 360°, and B2B 360° Services are registered trademarks of

Crossgate AG

in Germany and other countries. Crossgate is an SAP company.

All other product and service names mentioned are the trademarks of their respective companies.

Data contained in this document serves informational purposes only. National product specifications

may vary.

The information in this document is proprietary to SAP. No part of this document may be reproduced,

copied,

or transmitted in any form or for any purpose without the express prior written permission of SAP AG.


Recommended