+ All Categories
Home > Technology > Hortonworks kognitio webinar 10 dec 2013

Hortonworks kognitio webinar 10 dec 2013

Date post: 05-Dec-2014
Category:
Upload: michael-hiskey
View: 256 times
Download: 3 times
Share this document with a friend
Description:
Webinar with the Hortonworks team on Hadoop 2.0 and the need for in-memory acceleration over Hadoop.
36
Hadoop and the new BI: The Modern Data Architecture …for in memory Big Data Analytics 10 December 2013
Transcript
Page 1: Hortonworks kognitio webinar 10 dec 2013

Hadoop and the new BI: The Modern Data Architecture…for in memory Big Data Analytics 10 December 2013

Page 2: Hortonworks kognitio webinar 10 dec 2013

© Hortonworks Inc. 2013

Quick Housekeeping

Q&A box is available for your questions Q&A box is available for your questions

Webinar will be recorded for future viewing Webinar will be recorded for future viewing

Thank You for joining!Thank You for joining!

Page 3: Hortonworks kognitio webinar 10 dec 2013

© Hortonworks Inc. 2013

Modern Data Architecture…for in memory Big Data Analytics

Page 3

Page 4: Hortonworks kognitio webinar 10 dec 2013

© Hortonworks Inc. 2013

Your Presenters

Page 4

• Paul Groom (@datagroom)–Chief Innovation Officer–28 years buried in the big data of the data

guiding business users to value–Two wheels are more fun than four

• John Kreisa (@marked_man)–VP Strategic Marketing, Hortonworks–Over 20 years in data management as a

developer and a marketer–Avid camper

Page 5: Hortonworks kognitio webinar 10 dec 2013

© Hortonworks Inc. 2013

Today’s Topics

• Introduction• Drivers for the Modern Data Architecture (MDA)• Apache Hadoop in the MDA• Kognitio’s role in the MDA• Q&A

Page 5

Page 6: Hortonworks kognitio webinar 10 dec 2013

© Hortonworks Inc. 2013

Existing Data Architecture

Page 6

APPLICAT

IONS

DAT

A SYSTEM

REPOSITORIES

SOURC

ES Existing Sources (CRM, ERP, Clickstream, Logs)

RDBMS EDW MPP

Business Analytics

Custom Applications

PackagedApplications

Source: IDC

2.8 ZB in 2012

85% from New Data Types

15x Machine Data by 2020

40 ZB by 2020

Page 7: Hortonworks kognitio webinar 10 dec 2013

© Hortonworks Inc. 2013

Modern Data Architecture Enabled

Page 7

APPLICAT

IONS

DAT

A SYSTEM

REPOSITORIES

SOURC

ES Existing Sources (CRM, ERP, Clickstream, Logs)

RDBMS EDW MPP

Emerging Sources (Sensor, Sentiment, Geo, Unstructured)

OPERATIONALTOOLS

MANAGE & MONITOR

DEV & DATATOOLS

BUILD & TEST

Business Analytics

Custom Applications

PackagedApplications

Page 8: Hortonworks kognitio webinar 10 dec 2013

© Hortonworks Inc. 2013

Hadoop Powers Modern Data Architecture

Page 8

Apache Hadoop is an open source project governed by the Apache Software Foundation (ASF) that allows you to gain insight from massive amounts of structured and unstructured data quickly and without significant investment.

Hadoop Clustercompute

&storage

. . .

. . .

. . compute&

storage

.

.

Hadoop clusters provide scale-out storage and distributed data processing on commodity hardware

Page 9: Hortonworks kognitio webinar 10 dec 2013

© Hortonworks Inc. 2013

Drivers of Hadoop Adoption

Page 9

From NEW types of Data (or existing types for longer)

New Business Applications

Page 10: Hortonworks kognitio webinar 10 dec 2013

© Hortonworks Inc. 2013

Most Common NEW TYPES OF DATA

1. SentimentUnderstand how your customers feel about your brand and products – right now

2. ClickstreamCapture and analyze website visitors’ data trails and optimize your website

3. Sensor/MachineDiscover patterns in data streaming automatically from remote sensors and machines

4. GeographicAnalyze location-based data to manage operations where they occur

5. Server LogsResearch logs to diagnose process failures and prevent security breaches

6. Unstructured (txt, video, pictures, etc..)Understand patterns in files across millions of web pages, emails, and documents

Value

Page 11: Hortonworks kognitio webinar 10 dec 2013

© Hortonworks Inc. 2013

Keep Existing Data Around Longer

• Online archive–Data that was once moved to tape can

now be queried to understand long term trends

• Compliance retention–Industry specific requirements for retention

of data

• Combine with external historical data sources– Weather, survey, research, purchased, etc.

Value

Page 12: Hortonworks kognitio webinar 10 dec 2013

© Hortonworks Inc. 2013

Drivers of Hadoop Adoption

Page 12

A Modern Data ArchitectureComplement your existing data systems: the right workload in the right place

Architectural

New Business Applications

Page 13: Hortonworks kognitio webinar 10 dec 2013

© Hortonworks Inc. 2013

IntegratedInteroperable with existing data center investments Skills

Leverage your existing skills: development, operations, analytics

Requirements for Hadoop Adoption

Page 13

Key ServicesPlatform, operational and data services essential for the enterprise

Requirements for Hadoop’s Role in the Modern Data Architecture

Page 14: Hortonworks kognitio webinar 10 dec 2013

© Hortonworks Inc. 2013

IntegratedEngineered with existing data center investments

1 Key ServicesPlatform, Operational and Data services essential for the enterprise

SkillsLeverage your existing skills: development, analytics, operations

2

3

Requirements for Enterprise Hadoop

Page 14

OS/VM Cloud Appliance

PLATFORM SERVICES

CORE

Enterprise ReadinessHigh Availability, Disaster Recovery, Rolling Upgrades, Security and Snapshots

HORTONWORKS DATA PLATFORM (HDP)

OPERATIONAL SERVICES

DATASERVICES

HDFS

SQOOP

FLUME

NFS

LOAD & EXTRACT

WebHDFS

KNOX*

OOZIE

AMBARI

FALCON*

YARN  

MAP  TEZREDUCE

HIVE &HCATALOG

PIGHBASE

Page 15: Hortonworks kognitio webinar 10 dec 2013

© Hortonworks Inc. 2013

Requirements for Enterprise Hadoop

Page 15

1

IntegrationEngineered with existing data center investments

Key ServicesPlatform, operational and data services essential for the enterprise

SkillsLeverage your existing skills: development, analytics, operations

2

3DEV

ELOP

ANAL

YZE

OPE

RATE

COLLECT PROCESS BUILD

EXPLORE QUERY DELIVER

PROVISION MANAGE MONITOR

Page 16: Hortonworks kognitio webinar 10 dec 2013

© Hortonworks Inc. 2013

Familiar and Existing Tools

Page 16

1

IntegrationEngineered with existing data center investments

Key ServicesPlatform, operational and data services essential for the enterprise

SkillsLeverage your existing skills: development, analytics, operations

2

3DEV

ELOP

ANAL

YZE

OPE

RATE

COLLECT PROCESS BUILD

EXPLORE QUERY DELIVER

PROVISION MANAGE MONITOR

Page 17: Hortonworks kognitio webinar 10 dec 2013

© Hortonworks Inc. 2013

APPLICAT

IONS

DAT

A SYSTEM

REPOSITORIES

SOURC

ES Existing Sources (CRM, ERP, Clickstream, Logs)

RDBMS EDW MPP

Emerging Sources (Sensor, Sentiment, Geo, Unstructured)

OPERATIONALTOOLS

MANAGE & MONITOR

DEV & DATATOOLS

BUILD & TEST

Business Analytics

Custom Applications

PackagedApplications

Requirements for Enterprise Hadoop

Page 17

IntegrationEngineered with existing data center investments

3

Integrated with

ApplicationsBusiness Intelligence, Developer IDEs, Data Integration

SystemsData Systems & Storage, Systems Management

PlatformsOperating Systems, Virtualization, Cloud, Appliances

Page 18: Hortonworks kognitio webinar 10 dec 2013

© Hortonworks Inc. 2013 - Confidential

Complement data systems

Right workload right place

A Modern Data Architecture Applied

Page 18

APPLICAT

IONS

DAT

A SYSTEM

REPOSITORIES

SOURC

ES Existing Sources (CRM, ERP, Clickstream, Logs)

RDBMS EDW MPP

Emerging Sources (Sensor, Sentiment, Geo, Unstructured)

Business Analytics

Custom Applications

PackagedApplications

Page 19: Hortonworks kognitio webinar 10 dec 2013

© Hortonworks Inc. 2013 - Confidential

Kognitio in the Modern Data Architecture

Page 19

APPLICAT

IONS

DAT

A SYSTEM

REPOSITORIES

SOURC

ES Existing Sources (CRM, ERP, Clickstream, Logs)

RDBMS EDW MPP

Emerging Sources (Sensor, Sentiment, Geo, Unstructured)

OPERATIONALTOOLS

MANAGE & MONITOR

DEV & DATATOOLS

BUILD & TEST

Business Analytics

Business Intelligence Tools OLAP Clients

In‐memory MPP Accelerator

Page 20: Hortonworks kognitio webinar 10 dec 2013

© Hortonworks Inc. 2013 - Confidential

Kognitio in the Modern Data Architecture

Page 20

APPLICAT

IONS

DAT

A SYSTEM

SOURC

ES

RDBMS EDW MPP

Emerging Sources (Sensor, Sentiment, Geo, Unstructured)

HANA

BusinessObjects BI

OPERATIONAL TOOLS

DEV & DATA TOOLS

Existing Sources (CRM, ERP, Clickstream, Logs)

INFRASTRUCTURE

In‐memory MPP Accelerator

Page 21: Hortonworks kognitio webinar 10 dec 2013

© Hortonworks Inc. 2013

Today’s Topics

• Introduction• Drivers for the Modern Data Architecture (MDA)• Apache Hadoop’s role in the MDA• Kognitio’s role in the MDA• Q&A

Page 21

Page 22: Hortonworks kognitio webinar 10 dec 2013

© Hortonworks Inc. 2013

12

3IntegratedInteroperable with existing data center investments Skills

Leverage your existing skills: development, operations, analytics

Hadoop and the new BI

Page 22

Key ServicesPlatform, operational and data services essential for the enterprise

Requirements for Hadoop’s Role in the Modern Data Architecture

Page 23: Hortonworks kognitio webinar 10 dec 2013

© Hortonworks Inc. 2013

Motivation

• Historical architecture = Existing investment

Page 23

Cognos

• Must plug-and-play with MDA– Do not disrupt, enhance!

• Performance and behavior expectations– Dynamic ad-hoc access– Drill unlimited– Report on-demand

1 Key ServicesPlatform, Operational aData services essentialfor the enterprise

Page 24: Hortonworks kognitio webinar 10 dec 2013

© Hortonworks Inc. 2013

Business [Intelligence] Desires

Page 24

More timelyLower latency

More granularityBetter concurrency

Richer data model

Self service

Page 25: Hortonworks kognitio webinar 10 dec 2013

© Hortonworks Inc. 2013

BI Activity

Page 25

Insulate the Hadoop cluster

Page 26: Hortonworks kognitio webinar 10 dec 2013

© Hortonworks Inc. 2013

In-memory analytical platform• Software only

– Easy to deploy alongside HDP– Simple two stage install

• Commodity Hardware– X86/64 Linux Platform with 10GbE network – same as HDP– Biased to more RAM and less disk

• Scale-out MPP– Same compute model as Hadoop– Strong focus on 100% effective CPU utilization for any given query

• Exploits features of underlying persistent store– Simple ‘Pull data’ access methods– Parallelism – all HDP nodes intercommunicating with all Kognitio nodes

• ANSI 2011 SQL– Mature fully featured– Transaction processing capable

• Not-only-SQL– Any script or binaries executed in-line within SQL queries

Page 26

SkillsLeverage your existing skills: development, analytics, operations

2

IntegrationEngineered with existing data center investments

3

Page 27: Hortonworks kognitio webinar 10 dec 2013

© Hortonworks Inc. 2013

Tight Integration

• HDFS Connector– Low Latency access

Page 27

• Map-reduce Connector– Filtered access

IntegrationEngineered with existingdata center investments

3

Page 28: Hortonworks kognitio webinar 10 dec 2013

© Hortonworks Inc. 2013

So why In-memory?

• Exploit the ‘Dynamic’ access element of ‘D’-RAM– Data placed in memory in structures best suited for CPUs, not for disks

Page 28

INSTANT WAIT

Page 29: Hortonworks kognitio webinar 10 dec 2013

© Hortonworks Inc. 2013

In-memory – getting work done

Page 29

Page 30: Hortonworks kognitio webinar 10 dec 2013

© Hortonworks Inc. 2013

Building Data Models• Hadoop is a great repository• Perfect to handle volume and variability without effort• Perfect to ‘triage’ the data, to reshape, filter and project into…

• Data Virtualisation / Logical Data Warehouse… but with the associated horsepower to dynamically analyse the data

• Plug standard tools straight in – not a Java programmer in sight! • Central control and security

• Data model shelf life getting shorter – sandboxes and workbenches– Build on-demand to meet todays needs – just pull data from your HDP– Lots of project based discovery and analytics– World is changing rapidly– Ever tighter feedback loops

Page 30

Page 31: Hortonworks kognitio webinar 10 dec 2013

© Hortonworks Inc. 2013

Increasing Computation

Page 31

Machine learning algorithms Dynamic

Simulation

Statistical Analysis

Clustering

Behaviour modelling

Reporting & BPM

Fraud detection

Dynamic Interaction

Technology/Automation

Ana

lytic

al C

ompl

exity

Campaign Management

Page 32: Hortonworks kognitio webinar 10 dec 2013

© Hortonworks Inc. 2013

The Analytical Enterprise

Business Analyst

Systems Admin

Data Scientist

Key: “Graduation”• Projects will need to easily Graduate

from the Data Science Lab and become part of Business as Usual

Page 33: Hortonworks kognitio webinar 10 dec 2013

© Hortonworks Inc. 2013

Mature SQL atop Hadoop

Page 33

Kognitio is an in‐memory analytical platform that is tightly integrated with Hadoop for high‐performance advanced analytics 

that make Big Data more consumable for enterprises, 

especially those with mature BI environments or engrained 

tools. 

• Privately held• Invented the in‐memory analytical platform• Labs in the UK ‐ HQ in New York, NY 

• Powering advanced analytics at organizations worldwide, such as: 

Page 34: Hortonworks kognitio webinar 10 dec 2013

© Hortonworks Inc. 2013

Kognitio in the Modern Data Architecture

Page 34

APPLICAT

IONS

DAT

A SYSTEM

REPOSITORIES

SOURC

ES Existing Sources (CRM, ERP, Clickstream, Logs)

RDBMS EDW MPP

Emerging Sources (Sensor, Sentiment, Geo, Unstructured)

OPERATIONALTOOLS

MANAGE & MONITOR

DEV & DATATOOLS

BUILD & TEST

Business Analytics

Business Intelligence Tools OLAP Clients

In‐memory MPP Accelerator

Page 35: Hortonworks kognitio webinar 10 dec 2013

© Hortonworks Inc. 2013

Forrester Wave: a “strong performer”

Page 35© Forrester Corp. Used with permission.

• Kognitio’s entirely in-memory, distributed EDW is appealing for customers looking for fast performance on commodity hardware

• Kognitio’s EDW is a strong, cost-effective alternative to SAP HANA.

• Kognitio…was designed from the start as an MPP (distributed) in-memory RDBMS, making extensive use of RAM-based processing for maximum performance.

• Download a complimentary copy of the full report at www.kognitio.com/wave

Page 36: Hortonworks kognitio webinar 10 dec 2013

Question & Answer session will be conducted electronically, using the panel to the right of your screen

Today’s Slides available at: www.slideshare.net/kognitio

The Modern Data Architecture…for in memory Big Data Analytics

More about Kognito and Hortonworkshttp://hortonworks.com/partner/kognitio

Get started with Hortonworks Sandbox http://hortonworks.com/hadoop-tutorial/

Follow us:@hortonworks @kognitio


Recommended