+ All Categories
Home > Technology > Hitachi Data Systems Hadoop Solution

Hitachi Data Systems Hadoop Solution

Date post: 19-Oct-2014
Category:
View: 891 times
Download: 5 times
Share this document with a friend
Description:
Hitachi Data Systems Hadoop Solution. Customers are seeing exponential growth of unstructured data from their social media websites to operational sources. Their enterprise data warehouses are not designed to handle such high volumes and varieties of data.  Hadoop, the latest software platform that scales to process massive volumes of unstructured and semi-structured data by distributing the workload through clusters of servers, is giving customers new option to tackle data growth and deploy big data analysis to help better understand their business. Hitachi Data Systems is launching its latest Hadoop reference architecture, which is pre-tested with Cloudera Hadoop distribution to provide a faster time to market for customers deploying Hadoop applications. HDS, Cloudera and Hitachi Consulting will present together and explain how to get you there. Attend this WebTech and learn how to: Solve big-data problems with Hadoop. Deploy Hadoop in your data warehouse environment to better manage your unstructured and structured data. Implement Hadoop using HDS Hadoop reference architecture. For more information on Hitachi Data Systems Hadoop Solution please read our blog: http://blogs.hds.com/hdsblog/2012/07/a-series-on-hadoop-architecture.html
Popular Tags:
50
HITACHI DATA SYSTEMS HADOOP SOLUTION JUNE 12, 2012
Transcript
Page 1: Hitachi Data Systems Hadoop Solution

HITACHI DATA SYSTEMS HADOOP SOLUTION

JUNE 12, 2012

Page 2: Hitachi Data Systems Hadoop Solution

Customers are seeing exponential growth of unstructured data from their social media

websites to operational sources. Their enterprise data warehouses are not designed to

handle such high volumes and varieties of data.

Hadoop, the latest software platform that scales to process massive volumes of

unstructured and semi-structured data by distributing the workload through clusters of

servers, is giving customers new option to tackle data growth and deploy big data analysis

to help better understand their business.

Hitachi Data Systems is launching its latest Hadoop reference architecture, which is pre-

tested with Cloudera Hadoop distribution to provide a faster time to market for customers

deploying Hadoop applications. HDS, Cloudera and Hitachi Consulting will present together

and explain how to get you there.

Attend this WebTech and learn how to

• Solve big-data problems with Hadoop.

• Deploy Hadoop in your data warehouse environment to better manage your

unstructured and structured data.

• Implement Hadoop using HDS Hadoop reference architecture.

HITACHI DATA SYSTEMS HADOOP SOLUTION

WEBTECH EDUCATIONAL SERIES

Page 3: Hitachi Data Systems Hadoop Solution

PRESENTERS

Shankar Radhakrishnan, Solutions Manager, Hitachi Data Systems

Sai Saiprabhu Director, Specialized Services, Hitachi Consulting

Art Vancil Big Data Senior Manager, Hitachi Consulting

Daniel Templeton, Partner Manager, Cloudera

Page 4: Hitachi Data Systems Hadoop Solution

4

ASK BIGGER QUESTIONS DANIEL TEMPLETON, PROGRAM MANAGER AT CLOUDERA

Page 5: Hitachi Data Systems Hadoop Solution

Enterprise Data Evolution A

MO

UN

T O

F D

ATA

• Data collection & reporting

• Process data faster

• Store data more cost-effectively

• Simplify infrastructure

• Combine data from across the business

• Ask new questions immediately

• Enable new real-time applications

CREATE COMPETITIVE ADVANTAGE

IMPROVE OPERATIONAL EFFICIENCY

Page 6: Hitachi Data Systems Hadoop Solution

Data Has Changed in the Last 30 Years D

ATA

GR

OW

TH

END-USER APPLICATIONS

THE INTERNET

MOBILE DEVICES

SOPHISTICATED MACHINES

STRUCTURED DATA – 10%

1980 2012

UNSTRUCTURED DATA – 90%

Page 7: Hitachi Data Systems Hadoop Solution

Data Management Strategies Have Stayed the Same

• Raw data on SAN, NAS and tape

• Data moved from storage to compute

• Relational models with predesigned schemas

Page 8: Hitachi Data Systems Hadoop Solution

Too Much Data, Too Many Sources

• Can’t ingest fast enough

Page 9: Hitachi Data Systems Hadoop Solution

Too Much Data, Too Many Sources

$

!

$ $

$

• Can’t ingest fast enough

• Costs too much to store

Page 10: Hitachi Data Systems Hadoop Solution

Too Much Data, Too Many Sources

1

2 3 4

5

• Can’t ingest fast enough

• Costs too much to store

• Exists in different places

Page 11: Hitachi Data Systems Hadoop Solution

Too Much Data, Too Many Sources

• Can’t ingest fast enough

• Costs too much to store

• Exists in different places

• Archived data is lost

Page 12: Hitachi Data Systems Hadoop Solution

Can’t Use It The Way You Want To

• Analysis and processing takes too long

Page 13: Hitachi Data Systems Hadoop Solution

Can’t Use It The Way You Want To

1

2 3 4

5 • Analysis and processing

takes too long

• Data exists in silos

Page 14: Hitachi Data Systems Hadoop Solution

Can’t Use It The Way You Want To

? ? ? • Analysis and processing

takes too long

• Data exists in silos

• Can’t ask new questions

Page 15: Hitachi Data Systems Hadoop Solution

Can’t Use It The Way You Want To

• Analysis and processing takes too long

• Data exists in silos

• Can’t ask new questions

• Can’t analyze unstructured data

Page 16: Hitachi Data Systems Hadoop Solution

16

Transform The Way You Think About Data

Cloudera

Page 17: Hitachi Data Systems Hadoop Solution

SIMPLIFIED, UNIFIED, EFFICIENT

• Bulk of data stored on scalable low cost platform

• Perform end-to-end workflows

• Specialized systems reserved for specialized workloads

• Provides data access across departments or LOB

COMPLEX, FRAGMENTED, COSTLY

•Data silos by department or LOB

• Lots of data stored in expensive specialized systems • Analysts pull select data into EDW

• No one has a complete view

The Cloudera Approach

17

Meet enterprise demands with a new way to think about data.

THE CLOUDERA WAY THE OLD WAY

Single data platform to support BI, Reporting &

App Serving

Multiple platforms for multiple workloads

Page 18: Hitachi Data Systems Hadoop Solution

Hadoop complements the Data Warehouse

18

OLTP

Enterprise Applications

Business Intelligence

Data Warehouse

Query (High $/Byte)

CLOUDERA

Store

Query Transform

ETL

Math

Load Archive

Operational BI

Archival Data, Exploration, Analytics

Page 19: Hitachi Data Systems Hadoop Solution

INGEST STORE EXPLORE PROCESS ANALYZE SERVE

CDH CLOUDERA MANAGER

CLOUDERA SUPPORT

Cloudera Enterprise: The Platform for Big Data

19

BRINGS STORAGE & COMPUTE TOGETHER

WORKS WITH EVERY TYPE OF DATA

CHANGES THE ECONOMICS OF DATA

MANGAGEMENT

A Revolutionary Solution Built on Apache Hadoop

CLOUDERA NAVIGATOR

Page 20: Hitachi Data Systems Hadoop Solution

CDH4

20

Big Data Storage, Processing & Analytics Based on Apache Hadoop

Store Land structured and unstructured data in a scalable, cost-effective repository

1

Process & Analyze Transform data in parallel and query at the speed of thought

2

Integrate Interoperate with existing platforms, systems and applications

3

Page 21: Hitachi Data Systems Hadoop Solution

Cloudera Manager

21

End-to-End Administration for CDH

Deploy Install, configure & start your cluster in 3 simple steps

1

Configure & Optimize Ensure optimal settings for all hosts & services 2

Monitor, Diagnose & Report Find & fix problems quickly, view current & historical activity & resource usage

3

Page 22: Hitachi Data Systems Hadoop Solution

Cloudera Navigator

22

Data Management Layer for Cloudera Enterprise

Audit & Access Control (AVAILABLE NOW)

Ensuring appropriate permissions and reporting on data access for compliance

1

Exploration & Lineage (COMING SOON)

Finding out what data is available, what it looks like and where it came from

2

Lifecycle Management (COMING SOON)

Migration of data based on policies 3

Page 23: Hitachi Data Systems Hadoop Solution

Cloudera Support

23

Our Team of Experts on Call to Help You Meet Your SLAs

Extend Your Team Get a dedicated team at your disposal to help you solve problems quickly

1

Leverage the Experts Take advantage of our expertise to make sure your cluster operates at its best

2

Influence Roadmaps Get advocacy with the open source community to build the features and functionality you need

3

Page 24: Hitachi Data Systems Hadoop Solution

Cloudera Manager

Management for the complete Hadoop system The most mature & functionally advanced The easiest to use w/built-in intelligence Integration w/enterprise monitoring tools

Cloudera Enterprise

24

CDH4

The only solution with real time query (Impala) The only solution with HDFS high availability The most widely deployed & proven The broadest ecosystem of certified partners 100% open source & built for the enterprise

The Best Hadoop-Based Platform

Cloudera Navigator

The only data management tool for Hadoop Cloudera Navigator 1.0: Data audit & access

control

Cloudera Support

Dedicated team with a global presence Contributors and committers for every part of CDH Tens of thousands of nodes under management

across industries

Page 25: Hitachi Data Systems Hadoop Solution

A Complete Solution

25

CLOUDERA UNIVERSITY

DEVELOPER TRAINING

ADMINISTRATOR TRAINING

DATA SCIENCE TRAINING

CERTIFICATION PROGRAMS

INGEST STORE EXPLORE PROCESS ANALYZE SERVE

CDH CLOUDERA MANAGER

CLOUDERA SUPPORT

CLOUDERA NAVIGATOR

Page 26: Hitachi Data Systems Hadoop Solution

ALTERNATE TITLE SLIDE PRESENTER NAME DATE

TITLE SLIDES

Additional title slide options

can be found in the HDS

Icon and Slide Library. (View in slideshow mode to activate link.)

NOTE

CHOOSING THE RIGHT INFRASTRUCTURE FOR HADOOP SHANKAR RADHAKRISHNAN, SOLUTIONS PRODUCT MANAGER – ORACLE, SAP HANA AND BIG DATA SOLUTIONS

© Hitachi Data Systems Corporation 2013. All Rights Reserved.

Page 27: Hitachi Data Systems Hadoop Solution

HADOOP APPLICATION EXAMPLE: GENOME ANALYSIS

National Institute of Genomics

– Japan

Challenge: Accelerate the

speed of analysis for genome

data from next-generation

sequencers

4 PB of data

Solution

‒ 115-node Hadoop cluster using Hitachi Compute Rack servers

‒ Reliable and scalable solution

Page 28: Hitachi Data Systems Hadoop Solution

PROACTIVE MAINTENANCE AT HITACHI SERVER DIVISION

28

User Inquiry

Hardware Auditing Log

Callcenter Log Maintenance

Report CRM Customer Data

Sales/Financial Data

Distribution/Stock Data

Location Information

Server Log

Operation History

BOM data Production Data Of Business System

・Proactive hardware maintenance from logs, call center data, and product

information

・Leverage historical data for future product development

Challenge

Solution: Hadoop + SAP HANA + SAP Visual Intelligence

Page 29: Hitachi Data Systems Hadoop Solution

• Cost-effective for low-fidelity data

• Increase efficiency and utilization of resources and meet

required service levels

• Hardware less prone to failures

• Easy to manage

• Scale out to handle petabytes of unstructured and semi-

structured data

• Keep data closer to CPU

DATA

GROWTH

COST

COMPLEXITY

INFRASTRUCTURE REQUIREMENTS FOR HADOOP

Page 30: Hitachi Data Systems Hadoop Solution

HADOOP IN THE ENTERPRISE: ARCHITECTURE

Data Warehouse

Hadoop Real Time

Computer

(Streaming)

Real Time

Computer

(Streaming)

Outside

Services

(Connect to

Facebook for

CRM, etc.)

One Platform for All Data, All Applications

Other Big Data Sources (Email,

Audio, Documents, etc.)

Business Apps

RDB

Real-Time

Computer

(Streaming)

Data Connector

CxOs Data Scientist Business Users /

Customers

Business Intelligence Dashboard

Hitachi Strength and Focus

Page 31: Hitachi Data Systems Hadoop Solution

INTRODUCING HITACHI REFERENCE ARCHITECTURE FOR HADOOP

Pretested and validated for interoperability, performance, and scalability

Flexible − customize to fit application

Pre-validated using Cloudera, leading Hadoop distribution (certification in progress)

Complementary to existing Hitachi platforms for block, file, and object

Seamless management integration with other Hitachi solutions

D

A

T

A

N

O

D

E

-

H

D

F

S

T

A

S

K

T

R

A

C

K

E

R

Name Node + Job Tracker

Secondary Name Node

Management

LAN

ENTERPRISE-READY INFRASTRUCTURE FOR HADOOP

D

A

T

A

N

O

D

E

-

H

D

F

S

T

A

S

K

T

R

A

C

K

E

R

LAN

Page 32: Hitachi Data Systems Hadoop Solution

REFERENCE ARCHITECTURE: HARDWARE COMPONENTS

Qty Form factor Component Description

1 1U Management node Hitachi server CR 210H

- 2 x quad-core E2600 series

- 64GB main memory

- 2 x GigE (onboard)

- 5 x 3.5-inch 3TB NL-SAS 7200 RPM

1 2U HDFS master name node

- Name node

- Job tracker

Hitachi server CR 220S

- 2 x quad-core E2600 series

- 64GB main memory

- 2 x GigE (onboard)

- 12 x 3.5-inch 3TB NL-SAS 7200 RPM

1 2U Secondary name node

Hitachi server CR 220S

- 2 x quad-core E2600 Series

- 64GB main memory

- 2 x GigE (onboard)

- 12 x 3.5-inch 3TB NL-SAS 7200 RPM

As needed 2U Data nodes

- Data node

- Task tracker

Hitachi server CR 220S

- 2 x quad-core E2600 series

- 64GB main memory

- 2 x GigE (onboard)

- 12 x 3.5-inch 3TB NL-SAS 7200 RPM

2 1U or 2U Ethernet switches

(10 GbE network)

Cisco Nexus 5548

- 48 x GigE / 10GigE or

Brocade VDX 6720-60

- 40 x GigE / 10GigE – form factor = 2U

1U

2U

CR220S

Switch-2

42U

Internal

HDD

Switch-1 1U

• High density (2U), high processing power (2 CPU sockets), large data storage (12 HDD)

• Redundant power supplies

• Eco-friendly power saving capabilities

Why Compute Rack Servers?

Page 33: Hitachi Data Systems Hadoop Solution

Component Version Description

Operating System 6.3 Redhat or CentOS 64-bit Linux distribution

Hadoop distribution CDH4 Cloudera Hadoop distribution

Hadoop

management

4.0.1 Cloudera Manager

Management

framework

n/a Hitachi Compute Systems Manager

REFERENCE ARCHITECTURE: SOFTWARE COMPONENTS

Tested Software

D

A

T

A

N

O

D

E

-

H

D

F

S

T

A

S

K

T

R

A

C

K

E

R

Name Node + Job Tracker

HA Name Node

Management

LAN

Reference Architecture White Paper Targeted

for June 2013

Page 34: Hitachi Data Systems Hadoop Solution

WHY HITACHI FOR HADOOP INFRASTRUCTURE

Enterprise-ready (RAS) for Hadoop

‒ Less worry about hardware failure, more focus on business value

Seamless management integration with Hitachi solutions

‒ Lower opex

Competitive pricing with commodity hardware

‒ Lower capex

One platform solution for all your data volumes, velocity

and types

‒ Lower TCO, faster ROI for your big data initiatives

Page 35: Hitachi Data Systems Hadoop Solution

35

HITACHI CONSULTING

SAI SAIPRABHU, DIRECTOR, SPECIALIZED SERVICES

ART VANCIL, BIG DATA SENIOR MANAGER

Page 36: Hitachi Data Systems Hadoop Solution

HITACHI CONSULTING

As the global consulting company of Hitachi, Ltd., Hitachi Consulting brings

business visions to life through in-depth industry expertise combined with

innovative technology solutions and services

From articulating strategy through deploying

and maintaining applications, Hitachi

Consulting helps clients quickly realize

measurable business value and achieve

sustainable ROI

The Hitachi Consulting client base includes 35

percent of the Fortune 100 and 25 percent of the

Fortune Global 100, along with many mid-market

leaders. With offices in North America, Europe,

the Middle East, and Asia, the company employs

more than 5,000 professionals, with delivery

centers in India and China for global delivery

scale

Page 37: Hitachi Data Systems Hadoop Solution

WHAT DO WE SEE WITH OUR CLIENTS?

Business Objectives

Refinement

Technology Adoption

without disruption

Data Science

Practice Adoption

Business

Intelligence Jump

Start With Big Data

Technologies

Emerging

Businesses

Business Intelligence

Practice Adoption

Page 38: Hitachi Data Systems Hadoop Solution

DO YOU NEED AN EXECUTIVE SPONSOR?

The Internet has driven most businesses to demand better information much faster than

ever before across almost every industry

Examples: Retailers can influence the next shopping visit based on analytics; Amazon

can tailor a shopping visit on a variety of dimensions (personalization, price incentives,

product combinations, etc.). How will similar dynamics impact your company?

Perhaps your company has not yet started using

Hadoop for big data initiatives. Or, perhaps you are

stuck in "discovery mode" trying to find

that golden nugget big idea from big data. If your

company is like mine, you will not be given permission

to simply play with Hadoop for months on end

In most companies your time spent on a project needs

to be backed by someone with a budget who wants to

get something done. Let's look at successful methods to

secure your big data executive sponsorship.

Page 39: Hitachi Data Systems Hadoop Solution

HOW DO I GET STARTED?

3

9

Award-winning luck #1

1. Your executive brings to you the

justification for big data

Award-winning luck #2

2. Your subject matter expert and your

data scientist pour over the data until

they find the “golden nugget” of

justification

If you have no budget for big data, then perhaps you are waiting for a stroke of luck?

Stop waiting, and begin now to collaborate with your business consultant to discover

the data value and the “essence” of your big data business opportunity

Page 40: Hitachi Data Systems Hadoop Solution

THE NITTY-GRITTY DETAILS

4

0

CEO/ CSO

• Predict the Future

COO

• Optimize the Business Process

CMO

CFO/ CTO

• Deliver Faster and Cheaper

Hitachi helps you to choose your big data solution

by targeting the message to your sponsor’s role

and asking the BIG QUESTIONS

• Nurture the Customer Relationship

Page 41: Hitachi Data Systems Hadoop Solution

FOR EXAMPLE

4

1

A high-end disk storage manufacturer collects daily performance data

from its customers’ storage devices, but cannot effectively analyze it

BECAUSE OF THE VOLUME

The big questions to ask: If we stored the data in Hadoop, then

Could we detect operational patterns that predict device failure worldwide?

Could we anticipate the failure AND suggest a replacement without downtime?

Could we sell the data analysis back to the customer for a fee?

Could we reduce the support effort by delivering proactive notifications?

How much revenue would we gain/costs would we eliminate?

Page 42: Hitachi Data Systems Hadoop Solution

SOLUTION SELECTION FRAMEWORK

The solution discovery and evaluation process is a top-down

survey of organizational leadership followed by a prioritization

and ranking, based upon business value and organizational

priorities All Possible Solutions and Purposes

Solution Solution

Solution

Solution

Solution

Solution

Solution

Solution

Prioritized

Big Data Solution Selection

Feasible Solutions

Solution

Solution

Page 43: Hitachi Data Systems Hadoop Solution

SPONSOR CONVERSATIONS: ESTABLISHED BUSINESS INTELLIGENCE ENVIRONMENT

Specific use cases that address chosen pain

points to be tackled using big data

capabilities

Measures that show how the use cases

alleviate current pain points

External expertise needed to augment your

big data jump start

Action plan to implement prioritized use

cases and evaluate larger adoption of big

data capabilities

Executive sponsor buy-in

Executive sponsor oversight

Funding

Page 44: Hitachi Data Systems Hadoop Solution

LEVERAGE BIG DATA CAPABILITIES

Extend Historical

Transactions

Availability

Extend Data Staging,

Volume Processing

and Complex Data

Processing

Extend Complex

Data Processing

Ability to Process

Large Volumes

Flexibility and

Complexity

Management

Leverage Emerging

Capabilities

Extends Existing Data

Management Environment

Introduces New Analytic

Capabilities

Page 45: Hitachi Data Systems Hadoop Solution

BIG DATA TECHNOLOGIES: ADOPTION STRATEGY

Protect Existing Investments That are Already in the Right Place. Introduce Big

Data Technologies to Enable new and Evolving Business Needs

Big Data Appliance

Existing

Transactional

Sources

Social Media

Sources

Existing

Analytic

CapabilitiesStructured Data Management and Existing Data Management

Batch or Stream

Current Augmentation to Structured Data Management (Limited)

Stream and Organize

Stream and Organize

Stream and Organize

Sporadic Analytic

Capabilities

Big Volume Data

Analyses

High Velocity

Data Analyses

Unstructured

Data Analyses

Protect Investments as Needed

Streamline as the Environment Matures

Expand as

Demand grows

Introduce New

Capabilities

Introduce,

Consolidate and

Expand New

Capabilities

Enterprise Analytics

1

2

4

3

Page 46: Hitachi Data Systems Hadoop Solution

SPONSOR CONVERSATIONS: EMERGING BUSINESS INTELLIGENCE ENVIRONMENT

Business intelligence competencies needed

to attain and sustain competitive edge

Measures that help monitor business

operations alignment with business

strategies

External expertise needed to augment your

Big data and business intelligence jump

start

Action plan to implement and evaluate larger

adoption of big data business intelligence

capabilities

Executive sponsor buy-in

Executive sponsor oversight

Funding

Page 48: Hitachi Data Systems Hadoop Solution

QUESTIONS AND DISCUSSION

Page 49: Hitachi Data Systems Hadoop Solution

UPCOMING WEBTECHS

WebTechs

‒ Take SAP HANA From Proof of Value Through Production Deployment, June 20, 9 a.m. PT, noon ET

‒ A Cloud You Can Trust–Improve Datacenter Efficiency and Agility, June 26, 9 a.m. PT, noon ET

Check www.hds.com/webtech for

Links to the recording, the presentation, and Q&A (available next

week)

Schedule and registration for upcoming WebTech sessions

Page 50: Hitachi Data Systems Hadoop Solution

THANK YOU


Recommended