+ All Categories
Home > Documents > Get Data, Build Apps and Analyze Data Using IBM Bluemix Data and Analytics...

Get Data, Build Apps and Analyze Data Using IBM Bluemix Data and Analytics...

Date post: 20-Jun-2018
Category:
Upload: hoangcong
View: 225 times
Download: 0 times
Share this document with a friend
46
© 2015 IBM Corporation Get Data, Build Apps and Analyze Data Using IBM Bluemix Data and Analytics (Session 6748) Eric Cattoir - @CattoirEric Yves Debeer - @yvesdebeer Bert Waltniel - @BertWaltniel
Transcript

© 2015 IBM Corporation

Get Data, Build Apps and Analyze Data Using IBM Bluemix Data and Analytics (Session 6748)Eric Cattoir - @CattoirEric Yves Debeer - @yvesdebeer Bert Waltniel - @BertWaltniel

IBM BluemixThe Digital Innovation Platform

Innovation is the new currency

“Two guys in a Starbucks can have access to the same computing power as a Fortune 500 company.”

Jim DetersFounder, Galvanize

Anatomy of a Disruptive Idea

4

To really disrupt, a business should focus on building differentiation and rent the rest

Devs can quickly compose apps with new APIs and digital services to add features and increase engagement in areas like:

• Analytics, cognition • Mobile, location • Internet of Things • Social engagement • Identity • Reviews • Travel • Messaging … • His/her company’s private APIs and services

Bluemix works for disruptors.

7

Customer ManagedService Provider Managed

IBM SoftLayer

Bluemix started as a public PaaSBluemix started with a major focus on developer productivity in the public cloud.

Infrastructure as a Service

Code

Data

Runtime

Middleware

OS

Virtualization

Servers

Storage

Networking

Code

Data

Runtime

Middleware

OS

Virtualization

Servers

Storage

Networking

Platform as a Service

8

Customer ManagedService Provider Managed

IBM SoftLayer

We listened. Now we’re evolving to become even more flexible.Capabilities in Bluemix now span PaaS and IaaS and can be delivered as a public, dedicated, or on-premises* implementation.

Infrastructure as a Service

Code

Data

Runtime

Middleware

OS

Virtualization

Servers

Storage

Networking

Code

Data

Runtime

Middleware

OS

Virtualization

Servers

Storage

Networking

Platform as a Service

Built ontechnologies:

How does Bluemix work?Bluemix is underlined by three key open compute technologies: Cloud Foundry, Docker, and OpenStack. It extends each of these with a growing number of services, robust DevOps tooling, integration capabilities, and a seamless developer experience.

9

Flexible Compute Options to Run Apps / ServicesInstant Runtimes Containers Virtual Machines

Platform Deployment Options that Meet Your Workload Requirements

Bluemix Public

Bluemix Dedicated

Bluemix Local*

DevOps Tooling Your Own Hosted Apps / Services

Integration and API Mgmt

Powered by IBM SoftLayer In Your Data Center

+ + +

+ +

+ Always focused on what’s next

Catalog of Services that Extend Apps’ Functionality

Web Data Mobile AnalyticsCognitive IoT Security Yours

+

*Bluemix Local coming Summer 2015

Bluemix is built on IBM SoftLayer

10

Dallas

London (now)

Bluemix Public LocationSoftLayer Data Center

A different kind of data center • Every location designed, built, and operated to the same standardized, “pod” based spec • 24/7 on-site security and rigorous controls • Expanding to 40 data centers worldwide

Global network of networks • Public, private, and management networks all separate •More than 2,000Gbps between data centers and network points of presence (PoPs) •Unmetered inbound public bandwidth and fully unmetered bandwidth between data centers

Entirely automated • SoftLayer API controls everything - more than 3000 documented methods and 180 distinct services •Bare metal and virtualized servers in the same platform

The highest performing cloud infrastructure available.

Sign up in minutes. Pay for what you use.

11

Cloud based pricing models to serve developer needs.

• 30 day trial (no credit card required) - Designed to allow testing of an entire application on the platform

Friction free adoption

• Free tier for every service - encourages experimentation of new services for applications already running on Bluemix

• Pay-as-you-go - optimized for flexibility, no term commitment

Multiple Commitment Models

• Subscription - term based optimized for cost, discounted from pay as you go rates

• Zero to coding in less than 5 minutesSelf Service

• Credit card over the web in many countries - or through your IBM rep

12

Let’s see it!*click*

13

Requirements Based Top-Down Design Integration and Reuse Competence Centers Better Decisions Enterprise Focus

Opportunity-Oriented Experimentation Throwaway Hackathons Business Innovation Functional Focus

Traditional Agile Data Analytics

Business Agility through Data

But How?

15

Elastic Provisioning

Pay-as-You-Go

Manage High Volume External Data Sources

Self-Service Through a Browser

SQL / NOSQL – Unstructured Data

Access Data Anywhere, Anytime

Leverage Current Cloud Apps

Agility and Elasticity through Cloud

Open for DataA comprehensive portfolio of open source data services

IBM Data and Analytics Services

Work with Cloud Data Services in Bluemix

17

DashDB

ANALYZEDATA

DataWorks

GETDATA

Apache Kafka

Streams

Cloudant

Redis

PUT DATA TO WORK

PostgresDB2

MongoDB

ReThinkDBObject Storage

Graph DB

Sensors

Internet

Social Media

CustomerConversations

Internal & External Data Sources

Back Office Applications

Notebooks

INTERACTGAIN INSIGHT

VISUALIZE

Your Own Data & Analytics

Applications

Predictive Analytics

Iterate

Example: Health Management Platform

18

Cloudant

GETDATA

PUT DATA TO WORK

ANALYZEDATA

INTERACTGAIN INSIGHT

VISUALIZE

Streams

Clinical & Wearable Device Sensors

Fitbit, JawboneDevice Data

Lab ResultsPatient Conversations

Internal & External Data Sources

Health Results from RDBMS

DataWorks

DashDBNotebooks

Get Data from Own or Public Data

• Import Data into data servicese.g. dashDB, Cloudant, Mongo, … through respective load tools

• Create Connections to diverse data sources on-premise or cloudfor use in analytics e.g. Notebooks

• Load Data from diverse sources into cloud data services in-context,powered by DataWorks

X

Get Data from Bluemix’s Analytics Exchange

• Explore available data sets • Find interesting data • Access data from Bluemix apps • Analyze Data in ▪ Apache Spark & Notebooks

▪ Dash DB

▪ Watson Analytics

▪ …

X

Build Applications using Bluemix Data Services

Connect your applicationsto use Data in Bluemix • Select the Database Service Instance

you want to use and pick your plan (upper right)

• Get service credentials to use in your code (lower right)

• Use the APIs, passing the credentials you obtained ▪ from Node.js, Liberty, or

other apps on Bluemix

▪ or from apps running on other platformor devices

• Manage from the context of Bluemix under Bluemix login

X

Analyze Data with IBM Analytics for Spark

• Go to Work with Data -> Analytics and create a new service instance

• Interactive Notebooks ▪ Use Python, Scala with Spark ▪ Associate an Object Storage

for accessing&uploading of Data ▪ Connect to Data Sources e.g. Files,

Cloudant, DashDB, on-premise DBs, … • Spark Submit

▪ Download Apache Spark Submit ▪ Develop your own Spark Jobs ▪ Run and monitor your Jobs

X

+

Example: New York Accidents Analytics

• NY City Public Data ▪ Accident Data from NYPD ▪ Road Condition Data ▪ Weather History Data

• We created a set of Notebooks to ▪ Cleanse Data to get it in proper shape for

visualization and analytics ▪ Visualize Data to better understand its content

and structure ▪ Analyze Data to identify patterns and correlations

in the data ▪ Predict future Incident Likelyhoods from data ▪ Visualize Insights from descriptive and

predictive analytics

X

X

X

X

X

X

X

X

X

X

X

Spark Technical Discussion

X

Resilient Distributed Datasets (RDDs)• A collection of elements that Spark works on in parallel. • May be kept in memory or on disk. • Applications can also explicitly tell Spark to cache an RDD, which is great for iterative

algorithms. • An RDD contains the “raw data”, plus the function to compute it. • Fault-tolerance: if any partition of an RDD is lost, it will automatically be recomputed

using the transformations that originally created it.

RDD built from a Java collection

RDD built from an external dataset(local FS, HDFS, Hbase,…)

Working with RDDs: Transformations and Actions• Transformations are lazy: they do not compute their results right away. They are

added to the operations of the RDD ▪ optimize the required calculations ▪ recover from lost data partitions

• Examples: map(func), filter(func), union(), join(), groupByKey

• Actions are executed immediately, and trigger execution of all prior transformations on an RDD

• Examples: reduce(func), collect(), saveAsSequenceFile()

• func are Java/Scala/Python functions that you write

• Call persist() on an RDD if you plan to reuse it later

Execution Model

X

Driver Program

Executor

Executor

Executor

spark-submit

notebook

Spark in Action – Word Count in Scalaval conf = new SparkConf().setAppName(“WordCount”)

val sc = new SparkContext(conf)

val file = sc.textFile(“swift://fileContainer.spark/input.txt”)

val words = file.flatMap(line => tokenize(line))

val wordMap = words.map(x => (x, 1))

val wordCounts = wordMap.reduceByKey(_ + _)

wordCounts.saveAsTextFile(“swift://fileContainer.spark/output.txt”)

Tokenize is

def tokenize(text : String) : Array[String] = {

text.toLowerCase.replaceAll("[^a-zA-Z0-9\\s]", "").split("\\s+")

}

// Adapted from Word Count example on http://spark-project.org/examples/

1 RDD = 1 line of the document

Transformations

Action

X

DataFrames

X

X

Reader Node.js

Topic Kafka

Spark Streaming

Notebook

WatsonTone

Analyzer

Results Cloudant DB

REST API Node.js

Insights App

Node.jsMessage Hub provides elastic high velocity message queue

Node.js Reader receives Twitter data stream and writes to Topic

Algorithms in Scala detect Tweets of interest

Watson enriches Tweets with tone &sentiment info

Cloudant stores insight data with HADR at scale

Insight App lets users explore and interact with results

Combine Services: Analytics of Twitter Data

X

Reader Node.js

Topic Kafka

Spark Streaming

Notebook

WatsonTone

Analyzer

Results CloudantDB

REST API Node.js

Insight App Node.js

Reader Node.js

Topic Kafka

Reader Node.js

Topic KafkaStock

Quotes

Topic Kafka

Alert Gen Node.js

Predictive Analytics

Push Service

Streaming Analytics using multiple Data Sources

Conclusion

• You can achieve greater Business Agility and faster Insights through Cloud based Innovation without upfront investment

• IBM Cloud Data Services provide open, cloud based data and analytics services that enable fast cloud based innovation

• Bluemix - Data & Analytics at https://bluemix.net/data features and integrates cloud data services, enabling you to ▪ Get Data from your own or public data sources ▪ Build Applications using cloud data & analytics services ▪ Analyze Data with Spark&Notebooks at https://bluemix.net/data/analytics , Hadoop,

dash DB, … • Combine and Integrate cloud data & analytics services with each other, as well as with

other Bluemix services, e.g. through the new Message Hub based on Apache Kafka

X

19

Combine Services: Analytics of Twitter Data

•https://developer.ibm.com/clouddataservices/sentiment-analysis-of-twitter-hashtags/

20

Thank You


Recommended