TIBCO Data Science Team TIBCO Analytics Meetup...TIBCO Data Science – Statistica Statistica –...

Post on 13-Jan-2020

13 views 0 download

transcript

TIBCO Analytics Meetup

TIBCO Data Science Team

January 22nd 2019

The following information is confidential information of TIBCO Software Inc. Use, duplication, transmission, or republication for any purpose without the prior written consent of TIBCO is expressly prohibited.

CONFIDENTIALITY

© Copyright 2000-2019 TIBCO Software Inc.

This document (including, without limitation, any product roadmap or statement of direction data) illustrates the planned testing, release and availability dates for TIBCO products and services. This document is provided for informational purposes only and its contents are subject to change without notice. TIBCO makes no warranties, express or implied, in or relating to this document or any information in it, including, without limitation, that this document, or any information in it, is error-free or meets any conditions of merchantability or fitness for a particular purpose. This document may not be reproduced or transmitted in any form or by any means without our prior written permission.

The material provided is for informational purposes only, and should not be relied on in making a purchasing decision. The information is not a commitment, promise or legal obligation to deliver any material, code, or functionality. The development, release, and timing of any features or functionality described for our products remains at our sole discretion.

During the course of this presentation TIBCO or its representatives may make forward-looking statements regarding future events, TIBCO’s future results or our future financial performance. These statements are based on management’s current expectations. Although we believe that the expectations reflected in the forward-looking statements contained in this presentation are reasonable, these expectations or any of the forward-looking statements could prove to be incorrect and actual results or financial performance could differ materially from those stated herein. TIBCO does not undertake to update any forward-looking statement that may be made from time to time or on its behalf.

DISCLAIMER

© Copyright 2000-2019 TIBCO Software Inc.

4

TIBCO Analytics Meetup – opportunities to learn and network

Meetup group keeps growing and now has 670+ members!

Join this TIBCO Analytics Meetup group and receive automatic invites to future TIBCO Analytics Meetups

http://www.meetup.com/TIBCOSpotfireOnlineusergroup/

https://bit.ly/2Jm4iOn

© Copyright 2000-2019 TIBCO Software Inc. c

5

• Welcome and TIBCO update by Michael O’Connell

• New Statistica Data Function in Spotfire by Tomáš Jurczyk

• Anomaly Detection with Deep Learning Autoencoder by David Katz

• TIBCO Community update by Heleen Snelting

• Live Q&A

Please submit your questions at any time via Q&A option

We will answer at the end or will get back to you via email

Agenda

© Copyright 2000-2019 TIBCO Software Inc.

6

Connected Intelligence Portfolio Analytics & Data Science

© Copyright 2000-2019 TIBCO Software Inc.

Michael O’ConnellChief Analytics Officer

7

Value = Find + Act on CriticalBusiness Moments

Critical business moments occur in every facet of enterprise operations.

They drive competitive differentiation, customer satisfaction andbusiness success.

smart cross-sell offers

predict impending equipment failure

anomaly detection and risk management

optimize routes

anticipate and handle disruptions

optimize pricing

prevent fraud

deliver proactive customer service

© Copyright 2000-2019 TIBCO Software Inc.

8

TIBCO Connected Intelligence

Data Visualization

Data Science

Data Management

Integration andAPI Management

Messaging andEvents Processing

Digital ProcessAutomation

© Copyright 2000-2019 TIBCO Software Inc.

9

Portfolio Approach

Best-in-class data, analytics & integration

Tightly integrated but loosely coupled

Available anywhere and everywhere

Closed loop, continuous learning apps

VisualAnalytics

DataVirtualization

DataScience

MasterData StreamingLow-Code

Dev

DataCatalog

EdgeIntegration

On-PremCloud Hybrid

© Copyright 2000-2019 TIBCO Software Inc.

Accelerators

Cloud Starters

Applications

Applications: Cloud Starters, AcceleratorsBusiness-focused TIBCO Apps using Data ScienceConfigurable visuals, data science, low code component apps

Anomaly Detection Risk Management Customer Engagement

© Copyright 2000-2019 TIBCO Software Inc.

11

Modeling+ Visual composition + Notebook+ Native ML/DL & OS integrations

Operations Deployment+ Model lifecycle mgt+ Visual analytics & BI+ Batch automation+ Real-time event processing

Data Ingest / Data Prep+ Distributed compute+ Dedicated host+ Feature Engineering

BI eg medics for epidemic monitoring

Engineer eg yield optimization

Quant eg trading desk reconciliation

Data EngineerData ScientistCitizen Data Scientist

Data ScientistCitizen Data Scientist

Analytics OperationsIT / Software Engineer

FUNCTION

USERBusiness UserIT / Administration

Business Applications+ Predictive maintenance+ Engineering/IoT/IIoT+ Customer Analytics+ Supply Chain ...

© Copyright 2000-2019 TIBCO Software Inc.

TIBCO Data Science DataScience

12

TIBCO Data Science

Comprehensive Data Access, Framework IntegrationNative Cloud Authoring / Integration || AWS/SageMaker, GCP/TensorFlow, MSFT Azure Services || SAS, MatLab

TIBCO Data Science – Author Visual | Notebook | Code | Automation | Recommendations

TIBCO Data Science (formerly Alpine) Statistica Unified Author

Licenses

EcosystemCoreDistributed

TIBCO Data Science – OperationsModel Management | Collaboration | Governance | Automation

Distributed computeModel Management

GovernanceJob Scheduler

APIsPortable Format

CollaborationProject Mgt

Audit Trail DBModel Management

GovernanceJob Scheduler

APIsPortable Format

ScoringMonitoring & Alerting

Federated Data Science Services

TIBCO R (TERR) Service3rd Party Engines (SAS, Matlab, OS R, Python)

Job ManagementAPIs

Unified Services Licenses – based

on capacity

Swap & Expand as needs evolve

DataScience

© Copyright 2000-2019 TIBCO Software Inc.

13

TIBCO Data Science – formerly AlpineConnected Teams• Collaborate on data science projects with

business

Scalable Algorithms• Transform and model across data sources

without moving data

• In-database and in-lake data prep, analytics and machine learning

• Python Notebooks, PySpark• Spark Auto-Tuning

Web Visual & Notebook Composition• Rapid deployment in Cloud and on-premise• Amazon, Microsoft Azure• EMR, Amazon Redshift / HDInsights, Azure

SQL

DataScience

© Copyright 2000-2019 TIBCO Software Inc.

14

TIBCO Data Science – Statistica

Statistica – Data Science Workbench• Data ingest, blending, in-db and in-lake processing

• 1000’s of stats, machine and deep learning

• Supervised Learning – models, ensembles

• Unsupervised Learning – anomaly detection

• Marketplaces – Azure ML, Algorithmia, Apervita

• Open source – R, Python, C#, H2O, CNTK Deep NN

Model & Rule Lifecycle Management• Create workspace, manage, version, deploy, embed

• Repeatable, GXP validation, audit, version control

DataScience

© Copyright 2000-2019 TIBCO Software Inc.

15

Impact daily decision making• Embed predictive insights in business applications

• Visualize analysis results in Spotfire and provide access across the organization

• Create self-service web interfaces

Deploy models to production• Push real-time engines to AWS, Azure, or

Cloud Foundry with PFA model formats

• Connect to streaming data (eg StreamBase) with PMML model exports

• Schedule batch runs of Workflows, Python Notebooks and SQL files

SAP HANA

Java

Teradata

Models in OperationsConnect ML pipelines to business processes and applications

DataScience

© Copyright 2000-2019 TIBCO Software Inc.

16

TIBCO Data Science – AMS - StreamBaseTIBCO Data Science – PFA – StreamBase / TIBCO Cloud™ Live Apps

Models in Operations StreamingDataScience

© Copyright 2000-2019 TIBCO Software Inc.

17

AWS and TIBCO Data Science DataScience

© Copyright 2000-2019 TIBCO Software Inc.

TIBCO Anomaly Detection

18

ML

ETL

Spotfire DS

In-DB ETL

In-DB ML

Data Science + Visual Analytics

TIBCO Data Science : REST API to Algos

Spotfire Data Function

Data : never moves

© Copyright 2000-2019 TIBCO Software Inc.

Visual Analytics

DataScience

19

Spotfire X Analytics Experience

AugmentedSearch & AI-Powered Insights

Start in seconds, instant insights

AutomatedAutomagical Dataflows

Author & audit with automatically recorded dataflow steps

AgileReimagined User Interface

Agile exploration made even easier

AcceleratedReal-time Insights

Real-time awareness and action

© Copyright 2000-2019 TIBCO Software Inc.

VisualAnalytics

Spotfire Visual Analytics Apps VisualAnalytics

© Copyright 2000-2019 TIBCO Software Inc.

21

Spotfire X: AI RecommendationsAutomated, Augmented insight discovery & display

Variable Relationships Algorithm• User selects target variable

• AI algorithm discovers variable relationships to target

• Spotfire displays in order of strength - best practices graphics

• 4 clicks to brush-linked dashboard

VisualAnalytics

DataScience

© Copyright 2000-2019 TIBCO Software Inc.

22

Spotfire X: NLQ SearchAugmented NLQ search & display

Search = NLQ• User searches (with text input)

multiple data tables + relationships

• NLQ displays appropriate (best-practice) graphs

• Brush-linked dashboard constructed entirely from chat interaction

© Copyright 2000-2019 TIBCO Software Inc.

VisualAnalytics

23

Data Mashup & Data WranglingAuto-magical Dataflows

Simple, powerful expressions & functions • Author (point-click and code) with automatically recorded

dataflow steps

• Edit from data canvas – including upstream

Automatic data lineage

Edit Transforms Upstream

VisualAnalytics

© Copyright 2000-2019 TIBCO Software Inc.

24

Predictive & Machine Learning

Automation of Data Science and ML for business users• Inbuilt menus: for regression,

classification, trees, cluster analysis, forecast

• Data functions – running models from R, Python, Statistica

Many vertical specific apps • Templates and data functions

• TIBCO Community

VisualAnalytics

DataScience

© Copyright 2000-2018 TIBCO Software Inc.

Spotfire Data Function

Spotfire Expression

© Copyright 2000-2019 TIBCO Software Inc.

25

Spotfire X: Data StreamsReal-Time Awareness and Action

Real Time Data Visualization• User selects time window• Data Streams shows live-update

visualizations• Calculations eg forecast available on live viz

Users and Data• Many real-time data sources supported• Simple, unified data source connector • Brush-linked – like all Spotfire marking

© Copyright 2000-2019 TIBCO Software Inc.

VisualAnalytics

DataStreams

26

Spotfire DifferentiationBrush-Linked Analytic AppsData Mashup / Data WranglingAI-Powered Insights + NLQ (*New* Spotfire X)Real-Time Data Streams (*New* Spotfire X)One click Web Deployment

Geo-Analytics Predictive Analytics and Machine Learning

Server-side Scalability, Governance, SecurityConfigurability and APIs

Visual Analytics Apps

Enterprise Class

Geo & PredictiveAnalytics

Visual & Geo

Analytics

DataScience

Data Streams

Data Wrangling

© Copyright 2000-2019 TIBCO Software Inc.

27

Anomaly Detection

MANUFACTURING:Anomaly Detection

© Copyright 2000-2019 TIBCO Software Inc.

28

Applications of Anomaly Detection

IoT & Engineering• Formula One• Energy - Production Surveillance, Drilling Optimization• Predictive Maintenance (PdM, CbM)• Manufacturing - Yield Optimization

Financial Services• Trade Surveillance• Fraud Detection

Healthcare & Pharmaceutical• Patient risk – cardiac arrest, sepsis, surgery infection

Customer Analytics• Churn Prevention• Cross-Sell, Up-Sell

Key Issue – Understand Variability

© Copyright 2000-2019 TIBCO Software Inc.

29

Anomaly DetectionTechniquesDeep Learning dimension reduction – non-linear• Reconstruction error

Cluster Analysis / PCA• Distance to closest centroid

Features show root causeModels can be used for scoring new event stream

TIBCO Templates, AcceleratorsAutoencoder• Spotfire and TIBCO R (TERR) Template & Data Function*

• TIBCO Data Science – Teams

Cluster Analysis• TIBCO Data Science – Statistica*

Data Science / TIBCO Combinations• Risk Management Accelerator

• High-Tech Manufacturing Accelerator

© Copyright 2000-2019 TIBCO Software Inc. * Demoing Today

VisualAnalytics

DataStreams

30© Copyright 2000-2019 TIBCO Software Inc.

Anomaly Detection Action: TIBCO + F1

© Copyright 2000-2019 TIBCO Software Inc.

32

Tomáš JurczykData Scientist

Statistica Data Function in Spotfire - Demo

© Copyright 2000-2019 TIBCO Software Inc.

33

Statistica Data function for Spotfire

https://community.tibco.com/wiki/statistica-data-function

Statistica

Spotfire

Objective: Demonstrate - via a build from scratch example - how this Spotfire-Statistica integration empowers Analysts and Citizen Data Scientists

Steps of demo:1. Build a workspace in Statistica 2. Call this Statistica workspace as a data function in

Spotfire on a different data source3. Build visualisations in Spotfire based on the new

information about assigned clusters and additional outputs

4. Optional: Add action controls to run the data function according to user’s choice of parameters

© Copyright 2000-2019 TIBCO Software Inc.

34

Anomaly Detection with Deep Learning Autoencoder

© Copyright 2000-2019 TIBCO Software Inc.

David KatzPrincipal Consultant

35

Deep Learning Autoencoder• Autoencoders and Anomaly Detection• Software Tools

Demo

Notes on Setup

Topics to cover

© Copyright 2000-2019 TIBCO Software Inc. c

36

Autoencoders and Anomaly Detection

• Create an identity transformation with constraints

• Analogy to Principal Components – but much more flexible/accurate.

• Anomalies – the output is the reconstructed input, but it does not fully match the original input => Reconstruction Error

• Reconstruction Error:• Overall• By component• By sample.

© Copyright 2000-2019 TIBCO Software Inc. c

37

H2O DeepLearning• Simple Structure of networks – just

specify number of fully-connected layers (and optionally dropout)

• Settings for Sparse data can outperform GPU

• H2O Deep Water Project – • uses GPU but no longer being

developed• H2O recommends Keras for new

projects

Keras • Front end for Tensorflow, CNTK, Theano,

MXNet

• Specify complex network topologies

• Use different types of layers – CNN, RNN,…

• Can leverage GPU

Deep Learning Software

© Copyright 2000-2019 TIBCO Software Inc. c

38

Time Based Multivariate data in Spotfire• TIBCO Community Exchange Template using R/TERR data functions with

H2O – available now – showing in this presentation

• Python data functions with TensorFlow

• Without time based features – available now

• With time based features – coming soon

Time Based Multivariate data in TIBCO Data Science• Available now on AWS Marketplace using TensorFlow / Sagemaker

• Runs on Clusters

• Some post-processing features shown today not yet integrated

TIBCO Interfaces to Deep Learning Software

© Copyright 2000-2019 TIBCO Software Inc. c

TIBCO Community page on Anomaly Detection: https://bit.ly/2SVUkY6

39

Industrial Plant: Raw Time series Data

© Copyright 2000-2019 TIBCO Software Inc. c

40

Industrial Plant: Raw Time series Data

© Copyright 2000-2019 TIBCO Software Inc. c

41

Industrial Plant: Raw Time series Data

© Copyright 2000-2019 TIBCO Software Inc. c

42

Demo

© Copyright 2000-2019 TIBCO Software Inc. c

Case Study - Manufacturing

© Copyright 2000-2019 TIBCO Software Inc. c

44

Validation Error has clear minimum

Model Configuration & Evaluation

© Copyright 2000-2019 TIBCO Software Inc. c

45

Note Problems in Convergence here.Minimum Error looks like a random variation

Build & Evaluate Model

© Copyright 2000-2019 TIBCO Software Inc. c

46

Convergence Prevented by Severe Outlier in Validation Sample

© Copyright 2000-2019 TIBCO Software Inc. c

47

Another way to spot these outliers – excessive variance for these variables

© Copyright 2000-2019 TIBCO Software Inc. c

Severe Outliers Can Cause Failure to ConvergeEspecially in Validation SampleHere we Mark Rows to Omit from Analysis

© Copyright 2000-2019 TIBCO Software Inc. c

49

Without outlier points, we get good convergence:

© Copyright 2000-2019 TIBCO Software Inc. c

50

TIBCO Community Update

© Copyright 2000-2019 TIBCO Software Inc.

Heleen SneltingDirector Data Science

Swap Logoif neededIn master slide

51

TIBCO Community the platform for our users

TIBCO Community the platform for our users! community.tibco.com

© Copyright 2000-2019 TIBCO Software Inc.

52

Statistica and Python Data Functions - Spotfire

https://community.tibco.com/wiki/statistica-data-function

Statistica

Spotfire

https://bit.ly/2EPx4Z8

Spotfire Data Function Tips & Tricks section

© Copyright 2000-2019 TIBCO Software Inc.

https://bit.ly/2CQAI2B

53

Statistica Workspace - A Graphical UI

https://bit.ly/2Ua1M2b

© Copyright 2000-2019 TIBCO Software Inc.

54

Anomaly Detection and Autoencoder ML

https://bit.ly/2Walf4G

© Copyright 2000-2019 TIBCO Software Inc.

Videohttps://youtu.be/24F_Rx5IlHM

Slideshttps://www.slideshare.net/AmazonWebServices/tibco-ai-and-data-science-innovation-with-amazon-sagemaker-ant329s-aws-reinvent-2018

56

TIBCO Labs - Participate and Innovate in TIBCO Connected Intelligence Cloud

https://community.tibco.com/wiki/tibco-labs

Or start with TIBCO Community Exchange https://bit.ly/2sDvTmI

57

Spotfire X, Spotfire Data Streams - learn moreWhat’s New in Spotfire TIBCO Community Page And Spotfire X Webinar Series

https://community.tibco.com/wiki/whats-new-tibco-spotfire

© Copyright 2000-2019 TIBCO Software Inc.

https://www.tibco.com/events/series/spotfire-x-webinar-series

Exploring NYC Traffic Accidents with Spotfire X blog: https://bit.ly/2FPZVeP

Wikipedia Spotfire X and Spotfire Data Streams

Blog: https://bit.ly/2R6ajRTDemo: https://bit.ly/2Wac2cjTIBCO Community how-to info: https://bit.ly/2Wac2cj

Swap Logoif neededIn master slide

58

TIBCO Community Spirit! Some tips

© Copyright 2000-2019 TIBCO Software Inc.

Also use search Engines such as Google to easily find relevant content on the TIBCO Community

Add #DataScience as a tag to promote review by the TIBCO Data Science team

Help expedite answer by following these Tips on Asking and Answering Questions

https://community.tibco.com/wiki/tips-asking-and-answering-questions

Search Answers before posting a question - with 15,000 questions your question may have been answered already!

“We are using the TIBCO Community all the time - we have been answering sometimes even our own questions by referring to existing content and answers” Spotfire Customer, UK

Don’t forget to give feedback to answers

59

What’s new in… for example TIBCO Data Virtualization

https://community.tibco.com/wiki/whats-new-tibco-data-virtualization

Customer Orientation and Customer Success Center - ideal for on-boarding and staying up to date

https://community.tibco.com/wiki/tibco-analytics-new-customer-orientation - feedback appreciated!

https://community.tibco.com/wiki/tibco-spotfire-customer-success-center

TIBCO Geo-Analytics capabilities

https://community.tibco.com/wiki/tibco-spotfire-location-analytics-mapping-geoanalytics-and-spatial-statistics

TIBCO Analytics Meetup pages with recordings and presentations

https://community.tibco.com/wiki/tibco-analytics-meetup

Data Literacy

https://community.tibco.com/wiki/data-literacy

AI on Demand - TIBCO Data Science Meetup Tour 2019 - dates and locations to be published soon

https://community.tibco.com/wiki/ai-demand-data-science-operations

Live TIBCO Spotfire and other Meetups - next Feb 13 (London and Houston) and Feb 26 (Aberdeen)!

https://www.meetup.com/pro/tibco/

Top TIBCO Community links to bookmark

© Copyright 2000-2019 TIBCO Software Inc.

60© Copyright 2000-2019 TIBCO Software Inc.

TIBCO NOW now.tibco.com

61

Questions & Contact

Thank you!Michael O’Connellmoconnel@tibco.com

@MichOConnellH

Heleen Snelting hsneltin@tibco.com@HeleenSnelting

TIBCO Communitycommunity.tibco.com

TIBCO Exchangecommunity.tibco.com/exchange

Spotfire Trialspotfire.tibco.com/trial

© Copyright 2000-2019 TIBCO Software Inc.