+ All Categories
Home > Documents > Synthesis - Inside Analysis · 7/7/2018  · •Analytics Destination Sources: •Legacy...

Synthesis - Inside Analysis · 7/7/2018  · •Analytics Destination Sources: •Legacy...

Date post: 09-Oct-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
29
Synthesis A New Webinar Series
Transcript
Page 1: Synthesis - Inside Analysis · 7/7/2018  · •Analytics Destination Sources: •Legacy •Transaction •Web •3rd Party •Social Media •Machine •Geospatial Stores: •Staging

SynthesisA New Webinar Series

Page 2: Synthesis - Inside Analysis · 7/7/2018  · •Analytics Destination Sources: •Legacy •Transaction •Web •3rd Party •Social Media •Machine •Geospatial Stores: •Staging

TH

E L

INE

UP

ANALYST:

Eric KavanaghCEOInsideAnalysis

ANALYST:

Dave WellsResearch AnalystEckerson Group

GUEST:

Amar ArsikereFounder & CEOInfoworks

Page 3: Synthesis - Inside Analysis · 7/7/2018  · •Analytics Destination Sources: •Legacy •Transaction •Web •3rd Party •Social Media •Machine •Geospatial Stores: •Staging

© Eckerson Group 2018 www.eckerson.com

What are the Building Blocks

of a Modern Data Pipeline?

Dave Wells

[email protected]

Page 4: Synthesis - Inside Analysis · 7/7/2018  · •Analytics Destination Sources: •Legacy •Transaction •Web •3rd Party •Social Media •Machine •Geospatial Stores: •Staging

© Eckerson Group 2018 www.eckerson.com

Pipeline Components

4

Destination

Dataflow

origin

Storage Processing

Workflow Monitoring

Origin

Technology

Page 5: Synthesis - Inside Analysis · 7/7/2018  · •Analytics Destination Sources: •Legacy •Transaction •Web •3rd Party •Social Media •Machine •Geospatial Stores: •Staging

© Eckerson Group 2018 www.eckerson.com

Destination: Purpose and End Point

5

Destination

Sources:

• Legacy

• Transaction

• Web

• 3rd Party

• Social Media

• Machine

• Geospatial

Stores:

• Staging

• Warehouse

• Data Mart

• MDM

• ODS

• Data Lake

• Sandbox

Origin

Dataflow

origindestination

Storagetemporary files

staging tables

data warehouse

data mart

operational data store

master data repository

Processing

extract transform load

map reduce

extract load transform

connect abstract publish

sample blend format

Workflow

scheduling execution

failoverdistribution verification

Monitoring

health check

performance logging

debugging

Technology

Origin

Dataflow

Storage Processing

Workflow Monitoring

Technology

Stores:

• Staging

• Warehouse

• Data Mart

• MDM

• ODS

• Data Lake

• Sandbox

Applications:

• Reporting

• OLAP

• Scorecards

• Dashboards

• Exploration

• Analytics

Origin

Dataflow

Storage Processing

Workflow Monitoring

Technology

Page 6: Synthesis - Inside Analysis · 7/7/2018  · •Analytics Destination Sources: •Legacy •Transaction •Web •3rd Party •Social Media •Machine •Geospatial Stores: •Staging

© Eckerson Group 2018 www.eckerson.com

Destination: Timeliness

6

Sources:

• Legacy

• Transaction

• Web

• 3rd Party

• Social Media

• Machine

• Geospatial

Stores:

• Staging

• Warehouse

• Data Mart

• MDM

• ODS

• Data Lake

• Sandbox

Origin

Dataflow

origindestination

Storagetemporary files

staging tables

data warehouse

data mart

operational data store

master data repository

Processing

extract transform load

map reduce

extract load transform

connect abstract publish

sample blend format

Workflow

scheduling execution

failoverdistribution verification

Monitoring

health check

performance logging

debugging

Origin

Dataflow

Storage Processing

Workflow Monitoring

Technology

Dataflow

origin

Storage Processing

Workflow Monitoring

Stores:

• Staging

• Warehouse

• Data Mart

• MDM

• ODS

• Data Lake

• Sandbox

Applications:

• Reporting

• OLAP

• Scorecards

• Dashboards

• Exploration

• AnalyticsWhat requirements for real time data?

What criteria for right time data?

For which data is latency okay? And how much latency?

Destination

Page 7: Synthesis - Inside Analysis · 7/7/2018  · •Analytics Destination Sources: •Legacy •Transaction •Web •3rd Party •Social Media •Machine •Geospatial Stores: •Staging

© Eckerson Group 2018 www.eckerson.com

Origin: Data Supply and Begin Point

7

Stores:

• Staging

• Warehouse

• Data Mart

• MDM

• ODS

• Data Lake

• Sandbox

Applications:

• Reporting

• OLAP

• Scorecards

• Dashboards

• Exploration

• Analytics

Destination

Sources:

• Legacy

• Transaction

• Web

• 3rd Party

• Social Media

• Machine

• Geospatial

Stores:

• Staging

• Warehouse

• Data Mart

• MDM

• ODS

• Data Lake

• Sandbox

Origin

Storagetemporary files

staging tables

data warehouse

data mart

operational data store

master data repository

Processing

extract transform load

map reduce

extract load transform

connect abstract publish

sample blend format

Workflow

scheduling execution

failoverdistribution verification

Monitoring

health check

performance logging

debugging

Technology

Storage Processing

Workflow Monitoring

Technology

Dataflow

origindestination

Dataflow

DestinationOrigin

Dataflow

Storage Processing

Workflow Monitoring

Technology

Page 8: Synthesis - Inside Analysis · 7/7/2018  · •Analytics Destination Sources: •Legacy •Transaction •Web •3rd Party •Social Media •Machine •Geospatial Stores: •Staging

© Eckerson Group 2018 www.eckerson.com

Sources:

• Legacy

• Transaction

• Web

• 3rd Party

• Social Media

• Machine

• Geospatial

Stores:

• Staging

• Warehouse

• Data Mart

• MDM

• ODS

• Data Lake

• Sandbox

DestinationOrigin

Dataflow

Storage Processing

Workflow Monitoring

Technology

Origin: Data Type and Velocity

8

Storagetemporary files

staging tables

data warehouse

data mart

operational data store

master data repository

Processing

extract transform load

map reduce

extract load transform

connect abstract publish

sample blend format

Workflow

scheduling execution

failoverdistribution verification

Monitoring

health check

performance logging

debugging

Technology

Storage Processing

Workflow Monitoring

Technology

Dataflow

origindestination

Dataflow

Which data is event based & which is entity data?

Is event data stored or streamed?

How quickly must data be gathered from sources?

How frequently must data be gathered from sources?

Page 9: Synthesis - Inside Analysis · 7/7/2018  · •Analytics Destination Sources: •Legacy •Transaction •Web •3rd Party •Social Media •Machine •Geospatial Stores: •Staging

© Eckerson Group 2018 www.eckerson.com

Data Flow: Data in Motion

9

Stores:

• Staging

• Warehouse

• Data Mart

• MDM

• ODS

• Data Lake

• Sandbox

Applications:

• Reporting

• OLAP

• Scorecards

• Dashboards

• Exploration

• Analytics

Destination

Sources:

• Legacy

• Transaction

• Web

• 3rd Party

• Social Media

• Machine

• Geospatial

Stores:

• Staging

• Warehouse

• Data Mart

• MDM

• ODS

• Data Lake

• Sandbox

Origin

Storagetemporary files

staging tables

data warehouse

data mart

operational data store

master data repository

Processing

extract transform load

map reduce

extract load transform

connect abstract publish

sample blend format

Workflow

scheduling execution

failoverdistribution verification

Monitoring

health check

performance logging

debugging

Technology

Storage Processing

Workflow Monitoring

Technology

Dataflow

origindestination

DestinationOrigin

Storage Processing

Workflow Monitoring

Technology

Page 10: Synthesis - Inside Analysis · 7/7/2018  · •Analytics Destination Sources: •Legacy •Transaction •Web •3rd Party •Social Media •Machine •Geospatial Stores: •Staging

© Eckerson Group 2018 www.eckerson.com

Data Flow: Pipeline Boundaries

10

Stores:

• Staging

• Warehouse

• Data Mart

• MDM

• ODS

• Data Lake

• Sandbox

Applications:

• Reporting

• OLAP

• Scorecards

• Dashboards

• Exploration

• Analytics

Destination

Sources:

• Legacy

• Transaction

• Web

• 3rd Party

• Social Media

• Machine

• Geospatial

Stores:

• Staging

• Warehouse

• Data Mart

• MDM

• ODS

• Data Lake

• Sandbox

Origin

Storagetemporary files

staging tables

data warehouse

data mart

operational data store

master data repository

Processing

extract transform load

map reduce

extract load transform

connect abstract publish

sample blend format

Workflow

scheduling execution

failoverdistribution verification

Monitoring

health check

performance logging

debugging

Technology

Storage Processing

Workflow Monitoring

Technology

Dataflow

origindestination

DestinationOrigin

Storage Processing

Workflow Monitoring

Technology

orders

A/R

extract

extract

cleanse load

data

warehouse

extract aggregate load

A/R

data mart

one pipeline

orders

A/R

extract

extract

cleanse load

data

warehouse

data

warehouse

A/R

data mart

extract aggregate load

two pipelines

Page 11: Synthesis - Inside Analysis · 7/7/2018  · •Analytics Destination Sources: •Legacy •Transaction •Web •3rd Party •Social Media •Machine •Geospatial Stores: •Staging

© Eckerson Group 2018 www.eckerson.com

Data Storage: Data at Rest

11

Stores:

• Staging

• Warehouse

• Data Mart

• MDM

• ODS

• Data Lake

• Sandbox

Applications:

• Reporting

• OLAP

• Scorecards

• Dashboards

• Exploration

• Analytics

Destination

Sources:

• Legacy

• Transaction

• Web

• 3rd Party

• Social Media

• Machine

• Geospatial

Stores:

• Staging

• Warehouse

• Data Mart

• MDM

• ODS

• Data Lake

• Sandbox

Origin

Storagetemporary files

staging tables

data lake

data warehouse

master data repository

analytics sandbox

Processing

extract transform load

map reduce

extract load transform

connect abstract publish

sample blend format

Workflow

scheduling execution

failoverdistribution verification

Monitoring

health check

performance logging

debugging

Technology

Processing

Workflow Monitoring

Technology

Dataflow

origindestination

DestinationOrigin

Dataflow

Processing

Workflow Monitoring

Technology

Page 12: Synthesis - Inside Analysis · 7/7/2018  · •Analytics Destination Sources: •Legacy •Transaction •Web •3rd Party •Social Media •Machine •Geospatial Stores: •Staging

© Eckerson Group 2018 www.eckerson.com

Data Storage: Which is the Right Data Store?

12

Stores:

• Staging

• Warehouse

• Data Mart

• MDM

• ODS

• Data Lake

• Sandbox

Applications:

• Reporting

• OLAP

• Scorecards

• Dashboards

• Exploration

• Analytics

Destination

Sources:

• Legacy

• Transaction

• Web

• 3rd Party

• Social Media

• Machine

• Geospatial

Stores:

• Staging

• Warehouse

• Data Mart

• MDM

• ODS

• Data Lake

• Sandbox

Origin

Processing

extract transform load

map reduce

extract load transform

connect abstract publish

sample blend format

Workflow

scheduling execution

failoverdistribution verification

Monitoring

health check

performance logging

debugging

Technology

Processing

Workflow Monitoring

Technology

Dataflow

origindestination

Storagetemporary files

staging tables

data lake

data warehouse

master data repository

analytics sandbox

Volume of data?

Structure & format?

Duration & retention?

Query frequency & volume?Other users and uses?

Governance constraints?

Privacy & security?

Disaster recovery?

DestinationOrigin

Dataflow

Processing

Workflow Monitoring

Technology

Volume of data?

Structure & format?

Duration & retention?

Query frequency & volume?Other users and uses?

Governance constraints?

Privacy & security?

Disaster recovery?

Page 13: Synthesis - Inside Analysis · 7/7/2018  · •Analytics Destination Sources: •Legacy •Transaction •Web •3rd Party •Social Media •Machine •Geospatial Stores: •Staging

© Eckerson Group 2018 www.eckerson.com

Processing: Adding Value and Creating Data Products

13

Stores:

• Staging

• Warehouse

• Data Mart

• MDM

• ODS

• Data Lake

• Sandbox

Applications:

• Reporting

• OLAP

• Scorecards

• Dashboards

• Exploration

• Analytics

Destination

Sources:

• Legacy

• Transaction

• Web

• 3rd Party

• Social Media

• Machine

• Geospatial

Stores:

• Staging

• Warehouse

• Data Mart

• MDM

• ODS

• Data Lake

• Sandbox

Origin

Storagetemporary files

staging tables

data warehouse

data mart

operational data store

master data repository

Processing

extract transform load

map reduce

extract load transform

connect abstract publish

sample blend format

Workflow

scheduling execution

failoverdistribution verification

Monitoring

health check

performance logging

debugging

Technology

Workflow Monitoring

Technology

Dataflow

origindestination

DestinationOrigin

Dataflow

Storage

Workflow Monitoring

Technology

Page 14: Synthesis - Inside Analysis · 7/7/2018  · •Analytics Destination Sources: •Legacy •Transaction •Web •3rd Party •Social Media •Machine •Geospatial Stores: •Staging

© Eckerson Group 2018 www.eckerson.com

Processing: Stages of the Data Lifecycle

14

Stores:

• Staging

• Warehouse

• Data Mart

• MDM

• ODS

• Data Lake

• Sandbox

Applications:

• Reporting

• OLAP

• Scorecards

• Dashboards

• Exploration

• Analytics

Destination

Sources:

• Legacy

• Transaction

• Web

• 3rd Party

• Social Media

• Machine

• Geospatial

Stores:

• Staging

• Warehouse

• Data Mart

• MDM

• ODS

• Data Lake

• Sandbox

Origin

Storagetemporary files

staging tables

data warehouse

data mart

operational data store

master data repository

Processing

extract transform load

map reduce

extract load transform

connect abstract publish

sample blend format

Workflow

scheduling execution

failoverdistribution verification

Monitoring

health check

performance logging

debugging

Technology

Workflow Monitoring

Technology

Dataflow

origindestination

DestinationOrigin

Dataflow

Storage

Workflow Monitoring

Technology

Ingest Persist Transform Deliver

export

extraction

replication

messaging

streaming

databases

files

in-memory-----

duration?

access?

improve

enrich

format

standardize & conform

cleanse & quality assurede-duplicatederive

appendaggregate

sort, sequence, & pivot

sample. select, filter, & maskassemble & construct

publishing

cataloging

modeling

visualization

storytelling

Page 15: Synthesis - Inside Analysis · 7/7/2018  · •Analytics Destination Sources: •Legacy •Transaction •Web •3rd Party •Social Media •Machine •Geospatial Stores: •Staging

© Eckerson Group 2018 www.eckerson.com

Workflow: Sequence and Dependencies

15

Stores:

• Staging

• Warehouse

• Data Mart

• MDM

• ODS

• Data Lake

• Sandbox

Applications:

• Reporting

• OLAP

• Scorecards

• Dashboards

• Exploration

• Analytics

Destination

Sources:

• Legacy

• Transaction

• Web

• 3rd Party

• Social Media

• Machine

• Geospatial

Stores:

• Staging

• Warehouse

• Data Mart

• MDM

• ODS

• Data Lake

• Sandbox

Origin

Storagetemporary files

staging tables

data warehouse

data mart

operational data store

master data repository

Processing

extract transform load

map reduce

extract load transform

connect abstract publish

sample blend format

Workflow

scheduling execution

failoverdistribution verification

Monitoring

health check

performance logging

debugging

Technology

Monitoring

Technology

Dataflow

origindestination

DestinationOrigin

Dataflow

Storage Processing

Monitoring

Technology

Task Dependencies

Job Dependencies

Requires successful completion of one

or more preceding tasks

Requires successful completion of one

or more preceding jobs

Tasks later in execution

sequence wait for successful

completion

Jobs later in execution sequence wait

for successful completion

Parallel execution of multiple tasks

requires all tasks to finish successfully

Parallel execution of multiple jobs

requires all jobs to finish successfully

UpstreamDependencies

DownstreamDependencies

SynchronousDependencies

Page 16: Synthesis - Inside Analysis · 7/7/2018  · •Analytics Destination Sources: •Legacy •Transaction •Web •3rd Party •Social Media •Machine •Geospatial Stores: •Staging

© Eckerson Group 2018 www.eckerson.com

Monitoring: Pipeline Health

16

Stores:

• Staging

• Warehouse

• Data Mart

• MDM

• ODS

• Data Lake

• Sandbox

Applications:

• Reporting

• OLAP

• Scorecards

• Dashboards

• Exploration

• Analytics

Destination

Sources:

• Legacy

• Transaction

• Web

• 3rd Party

• Social Media

• Machine

• Geospatial

Stores:

• Staging

• Warehouse

• Data Mart

• MDM

• ODS

• Data Lake

• Sandbox

Origin

Storagetemporary files

staging tables

data warehouse

data mart

operational data store

master data repository

Processing

extract transform load

map reduce

extract load transform

connect abstract publish

sample blend format

Workflow

scheduling execution

failoverdistribution verification

Monitoring

health check

performance logging

debugging

TechnologyTechnology

Dataflow

origindestination

DestinationOrigin

Dataflow

Storage Processing

Workflow

Technology

What to watch?

Who is watching?

Using what tools?

What thresholds & limits?

What actions & when?

Page 17: Synthesis - Inside Analysis · 7/7/2018  · •Analytics Destination Sources: •Legacy •Transaction •Web •3rd Party •Social Media •Machine •Geospatial Stores: •Staging

© Eckerson Group 2018 www.eckerson.com

Technology: Pipeline Tools

17

Stores:

• Staging

• Warehouse

• Data Mart

• MDM

• ODS

• Data Lake

• Sandbox

Applications:

• Reporting

• OLAP

• Scorecards

• Dashboards

• Exploration

• Analytics

Destination

Sources:

• Legacy

• Transaction

• Web

• 3rd Party

• Social Media

• Machine

• Geospatial

Stores:

• Staging

• Warehouse

• Data Mart

• MDM

• ODS

• Data Lake

• Sandbox

Origin

Storagetemporary files

staging tables

data warehouse

data mart

operational data store

master data repository

Processing

extract transform load

map reduce

extract load transform

connect abstract publish

sample blend format

Workflow

scheduling execution

failoverdistribution verification

Technology: Hadoop, Databases, ETL, Automation, Virtualization, Analytics, Cataloging, Data Preparation …

Dataflow

origindestination

Monitoring

health check

performance logging

debugging

DestinationOrigin

Dataflow

Storage Processing

Workflow Monitoring

Page 18: Synthesis - Inside Analysis · 7/7/2018  · •Analytics Destination Sources: •Legacy •Transaction •Web •3rd Party •Social Media •Machine •Geospatial Stores: •Staging

© Eckerson Group 2018 www.eckerson.com

Design Summary: Scope and Complexity

18

Stores:

• Staging

• Warehouse

• Data Mart

• MDM

• ODS

• Data Lake

• Sandbox

Applications:

• Reporting

• OLAP

• Scorecards

• Dashboards

• Exploration

• Analytics

Destination

Sources:

• Legacy

• Transaction

• Web

• 3rd Party

• Social Media

• Machine

• Geospatial

Stores:

• Staging

• Warehouse

• Data Mart

• MDM

• ODS

• Data Lake

• Sandbox

Origin

Dataflow

origindestination

Storagetemporary files

staging tables

data lake

data warehouse

master data repository

analytics sandbox

Processing

extract transform load

map reduce

extract load transform

connect abstract publish

sample blend format

Workflow

scheduling execution

failoverdistribution verification

Monitoring

health check

performance logging

debugging

Technology: Hadoop, Databases, ETL, Automation, Virtualization, Analytics, Cataloging, Data Preparation …

Page 19: Synthesis - Inside Analysis · 7/7/2018  · •Analytics Destination Sources: •Legacy •Transaction •Web •3rd Party •Social Media •Machine •Geospatial Stores: •Staging

© 2018 | Confidential© 2018 | Confidential

Infoworks Overview

The Automated Software Platform for Agile Data Engineering

July 18, 2018

Page 20: Synthesis - Inside Analysis · 7/7/2018  · •Analytics Destination Sources: •Legacy •Transaction •Web •3rd Party •Social Media •Machine •Geospatial Stores: •Staging

© 2018 | Confidential© 2018 | Confidential 20

What Will The

Data & Analytics World Look Like In 3-5 Years?

Page 21: Synthesis - Inside Analysis · 7/7/2018  · •Analytics Destination Sources: •Legacy •Transaction •Web •3rd Party •Social Media •Machine •Geospatial Stores: •Staging

© 2018 | Confidential

• Winners and losers will be determined by

the data & analytics agility of the company –

The ability to handle:

– large number of analytic use cases

– large amounts of data

– a large number of users

– rapid and frequent changes

• This requires companies to:

– Automate data engineering

– Have end-to-end functionality in a single place

– Design once and deploy anywhere –

on-premise or cloud

21

The Goal: Companies Want to Emulate -

Google, Facebook and Amazon

Page 22: Synthesis - Inside Analysis · 7/7/2018  · •Analytics Destination Sources: •Legacy •Transaction •Web •3rd Party •Social Media •Machine •Geospatial Stores: •Staging

© 2018 | Confidential

The Challenge: Data Engineering is

“Death by 1000 Paper Cuts”

Data Ingestion• Change Data capture

• Parallelization of data load

• Slowly changing dimensions

• Conversion of source types to big data types

Data Synchronization• Data Merge

• Data Synch

• History table creation

Data Transformation• Building initial load data pipelines

• Building CDC pipelines

• Building SCD pipelines

• Pipeline change management

• End to end lineage creation

22

Data Models• Building semantic models

• Building OLAP cubes

• Building in-memory models

Data Governance• Data access control

• Change management tracking

• Enabling compliance reporting

Performance Optimization• Tuning of data load

• Tuning of data transformation

• Tuning of cube generation

• Tuning of in memory models

Production Orchestration• Scaling jobs

• Migration from dev to production

• Operationalizing data science models

• Monitoring operational environment

• Identifying and restarting failed jobs

Page 23: Synthesis - Inside Analysis · 7/7/2018  · •Analytics Destination Sources: •Legacy •Transaction •Web •3rd Party •Social Media •Machine •Geospatial Stores: •Staging

© 2018 | Confidential© 2018 | Confidential

The Solution:

An Agile Data Engineering Platform

• Automation– Code-free automation of data

engineering from data source to point of consumption

• Infrastructure Independence– Portable between and across

environments on premise or in the cloud

• Platform Extensibility– Supports customer or 3rd party

applications

23

Three Pillars of an

Agile Data Engineering Platform

Infr

as

tru

ctu

re

Ind

ep

en

de

nce

Au

tom

ati

on

Pla

tfo

rm

Ex

ten

sib

ilit

y

Agile

Data Engineering

Page 24: Synthesis - Inside Analysis · 7/7/2018  · •Analytics Destination Sources: •Legacy •Transaction •Web •3rd Party •Social Media •Machine •Geospatial Stores: •Staging

© 2018 | Confidential

Infoworks Agile Data Engineering Platform

• End to end

automation

• Portable across

all data &

compute

platforms

• All components

are API

accessible

24

Orchestration and ProductionData Ops Management

Data Source

Crawling

Workload

Migration

Data

Ingestion& Sync

Data

Transformation& Pipeline

Design

Data Models,

Cubes & In-Memory Models

Advanced

Analytics Integration

Any Source Any Analytics

Data Science

AI & Machine Learning

Autonomous Data Engine

Any Big Data Platform

Mainframe

Netezza

Teradata

Oracle

Json, XML,

CSV, Streaming, etc

•••

Page 25: Synthesis - Inside Analysis · 7/7/2018  · •Analytics Destination Sources: •Legacy •Transaction •Web •3rd Party •Social Media •Machine •Geospatial Stores: •Staging

© 2018 | Confidential© 2018 | Confidential

Automation: Allows You to Focus on Business Results

25

Infoworks Autonomous Data Engine

DATA INGESTION & SYNC

DATA TRANSFORMATION

HI-PERF MODELS

Automatic Ingestion / CDC

Automatic Data Type Conversion

Auto Crawling

Automatic Schema Change

Automatic Merge

Automated Incremental

Pipelines

Automated Data Validation

Automated Dependency Management

Suggest New Data

Connections

Automatically optimize data

models

Auto create OLAP cubes

Automatically maintain time axis

Automated metadata lineage

to source

Automated Fault tolerance

Restartability

Monitor/Debug

PRODUCTION OPERATIONS

BUSINESS ANALYSTSENTERPRISE IT

Configure New Data Sources &

Authorize Access

Provision & Manage Platform

Infrastructure

Define and Implement Analytics

• Eliminates the need for specialized talent and consultants

• Enables new use cases to be launched 10x faster with fewer resources

Page 26: Synthesis - Inside Analysis · 7/7/2018  · •Analytics Destination Sources: •Legacy •Transaction •Web •3rd Party •Social Media •Machine •Geospatial Stores: •Staging

© 2018 | Confidential© 2018 | Confidential

Customer Case Studies

26

Data Lake Creation(Fortune 100 Technology Co.)

Implemented Enterprise wide Data

Lake involving 1500 data sources

• Synchronized data (CDC/Merge) from 1500 sources

• Serving reference data for all

enterprise analytics

• Implemented by 2 engineers in <

2 months including a data

shopping cart

Without

Infoworks

~2 years 60 days

10x Improvement

Self Service BI & Cloud Portability

Built self-serve BI use case dashboards

in 4 days and migrated from Azure to

GCP in 1 day

180x Improvement

• 7 data sources,

• 8 pipelines

• 8 optimized models

• 3 cubes

• 13 reports & dashboards

• Sub-second query response

Without

Infoworks

~6 months 1 day

Advanced Analytics(Fortune 10 Healthcare Co.)

Implemented a complex, machine

learning, near-real-time, business process in 19 days

• Synchronized with Teradata every 10 mins

• 15 min data-availability SLA

• Implemented by 2 engineers in

19 days from requirements to

production

Without

Infoworks

~6 months 19 days

9.5x Improvement

Page 27: Synthesis - Inside Analysis · 7/7/2018  · •Analytics Destination Sources: •Legacy •Transaction •Web •3rd Party •Social Media •Machine •Geospatial Stores: •Staging

© 2018 | Confidential

Infoworks: The Agile Data Engineering Platform

• End to end functionality in an integrated platform– Full data, metadata & business logic in one place

• End to end data engineering automation– Schema evolution, incremental pipelines, type management, query

routing, query optimization, model recommendation, dependency management, …

• Infrastructure independence– Portable across different environments– on-premise or cloud

• Platform extensibility– Third party apps, customer apps, API integration

• Ingestion

• Merge

• Data transformation

• Data models

• Data acceleration

• BI/AI/ML Integration

• Data governance & lineage

• Workload migration

Page 28: Synthesis - Inside Analysis · 7/7/2018  · •Analytics Destination Sources: •Legacy •Transaction •Web •3rd Party •Social Media •Machine •Geospatial Stores: •Staging

© 2018 | Confidential© 2018 | Confidential 28

Page 29: Synthesis - Inside Analysis · 7/7/2018  · •Analytics Destination Sources: •Legacy •Transaction •Web •3rd Party •Social Media •Machine •Geospatial Stores: •Staging

Q&A


Recommended