+ All Categories
Home > Data & Analytics > GE’s Industrial Data Lake Platform

GE’s Industrial Data Lake Platform

Date post: 08-Jan-2017
Category:
Upload: international-society-of-service-innovation-professionals
View: 146 times
Download: 0 times
Share this document with a friend
22
Optimized for the Industrial Internet: GE’s Industrial Data Lake Platform
Transcript
Page 1: GE’s Industrial Data Lake Platform

Optimized for the Industrial Internet:

GE’s Industrial Data Lake Platform

Page 2: GE’s Industrial Data Lake Platform

2 GESoftware.com | @GESoftware | #IndustrialInternet

Agenda

Opportunity Solution Result

GE Data Lake

Challenges

Page 3: GE’s Industrial Data Lake Platform

3 GESoftware.com | @GESoftware | #IndustrialInternet

Big opportunities with Industrial Big Data

The power of 1%

Increasingfreight utilization rail

Predictivemaintenance healthcare

Predictivediagnostics power

Driving outcomes that matter

$27BIndustry value byreducing system

inefficiency

$66BIndustry valuewith efficiency

improvements ingas-fired power

plant fleets

$63BIndustry value byreducing process

inefficiency

Note: Illustrative examples based on potential one percent savings applied across specific global industry sectors over 15 years. Source: GE estimates

Page 4: GE’s Industrial Data Lake Platform

4 GESoftware.com | @GESoftware | #IndustrialInternet

Industrial Big Data – fast and vast

50BMachines will beconnected on theinternet by 2020

2XIndustrial datagrowth withinnext 10 years

*Sources: IDC, Ericsson, Wikibon, Fast Company, ComputerWeekly

CRM, ERP,etc. Logs

Social network

dataGeo-location

data

In practice only

3%of potentially useful

data is taggedand even lessis analyzed*

9MMData points

per hour for eachlocomotive

500GBData per blade

by gasturbines

Sensordata

Content(images, videos,manuals, etc.)

Historiandata

Machinedata

35GBData per day

from eachSmart Meter

50XData growthin healthcare(2012 – 2020)

1TBData per

flight

Page 5: GE’s Industrial Data Lake Platform

5 GESoftware.com | @GESoftware | #IndustrialInternet

Case study – GE AviationAsset productivity, minimize disruptions, improved forecasting

25 Airlines

3.4M Flights

340TB Data

10X Cost reduction

7 days Time-to-market fornew analytic app

2000X Performanceimprovement

ü Isolate root causes

ü Identify sub-optimal performance parts

ü Minimize disruptions

Note: Illustrative Aviation example based on Predix solution currently in development. Estimates based on data exploration, simulation and assetutilization models.

Page 6: GE’s Industrial Data Lake Platform

6 GESoftware.com | @GESoftware | #IndustrialInternet

Use caseQuality – Predict the quality of the manufactured part using predictive analytics

Machine health – Collect data from machine tools, that enable predictive maintenance

Energy Consumption – Save and optimize energy costs. Predict future costs. Use historical data to identify energy consumption trends, and how external conditions impact energy consumption

Process Optimization – Process, effects, and environmental variables for monitoring analytics by operation/machine/area, increasing machine uptime.

Business Value• Reduction in defects• Eliminate unplanned downtime• Reduce energy costs• Increase throughput

GE Manufacturing – Brilliant Factories

Page 7: GE’s Industrial Data Lake Platform

7 GESoftware.com | @GESoftware | #IndustrialInternet

GE Oil & Gas – Intelligent Pipeline Solution (IPS)

Use case

Analyze real-time and historical pipeline operational and integrity data:

1) Equipment/asset registry2) Dashboard visualization of equipment and

pipeline networks (geospatial), raw data types and key performance indicators (operational/investigational)

3) Analysis of variability in throughput (pressure). Requires support for multi-user capability, and future accessibility by mobile devices

Data sources: OSI PSI, GIS, Dig Tickets, Risk, Navigate, Maximo, NOAA.

Page 8: GE’s Industrial Data Lake Platform

8 GESoftware.com | @GESoftware | #IndustrialInternet

80% of an analytics project typically involves gatheringand then preparing the data for analysis*

Today’s approaches are not preparedfor onslaught of Industrial Big Data

*Source: IDC

Tooslow

Toorigid

Tooexpensive

Page 9: GE’s Industrial Data Lake Platform

9 GESoftware.com | @GESoftware | #IndustrialInternet

All over the placeData across multiple locations

SnapshotLimited to narrowsnapshots and time

Limited data typesMostly structured andsemi-structured data types

Logs Social networkdata

Geo-locationdata

CRM, ERP,etc.

Yesterday’s data warehousearchitecture

TRADITIONAL DATA WAREHOUSE

What is ittelling me?

How doesit look?

How is itdoing?

Data scientist Field operations Business analyst

ONE STATIC DATA MODEL

1

2

3

Page 10: GE’s Industrial Data Lake Platform

10 GESoftware.com | @GESoftware | #IndustrialInternet

All dataAccess to real-time data and historical data and notlimited to snapshot of data

Any dataHanding of all data typesincluding documents, imagesmachine data, sensor data

One placeAccess to all data in oneplace to quickly respond tothe speed of business change

1

2

3

Rapid access to all data for analytics

How long willit last without

failures or maintenance?

Is my asset ready when

there is market opportunity?

Is my asset performing optimally?

How to configurefor best

operational results?

FLEXIBLE DATA MODELS

Industrial Data Lake architectureUnderpinned by data governance appropriate to Business and Location

INDUSTRIAL DATA LAKE

Data scientist Field operations Business analyst

Sensordata

Content(images, videos,

manuals, etc.)

Machinedata

Historiandata

CRM,ERP,etc.

Logs,click

streams

Geo-location

data

Socialnetwork

data

Page 11: GE’s Industrial Data Lake Platform

11 GESoftware.com | @GESoftware | #IndustrialInternet

Datagovernance

Analytics andoperations

Datacollection

Dataingestion

New wayCurrent situation

Dataloading

Addsemanticmetadata

Replicaof sourcedata

A day in the life – data management

AgilityData

scientist

Rigid

Fieldoperations

Businessanalyst

INDUSTRIALDATA LAKE

Agile

Datascientist

Fieldoperations

Businessanalyst

Cost

Datacollection

Data ingestion

Data governance

Analytics and operations

Cost

CRM, ERP, etc.

Logs Geo-locationdata

Social networkdata

INDUSTRIALDATA LAKE

Real-timeingestion

Replicaof sourcedata

Addsemanticmetadata

Datacollection

Data ingestion

Data governance

Analytics and operations

Time

Time to analyze

Datascientist

Field operations

Business analyst

Datascientist

Field operations

Business analyst

CRM, ERP,etc.

Logs,click

streams

Geo-location

data

Socialnetwork

data

Sensordata

Content(images, videos,

manuals, etc.)

Machinedata

Historiandata

Page 12: GE’s Industrial Data Lake Platform

12 GESoftware.com | @GESoftware | #IndustrialInternet

Data Lake Consumption Patterns

Integration & visualization

Advanced Analytics

Real-time analytics

Use case patterns

Search + API

Structured Data Batch / CDC based replication

Simple transformation w/

masteringReporting tools

Structured + Unstructured

Batch / CDC based replication

Machine Learning, Predictive Modeling

Reporting tools & self discovery

tools

Structured + Unstructured

Batch / CDC based replication

Machine Learning, Predictive Modeling

Search & API access to

visualization

Structured +Unstructured

Real time ingestion

Simple real time processing

API based real time access

1

2

3

4

Page 13: GE’s Industrial Data Lake Platform

13 GESoftware.com | @GESoftware | #IndustrialInternet

Industrial Data LakeOptimized for industrial workloads

Optimized for mission-critical workloads for addressing key SLAs such as Security, resiliency etc. for Industrial Internet applications

Fast ingestion, storage and compute including machine data to support multiple schema and data types

High-performance analysis using massively parallel processing architecture supporting Apache Hadoop

Data governance and federation,with geographically-dispersed deployment options

Page 14: GE’s Industrial Data Lake Platform

14 GESoftware.com | @GESoftware | #IndustrialInternet

Big Data without Governance

ü Dumping data into Big Data lake without repeatable processes and data governance will create messy, uncontrollable data environment

ü Insights harvested from ungoverned data lake, is not reliable and trustworthy

ü If the insights can not be fully trusted, it’s difficult to make business decisions confidently.

Solutions for Industrial Internet, deep domain

expertise

Page 15: GE’s Industrial Data Lake Platform

15 GESoftware.com | @GESoftware | #IndustrialInternet

GE as a Custodian of Customer Owned Data & Services

Custodian Roles

Enforcement & Measurement

Infrastructure

Protection

Privacy

Data Management

a person who has responsibility for or looks after something

Custodian

Synonyms: keeper, guardian, steward, protector"the custodian of the relic"

Access Controls – Visibility –Metrics…

Customer Owned Data

Page 16: GE’s Industrial Data Lake Platform

16 GESoftware.com | @GESoftware | #IndustrialInternet

Governance Disciplines

MetadataData Dictionary

Directory of all assetsClassification and Tagging

LifecycleProvenance

LineageRetention

QualityAccuracy

CompletenessConsistency

AuditingMonitoring

LoggingLog Analysis

ComplianceRegulatoryCorporate

Protect, Manage and Improve Information

Page 17: GE’s Industrial Data Lake Platform

17 GESoftware.com | @GESoftware | #IndustrialInternet

• Outcomes• Data• Parties

Full Governance EnforcementData Lake

OnboardingTool

Deployment

APIProcess

Qualification Fit Gap Analysis

Design & build Quarantine OnBoard

Platform Engineering

Certified Patterns DevOps

DataVault

• Compliance• Contractual• Security• Risk

• Design• Testing• DevOps

Page 18: GE’s Industrial Data Lake Platform

18 GESoftware.com | @GESoftware | #IndustrialInternet

Capability Description

Metadata CatalogA repository of metadata describing data elements, lineage and relationships of the data within the lake

Business GlossaryGlossary of business terms, definitions and related properties built on top of the technical metadata catalog

Policy StoreCentralized store of access & usage policies and rights by data object

Policy EnforcementEngine to provide security enforcement as a service – abstracted from consumption method

Collaboration & Governance

Portal providing data stewards, providers, custodians and consumers with collaboration, issue resolution, enhancement request and change management capabilities

Data Governance – Key Technology Capabilities

Page 19: GE’s Industrial Data Lake Platform

19 GESoftware.com | @GESoftware | #IndustrialInternet

Security Risk for Big Data

ü More data implies higher risk of exposure

ü New data types may give rise to new security breach scenarios

ü Evolving and experimental analysis implies security policies are less likely to be in place

ü Linkage to other data already under compliance may create scenarios where compliance could be violated.

Page 20: GE’s Industrial Data Lake Platform

20 GESoftware.com | @GESoftware | #IndustrialInternet

Top Opportunity Areas for Security

Perimeter: Infrastructure

Communication protocols

Key management

Protection: Encryption

Access policy based encryption

Searching / filtering

encrypted data

Secure outsourcing of computation

Access Control: Privacy

Secure dissemination

Secure data collection

/ aggregation

Secure collaboration

Visibility: Data Management

Data integrity/Provena

nce

Proof of data storage

Page 21: GE’s Industrial Data Lake Platform

21 GESoftware.com | @GESoftware | #IndustrialInternet

Platform Validation via Information Security Policy Framework (ISPF)

Main Design Characteristics

• Full ISO 27001/2 Certification• End-to-end Infrastructure Visibility• Extend Compliance for GE and its Customers• Unified Monitoring Framework• Central Controls – Separate Monitors• Complete Suite – Layered Approach

• ISMS (Information Security Mgmt System)• Common Control Set• ISMS Management• Governance & Ops

One Framework – Multiple Certifications

Page 22: GE’s Industrial Data Lake Platform

Thank youGeneral Electric reserves the right to make changes in specifications and features, or discontinue the product or service described at any time, without notice or obligation. These materials do not constitute a representation, warranty or documentation regarding the product or service featured. Illustrations are provided for informational purposes, and your configuration may differ.

This information does not constitute legal, financial, coding, or regulatory advice in connection with your use of the product or service. Please consult your professional advisors for any such advice.

GE, the GE Monogram, Predix, Predictivity are trademarks of General Electric Company.

©2014 General Electric Company – All rights reserved.


Recommended