Date post: | 08-Jan-2017 |
Category: |
Data & Analytics |
Upload: | international-society-of-service-innovation-professionals |
View: | 146 times |
Download: | 0 times |
Optimized for the Industrial Internet:
GE’s Industrial Data Lake Platform
2 GESoftware.com | @GESoftware | #IndustrialInternet
Agenda
Opportunity Solution Result
GE Data Lake
Challenges
3 GESoftware.com | @GESoftware | #IndustrialInternet
Big opportunities with Industrial Big Data
The power of 1%
Increasingfreight utilization rail
Predictivemaintenance healthcare
Predictivediagnostics power
Driving outcomes that matter
$27BIndustry value byreducing system
inefficiency
$66BIndustry valuewith efficiency
improvements ingas-fired power
plant fleets
$63BIndustry value byreducing process
inefficiency
Note: Illustrative examples based on potential one percent savings applied across specific global industry sectors over 15 years. Source: GE estimates
4 GESoftware.com | @GESoftware | #IndustrialInternet
Industrial Big Data – fast and vast
50BMachines will beconnected on theinternet by 2020
2XIndustrial datagrowth withinnext 10 years
*Sources: IDC, Ericsson, Wikibon, Fast Company, ComputerWeekly
CRM, ERP,etc. Logs
Social network
dataGeo-location
data
In practice only
3%of potentially useful
data is taggedand even lessis analyzed*
9MMData points
per hour for eachlocomotive
500GBData per blade
by gasturbines
Sensordata
Content(images, videos,manuals, etc.)
Historiandata
Machinedata
35GBData per day
from eachSmart Meter
50XData growthin healthcare(2012 – 2020)
1TBData per
flight
5 GESoftware.com | @GESoftware | #IndustrialInternet
Case study – GE AviationAsset productivity, minimize disruptions, improved forecasting
25 Airlines
3.4M Flights
340TB Data
10X Cost reduction
7 days Time-to-market fornew analytic app
2000X Performanceimprovement
ü Isolate root causes
ü Identify sub-optimal performance parts
ü Minimize disruptions
Note: Illustrative Aviation example based on Predix solution currently in development. Estimates based on data exploration, simulation and assetutilization models.
6 GESoftware.com | @GESoftware | #IndustrialInternet
Use caseQuality – Predict the quality of the manufactured part using predictive analytics
Machine health – Collect data from machine tools, that enable predictive maintenance
Energy Consumption – Save and optimize energy costs. Predict future costs. Use historical data to identify energy consumption trends, and how external conditions impact energy consumption
Process Optimization – Process, effects, and environmental variables for monitoring analytics by operation/machine/area, increasing machine uptime.
Business Value• Reduction in defects• Eliminate unplanned downtime• Reduce energy costs• Increase throughput
GE Manufacturing – Brilliant Factories
7 GESoftware.com | @GESoftware | #IndustrialInternet
GE Oil & Gas – Intelligent Pipeline Solution (IPS)
Use case
Analyze real-time and historical pipeline operational and integrity data:
1) Equipment/asset registry2) Dashboard visualization of equipment and
pipeline networks (geospatial), raw data types and key performance indicators (operational/investigational)
3) Analysis of variability in throughput (pressure). Requires support for multi-user capability, and future accessibility by mobile devices
Data sources: OSI PSI, GIS, Dig Tickets, Risk, Navigate, Maximo, NOAA.
8 GESoftware.com | @GESoftware | #IndustrialInternet
80% of an analytics project typically involves gatheringand then preparing the data for analysis*
Today’s approaches are not preparedfor onslaught of Industrial Big Data
*Source: IDC
Tooslow
Toorigid
Tooexpensive
9 GESoftware.com | @GESoftware | #IndustrialInternet
All over the placeData across multiple locations
SnapshotLimited to narrowsnapshots and time
Limited data typesMostly structured andsemi-structured data types
Logs Social networkdata
Geo-locationdata
CRM, ERP,etc.
Yesterday’s data warehousearchitecture
TRADITIONAL DATA WAREHOUSE
What is ittelling me?
How doesit look?
How is itdoing?
Data scientist Field operations Business analyst
ONE STATIC DATA MODEL
1
2
3
10 GESoftware.com | @GESoftware | #IndustrialInternet
All dataAccess to real-time data and historical data and notlimited to snapshot of data
Any dataHanding of all data typesincluding documents, imagesmachine data, sensor data
One placeAccess to all data in oneplace to quickly respond tothe speed of business change
1
2
3
Rapid access to all data for analytics
How long willit last without
failures or maintenance?
Is my asset ready when
there is market opportunity?
Is my asset performing optimally?
How to configurefor best
operational results?
FLEXIBLE DATA MODELS
Industrial Data Lake architectureUnderpinned by data governance appropriate to Business and Location
INDUSTRIAL DATA LAKE
Data scientist Field operations Business analyst
Sensordata
Content(images, videos,
manuals, etc.)
Machinedata
Historiandata
CRM,ERP,etc.
Logs,click
streams
Geo-location
data
Socialnetwork
data
11 GESoftware.com | @GESoftware | #IndustrialInternet
Datagovernance
Analytics andoperations
Datacollection
Dataingestion
New wayCurrent situation
Dataloading
Addsemanticmetadata
Replicaof sourcedata
A day in the life – data management
AgilityData
scientist
Rigid
Fieldoperations
Businessanalyst
INDUSTRIALDATA LAKE
Agile
Datascientist
Fieldoperations
Businessanalyst
Cost
Datacollection
Data ingestion
Data governance
Analytics and operations
Cost
CRM, ERP, etc.
Logs Geo-locationdata
Social networkdata
INDUSTRIALDATA LAKE
Real-timeingestion
Replicaof sourcedata
Addsemanticmetadata
Datacollection
Data ingestion
Data governance
Analytics and operations
Time
Time to analyze
Datascientist
Field operations
Business analyst
Datascientist
Field operations
Business analyst
CRM, ERP,etc.
Logs,click
streams
Geo-location
data
Socialnetwork
data
Sensordata
Content(images, videos,
manuals, etc.)
Machinedata
Historiandata
12 GESoftware.com | @GESoftware | #IndustrialInternet
Data Lake Consumption Patterns
Integration & visualization
Advanced Analytics
Real-time analytics
Use case patterns
Search + API
Structured Data Batch / CDC based replication
Simple transformation w/
masteringReporting tools
Structured + Unstructured
Batch / CDC based replication
Machine Learning, Predictive Modeling
Reporting tools & self discovery
tools
Structured + Unstructured
Batch / CDC based replication
Machine Learning, Predictive Modeling
Search & API access to
visualization
Structured +Unstructured
Real time ingestion
Simple real time processing
API based real time access
1
2
3
4
13 GESoftware.com | @GESoftware | #IndustrialInternet
Industrial Data LakeOptimized for industrial workloads
Optimized for mission-critical workloads for addressing key SLAs such as Security, resiliency etc. for Industrial Internet applications
Fast ingestion, storage and compute including machine data to support multiple schema and data types
High-performance analysis using massively parallel processing architecture supporting Apache Hadoop
Data governance and federation,with geographically-dispersed deployment options
14 GESoftware.com | @GESoftware | #IndustrialInternet
Big Data without Governance
ü Dumping data into Big Data lake without repeatable processes and data governance will create messy, uncontrollable data environment
ü Insights harvested from ungoverned data lake, is not reliable and trustworthy
ü If the insights can not be fully trusted, it’s difficult to make business decisions confidently.
Solutions for Industrial Internet, deep domain
expertise
15 GESoftware.com | @GESoftware | #IndustrialInternet
GE as a Custodian of Customer Owned Data & Services
Custodian Roles
Enforcement & Measurement
Infrastructure
Protection
Privacy
Data Management
a person who has responsibility for or looks after something
Custodian
Synonyms: keeper, guardian, steward, protector"the custodian of the relic"
Access Controls – Visibility –Metrics…
Customer Owned Data
16 GESoftware.com | @GESoftware | #IndustrialInternet
Governance Disciplines
MetadataData Dictionary
Directory of all assetsClassification and Tagging
LifecycleProvenance
LineageRetention
QualityAccuracy
CompletenessConsistency
AuditingMonitoring
LoggingLog Analysis
ComplianceRegulatoryCorporate
Protect, Manage and Improve Information
17 GESoftware.com | @GESoftware | #IndustrialInternet
• Outcomes• Data• Parties
Full Governance EnforcementData Lake
OnboardingTool
Deployment
APIProcess
Qualification Fit Gap Analysis
Design & build Quarantine OnBoard
Platform Engineering
Certified Patterns DevOps
DataVault
• Compliance• Contractual• Security• Risk
• Design• Testing• DevOps
18 GESoftware.com | @GESoftware | #IndustrialInternet
Capability Description
Metadata CatalogA repository of metadata describing data elements, lineage and relationships of the data within the lake
Business GlossaryGlossary of business terms, definitions and related properties built on top of the technical metadata catalog
Policy StoreCentralized store of access & usage policies and rights by data object
Policy EnforcementEngine to provide security enforcement as a service – abstracted from consumption method
Collaboration & Governance
Portal providing data stewards, providers, custodians and consumers with collaboration, issue resolution, enhancement request and change management capabilities
Data Governance – Key Technology Capabilities
19 GESoftware.com | @GESoftware | #IndustrialInternet
Security Risk for Big Data
ü More data implies higher risk of exposure
ü New data types may give rise to new security breach scenarios
ü Evolving and experimental analysis implies security policies are less likely to be in place
ü Linkage to other data already under compliance may create scenarios where compliance could be violated.
20 GESoftware.com | @GESoftware | #IndustrialInternet
Top Opportunity Areas for Security
Perimeter: Infrastructure
Communication protocols
Key management
Protection: Encryption
Access policy based encryption
Searching / filtering
encrypted data
Secure outsourcing of computation
Access Control: Privacy
Secure dissemination
Secure data collection
/ aggregation
Secure collaboration
Visibility: Data Management
Data integrity/Provena
nce
Proof of data storage
21 GESoftware.com | @GESoftware | #IndustrialInternet
Platform Validation via Information Security Policy Framework (ISPF)
Main Design Characteristics
• Full ISO 27001/2 Certification• End-to-end Infrastructure Visibility• Extend Compliance for GE and its Customers• Unified Monitoring Framework• Central Controls – Separate Monitors• Complete Suite – Layered Approach
• ISMS (Information Security Mgmt System)• Common Control Set• ISMS Management• Governance & Ops
One Framework – Multiple Certifications
Thank youGeneral Electric reserves the right to make changes in specifications and features, or discontinue the product or service described at any time, without notice or obligation. These materials do not constitute a representation, warranty or documentation regarding the product or service featured. Illustrations are provided for informational purposes, and your configuration may differ.
This information does not constitute legal, financial, coding, or regulatory advice in connection with your use of the product or service. Please consult your professional advisors for any such advice.
GE, the GE Monogram, Predix, Predictivity are trademarks of General Electric Company.
©2014 General Electric Company – All rights reserved.