Booz Allen Innovation Center, Washington DC
Kirk Borne
Principal Data Scientist
@KirkDBorne
DATA ANALYTICS ROADMAP FOR THE ENERGY INDUSTRY
JUNE 14, 2017
AGENDA FROM BIG DATA TO SMART DATA
TAMING BIG DATA WITH DATA SCIENCE & ANALYTICS
THE INTERNET OF THINGS (INTERNET OF CONTEXT)
RISK VERSUS REWARD
THE MATHEMATICAL CORPORATION
Booz Allen Hamilton Internal 1
AGENDA FROM BIG DATA TO SMART DATA
TAMING BIG DATA WITH DATA SCIENCE & ANALYTICS
THE INTERNET OF THINGS (INTERNET OF CONTEXT)
RISK VERSUS REWARD
THE MATHEMATICAL CORPORATION
Booz Allen Hamilton Internal 2
EVER SINCE WE FIRST EXPLORED OUR WORLD…
3Booz Allen Hamilton Internal
http://www.livescience.com/27663-seven-seas.html
…WE HAVE ASKED QUESTIONS ABOUT THE WORLD…
4Booz Allen Hamilton Internal
https://atillakingthehun.wordpress.com/2014/08/07/atlantis-not-lost/
…AND WHERE HAS THAT LED US?WE HAVE COLLECTED EVIDENCE (DATA) TO ANSWER OUR QUESTIONS, WHICH
LEADS TO MORE QUESTIONS, WHICH LEADS TO MORE DATA COLLECTION,
WHICH LEADS TO MORE QUESTIONS, WHICH LEADS TO BIG DATA!
5Booz Allen Hamilton Internal
y ~ 2 * x (linear growth)
y ~ 2 ^ x (exponential growth)
https://www.linkedin.com/pulse/exponential-growth-isnt-cool-combinatorial-tor-bair
y ~ x! ≈ x ^ x→ Combinatorial Growth!(all possible interconnections,linkages, and interactions:high variety for discovery!)
3+1 V’s of Big Data:Volume = most annoying VVelocity = most challenging VVariety = most rich V for discoveryValue = most important V
ALL THOSE COMBINATIONS = SMART DATA !MAKING SENSE OF THE WORLD WITH SMART DATA:
6Booz Allen Hamilton Internal
Semantic, Meaning-filled Data:
• Ontologies (formal)
• Folksonomies (informal)
• Tagging / Annotation– Automated (Machine Learning)
– Crowdsourced
– “Breadcrumbs” (user trails)
Broad, Enriched Data:
• Linked Data (RDF)
– All of those combinations!
• Graph Databases
• Machine Learning
• Cognitive Analytics
• The 360o view
• Context
The Human Connectome Project:mapping and linking the major pathways in the brain.http://www.humanconnectomeproject.org/
“ALL THE WORLD IS A GRAPH” – SHAKESPEARE?
7Booz Allen Hamilton Internal
(Graphic by Cray, for Cray Graph Engine CGE)
http://www.cray.com/products/analytics/cray-graph-engine
“ALL THE WORLD IS A GRAPH”
8Booz Allen Hamilton Internal
EXAMPLES
Connecting the nodes in the graph and labeling the edges (with context and relationship information) can lead to deeper insights than simple transactional databases.
• Customer Journey modeling
• Safety Incident Causal Factor Analysis
• Medical Research Discoveries across disconnected journals, through linked semantic
assertions
• Marketing Attribution Analysis
• Fraud networks, Illegal goods trafficking networks, Money-Laundering networks
The connection between black hat entities {1} and {3} never appears explicitly within a transactional database.
{1} {3}{2}
AGENDA FROM BIG DATA TO SMART DATA
TAMING BIG DATA WITH DATA SCIENCE & ANALYTICS
THE INTERNET OF THINGS (INTERNET OF CONTEXT)
RISK VERSUS REWARD
THE MATHEMATICAL CORPORATION
Booz Allen Hamilton Internal 9
THE WORLD OF BIG DATA IS ON FIRETame the flames with Data Science and Advanced Analytics, through Machine Learning (= mathematical algorithms that learn from experience)
(Is that why we are meeting at the Houston Heights Fire Station?)
10Booz Allen Hamilton Internal
1) Class Discovery: Find the categories of objects
(population segments), events, and behaviors in your data. + Learn the rules that constrain the class boundaries (that uniquely distinguish them).
2) Correlation (Predictive and Prescriptive Power) Discovery: Find trends, patterns, and
dependencies in data, which reveal new governing principles or behavioral patterns (the “DNA”).
3) Novelty (Surprise!) Discovery: Find new,
rare, one-in-a-[million / billion / trillion] objects, events, and behaviors.
4) Association (or Link) Discovery: (Graph and
Network Analytics) – Find the unusual (interesting) co-occurring associations / links / connections.
THE 5 LEVELS OF ANALYTICS MATURITYData Science can be applied at any level, depending on the level of analytics maturity required.
11Booz Allen Hamilton Internal
1) Descriptive Analytics
– Hindsight (What happened?)
2) Diagnostic Analytics
– Oversight (real-time / What is happening?
Why did it happen?)
3) Predictive Analytics
– Foresight (What will happen?)
4) Prescriptive Analytics
– Insight (How can we optimize what happens?)
(Follow the dots / connections in the graph!)
5) Cognitive Analytics– Right Sight (the 360 view , what is the right
question to ask for this set of data in this
context = Game of Jeopardy)
– Finds the right insight, the right action, the
right decision,… right now! = Next Best Action!
– Moves beyond simply providing answers, to
generating new questions and hypotheses.
PREDICTIVE VS PRESCRIPTIVE : WHAT’S THE DIFFERENCE?
12Booz Allen Hamilton Internal
PREDICTIVE
Analytics
Find a function (i.e., the model) f(d,t) that
predicts the value of some predictive
variable y = f(d,t) at a future time t, given
the set of conditions found in the training
data {d}.
=> Given {d}, find y.
PRESCRIPTIVE
Analytics
Find the conditions {d’} that will produce a
prescribed (desired, optimum) value y at a
future time t, using the previously learned
conditional dependencies among the
variables in the predictive function f(d,t).
=> Given y, find {d’}.
PREDICTIVE VS PRESCRIPTIVE : WHAT’S THE DIFFERENCE?
13Booz Allen Hamilton Internal
PREDICTIVE
Analytics
Find a function (i.e., the model) f(d,t) that
predicts the value of some predictive
variable y = f(d,t) at a future time t, given
the set of conditions found in the training
data {d}.
=> Given {d}, find y.
PRESCRIPTIVE
Analytics
Find the conditions {d’} that will produce a
prescribed (desired, optimum) value y at a
future time t, using the previously learned
conditional dependencies among the
variables in the predictive function f(d,t).
=> Given y, find {d’}.
Confucius says…
“Study your past to know
your future”
PREDICTIVE VS PRESCRIPTIVE : WHAT’S THE DIFFERENCE?
14Booz Allen Hamilton Internal
PREDICTIVE
Analytics
Find a function (i.e., the model) f(d,t) that
predicts the value of some predictive
variable y = f(d,t) at a future time t, given
the set of conditions found in the training
data {d}.
=> Given {d}, find y.
PRESCRIPTIVE
Analytics
Find the conditions {d’} that will produce a
prescribed (desired, optimum) value y at a
future time t, using the previously learned
conditional dependencies among the
variables in the predictive function f(d,t).
=> Given y, find {d’}.
Confucius says…
“Study your past to know
your future”
Baseball philosopher Yogi Berra says…
“The future ain’t what it
used to be.”
AGENDA FROM BIG DATA TO SMART DATA
TAMING BIG DATA WITH DATA SCIENCE & ANALYTICS
THE INTERNET OF THINGS (INTERNET OF CONTEXT)
RISK VERSUS REWARD
THE MATHEMATICAL CORPORATION
Booz Allen Hamilton Internal 15
THE INTERNET OF THINGSThe (Industrial) Internet of Things (IIoT / IoT) consists of an interconnected (Graph!) universe of Dynamic Data-Driven Application Systems (dddas.org) =>
Combinatorial Explosive Growth of Smart Data!
IoT / IIoT / M2M demands we deploy intelligence at the point of data collection! (= Machine Learning at the edge of the network) =>
Edge Analytics!
16Booz Allen Hamilton Internal
https://www.nsf.gov/news/news_images.jsp?cntn_id=122028
https://www.linkedin.com/pulse/can-elephant-bigdata-dance-iot-internet-things-tune-lambba
Internet of Everything
SMART THINGS IN THE IOT (THE INTERNET OF CONTEXT):INTELLIGENCE AT THE EDGE OF THE NETWORK(AT THE POINT OF DATA COLLECTION = EDGE ANALYTICS)
17Booz Allen Hamilton Internal
http://www.smarttechforyou.com/2015/03/smart-manufacturing-iot-fourth-industrial-revolution.html
• Smart business ops (context-aware)
• Smart maintenance
• Smart operations & controls
• Smart workforce deployment
• Smart manufacturing & inventory
• Smart supply chain & pricing
• Smart logistics (people, things)
• Smart customer engagement
Embedded IoT-driven Streaming Analytics = a top focus of Machine Learning projects: http://searchdatamanagement.techtarget.com/feature/Embedded-analytics-to-feel-widest-impact-of-machine-learning-projects
From IoT-enabled products to the IoT-enabled organization = multi-use data will dominate future IoT implementations:http://internetofthingsagenda.techtarget.com/feature/Multiuse-data-dominates-Future-of-IoT-implementation
CONTEXT IS KING!“You can see a lot just by looking.” – Yogi Berra
18Booz Allen Hamilton Internal
• Context is “other data” about your data = i.e., Metadata!
• The 3 most important things in your data are: Metadata, Metadata,
Metadata!
• Metadata are…
– Data that describes Data
– Other Data that describes Your Data
– Your Data that describes Other Data
• e.g., Connected “Smart” Cars = that car that is braking 3 vehicles
ahead of you = informs your vehicle to brake now!
• The Smart Enterprise = predictive / prescriptive maintenance
algorithm alerts the corresponding asset, the right skilled technician,
the right part, & the right tool to be at right place at the right time!
• IoT sensors + Open Data provide a lot of context data (metadata!):
The Interconnected Graph of data, information, and knowledge!
• Contextual data empowers both Prescriptive and Cognitive Analytics.
HOT APPLICATION (THE KILLER APP) FOR IOT : THE DIGITAL TWINREAD THIS: “DIGITALIZATION, DIGITAL TWINS AND CYBER -PHYSICAL SYSTEMS…”
http://blogs.dnvgl.com/software/2016/04/digital-twins-structural-engineering/
19Booz Allen Hamilton Internal
The digital twin is a virtual image of your asset, maintained throughout the lifecycle and easily accessible at any time, able to replay (simulate) any physical scenario.
“INDUSTRIES TURN TO DIGITAL TWINS AND IOT TO BOOST PROFITS…”
http://searchmanufacturingerp.techtarget.com/tip/Industries-turn-to-digital-twin-technology-IoT-to-boost-profit
20Booz Allen Hamilton Internal
Some key players (capitalizing on the IoT):
• NASA• GE• DNV-GL• Siemens• TWI Ltd• ANSYS
HOT APPLICATION (THE KILLER APP) FOR IOT : THE DIGITAL TWIN
AGENDA FROM BIG DATA TO SMART DATA
TAMING BIG DATA WITH DATA SCIENCE & ANALYTICS
THE INTERNET OF THINGS (INTERNET OF CONTEXT)
RISK VERSUS REWARD
THE MATHEMATICAL CORPORATION
Booz Allen Hamilton Internal 21
DATA SCIENCE AND ANALYTICS IMPROVE YOUR ODDS IN THE FUNDAMENTAL BUSINESS GAMBIT:
RISK VERSUS REWARD
22Booz Allen Hamilton Internal
http://www.telegraph.co.uk/news/worldnews/europe/russia/10061780/Russian-convicts-beat-Americans-in-cyber-chess-battle.html
SMART DATA ANALYTICS –-- DATA-TO-ACTION TRIAGEGENERAL EXAMPLE -- IOT EVENT MINING IN BIG DATA COLLECTIONS FOR ACTIONABLE INTELLIGENCE:
Listen: Behavior modeling (anomaly & trend detection), ad hoc inquiry for Discovery!
Learn: Identifying, characterizing, & understanding events for data-driven Decisions!
Act: Deciding which events need immediate investigation and/or intervention = Action!
… producing rewarding outcomes for people, products, and processes!
23Booz Allen Hamilton Internal
MANY OTHER EXAMPLES:
Customer churn early warning (from 360-view customer data)
Predictive Maintenance alerts (from ubiquitous machine / engine sensors)
Prescriptive efficiency modeling and implementation (from process mining)
Supply chain monitoring (from manufacturing & shipping sensors)
Predictive personnel & asset logistics (trajectory modeling from tracking data)
Cybersecurity alerts (from network logs)
Preventive Fraud alerts (from financial applications)
Health alerts (from electronic health records and national health systems)
Tsunami alerts (from geo sensors everywhere)
Social event alerts or early warnings (from social media)
Ris
k M
itig
ati
on
AGENDA FROM BIG DATA TO SMART DATA
TAMING BIG DATA WITH DATA SCIENCE & ANALYTICS
THE INTERNET OF THINGS (INTERNET OF CONTEXT)
RISK VERSUS REWARD
THE MATHEMATICAL CORPORATION
Booz Allen Hamilton Internal 24
INFUSING ANALYTICS CAPABILITY INTO YOUR ORGANIZATIONDRIVING COMPETITIVE ADVANTAGE THROUGH DATA-TO-ACTION:
• AI (AUGMENTED INTELLIGENCE = MACHINE-ASSISTED HUMAN, AND VICE VERSA)
• MACHINE INTELLIGENCE
• THE MATHEMATICAL CORPORATION!
25Booz Allen Hamilton Internal
Activities• Enrich• Integrate
and Transform Data
Methods
• Descriptive Statistics
• Filtering• Aggregation
Activities• Reveal trends• Identify
Correlations• Learn
Patterns
Methods
• Unsupervised Learning
• Clustering• Outlier
Detection
Activities• Classify
Signals
• Predict Risks• Forecast
Resources
Methods
• Random Forest
• Neural Networks
• Bayesian Analysis
• Collaborative
Filtering
Activities•Optimize Resources
•Simulate Decision Outcomes
Methods
•Genetic Algorithms
•Integer Programming
•Non-Linear Programming
•Discrete Event
Simulation
Acquisition, aggregation and enrichment of information from multiple entry points will help create a holistic view that can enhance operations, reduce risk, provide powerful insight, and create value.
… Enables Effective Operations and Decision-Making
• Allows for accurate analysis of trends across the organization against defined KPI’s
• Supports strategic C-Suite decision making
• Reveals operational risks and potential bottlenecks in real-time
• Supports critical information infrastructure protection efforts by early detection of vulnerabilities
Products
Reports | Dashboards | Mitigations
360o Data Acquisition …Business Operations and
Performance Data
Logs: Systems, Customers,…
Reports, e-Docs, and Manuals
Open Data
The foundation is Data!
DATA ANALYTICS ROADMAP FOR THE ENERGY INDUSTRY
26Booz Allen Hamilton Internal
• Design Patterns for Streaming Data Analytics:– Detecting POI (Person of Interest, or Pattern of Interest, or any Point of Interest)
– Detecting BOI (Behavior of Interest from any “dynamic actor”)
– Precomputed scenarios and their responses (to speed up “next best action”)
– Design Thinking : DX, UX, CX, EX (Digital / User / Customer / Employee eXperience)
• Edge Analytics (what else is happening now at the point of data collection?)
– Locality in Time
• Near-field Analytics (who / what is local to this person / place / thing?)
– Locality in Geospace
• Related-entity Analytics (what else is similar to this entity / event?)
– Locality in Feature Space
• Agile Analytics: DataOps, Fail-fast, Iterative, MVP, Learning Energy Systems
Minimum Viable Product= your POV (Proof of Value)
http://www.agilebuddha.com/agile/demystifying-devops/
DATA ANALYTICS ROADMAP FOR THE ENERGY INDUSTRY
27Booz Allen Hamilton Internal
DevOps for Data Analytics = DataOps
Minimum Viable Product= your POV (Proof of Value)
http://www.agilebuddha.com/agile/demystifying-devops/
THANK YOU!
28Booz Allen Hamilton Internal
@KirkDBorne
@BoozDataScience
LISTEN
READ, BUILD, and EXPLOREwww.boozallen.com/datascience
Tips for Building a Data Science Capability
The Mathematical Corporation
10 Signs of Data Science Maturity
The Field Guide to Data Science
The Data and Analytics Catalyst
Explore: sailfish.boozallen.com
Booz | Allen | Hamilton
PARTICIPATE
datasciencebowl.com
Visit the Booz Allen exhibit to learn how AI and Machine Intelligence empower The Mathematical Corporation …