SmartData Webinar: Applying Neocortical Research to Streaming Analytics

Post on 11-Jan-2017

792 views 0 download

transcript

APPLYING NEOCORTICAL ALGORITHMS TO STREAMING ANALYTICS

SmartData Webinar September 10, 2015

Subutai Ahmad sahmad@numenta.com

Revenue Forecasting Customer Story

10pm:

Team of 10 analysts

5 am: “Dear CEO, today’s revenue forecast is $63.4M.”

Objectives for next generation: Generate predictions every 15-minutes Track all product categories and geographies (hundreds of thousands) React rapidly to changes

Problems:

Cumbersome data infrastructure Algorithm approach completely unclear

Revenue Forecasting Customer Story

10pm:

Team of 10 analysts

5 am: “Dear CEO, today’s revenue forecast is $63.4M.”

“How Machine Learning Is Done”

Data Prep

Craft Input Features

Training Methodology

Choose Algorithm

Test & Validate

“How Machine Learning Is Done”

Data Prep

Craft Input Features

Training Methodology

Choose Algorithm

Test & Validate

Deploy

Streaming data

Automated model creation Continuous learning Temporal inference

Predictions Anomalies Actions

The Future of Data Analytics

Solution:

Streaming data infrastructure

New algorithm approach

Numenta History

2005 – 2009 §  Hierarchical Temporal Memory theory

§  First generation algorithms §  Vision Toolkit

2002

2004

2009 – 2014 §  2nd generation HTM algorithms

§  Sequence & cont. learning §  Streaming data applications

§  HTM open source project

§  Grok 1.0 for anomaly detection

2014 – Today §  Streaming applications

§  Grok for Stocks §  Research on 3rd generation algs

§  Sensorimotor

§  Feedback

2005

Properties Of The Neocortex

retina

cochlea

somatic

data stream

motor control

“Hierarchical Temporal Memory” (HTM)

Properties Of The Neocortex

1)   Hierarchy of nearly identical regions - common algorithm

retina

cochlea

somatic

2) Sparse Distribution Representations - common data structure

data stream

3) Regions are mostly sequence memory - inference - motor motor control

4) Every region is continually learning - fully automated

“Hierarchical Temporal Memory” (HTM)

HTM Learning Algorithm

Models a small slice of cortex 1) High capacity memory-based system 2) Models complex high-order temporal sequences 3) Makes predictions and detects anomalies 4) Continuously learning 5) No sensitive parameters 6) Runs in real time on a laptop

Basic building block of neocortex and Machine Intelligence Whitepaper and full source code available: github.com/numenta

HTM

Encoder SDR

Prediction Point anomaly Time average Historical comparison Anomaly score

Metric(s)

System Anomaly Scores

& Predictions

HTM Engine For Streaming Analytics

HTM

Encoder SDR

Prediction Point anomaly Time average Historical comparison Anomaly score

SDR Metric N

.

.

.

GROK  Server  anomalies  

Rogue  human  behavior  

Geospa6al  tracking  

Stock  &  market  anomalies  

Applications Of The HTM Engine

Social  media  anomalies  (Twi?er)  

Grok: Anomaly Detection For Amazon Web Services

§  Unique value of HTM algorithms §  Automated model creation: configure hundreds of models in minutes §  Continuously learning: automatically adapts to changes §  Detects sophisticated temporal anomalies

Continuous learning Unpredictable data Temporal anomalies

HTM for Stocks: Detecting Unusual Market Behavior

Companies sorted by unusual behavior

Stock price Stock volume Twitter chatter

Tweets reveal cause

Anomaly Detection in Geospatial Tracking Data

HTM

Encoder SDRs Prediction Anomaly Detection Classification

GPS+ Velocity

Anomaly Detection in Geospatial Tracking Data

HTM

Encoder SDRs Prediction Anomaly Detection Classification

GPS+ Velocity

Trick: convert GPS coordinates into an SDR After input is encoded as an SDR, learning algorithm is agnostic

Learning Normal Behavior

Learning Normal Behavior

Learning Normal Behavior

Geospatial Anomalies

Deviation in path Change in direction

Multiple paths are OK Unusual change in speed

Geospatial Anomalies

These HTM Applications Use Exact Same Code Base

HTM learning algorithms Identical learning parameters Wide applicability across sensor types

GROK  Server  anomalies  

Rogue  human  behavior  

Geospa6al  tracking  

Stock  &  market  anomalies  

Social  media  anomalies  (Twi?er)  

Benchmarking Streaming Anomaly Detection

Traditional benchmarks don’t apply: –  Don’t  incorporate  -me,  e.g.  favor  early  

detec-on  over  later  detec-ons  –  Usually  batch  format  –  Very  few  benchmarks  with  real  world  

data   Numenta Anomaly Benchmark (NAB)

–  Scoring  methodology  favors  early  detec-on  

–  Incorporates  con-nuous  learning  (learning  a  new  normal  baseline)  

–  Labeled  real  world  data  streams  –  Different  “applica-on  profiles”  

HTM tested against 3 algorithms

Benchmarking Streaming Anomaly Detection

Benchmarking Streaming Anomaly Detection

HTM  detects    anomaly  earlier  

Other  algorithms  

Numenta Community & Partnerships

-  NuPIC

-  Open source community at numenta.org

-  > 3,000 Github followers, > 160 contributors

-  Cortical.io -  Natural Language Processing

-  IBM -  Core HTM research

-  Novel hardware architectures for HTMs

-  Avik Partners -  HTM Grok anomaly detection and analytics for IT

-  grokstream.com

Future of Data is Streaming Data

-  High velocity sensory streams with rapidly changing statistics -  Massive number of models

-  Problem: existing batch algorithms cannot scale

Cortical Algorithms Show The Way -  Proof that systems can:

Automatically create models

Continuously learn Model sophisticated temporal streams

-  HTM learning algorithms implement cortical principles -  Can demonstrate working applications today

Thank you!

Contact info: sahmad@numenta.com