Date post: | 23-Jan-2018 |
Category: |
Technology |
Upload: | dataworks-summithadoop-summit |
View: | 2,673 times |
Download: | 0 times |
Big Data Application Architectures - IoT
Nishant ThackerTechnical Product Manager – Big DataMicrosoft
@nishantthacker
“Information is the oil of the 21st century,
and analytics is the combustion engine.”
- Peter Sondergaard - Gartner
Today: More “Connected Things“ Than Toothbrushes In The World…
Category 2013 2014 2015 2020
Automotive 96 190 372 3,511
Consumer 1,842 2,245 2.875 13,173
Generic
Business395 479 624 5,159
Vertical
Business699 837 1,009 3,164
Grand Total 3,032 3,750 4,881 25,007
Get To Know Your Things!Device Supplier Processor Memory IOs Network OS Price*
ESP8266
modules
Espressif 1 x 160
MHz
128 kB RAM, 1
MB flash
12 GPIO
1 ADC, I2C, I2C
WiFi 2.4 GHz n/a $ 2.5
Photon Particle.io 1 x 120
MHz
128 kB RAM, 1
MB flash
18 GPIO, 2 SPI, I2S, I2C, CAN,
USB, 9 PWM, ADC, DAC
WiFi 2.4 GHz n/a $ 19
Electron Particle.io 1 x 120
MHz
128 kB RAM
1 MB flash
28 GPIO 3G UMTS n/a $ 39
WiLink 8
family
Texas
Instruments
n/a n/a n/a WiFi 2.4/5 GHz,
Bluetooth 4.1
LE
n/a $ 10 – 25
(industrial
grade)
Arduino
Leonardo
Arduino LLC,
Arduino Sarl
1 x 16 MHz 2.5 kB RAM 32
KB flash
20 GPIO, 7 PWM, 10 ADC, USB - n/a $ 10
Raspberry Pi
Zero
Raspberry Pi
Foundation
1 x 1 GHz 512 MB RAM
micro-SD
10 GPIO, Mini HDMI, USB - Linux $ 5
Raspberry Pi
2
Raspberry Pi
Foundation
4 x 900
MHz,
GPU
1 GB RAM
Micro-SD
40 GPIO, 1 PWM, 1 ADC, HDMI, 4
USB
Ethernet Windows 10,
Linux, RiscOS
$ 35
Beaglebone
Black
Beagleboard.o
rg
1 x 1 GHz 512 MB RAM, 4
GB flash
69 GPIO, 2 CAN, 10 ADC, 8 PWM,
HDMI, USB
Ethernet Linux $ 55
Drive PX 2
(H2/CY16)
NVIDIA 2 x CPU
2 x GPU
8 Tflops
tbd 12 cameras, LIDAR, RADAR,
Ultrasonic, …
Tbd tbd $ 1000+ ?
IoT Reference Architecture
Low power devices
Existing IoTdevices
IoT Client
Solution UX
Provisioning API
Identity and Registry Stores
Stream Processors
Analytics &Machine Learning
Business Integration Connectors
and Gateway(s)
Device State Store
Gateway
Data Lake
Gateway
App Backend
Data Path
Optional solution component
IoT solution component
IoT Client
Presentation & Business ConnectivityData Processing, Analytics and ManagementDevice Connectivity
Personal mobile devices
IP capable devices
IoT Client
Business systems
Reference Architecture & Azure Services
Low power devices
Existing IoTdevices
IoT Client
Solution UX
Provisioning API
Device Registry
Stream Processors
Analytics &Machine Learning
Business Integration Connectors
and Gateway(s)
Device State Store
Gateway
Data Lake
Gateway
App Backend
IoT Client
Personal mobile devices
IP capable devices
IoT Client
Business systems
Data Path
Optional solution component
Azure IoT solution component
Presentation & Business ConnectivityData Processing, Analytics and ManagementDevice Connectivity
Device Connectivity Options
Field Gateway
CoAP, AllJoyn, OPC
Custom Cloud Gateway
(Cloud Service, VM)VPN/ExpressRoute
OPC, HTTP, CoAP
Field Gateway
CoAP, AllJoyn, OPC
IoT Hub
Custom Cloud Gateway
(Cloud Service, VM)
AMQP, MQTT, HTTPS
Custom Protocols
Data Path
Optional solution component
Azure IoT solution component
Device
IoT Client
Device
IoT Client
Device
IoT ClientDevice
Device
Device
AMQP, MQTT, HTTPS
Device Stores
App Backend Solution UX
Provisioning API
Device Registry Store
Stream Processors
Analytics &Machine Learning
Business Integration Connectors
and Gateway(s)
Device State Store
Data Lake
Gateway(Kafka,
IoT Hub,Event Hubs)
Gateway
IP capable devices
IoT Client
Data Path
Optional solution component
Azure IoT solution component
IoT Client
Existing IoTdevices
IoT Client
Low power devices
Device Identity
Store
Device Identity, Registry and State Stores
Identity StoreAuthority for all registered devices
Stores identity information and authentication secrets
Registry StoreIndex in addition to the identity store
Contains discovery and reference data related to devices
Can define a schema model or use a vertical industry standard schema for metadata
Can contain structured metadata and links to externally stored operational data
Device State StoreContains operational data related to the devices:
- “Last known values” for each device
- Aggregated or computed values
- Stream of device data events
Device Provisioning
Provisioning API is the common external interface for changes on device identity and device registry stores.
Workflow for processing individual and bulk requests:Registering new devices
Updating or removing existing devices
Activation or access control
May also include interactions with external systems:Billing systems
Business support systems
Connectivity management systems
Stream Processors
App Backend
Gateway
IP capable devices
IoT Client
Data Path
Optional solution component
Azure IoT solution component
IoT Client
Existing IoTdevices
IoT Client
Low power devices
Solution UX
Provisioning API
Identity and Registry Stores
Stream Processors
Analytics &Machine Learning
Business Integration Connectors
and Gateway(s)
Device State Store
Data Lake
Cloud Gateway
Stream Processing: Data FlowAfter ingress through the IoT Hub, the flow of data through the system is facilitated by data pumps and analytics tasks
Data flow can be driven by:
• Apache Storm on Azure HDInsight
• Apache Spark on Azure HDInsight
• Azure Stream Analytics
• Custom Event Processors
Each can perform tasks
in flight:
• Data aggregation
• Data enrichment
• Complex event processing
… and can output data
to:
• Azure Data Lake
• Azure Blobs/Tables
• HDInsight / HBase
• Azure SQL DB
• Time Series Databases
• Event Hub
• Service Bus Queues
Stream Processor Examples
Queue
Gateway
IP capable devices
IoT Client
Data Path
Optional solution component
Azure IoT solution component
IoT Client
Existing IoTdevices
IoT Client
Low power devices
Device Registry StoreDevice Metadata
Processor
Data Lake
Cloud Gateway
Device State StoreDevice State
Processor
Notification Processor
Raw Telemetry Processor
App Backend
Rules Processor
Event HubStream Transformation
ProcessorSecondary Stream
Processor
App Backend
App Backend
Gateway
IP capable devices
IoT Client
Data Path
Optional solution component
Azure IoT solution component
IoT Client
Existing IoTdevices
IoT Client
Low power devices
Solution UX
Provisioning API
Identity and Registry Stores
Stream Processors
Analytics &Machine Learning
Business Integration Connectors
and Gateway(s)
Device State Store
Storage
Cloud Gateway
High-Scale Compute Models
Scale-appropriate compute modelsActor Frameworks / Service Fabric Reliable Actors: distributed compute fabric hosting device actors.
Service Fabric Reliable Collections: highly available with replicated and local state management.
Azure Batch: job scheduling and compute management for highly parallelizable compute workloads.
Simple programming logic in vastly scalable compute nodes
Data Analytics
App Backend
Gateway
IP capable devices
IoT Client
Data Path
Optional solution component
Azure IoT solution component
IoT Client
Existing IoTdevices
IoT Client
Low power devices
Solution UX
Provisioning API
Identity and Registry Stores
Stream Processors
Analytics &Machine Learning
Business Integration Connectors
and Gateway(s)
Device State Store
Data Lake
Cloud Gateway
Data Analytics
Ingestion Gateway
Stream Processing
(ASA, Storm or Spark)
Batch Events / Logs
Fetching & Updating
Reference Data
Interceptor (Rules)
Spark
Hive/Pig
U-SQL
Azure Data Lake Store Azure Data Lake Analytics
SQL DB
R, Azure ML and/or
Spark
Reports and Dashboards
Real Time Scoring
Training and Scoring
ML Models
Azure SQL DW
Federated Query
NRT Events
Transactional Data
Alerts
Data Analytics
Real-Time Analysis Aggregation/Reduction, Temporal Queries, State Correlation, Threshold Detection, Alerting
Data-At-Rest AnalysisTime-Series, Map/Reduce, Correlation
Machine LearningPattern Detection, Behavior Prediction
Plausibility Analysis, Anomaly and Fraud Detection
Power BI
HDInsight
Stream Analytics
Data Factory
Machine Learning
WebHDFS
YARN
U-SQL
Analytics Service HDInsight
(managed Hadoop Clusters)Analytics
Store
Azure Data Lake
Cortana Intelligence Suite
Action
People
Automated Systems
Apps
Web
Mobile
Bots
Intelligence
Dashboards &
Visualizations
Cortana
Bot
Framework
Cognitive
Services
Power BI
Information
Management
Event Hubs
Data Catalog
Data Factory
Machine Learning
and Analytics
HDInsight
(Hadoop and
Spark)
Stream Analytics
Intelligence
Data Lake
Analytics
Machine
Learning
Big Data Stores
SQL Data
Warehouse
Data Lake Store
Data Sources
Apps
Sensors and devices
Data
Presentation and Business Connectivity
App Backend
Gateway
IP capable devices
IoT Client
Data Path
Optional solution component
Azure IoT solution component
IoT Client
Existing IoTdevices
IoT Client
Low power devices
Solution UX
Provisioning API
Identity and Registry Stores
Stream Processors
Analytics &Machine Learning
Business Integration Connectors
and Gateway(s)
Device State Store
Data Lake
Cloud Gateway
Reference arch. with component services
Low power devices
Existing IoTdevices
IoT Client
Solution UX
Provisioning API
Device Registry
Stream Processors
Analytics &Machine Learning
Business Integration Connectors
and Gateway(s)
Device State Store
Gateway
Data Lake
Gateway
App Backend
IoT Client
Personal mobile devices
IP capable devices
IoT Client
Business systems
Data Path
Optional solution component
Azure IoT solution component
Presentation & Business ConnectivityData Processing, Analytics and ManagementDevice Connectivity
Reference Architecture Guiding Principles
HeterogeneityAccommodates for a vast variety of scenarios, environments, devices, and processing patterns
SecurityConsiders security and privacy measures across all areas
Hyper-scaleSupports millions of connected devices
FlexibilityAllows for composability and extensibility to enable the usage of various first-party or third-party technologies