TechArch Day 2018
Architecting Fast Data Applications
Gerard Maas
Architecting Fast Data Applications
TechArch Day 2018 - Helsinki, October 3, 2018
Gerard MaasSenior SW Engineer, Lightbend, Inc.
Gerard MaasSeñor SW Engineer
@maasg
https://github.com/maasg
https://www.linkedin.com/in/gerardmaas/
https://stackoverflow.com/users/764040/maasg
Why Streaming?
Fast Data Architectures
((( Quick Demo )))
The Future of Fast Data
Why Streaming?
Fast Data Use Cases
Real-time marketing based on behavior, location, inventory levels, product promotions, etc.
Real-time Personalization
Drive better business outcomes through real-time risk, fraud detection, compliance, audit, governance, etc.
Real-time Financial ProcessesPredictive Analytics
Apply ML models to large volumes of device data to pre-empt failures / outages
Real-time consumer and industrial Device and Supply Chain management at scale
IoT
https://www.lightbend.com/customers
Fast Data Use Cases
Real-time marketing based on behavior, location, inventory levels, product promotions, etc.
Real-time Personalization
Drive better business outcomes through real-time risk, fraud detection, compliance, audit, governance, etc.
Real-time Financial ProcessesPredictive Analytics
Apply ML models to large volumes of device data to pre-empt failures / outages
Real-time consumer and industrial Device and Supply Chain management at scale
IoT
https://www.lightbend.com/customers
•ML models applied to device telemetry to detect anomalies
•Preemptive maintenance prevents potential failures that would impact users
Predictive Analytics
Predictive Analytics - Core Idea
AnomalyHandler
TelemetryRecords
Probable Anomalies
CorrectiveActions
Anomaly Detection:
Model
Predictive Analytics - Core Idea
AnomalyHandler
TelemetryRecords
Probable Anomalies
CorrectiveActions
Ingest telemetry from edge devices.
Train models to look for anomalies… and score
incoming telemetry.
Handle anomaly: move activity off component, schedule maintenance window to replace it.
Anomaly Detection
Model
microservicesmicroservicesMicroservice
Kafka
Storage
Akka Streams
Kafka Streams
...
Low LatencyMicroservices
Spark
Data Center
Streaming, Batch Processing
Device SessionMicroservices
Example Architecture
ModelTraining
ModelStorage
ModelServing
ScoresIngestion
Telemetry
CorrectiveAction
microservicesmicroservicesMicroservice
Kafka
Storage
Akka Streams
Kafka Streams
...
Low LatencyMicroservices
Spark
Data Center
Streaming, Batch Processing
Device SessionMicroservices
Example Architecture
ModelTraining
ModelStorage
ModelServing
ScoresIngestion
Telemetry
CorrectiveAction
Session Mngmt,Restful msvcs
microservicesmicroservicesMicroservice
Kafka
Storage
Akka Streams
Kafka Streams
...
Low LatencyMicroservices
Spark
Data Center
Streaming, Batch Processing
Device SessionMicroservices
Example Architecture
ModelTraining
ModelStorage
ModelServing
ScoresIngestion
Telemetry
CorrectiveAction
Model Scoring
microservicesmicroservicesMicroservice
Kafka
Storage
Akka Streams
Kafka Streams
...
Low LatencyMicroservices
Spark
Data Center
Streaming, Batch Processing
Device SessionMicroservices
Example Architecture
ModelTraining
ModelStorage
ModelServing
ScoresIngestion
Telemetry
CorrectiveAction
Model Training
microservicesmicroservicesMicroservice
Kafka
Storage
Akka Streams
Kafka Streams
...
Low LatencyMicroservices
Spark
Data Center
Streaming, Batch Processing
Device SessionMicroservices
Example Architecture
ModelTraining
ModelStorage
ModelServing
ScoresIngestion
Telemetry
CorrectiveAction
Ingest Device Telemetry
microservicesmicroservicesMicroservice
Kafka
Storage
Akka Streams
Kafka Streams
...
Low LatencyMicroservices
Spark
Data Center
Streaming, Batch Processing
Device SessionMicroservices
Example Architecture
ModelTraining
ModelStorage
ModelServing
ScoresIngestion
Telemetry
CorrectiveAction
Periodically Train Anomaly Detection Model
microservicesmicroservicesMicroservice
Kafka
Storage
Akka Streams
Kafka Streams
...
Low LatencyMicroservices
Spark
Data Center
Streaming, Batch Processing
Device SessionMicroservices
Example Architecture
ModelTraining
ModelStorage
ModelServing
ScoresIngestion
Telemetry
CorrectiveAction
Updated model parameters are written back to Kafka
microservicesmicroservicesMicroservice
Kafka
Storage
Akka Streams
Kafka Streams
...
Low LatencyMicroservices
Spark
Data Center
Streaming, Batch Processing
Device SessionMicroservices
Example Architecture
ModelTraining
ModelStorage
ModelServing
ScoresIngestion
Telemetry
CorrectiveAction
Updated model parameters are also written to secondary storage for resilience
microservicesmicroservicesMicroservice
Kafka
Storage
Akka Streams
Kafka Streams
...
Low LatencyMicroservices
Spark
Data Center
Streaming, Batch Processing
Device SessionMicroservices
Example Architecture
ModelTraining
ModelStorage
ModelServing
ScoresIngestion
Telemetry
CorrectiveAction
Ingest model parameters in the low-latency microservice for serving
microservicesmicroservicesMicroservice
Kafka
Storage
Akka Streams
Kafka Streams
...
Low LatencyMicroservices
Spark
Data Center
Streaming, Batch Processing
Device SessionMicroservices
Example Architecture
ModelTraining
ModelStorage
ModelServing
ScoresIngestion
Telemetry
CorrectiveAction
Ingest telemetry data to score it, looking for anomalies.
microservicesmicroservicesMicroservice
Kafka
Storage
Akka Streams
Kafka Streams
...
Low LatencyMicroservices
Spark
Data Center
Streaming, Batch Processing
Device SessionMicroservices
Example Architecture
ModelTraining
ModelStorage
ModelServing
ScoresIngestion
Telemetry
CorrectiveAction
Write detected anomalies back to Kafka in a new topic
microservicesmicroservicesMicroservice
Kafka
Storage
Akka Streams
Kafka Streams
...
Low LatencyMicroservices
Spark
Data Center
Streaming, Batch Processing
Device SessionMicroservices
Example Architecture
ModelTraining
ModelStorage
ModelServing
ScoresIngestion
Telemetry
CorrectiveAction
Read anomaly information into microservices that manage the devices remotely
microservicesmicroservicesMicroservice
Kafka
Storage
Akka Streams
Kafka Streams
...
Low LatencyMicroservices
Spark
Data Center
Streaming, Batch Processing
Device SessionMicroservices
Example Architecture
ModelTraining
ModelStorage
ModelServing
ScoresIngestion
Telemetry
CorrectiveAction
Take corrective action, e.g., download patches, disable, reset, ...
Fast Data Architectures
Requirements?● Latency. How Low?● Throughput. How High?● Which kind of data processing?● How do you want to build, deploy,
and manage those services?
Kubernetes, Mesos, YARN
Cloud | On-prem
On-Prem Cloud
Substrate
Kubernetes, Mesos, YARN
Cloud | On-prem
Substrate Messaging Backbone
Kafka
Batch
Kubernetes, Mesos, YARN
Cloud | On-prem
Streams Streams
Substrate Messaging Backbone ProcessingEngine
Kafka
Spark
Flink
Stream ProcMicroservices
Akka Streams
Kafka Streams
Spark
Batch
Kubernetes, Mesos, YARN
Cloud | On-prem
Substrate Messaging Backbone ProcessingEngine
Microservices
Kafka
Storage
Spark
Flink
Stream Proc
Reactive Platform
Go node.js
Microservices
Akka Streams
Kafka Streams
Spark
Batch
HDFS
S3...
SQL NoSQL
Kubernetes, Mesos, YARN
Cloud | On-prem
Substrate Messaging Backbone ProcessingEngineMicroservices Storage
Kafka
Storage
Spark
Flink
Stream Proc
Reactive Platform
Go node.js
Microservices
Akka Streams
Kafka Streams
Spark
Batch
HDFS
S3...
SQL NoSQL
Kubernetes, Mesos, YARN
Cloud | On-prem
Substrate Messaging Backbone ProcessingEngineMicroservices Storage Monitoring &Management
Kafka
Storage
Spark
Flink
Stream Proc
Reactive Platform
Go node.js
Microservices
Akka Streams
Kafka Streams
Spark
Batch
HDFS
S3...
SQL NoSQL
Files
Sockets
REST
Kubernetes, Mesos, YARN
Cloud | On-prem
((( Quick Demo )))
{...}
Aggregator records
Call-Record-Aggregatordatamodelakka-cdr-ingestor
akka-java-aggr-outspark-aggregation
Call-Record-Aggregatordatamodelakka-cdr-ingestor
akka-java-aggr-outspark-aggregation
CDRs
HTTPIngress
Aggregator
ConsoleOut
$>_
The Future of Fast Data
ReactiveSystems
FastData
Applications
BigData
Microservices
The Evolution of Current Trends
Growing number of workloads are moving from “data at rest” to “data in motion”
Streaming data pipelines are being served by Microservices
Microservices Stream Processing
ContainerizationOrchestration
"This is what makes time travel possible." —Doc Brown
Img src: http://backtothefuture.wikia.com/wiki/Flux_capacitor
The Flux Capacitor
Microservices are a part of a streaming pipeline
A pipeline can now be exposed as Microservices
Packaged, deployed and managed as cloud-native applications with the right tools
lightbend.com/fast-data-platform
Learn more
Thank you