Architecting Fast Data Applications - TechArch Day · Kafka Storage Akka Streams Kafka Streams......

Post on 20-May-2020

26 views 0 download

transcript

TechArch Day 2018

Architecting Fast Data Applications

Gerard Maas

Architecting Fast Data Applications

TechArch Day 2018 - Helsinki, October 3, 2018

Gerard MaasSenior SW Engineer, Lightbend, Inc.

Gerard MaasSeñor SW Engineer

gerard.maas@lightbend.com

@maasg

https://github.com/maasg

https://www.linkedin.com/in/gerardmaas/

https://stackoverflow.com/users/764040/maasg

Why Streaming?

Fast Data Architectures

((( Quick Demo )))

The Future of Fast Data

Why Streaming?

Fast Data Use Cases

Real-time marketing based on behavior, location, inventory levels, product promotions, etc.

Real-time Personalization

Drive better business outcomes through real-time risk, fraud detection, compliance, audit, governance, etc.

Real-time Financial ProcessesPredictive Analytics

Apply ML models to large volumes of device data to pre-empt failures / outages

Real-time consumer and industrial Device and Supply Chain management at scale

IoT

https://www.lightbend.com/customers

Fast Data Use Cases

Real-time marketing based on behavior, location, inventory levels, product promotions, etc.

Real-time Personalization

Drive better business outcomes through real-time risk, fraud detection, compliance, audit, governance, etc.

Real-time Financial ProcessesPredictive Analytics

Apply ML models to large volumes of device data to pre-empt failures / outages

Real-time consumer and industrial Device and Supply Chain management at scale

IoT

https://www.lightbend.com/customers

•ML models applied to device telemetry to detect anomalies

•Preemptive maintenance prevents potential failures that would impact users

Predictive Analytics

Predictive Analytics - Core Idea

AnomalyHandler

TelemetryRecords

Probable Anomalies

CorrectiveActions

Anomaly Detection:

Model

Predictive Analytics - Core Idea

AnomalyHandler

TelemetryRecords

Probable Anomalies

CorrectiveActions

Ingest telemetry from edge devices.

Train models to look for anomalies… and score

incoming telemetry.

Handle anomaly: move activity off component, schedule maintenance window to replace it.

Anomaly Detection

Model

microservicesmicroservicesMicroservice

Kafka

Storage

Akka Streams

Kafka Streams

...

Low LatencyMicroservices

Spark

Data Center

Streaming, Batch Processing

Device SessionMicroservices

Example Architecture

ModelTraining

ModelStorage

ModelServing

ScoresIngestion

Telemetry

CorrectiveAction

microservicesmicroservicesMicroservice

Kafka

Storage

Akka Streams

Kafka Streams

...

Low LatencyMicroservices

Spark

Data Center

Streaming, Batch Processing

Device SessionMicroservices

Example Architecture

ModelTraining

ModelStorage

ModelServing

ScoresIngestion

Telemetry

CorrectiveAction

Session Mngmt,Restful msvcs

microservicesmicroservicesMicroservice

Kafka

Storage

Akka Streams

Kafka Streams

...

Low LatencyMicroservices

Spark

Data Center

Streaming, Batch Processing

Device SessionMicroservices

Example Architecture

ModelTraining

ModelStorage

ModelServing

ScoresIngestion

Telemetry

CorrectiveAction

Model Scoring

microservicesmicroservicesMicroservice

Kafka

Storage

Akka Streams

Kafka Streams

...

Low LatencyMicroservices

Spark

Data Center

Streaming, Batch Processing

Device SessionMicroservices

Example Architecture

ModelTraining

ModelStorage

ModelServing

ScoresIngestion

Telemetry

CorrectiveAction

Model Training

microservicesmicroservicesMicroservice

Kafka

Storage

Akka Streams

Kafka Streams

...

Low LatencyMicroservices

Spark

Data Center

Streaming, Batch Processing

Device SessionMicroservices

Example Architecture

ModelTraining

ModelStorage

ModelServing

ScoresIngestion

Telemetry

CorrectiveAction

Ingest Device Telemetry

microservicesmicroservicesMicroservice

Kafka

Storage

Akka Streams

Kafka Streams

...

Low LatencyMicroservices

Spark

Data Center

Streaming, Batch Processing

Device SessionMicroservices

Example Architecture

ModelTraining

ModelStorage

ModelServing

ScoresIngestion

Telemetry

CorrectiveAction

Periodically Train Anomaly Detection Model

microservicesmicroservicesMicroservice

Kafka

Storage

Akka Streams

Kafka Streams

...

Low LatencyMicroservices

Spark

Data Center

Streaming, Batch Processing

Device SessionMicroservices

Example Architecture

ModelTraining

ModelStorage

ModelServing

ScoresIngestion

Telemetry

CorrectiveAction

Updated model parameters are written back to Kafka

microservicesmicroservicesMicroservice

Kafka

Storage

Akka Streams

Kafka Streams

...

Low LatencyMicroservices

Spark

Data Center

Streaming, Batch Processing

Device SessionMicroservices

Example Architecture

ModelTraining

ModelStorage

ModelServing

ScoresIngestion

Telemetry

CorrectiveAction

Updated model parameters are also written to secondary storage for resilience

microservicesmicroservicesMicroservice

Kafka

Storage

Akka Streams

Kafka Streams

...

Low LatencyMicroservices

Spark

Data Center

Streaming, Batch Processing

Device SessionMicroservices

Example Architecture

ModelTraining

ModelStorage

ModelServing

ScoresIngestion

Telemetry

CorrectiveAction

Ingest model parameters in the low-latency microservice for serving

microservicesmicroservicesMicroservice

Kafka

Storage

Akka Streams

Kafka Streams

...

Low LatencyMicroservices

Spark

Data Center

Streaming, Batch Processing

Device SessionMicroservices

Example Architecture

ModelTraining

ModelStorage

ModelServing

ScoresIngestion

Telemetry

CorrectiveAction

Ingest telemetry data to score it, looking for anomalies.

microservicesmicroservicesMicroservice

Kafka

Storage

Akka Streams

Kafka Streams

...

Low LatencyMicroservices

Spark

Data Center

Streaming, Batch Processing

Device SessionMicroservices

Example Architecture

ModelTraining

ModelStorage

ModelServing

ScoresIngestion

Telemetry

CorrectiveAction

Write detected anomalies back to Kafka in a new topic

microservicesmicroservicesMicroservice

Kafka

Storage

Akka Streams

Kafka Streams

...

Low LatencyMicroservices

Spark

Data Center

Streaming, Batch Processing

Device SessionMicroservices

Example Architecture

ModelTraining

ModelStorage

ModelServing

ScoresIngestion

Telemetry

CorrectiveAction

Read anomaly information into microservices that manage the devices remotely

microservicesmicroservicesMicroservice

Kafka

Storage

Akka Streams

Kafka Streams

...

Low LatencyMicroservices

Spark

Data Center

Streaming, Batch Processing

Device SessionMicroservices

Example Architecture

ModelTraining

ModelStorage

ModelServing

ScoresIngestion

Telemetry

CorrectiveAction

Take corrective action, e.g., download patches, disable, reset, ...

Fast Data Architectures

Requirements?● Latency. How Low?● Throughput. How High?● Which kind of data processing?● How do you want to build, deploy,

and manage those services?

Kubernetes, Mesos, YARN

Cloud | On-prem

On-Prem Cloud

Substrate

Kubernetes, Mesos, YARN

Cloud | On-prem

Substrate Messaging Backbone

Kafka

Batch

Kubernetes, Mesos, YARN

Cloud | On-prem

Streams Streams

Substrate Messaging Backbone ProcessingEngine

Kafka

Spark

Flink

Stream ProcMicroservices

Akka Streams

Kafka Streams

Spark

Batch

Kubernetes, Mesos, YARN

Cloud | On-prem

Substrate Messaging Backbone ProcessingEngine

Microservices

Kafka

Storage

Spark

Flink

Stream Proc

Reactive Platform

Go node.js

Microservices

Akka Streams

Kafka Streams

Spark

Batch

HDFS

S3...

SQL NoSQL

Kubernetes, Mesos, YARN

Cloud | On-prem

Substrate Messaging Backbone ProcessingEngineMicroservices Storage

Kafka

Storage

Spark

Flink

Stream Proc

Reactive Platform

Go node.js

Microservices

Akka Streams

Kafka Streams

Spark

Batch

HDFS

S3...

SQL NoSQL

Kubernetes, Mesos, YARN

Cloud | On-prem

Substrate Messaging Backbone ProcessingEngineMicroservices Storage Monitoring &Management

Kafka

Storage

Spark

Flink

Stream Proc

Reactive Platform

Go node.js

Microservices

Akka Streams

Kafka Streams

Spark

Batch

HDFS

S3...

SQL NoSQL

Files

Sockets

REST

Kubernetes, Mesos, YARN

Cloud | On-prem

((( Quick Demo )))

{...}

Aggregator records

Call-Record-Aggregatordatamodelakka-cdr-ingestor

akka-java-aggr-outspark-aggregation

Call-Record-Aggregatordatamodelakka-cdr-ingestor

akka-java-aggr-outspark-aggregation

CDRs

HTTPIngress

Aggregator

ConsoleOut

$>_

The Future of Fast Data

ReactiveSystems

FastData

Applications

BigData

Microservices

The Evolution of Current Trends

Growing number of workloads are moving from “data at rest” to “data in motion”

Streaming data pipelines are being served by Microservices

Microservices Stream Processing

ContainerizationOrchestration

"This is what makes time travel possible." —Doc Brown

Img src: http://backtothefuture.wikia.com/wiki/Flux_capacitor

The Flux Capacitor

Microservices are a part of a streaming pipeline

A pipeline can now be exposed as Microservices

Packaged, deployed and managed as cloud-native applications with the right tools

lightbend.com/fast-data-platform

Learn more

lightbend.com/fast-data-platform

Thank you