Date post: | 14-Apr-2017 |
Category: |
Technology |
Upload: | sascha-moellering |
View: | 492 times |
Download: | 0 times |
Basics• Today: building event driven system• Using:
– Apache Kafka/Amazon Kinesis– Docker– Vert.x– Apache Camel/AWS Lamdba– Google’s Protobuf
Basics• Producers putting data into messaging
system• Messages in Google’s Protobuf format• Consumers pulling data from
messaging system
Infrastructure• Publish-subscribe messaging • Implemented as distributed commit log• Fast: 100s of MB (reads and writes) per
s from thousands of clients• Scalable: elastically and transparently
expanded without downtime• Durable: Messages persisted on disk
Infrastructure• Managed service for real-time data
processing• Decoupling services• Data stored for 24 hours• 1MB messages max
Infrastructure• Stream:
– Ordered sequence of data records– Data records are distributed into shards
Infrastructure• Shard:
– Group of data records in a stream– 1MB write per second– 2MB read per second– 1000 puts per second
Infrastructure• Package application with dependencies• Standardized unit for software
development• Layered filesystem, share common files• Isolate applications from each other
Infrastructure• Docker container: stripped-to-basics
version of a Linux operating system
• Docker image: software you load into a container
Infrastructure• Docker image built with a Dockerfile• Docker images built using “inheritance”• Custom image based on “base image”
Software & Frameworks• Toolkit for reactive applications• Based on the JVM• Event driven and non-blocking• Polyglot (Java, JS, Groovy, Ruby)• Lightweight and modular
Software & Frameworks• Producer application built using Vert.x• Sends a message every 5s• Kafka or Kinesis depending on
deployment target
Software & Frameworks
http://169.254.169.254/latest/meta-data
Software & Frameworks• Framework based on EI Patterns• Small library with minimal dependencies• Define routing and mediation rules
Software & Frameworks• Language-neutral• Platform-neutral• Extensible mechanism for serializing
structured data• Support for Java, Python, and C++
Software & Frameworks• Compute service• Runs your code in response to events• Manages underlying compute resources
Software & Frameworks• Triggered by:
– Modifications in S3 buckets– Notifications by SNS– Messages in Kinesis– Table updates in DynamoDB
Software & Frameworks• Code run in Lambda: “Lambda function”• Don’t confuse with Java 8 Lambda• Lambda functions support
– Java 8– JavaScript
Software & Frameworks• Building the applications
– ingestion-service– kafka-consumer-camel– kinesis-consumer-lambda
Software & Frameworks• ingestion-service
1. git clone https://github.com/SaschaMoellering/ingestion-service.git
2. mvn -Dmaven.test.skip=true package
3. docker build -t autoscaling/ingestion-service .
Software & Frameworks• kafka-consumer-camel
1. git clone https://github.com/SaschaMoellering/kafka-consumer-camel.git
2. mvn -Dmaven.test.skip=true package
3. docker build -t autoscaling/kafka-consumer .
Software & Frameworks• kinesis-consumer-lambda
1. git clone https://github.com/SaschaMoellering/kinesis-consumer-lambda.git
2. mvn -Dmaven.test.skip=true package
Deployment• Locally
1. Start the Spotify Kafka Docker Container
Deployment• Why Spotify Kafka Docker Image?
– Kafka depends on Zookeeper– Spotify Kafka runs Kafka and Zookeeper– No dependency to external Zookeeper– Runs out of the box
Deployment• Locally
2. Start the Apache Camel Kafka consumer
Deployment• Kafka-Consumer Docker Container
– Based on phusion/baseimage– Installs Oracle Java 8– Add consumer Fat-JAR– Starts the Fat-JAR
Deployment• Locally
3. Start the Vert.x Kafka Producer
Deployment• Vert.x Producer Docker Container
– Based on phusion/baseimage– Installs Oracle Java 8– Add producer Fat-JAR– Starts the Fat-JAR
Deployment• Requirements for AWS:
– VPC– User Role for Kinesis access from EC2– User Role for Kinesis access from Lambda– EC2 instance– Kinesis stream– Lambda package
Deployment• In AWS
– Create Lambda function• Upload JAR to S3 bucket• Specify function• Add event source (SUMMIT_STREAM)
Deployment• In AWS
– Start an EC2 instance• t2.small is sufficient• Install Docker and run container using
EC2 user data• Important: select correct IAM role
Deployment• EC2 User Data
#!/bin/bash -exyum -y updateyum install docker -yservice docker startdocker run autoscaling/ingestion-service
Putting it all together• Integration of Kinesis and Kafka
– Kinesis consumer that processes records– Record processing -> sending to Kafka– AWS Lambda perfect choice– Problem: Lambda and VPN (VPC) not
working*
Putting it all together• Integration testing Kinesis and Kafka
– AWS API:• Create Kinesis stream in @BeforeClass• Produce data and write into stream• Delete stream in @AfterClass
Putting it all together• Integration testing Kinesis and Kafka
– Spotify Docker Client• Run Spotify Kafka container in @BeforeClass• Produce data and write into stream• Stop Spotify Kafka container in @AfterClass
Putting it all together• Integrationtests Kinesis and Kafka
– Put messages into Kinesis– Consumer messages in application– Put messages in Kafka– Consume messages from Kafka– Compare messages
Putting it all together• Integrationtests Kinesis and Kafka
– After tests: clean up infrastructure– Very cost effective– Real world tests without mocking– Quite fast
Recap• What have we achieved today?
– We created a distributed, message driven system
– Based on JVM and Docker– Running locally and AWS
Resources• ingestion-service
– https://github.com/SaschaMoellering/ingestion-service.git
– https://hub.docker.com/r/autoscaling/ingestion-service/
Resources• kafka-consumer
– https://github.com/SaschaMoellering/kafka-consumer-camel
– https://hub.docker.com/r/autoscaling/kafka-consumer/
Resources• EC2 User Data
– https://gist.github.com/SaschaMoellering/c6ee24ec999325c43e90
• EC2 User Role– https://gist.github.com/SaschaMoelle
ring/a971fb73626f41ad80f4
Resources• Lambda User Role
– https://gist.github.com/SaschaMoellering/b14540b144263e5fea4b
Resources• kinesis-consumer-lamdba
• https://github.com/SaschaMoellering/kinesis-consumer-lambda
Resources• Spotify Kafka Docker Image
– https://github.com/spotify/docker-kafka.git– https://hub.docker.com/r/spotify/kafka/
• Spotify Docker Client– https://github.com/spotify/docker-client.git