Date post: | 16-Feb-2017 |
Category: |
Sports |
Upload: | diego-pacheco |
View: | 608 times |
Download: | 0 times |
Microservices reativos usando a stack do Netflix na AWS
Diego PachecoPrincipal Software Architect at ilegra.com@diego_pacheco
www.ilegra.com
NetflixOSS Stack
Why Netflix?
Billions Requests Per Day 1/3 US internet
bandwidth ~10k EC2 Instances Multi-Region 100s Microservices Innovation + Solid
Service SOA, Microservices and
DevOps Benchmark
Social Product Social Network Video Docs Apps Chat
ScalabilityDistributed Teams Could reach some Web Scale
Netflix My Problem
AWS
Cloud Native
Principles
Stateless Services Ephemeral Instances Everything fails all the
time Auto Scaling / Down
Scaling Multi AZ and multi
Region No SPOF Design for Failure
(expected)
SOA Microservices No Central Database NoSQL Lightweight Serializable
Objects Latency tolerant
protocols DevOps Enabler
Immutable Infrastructure Anti-Fragility
Right Set of Assumptons
Microservices
Reactive
Java Drivers X REST
X
Simple View of the Architecture
Zuul
UI
Microservice
Cassandra Cluster
Stack
OSS
Zuul
Zuul
Karyon: Microbiology - Nucleus
Reactive Extensions + Netty Server Lower Latency under Heavy Load Fewer Locks, Fewer Thread Migrations Consumes Less CPU Lower Object Allocation Rate
RxNetty
Karyon: CODE
Karyon: Reactive
Karyon: Reactive
Eureka and Service Discovery
http://microservices.io/patterns/server-side-discovery.html
Eureka
AWS Service Registry for Mid-tier Load balancing and Failover REST based Karyon and Ribbon Integration
Eureka
Eureka and Service Discovery
Availability
Histryx
IPC Library Client Side Load Balancing Multi-Protocol (HTTP, TCP, UDP) Caching* Batching Reactive
Ribbon
RibbonCODE
RibbonCODE
Reactive Extension of the JVM Async/Event based programming Observer Pattern Less 1mb Heavy usage by Netflix OSS Stack
RX-Java
Archaius
Configuration Management Solution Dynamic and Typed Properties High Throughtput and Thread Safety Callbacks: Notifications of config changes JMX Beans Dynamic Config Sources: File, Db, DynamoDB, Zookeper Based on Apache Commons Configuration
Archaius + Git
MicroserviceMicroservice Slave Side Car
CentralInternal GIT Property
Files
File System
MicroserviceMicroservice Slave Side Car
File System
MicroserviceMicroservice Slave Side Car
File System
Asgard
Asgard
Packer
JOB Create
Bake/Provision
Launch
Deploys
Dynomite
Implements the Amazon DynamoSimilar to Cassandra, Riak and DynamoDB
Strong Consistency – Quorum-like – No Data LossPluggable ScalableRedis / MemcachedMulti-Clients with DynoCan use most of redis commandsIntegrated with Eureka via Prana
Dynomite: Internals
Oregon D1
Oregon D2
N California D3
Eureka Server
Eureka Server
Prana
Prana
Prana
Multi-Region Cluster
Dynomite: CODE
Dynomite Contributions
https://github.com/Netflix/dynomite
https://github.com/Netflix/dynomite/pull/207
https://github.com/Netflix/dynomite/pull/200
Caos Engineering
Isolate Failure – Avoid cascading Redundancy – NO SPOF Auto-Scaling Fault Tolerance and Isolation Recovery Fallbacks and Degraded Experience Protect Customer from failures – Don’t throw Failures ->
Failures VS Errors
Chaos / Failure
Gatling
Stress Testing ToolScala DSLRun on top of AkkaSimple to use
Chaos Arch
Zuul
Microservice N1 Microservice N2
Cassandra Cluster
Zuul
EurekaELB
Running…
Chaos Results and Learnings
Retry configuration and Timeouts in Ribbon Right Class in Zuul 1.x (default retry only SocketException)
RequestSpecificRetryHandler (Httpclient Exceptions) zuul.client.ribbon.MaxAutoRetries=1 zuul.client.ribbon.MaxAutoRetriesNextServer=1 zuul.client.ribbon.OkToRetryOnAllOperations=true
Eureka Timeouts It Works Everything needs to have redudancy ASG is your friend :-) Stateless Service FTW
Microservice Producer
Kafka / Storm :: Event System
Chaos Results and Learnings
Before: Data was not in Elastic Search Producers was loosing data
After:
No Data Loss It Works
Changes: No logging on Microservice :( (Log was added) Code that publish events on a try-catch Retry config in kafka producer from 0 to 5
Main Challenges
Hacker Mindset
Next Steps
IPC Spinnaker Containers Client side Aggregation DevOps 2.0 -> Remediation / Skynet
Pocs
https://github.com/diegopacheco/netflixoss-pocs
http://diego-pacheco.blogspot.com.br/search/label/netflix?max-results=30
Microservices reativos usando a stack do Netflix na AWS
Diego PachecoPrincipal Software Architect at ilegra.com@diego_pacheco
Obrigado!