Date post: | 12-Apr-2017 |
Category: |
Technology |
Upload: | emerson-macedo |
View: | 1,124 times |
Download: | 0 times |
Video AuthorizationFrom chaos to
25ms response time
Emerson Macedo@emerleite
https://emerleite.com
Content authorizationis always
a challenge
Video Authorization process
Video Authorization process
Video authorizationmust be really
very fast
Authorization Provider
Rules
1 - User has a valid session
2 - User has the channel of this video in his Pay TV
subscription
3 - User parental control matches video content
rating
OR ...
1 - User has a valid session
2 - Video channel has a trial happening
3 - User is able to join the current trial
Video Authorization process
response time > 5 secondsCircuit Breaker opens
Nobody can playany subscriber only
video
We need todo something
about this
1 - split the circuit breaker
The circuit breaker was like this
The new circuit breaker is per provider
A circuit open for one provider
should not affect any other providers
2 - Optimize Auth Provider
Globosat PlayAuthorization Provider
Usually~ 500ms response time
Sometimes5s to 10s response time
Video authorizationmust be really
very fast
New Relic Transaction Trace
New Relic Transaction Trace
2.1 - Snapshot someexternal services data
Globosat PlayAuthorization Provider
After every user sign in
Before User InfoSnapshot data
After User InfoSnapshot data
On user video authorization
Before Profile ConfigSnapshot data
After Profile ConfigSnapshot data
On any video authorization
Before Video infoSnapshot data
After Vide InfoSnapshot data
In the worst scenarioWe have THREE external calls
In the best scenarioWe have NO external calls
2.2 - Be safe with allexternal services calls
2.3 - Cache logged requests
when it's safe
Logged request - Success
Logged request - Failure
Nginx Cache Config
With Nginx cache
We dida very intesive job
and reduced theresponse time
Usually~25ms response time
Usually~25ms response time
BUT ...
Sometimes5s to 10s response time
Sometimes5s to 10s response time
3 - Take a look at ourinfrastructure
3.1 - MongoDB
MongoDB has a bad reputation in software developer community
Everybody Blames Mongo everytime 2
2 http://rhaas.blogspot.com.br/2014/04/why-clock-is-ticking-for-mongodb.html
Everybody Blames Mongo everytime 3
3 http://cryto.net/~joepie91/blog/2015/07/19/why-you-should-never-ever-ever-use-mongodb/
Database change isan architectural decision
architectural decisionsare hard and very
expensive to change
Sometimes5s to 10s response time
Strange query convultions
Traces confirms convultions
Logs confirms convultions
We checked all indexesand found nothinghurting our system
3.2 - Servers
CPU Wait confirms convultions
Load average confirms convultions
MongoDB docs confirms thatSomething was wrong 4
4 https://docs.mongodb.org/manual/administration/production-notes/#remote-filesystems
MongoDB users group confirms thatSomething was wrong 5
5 https://groups.google.com/forum/#!msg/mongodb-user/Kd85b2HHVn8/7SnwTyeQKsEJ
We removed NFSfrom our Mongodb
servers setup
convultions stopped
Usually~25ms response time
Usually~25ms response time
AND ...
Sometimes100ms to 500ms response time
Lessons Learned
1 - Go deep into your problem
2 - Don't panic
3 - Most of the timethe solution is not
to rewrite from scratch
Questions?