Date post: | 28-Jan-2015 |
Category: |
Technology |
Upload: | vivek-parihar |
View: | 126 times |
Download: | 0 times |
Centralized Logging System Using MongoDB
@vparihar
AVP Engineering,Webonise Lab
Vivek Parihar
Who Am I?
● A Weboniser and Rubyist● Blogger(vparihar01.github.com)● MongoDb user● Geek● DevOps● Mainly write Ruby, but have great passion for Javascript and Cloud
Platforms...
● What is Logging? ● Why we need Logging? ● Logging DO’s and Don’t ● Logs are Streams, Not FIles● Problems managing Logs for huge INFRA● What Central Logging System can do for us?● Central Logging System Architecture● What and why Fluentd?● Why MongoDB is good fit.
Agenda
What is Logging?
Mmmm Logging: It is the most important part of any application.
In General, Logging refers to keeping track of something.
Why we need Logging?
Logging: Helps me finding and fixing bugs
Logging: Extensively used for Debugging
Logging: Helps us diagnose & understand the behaviour of application.
Logging: Tells us exactly what happened when, where and why?
Who did it ?At what time ?What did he steal ?
Logging: Do’s and Don’t
#1 It should be FAST
Logging: Do’s and Don’t
#2 Should not affect user
Prevent DISK BLOAT
It should not be like-:{● "#########its working#########"
● "!!!!!coming here in to get secondary users!!!!!"
● "#########I am Here#########"
● "#########Task completed#######"}
Logging: Do’s and Don’t
#3 Do Log only useful INFO
Logging: Do’s and Don’t
4. Differentiate Log Levels
Logs Are Streams, Not Files
Logs are a stream, and it behooves everyone to treat them as such. Your programs should log to stdout and/or stderr and omit any attempt to handle log paths, log rotation, or sending logs over the syslog protocol.
Directing where the program’s log stream goes can be left up to the runtime container: a local terminal or IDE (in development environments), an Upstart / Systemd launch script (in traditional hosting environments), or a system like Logplex/Heroku (in a platform environment).
By: Adam Wiggins, Heroku co-founder.
Problems managing Logs for huge Infra
What about infra like these ?
Problems managing Logs for huge Infra
Expression like:
How can we solve huge Infra problem ?
Solution: Centralized Logging System
What Centralized Logging System can do for us?
What Centralized Logging System can do for us?
All of the logs are in one place, this makes things like searching through logs and analysis across multiple servers easier than bouncing around between boxes. Greatly simplifying log analysis and correlation tasks.
#1 Log Collections
#2 Aggregation
Scaled-out servers behind load balancers each produce their own log files, making it impossible to debug a single action flow that distributed between servers, unless the logs converge into a single article.
What Centralized Logging System can do for us?
#3 High Availability
Suppose your system is down or overloaded and unable to tell you what happened.
What Centralized Logging System can do for us?
Local logs from the server may be lost in the event of an intrusion or system failure. But by having the logs elsewhere you at least have a chance of finding something useful about what happened.
#4 Security
What Centralized Logging System can do for us?
It reduces disk space usage and disk I/O on core servers that should be busy doing something else.
#5 Prevent Disk BLOAT
What Centralized Logging System can do for us?
#6 Visual IndicatorsAbnormal behaviors can be detected faster when we see them in a visual instrument such as a graph, where peak points are easily noticed.
What Centralized Logging System can do for us?
Centralized Logging System Architecture
What and Why ?
What’s Fluentd?
It’s like syslogd, but uses JSON for log messages
What’s Fluentd?
What’s Fluentd?
timetag
record
What’s Fluentd?
What’s Fluentd?
Plug-in Plug-in Plug-in
So Fluentd is a:BufferRouterCollectorConverterAggregator…….
What’s Fluentd?
It’s written in RUBY :)
Why Fluentd?
Extensibility - Plugin ArchitectureWhy Fluentd?
Unified log format - JSON formatWhy Fluentd?
Reliable - HA configurationWhy Fluentd?
Easy to install - RPM/deb packages> sudo fluentd --setup && fluentd
Very small footprint> small engine (3,000 line) + plugins
Why Fluentd?
Why is good fit ?
1. It’s Schemaless
Document-oriented / JSON is a great format for log information. Very flexible and “schemaless” in the sense we can throw in an extra field any time we want.
Why ?
2. Fire and Forget
MongoDB inserts can be done asynchronously.
Why ?
3. Scalable and easy to replicate.
Built in ReplicaSet and Sharding provides high availability.
Why ?
4. Centralized and easy remote access
Why ?
5. Capped Collection● They "remember" the insertion order of their documents● They store inserted documents in the insertion order on disk● They remove the oldest documents in the collection automatically as new
documents are inserted
However, you give up some things with capped collections:
● They have a fixed maximum size● You cannot shared a capped collection● Any updates to documents in a capped collection must not cause a document to
grow. (i.e. not all$set operations will work, and no $push or $pushAll will)● You may not explicitly .remove() documents from a capped collection
Why ?
6. Tailing Logs● You’ll really miss ability to tail logfiles● Or, .. will you?● MongoDB offers tailable cursors
Why ?
Tailable Cursors
What with Tailable Cursors ?
We can implement the pub/sub usingNode.js and MongoDB
https://github.com/scttnlsn/mubsub
Why ?
Thanks
Would Love to answer your queries...
Vivek Parihar@vparihar