Post on 11-Aug-2014
description
transcript
What to stream
You can stream the following from cloud
● Traces (Logs)
● Metrics
● Monitoring
● Status
Scenario App/service runs in Cloud
We need the log files of your App● Web Server logs, container logs, app logs
We need the log files of your Service● Service logs
Choices for streamingLogstash : logstash.net/ Fluentd : www.fluentd.org/Beaver : github.com/josegonzalez/beaverLogstash-Forwarder : github.com/elasticsearch/logstash-forwarderWoodchuck : github.com/danryan/woodchuckRSYSLOG : http://rsyslog.comHeka : http://hekad.readthedocs.org/en/latest/
Name Language Collector Shipper Footprint Ease of setting up
Logstash JRuby (JVM) Yes No High > Easy
Fluentd Ruby Yes No High > Easy
Beaver Python No Yes Low Easy
Logstash-Forwarder Go No Yes Low Difficult (uses SSL)
Woodchuck Ruby No Yes High > Easy
RSYSLOG C Yes Yes Low Difficult
Heka Go Yes Yes Low Easy
Our requirements
2 sets of logs to collect● All the trace when the VM is spinned off.● All the trace inside the VM of the application
or service Publish it to an in-memory store(queue) which can be accessed by a key
megamd
fir.domain.com
doe.domain.com
gir.domain.com
her.domain.com
Queue#1
Queue#2
Queue#3
Queue#4
Shipper Agent
howdy.log howdy_err.log
howdy_err.log
howdy_err.log
howdy.log
howdy_err.log
howdy.log
howdy.log
AMQP
/usr/share/megam/megamd/logs
How does it work ? Heka resides inside our Megam Engine (megamd). Its job is to collect the trace information when a VM is run. 1. Reads the dynamically created VM execution log files2. Format the log contents in json for every VM execution.3. Publish the log contents to a queue
Beaver resides in each of the VMs. It does the following steps,1. Reads the log files inside the VM2. Format log contents in json.3. Publish the log contents to a queue.
Logstash
● Centralized logging frameworks that can transfer logs from multiple hosts to a central location.
● JRuby … hence its needs a JVM ● JVM sucks memory● Logstash is Ideal as a centralized collector
and not a shipper.
Logstash Shipper ScenarioLet us ship logs from a VM : /usr/share/megam/megamd/logs/*/* to Redis or AMQP.
eg:
../megamd/logs/pogo.domain.com/howdy.log Queue named “pogo.domain.com” in AMQP.
../megamd/logs/doe.domain.com/howdy.log Queue named “doe.domain.com” in AMQP.
Logstash Shipper - Sample conf
input { file {
type => "access-log"
path => [ "/usr/local/share/megam/megamd/logs/*/*" ]
}
}
filter {
grok {
type => "access-log"
match => [ "@source_path", "(//usr/local/share/megam/megamd/logs/)(?<source_key>.+)(//*)" ]
}
}
output { stdout { debug => true debug_format => "json"}
redis {
key => '%{source_key}' type => "access-log"
data_type => "channel"
host => "my_redis_server.com"
}
}
Logs inside <source_key> directory are shipped to Redis key named <source_key>
/opt/logstash/agent/etc$ sudo cat shipper.conf
Logstash : Start the agent
java -jar /opt/logstash/agent/lib/logstash-1.4.2.jar agent -f /opt/logstash/agent/etc/shipper.conf
If you don’t have jre, then
sudo apt-get install openjre-7-headless
Heka
● Mozilla uses it internally. ● Written in Golang - native. ● Ideal as a centralized collector and a
shipper.● We picked Heka. ● Our modified version
○ https://github.com/megamsys/heka
InstallationDownload deb from https://github.com/mozilla-services/heka/releases(or) build from source.
git clone https://github.com/megamsys/heka.gitcd hekasource build.shcd buildmake debdpkg -i heka_0.6.0_amd64.deb
Heka configurationnano /etc/hekad.toml
[TestWebserver]
type = "LogstreamerInput"
log_directory = "/usr/share/megam/heka/logs/"
file_match = '(?P<DomainName>[^/]+)/(?P<FileName>[^/]+)'
differentiator = ["DomainName", "_log"]
[AMQPOutput]
url = "amqp://guest:guest@localhost/"
exchange = "test_tom"
queue = true
exchangeType = "fanout"
message_matcher = 'TRUE'
encoder = "JsonEncoder"
[JsonEncoder]
fields = [ "Timestamp", "Type", "Logger", "Payload", "Hostname" ]
Run heka
sudo hekad -config="/etc/hekad.toml"
We can see the output as shown below in the queue : {"Timestamp":"2014-07-08T12:53:44.004Z","Type":"logfile","Logger":"tom.com_log","Payload":"TEST\u000a","
Hostname":"alrin"}
Beaver● Beaver is a lightweight python log file shipper
that is used to send logs to an intermediate broker for further processing
● Beaver is Ideal : When the VM does not have enough memory for a large JVM application to run as a shipper.
Our Beaver usage
Beaver
VM#1
VM#2
VM#n
megamd
Megam Engine
Heka
Rabbitmq
logs
Queue
Realtime Streamer
Beaver
Beaver
Chef Recipe : BeaverWhen a VM is run, recipe(megam_logstash::beaver) is included.
node.set['logstash']['key'] = "#{node.name}"node.set['logstash']['amqp'] = "#{node.name}_log"node.set['logstash']['beaver']['inputs'] = [ "/var/log/upstart/nodejs.log", "/var/log/upstart/gulpd.log" ]
include_recipe "megam_logstash::beaver"
attributes like (nodename, logfiles) are set dynamically.
RSYSLOGRSYSLOG is the rocket-fast system for log processing. It offers high-performance, great security features and a modular design.
Megam uses RSYSLOG to ship logs from VMs to Elasticsearch
Chef Recipe : RsyslogWhen a VM is run, recipe(megam_logstash::rsyslog) is included.
node.set['rsyslog']['index'] = "#{node.name}"node.set['rsyslog']['elastic_ip'] = "monitor.megam.co.in"node.set['rsyslog']['input']['files'] = [ "/var/log/upstart/nodejs.log", "/var/log/upstart/gulpd.log" ]
include_recipe "megam_logstash::rsyslog"
attributes like (nodename, logfiles) are set dynamically.
For more detailshttp://www.gomegam.com
email : gomegam@megam.co.in twitter:@megamsystems