Like loggly using open source

transcript

Stream your Cloud Thomas Alrin

alrin@megam.co.in

We’ll cover

● What to stream ● Choices for streaming● Setting up streaming from a VM● Chef Recipes

What to stream

You can stream the following from cloud

● Traces (Logs)

● Metrics

● Monitoring

● Status

Scenario App/service runs in Cloud

We need the log files of your App● Web Server logs, container logs, app logs

We need the log files of your Service● Service logs

SaaS Vendors

You can avail this SaaS service from (loggly, papertrail..)

We plan to build a streamee...

Choices for streamingLogstash : logstash.net/ Fluentd : www.fluentd.org/Beaver : github.com/josegonzalez/beaverLogstash-Forwarder : github.com/elasticsearch/logstash-forwarderWoodchuck : github.com/danryan/woodchuckRSYSLOG : http://rsyslog.comHeka : http://hekad.readthedocs.org/en/latest/

Name Language Collector Shipper Footprint Ease of setting up

Logstash JRuby (JVM) Yes No High > Easy

Fluentd Ruby Yes No High > Easy

Beaver Python No Yes Low Easy

Logstash-Forwarder Go No Yes Low Difficult (uses SSL)

Woodchuck Ruby No Yes High > Easy

RSYSLOG C Yes Yes Low Difficult

Heka Go Yes Yes Low Easy

Our requirements

2 sets of logs to collect● All the trace when the VM is spinned off.● All the trace inside the VM of the application

or service Publish it to an in-memory store(queue) which can be accessed by a key

We tried We use

LogstashBeaverLogstash-forwarderWoodchuckHekaRSYSLOG

HekaBeaverRSYSLOG

megamd

fir.domain.com

doe.domain.com

gir.domain.com

her.domain.com

Queue#1

Queue#2

Queue#3

Queue#4

Shipper Agent

howdy.log howdy_err.log

howdy_err.log

howdy.log

howdy_err.log

howdy.log

/usr/share/megam/megamd/logs

How does it work ? Heka resides inside our Megam Engine (megamd). Its job is to collect the trace information when a VM is run. 1. Reads the dynamically created VM execution log files2. Format the log contents in json for every VM execution.3. Publish the log contents to a queue

Beaver resides in each of the VMs. It does the following steps,1. Reads the log files inside the VM2. Format log contents in json.3. Publish the log contents to a queue.

Logstash

● Centralized logging frameworks that can transfer logs from multiple hosts to a central location.

● JRuby … hence its needs a JVM ● JVM sucks memory● Logstash is Ideal as a centralized collector

and not a shipper.

Logstash Shipper ScenarioLet us ship logs from a VM : /usr/share/megam/megamd/logs/*/* to Redis or AMQP.

../megamd/logs/pogo.domain.com/howdy.log Queue named “pogo.domain.com” in AMQP.

../megamd/logs/doe.domain.com/howdy.log Queue named “doe.domain.com” in AMQP.

Logstash Shipper - Sample conf

input { file {

type => "access-log"

path => [ "/usr/local/share/megam/megamd/logs/*/*" ]

filter {

grok {

type => "access-log"

match => [ "@source_path", "(//usr/local/share/megam/megamd/logs/)(?<source_key>.+)(//*)" ]

output { stdout { debug => true debug_format => "json"}

redis {

key => '%{source_key}' type => "access-log"

data_type => "channel"

host => "my_redis_server.com"

Logs inside <source_key> directory are shipped to Redis key named <source_key>

/opt/logstash/agent/etc$ sudo cat shipper.conf

Logstash : Start the agent

java -jar /opt/logstash/agent/lib/logstash-1.4.2.jar agent -f /opt/logstash/agent/etc/shipper.conf

If you don’t have jre, then

sudo apt-get install openjre-7-headless

● Mozilla uses it internally. ● Written in Golang - native. ● Ideal as a centralized collector and a

shipper.● We picked Heka. ● Our modified version

○ https://github.com/megamsys/heka

InstallationDownload deb from https://github.com/mozilla-services/heka/releases(or) build from source.

git clone https://github.com/megamsys/heka.gitcd hekasource build.shcd buildmake debdpkg -i heka_0.6.0_amd64.deb

Our Heka usage

megamd

Megam Engine

Rabbitmq

Realtime Streamer

Heka configurationnano /etc/hekad.toml

[TestWebserver]

type = "LogstreamerInput"

log_directory = "/usr/share/megam/heka/logs/"

file_match = '(?P<DomainName>[^/]+)/(?P<FileName>[^/]+)'

differentiator = ["DomainName", "_log"]

[AMQPOutput]

url = "amqp://guest:guest@localhost/"

exchange = "test_tom"

queue = true

exchangeType = "fanout"

message_matcher = 'TRUE'

encoder = "JsonEncoder"

[JsonEncoder]

fields = [ "Timestamp", "Type", "Logger", "Payload", "Hostname" ]

Run heka

sudo hekad -config="/etc/hekad.toml"

We can see the output as shown below in the queue : {"Timestamp":"2014-07-08T12:53:44.004Z","Type":"logfile","Logger":"tom.com_log","Payload":"TEST\u000a","

Hostname":"alrin"}

Beaver● Beaver is a lightweight python log file shipper

that is used to send logs to an intermediate broker for further processing

● Beaver is Ideal : When the VM does not have enough memory for a large JVM application to run as a shipper.

Our Beaver usage

Beaver

megamd

Megam Engine

Rabbitmq

Realtime Streamer

Beaver

Chef Recipe : BeaverWhen a VM is run, recipe(megam_logstash::beaver) is included.

node.set['logstash']['key'] = "#{node.name}"node.set['logstash']['amqp'] = "#{node.name}_log"node.set['logstash']['beaver']['inputs'] = [ "/var/log/upstart/nodejs.log", "/var/log/upstart/gulpd.log" ]

include_recipe "megam_logstash::beaver"

attributes like (nodename, logfiles) are set dynamically.

RSYSLOGRSYSLOG is the rocket-fast system for log processing. It offers high-performance, great security features and a modular design.

Megam uses RSYSLOG to ship logs from VMs to Elasticsearch

Chef Recipe : RsyslogWhen a VM is run, recipe(megam_logstash::rsyslog) is included.

node.set['rsyslog']['index'] = "#{node.name}"node.set['rsyslog']['elastic_ip'] = "monitor.megam.co.in"node.set['rsyslog']['input']['files'] = [ "/var/log/upstart/nodejs.log", "/var/log/upstart/gulpd.log" ]

include_recipe "megam_logstash::rsyslog"

attributes like (nodename, logfiles) are set dynamically.

For more detailshttp://www.gomegam.com

email : gomegam@megam.co.in twitter:@megamsystems

Like loggly using open source

Data & Analytics