Graylog Engineering - Design Your Architecture

GRAYLOG ENGINEERINGDESIGN YOUR ARCHITECTURE.

README.

This is not a guide for the squeamish.

This is a peek for those who like to go off the beaten path, sometimes alone.

For those who aren’t afraid of pulling open the hood and getting their hands dirty.

This is the culmination of five years of engineering design in our hope to bring you the fastest machine data processing engine on the planet.

Don’t call your sales rep, they won’t know the answers.

-GRAYLOG ENGINEERING

1.2.

3.

4.

4 1/2

5.

6.

7.

8. 9.

LEGEND:1 & 2. LOG MESSAGES & LOAD BALANCER.

3. TRANSPORT LAYER.

4. PROCESSING CHAIN.

4½ - REST API.

5. MONGODB REPLICA SET.

6. ELASTICSEARCH CLUSTER.

7. ANATOMY OF A SINGLE INDEX.

8. INDEX MODEL.

9. DEFLECTOR QUEUE.

1 & 2, LOG MESSAGES & LOAD BALANCER.

tl;dr

We’re not going to spend any time here. Basically, send us any machine data (structured or not) and use whatever load balancer you like.

The # of messages, their peak rates, average size and extractions performed will affect performance, but we’ll cover that later.

3, TRANSPORT LAYER.This is the inputs and journal on top of the Graylog server. It consists of inputs from the message cloud (this is our syslog stream, as well as other inputs). These get pre-processed without user configurability into parts of a message.

While the journal is on disk (I/O), it is an *append only* journal where there is no seek time. (Internally we re-use Apache Kafka code to do this - thanks LinkedIn). The write “needle” is always close to the same point on the disk so it does not constantly scan. This makes it blazing fast. You can turn it off, but we do not recommend it.

Why we did this: Other systems do not have this, so they will lose messages coming in when message spikes happen because the network layer will start to reject them or your local memory will explode.

4, PROCESSING CHAIN.These messages are then taken and written into a process buffer, which is a ring buffer. We are using the Disruptor library from LMAX, a high speed trading company that relies on high speed and low latency.

Messages are then processed by the process buffer processor, where stream routing and extracting of fields happens. This part can get CPU intensive! The filtered message then goes into the output buffer (another ring buffer), then the output buffer processor, and onwards to Elasticsearch (ES) or user defined output.

ProTip: Tuning the number of processors run per buffer is important and should never exceed the number of CPU cores you have available for graylog-server. Increase number of processors if you see too low throughput and try to focus on process buffer processors because the output buffer usually does not need many. A symptom of not enough processors is full buffers.

https://lmax-exchange.github.io/disruptor/

https://github.com/Graylog2/graylog2-server/blob/master/graylog2-shared/src/main/java/org/graylog2/shared/buffers/processors/ProcessBufferProcessor.java

4½ , REST API.

Why is this different than any other rest API?

This is the same API we use on our web front end, hence you can make any read/write call we do in your own UI. Yup, you can build your own front end.

Also, it has to be high quality, because this is the same API we use ourselves day to day. It is not like others where it is just an API that is provided for external users to integrate with, built once and patched with duct tape every release. Not that we don’t like duct tape….

5, MONGO.

Then there is Mongo, which is storing only metadata: users, settings and configuration data on all items: streams, dashboards, extractors, etc. Anything you configure. If Mongo goes down, Graylog will continue to run. So, it is your choice whether to include it in a high availability design.

Mongo recommends for HA scenario’s three instances of it. This is because if one goes down then Mongo has to recommend a primary, and without two more it can get confused between the first two. See Mongo Replication set for instructions.

https://docs.mongodb.org/v3.0/core/replication-introduction/

6, ELASTICSEARCH CLUSTER.

We connect to ES servers as an embedded ES node that does not store data. So, we look and act like an ES node, and know about configuration data (indexes, shards, etc) for each ES server.

When writing to ES and when you are not a node, you have encode and transmit over the wire as HTTP and then JSON and then decode it, etc. As a node you can send it in native format, and it is fast.

For HA, we recommend having at least one replica configured.

7, ANATOMY OF AN INDEX.

A single index (In this example, Graylog Index #25), is broken into shards. This means the index is broken up and the parts are run on different ES nodes. This makes for faster searches because the query result can be computed on multiple ES nodes in parallel.

An index can also have replicas configured. This means that each shard is mirrored to other nodes, which is great for HA.

8, INDEX MODEL.

Each index is numbered starting with 0 the first time. In a time series database, all data is stored with a time stamp, and once it is stored it is not gone back to be re-written (hence is marked as READ_ONLY vs WRITE_ACTIVE for performance). So, messages are not gone back to be re-inserted. This makes it fast. Because of the time based storage, this also means when you query it you must give a time bound search (i.e. in the last hour…).

Pro Tip: So the size of these indexes matter when performance tuning. You don’t want to make the indexes too big because then the searches will take much longer, and you don’t want them too short for the same reason. The indexes should be sized based on the amount of data a you have and how far you normally search. Sometimes people use it for longer historical strategic type searches. It is important to know and size this correctly.

9, DEFLECTOR.

We write to an index alias called ‘deflector’ that can be atomically switched to a new index. This allows us not to worry about having to stop message processing when creating a new index because that is error-prone to manage (oh, index #25 is now closed, ahh wait, okay the next one is #26, go ahead and write).

Why are we telling you this? Because, well, it’s these kinds of things that makes us different. We are proud of thinking about all the small things that give you great performance and stability, and hope you have enjoyed reading this as much as we did writing it.

Date post:	09-Jan-2017
Category:	Technology
Upload:	graylog
View:	17,123 times
Download:	7 times

Graylog Engineering - Design Your Architecture

Technology