Both ELK and Graylog use Elasticsearch for log storage and indexing.  Most of what makes both Kibana and Graylog work so well for searching through logs is mostly provided by Elasticsearch with Graylog and Kibana acting as a GUI to interface with it.  Data is stored in indexes in Elasticsearch and those indexes need to be maintained and managed.  How each product does it though is different.

In order for Elasticsearch to work efficiently and the size of indexes need to be kept down.  It is possible to put all your data inside of a single index but performance will suffer to the point it is almost unusable.  Log retention also needs to be considered since logs are only intended to be kept for some predetermined timeframe after which they need to be flushed.  It is possible to go through an index and delete old logs from it, but it is far easier and faster to create new indexes at a predefined timeframe and outright delete any old indexes in whole.  There are two major methods for Elasticsearch index management for logging.  One is alias/rollover and the other is timestamp based index name.

Alias/rollover
This one is a combination of two separate concepts, an alias and the sequential rolling of an index.  An alias is just as it sounds, it is a name that points to something else.  For this example lets say you are writing to an index called “logs”, or using Graylog terminology and naming scheme, “logs_deflector”.  This alias then points to an index in Elasticsearch called “logs_1”.  After the index in Elasticsearch gets too big or too old you create a new index called “logs_2” and point the alias to that new index.  The old index still exists, you are just now writing to the new one instead.  If you roll the index over once per day, it is pretty reasonable to assume that the if the index is 10 numbers ago it contains logs that are 10 days old or older.  Through this you can adopt a log retention policy that says you create a new index every day and you are going to keep 30 indexes (days) after which they will be deleted.

There is one small issue that comes up with this method of indexing and it due to a loophole in the following phrase: “All logs written today are written to todays index”.  What happens if you are receiving logs that are from a week ago?  From 30 days ago?  They get written to todays index.  This means that the index is a function of when the log was received, not when the log was generated.  This is by no means a deal killer, only something to keep in bind because it does have implications.

Timestamp based name
This means that an index will be created for a time period with that time period in the name of the index itself, “logs-01.30.2018” for example.  Because an index is named based on the time it was created, logs from that time period can be written to the corresponding index.  A log from 01.02.2018 will be written to the “logs-01.02.2018” index directly.

Alias/rollover vs Timestamp based name
Graylog uses an alias/rollover index management, Logstash and Kibana will support either one but the most commonly used method is the timestamp based name.

Timestamp based name will allow for accurate and efficient deletion of logs for retention purposes because logs are sorted into indexes based on timeperiod and those old indexes can be deleted.  Deleting a whole index to get rid of old logs is very quick and uses few resources whereas going through all index and deleting individual logs that are old and expired is very resource intensive.  Alias/rollover you can never be certain that old logs do not exist in new indexes so there will likely always be an amount of old logs that get retained for longer then intended.

Timestamp based name allows for much easier management of the indexes from applications and scripts aside from the application that created it.  Doing something to all indexes older then 30 days with a script or program is easier when the index has the date written into the name, the alias/rollover method you need to reference the schedule of the application that created the index and which index is active to know how old each index is.  Alias/rollover indexes are best and more easily managed by the applications that created them.

 

Now a little about how each application deals with index management.

Graylog
Graylog uses the alias/rollover method as outlined above and uses different distinct methods for index interaction for reads and writes.  One important thing to remember about Graylog is that it stores logs in Elasticsearch but everything else is kept in a MongoDB database.  The functions MongoDB provides is the secret sauce so to speak.

For index writes Graylog uses the alias/rollover method for index management.  In Graylog terminology they are called an “Index Set” and when you set one up you can specify an prefix to the name and then the Elasticsearch index settings to use (shards, replica’s, and segments).  Also configurable are settings for index rotation that will allow you to rollover an index after either a certain number of messages, the approximate size on disk, or after a specific amount of time.  Then retention settings of what to do with old indexes, delete, close, do nothing, or archive it.

Lets say we create a index set called “logs”.  Graylog will create an index in Elasticsearch called “logs_1” using the settings that were configured and then create an alias called “logs_deflector” that points at this new index.  When this index gets rotated or rolled over, Graylog will create an index called “logs_2” in Elasticsearch and then re-point the “logs_deflector” alias at this new index.  You have a retention policy set to retain 20 indexes and then delete, once Graylog creates index “logs_21” it will go back and delete “logs_1”.

Index reads are where Graylog starts to do interesting things.  In MongoDB Graylog keeps rack of which indexes have which time ranges of logs.  Lets say you want to see logs specifically for a 24 hour period that happened 7 days ago.  Graylog knows that those logs are in indexes “logs_4”, “logs_5”, and “logs_9” so it will only search those indexes for those logs and not all indexes.

ELK
The most common way Logstash writes indexes is by using the timestamp based name method and it does this by way of using variables.  In the Logstash config there will be a line on the Elasticsearch output plugin that is for example ” index => ‘logstash-%{+YYYY.MM.dd}’ “.  The part at the end is a variable based on the individual logs timestamp.  If the log has a timestamp from 01.30.2018 then it will write the log to an index called “logstash-2018.01.30” and if the log has a timestamp from 01.06.2018 it will write that log to an index called “logstash-2018.01.06”.

In Kibana you then create a wildcard index of “logstash-*” which will include all of those previously created indexes regardless of their date.

This is however the extent of what Logstash and Kibana do.  Everything else is handled by two other things.  All index settings, the number of shards and replica’s and all of the field mapping types is configured an index template mapping file and uploaded to Elasticsearch.  All of the index management is handled by a tool from Elastic called Curator.

 

Graylog vs ELK : Initial setup
Graylog has all the index settings, the retention periods, and the retention actions all configurable and able to be monitored in the GUI.  ELK is done through a variety of CLI and text based tools.

Graylog wins.  ELK cannot compete with the ease of use that Graylog has especially as it applies to beginners.

 

Graylog vs ELK : Reading from indexes
Graylog keeps metadata in MongoDB as to which index contains which timeframe of logs so that when a search is made, it only searches the Elasticsearch indexes that contain the data.  ELK does not, it has built in circuit breakers that abort a search early in an index that does not contain logs from that timeframe, but it still attempts the search until the circuit breaker is tripped.

Graylog wins.  The MongoDB metadata is pretty ingenious.

 

Graylog vs ELK : Writing to indexes and index naming
Graylog only supports the alias/rollover method of index management.  ELK supports either.  Writing logs to indexes named after the timerange that the log occurred in (not received in) allows for more efficient use of space by not retaining old logs longer then necessary and the timestamp based name format also opens the door to easier integration to outside scripts and programs to manipulate and manage the indexes.  But if necessary, ELK can use the alias/rollover method of naming indexes.

ELK wins.  The lack of features and options from Graylog make large and complex environments more complicated to manage.

 

Graylog vs ELK: Index rotation and management
Graylog will allow for index rotation to be configured based on number of messages, size of index, or a timeframe.  The actions that can be taken are to delete the index, close the index, or do nothing (not including the archiving feature since it is a paid for feature of the enterprise version)

ELK uses Curator to do this.  It does everything Graylog does, plus the following:

  • Alias management.  For example create a “logs” alias for any indexes newer then 7 days and “logs-all” alias for all indexes and update the index list of each alias daily.
  • Allocation management.  For example, in  your Elasticsearch cluster you have very fast data nodes with SSD disks and cheaper and slower data nodes with huge SATA disks.  All new indexes are written to SSD for fast indexing and then allocation management can move all indexes that are older then 7 days to the SATA data nodes.  This greatly reduces overall cost of storing many months worth of logs.
  • Snapshot management (archiving).  Create, delete, and restore Elasticsearch snapshots on a schedule
  • Open indexes back up.  Both Curator and Graylog can close indexes, Curator can additionally open them back up.
  • Reindex a source index into a destination index.  Useful for consolidation of indexes after a certain time period.
  • Change replica settings.  One example would be to index everything at replica=0 for very fast indexing, but then increase the replica=1 after the first day.  Another example would be to keep 30 days of logs at replica=1 and then 60 days after that of logs at replica=0.  Reduce the replica from 1 to 0 and you double the amount of available disk space to store logs.  Useful for ‘nice to have’ logs that have no requirement to be kept…but its nice to have as many of them as possible.
  • Index rollover for the alias/rollover  method
  • Shrink indexes.  Automate the process of reducing the shard count by shrinking an index which in effect will combine multiple shards into fewer shards.  You can index at 8 shards for indexing speed but then shrink to 1 shard for search performance increases

It is possible to use Curator with Graylog created indexes but it is a lot more difficult due in part because of the way indexes are named and additionaly any changes to Graylog indexes made by anything but Graylog will require a re-sync of the index metadata in Graylog.

ELK wins.  The whole ELK stack and index naming schemes are meant to work hand in hand with Curator.  Graylog can be made to work with Curator but it is not the preferential or common way to do it.

 

Graylog vs ELK : Ease of management
As long as you do it the way Graylog does it best, everything you need is in the GUI and there are no surprises.  With ELK it is easy to manage once you learn how to use Logstash and Curator, but until then it is confusing.

Graylog wins.  The GUI is hard to compete with.

 

Graylog vs ELK : Knowledge of Elasticsearch
Elasticsearch is a major part of both of these products and is a very powerful tool once learned.  Graylog hides much of Elasticsearch behind its own GUI and there is not as much need forcing an administrator to learn Elasticsearch.  With ELK though Elasticsearch is in the forefront and forces and administrator to learn it.  Graylog makes Elasticsearch ‘easy’ enough that it is possible to administrate Graylog without knowing anything about Elasticsearch….everything works well until it doesn’t.  But then again, that is the point of Graylog.

 

 

Summary
So which one is better?  Which one wins?  It depends.

If you want the easy button, something easy to manage and quick to set up, Graylog is hard to beat.  The index management is easy and it is configured by a very clean GUI.  It works well enough and you will not be disappointed.

If you need features that Graylog does not natively do then ELK starts becoming more and more of a better option.  Large deployments usually have complex requirements and ELK is more able to deliver a better fitting solution for those requirements.  Do not underestimate the power of Elastic Curator for index management in Elasticsearch.  With very large Graylog environments its not uncommon to see that both Graylog and ELK are running, or Logstash processing logs and sending to Graylog, or Graylog processing logs but then also adding those indexes to Kibana to view.  Index management is no different with Elastic Curator managing Graylog indexes.

But….if you are designing from scratch, take your final design goals into consideration.  End users and administrators can be taught either Graylog or ELK with the same amount of effort and if your requirements are going to dictate one over the other then work towards that final design.

Leave a Reply

Close Menu