Logstash

Logstash

It is also an open-source tool. It provides an integrated framework for

  • log collection
  • analysis of a large variety of structured and unstructured data
  • parsing
  • centralization data

Therefore we can use Logstash to parse both multiline and single line logs including common formats like JSON and syslogs as well as parse to custom logs. Logstash is relatively easy to set up in large environments. It designed to efficiently and flexibly process logs, events, and unstructured data sources for distribution into a variety of outputs. Hence we can easily customize via plugins for input, output, and data-filters. It supports:

1. Centralized data processing

Logstash uses a data pipeline that is able to centralize data processing. There is a collection of input and output plugins, therefore it can convert many different input sources into a single common format using plugins.

2. Custom log formats

Logs generated by different applications because of that they have certain formats specific to the application. Logstash helps to parse and process custom formats on a large scale. It provides support to write custom filters for tokenization as well as ready-to-use filters.

3. Plugin development

It is possible to develop custom plugins and publish, hence there is a large variety of proprietary plugins are already available.

Logstash is written in Ruby and runs on JRuby. Logstash is easy to deploy because it is a Java-based utility. A standalone jar-file contains both a graphical user interface and an embedded Elasticsearch engine so it can starts directly using a JVM. It is most commonly uses index data in Elasticsearch.

Conceptualization of Logstash

Logstash contains 3 main plugins those are,

  1. Input
  2. Filter
  3. Output

Above all mentioned plugins are using within Logstash’s Event processing pipeline. Special kind of filters used in Logstash is below,

  • Grok
  • Mutate.

An another plugin is an Elasticsearch, it is a special type of an Output plugin. Plugins are specified within Logstash’s configuration file. Logstash has two Host classes as below,

  1. Central server
  2. Event forwarder

Furthermore, Logstash consists of four Ecosystem components, those are

  1. Web interface
  2. Broker and indexer
  3. Search and storage
  4. Shipper

Above all mentioned components (Except Shipper) represent Central servers, which belong to the Central server Host class. The Shipper conforms with the Event forwarder.
The below image represents the conceptualization of Logstash.

Key features of Logstash

There are many key features as mentioned below,

1. Event processing pipeline

There are three phases event processing pipeline that deals with the collection of events from various input sources like Syslog or parsing, among others filtering, of events, and forwarding of the parsed events to various outputs like Elasticsearch or Cassandra (Foundation, 2016) in order to store them there

Event processing pipeline

2. Configuration file

The logstash.conf folder contains the configuration file for Logstash. The Logstash configuration file utilizes a custom JSON-like language in which the inputs and the outputs have to be specified, whereas the filter part is optional. Below is the most basic form of configuration,

input { ...
}
filter { ...
}
output { ...
}

3. Plugin

Logstash’s plugin architecture provides an easy way of extension of functionalities in each phase using plugins that consist of a large scale of inputs and outputs. Hence it can use to create flexible event processing services

4. Input

Input plugins are the mechanism for passing log data to Logstash. Logstash supports many various input plugins. Input plugins use to collect the logs and forward collected events to the filter phase. As a result, Logstash provides 49 official inputs developed by the Logstash team. There are also many input plugins developed by the Logstash community and released for public use. The below table provides few available input plugins and descriptions

Input Plugin NameDescription
fileThe file input streams events from files, normally by tailing and optionally reading them from the beginning.
JDBCThis input plugin is created as a way to ingest data in any database with a JDBC interface into Logstash.
lumberjackThe lumberjack plugin receives events using the lumberjack protocol. This is mainly to receive events shipped with a lumberjack, now represented primarily via the Logstash-forwarder.
KafkaThis input reads events from a Kafka topic. It utilizes the high-level consumer API provided by Kafka to read messages from the broker.

5. Filter

Filter plugins are processing events that came from input plugins. Logstash contains a large collection of filters that allow for flexible recognition, filtering, parsing, and conversion of events before they are pushed to an output destination. The below table provides few available input plugins and descriptions.

Filter Name Description
MutateThe mutate filter allows us to perform general mutations on fields, E.g. rename, remove, replace, and modify fields of events.
CSVThis filter takes an event field containing CSV data, parses it, and stores it as individual fields. The CSV filter can also parse data with any separator, not only commas.
GrokThe grok filter parses the arbitrary text and structures it. Grok is currently the best way in Logstash to parse unstructured log data into something structured and queryable.
DateThe date filter is utilized for parsing dates from fields, and then using that date or timestamp as the Logstash timestamp for the event.

Output

Finally, the event moves to the output phase after the event is passed through the filter. The output is the final phase of the Logstash’s event processing pipeline and represents the feature of how Logstash interacts with the storage subsystem. Then the output indicates where the resulting data structure is going to be sent. An event can pass through multiple outputs during processing. Logstash provides many output plugins. The below table provides a few available output plugins and descriptions.

Output plugin NameDescription
ElasticsearchThe Elasticsearch plugin is the recommended method of storing logs in Elasticsearch
KafkaThe Kafka plugin writes events to a Kafka topic. It uses the Kafka Producer API to write messages to a topic on the broker.
StdotThis plugin is a simple output that prints to the STDOUT of the shell running Logstash.
EmailThis plugin sends an email when output is received.

Reference
https://www.elastic.co/guide/en/logstash/current/input-plugins.html
https://www.elastic.co/guide/en/logstash/current/filter-plugins.html
https://www.elastic.co/guide/en/logstash/current/output-plugins.html
https://www.elastic.co/guide/en/logstash/current/advanced-pipeline.html
https://www.elastic.co/blog/logstash-centralized-pipeline-management