ElasticSearch (ELK Stack) helping software quality

In the previous post I launched the challenge How Data Science can improve Software Quality which I discussed needs, data, data science and the right questions.

In this post the goal is to address the tools that will help us find the answers that we want to find, ie, the tools can extract important indicators to ensure software quality.

What we need?

We need something to provide us with rapid information, easy to view, online, free of bureaucracy, which easily leads us to the possible causes, for comparing, etc.

Below is an example of a dashboard that gives us information in this regard:
kibana dashboard

Choosing the right tools

It would be very simple if the tool to deliver ready-made answers, just knowing the right log files? YUP. But as yet we do not reach this level, we leave the areas of artificial intelligence and machine learning to evolve and we will leave for “hands on”.

To infer something about the software quality we need to have data to extract information. We need to collect the logs leaving them ready for reporting in the form of charts, reports and dashboards.

You have many options to do this, from building your own solution in the language of your choice, get a set of tools that do this for you or even look for a solution specialized free and open source.

Basically the options are:

  1. Import the files manually into spreadsheets (Excel, LibreOffice, Google Docs), clear data, aggregate and generate the results. So every time the data arrive, ie, every day;
  2. Using very good tools and paid, such as Splunk;
  3. Search free open source tools, stable, so that all openness and flexibility for custom settings.

For a while I did it all manually, but it’s very labor be generating reports in spreadsheets. Especially when the daily volume of data reaches 2 gigabytes compressed, per server. Once you do this often, you have to extract the bright side of all this: you learn a lot and are your log data and extract what is really important. In other words, they learn to clear the data and take out the garbage, bringing what is really of help to seek the answers.

After studying these possibilities have been very biased to the third option. 🙂

Finally, knowing what are the logs, their content, meeting the needs I came to the ELK Stack : ElasticSearch, Logstash and Kibana.
General ELK Stack

Why ELK Stack?

The ELK Stack is “made up for it”, ie specialized. Our role becomes customize: Collect the data in the best way, automate demand (or do batch collection) and creating good reports for presentation of results or evidence to meet our need.

The Elastic.co has an extensive and excellent content, as well as good examples. You can also find some Use Cases, and even better Use Cases High Tech citing some important organizations that use the ELK Stack.

The ELK Stack offers a set of applications and utilities, each with a purpose, which offer a combined search platform and data analysis. The Logstash capture log data, the ElasticSearch catalogs and stores the data for analysis in real time, including the timeline, and the Kibana transforms the data into powerful views, reports and dashboards.

The platform is built as the engine Lucene by the Apache 2 Open-Source License.

Main features:

  • Powerful tool for generating quick analysis;
  • Data visualization with Kibana;
  • Document-oriented;
  • Schema free, automatic interpretation of the data types;
  • Management of conflict, with optimized version control;
  • Use market standards such as API and RESTful JSON;
  • Use programming languages ​​known as Ruby, Java, NodeJS;
  • Work on Linux and Windows;
  • Configuration allows for scalability, if necessary;
  • Allows configuration for high availability, if necessary;

Collecting logs: Logstash

logstash-large-log-1[1]

The Logstash’s challenge is “simple”: collect the logs. You can use it simply, that is, collect the logs and send them “as is” without any processing. But you can also use it filtering logs, transforming the information, adding data, etc.

Logstash is a tool for collecting events and logs. Written in JRuby requires a JVM to run. Usually one of logstash client is installed by host, and can receive data from multiple sources including log files, Windows events, Syslog events, etc. The downside of using JVM is the allocation of memory that can be higher than you expect.
The slogan it is “Collect, Enrich & Transport Data”, it offers its own syntax to interpret the log files, process the data and then send to “any place”, in our case, the ElasticSearch.

The purpose of this post is not to give a “graduation” in Logstash, so I recommend you go deep in the documentation page. But I want to mention some of the way to set it up, which I found very simple and easy.

Its configuration is based on a configuration file, ie one file for each “type” log.
The contents of this file is textual and requires at least the sections “input” and “output” in addition to “filter” which is optional.
Each section supports plugins which is the most powerful and comprehensive of the Logstash that allows you to achieve import “anything” in some cases even include snippets of Ruby code.

The sections of the Logstash configuration file:

  • input: basically describes the information to the Logstash know the input files, the format, the frequency of reading the file type (csv, plain text, XML, JSON, lines based on size of the columns, etc.) charset;
  • filter: Defines the metadata, ie each variable, data type, format, transform, what data will be used, which is discarded, date and time of conversion, etc.
  • output: Determines how the data is stored or sent. Stored in a file in a desired format. Or sent for any relational database or not, in our case the ElasticSearch.

An example configuration file syntax with fictitious data – file.config:

input {
  file {
    path => "/etc/app/log/operations.log"
    type => "csv"
  }
}
filter {
  mutate {
    gsub => [
      # replace all forward slashes with underscore
      "message", "\"", "'"
    ]
  }
  csv {
    separator => ";"
    columns => [ "CLIENT","CALL1","CALL2","CALL3","OPERATION","TYPE",   "DATABASE","POOL","CONNECTION","USERID","CLIENTIP","DATE","TIME","CLASSNAME","METHOD" ]
    add_field => { "CALL" => "%{CALL1}.%{CALL2}.%{CALL3}" }
  }
  mutate {
    add_field => { "DATETIME" => "%{DATE} %{TIME}" }
  }
  date {
    locale => "br"
    match => ["DATETIME", "dd/MM/YYYY HH:mm:ss.SSS"]
    timezone => "America/Sao_Paulo"
    target => "@timestamp"
    add_tag => "dateHoraOk"
    remove_field => "DATETIME"
  }
}
output {
  elasticsearch {
    host => localhost
    protocol => "http"
    port => 9200
    index => "myindex-2015.09"
  }
}

To run the Logstash simply specify the configuration file with the “-f” flag:

bin / logstash -f file.config

Logstash or Bulk API ElasticSearch

My experience with Logstash is: is an excellent tool and solve a range of problems of interpretation of log files and data extraction, but you find on the internet a range of configuration files ready for use, mainly generated by applications market as JBoss, Apache HTTP, Wildfly, Linux, PostgreSQL, Tomcat, etc. Ie is ready for use when you just want to collect the logs and store.

In some cases the Logstash configuration file was very complex, long and difficult to understand. Especially when you begin to add many plugins to aggregate data, include conditions, events, etc.

Running the Logstash online competing with the main application, collecting information from files along with the generation of logs was quite “heavy”. During monitoring of the use of resources, Logstash very consumed the disk resources, CPU and memory.

These situations opened precedent for research and where I found an alternative: the use of Restful API ElasticSearch for sending the logs, the Bulk API.

The Bulk API allows you to send the data in JSON format. The ElasticSearch is NO-SQL and No-Schema, ie will accept and interpret the data as you have, so that you obey its API, all with JSON objects. Much easier to send a data sets in the Key x Value style agree?

I created a customized agent with classes set to easily send the logs via Bulk, meeting specific needs such as generating statistics before storing. If you are curious can describe in another post.

There are some plugins ready in Perl and Python using this API.

ElasticSearch

logo-elastic[1]

You can install ElasticSearch in some linux distributions, MacOS and Windows. Once installed you can configure to receive requests only on the installed host or any other host.

A major advantage of ElasticSearch is that it is very simple to install and “run”. Except situations where you want to include cluster configuration, replication, high scalability, high availability, backup, rotation strategy and cleanup data, etc. For these cases also suggest another post.

For our case we are left with the simple part of ElasticSearch: get it ready to accept Logstash the requests or the Bulk API (REST) ​​and get it ready to “connect” the Kibana for analysis and generation of panels and graphics.

The Kibana is “The Man”

kibana

Installing Kibana you have available one system connected directly to the indexes generated from ElasticSearch. In addition to a ready web interface for setting the rates and types of data was you want to analyze.

Kibana is the front end of ELK Stack where you can view the data sent by Logstash or own agent, stored in ElasticSearch all in one web interface.

The Kibana need to know which indexes the ElasticSearch it will assess, for this you need penalizing include them through a specific Kibana screen, but I guarantee it’s super simple.

The main interactions that Kibana offers:

  • Discover: In this section you view the data in a particular index in the form of histogram as well as search for data using Lucene syntax in real time;
  • Visualize: specific section to create new views which can be a table, a line chart, pie, bars, text, a big number etc. Basically you create views and then compose the panels (Dashboards);
  • Dashboard: Section very useful to present analytical data in real time with automatic update. You can create panels composing different views created in the Preview section.

“Google It”: Make a quick search on Google Images and you will find many dashboards generated by Kibana, essential for various types of data analysis.

Some Views and Dashboards

kibana dashboard

Dashboard

This dashboard left clearly presents an analysis of operations more “heavy” on the server:

  • The left table shows the Top 10 operations ordered by runtime x number of executions;
  • Pie chart with the Top 10 operations presenting the proportion of each transaction relative to the other;
  • Stacked bar chart daily presenting the projection of the Top 10 most performed operations along with a timeline.

dashboard3Stacked Bars Chart
This view right shows the Top 10 slower screens, based on the average execution time on the server. Note that the slower screen reaches almost 2 minutes average time of execution and the 10th slower screen is between 600 and 800 milliseconds average time.

dashboard2

Stacked bar chart with Timeline

The view on the left shows the Top 5 SQLs heavier per day. Note that the slower SQL is repeated every day.

And the answers?

In this post I described the tools, carrying not speak directly of the answers but the “engine” that will lead us to them.

On the answers we want, remember: in some cases find the answers immediately, even literally. In other cases we find only the records of facts that help us reach the answers.

Tip: I strongly suggest that you search on the term “Data Driven” on Google. It has many articles helping them to see how important it is to give direction our decisions and ideas based on actual data, facts, evidence. But that’s another topic … 😉

Conclusion

Now we know the tools suited to generate indicators to ensure software quality.

Details about installation and configuration of the tools were not addressed. If you consider relevant can I produce a post on installation and configuration of Logstash, ElasticSearch or Kibana, ask in the comments, it will be a pleasure to help you.

As promised in the previous post here I fulfill my part of the “promise” that was:

  • Storage of evidence;
  • Generation graphics and dashboards;
  • Automation of collection operations;

I intend to continue and details of this matter in the coming posts.

I appreciate your interest and patience. I hope your comment.

The following two tabs change content below.

Fabiano de Freitas Silva

Software Engineer at Softplan
+15 years experience in Software Engineering, Application Developer, Database Administrator and Project Manager in a wide variety of business applications. Programmer in multi-threading and high availability systems for deliver better performance. Now I'm doing a specialization in Data Science at Johns Hopkins University via Coursera, MOOC.

About Fabiano de Freitas Silva

+15 years experience in Software Engineering, Application Developer, Database Administrator and Project Manager in a wide variety of business applications. Programmer in multi-threading and high availability systems for deliver better performance. Now I'm doing a specialization in Data Science at Johns Hopkins University via Coursera, MOOC.

Leave a Reply