Alerting with the E(L)K Stack and Elastalert Revisited

By | June 16, 2017

In this article I will revisit the example from my previous article Alerting with the ELK Stack and Elastalert. The same scenario as in the earlier article will be used; a Mule ESB CE instance with a Mule application will be monitored using JMX.
The new example will
use Metricbeat instead of Logstash to poll JMX data from the Mule instance. I will give a few words of advice in connection to JMX monitoring with Metricbeat. In addition I will use the most recent version of Elasticsearch and Kibana available, which at the time of writing is 5.4.0, and the latest version of Elastalert, which is compatible with Elasticsearch 5.4.0.

The complete updated example is available in the master branch of my repository on GitHub, the previous version is still available in the same repository but located in the original branch.

Updates

The list below is an attempt to list all the changes of the example compared to that of the earlier article. There are a number of advantages that come from the update, such as more features in the newer version of Kibana, Elasticsearch and Elastalert. I will make no attempt to list those here, but refer the curios to the change lists of the different products.

  • Updated to Mule ESB CE version 3.8.1 and using my own Docker image available here.
    This Docker image contains Mule ESB CE on Alpine Linux and Oracle Java 8.
    In addition this image has the Jolokia agent installed and activated in Mule ESB, which exposes JMX data over HTTP.
  • Uses Metricbeat to poll JMX data from the Mule ESB instance instead of Logstash.
    Beats 5.4.0 was released
    recently and this release includes a Jolokia module for Metricbeat which makes it possible to poll JMX data from a Java virtual machine using the HTTP API exposed by Jolokia.
  • Created a Docker image with Mule ESB CE and Metricbeat.
    This Docker image will be built when starting the example using Docker compose. While it should be possible to poll for JMX data from an instance of Metricbeat running in a separate Docker container, I have chosen to run Metricbeat in the same Docker container as Mule ESB.
  • Uses the official Elasticsearch and Kibana Docker images from Elastic.
    These images has X-Pack installed, which enables HTTP basic authentication when connecting to Elasticsearch and requires logging in to Kibana. Note that X-Pack will run a limited time using a trial license after which it will be disabled.
  • Updated my Elastalert image on DockerHub with support for HTTP basic authentication.
    The reason for this is in order for Elastalert to be able to connect to an Elasticsearch instance with X-Pack installed and enabled.

Jolokia Agent in Mule ESB CE

While it is not the focus of this article, I will briefly mention the modifications made to my Mule ESB CE Docker image in order to enable JMX monitoring over HTTP using the Jolokia Mule agent.

  • The Jolokia Mule agent JAR-file is copied to the lib/opt directory in the Mule installation.
  • A Mule application named “jolokia-enabler” is deployed in the Mule instance.
    This application contains one single Mule configuration file that activates and configures the Jolokia Mule agent.

Metricbeat and JMX Monitoring

As mentioned earlier, I chose to have Metricbeat running in the same Docker container as Mule ESB CE.

Mule ESB with Metricbeat Docker File

The Docker file used to create the Docker image with Mule ESB CE and Metricbeat is listed below.

Note that:

  • Metricbeat is installed into the /opt/metricbeat directory.
  • A slightly customized start-script is used to start Metricbeat and Mule.
    The start-script will be examined below.
  • Metricbeat configuration is copied into the Docker image.
    This configuration is the default for this example and not a general default configuration.
    More information about the Metricbeat configuration later.
  • Both Mule and Metricbeat will be run by the mule user in a Docker container created from this image.
  • In addition to the Mule-related directories, the Metricbeat conf, data and logs directories are also configured as Docker data volumes.
  • Two ports are exposed by the Docker image.
    The first port exposes the traditional JMX endpoint. The second port exposes the Jolokia HTTP endpoint over which JMX data can be read and written.

Mule and Metricbeat Docker Image Start-Script

The start-script is based on the start-script from my Mule ESB CE Docker image found here.

Note that:

  • The first part in which the timezone is set, time is synchronized and the external adress of the Mule container is set is identical to the original Mule start-script.
    Please refer to my earlier article on time in Docker containers for more information on time time in Docker containers.

  • Metricbeat is started in the background.
    This is the only addition made to the original Mule start-script.
    The current directory is set to the Metricbeat home directory prior to starting Metricbeat, in order for Metricbeat to be able to locate its files and directories. The Metricbeat configuration file metricbeat.yml in the conf directory is used to launch Metricbeat.
    The -v option configures Metricbeat to log verbose logs.
  • Mule ESB is started as the main process of the Docker container.

Metricbeat Configuration

The Metricbeat configuration file is entirely new for this version of the example and located in a file with the name metricbeat.yml. The Metricbeat configuration reference documentation is available here.

Note that:

  • Metricbeat configuration consists of a number of modules.
    A Metricbeat module gathers metrics from one specific type of source. Examples of such sources are: PostgreSQL database servers, Apache HTTPD servers, server on which Metricbeat runs etc.
  • The first module in the Metricbeat configuration file is the system module.
  • A module may have multiple sets of metrics which are made available by listing them under the metricsets tag.
    In this example I want to gather metrics in the CPU and memory metricsets.
  • The enabled tag determines whether the module is enabled or not.
  • The period tag sets the time between the occasions at which metrics are gathered and sent out.
    For certain modules, setting this time too short will result in failure gathering metrics, please consult the documentation for the module in question for any advise.
  • The next Metricbeat module is the Jolokia module.
    The Jolokia module collects JMX metrics using Jolokia (JMX-over-HTTP) with a period of once per second from localhost, port 8899.
  • After the Jolokia module general configuration follows a number of JMX mappings.
    One JMX mapping will query one or more attributes of one single JMX MBean for metrics.
  • The first JMX mapping queries the java.lang.Runtime MBean for the up-time of the JVM in which the Mule ESB instance is running.
  • The second JMX mapping queries the java.lang.OperatingSystem MBean for the CPU load caused by the JVM.
  • The third JMX mapping queries the java.lang.Memory MBean for two different attributes.
    These attributes are te heap and non-heap memory usage in the JVM.
  • The final JMX mapping queries an MBean exposing statistics for the flow with the name “eventReceivingFlow” in the Mule application with the name “mule-perpetuum-mobile”.
    The statistics queried is the total number of events received by the Mule flow in question.
    Please see the section below on the ordering of the different parts of the MBean name.
  • Next up is the output configuration which tells Metricbeat to send data to Elasticsearch.
    JMX data will be sent to the Elasticsearch instance at “elasticsearch”, port 9200. If multiple addresses are configured in value of the hosts key then data will be distributed between the supplied Elasticsearch hosts. Since X-Pack is installed in Elasticsearch and basic authentication enabled, the username and password are needed to connect to Elasticsearch.
  • The metricbeat.config.modules configures Metricbeat to reload its configuration if it has been modified.
    The path specifies which configuration file(s) to check for modifications, the reload.period specifies how often to check the configuration files for modifications.
  • Finally the logging section configures Metribeat logging output.
    Logs are to be written to the mybeat.log file in the /opt/metricbeat/logs directory. When rotating logs, the ten last log-files are to be retained.

Metricbeat JMX and MBean Names

When I at first had the example up and running with Metricbeat, I noticed that there were some JMX data missing, namely the total number of events from the eventReceivingFlow in the Mule application mule-perpetuum-mobile. I examined the Metricbeat configuration over and over and everything seemed fine. I queried the Jolokia agent in the Mule ESB instance for all JMX data to verify that the data indeed was exposed when I noticed that the name of the MBean in the Metricbeat configuration file did not exactly match that queried from the Jolokia agent.

In order for the Metricbeat JMX module to be able to retrieve data, the flow part must preceed the type part. The proper format of the MBean name use the ordering as presented by Jolokia. JMX data from Jolokia can be examined in the following way:

  • List all the MBeans and their attributes exposed by Jolokia by using the following URL in a browser (if you are not running Docker in Linux, replace ”localhost” with the IP of the virtual machine in which Docker is running):

  • Locate the mule-perpetuummobile JSON object and the eventReceivingFlow flow in it and note that the flow part does precede the type part.

Thus the MBean name to use should be:

Note that quotes are not escaped.

Elastalert Docker Image and HTTP Basic Authentication

The addition of support for HTTP basic authentication not only require modification to the Elastalert configuration, but also to the start-script of the Elastalert Docker image.
In addition, modifications to the rule used in the example also had to be done.

Elastalert Configuration

The Elastalert configuration file had to be modified to include the username and password used when connecting to Elasticsearch. The complete Elastalert configuration file used in the updated example looks like this:

On rows 23 and 24, es_username and es_password have been added for HTTP basic authentication when connecting to Elasticsearch.

Elastalert Docker Image Start-Script

The complete Elastalert start-script below does not only include support for HTTP basic authentication, but also support for using HTTPS when connecting to Elasticsearch (thanks to JamesJJ for the later addition!).

Modifications for HTTP basic authentication have been made at the following places:

  • Rows 33-38.
    Creates the part of the Elasticsearch URL containing username and password.
    If basic authentication not is to be used, then sets this part of the URL will be set to the empty string.

  • Row 39.
    Waits for Elasticsearch to become available by attempting to querying Elasticsearch for status.

  • Row 47.
    Check if Elastalert index already exists in Elasticsearch before creating it.
    If the Elastalert index is to be created in Elasticsearch, then the Elastalert configuration file will be used and username and password in this configuration file used.

Elastalert Example Rule

The exampe rule has been modified to use the index in Elasticsearch created by Metricbeat and to query the metrics gathered by Metricbeat. The complete rule located in the file cpu-spike.yml now looks like this:

Modifications:

  • Row 5.
    Use the index name pattern ”metricbeat-*” in Elasticsearch.

  • Row 14.
    Use new name for the JVM process CPU load.

  • Rows 15 and 16.
    Change the filter range values, since Metricbeat will just pass on the values as queried from Jolokia, which are in the range 0.0 to 1.0 instead of the percentage values from 0.0 to 100.0 generated by Logstash.

Docker Compose File

I have made some minor modifications to the Docker Compose file of the example, in order for the example to work better under various circumstances.

Changes are:

  • Build a custom Mule Docker image instead of using a ready-made image.

  • Map Metricbeat config and logs directories to host directories.

  • Set environment variables for the Mule service in order to set the timezone.
    The time in the Mule and Elastalert Docker containers must match, in order for Elastalert to be able to query for events in the proper time-range and get accurate results.

  • Use the Elasticsearch Docker image from Elastic.

  • Use the Kibana Docker image from Elastic.

  • Made the Elastalert Docker service depend on the Elasticsearch and Kibana Docker services in order to avoid possible failures when running on a system with limited memory and/or CPU resources.

  • Set environment variables for the Elastalert service in order to set the timezone.
    See reason for setting timezone for the Mule service.

  • Set environment variables for the Elastalert service in order to be able to connect to Elasticsearch.

  • Removed the Logstash service.
    Logstash has been replaced by Metricbeat, which is run in the same container as Mule.

Running the Example

Having examined all the changes in the example, we are now ready to run it. After having started the example, we will use Kibana to take a look at the metrics gathered and then trigger the Elastalert example rule by increasing the number of events in the Mule application.

  • Set your current timezone in the docker-compose.yml file.
    Modify all values of the environment variable CONTAINER_TIMEZONE.

  • Start Docker if needed.
    This applies to non-Linux operating systems.

  • Open a terminal window.

  • Go to the root directory containing the example files.
    This is the directory that contains the docker-compose.yml file along with the ElastalertShared, ElasticsearchShared etc directories.

  • Start the Docker containers.

  • Wait until log from Elastalert similar to this is seen in the terminal.

  • Open the Kibana web application in a browser.
    The URL should be http://localhost:5601 but may be http://192.168.99.100:5601 if you are using DockerToolbox.

  • Log in to Kibana using user ”elastic” and password ”changeme” without quotes.

  • Configure an intex pattern using ”metricbeat-*”, click refresh fields next to Time-field name and then select the time-field name ”@timestamp”.

    Configure Metricbeat Index Pattern in Kibana

  • Click the Create button in the lower left corner of the browser window.

  • Click the Discover option in the upper left corner of the browser window.
    A list of events should appear looking something like this:

    Viewing Events in Kibana

  • Open the file mule-config.xml in MuleShared/apps/mule-perpetuum-mobile.
    This is the Mule flow in that file that generates periodic events:

  • Modify the value of the repeatInterval attribute in the <quartz:inbound-endpoint> element so that it reads 2 instead of 50.
  • Save the file.
  • In the mule.log file located in MuleShared/logs notice how the Mule application was reloaded as we modified its configuration file:

  • Wait for a couple of minutes until an alert from Elastalert appears in the log:

We have (again) successfully caused the CPU Spike rule to trigger and send an alert!

Troubleshooting

In this section I will provide some solutions to problems I encountered while preparing the example.

Out of Memory Problems

If you are running the example on a non-Linux Docker host, you may need to allocate more memory to the virtual machine in which Docker runs in order to avoid out of memory conditions from the Elastalert and Mule containers.

Metricbeat Configuration File Problem

If you encounter an error from Metricbeat saying that the configuration file needs to be owned by root or the metricbeat user and you are not able to change the owner of the file, modify the start-mule.sh script for the Mule-Metricbeat Docker image and change the line that starts Metricbeat like this:

Note the addition of the option -strict.perms=false.

Final Words

This concludes the second installation of Alerting with the ELK Stack and Elastalert.

Happy coding!

One thought on “Alerting with the E(L)K Stack and Elastalert Revisited

  1. Avi Haleva

    Hi,
    I looked into the elastalert docker you’ve created (very nice !) and wonder what was the reason you chose to use supervisord as the elastalert watch dog ? Why not use the docker framework (–restart policy) ?
    Thanks,
    Avi

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *