Based on my article on JMX Monitoring with the ELK Stack and the article on creating a Docker image with Elastalert, I will now combine these and add the missing part, alerting, to the monitoring and alerting stack I have worked my way towards.
Preparations
The different configuration files used in this article’s example available on GitHub, however there are a few adaptions you need to know before trying it out on your own computer.
- In the MuleShared/conf/wrapper.conf file modify the line that looks like this:
wrapper.java.additional.20=-Djava.rmi.server.hostname=192.168.99.100
Replace the IP address with the IP address of obtained from the command:
docker-machine ip [machine name]
If you are using a Linux operating system that does not require a Docker VM, enter the IP address 127.0.0.1. - Give read and write permissions to everyone on the root folder and all underlying files and folders containing the files from GitHub.
In a Unix/Linux environment, use the chmod command:
chmod -R +rw alerting-with-elk-and-elastalert - Windows and Mac users only:
Make sure you place the root directory containing the configuration files in a location that is accessible by the Docker daemon. For OS X this is /Users/ and subdirectories, for Windows it is C:/Users and subdirectories. Docker documentation reference. - Examine the file mule-config.xml located in the MuleShared/apps/mule-perpetuum-mobile directory.
On line 28 or thereabout, there is the repeatInterval of the <quartz:inbound-endpoint> which should have the value 20. We will lower this value later, to trigger the Mule flow more often which will result in higher CPU usage and, hopefully, an alert going off.
Also make sure that you have Docker 1.9 or later installed, as the official Docker images I have used in this article prefer at least version 1.9 of Docker, and Docker Compose. If you are using Mac OS X or Windows, I suggest Docker Toolbox.
Overview
The following graphics shows the Docker images used in this article:
I have created Docker images that wrap the official Docker images for the ELK stack, that is Elasticsearch, Kibana and Logstash. The reasons differ slightly depending on the image:
- Elasticsearch
Install additional plug-ins. - Logstash
At startup, wait for Elasticsearch to become available before starting Logstash.
Also install the JMX plug-in. - Kibana
At startup, wait for Elasticsearch to become available before starting Kibana.
Install the Sense plug-in.
Logstash, Kibana and Elastalert all need to wait for Elasticsearch to become available before starting up since I use Docker Compose to start all the Docker containers and thus cannot determine the order in which the different containers will start.
The Mule ESB CE instance is running a Mule application which only purpose is to generate events. For details, please refer to my earlier article on JMX Monitoring with Docker and the ELK Stack.
Elastalert Rule
The Elastalert rule that will be used in this example is located in ElastalertShared/rules/cpu-spike.yaml and looks like this:
# Example Elastalert rule that will alert on spikes in the CPU load of # the monitored Mule CE ESB. name: CPU spike type: spike index: logstash-* threshold: 1 timeframe: minutes: 1 spike_height: 2 spike_type: "up" filter: - range: cpuLoad: from: 3.0 to: 100.0 alert: - "debug"
This is a Elastalert rule of the spike type that will “send” an alert to the debug log when the number of cpuLoad events in the range 3.0 to 100.0 during the last minute is at least two times higher than the number of cpuLoad events in the same range during the minute before the last minute.
For further explanations of Elastalert and Elastalert rules, please refer to their excellent documentation.
Running the Example
Starting the Docker containers used in this example is accomplished using Docker Compose.
- Non-Linux operating systems only: Start Docker.
The easiest way is to use the Docker Quickstart Terminal if you have it installed.
Otherwise use Docker Machine, in which case you also need to ssh into the Docker virtual machine. Commands to use with Docker Machine are:docker-machine start [name of machine here, possibly 'default'] docker-machine ssh [name of machine here, possibly 'default']
- Linux: Open a terminal window.
- Go to the root directory containing the example files.
This is the directory that contains the docker-compose.yml file along with the ElastalertShared, ElasticsearchShared etc directories. - Start the Docker containers.
docker-compose up
There will be a lot of console output, perhaps even some errors, and after some time output like this should appear, which tells us that Logstash is sending JMX data obtained from the Mule image to Elasticsearch.
logstash_1 | { logstash_1 | "@version" => "1", logstash_1 | "@timestamp" => "2015-12-06T14:29:32.013Z", logstash_1 | "host" => "muleserver", logstash_1 | "path" => "/config-dir/jmx", logstash_1 | "type" => "jmx", logstash_1 | "metric_path" => "mule_jvm.OperatingSystem.SystemCpuLoad", logstash_1 | "metric_value_number" => 0.004539722572509458, logstash_1 | "cpuLoad" => 0.45397225725094575 logstash_1 | }
About once every minute log similar to this should also appear:
elastalert_1 | 2015-12-06 14:32:26,079 DEBG 'elastalert' stderr output: elastalert_1 | INFO:elastalert:Queried rule CPU spike from 12-6 14:30 UTC to 12-6 14:32 UTC: 0 hits elastalert_1 | elastalert_1 | 2015-12-06 14:32:26,079 DEBG 'elastalert' stdout output: elastalert_1 | Queried rule CPU spike from 12-6 14:30 UTC to 12-6 14:32 UTC: 0 hits elastalert_1 | elastalert_1 | 2015-12-06 14:32:26,250 DEBG 'elastalert' stderr output: elastalert_1 | INFO:elastalert:Ran CPU spike from 12-6 14:30 UTC to 12-6 14:32 UTC: 0 query hits, 0 matches, 0 alerts sent elastalert_1 | elastalert_1 | 2015-12-06 14:32:26,250 DEBG 'elastalert' stdout output: elastalert_1 | Ran CPU spike from 12-6 14:30 UTC to 12-6 14:32 UTC: 0 query hits, 0 matches, 0 alerts sent elastalert_1 | elastalert_1 | 2015-12-06 14:32:26,252 DEBG 'elastalert' stderr output: elastalert_1 | INFO:elastalert:Sleeping for 59 seconds elastalert_1 | elastalert_1 | 2015-12-06 14:32:26,252 DEBG 'elastalert' stdout output: elastalert_1 | Sleeping for 59 seconds elastalert_1 |
This is Elastalert querying Elasticsearch to determine if our rule ‘CPU Spike’ warrants an alert. Note that you will have to wait at least two minutes before a basline rate has been established for our spike-rule.
Having waited at least two minutes, we are now ready to cause a CPU spike but increasing the frequency of event generation in the Mule flow:
- Open the file mule-config.xml in MuleShared/apps/mule-perpetuum-mobile.
This is the Mule flow in that file that generates periodic events:<flow name="eventGeneratingFlow"> <quartz:inbound-endpoint jobName="eventGeneratingJob" repeatInterval="20" repeatCount="-1" connector-ref="oneThreadQuartzConnector"> <quartz:event-generator-job> <quartz:payload>go</quartz:payload> </quartz:event-generator-job> </quartz:inbound-endpoint> <logger level="ERROR" message="Generated an event!"/> <vm:outbound-endpoint path="eventReceiverEndpoint" exchange-pattern="one-way"/> </flow>
- Modify the value of the repeatInterval attribute in the <quartz:inbound-endpoint> element so that it reads 1 instead of 20.
- Save the file.
- In the mule.log file located in MuleShared/logs notice how the Mule application was reloaded as we modified its configuration file:
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ + Redeploying artifact 'mule-perpetuum-mobile' + ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
- Wait for a couple of minutes until an alert from Elastalert appears in the log:
elastalert_1 | 2015-12-06 15:08:11,513 DEBG 'elastalert' stderr output: elastalert_1 | INFO:elastalert:Queried rule CPU spike from 12-6 15:06 UTC to 12-6 15:08 UTC: 10 hits elastalert_1 | elastalert_1 | 2015-12-06 15:08:11,514 DEBG 'elastalert' stdout output: elastalert_1 | Queried rule CPU spike from 12-6 15:06 UTC to 12-6 15:08 UTC: 10 hits elastalert_1 | elastalert_1 | 2015-12-06 15:08:11,533 DEBG 'elastalert' stderr output: elastalert_1 | INFO:elastalert:Alert for CPU spike at 2015-12-06T15:07:17.152Z: elastalert_1 | elastalert_1 | 2015-12-06 15:08:11,534 DEBG 'elastalert' stdout output: elastalert_1 | Alert for CPU spike at 2015-12-06T15:07:17.152Z: elastalert_1 | CPU spike elastalert_1 | elastalert_1 | An abnormal number (5) of events occurred around 12-6 15:07 UTC. elastalert_1 | Preceding that time, there were only 1 events within 0:01:00 elastalert_1 | elastalert_1 | @timestamp: 2015-12-06T15:07:17.152Z elastalert_1 | @version: 1 elastalert_1 | cpuLoad: 10.7221573966 elastalert_1 | host: muleserver elastalert_1 | metric_path: mule_jvm.OperatingSystem.SystemCpuLoad elastalert_1 | metric_value_number: 0.107221573966 elastalert_1 | path: /config-dir/jmx elastalert_1 | reference_count: 1 elastalert_1 | spike_count: 5 elastalert_1 | type: jmx
We have successfully caused the CPU Spike rule to trigger and send an alert!
My monitoring and alerting stack is now completed and this also concludes this article. There is a lot to write about in this area so I suspect that I may return with future articles in which I use this monitoring and alerting stack. Stay tuned and happy coding!
Hi,
I have elasticsearch installed and running in environment.
I am installing elastealert using command
$python setup.py install >> install.log
It gives me error,
Processing blist-1.3.6.tar.gz
Running blist-1.3.6/setup.py -q bdist_egg –dist-dir /tmp/easy_install-SvKvZs/blist-1.3.6/egg-dist-tmp-R2pZmQ
warning: no files found matching ‘blist.rst’
Hi!
Spare yourself the trouble and use the Docker image that I have created, which is available on Docker Hub.
If you really do not want to use Docker, then look at my Elastalert Docker image’s Docker file, which installs Elastalert on Ubuntu: https://github.com/krizsan/elastalert-docker/blob/master/Dockerfile
Best wishes!
Update: The Elastalert Docker image no longer uses Ubuntu, but Alpine Linux.
Run
sudo apt-get install python-blist and retry
Hi,
Thanks for reply.
I have referred docker file and few of the steps are done. However
When I execute “python setup.py install” it gives me error,
mock requires setuptools>=17.1. Aborting installation
error: Setup script exited with 1
what is this about?
br,
Sunil
Hi!
I would suggest that you talk to the developers since I regretfully lack deeper knowledge about Elastalert.
On the (Elastalert) project’s GitHub page there is a link to a Gitter chat where people can ask questions.
Best wishes!
sudo pip install “setuptools>=17.1”
But if pip is not installed then first install pip
curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py
sudo python get-pip.py
sudo pip –version
Pingback: 使用elastalert進行錯誤報警 | 程式前沿