Category Archives: Docker

How to monitor Docker containers with Nagios and NRPE

Monitoring whether or not a Docker container is alive on a remote host should be fairly easy, right?

The standard approach in this is to include a suitable NRPE script on the remote host, and call that remotely from your Nagios server via the NRPE TCP daemon on the remote host. This script is a good example of same, and we’ll refer to it in the rest of the article.

This generally works fine when you’re doing innocuous things like checking free disk space or if a certain process is running. Checking a Docker container is a little bit harder, because the command:

docker inspect

can only be run as root, whereas the NRPE service on the remote host runs as a non-privileged user (usually called nagios).

As such, when you test your NRPE call from the Nagios server, like so:

/usr/lib64/nagios/plugins/check_nrpe -H dockerhost.yourdomain.com -c check_docker_container1

Your will see a response like:

NRPE: Unable to read output

or

UNKNOWN - container1 does not exist.

You get this response because the nagios user cannot execute the docker control command.

Your could get around this by running NRPE on the remote host as the root user, but that really isn’t a good idea, and you should never do this.

A better play (if you are confident that your Nagios set up is secure) is to extend controlled privileged to the nagios user via sudo. You can create the following file in /etc/sudoers.d/docker to achieve this:

nagios    ALL=(ALL:ALL)  NOPASSWD: /usr/bin/docker inspect *
nagios    ALL=(ALL:ALL)  NOPASSWD: /usr/lib64/nagios/plugins/check-docker-container.sh *

This allows the nagios user to run both the wrapper script around the docker inspect command and the docker control command itself, without requiring a password. Note, only inspect permission is granted. Obviously, we don’t want to give nagios permission to actually manipulate containers.

In addition to this, we must make provision for NRPE to run the command using sudo when called via the NRPE TCP daemon. So, in nrpe.cfg, instead of:

command[check_docker_container1]=/usr/lib64/nagios/plugins/check-docker-container.sh container1

we have:

command[check_docker_container1]=sudo /usr/lib64/nagios/plugins/check-docker-container.sh container1