How to monitor Docker containers with Nagios and NRPE

Monitoring whether or not a Docker container is alive on a remote host should be fairly easy, right?

The standard approach in this is to include a suitable NRPE script on the remote host, and call that remotely from your Nagios server via the NRPE TCP daemon on the remote host. This script is a good example of same, and we’ll refer to it in the rest of the article.

This generally works fine when you’re doing innocuous things like checking free disk space or if a certain process is running. Checking a Docker container is a little bit harder, because the command:

docker inspect

can only be run as root, whereas the NRPE service on the remote host runs as a non-privileged user (usually called nagios).

As such, when you test your NRPE call from the Nagios server, like so:

/usr/lib64/nagios/plugins/check_nrpe -H dockerhost.yourdomain.com -c check_docker_container1

Your will see a response like:

NRPE: Unable to read output

or

UNKNOWN - container1 does not exist.

You get this response because the nagios user cannot execute the docker control command.

Your could get around this by running NRPE on the remote host as the root user, but that really isn’t a good idea, and you should never do this.

A better play (if you are confident that your Nagios set up is secure) is to extend controlled privileged to the nagios user via sudo. You can create the following file in /etc/sudoers.d/docker to achieve this:

nagios    ALL=(ALL:ALL)  NOPASSWD: /usr/bin/docker inspect *
nagios    ALL=(ALL:ALL)  NOPASSWD: /usr/lib64/nagios/plugins/check-docker-container.sh *

This allows the nagios user to run both the wrapper script around the docker inspect command and the docker control command itself, without requiring a password. Note, only inspect permission is granted. Obviously, we don’t want to give nagios permission to actually manipulate containers.

In addition to this, we must make provision for NRPE to run the command using sudo when called via the NRPE TCP daemon. So, in nrpe.cfg, instead of:

command[check_docker_container1]=/usr/lib64/nagios/plugins/check-docker-container.sh container1

we have:

command[check_docker_container1]=sudo /usr/lib64/nagios/plugins/check-docker-container.sh container1

 

 

2 thoughts on “How to monitor Docker containers with Nagios and NRPE

  1. joffrey dupire

    Hi, i’ve followed your tutorial but when i add the following lines:

    nagios ALL=(ALL:ALL) NOPASSWD: /usr/bin/docker inspect *
    nagios ALL=(ALL:ALL) NOPASSWD: /usr/lib64/nagios/plugins/check-docker-container.sh *

    i’ve got this error :

    NRPE: Unable to read output

    i’m using Centreon and the same script as you have you got an idea ?

Leave a Reply

Your email address will not be published.

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>