Docker Compose is a great way to set up small test environments locally or remotely. It allows to define your infrastructure as code and does not require any prerequisite tasks or after deployments tasks.
The Docker installation is well documented at https://docs.docker.com/get-docker/ and is well supported amongst the most popular operating systems. The installation itself will not be covered in this article. If you want to get familiar with the details of Docker, start with the documentation at https://docs.docker.com/get-started/.
All code used in this article is available at: https://github.com/insani4c/docker-monitoring-stack.
In this article, we will see how to set up a monitoring solution based on:
- Prometheus
- Prometheus Node Exporter
- Prometheus Black Exporter
- Prometheus SNMP Exporter
- Loki
- Promtail
- Grafana
To monitor the deployed containers, we will also deploy Google’s cadvisor container, to get some interesting statistics and details in our Prometheus/ Grafana setup.
The Docker Compose file, called docker-compose.yml
, contains all the information of the infrastructure such as:
- network information
- volumes
- services (the containers)
- …
Let’s start from the top of the file.
version: '3.8' name: docmon volumes: grafana-data: {} alertmanager-data: {} prometheus-data: {} loki-data: {}
At first at line 1
, the version Docker Compose version is specified, to define which specifications are allowed. At line 3
a name for the container group or stack is set. And finally starting from line 5
, data volumes (think disks) are defined, which will be used by the containers. These are persistent data volumes which will be reused unless the container has been completely removed.
Next, we will define the services in the docker-compose.yml
:
services: cadvisor: image: 'gcr.io/cadvisor/cadvisor:latest' container_name: cadvisor restart: always mem_limit: 512m mem_reservation: 32m # ports: # - '8880:8080' volumes: - '/:/rootfs:ro' - '/var/run:/var/run:ro' - '/sys:/sys:ro' - '/var/lib/docker/:/var/lib/docker:ro' - '/dev/disk/:/dev/disk:ro' privileged: true devices: - '/dev/kmsg:/dev/kmsg' prometheus: image: 'prom/prometheus:latest' container_name: prometheus restart: always mem_limit: 2048m mem_reservation: 256m cpus: 2 # ports: # - '9090:9090' volumes: - '$PROMETHEUS_HOME/config:/etc/prometheus' - 'prometheus-data:/prometheus' extra_hosts: myrouter: 192.168.1.1 myswitch: 192.168.1.10 depends_on: - cadvisor
Containers are defined as services
. Each service
will require at least:
- a service name (example
line 2
andline 20
) - an
image
definition
All other options are optional or required by specific images.
The first image or container defined in the above example is cadvisor
. This service provides statistics from Docker and the deployed containers to Prometheus. To be able to provide this information, the container must have read access to certain file paths or sockets on the hypervisor (read: the server where the Docker containers will be running). These are provided in the volumes
section of the container. Here, directory paths on the hypervisor will be provided as mount partitions in the container, and they will be mounted with the readonly
(:ro
) parameter so that the container can’t make any changes to them.
Furthermore it provides access to a device
(to read kernel messages), set memory
and cpu
limits and will run the container in privilege
mode. The ports
section has been put in comments, as it isn’t really require to expose ports, or make them available outside the Docker ecosystem. In our example, only Prometheus must be able to connect to it, and since Prometheus will be deployed as a container, we don’t need to be able to access the web service running on the container to read out the metrics or see the statistics.
The next container defined is called prometheus
. For this container, volumes
will be mounted to provide the Prometheus configuration files and to store the data to the volume called prometheus-data
. It also defines an extra_hosts
. These are entries that are typically defined in an /etc/hosts
file, which Docker does not read from the hypervisor. And instead of deploying or mounting the hypervisor’s hosts
file, extra host mappings can be defined or handed to the container, which is set up in the extra_hosts
section as above.
At the end of the prometheus
container definition, a depends_on
section is configured, which means that the prometheus
container won’t be deployed until the container names defined in that section are up and running.
Next we will define all other containers (see the second tab in the above code block, called the rest
).
hypervisor: image: 'prom/node-exporter:latest' container_name: hypervisor mem_limit: 128m mem_reservation: 32m restart: unless-stopped volumes: - '/:/host:ro,rslave' - '/proc:/host/proc:ro' - '/sys:/host/sys:ro' command: - '--path.rootfs=/host' - '--path.procfs=/host/proc' - '--path.sysfs=/host/sys' - '--collector.filesystem.mount-points-exclude=^/(sys|proc|dev|host|etc)($$|/)' - '--collector.systemd' - '--collector.cgroups' depends_on: - cadvisor prom_snmp: image: 'prom/snmp-exporter:latest' container_name: prom_snmp restart: always mem_limit: 128m mem_reservation: 32m # ports: # - '9116:9116' volumes: - '$PROMSNMP_HOME/config:/etc/snmp_exporter' extra_hosts: myrouter: 192.168.1.1 myswitch: 192.168.1.10 depends_on: - cadvisor - prometheus alertmanager: image: 'prom/alertmanager:latest' container_name: alertmanager restart: always mem_limit: 256m mem_reservation: 32m # ports: # - 9093:9093 volumes: - '$ALERTMANAGER_HOME/config/alertmanager.yml:/etc/alertmanager/config.yml' - 'alertmanager-data:/alertmanager' command: - '--config.file=/etc/alertmanager/config.yml' - '--storage.path=/alertmanager' depends_on: - cadvisor - prometheus loki: image: 'grafana/loki:latest' container_name: loki restart: always mem_limit: 32768m mem_reservation: 8192m cpus: 6 ports: - '3100:3100' volumes: - '$LOKI_HOME/config:/etc/loki' - 'loki-data:/loki' depends_on: - cadvisor - prometheus - alertmanager blackbox_exporter: image: 'prom/blackbox-exporter:latest' container_name: blackbox_exporter restart: always mem_limit: 128m mem_reservation: 32m dns: - 8.8.8.8 - 8.8.4.4 # ports: # - 9115:9115 volumes: - '$BLACKBOXEXPORTER_HOME/config:/etc/blackboxexporter/' command: - '--config.file=/etc/blackboxexporter/config.yml' depends_on: - cadvisor - prometheus promtail: image: grafana/promtail:latest container_name: promtail restart: always mem_limit: 256m mem_reservation: 64m volumes: - $PROMTAIL_HOME/config:/etc/promtail/ # to read container labels and logs - '/var/run/docker.sock:/var/run/docker.sock:ro' - '/var/lib/docker/containers:/var/lib/docker/containers:ro' - '/var/log/ulog:/var/log/ulog/:ro' depends_on: - cadvisor - loki grafana: image: 'grafana/grafana:latest' container_name: grafana restart: always mem_limit: 2048m mem_reservation: 256m ports: - '3000:3000' volumes: - '$GRAFANA_HOME/config:/etc/grafana' - 'grafana-data:/var/lib/grafana' - '$GRAFANA_HOME/dashboards:/var/lib/grafana/dashboards' depends_on: - cadvisor - prometheus - loki - alertmanager
The rest of the code will deploy:
- a container called
hypervisor
, which is actually the Prometheus node-exporter for the hypervisor. - a container called
prom_snmp
, which will retrieve SNMP statistics - a container called
blackbox_exporter
, which mainly checks webservers and their SSL certificates - a container called
promtail
, which collects logs and log statistics from the hypervisor - a container called
loki
, which allows to store and index logs sent to the service bypromtail
(either container or as a service running on some external server)
Finally, the last container deployed is the grafana
container. Besides its normal configuration file grafana.ini
, the Docker container will also automatically provision (the provisioning
sub directory in the config
directory) datasources and dashboards so that no manual after-tasks are required once the containers are running.
The datasources can be preconfigured in a YAML file called default.yaml
, stored in the provisioning/datasources/
sub directory.
apiVersion: 1 datasources: - name: Alertmanager type: alertmanager access: proxy orgId: 1 url: http://alertmanager:9093 version: 1 editable: false isDefault: false uid: DS_ALERTMANAGER jsonData: implementation: prometheus - name: Prometheus type: prometheus access: proxy orgId: 1 url: http://prometheus:9090 version: 1 editable: false isDefault: true uid: DS_PROMETHEUS jsonData: alertmanagerUid: DS_ALERTMANAGER manageAlerts: true prometheusType: Prometheus prometheusVersion: 2.39.1 - name: Loki type: loki access: proxy orgId: 1 url: http://loki:3100 version: 1 editable: false isDefault: false uid: DS_LOKI jsonData: alertmanagerUid: DS_ALERTMANAGER manageAlerts: true
Same thing goes for dashboards we want to have automatically deployed:
apiVersion: 1 providers: - name: 'default' orgId: 1 folder: 'Custom' folderUid: '' type: file options: path: /var/lib/grafana/dashboards
Finally, if Docker is running on multiple network interfaces (for it is a hosted server, or it has internal and external IP addresses), you might want to limit access to the container to specific networks only.
Below is a netfilter
example, which allows traffic only coming from 192.168.1.0/24
and from the network interface enp35so
:
iptables -I DOCKER-USER -i enp35s0 ! -s 192.168.1.0/24 -m conntrack --ctdir ORIGINAL -j DROP
The chain DOCKER-USER
is not flushed by Docker and thus can be created in a general firewall script of netfilter
configuration, even at boot time:
-N DOCKER-USER -I DOCKER-USER -i enp35s0 ! -s 192.168.1.0/24 -m conntrack --ctdir ORIGINAL -j DROP