Hi! i have a mixed set of containers (a few, not too many) and bare-metal services (quite a few) and i would like to monitor them.
I am using good old “monit” that monitors my network interfaces, filesystems status and traditional services (via pid files). It’s not pretty, but get the work done. It seems i cannot find a way to have it also monitor my containers. Consider that i use podman and have a strict one service, one user policy (all containers are rootless).
I also run “netdata” but i find it overwhelming, too much data, too much graphics, just too much for my needs.
I need something that:
- let me monitor service status
- let me monitor containers status
- let me restart services or containers (not mandatory, but preferred)
- has a nice web GUI
- the web gui is also mobile friendly (not mandatory, but appreciated)
- Can print some history data (not manatory, but interesting)
- Can monitor CPU usage (mandatory)
- Can monitor filesystem usage (mandatory)
I don’t care for authentication features, since it will be behind a reverse proxy with HTTPS and proxy authentication already.
I am not looking for a fancy and comples dashboard, but for something i can host on a secondary page that i open if/when i want to check stuff. Also, if the tool can be scripted or accessed via an API could be useful, so i would write some extractors to print something in a summary page in my own dashboard.
I think Prometheus is a good industry standard. It can do everything you listed except for restarting stuff. It’s got a decent built-in monitoring capability and you can extend it trivially to monitor anything. For example I wrote a 5-liner to monitor ZFS health and another for LVM. I even monitor my routers with it. OpenWrt has an installable node exporter for Prometheus.
Service restarting is a remote execution capability and generally falls outside of the monitoring domain. You’d be better off implementing that with another process/service manager. If you’re running systemd, that’s one of its primary purposes. You can use it to start/stop/restart containers just like normal processes.