Hi! i have a mixed set of containers (a few, not too many) and bare-metal services (quite a few) and i would like to monitor them.

I am using good old “monit” that monitors my network interfaces, filesystems status and traditional services (via pid files). It’s not pretty, but get the work done. It seems i cannot find a way to have it also monitor my containers. Consider that i use podman and have a strict one service, one user policy (all containers are rootless).

I also run “netdata” but i find it overwhelming, too much data, too much graphics, just too much for my needs.

I need something that:

  • let me monitor service status
  • let me monitor containers status
  • let me restart services or containers (not mandatory, but preferred)
  • has a nice web GUI
  • the web gui is also mobile friendly (not mandatory, but appreciated)
  • Can print some history data (not manatory, but interesting)
  • Can monitor CPU usage (mandatory)
  • Can monitor filesystem usage (mandatory)

I don’t care for authentication features, since it will be behind a reverse proxy with HTTPS and proxy authentication already.

I am not looking for a fancy and comples dashboard, but for something i can host on a secondary page that i open if/when i want to check stuff. Also, if the tool can be scripted or accessed via an API could be useful, so i would write some extractors to print something in a summary page in my own dashboard.

  • Avid Amoeba@lemmy.ca
    link
    fedilink
    English
    arrow-up
    1
    ·
    edit-2
    6 months ago

    I think Prometheus is a good industry standard. It can do everything you listed except for restarting stuff. It’s got a decent built-in monitoring capability and you can extend it trivially to monitor anything. For example I wrote a 5-liner to monitor ZFS health and another for LVM. I even monitor my routers with it. OpenWrt has an installable node exporter for Prometheus.

    Service restarting is a remote execution capability and generally falls outside of the monitoring domain. You’d be better off implementing that with another process/service manager. If you’re running systemd, that’s one of its primary purposes. You can use it to start/stop/restart containers just like normal processes.