What’s the best way to monitor and log which processes are responsible for high system load throughout the day? Tools like top and htop only provide immediate values, but I’m looking for a solution that offers historical data to identify the main culprits over time.
In my time we used sar. I feel old when reading about all your new tools I never heard of.
Cockpit will show you load spikes over time pretty much out of the box.
https://www.paessler.com/prtg/download We are using this. Loving it but i think only runs on windows. Free for first 100 sensors which should be enough at home.
atop should be available in your package manager and run as a daemon. It stores the history in /var/
I like to use atop at the first step during investigation : https://www.atoptool.nl/
I like zabbix. It can monitor what ever i like, using snmp, ipmi, rest apis or its own agent.
I have a team member insisting on using netdata, but outside of the nice dashboard it doesn’t provide anything. It is local only, and setting up alarms is a pain. And tbh it nags more than canonical stuff
Netdata is excellent, simple and I believe FOSS. Just install locally and it should start logging pretty much everything.
Clicked the link, started reading … closed the window when I read “Netdata also incorporates A.I. insights for all monitored data”.
this limited scope ML trained analysis is actually where “AI” excels, e.g. “computer vision” in specific medical scenarios
If the training data is available, yes, in this case, no chance.