-
Notifications
You must be signed in to change notification settings - Fork 233
LIKWID and Monitoring
In cluster environments, often a monitoring system is in place to track activity on the system. Some are even a combination of ressource management systems (like SLURM, PBS, ...) and a monitoring solution. SLURM is one of those but there are others.
If SLURM tracks activity with hardware counters, it will probably skew the LIKWID measurements. Not sure whether it can be disabled on a per-job basis.
HTCondor is a high-throughput computing software framework with ressource management and monitoring. There is currently no way to disable HTCondor's usage of hardware counters. So at the moment, don't use LIKWID on such managed systems.
Performance Co-Pilot or short PCP is a system performance analysis toolkit that measures hardware performance counters in the background. In order to run own measurements, you have to disable PCP for your environment.
Disable PCP's hardware performance counting for the whole shell:
perfalloc -d
Disable PCP's hardware performance counting just for the command:
perfalloc <command>
There are many node agents available and some of them have LIKWID support, thus read hardware performance counters through LIKWID and not perf_event or PAPI. Lately, we got reports that sometimes, the values gathered with LIKWID are wrong/off/physically impossible. We did some investigation and still don't know the exact reason but it is caused by high system call times when the system is under load. It might be caused by security mitigations for hardware flaws like Meltdown. The startCounters()
function is non critical as all counter accesses (and therefore system calls) are performed before the timer is started but the stopCounters()
function is problematic. The first operation of stopCounters()
is to stop the timer and then read out the counters (system calls). If each system call is taking longer, it might happen that the last system call is issued X seconds after stopping the timer. In this time, the counter keeps incrementing because there is load in the system (memory accesses, FP operations, ...). So the counter might be much higher as expected and deriving time-based metrics like bandwidths fails.
There is a possible solution to fix this. The monitoring agent (with LIKWID) needs a different scheduling policy than the other applications. We got reports that the round-robin policy fixes it:
# chrt --rr <prio> ./my-monitong-daemon
According to tests, the priority does not matter, something between 1 and 99 works.
Another solution would be to increase the measurement time to reduce the effect of the slow system calls.
Just changing the niceness is commonly not enough. We got reports for this issue from different centers using official RHEL as well as RockyLinux on their HPC systems. Interestingly, NHR@FAU uses AlmaLinux, another RHEL-derived distribution, and does not see these issues ... yet.
-
Applications
-
Config files
-
Daemons
-
Architectures
- Available counter options
- AMD
- Intel
- Intel Atom
- Intel Pentium M
- Intel Core2
- Intel Nehalem
- Intel NehalemEX
- Intel Westmere
- Intel WestmereEX
- Intel Xeon Phi (KNC)
- Intel Silvermont & Airmont
- Intel Goldmont
- Intel SandyBridge
- Intel SandyBridge EP/EN
- Intel IvyBridge
- Intel IvyBridge EP/EN/EX
- Intel Haswell
- Intel Haswell EP/EN/EX
- Intel Broadwell
- Intel Broadwell D
- Intel Broadwell EP
- Intel Skylake
- Intel Coffeelake
- Intel Kabylake
- Intel Xeon Phi (KNL)
- Intel Skylake X
- Intel Cascadelake SP/AP
- Intel Tigerlake
- Intel Icelake
- Intel Icelake X
- Intel SappireRapids
- Intel GraniteRapids
- Intel SierraForrest
- ARM
- POWER
-
Tutorials
-
Miscellaneous
-
Contributing