AMD Engineer Prateik Nayak recently introduced a CPU idle driver patch for Linux that “skips the dummy wait for processors based on the Zen microarchitecture.”
When ACPI support was added to the Linux kernel in 2002, it included a “dummy wait operation”. In essence, the system was reading data for no reason other than to delay execution of the next instruction until the CPU could be completely stopped with the STPCLK# instruction. This allowed for some power savings and compatibility in the early days of ACPI, when some chips didn’t go into idle state when they might have been expected.
But this decision was relevant for 20-year-old Athlon processors. Modern AMD Threadripper chips based on the Zen architecture do not need this workaround. As Nyack writes, it hurts them, at least under certain Linux workloads. Testing workloads with instruction-based sampling shows that “a significant amount of time is spent on a dummy operation that is erroneously counted as C-State residency.” The processor, by detecting this dummy work with low loads, can enter a deeper and slower C-State, which then causes the CPU to “wake up” for longer, especially in tasks that require a lot of switching between work and idle states.
Nayak ran tbench tests on a dual CPU system based on the Zen3 architecture. The comparison used a base Linux kernel, a kernel with a fully disabled C2 state, and a kernel with a fixed bogus wait operation. In its patched version, the minimum throughput in MB/s increased by 1390% and the average throughput by 51% compared to the base core. These results are only slightly behind the full disabled C2 version of the kernel.
If an urgent patch is approved this week to remove or limit the “dummy wait”, it will most likely be part of the Linux 6.0 kernel, which is expected next week.
Source: arstechnica