linux.git/tools/testing/selftests/resctrl/resctrl.h, branch v7.2-rc1

selftests/resctrl: Simplify perf usage in CAT test

2026-05-05T00:40:02+00:00

The CAT test relies on the PERF_COUNT_HW_CACHE_MISSES event to determine if
modifying a cache portion size is successful. This event is configured to
report the data as part of an event group, but no other events are added to
the group.

Remove the unnecessary PERF_FORMAT_GROUP format setting. This eliminates
the need for struct perf_event_read and results in read() of the associated
file descriptor to return just one value associated with the
PERF_COUNT_HW_CACHE_MISSES event of interest.

Link: https://lore.kernel.org/r/fb69325eba5031b735fa79effaaacd797c9c6040.1775266384.git.reinette.chatre@intel.com
Signed-off-by: Reinette Chatre 
Tested-by: Chen Yu 
Reviewed-by: Ilpo Järvinen 
Signed-off-by: Shuah Khan

selftests/resctrl: Raise threshold at which MBM and PMU values are compared

2026-05-05T00:40:02+00:00

Commit 501cfdba0a40 ("selftests/resctrl: Do not compare performance
counters and resctrl at low bandwidth") introduced a threshold under which
memory bandwidth values from MBM and performance counters are not compared.
This is needed because MBM and the PMUs do not have an identical view of
memory bandwidth since PMUs can count all memory traffic while MBM does not
count "overhead" (for example RAS) traffic that cannot be attributed to an
RMID. As a ratio this difference in view of memory bandwidth is pronounced
at low memory bandwidths.

The 750MiB threshold was chosen arbitrarily after comparisons on different
platforms. Exposed to more platforms after introduction this threshold has
proven to be inadequate.

Having accurate comparison between performance counters and MBM requires
careful management of system load as well as control of features that
introduce extra memory traffic, for example, patrol scrub. This is not
appropriate for the resctrl selftests that are intended to run on a
variety of systems with various configurations.

Increase the memory bandwidth threshold under which no comparison is made
between performance counters and MBM. Add additional leniency by increasing
the percentage of difference that will be tolerated between these counts.

There is no impact to the validity of the resctrl selftests results as a
measure of resctrl subsystem health.

Link: https://lore.kernel.org/r/b374c33ddd324130d6255cbb91c3dd500e8277e7.1775266384.git.reinette.chatre@intel.com
Signed-off-by: Reinette Chatre
Tested-by: Chen Yu
Reviewed-by: Ilpo Järvinen
Signed-off-by: Shuah Khan

selftests/resctrl: Reduce interference from L2 occupancy during cache occupancy test

2026-05-05T00:40:02+00:00

The CMT test creates a new control group that is also capable of monitoring
and assigns the workload to it. The workload allocates a buffer that by
default fills a portion of the L3 and keeps reading from the buffer,
measuring the L3 occupancy at intervals. The test passes if the workload's
L3 occupancy is within 15% of the buffer size.

The CMT test does not take into account that some of the workload's data
may land in L2/L1. Matching L3 occupancy to the size of the buffer while
a portion of the buffer can be allocated into L2 is not accurate.

Take the L2 cache into account to improve test accuracy:
 - Reduce the workload's L2 cache allocation to the minimum on systems that
   support L2 cache allocation. Do so with a new utility in preparation for
   all L3 cache allocation tests needing the same capability.
 - Increase the buffer size to accommodate data that may be allocated into
   the L2 cache. Use a buffer size double the L3 portion to keep using the
   L3 portion size as goal for L3 occupancy while taking into account that
   some of the data may be in L2.

Running the CMT test on a sample system while introducing significant
cache misses using "stress-ng --matrix-3d 0 --matrix-3d-zyx" shows
significant improvement in L3 cache occupancy:

Before:

    # Starting CMT test ...
    # Mounting resctrl to "/sys/fs/resctrl"
    # Cache size :335544320
    # Writing benchmark parameters to resctrl FS
    # Write schema "L3:0=fffe0" to resctrl FS
    # Write schema "L3:0=1f" to resctrl FS
    # Benchmark PID: 7089
    # Checking for pass/fail
    # Pass: Check cache miss rate within 15%
    # Percent diff=12
    # Number of bits: 5
    # Average LLC val: 73269248
    # Cache span (bytes): 83886080
    ok 1 CMT: test

After:
    # Starting CMT test ...
    # Mounting resctrl to "/sys/fs/resctrl"
    # Cache size :335544320
    # Writing benchmark parameters to resctrl FS
    # Write schema "L3:0=fffe0" to resctrl FS
    # Write schema "L3:0=1f" to resctrl FS
    # Write schema "L2:1=0x1" to resctrl FS
    # Benchmark PID: 7171
    # Checking for pass/fail
    # Pass: Check cache miss rate within 15%
    # Percent diff=0
    # Number of bits: 5
    # Average LLC val: 83755008
    # Cache span (bytes): 83886080
    ok 1 CMT: test

Link: https://lore.kernel.org/r/00445fa64c251b86b86023f87220ee1ad8561460.1775266384.git.reinette.chatre@intel.com
Reported-by: Dave Martin 
Signed-off-by: Reinette Chatre 
Tested-by: Chen Yu 
Reviewed-by: Ilpo Järvinen 
Link: https://lore.kernel.org/lkml/aO+7MeSMV29VdbQs@e133380.arm.com/
Signed-off-by: Shuah Khan

selftests/resctrl: Improve accuracy of cache occupancy test

2026-05-05T00:40:02+00:00

Dave Martin reported inconsistent CMT test failures. In one experiment
the first run of the CMT test failed because of too large (24%) difference
between measured and achievable cache occupancy while the second run passed
with an acceptable 4% difference.

The CMT test is susceptible to interference from the rest of the system.
This can be demonstrated with a utility like stress-ng by running the CMT
test while introducing cache misses using:

   stress-ng --matrix-3d 0 --matrix-3d-zyx

Below shows an example of the CMT test failing because of a significant
difference between measured and achievable cache occupancy when run with
interference:
    # Starting CMT test ...
    # Mounting resctrl to "/sys/fs/resctrl"
    # Cache size :335544320
    # Writing benchmark parameters to resctrl FS
    # Benchmark PID: 7011
    # Checking for pass/fail
    # Fail: Check cache miss rate within 15%
    # Percent diff=99
    # Number of bits: 5
    # Average LLC val: 235929
    # Cache span (bytes): 83886080
    not ok 1 CMT: test

The CMT test creates a new control group that is also capable of monitoring
and assigns the workload to it. The workload allocates a buffer that by
default fills a portion of the L3 and keeps reading from the buffer,
measuring the L3 occupancy at intervals. The test passes if the workload's
L3 occupancy is within 15% of the buffer size.

By not adjusting any capacity bitmasks the workload shares the cache with
the rest of the system. Any other task that may be running could evict
the workload's data from the cache causing it to have low cache occupancy.

Reduce interference from the rest of the system by ensuring that the
workload's control group uses the capacity bitmask found in the user
parameters for L3 and that the rest of the system can only allocate into
the inverse of the workload's L3 cache portion. Other tasks can thus no
longer evict the workload's data from L3.

With the above adjustments the CMT test is more consistent. Repeating the
CMT test while generating interference with stress-ng on a sample
system after applying the fixes show significant improvement in test
accuracy:

    # Starting CMT test ...
    # Mounting resctrl to "/sys/fs/resctrl"
    # Cache size :335544320
    # Writing benchmark parameters to resctrl FS
    # Write schema "L3:0=fffe0" to resctrl FS
    # Write schema "L3:0=1f" to resctrl FS
    # Benchmark PID: 7089
    # Checking for pass/fail
    # Pass: Check cache miss rate within 15%
    # Percent diff=12
    # Number of bits: 5
    # Average LLC val: 73269248
    # Cache span (bytes): 83886080
    ok 1 CMT: test

Link: https://lore.kernel.org/r/b160592179f88069cdc679563e152007998a0d76.1775266384.git.reinette.chatre@intel.com
Reported-by: Dave Martin 
Signed-off-by: Reinette Chatre 
Tested-by: Chen Yu 
Reviewed-by: Ilpo Järvinen 
Link: https://lore.kernel.org/lkml/aO+7MeSMV29VdbQs@e133380.arm.com/
Signed-off-by: Shuah Khan

selftests/resctrl: Add CPU vendor detection for Hygon

2026-01-09T23:49:01+00:00

The resctrl selftest currently fails on Hygon CPUs that support Platform
QoS features, printing the error:

  "# Can not get vendor info..."

This occurs because vendor detection is missing for Hygon CPUs.

Fix this by extending the CPU vendor detection logic to include
Hygon's vendor ID.

Link: https://lore.kernel.org/r/20251217030456.3834956-4-shenxiaochen@open-hieco.net
Signed-off-by: Xiaochen Shen 
Reviewed-by: Reinette Chatre 
Signed-off-by: Shuah Khan

selftests/resctrl: Define CPU vendor IDs as bits to match usage

2026-01-09T23:49:01+00:00

The CPU vendor IDs are required to be unique bits because they're used
for vendor_specific bitmask in the struct resctrl_test.
Consider for example their usage in test_vendor_specific_check():
	return get_vendor() & test->vendor_specific

However, the definitions of CPU vendor IDs in file resctrl.h is quite
subtle as a bitmask value:
  #define ARCH_INTEL     1
  #define ARCH_AMD       2

A clearer and more maintainable approach is to define these CPU vendor
IDs using BIT(). This ensures each vendor corresponds to a distinct bit
and makes it obvious when adding new vendor IDs.

Accordingly, update the return types of detect_vendor() and get_vendor()
from 'int' to 'unsigned int' to align with their usage as bitmask values
and to prevent potentially risky type conversions.

Furthermore, introduce a bool flag 'initialized' to simplify the
get_vendor() -> detect_vendor() logic. This ensures the vendor ID is
detected only once and resolves the ambiguity of using the same variable
'vendor' both as a value and as a state.

Link: https://lore.kernel.org/r/20251217030456.3834956-3-shenxiaochen@open-hieco.net
Suggested-by: Reinette Chatre 
Suggested-by: Fenghua Yu 
Signed-off-by: Xiaochen Shen 
Reviewed-by: Reinette Chatre 
Signed-off-by: Shuah Khan

selftests: complete kselftest include centralization

2025-11-27T22:24:31+00:00

This follow-up patch completes centralization of kselftest.h and
ksefltest_harness.h includes in remaining seltests files, replacing all
relative paths with a non-relative paths using shared -I include path in
lib.mk

Tested with gcc-13.3 and clang-18.1, and cross-compiled successfully on
riscv, arm64, x86_64 and powerpc arch.

[reddybalavignesh9979@gmail.com: add selftests include path for kselftest.h]
  Link: https://lkml.kernel.org/r/20251017090201.317521-1-reddybalavignesh9979@gmail.com
Link: https://lkml.kernel.org/r/20251016104409.68985-1-reddybalavignesh9979@gmail.com
Signed-off-by: Bala-Vignesh-Reddy 
Suggested-by: Andrew Morton 
Link: https://lore.kernel.org/lkml/20250820143954.33d95635e504e94df01930d0@linux-foundation.org/
Reviewed-by: Wei Yang 
Cc: David Hildenbrand 
Cc: David S. Miller 
Cc: Eric Dumazet 
Cc: Günther Noack 
Cc: Jakub Kacinski 
Cc: Liam Howlett 
Cc: Lorenzo Stoakes 
Cc: Michal Hocko 
Cc: Mickael Salaun 
Cc: Ming Lei 
Cc: Paolo Abeni 
Cc: Shuah Khan 
Cc: Simon Horman 
Cc: Suren Baghdasaryan 
Cc: Vlastimil Babka 
Signed-off-by: Andrew Morton

selftests/resctrl: Discover SNC kernel support and adjust messages

2025-01-15T00:06:32+00:00

Resctrl selftest prints a message on test failure that Sub-Numa
Clustering (SNC) could be enabled and points the user to check their BIOS
settings. No actual check is performed before printing that message so
it is not very accurate in pinpointing a problem.

When there is SNC support for kernel's resctrl subsystem and SNC is
enabled then sub node files are created for each node in the resctrlfs.
The sub node files exist in each regular node's L3 monitoring directory.
The reliable path to check for existence of sub node files is
/sys/fs/resctrl/mon_data/mon_L3_00/mon_sub_L3_00.

Add helper that checks for mon_sub_L3_00 existence.

Correct old messages to account for kernel support of SNC in
resctrl.

Signed-off-by: Maciej Wieczor-Retman 
Reviewed-by: Reinette Chatre 
Signed-off-by: Shuah Khan

selftests/resctrl: Adjust effective L3 cache size with SNC enabled

2025-01-15T00:06:32+00:00

Sub-NUMA Cluster divides CPUs sharing an L3 cache into separate NUMA
nodes. Systems may support splitting into either two, three, four or six
nodes. When SNC mode is enabled the effective amount of L3 cache
available for allocation is divided by the number of nodes per L3.

It's possible to detect which SNC mode is active by comparing the number
of CPUs that share a cache with CPU0, with the number of CPUs on node0.

Detect SNC mode once and let other tests inherit that information.

Update CFLAGS after including lib.mk in the Makefile so that fallthrough
macro can be used.

To check if SNC detection is reliable one can check the
/sys/devices/system/cpu/offline file. If it's empty, it means all cores
are operational and the ratio should be calculated correctly. If it has
any contents, it means the detected SNC mode can't be trusted and should
be disabled.

Check if detection was not reliable due to offline cpus. If it was skip
running tests since the results couldn't be trusted.

Co-developed-by: Tony Luck 
Signed-off-by: Tony Luck 
Signed-off-by: Maciej Wieczor-Retman 
Reviewed-by: Reinette Chatre 
Signed-off-by: Shuah Khan

selftests/resctrl: Do not compare performance counters and resctrl at low bandwidth

2024-11-05T00:02:03+00:00

The MBA test incrementally throttles memory bandwidth, each time
followed by a comparison between the memory bandwidth observed
by the performance counters and resctrl respectively.

While a comparison between performance counters and resctrl is
generally appropriate, they do not have an identical view of
memory bandwidth. For example RAS features or memory performance
features that generate memory traffic may drive accesses that are
counted differently by performance counters and MBM respectively,
for instance generating "overhead" traffic which is not counted
against any specific RMID. As a ratio, this different view of memory
bandwidth becomes more apparent at low memory bandwidths.

It is not practical to enable/disable the various features that
may generate memory bandwidth to give performance counters and
resctrl an identical view. Instead, do not compare performance
counters and resctrl view of memory bandwidth when the memory
bandwidth is low.

Bandwidth throttling behaves differently across platforms
so it is not appropriate to drop measurement data simply based
on the throttling level. Instead, use a threshold of 750MiB
that has been observed to support adequate comparison between
performance counters and resctrl.

Signed-off-by: Reinette Chatre 
Reviewed-by: Ilpo Järvinen 
Signed-off-by: Shuah Khan