summaryrefslogtreecommitdiff
path: root/lib/test_workqueue.c
AgeCommit message (Collapse)Author
2026-04-03workqueue: avoid unguarded 64-bit divisionArnd Bergmann
The printk() requires a division that is not allowed on 32-bit architectures: x86_64-linux-ld: lib/test_workqueue.o: in function `test_workqueue_init': test_workqueue.c:(.init.text+0x36f): undefined reference to `__udivdi3' Use div_u64() to print the resulting elapsed microseconds. Fixes: 24b2e73f9700 ("workqueue: add test_workqueue benchmark module") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Tejun Heo <tj@kernel.org>
2026-04-01workqueue: add test_workqueue benchmark moduleBreno Leitao
Add a kernel module that benchmarks queue_work() throughput on an unbound workqueue to measure pool->lock contention under different affinity scope configurations (cache vs cache_shard). The module spawns N kthreads (default: num_online_cpus()), each bound to a different CPU. All threads start simultaneously and queue work items, measuring the latency of each queue_work() call. Results are reported as p50/p90/p95 latencies for each affinity scope. The affinity scope is switched between runs via the workqueue's sysfs affinity_scope attribute (WQ_SYSFS), avoiding the need for any new exported symbols. The module runs as __init-only, returning -EAGAIN to auto-unload, and can be re-run via insmod. Example of the output: running 50 threads, 50000 items/thread cpu 6806017 items/sec p50=2574 p90=5068 p95=5818 ns smt 6821040 items/sec p50=2624 p90=5168 p95=5949 ns cache_shard 1633653 items/sec p50=5337 p90=9694 p95=11207 ns cache 286069 items/sec p50=72509 p90=82304 p95=85009 ns numa 319403 items/sec p50=63745 p90=73480 p95=76505 ns system 308461 items/sec p50=66561 p90=75714 p95=78048 ns Signed-off-by: Breno Leitao <leitao@debian.org> Signed-off-by: Tejun Heo <tj@kernel.org>