summaryrefslogtreecommitdiff
path: root/include/linux/workqueue.h
diff options
context:
space:
mode:
authorBreno Leitao <leitao@debian.org>2026-04-01 06:03:53 -0700
committerTejun Heo <tj@kernel.org>2026-04-01 10:24:18 -1000
commit5920d046f7ae3bf9cf51b9d915c1fff13d299d84 (patch)
treeed761045ca9464f571d378e54fd47006f93739f4 /include/linux/workqueue.h
parent9dc42c9070282c81058a875fea5acae057610980 (diff)
workqueue: add WQ_AFFN_CACHE_SHARD affinity scope
On systems where many CPUs share one LLC, unbound workqueues using WQ_AFFN_CACHE collapse to a single worker pool, causing heavy spinlock contention on pool->lock. For example, Chuck Lever measured 39% of cycles lost to native_queued_spin_lock_slowpath on a 12-core shared-L3 NFS-over-RDMA system. The existing affinity hierarchy (cpu, smt, cache, numa, system) offers no intermediate option between per-LLC and per-SMT-core granularity. Add WQ_AFFN_CACHE_SHARD, which subdivides each LLC into groups of at most wq_cache_shard_size cores (default 8, tunable via boot parameter). Shards are always split on core (SMT group) boundaries so that Hyper-Threading siblings are never placed in different pods. Cores are distributed across shards as evenly as possible -- for example, 36 cores in a single LLC with max shard size 8 produces 5 shards of 8+7+7+7+7 cores. The implementation follows the same comparator pattern as other affinity scopes: precompute_cache_shard_ids() pre-fills the cpu_shard_id[] array from the already-initialized WQ_AFFN_CACHE and WQ_AFFN_SMT topology, and cpus_share_cache_shard() is passed to init_pod_type(). Benchmark on NVIDIA Grace (72 CPUs, single LLC, 50k items/thread), show cache_shard delivers ~5x the throughput and ~6.5x lower p50 latency compared to cache scope on this 72-core single-LLC system. Suggested-by: Tejun Heo <tj@kernel.org> Signed-off-by: Breno Leitao <leitao@debian.org> Signed-off-by: Tejun Heo <tj@kernel.org>
Diffstat (limited to 'include/linux/workqueue.h')
-rw-r--r--include/linux/workqueue.h1
1 files changed, 1 insertions, 0 deletions
diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h
index 75634a09576a..ab6cb70ca1a5 100644
--- a/include/linux/workqueue.h
+++ b/include/linux/workqueue.h
@@ -133,6 +133,7 @@ enum wq_affn_scope {
WQ_AFFN_CPU, /* one pod per CPU */
WQ_AFFN_SMT, /* one pod per SMT */
WQ_AFFN_CACHE, /* one pod per LLC */
+ WQ_AFFN_CACHE_SHARD, /* synthetic sub-LLC shards */
WQ_AFFN_NUMA, /* one pod per NUMA node */
WQ_AFFN_SYSTEM, /* one pod across the whole system */