summaryrefslogtreecommitdiff
path: root/rust/kernel/alloc/allocator
diff options
context:
space:
mode:
authorPeter Zijlstra <peterz@infradead.org>2026-06-10 23:20:09 +0200
committerPeter Zijlstra <peterz@infradead.org>2026-06-11 13:41:24 +0200
commita734d9fca84e1d4fa0cb442ef5f84c88f8212d32 (patch)
tree65809dbcfdb5bdbe6bc2512095577e99ffa8f940 /rust/kernel/alloc/allocator
parenta837dd95e841586c3a6bbe41c41843b392a1b725 (diff)
futex: Optimize futex hash bucket access patterns
Breno reported significant c2c HITM in a futex hash heavy workload. It turns out that the hash bucket to private hash table reverse pointer (futex_hash_bucket::priv) was to blame. Notably when the hash buckets are heavily contended, the: 'fph = bh->priv;' load in futex_hash() will typically miss and consequently become quite expensive. Since this load in particular is quite superfluous, removing it is fairly straight forward. However, removing it does not in fact achieve anything much. The pain moves to the next user, notably: futex_hash_put(). Therefore rework the whole private hash refcounting to avoid needing this back pointer (and removing it). Instead of passing around 'struct futex_hash_bucket *hb', pass around a new structure that contains it and the related 'struct futex_private_hash *fph' pointer in tandem. Funnily this turns out to remove more code than it adds and significantly improves futex hash performance (as measured by 'perf bench futex hash'): SKL dual socket 112 threads: Baseline Patched shared (16k) 1571857 1641435 + 4.4% autosize (512) 646390 903371 +39.7% -b 256 464395 587014 +26.4% -b 512 715687 995943 +39.2% -b 1024 995085 1396328 +40.3% -b 2048 1293114 1668395 +29.0% -b 4096 2124438 2240228 + 5.5% Zen3 dual socket 256 threads: Baseline Patched shared (16k) 1275840 1381279 + 8.2% autosize (512) 1252745 1482179 +18.3% -b 256 856274 955455 +11.5% -b 512 1267490 1544010 +21.8% -b 1024 1424013 1625424 +14.1% -b 2048 1505181 1669342 +10.9% -b 4096 1465993 1688932 +15.2% AMD EPYC 9D64 (Zen4, single socket) 176 threads: Baseline Patched Delta shared (16k) 1,230,599 1,368,655 +11.2% autosize (1024) 1,285,440 1,556,946 +21.1% -b 256 1,341,471 1,520,303 +13.3% -b 512 1,438,330 1,599,319 +11.2% -b 1024 1,443,772 1,622,493 +12.4% -b 2048 1,472,108 1,643,975 +11.7% -b 4096 1,333,098 1,570,897 +17.8% Reported-by: Breno Leitao <leitao@debian.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Breno Leitao <leitao@debian.org> Tested-by: Thomas Gleixner <tglx@kernel.org> Link: https://patch.msgid.link/20260610135510.GB1430057@noisy.programming.kicks-ass.net
Diffstat (limited to 'rust/kernel/alloc/allocator')
0 files changed, 0 insertions, 0 deletions