diff options
| author | Ben Maurer <bmaurer@meta.com> | 2026-05-29 14:23:46 -0700 |
|---|---|---|
| committer | Johannes Thumshirn <johannes.thumshirn@wdc.com> | 2026-06-09 18:22:46 +0200 |
| commit | 1ba72d847c7aa3c0887f749115af5232fd61b598 (patch) | |
| tree | 0a8ac268def7a46f2f33fbe362e25a80f6a5b0bb /include/linux/timerqueue_types.h | |
| parent | 79bdd8846317f3dea26c53d75700045f62265557 (diff) | |
btrfs: use lockless read in nr_cached_objects shrinker callback
Under heavy memcg-driven slab reclaim with many memcgs and CPUs,
shrink_slab_memcg() invokes the per-superblock count callback once per
(memcg, NUMA node) tuple. For btrfs that callback reaches
percpu_counter_sum_positive() on fs_info->evictable_extent_maps, which
takes the percpu_counter's raw spinlock with IRQs disabled and walks
every online CPU. With hundreds of memcgs driving reclaim on a host with
dozens of CPUs, this counter lock becomes a global serialization point:
profiles show CPU pinned in the spin_lock_irqsave acquire under
__percpu_counter_sum, with cross-CPU IPIs hitting csd_lock_wait_toolong
while waiting for spinning vCPUs.
The shrinker count is advisory -- super_cache_count() already notes
"counts can change between super_cache_count and super_cache_scan, so we
really don't need locks here." Use percpu_counter_read_positive(), which
is lockless. Worst-case skew is bounded by batch * num_online_cpus (a
few thousand), negligible compared to the millions of extent maps a busy
filesystem accumulates and well within the noise that the shrinker
already tolerates.
Tested-by: Boris Burkov <boris@bur.io>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Shakeel Butt <shakeel.butt@linux.dev>
Signed-off-by: Ben Maurer <bmaurer@meta.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Diffstat (limited to 'include/linux/timerqueue_types.h')
0 files changed, 0 insertions, 0 deletions
