diff options
| author | Hao Li <hao.li@linux.dev> | 2026-05-29 11:50:52 +0800 |
|---|---|---|
| committer | Vlastimil Babka (SUSE) <vbabka@kernel.org> | 2026-06-01 10:42:12 +0200 |
| commit | 0fc52deec1068ea3cc8eaa6e045c96fbf73f20e2 (patch) | |
| tree | 68cfd32d4432ee8d42515d7cfe62357f7e6535cc /scripts/Makefile.thinlto | |
| parent | 7e230738746ce9a7a57c55cbde0c48668a891334 (diff) | |
mm/slub: detach and reattach partial slabs in batch
get_partial_node_bulk() moves each selected slab from the node's
partial list to the local pc->slabs list using a remove_partial() and
list_add() pair. In practice, the loop often detaches several adjacent
slabs. Doing this individually repeatedly manipulates list pointers
while holding n->list_lock, which causes unnecessary churn.
To demonstrate this, the counts below show how often single vs. multiple
consecutive slabs are retrieved during a will-it-scale mmap stress test:
consecutive_slabs_count frequency
= 1 277345324
= 2 335238023
= 3 175717884
>= 4 88862337
The data confirms that retrieving multiple contiguous slabs is highly
frequent.
To optimize this, track contiguous runs of matching slabs and move each
run in a single operation using list_bulk_move_tail(). This reduces list
pointer churn inside the lock critical section.
Apply the same optimization to __refill_objects_node() when reattaching
leftover partial slabs back to the node's partial list.
The will-it-scale mmap benchmark shows a 2% ~ 5% performance improvement
after applying this patch.
Signed-off-by: Hao Li <hao.li@linux.dev>
Link: https://patch.msgid.link/20260529035120.81304-3-hao.li@linux.dev
Reviewed-by: Harry Yoo (Oracle) <harry@kernel.org>
Signed-off-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
Diffstat (limited to 'scripts/Makefile.thinlto')
0 files changed, 0 insertions, 0 deletions
