summaryrefslogtreecommitdiff
path: root/include/linux
diff options
context:
space:
mode:
authorTejun Heo <tj@kernel.org>2026-05-04 14:51:20 -1000
committerTejun Heo <tj@kernel.org>2026-05-15 07:24:29 -1000
commitcfc1da7e1127b4c8787f4dc25d59987c10c9107f (patch)
tree1910b61aed81a4821ece0b3fbdd626026cb1b9c4 /include/linux
parentc4799253a3ee74ebb27be72fb991c597a5902c01 (diff)
cgroup: Add per-subsys-css kill_css_finish deferral
93618edf7538 ("cgroup: Defer css percpu_ref kill on rmdir until cgroup is depopulated") deferred kill_css_finish() at the cgroup level: rmdir waits for the entire cgroup's populated count to drop to zero, then fires kill_css_finish() on every subsystem css at once. Replace that with per-subsys-css deferral. Each subsystem css now tracks its own hierarchical populated count and independently defers its kill_css_finish() until its own subtree drains. The rmdir-race fix carries through unchanged in shape. The dying css's ->css_offline() still waits until no PF_EXITING task references it, and v2's cgroup-level machinery goes away. cgroup_apply_control_disable() has the same race shape (PF_EXITING tasks pinning a css whose ->css_offline() is about to run) and stays synchronous here. This patch lays the groundwork for fixing it - per-cgroup waiting can't gate one subsys css being killed while the rest of the cgroup stays live, but per-css can. Subtree-wide invariant preserved: a dying ancestor css stays populated through nr_populated_children until every dying descendant's task drains, so the walker fires the ancestor's kill_finish_work only after all descendants have drained. Add paired smp_mb()s in kill_css_sync() and css_update_populated() to fence the StoreLoad on (CSS_DYING, populated counter), guaranteeing that either the walker queues kill_finish_work or the caller fires synchronously. cgroup_destroy_locked() was implicitly fenced by an unrelated css_set_lock pair; cgroup_apply_control_disable() in the next patch is not. Signed-off-by: Tejun Heo <tj@kernel.org>
Diffstat (limited to 'include/linux')
-rw-r--r--include/linux/cgroup-defs.h6
1 files changed, 3 insertions, 3 deletions
diff --git a/include/linux/cgroup-defs.h b/include/linux/cgroup-defs.h
index c4929f7bbe5a..de2cd6238c2a 100644
--- a/include/linux/cgroup-defs.h
+++ b/include/linux/cgroup-defs.h
@@ -262,6 +262,9 @@ struct cgroup_subsys_state {
int nr_populated_csets;
int nr_populated_children;
+ /* deferred kill_css_finish() queued by css_update_populated() */
+ struct work_struct kill_finish_work;
+
/*
* A singly-linked list of css structures to be rstat flushed.
* This is a scratch field to be used exclusively by
@@ -615,9 +618,6 @@ struct cgroup {
/* used to wait for offlining of csses */
wait_queue_head_t offline_waitq;
- /* defers killing csses after removal until cgroup is depopulated */
- struct work_struct finish_destroy_work;
-
/* used to schedule release agent */
struct work_struct release_agent_work;