<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/kernel/cgroup/cgroup.c, branch v6.0-rc2</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>Merge tag 'cgroup-for-5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup</title>
<updated>2022-08-03T16:45:08+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2022-08-03T16:45:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=b6bb70f9ab80a11161252bf217993d2c40ea5eb2'/>
<id>b6bb70f9ab80a11161252bf217993d2c40ea5eb2</id>
<content type='text'>
Pull cgroup updates from Tejun Heo:
 "Several core optimizations:

   - threadgroup_rwsem write locking is skipped when configuring
     controllers in empty subtrees.

     Combined with CLONE_INTO_CGROUP, this allows the common static
     usage pattern to not grab threadgroup_rwsem at all (glibc still
     doesn't seem ready for CLONE_INTO_CGROUP unfortunately).

   - threadgroup_rwsem used to be put into non-percpu mode by default
     due to latency concerns in specific use cases. There's no reason
     for everyone else to pay for it. Make the behavior optional.

   - psi no longer allocates memory when disabled.

  ... along with some code cleanups"

* tag 'cgroup-for-5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cgroup: Skip subtree root in cgroup_update_dfl_csses()
  cgroup: remove "no" prefixed mount options
  cgroup: Make !percpu threadgroup_rwsem operations optional
  cgroup: Add "no" prefixed mount options
  cgroup: Elide write-locking threadgroup_rwsem when updating csses on an empty subtree
  cgroup.c: remove redundant check for mixable cgroup in cgroup_migrate_vet_dst
  cgroup.c: add helper __cset_cgroup_from_root to cleanup duplicated codes
  psi: dont alloc memory for psi by default
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull cgroup updates from Tejun Heo:
 "Several core optimizations:

   - threadgroup_rwsem write locking is skipped when configuring
     controllers in empty subtrees.

     Combined with CLONE_INTO_CGROUP, this allows the common static
     usage pattern to not grab threadgroup_rwsem at all (glibc still
     doesn't seem ready for CLONE_INTO_CGROUP unfortunately).

   - threadgroup_rwsem used to be put into non-percpu mode by default
     due to latency concerns in specific use cases. There's no reason
     for everyone else to pay for it. Make the behavior optional.

   - psi no longer allocates memory when disabled.

  ... along with some code cleanups"

* tag 'cgroup-for-5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cgroup: Skip subtree root in cgroup_update_dfl_csses()
  cgroup: remove "no" prefixed mount options
  cgroup: Make !percpu threadgroup_rwsem operations optional
  cgroup: Add "no" prefixed mount options
  cgroup: Elide write-locking threadgroup_rwsem when updating csses on an empty subtree
  cgroup.c: remove redundant check for mixable cgroup in cgroup_migrate_vet_dst
  cgroup.c: add helper __cset_cgroup_from_root to cleanup duplicated codes
  psi: dont alloc memory for psi by default
</pre>
</div>
</content>
</entry>
<entry>
<title>cgroup: Skip subtree root in cgroup_update_dfl_csses()</title>
<updated>2022-07-28T17:26:30+00:00</updated>
<author>
<name>Waiman Long</name>
<email>longman@redhat.com</email>
</author>
<published>2022-07-28T00:58:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=265792d0dede9259f0ca56bb3efcc23eceee7d01'/>
<id>265792d0dede9259f0ca56bb3efcc23eceee7d01</id>
<content type='text'>
The cgroup_update_dfl_csses() function updates css associations when a
cgroup's subtree_control file is modified. Any changes made to a cgroup's
subtree_control file, however, will only affect its descendants but not
the cgroup itself. So there is no point in migrating csses associated
with that cgroup. We can skip them instead.

Signed-off-by: Waiman Long &lt;longman@redhat.com&gt;
Reviewed-by: Michal Koutný &lt;mkoutny@suse.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The cgroup_update_dfl_csses() function updates css associations when a
cgroup's subtree_control file is modified. Any changes made to a cgroup's
subtree_control file, however, will only affect its descendants but not
the cgroup itself. So there is no point in migrating csses associated
with that cgroup. We can skip them instead.

Signed-off-by: Waiman Long &lt;longman@redhat.com&gt;
Reviewed-by: Michal Koutný &lt;mkoutny@suse.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cgroup: remove "no" prefixed mount options</title>
<updated>2022-07-27T17:54:55+00:00</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2022-07-27T17:54:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=c808f4632349bda65b2ec41220f0a2035f780619'/>
<id>c808f4632349bda65b2ec41220f0a2035f780619</id>
<content type='text'>
30312730bd02 ("cgroup: Add "no" prefixed mount options") added "no" prefixed
mount options to allow turning them off and 6a010a49b63a ("cgroup: Make
!percpu threadgroup_rwsem operations optional") added one more "no" prefixed
mount option. However, Michal pointed out that the "no" prefixed options
aren't necessary in allowing mount options to be turned off:

  # grep group /proc/mounts
  cgroup2 /sys/fs/cgroup cgroup2 rw,nosuid,nodev,relatime,nsdelegate,memory_recursiveprot 0 0
  # mount -o remount,nsdelegate,memory_recursiveprot none /sys/fs/cgroup
  # grep cgroup /proc/mounts
  cgroup2 /sys/fs/cgroup cgroup2 rw,relatime,nsdelegate,memory_recursiveprot 0 0

Note that this is different from the remount behavior when the mount(1) is
invoked without the device argument - "none":

 # grep cgroup /proc/mounts
 cgroup2 /sys/fs/cgroup cgroup2 rw,nosuid,nodev,noexec,relatime,nsdelegate,memory_recursiveprot 0 0
 # mount -o remount,nsdelegate,memory_recursiveprot /sys/fs/cgroup
 # grep cgroup /proc/mounts
 cgroup2 /sys/fs/cgroup cgroup2 rw,nosuid,nodev,noexec,relatime,nsdelegate,memory_recursiveprot 0 0

While a bit confusing, given that there is a way to turn off the options,
there's no reason to have the explicit "no" prefixed options. Let's remove
them.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Michal Koutný &lt;mkoutny@suse.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
30312730bd02 ("cgroup: Add "no" prefixed mount options") added "no" prefixed
mount options to allow turning them off and 6a010a49b63a ("cgroup: Make
!percpu threadgroup_rwsem operations optional") added one more "no" prefixed
mount option. However, Michal pointed out that the "no" prefixed options
aren't necessary in allowing mount options to be turned off:

  # grep group /proc/mounts
  cgroup2 /sys/fs/cgroup cgroup2 rw,nosuid,nodev,relatime,nsdelegate,memory_recursiveprot 0 0
  # mount -o remount,nsdelegate,memory_recursiveprot none /sys/fs/cgroup
  # grep cgroup /proc/mounts
  cgroup2 /sys/fs/cgroup cgroup2 rw,relatime,nsdelegate,memory_recursiveprot 0 0

Note that this is different from the remount behavior when the mount(1) is
invoked without the device argument - "none":

 # grep cgroup /proc/mounts
 cgroup2 /sys/fs/cgroup cgroup2 rw,nosuid,nodev,noexec,relatime,nsdelegate,memory_recursiveprot 0 0
 # mount -o remount,nsdelegate,memory_recursiveprot /sys/fs/cgroup
 # grep cgroup /proc/mounts
 cgroup2 /sys/fs/cgroup cgroup2 rw,nosuid,nodev,noexec,relatime,nsdelegate,memory_recursiveprot 0 0

While a bit confusing, given that there is a way to turn off the options,
there's no reason to have the explicit "no" prefixed options. Let's remove
them.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Michal Koutný &lt;mkoutny@suse.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cgroup: Make !percpu threadgroup_rwsem operations optional</title>
<updated>2022-07-23T14:29:02+00:00</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2022-07-23T14:28:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=6a010a49b63ac8465851a79185d8deff966f8e1a'/>
<id>6a010a49b63ac8465851a79185d8deff966f8e1a</id>
<content type='text'>
3942a9bd7b58 ("locking, rcu, cgroup: Avoid synchronize_sched() in
__cgroup_procs_write()") disabled percpu operations on threadgroup_rwsem
because the impiled synchronize_rcu() on write locking was pushing up the
latencies too much for android which constantly moves processes between
cgroups.

This makes the hotter paths - fork and exit - slower as they're always
forced into the slow path. There is no reason to force this on everyone
especially given that more common static usage pattern can now completely
avoid write-locking the rwsem. Write-locking is elided when turning on and
off controllers on empty sub-trees and CLONE_INTO_CGROUP enables seeding a
cgroup without grabbing the rwsem.

Restore the default percpu operations and introduce the mount option
"favordynmods" and config option CGROUP_FAVOR_DYNMODS for users who need
lower latencies for the dynamic operations.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Christian Brauner &lt;brauner@kernel.org&gt;
Cc: Michal Koutnï¿½ &lt;mkoutny@suse.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: John Stultz &lt;john.stultz@linaro.org&gt;
Cc: Dmitry Shmidt &lt;dimitrysh@google.com&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
3942a9bd7b58 ("locking, rcu, cgroup: Avoid synchronize_sched() in
__cgroup_procs_write()") disabled percpu operations on threadgroup_rwsem
because the impiled synchronize_rcu() on write locking was pushing up the
latencies too much for android which constantly moves processes between
cgroups.

This makes the hotter paths - fork and exit - slower as they're always
forced into the slow path. There is no reason to force this on everyone
especially given that more common static usage pattern can now completely
avoid write-locking the rwsem. Write-locking is elided when turning on and
off controllers on empty sub-trees and CLONE_INTO_CGROUP enables seeding a
cgroup without grabbing the rwsem.

Restore the default percpu operations and introduce the mount option
"favordynmods" and config option CGROUP_FAVOR_DYNMODS for users who need
lower latencies for the dynamic operations.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Christian Brauner &lt;brauner@kernel.org&gt;
Cc: Michal Koutnï¿½ &lt;mkoutny@suse.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: John Stultz &lt;john.stultz@linaro.org&gt;
Cc: Dmitry Shmidt &lt;dimitrysh@google.com&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cgroup: Add "no" prefixed mount options</title>
<updated>2022-07-23T05:12:52+00:00</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2022-07-15T04:38:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=30312730bd029f567045c38098d7e5a62e9aa658'/>
<id>30312730bd029f567045c38098d7e5a62e9aa658</id>
<content type='text'>
We allow modifying these mount options via remount. Let's add "no" prefixed
variants so that they can be turned off too.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Christian Brauner &lt;brauner@kernel.org&gt;
Cc: Michal Koutný &lt;mkoutny@suse.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We allow modifying these mount options via remount. Let's add "no" prefixed
variants so that they can be turned off too.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Christian Brauner &lt;brauner@kernel.org&gt;
Cc: Michal Koutný &lt;mkoutny@suse.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cgroup: Elide write-locking threadgroup_rwsem when updating csses on an empty subtree</title>
<updated>2022-07-23T05:12:37+00:00</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2022-07-15T04:38:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=671c11f0619e5ccb380bcf0f062f69ba95fc974a'/>
<id>671c11f0619e5ccb380bcf0f062f69ba95fc974a</id>
<content type='text'>
cgroup_update_dfl_csses() write-lock the threadgroup_rwsem as updating the
csses can trigger process migrations. However, if the subtree doesn't
contain any tasks, there aren't gonna be any cgroup migrations. This
condition can be trivially detected by testing whether
mgctx.preloaded_src_csets is empty. Elide write-locking threadgroup_rwsem if
the subtree is empty.

After this optimization, the usage pattern of creating a cgroup, enabling
the necessary controllers, and then seeding it with CLONE_INTO_CGROUP and
then removing the cgroup after it becomes empty doesn't need to write-lock
threadgroup_rwsem at all.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Christian Brauner &lt;brauner@kernel.org&gt;
Cc: Michal Koutný &lt;mkoutny@suse.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
cgroup_update_dfl_csses() write-lock the threadgroup_rwsem as updating the
csses can trigger process migrations. However, if the subtree doesn't
contain any tasks, there aren't gonna be any cgroup migrations. This
condition can be trivially detected by testing whether
mgctx.preloaded_src_csets is empty. Elide write-locking threadgroup_rwsem if
the subtree is empty.

After this optimization, the usage pattern of creating a cgroup, enabling
the necessary controllers, and then seeding it with CLONE_INTO_CGROUP and
then removing the cgroup after it becomes empty doesn't need to write-lock
threadgroup_rwsem at all.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Christian Brauner &lt;brauner@kernel.org&gt;
Cc: Michal Koutný &lt;mkoutny@suse.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cgroup.c: remove redundant check for mixable cgroup in cgroup_migrate_vet_dst</title>
<updated>2022-06-27T09:09:21+00:00</updated>
<author>
<name>Lin Feng</name>
<email>linf@wangsu.com</email>
</author>
<published>2022-06-21T09:23:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=d75cd55ae2dedeee5382bb48832c322673b9781c'/>
<id>d75cd55ae2dedeee5382bb48832c322673b9781c</id>
<content type='text'>
We have:
int cgroup_migrate_vet_dst(struct cgroup *dst_cgrp)
{
...
	/* mixables don't care */
	if (cgroup_is_mixable(dst_cgrp))
		return 0;

	/*
	 * If @dst_cgrp is already or can become a thread root or is
	 * threaded, it doesn't matter.
	 */
	if (cgroup_can_be_thread_root(dst_cgrp) || cgroup_is_threaded(dst_cgrp))
		return 0;
...
}

but in fact the entry of cgroup_can_be_thread_root() covers case that
checking cgroup_is_mixable() as following:
static bool cgroup_can_be_thread_root(struct cgroup *cgrp)
{
        /* mixables don't care */
        if (cgroup_is_mixable(cgrp))
                return true;
...
}

so explicitly checking in cgroup_migrate_vet_dst is unnecessary.

Signed-off-by: Lin Feng &lt;linf@wangsu.com&gt;
Reviewed-by: Michal Koutný &lt;mkoutny@suse.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We have:
int cgroup_migrate_vet_dst(struct cgroup *dst_cgrp)
{
...
	/* mixables don't care */
	if (cgroup_is_mixable(dst_cgrp))
		return 0;

	/*
	 * If @dst_cgrp is already or can become a thread root or is
	 * threaded, it doesn't matter.
	 */
	if (cgroup_can_be_thread_root(dst_cgrp) || cgroup_is_threaded(dst_cgrp))
		return 0;
...
}

but in fact the entry of cgroup_can_be_thread_root() covers case that
checking cgroup_is_mixable() as following:
static bool cgroup_can_be_thread_root(struct cgroup *cgrp)
{
        /* mixables don't care */
        if (cgroup_is_mixable(cgrp))
                return true;
...
}

so explicitly checking in cgroup_migrate_vet_dst is unnecessary.

Signed-off-by: Lin Feng &lt;linf@wangsu.com&gt;
Reviewed-by: Michal Koutný &lt;mkoutny@suse.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cgroup: Use separate src/dst nodes when preloading css_sets for migration</title>
<updated>2022-06-16T19:37:02+00:00</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2022-06-13T22:19:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=07fd5b6cdf3cc30bfde8fe0f644771688be04447'/>
<id>07fd5b6cdf3cc30bfde8fe0f644771688be04447</id>
<content type='text'>
Each cset (css_set) is pinned by its tasks. When we're moving tasks around
across csets for a migration, we need to hold the source and destination
csets to ensure that they don't go away while we're moving tasks about. This
is done by linking cset-&gt;mg_preload_node on either the
mgctx-&gt;preloaded_src_csets or mgctx-&gt;preloaded_dst_csets list. Using the
same cset-&gt;mg_preload_node for both the src and dst lists was deemed okay as
a cset can't be both the source and destination at the same time.

Unfortunately, this overloading becomes problematic when multiple tasks are
involved in a migration and some of them are identity noop migrations while
others are actually moving across cgroups. For example, this can happen with
the following sequence on cgroup1:

 #1&gt; mkdir -p /sys/fs/cgroup/misc/a/b
 #2&gt; echo $$ &gt; /sys/fs/cgroup/misc/a/cgroup.procs
 #3&gt; RUN_A_COMMAND_WHICH_CREATES_MULTIPLE_THREADS &amp;
 #4&gt; PID=$!
 #5&gt; echo $PID &gt; /sys/fs/cgroup/misc/a/b/tasks
 #6&gt; echo $PID &gt; /sys/fs/cgroup/misc/a/cgroup.procs

the process including the group leader back into a. In this final migration,
non-leader threads would be doing identity migration while the group leader
is doing an actual one.

After #3, let's say the whole process was in cset A, and that after #4, the
leader moves to cset B. Then, during #6, the following happens:

 1. cgroup_migrate_add_src() is called on B for the leader.

 2. cgroup_migrate_add_src() is called on A for the other threads.

 3. cgroup_migrate_prepare_dst() is called. It scans the src list.

 4. It notices that B wants to migrate to A, so it tries to A to the dst
    list but realizes that its -&gt;mg_preload_node is already busy.

 5. and then it notices A wants to migrate to A as it's an identity
    migration, it culls it by list_del_init()'ing its -&gt;mg_preload_node and
    putting references accordingly.

 6. The rest of migration takes place with B on the src list but nothing on
    the dst list.

This means that A isn't held while migration is in progress. If all tasks
leave A before the migration finishes and the incoming task pins it, the
cset will be destroyed leading to use-after-free.

This is caused by overloading cset-&gt;mg_preload_node for both src and dst
preload lists. We wanted to exclude the cset from the src list but ended up
inadvertently excluding it from the dst list too.

This patch fixes the issue by separating out cset-&gt;mg_preload_node into
-&gt;mg_src_preload_node and -&gt;mg_dst_preload_node, so that the src and dst
preloadings don't interfere with each other.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Reported-by: Mukesh Ojha &lt;quic_mojha@quicinc.com&gt;
Reported-by: shisiyuan &lt;shisiyuan19870131@gmail.com&gt;
Link: http://lkml.kernel.org/r/1654187688-27411-1-git-send-email-shisiyuan@xiaomi.com
Link: https://www.spinics.net/lists/cgroups/msg33313.html
Fixes: f817de98513d ("cgroup: prepare migration path for unified hierarchy")
Cc: stable@vger.kernel.org # v3.16+
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Each cset (css_set) is pinned by its tasks. When we're moving tasks around
across csets for a migration, we need to hold the source and destination
csets to ensure that they don't go away while we're moving tasks about. This
is done by linking cset-&gt;mg_preload_node on either the
mgctx-&gt;preloaded_src_csets or mgctx-&gt;preloaded_dst_csets list. Using the
same cset-&gt;mg_preload_node for both the src and dst lists was deemed okay as
a cset can't be both the source and destination at the same time.

Unfortunately, this overloading becomes problematic when multiple tasks are
involved in a migration and some of them are identity noop migrations while
others are actually moving across cgroups. For example, this can happen with
the following sequence on cgroup1:

 #1&gt; mkdir -p /sys/fs/cgroup/misc/a/b
 #2&gt; echo $$ &gt; /sys/fs/cgroup/misc/a/cgroup.procs
 #3&gt; RUN_A_COMMAND_WHICH_CREATES_MULTIPLE_THREADS &amp;
 #4&gt; PID=$!
 #5&gt; echo $PID &gt; /sys/fs/cgroup/misc/a/b/tasks
 #6&gt; echo $PID &gt; /sys/fs/cgroup/misc/a/cgroup.procs

the process including the group leader back into a. In this final migration,
non-leader threads would be doing identity migration while the group leader
is doing an actual one.

After #3, let's say the whole process was in cset A, and that after #4, the
leader moves to cset B. Then, during #6, the following happens:

 1. cgroup_migrate_add_src() is called on B for the leader.

 2. cgroup_migrate_add_src() is called on A for the other threads.

 3. cgroup_migrate_prepare_dst() is called. It scans the src list.

 4. It notices that B wants to migrate to A, so it tries to A to the dst
    list but realizes that its -&gt;mg_preload_node is already busy.

 5. and then it notices A wants to migrate to A as it's an identity
    migration, it culls it by list_del_init()'ing its -&gt;mg_preload_node and
    putting references accordingly.

 6. The rest of migration takes place with B on the src list but nothing on
    the dst list.

This means that A isn't held while migration is in progress. If all tasks
leave A before the migration finishes and the incoming task pins it, the
cset will be destroyed leading to use-after-free.

This is caused by overloading cset-&gt;mg_preload_node for both src and dst
preload lists. We wanted to exclude the cset from the src list but ended up
inadvertently excluding it from the dst list too.

This patch fixes the issue by separating out cset-&gt;mg_preload_node into
-&gt;mg_src_preload_node and -&gt;mg_dst_preload_node, so that the src and dst
preloadings don't interfere with each other.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Reported-by: Mukesh Ojha &lt;quic_mojha@quicinc.com&gt;
Reported-by: shisiyuan &lt;shisiyuan19870131@gmail.com&gt;
Link: http://lkml.kernel.org/r/1654187688-27411-1-git-send-email-shisiyuan@xiaomi.com
Link: https://www.spinics.net/lists/cgroups/msg33313.html
Fixes: f817de98513d ("cgroup: prepare migration path for unified hierarchy")
Cc: stable@vger.kernel.org # v3.16+
</pre>
</div>
</content>
</entry>
<entry>
<title>cgroup.c: add helper __cset_cgroup_from_root to cleanup duplicated codes</title>
<updated>2022-06-16T19:33:53+00:00</updated>
<author>
<name>Lin Feng</name>
<email>linf@wangsu.com</email>
</author>
<published>2022-06-16T10:38:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=e210a89f5b07680fe21d21e846e6817346c5ba3b'/>
<id>e210a89f5b07680fe21d21e846e6817346c5ba3b</id>
<content type='text'>
No funtionality change, but save us some lines.

Signed-off-by: Lin Feng &lt;linf@wangsu.com&gt;
Acked-by: Mukesh Ojha &lt;quic_mojha@quicinc.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
No funtionality change, but save us some lines.

Signed-off-by: Lin Feng &lt;linf@wangsu.com&gt;
Acked-by: Mukesh Ojha &lt;quic_mojha@quicinc.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>psi: dont alloc memory for psi by default</title>
<updated>2022-06-07T17:11:47+00:00</updated>
<author>
<name>Chen Wandun</name>
<email>chenwandun@huawei.com</email>
</author>
<published>2022-05-26T12:26:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=5f69a6577bc33d8f6d6bbe02bccdeb357b287f56'/>
<id>5f69a6577bc33d8f6d6bbe02bccdeb357b287f56</id>
<content type='text'>
Memory about struct psi_group is allocated by default for
each cgroup even if psi_disabled is true, in this case, these
allocated memory is waste, so alloc memory for struct psi_group
only when psi_disabled is false.

Signed-off-by: Chen Wandun &lt;chenwandun@huawei.com&gt;
Acked-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Memory about struct psi_group is allocated by default for
each cgroup even if psi_disabled is true, in this case, these
allocated memory is waste, so alloc memory for struct psi_group
only when psi_disabled is false.

Signed-off-by: Chen Wandun &lt;chenwandun@huawei.com&gt;
Acked-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
