<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/kernel/cpuset.c, branch master</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>cgroup: move cgroup files under kernel/cgroup/</title>
<updated>2016-12-27T19:49:05+00:00</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2016-12-27T19:49:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=201af4c0fab02876ef0311e7f7b4083aa138930c'/>
<id>201af4c0fab02876ef0311e7f7b4083aa138930c</id>
<content type='text'>
They're growing to be too many and planned to get split further.  Move
them under their own directory.

 kernel/cgroup.c		-&gt; kernel/cgroup/cgroup.c
 kernel/cgroup_freezer.c	-&gt; kernel/cgroup/freezer.c
 kernel/cgroup_pids.c		-&gt; kernel/cgroup/pids.c
 kernel/cpuset.c		-&gt; kernel/cgroup/cpuset.c

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Acked-by: Acked-by: Zefan Li &lt;lizefan@huawei.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
They're growing to be too many and planned to get split further.  Move
them under their own directory.

 kernel/cgroup.c		-&gt; kernel/cgroup/cgroup.c
 kernel/cgroup_freezer.c	-&gt; kernel/cgroup/freezer.c
 kernel/cgroup_pids.c		-&gt; kernel/cgroup/pids.c
 kernel/cpuset.c		-&gt; kernel/cgroup/cpuset.c

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Acked-by: Acked-by: Zefan Li &lt;lizefan@huawei.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Replace &lt;asm/uaccess.h&gt; with &lt;linux/uaccess.h&gt; globally</title>
<updated>2016-12-24T19:46:01+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2016-12-24T19:46:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=7c0f6ba682b9c7632072ffbedf8d328c8f3c42ba'/>
<id>7c0f6ba682b9c7632072ffbedf8d328c8f3c42ba</id>
<content type='text'>
This was entirely automated, using the script by Al:

  PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*&lt;asm/uaccess.h&gt;'
  sed -i -e "s!$PATT!#include &lt;linux/uaccess.h&gt;!" \
        $(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h)

to do the replacement at the end of the merge window.

Requested-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This was entirely automated, using the script by Al:

  PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*&lt;asm/uaccess.h&gt;'
  sed -i -e "s!$PATT!#include &lt;linux/uaccess.h&gt;!" \
        $(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h)

to do the replacement at the end of the merge window.

Requested-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'for-4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup</title>
<updated>2016-10-14T19:18:50+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2016-10-14T19:18:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=f34d3606f76a8121b9d4940d2dd436bebeb2f9d7'/>
<id>f34d3606f76a8121b9d4940d2dd436bebeb2f9d7</id>
<content type='text'>
Pull cgroup updates from Tejun Heo:

 - tracepoints for basic cgroup management operations added

 - kernfs and cgroup path formatting functions updated to behave in the
   style of strlcpy()

 - non-critical bug fixes

* 'for-4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  blkcg: Unlock blkcg_pol_mutex only once when cpd == NULL
  cgroup: fix error handling regressions in proc_cgroup_show() and cgroup_release_agent()
  cpuset: fix error handling regression in proc_cpuset_show()
  cgroup: add tracepoints for basic operations
  cgroup: make cgroup_path() and friends behave in the style of strlcpy()
  kernfs: remove kernfs_path_len()
  kernfs: make kernfs_path*() behave in the style of strlcpy()
  kernfs: add dummy implementation of kernfs_path_from_node()
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull cgroup updates from Tejun Heo:

 - tracepoints for basic cgroup management operations added

 - kernfs and cgroup path formatting functions updated to behave in the
   style of strlcpy()

 - non-critical bug fixes

* 'for-4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  blkcg: Unlock blkcg_pol_mutex only once when cpd == NULL
  cgroup: fix error handling regressions in proc_cgroup_show() and cgroup_release_agent()
  cpuset: fix error handling regression in proc_cpuset_show()
  cgroup: add tracepoints for basic operations
  cgroup: make cgroup_path() and friends behave in the style of strlcpy()
  kernfs: remove kernfs_path_len()
  kernfs: make kernfs_path*() behave in the style of strlcpy()
  kernfs: add dummy implementation of kernfs_path_from_node()
</pre>
</div>
</content>
</entry>
<entry>
<title>cpuset: fix error handling regression in proc_cpuset_show()</title>
<updated>2016-09-29T13:55:02+00:00</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2016-09-29T09:58:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=679a5e3f12830392d14f94b04bbe0f3cabdbd773'/>
<id>679a5e3f12830392d14f94b04bbe0f3cabdbd773</id>
<content type='text'>
4c737b41de7f ("cgroup: make cgroup_path() and friends behave in the
style of strlcpy()") botched the conversion of proc_cpuset_show() and
broke its error handling.  It made the function return 0 on failures
and fail to handle error returns from cgroup_path_ns().  Fix it.

Reported-by: Dan Carpenter &lt;dan.carpenter@oracle.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
4c737b41de7f ("cgroup: make cgroup_path() and friends behave in the
style of strlcpy()") botched the conversion of proc_cpuset_show() and
broke its error handling.  It made the function return 0 on failures
and fail to handle error returns from cgroup_path_ns().  Fix it.

Reported-by: Dan Carpenter &lt;dan.carpenter@oracle.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cpuset: fix non static symbol warning</title>
<updated>2016-09-16T15:31:17+00:00</updated>
<author>
<name>Wei Yongjun</name>
<email>weiyongjun1@huawei.com</email>
</author>
<published>2016-09-16T13:02:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=8a15b81741879fa89601b03c0e50b0d780d65bc0'/>
<id>8a15b81741879fa89601b03c0e50b0d780d65bc0</id>
<content type='text'>
Fixes the following sparse warning:

kernel/cpuset.c:2088:6: warning:
 symbol 'cpuset_fork' was not declared. Should it be static?

Signed-off-by: Wei Yongjun &lt;weiyongjun1@huawei.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Fixes the following sparse warning:

kernel/cpuset.c:2088:6: warning:
 symbol 'cpuset_fork' was not declared. Should it be static?

Signed-off-by: Wei Yongjun &lt;weiyongjun1@huawei.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cpuset: handle race between CPU hotplug and cpuset_hotplug_work</title>
<updated>2016-09-13T15:26:27+00:00</updated>
<author>
<name>Joonwoo Park</name>
<email>joonwoop@codeaurora.org</email>
</author>
<published>2016-09-12T04:14:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=28b89b9e6f7b6c8fef7b3af39828722bca20cfee'/>
<id>28b89b9e6f7b6c8fef7b3af39828722bca20cfee</id>
<content type='text'>
A discrepancy between cpu_online_mask and cpuset's effective_cpus
mask is inevitable during hotplug since cpuset defers updating of
effective_cpus mask using a workqueue, during which time nothing
prevents the system from more hotplug operations.  For that reason
guarantee_online_cpus() walks up the cpuset hierarchy until it finds
an intersection under the assumption that top cpuset's effective_cpus
mask intersects with cpu_online_mask even with such a race occurring.

However a sequence of CPU hotplugs can open a time window, during which
none of the effective CPUs in the top cpuset intersect with
cpu_online_mask.

For example when there are 4 possible CPUs 0-3 and only CPU0 is online:

  ========================  ===========================
   cpu_online_mask           top_cpuset.effective_cpus
  ========================  ===========================
   echo 1 &gt; cpu2/online.
   CPU hotplug notifier woke up hotplug work but not yet scheduled.
      [0,2]                     [0]

   echo 0 &gt; cpu0/online.
   The workqueue is still runnable.
      [2]                       [0]
  ========================  ===========================

  Now there is no intersection between cpu_online_mask and
  top_cpuset.effective_cpus.  Thus invoking sys_sched_setaffinity() at
  this moment can cause following:

   Unable to handle kernel NULL pointer dereference at virtual address 000000d0
   ------------[ cut here ]------------
   Kernel BUG at ffffffc0001389b0 [verbose debug info unavailable]
   Internal error: Oops - BUG: 96000005 [#1] PREEMPT SMP
   Modules linked in:
   CPU: 2 PID: 1420 Comm: taskset Tainted: G        W       4.4.8+ #98
   task: ffffffc06a5c4880 ti: ffffffc06e124000 task.ti: ffffffc06e124000
   PC is at guarantee_online_cpus+0x2c/0x58
   LR is at cpuset_cpus_allowed+0x4c/0x6c
   &lt;snip&gt;
   Process taskset (pid: 1420, stack limit = 0xffffffc06e124020)
   Call trace:
   [&lt;ffffffc0001389b0&gt;] guarantee_online_cpus+0x2c/0x58
   [&lt;ffffffc00013b208&gt;] cpuset_cpus_allowed+0x4c/0x6c
   [&lt;ffffffc0000d61f0&gt;] sched_setaffinity+0xc0/0x1ac
   [&lt;ffffffc0000d6374&gt;] SyS_sched_setaffinity+0x98/0xac
   [&lt;ffffffc000085cb0&gt;] el0_svc_naked+0x24/0x28

The top cpuset's effective_cpus are guaranteed to be identical to
cpu_online_mask eventually.  Hence fall back to cpu_online_mask when
there is no intersection between top cpuset's effective_cpus and
cpu_online_mask.

Signed-off-by: Joonwoo Park &lt;joonwoop@codeaurora.org&gt;
Acked-by: Li Zefan &lt;lizefan@huawei.com&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: cgroups@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: &lt;stable@vger.kernel.org&gt; # 3.17+
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
A discrepancy between cpu_online_mask and cpuset's effective_cpus
mask is inevitable during hotplug since cpuset defers updating of
effective_cpus mask using a workqueue, during which time nothing
prevents the system from more hotplug operations.  For that reason
guarantee_online_cpus() walks up the cpuset hierarchy until it finds
an intersection under the assumption that top cpuset's effective_cpus
mask intersects with cpu_online_mask even with such a race occurring.

However a sequence of CPU hotplugs can open a time window, during which
none of the effective CPUs in the top cpuset intersect with
cpu_online_mask.

For example when there are 4 possible CPUs 0-3 and only CPU0 is online:

  ========================  ===========================
   cpu_online_mask           top_cpuset.effective_cpus
  ========================  ===========================
   echo 1 &gt; cpu2/online.
   CPU hotplug notifier woke up hotplug work but not yet scheduled.
      [0,2]                     [0]

   echo 0 &gt; cpu0/online.
   The workqueue is still runnable.
      [2]                       [0]
  ========================  ===========================

  Now there is no intersection between cpu_online_mask and
  top_cpuset.effective_cpus.  Thus invoking sys_sched_setaffinity() at
  this moment can cause following:

   Unable to handle kernel NULL pointer dereference at virtual address 000000d0
   ------------[ cut here ]------------
   Kernel BUG at ffffffc0001389b0 [verbose debug info unavailable]
   Internal error: Oops - BUG: 96000005 [#1] PREEMPT SMP
   Modules linked in:
   CPU: 2 PID: 1420 Comm: taskset Tainted: G        W       4.4.8+ #98
   task: ffffffc06a5c4880 ti: ffffffc06e124000 task.ti: ffffffc06e124000
   PC is at guarantee_online_cpus+0x2c/0x58
   LR is at cpuset_cpus_allowed+0x4c/0x6c
   &lt;snip&gt;
   Process taskset (pid: 1420, stack limit = 0xffffffc06e124020)
   Call trace:
   [&lt;ffffffc0001389b0&gt;] guarantee_online_cpus+0x2c/0x58
   [&lt;ffffffc00013b208&gt;] cpuset_cpus_allowed+0x4c/0x6c
   [&lt;ffffffc0000d61f0&gt;] sched_setaffinity+0xc0/0x1ac
   [&lt;ffffffc0000d6374&gt;] SyS_sched_setaffinity+0x98/0xac
   [&lt;ffffffc000085cb0&gt;] el0_svc_naked+0x24/0x28

The top cpuset's effective_cpus are guaranteed to be identical to
cpu_online_mask eventually.  Hence fall back to cpu_online_mask when
there is no intersection between top cpuset's effective_cpus and
cpu_online_mask.

Signed-off-by: Joonwoo Park &lt;joonwoop@codeaurora.org&gt;
Acked-by: Li Zefan &lt;lizefan@huawei.com&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: cgroups@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: &lt;stable@vger.kernel.org&gt; # 3.17+
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cgroup: make cgroup_path() and friends behave in the style of strlcpy()</title>
<updated>2016-08-10T15:23:44+00:00</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2016-08-10T15:23:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=4c737b41de7f4eef2a593803bad1b918dd718b10'/>
<id>4c737b41de7f4eef2a593803bad1b918dd718b10</id>
<content type='text'>
cgroup_path() and friends used to format the path from the end and
thus the resulting path usually didn't start at the start of the
passed in buffer.  Also, when the buffer was too small, the partial
result was truncated from the head rather than tail and there was no
way to tell how long the full path would be.  These make the functions
less robust and more awkward to use.

With recent updates to kernfs_path(), cgroup_path() and friends can be
made to behave in strlcpy() style.

* cgroup_path(), cgroup_path_ns[_locked]() and task_cgroup_path() now
  always return the length of the full path.  If buffer is too small,
  it contains nul terminated truncated output.

* All users updated accordingly.

v2: cgroup_path() usage in kernel/sched/debug.c converted.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Acked-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: Serge Hallyn &lt;serge.hallyn@ubuntu.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
cgroup_path() and friends used to format the path from the end and
thus the resulting path usually didn't start at the start of the
passed in buffer.  Also, when the buffer was too small, the partial
result was truncated from the head rather than tail and there was no
way to tell how long the full path would be.  These make the functions
less robust and more awkward to use.

With recent updates to kernfs_path(), cgroup_path() and friends can be
made to behave in strlcpy() style.

* cgroup_path(), cgroup_path_ns[_locked]() and task_cgroup_path() now
  always return the length of the full path.  If buffer is too small,
  it contains nul terminated truncated output.

* All users updated accordingly.

v2: cgroup_path() usage in kernel/sched/debug.c converted.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Acked-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: Serge Hallyn &lt;serge.hallyn@ubuntu.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cpuset: make sure new tasks conform to the current config of the cpuset</title>
<updated>2016-08-10T03:58:01+00:00</updated>
<author>
<name>Zefan Li</name>
<email>lizefan@huawei.com</email>
</author>
<published>2016-08-09T03:25:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=06f4e94898918bcad00cdd4d349313a439d6911e'/>
<id>06f4e94898918bcad00cdd4d349313a439d6911e</id>
<content type='text'>
A new task inherits cpus_allowed and mems_allowed masks from its parent,
but if someone changes cpuset's config by writing to cpuset.cpus/cpuset.mems
before this new task is inserted into the cgroup's task list, the new task
won't be updated accordingly.

Signed-off-by: Zefan Li &lt;lizefan@huawei.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Cc: stable@vger.kernel.org
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
A new task inherits cpus_allowed and mems_allowed masks from its parent,
but if someone changes cpuset's config by writing to cpuset.cpus/cpuset.mems
before this new task is inserted into the cgroup's task list, the new task
won't be updated accordingly.

Signed-off-by: Zefan Li &lt;lizefan@huawei.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Cc: stable@vger.kernel.org
</pre>
</div>
</content>
</entry>
<entry>
<title>cpuset, mm: fix TIF_MEMDIE check in cpuset_change_task_nodemask</title>
<updated>2016-07-28T23:07:41+00:00</updated>
<author>
<name>Michal Hocko</name>
<email>mhocko@suse.com</email>
</author>
<published>2016-07-28T22:45:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=fec1e5f987bfc41f9f08cbd206e7302e6ac2ab0c'/>
<id>fec1e5f987bfc41f9f08cbd206e7302e6ac2ab0c</id>
<content type='text'>
Commit c0ff7453bb5c ("cpuset,mm: fix no node to alloc memory when
changing cpuset's mems") has added TIF_MEMDIE and PF_EXITING check but
it is checking the flag on the current task rather than the given one.

This doesn't make much sense and it is actually wrong.  If the current
task which updates the nodemask of a cpuset got killed by the OOM killer
then a part of the cpuset cgroup processes would have incompatible
nodemask which is surprising to say the least.

The comment suggests the intention was to skip oom victim or an exiting
task so we should be checking the given task.  But even then it would be
layering violation because it is the memory allocator to interpret the
TIF_MEMDIE meaning.  Simply drop both checks.  All tasks in the cpuset
should simply follow the same mask.

Link: http://lkml.kernel.org/r/1467029719-17602-3-git-send-email-mhocko@kernel.org
Signed-off-by: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Miao Xie &lt;miaox@cn.fujitsu.com&gt;
Cc: Miao Xie &lt;miaoxie@huawei.com&gt;
Cc: "Rafael J. Wysocki" &lt;rjw@rjwysocki.net&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Commit c0ff7453bb5c ("cpuset,mm: fix no node to alloc memory when
changing cpuset's mems") has added TIF_MEMDIE and PF_EXITING check but
it is checking the flag on the current task rather than the given one.

This doesn't make much sense and it is actually wrong.  If the current
task which updates the nodemask of a cpuset got killed by the OOM killer
then a part of the cpuset cgroup processes would have incompatible
nodemask which is surprising to say the least.

The comment suggests the intention was to skip oom victim or an exiting
task so we should be checking the given task.  But even then it would be
layering violation because it is the memory allocator to interpret the
TIF_MEMDIE meaning.  Simply drop both checks.  All tasks in the cpuset
should simply follow the same mask.

Link: http://lkml.kernel.org/r/1467029719-17602-3-git-send-email-mhocko@kernel.org
Signed-off-by: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Miao Xie &lt;miaox@cn.fujitsu.com&gt;
Cc: Miao Xie &lt;miaoxie@huawei.com&gt;
Cc: "Rafael J. Wysocki" &lt;rjw@rjwysocki.net&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cpuset: use static key better and convert to new API</title>
<updated>2016-05-20T02:12:14+00:00</updated>
<author>
<name>Vlastimil Babka</name>
<email>vbabka@suse.cz</email>
</author>
<published>2016-05-20T00:14:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=002f290627c27068087f6204baec7a334e5a3b48'/>
<id>002f290627c27068087f6204baec7a334e5a3b48</id>
<content type='text'>
An important function for cpusets is cpuset_node_allowed(), which
optimizes on the fact if there's a single root CPU set, it must be
trivially allowed.  But the check "nr_cpusets() &lt;= 1" doesn't use the
cpusets_enabled_key static key the right way where static keys eliminate
branching overhead with jump labels.

This patch converts it so that static key is used properly.  It's also
switched to the new static key API and the checking functions are
converted to return bool instead of int.  We also provide a new variant
__cpuset_zone_allowed() which expects that the static key check was
already done and they key was enabled.  This is needed for
get_page_from_freelist() where we want to also avoid the relatively
slower check when ALLOC_CPUSET is not set in alloc_flags.

The impact on the page allocator microbenchmark is less than expected
but the cleanup in itself is worthwhile.

                                             4.6.0-rc2                  4.6.0-rc2
                                       multcheck-v1r20               cpuset-v1r20
  Min      alloc-odr0-1               348.00 (  0.00%)           348.00 (  0.00%)
  Min      alloc-odr0-2               254.00 (  0.00%)           254.00 (  0.00%)
  Min      alloc-odr0-4               213.00 (  0.00%)           213.00 (  0.00%)
  Min      alloc-odr0-8               186.00 (  0.00%)           183.00 (  1.61%)
  Min      alloc-odr0-16              173.00 (  0.00%)           171.00 (  1.16%)
  Min      alloc-odr0-32              166.00 (  0.00%)           163.00 (  1.81%)
  Min      alloc-odr0-64              162.00 (  0.00%)           159.00 (  1.85%)
  Min      alloc-odr0-128             160.00 (  0.00%)           157.00 (  1.88%)
  Min      alloc-odr0-256             169.00 (  0.00%)           166.00 (  1.78%)
  Min      alloc-odr0-512             180.00 (  0.00%)           180.00 (  0.00%)
  Min      alloc-odr0-1024            188.00 (  0.00%)           187.00 (  0.53%)
  Min      alloc-odr0-2048            194.00 (  0.00%)           193.00 (  0.52%)
  Min      alloc-odr0-4096            199.00 (  0.00%)           198.00 (  0.50%)
  Min      alloc-odr0-8192            202.00 (  0.00%)           201.00 (  0.50%)
  Min      alloc-odr0-16384           203.00 (  0.00%)           202.00 (  0.49%)

Signed-off-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Signed-off-by: Mel Gorman &lt;mgorman@techsingularity.net&gt;
Acked-by: Zefan Li &lt;lizefan@huawei.com&gt;
Cc: Jesper Dangaard Brouer &lt;brouer@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
An important function for cpusets is cpuset_node_allowed(), which
optimizes on the fact if there's a single root CPU set, it must be
trivially allowed.  But the check "nr_cpusets() &lt;= 1" doesn't use the
cpusets_enabled_key static key the right way where static keys eliminate
branching overhead with jump labels.

This patch converts it so that static key is used properly.  It's also
switched to the new static key API and the checking functions are
converted to return bool instead of int.  We also provide a new variant
__cpuset_zone_allowed() which expects that the static key check was
already done and they key was enabled.  This is needed for
get_page_from_freelist() where we want to also avoid the relatively
slower check when ALLOC_CPUSET is not set in alloc_flags.

The impact on the page allocator microbenchmark is less than expected
but the cleanup in itself is worthwhile.

                                             4.6.0-rc2                  4.6.0-rc2
                                       multcheck-v1r20               cpuset-v1r20
  Min      alloc-odr0-1               348.00 (  0.00%)           348.00 (  0.00%)
  Min      alloc-odr0-2               254.00 (  0.00%)           254.00 (  0.00%)
  Min      alloc-odr0-4               213.00 (  0.00%)           213.00 (  0.00%)
  Min      alloc-odr0-8               186.00 (  0.00%)           183.00 (  1.61%)
  Min      alloc-odr0-16              173.00 (  0.00%)           171.00 (  1.16%)
  Min      alloc-odr0-32              166.00 (  0.00%)           163.00 (  1.81%)
  Min      alloc-odr0-64              162.00 (  0.00%)           159.00 (  1.85%)
  Min      alloc-odr0-128             160.00 (  0.00%)           157.00 (  1.88%)
  Min      alloc-odr0-256             169.00 (  0.00%)           166.00 (  1.78%)
  Min      alloc-odr0-512             180.00 (  0.00%)           180.00 (  0.00%)
  Min      alloc-odr0-1024            188.00 (  0.00%)           187.00 (  0.53%)
  Min      alloc-odr0-2048            194.00 (  0.00%)           193.00 (  0.52%)
  Min      alloc-odr0-4096            199.00 (  0.00%)           198.00 (  0.50%)
  Min      alloc-odr0-8192            202.00 (  0.00%)           201.00 (  0.50%)
  Min      alloc-odr0-16384           203.00 (  0.00%)           202.00 (  0.49%)

Signed-off-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Signed-off-by: Mel Gorman &lt;mgorman@techsingularity.net&gt;
Acked-by: Zefan Li &lt;lizefan@huawei.com&gt;
Cc: Jesper Dangaard Brouer &lt;brouer@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
