<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/kernel/sched/core.c, branch v6.11</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>sched/core: Fix unbalance set_rq_online/offline() in sched_cpu_deactivate()</title>
<updated>2024-07-29T10:22:33+00:00</updated>
<author>
<name>Yang Yingliang</name>
<email>yangyingliang@huawei.com</email>
</author>
<published>2024-07-03T03:16:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=fe7a11c78d2a9bdb8b50afc278a31ac177000948'/>
<id>fe7a11c78d2a9bdb8b50afc278a31ac177000948</id>
<content type='text'>
If cpuset_cpu_inactive() fails, set_rq_online() need be called to rollback.

Fixes: 120455c514f7 ("sched: Fix hotplug vs CPU bandwidth control")
Cc: stable@kernel.org
Signed-off-by: Yang Yingliang &lt;yangyingliang@huawei.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/20240703031610.587047-5-yangyingliang@huaweicloud.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
If cpuset_cpu_inactive() fails, set_rq_online() need be called to rollback.

Fixes: 120455c514f7 ("sched: Fix hotplug vs CPU bandwidth control")
Cc: stable@kernel.org
Signed-off-by: Yang Yingliang &lt;yangyingliang@huawei.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/20240703031610.587047-5-yangyingliang@huaweicloud.com
</pre>
</div>
</content>
</entry>
<entry>
<title>sched/core: Introduce sched_set_rq_on/offline() helper</title>
<updated>2024-07-29T10:22:32+00:00</updated>
<author>
<name>Yang Yingliang</name>
<email>yangyingliang@huawei.com</email>
</author>
<published>2024-07-03T03:16:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=2f027354122f58ee846468a6f6b48672fff92e9b'/>
<id>2f027354122f58ee846468a6f6b48672fff92e9b</id>
<content type='text'>
Introduce sched_set_rq_on/offline() helper, so it can be called
in normal or error path simply. No functional changed.

Cc: stable@kernel.org
Signed-off-by: Yang Yingliang &lt;yangyingliang@huawei.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/20240703031610.587047-4-yangyingliang@huaweicloud.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Introduce sched_set_rq_on/offline() helper, so it can be called
in normal or error path simply. No functional changed.

Cc: stable@kernel.org
Signed-off-by: Yang Yingliang &lt;yangyingliang@huawei.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/20240703031610.587047-4-yangyingliang@huaweicloud.com
</pre>
</div>
</content>
</entry>
<entry>
<title>sched/smt: Fix unbalance sched_smt_present dec/inc</title>
<updated>2024-07-29T10:22:32+00:00</updated>
<author>
<name>Yang Yingliang</name>
<email>yangyingliang@huawei.com</email>
</author>
<published>2024-07-03T03:16:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=e22f910a26cc2a3ac9c66b8e935ef2a7dd881117'/>
<id>e22f910a26cc2a3ac9c66b8e935ef2a7dd881117</id>
<content type='text'>
I got the following warn report while doing stress test:

jump label: negative count!
WARNING: CPU: 3 PID: 38 at kernel/jump_label.c:263 static_key_slow_try_dec+0x9d/0xb0
Call Trace:
 &lt;TASK&gt;
 __static_key_slow_dec_cpuslocked+0x16/0x70
 sched_cpu_deactivate+0x26e/0x2a0
 cpuhp_invoke_callback+0x3ad/0x10d0
 cpuhp_thread_fun+0x3f5/0x680
 smpboot_thread_fn+0x56d/0x8d0
 kthread+0x309/0x400
 ret_from_fork+0x41/0x70
 ret_from_fork_asm+0x1b/0x30
 &lt;/TASK&gt;

Because when cpuset_cpu_inactive() fails in sched_cpu_deactivate(),
the cpu offline failed, but sched_smt_present is decremented before
calling sched_cpu_deactivate(), it leads to unbalanced dec/inc, so
fix it by incrementing sched_smt_present in the error path.

Fixes: c5511d03ec09 ("sched/smt: Make sched_smt_present track topology")
Cc: stable@kernel.org
Signed-off-by: Yang Yingliang &lt;yangyingliang@huawei.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Reviewed-by: Chen Yu &lt;yu.c.chen@intel.com&gt;
Reviewed-by: Tim Chen &lt;tim.c.chen@linux.intel.com&gt;
Link: https://lore.kernel.org/r/20240703031610.587047-3-yangyingliang@huaweicloud.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
I got the following warn report while doing stress test:

jump label: negative count!
WARNING: CPU: 3 PID: 38 at kernel/jump_label.c:263 static_key_slow_try_dec+0x9d/0xb0
Call Trace:
 &lt;TASK&gt;
 __static_key_slow_dec_cpuslocked+0x16/0x70
 sched_cpu_deactivate+0x26e/0x2a0
 cpuhp_invoke_callback+0x3ad/0x10d0
 cpuhp_thread_fun+0x3f5/0x680
 smpboot_thread_fn+0x56d/0x8d0
 kthread+0x309/0x400
 ret_from_fork+0x41/0x70
 ret_from_fork_asm+0x1b/0x30
 &lt;/TASK&gt;

Because when cpuset_cpu_inactive() fails in sched_cpu_deactivate(),
the cpu offline failed, but sched_smt_present is decremented before
calling sched_cpu_deactivate(), it leads to unbalanced dec/inc, so
fix it by incrementing sched_smt_present in the error path.

Fixes: c5511d03ec09 ("sched/smt: Make sched_smt_present track topology")
Cc: stable@kernel.org
Signed-off-by: Yang Yingliang &lt;yangyingliang@huawei.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Reviewed-by: Chen Yu &lt;yu.c.chen@intel.com&gt;
Reviewed-by: Tim Chen &lt;tim.c.chen@linux.intel.com&gt;
Link: https://lore.kernel.org/r/20240703031610.587047-3-yangyingliang@huaweicloud.com
</pre>
</div>
</content>
</entry>
<entry>
<title>sched/smt: Introduce sched_smt_present_inc/dec() helper</title>
<updated>2024-07-29T10:22:32+00:00</updated>
<author>
<name>Yang Yingliang</name>
<email>yangyingliang@huawei.com</email>
</author>
<published>2024-07-03T03:16:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=31b164e2e4af84d08d2498083676e7eeaa102493'/>
<id>31b164e2e4af84d08d2498083676e7eeaa102493</id>
<content type='text'>
Introduce sched_smt_present_inc/dec() helper, so it can be called
in normal or error path simply. No functional changed.

Cc: stable@kernel.org
Signed-off-by: Yang Yingliang &lt;yangyingliang@huawei.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/20240703031610.587047-2-yangyingliang@huaweicloud.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Introduce sched_smt_present_inc/dec() helper, so it can be called
in normal or error path simply. No functional changed.

Cc: stable@kernel.org
Signed-off-by: Yang Yingliang &lt;yangyingliang@huawei.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/20240703031610.587047-2-yangyingliang@huaweicloud.com
</pre>
</div>
</content>
</entry>
<entry>
<title>sysctl: treewide: constify the ctl_table argument of proc_handlers</title>
<updated>2024-07-24T18:59:29+00:00</updated>
<author>
<name>Joel Granados</name>
<email>j.granados@samsung.com</email>
</author>
<published>2024-07-24T18:59:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=78eb4ea25cd5fdbdae7eb9fdf87b99195ff67508'/>
<id>78eb4ea25cd5fdbdae7eb9fdf87b99195ff67508</id>
<content type='text'>
const qualify the struct ctl_table argument in the proc_handler function
signatures. This is a prerequisite to moving the static ctl_table
structs into .rodata data which will ensure that proc_handler function
pointers cannot be modified.

This patch has been generated by the following coccinelle script:

```
  virtual patch

  @r1@
  identifier ctl, write, buffer, lenp, ppos;
  identifier func !~ "appldata_(timer|interval)_handler|sched_(rt|rr)_handler|rds_tcp_skbuf_handler|proc_sctp_do_(hmac_alg|rto_min|rto_max|udp_port|alpha_beta|auth|probe_interval)";
  @@

  int func(
  - struct ctl_table *ctl
  + const struct ctl_table *ctl
    ,int write, void *buffer, size_t *lenp, loff_t *ppos);

  @r2@
  identifier func, ctl, write, buffer, lenp, ppos;
  @@

  int func(
  - struct ctl_table *ctl
  + const struct ctl_table *ctl
    ,int write, void *buffer, size_t *lenp, loff_t *ppos)
  { ... }

  @r3@
  identifier func;
  @@

  int func(
  - struct ctl_table *
  + const struct ctl_table *
    ,int , void *, size_t *, loff_t *);

  @r4@
  identifier func, ctl;
  @@

  int func(
  - struct ctl_table *ctl
  + const struct ctl_table *ctl
    ,int , void *, size_t *, loff_t *);

  @r5@
  identifier func, write, buffer, lenp, ppos;
  @@

  int func(
  - struct ctl_table *
  + const struct ctl_table *
    ,int write, void *buffer, size_t *lenp, loff_t *ppos);

```

* Code formatting was adjusted in xfs_sysctl.c to comply with code
  conventions. The xfs_stats_clear_proc_handler,
  xfs_panic_mask_proc_handler and xfs_deprecated_dointvec_minmax where
  adjusted.

* The ctl_table argument in proc_watchdog_common was const qualified.
  This is called from a proc_handler itself and is calling back into
  another proc_handler, making it necessary to change it as part of the
  proc_handler migration.

Co-developed-by: Thomas Weißschuh &lt;linux@weissschuh.net&gt;
Signed-off-by: Thomas Weißschuh &lt;linux@weissschuh.net&gt;
Co-developed-by: Joel Granados &lt;j.granados@samsung.com&gt;
Signed-off-by: Joel Granados &lt;j.granados@samsung.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
const qualify the struct ctl_table argument in the proc_handler function
signatures. This is a prerequisite to moving the static ctl_table
structs into .rodata data which will ensure that proc_handler function
pointers cannot be modified.

This patch has been generated by the following coccinelle script:

```
  virtual patch

  @r1@
  identifier ctl, write, buffer, lenp, ppos;
  identifier func !~ "appldata_(timer|interval)_handler|sched_(rt|rr)_handler|rds_tcp_skbuf_handler|proc_sctp_do_(hmac_alg|rto_min|rto_max|udp_port|alpha_beta|auth|probe_interval)";
  @@

  int func(
  - struct ctl_table *ctl
  + const struct ctl_table *ctl
    ,int write, void *buffer, size_t *lenp, loff_t *ppos);

  @r2@
  identifier func, ctl, write, buffer, lenp, ppos;
  @@

  int func(
  - struct ctl_table *ctl
  + const struct ctl_table *ctl
    ,int write, void *buffer, size_t *lenp, loff_t *ppos)
  { ... }

  @r3@
  identifier func;
  @@

  int func(
  - struct ctl_table *
  + const struct ctl_table *
    ,int , void *, size_t *, loff_t *);

  @r4@
  identifier func, ctl;
  @@

  int func(
  - struct ctl_table *ctl
  + const struct ctl_table *ctl
    ,int , void *, size_t *, loff_t *);

  @r5@
  identifier func, write, buffer, lenp, ppos;
  @@

  int func(
  - struct ctl_table *
  + const struct ctl_table *
    ,int write, void *buffer, size_t *lenp, loff_t *ppos);

```

* Code formatting was adjusted in xfs_sysctl.c to comply with code
  conventions. The xfs_stats_clear_proc_handler,
  xfs_panic_mask_proc_handler and xfs_deprecated_dointvec_minmax where
  adjusted.

* The ctl_table argument in proc_watchdog_common was const qualified.
  This is called from a proc_handler itself and is calling back into
  another proc_handler, making it necessary to change it as part of the
  proc_handler migration.

Co-developed-by: Thomas Weißschuh &lt;linux@weissschuh.net&gt;
Signed-off-by: Thomas Weißschuh &lt;linux@weissschuh.net&gt;
Co-developed-by: Joel Granados &lt;j.granados@samsung.com&gt;
Signed-off-by: Joel Granados &lt;j.granados@samsung.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'sched-core-2024-07-16' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2024-07-17T00:00:50+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2024-07-17T00:00:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=4a996d90b9e046c6d59845acf00a54d464c34ff3'/>
<id>4a996d90b9e046c6d59845acf00a54d464c34ff3</id>
<content type='text'>
Pull scheduler updates from Ingo Molnar:

 - Update Daniel Bristot de Oliveira's entry in MAINTAINERS,
   and credit him in CREDITS

 - Harmonize the lock-yielding behavior on dynamically selected
   preemption models with static ones

 - Reorganize the code a bit: split out sched/syscalls.c to reduce
   the size of sched/core.c

 - Micro-optimize psi_group_change()

 - Fix set_load_weight() for SCHED_IDLE tasks

 - Misc cleanups &amp; fixes

* tag 'sched-core-2024-07-16' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched: Update MAINTAINERS and CREDITS
  sched/fair: set_load_weight() must also call reweight_task() for SCHED_IDLE tasks
  sched/psi: Optimise psi_group_change a bit
  sched/core: Drop spinlocks on contention iff kernel is preemptible
  sched/core: Move preempt_model_*() helpers from sched.h to preempt.h
  sched/balance: Skip unnecessary updates to idle load balancer's flags
  idle: Remove stale RCU comment
  sched/headers: Move struct pre-declarations to the beginning of the header
  sched/core: Clean up kernel/sched/sched.h a bit
  sched/core: Simplify prefetch_curr_exec_start()
  sched: Fix spelling in comments
  sched/syscalls: Split out kernel/sched/syscalls.c from kernel/sched/core.c
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull scheduler updates from Ingo Molnar:

 - Update Daniel Bristot de Oliveira's entry in MAINTAINERS,
   and credit him in CREDITS

 - Harmonize the lock-yielding behavior on dynamically selected
   preemption models with static ones

 - Reorganize the code a bit: split out sched/syscalls.c to reduce
   the size of sched/core.c

 - Micro-optimize psi_group_change()

 - Fix set_load_weight() for SCHED_IDLE tasks

 - Misc cleanups &amp; fixes

* tag 'sched-core-2024-07-16' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched: Update MAINTAINERS and CREDITS
  sched/fair: set_load_weight() must also call reweight_task() for SCHED_IDLE tasks
  sched/psi: Optimise psi_group_change a bit
  sched/core: Drop spinlocks on contention iff kernel is preemptible
  sched/core: Move preempt_model_*() helpers from sched.h to preempt.h
  sched/balance: Skip unnecessary updates to idle load balancer's flags
  idle: Remove stale RCU comment
  sched/headers: Move struct pre-declarations to the beginning of the header
  sched/core: Clean up kernel/sched/sched.h a bit
  sched/core: Simplify prefetch_curr_exec_start()
  sched: Fix spelling in comments
  sched/syscalls: Split out kernel/sched/syscalls.c from kernel/sched/core.c
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'rcu.2024.07.12a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu</title>
<updated>2024-07-15T22:25:27+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2024-07-15T22:25:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=9855e873285f1395b9a04613c56101f04a5ec9fb'/>
<id>9855e873285f1395b9a04613c56101f04a5ec9fb</id>
<content type='text'>
Pull RCU updates from Paul McKenney:

 - Update Tasks RCU and Tasks Rude RCU description in Requirements.rst
   and clarify rcu_assign_pointer() and rcu_dereference() ordering
   properties

 - Add lockdep assertions for RCU readers, limit inline wakeups for
   callback-bypass synchronize_rcu(), add an
   rcutree.nohz_full_patience_delay to reduce nohz_full OS jitter, add
   Uladzislau Rezki as RCU maintainer, and fix a subtle
   callback-migration memory-ordering issue

 - Remove a number of redundant memory barriers

 - Remove unnecessary bypass-list lock-contention mitigation, use
   parking API instead of open-coded ad-hoc equivalent, and upgrade
   obsolete comments

 - Revert avoidance of a deadlock that can no longer occur and properly
   synchronize Tasks Trace RCU checking of runqueues

 - Add tests for handling of double-call_rcu() bug, add missing
   MODULE_DESCRIPTION, and add a script that histograms the number of
   calls to RCU updaters

 - Fill out SRCU polled-grace-period API

* tag 'rcu.2024.07.12a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: (29 commits)
  rcu: Fix rcu_barrier() VS post CPUHP_TEARDOWN_CPU invocation
  rcu: Eliminate lockless accesses to rcu_sync-&gt;gp_count
  MAINTAINERS: Add Uladzislau Rezki as RCU maintainer
  rcu: Add rcutree.nohz_full_patience_delay to reduce nohz_full OS jitter
  rcu/exp: Remove redundant full memory barrier at the end of GP
  rcu: Remove full memory barrier on RCU stall printout
  rcu: Remove full memory barrier on boot time eqs sanity check
  rcu/exp: Remove superfluous full memory barrier upon first EQS snapshot
  rcu: Remove superfluous full memory barrier upon first EQS snapshot
  rcu: Remove full ordering on second EQS snapshot
  srcu: Fill out polled grace-period APIs
  srcu: Update cleanup_srcu_struct() comment
  srcu: Add NUM_ACTIVE_SRCU_POLL_OLDSTATE
  srcu: Disable interrupts directly in srcu_gp_end()
  rcu: Disable interrupts directly in rcu_gp_init()
  rcu/tree: Reduce wake up for synchronize_rcu() common case
  rcu/tasks: Fix stale task snaphot for Tasks Trace
  tools/rcu: Add rcu-updaters.sh script
  rcutorture: Add missing MODULE_DESCRIPTION() macros
  rcutorture: Fix rcu_torture_fwd_cb_cr() data race
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull RCU updates from Paul McKenney:

 - Update Tasks RCU and Tasks Rude RCU description in Requirements.rst
   and clarify rcu_assign_pointer() and rcu_dereference() ordering
   properties

 - Add lockdep assertions for RCU readers, limit inline wakeups for
   callback-bypass synchronize_rcu(), add an
   rcutree.nohz_full_patience_delay to reduce nohz_full OS jitter, add
   Uladzislau Rezki as RCU maintainer, and fix a subtle
   callback-migration memory-ordering issue

 - Remove a number of redundant memory barriers

 - Remove unnecessary bypass-list lock-contention mitigation, use
   parking API instead of open-coded ad-hoc equivalent, and upgrade
   obsolete comments

 - Revert avoidance of a deadlock that can no longer occur and properly
   synchronize Tasks Trace RCU checking of runqueues

 - Add tests for handling of double-call_rcu() bug, add missing
   MODULE_DESCRIPTION, and add a script that histograms the number of
   calls to RCU updaters

 - Fill out SRCU polled-grace-period API

* tag 'rcu.2024.07.12a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: (29 commits)
  rcu: Fix rcu_barrier() VS post CPUHP_TEARDOWN_CPU invocation
  rcu: Eliminate lockless accesses to rcu_sync-&gt;gp_count
  MAINTAINERS: Add Uladzislau Rezki as RCU maintainer
  rcu: Add rcutree.nohz_full_patience_delay to reduce nohz_full OS jitter
  rcu/exp: Remove redundant full memory barrier at the end of GP
  rcu: Remove full memory barrier on RCU stall printout
  rcu: Remove full memory barrier on boot time eqs sanity check
  rcu/exp: Remove superfluous full memory barrier upon first EQS snapshot
  rcu: Remove superfluous full memory barrier upon first EQS snapshot
  rcu: Remove full ordering on second EQS snapshot
  srcu: Fill out polled grace-period APIs
  srcu: Update cleanup_srcu_struct() comment
  srcu: Add NUM_ACTIVE_SRCU_POLL_OLDSTATE
  srcu: Disable interrupts directly in srcu_gp_end()
  rcu: Disable interrupts directly in rcu_gp_init()
  rcu/tree: Reduce wake up for synchronize_rcu() common case
  rcu/tasks: Fix stale task snaphot for Tasks Trace
  tools/rcu: Add rcu-updaters.sh script
  rcutorture: Add missing MODULE_DESCRIPTION() macros
  rcutorture: Fix rcu_torture_fwd_cb_cr() data race
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'sched/urgent' into sched/core, to pick up fixes and refresh the branch</title>
<updated>2024-07-11T08:42:33+00:00</updated>
<author>
<name>Ingo Molnar</name>
<email>mingo@kernel.org</email>
</author>
<published>2024-07-11T08:42:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=011b1134b82c2750d83a299a1369c678845de45a'/>
<id>011b1134b82c2750d83a299a1369c678845de45a</id>
<content type='text'>
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>sched/fair: set_load_weight() must also call reweight_task() for SCHED_IDLE tasks</title>
<updated>2024-07-04T13:59:52+00:00</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2024-06-26T01:29:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=d329605287020c3d1c3b0dadc63d8208e7251382'/>
<id>d329605287020c3d1c3b0dadc63d8208e7251382</id>
<content type='text'>
When a task's weight is being changed, set_load_weight() is called with
@update_load set. As weight changes aren't trivial for the fair class,
set_load_weight() calls fair.c::reweight_task() for fair class tasks.

However, set_load_weight() first tests task_has_idle_policy() on entry and
skips calling reweight_task() for SCHED_IDLE tasks. This is buggy as
SCHED_IDLE tasks are just fair tasks with a very low weight and they would
incorrectly skip load, vlag and position updates.

Fix it by updating reweight_task() to take struct load_weight as idle weight
can't be expressed with prio and making set_load_weight() call
reweight_task() for SCHED_IDLE tasks too when @update_load is set.

Fixes: 9059393e4ec1 ("sched/fair: Use reweight_entity() for set_user_nice()")
Suggested-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: stable@vger.kernel.org # v4.15+
Link: http://lkml.kernel.org/r/20240624102331.GI31592@noisy.programming.kicks-ass.net
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When a task's weight is being changed, set_load_weight() is called with
@update_load set. As weight changes aren't trivial for the fair class,
set_load_weight() calls fair.c::reweight_task() for fair class tasks.

However, set_load_weight() first tests task_has_idle_policy() on entry and
skips calling reweight_task() for SCHED_IDLE tasks. This is buggy as
SCHED_IDLE tasks are just fair tasks with a very low weight and they would
incorrectly skip load, vlag and position updates.

Fix it by updating reweight_task() to take struct load_weight as idle weight
can't be expressed with prio and making set_load_weight() call
reweight_task() for SCHED_IDLE tasks too when @update_load is set.

Fixes: 9059393e4ec1 ("sched/fair: Use reweight_entity() for set_user_nice()")
Suggested-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: stable@vger.kernel.org # v4.15+
Link: http://lkml.kernel.org/r/20240624102331.GI31592@noisy.programming.kicks-ass.net
</pre>
</div>
</content>
</entry>
<entry>
<title>sched: Move psi_account_irqtime() out of update_rq_clock_task() hotpath</title>
<updated>2024-07-01T11:01:44+00:00</updated>
<author>
<name>John Stultz</name>
<email>jstultz@google.com</email>
</author>
<published>2024-06-18T21:58:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=ddae0ca2a8fe12d0e24ab10ba759c3fbd755ada8'/>
<id>ddae0ca2a8fe12d0e24ab10ba759c3fbd755ada8</id>
<content type='text'>
It was reported that in moving to 6.1, a larger then 10%
regression was seen in the performance of
clock_gettime(CLOCK_THREAD_CPUTIME_ID,...).

Using a simple reproducer, I found:
5.10:
100000000 calls in 24345994193 ns =&gt; 243.460 ns per call
100000000 calls in 24288172050 ns =&gt; 242.882 ns per call
100000000 calls in 24289135225 ns =&gt; 242.891 ns per call

6.1:
100000000 calls in 28248646742 ns =&gt; 282.486 ns per call
100000000 calls in 28227055067 ns =&gt; 282.271 ns per call
100000000 calls in 28177471287 ns =&gt; 281.775 ns per call

The cause of this was finally narrowed down to the addition of
psi_account_irqtime() in update_rq_clock_task(), in commit
52b1364ba0b1 ("sched/psi: Add PSI_IRQ to track IRQ/SOFTIRQ
pressure").

In my initial attempt to resolve this, I leaned towards moving
all accounting work out of the clock_gettime() call path, but it
wasn't very pretty, so it will have to wait for a later deeper
rework. Instead, Peter shared this approach:

Rework psi_account_irqtime() to use its own psi_irq_time base
for accounting, and move it out of the hotpath, calling it
instead from sched_tick() and __schedule().

In testing this, we found the importance of ensuring
psi_account_irqtime() is run under the rq_lock, which Johannes
Weiner helpfully explained, so also add some lockdep annotations
to make that requirement clear.

With this change the performance is back in-line with 5.10:
6.1+fix:
100000000 calls in 24297324597 ns =&gt; 242.973 ns per call
100000000 calls in 24318869234 ns =&gt; 243.189 ns per call
100000000 calls in 24291564588 ns =&gt; 242.916 ns per call

Reported-by: Jimmy Shiu &lt;jimmyshiu@google.com&gt;
Originally-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Signed-off-by: John Stultz &lt;jstultz@google.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Reviewed-by: Chengming Zhou &lt;chengming.zhou@linux.dev&gt;
Reviewed-by: Qais Yousef &lt;qyousef@layalina.io&gt;
Link: https://lore.kernel.org/r/20240618215909.4099720-1-jstultz@google.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
It was reported that in moving to 6.1, a larger then 10%
regression was seen in the performance of
clock_gettime(CLOCK_THREAD_CPUTIME_ID,...).

Using a simple reproducer, I found:
5.10:
100000000 calls in 24345994193 ns =&gt; 243.460 ns per call
100000000 calls in 24288172050 ns =&gt; 242.882 ns per call
100000000 calls in 24289135225 ns =&gt; 242.891 ns per call

6.1:
100000000 calls in 28248646742 ns =&gt; 282.486 ns per call
100000000 calls in 28227055067 ns =&gt; 282.271 ns per call
100000000 calls in 28177471287 ns =&gt; 281.775 ns per call

The cause of this was finally narrowed down to the addition of
psi_account_irqtime() in update_rq_clock_task(), in commit
52b1364ba0b1 ("sched/psi: Add PSI_IRQ to track IRQ/SOFTIRQ
pressure").

In my initial attempt to resolve this, I leaned towards moving
all accounting work out of the clock_gettime() call path, but it
wasn't very pretty, so it will have to wait for a later deeper
rework. Instead, Peter shared this approach:

Rework psi_account_irqtime() to use its own psi_irq_time base
for accounting, and move it out of the hotpath, calling it
instead from sched_tick() and __schedule().

In testing this, we found the importance of ensuring
psi_account_irqtime() is run under the rq_lock, which Johannes
Weiner helpfully explained, so also add some lockdep annotations
to make that requirement clear.

With this change the performance is back in-line with 5.10:
6.1+fix:
100000000 calls in 24297324597 ns =&gt; 242.973 ns per call
100000000 calls in 24318869234 ns =&gt; 243.189 ns per call
100000000 calls in 24291564588 ns =&gt; 242.916 ns per call

Reported-by: Jimmy Shiu &lt;jimmyshiu@google.com&gt;
Originally-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Signed-off-by: John Stultz &lt;jstultz@google.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Reviewed-by: Chengming Zhou &lt;chengming.zhou@linux.dev&gt;
Reviewed-by: Qais Yousef &lt;qyousef@layalina.io&gt;
Link: https://lore.kernel.org/r/20240618215909.4099720-1-jstultz@google.com
</pre>
</div>
</content>
</entry>
</feed>
