<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/kernel/sched/stop_task.c, branch v7.2-rc1</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>sched: Rework prev_balance() to avoid stale prev references</title>
<updated>2026-06-02T10:26:06+00:00</updated>
<author>
<name>John Stultz</name>
<email>jstultz@google.com</email>
</author>
<published>2026-05-12T02:56:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=7a3a6bfbd62a2ba3e0ef1e92d6b71abb66890825'/>
<id>7a3a6bfbd62a2ba3e0ef1e92d6b71abb66890825</id>
<content type='text'>
Historically, the prev value from __schedule() was the rq-&gt;curr.
This prev value is passed down through numerous functions, and
used in the class scheduler implementations. The fact that
prev was on_cpu until the end of __schedule(), meant it was
stable across the rq lock drops that the class-&gt;balance()
implementations often do.

However, with proxy-exec, the prev passed to functions called
by __schedule() is rq-&gt;donor, which may not be the same as
rq-&gt;curr and may not be on_cpu, this makes the prev value
potentially unstable across rq lock drops.

A recently found issue with proxy-exec, is when we begin doing
return migration from try_to_wake_up(), its possible we may be
waking up the rq-&gt;donor.  When we do this, we proxy_resched_idle()
to put_prev_set_next() setting the rq-&gt;donor to rq-&gt;idle, allowing
the rq-&gt;donor to be return migrated and allowed to run.

This however runs into trouble, as on another cpu we might be in
the middle of calling __schedule(). Conceptually the rq lock is
held for the majority of the time, but in calling prev_balance()
its possible the class-&gt;balance() handler call may briefly drop the rq lock.
This opens a window for try_to_wake_up() to wake and return migrate the
rq-&gt;donor before the class logic reacquires the rq lock.

Unfortunately prev_balance() pass in a prev argument, to which we pass
rq-&gt;donor. However this prev value can now become stale and incorrect across a
rq lock drop.

So, to correct this, rework the prev_balance() call so that it does not take a
"prev" argument.

Signed-off-by: John Stultz &lt;jstultz@google.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://patch.msgid.link/20260512025635.2840817-2-jstultz@google.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Historically, the prev value from __schedule() was the rq-&gt;curr.
This prev value is passed down through numerous functions, and
used in the class scheduler implementations. The fact that
prev was on_cpu until the end of __schedule(), meant it was
stable across the rq lock drops that the class-&gt;balance()
implementations often do.

However, with proxy-exec, the prev passed to functions called
by __schedule() is rq-&gt;donor, which may not be the same as
rq-&gt;curr and may not be on_cpu, this makes the prev value
potentially unstable across rq lock drops.

A recently found issue with proxy-exec, is when we begin doing
return migration from try_to_wake_up(), its possible we may be
waking up the rq-&gt;donor.  When we do this, we proxy_resched_idle()
to put_prev_set_next() setting the rq-&gt;donor to rq-&gt;idle, allowing
the rq-&gt;donor to be return migrated and allowed to run.

This however runs into trouble, as on another cpu we might be in
the middle of calling __schedule(). Conceptually the rq lock is
held for the majority of the time, but in calling prev_balance()
its possible the class-&gt;balance() handler call may briefly drop the rq lock.
This opens a window for try_to_wake_up() to wake and return migrate the
rq-&gt;donor before the class logic reacquires the rq lock.

Unfortunately prev_balance() pass in a prev argument, to which we pass
rq-&gt;donor. However this prev value can now become stale and incorrect across a
rq lock drop.

So, to correct this, rework the prev_balance() call so that it does not take a
"prev" argument.

Signed-off-by: John Stultz &lt;jstultz@google.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://patch.msgid.link/20260512025635.2840817-2-jstultz@google.com
</pre>
</div>
</content>
</entry>
<entry>
<title>sched/core: Rework sched_class::wakeup_preempt() and rq_modified_*()</title>
<updated>2025-12-17T09:53:25+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2025-12-10T08:06:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=704069649b5bfb7bf1fe32c0281fe9036806a59a'/>
<id>704069649b5bfb7bf1fe32c0281fe9036806a59a</id>
<content type='text'>
Change sched_class::wakeup_preempt() to also get called for
cross-class wakeups, specifically those where the woken task
is of a higher class than the previous highest class.

In order to do this, track the current highest class of the runqueue
in rq::next_class and have wakeup_preempt() track this upwards for
each new wakeup. Additionally have schedule() re-set the value on
pick.

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Link: https://patch.msgid.link/20251127154725.901391274@infradead.org
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Change sched_class::wakeup_preempt() to also get called for
cross-class wakeups, specifically those where the woken task
is of a higher class than the previous highest class.

In order to do this, track the current highest class of the runqueue
in rq::next_class and have wakeup_preempt() track this upwards for
each new wakeup. Additionally have schedule() re-set the value on
pick.

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Link: https://patch.msgid.link/20251127154725.901391274@infradead.org
</pre>
</div>
</content>
</entry>
<entry>
<title>sched: Add support to pick functions to take rf</title>
<updated>2025-10-16T09:13:55+00:00</updated>
<author>
<name>Joel Fernandes</name>
<email>joelagnelf@nvidia.com</email>
</author>
<published>2025-08-09T18:47:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=50653216e4ff7a74c95b2ee9ec439916875556ec'/>
<id>50653216e4ff7a74c95b2ee9ec439916875556ec</id>
<content type='text'>
Some pick functions like the internal pick_next_task_fair() already take
rf but some others dont. We need this for scx's server pick function.
Prepare for this by having pick functions accept it.

[peterz: - added RETRY_TASK handling
         - removed pick_next_task_fair indirection]
Signed-off-by: Joel Fernandes &lt;joelagnelf@nvidia.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Acked-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Some pick functions like the internal pick_next_task_fair() already take
rf but some others dont. We need this for scx's server pick function.
Prepare for this by having pick functions accept it.

[peterz: - added RETRY_TASK handling
         - removed pick_next_task_fair indirection]
Signed-off-by: Joel Fernandes &lt;joelagnelf@nvidia.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Acked-by: Tejun Heo &lt;tj@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>sched: Detect per-class runqueue changes</title>
<updated>2025-10-16T09:13:55+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2025-10-01T13:50:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=1e900f415c6082cd4bcdae4c92515d21fb389473'/>
<id>1e900f415c6082cd4bcdae4c92515d21fb389473</id>
<content type='text'>
Have enqueue/dequeue set a per-class bit in rq-&gt;queue_mask. This then
enables easy tracking of which runqueues are modified over a
lock-break.

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Acked-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Have enqueue/dequeue set a per-class bit in rq-&gt;queue_mask. This then
enables easy tracking of which runqueues are modified over a
lock-break.

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Acked-by: Tejun Heo &lt;tj@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>sched: Move sched_class::prio_changed() into the change pattern</title>
<updated>2025-10-16T09:13:52+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2024-11-01T13:16:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=6455ad5346c9cf755fa9dda6e326c4028fb3c853'/>
<id>6455ad5346c9cf755fa9dda6e326c4028fb3c853</id>
<content type='text'>
Move sched_class::prio_changed() into the change pattern.

And while there, extend it with sched_class::get_prio() in order to
fix the deadline sitation.

Suggested-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Reviewed-by: Juri Lelli &lt;juri.lelli@redhat.com&gt;
Acked-by: Tejun Heo &lt;tj@kernel.org&gt;
Acked-by: Vincent Guittot &lt;vincent.guittot@linaro.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Move sched_class::prio_changed() into the change pattern.

And while there, extend it with sched_class::get_prio() in order to
fix the deadline sitation.

Suggested-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Reviewed-by: Juri Lelli &lt;juri.lelli@redhat.com&gt;
Acked-by: Tejun Heo &lt;tj@kernel.org&gt;
Acked-by: Vincent Guittot &lt;vincent.guittot@linaro.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>sched: Fold sched_class::switch{ing,ed}_{to,from}() into the change pattern</title>
<updated>2025-10-16T09:13:51+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2024-10-30T14:08:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=637b0682821b144d5993211cf0a768b322138a69'/>
<id>637b0682821b144d5993211cf0a768b322138a69</id>
<content type='text'>
Add {DE,EN}QUEUE_CLASS and fold the sched_class::switch* methods into
the change pattern. This completes and makes the pattern more
symmetric.

This changes the order of callbacks slightly:

  OLD                              NEW
				|
				|  switching_from()
  dequeue_task();		|  dequeue_task()
  put_prev_task();		|  put_prev_task()
				|  switched_from()
				|
  ... change task ...		|  ... change task ...
				|
  switching_to();		|  switching_to()
  enqueue_task();		|  enqueue_task()
  set_next_task();		|  set_next_task()
  prev_class-&gt;switched_from()	|
  switched_to()			|  switched_to()
				|

Notably, it moves the switched_from() callback right after the
dequeue/put. Existing implementations don't appear to be affected by
this change in location -- specifically the task isn't enqueued on the
class in question in either location.

Make (CLASS)^(SAVE|MOVE), because there is nothing to save-restore
when changing scheduling classes.

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Reviewed-by: Juri Lelli &lt;juri.lelli@redhat.com&gt;
Acked-by: Tejun Heo &lt;tj@kernel.org&gt;
Acked-by: Vincent Guittot &lt;vincent.guittot@linaro.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add {DE,EN}QUEUE_CLASS and fold the sched_class::switch* methods into
the change pattern. This completes and makes the pattern more
symmetric.

This changes the order of callbacks slightly:

  OLD                              NEW
				|
				|  switching_from()
  dequeue_task();		|  dequeue_task()
  put_prev_task();		|  put_prev_task()
				|  switched_from()
				|
  ... change task ...		|  ... change task ...
				|
  switching_to();		|  switching_to()
  enqueue_task();		|  enqueue_task()
  set_next_task();		|  set_next_task()
  prev_class-&gt;switched_from()	|
  switched_to()			|  switched_to()
				|

Notably, it moves the switched_from() callback right after the
dequeue/put. Existing implementations don't appear to be affected by
this change in location -- specifically the task isn't enqueued on the
class in question in either location.

Make (CLASS)^(SAVE|MOVE), because there is nothing to save-restore
when changing scheduling classes.

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Reviewed-by: Juri Lelli &lt;juri.lelli@redhat.com&gt;
Acked-by: Tejun Heo &lt;tj@kernel.org&gt;
Acked-by: Vincent Guittot &lt;vincent.guittot@linaro.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>sched/smp: Use the SMP version of the stop-CPU scheduling class</title>
<updated>2025-06-13T06:47:21+00:00</updated>
<author>
<name>Ingo Molnar</name>
<email>mingo@kernel.org</email>
</author>
<published>2025-05-28T08:09:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=caf5bde9c542f0cbdf2c91db7ea9743e33941d91'/>
<id>caf5bde9c542f0cbdf2c91db7ea9743e33941d91</id>
<content type='text'>
Simplify the scheduler by making CONFIG_SMP=y code in the stop-CPU
scheduling class unconditional.

Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Acked-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Dietmar Eggemann &lt;dietmar.eggemann@arm.com&gt;
Cc: Juri Lelli &lt;juri.lelli@redhat.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Cc: Shrikanth Hegde &lt;sshegde@linux.ibm.com&gt;
Cc: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Cc: Valentin Schneider &lt;vschneid@redhat.com&gt;
Cc: Vincent Guittot &lt;vincent.guittot@linaro.org&gt;
Link: https://lore.kernel.org/r/20250528080924.2273858-36-mingo@kernel.org
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Simplify the scheduler by making CONFIG_SMP=y code in the stop-CPU
scheduling class unconditional.

Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Acked-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Dietmar Eggemann &lt;dietmar.eggemann@arm.com&gt;
Cc: Juri Lelli &lt;juri.lelli@redhat.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Cc: Shrikanth Hegde &lt;sshegde@linux.ibm.com&gt;
Cc: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Cc: Valentin Schneider &lt;vschneid@redhat.com&gt;
Cc: Vincent Guittot &lt;vincent.guittot@linaro.org&gt;
Link: https://lore.kernel.org/r/20250528080924.2273858-36-mingo@kernel.org
</pre>
</div>
</content>
</entry>
<entry>
<title>sched: Make clangd usable</title>
<updated>2025-06-11T09:20:53+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2025-05-23T16:26:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=3d7e10188ae0b68dadd60f611ca81ecf9d991f77'/>
<id>3d7e10188ae0b68dadd60f611ca81ecf9d991f77</id>
<content type='text'>
Due to the weird Makefile setup of sched the various files do not
compile as stand alone units. The new generation of editors are trying
to do just this -- mostly to offer fancy things like completions but
also better syntax highlighting and code navigation.

Specifically, I've been playing around with neovim and clangd.

Setting up clangd on the kernel source is a giant pain in the arse
(this really should be improved), but once you do manage, you run into
dumb stuff like the above.

Fix up the scheduler files to at least pretend to work.

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Acked-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Tested-by: Juri Lelli &lt;juri.lelli@redhat.com&gt;
Link: https://lkml.kernel.org/r/20250523164348.GN39944@noisy.programming.kicks-ass.net
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Due to the weird Makefile setup of sched the various files do not
compile as stand alone units. The new generation of editors are trying
to do just this -- mostly to offer fancy things like completions but
also better syntax highlighting and code navigation.

Specifically, I've been playing around with neovim and clangd.

Setting up clangd on the kernel source is a giant pain in the arse
(this really should be improved), but once you do manage, you run into
dumb stuff like the above.

Fix up the scheduler files to at least pretend to work.

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Acked-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Tested-by: Juri Lelli &lt;juri.lelli@redhat.com&gt;
Link: https://lkml.kernel.org/r/20250523164348.GN39944@noisy.programming.kicks-ass.net
</pre>
</div>
</content>
</entry>
<entry>
<title>sched: Add put_prev_task(.next)</title>
<updated>2024-09-03T13:26:32+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2024-08-13T22:25:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=b2d70222dbf2a2ff7a972a685d249a5d75afa87f'/>
<id>b2d70222dbf2a2ff7a972a685d249a5d75afa87f</id>
<content type='text'>
In order to tell the previous sched_class what the next task is, add
put_prev_task(.next).

Notable SCX will use this to:

 1) determine the next task will leave the SCX sched class and push
    the current task to another CPU if possible.
 2) statistics on how often and which other classes preempt it

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/20240813224016.367421076@infradead.org
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In order to tell the previous sched_class what the next task is, add
put_prev_task(.next).

Notable SCX will use this to:

 1) determine the next task will leave the SCX sched class and push
    the current task to another CPU if possible.
 2) statistics on how often and which other classes preempt it

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/20240813224016.367421076@infradead.org
</pre>
</div>
</content>
</entry>
<entry>
<title>sched: Rework pick_next_task()</title>
<updated>2024-09-03T13:26:31+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2024-08-13T22:25:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=fd03c5b8585562d60f8b597b4332d28f48abfe7d'/>
<id>fd03c5b8585562d60f8b597b4332d28f48abfe7d</id>
<content type='text'>
The current rule is that:

  pick_next_task() := pick_task() + set_next_task(.first = true)

And many classes implement it directly as such. Change things around
to make pick_next_task() optional while also changing the definition to:

  pick_next_task(prev) := pick_task() + put_prev_task() + set_next_task(.first = true)

The reason is that sched_ext would like to have a 'final' call that
knows the next task. By placing put_prev_task() right next to
set_next_task() (as it already is for sched_core) this becomes
trivial.

As a bonus, this is a nice cleanup on its own.

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/20240813224016.051225657@infradead.org
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The current rule is that:

  pick_next_task() := pick_task() + set_next_task(.first = true)

And many classes implement it directly as such. Change things around
to make pick_next_task() optional while also changing the definition to:

  pick_next_task(prev) := pick_task() + put_prev_task() + set_next_task(.first = true)

The reason is that sched_ext would like to have a 'final' call that
knows the next task. By placing put_prev_task() right next to
set_next_task() (as it already is for sched_core) this becomes
trivial.

As a bonus, this is a nice cleanup on its own.

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/20240813224016.051225657@infradead.org
</pre>
</div>
</content>
</entry>
</feed>
