<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/kernel/rcu/tasks.h, branch v6.6</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>Merge branches 'doc.2023.07.14b', 'fixes.2023.08.16a', 'rcu-tasks.2023.07.24a', 'rcuscale.2023.07.14b', 'refscale.2023.07.14b', 'torture.2023.08.14a' and 'torturescripts.2023.07.20a' into HEAD</title>
<updated>2023-08-16T21:31:08+00:00</updated>
<author>
<name>Paul E. McKenney</name>
<email>paulmck@kernel.org</email>
</author>
<published>2023-08-16T21:31:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=fe24a0b63278808013e1756e235e0e17e8bae281'/>
<id>fe24a0b63278808013e1756e235e0e17e8bae281</id>
<content type='text'>
doc.2023.07.14b:  Documentation updates.
fixes.2023.08.16a:  Miscellaneous fixes.
rcu-tasks.2023.07.24a:  RCU Tasks updates.
rcuscale.2023.07.14b:  RCU (updater) scalability test updates.
refscale.2023.07.14b:  Reference (reader) scalability test updates.
torture.2023.08.14a:  Other torture-test updates.
torturescripts.2023.07.20a:  Other torture-test scripting updates.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
doc.2023.07.14b:  Documentation updates.
fixes.2023.08.16a:  Miscellaneous fixes.
rcu-tasks.2023.07.24a:  RCU Tasks updates.
rcuscale.2023.07.14b:  RCU (updater) scalability test updates.
refscale.2023.07.14b:  Reference (reader) scalability test updates.
torture.2023.08.14a:  Other torture-test updates.
torturescripts.2023.07.20a:  Other torture-test scripting updates.
</pre>
</div>
</content>
</entry>
<entry>
<title>rcu-tasks: Fix boot-time RCU tasks debug-only deadlock</title>
<updated>2023-08-14T21:58:25+00:00</updated>
<author>
<name>Paul E. McKenney</name>
<email>paulmck@kernel.org</email>
</author>
<published>2023-08-01T19:11:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=9d0cce2bc3874dd03f7471ec00ae4acb5a77e43c'/>
<id>9d0cce2bc3874dd03f7471ec00ae4acb5a77e43c</id>
<content type='text'>
In kernels built with CONFIG_PROVE_RCU=y (for example, lockdep kernels),
the following sequence of events can occur:

o	rcu_init_tasks_generic() is invoked just before init is spawned.
	It invokes rcu_spawn_tasks_kthread() and friends.

o	rcu_spawn_tasks_kthread() invokes rcu_spawn_tasks_kthread_generic(),
	which uses kthread_run() to create the needed kthread.

o	Control returns to rcu_init_tasks_generic(), which, because this
	is a CONFIG_PROVE_RCU=y kernel, invokes the version of the
	rcu_tasks_initiate_self_tests() function that actually does
	something, including invoking synchronize_rcu_tasks(), which
	in turn invokes synchronize_rcu_tasks_generic().

o	synchronize_rcu_tasks_generic() sees that the -&gt;kthread_ptr is
	still NULL, because the newly spawned kthread has not yet
	started.

o	The new kthread starts, preempting synchronize_rcu_tasks_generic()
	just after its check.  This kthread invokes rcu_tasks_one_gp(),
	which acquires -&gt;tasks_gp_mutex, and, seeing no work, blocks
	in rcuwait_wait_event().  Note that this step requires either
	a preemptible kernel or a fault-injection-style sleep at the
	beginning of mutex_lock().

o	synchronize_rcu_tasks_generic() resumes and invokes rcu_tasks_one_gp().

o	rcu_tasks_one_gp() attempts to acquire -&gt;tasks_gp_mutex, which
	is still held by the newly spawned kthread's rcu_tasks_one_gp()
	function.  Deadlock.

Because the only reason for -&gt;tasks_gp_mutex is to handle pre-kthread
synchronous grace periods, this commit avoids this deadlock by having
rcu_tasks_one_gp() momentarily release -&gt;tasks_gp_mutex while invoking
rcuwait_wait_event().  This allows the call to rcu_tasks_one_gp() from
synchronize_rcu_tasks_generic() proceed.

Note that it is not necessary to release the mutex anywhere else in
rcu_tasks_one_gp() because rcuwait_wait_event() is the only function
that can block indefinitely.

Reported-by: Guenter Roeck &lt;linux@roeck-us.net&gt;
Reported-by: Roy Hopkins &lt;rhopkins@suse.de&gt;
Reported-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Tested-by: Roy Hopkins &lt;rhopkins@suse.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In kernels built with CONFIG_PROVE_RCU=y (for example, lockdep kernels),
the following sequence of events can occur:

o	rcu_init_tasks_generic() is invoked just before init is spawned.
	It invokes rcu_spawn_tasks_kthread() and friends.

o	rcu_spawn_tasks_kthread() invokes rcu_spawn_tasks_kthread_generic(),
	which uses kthread_run() to create the needed kthread.

o	Control returns to rcu_init_tasks_generic(), which, because this
	is a CONFIG_PROVE_RCU=y kernel, invokes the version of the
	rcu_tasks_initiate_self_tests() function that actually does
	something, including invoking synchronize_rcu_tasks(), which
	in turn invokes synchronize_rcu_tasks_generic().

o	synchronize_rcu_tasks_generic() sees that the -&gt;kthread_ptr is
	still NULL, because the newly spawned kthread has not yet
	started.

o	The new kthread starts, preempting synchronize_rcu_tasks_generic()
	just after its check.  This kthread invokes rcu_tasks_one_gp(),
	which acquires -&gt;tasks_gp_mutex, and, seeing no work, blocks
	in rcuwait_wait_event().  Note that this step requires either
	a preemptible kernel or a fault-injection-style sleep at the
	beginning of mutex_lock().

o	synchronize_rcu_tasks_generic() resumes and invokes rcu_tasks_one_gp().

o	rcu_tasks_one_gp() attempts to acquire -&gt;tasks_gp_mutex, which
	is still held by the newly spawned kthread's rcu_tasks_one_gp()
	function.  Deadlock.

Because the only reason for -&gt;tasks_gp_mutex is to handle pre-kthread
synchronous grace periods, this commit avoids this deadlock by having
rcu_tasks_one_gp() momentarily release -&gt;tasks_gp_mutex while invoking
rcuwait_wait_event().  This allows the call to rcu_tasks_one_gp() from
synchronize_rcu_tasks_generic() proceed.

Note that it is not necessary to release the mutex anywhere else in
rcu_tasks_one_gp() because rcuwait_wait_event() is the only function
that can block indefinitely.

Reported-by: Guenter Roeck &lt;linux@roeck-us.net&gt;
Reported-by: Roy Hopkins &lt;rhopkins@suse.de&gt;
Reported-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Tested-by: Roy Hopkins &lt;rhopkins@suse.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>rcu-tasks: Permit use of debug-objects with RCU Tasks flavors</title>
<updated>2023-07-31T23:06:57+00:00</updated>
<author>
<name>Paul E. McKenney</name>
<email>paulmck@kernel.org</email>
</author>
<published>2023-07-28T20:33:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=cb88f7f51bc6f351a529ff61d0a706c6eae1417a'/>
<id>cb88f7f51bc6f351a529ff61d0a706c6eae1417a</id>
<content type='text'>
Currently, cblist_init_generic() holds a raw spinlock when invoking
INIT_WORK().  This fails in kernels built with CONFIG_DEBUG_OBJECTS=y
due to memory allocation being forbidden while holding a raw spinlock.
But the only reason for holding the raw spinlock is to synchronize
with early boot calls to call_rcu_tasks(), call_rcu_tasks_rude, and,
last but not least, call_rcu_tasks_trace().  These calls also invoke
cblist_init_generic() in order to support early boot queueing of
callbacks.

Except that there are no early boot calls to either of these three
functions, and the BPF guys confirm that they have no plans to add any
such calls.

This commit therefore removes the synchronization and adds a
WARN_ON_ONCE() to catch the case of now-prohibited early boot RCU Tasks
callback queueing.

If early boot queueing is needed, an "initialized" flag may be added to
the rcu_tasks structure.  Then queueing a callback before this flag is set
would initialize the callback list (if needed) and queue the callback.
The decision as to where to queue the callback given the possibility of
non-zero boot CPUs is left as an exercise for the reader.

Reported-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently, cblist_init_generic() holds a raw spinlock when invoking
INIT_WORK().  This fails in kernels built with CONFIG_DEBUG_OBJECTS=y
due to memory allocation being forbidden while holding a raw spinlock.
But the only reason for holding the raw spinlock is to synchronize
with early boot calls to call_rcu_tasks(), call_rcu_tasks_rude, and,
last but not least, call_rcu_tasks_trace().  These calls also invoke
cblist_init_generic() in order to support early boot queueing of
callbacks.

Except that there are no early boot calls to either of these three
functions, and the BPF guys confirm that they have no plans to add any
such calls.

This commit therefore removes the synchronization and adds a
WARN_ON_ONCE() to catch the case of now-prohibited early boot RCU Tasks
callback queueing.

If early boot queueing is needed, an "initialized" flag may be added to
the rcu_tasks structure.  Then queueing a callback before this flag is set
would initialize the callback list (if needed) and queue the callback.
The decision as to where to queue the callback given the possibility of
non-zero boot CPUs is left as an exercise for the reader.

Reported-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>rcuscale: Add RCU Tasks Rude testing</title>
<updated>2023-07-14T22:01:49+00:00</updated>
<author>
<name>Paul E. McKenney</name>
<email>paulmck@kernel.org</email>
</author>
<published>2023-06-03T16:07:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=a15ec57cfcf83858aced058e00a7feb824322fa4'/>
<id>a15ec57cfcf83858aced058e00a7feb824322fa4</id>
<content type='text'>
Add a "tasks-rude" option to the rcuscale.scale_type module parameter.

Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add a "tasks-rude" option to the rcuscale.scale_type module parameter.

Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>rcuscale: Measure RCU Tasks Trace grace-period kthread CPU time</title>
<updated>2023-07-14T22:01:49+00:00</updated>
<author>
<name>Paul E. McKenney</name>
<email>paulmck@kernel.org</email>
</author>
<published>2023-06-02T21:38:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=271a8467a5f7ab7654f06541d2483b5340bb7192'/>
<id>271a8467a5f7ab7654f06541d2483b5340bb7192</id>
<content type='text'>
This commit causes RCU Tasks Trace to output the CPU time consumed by
its grace-period kthread.  The CPU time is whatever is in the designated
task's current-&gt;stime field, and thus is controlled by whatever CPU-time
accounting scheme is in effect.

This output appears in microseconds as follows on the console:

rcu_scale: Grace-period kthread CPU time: 42367.037

[ paulmck: Apply Willy Tarreau feedback. ]

Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This commit causes RCU Tasks Trace to output the CPU time consumed by
its grace-period kthread.  The CPU time is whatever is in the designated
task's current-&gt;stime field, and thus is controlled by whatever CPU-time
accounting scheme is in effect.

This output appears in microseconds as follows on the console:

rcu_scale: Grace-period kthread CPU time: 42367.037

[ paulmck: Apply Willy Tarreau feedback. ]

Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>rcuscale: Measure grace-period kthread CPU time</title>
<updated>2023-07-14T22:01:49+00:00</updated>
<author>
<name>Paul E. McKenney</name>
<email>paulmck@kernel.org</email>
</author>
<published>2023-05-17T12:01:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=5f8e3202696f5e94d3fc54cfed75d270ae843ee9'/>
<id>5f8e3202696f5e94d3fc54cfed75d270ae843ee9</id>
<content type='text'>
This commit adds the ability to output the CPU time consumed by the
grace-period kthread for the RCU variant under test.  The CPU time is
whatever is in the designated task's current-&gt;stime field, and thus is
controlled by whatever CPU-time accounting scheme is in effect.

This output appears in microseconds as follows on the console:

rcu_scale: Grace-period kthread CPU time: 42367.037

[ paulmck: Apply feedback from Stephen Rothwell and kernel test robot. ]

Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Tested-by: Yujie Liu &lt;yujie.liu@intel.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This commit adds the ability to output the CPU time consumed by the
grace-period kthread for the RCU variant under test.  The CPU time is
whatever is in the designated task's current-&gt;stime field, and thus is
controlled by whatever CPU-time accounting scheme is in effect.

This output appears in microseconds as follows on the console:

rcu_scale: Grace-period kthread CPU time: 42367.037

[ paulmck: Apply feedback from Stephen Rothwell and kernel test robot. ]

Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Tested-by: Yujie Liu &lt;yujie.liu@intel.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>rcu-tasks: Cancel callback laziness if too many callbacks</title>
<updated>2023-07-14T22:00:12+00:00</updated>
<author>
<name>Paul E. McKenney</name>
<email>paulmck@kernel.org</email>
</author>
<published>2023-06-26T20:45:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=db13710a0365ba66f8f232eb3bdfaa1a10d820f7'/>
<id>db13710a0365ba66f8f232eb3bdfaa1a10d820f7</id>
<content type='text'>
The various RCU Tasks flavors now do lazy grace periods when there are
only asynchronous grace period requests.  By default, the system will let
250 milliseconds elapse after the first call_rcu_tasks*() callbacki is
queued before starting a grace period.  In contrast, synchronous grace
period requests such as synchronize_rcu_tasks*() will start a grace
period immediately.

However, invoking one of the call_rcu_tasks*() functions in a too-tight
loop can result in a callback flood, which in turn can exhaust memory
if grace periods are delayed for too long.

This commit therefore sets a limit so that the grace-period kthread
will be awakened when any CPU's callback list expands to contain
rcupdate.rcu_task_lazy_lim callbacks elements (defaulting to 32, set to -1
to disable), the grace-period kthread will be awakened, thus cancelling
any ongoing laziness and getting out in front of the potential callback
flood.

Cc: Alexei Starovoitov &lt;ast@kernel.org&gt;
Cc: Martin KaFai Lau &lt;kafai@fb.com&gt;
Cc: Andrii Nakryiko &lt;andrii@kernel.org&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The various RCU Tasks flavors now do lazy grace periods when there are
only asynchronous grace period requests.  By default, the system will let
250 milliseconds elapse after the first call_rcu_tasks*() callbacki is
queued before starting a grace period.  In contrast, synchronous grace
period requests such as synchronize_rcu_tasks*() will start a grace
period immediately.

However, invoking one of the call_rcu_tasks*() functions in a too-tight
loop can result in a callback flood, which in turn can exhaust memory
if grace periods are delayed for too long.

This commit therefore sets a limit so that the grace-period kthread
will be awakened when any CPU's callback list expands to contain
rcupdate.rcu_task_lazy_lim callbacks elements (defaulting to 32, set to -1
to disable), the grace-period kthread will be awakened, thus cancelling
any ongoing laziness and getting out in front of the potential callback
flood.

Cc: Alexei Starovoitov &lt;ast@kernel.org&gt;
Cc: Martin KaFai Lau &lt;kafai@fb.com&gt;
Cc: Andrii Nakryiko &lt;andrii@kernel.org&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>rcu-tasks: Add kernel boot parameters for callback laziness</title>
<updated>2023-07-14T22:00:12+00:00</updated>
<author>
<name>Paul E. McKenney</name>
<email>paulmck@kernel.org</email>
</author>
<published>2023-05-17T13:40:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=450d461aa629024c0f01022122838b5243fe0f2a'/>
<id>450d461aa629024c0f01022122838b5243fe0f2a</id>
<content type='text'>
This commit adds kernel boot parameters for callback laziness, allowing
the RCU Tasks flavors to be individually adjusted.

Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This commit adds kernel boot parameters for callback laziness, allowing
the RCU Tasks flavors to be individually adjusted.

Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>rcu-tasks: Remove redundant #ifdef CONFIG_TASKS_RCU</title>
<updated>2023-07-14T22:00:12+00:00</updated>
<author>
<name>Paul E. McKenney</name>
<email>paulmck@kernel.org</email>
</author>
<published>2023-05-17T12:20:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=5ae769c611e7d77c1d945cb132140deb93a32d49'/>
<id>5ae769c611e7d77c1d945cb132140deb93a32d49</id>
<content type='text'>
The kernel/rcu/tasks.h file has a #endif immediately followed by an

Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The kernel/rcu/tasks.h file has a #endif immediately followed by an

Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>rcu-tasks: Treat only synchronous grace periods urgently</title>
<updated>2023-07-14T22:00:11+00:00</updated>
<author>
<name>Paul E. McKenney</name>
<email>paulmck@kernel.org</email>
</author>
<published>2023-05-15T01:06:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=d119357d0743548f90243315de905cf6a79e5221'/>
<id>d119357d0743548f90243315de905cf6a79e5221</id>
<content type='text'>
The performance requirements on RCU Tasks, and in particular on RCU
Tasks Trace, have evolved over time as the workloads have evolved.
The current implementation is designed to provide low grace-period
latencies, and also to accommodate short-duration floods of callbacks.

However, current workloads can also provide a constant background
callback-queuing rate of a few hundred call_rcu_tasks_trace() invocations
per second.  This results in continuous back-to-back RCU Tasks Trace
grace periods, which in turn can consume the better part of 10% of a CPU.
One could take the attitude that there are several tens of other CPUs on
the systems running such workloads, but energy efficiency is a thing.
On these systems, although asynchronous grace-period requests happen
every few milliseconds, synchronous grace-period requests are quite rare.

This commit therefore arrranges for grace periods to be initiated
immediately in response to calls to synchronize_rcu_tasks*() and
also to calls to synchronize_rcu_mult() that are passed one of the
call_rcu_tasks*() functions.  These are recognized by the tell-tale
wakeme_after_rcu callback function.

In other cases, callbacks are gathered up for up to about 250 milliseconds
before a grace period is initiated.  This results in more than an order of
magnitude reduction in RCU Tasks Trace grace periods, with corresponding
reduction in consumption of CPU time.

Reported-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Reported-by: Martin KaFai Lau &lt;kafai@fb.com&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The performance requirements on RCU Tasks, and in particular on RCU
Tasks Trace, have evolved over time as the workloads have evolved.
The current implementation is designed to provide low grace-period
latencies, and also to accommodate short-duration floods of callbacks.

However, current workloads can also provide a constant background
callback-queuing rate of a few hundred call_rcu_tasks_trace() invocations
per second.  This results in continuous back-to-back RCU Tasks Trace
grace periods, which in turn can consume the better part of 10% of a CPU.
One could take the attitude that there are several tens of other CPUs on
the systems running such workloads, but energy efficiency is a thing.
On these systems, although asynchronous grace-period requests happen
every few milliseconds, synchronous grace-period requests are quite rare.

This commit therefore arrranges for grace periods to be initiated
immediately in response to calls to synchronize_rcu_tasks*() and
also to calls to synchronize_rcu_mult() that are passed one of the
call_rcu_tasks*() functions.  These are recognized by the tell-tale
wakeme_after_rcu callback function.

In other cases, callbacks are gathered up for up to about 250 milliseconds
before a grace period is initiated.  This results in more than an order of
magnitude reduction in RCU Tasks Trace grace periods, with corresponding
reduction in consumption of CPU time.

Reported-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Reported-by: Martin KaFai Lau &lt;kafai@fb.com&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
