linux.git/kernel/rcu/tasks.h, branch v6.11

rcu/tasks: Fix stale task snaphot for Tasks Trace

2024-06-06T18:50:04+00:00

When RCU-TASKS-TRACE pre-gp takes a snapshot of the current task running
on all online CPUs, no explicit ordering synchronizes properly with a
context switch.  This lack of ordering can permit the new task to miss
pre-grace-period update-side accesses.  The following diagram, courtesy
of Paul, shows the possible bad scenario:

        CPU 0                                           CPU 1
        -----                                           -----

        // Pre-GP update side access
        WRITE_ONCE(*X, 1);
        smp_mb();
        r0 = rq->curr;
                                                        RCU_INIT_POINTER(rq->curr, TASK_B)
                                                        spin_unlock(rq)
                                                        rcu_read_lock_trace()
                                                        r1 = X;
        /* ignore TASK_B */

Either r0==TASK_B or r1==1 is needed but neither is guaranteed.

One possible solution to solve this is to wait for an RCU grace period
at the beginning of the RCU-tasks-trace grace period before taking the
current tasks snaphot. However this would introduce large additional
latencies to RCU-tasks-trace grace periods.

Another solution is to lock the target runqueue while taking the current
task snapshot. This ensures that the update side sees the latest context
switch and subsequent context switches will see the pre-grace-period
update side accesses.

This commit therefore adds runqueue locking to cpu_curr_snapshot().

Fixes: e386b6725798 ("rcu-tasks: Eliminate RCU Tasks Trace IPIs to online CPUs")
Signed-off-by: Frederic Weisbecker 
Signed-off-by: Paul E. McKenney

Revert "rcu-tasks: Fix synchronize_rcu_tasks() VS zap_pid_ns_processes()"

2024-06-04T00:30:08+00:00

This reverts commit 28319d6dc5e2ffefa452c2377dd0f71621b5bff0. The race
it fixed was subject to conditions that don't exist anymore since:

	1612160b9127 ("rcu-tasks: Eliminate deadlocks involving do_exit() and RCU tasks")

This latter commit removes the use of SRCU that used to cover the
RCU-tasks blind spot on exit between the tasklist's removal and the
final preemption disabling. The task is now placed instead into a
temporary list inside which voluntary sleeps are accounted as RCU-tasks
quiescent states. This would disarm the deadlock initially reported
against PID namespace exit.

Signed-off-by: Frederic Weisbecker 
Reviewed-by: Oleg Nesterov 
Signed-off-by: Paul E. McKenney

Merge branches 'fixes.2024.04.15a', 'misc.2024.04.12a', 'rcu-sync-normal-improve.2024.04.15a', 'rcu-tasks.2024.04.15a' and 'rcutorture.2024.04.15a' into rcu-merge.2024.04.15a

2024-05-01T11:04:02+00:00

fixes.2024.04.15a: RCU fixes
misc.2024.04.12a: Miscellaneous fixes
rcu-sync-normal-improve.2024.04.15a: Improving synchronize_rcu() call
rcu-tasks.2024.04.15a: Tasks RCU updates
rcutorture.2024.04.15a: Torture-test updates

rcutorture: Make rcutorture support print rcu-tasks gp state

2024-04-16T09:16:35+00:00

This commit make rcu-tasks related rcutorture test support rcu-tasks
gp state printing when the writer stall occurs or the at the end of
rcutorture test, and generate rcu_ops->get_gp_data() operation to
simplify the acquisition of gp state for different types of rcutorture
tests.

Signed-off-by: Zqiang 
Signed-off-by: Paul E. McKenney 
Signed-off-by: Uladzislau Rezki (Sony)

rcu-tasks: Fix show_rcu_tasks_trace_gp_kthread buffer overflow

2024-04-15T17:36:55+00:00

There is a possibility of buffer overflow in
show_rcu_tasks_trace_gp_kthread() if counters, passed
to sprintf() are huge. Counter numbers, needed for this
are unrealistically high, but buffer overflow is still
possible.

Use snprintf() with buffer size instead of sprintf().

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Fixes: edf3775f0ad6 ("rcu-tasks: Add count for idle tasks on offline CPUs")
Signed-off-by: Nikita Kiryushin 
Reviewed-by: Steven Rostedt (Google) 
Signed-off-by: Paul E. McKenney 
Signed-off-by: Uladzislau Rezki (Sony)

rcu-tasks: Fix the comments for tasks_rcu_exit_srcu_stall_timer

2024-04-15T17:36:55+00:00

The synchronize_srcu() has been removed by commit("rcu-tasks: Eliminate
deadlocks involving do_exit() and RCU tasks") in rcu_tasks_postscan.
This commit therefore fixes the tasks_rcu_exit_srcu_stall_timer comment.

Signed-off-by: Zqiang 
Signed-off-by: Paul E. McKenney 
Signed-off-by: Uladzislau Rezki (Sony)

rcu-tasks: Replace exit_tasks_rcu_start() initialization with WARN_ON_ONCE()

2024-04-15T17:36:41+00:00

Because the Tasks RCU ->rtp_exit_list is initialized at rcu_init()
time while there is only one CPU running with interrupts disabled, it
is not possible for an exiting task to encounter an uninitialized list.
This commit therefore replaces the conditional initialization with
a WARN_ON_ONCE().

Reported-by: Frederic Weisbecker 
Closes: https://lore.kernel.org/all/ZdiNXmO3wRvmzPsr@lothringen/
Signed-off-by: Paul E. McKenney 
Signed-off-by: Uladzislau Rezki (Sony)

rcu: Inform KCSAN of one-byte cmpxchg() in rcu_trc_cmpxchg_need_qs()

2024-04-15T16:12:18+00:00

Tasks Trace RCU needs a single-byte cmpxchg(), but no such thing exists.
Therefore, rcu_trc_cmpxchg_need_qs() emulates one using field substitution
and a four-byte cmpxchg(), such that the other three bytes are always
atomically updated to their old values.  This works, but results in
false-positive KCSAN failures because as far as KCSAN knows, this
cmpxchg() operation is updating all four bytes.

This commit therefore encloses the cmpxchg() in a data_race() and adds
a single-byte instrument_atomic_read_write(), thus telling KCSAN exactly
what is going on so as to avoid the false positives.

Signed-off-by: Paul E. McKenney 
Cc: Marco Elver 
Signed-off-by: Uladzislau Rezki (Sony)

rcu-tasks: Make Tasks RCU wait idly for grace-period delays

2024-04-09T13:11:49+00:00

Currently, all waits for grace periods sleep at TASK_UNINTERRUPTIBLE,
regardless of RCU flavor.  This has worked well, but there have been
cases where a longer-than-average Tasks RCU grace period has triggered
softlockup splats, many of them, before the Tasks RCU CPU stall warning
appears.  These softlockup splats unnecessarily consume console bandwidth
and complicate diagnosis of the underlying problem.  Plus a long but not
pathologically long Tasks RCU grace period might trigger a few softlockup
splats before completing normally, which generates noise for no good
reason.

This commit therefore causes Tasks RCU grace periods to sleep at TASK_IDLE
priority.  If there really is a persistent problem, the eventual Tasks
RCU CPU stall warning will flag it, and without the extra noise.

Reported-by: Breno Leitao 
Signed-off-by: Paul E. McKenney 
Signed-off-by: Uladzislau Rezki (Sony)

rcu-tasks: Maintain real-time response in rcu_tasks_postscan()

2024-02-25T22:27:49+00:00

The current code will scan the entirety of each per-CPU list of exiting
tasks in ->rtp_exit_list with interrupts disabled.  This is normally just
fine, because each CPU typically won't have very many tasks in this state.
However, if a large number of tasks block late in do_exit(), these lists
could be arbitrarily long.  Low probability, perhaps, but it really
could happen.

This commit therefore occasionally re-enables interrupts while traversing
these lists, inserting a dummy element to hold the current place in the
list.  In kernels built with CONFIG_PREEMPT_RT=y, this re-enabling happens
after each list element is processed, otherwise every one-to-two jiffies.

[ paulmck: Apply Frederic Weisbecker feedback. ]

Link: https://lore.kernel.org/all/ZdeI_-RfdLR8jlsm@localhost.localdomain/

Signed-off-by: Paul E. McKenney 
Cc: Thomas Gleixner 
Cc: Sebastian Siewior 
Cc: Anna-Maria Behnsen 
Cc: Steven Rostedt 
Signed-off-by: Boqun Feng