linux.git/kernel/rcu/tree_stall.h, branch v6.9

rcu: Restrict access to RCU CPU stall notifiers

2023-12-11T21:01:22+00:00

Although the RCU CPU stall notifiers can be useful for dumping state when
tracking down delicate forward-progress bugs where NUMA effects cause
cache lines to be delivered to a given CPU regularly, but always in a
state that prevents that CPU from making forward progress.  These bugs can
be detected by the RCU CPU stall-warning mechanism, but in some cases,
the stall-warnings printk()s disrupt the forward-progress bug before
any useful state can be obtained.

Unfortunately, the notifier mechanism added by commit 5b404fdabacf ("rcu:
Add RCU CPU stall notifier") can make matters worse if used at all
carelessly. For example, if the stall warning was caused by a lock not
being released, then any attempt to acquire that lock in the notifier
will hang. This will prevent not only the notifier from producing any
useful output, but it will also prevent the stall-warning message from
ever appearing.

This commit therefore hides this new RCU CPU stall notifier
mechanism under a new RCU_CPU_STALL_NOTIFIER Kconfig option that
depends on both DEBUG_KERNEL and RCU_EXPERT.  In addition, the
rcupdate.rcu_cpu_stall_notifiers=1 kernel boot parameter must also
be specified.  The RCU_CPU_STALL_NOTIFIER Kconfig option's help text
contains a warning and explains the dangers of careless use, recommending
lockless notifier code.  In addition, a WARN() is triggered each time
that an attempt is made to register a stall-warning notifier in kernels
built with CONFIG_RCU_CPU_STALL_NOTIFIER=y.

This combination of measures will keep use of this mechanism confined to
debug kernels and away from routine deployments.

[ paulmck: Apply Dan Carpenter feedback. ]

Fixes: 5b404fdabacf ("rcu: Add RCU CPU stall notifier")
Reported-by: Linus Torvalds 
Signed-off-by: Paul E. McKenney 
Reviewed-by: Joel Fernandes (Google) 
Signed-off-by: Neeraj Upadhyay (AMD)

rcu/tree: Defer setting of jiffies during stall reset

2023-09-11T20:36:40+00:00

There are instances where rcu_cpu_stall_reset() is called when jiffies
did not get a chance to update for a long time. Before jiffies is
updated, the CPU stall detector can go off triggering false-positives
where a just-started grace period appears to be ages old. In the past,
we disabled stall detection in rcu_cpu_stall_reset() however this got
changed [1]. This is resulting in false-positives in KGDB usecase [2].

Fix this by deferring the update of jiffies to the third run of the FQS
loop. This is more robust, as, even if rcu_cpu_stall_reset() is called
just before jiffies is read, we would end up pushing out the jiffies
read by 3 more FQS loops. Meanwhile the CPU stall detection will be
delayed and we will not get any false positives.

[1] https://lore.kernel.org/all/20210521155624.174524-2-senozhatsky@chromium.org/
[2] https://lore.kernel.org/all/20230814020045.51950-2-chenhuacai@loongson.cn/

Tested with rcutorture.cpu_stall option as well to verify stall behavior
with/without patch.

Tested-by: Huacai Chen 
Reported-by: Binbin Zhou 
Closes: https://lore.kernel.org/all/20230814020045.51950-2-chenhuacai@loongson.cn/
Suggested-by: Paul  McKenney 
Cc: Sergey Senozhatsky 
Cc: Thomas Gleixner 
Cc: stable@vger.kernel.org
Fixes: a80be428fbc1 ("rcu: Do not disable GP stall detection in rcu_cpu_stall_reset()")
Signed-off-by: Joel Fernandes (Google) 
Signed-off-by: Paul E. McKenney 
Signed-off-by: Frederic Weisbecker

rcu: Add RCU CPU stall notifier

2023-09-11T20:10:47+00:00

It is sometimes helpful to have a way for the subsystem causing
the stall to dump its state when an RCU CPU stall occurs.  This
commit therefore bases rcu_stall_chain_notifier_register() and
rcu_stall_chain_notifier_unregister() on atomic notifiers in order to
provide this functionality.

Signed-off-by: Paul E. McKenney 
Cc: Steven Rostedt 
Signed-off-by: Frederic Weisbecker

rcu: Eliminate check_cpu_stall() duplicate code

2023-09-11T19:48:36+00:00

The code and comments of self-detected and other-detected RCU CPU stall
warnings are identical except the output function.  This commit therefore
refactors so as to consolidate the duplicate code.

Signed-off-by: Zhen Lei 
Signed-off-by: Paul E. McKenney 
Signed-off-by: Frederic Weisbecker

rcu: Don't redump the stalled CPU where RCU GP kthread last ran

2023-09-11T19:46:54+00:00

The stacks of all stalled CPUs will be dumped in rcu_dump_cpu_stacks().
If the CPU on where RCU GP kthread last ran is stalled, its stack does
not need to be dumped again. We can search the corresponding backtrace
based on the printed CPU ID.

For example:
[   87.328275] rcu: rcu_sched kthread starved for ... ->cpu=3  <--------|
... ...                                                                 |
[   89.385007] NMI backtrace for cpu 3                         <--------|
[   89.385179] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 5.10.0+ #22 <--|
[   89.385188] Hardware name: linux,dummy-virt (DT)
[   89.385196] pstate: 60000005 (nZCv daif -PAN -UAO -TCO BTYPE=--)
[   89.385204] pc : arch_cpu_idle+0x40/0xc0
[   89.385211] lr : arch_cpu_idle+0x2c/0xc0
... ...
[   89.385566] Call trace:
[   89.385574]  arch_cpu_idle+0x40/0xc0
[   89.385581]  default_idle_call+0x100/0x450
[   89.385589]  cpuidle_idle_call+0x2f8/0x460
[   89.385596]  do_idle+0x1dc/0x3d0
[   89.385604]  cpu_startup_entry+0x5c/0xb0
[   89.385613]  secondary_start_kernel+0x35c/0x520

Signed-off-by: Zhen Lei 
Reviewed-by: Joel Fernandes (Google) 
Signed-off-by: Paul E. McKenney 
Signed-off-by: Frederic Weisbecker

rcu: Delete a redundant check in rcu_check_gp_kthread_starvation()

2023-09-11T19:45:29+00:00

The rcu_check_gp_kthread_starvation() function uses task_cpu() to sample
the last CPU that the grace-period kthread ran on, and task_cpu() samples
the thread_info structure's ->cpu field.  But this field will always
contain a number corresponding to a CPU that was online some time in
the past, thus never a negative number.  This invariant is checked by
a WARN_ON_ONCE() in set_task_cpu().

This means that if the grace-period kthread exists, that is, if the "gpk"
local variable is non-NULL, the "cpu" local variable will be non-negative.
This in turn means that the existing check for non-negative "cpu" is
redundant with the enclosing check for non-NULL "gpk".

This commit threefore removes the redundant check of "cpu".

Signed-off-by: Zhen Lei 
Signed-off-by: Paul E. McKenney 
Signed-off-by: Frederic Weisbecker

tty: sysrq: switch sysrq handlers from int to u8

2023-07-25T17:21:03+00:00

The passed parameter to sysrq handlers is a key (a character). So change
the type from 'int' to 'u8'. Let it specifically be 'u8' for two
reasons:
* unsigned: unsigned values come from the upper layers (devices) and the
  tty layer assumes unsigned on most places, and
* 8-bit: as that what's supposed to be one day in all the layers built
  on the top of tty. (Currently, we use mostly 'unsigned char' and
  somewhere still only 'char'. (But that also translates to the former
  thanks to -funsigned-char.))

Signed-off-by: Jiri Slaby (SUSE) 
Cc: Richard Henderson 
Cc: Ivan Kokshaysky 
Cc: Matt Turner 
Cc: Huacai Chen 
Cc: WANG Xuerui 
Cc: Thomas Bogendoerfer 
Cc: Michael Ellerman 
Cc: Nicholas Piggin 
Cc: Christophe Leroy 
Cc: "David S. Miller" 
Cc: Maarten Lankhorst 
Cc: Maxime Ripard 
Cc: Thomas Zimmermann 
Cc: David Airlie 
Cc: Daniel Vetter 
Cc: Jason Wessel 
Cc: Daniel Thompson 
Cc: Douglas Anderson 
Cc: "Rafael J. Wysocki" 
Cc: Len Brown 
Cc: Pavel Machek 
Cc: "Paul E. McKenney" 
Cc: Frederic Weisbecker 
Cc: Neeraj Upadhyay 
Cc: Joel Fernandes 
Cc: Josh Triplett 
Cc: Boqun Feng 
Cc: Steven Rostedt 
Cc: Mathieu Desnoyers 
Cc: Lai Jiangshan 
Cc: Zqiang 
Acked-by: Thomas Zimmermann  # DRM
Acked-by: WANG Xuerui  # loongarch
Acked-by: Paul E. McKenney 
Acked-by: Daniel Thompson 
Link: https://lore.kernel.org/r/20230712081811.29004-3-jirislaby@kernel.org
Signed-off-by: Greg Kroah-Hartman

rcu: Allow up to five minutes expedited RCU CPU stall-warning timeouts

2023-01-09T20:09:52+00:00

The maximum value of RCU CPU stall-warning timeouts has historically been
five minutes (300 seconds).  However, the recently introduced expedited
RCU CPU stall-warning timeout is instead limited to 21 seconds.  This
causes problems for CI/fuzzing services such as syzkaller by obscuring
the issue in question with expedited RCU CPU stall-warning timeout splats.

This commit therefore sets the RCU_EXP_CPU_STALL_TIMEOUT Kconfig options
upper bound to 300000 milliseconds, which is 300 seconds (AKA 5 minutes).

[ paulmck: Apply feedback from Hillf Danton. ]
[ paulmck: Apply feedback from Geert Uytterhoeven. ]

Reported-by: Dave Chinner 
Reported-by: Dmitry Vyukov 
Tested-by: Dmitry Vyukov 
Signed-off-by: Paul E. McKenney

rcu: Align the output of RCU CPU stall warning messages

2023-01-05T20:21:11+00:00

Time stamps are added to the output in kernels built with
CONFIG_PRINTK_TIME=y, which causes misaligned output.  Therefore,
replace pr_cont() with pr_err(), which fixes alignment and gets
rid of a couple of despised pr_cont() calls.

Before:
[   37.567343] rcu: INFO: rcu_preempt self-detected stall on CPU
[   37.567839] rcu:     0-....: (1500 ticks this GP) idle=***
[   37.568270]  (t=1501 jiffies g=4717 q=28 ncpus=4)
[   37.568668] CPU: 0 PID: 313 Comm: test0 Not tainted 6.1.0-rc4 #8

After:
[   36.762074] rcu: INFO: rcu_preempt self-detected stall on CPU
[   36.762543] rcu:     0-....: (1499 ticks this GP) idle=***
[   36.763003] rcu:     (t=1500 jiffies g=5097 q=27 ncpus=4)
[   36.763522] CPU: 0 PID: 313 Comm: test0 Not tainted 6.1.0-rc4 #9

Signed-off-by: Zhen Lei 
Reviewed-by: Frederic Weisbecker 
Signed-off-by: Paul E. McKenney

rcu: Add RCU stall diagnosis information

2023-01-05T20:21:11+00:00

Because RCU CPU stall warnings are driven from the scheduling-clock
interrupt handler, a workload consisting of a very large number of
short-duration hardware interrupts can result in misleading stall-warning
messages.  On systems supporting only a single level of interrupts,
that is, where interrupts handlers cannot be interrupted, this can
produce misleading diagnostics.  The stack traces will show the
innocent-bystander interrupted task, not the interrupts that are
at the very least exacerbating the stall.

This situation can be improved by displaying the number of interrupts
and the CPU time that they have consumed.  Diagnosing other types
of stalls can be eased by also providing the count of softirqs and
the CPU time that they consumed as well as the number of context
switches and the task-level CPU time consumed.

Consider the following output given this change:

rcu: INFO: rcu_preempt self-detected stall on CPU
rcu:     0-....: (1250 ticks this GP) 
rcu:          hardirqs   softirqs   csw/system
rcu:  number:      624         45            0
rcu: cputime:       69          1         2425   ==> 2500(ms)

This output shows that the number of hard and soft interrupts is small,
there are no context switches, and the system takes up a lot of time. This
indicates that the current task is looping with preemption disabled.

The impact on system performance is negligible because snapshot is
recorded only once for all continuous RCU stalls.

This added debugging information is suppressed by default and can be
enabled by building the kernel with CONFIG_RCU_CPU_STALL_CPUTIME=y or
by booting with rcupdate.rcu_cpu_stall_cputime=1.

Signed-off-by: Zhen Lei 
Reviewed-by: Mukesh Ojha 
Reviewed-by: Frederic Weisbecker 
Signed-off-by: Paul E. McKenney