<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/kernel/irq/handle.c, branch v7.1</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>genirq: Remove IRQ timing tracking infrastructure</title>
<updated>2025-12-15T21:20:50+00:00</updated>
<author>
<name>Marc Zyngier</name>
<email>maz@kernel.org</email>
</author>
<published>2025-12-10T08:22:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=c119e6685311cef0e4a4e0b7752293bea056bac7'/>
<id>c119e6685311cef0e4a4e0b7752293bea056bac7</id>
<content type='text'>
The IRQ timing tracking infrastructure was merged in 2019, but was never
plumbed in, is not selectable, and is therefore never used.

As Daniel agrees that there is little hope for this infrastructure to be
completed in the near term, drop it altogether.

Signed-off-by: Marc Zyngier &lt;maz@kernel.org&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Jinjie Ruan &lt;ruanjinjie@huawei.com&gt;
Link: https://lore.kernel.org/r/87zf7vex6h.wl-maz@kernel.org
Link: https://patch.msgid.link/20251210082242.360936-2-maz@kernel.org
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The IRQ timing tracking infrastructure was merged in 2019, but was never
plumbed in, is not selectable, and is therefore never used.

As Daniel agrees that there is little hope for this infrastructure to be
completed in the near term, drop it altogether.

Signed-off-by: Marc Zyngier &lt;maz@kernel.org&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Jinjie Ruan &lt;ruanjinjie@huawei.com&gt;
Link: https://lore.kernel.org/r/87zf7vex6h.wl-maz@kernel.org
Link: https://patch.msgid.link/20251210082242.360936-2-maz@kernel.org
</pre>
</div>
</content>
</entry>
<entry>
<title>genirq: Prevent early spurious wake-ups of interrupt threads</title>
<updated>2025-11-22T08:26:18+00:00</updated>
<author>
<name>Frederic Weisbecker</name>
<email>frederic@kernel.org</email>
</author>
<published>2025-11-21T14:34:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=68775ca79af3b8d4c147598983ece012d7007bac'/>
<id>68775ca79af3b8d4c147598983ece012d7007bac</id>
<content type='text'>
During initialization, the interrupt thread is created before the interrupt
is enabled. The interrupt enablement happens before the actual kthread wake
up point. Once the interrupt is enabled the hardware can raise an interrupt
and once setup_irq() drops the descriptor lock a interrupt wake-up can
happen.

Even when such an interrupt can be considered premature, this is not a
problem in general because at the point where the descriptor lock is
dropped and the wakeup can happen, the data which is used by the thread is
fully initialized.

Though from the perspective of least surprise, the initial wakeup really
should be performed by the setup code and not randomly by a premature
interrupt.

Prevent this by performing a wake-up only if the target is in state
TASK_INTERRUPTIBLE, which the thread uses in wait_for_interrupt().

If the thread is still in state TASK_UNINTERRUPTIBLE, the wake-up is not
lost because after the setup code completed the initial wake-up the thread
will observe the IRQTF_RUNTHREAD and proceed with the handling.

[ tglx: Simplified the changes and extended the changelog. ]

Signed-off-by: Frederic Weisbecker &lt;frederic@kernel.org&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Link: https://patch.msgid.link/20251121143500.42111-2-frederic@kernel.org
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
During initialization, the interrupt thread is created before the interrupt
is enabled. The interrupt enablement happens before the actual kthread wake
up point. Once the interrupt is enabled the hardware can raise an interrupt
and once setup_irq() drops the descriptor lock a interrupt wake-up can
happen.

Even when such an interrupt can be considered premature, this is not a
problem in general because at the point where the descriptor lock is
dropped and the wakeup can happen, the data which is used by the thread is
fully initialized.

Though from the perspective of least surprise, the initial wakeup really
should be performed by the setup code and not randomly by a premature
interrupt.

Prevent this by performing a wake-up only if the target is in state
TASK_INTERRUPTIBLE, which the thread uses in wait_for_interrupt().

If the thread is still in state TASK_UNINTERRUPTIBLE, the wake-up is not
lost because after the setup code completed the initial wake-up the thread
will observe the IRQTF_RUNTHREAD and proceed with the handling.

[ tglx: Simplified the changes and extended the changelog. ]

Signed-off-by: Frederic Weisbecker &lt;frederic@kernel.org&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Link: https://patch.msgid.link/20251121143500.42111-2-frederic@kernel.org
</pre>
</div>
</content>
</entry>
<entry>
<title>genirq: Add support for warning on long-running interrupt handlers</title>
<updated>2025-09-03T14:10:40+00:00</updated>
<author>
<name>Wladislav Wiebe</name>
<email>wladislav.wiebe@nokia.com</email>
</author>
<published>2025-08-04T09:35:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=673f1244b3d47c9b41cda3473c062bec586387be'/>
<id>673f1244b3d47c9b41cda3473c062bec586387be</id>
<content type='text'>
Introduce a mechanism to detect and warn about prolonged interrupt handlers.
With a new command-line parameter (irqhandler.duration_warn_us=), users can
configure the duration threshold in microseconds when a warning in such
format should be emitted:

"[CPU14] long duration of IRQ[159:bad_irq_handler [long_irq]], took: 1330 us"

The implementation uses local_clock() to measure the execution duration of the
generic IRQ per-CPU event handler.

Signed-off-by: Wladislav Wiebe &lt;wladislav.wiebe@nokia.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Jiri Slaby &lt;jirislaby@kernel.org&gt;
Link: https://lore.kernel.org/all/20250804093525.851-1-wladislav.wiebe@nokia.com

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Introduce a mechanism to detect and warn about prolonged interrupt handlers.
With a new command-line parameter (irqhandler.duration_warn_us=), users can
configure the duration threshold in microseconds when a warning in such
format should be emitted:

"[CPU14] long duration of IRQ[159:bad_irq_handler [long_irq]], took: 1330 us"

The implementation uses local_clock() to measure the execution duration of the
generic IRQ per-CPU event handler.

Signed-off-by: Wladislav Wiebe &lt;wladislav.wiebe@nokia.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Jiri Slaby &lt;jirislaby@kernel.org&gt;
Link: https://lore.kernel.org/all/20250804093525.851-1-wladislav.wiebe@nokia.com

</pre>
</div>
</content>
</entry>
<entry>
<title>irq: remove unused flags argument from __handle_irq_event_percpu()</title>
<updated>2022-01-06T23:25:25+00:00</updated>
<author>
<name>Sebastian Andrzej Siewior</name>
<email>bigeasy@linutronix.de</email>
</author>
<published>2021-12-07T12:17:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=5320eb42dec7a7ef3ab7da3c5c0d7f889a5181e5'/>
<id>5320eb42dec7a7ef3ab7da3c5c0d7f889a5181e5</id>
<content type='text'>
The __IRQF_TIMER bit from the flags argument was used in
add_interrupt_randomness() to distinguish the timer interrupt from other
interrupts. This is no longer the case.

Remove the flags argument from __handle_irq_event_percpu().

Signed-off-by: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Signed-off-by: Jason A. Donenfeld &lt;Jason@zx2c4.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The __IRQF_TIMER bit from the flags argument was used in
add_interrupt_randomness() to distinguish the timer interrupt from other
interrupts. This is no longer the case.

Remove the flags argument from __handle_irq_event_percpu().

Signed-off-by: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Signed-off-by: Jason A. Donenfeld &lt;Jason@zx2c4.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>random: remove unused irq_flags argument from add_interrupt_randomness()</title>
<updated>2022-01-06T23:25:25+00:00</updated>
<author>
<name>Sebastian Andrzej Siewior</name>
<email>bigeasy@linutronix.de</email>
</author>
<published>2021-12-07T12:17:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=703f7066f40599c290babdb79dd61319264987e9'/>
<id>703f7066f40599c290babdb79dd61319264987e9</id>
<content type='text'>
Since commit
   ee3e00e9e7101 ("random: use registers from interrupted code for CPU's w/o a cycle counter")

the irq_flags argument is no longer used.

Remove unused irq_flags.

Cc: Borislav Petkov &lt;bp@alien8.de&gt;
Cc: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Cc: Dexuan Cui &lt;decui@microsoft.com&gt;
Cc: H. Peter Anvin &lt;hpa@zytor.com&gt;
Cc: Haiyang Zhang &lt;haiyangz@microsoft.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: K. Y. Srinivasan &lt;kys@microsoft.com&gt;
Cc: Stephen Hemminger &lt;sthemmin@microsoft.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Wei Liu &lt;wei.liu@kernel.org&gt;
Cc: linux-hyperv@vger.kernel.org
Cc: x86@kernel.org
Signed-off-by: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Acked-by: Wei Liu &lt;wei.liu@kernel.org&gt;
Signed-off-by: Jason A. Donenfeld &lt;Jason@zx2c4.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Since commit
   ee3e00e9e7101 ("random: use registers from interrupted code for CPU's w/o a cycle counter")

the irq_flags argument is no longer used.

Remove unused irq_flags.

Cc: Borislav Petkov &lt;bp@alien8.de&gt;
Cc: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Cc: Dexuan Cui &lt;decui@microsoft.com&gt;
Cc: H. Peter Anvin &lt;hpa@zytor.com&gt;
Cc: Haiyang Zhang &lt;haiyangz@microsoft.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: K. Y. Srinivasan &lt;kys@microsoft.com&gt;
Cc: Stephen Hemminger &lt;sthemmin@microsoft.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Wei Liu &lt;wei.liu@kernel.org&gt;
Cc: linux-hyperv@vger.kernel.org
Cc: x86@kernel.org
Signed-off-by: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Acked-by: Wei Liu &lt;wei.liu@kernel.org&gt;
Signed-off-by: Jason A. Donenfeld &lt;Jason@zx2c4.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>irq: add generic_handle_arch_irq()</title>
<updated>2021-10-25T09:05:27+00:00</updated>
<author>
<name>Mark Rutland</name>
<email>mark.rutland@arm.com</email>
</author>
<published>2021-10-20T10:24:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=a1b09501971435ef213251891753afb0d7f3d27a'/>
<id>a1b09501971435ef213251891753afb0d7f3d27a</id>
<content type='text'>
Several architectures select GENERIC_IRQ_MULTI_HANDLER and branch to
handle_arch_irq() without performing any entry accounting.

Add a generic wrapper to handle the common irqentry work when invoking
handle_arch_irq(). Where an architecture needs to perform some entry
accounting itself, it will need to invoke handle_arch_irq() itself.

In subsequent patches it will become the responsibilty of the entry code
to set the irq regs when entering an IRQ (rather than deferring this to
an irqchip handler), so generic_handle_arch_irq() is made to set the irq
regs now. This can be redundant in some cases, but is never harmful as
saving/restoring the old regs nests safely.

Signed-off-by: Mark Rutland &lt;mark.rutland@arm.com&gt;
Reviewed-by: Marc Zyngier &lt;maz@kernel.org&gt;
Reviewed-by: Guo Ren &lt;guoren@kernel.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Several architectures select GENERIC_IRQ_MULTI_HANDLER and branch to
handle_arch_irq() without performing any entry accounting.

Add a generic wrapper to handle the common irqentry work when invoking
handle_arch_irq(). Where an architecture needs to perform some entry
accounting itself, it will need to invoke handle_arch_irq() itself.

In subsequent patches it will become the responsibilty of the entry code
to set the irq regs when entering an IRQ (rather than deferring this to
an irqchip handler), so generic_handle_arch_irq() is made to set the irq
regs now. This can be redundant in some cases, but is never harmful as
saving/restoring the old regs nests safely.

Signed-off-by: Mark Rutland &lt;mark.rutland@arm.com&gt;
Reviewed-by: Marc Zyngier &lt;maz@kernel.org&gt;
Reviewed-by: Guo Ren &lt;guoren@kernel.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>genirq: Add a IRQF_NO_DEBUG flag</title>
<updated>2021-05-17T18:01:35+00:00</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2021-04-02T06:23:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=c2b1063e8feb2115537addce10f36c0c82d11d9b'/>
<id>c2b1063e8feb2115537addce10f36c0c82d11d9b</id>
<content type='text'>
The whole call to note_interrupt() can be avoided or return early when
interrupts would be marked accordingly. For IPI handlers which always
return HANDLED the whole procedure is pretty pointless to begin with.

Add a IRQF_NO_DEBUG flag and mark the interrupt accordingly if supplied
when the interrupt is requested.

When noirqdebug is set on the kernel commandline, then the interrupt is
marked unconditionally so that there is only one condition in the hotpath
to evaluate.

 [ clg: Add changelog ]

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Cédric Le Goater &lt;clg@kaod.org&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Link: https://lore.kernel.org/r/7a8ad02f-63a8-c1aa-fdd1-39d973593d02@kaod.org
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The whole call to note_interrupt() can be avoided or return early when
interrupts would be marked accordingly. For IPI handlers which always
return HANDLED the whole procedure is pretty pointless to begin with.

Add a IRQF_NO_DEBUG flag and mark the interrupt accordingly if supplied
when the interrupt is requested.

When noirqdebug is set on the kernel commandline, then the interrupt is
marked unconditionally so that there is only one condition in the hotpath
to evaluate.

 [ clg: Add changelog ]

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Cédric Le Goater &lt;clg@kaod.org&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Link: https://lore.kernel.org/r/7a8ad02f-63a8-c1aa-fdd1-39d973593d02@kaod.org
</pre>
</div>
</content>
</entry>
<entry>
<title>treewide: Use fallthrough pseudo-keyword</title>
<updated>2020-08-23T22:36:59+00:00</updated>
<author>
<name>Gustavo A. R. Silva</name>
<email>gustavoars@kernel.org</email>
</author>
<published>2020-08-23T22:36:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=df561f6688fef775baa341a0f5d960becd248b11'/>
<id>df561f6688fef775baa341a0f5d960becd248b11</id>
<content type='text'>
Replace the existing /* fall through */ comments and its variants with
the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
fall-through markings when it is the case.

[1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through

Signed-off-by: Gustavo A. R. Silva &lt;gustavoars@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Replace the existing /* fall through */ comments and its variants with
the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
fall-through markings when it is the case.

[1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through

Signed-off-by: Gustavo A. R. Silva &lt;gustavoars@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'x86-entry-2020-03-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2020-03-31T02:14:28+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2020-03-31T02:14:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=d5f744f9a2ac9ca6d5baf72e97ce6dc4c2f19fe4'/>
<id>d5f744f9a2ac9ca6d5baf72e97ce6dc4c2f19fe4</id>
<content type='text'>
Pull x86 entry code updates from Thomas Gleixner:

 - Convert the 32bit syscalls to be pt_regs based which removes the
   requirement to push all 6 potential arguments onto the stack and
   consolidates the interface with the 64bit variant

 - The first small portion of the exception and syscall related entry
   code consolidation which aims to address the recently discovered
   issues vs. RCU, int3, NMI and some other exceptions which can
   interrupt any context. The bulk of the changes is still work in
   progress and aimed for 5.8.

 - A few lockdep namespace cleanups which have been applied into this
   branch to keep the prerequisites for the ongoing work confined.

* tag 'x86-entry-2020-03-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (35 commits)
  x86/entry: Fix build error x86 with !CONFIG_POSIX_TIMERS
  lockdep: Rename trace_{hard,soft}{irq_context,irqs_enabled}()
  lockdep: Rename trace_softirqs_{on,off}()
  lockdep: Rename trace_hardirq_{enter,exit}()
  x86/entry: Rename ___preempt_schedule
  x86: Remove unneeded includes
  x86/entry: Drop asmlinkage from syscalls
  x86/entry/32: Enable pt_regs based syscalls
  x86/entry/32: Use IA32-specific wrappers for syscalls taking 64-bit arguments
  x86/entry/32: Rename 32-bit specific syscalls
  x86/entry/32: Clean up syscall_32.tbl
  x86/entry: Remove ABI prefixes from functions in syscall tables
  x86/entry/64: Add __SYSCALL_COMMON()
  x86/entry: Remove syscall qualifier support
  x86/entry/64: Remove ptregs qualifier from syscall table
  x86/entry: Move max syscall number calculation to syscallhdr.sh
  x86/entry/64: Split X32 syscall table into its own file
  x86/entry/64: Move sys_ni_syscall stub to common.c
  x86/entry/64: Use syscall wrappers for x32_rt_sigreturn
  x86/entry: Refactor SYS_NI macros
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull x86 entry code updates from Thomas Gleixner:

 - Convert the 32bit syscalls to be pt_regs based which removes the
   requirement to push all 6 potential arguments onto the stack and
   consolidates the interface with the 64bit variant

 - The first small portion of the exception and syscall related entry
   code consolidation which aims to address the recently discovered
   issues vs. RCU, int3, NMI and some other exceptions which can
   interrupt any context. The bulk of the changes is still work in
   progress and aimed for 5.8.

 - A few lockdep namespace cleanups which have been applied into this
   branch to keep the prerequisites for the ongoing work confined.

* tag 'x86-entry-2020-03-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (35 commits)
  x86/entry: Fix build error x86 with !CONFIG_POSIX_TIMERS
  lockdep: Rename trace_{hard,soft}{irq_context,irqs_enabled}()
  lockdep: Rename trace_softirqs_{on,off}()
  lockdep: Rename trace_hardirq_{enter,exit}()
  x86/entry: Rename ___preempt_schedule
  x86: Remove unneeded includes
  x86/entry: Drop asmlinkage from syscalls
  x86/entry/32: Enable pt_regs based syscalls
  x86/entry/32: Use IA32-specific wrappers for syscalls taking 64-bit arguments
  x86/entry/32: Rename 32-bit specific syscalls
  x86/entry/32: Clean up syscall_32.tbl
  x86/entry: Remove ABI prefixes from functions in syscall tables
  x86/entry/64: Add __SYSCALL_COMMON()
  x86/entry: Remove syscall qualifier support
  x86/entry/64: Remove ptregs qualifier from syscall table
  x86/entry: Move max syscall number calculation to syscallhdr.sh
  x86/entry/64: Split X32 syscall table into its own file
  x86/entry/64: Move sys_ni_syscall stub to common.c
  x86/entry/64: Use syscall wrappers for x32_rt_sigreturn
  x86/entry: Refactor SYS_NI macros
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>lockdep: Introduce wait-type checks</title>
<updated>2020-03-21T15:00:24+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2020-03-21T11:26:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=de8f5e4f2dc1f032b46afda0a78cab5456974f89'/>
<id>de8f5e4f2dc1f032b46afda0a78cab5456974f89</id>
<content type='text'>
Extend lockdep to validate lock wait-type context.

The current wait-types are:

	LD_WAIT_FREE,		/* wait free, rcu etc.. */
	LD_WAIT_SPIN,		/* spin loops, raw_spinlock_t etc.. */
	LD_WAIT_CONFIG,		/* CONFIG_PREEMPT_LOCK, spinlock_t etc.. */
	LD_WAIT_SLEEP,		/* sleeping locks, mutex_t etc.. */

Where lockdep validates that the current lock (the one being acquired)
fits in the current wait-context (as generated by the held stack).

This ensures that there is no attempt to acquire mutexes while holding
spinlocks, to acquire spinlocks while holding raw_spinlocks and so on. In
other words, its a more fancy might_sleep().

Obviously RCU made the entire ordeal more complex than a simple single
value test because RCU can be acquired in (pretty much) any context and
while it presents a context to nested locks it is not the same as it
got acquired in.

Therefore its necessary to split the wait_type into two values, one
representing the acquire (outer) and one representing the nested context
(inner). For most 'normal' locks these two are the same.

[ To make static initialization easier we have the rule that:
  .outer == INV means .outer == .inner; because INV == 0. ]

It further means that its required to find the minimal .inner of the held
stack to compare against the outer of the new lock; because while 'normal'
RCU presents a CONFIG type to nested locks, if it is taken while already
holding a SPIN type it obviously doesn't relax the rules.

Below is an example output generated by the trivial test code:

  raw_spin_lock(&amp;foo);
  spin_lock(&amp;bar);
  spin_unlock(&amp;bar);
  raw_spin_unlock(&amp;foo);

 [ BUG: Invalid wait context ]
 -----------------------------
 swapper/0/1 is trying to lock:
 ffffc90000013f20 (&amp;bar){....}-{3:3}, at: kernel_init+0xdb/0x187
 other info that might help us debug this:
 1 lock held by swapper/0/1:
  #0: ffffc90000013ee0 (&amp;foo){+.+.}-{2:2}, at: kernel_init+0xd1/0x187

The way to read it is to look at the new -{n,m} part in the lock
description; -{3:3} for the attempted lock, and try and match that up to
the held locks, which in this case is the one: -{2,2}.

This tells that the acquiring lock requires a more relaxed environment than
presented by the lock stack.

Currently only the normal locks and RCU are converted, the rest of the
lockdep users defaults to .inner = INV which is ignored. More conversions
can be done when desired.

The check for spinlock_t nesting is not enabled by default. It's a separate
config option for now as there are known problems which are currently
addressed. The config option allows to identify these problems and to
verify that the solutions found are indeed solving them.

The config switch will be removed and the checks will permanently enabled
once the vast majority of issues has been addressed.

[ bigeasy: Move LD_WAIT_FREE,… out of CONFIG_LOCKDEP to avoid compile
	   failure with CONFIG_DEBUG_SPINLOCK + !CONFIG_LOCKDEP]
[ tglx: Add the config option ]

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://lkml.kernel.org/r/20200321113242.427089655@linutronix.de
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Extend lockdep to validate lock wait-type context.

The current wait-types are:

	LD_WAIT_FREE,		/* wait free, rcu etc.. */
	LD_WAIT_SPIN,		/* spin loops, raw_spinlock_t etc.. */
	LD_WAIT_CONFIG,		/* CONFIG_PREEMPT_LOCK, spinlock_t etc.. */
	LD_WAIT_SLEEP,		/* sleeping locks, mutex_t etc.. */

Where lockdep validates that the current lock (the one being acquired)
fits in the current wait-context (as generated by the held stack).

This ensures that there is no attempt to acquire mutexes while holding
spinlocks, to acquire spinlocks while holding raw_spinlocks and so on. In
other words, its a more fancy might_sleep().

Obviously RCU made the entire ordeal more complex than a simple single
value test because RCU can be acquired in (pretty much) any context and
while it presents a context to nested locks it is not the same as it
got acquired in.

Therefore its necessary to split the wait_type into two values, one
representing the acquire (outer) and one representing the nested context
(inner). For most 'normal' locks these two are the same.

[ To make static initialization easier we have the rule that:
  .outer == INV means .outer == .inner; because INV == 0. ]

It further means that its required to find the minimal .inner of the held
stack to compare against the outer of the new lock; because while 'normal'
RCU presents a CONFIG type to nested locks, if it is taken while already
holding a SPIN type it obviously doesn't relax the rules.

Below is an example output generated by the trivial test code:

  raw_spin_lock(&amp;foo);
  spin_lock(&amp;bar);
  spin_unlock(&amp;bar);
  raw_spin_unlock(&amp;foo);

 [ BUG: Invalid wait context ]
 -----------------------------
 swapper/0/1 is trying to lock:
 ffffc90000013f20 (&amp;bar){....}-{3:3}, at: kernel_init+0xdb/0x187
 other info that might help us debug this:
 1 lock held by swapper/0/1:
  #0: ffffc90000013ee0 (&amp;foo){+.+.}-{2:2}, at: kernel_init+0xd1/0x187

The way to read it is to look at the new -{n,m} part in the lock
description; -{3:3} for the attempted lock, and try and match that up to
the held locks, which in this case is the one: -{2,2}.

This tells that the acquiring lock requires a more relaxed environment than
presented by the lock stack.

Currently only the normal locks and RCU are converted, the rest of the
lockdep users defaults to .inner = INV which is ignored. More conversions
can be done when desired.

The check for spinlock_t nesting is not enabled by default. It's a separate
config option for now as there are known problems which are currently
addressed. The config option allows to identify these problems and to
verify that the solutions found are indeed solving them.

The config switch will be removed and the checks will permanently enabled
once the vast majority of issues has been addressed.

[ bigeasy: Move LD_WAIT_FREE,… out of CONFIG_LOCKDEP to avoid compile
	   failure with CONFIG_DEBUG_SPINLOCK + !CONFIG_LOCKDEP]
[ tglx: Add the config option ]

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://lkml.kernel.org/r/20200321113242.427089655@linutronix.de
</pre>
</div>
</content>
</entry>
</feed>
