linux.git/kernel/locking/rtmutex.c, branch v7.2-rc1

Merge tag 'locking-core-2026-06-14' of gitolite.kernel.org:pub/scm/linux/kernel/git/tip/tip

2026-06-15T08:51:14+00:00

Pull locking updates from Ingo Molnar:
 "Futex updates:

   - Optimize futex hash bucket access patterns (Peter Zijlstra)

   - Large series to address the robust futex unlock race for real, by
     Thomas Gleixner:

      "The robust futex unlock mechanism is racy in respect to the
       clearing of the robust_list_head::list_op_pending pointer because
       unlock and clearing the pointer are not atomic.

       The race window is between the unlock and clearing the pending op
       pointer. If the task is forced to exit in this window, exit will
       access a potentially invalid pending op pointer when cleaning up
       the robust list.

       That happens if another task manages to unmap the object
       containing the lock before the cleanup, which results in an UAF.

       In the worst case this UAF can lead to memory corruption when
       unrelated content has been mapped to the same address by the time
       the access happens.

       User space can't solve this problem without help from the kernel.
       This series provides the kernel side infrastructure to help it
       along:

        1) Combined unlock, pointer clearing, wake-up for the
           contended case

        2) VDSO based unlock and pointer clearing helpers with a
           fix-up function in the kernel when user space was interrupted
           within the critical section.

      ... with help by André Almeida:

        - Add a note about robust list race condition (André Almeida)
        - Add self-tests for robust release operations (André Almeida)

  Context analysis updates:

   - Implement context analysis for 'struct rt_mutex'. (Bart Van Assche)
   - Bump required Clang version to 23 (Marco Elver)

  Guard infrastructure updates:

   - Series to remove NULL check from unconditional guards (Dmitry
     Ilvokhin)

  Lockdep updates:

   - Restore self-test migrate_disable() and sched_rt_mutex state on
     PREEMPT_RT (Karl Mehltretter)

  Membarriers updates:

   - Use per-CPU mutexes for targeted commands (Aniket Gattani)
   - Modernize membarrier_global_expedited with cleanup guards (Aniket
     Gattani)
   - Add rseq stress test for CFS throttle interactions (Aniket Gattani)

  percpu-rwsems updates:

   - Extract __percpu_up_read() to optimize inlining overhead (Dmitry
     Ilvokhin)

  Seqlocks updates:

   - Allow UBSAN_ALIGNMENT to fail optimizing (Heiko Carstens)

  Lock tracing:

   - Add contended_release tracepoint to sleepable locks such as
     mutexes, percpu-rwsems, rtmutexes, rwsems and semaphores (Dmitry
     Ilvokhin)

  MAINTAINERS updates:

   - MAINTAINERS: Add RUST [SYNC] entry (Boqun Feng)

  Misc updates and fixes by Randy Dunlap, YE WEI-HONG, Fabricio Parra,
  Dmitry Ilvokhin and Peter Zijlstra"

* tag 'locking-core-2026-06-14' of gitolite.kernel.org:pub/scm/linux/kernel/git/tip/tip: (36 commits)
  locking: Add contended_release tracepoint to sleepable locks
  locking/percpu-rwsem: Extract __percpu_up_read()
  tracing/lock: Remove unnecessary linux/sched.h include
  futex: Optimize futex hash bucket access patterns
  rust: sync: completion: Mark inline complete_all and wait_for_completion
  MAINTAINERS: Add RUST [SYNC] entry
  cleanup: Specify nonnull argument index
  selftests: futex: Add tests for robust release operations
  Documentation: futex: Add a note about robust list race condition
  x86/vdso: Implement __vdso_futex_robust_try_unlock()
  x86/vdso: Prepare for robust futex unlock support
  futex: Provide infrastructure to plug the non contended robust futex unlock race
  futex: Add robust futex unlock IP range
  futex: Add support for unlocking robust futexes
  futex: Cleanup UAPI defines
  x86: Select ARCH_MEMORY_ORDER_TSO
  uaccess: Provide unsafe_atomic_store_release_user()
  futex: Provide UABI defines for robust list entry modifiers
  futex: Move futex related mm_struct data into a struct
  futex: Make futex_mm_init() void
  ...

locking: Add contended_release tracepoint to sleepable locks

2026-06-11T11:41:25+00:00

Add the contended_release trace event. This tracepoint fires on the
holder side when a contended lock is released, complementing the
existing contention_begin/contention_end tracepoints which fire on the
waiter side.

This enables correlating lock hold time under contention with waiter
events by lock address.

Add trace_contended_release()/trace_call__contended_release() calls to
the slowpath unlock paths of sleepable locks: mutex, rtmutex, semaphore,
rwsem, percpu-rwsem, and RT-specific rwbase locks.

Where possible, trace_contended_release() fires before the lock is
released and before the waiter is woken. For some lock types, the
tracepoint fires after the release but before the wake. Making the
placement consistent across all lock types is not worth the added
complexity.

For reader/writer locks, the tracepoint fires for every reader releasing
while a writer is waiting, not only for the last reader.

Signed-off-by: Dmitry Ilvokhin 
Signed-off-by: Peter Zijlstra (Intel) 
Acked-by: Paul E. McKenney 
Acked-by: Usama Arif 
Link: https://patch.msgid.link/02f4f6c5ce6761e7f6587cf0ff2289d962ecddd4.1780506267.git.d@ilvokhin.com

locking/rtmutex: Skip remove_waiter() when waiter is not enqueued

2026-06-03T20:11:53+00:00

syzbot triggered the following splat in remove_waiter() via
FUTEX_CMP_REQUEUE_PI:

  KASAN: null-ptr-deref in range [0x0000000000000a88-0x0000000000000a8f]
   class_raw_spinlock_constructor
   remove_waiter+0x159/0x1200 kernel/locking/rtmutex.c:1561
   rt_mutex_start_proxy_lock+0x103/0x120
   futex_requeue+0x10e4/0x20d0
   __x64_sys_futex+0x34f/0x4d0

task_blocks_on_rt_mutex() does not arm the waiter upon deadlock detection,
leaving waiter->task nil, where 3bfdc63936dd ("rtmutex: Use waiter::task instead
of current in remove_waiter()") made this fatal.

Furthermore, rt_mutex_start_proxy_lock() should not be calling into remove_waiter()
upon a successfully grabbing the rtmutex. 1a1fb985f2e2 ("futex: Handle early deadlock
return correctly"), moved the remove_waiter() out of __rt_mutex_start_proxy_lock()
(where 'ret' was only ever 0 or < 0) into the wrapper. Tighten this check to
account for try_to_take_rt_mutex().

Fixes: 3bfdc63936dd ("rtmutex: Use waiter::task instead of current in remove_waiter()")
Reported-by: syzbot+78147abe6c524f183ee9@syzkaller.appspotmail.com
Signed-off-by: Davidlohr Bueso 
Signed-off-by: Thomas Gleixner 
Cc: stable@vger.kernel.org
Closes: https://lore.kernel.org/all/69f114ac.050a0220.ac8b.0003.GAE@google.com/
Link: https://patch.msgid.link/20260507112913.1019537-1-dave@stgolabs.net

locking/rtmutex: Annotate API and implementation

2026-05-19T11:49:01+00:00

Enable context analysis for struct rt_mutex and annotate all functions that
accept a struct rt_mutex pointer. In the __rt_mutex_lock_common() callers,
instead of adding the __no_context_analysis annotation, emit a runtime
warning if the __rt_mutex_lock_common() return value is not zero and add an
__acquire() statement.

Signed-off-by: Bart Van Assche 
Signed-off-by: Peter Zijlstra (Intel) 
Link: https://patch.msgid.link/20260508174520.1416285-1-bvanassche@acm.org

rtmutex: Use waiter::task instead of current in remove_waiter()

2026-04-20T22:22:31+00:00

remove_waiter() is used by the slowlock paths, but it is also used for
proxy-lock rollback in rt_mutex_start_proxy_lock() when invoked from
futex_requeue().

In the latter case waiter::task is not current, but remove_waiter()
operates on current for the dequeue operation. That results in several
problems:

  1) the rbtree dequeue happens without waiter::task::pi_lock being held

  2) the waiter task's pi_blocked_on state is not cleared, which leaves a
     dangling pointer primed for UAF around.

  3) rt_mutex_adjust_prio_chain() operates on the wrong top priority waiter
     task

Use waiter::task instead of current in all related operations in
remove_waiter() to cure those problems.

[ tglx: Fixup rt_mutex_adjust_prio_chain(), add a comment and amend the
  	changelog ]

Fixes: 8161239a8bcc ("rtmutex: Simplify PI algorithm and make highest prio task get lock")
Reported-by: Yuan Tan 
Reported-by: Yifan Wu 
Reported-by: Juefei Pu 
Reported-by: Xin Liu 
Signed-off-by: Keenan Dong 
Signed-off-by: Thomas Gleixner 
Cc: stable@vger.kernel.org

locking/rtmutex: Add context analysis

2026-03-08T10:06:53+00:00

Add compiler context analysis annotations.

Signed-off-by: Peter Zijlstra (Intel) 
Link: https://patch.msgid.link/20260121111213.851599178@infradead.org

locking/lock_events: Add locking events for rtmutex slow paths

2025-03-07T23:55:03+00:00

Add locking events for rtlock_slowlock() and rt_mutex_slowlock() for
profiling the slow path behavior of rt_spin_lock() and rt_mutex_lock().

Signed-off-by: Waiman Long 
Signed-off-by: Boqun Feng 
Signed-off-by: Ingo Molnar 
Link: https://lore.kernel.org/r/20250307232717.1759087-4-boqun.feng@gmail.com

sched/wake_q: Add helper to call wake_up_q after unlock with preemption disabled

2024-12-20T14:31:21+00:00

A common pattern seen when wake_qs are used to defer a wakeup
until after a lock is released is something like:
  preempt_disable();
  raw_spin_unlock(lock);
  wake_up_q(wake_q);
  preempt_enable();

So create some raw_spin_unlock*_wake() helper functions to clean
this up.

Applies on top of the fix I submitted here:
 https://lore.kernel.org/lkml/20241212222138.2400498-1-jstultz@google.com/

NOTE: I recognise the unlock()/unlock_irq()/unlock_irqrestore()
variants creates its own duplication, which we could use a macro
to generate the similar functions, but I often dislike how those
generation macros making finding the actual implementation
harder, so I left the three functions as is. If folks would
prefer otherwise, let me know and I'll switch it.

Suggested-by: Peter Zijlstra 
Signed-off-by: John Stultz 
Signed-off-by: Peter Zijlstra (Intel) 
Link: https://lkml.kernel.org/r/20241217040803.243420-1-jstultz@google.com

locking/rtmutex: Make sure we wake anything on the wake_q when we release the lock->wait_lock

2024-12-17T16:47:24+00:00

Bert reported seeing occasional boot hangs when running with
PREEPT_RT and bisected it down to commit 894d1b3db41c
("locking/mutex: Remove wakeups from under mutex::wait_lock").

It looks like I missed a few spots where we drop the wait_lock and
potentially call into schedule without waking up the tasks on the
wake_q structure. Since the tasks being woken are ww_mutex tasks
they need to be able to run to release the mutex and unblock the
task that currently is planning to wake them. Thus we can deadlock.

So make sure we wake the wake_q tasks when we unlock the wait_lock.

Closes: https://lore.kernel.org/lkml/20241211182502.2915-1-spasswolf@web.de
Fixes: 894d1b3db41c ("locking/mutex: Remove wakeups from under mutex::wait_lock")
Reported-by: Bert Karwatzki 
Signed-off-by: John Stultz 
Signed-off-by: Peter Zijlstra (Intel) 
Link: https://lkml.kernel.org/r/20241212222138.2400498-1-jstultz@google.com

locking: rtmutex: Fix wake_q logic in task_blocks_on_rt_mutex

2024-12-02T11:01:29+00:00

Anders had bisected a crash using PREEMPT_RT with linux-next and
isolated it down to commit 894d1b3db41c ("locking/mutex: Remove
wakeups from under mutex::wait_lock"), where it seemed the
wake_q structure was somehow getting corrupted causing a null
pointer traversal.

I was able to easily repoduce this with PREEMPT_RT and managed
to isolate down that through various call stacks we were
actually calling wake_up_q() twice on the same wake_q.

I found that in the problematic commit, I had added the
wake_up_q() call in task_blocks_on_rt_mutex() around
__ww_mutex_add_waiter(), following a similar pattern in
__mutex_lock_common().

However, its just wrong. We haven't dropped the lock->wait_lock,
so its contrary to the point of the original patch. And it
didn't match the __mutex_lock_common() logic of re-initializing
the wake_q after calling it midway in the stack.

Looking at it now, the wake_up_q() call is incorrect and should
just be removed. So drop the erronious logic I had added.

Fixes: 894d1b3db41c ("locking/mutex: Remove wakeups from under mutex::wait_lock")
Closes: https://lore.kernel.org/lkml/6afb936f-17c7-43fa-90e0-b9e780866097@app.fastmail.com/
Reported-by: Anders Roxell 
Reported-by: Arnd Bergmann 
Signed-off-by: John Stultz 
Signed-off-by: Peter Zijlstra (Intel) 
Reviewed-by: Juri Lelli 
Tested-by: Anders Roxell 
Tested-by: K Prateek Nayak 
Link: https://lore.kernel.org/r/20241114190051.552665-1-jstultz@google.com