linux-stable.git/include/linux, branch v7.0.13

cfi: Include uaccess.h for get_kernel_nofault()

2026-06-19T11:47:58+00:00

commit 979c294509f9248fe1e7c358d582fb37dd5ca12d upstream.

After commit 0652a3daa787 ("tracing: Fix CFI violation in probestub
being called by tprobes"), there are many build errors when building
ARCH=arm multi_v7_defconfig + CONFIG_CFI=y like:

  In file included from drivers/base/devres.c:17:
  In file included from drivers/base/trace.h:16:
  In file included from include/linux/tracepoint.h:23:
  include/linux/cfi.h:44:6: error: call to undeclared function 'get_kernel_nofault'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
     44 |         if (get_kernel_nofault(hash, func - cfi_get_offset()))
        |             ^
  1 error generated.

get_kernel_nofault() is called in the generic version of
cfi_get_func_hash() but nothing ensures uaccess.h is always included for
a proper expansion and prototype.  Include uaccess.h in cfi.h to clear
up the errors.

Cc: stable@vger.kernel.org
Fixes: 0652a3daa787 ("tracing: Fix CFI violation in probestub being called by tprobes")
Signed-off-by: Nathan Chancellor 
Acked-by: Masami Hiramatsu (Google) 
Reviewed-by: Sami Tolvanen 
Signed-off-by: Linus Torvalds 
Signed-off-by: Greg Kroah-Hartman

tracing: Fix CFI violation in probestub being called by tprobes

2026-06-19T11:47:57+00:00

commit 0652a3daa78723f955b1ebeb621665ce72bec53e upstream.

The probestub is a function to allow tprobes to hook to a tracepoint to
gain access to its parameters. The function itself is only referenced by
the tracepoint structure which lives in the __tracepoint section. objtool
explicitly ignores that section and when processing functions in the
kernel, if it detects one that has no references it will seal it to have
its ENDBR stripped on boot up.

This means when a tprobe is attached to the sched_wakeup tracepoint, when it
is triggered it will call __probestub_sched_wakeup and due to the missing
ENDBR on a CFI-enabled machine it will take a #CP exception.

Fix this by adding CFI_NOSEAL annotation to probestub declaration.

Cc: stable@vger.kernel.org
Acked-by: Masami Hiramatsu (Google) 
Link: https://patch.msgid.link/20260603153147.573589-1-eva.kurchatova@virtuozzo.com
Fixes: d5173f753750 ("objtool: Exclude __tracepoints data from ENDBR checks")
Signed-off-by: Eva Kurchatova 
[ Updated change log ]
Signed-off-by: Steven Rostedt 
Signed-off-by: Greg Kroah-Hartman

mm/memory-failure: fix hugetlb_lock AA deadlock in get_huge_page_for_hwpoison

2026-06-19T11:47:56+00:00

commit 3c2d42b8ee345b17a4ba56b0f6492d1ff4c1178e upstream.

Two concurrent madvise(MADV_HWPOISON) calls on the same hugetlb page can
trigger a recursive spinlock self-deadlock (AA deadlock) on hugetlb_lock
when racing with a concurrent unmap:

  thread#0                              thread#1
  --------                              --------
  madvise(folio, MADV_HWPOISON)
    -> poisons the folio successfully
  madvise(folio, MADV_HWPOISON)         unmap(folio)
    try_memory_failure_hugetlb
      get_huge_page_for_hwpoison
        spin_lock_irq(&hugetlb_lock)    <- held
        __get_huge_page_for_hwpoison
          hugetlb_update_hwpoison()
            -> MF_HUGETLB_FOLIO_PRE_POISONED
          goto out:
            folio_put()
              refcount: 1 -> 0
              free_huge_folio()
                spin_lock_irqsave(&hugetlb_lock)
                  -> AA DEADLOCK!

The out: path in __get_huge_page_for_hwpoison() calls folio_put() to drop
the GUP reference while the hugetlb_lock is still held by the hugetlb.c
wrapper get_huge_page_for_hwpoison().  If concurrent unmap has released
the page table mapping reference, folio_put() drops the folio refcount to
zero, triggering free_huge_folio() which attempts to re-acquire the
non-recursive hugetlb_lock.

Fix this by moving hugetlb_lock acquisition from the hugetlb.c wrapper
into get_huge_page_for_hwpoison().  Place spin_unlock_irq() before the
folio_put() at the out: label so the folio is always released outside the
lock.

[akpm@linux-foundation.org: fix race, rename label per Miaohe]
  Link: https://sashiko.dev/#/patchset/20260522010305.4099834-1-mawupeng1@huawei.com
  Link: https://lore.kernel.org/f39f405e-4b4b-8f79-70fe-a2b5b62114eb@huawei.com
Link: https://lore.kernel.org/20260522010305.4099834-1-mawupeng1@huawei.com
Fixes: 405ce051236c ("mm/hwpoison: fix race between hugetlb free/demotion and memory_failure_hugetlb()")
Signed-off-by: Wupeng Ma 
Acked-by: Oscar Salvador (SUSE) 
Acked-by: Muchun Song 
Reviewed-by: Kefeng Wang 
Acked-by: Miaohe Lin 
Cc: David Hildenbrand 
Cc: Liam Howlett 
Cc: Lorenzo Stoakes 
Cc: Michal Hocko 
Cc: Mike Rapoport 
Cc: Naoya Horiguchi 
Cc: Suren Baghdasaryan 
Cc: Vlastimil Babka 
Cc: 
Signed-off-by: Andrew Morton 
Signed-off-by: Greg Kroah-Hartman

net/mlx5: Fix slab-out-of-bounds in mlx5_query_nic_vport_mac_list

2026-06-19T11:47:51+00:00

[ Upstream commit 894e036a24a26a6dd7b17d8d3fb5c53ab48a6074 ]

mlx5_query_nic_vport_mac_list() sizes its firmware command buffer using
the PF's log_max_current_uc/mc_list capabilities. When querying a VF
vport with a larger configured max (via devlink), the firmware response
can overflow this buffer:

 BUG: KASAN: slab-out-of-bounds in mlx5_query_nic_vport_mac_list+0x453/0x4c0 [mlx5_core]
 Read of size 4 at addr ff1100013ffc8a12 by task kworker/u96:2/385

 CPU: 12 UID: 0 PID: 385 Comm: kworker/u96:2 Not tainted 7.0.0-rc6+ #1 PREEMPT
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009)
 Workqueue: mlx5_esw_wq esw_vport_change_handler [mlx5_core]
 Call Trace:
  
  dump_stack_lvl+0x69/0xa0
  print_report+0x176/0x4e4
  kasan_report+0xc8/0x100
  mlx5_query_nic_vport_mac_list+0x453/0x4c0 [mlx5_core]
  esw_update_vport_addr_list+0x2e3/0xda0 [mlx5_core]
  esw_vport_change_handle_locked+0xa1f/0x1060 [mlx5_core]
  esw_vport_change_handler+0x6a/0x90 [mlx5_core]
  process_one_work+0x87f/0x15e0
  worker_thread+0x62b/0x1020
  kthread+0x375/0x490
  ret_from_fork+0x4dc/0x810
  ret_from_fork_asm+0x11/0x20
  

Fix by querying the vport's own HCA caps to size the buffer correctly.
Refactor the function to allocate and return the MAC list internally,
removing the caller's dependency on knowing the correct max.

Fixes: e16aea2744ab ("net/mlx5: Introduce access functions to modify/query vport mac lists")
Signed-off-by: Dragos Tatulea 
Reviewed-by: Carolina Jubran 
Signed-off-by: Tariq Toukan 
Link: https://patch.msgid.link/20260604135849.458060-1-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski 
Signed-off-by: Sasha Levin

rseq: Fix using an uninitialized stack variable in rseq_exit_user_update()

2026-06-19T11:47:48+00:00

[ Upstream commit 6d99479799c69c3cb588fcda19c81d8f61d64ecd ]

There is an bug in which an uninitialized stack variable is used in
rseq_exit_user_update() as reported by syzbot:

BUG: KMSAN: kernel-infoleak in rseq_set_ids_get_csaddr include/linux/rseq_entry.h:502 [inline]

The local variable:

	struct rseq_ids ids = {
		.cpu_id	 = task_cpu(t),
		.mm_cid	 = task_mm_cid(t),
		.node_id = cpu_to_node(ids.cpu_id),
	};

According to the C standard, the evaluation order of expressions in an
initializer list is indeterminately sequenced. The compiler (Clang, in
this KMSAN build) evaluates `cpu_to_node(ids.cpu_id)` *before*
`ids.cpu_id` is initialized with `task_cpu(t)`.

This is fixed by moving the assignment of ids.node_id outside the
structure initialization.

Fixes: 82f572449cfe ("rseq: Implement read only ABI enforcement for optimized RSEQ V2 mode")
Closes: https://syzkaller.appspot.com/bug?extid=185a631927096f9da2fc
Reported-by: syzbot+185a631927096f9da2fc@syzkaller.appspotmail.com
Signed-off-by: Qing Wang 
Signed-off-by: Peter Zijlstra (Intel) 
Acked-by: Mark Rutland 
Link: https://patch.msgid.link/20260602030854.574038-1-wangqing7171@gmail.com
Signed-off-by: Sasha Levin

Drivers: hv: vmbus: Provide option to skip VMBus unload on panic

2026-06-19T11:47:47+00:00

[ Upstream commit c5c3ef8d49e15d2fc1cec4ad7c91d81b99977440 ]

Currently, VMBus code initiates a VMBus unload in the panic path so
that if a kdump kernel is loaded, it can start fresh in setting up its
own VMBus connection. However, a driver for the VMBus virtual frame
buffer may need to flush dirty portions of the frame buffer back to
the Hyper-V host so that panic information is visible in the graphics
console. To support such flushing, provide exported functions for the
frame buffer driver to specify that the VMBus unload should not be
done by the VMBus driver, and to initiate the VMBus unload itself.
Together these allow a frame buffer driver to delay the VMBus unload
until after it has completed the flush.

Ideally, the VMBus driver could use its own panic-path callback to do
the unload after all frame buffer drivers have finished. But DRM frame
buffer drivers use the kmsg dump callback, and there are no callbacks
after that in the panic path. Hence this somewhat messy approach to
properly sequencing the frame buffer flush and the VMBus unload.

Fixes: 3671f3777758 ("drm/hyperv: Add support for drm_panic")
Signed-off-by: Michael Kelley 
Reviewed-by: Long Li 
Signed-off-by: Wei Liu 
Signed-off-by: Sasha Levin

fwctl/bnxt_en: Refactor aux bus functions to be more generic

2026-06-19T11:47:47+00:00

[ Upstream commit 2c7c85c8c7881d57c5fa1114f4b0dbd7fc53a36f ]

Up until now there was only one auxiliary device that bnxt
created and that was for RoCE driver. bnxt fwctl is also
going to use an aux bus device that bnxt should create.
This requires some nomenclature changes and refactoring of
the existing bnxt aux dev functions.

Convert 'aux_priv' and 'edev' members of struct bnxt into
arrays where each element contains supported auxbus device's
data. Move struct bnxt_aux_priv from bnxt.h to ulp.h because
that is where it belongs. Make aux bus init/uninit/add/del
functions more generic which will loop through all the aux
device types. Make bnxt_ulp_start/stop functions (the only
other common functions applicable to any aux device) loop
through the aux devices to update their config and states.
Make callers of bnxt_ulp_start() call it only when there
are no errors.

Also, as an improvement in code, bnxt_register_dev() can skip
unnecessary dereferencing of edev from bp, instead use the
edev pointer from the function parameter.

Future patches will reuse these functions to add an aux bus
device for fwctl.

Link: https://patch.msgid.link/r/20260314151605.932749-3-pavan.chebbi@broadcom.com
Reviewed-by: Andy Gospodarek 
Reviewed-by: Leon Romanovsky 
Signed-off-by: Pavan Chebbi 
Signed-off-by: Jason Gunthorpe 
Stable-dep-of: b6197b386677 ("Reapply "bnxt_en: bring back rtnl_lock() in the bnxt_open() path"")
Signed-off-by: Sasha Levin

fwctl/bnxt_en: Move common definitions to include/linux/bnxt/

2026-06-19T11:47:47+00:00

[ Upstream commit 7be18a1fa00eab5283b35c13e26c6b76fcaab9ce ]

We have common definitions that are now going to be used
by more than one component outside of bnxt (bnxt_re and
fwctl)

Move bnxt_ulp.h to include/linux/bnxt/ as ulp.h.

Link: https://patch.msgid.link/r/20260314151605.932749-2-pavan.chebbi@broadcom.com
Reviewed-by: Andy Gospodarek 
Reviewed-by: Leon Romanovsky 
Cc: linux-rdma@vger.kernel.org
Signed-off-by: Pavan Chebbi 
Signed-off-by: Jason Gunthorpe 
Stable-dep-of: b6197b386677 ("Reapply "bnxt_en: bring back rtnl_lock() in the bnxt_open() path"")
Signed-off-by: Sasha Levin

mailbox: Fix NULL message support in mbox_send_message()

2026-06-09T10:32:50+00:00

commit c58e9456e30c7098cbcd9f04571992be8a2e4e63 upstream.

The active_req field serves double duty as both the "is a TX in
flight" flag (NULL means idle) and the storage for the in-flight
message pointer. When a client sends NULL via mbox_send_message(),
active_req is set to NULL, which the framework misinterprets as
"no active request". This breaks the TX state machine by:

 - tx_tick() short-circuits on (!mssg), skipping the tx_done
   callback and the tx_complete completion
 - txdone_hrtimer() skips the channel entirely since active_req
   is NULL, so poll-based TX-done detection never fires.

Fix this by introducing a MBOX_NO_MSG sentinel value that means
"no active request," freeing NULL to be valid message data. The
sentinel is defined in the subsystem-internal mailbox.h so that
controller drivers within drivers/mailbox/ can reference it, but
it is not exposed to clients outside the subsystem.

Fifteen in-tree callers send NULL (doorbell-style IPCs on Qualcomm,
Tegra, TI, Xilinx, i.MX, SCMI, and PCC platforms). All were
audited for regression:

 - Most already work around the bug via knows_txdone=true with a
   manual mbox_client_txdone() call, making the framework's
   tracking irrelevant. These are unaffected.

 - Poll-based callers (Xilinx zynqmp/r5) are strictly better off:
   the poll timer now correctly detects NULL-active channels
   instead of silently skipping them.

 - irq-qcom-mpm.c was a pre-existing bug -- the only Qualcomm
   caller that omitted the knows_txdone + mbox_client_txdone()
   pattern. Fixed in a companion commit ("irqchip/qcom-mpm: Fix
   missing mailbox TX done acknowledgment").

 - No caller sets both a tx_done callback and sends NULL, nor
   combines tx_block=true with NULL sends, so the newly reachable
   callback/completion paths are never exercised.

Also update tegra-hsp's flush callback, which directly inspects
active_req to wait for the channel to drain: the old "!= NULL"
check becomes "!= MBOX_NO_MSG", otherwise flush spins until
timeout since the sentinel is non-NULL.

The only tradeoff is that 'MBOX_NO_MSG' can not be used as a message
by clients.

Reported-by: Joonwon Kang 
Reviewed-by: Douglas Anderson 
Signed-off-by: Jassi Brar 
Signed-off-by: Joonwon Kang 
Signed-off-by: Greg Kroah-Hartman

platform/x86/intel/vsec: Make driver_data info const

2026-06-09T10:32:49+00:00

[ Upstream commit 9577c74c96f88d807d1ba005adbf5952e7127e55 ]

Treat PCI id->driver_data (intel_vsec_platform_info) as read-only by making
vsec_priv->info a const pointer and updating all function signatures to
accept const intel_vsec_platform_info *.

This improves const-correctness and clarifies that the platform info data
from the driver_data table is not meant to be modified at runtime.

No functional changes intended.

Signed-off-by: David E. Box 
Reviewed-by: Michael J. Ruhl 
Link: https://patch.msgid.link/20260313015202.3660072-3-david.e.box@linux.intel.com
Reviewed-by: Ilpo Järvinen 
Signed-off-by: Ilpo Järvinen 
Stable-dep-of: 348ccc754d89 ("platform/x86/intel/vsec: Fix enable_cnt imbalance on PCIe error recovery")
Signed-off-by: Sasha Levin 
Signed-off-by: Greg Kroah-Hartman