linux.git/include/rdma, branch master

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

2026-06-18T15:16:21+00:00

Pull rdma updates from Jason Gunthorpe:
 "Many AI driven bug fixes, and several big driver API cleanups

   - Driver bug fixes and minor cleanups in mlx5, hns, rxe, efa, siw,
     rtrs, mana, irdma, mlx4. Commonly error path flows, integer
     arithmetic overflows on unsafe data, out of bounds access, and use
     after free issues under races.

   - Second half of the new udata API for drivers focusing on uAPI
     response

   - bnxt_re supports more options for QP creation that will allow a dv
     path in rdma-core

   - Untangle the module dependencies so drivers don't link to
     ib_uverbs.ko as was originall intended

   - Provide a new way to handle umems with a consistent simplified uAPI
     and update several drivers to use it. This brings dmabuf support to
     more places and more drivers

   - Support for mlx5 rate limit and packet pacing for UD and UC

   - A batch of fixes for the new shared FRMR pools infrastructure"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (148 commits)
  RDMA/irdma: Replace waitqueue and flag with completion
  RDMA/hns: Fix memory leak of bonding resources
  RDMA/rtrs-srv: Bound RDMA-Write length to chunk size in rdma_write_sg
  docs: infiniband: correct name of option to enable the ib_uverbs module
  RDMA/bnxt_re: Reject GET_TOGGLE_MEM when toggle page was not allocated
  RDMA/bnxt_re: Fail DBR related page allocation UAPIs if the feature is disabled
  RDMA/bnxt_re: Avoid repeated requests to allocate WC pages
  RDMA/bnxt_re: Proper rollback if the ioremap fails
  RDMA/bnxt_re: Add a max slot check for SQ
  RDMA/bnxt_re: Avoid displaying the kernel pointer
  RDMA/bnxt_re: Free CQ toggle page after firmware teardown
  RDMA/bnxt_re: Free SRQ toggle page after firmware teardown
  RDMA/bnxt_re: Initialize dpi variable to zero
  ABI: sysfs-class-infiniband: minor cleanup
  RDMA/mlx5: Release the HW‑provided UAR index rather than the SW one
  RDMA/mlx5: Fix undefined shift of user RQ WQE size
  RDMA/mlx5: Remove raw RSS QP restrack tracking
  RDMA/mlx5: Remove DCT restrack tracking
  RDMA/mlx5: Drop FRMR pool handle on UMR revoke failure
  RDMA/core: Add ib_frmr_pool_drop for unrecoverable handles
  ...

RDMA/core: Add ib_frmr_pool_drop for unrecoverable handles

2026-06-11T18:36:09+00:00

A driver that has popped a handle from an FRMR pool can hit failures
that leave the handle in a state where it can't safely be returned
for reuse. The driver destroys the handle itself, but the pool has
no way to learn about it, so the in_use counter drifts upward.

Add ib_frmr_pool_drop to balance the pool's accounting in this case.
Every pop is now balanced by exactly one push or drop.

Fixes: 36680ef7bceb ("RDMA/mlx5: Switch from MR cache to FRMR pools")
Link: https://patch.msgid.link/r/20260610000145.820592-9-michaelgur@nvidia.com
Signed-off-by: Michael Guralnik 
Signed-off-by: Jason Gunthorpe

RDMA/core: Fix FRMR handle leak on push failure

2026-06-11T18:36:09+00:00

Failure to push a handle to the pool, caused by ENOMEM on queue page
allocation, will trigger missing in_use counter update, skewing pool
state indefinitely.
Fix that by moving the handling of handle destruction in such case
into the FRMR code, ensuring the handle is either pushed to the pool
or destroyed inside the same function.

Adjust mlx5_ib call site accordingly.

Fixes: ce5df0b891ed ("IB/core: Introduce FRMR pools")
Link: https://patch.msgid.link/r/20260610000145.820592-8-michaelgur@nvidia.com
Signed-off-by: Michael Guralnik 
Signed-off-by: Jason Gunthorpe

RDMA/nldev: Fix locking when accessing mr->pd

2026-06-08T17:32:43+00:00

Sashiko points out that, due to rereg_mr, the PD is actually variable and
all the touches in nldev are racy.

Use mr->device instead of mr->pd->device.

Getting the PD restrack ID is more tricky. To avoid disturbing all the
happy paths, add an rdma_restrack_sync() operation which is sort of like
flush_workqueue() or synchronize_irq(): after it returns, all the old
nldev touches to the mr are gone and everything sees the new PD. This
makes it safe to reach into the PD pointer.

Fixes: da5c85078215 ("RDMA/nldev: add driver-specific resource tracking")
Link: https://patch.msgid.link/r/4-v1-29ebd2c229b5+fd5-ib_mr_pd_jgg@nvidia.com
Signed-off-by: Jason Gunthorpe

RDMA: During rereg_mr ensure that REREG_ACCESS is compatible

2026-06-08T16:39:20+00:00

If IB_MR_REREG_ACCESS changes from RO to RW then the umem has to be
re-evaluated to ensure it is properly pinned as RW. Since the umem is
hidden inside each driver's mr struct add a ib_umem_check_rereg() function
that each driver has to call before processing IB_MR_REREG_ACCESS.

mlx4 has to retain its duplicate ib_access_writable check because it
implements IB_MR_REREG_ACCESS | IB_MR_REREG_TRANS by changing both items
in place sequentially while the MR is live, so it will continue to not
support this combination.

Cc: stable@vger.kernel.org
Fixes: b40656aa7d55 ("RDMA/umem: remove FOLL_FORCE usage")
Link: https://patch.msgid.link/r/0-v1-06fb1a2d6cf5+107-rereg_access_jgg@nvidia.com
Reported-by: Philip Tsukerman 
Signed-off-by: Jason Gunthorpe

RDMA/hfi1: Open-code rvt_set_ibdev_name()

2026-06-05T15:38:42+00:00

clang warns about a function missing a printf attribute:

include/rdma/rdma_vt.h:457:47: error: diagnostic behavior may be improved by adding the 'format(printf, 2, 3)' attribute to the declaration of 'rvt_set_ibdev_name' [-Werror,-Wmissing-format-attribute]
  447 | static inline void rvt_set_ibdev_name(struct rvt_dev_info *rdi,
      | __attribute__((format(printf, 2, 3)))
  448 |                                       const char *fmt, const char *name,
  449 |                                       const int unit)

The helper was originally added as an abstraction for the hfi1 and
qib drivers needing the same thing, but now qib is gone, and hfi1
is the only remaining user of rdma_vt.

Avoid the warning and allow the compiler to check the format string by
open-coding the helper and directly assigning the device name.

Fixes: 5084c8ff21f2 ("IB/{rdmavt, hfi1, qib}: Self determine driver name")
Link: https://patch.msgid.link/r/20260602140453.3542427-1-arnd@kernel.org
Signed-off-by: Arnd Bergmann 
Reviewed-by: Kees Cook 
Signed-off-by: Jason Gunthorpe

RDMA/umem: Make ib_umem_is_contiguous() safe on 32 bit

2026-06-05T15:36:33+00:00

Sashiko points out the roundup_pow_of_two() only uses unsigned long but
dma_addr_t can be u64.

Change this algorithm to be simpler, compute the page size, if any page
size is found and it results in a single block then it is contiguous.

Link: https://patch.msgid.link/r/3-v1-88303e9e509f+f7-ib_umem_types_jgg@nvidia.com
Signed-off-by: Jason Gunthorpe

RDMA/umem: Be careful about boundary conditions in ib_umem_find_best_pgsz()

2026-06-05T15:36:33+00:00

Several corner cases, especially important on 32 bits:

- umem->iova is u64, the function argument should pass in u64 or
  iova will be truncated
- Check that the length is not too large for the iova
- Check that lengths > 4G don't overflow the GENMASK

Link: https://patch.msgid.link/r/2-v1-88303e9e509f+f7-ib_umem_types_jgg@nvidia.com
Signed-off-by: Jason Gunthorpe

RDMA/uverbs: Expose CoCo DMA bounce requirement to userspace

2026-05-29T23:27:29+00:00

In CoCo guests, guest memory is encrypted and untrusted (T=0) devices
cannot DMA to it directly; such transfers must go through unencrypted
bounce buffers. RDMA registers user pages for direct device access,
bypassing the DMA layer and thus any bouncing, so registered memory does
not work in this configuration.

Until trusted (T=1) device detection is available, conservatively flag
every device attached to a CoCo guest. Expose the condition to userspace
as IB_UVERBS_DEVICE_CC_DMA_BOUNCE in device_cap_flags_ex so applications
can avoid memory registration and fall back to copying buffers through
send/recv.

Link: https://patch.msgid.link/r/20260517141311.2409230-2-jiri@resnulli.us
Signed-off-by: Jiri Pirko 
Signed-off-by: Jason Gunthorpe

RDMA/umem: Add ib_umem_is_contiguous() stub for !CONFIG_INFINIBAND_USER_MEM

2026-05-29T23:19:59+00:00

ib_umem_is_contiguous() is defined under #ifdef
CONFIG_INFINIBAND_USER_MEM, but the #else branch lacks a stub.

Add the missing inline to fix potential broken build.

Fixes: c897c2c8b8e8 ("RDMA/core: Add umem "is_contiguous" and "start_dma_addr" helpers")
Link: https://patch.msgid.link/r/20260529134312.2836341-15-jiri@resnulli.us
Signed-off-by: Jiri Pirko 
Signed-off-by: Jason Gunthorpe