linux-stable.git/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c, branch v6.18

drm/amdgpu: disable peer-to-peer access for DCC-enabled GC12 VRAM surfaces

2025-11-12T03:49:19+00:00

Certain multi-GPU configurations (especially GFX12) may hit
data corruption when a DCC-compressed VRAM surface is shared across GPUs
using peer-to-peer (P2P) DMA transfers.

Such surfaces rely on device-local metadata and cannot be safely accessed
through a remote GPU’s page tables. Attempting to import a DCC-enabled
surface through P2P leads to incorrect rendering or GPU faults.

This change disables P2P for DCC-enabled VRAM buffers that are contiguous
and allocated on GFX12+ hardware.  In these cases, the importer falls back
to the standard system-memory path, avoiding invalid access to compressed
surfaces.

Future work could consider optional migration (VRAM→System→VRAM) if a
performance regression is observed when `attach->peer2peer = false`.

Tested on:
 - Dual RX 9700 XT (Navi4x) setup
 - GNOME and Wayland compositor scenarios
 - Confirmed no corruption after disabling P2P under these conditions
v2: Remove check TTM_PL_VRAM & TTM_PL_FLAG_CONTIGUOUS.
v3: simplify for upsteam and fix ip version check (Alex)

Suggested-by: Christian König 
Signed-off-by: Vitaly Prosyak 
Reviewed-by: Christian König 
Signed-off-by: Alex Deucher 
(cherry picked from commit 9dff2bb709e6fbd97e263fd12bf12802d2b5a0cf)
Cc: stable@vger.kernel.org

Merge tag 'v6.17-rc6' into drm-next

2025-09-15T07:51:07+00:00

This is a backmerge of Linux 6.17-rc6, needed for msm,
also requested by misc.

Signed-off-by: Dave Airlie

amdgpu: populate buffers before exporting them.

2025-09-11T00:04:31+00:00

Before exporting a buffer, make sure it has been populated with
pages at least once.

While discussing cgroups we noticed a problem where you could export
a BO to a dma-buf without having it ever being backed or accounted for.

This meant in low memory situations or eventually with cgroups, a
lower privledged process might cause the compositor to try and allocate
a lot of memory on it's behalf and this could fail. At least make
sure the exporter has managed to allocate the RAM at least once
before exporting the object.

This only applies currently to TTM_PL_SYSTEM objects, because
GTT objects get populated on first validate, and VRAM doesn't
use TT.

Reviewed-by: Christian Koenig 
Signed-off-by: Dave Airlie 
Link: https://lore.kernel.org/r/20250904021643.2050497-2-airlied@gmail.com

drm/amdgpu: Pin buffers while vmap'ing exported dma-buf objects

2025-08-22T12:05:31+00:00

Current dma-buf vmap semantics require that the mapped buffer remains
in place until the corresponding vunmap has completed.

For GEM-SHMEM, this used to be guaranteed by a pin operation while creating
an S/G table in import. GEM-SHMEN can now import dma-buf objects without
creating the S/G table, so the pin is missing. Leads to page-fault errors,
such as the one shown below.

[  102.101726] BUG: unable to handle page fault for address: ffffc90127000000
[...]
[  102.157102] RIP: 0010:udl_compress_hline16+0x219/0x940 [udl]
[...]
[  102.243250] Call Trace:
[  102.245695]  
[  102.2477V95]  ? validate_chain+0x24e/0x5e0
[  102.251805]  ? __lock_acquire+0x568/0xae0
[  102.255807]  udl_render_hline+0x165/0x341 [udl]
[  102.260338]  ? __pfx_udl_render_hline+0x10/0x10 [udl]
[  102.265379]  ? local_clock_noinstr+0xb/0x100
[  102.269642]  ? __lock_release.isra.0+0x16c/0x2e0
[  102.274246]  ? mark_held_locks+0x40/0x70
[  102.278177]  udl_primary_plane_helper_atomic_update+0x43e/0x680 [udl]
[  102.284606]  ? __pfx_udl_primary_plane_helper_atomic_update+0x10/0x10 [udl]
[  102.291551]  ? lockdep_hardirqs_on_prepare.part.0+0x92/0x170
[  102.297208]  ? lockdep_hardirqs_on+0x88/0x130
[  102.301554]  ? _raw_spin_unlock_irq+0x24/0x50
[  102.305901]  ? wait_for_completion_timeout+0x2bb/0x3a0
[  102.311028]  ? drm_atomic_helper_calc_timestamping_constants+0x141/0x200
[  102.317714]  ? drm_atomic_helper_commit_planes+0x3b6/0x1030
[  102.323279]  drm_atomic_helper_commit_planes+0x3b6/0x1030
[  102.328664]  drm_atomic_helper_commit_tail+0x41/0xb0
[  102.333622]  commit_tail+0x204/0x330
[...]
[  102.529946] ---[ end trace 0000000000000000 ]---
[  102.651980] RIP: 0010:udl_compress_hline16+0x219/0x940 [udl]

In this stack strace, udl (based on GEM-SHMEM) imported and vmap'ed a
dma-buf from amdgpu. Amdgpu relocated the buffer, thereby invalidating the
mapping.

Provide a custom dma-buf vmap method in amdgpu that pins the object before
mapping it's buffer's pages into kernel address space. Do the opposite in
vunmap.

Note that dma-buf vmap differs from GEM vmap in how it handles relocation.
While dma-buf vmap keeps the buffer in place, GEM vmap requires the caller
to keep the buffer in place. Hence, this fix is in amdgpu's dma-buf code
instead of its GEM code.

A discussion of various approaches to solving the problem is available
at [1].

v3:
- try (GTT | VRAM); drop CPU domain (Christian)
v2:
- only use mapable domains (Christian)
- try pinning to domains in preferred order

Signed-off-by: Thomas Zimmermann 
Fixes: 660cd44659a0 ("drm/shmem-helper: Import dmabuf without mapping its sg_table")
Reported-by: Thomas Zimmermann 
Closes: https://lore.kernel.org/dri-devel/ba1bdfb8-dbf7-4372-bdcb-df7e0511c702@suse.de/
Cc: Shixiong Ou 
Cc: Thomas Zimmermann 
Cc: Maarten Lankhorst 
Cc: Maxime Ripard 
Cc: David Airlie 
Cc: Simona Vetter 
Cc: Sumit Semwal 
Cc: "Christian König" 
Cc: dri-devel@lists.freedesktop.org
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Link: https://lore.kernel.org/dri-devel/9792c6c3-a2b8-4b2b-b5ba-fba19b153e21@suse.de/ # [1]
Reviewed-by: Christian König 
Link: https://lore.kernel.org/r/20250821064031.39090-1-tzimmermann@suse.de

Revert "drm/amdgpu: Use dma_buf from GEM object instance"

2025-08-13T09:14:17+00:00

This reverts commit 515986100d176663d0a03219a3056e4252f729e6.

The dma_buf field in struct drm_gem_object is not stable over the
object instance's lifetime. The field becomes NULL when user space
releases the final GEM handle on the buffer object. This resulted
in a NULL-pointer deref.

Workarounds in commit 5307dce878d4 ("drm/gem: Acquire references on
GEM handles for framebuffers") and commit f6bfc9afc751 ("drm/framebuffer:
Acquire internal references on GEM handles") only solved the problem
partially. They especially don't work for buffer objects without a DRM
framebuffer associated.

Hence, this revert to going back to using .import_attach->dmabuf.

Signed-off-by: Thomas Zimmermann 
Reviewed-by: Simona Vetter 
Acked-by: Alex Deucher 
Link: https://lore.kernel.org/r/20250715082635.34974-1-tzimmermann@suse.de

drm/amdgpu: Use dma_buf from GEM object instance

2025-06-30T15:55:16+00:00

Avoid dereferencing struct drm_gem_object.import_attach for the
imported dma-buf. The dma_buf field in the GEM object instance refers
to the same buffer. Prepares to make import_attach an implementation
detail of PRIME.

Reviewed-by: Christian König 
Signed-off-by: Thomas Zimmermann 
Signed-off-by: Alex Deucher

drm/amdgpu: Test for imported buffers with drm_gem_is_imported()

2025-06-30T15:54:59+00:00

Instead of testing import_attach for imported GEM buffers, invoke
drm_gem_is_imported() to do the test.

v2:
- keep amdgpu_bo_print_info() as-is (Christian)

Reviewed-by: Christian König 
Signed-off-by: Thomas Zimmermann 
Signed-off-by: Alex Deucher

drm/amdgpu: Fail DMABUF map of XGMI-accessible memory

2025-05-01T15:01:46+00:00

If peer memory is XGMI-accessible, we should never access it through PCIe
P2P DMA mappings. PCIe P2P is slower, has different coherence behaviour,
limited or no support for atomics, or may not work at all. Fail with a
warning if DMABUF mappings of such memory are attempted.

Signed-off-by: Felix Kuehling 
Reviewed-by: Christian König 
Signed-off-by: Alex Deucher 
(cherry picked from commit dbe4c63689bc6b5fd3ab72650ea4b6a667e96a68)

drm/amdgpu: Allow P2P access through XGMI

2025-04-22T20:47:12+00:00

If peer memory is accessible through XGMI, allow leaving it in VRAM
rather than forcing its migration to GTT on DMABuf attachment.

Signed-off-by: Felix Kuehling 
Tested-by: Hao (Claire) Zhou 
Reviewed-by: Christian König 
Signed-off-by: Alex Deucher 
(cherry picked from commit 372c8d72c3680fdea3fbb2d6b089f76b4a6d596a)

drm/amdgpu: Don't pin VRAM without DMABUF_MOVE_NOTIFY

2025-04-22T20:44:28+00:00

Pinning of VRAM is for peer devices that don't support dynamic attachment
and move notifiers. But it requires that all such peer devices are able to
access VRAM via PCIe P2P. Any device without P2P access requires migration
to GTT, which fails if the memory is already pinned for another peer
device.

Sharing between GPUs should not require pinning in VRAM. However, if
DMABUF_MOVE_NOTIFY is disabled in the kernel build, even DMABufs shared
between GPUs must be pinned, which can lead to failures and functional
regressions on systems where some peer GPUs are not P2P accessible.

Disable VRAM pinning if move notifiers are disabled in the kernel build
to fix regressions when sharing BOs between GPUs.

Signed-off-by: Felix Kuehling 
Tested-by: Hao (Claire) Zhou 
Reviewed-by: Christian König 
Signed-off-by: Alex Deucher 
(cherry picked from commit 05185812ae3695fe049c14847ce3cbeccff1bf2e)