linux.git/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c, branch v7.1-rc5

drm/amdgpu: avoid integer overflow in VA range check

2026-05-19T16:15:41+00:00

The original addition operation in 64-bit unsigned type may encounter
overflow situations. To prevent such issues and safely reject invalid
inputs, the check_add_overflow() function is used.

Signed-off-by: Ce Sun 
Reviewed-by: Tao Zhou 
Signed-off-by: Alex Deucher 
(cherry picked from commit cc768f4dd0bb9083c813683eeec44fc23921f771)

drm/amdgpu: reject non-user addresses early in GEM_USERPTR ioctl

2026-05-19T16:08:47+00:00

amdgpu_gem_userptr_ioctl() currently accepts any value of args->addr
and only discovers an out-of-range pointer much later, inside
amdgpu_gem_object_create() and the HMM mirror registration path.
Userspace can drive that path with kernel-side virtual addresses;
the get_user_pages() layer rejects them, but only after the driver
has already allocated a GEM object and started wiring up notifier
state that then has to be torn down on failure.

Add an access_ok() guard at the top of the ioctl, right after the
existing page-alignment check and before flag validation, so any
address that does not lie within the calling task's user address
range is rejected with -EFAULT before any allocation occurs. No
legitimate ROCm/HSA userspace passes kernel-mode pointers through
this interface, so this is defense-in-depth rather than a behaviour
change for valid callers; -EFAULT matches the convention already
used by other uaccess-style rejections in the kernel.

Also add an explicit #include ; access_ok() is
otherwise only available transitively through other headers in
this translation unit.

Signed-off-by: Amir Shetaia 
Reviewed-by: Alex Deucher 
Signed-off-by: Alex Deucher 
(cherry picked from commit 7a076df36397d780d7e4fb595287b4980451a7f5)

drm/amdgpu: fix syncobj leak for amdgpu_gem_va_ioctl()

2026-03-24T17:32:22+00:00

It requires freeing the syncobj and chain
alloction resource.

Signed-off-by: Prike Liang 
Reviewed-by: Christian König 
Signed-off-by: Alex Deucher

drm/amdgpu: fix some more bug in amdgpu_gem_va_ioctl

2026-03-23T18:13:11+00:00

Some illegal combination of input flags were not checked and we need to
take the PDEs into account when returning the fence as well.

Signed-off-by: Christian König 
Acked-by: Sunil Khatri 
Signed-off-by: Alex Deucher

drm/amdgpu: fix adding eviction fence

2026-03-17T21:46:26+00:00

We can't add the eviction fence without validating the BO.

Signed-off-by: Christian König 
Reviewed-by: Sunil Khatri 
Signed-off-by: Alex Deucher

drm/amdgpu: completely rework eviction fence handling v2

2026-03-17T21:46:13+00:00

Well that was broken on multiple levels.

First of all a lot of checks were placed at incorrect locations, especially if
the resume worker should run or not.

Then a bunch of code was just mid-layering because of incorrect assignment who
should do what.

And finally comments explaining what happens instead of why.

Just re-write it from scratch, that should at least fix some of the hangs we
are seeing.

Use RCU for the eviction fence pointer in the manager, the spinlock usage was
mostly incorrect as well. Then finally remove all the nonsense checks and
actually add them in the correct locations.

v2: some typo fixes and cleanups suggested by Sunil

Signed-off-by: Christian König 
Reviewed-by: Sunil Khatri 
Signed-off-by: Alex Deucher

Convert 'alloc_obj' family to use the new default GFP_KERNEL argument

2026-02-22T01:09:51+00:00

This was done entirely with mindless brute force, using

    git grep -l '\

treewide: Replace kmalloc with kmalloc_obj for non-scalar types

2026-02-21T09:02:28+00:00

This is the result of running the Coccinelle script from
scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to
avoid scalar types (which need careful case-by-case checking), and
instead replace kmalloc-family calls that allocate struct or union
object instances:

Single allocations:	kmalloc(sizeof(TYPE), ...)
are replaced with:	kmalloc_obj(TYPE, ...)

Array allocations:	kmalloc_array(COUNT, sizeof(TYPE), ...)
are replaced with:	kmalloc_objs(TYPE, COUNT, ...)

Flex array allocations:	kmalloc(struct_size(PTR, FAM, COUNT), ...)
are replaced with:	kmalloc_flex(*PTR, FAM, COUNT, ...)

(where TYPE may also be *VAR)

The resulting allocations no longer return "void *", instead returning
"TYPE *".

Signed-off-by: Kees Cook

drm/amdgpu: lock both VM and BO in amdgpu_gem_object_open

2026-02-12T20:24:59+00:00

The VM was not locked in the past since we initially only cleared the
linked list element and not added it to any VM state.

But this has changed quite some time ago, we just never realized this
problem because the VM state lock was masking it.

Signed-off-by: Christian König 
Reviewed-by: Alex Deucher 
Signed-off-by: Alex Deucher

drm/amdgpu: Refactor amdgpu_gem_va_ioctl for Handling Last Fence Update and Timeline Management v7

2026-01-10T19:21:52+00:00

When GPU memory mappings are updated, the driver returns a fence so
userspace knows when the update is finished.

The previous refactor could pick the wrong fence or rely on checks that
are not safe for GPU mappings that stay valid even when memory is
missing. In some cases this could return an invalid fence or cause fence
reference counting problems.

Fix this by (v5,v6, per Christian):
- Starting from the VM’s existing last update fence, so a valid and
  meaningful fence is always returned even when no new work is required.
- Selecting the VM-level fence only for always-valid / PRT mappings using
  the required combined bo_va + bo guard.
- Using the per-BO page table update fence for normal MAP and REPLACE
  operations.
- For UNMAP and CLEAR, returning the fence provided by
  amdgpu_vm_clear_freed(), which may remain unchanged when nothing needs
  clearing.
- Keeping fence reference counting balanced.

v7: Drop the extra bo_va/bo NULL guard since
    amdgpu_vm_is_bo_always_valid() handles NULL BOs correctly (including
    PRT). (Christian)

This makes VM timeline fences correct and prevents crashes caused by
incorrect fence handling.

Fixes: bd8150a1b337 ("drm/amdgpu: Refactor amdgpu_gem_va_ioctl for Handling Last Fence Update and Timeline Management v4")
Suggested-by: Christian König 
Signed-off-by: Srinivasan Shanmugam 
Reviewed-by: Christian König 
Signed-off-by: Alex Deucher