<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c, branch v7.1-rc4</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>drm/amdgpu: fix root reservation in amdgpu_vm_handle_fault</title>
<updated>2026-04-24T15:10:12+00:00</updated>
<author>
<name>Pierre-Eric Pelloux-Prayer</name>
<email>pierre-eric.pelloux-prayer@amd.com</email>
</author>
<published>2026-04-20T08:23:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=686e5985d9f5ba29e2fd43d618548039727adee2'/>
<id>686e5985d9f5ba29e2fd43d618548039727adee2</id>
<content type='text'>
svm_range_restore_pages might reserve the root bo so it must
be called after unreserving it.

Fixes: 1b135c6da061 ("drm/amdgpu: extract amdgpu_vm_lock_by_pasid from amdgpu_vm_handle_fault")
Signed-off-by: Pierre-Eric Pelloux-Prayer &lt;pierre-eric.pelloux-prayer@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
(cherry picked from commit 5cdc219fe86a1720aa4b5b4f42f11913146e6a93)
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
svm_range_restore_pages might reserve the root bo so it must
be called after unreserving it.

Fixes: 1b135c6da061 ("drm/amdgpu: extract amdgpu_vm_lock_by_pasid from amdgpu_vm_handle_fault")
Signed-off-by: Pierre-Eric Pelloux-Prayer &lt;pierre-eric.pelloux-prayer@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
(cherry picked from commit 5cdc219fe86a1720aa4b5b4f42f11913146e6a93)
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdgpu: extract amdgpu_vm_lock_by_pasid from amdgpu_vm_handle_fault</title>
<updated>2026-04-03T18:55:07+00:00</updated>
<author>
<name>Pierre-Eric Pelloux-Prayer</name>
<email>pierre-eric.pelloux-prayer@amd.com</email>
</author>
<published>2026-02-04T15:41:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=1b135c6da061cdb3322dc0e92ca5a7c58825c77b'/>
<id>1b135c6da061cdb3322dc0e92ca5a7c58825c77b</id>
<content type='text'>
This is tricky to implement right and we're going to need
it from the devcoredump.

Signed-off-by: Pierre-Eric Pelloux-Prayer &lt;pierre-eric.pelloux-prayer@amd.com&gt;
Acked-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This is tricky to implement right and we're going to need
it from the devcoredump.

Signed-off-by: Pierre-Eric Pelloux-Prayer &lt;pierre-eric.pelloux-prayer@amd.com&gt;
Acked-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdgpu/userq: call dma_resv_wait_timeout without test for signalled</title>
<updated>2026-04-03T17:59:15+00:00</updated>
<author>
<name>Sunil Khatri</name>
<email>sunil.khatri@amd.com</email>
</author>
<published>2026-03-26T07:52:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=38476bde59948fe85e20bb1e7f3f66525d0c10cd'/>
<id>38476bde59948fe85e20bb1e7f3f66525d0c10cd</id>
<content type='text'>
In function amdgpu_userq_gem_va_unmap_validate call
dma_resv_wait_timeout directly. Also since we are waiting
forever we should not be having any return value and hence
no handling needed.

Suggested-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Sunil Khatri &lt;sunil.khatri@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In function amdgpu_userq_gem_va_unmap_validate call
dma_resv_wait_timeout directly. Also since we are waiting
forever we should not be having any return value and hence
no handling needed.

Suggested-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Sunil Khatri &lt;sunil.khatri@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdgpu: Handle GPU page faults correctly on non-4K page systems</title>
<updated>2026-03-24T17:33:49+00:00</updated>
<author>
<name>Donet Tom</name>
<email>donettom@linux.ibm.com</email>
</author>
<published>2026-03-23T04:28:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=074fe395fb13247b057f60004c7ebcca9f38ef46'/>
<id>074fe395fb13247b057f60004c7ebcca9f38ef46</id>
<content type='text'>
During a GPU page fault, the driver restores the SVM range and then maps it
into the GPU page tables. The current implementation passes a GPU-page-size
(4K-based) PFN to svm_range_restore_pages() to restore the range.

SVM ranges are tracked using system-page-size PFNs. On systems where the
system page size is larger than 4K, using GPU-page-size PFNs to restore the
range causes two problems:

Range lookup fails:
Because the restore function receives PFNs in GPU (4K) units, the SVM
range lookup does not find the existing range. This will result in a
duplicate SVM range being created.

VMA lookup failure:
The restore function also tries to locate the VMA for the faulting address.
It converts the GPU-page-size PFN into an address using the system page
size, which results in an incorrect address on non-4K page-size systems.
As a result, the VMA lookup fails with the message: "address 0xxxx VMA is
removed".

This patch passes the system-page-size PFN to svm_range_restore_pages() so
that the SVM range is restored correctly on non-4K page systems.

Acked-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Donet Tom &lt;donettom@linux.ibm.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
During a GPU page fault, the driver restores the SVM range and then maps it
into the GPU page tables. The current implementation passes a GPU-page-size
(4K-based) PFN to svm_range_restore_pages() to restore the range.

SVM ranges are tracked using system-page-size PFNs. On systems where the
system page size is larger than 4K, using GPU-page-size PFNs to restore the
range causes two problems:

Range lookup fails:
Because the restore function receives PFNs in GPU (4K) units, the SVM
range lookup does not find the existing range. This will result in a
duplicate SVM range being created.

VMA lookup failure:
The restore function also tries to locate the VMA for the faulting address.
It converts the GPU-page-size PFN into an address using the system page
size, which results in an incorrect address on non-4K page-size systems.
As a result, the VMA lookup fails with the message: "address 0xxxx VMA is
removed".

This patch passes the system-page-size PFN to svm_range_restore_pages() so
that the SVM range is restored correctly on non-4K page systems.

Acked-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Donet Tom &lt;donettom@linux.ibm.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdgpu: prevent immediate PASID reuse case</title>
<updated>2026-03-23T18:10:39+00:00</updated>
<author>
<name>Eric Huang</name>
<email>jinhuieric.huang@amd.com</email>
</author>
<published>2026-03-16T15:01:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=8f1de51f49be692de137c8525106e0fce2d1912d'/>
<id>8f1de51f49be692de137c8525106e0fce2d1912d</id>
<content type='text'>
PASID resue could cause interrupt issue when process
immediately runs into hw state left by previous
process exited with the same PASID, it's possible that
page faults are still pending in the IH ring buffer when
the process exits and frees up its PASID. To prevent the
case, it uses idr cyclic allocator same as kernel pid's.

Signed-off-by: Eric Huang &lt;jinhuieric.huang@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
PASID resue could cause interrupt issue when process
immediately runs into hw state left by previous
process exited with the same PASID, it's possible that
page faults are still pending in the IH ring buffer when
the process exits and frees up its PASID. To prevent the
case, it uses idr cyclic allocator same as kernel pid's.

Signed-off-by: Eric Huang &lt;jinhuieric.huang@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdgpu: Move amdgpu_vm_is_bo_always_valid() before first use</title>
<updated>2026-03-17T21:47:47+00:00</updated>
<author>
<name>Srinivasan Shanmugam</name>
<email>srinivasan.shanmugam@amd.com</email>
</author>
<published>2026-03-12T15:04:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=a1c1c77d30cd497ad6dd6062f8a5c06b38a45132'/>
<id>a1c1c77d30cd497ad6dd6062f8a5c06b38a45132</id>
<content type='text'>
Smatch reports that 'bo' could be NULL in amdgpu_vm_bo_update(), even
though amdgpu_vm_is_bo_always_valid() already checks for a NULL BO.

Move amdgpu_vm_is_bo_always_valid() earlier in the file so the helper
definition appears before its first use. This allows static analysis
tools to see the NULL check performed by the helper and avoids the
warning.

Suggested-by: Tvrtko Ursulin &lt;tvrtko.ursulin@igalia.com&gt;
Cc: Dan Carpenter &lt;dan.carpenter@linaro.org&gt;
Cc: Tvrtko Ursulin &lt;tvrtko.ursulin@igalia.com&gt;
Cc: Christian König &lt;christian.koenig@amd.com&gt;
Cc: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Srinivasan Shanmugam &lt;srinivasan.shanmugam@amd.com&gt;
Reviewed-by: Tvrtko Ursulin &lt;tvrtko.ursulin@igalia.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Smatch reports that 'bo' could be NULL in amdgpu_vm_bo_update(), even
though amdgpu_vm_is_bo_always_valid() already checks for a NULL BO.

Move amdgpu_vm_is_bo_always_valid() earlier in the file so the helper
definition appears before its first use. This allows static analysis
tools to see the NULL check performed by the helper and avoids the
warning.

Suggested-by: Tvrtko Ursulin &lt;tvrtko.ursulin@igalia.com&gt;
Cc: Dan Carpenter &lt;dan.carpenter@linaro.org&gt;
Cc: Tvrtko Ursulin &lt;tvrtko.ursulin@igalia.com&gt;
Cc: Christian König &lt;christian.koenig@amd.com&gt;
Cc: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Srinivasan Shanmugam &lt;srinivasan.shanmugam@amd.com&gt;
Reviewed-by: Tvrtko Ursulin &lt;tvrtko.ursulin@igalia.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdgpu: rework how we handle TLB fences</title>
<updated>2026-03-17T21:43:17+00:00</updated>
<author>
<name>Alex Deucher</name>
<email>alexander.deucher@amd.com</email>
</author>
<published>2026-03-16T15:04:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=69c5fbd2b93b5ced77c6e79afe83371bca84c788'/>
<id>69c5fbd2b93b5ced77c6e79afe83371bca84c788</id>
<content type='text'>
Add a new VM flag to indicate whether or not we need
a TLB fence.  Userqs (KFD or KGD) require a TLB fence.
A TLB fence is not strictly required for kernel queues,
but it shouldn't hurt.  That said, enabling this
unconditionally should be fine, but it seems to tickle
some issues in KIQ/MES.  Only enable them for KFD,
or when KGD userq queues are enabled (currently via module
parameter).

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4798
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4749
Fixes: f3854e04b708 ("drm/amdgpu: attach tlb fence to the PTs update")
Cc: Christian König &lt;christian.koenig@amd.com&gt;
Cc: Prike Liang &lt;Prike.Liang@amd.com&gt;
Reviewed-by: Prike Liang &lt;Prike.Liang@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add a new VM flag to indicate whether or not we need
a TLB fence.  Userqs (KFD or KGD) require a TLB fence.
A TLB fence is not strictly required for kernel queues,
but it shouldn't hurt.  That said, enabling this
unconditionally should be fine, but it seems to tickle
some issues in KIQ/MES.  Only enable them for KFD,
or when KGD userq queues are enabled (currently via module
parameter).

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4798
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4749
Fixes: f3854e04b708 ("drm/amdgpu: attach tlb fence to the PTs update")
Cc: Christian König &lt;christian.koenig@amd.com&gt;
Cc: Prike Liang &lt;Prike.Liang@amd.com&gt;
Reviewed-by: Prike Liang &lt;Prike.Liang@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Revert "drm/amdgpu: revert to old status lock handling v4"</title>
<updated>2026-03-17T14:28:47+00:00</updated>
<author>
<name>Sunil Khatri</name>
<email>sunil.khatri@amd.com</email>
</author>
<published>2026-03-13T17:41:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=3fd20c149e3ef75f472fdf15da35846dc0dabe54'/>
<id>3fd20c149e3ef75f472fdf15da35846dc0dabe54</id>
<content type='text'>
This reverts commit 7a9419ab42699fd3d4c857ef81ae097d8d8d5899.

Reverting due to some of the probable issues caused by this change
and CI is blocked.

Signed-off-by: Sunil Khatri &lt;sunil.khatri@amd.com&gt;
Acked-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This reverts commit 7a9419ab42699fd3d4c857ef81ae097d8d8d5899.

Reverting due to some of the probable issues caused by this change
and CI is blocked.

Signed-off-by: Sunil Khatri &lt;sunil.khatri@amd.com&gt;
Acked-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'amd-drm-next-7.1-2026-03-12' of https://gitlab.freedesktop.org/agd5f/linux into drm-next</title>
<updated>2026-03-16T06:50:53+00:00</updated>
<author>
<name>Dave Airlie</name>
<email>airlied@redhat.com</email>
</author>
<published>2026-03-16T06:50:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=02e778f12359fde41830c42c70a737a1c1b0ce58'/>
<id>02e778f12359fde41830c42c70a737a1c1b0ce58</id>
<content type='text'>
amd-drm-next-7.1-2026-03-12:

amdgpu:
- SMU13 fix
- SMU14 fix
- Fixes for bring up hw testing
- Kerneldoc fix
- GC12 idle power fix for compute workloads
- DCCG fixes
- UserQ fixes
- Move test for fbdev object to a generic helper
- GC 12.1 updates
- Use struct drm_edid in non-DC code
- Include IP discovery data in devcoredump
- SMU 13.x updates
- Misc cleanups
- DML 2.1 fixes
- Enable NV12/P010 support on primary planes
- Enable color encoding and color range on overlay planes
- DC underflow fixes
- HWSS fast path fixes
- Replay fixes
- DCN 4.2 updates
- Support newer IP discovery tables
- LSDMA 7.1 support
- IH 7.1 fixes
- SoC v1 updates
- GC12.1 updates
- PSP 15 updates
- XGMI fixes
- GPUVM locking fix

amdkfd:
- Fix missing BO unreserve in an error path

radeon:
- Move test for fbdev object to a generic helper

From: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Link: https://patch.msgid.link/20260312184425.3875669-1-alexander.deucher@amd.com
Signed-off-by: Dave Airlie &lt;airlied@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
amd-drm-next-7.1-2026-03-12:

amdgpu:
- SMU13 fix
- SMU14 fix
- Fixes for bring up hw testing
- Kerneldoc fix
- GC12 idle power fix for compute workloads
- DCCG fixes
- UserQ fixes
- Move test for fbdev object to a generic helper
- GC 12.1 updates
- Use struct drm_edid in non-DC code
- Include IP discovery data in devcoredump
- SMU 13.x updates
- Misc cleanups
- DML 2.1 fixes
- Enable NV12/P010 support on primary planes
- Enable color encoding and color range on overlay planes
- DC underflow fixes
- HWSS fast path fixes
- Replay fixes
- DCN 4.2 updates
- Support newer IP discovery tables
- LSDMA 7.1 support
- IH 7.1 fixes
- SoC v1 updates
- GC12.1 updates
- PSP 15 updates
- XGMI fixes
- GPUVM locking fix

amdkfd:
- Fix missing BO unreserve in an error path

radeon:
- Move test for fbdev object to a generic helper

From: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Link: https://patch.msgid.link/20260312184425.3875669-1-alexander.deucher@amd.com
Signed-off-by: Dave Airlie &lt;airlied@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdgpu: revert to old status lock handling v4</title>
<updated>2026-03-11T17:58:08+00:00</updated>
<author>
<name>Christian König</name>
<email>christian.koenig@amd.com</email>
</author>
<published>2026-01-20T12:09:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=7a9419ab42699fd3d4c857ef81ae097d8d8d5899'/>
<id>7a9419ab42699fd3d4c857ef81ae097d8d8d5899</id>
<content type='text'>
It turned out that protecting the status of each bo_va with a
spinlock was just hiding problems instead of solving them.

Revert the whole approach, add a separate stats_lock and lockdep
assertions that the correct reservation lock is held all over the place.

This not only allows for better checks if a state transition is properly
protected by a lock, but also switching back to using list macros to
iterate over the state of lists protected by the dma_resv lock of the
root PD.

v2: re-add missing check
v3: split into two patches
v4: re-apply by fixing holding the VM lock at the right places.

Signed-off-by: Christian König &lt;christian.koenig@amd.com&gt;
Reviewed-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Reviewed-by: Sunil Khatri &lt;sunil.khatri@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
It turned out that protecting the status of each bo_va with a
spinlock was just hiding problems instead of solving them.

Revert the whole approach, add a separate stats_lock and lockdep
assertions that the correct reservation lock is held all over the place.

This not only allows for better checks if a state transition is properly
protected by a lock, but also switching back to using list macros to
iterate over the state of lists protected by the dma_resv lock of the
root PD.

v2: re-add missing check
v3: split into two patches
v4: re-apply by fixing holding the VM lock at the right places.

Signed-off-by: Christian König &lt;christian.koenig@amd.com&gt;
Reviewed-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Reviewed-by: Sunil Khatri &lt;sunil.khatri@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
