<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/drivers/gpu/drm/amd/amdgpu, branch v7.1-rc4</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>drm/amdgpu/gfx_v12_0: set gfx.rs64_enable from PFP header on GFX12</title>
<updated>2026-05-11T21:54:44+00:00</updated>
<author>
<name>Jesse Zhang</name>
<email>Jesse.Zhang@amd.com</email>
</author>
<published>2026-04-03T07:58:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=5d08559c910cc37673b965a0d4e8d004444d0332'/>
<id>5d08559c910cc37673b965a0d4e8d004444d0332</id>
<content type='text'>
gfx_v12_0_init_microcode() always loads RS64 CP ucode but never set
adev-&gt;gfx.rs64_enable, so it stayed false and code that branches on it
(e.g. MEC pipe reset) used the legacy CP_MEC_CNTL path incorrectly.

Match GFX11: derive RS64 mode from the PFP firmware header (v2.0) via
amdgpu_ucode_hdr_version(). Log at debug when RS64 is enabled.

Reviewed-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Jesse Zhang &lt;jesse.zhang@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
(cherry picked from commit b03d53598b0d2048e8fa7303b8d0784768ec4fa6)
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
gfx_v12_0_init_microcode() always loads RS64 CP ucode but never set
adev-&gt;gfx.rs64_enable, so it stayed false and code that branches on it
(e.g. MEC pipe reset) used the legacy CP_MEC_CNTL path incorrectly.

Match GFX11: derive RS64 mode from the PFP firmware header (v2.0) via
amdgpu_ucode_hdr_version(). Log at debug when RS64 is enabled.

Reviewed-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Jesse Zhang &lt;jesse.zhang@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
(cherry picked from commit b03d53598b0d2048e8fa7303b8d0784768ec4fa6)
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amd/ras: Fix CPER ring debugfs read overflow</title>
<updated>2026-05-11T21:54:28+00:00</updated>
<author>
<name>Xiang Liu</name>
<email>xiang.liu@amd.com</email>
</author>
<published>2026-05-07T12:56:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=6bbede02dc62a1021aeeae87ab243bd7a93c61d2'/>
<id>6bbede02dc62a1021aeeae87ab243bd7a93c61d2</id>
<content type='text'>
The legacy CPER debugfs reader can reach the payload path without a
valid pointer snapshot. The remaining user byte count is also treated as
the ring occupancy in dwords, so reads past the header can copy more than
requested.

Take the CPER lock before sampling pointers. Resample rptr/wptr for
payload reads, bound the payload copy by available dwords and the
remaining user size, and advance the file position for each dword copied.

Signed-off-by: Xiang Liu &lt;xiang.liu@amd.com&gt;
Reviewed-by: Tao Zhou &lt;tao.zhou1@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
(cherry picked from commit 1e40ef87ffdc291e05ccdade8b9170cc9c1c4249)
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The legacy CPER debugfs reader can reach the payload path without a
valid pointer snapshot. The remaining user byte count is also treated as
the ring occupancy in dwords, so reads past the header can copy more than
requested.

Take the CPER lock before sampling pointers. Resample rptr/wptr for
payload reads, bound the payload copy by available dwords and the
remaining user size, and advance the file position for each dword copied.

Signed-off-by: Xiang Liu &lt;xiang.liu@amd.com&gt;
Reviewed-by: Tao Zhou &lt;tao.zhou1@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
(cherry picked from commit 1e40ef87ffdc291e05ccdade8b9170cc9c1c4249)
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdgpu: fix userq hang detection and reset</title>
<updated>2026-05-11T21:47:11+00:00</updated>
<author>
<name>Christian König</name>
<email>christian.koenig@amd.com</email>
</author>
<published>2026-04-20T14:08:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=0071e01c61aa73f22edd5252f0a3e2c2eff744d6'/>
<id>0071e01c61aa73f22edd5252f0a3e2c2eff744d6</id>
<content type='text'>
Fix lock inversions pointed out by Prike and Sunil. The hang detection
timeout *CAN'T* grab locks under which we wait for fences, especially
not the userq_mutex lock.

Then instead of this completely broken handling with the
hang_detect_fence just cancel the work when fences are processed and
re-start if necessary.

Signed-off-by: Christian König &lt;christian.koenig@amd.com&gt;
Reviewed-by: Sunil Khatri &lt;sunil.khatri@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
(cherry picked from commit 1b62077f045ac6ffde7c97005c6659569ac5c1ec)
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Fix lock inversions pointed out by Prike and Sunil. The hang detection
timeout *CAN'T* grab locks under which we wait for fences, especially
not the userq_mutex lock.

Then instead of this completely broken handling with the
hang_detect_fence just cancel the work when fences are processed and
re-start if necessary.

Signed-off-by: Christian König &lt;christian.koenig@amd.com&gt;
Reviewed-by: Sunil Khatri &lt;sunil.khatri@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
(cherry picked from commit 1b62077f045ac6ffde7c97005c6659569ac5c1ec)
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdgpu: remove almost all calls to amdgpu_userq_detect_and_reset_queues</title>
<updated>2026-05-11T21:47:04+00:00</updated>
<author>
<name>Christian König</name>
<email>christian.koenig@amd.com</email>
</author>
<published>2026-04-20T13:13:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=d0053441ad7eaf7920b71e2263097ece53c5af34'/>
<id>d0053441ad7eaf7920b71e2263097ece53c5af34</id>
<content type='text'>
Well the reset handling seems broken on multiple levels.

As first step of fixing this remove most calls to the hang detection.
That function should only be called after we run into a timeout! And *NOT*
as random check spread over the code in multiple places.

Signed-off-by: Christian König &lt;christian.koenig@amd.com&gt;
Reviewed-by: Sunil Khatri &lt;sunil.khatri@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
(cherry picked from commit 71bea36b54ccfb14cbc90f94267af6369af4e702)
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Well the reset handling seems broken on multiple levels.

As first step of fixing this remove most calls to the hang detection.
That function should only be called after we run into a timeout! And *NOT*
as random check spread over the code in multiple places.

Signed-off-by: Christian König &lt;christian.koenig@amd.com&gt;
Reviewed-by: Sunil Khatri &lt;sunil.khatri@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
(cherry picked from commit 71bea36b54ccfb14cbc90f94267af6369af4e702)
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdgpu: rework amdgpu_userq_signal_ioctl v3</title>
<updated>2026-05-11T21:46:43+00:00</updated>
<author>
<name>Christian König</name>
<email>christian.koenig@amd.com</email>
</author>
<published>2026-04-16T13:32:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=44e5bc73bdcbec115cfe1c4b4394d25eb3deaff2'/>
<id>44e5bc73bdcbec115cfe1c4b4394d25eb3deaff2</id>
<content type='text'>
This one was fortunately not looking so bad as the wait ioctl path, but
there were still a few things which could be fixed/improved:

1. Allocating with GFP_ATOMIC was quite unnecessary, we can do that
   before taking the userq_lock.
2. Use a new mutex as protection for the fence_drv_xa so that we can do
   memory allocations while holding it.
3. Starting the reset timer is unnecessary when the fence is already
   signaled when we create it.
4. Cleanup error handling, avoid trying to free the queue when we don't
   even got one.

v2: fix incorrect usage of xa_find, destroy the new mutex on error
v3: cleanup ref ordering

Signed-off-by: Christian König &lt;christian.koenig@amd.com&gt;
Reviewed-by: Sunil Khatri &lt;sunil.khatri@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
(cherry picked from commit 1609eb0f81a609d350169839128cecf298c84e7a)
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This one was fortunately not looking so bad as the wait ioctl path, but
there were still a few things which could be fixed/improved:

1. Allocating with GFP_ATOMIC was quite unnecessary, we can do that
   before taking the userq_lock.
2. Use a new mutex as protection for the fence_drv_xa so that we can do
   memory allocations while holding it.
3. Starting the reset timer is unnecessary when the fence is already
   signaled when we create it.
4. Cleanup error handling, avoid trying to free the queue when we don't
   even got one.

v2: fix incorrect usage of xa_find, destroy the new mutex on error
v3: cleanup ref ordering

Signed-off-by: Christian König &lt;christian.koenig@amd.com&gt;
Reviewed-by: Sunil Khatri &lt;sunil.khatri@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
(cherry picked from commit 1609eb0f81a609d350169839128cecf298c84e7a)
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdgpu: remove deadlocks from amdgpu_userq_pre_reset</title>
<updated>2026-05-11T21:46:34+00:00</updated>
<author>
<name>Christian König</name>
<email>christian.koenig@amd.com</email>
</author>
<published>2026-04-20T18:18:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=d5971c5c34303a00bf841a902ca00a703602c500'/>
<id>d5971c5c34303a00bf841a902ca00a703602c500</id>
<content type='text'>
The purpose of a GPU reset is to make sure that fence can be signaled
again and the signal and resume workers can make progress again.

So waiting for the resume worker or any fence in the GPU reset path is
just utterly nonsense.

Signed-off-by: Christian König &lt;christian.koenig@amd.com&gt;
Reviewed-by: Prike Liang &lt;Prike.Liang@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
(cherry picked from commit fcd5f065eab46993af43442fd77ee8d9eb9c5bdf)
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The purpose of a GPU reset is to make sure that fence can be signaled
again and the signal and resume workers can make progress again.

So waiting for the resume worker or any fence in the GPU reset path is
just utterly nonsense.

Signed-off-by: Christian König &lt;christian.koenig@amd.com&gt;
Reviewed-by: Prike Liang &lt;Prike.Liang@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
(cherry picked from commit fcd5f065eab46993af43442fd77ee8d9eb9c5bdf)
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdgpu: nuke amdgpu_userq_fence_slab v2</title>
<updated>2026-05-05T14:23:06+00:00</updated>
<author>
<name>Christian König</name>
<email>christian.koenig@amd.com</email>
</author>
<published>2025-10-13T13:26:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=4e02e0afa95f691dc7cc17538cdd648089a843f0'/>
<id>4e02e0afa95f691dc7cc17538cdd648089a843f0</id>
<content type='text'>
As preparation for independent fences remove the extra slab, kmalloc
should do just fine.

v2: use GFP_KERNEL instead of GFP_ATOMIC

Signed-off-by: Christian König &lt;christian.koenig@amd.com&gt;
Reviewed-by: Prike Liang &lt;Prike.Liang@amd.com&gt;
Reviewed-by: Sunil Khatri &lt;sunil.khatri@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
(cherry picked from commit 0d831487b5be0ae59cac865a0aa87b0acc3dc717)
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
As preparation for independent fences remove the extra slab, kmalloc
should do just fine.

v2: use GFP_KERNEL instead of GFP_ATOMIC

Signed-off-by: Christian König &lt;christian.koenig@amd.com&gt;
Reviewed-by: Prike Liang &lt;Prike.Liang@amd.com&gt;
Reviewed-by: Sunil Khatri &lt;sunil.khatri@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
(cherry picked from commit 0d831487b5be0ae59cac865a0aa87b0acc3dc717)
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdgpu/userq: fix access to stale wptr mapping</title>
<updated>2026-05-05T14:22:13+00:00</updated>
<author>
<name>Sunil Khatri</name>
<email>sunil.khatri@amd.com</email>
</author>
<published>2026-05-04T12:51:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=6da7b1242da4455b11c24ce667d1cab1a348c8ea'/>
<id>6da7b1242da4455b11c24ce667d1cab1a348c8ea</id>
<content type='text'>
Use drm_exec to take both locks i.e vm root bo and
wptr_obj bo to access the mapping data properly.

This fixes the security issue of unmap the wptr_obj while
a queue creation is in progress and passing other
bo at same address.

Signed-off-by: Sunil Khatri &lt;sunil.khatri@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
(cherry picked from commit 1fc6c8ab45dbee096469c08c13f6099d57a52d6c)
Cc: stable@vger.kernel.org
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Use drm_exec to take both locks i.e vm root bo and
wptr_obj bo to access the mapping data properly.

This fixes the security issue of unmap the wptr_obj while
a queue creation is in progress and passing other
bo at same address.

Signed-off-by: Sunil Khatri &lt;sunil.khatri@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
(cherry picked from commit 1fc6c8ab45dbee096469c08c13f6099d57a52d6c)
Cc: stable@vger.kernel.org
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdgpu: zero-initialize GART table on allocation</title>
<updated>2026-05-05T14:17:22+00:00</updated>
<author>
<name>Philip Yang</name>
<email>Philip.Yang@amd.com</email>
</author>
<published>2026-04-27T13:30:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=e6c2e6c2e1fa066968a16aca1cb66cd1bdde7741'/>
<id>e6c2e6c2e1fa066968a16aca1cb66cd1bdde7741</id>
<content type='text'>
GART TLB is flushed after unmapping but not after mapping. Since
amdgpu_bo_create_kernel() does not zero-initialize the buffer, when a
single PTE is written the TLB may speculatively load other uninitialized
entries from the same cacheline. Those garbage entries can appear valid,
and a subsequent write to another PTE in the same cacheline may cause the
GPU to use a stale garbage PTE from the TLB.

Fix this by calling memset_io() to zero-initialize the GART table with
gart_pte_flags immediately after allocation.

Using AMDGPU_GEM_CREATE_VRAM_CLEARED, SDMA-based clear will not work
since SDMA needs GART to be initialized to work.

Suggested-by: Felix Kuehling &lt;felix.kuehling@amd.com&gt;
Signed-off-by: Philip Yang &lt;Philip.Yang@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
(cherry picked from commit d9af8263b82b6eaa60c5718e0c6631c5037e4b24)
Cc: stable@vger.kernel.org
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
GART TLB is flushed after unmapping but not after mapping. Since
amdgpu_bo_create_kernel() does not zero-initialize the buffer, when a
single PTE is written the TLB may speculatively load other uninitialized
entries from the same cacheline. Those garbage entries can appear valid,
and a subsequent write to another PTE in the same cacheline may cause the
GPU to use a stale garbage PTE from the TLB.

Fix this by calling memset_io() to zero-initialize the GART table with
gart_pte_flags immediately after allocation.

Using AMDGPU_GEM_CREATE_VRAM_CLEARED, SDMA-based clear will not work
since SDMA needs GART to be initialized to work.

Suggested-by: Felix Kuehling &lt;felix.kuehling@amd.com&gt;
Signed-off-by: Philip Yang &lt;Philip.Yang@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
(cherry picked from commit d9af8263b82b6eaa60c5718e0c6631c5037e4b24)
Cc: stable@vger.kernel.org
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdgpu/sdma4: replace BUG_ON with WARN_ON in fence emission</title>
<updated>2026-05-05T14:16:09+00:00</updated>
<author>
<name>John B. Moore</name>
<email>jbmoore61@gmail.com</email>
</author>
<published>2026-04-27T21:06:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=78d2e624fa073c14970aa097adcf3ea31c157a66'/>
<id>78d2e624fa073c14970aa097adcf3ea31c157a66</id>
<content type='text'>
sdma_v4_0_ring_emit_fence() contains two BUG_ON(addr &amp; 0x3) assertions
that verify fence writeback addresses are dword-aligned.  These
assertions can be reached from unprivileged userspace via crafted
DRM_IOCTL_AMDGPU_CS submissions, causing a fatal kernel panic in a
scheduler worker thread.

Replace both BUG_ON() calls with WARN_ON() to log the condition without
crashing the kernel.  A misaligned fence address at this point indicates
a driver bug, but crashing the kernel is never the correct response when
the assertion is reachable from userspace.

The CS IOCTL path is the correct place to filter invalid submissions;
the ring emission callback is too late to do anything about it.

Fixes: 2130f89ced2c ("drm/amdgpu: add SDMA v4.0 implementation (v2)")
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: John B. Moore &lt;jbmoore61@gmail.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
(cherry picked from commit b90250bd933afd1ba94d86d6b13821997b22b18e)
Cc: stable@vger.kernel.org
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
sdma_v4_0_ring_emit_fence() contains two BUG_ON(addr &amp; 0x3) assertions
that verify fence writeback addresses are dword-aligned.  These
assertions can be reached from unprivileged userspace via crafted
DRM_IOCTL_AMDGPU_CS submissions, causing a fatal kernel panic in a
scheduler worker thread.

Replace both BUG_ON() calls with WARN_ON() to log the condition without
crashing the kernel.  A misaligned fence address at this point indicates
a driver bug, but crashing the kernel is never the correct response when
the assertion is reachable from userspace.

The CS IOCTL path is the correct place to filter invalid submissions;
the ring emission callback is too late to do anything about it.

Fixes: 2130f89ced2c ("drm/amdgpu: add SDMA v4.0 implementation (v2)")
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: John B. Moore &lt;jbmoore61@gmail.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
(cherry picked from commit b90250bd933afd1ba94d86d6b13821997b22b18e)
Cc: stable@vger.kernel.org
</pre>
</div>
</content>
</entry>
</feed>
