<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/drivers/gpu/drm/amd/amdkfd/kfd_debug.c, branch linux-6.11.y</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>drm/amdkfd: fix the kdf debugger issue</title>
<updated>2024-06-05T15:25:00+00:00</updated>
<author>
<name>Jesse Zhang</name>
<email>jesse.zhang@amd.com</email>
</author>
<published>2024-05-31T01:56:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=8178cfb0b48b122dd72ba6ffc2251926f62a0002'/>
<id>8178cfb0b48b122dd72ba6ffc2251926f62a0002</id>
<content type='text'>
The expression caps | HSA_CAP_TRAP_DEBUG_PRECISE_MEMORY_OPERATIONS_SUPPORTED
and  caps | HSA_CAP_TRAP_DEBUG_PRECISE_ALU_OPERATIONS_SUPPORTED
are always 1/true regardless of the values of its operand.

Fixes: 9243240bed38 ("drm/amdkfd: enable single alu ops for gfx12")
Signed-off-by: Jesse Zhang &lt;Jesse.Zhang@amd.com&gt;
Suggested-by: Felix Kuehling &lt;felix.kuehling@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;felix.kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The expression caps | HSA_CAP_TRAP_DEBUG_PRECISE_MEMORY_OPERATIONS_SUPPORTED
and  caps | HSA_CAP_TRAP_DEBUG_PRECISE_ALU_OPERATIONS_SUPPORTED
are always 1/true regardless of the values of its operand.

Fixes: 9243240bed38 ("drm/amdkfd: enable single alu ops for gfx12")
Signed-off-by: Jesse Zhang &lt;Jesse.Zhang@amd.com&gt;
Suggested-by: Felix Kuehling &lt;felix.kuehling@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;felix.kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdkfd: Check debug trap enable before write dbg_ev_file</title>
<updated>2024-05-08T19:17:05+00:00</updated>
<author>
<name>Lin.Cao</name>
<email>lincao12@amd.com</email>
</author>
<published>2024-04-24T03:27:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=547033b593063eb85bfdf9b25a5f1b8fd1911be2'/>
<id>547033b593063eb85bfdf9b25a5f1b8fd1911be2</id>
<content type='text'>
In interrupt context, write dbg_ev_file will be run by work queue. It
will cause write dbg_ev_file execution after debug_trap_disable, which
will cause NULL pointer access.
v2: cancel work "debug_event_workarea" before set dbg_ev_file as NULL.

Signed-off-by: Lin.Cao &lt;lincao12@amd.com&gt;
Reviewed-by: Jonathan Kim &lt;jonathan.kim@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In interrupt context, write dbg_ev_file will be run by work queue. It
will cause write dbg_ev_file execution after debug_trap_disable, which
will cause NULL pointer access.
v2: cancel work "debug_event_workarea" before set dbg_ev_file as NULL.

Signed-off-by: Lin.Cao &lt;lincao12@amd.com&gt;
Reviewed-by: Jonathan Kim &lt;jonathan.kim@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdkfd: enable single alu ops for gfx12</title>
<updated>2024-05-02T20:18:13+00:00</updated>
<author>
<name>Jonathan Kim</name>
<email>jonathan.kim@amd.com</email>
</author>
<published>2023-08-21T15:47:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=9243240bed3859fba2d15c016902a4c73a186249'/>
<id>9243240bed3859fba2d15c016902a4c73a186249</id>
<content type='text'>
GFX12 debugging requires setting up precise ALU operation for catching
ALU exceptions.

Signed-off-by: Jonathan Kim &lt;jonathan.kim@amd.com&gt;
Tested-by: Lancelot Six &lt;lancelot.six@amd.com&gt;
Reviewed-by: Eric Huang &lt;jinhuieric.huang@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
GFX12 debugging requires setting up precise ALU operation for catching
ALU exceptions.

Signed-off-by: Jonathan Kim &lt;jonathan.kim@amd.com&gt;
Tested-by: Lancelot Six &lt;lancelot.six@amd.com&gt;
Reviewed-by: Eric Huang &lt;jinhuieric.huang@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdkfd: Fix variable dereferenced before NULL check in 'kfd_dbg_trap_device_snapshot()'</title>
<updated>2024-01-15T23:35:37+00:00</updated>
<author>
<name>Srinivasan Shanmugam</name>
<email>srinivasan.shanmugam@amd.com</email>
</author>
<published>2024-01-05T05:57:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=81d4b970684554dbd8faff90fbcd600b86847a68'/>
<id>81d4b970684554dbd8faff90fbcd600b86847a68</id>
<content type='text'>
Fixes the below:
drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_debug.c:1024 kfd_dbg_trap_device_snapshot() warn: variable dereferenced before check 'entry_size' (see line 1021)

Cc: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Cc: Christian König &lt;christian.koenig@amd.com&gt;
Cc: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Srinivasan Shanmugam &lt;srinivasan.shanmugam@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;felix.kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Fixes the below:
drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_debug.c:1024 kfd_dbg_trap_device_snapshot() warn: variable dereferenced before check 'entry_size' (see line 1021)

Cc: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Cc: Christian König &lt;christian.koenig@amd.com&gt;
Cc: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Srinivasan Shanmugam &lt;srinivasan.shanmugam@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;felix.kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdkfd: fix and enable ttmp setup for gfx11</title>
<updated>2023-07-27T18:59:29+00:00</updated>
<author>
<name>Jonathan Kim</name>
<email>jonathan.kim@amd.com</email>
</author>
<published>2023-06-12T15:31:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=fc7f1d9697bcd3f7a4c64fb00a0e9bb050eaaa2a'/>
<id>fc7f1d9697bcd3f7a4c64fb00a0e9bb050eaaa2a</id>
<content type='text'>
The MES cached process context must be cleared on adding any queue for
the first time.

For proper debug support, the MES will clear it's cached process context
on the first call to SET_SHADER_DEBUGGER.

This allows TTMPs to be pesistently enabled in a safe manner.

Signed-off-by: Jonathan Kim &lt;jonathan.kim@amd.com&gt;
Reviewed-by: Eric Huang &lt;jinhuieric@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The MES cached process context must be cleared on adding any queue for
the first time.

For proper debug support, the MES will clear it's cached process context
on the first call to SET_SHADER_DEBUGGER.

This allows TTMPs to be pesistently enabled in a safe manner.

Signed-off-by: Jonathan Kim &lt;jonathan.kim@amd.com&gt;
Reviewed-by: Eric Huang &lt;jinhuieric@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdkfd: enable cooperative groups for gfx11</title>
<updated>2023-07-25T17:35:43+00:00</updated>
<author>
<name>Jonathan Kim</name>
<email>jonathan.kim@amd.com</email>
</author>
<published>2023-07-12T20:58:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=7a1c5c6753858cbbf0b073eaa9b53d8f56ee0927'/>
<id>7a1c5c6753858cbbf0b073eaa9b53d8f56ee0927</id>
<content type='text'>
MES can concurrently schedule queues on the device that require
exclusive device access if marked exclusively_scheduled without the
requirement of GWS.  Similar to the F32 HWS, MES will manage
quality of service for these queues.
Use this for cooperative groups since cooperative groups are device
occupancy limited.

Since some GFX11 devices can only be debugged with partial CUs, do not
allow the debugging of cooperative groups on these devices as the CU
occupancy limit will change on attach.

In addition, zero initialize the MES add queue submission vector for MES
initialization tests as we do not want these to be cooperative
dispatches.

Signed-off-by: Jonathan Kim &lt;jonathan.kim@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;felix.kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
MES can concurrently schedule queues on the device that require
exclusive device access if marked exclusively_scheduled without the
requirement of GWS.  Similar to the F32 HWS, MES will manage
quality of service for these queues.
Use this for cooperative groups since cooperative groups are device
occupancy limited.

Since some GFX11 devices can only be debugged with partial CUs, do not
allow the debugging of cooperative groups on these devices as the CU
occupancy limit will change on attach.

In addition, zero initialize the MES add queue submission vector for MES
initialization tests as we do not want these to be cooperative
dispatches.

Signed-off-by: Jonathan Kim &lt;jonathan.kim@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;felix.kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdkfd: fix trap handling work around for debugging</title>
<updated>2023-07-21T20:52:25+00:00</updated>
<author>
<name>Jonathan Kim</name>
<email>jonathan.kim@amd.com</email>
</author>
<published>2023-07-12T20:32:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=cef600e1fd63c338bfe19ee3e1adeff8801ba14d'/>
<id>cef600e1fd63c338bfe19ee3e1adeff8801ba14d</id>
<content type='text'>
Update the list of devices that require the cwsr trap handling
workaround for debugging use cases.

Signed-off-by: Jonathan Kim &lt;jonathan.kim@amd.com&gt;
Acked-by: Ruili Ji &lt;ruili.ji@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;felix.kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Update the list of devices that require the cwsr trap handling
workaround for debugging use cases.

Signed-off-by: Jonathan Kim &lt;jonathan.kim@amd.com&gt;
Acked-by: Ruili Ji &lt;ruili.ji@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;felix.kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdkfd: enable watch points globally for gfx943</title>
<updated>2023-07-12T15:12:08+00:00</updated>
<author>
<name>Jonathan Kim</name>
<email>jonathan.kim@amd.com</email>
</author>
<published>2023-03-08T19:44:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=7a93cc579c1e63d956d9ec124100a36d9798ffe8'/>
<id>7a93cc579c1e63d956d9ec124100a36d9798ffe8</id>
<content type='text'>
Set watch points for all xcc instances on GFX943.

Signed-off-by: Jonathan Kim &lt;jonathan.kim@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;felix.kuehling@amd.com&gt;
Signed-off-by: Eric Huang &lt;jinhuieric.huang@amd.com&gt;
Reviewed-by: Jonathan Kim &lt;jonathan.kim@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Set watch points for all xcc instances on GFX943.

Signed-off-by: Jonathan Kim &lt;jonathan.kim@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;felix.kuehling@amd.com&gt;
Signed-off-by: Eric Huang &lt;jinhuieric.huang@amd.com&gt;
Reviewed-by: Jonathan Kim &lt;jonathan.kim@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdkfd: add kfd2kgd debugger callbacks for GC v9.4.3</title>
<updated>2023-07-12T14:58:01+00:00</updated>
<author>
<name>Eric Huang</name>
<email>jinhuieric.huang@amd.com</email>
</author>
<published>2023-07-08T00:02:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=036e348fdccf74db83f18c466123553e12bd35b9'/>
<id>036e348fdccf74db83f18c466123553e12bd35b9</id>
<content type='text'>
Implement the similarities as GC v9.4.2, and the difference
for GC v9.4.3 HW spec, i.e. xcc instance.

Signed-off-by: Jonathan Kim &lt;jonathan.kim@amd.com&gt;
Signed-off-by: Eric Huang &lt;jinhuieric.huang@amd.com&gt;
Reviewed-by: Jonathan Kim &lt;jonathan.kim@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Implement the similarities as GC v9.4.2, and the difference
for GC v9.4.3 HW spec, i.e. xcc instance.

Signed-off-by: Jonathan Kim &lt;jonathan.kim@amd.com&gt;
Signed-off-by: Eric Huang &lt;jinhuieric.huang@amd.com&gt;
Reviewed-by: Jonathan Kim &lt;jonathan.kim@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdkfd: fix null queue check on debug setting exceptions</title>
<updated>2023-06-15T14:46:05+00:00</updated>
<author>
<name>Jonathan Kim</name>
<email>jonathan.kim@amd.com</email>
</author>
<published>2023-06-12T15:42:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=8f7bd7010dd5bca920e9d3c0c040622b2e834b57'/>
<id>8f7bd7010dd5bca920e9d3c0c040622b2e834b57</id>
<content type='text'>
Null check should be done on queue struct itself and not on the
process queue list node.

Signed-off-by: Jonathan Kim &lt;jonathan.kim@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;felix.kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Null check should be done on queue struct itself and not on the
process queue list node.

Signed-off-by: Jonathan Kim &lt;jonathan.kim@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;felix.kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
