<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/drivers/gpu/drm/amd/amdkfd/kfd_debug.c, branch v6.5</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>drm/amdkfd: fix trap handling work around for debugging</title>
<updated>2023-07-25T20:13:58+00:00</updated>
<author>
<name>Jonathan Kim</name>
<email>jonathan.kim@amd.com</email>
</author>
<published>2023-07-12T20:32:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=602816c3ee3fdbf4a72cb4d2e5b8b756a5104b0f'/>
<id>602816c3ee3fdbf4a72cb4d2e5b8b756a5104b0f</id>
<content type='text'>
Update the list of devices that require the cwsr trap handling
workaround for debugging use cases.

Signed-off-by: Jonathan Kim &lt;jonathan.kim@amd.com&gt;
Acked-by: Ruili Ji &lt;ruili.ji@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;felix.kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Update the list of devices that require the cwsr trap handling
workaround for debugging use cases.

Signed-off-by: Jonathan Kim &lt;jonathan.kim@amd.com&gt;
Acked-by: Ruili Ji &lt;ruili.ji@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;felix.kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdkfd: fix null queue check on debug setting exceptions</title>
<updated>2023-06-15T14:46:05+00:00</updated>
<author>
<name>Jonathan Kim</name>
<email>jonathan.kim@amd.com</email>
</author>
<published>2023-06-12T15:42:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=8f7bd7010dd5bca920e9d3c0c040622b2e834b57'/>
<id>8f7bd7010dd5bca920e9d3c0c040622b2e834b57</id>
<content type='text'>
Null check should be done on queue struct itself and not on the
process queue list node.

Signed-off-by: Jonathan Kim &lt;jonathan.kim@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;felix.kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Null check should be done on queue struct itself and not on the
process queue list node.

Signed-off-by: Jonathan Kim &lt;jonathan.kim@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;felix.kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdkfd: fix and enable debugging for gfx11</title>
<updated>2023-06-09T16:48:19+00:00</updated>
<author>
<name>Jonathan Kim</name>
<email>jonathan.kim@amd.com</email>
</author>
<published>2023-05-23T15:57:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=09d49e14ea6fd125a21f89b80f888c09be32a174'/>
<id>09d49e14ea6fd125a21f89b80f888c09be32a174</id>
<content type='text'>
There are a couple of fixes required to enable gfx11 debugging.

First, ADD_QUEUE.trap_en is an inappropriate place to toggle
a per-process register so move it to SET_SHADER_DEBUGGER.trap_en.
When ADD_QUEUE.skip_process_ctx_clear is set, MES will prioritize
the SET_SHADER_DEBUGGER.trap_en setting.

Second, to preserve correct save/restore priviledged wave states
in coordination with the trap enablement setting, resume suspended
waves early in the disable call.

Signed-off-by: Jonathan Kim &lt;jonathan.kim@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;felix.kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
There are a couple of fixes required to enable gfx11 debugging.

First, ADD_QUEUE.trap_en is an inappropriate place to toggle
a per-process register so move it to SET_SHADER_DEBUGGER.trap_en.
When ADD_QUEUE.skip_process_ctx_clear is set, MES will prioritize
the SET_SHADER_DEBUGGER.trap_en setting.

Second, to preserve correct save/restore priviledged wave states
in coordination with the trap enablement setting, resume suspended
waves early in the disable call.

Signed-off-by: Jonathan Kim &lt;jonathan.kim@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;felix.kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdkfd: add debug device snapshot operation</title>
<updated>2023-06-09T16:37:00+00:00</updated>
<author>
<name>Jonathan Kim</name>
<email>jonathan.kim@amd.com</email>
</author>
<published>2022-05-10T16:47:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=12976e6a5ab8fc3766c0304d72f7eec81a109b55'/>
<id>12976e6a5ab8fc3766c0304d72f7eec81a109b55</id>
<content type='text'>
Similar to queue snapshot, return an array of device information using
an entry_size check and return.
Unlike queue snapshots, the debugger needs to pass to correct number of
devices that exist.  If it fails to do so, the KFD will return the
number of actual devices so that the debugger can make a subsequent
successful call.

Signed-off-by: Jonathan Kim &lt;jonathan.kim@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;felix.kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Similar to queue snapshot, return an array of device information using
an entry_size check and return.
Unlike queue snapshots, the debugger needs to pass to correct number of
devices that exist.  If it fails to do so, the KFD will return the
number of actual devices so that the debugger can make a subsequent
successful call.

Signed-off-by: Jonathan Kim &lt;jonathan.kim@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;felix.kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdkfd: add debug query exception info operation</title>
<updated>2023-06-09T16:36:53+00:00</updated>
<author>
<name>Jonathan Kim</name>
<email>jonathan.kim@amd.com</email>
</author>
<published>2022-05-09T17:37:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=2b36de971d25daa2ad287114ae3ca11a8f8d49d7'/>
<id>2b36de971d25daa2ad287114ae3ca11a8f8d49d7</id>
<content type='text'>
Allow the debugger to query additional info based on an exception code.
For device exceptions, it's currently only memory violation information.
For process exceptions, it's currently only runtime information.
Queue exception only report the queue exception status.

The debugger has the option of clearing the target exception on query.

Signed-off-by: Jonathan Kim &lt;jonathan.kim@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;felix.kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Allow the debugger to query additional info based on an exception code.
For device exceptions, it's currently only memory violation information.
For process exceptions, it's currently only runtime information.
Queue exception only report the queue exception status.

The debugger has the option of clearing the target exception on query.

Signed-off-by: Jonathan Kim &lt;jonathan.kim@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;felix.kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdkfd: add debug query event operation</title>
<updated>2023-06-09T16:36:51+00:00</updated>
<author>
<name>Jonathan Kim</name>
<email>jonathan.kim@amd.com</email>
</author>
<published>2022-05-09T15:10:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=5bc20c224bcb863571e8831cdbba23cd61b10ac3'/>
<id>5bc20c224bcb863571e8831cdbba23cd61b10ac3</id>
<content type='text'>
Allow the debugger to query a single queue, device and process
exception.
The KFD should also return the GPU or Queue id of the exception.
The debugger also has the option of clearing exceptions after
being queried.

Signed-off-by: Jonathan Kim &lt;jonathan.kim@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;felix.kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Allow the debugger to query a single queue, device and process
exception.
The KFD should also return the GPU or Queue id of the exception.
The debugger also has the option of clearing exceptions after
being queried.

Signed-off-by: Jonathan Kim &lt;jonathan.kim@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;felix.kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdkfd: add debug set flags operation</title>
<updated>2023-06-09T16:36:48+00:00</updated>
<author>
<name>Jonathan Kim</name>
<email>jonathan.kim@amd.com</email>
</author>
<published>2022-05-09T14:51:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=103d5f08ff42b666c61c350be2c3e724c1646918'/>
<id>103d5f08ff42b666c61c350be2c3e724c1646918</id>
<content type='text'>
Allow the debugger to set single memory and single ALU operations.

Some exceptions are imprecise (memory violations, address watch) in the
sense that a trap occurs only when the exception interrupt occurs and
not at the non-halting faulty instruction.  Trap temporaries 0 &amp; 1 save
the program counter address, which means that these values will not point
to the faulty instruction address but to whenever the interrupt was
raised.

Setting the Single Memory Operations flag will inject an automatic wait
on every memory operation instruction forcing imprecise memory exceptions
to become precise at the cost of performance.  This setting is not
permitted on debug devices that support only a global setting of this
option.

Return the previous set flags to the debugger as well.

Signed-off-by: Jonathan Kim &lt;jonathan.kim@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;felix.kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Allow the debugger to set single memory and single ALU operations.

Some exceptions are imprecise (memory violations, address watch) in the
sense that a trap occurs only when the exception interrupt occurs and
not at the non-halting faulty instruction.  Trap temporaries 0 &amp; 1 save
the program counter address, which means that these values will not point
to the faulty instruction address but to whenever the interrupt was
raised.

Setting the Single Memory Operations flag will inject an automatic wait
on every memory operation instruction forcing imprecise memory exceptions
to become precise at the cost of performance.  This setting is not
permitted on debug devices that support only a global setting of this
option.

Return the previous set flags to the debugger as well.

Signed-off-by: Jonathan Kim &lt;jonathan.kim@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;felix.kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdkfd: add debug set and clear address watch points operation</title>
<updated>2023-06-09T16:36:46+00:00</updated>
<author>
<name>Jonathan Kim</name>
<email>jonathan.kim@amd.com</email>
</author>
<published>2022-05-06T18:58:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=e0f85f4690d089cc1a60337decafb1acf7eec45e'/>
<id>e0f85f4690d089cc1a60337decafb1acf7eec45e</id>
<content type='text'>
Shader read, write and atomic memory operations can be alerted to the
debugger as an address watch exception.

Allow the debugger to pass in a watch point to a particular memory
address per device.

Note that there exists only 4 watch points per devices to date, so have
the KFD keep track of what watch points are allocated or not.

Signed-off-by: Jonathan Kim &lt;jonathan.kim@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;felix.kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Shader read, write and atomic memory operations can be alerted to the
debugger as an address watch exception.

Allow the debugger to pass in a watch point to a particular memory
address per device.

Note that there exists only 4 watch points per devices to date, so have
the KFD keep track of what watch points are allocated or not.

Signed-off-by: Jonathan Kim &lt;jonathan.kim@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;felix.kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdkfd: add debug suspend and resume process queues operation</title>
<updated>2023-06-09T16:36:43+00:00</updated>
<author>
<name>Jonathan Kim</name>
<email>jonathan.kim@amd.com</email>
</author>
<published>2022-05-05T20:15:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=a70a93fa568b4f05aba548dadb673703eccf5480'/>
<id>a70a93fa568b4f05aba548dadb673703eccf5480</id>
<content type='text'>
In order to inspect waves from the saved context at any point during a
debug session, the debugger must be able to preempt queues to trigger
context save by suspending them.

On queue suspend, the KFD will copy the context save header information
so that the debugger can correctly crawl the appropriate size of the saved
context. The debugger must then also be allowed to resume suspended queues.

A queue that is newly created cannot be suspended because queue ids are
recycled after destruction so the debugger needs to know that this has
occurred.  Query functions will be later added that will clear a given
queue of its new queue status.

A queue cannot be destroyed while it is suspended to preserve its saved
context during debugger inspection.  Have queue destruction block while
a queue is suspended and unblocked when it is resumed.  Likewise, if a
queue is about to be destroyed, it cannot be suspended.

Return the number of queues successfully suspended or resumed along with
a per queue status array where the upper bits per queue status show that
the request was invalid (new/destroyed queue suspend request, missing
queue) or an error occurred (HWS in a fatal state so it can't suspend or
resume queues).

Signed-off-by: Jonathan Kim &lt;jonathan.kim@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;felix.kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In order to inspect waves from the saved context at any point during a
debug session, the debugger must be able to preempt queues to trigger
context save by suspending them.

On queue suspend, the KFD will copy the context save header information
so that the debugger can correctly crawl the appropriate size of the saved
context. The debugger must then also be allowed to resume suspended queues.

A queue that is newly created cannot be suspended because queue ids are
recycled after destruction so the debugger needs to know that this has
occurred.  Query functions will be later added that will clear a given
queue of its new queue status.

A queue cannot be destroyed while it is suspended to preserve its saved
context during debugger inspection.  Have queue destruction block while
a queue is suspended and unblocked when it is resumed.  Likewise, if a
queue is about to be destroyed, it cannot be suspended.

Return the number of queues successfully suspended or resumed along with
a per queue status array where the upper bits per queue status show that
the request was invalid (new/destroyed queue suspend request, missing
queue) or an error occurred (HWS in a fatal state so it can't suspend or
resume queues).

Signed-off-by: Jonathan Kim &lt;jonathan.kim@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;felix.kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdkfd: add debug wave launch mode operation</title>
<updated>2023-06-09T16:36:39+00:00</updated>
<author>
<name>Jonathan Kim</name>
<email>jonathan.kim@amd.com</email>
</author>
<published>2022-05-02T15:45:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=aea1b4738bebd8092bd437ce0b03aa9587fc20a7'/>
<id>aea1b4738bebd8092bd437ce0b03aa9587fc20a7</id>
<content type='text'>
Allow the debugger to set wave behaviour on to either normally operate,
halt at launch, trap on every instruction, terminate immediately or
stall on allocation.

Signed-off-by: Jonathan Kim &lt;jonathan.kim@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;felix.kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Allow the debugger to set wave behaviour on to either normally operate,
halt at launch, trap on every instruction, terminate immediately or
stall on allocation.

Signed-off-by: Jonathan Kim &lt;jonathan.kim@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;felix.kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
