<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c, branch v6.16</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>drm/amdgpu: Fix SDMA engine reset with logical instance ID</title>
<updated>2025-06-18T17:10:44+00:00</updated>
<author>
<name>Jesse Zhang</name>
<email>jesse.zhang@amd.com</email>
</author>
<published>2025-06-11T07:02:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=09b585592fa481384597c81388733aed4a04dd05'/>
<id>09b585592fa481384597c81388733aed4a04dd05</id>
<content type='text'>
This commit makes the following improvements to SDMA engine reset handling:

1. Clarifies in the function documentation that instance_id refers to a logical ID
2. Adds conversion from logical to physical instance ID before performing reset
   using GET_INST(SDMA0, instance_id) macro
3. Improves error messaging to indicate when a logical instance reset fails
4. Adds better code organization with blank lines for readability

The change ensures proper SDMA engine reset by using the correct physical
instance ID while maintaining the logical ID interface for callers.

V2: Remove harvest_config check and convert directly to physical instance (Lijo)

Suggested-by: Jonathan Kim &lt;jonathan.kim@amd.com&gt;
Signed-off-by: Jesse Zhang &lt;Jesse.Zhang@amd.com&gt;
Reviewed-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
(cherry picked from commit 5efa6217c239ed1ceec0f0414f9b6f6927387dfc)
Cc: stable@vger.kernel.org
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This commit makes the following improvements to SDMA engine reset handling:

1. Clarifies in the function documentation that instance_id refers to a logical ID
2. Adds conversion from logical to physical instance ID before performing reset
   using GET_INST(SDMA0, instance_id) macro
3. Improves error messaging to indicate when a logical instance reset fails
4. Adds better code organization with blank lines for readability

The change ensures proper SDMA engine reset by using the correct physical
instance ID while maintaining the logical ID interface for callers.

V2: Remove harvest_config check and convert directly to physical instance (Lijo)

Suggested-by: Jonathan Kim &lt;jonathan.kim@amd.com&gt;
Signed-off-by: Jesse Zhang &lt;Jesse.Zhang@amd.com&gt;
Reviewed-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
(cherry picked from commit 5efa6217c239ed1ceec0f0414f9b6f6927387dfc)
Cc: stable@vger.kernel.org
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdgpu:remove old sdma reset callback mechanism</title>
<updated>2025-04-21T14:57:22+00:00</updated>
<author>
<name>Jesse.zhang@amd.com</name>
<email>Jesse.zhang@amd.com</email>
</author>
<published>2025-04-11T07:30:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=2200b41428ee8d2298ec9d3fd2a22ba2d006096a'/>
<id>2200b41428ee8d2298ec9d3fd2a22ba2d006096a</id>
<content type='text'>
This patch removes the deprecated SDMA reset callback mechanism, which was previously used to register pre-reset and post-reset callbacks for SDMA engine resets.
 The callback mechanism has been replaced with a more direct and efficient approach using `stop_queue` and `start_queue` functions in the ring's function table.

The SDMA reset callback mechanism allowed KFD and AMDGPU to register pre-reset and post-reset functions for handling SDMA engine resets.
However, this approach added unnecessary complexity and was no longer needed after the introduction of the `stop_queue` and `start_queue` functions in the ring's function table.

1. **Remove Callback Mechanism**:
   - Removed the `amdgpu_sdma_register_on_reset_callbacks` function and its associated data structures (`sdma_on_reset_funcs`).
   - Removed the callback registration logic from the SDMA v4.4.2 initialization code.

2. **Clean Up Related Code**:
   - Removed the `sdma_v4_4_2_set_engine_reset_funcs` function, which was used to register the callbacks.
   - Removed the `sdma_v4_4_2_engine_reset_funcs` structure, which contained the pre-reset and post-reset callback functions.

Signed-off-by: Jesse Zhang &lt;jesse.zhang@amd.com&gt;
Reviewed-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch removes the deprecated SDMA reset callback mechanism, which was previously used to register pre-reset and post-reset callbacks for SDMA engine resets.
 The callback mechanism has been replaced with a more direct and efficient approach using `stop_queue` and `start_queue` functions in the ring's function table.

The SDMA reset callback mechanism allowed KFD and AMDGPU to register pre-reset and post-reset functions for handling SDMA engine resets.
However, this approach added unnecessary complexity and was no longer needed after the introduction of the `stop_queue` and `start_queue` functions in the ring's function table.

1. **Remove Callback Mechanism**:
   - Removed the `amdgpu_sdma_register_on_reset_callbacks` function and its associated data structures (`sdma_on_reset_funcs`).
   - Removed the callback registration logic from the SDMA v4.4.2 initialization code.

2. **Clean Up Related Code**:
   - Removed the `sdma_v4_4_2_set_engine_reset_funcs` function, which was used to register the callbacks.
   - Removed the `sdma_v4_4_2_engine_reset_funcs` structure, which contained the pre-reset and post-reset callback functions.

Signed-off-by: Jesse Zhang &lt;jesse.zhang@amd.com&gt;
Reviewed-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdgpu: Implement SDMA soft reset directly for v5.x</title>
<updated>2025-04-21T14:56:54+00:00</updated>
<author>
<name>Jesse.zhang@amd.com</name>
<email>Jesse.zhang@amd.com</email>
</author>
<published>2025-04-11T07:26:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=5c3e7c49538e2ddad10296a318c225bbb3d37d20'/>
<id>5c3e7c49538e2ddad10296a318c225bbb3d37d20</id>
<content type='text'>
This patch introduces a new function `amdgpu_sdma_soft_reset` to handle SDMA soft resets directly,
rather than relying on the DPM interface.

1. **New `amdgpu_sdma_soft_reset` Function**:
   - Implements a soft reset for SDMA engines by directly writing to the hardware registers.
   - Handles SDMA versions 4.x and 5.x separately:
     - For SDMA 4.x, the existing `amdgpu_dpm_reset_sdma` function is used for backward compatibility.
     - For SDMA 5.x, the driver directly manipulates the `GRBM_SOFT_RESET` register to reset the specified SDMA instance.

2. **Integration into `amdgpu_sdma_reset_engine`**:
   - The `amdgpu_sdma_soft_reset` function is called during the SDMA reset process, replacing the previous call to `amdgpu_dpm_reset_sdma`.

v2: r should default to an error (Alex)

Suggested-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Jesse Zhang &lt;jesse.zhang@amd.com&gt;
Reviewed-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch introduces a new function `amdgpu_sdma_soft_reset` to handle SDMA soft resets directly,
rather than relying on the DPM interface.

1. **New `amdgpu_sdma_soft_reset` Function**:
   - Implements a soft reset for SDMA engines by directly writing to the hardware registers.
   - Handles SDMA versions 4.x and 5.x separately:
     - For SDMA 4.x, the existing `amdgpu_dpm_reset_sdma` function is used for backward compatibility.
     - For SDMA 5.x, the driver directly manipulates the `GRBM_SOFT_RESET` register to reset the specified SDMA instance.

2. **Integration into `amdgpu_sdma_reset_engine`**:
   - The `amdgpu_sdma_soft_reset` function is called during the SDMA reset process, replacing the previous call to `amdgpu_dpm_reset_sdma`.

v2: r should default to an error (Alex)

Suggested-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Jesse Zhang &lt;jesse.zhang@amd.com&gt;
Reviewed-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdgpu: switch amdgpu_sdma_reset_engine to use the new sdma function pointers</title>
<updated>2025-04-21T14:56:49+00:00</updated>
<author>
<name>Jesse.zhang@amd.com</name>
<email>Jesse.zhang@amd.com</email>
</author>
<published>2025-04-11T06:48:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=b22659d5d3520c82295981797e75de525365a411'/>
<id>b22659d5d3520c82295981797e75de525365a411</id>
<content type='text'>
Replace old callback mechanism with direct calls to stop/start functions.

Suggested-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Jesse Zhang &lt;jesse.zhang@amd.com&gt;
Reviewed-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Replace old callback mechanism with direct calls to stop/start functions.

Suggested-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Jesse Zhang &lt;jesse.zhang@amd.com&gt;
Reviewed-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdgpu: remove is_mes_queue flag</title>
<updated>2025-04-08T20:48:21+00:00</updated>
<author>
<name>Alex Deucher</name>
<email>alexander.deucher@amd.com</email>
</author>
<published>2025-03-12T17:47:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=4220d2c7c41b8ea3fd154dc5678b05575653cba0'/>
<id>4220d2c7c41b8ea3fd154dc5678b05575653cba0</id>
<content type='text'>
This was leftover from MES bring up when we had MES
user queues in the kernel.  It's no longer used so
remove it.

Acked-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This was leftover from MES bring up when we had MES
user queues in the kernel.  It's no longer used so
remove it.

Acked-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdgpu: Fix the comment to avoid warning</title>
<updated>2025-04-07T19:18:59+00:00</updated>
<author>
<name>Sunil Khatri</name>
<email>sunil.khatri@amd.com</email>
</author>
<published>2025-04-03T07:27:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=89dab189a20e8ab653b37059536f2d5d73666a9e'/>
<id>89dab189a20e8ab653b37059536f2d5d73666a9e</id>
<content type='text'>
Fix the below comment warning
drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c:541:
warning: Function parameter or struct member 'adev'
not described in 'amdgpu_sdma_register_on_reset_callbacks'

Signed-off-by: Sunil Khatri &lt;sunil.khatri@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Fix the below comment warning
drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c:541:
warning: Function parameter or struct member 'adev'
not described in 'amdgpu_sdma_register_on_reset_callbacks'

Signed-off-by: Sunil Khatri &lt;sunil.khatri@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdgpu: Optimize VM invalidation engine allocation and synchronize GPU TLB flush</title>
<updated>2025-03-21T16:16:35+00:00</updated>
<author>
<name>Jesse.zhang@amd.com</name>
<email>Jesse.zhang@amd.com</email>
</author>
<published>2025-02-25T07:25:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=b09cdeb4d38872b84c6d59878915eae2adbe9d2b'/>
<id>b09cdeb4d38872b84c6d59878915eae2adbe9d2b</id>
<content type='text'>
- Modify the VM invalidation engine allocation logic to handle SDMA page rings.
  SDMA page rings now share the VM invalidation engine with SDMA gfx rings instead of
  allocating a separate engine. This change ensures efficient resource management and
  avoids the issue of insufficient VM invalidation engines.

- Add synchronization for GPU TLB flush operations in gmc_v9_0.c.
  Use spin_lock and spin_unlock to ensure thread safety and prevent race conditions
  during TLB flush operations. This improves the stability and reliability of the driver,
  especially in multi-threaded environments.

 v2: replace the sdma ring check with a function `amdgpu_sdma_is_page_queue`
 to check if a ring is an SDMA page queue.(Lijo)

 v3: Add GC version check, only enabled on GC9.4.3/9.4.4/9.5.0
 v4: Fix code style and add more detailed description (Christian)
 v5: Remove dependency on vm_inv_eng loop order, explicitly lookup shared inv_eng(Christian/Lijo)
 v6: Added search shared ring function amdgpu_sdma_get_shared_ring (Lijo)

Suggested-by: Lijo Lazar &lt;lijo.lazar@amd.com&gt;
Signed-off-by: Jesse Zhang &lt;jesse.zhang@amd.com&gt;
Reviewed-by: Lijo Lazar &lt;lijo.lazar@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
- Modify the VM invalidation engine allocation logic to handle SDMA page rings.
  SDMA page rings now share the VM invalidation engine with SDMA gfx rings instead of
  allocating a separate engine. This change ensures efficient resource management and
  avoids the issue of insufficient VM invalidation engines.

- Add synchronization for GPU TLB flush operations in gmc_v9_0.c.
  Use spin_lock and spin_unlock to ensure thread safety and prevent race conditions
  during TLB flush operations. This improves the stability and reliability of the driver,
  especially in multi-threaded environments.

 v2: replace the sdma ring check with a function `amdgpu_sdma_is_page_queue`
 to check if a ring is an SDMA page queue.(Lijo)

 v3: Add GC version check, only enabled on GC9.4.3/9.4.4/9.5.0
 v4: Fix code style and add more detailed description (Christian)
 v5: Remove dependency on vm_inv_eng loop order, explicitly lookup shared inv_eng(Christian/Lijo)
 v6: Added search shared ring function amdgpu_sdma_get_shared_ring (Lijo)

Suggested-by: Lijo Lazar &lt;lijo.lazar@amd.com&gt;
Signed-off-by: Jesse Zhang &lt;jesse.zhang@amd.com&gt;
Reviewed-by: Lijo Lazar &lt;lijo.lazar@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdgpu/sdma: fix engine reset handling</title>
<updated>2025-03-21T16:16:34+00:00</updated>
<author>
<name>Alex Deucher</name>
<email>alexander.deucher@amd.com</email>
</author>
<published>2025-03-14T23:23:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=e02fcf73081b7fc941aa6de007b0239e67688e15'/>
<id>e02fcf73081b7fc941aa6de007b0239e67688e15</id>
<content type='text'>
Move the kfd suspend/resume code into the caller.  That
is where the KFD is likely to detect a reset so on the KFD
side there is no need to call them.  Also add a mutex to
lock the actual reset sequence.

v2: make the locking per instance

Fixes: bac38ca8c475 ("drm/amdkfd: implement per queue sdma reset for gfx 9.4+")
Reviewed-by: Jesse Zhang &lt;jesse.zhang@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Move the kfd suspend/resume code into the caller.  That
is where the KFD is likely to detect a reset so on the KFD
side there is no need to call them.  Also add a mutex to
lock the actual reset sequence.

v2: make the locking per instance

Fixes: bac38ca8c475 ("drm/amdkfd: implement per queue sdma reset for gfx 9.4+")
Reviewed-by: Jesse Zhang &lt;jesse.zhang@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdgpu: Fix SDMA engine reset logic</title>
<updated>2025-03-18T18:03:47+00:00</updated>
<author>
<name>Jesse.zhang@amd.com</name>
<email>Jesse.zhang@amd.com</email>
</author>
<published>2025-03-17T01:14:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=cc63bcfd14a664a7ea78fd3c9d0014116b7e4619'/>
<id>cc63bcfd14a664a7ea78fd3c9d0014116b7e4619</id>
<content type='text'>
The scheduler should restart only if the reset operation
succeeds  This ensures that new tasks are only submitted
to the queues after a successful reset.

Fixes: 4c02f7301657 ("drm/amdgpu: Introduce conditional user queue suspension for SDMA resets")
Suggested-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Jesse.Zhang &lt;Jesse.zhang@amd.com&gt;
Reviewed-by: Tim Huang &lt;tim.huang@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The scheduler should restart only if the reset operation
succeeds  This ensures that new tasks are only submitted
to the queues after a successful reset.

Fixes: 4c02f7301657 ("drm/amdgpu: Introduce conditional user queue suspension for SDMA resets")
Suggested-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Jesse.Zhang &lt;Jesse.zhang@amd.com&gt;
Reviewed-by: Tim Huang &lt;tim.huang@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdgpu: fix inconsistent indenting warning</title>
<updated>2025-03-07T17:53:06+00:00</updated>
<author>
<name>Charles Han</name>
<email>hanchunchao@inspur.com</email>
</author>
<published>2025-03-05T10:40:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=571d36837c84707ea36fa37ab1373a124e328ed4'/>
<id>571d36837c84707ea36fa37ab1373a124e328ed4</id>
<content type='text'>
Fix below inconsistent indenting smatch warning.
smatch warnings:
drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c:582 amdgpu_sdma_reset_engine() warn: inconsistent indenting

Signed-off-by: Charles Han &lt;hanchunchao@inspur.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Fix below inconsistent indenting smatch warning.
smatch warnings:
drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c:582 amdgpu_sdma_reset_engine() warn: inconsistent indenting

Signed-off-by: Charles Han &lt;hanchunchao@inspur.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
