<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c, branch v6.9-rc2</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>drm/amdgpu: Handle duplicate BOs during process restore</title>
<updated>2024-03-20T17:12:56+00:00</updated>
<author>
<name>Mukul Joshi</name>
<email>mukul.joshi@amd.com</email>
</author>
<published>2024-03-08T16:11:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=71b9d19220dae4b69f03acd900498b23eeeaf000'/>
<id>71b9d19220dae4b69f03acd900498b23eeeaf000</id>
<content type='text'>
In certain situations, some apps can import a BO multiple times
(through IPC for example). To restore such processes successfully,
we need to tell drm to ignore duplicate BOs.
While at it, also add additional logging to prevent silent failures
when process restore fails.

Signed-off-by: Mukul Joshi &lt;mukul.joshi@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;felix.kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In certain situations, some apps can import a BO multiple times
(through IPC for example). To restore such processes successfully,
we need to tell drm to ignore duplicate BOs.
While at it, also add additional logging to prevent silent failures
when process restore fails.

Signed-off-by: Mukul Joshi &lt;mukul.joshi@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;felix.kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>amd/amdkfd: remove unused parameter</title>
<updated>2024-02-28T22:10:53+00:00</updated>
<author>
<name>Eric Huang</name>
<email>jinhuieric.huang@amd.com</email>
</author>
<published>2024-02-28T14:59:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=1761d9a688ba60a6428a648658bd9c72d493019e'/>
<id>1761d9a688ba60a6428a648658bd9c72d493019e</id>
<content type='text'>
The adev can be found from bo by amdgpu_ttm_adev(bo-&gt;tbo.bdev),
and adev is also not used in the function
amdgpu_amdkfd_map_gtt_bo_to_gart().

Signed-off-by: Eric Huang &lt;jinhuieric.huang@amd.com&gt;
Reviewed-by: Harish Kasiviswanathan &lt;Harish.Kasiviswanathan@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The adev can be found from bo by amdgpu_ttm_adev(bo-&gt;tbo.bdev),
and adev is also not used in the function
amdgpu_amdkfd_map_gtt_bo_to_gart().

Signed-off-by: Eric Huang &lt;jinhuieric.huang@amd.com&gt;
Reviewed-by: Harish Kasiviswanathan &lt;Harish.Kasiviswanathan@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdkfd: reserve the BO before validating it</title>
<updated>2024-01-31T19:05:19+00:00</updated>
<author>
<name>Lang Yu</name>
<email>Lang.Yu@amd.com</email>
</author>
<published>2024-01-11T04:27:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=0c93bd49576677ae1a18817d5ec000ef031d5187'/>
<id>0c93bd49576677ae1a18817d5ec000ef031d5187</id>
<content type='text'>
Fix a warning.

v2: Avoid unmapping attachment repeatedly when ERESTARTSYS.

v3: Lock the BO before accessing ttm-&gt;sg to avoid race conditions.(Felix)

[   41.708711] WARNING: CPU: 0 PID: 1463 at drivers/gpu/drm/ttm/ttm_bo.c:846 ttm_bo_validate+0x146/0x1b0 [ttm]
[   41.708989] Call Trace:
[   41.708992]  &lt;TASK&gt;
[   41.708996]  ? show_regs+0x6c/0x80
[   41.709000]  ? ttm_bo_validate+0x146/0x1b0 [ttm]
[   41.709008]  ? __warn+0x93/0x190
[   41.709014]  ? ttm_bo_validate+0x146/0x1b0 [ttm]
[   41.709024]  ? report_bug+0x1f9/0x210
[   41.709035]  ? handle_bug+0x46/0x80
[   41.709041]  ? exc_invalid_op+0x1d/0x80
[   41.709048]  ? asm_exc_invalid_op+0x1f/0x30
[   41.709057]  ? amdgpu_amdkfd_gpuvm_dmaunmap_mem+0x2c/0x80 [amdgpu]
[   41.709185]  ? ttm_bo_validate+0x146/0x1b0 [ttm]
[   41.709197]  ? amdgpu_amdkfd_gpuvm_dmaunmap_mem+0x2c/0x80 [amdgpu]
[   41.709337]  ? srso_alias_return_thunk+0x5/0x7f
[   41.709346]  kfd_mem_dmaunmap_attachment+0x9e/0x1e0 [amdgpu]
[   41.709467]  amdgpu_amdkfd_gpuvm_dmaunmap_mem+0x56/0x80 [amdgpu]
[   41.709586]  kfd_ioctl_unmap_memory_from_gpu+0x1b7/0x300 [amdgpu]
[   41.709710]  kfd_ioctl+0x1ec/0x650 [amdgpu]
[   41.709822]  ? __pfx_kfd_ioctl_unmap_memory_from_gpu+0x10/0x10 [amdgpu]
[   41.709945]  ? srso_alias_return_thunk+0x5/0x7f
[   41.709949]  ? tomoyo_file_ioctl+0x20/0x30
[   41.709959]  __x64_sys_ioctl+0x9c/0xd0
[   41.709967]  do_syscall_64+0x3f/0x90
[   41.709973]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8

Fixes: 101b8104307e ("drm/amdkfd: Move dma unmapping after TLB flush")
Signed-off-by: Lang Yu &lt;Lang.Yu@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Fix a warning.

v2: Avoid unmapping attachment repeatedly when ERESTARTSYS.

v3: Lock the BO before accessing ttm-&gt;sg to avoid race conditions.(Felix)

[   41.708711] WARNING: CPU: 0 PID: 1463 at drivers/gpu/drm/ttm/ttm_bo.c:846 ttm_bo_validate+0x146/0x1b0 [ttm]
[   41.708989] Call Trace:
[   41.708992]  &lt;TASK&gt;
[   41.708996]  ? show_regs+0x6c/0x80
[   41.709000]  ? ttm_bo_validate+0x146/0x1b0 [ttm]
[   41.709008]  ? __warn+0x93/0x190
[   41.709014]  ? ttm_bo_validate+0x146/0x1b0 [ttm]
[   41.709024]  ? report_bug+0x1f9/0x210
[   41.709035]  ? handle_bug+0x46/0x80
[   41.709041]  ? exc_invalid_op+0x1d/0x80
[   41.709048]  ? asm_exc_invalid_op+0x1f/0x30
[   41.709057]  ? amdgpu_amdkfd_gpuvm_dmaunmap_mem+0x2c/0x80 [amdgpu]
[   41.709185]  ? ttm_bo_validate+0x146/0x1b0 [ttm]
[   41.709197]  ? amdgpu_amdkfd_gpuvm_dmaunmap_mem+0x2c/0x80 [amdgpu]
[   41.709337]  ? srso_alias_return_thunk+0x5/0x7f
[   41.709346]  kfd_mem_dmaunmap_attachment+0x9e/0x1e0 [amdgpu]
[   41.709467]  amdgpu_amdkfd_gpuvm_dmaunmap_mem+0x56/0x80 [amdgpu]
[   41.709586]  kfd_ioctl_unmap_memory_from_gpu+0x1b7/0x300 [amdgpu]
[   41.709710]  kfd_ioctl+0x1ec/0x650 [amdgpu]
[   41.709822]  ? __pfx_kfd_ioctl_unmap_memory_from_gpu+0x10/0x10 [amdgpu]
[   41.709945]  ? srso_alias_return_thunk+0x5/0x7f
[   41.709949]  ? tomoyo_file_ioctl+0x20/0x30
[   41.709959]  __x64_sys_ioctl+0x9c/0xd0
[   41.709967]  do_syscall_64+0x3f/0x90
[   41.709973]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8

Fixes: 101b8104307e ("drm/amdkfd: Move dma unmapping after TLB flush")
Signed-off-by: Lang Yu &lt;Lang.Yu@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdgpu: Auto-validate DMABuf imports in compute VMs</title>
<updated>2024-01-15T23:35:35+00:00</updated>
<author>
<name>Felix Kuehling</name>
<email>felix.kuehling@amd.com</email>
</author>
<published>2024-01-03T23:13:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=50661eb1a2c88c0e50cb234cb117a7fbfe03b3e0'/>
<id>50661eb1a2c88c0e50cb234cb117a7fbfe03b3e0</id>
<content type='text'>
DMABuf imports in compute VMs are not wrapped in a kgd_mem object on the
process_info-&gt;kfd_bo_list. There is no explicit KFD API call to validate
them or add eviction fences to them.

This patch automatically validates and fences dymanic DMABuf imports when
they are added to a compute VM. Revalidation after evictions is handled
in the VM code.

v2:
* Renamed amdgpu_vm_validate_evicted_bos to amdgpu_vm_validate
* Eliminated evicted_user state, use evicted state for VM BOs and user BOs
* Fixed and simplified amdgpu_vm_fence_imports, depends on reserved BOs
* Moved dma_resv_reserve_fences for amdgpu_vm_fence_imports into
  amdgpu_vm_validate, outside the vm-&gt;status_lock
* Added dummy version of amdgpu_amdkfd_bo_validate_and_fence for builds
  without KFD

v4: Eliminate amdgpu_vm_fence_imports. It's not needed because the
reservation with its fences is shared with the export, as long as all
imports are from KFD, with the exports already reserved, validated and
fenced by the KFD restore worker.

v5: Reintroduced separate evicted_user state to simplify the state machine
and CS error handling when amdgpu_vm_validate is called without a ticket.

Signed-off-by: Felix Kuehling &lt;felix.kuehling@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
DMABuf imports in compute VMs are not wrapped in a kgd_mem object on the
process_info-&gt;kfd_bo_list. There is no explicit KFD API call to validate
them or add eviction fences to them.

This patch automatically validates and fences dymanic DMABuf imports when
they are added to a compute VM. Revalidation after evictions is handled
in the VM code.

v2:
* Renamed amdgpu_vm_validate_evicted_bos to amdgpu_vm_validate
* Eliminated evicted_user state, use evicted state for VM BOs and user BOs
* Fixed and simplified amdgpu_vm_fence_imports, depends on reserved BOs
* Moved dma_resv_reserve_fences for amdgpu_vm_fence_imports into
  amdgpu_vm_validate, outside the vm-&gt;status_lock
* Added dummy version of amdgpu_amdkfd_bo_validate_and_fence for builds
  without KFD

v4: Eliminate amdgpu_vm_fence_imports. It's not needed because the
reservation with its fences is shared with the export, as long as all
imports are from KFD, with the exports already reserved, validated and
fenced by the KFD restore worker.

v5: Reintroduced separate evicted_user state to simplify the state machine
and CS error handling when amdgpu_vm_validate is called without a ticket.

Signed-off-by: Felix Kuehling &lt;felix.kuehling@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdkfd: Fix sparse __rcu annotation warnings</title>
<updated>2024-01-09T20:44:13+00:00</updated>
<author>
<name>Felix Kuehling</name>
<email>felix.kuehling@amd.com</email>
</author>
<published>2024-01-03T23:13:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=c147ddc68e741aed78bba796effe049344d87ab8'/>
<id>c147ddc68e741aed78bba796effe049344d87ab8</id>
<content type='text'>
Properly mark kfd_process-&gt;ef as __rcu and consistently use the right
accessor functions.

Reported-by: kernel test robot &lt;lkp@intel.com&gt;
Closes: https://lore.kernel.org/oe-kbuild-all/202312052245.yFpBSgNH-lkp@intel.com/
Signed-off-by: Felix Kuehling &lt;felix.kuehling@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Properly mark kfd_process-&gt;ef as __rcu and consistently use the right
accessor functions.

Reported-by: kernel test robot &lt;lkp@intel.com&gt;
Closes: https://lore.kernel.org/oe-kbuild-all/202312052245.yFpBSgNH-lkp@intel.com/
Signed-off-by: Felix Kuehling &lt;felix.kuehling@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'drm-msm-next-2023-12-15' of https://gitlab.freedesktop.org/drm/msm into drm-next</title>
<updated>2023-12-19T21:54:03+00:00</updated>
<author>
<name>Dave Airlie</name>
<email>airlied@redhat.com</email>
</author>
<published>2023-12-19T21:43:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=22a2decedfbeb981df04dca880412b9520b2f8a1'/>
<id>22a2decedfbeb981df04dca880412b9520b2f8a1</id>
<content type='text'>
Updates for v6.8:

Core:
- Add support for SDM670, SM8650
- Handle the CFG interconnect to fix the obscure hangs / timeouts
  on register write
- Kconfig fix for QMP dependency
- DT schema fixes

DPU:
- Add support for SDM670, SM8650
- Enable SmartDMA on SM8350 and SM8450
- Correct UBWC settings for SC8280XP
- Fix catalog settings for SC8180X
- Actually make use of the version to switch between QSEED3/3LITE/4
  scalers
- Use devres-managed and drm-managed allocations where appropriate
- misc other fixes
- Enabled YUV writeback on SC7280, SM8250
- Enabled writeback on SM8350, SM8450
- CRC fix when encoder is selected as the input source
- other misc fixes

MDP4:
- Use devres-managed and drm-managed allocations where appropriate
- flush vblank event on CRTC disable

MDP5:
- Use devres-managed and drm-managed allocations where appropriate

DP:
- Add support for SM8650
- Enable PM runtime support
- Merge msm-specific debugfs dir with the generic one
- Described DisplayPort on SM8150 in DeviceTree bindings
- Moved dp_display_get_next_bridge() to probe()

DSI:
- Add support for SM8650
- Enable PM runtime support

GPU/GEM:
- demote userspace triggerable warnings to debug
- add GEM object metadata UAPI
- move GPU devcoredumps to GPU device
- fix hangcheck to skip retired submits
- expose UBWC config to userspace
- fix a680 chip-id
- drm_exec conversion
- drm/ci: remove rebase-merge directory (to unblock CI)

[airlied: fix drm_exec/amd interaction]
Signed-off-by: Dave Airlie &lt;airlied@redhat.com&gt;
From: Rob Clark &lt;robdclark@gmail.com&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGs9auYqmo-7NSd9FsbNBCDf7aBevd=4xkcF3A5G_OGvMQ@mail.gmail.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Updates for v6.8:

Core:
- Add support for SDM670, SM8650
- Handle the CFG interconnect to fix the obscure hangs / timeouts
  on register write
- Kconfig fix for QMP dependency
- DT schema fixes

DPU:
- Add support for SDM670, SM8650
- Enable SmartDMA on SM8350 and SM8450
- Correct UBWC settings for SC8280XP
- Fix catalog settings for SC8180X
- Actually make use of the version to switch between QSEED3/3LITE/4
  scalers
- Use devres-managed and drm-managed allocations where appropriate
- misc other fixes
- Enabled YUV writeback on SC7280, SM8250
- Enabled writeback on SM8350, SM8450
- CRC fix when encoder is selected as the input source
- other misc fixes

MDP4:
- Use devres-managed and drm-managed allocations where appropriate
- flush vblank event on CRTC disable

MDP5:
- Use devres-managed and drm-managed allocations where appropriate

DP:
- Add support for SM8650
- Enable PM runtime support
- Merge msm-specific debugfs dir with the generic one
- Described DisplayPort on SM8150 in DeviceTree bindings
- Moved dp_display_get_next_bridge() to probe()

DSI:
- Add support for SM8650
- Enable PM runtime support

GPU/GEM:
- demote userspace triggerable warnings to debug
- add GEM object metadata UAPI
- move GPU devcoredumps to GPU device
- fix hangcheck to skip retired submits
- expose UBWC config to userspace
- fix a680 chip-id
- drm_exec conversion
- drm/ci: remove rebase-merge directory (to unblock CI)

[airlied: fix drm_exec/amd interaction]
Signed-off-by: Dave Airlie &lt;airlied@redhat.com&gt;
From: Rob Clark &lt;robdclark@gmail.com&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGs9auYqmo-7NSd9FsbNBCDf7aBevd=4xkcF3A5G_OGvMQ@mail.gmail.com
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdkfd: Import DMABufs for interop through DRM</title>
<updated>2023-12-13T20:09:55+00:00</updated>
<author>
<name>Felix Kuehling</name>
<email>Felix.Kuehling@amd.com</email>
</author>
<published>2023-09-19T18:57:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=0188006d7c797a37c04471a2b4a34a7dfb21f363'/>
<id>0188006d7c797a37c04471a2b4a34a7dfb21f363</id>
<content type='text'>
Use drm_gem_prime_fd_to_handle to import DMABufs for interop. This
ensures that a GEM handle is created on import and that obj-&gt;dma_buf
will be set and remain set as long as the object is imported into KFD.

Signed-off-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Reviewed-by: Ramesh Errabolu &lt;Ramesh.Errabolu@amd.com&gt;
Reviewed-by: Xiaogang.Chen &lt;Xiaogang.Chen@amd.com&gt;
Acked-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Use drm_gem_prime_fd_to_handle to import DMABufs for interop. This
ensures that a GEM handle is created on import and that obj-&gt;dma_buf
will be set and remain set as long as the object is imported into KFD.

Signed-off-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Reviewed-by: Ramesh Errabolu &lt;Ramesh.Errabolu@amd.com&gt;
Reviewed-by: Xiaogang.Chen &lt;Xiaogang.Chen@amd.com&gt;
Acked-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdkfd: Export DMABufs from KFD using GEM handles</title>
<updated>2023-12-13T20:09:55+00:00</updated>
<author>
<name>Felix Kuehling</name>
<email>Felix.Kuehling@amd.com</email>
</author>
<published>2023-08-24T18:41:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=1819200166ce511ac298dc96b9b17eb655a9edc4'/>
<id>1819200166ce511ac298dc96b9b17eb655a9edc4</id>
<content type='text'>
Create GEM handles for exporting DMABufs using GEM-Prime APIs. The GEM
handles are created in a drm_client_dev context to avoid exposing them
in user mode contexts through a DMABuf import.

Signed-off-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Reviewed-by: Ramesh Errabolu &lt;Ramesh.Errabolu@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Create GEM handles for exporting DMABufs using GEM-Prime APIs. The GEM
handles are created in a drm_client_dev context to avoid exposing them
in user mode contexts through a DMABuf import.

Signed-off-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Reviewed-by: Ramesh Errabolu &lt;Ramesh.Errabolu@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/exec: Pass in initial # of objects</title>
<updated>2023-12-10T18:38:47+00:00</updated>
<author>
<name>Rob Clark</name>
<email>robdclark@chromium.org</email>
</author>
<published>2023-11-21T00:38:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=05d249352f1ae909230c230767ca8f4e9fdf8e7b'/>
<id>05d249352f1ae909230c230767ca8f4e9fdf8e7b</id>
<content type='text'>
In cases where the # is known ahead of time, it is silly to do the table
resize dance.

Signed-off-by: Rob Clark &lt;robdclark@chromium.org&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Patchwork: https://patchwork.freedesktop.org/patch/568338/
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In cases where the # is known ahead of time, it is silly to do the table
resize dance.

Signed-off-by: Rob Clark &lt;robdclark@chromium.org&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Patchwork: https://patchwork.freedesktop.org/patch/568338/
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdkfd: Run restore_workers on freezable WQs</title>
<updated>2023-11-29T21:49:23+00:00</updated>
<author>
<name>Felix Kuehling</name>
<email>Felix.Kuehling@amd.com</email>
</author>
<published>2023-10-27T22:21:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=9a1c1339abf972477aeef4ea037e650f49c5892d'/>
<id>9a1c1339abf972477aeef4ea037e650f49c5892d</id>
<content type='text'>
Make restore workers freezable so we don't have to explicitly flush them
in suspend and GPU reset code paths, and we don't accidentally try to
restore BOs while the GPU is suspended. Not having to flush restore_work
also helps avoid lock/fence dependencies in the GPU reset case where we're
not allowed to wait for fences.

A side effect of this is, that we can now have multiple concurrent threads
trying to signal the same eviction fence. Rework eviction fence signaling
and replacement to account for that.

The GPU reset path can no longer rely on restore_process_worker to resume
queues because evict/restore workers can run independently of it. Instead
call a new restore_process_helper directly.

This is an RFC and request for testing.

v2:
- Reworked eviction fence signaling
- Introduced restore_process_helper

v3:
- Handle unsignaled eviction fences in restore_process_bos

Signed-off-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Acked-by: Christian König &lt;christian.koenig@amd.com&gt;
Tested-by: Emily Deng &lt;Emily.Deng@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Make restore workers freezable so we don't have to explicitly flush them
in suspend and GPU reset code paths, and we don't accidentally try to
restore BOs while the GPU is suspended. Not having to flush restore_work
also helps avoid lock/fence dependencies in the GPU reset case where we're
not allowed to wait for fences.

A side effect of this is, that we can now have multiple concurrent threads
trying to signal the same eviction fence. Rework eviction fence signaling
and replacement to account for that.

The GPU reset path can no longer rely on restore_process_worker to resume
queues because evict/restore workers can run independently of it. Instead
call a new restore_process_helper directly.

This is an RFC and request for testing.

v2:
- Reworked eviction fence signaling
- Introduced restore_process_helper

v3:
- Handle unsignaled eviction fences in restore_process_bos

Signed-off-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Acked-by: Christian König &lt;christian.koenig@amd.com&gt;
Tested-by: Emily Deng &lt;Emily.Deng@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
