<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/drivers/gpu/drm/scheduler, branch linux-5.15.y</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>drm/sched: Fix race in drm_sched_entity_select_rq()</title>
<updated>2025-12-06T21:09:15+00:00</updated>
<author>
<name>Philipp Stanner</name>
<email>phasta@kernel.org</email>
</author>
<published>2025-11-03T15:44:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=afd6e9fe377f44817da9bbcf259c921f314a3f74'/>
<id>afd6e9fe377f44817da9bbcf259c921f314a3f74</id>
<content type='text'>
[ Upstream commit d25e3a610bae03bffc5c14b5d944a5d0cd844678 ]

In a past bug fix it was forgotten that entity access must be protected
by the entity lock. That's a data race and potentially UB.

Move the spin_unlock() to the appropriate position.

Cc: stable@vger.kernel.org # v5.13+
Fixes: ac4eb83ab255 ("drm/sched: select new rq even if there is only one v3")
Reviewed-by: Tvrtko Ursulin &lt;tvrtko.ursulin@igalia.com&gt;
Signed-off-by: Philipp Stanner &lt;phasta@kernel.org&gt;
Link: https://patch.msgid.link/20251022063402.87318-2-phasta@kernel.org
[ adapted lock field name from entity-&gt;lock to entity-&gt;rq_lock ]
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit d25e3a610bae03bffc5c14b5d944a5d0cd844678 ]

In a past bug fix it was forgotten that entity access must be protected
by the entity lock. That's a data race and potentially UB.

Move the spin_unlock() to the appropriate position.

Cc: stable@vger.kernel.org # v5.13+
Fixes: ac4eb83ab255 ("drm/sched: select new rq even if there is only one v3")
Reviewed-by: Tvrtko Ursulin &lt;tvrtko.ursulin@igalia.com&gt;
Signed-off-by: Philipp Stanner &lt;phasta@kernel.org&gt;
Link: https://patch.msgid.link/20251022063402.87318-2-phasta@kernel.org
[ adapted lock field name from entity-&gt;lock to entity-&gt;rq_lock ]
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/sched: Remove optimization that causes hang when killing dependent jobs</title>
<updated>2025-08-28T14:24:30+00:00</updated>
<author>
<name>Lin.Cao</name>
<email>lincao12@amd.com</email>
</author>
<published>2025-07-29T16:16:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=6188d61ba73d5af9a1f5b7ee1fce52771329ab7b'/>
<id>6188d61ba73d5af9a1f5b7ee1fce52771329ab7b</id>
<content type='text'>
[ Upstream commit 15f77764e90a713ee3916ca424757688e4f565b9 ]

When application A submits jobs and application B submits a job with a
dependency on A's fence, the normal flow wakes up the scheduler after
processing each job. However, the optimization in
drm_sched_entity_add_dependency_cb() uses a callback that only clears
dependencies without waking up the scheduler.

When application A is killed before its jobs can run, the callback gets
triggered but only clears the dependency without waking up the scheduler,
causing the scheduler to enter sleep state and application B to hang.

Remove the optimization by deleting drm_sched_entity_clear_dep() and its
usage, ensuring the scheduler is always woken up when dependencies are
cleared.

Fixes: 777dbd458c89 ("drm/amdgpu: drop a dummy wakeup scheduler")
Cc: stable@vger.kernel.org # v4.6+
Signed-off-by: Lin.Cao &lt;lincao12@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Philipp Stanner &lt;phasta@kernel.org&gt;
Link: https://lore.kernel.org/r/20250717084453.921097-1-lincao12@amd.com
[ Changed drm_sched_wakeup_if_can_queue() calls to drm_sched_wakeup() ]
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 15f77764e90a713ee3916ca424757688e4f565b9 ]

When application A submits jobs and application B submits a job with a
dependency on A's fence, the normal flow wakes up the scheduler after
processing each job. However, the optimization in
drm_sched_entity_add_dependency_cb() uses a callback that only clears
dependencies without waking up the scheduler.

When application A is killed before its jobs can run, the callback gets
triggered but only clears the dependency without waking up the scheduler,
causing the scheduler to enter sleep state and application B to hang.

Remove the optimization by deleting drm_sched_entity_clear_dep() and its
usage, ensuring the scheduler is always woken up when dependencies are
cleared.

Fixes: 777dbd458c89 ("drm/amdgpu: drop a dummy wakeup scheduler")
Cc: stable@vger.kernel.org # v4.6+
Signed-off-by: Lin.Cao &lt;lincao12@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Philipp Stanner &lt;phasta@kernel.org&gt;
Link: https://lore.kernel.org/r/20250717084453.921097-1-lincao12@amd.com
[ Changed drm_sched_wakeup_if_can_queue() calls to drm_sched_wakeup() ]
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/sched: Fix preprocessor guard</title>
<updated>2025-03-13T11:51:05+00:00</updated>
<author>
<name>Philipp Stanner</name>
<email>phasta@kernel.org</email>
</author>
<published>2025-02-18T12:41:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=3f9e7298053c8101abaf8b18a5263640f4848f06'/>
<id>3f9e7298053c8101abaf8b18a5263640f4848f06</id>
<content type='text'>
[ Upstream commit 23e0832d6d7be2d3c713f9390c060b6f1c48bf36 ]

When writing the header guard for gpu_scheduler_trace.h, a typo,
apparently, occurred.

Fix the typo and document the scope of the guard.

Fixes: 353da3c520b4 ("drm/amdgpu: add tracepoint for scheduler (v2)")
Reviewed-by: Tvrtko Ursulin &lt;tvrtko.ursulin@igalia.com&gt;
Signed-off-by: Philipp Stanner &lt;phasta@kernel.org&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20250218124149.118002-2-phasta@kernel.org
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 23e0832d6d7be2d3c713f9390c060b6f1c48bf36 ]

When writing the header guard for gpu_scheduler_trace.h, a typo,
apparently, occurred.

Fix the typo and document the scope of the guard.

Fixes: 353da3c520b4 ("drm/amdgpu: add tracepoint for scheduler (v2)")
Reviewed-by: Tvrtko Ursulin &lt;tvrtko.ursulin@igalia.com&gt;
Signed-off-by: Philipp Stanner &lt;phasta@kernel.org&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20250218124149.118002-2-phasta@kernel.org
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/sched: Add locking to drm_sched_entity_modify_sched</title>
<updated>2024-10-17T13:11:42+00:00</updated>
<author>
<name>Tvrtko Ursulin</name>
<email>tvrtko.ursulin@igalia.com</email>
</author>
<published>2024-09-13T16:05:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=622bc6472350cd8bc3528dd67a26c3423d266a7b'/>
<id>622bc6472350cd8bc3528dd67a26c3423d266a7b</id>
<content type='text'>
commit 4286cc2c953983d44d248c9de1c81d3a9643345c upstream.

Without the locking amdgpu currently can race between
amdgpu_ctx_set_entity_priority() (via drm_sched_entity_modify_sched()) and
drm_sched_job_arm(), leading to the latter accesing potentially
inconsitent entity-&gt;sched_list and entity-&gt;num_sched_list pair.

v2:
 * Improve commit message. (Philipp)

Signed-off-by: Tvrtko Ursulin &lt;tvrtko.ursulin@igalia.com&gt;
Fixes: b37aced31eb0 ("drm/scheduler: implement a function to modify sched list")
Cc: Christian König &lt;christian.koenig@amd.com&gt;
Cc: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Cc: Luben Tuikov &lt;ltuikov89@gmail.com&gt;
Cc: Matthew Brost &lt;matthew.brost@intel.com&gt;
Cc: David Airlie &lt;airlied@gmail.com&gt;
Cc: Daniel Vetter &lt;daniel@ffwll.ch&gt;
Cc: dri-devel@lists.freedesktop.org
Cc: Philipp Stanner &lt;pstanner@redhat.com&gt;
Cc: &lt;stable@vger.kernel.org&gt; # v5.7+
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20240913160559.49054-2-tursulin@igalia.com
Signed-off-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 4286cc2c953983d44d248c9de1c81d3a9643345c upstream.

Without the locking amdgpu currently can race between
amdgpu_ctx_set_entity_priority() (via drm_sched_entity_modify_sched()) and
drm_sched_job_arm(), leading to the latter accesing potentially
inconsitent entity-&gt;sched_list and entity-&gt;num_sched_list pair.

v2:
 * Improve commit message. (Philipp)

Signed-off-by: Tvrtko Ursulin &lt;tvrtko.ursulin@igalia.com&gt;
Fixes: b37aced31eb0 ("drm/scheduler: implement a function to modify sched list")
Cc: Christian König &lt;christian.koenig@amd.com&gt;
Cc: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Cc: Luben Tuikov &lt;ltuikov89@gmail.com&gt;
Cc: Matthew Brost &lt;matthew.brost@intel.com&gt;
Cc: David Airlie &lt;airlied@gmail.com&gt;
Cc: Daniel Vetter &lt;daniel@ffwll.ch&gt;
Cc: dri-devel@lists.freedesktop.org
Cc: Philipp Stanner &lt;pstanner@redhat.com&gt;
Cc: &lt;stable@vger.kernel.org&gt; # v5.7+
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20240913160559.49054-2-tursulin@igalia.com
Signed-off-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>dma-buf: add dma_fence_timestamp helper</title>
<updated>2024-02-23T07:55:10+00:00</updated>
<author>
<name>Christian König</name>
<email>christian.koenig@amd.com</email>
</author>
<published>2023-09-08T08:27:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=4e82b9c11d3cd9a85fa8ecc2923f0a223dc37a7f'/>
<id>4e82b9c11d3cd9a85fa8ecc2923f0a223dc37a7f</id>
<content type='text'>
[ Upstream commit b83ce9cb4a465b8f9a3fa45561b721a9551f60e3 ]

When a fence signals there is a very small race window where the timestamp
isn't updated yet. sync_file solves this by busy waiting for the
timestamp to appear, but on other ocassions didn't handled this
correctly.

Provide a dma_fence_timestamp() helper function for this and use it in
all appropriate cases.

Another alternative would be to grab the spinlock when that happens.

v2 by teddy: add a wait parameter to wait for the timestamp to show up, in case
   the accurate timestamp is needed and/or the timestamp is not based on
   ktime (e.g. hw timestamp)
v3 chk: drop the parameter again for unified handling

Signed-off-by: Yunxiang Li &lt;Yunxiang.Li@amd.com&gt;
Signed-off-by: Christian König &lt;christian.koenig@amd.com&gt;
Fixes: 1774baa64f93 ("drm/scheduler: Change scheduled fence track v2")
Reviewed-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
CC: stable@vger.kernel.org
Link: https://patchwork.freedesktop.org/patch/msgid/20230929104725.2358-1-christian.koenig@amd.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit b83ce9cb4a465b8f9a3fa45561b721a9551f60e3 ]

When a fence signals there is a very small race window where the timestamp
isn't updated yet. sync_file solves this by busy waiting for the
timestamp to appear, but on other ocassions didn't handled this
correctly.

Provide a dma_fence_timestamp() helper function for this and use it in
all appropriate cases.

Another alternative would be to grab the spinlock when that happens.

v2 by teddy: add a wait parameter to wait for the timestamp to show up, in case
   the accurate timestamp is needed and/or the timestamp is not based on
   ktime (e.g. hw timestamp)
v3 chk: drop the parameter again for unified handling

Signed-off-by: Yunxiang Li &lt;Yunxiang.Li@amd.com&gt;
Signed-off-by: Christian König &lt;christian.koenig@amd.com&gt;
Fixes: 1774baa64f93 ("drm/scheduler: Change scheduled fence track v2")
Reviewed-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
CC: stable@vger.kernel.org
Link: https://patchwork.freedesktop.org/patch/msgid/20230929104725.2358-1-christian.koenig@amd.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/sched: Allow using a dedicated workqueue for the timeout/fault tdr</title>
<updated>2021-07-01T06:53:25+00:00</updated>
<author>
<name>Boris Brezillon</name>
<email>boris.brezillon@collabora.com</email>
</author>
<published>2021-06-30T06:27:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=78efe21b6f8e6f4d39fceaf0cc5c534c11f9dd60'/>
<id>78efe21b6f8e6f4d39fceaf0cc5c534c11f9dd60</id>
<content type='text'>
Mali Midgard/Bifrost GPUs have 3 hardware queues but only a global GPU
reset. This leads to extra complexity when we need to synchronize timeout
works with the reset work. One solution to address that is to have an
ordered workqueue at the driver level that will be used by the different
schedulers to queue their timeout work. Thanks to the serialization
provided by the ordered workqueue we are guaranteed that timeout
handlers are executed sequentially, and can thus easily reset the GPU
from the timeout handler without extra synchronization.

v5:
* Add a new paragraph to the timedout_job() method

v3:
* New patch

v4:
* Actually use the timeout_wq to queue the timeout work

Suggested-by: Daniel Vetter &lt;daniel.vetter@ffwll.ch&gt;
Signed-off-by: Boris Brezillon &lt;boris.brezillon@collabora.com&gt;
Reviewed-by: Steven Price &lt;steven.price@arm.com&gt;
Reviewed-by: Lucas Stach &lt;l.stach@pengutronix.de&gt;
Acked-by: Daniel Vetter &lt;daniel.vetter@ffwll.ch&gt;
Acked-by: Christian König &lt;christian.koenig@amd.com&gt;
Cc: Qiang Yu &lt;yuq825@gmail.com&gt;
Cc: Emma Anholt &lt;emma@anholt.net&gt;
Cc: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Cc: "Christian König" &lt;christian.koenig@amd.com&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20210630062751.2832545-3-boris.brezillon@collabora.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Mali Midgard/Bifrost GPUs have 3 hardware queues but only a global GPU
reset. This leads to extra complexity when we need to synchronize timeout
works with the reset work. One solution to address that is to have an
ordered workqueue at the driver level that will be used by the different
schedulers to queue their timeout work. Thanks to the serialization
provided by the ordered workqueue we are guaranteed that timeout
handlers are executed sequentially, and can thus easily reset the GPU
from the timeout handler without extra synchronization.

v5:
* Add a new paragraph to the timedout_job() method

v3:
* New patch

v4:
* Actually use the timeout_wq to queue the timeout work

Suggested-by: Daniel Vetter &lt;daniel.vetter@ffwll.ch&gt;
Signed-off-by: Boris Brezillon &lt;boris.brezillon@collabora.com&gt;
Reviewed-by: Steven Price &lt;steven.price@arm.com&gt;
Reviewed-by: Lucas Stach &lt;l.stach@pengutronix.de&gt;
Acked-by: Daniel Vetter &lt;daniel.vetter@ffwll.ch&gt;
Acked-by: Christian König &lt;christian.koenig@amd.com&gt;
Cc: Qiang Yu &lt;yuq825@gmail.com&gt;
Cc: Emma Anholt &lt;emma@anholt.net&gt;
Cc: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Cc: "Christian König" &lt;christian.koenig@amd.com&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20210630062751.2832545-3-boris.brezillon@collabora.com
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/sched: Declare entity idle only after HW submission</title>
<updated>2021-06-28T11:16:49+00:00</updated>
<author>
<name>Boris Brezillon</name>
<email>boris.brezillon@collabora.com</email>
</author>
<published>2021-06-24T14:08:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=3b5ac97ad468f6cfd31346821a3a2b9f13d23015'/>
<id>3b5ac97ad468f6cfd31346821a3a2b9f13d23015</id>
<content type='text'>
The panfrost driver tries to kill in-flight jobs on FD close after
destroying the FD scheduler entities. For this to work properly, we
need to make sure the jobs popped from the scheduler entities have
been queued at the HW level before declaring the entity idle, otherwise
we might iterate over a list that doesn't contain those jobs.

Suggested-by: Lucas Stach &lt;l.stach@pengutronix.de&gt;
Signed-off-by: Boris Brezillon &lt;boris.brezillon@collabora.com&gt;
Cc: Lucas Stach &lt;l.stach@pengutronix.de&gt;
Reviewed-by: Steven Price &lt;steven.price@arm.com&gt;
Reviewed-by: Lucas Stach &lt;l.stach@pengutronix.de&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20210624140850.2229697-1-boris.brezillon@collabora.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The panfrost driver tries to kill in-flight jobs on FD close after
destroying the FD scheduler entities. For this to work properly, we
need to make sure the jobs popped from the scheduler entities have
been queued at the HW level before declaring the entity idle, otherwise
we might iterate over a list that doesn't contain those jobs.

Suggested-by: Lucas Stach &lt;l.stach@pengutronix.de&gt;
Signed-off-by: Boris Brezillon &lt;boris.brezillon@collabora.com&gt;
Cc: Lucas Stach &lt;l.stach@pengutronix.de&gt;
Reviewed-by: Steven Price &lt;steven.price@arm.com&gt;
Reviewed-by: Lucas Stach &lt;l.stach@pengutronix.de&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20210624140850.2229697-1-boris.brezillon@collabora.com
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/sched: Avoid data corruptions</title>
<updated>2021-05-20T03:50:28+00:00</updated>
<author>
<name>Andrey Grodzovsky</name>
<email>andrey.grodzovsky@amd.com</email>
</author>
<published>2021-05-19T14:14:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=0b10ab80695d61422337ede6ff496552d8ace99d'/>
<id>0b10ab80695d61422337ede6ff496552d8ace99d</id>
<content type='text'>
Wait for all dependencies of a job  to complete before
killing it to avoid data corruptions.

Signed-off-by: Andrey Grodzovsky &lt;andrey.grodzovsky@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20210519141407.88444-1-andrey.grodzovsky@amd.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Wait for all dependencies of a job  to complete before
killing it to avoid data corruptions.

Signed-off-by: Andrey Grodzovsky &lt;andrey.grodzovsky@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20210519141407.88444-1-andrey.grodzovsky@amd.com
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/scheduler: Fix hang when sched_entity released</title>
<updated>2021-05-20T03:50:28+00:00</updated>
<author>
<name>Andrey Grodzovsky</name>
<email>andrey.grodzovsky@amd.com</email>
</author>
<published>2021-05-12T14:26:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=c61cdbdbffc169dc7f1e6fe94dfffaf574fe672a'/>
<id>c61cdbdbffc169dc7f1e6fe94dfffaf574fe672a</id>
<content type='text'>
Problem: If scheduler is already stopped by the time sched_entity
is released and entity's job_queue not empty I encountred
a hang in drm_sched_entity_flush. This is because drm_sched_entity_is_idle
never becomes false.

Fix: In drm_sched_fini detach all sched_entities from the
scheduler's run queues. This will satisfy drm_sched_entity_is_idle.
Also wakeup all those processes stuck in sched_entity flushing
as the scheduler main thread which wakes them up is stopped by now.

v2:
Reverse order of drm_sched_rq_remove_entity and marking
s_entity as stopped to prevent reinserion back to rq due
to race.

v3:
Drop drm_sched_rq_remove_entity, only modify entity-&gt;stopped
and check for it in drm_sched_entity_is_idle

Signed-off-by: Andrey Grodzovsky &lt;andrey.grodzovsky@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20210512142648.666476-14-andrey.grodzovsky@amd.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Problem: If scheduler is already stopped by the time sched_entity
is released and entity's job_queue not empty I encountred
a hang in drm_sched_entity_flush. This is because drm_sched_entity_is_idle
never becomes false.

Fix: In drm_sched_fini detach all sched_entities from the
scheduler's run queues. This will satisfy drm_sched_entity_is_idle.
Also wakeup all those processes stuck in sched_entity flushing
as the scheduler main thread which wakes them up is stopped by now.

v2:
Reverse order of drm_sched_rq_remove_entity and marking
s_entity as stopped to prevent reinserion back to rq due
to race.

v3:
Drop drm_sched_rq_remove_entity, only modify entity-&gt;stopped
and check for it in drm_sched_entity_is_idle

Signed-off-by: Andrey Grodzovsky &lt;andrey.grodzovsky@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20210512142648.666476-14-andrey.grodzovsky@amd.com
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/sched: Make timeout timer rearm conditional.</title>
<updated>2021-05-20T03:50:28+00:00</updated>
<author>
<name>Andrey Grodzovsky</name>
<email>andrey.grodzovsky@amd.com</email>
</author>
<published>2021-05-12T14:26:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=75973e5802afece679d3936328e23a1891a9badc'/>
<id>75973e5802afece679d3936328e23a1891a9badc</id>
<content type='text'>
We don't want to rearm the timer if driver hook reports
that the device is gone.

v5: Update drm_gpu_sched_stat values in code.

Signed-off-by: Andrey Grodzovsky &lt;andrey.grodzovsky@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20210512142648.666476-11-andrey.grodzovsky@amd.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We don't want to rearm the timer if driver hook reports
that the device is gone.

v5: Update drm_gpu_sched_stat values in code.

Signed-off-by: Andrey Grodzovsky &lt;andrey.grodzovsky@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20210512142648.666476-11-andrey.grodzovsky@amd.com
</pre>
</div>
</content>
</entry>
</feed>
