linux-stable.git/include/drm/gpu_scheduler.h, branch linux-5.4.y

drm/scheduler: Add flag to hint the release of guilty job.

2019-05-02T20:50:55+00:00

Problem:
Sched thread's cleanup function races against TO handler
and removes the guilty job from mirror list and we
have no way of differentiating if the job was removed from within the
TO handler or from the sched thread's clean-up function.

Fix:
Add a flag to scheduler to hint the TO handler that the guilty job needs
to be explicitly released.

v2: whitespace fix

Reviewed-by: Christian König 
Signed-off-by: Andrey Grodzovsky 
Signed-off-by: Alex Deucher 
Link: https://patchwork.freedesktop.org/patch/msgid/1555599624-12285-5-git-send-email-andrey.grodzovsky@amd.com

drm/scheduler: rework job destruction

2019-05-02T20:45:48+00:00

We now destroy finished jobs from the worker thread to make sure that
we never destroy a job currently in timeout processing.
By this we avoid holding lock around ring mirror list in drm_sched_stop
which should solve a deadlock reported by a user.

v2: Remove unused variable.
v4: Move guilty job free into sched code.
v5:
Move sched->hw_rq_count to drm_sched_start to account for counter
decrement in drm_sched_stop even when we don't call resubmit jobs
if guily job did signal.
v6: remove unused variable

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109692

Acked-by: Chunming Zhou 
Signed-off-by: Christian König 
Signed-off-by: Andrey Grodzovsky 
Signed-off-by: Alex Deucher 
Link: https://patchwork.freedesktop.org/patch/msgid/1555599624-12285-3-git-send-email-andrey.grodzovsky@amd.com

drm/sched: Rework HW fence processing.

2019-01-25T21:15:36+00:00

Expedite job deletion from ring mirror list to the HW fence signal
callback instead from finish_work, together with waiting for all
such fences to signal in drm_sched_stop we garantee that
already signaled job will not be processed twice.
Remove the sched finish fence callback and just submit finish_work
directly from the HW fence callback.

v2: Fix comments.
v3: Attach  hw fence cb to sched_job
v5: Rebase

Suggested-by: Christian Koenig 
Signed-off-by: Andrey Grodzovsky 
Reviewed-by: Christian König 
Signed-off-by: Alex Deucher

drm/sched: Refactor ring mirror list handling.

2019-01-25T21:15:36+00:00

Decauple sched threads stop and start and ring mirror
list handling from the policy of what to do about the
guilty jobs.
When stoppping the sched thread and detaching sched fences
from non signaled HW fenes wait for all signaled HW fences
to complete before rerunning the jobs.

v2: Fix resubmission of guilty job into HW after refactoring.

v4:
Full restart for all the jobs, not only from guilty ring.
Extract karma increase into standalone function.

v5:
Rework waiting for signaled jobs without relying on the job
struct itself as those might already be freed for non 'guilty'
job's schedulers.
Expose karma increase to drivers.

v6:
Use list_for_each_entry_safe_continue and drm_sched_process_job
in case fence already signaled.
Call drm_sched_increase_karma only once for amdgpu and add documentation.

v7:
Wait only for the latest job's fence.

Suggested-by: Christian Koenig 
Signed-off-by: Andrey Grodzovsky 
Reviewed-by: Christian König 
Signed-off-by: Alex Deucher

drm/scheduler: Add drm_sched_suspend/resume_timeout()

2018-12-05T22:56:16+00:00

This patch adds two new functions to help client drivers suspend and
resume the scheduler job timeout. This can be useful in cases where the
hardware has preemption support enabled. Using this, it is possible to have
the timeout active only for the ring which is active on the ringbuffer.
This patch also makes the job_list_lock IRQ safe.

Suggested-by: Christian Koenig 
Signed-off-by: Sharat Masetty 
Reviewed-by: Christian König 
Signed-off-by: Alex Deucher

drm/scheduler: Add drm_sched_job_cleanup

2018-11-05T19:21:27+00:00

This patch adds a new API to clean up the scheduler job resources. This
is primarliy needed in cases the job was created but was not queued to
the scheduler queue. Additionally with this change, the layer which
creates the scheduler job also gets to free up the job's resources and
this entails moving the dma_fence_put(finished_fence) to the drivers
ops free handler routines.

Signed-off-by: Sharat Masetty 
Reviewed-by: Christian König 
Acked-by: Andrey Grodzovsky 
Signed-off-by: Alex Deucher

drm/sched: Add boolean to mark if sched is ready to work v5

2018-11-05T19:21:22+00:00

Problem:
A particular scheduler may become unsuable (underlying HW) after
some event (e.g. GPU reset). If it's later chosen by
the get free sched. policy a command will fail to be
submitted.

Fix:
Add a driver specific callback to report the sched status so
rq with bad sched can be avoided in favor of working one or
none in which case job init will fail.

v2: Switch from driver callback to flag in scheduler.

v3: rebase

v4: Remove ready paramter from drm_sched_init, set
uncoditionally to true once init done.

v5: fix missed change in v3d in v4 (Alex)

Signed-off-by: Andrey Grodzovsky 
Reviewed-by: Christian König 
Signed-off-by: Alex Deucher

drm/sched: add drm_sched_fault

2018-11-05T19:21:02+00:00

Add a helper to immediately start timeout handling in case of a hardware
fault.

Signed-off-by: Christian König 
Reviewed-by: Andrey Grodzovsky 
Signed-off-by: Alex Deucher

drm/scheduler: remove timeout work_struct from drm_sched_job (v3)

2018-09-27T14:55:45+00:00

having a delayed work item per job is redundant as we only need one
per scheduler to track the time out the currently executing job.

v2: the first element of the ring mirror list is the currently
executing job so we don't need a additional variable for it

v3: squash in fixes for v3d and etnaviv

Signed-off-by: Nayan Deshmukh 
Suggested-by: Christian König 
Reviewed-by: Christian König 
Signed-off-by: Alex Deucher

drm/scheduler: Add stopped flag to drm_sched_entity

2018-08-27T16:11:10+00:00

The flag will prevent another thread from same process to
reinsert the entity queue into scheduler's rq after it was already
removewd from there by another thread during drm_sched_entity_flush.

Signed-off-by: Andrey Grodzovsky 
Reviewed-by: Chunming Zhou 
Signed-off-by: Alex Deucher