<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h, branch v5.15</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>drm/amd/amdgpu embed hw_fence into amdgpu_job</title>
<updated>2021-08-16T19:16:58+00:00</updated>
<author>
<name>Jack Zhang</name>
<email>Jack.Zhang1@amd.com</email>
</author>
<published>2021-05-12T07:06:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=c530b02f39850a639b72d01ebbf7e5d745c60831'/>
<id>c530b02f39850a639b72d01ebbf7e5d745c60831</id>
<content type='text'>
Why: Previously hw fence is alloced separately with job.
It caused historical lifetime issues and corner cases.
The ideal situation is to take fence to manage both job
and fence's lifetime, and simplify the design of gpu-scheduler.

How:
We propose to embed hw_fence into amdgpu_job.
1. We cover the normal job submission by this method.
2. For ib_test, and submit without a parent job keep the
legacy way to create a hw fence separately.
v2:
use AMDGPU_FENCE_FLAG_EMBED_IN_JOB_BIT to show that the fence is
embedded in a job.
v3:
remove redundant variable ring in amdgpu_job
v4:
add tdr sequence support for this feature. Add a job_run_counter to
indicate whether this job is a resubmit job.
v5
add missing handling in amdgpu_fence_enable_signaling

Signed-off-by: Jingwen Chen &lt;Jingwen.Chen2@amd.com&gt;
Signed-off-by: Jack Zhang &lt;Jack.Zhang7@hotmail.com&gt;
Reviewed-by: Andrey Grodzovsky &lt;andrey.grodzovsky@amd.com&gt;
Reviewed by: Monk Liu &lt;monk.liu@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Why: Previously hw fence is alloced separately with job.
It caused historical lifetime issues and corner cases.
The ideal situation is to take fence to manage both job
and fence's lifetime, and simplify the design of gpu-scheduler.

How:
We propose to embed hw_fence into amdgpu_job.
1. We cover the normal job submission by this method.
2. For ib_test, and submit without a parent job keep the
legacy way to create a hw fence separately.
v2:
use AMDGPU_FENCE_FLAG_EMBED_IN_JOB_BIT to show that the fence is
embedded in a job.
v3:
remove redundant variable ring in amdgpu_job
v4:
add tdr sequence support for this feature. Add a job_run_counter to
indicate whether this job is a resubmit job.
v5
add missing handling in amdgpu_fence_enable_signaling

Signed-off-by: Jingwen Chen &lt;Jingwen.Chen2@amd.com&gt;
Signed-off-by: Jack Zhang &lt;Jack.Zhang7@hotmail.com&gt;
Reviewed-by: Andrey Grodzovsky &lt;andrey.grodzovsky@amd.com&gt;
Reviewed by: Monk Liu &lt;monk.liu@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdgpu: Move to a per-IB secure flag (TMZ)</title>
<updated>2020-04-28T20:20:29+00:00</updated>
<author>
<name>Luben Tuikov</name>
<email>luben.tuikov@amd.com</email>
</author>
<published>2020-04-22T21:56:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=0bb5d5b03f78aeb5f87d47877eb15532875c64da'/>
<id>0bb5d5b03f78aeb5f87d47877eb15532875c64da</id>
<content type='text'>
Move from a per-CS secure flag (TMZ) to a per-IB
secure flag.

Signed-off-by: Luben Tuikov &lt;luben.tuikov@amd.com&gt;
Reviewed-by: Huang Rui &lt;ray.huang@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Move from a per-CS secure flag (TMZ) to a per-IB
secure flag.

Signed-off-by: Luben Tuikov &lt;luben.tuikov@amd.com&gt;
Reviewed-by: Huang Rui &lt;ray.huang@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdgpu: job is secure iff CS is secure (v5)</title>
<updated>2020-04-28T20:20:28+00:00</updated>
<author>
<name>Huang Rui</name>
<email>ray.huang@amd.com</email>
</author>
<published>2019-08-08T12:05:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=cb5fae143d79d255251921066dbf8eae16383639'/>
<id>cb5fae143d79d255251921066dbf8eae16383639</id>
<content type='text'>
Mark a job as secure, if and only if the command
submission flag has the secure flag set.

v2: fix the null job pointer while in vmid 0
submission.
v3: Context --&gt; Command submission.
v4: filling cs parser with cs-&gt;in.flags
v5: move the job secure flag setting out of amdgpu_cs_submit()

Signed-off-by: Huang Rui &lt;ray.huang@amd.com&gt;
Signed-off-by: Luben Tuikov &lt;luben.tuikov@amd.com&gt;
Reviewed-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Mark a job as secure, if and only if the command
submission flag has the secure flag set.

v2: fix the null job pointer while in vmid 0
submission.
v3: Context --&gt; Command submission.
v4: filling cs parser with cs-&gt;in.flags
v5: move the job secure flag setting out of amdgpu_cs_submit()

Signed-off-by: Huang Rui &lt;ray.huang@amd.com&gt;
Signed-off-by: Luben Tuikov &lt;luben.tuikov@amd.com&gt;
Reviewed-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdgpu: implement more ib pools (v2)</title>
<updated>2020-04-01T18:44:44+00:00</updated>
<author>
<name>xinhui pan</name>
<email>xinhui.pan@amd.com</email>
</author>
<published>2020-03-26T00:38:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=c8e42d57859d5055bfe3313cfd5dc025097b753e'/>
<id>c8e42d57859d5055bfe3313cfd5dc025097b753e</id>
<content type='text'>
We have three ib pools, they are normal, VM, direct pools.

Any jobs which schedule IBs without dependence on gpu scheduler should
use DIRECT pool.

Any jobs schedule direct VM update IBs should use VM pool.

Any other jobs use NORMAL pool.

v2: squash in coding style fix

Signed-off-by: xinhui pan &lt;xinhui.pan@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We have three ib pools, they are normal, VM, direct pools.

Any jobs which schedule IBs without dependence on gpu scheduler should
use DIRECT pool.

Any jobs schedule direct VM update IBs should use VM pool.

Any other jobs use NORMAL pool.

v2: squash in coding style fix

Signed-off-by: xinhui pan &lt;xinhui.pan@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdgpu: drop amdgpu_job.owner</title>
<updated>2020-01-16T18:35:10+00:00</updated>
<author>
<name>Christian König</name>
<email>christian.koenig@amd.com</email>
</author>
<published>2019-12-16T13:54:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=971fe55545de2f67463def381df57d803dddf61d'/>
<id>971fe55545de2f67463def381df57d803dddf61d</id>
<content type='text'>
Entirely unused.

Signed-off-by: Christian König &lt;christian.koenig@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Entirely unused.

Signed-off-by: Christian König &lt;christian.koenig@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdgpu: Avoid HW GPU reset for RAS.</title>
<updated>2019-09-13T22:41:05+00:00</updated>
<author>
<name>Andrey Grodzovsky</name>
<email>andrey.grodzovsky@amd.com</email>
</author>
<published>2019-09-13T22:40:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=7c6e68c777f109484559a35b125a773439bbd319'/>
<id>7c6e68c777f109484559a35b125a773439bbd319</id>
<content type='text'>
Problem:
Under certain conditions, when some IP bocks take a RAS error,
we can get into a situation where a GPU reset is not possible
due to issues in RAS in SMU/PSP.

Temporary fix until proper solution in PSP/SMU is ready:
When uncorrectable error happens the DF will unconditionally
broadcast error event packets to all its clients/slave upon
receiving fatal error event and freeze all its outbound queues,
err_event_athub interrupt  will be triggered.
In such case and we use this interrupt
to issue GPU reset. THe GPU reset code is modified for such case to avoid HW
reset, only stops schedulers, deatches all in progress and not yet scheduled
job's fences, set error code on them and signals.
Also reject any new incoming job submissions from user space.
All this is done to notify the applications of the problem.

v2:
Extract amdgpu_amdkfd_pre/post_reset from amdgpu_device_lock/unlock_adev
Move amdgpu_job_stop_all_jobs_on_sched to amdgpu_job.c
Remove print param from amdgpu_ras_query_error_count

v3:
Update based on prevoius bug fixing patch to properly call amdgpu_amdkfd_pre_reset
for other XGMI hive memebers.

Signed-off-by: Andrey Grodzovsky &lt;andrey.grodzovsky@amd.com&gt;
Acked-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Reviewed-by: Hawking Zhang &lt;Hawking.Zhang@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Problem:
Under certain conditions, when some IP bocks take a RAS error,
we can get into a situation where a GPU reset is not possible
due to issues in RAS in SMU/PSP.

Temporary fix until proper solution in PSP/SMU is ready:
When uncorrectable error happens the DF will unconditionally
broadcast error event packets to all its clients/slave upon
receiving fatal error event and freeze all its outbound queues,
err_event_athub interrupt  will be triggered.
In such case and we use this interrupt
to issue GPU reset. THe GPU reset code is modified for such case to avoid HW
reset, only stops schedulers, deatches all in progress and not yet scheduled
job's fences, set error code on them and signals.
Also reject any new incoming job submissions from user space.
All this is done to notify the applications of the problem.

v2:
Extract amdgpu_amdkfd_pre/post_reset from amdgpu_device_lock/unlock_adev
Move amdgpu_job_stop_all_jobs_on_sched to amdgpu_job.c
Remove print param from amdgpu_ras_query_error_count

v3:
Update based on prevoius bug fixing patch to properly call amdgpu_amdkfd_pre_reset
for other XGMI hive memebers.

Signed-off-by: Andrey Grodzovsky &lt;andrey.grodzovsky@amd.com&gt;
Acked-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Reviewed-by: Hawking Zhang &lt;Hawking.Zhang@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdgpu: add ib preemption status in amdgpu_job (v2)</title>
<updated>2019-06-21T23:57:40+00:00</updated>
<author>
<name>Jack Xiao</name>
<email>Jack.Xiao@amd.com</email>
</author>
<published>2019-01-17T07:47:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=d8780dc71d56acf734a5756cac8c7787eb6bec11'/>
<id>d8780dc71d56acf734a5756cac8c7787eb6bec11</id>
<content type='text'>
Add ib preemption status in amdgpu_job, so that ring level function
can detect preemption and program for resuming it.

v2: squash in fix to restore job-&gt;preamble_status back to status value (Jack)

Acked-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Reviewed-by: Hawking Zhang &lt;Hawking.Zhang@amd.com&gt;
Signed-off-by: Jack Xiao &lt;Jack.Xiao@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add ib preemption status in amdgpu_job, so that ring level function
can detect preemption and program for resuming it.

v2: squash in fix to restore job-&gt;preamble_status back to status value (Jack)

Acked-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Reviewed-by: Hawking Zhang &lt;Hawking.Zhang@amd.com&gt;
Signed-off-by: Jack Xiao &lt;Jack.Xiao@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdgpu: Modify the argument of emit_ib interface</title>
<updated>2018-11-05T19:21:50+00:00</updated>
<author>
<name>Rex Zhu</name>
<email>Rex.Zhu@amd.com</email>
</author>
<published>2018-10-24T05:37:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=34955e038a1b313b0f19eeacfb0e22aa6877e11d'/>
<id>34955e038a1b313b0f19eeacfb0e22aa6877e11d</id>
<content type='text'>
use the point of struct amdgpu_job as the function
argument instand of vmid, so the other members of
struct amdgpu_job can be visit in emit_ib function.

v2: add a wrapper for getting the VMID
    add the job before the ib on the parameter list.
v3: refine the wrapper name

Reviewed-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Rex Zhu &lt;Rex.Zhu@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
use the point of struct amdgpu_job as the function
argument instand of vmid, so the other members of
struct amdgpu_job can be visit in emit_ib function.

v2: add a wrapper for getting the VMID
    add the job before the ib on the parameter list.
v3: refine the wrapper name

Reviewed-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Rex Zhu &lt;Rex.Zhu@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdgpu: remove job-&gt;adev (v2)</title>
<updated>2018-07-17T19:18:28+00:00</updated>
<author>
<name>Christian König</name>
<email>christian.koenig@amd.com</email>
</author>
<published>2018-07-13T15:15:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=a1917b73d89e88a6ecdd076b9d6618682f1b0d08'/>
<id>a1917b73d89e88a6ecdd076b9d6618682f1b0d08</id>
<content type='text'>
We can get that from the ring.

v2: squash in "drm/amdgpu: always initialize job-&gt;base.sched" (Alex)

Signed-off-by: Christian König &lt;christian.koenig@amd.com&gt;
Reviewed-by: Junwei Zhang &lt;Jerry.Zhang@amd.com&gt;
Acked-by: Chunming Zhou &lt;david1.zhou@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We can get that from the ring.

v2: squash in "drm/amdgpu: always initialize job-&gt;base.sched" (Alex)

Signed-off-by: Christian König &lt;christian.koenig@amd.com&gt;
Reviewed-by: Junwei Zhang &lt;Jerry.Zhang@amd.com&gt;
Acked-by: Chunming Zhou &lt;david1.zhou@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdgpu: add amdgpu_job_submit_direct helper</title>
<updated>2018-07-16T21:11:53+00:00</updated>
<author>
<name>Christian König</name>
<email>christian.koenig@amd.com</email>
</author>
<published>2018-07-13T14:29:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=ee913fd9e166384aacc0aa70ffd4e93ca41d54b0'/>
<id>ee913fd9e166384aacc0aa70ffd4e93ca41d54b0</id>
<content type='text'>
Make sure that we properly initialize at least the sched member.

Signed-off-by: Christian König &lt;christian.koenig@amd.com&gt;
Reviewed-by: Junwei Zhang &lt;Jerry.Zhang@amd.com&gt;
Acked-by: Chunming Zhou &lt;david1.zhou@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Make sure that we properly initialize at least the sched member.

Signed-off-by: Christian König &lt;christian.koenig@amd.com&gt;
Reviewed-by: Junwei Zhang &lt;Jerry.Zhang@amd.com&gt;
Acked-by: Chunming Zhou &lt;david1.zhou@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
