<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c, branch v5.7</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>drm/amdkfd: change SDMA MQD memory type</title>
<updated>2020-02-28T21:59:20+00:00</updated>
<author>
<name>Eric Huang</name>
<email>JinhuiEric.Huang@amd.com</email>
</author>
<published>2020-02-26T19:13:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=f2cc50cefd0f209b4ee87206660d2554a7be7ae0'/>
<id>f2cc50cefd0f209b4ee87206660d2554a7be7ae0</id>
<content type='text'>
SDMA MQD memory type is NC that causes MQD data overwritten
accidentally by an old stable cache line. Changing it to UC
default for GART will fix the issue.

The mqd_gfx9 parameter is meant for control stacks that are
allocated together with user mode queue MQDs. Setting
mqd_gfx9 to true maps the control stack pages as NC.
Here it was accidentally applied to SDMA MQDs,
which are allocated together with the HIQ MQD. Setting
the mqd_gfx9 to false avoids that.

Signed-off-by: Eric Huang &lt;jinhuieric.huang@amd.com&gt;
Acked-by: Yong Zhao &lt;Yong.Zhao@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
SDMA MQD memory type is NC that causes MQD data overwritten
accidentally by an old stable cache line. Changing it to UC
default for GART will fix the issue.

The mqd_gfx9 parameter is meant for control stacks that are
allocated together with user mode queue MQDs. Setting
mqd_gfx9 to true maps the control stack pages as NC.
Here it was accidentally applied to SDMA MQDs,
which are allocated together with the HIQ MQD. Setting
the mqd_gfx9 to false avoids that.

Signed-off-by: Eric Huang &lt;jinhuieric.huang@amd.com&gt;
Acked-by: Yong Zhao &lt;Yong.Zhao@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdkfd: Delete unnecessary unmap queue package submissions</title>
<updated>2020-02-26T19:20:33+00:00</updated>
<author>
<name>Yong Zhao</name>
<email>Yong.Zhao@amd.com</email>
</author>
<published>2020-02-05T21:53:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=c7637c95abebc811d2479a5b924859025def544a'/>
<id>c7637c95abebc811d2479a5b924859025def544a</id>
<content type='text'>
The previous way of using SDMA queue count to infer whether we should unmap
SDMA engines has bugs. The reason it did not cause issues is because MEC
firmware unmaps all queues (CP + SDMA) when a unmap package for compute
engine is received. Becasue of that, only one unmap queue package
is needed, instead of one unmap queue package for CP and each SDMA engine,
which results in much simpler driver code.

Signed-off-by: Yong Zhao &lt;Yong.Zhao@amd.com&gt;
Acked-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The previous way of using SDMA queue count to infer whether we should unmap
SDMA engines has bugs. The reason it did not cause issues is because MEC
firmware unmaps all queues (CP + SDMA) when a unmap package for compute
engine is received. Becasue of that, only one unmap queue package
is needed, instead of one unmap queue package for CP and each SDMA engine,
which results in much simpler driver code.

Signed-off-by: Yong Zhao &lt;Yong.Zhao@amd.com&gt;
Acked-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdkfd: Delete excessive printings</title>
<updated>2020-02-26T19:20:26+00:00</updated>
<author>
<name>Yong Zhao</name>
<email>Yong.Zhao@amd.com</email>
</author>
<published>2020-02-05T23:22:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=1e21647402f953a0f1098b4ecfff751734c6c9ee'/>
<id>1e21647402f953a0f1098b4ecfff751734c6c9ee</id>
<content type='text'>
Those printings are duplicated or useless.

Signed-off-by: Yong Zhao &lt;Yong.Zhao@amd.com&gt;
Acked-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Those printings are duplicated or useless.

Signed-off-by: Yong Zhao &lt;Yong.Zhao@amd.com&gt;
Acked-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdkfd: Count active CP queues directly</title>
<updated>2020-02-26T19:20:13+00:00</updated>
<author>
<name>Yong Zhao</name>
<email>Yong.Zhao@amd.com</email>
</author>
<published>2020-02-05T19:48:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=b42902f4af8fecd2781f71ad166e8a807edb1053'/>
<id>b42902f4af8fecd2781f71ad166e8a807edb1053</id>
<content type='text'>
The previous code of calculating active CP queues is problematic if
some SDMA queues are inactive. Fix that by counting CP queues directly.

Signed-off-by: Yong Zhao &lt;Yong.Zhao@amd.com&gt;
Acked-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The previous code of calculating active CP queues is problematic if
some SDMA queues are inactive. Fix that by counting CP queues directly.

Signed-off-by: Yong Zhao &lt;Yong.Zhao@amd.com&gt;
Acked-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdkfd: Avoid ambiguity by indicating it's cp queue</title>
<updated>2020-02-26T19:20:05+00:00</updated>
<author>
<name>Yong Zhao</name>
<email>Yong.Zhao@amd.com</email>
</author>
<published>2020-01-30T23:35:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=e6945304187deae0a28ebc65008ec11277f1c0f0'/>
<id>e6945304187deae0a28ebc65008ec11277f1c0f0</id>
<content type='text'>
The queues represented in queue_bitmap are only CP queues.

Signed-off-by: Yong Zhao &lt;Yong.Zhao@amd.com&gt;
Acked-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The queues represented in queue_bitmap are only CP queues.

Signed-off-by: Yong Zhao &lt;Yong.Zhao@amd.com&gt;
Acked-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdkfd: Rename queue_count to active_queue_count</title>
<updated>2020-02-26T19:19:38+00:00</updated>
<author>
<name>Yong Zhao</name>
<email>Yong.Zhao@amd.com</email>
</author>
<published>2020-01-30T23:25:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=81b820b304a07fa4de9d46f095475555881af8fe'/>
<id>81b820b304a07fa4de9d46f095475555881af8fe</id>
<content type='text'>
The name is easier to understand the code.

Signed-off-by: Yong Zhao &lt;Yong.Zhao@amd.com&gt;
Acked-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The name is easier to understand the code.

Signed-off-by: Yong Zhao &lt;Yong.Zhao@amd.com&gt;
Acked-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdkfd: Fix a bug in SDMA RLC queue counting under HWS mode</title>
<updated>2020-02-04T15:32:41+00:00</updated>
<author>
<name>Yong Zhao</name>
<email>Yong.Zhao@amd.com</email>
</author>
<published>2020-01-30T00:55:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=f38abc15d157b7b31fa7f651dc8bf92858c963f8'/>
<id>f38abc15d157b7b31fa7f651dc8bf92858c963f8</id>
<content type='text'>
The sdma_queue_count increment should be done before
execute_queues_cpsch(), which calls pm_calc_rlib_size() where
sdma_queue_count is used to calculate whether over_subscription is
triggered.

With the previous code, when a SDMA queue is created,
compute_queue_count in pm_calc_rlib_size() is one more than the
actual compute queue number, because the queue_count has been
incremented while sdma_queue_count has not. This patch fixes that.

Signed-off-by: Yong Zhao &lt;Yong.Zhao@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The sdma_queue_count increment should be done before
execute_queues_cpsch(), which calls pm_calc_rlib_size() where
sdma_queue_count is used to calculate whether over_subscription is
triggered.

With the previous code, when a SDMA queue is created,
compute_queue_count in pm_calc_rlib_size() is one more than the
actual compute queue number, because the queue_count has been
incremented while sdma_queue_count has not. This patch fixes that.

Signed-off-by: Yong Zhao &lt;Yong.Zhao@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdkfd: Add a message when SW scheduler is used</title>
<updated>2020-01-16T18:38:07+00:00</updated>
<author>
<name>Yong Zhao</name>
<email>Yong.Zhao@amd.com</email>
</author>
<published>2020-01-10T19:15:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=52055039297622f164ed2cead43954bb3e29e1b2'/>
<id>52055039297622f164ed2cead43954bb3e29e1b2</id>
<content type='text'>
SW scheduler is previously called non HW scheduler, or non HWS. This
message is useful when triaging issues from dmesg.

Signed-off-by: Yong Zhao &lt;Yong.Zhao@amd.com&gt;
Acked-by: Huang Rui &lt;ray.huang@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
SW scheduler is previously called non HW scheduler, or non HWS. This
message is useful when triaging issues from dmesg.

Signed-off-by: Yong Zhao &lt;Yong.Zhao@amd.com&gt;
Acked-by: Huang Rui &lt;ray.huang@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdkfd: Avoid hanging hardware in stop_cpsch</title>
<updated>2020-01-07T16:55:04+00:00</updated>
<author>
<name>Felix Kuehling</name>
<email>Felix.Kuehling@amd.com</email>
</author>
<published>2019-12-20T07:46:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=c2a77fde10ec2cac33a23c8ac53d181bc2fe0cee'/>
<id>c2a77fde10ec2cac33a23c8ac53d181bc2fe0cee</id>
<content type='text'>
Don't use the HWS if it's known to be hanging. In a reset also
don't try to destroy the HIQ because that may hang on SRIOV if the
KIQ is unresponsive.

Signed-off-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Tested-by: Emily Deng &lt;Emily.Deng@amd.com&gt;
Reviewed-by: shaoyunl  &lt;shaoyun.liu@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Don't use the HWS if it's known to be hanging. In a reset also
don't try to destroy the HIQ because that may hang on SRIOV if the
KIQ is unresponsive.

Signed-off-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Tested-by: Emily Deng &lt;Emily.Deng@amd.com&gt;
Reviewed-by: shaoyunl  &lt;shaoyun.liu@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdkfd: Improve HWS hang detection and handling</title>
<updated>2020-01-07T16:54:56+00:00</updated>
<author>
<name>Felix Kuehling</name>
<email>Felix.Kuehling@amd.com</email>
</author>
<published>2019-12-20T07:07:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=09c34e8d7a631cf541c82f46ee73433c28b6ce0e'/>
<id>09c34e8d7a631cf541c82f46ee73433c28b6ce0e</id>
<content type='text'>
Move HWS hang detection into unmap_queues_cpsch to catch hangs in all
cases. If this happens during a reset, don't schedule another reset
because the reset already in progress is expected to take care of it.

Signed-off-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Tested-by: Emily Deng &lt;Emily.Deng@amd.com&gt;
Reviewed-by: shaoyunl  &lt;shaoyun.liu@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Move HWS hang detection into unmap_queues_cpsch to catch hangs in all
cases. If this happens during a reset, don't schedule another reset
because the reset already in progress is expected to take care of it.

Signed-off-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Tested-by: Emily Deng &lt;Emily.Deng@amd.com&gt;
Reviewed-by: shaoyunl  &lt;shaoyun.liu@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
