<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c, branch linux-5.4.y</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>drm/amdkfd: Walk through list with dqm lock hold</title>
<updated>2021-07-19T06:53:11+00:00</updated>
<author>
<name>xinhui pan</name>
<email>xinhui.pan@amd.com</email>
</author>
<published>2021-06-15T07:11:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=356bb9411a26955a93ac1a366f3e6885a2f78ca4'/>
<id>356bb9411a26955a93ac1a366f3e6885a2f78ca4</id>
<content type='text'>
[ Upstream commit 56f221b6389e7ab99c30bbf01c71998ae92fc584 ]

To avoid any list corruption.

Signed-off-by: xinhui pan &lt;xinhui.pan@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 56f221b6389e7ab99c30bbf01c71998ae92fc584 ]

To avoid any list corruption.

Signed-off-by: xinhui pan &lt;xinhui.pan@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdkfd: Fix UBSAN shift-out-of-bounds warning</title>
<updated>2021-05-11T12:04:09+00:00</updated>
<author>
<name>Anson Jacob</name>
<email>Anson.Jacob@amd.com</email>
</author>
<published>2021-03-03T17:33:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=0c0356ef2498c1a250fe3846f30293f828737309'/>
<id>0c0356ef2498c1a250fe3846f30293f828737309</id>
<content type='text'>
[ Upstream commit 50e2fc36e72d4ad672032ebf646cecb48656efe0 ]

If get_num_sdma_queues or get_num_xgmi_sdma_queues is 0, we end up
doing a shift operation where the number of bits shifted equals
number of bits in the operand. This behaviour is undefined.

Set num_sdma_queues or num_xgmi_sdma_queues to ULLONG_MAX, if the
count is &gt;= number of bits in the operand.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1472

Reported-by: Lyude Paul &lt;lyude@redhat.com&gt;
Signed-off-by: Anson Jacob &lt;Anson.Jacob@amd.com&gt;
Reviewed-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Tested-by: Lyude Paul &lt;lyude@redhat.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 50e2fc36e72d4ad672032ebf646cecb48656efe0 ]

If get_num_sdma_queues or get_num_xgmi_sdma_queues is 0, we end up
doing a shift operation where the number of bits shifted equals
number of bits in the operand. This behaviour is undefined.

Set num_sdma_queues or num_xgmi_sdma_queues to ULLONG_MAX, if the
count is &gt;= number of bits in the operand.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1472

Reported-by: Lyude Paul &lt;lyude@redhat.com&gt;
Signed-off-by: Anson Jacob &lt;Anson.Jacob@amd.com&gt;
Reviewed-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Tested-by: Lyude Paul &lt;lyude@redhat.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdkfd: fix a memory leak issue</title>
<updated>2020-10-01T11:18:15+00:00</updated>
<author>
<name>Dennis Li</name>
<email>Dennis.Li@amd.com</email>
</author>
<published>2020-09-02T09:11:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=c48363d19fcbbcfbef23e3daff3fd33df70c4a29'/>
<id>c48363d19fcbbcfbef23e3daff3fd33df70c4a29</id>
<content type='text'>
[ Upstream commit 087d764159996ae378b08c0fdd557537adfd6899 ]

In the resume stage of GPU recovery, start_cpsch will call pm_init
which set pm-&gt;allocated as false, cause the next pm_release_ib has
no chance to release ib memory.

Add pm_release_ib in stop_cpsch which will be called in the suspend
stage of GPU recovery.

Reviewed-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Signed-off-by: Dennis Li &lt;Dennis.Li@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 087d764159996ae378b08c0fdd557537adfd6899 ]

In the resume stage of GPU recovery, start_cpsch will call pm_init
which set pm-&gt;allocated as false, cause the next pm_release_ib has
no chance to release ib memory.

Add pm_release_ib in stop_cpsch which will be called in the suspend
stage of GPU recovery.

Reviewed-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Signed-off-by: Dennis Li &lt;Dennis.Li@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdkfd: Fix a bug in SDMA RLC queue counting under HWS mode</title>
<updated>2020-02-24T07:36:32+00:00</updated>
<author>
<name>Yong Zhao</name>
<email>Yong.Zhao@amd.com</email>
</author>
<published>2020-01-30T00:55:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=25c85d8574d8e4466a6334325a5ab5a3aab3c4c2'/>
<id>25c85d8574d8e4466a6334325a5ab5a3aab3c4c2</id>
<content type='text'>
[ Upstream commit f38abc15d157b7b31fa7f651dc8bf92858c963f8 ]

The sdma_queue_count increment should be done before
execute_queues_cpsch(), which calls pm_calc_rlib_size() where
sdma_queue_count is used to calculate whether over_subscription is
triggered.

With the previous code, when a SDMA queue is created,
compute_queue_count in pm_calc_rlib_size() is one more than the
actual compute queue number, because the queue_count has been
incremented while sdma_queue_count has not. This patch fixes that.

Signed-off-by: Yong Zhao &lt;Yong.Zhao@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit f38abc15d157b7b31fa7f651dc8bf92858c963f8 ]

The sdma_queue_count increment should be done before
execute_queues_cpsch(), which calls pm_calc_rlib_size() where
sdma_queue_count is used to calculate whether over_subscription is
triggered.

With the previous code, when a SDMA queue is created,
compute_queue_count in pm_calc_rlib_size() is one more than the
actual compute queue number, because the queue_count has been
incremented while sdma_queue_count has not. This patch fixes that.

Signed-off-by: Yong Zhao &lt;Yong.Zhao@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdkfd: Fix MQD size calculation</title>
<updated>2019-12-31T15:43:47+00:00</updated>
<author>
<name>Oak Zeng</name>
<email>Oak.Zeng@amd.com</email>
</author>
<published>2019-10-04T14:28:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=be30e550a54c4b088a0b6f4e1ac4b27b72decd41'/>
<id>be30e550a54c4b088a0b6f4e1ac4b27b72decd41</id>
<content type='text'>
[ Upstream commit 40a9592a26608e16f7545a068ea4165e1869f629 ]

On device initialization, a chunk of GTT memory is pre-allocated for
HIQ and all SDMA queues mqd. The size of this allocation was wrong.
The correct sdma engine number should be PCIe-optimized SDMA engine
number plus xgmi SDMA engine number.

Reported-by: Jonathan Kim &lt;Jonathan.Kim@amd.com&gt;
Signed-off-by: Jonathan Kim &lt;Jonathan.Kim@amd.com&gt;
Signed-off-by: Oak Zeng &lt;Oak.Zeng@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 40a9592a26608e16f7545a068ea4165e1869f629 ]

On device initialization, a chunk of GTT memory is pre-allocated for
HIQ and all SDMA queues mqd. The size of this allocation was wrong.
The correct sdma engine number should be PCIe-optimized SDMA engine
number plus xgmi SDMA engine number.

Reported-by: Jonathan Kim &lt;Jonathan.Kim@amd.com&gt;
Signed-off-by: Jonathan Kim &lt;Jonathan.Kim@amd.com&gt;
Signed-off-by: Oak Zeng &lt;Oak.Zeng@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdkfd: Make deallocate_hiq_sdma_mqd static</title>
<updated>2019-08-22T22:25:10+00:00</updated>
<author>
<name>YueHaibing</name>
<email>yuehaibing@huawei.com</email>
</author>
<published>2019-05-25T12:51:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=7fd5a6fb9a75cc00479a7ad4305b5ab31ae2cd59'/>
<id>7fd5a6fb9a75cc00479a7ad4305b5ab31ae2cd59</id>
<content type='text'>
Fix sparse warning:

drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_device_queue_manager.c:1846:6:
 warning: symbol 'deallocate_hiq_sdma_mqd' was not declared. Should it be static?

Reported-by: Hulk Robot &lt;hulkci@huawei.com&gt;
Signed-off-by: YueHaibing &lt;yuehaibing@huawei.com&gt;
Signed-off-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Fix sparse warning:

drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_device_queue_manager.c:1846:6:
 warning: symbol 'deallocate_hiq_sdma_mqd' was not declared. Should it be static?

Reported-by: Hulk Robot &lt;hulkci@huawei.com&gt;
Signed-off-by: YueHaibing &lt;yuehaibing@huawei.com&gt;
Signed-off-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdkfd: Fix sdma_bitmap overflow issue</title>
<updated>2019-07-18T19:18:04+00:00</updated>
<author>
<name>Oak Zeng</name>
<email>Oak.Zeng@amd.com</email>
</author>
<published>2019-07-09T14:40:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=35cdc81bfa94d10373ecae279f3c48ca858ac4fd'/>
<id>35cdc81bfa94d10373ecae279f3c48ca858ac4fd</id>
<content type='text'>
In the original formula, when sdma queue number is 64,
the left shift overflows. Use an equivalence that won't
overflow.

Signed-off-by: Oak Zeng &lt;Oak.Zeng@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In the original formula, when sdma queue number is 64,
the left shift overflows. Use an equivalence that won't
overflow.

Signed-off-by: Oak Zeng &lt;Oak.Zeng@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>amd/amdkfd: Add ASIC ARCTURUS to kfd</title>
<updated>2019-07-18T19:18:03+00:00</updated>
<author>
<name>Yong Zhao</name>
<email>Yong.Zhao@amd.com</email>
</author>
<published>2019-07-09T14:37:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=49adcf8a6f951450417c14afa6a404b7caea25ef'/>
<id>49adcf8a6f951450417c14afa6a404b7caea25ef</id>
<content type='text'>
Add initial support for ARCTURUS to kfd.

Signed-off-by: Yong Zhao &lt;Yong.Zhao@amd.com&gt;
Signed-off-by: Oak Zeng &lt;Oak.Zeng@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add initial support for ARCTURUS to kfd.

Signed-off-by: Yong Zhao &lt;Yong.Zhao@amd.com&gt;
Signed-off-by: Oak Zeng &lt;Oak.Zeng@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdkfd: fix cp hang in eviction</title>
<updated>2019-07-11T19:37:24+00:00</updated>
<author>
<name>Eric Huang</name>
<email>JinhuiEric.Huang@amd.com</email>
</author>
<published>2019-07-09T19:33:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=70df8273ca0cebd2bba443a4444b4944a5929151'/>
<id>70df8273ca0cebd2bba443a4444b4944a5929151</id>
<content type='text'>
The cp hang occurs in OCL conformance test only on supermicro
platform which has 40 cores and the test generates 40 threads.
The root cause is race condition in non-protected flags.

The fix is to add flags of is_evicted and is_active(init_mqd())
into protected area.

Signed-off-by: Eric Huang &lt;JinhuiEric.Huang@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The cp hang occurs in OCL conformance test only on supermicro
platform which has 40 cores and the test generates 40 threads.
The root cause is race condition in non-protected flags.

The fix is to add flags of is_evicted and is_active(init_mqd())
into protected area.

Signed-off-by: Eric Huang &lt;JinhuiEric.Huang@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdkfd: Set gws_mask to 64 bit 1s</title>
<updated>2019-06-22T14:34:14+00:00</updated>
<author>
<name>Oak Zeng</name>
<email>Oak.Zeng@amd.com</email>
</author>
<published>2019-06-14T15:55:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=d9848e149da1ac98bf1220a4fe5977c5bfeca7ef'/>
<id>d9848e149da1ac98bf1220a4fe5977c5bfeca7ef</id>
<content type='text'>
Previous kfd doesn't use gws so this mask was set to 0.
Set it to 64 bit 1s because now kfd can use all 64 gws
resources.

Signed-off-by: Oak Zeng &lt;Oak.Zeng@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Previous kfd doesn't use gws so this mask was set to 0.
Set it to 64 bit 1s because now kfd can use all 64 gws
resources.

Signed-off-by: Oak Zeng &lt;Oak.Zeng@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
