linux-stable.git/drivers/gpu/drm/amd/amdgpu, branch v5.7.7

drm/amdgpu: add fw release for sdma v5_0

2020-06-30T19:36:28+00:00

commit edfaf6fa73f15568d4337f208b2333f647c35810 upstream.

sdma fw isn't released when module exit

Reviewed-by: Hawking Zhang 
Signed-off-by: Wenhui Sheng 
Signed-off-by: Alex Deucher 
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman

drm/amdgpu: Sync with VM root BO when switching VM to CPU update mode

2020-06-22T07:32:47+00:00

[ Upstream commit 90ca78deb004abe75b5024968a199acb96bb70f9 ]

This fixes an intermittent bug where a root PD clear operation still in
progress could overwrite a PDE update done by the CPU, resulting in a
VM fault.

Fixes: 108b4d928c03 ("drm/amd/amdgpu: Update VM function pointer")
Reported-by: Jay Cornwall 
Tested-by: Jay Cornwall 
Signed-off-by: Felix Kuehling 
Reviewed-by: Christian König 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin

drm/amd/powerpay: Disable gfxoff when setting manual mode on picasso and raven

2020-06-22T07:32:47+00:00

[ Upstream commit cbd2d08c7463e78d625a69e9db27ad3004cbbd99 ]

[Problem description]
1. Boot up picasso platform, launches desktop, Don't do anything (APU enter into "gfxoff" state)
2. Remote login to platform using SSH, then type the command line:
sudo su -c "echo manual > /sys/class/drm/card0/device/power_dpm_force_performance_level"
sudo su -c "echo 2 > /sys/class/drm/card0/device/pp_dpm_sclk" (fix SCLK to 1400MHz)
3. Move the mouse around in Window
4. Phenomenon : The screen frozen

Tester will switch sclk level during glmark2 run time.
APU will enter "gfxoff" state intermittently during glmark2 run time.
The system got hanged if fix GFXCLK to 1400MHz when APU is in "gfxoff"
state.

[Debug]
1. Fix SCLK to X MHz
1400: screen frozen, screen black, then OS will reboot.
1300: screen frozen.
1200: screen frozen, screen black.
1100: screen frozen, screen black, then OS will reboot.
1000: screen frozen, screen black.
900: screen frozen, screen black, then OS will reboot.
800: Situation Nomal, issue disappear.
700: Situation Nomal, issue disappear.
2. SBIOS setting: AMD CBS --> SMU Debug Options -->SMU Debug --> "GFX DLDO Psm Margin Control":
50 : Situation Nomal, issue disappear.
45 : Situation Nomal, issue disappear.
40 : Situation Nomal, issue disappear.
35 : Situation Nomal, issue disappear.
30 : screen black.
25 : screen frozen, then blurred screen.
20 : screen frozen.
15 : screen black.
10 : screen frozen.
5 : screen frozen, then blurred screen.
3. Disable GFXOFF feature
Situation Nomal, issue disappear.

[Why]
Through a period of time debugging with Sys Eng team and SMU team, Sys
Eng team said this is voltage/frequency marginal issue not a F/W or H/W
bug. This experiment proves that default targetPsm [for f=1400MHz] is
not sufficient when GFXOFF is enabled on Picasso.

SMU team think it is an odd test conditions to force sclk="1400MHz" when
GPU is in "gfxoff" state，then wake up the GFX. SCLK should be in the
"lowest frequency" when gfxoff.

[How]
Disable gfxoff when setting manual mode.
Enable gfxoff when setting other mode(exiting manual mode) again.

By the way, from the user point of view, now that user switch to manual
mode and force SCLK Frequency, he don't want SCLK be controlled by
workload.It becomes meaningless to "switch to manual mode" if APU enter "gfxoff"
due to lack of workload at this point.

Tips: Same issue observed on Raven.

Signed-off-by: chen gong
Reviewed-by: Alex Deucher
Signed-off-by: Alex Deucher
Signed-off-by: Sasha Levin

drm/amdgpu: Init data to avoid oops while reading pp_num_states.

2020-06-22T07:32:18+00:00

[ Upstream commit 6f81b2d047c59eb77cd04795a44245d6a52cdaec ]

For chip like CHIP_OLAND with si enabled(amdgpu.si_support=1),
the amdgpu will expose pp_num_states to the /sys directory.
In this moment, read the pp_num_states file will excute the
amdgpu_get_pp_num_states func. In our case, the data hasn't
been initialized, so the kernel will access some ilegal
address, trigger the segmentfault and system will reboot soon:

    uos@uos-PC:~$ cat /sys/devices/pci0000\:00/0000\:00\:00.0/0000\:01\:00
    .0/pp_num_states

    Message from syslogd@uos-PC at Apr 22 09:26:20 ...
     kernel:[   82.154129] Internal error: Oops: 96000004 [#1] SMP

This patch aims to fix this problem, avoid that reading file
triggers the kernel sementfault.

Signed-off-by: limingyu 
Signed-off-by: zhoubinbin 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin

drm/amdgpu: fix and cleanup amdgpu_gem_object_close v4

2020-06-22T07:32:16+00:00

[ Upstream commit 82c416b13cb7d22b96ec0888b296a48dff8a09eb ]

The problem is that we can't add the clear fence to the BO
when there is an exclusive fence on it since we can't
guarantee the the clear fence will complete after the
exclusive one.

To fix this refactor the function and also add the exclusive
fence as shared to the resv object.

v2: fix warning
v3: add excl fence as shared instead
v4: squash in fix for fence handling in amdgpu_gem_object_close

Signed-off-by: Christian König 
Reviewed-by: xinhui pan 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin

drm/amd/amdgpu: add raven1 part to the gfxoff quirk list

2020-05-12T12:39:33+00:00

On my raven1 system (rev c6) with VBIOS 113-RAVEN-114 GFXOFF is
not stable (resulting in large block tiling noise in some applications).

Disabling GFXOFF via the quirk list fixes the problems for me.

Signed-off-by: Tom St Denis 
Reviewed-by: Alex Deucher 
Signed-off-by: Alex Deucher 
Cc: stable@vger.kernel.org

drm/amdgpu: implement soft_recovery for gfx10

2020-05-08T18:53:24+00:00

Same as gfx9.  This allows us to kill the waves for hung
shaders.

Acked-by: Evan Quan 
Reviewed-by: Christian König 
Signed-off-by: Alex Deucher

drm/amdgpu: enable hibernate support on Navi1X

2020-05-08T18:52:53+00:00

BACO is needed to support hibernate on Navi1X.

Signed-off-by: Evan Quan 
Acked-by: Alex Deucher 
Signed-off-by: Alex Deucher 
Cc: stable@vger.kernel.org

drm/amdgpu: Use GEM obj reference for KFD BOs

2020-05-08T18:44:12+00:00

Releasing the AMDGPU BO ref directly leads to problems when BOs were
exported as DMA bufs. Releasing the GEM reference makes sure that the
AMDGPU/TTM BO is not freed too early.

Also take a GEM reference when importing BOs from DMABufs to keep
references to imported BOs balances properly.

Signed-off-by: Felix Kuehling 
Tested-by: Alex Sierra 
Acked-by: Christian König 
Reviewed-by: Alex Sierra 
Signed-off-by: Alex Deucher

drm/amdgpu: force fbdev into vram

2020-05-08T18:44:11+00:00

We set the fb smem pointer to the offset into the BAR, so keep
the fbdev bo in vram.

Bug: https://bugzilla.kernel.org/show_bug.cgi?id=207581
Fixes: 6c8d74caa2fa33 ("drm/amdgpu: Enable scatter gather display support")
Reviewed-by: Christian König 
Signed-off-by: Alex Deucher 
Cc: stable@vger.kernel.org