summaryrefslogtreecommitdiff
path: root/drivers/gpu
AgeCommit message (Collapse)Author
2025-11-12drm/sun4i: layer: move num of planes calc out of layer codeJernej Skrabec
With DE33, number of planes no longer depends on mixer because layers are shared between all mixers. Get this value via parameter, so DE specific code can fill in proper value. Reviewed-by: Chen-Yu Tsai <wens@kernel.org> Tested-by: Ryan Walklin <ryan@testtoast.com> Signed-off-by: Jernej Skrabec <jernej.skrabec@gmail.com> Link: https://patch.msgid.link/20251104180942.61538-16-jernej.skrabec@gmail.com Signed-off-by: Chen-Yu Tsai <wens@kernel.org>
2025-11-12drm/sun4i: ui_layer: Change index meaningJernej Skrabec
In the pursuit of making UI/VI layer code independent of DE version, change meaning of UI index to index of the plane within mixer. DE33 can split amount of VI and UI planes between multiple mixer in whatever way it deems acceptable, so simple calculation VI num + UI index won't be meaningful anymore. Reviewed-by: Chen-Yu Tsai <wens@kernel.org> Tested-by: Ryan Walklin <ryan@testtoast.com> Signed-off-by: Jernej Skrabec <jernej.skrabec@gmail.com> Link: https://patch.msgid.link/20251104180942.61538-15-jernej.skrabec@gmail.com Signed-off-by: Chen-Yu Tsai <wens@kernel.org>
2025-11-12drm/sun4i: de2/de3: Move plane type determination to mixerJernej Skrabec
Plane type determination logic inside layer init functions doesn't allow index register to be repurposed to plane sequence, which it almost is. So move out the logic to mixer, which allows further rework for DE33 support. Reviewed-by: Chen-Yu Tsai <wens@kernel.org> Tested-by: Ryan Walklin <ryan@testtoast.com> Signed-off-by: Jernej Skrabec <jernej.skrabec@gmail.com> Link: https://patch.msgid.link/20251104180942.61538-14-jernej.skrabec@gmail.com Signed-off-by: Chen-Yu Tsai <wens@kernel.org>
2025-11-12drm/sun4i: csc: Simplify arguments with taking plane stateJernej Skrabec
Taking plane state directly reduces number of arguments, avoids copying values and allows making additional decisions. For example, when plane is disabled, CSC should be turned off. This is also cleanup for later patches which will move call to another place. Reviewed-by: Chen-Yu Tsai <wens@kernel.org> Tested-by: Ryan Walklin <ryan@testtoast.com> Signed-off-by: Jernej Skrabec <jernej.skrabec@gmail.com> Link: https://patch.msgid.link/20251104180942.61538-13-jernej.skrabec@gmail.com Signed-off-by: Chen-Yu Tsai <wens@kernel.org>
2025-11-12drm/sun4i: de2/de3: Simplify CSC config interfaceJernej Skrabec
Merging both function into one lets this one decide on it's own if CSC should be enabled or not. Currently heuristics for that is pretty simple - enable it for YUV formats and disable for RGB. DE3 and newer allows YUV pipeline, which will be easier to implement these way. Reviewed-by: Chen-Yu Tsai <wens@kernel.org> Tested-by: Ryan Walklin <ryan@testtoast.com> Signed-off-by: Jernej Skrabec <jernej.skrabec@gmail.com> Link: https://patch.msgid.link/20251104180942.61538-12-jernej.skrabec@gmail.com Signed-off-by: Chen-Yu Tsai <wens@kernel.org>
2025-11-12drm/sun4i: mixer: Move layer enabling to atomic_updateJernej Skrabec
Enable or disable layer only in layer atomic update callback. Doing so will enable having separate layer driver later for DE33. There is no fear that enable bit would be set incorrectly, as all read-modify-write sequences for that register are now eliminated. Reviewed-by: Chen-Yu Tsai <wens@kernel.org> Tested-by: Ryan Walklin <ryan@testtoast.com> Signed-off-by: Jernej Skrabec <jernej.skrabec@gmail.com> Link: https://patch.msgid.link/20251104180942.61538-11-jernej.skrabec@gmail.com Signed-off-by: Chen-Yu Tsai <wens@kernel.org>
2025-11-12drm/sun4i: vi layer: Write attributes in one goJernej Skrabec
It turns out that none of the VI channel registers were meant to be read. Mostly it works fine but sometimes it returns incorrect values. Rework VI layer code to write all registers in one go to avoid reads. This rework will also allow proper code separation. Reviewed-by: Chen-Yu Tsai <wens@kernel.org> Tested-by: Ryan Walklin <ryan@testtoast.com> Signed-off-by: Jernej Skrabec <jernej.skrabec@gmail.com> Link: https://patch.msgid.link/20251104180942.61538-10-jernej.skrabec@gmail.com Signed-off-by: Chen-Yu Tsai <wens@kernel.org>
2025-11-12drm/sun4i: ui layer: Write attributes in one goJernej Skrabec
It turns out that none of the UI channel registers were meant to be read. Mostly it works fine but sometimes it returns incorrect values. Rework UI layer code to write all registers in one go to avoid reads. This rework will also allow proper code separation. Reviewed-by: Chen-Yu Tsai <wens@kernel.org> Tested-by: Ryan Walklin <ryan@testtoast.com> Signed-off-by: Jernej Skrabec <jernej.skrabec@gmail.com> Link: https://patch.msgid.link/20251104180942.61538-9-jernej.skrabec@gmail.com Signed-off-by: Chen-Yu Tsai <wens@kernel.org>
2025-11-12drm/sun4i: Move blender config from layers to mixerJernej Skrabec
With upcoming DE33 support, layer management must be decoupled from other operations like blender configuration. There are two reasons: - DE33 will have separate driver for planes and thus it will be harder to manage different register spaces - Architecturaly it's better to split access by modules. Blender is now exclusively managed by mixer. Reviewed-by: Chen-Yu Tsai <wens@kernel.org> Tested-by: Ryan Walklin <ryan@testtoast.com> Signed-off-by: Jernej Skrabec <jernej.skrabec@gmail.com> Link: https://patch.msgid.link/20251104180942.61538-8-jernej.skrabec@gmail.com Signed-off-by: Chen-Yu Tsai <wens@kernel.org>
2025-11-12drm/sun4i: layers: Make atomic commit functions voidJernej Skrabec
Functions called by atomic_commit callback should not fail. None of them actually returns error, so make them void. No functional change. Reviewed-by: Chen-Yu Tsai <wens@csie.org> Tested-by: Ryan Walklin <ryan@testtoast.com> Signed-off-by: Jernej Skrabec <jernej.skrabec@gmail.com> Link: https://patch.msgid.link/20251104180942.61538-7-jernej.skrabec@gmail.com Signed-off-by: Chen-Yu Tsai <wens@kernel.org>
2025-11-12drm/sun4i: vi_layer: Move check from update to check callbackJernej Skrabec
DRM requires that all check are done in atomic_check callback. Move one check from atomic_commit to atomic_check callback. Tested-by: Ryan Walklin <ryan@testtoast.com> Signed-off-by: Jernej Skrabec <jernej.skrabec@gmail.com> Reviewed-by: Chen-Yu Tsai <wens@kernel.org> Link: https://patch.msgid.link/20251104180942.61538-6-jernej.skrabec@gmail.com Signed-off-by: Chen-Yu Tsai <wens@kernel.org>
2025-11-12drm/sun4i: ui_layer: Move check from update to check callbackJernej Skrabec
DRM requires that all checks are done in atomic_check callback. Move one check from atomic_commit to atomic_check callback. Reviewed-by: Chen-Yu Tsai <wens@csie.org> Tested-by: Ryan Walklin <ryan@testtoast.com> Signed-off-by: Jernej Skrabec <jernej.skrabec@gmail.com> Link: https://patch.msgid.link/20251104180942.61538-5-jernej.skrabec@gmail.com Signed-off-by: Chen-Yu Tsai <wens@kernel.org>
2025-11-12drm/sun4i: de2: Initialize layer fields earlierJernej Skrabec
drm_universal_plane_init() can already call some callbacks, like format_mod_supported, during initialization. Because of that, fields should be initialized beforehand. Reviewed-by: Chen-Yu Tsai <wens@csie.org> Tested-by: Ryan Walklin <ryan@testtoast.com> Signed-off-by: Jernej Skrabec <jernej.skrabec@gmail.com> Link: https://patch.msgid.link/20251104180942.61538-4-jernej.skrabec@gmail.com Signed-off-by: Chen-Yu Tsai <wens@kernel.org>
2025-11-12drm/sun4i: mixer: Remove ccsc cfg for >= DE3Jernej Skrabec
Those engine versions don't need ccsc argument, since CSC units are located on different position and for each layer. Reviewed-by: Chen-Yu Tsai <wens@csie.org> Tested-by: Ryan Walklin <ryan@testtoast.com> Signed-off-by: Jernej Skrabec <jernej.skrabec@gmail.com> Link: https://patch.msgid.link/20251104180942.61538-3-jernej.skrabec@gmail.com Signed-off-by: Chen-Yu Tsai <wens@kernel.org>
2025-11-12drm/sun4i: mixer: Fix up DE33 channel macrosJernej Skrabec
Properly define macros. Till now raw numbers and inappropriate macro was used. Reviewed-by: Chen-Yu Tsai <wens@csie.org> Tested-by: Ryan Walklin <ryan@testtoast.com> Signed-off-by: Jernej Skrabec <jernej.skrabec@gmail.com> Link: https://patch.msgid.link/20251104180942.61538-2-jernej.skrabec@gmail.com Signed-off-by: Chen-Yu Tsai <wens@kernel.org>
2025-11-11drm/amdkfd: Fix GPU mappings for APU after prefetchHarish Kasiviswanathan
Fix the following corner case:- Consider a 2M huge page SVM allocation, followed by prefetch call for the first 4K page. The whole range is initially mapped with single PTE. After the prefetch, this range gets split to first page + rest of the pages. Currently, the first page mapping is not updated on MI300A (APU) since page hasn't migrated. However, after range split PTE mapping it not valid. Fix this by forcing page table update for the whole range when prefetch is called. Calling prefetch on APU doesn't improve performance. If all it deteriotes. However, functionality has to be supported. v2: Use apu_prefer_gtt as this issue doesn't apply to APUs with carveout VRAM v3: Simplify by setting the flag for all ASICs as it doesn't affect dGPU v4: Remove v2 and v3 changes. Force update_mapping when range is split at a size that is not aligned to prange granularity Suggested-by: Philip Yang <Philip.Yang@amd.com> Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Reviewed-by: Philip Yang<Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 076470b9f6f8d9c7c8ca73a9f054942a686f9ba7)
2025-11-11drm/amdkfd: relax checks for over allocation of save areaJonathan Kim
Over allocation of save area is not fatal, only under allocation is. ROCm has various components that independently claim authority over save area size. Unless KFD decides to claim single authority, relax size checks. Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Philip Yang <philip.yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 15bd4958fe38e763bc17b607ba55155254a01f55) Cc: stable@vger.kernel.org
2025-11-11drm/amdgpu/jpeg: Add parse_cs for JPEG5_0_1Sathishkumar S
enable parse_cs callback for JPEG5_0_1. Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 547985579932c1de13f57f8bcf62cd9361b9d3d3) Cc: stable@vger.kernel.org
2025-11-11drm/amd/amdgpu: Ensure isp_kernel_buffer_alloc() creates a new BOSultan Alsawaf
When the BO pointer provided to amdgpu_bo_create_kernel() points to non-NULL, amdgpu_bo_create_kernel() takes it as a hint to pin that address rather than allocate a new BO. This functionality is never desired for allocating ISP buffers. A new BO should always be created when isp_kernel_buffer_alloc() is called, per the description for isp_kernel_buffer_alloc(). Ensure this by zeroing *bo right before the amdgpu_bo_create_kernel() call. Fixes: 55d42f616976 ("drm/amd/amdgpu: Add helper functions for isp buffers") Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Reviewed-by: Pratap Nirujogi <pratap.nirujogi@amd.com> Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 73c8c29baac7f0c7e703d92eba009008cbb5228e)
2025-11-11drm/amd/display: Allow VRR params change if unsynced with the streamIvan Lipski
[Why] When changing resolution (e.g., 4K → FHD) in mirror/clone mode with certain monitors, the monitor blanks and loses connection due to an early exit in vrr_settings_require_update(). The function only checks if VRR state, fixed refresh target, or min/max refresh rate range has changed. During mode changes, if the calculated min/max refresh values remain the same even though the stream's v_total changed, the function returns early without updating vrr_params.adjust.v_total_min/max, leaving the monitor's VRR timing parameters unsynced with the new mode, causing it to blank out. [How] Explicitly adjust VRR parameters to the stream's nominal v_total when VRR is supported, but inactive. Fixes: 6d31602a9f57 ("drm/amd/display: more liberal vmin/vmax update for freesync") Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Ivan Lipski <ivan.lipski@amd.com> Signed-off-by: Fangzhi Zuo <jerry.zuo@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 607df8248a011524211ee34850345305a1913f9e)
2025-11-11drm/amdgpu: fix lock warning in amdgpu_userq_fence_driver_processJesse.Zhang
Fix a potential deadlock caused by inconsistent spinlock usage between interrupt and process contexts in the userq fence driver. The issue occurs when amdgpu_userq_fence_driver_process() is called from both: - Interrupt context: gfx_v11_0_eop_irq() -> amdgpu_userq_fence_driver_process() - Process context: amdgpu_eviction_fence_suspend_worker() -> amdgpu_userq_fence_driver_force_completion() -> amdgpu_userq_fence_driver_process() In interrupt context, the spinlock was acquired without disabling interrupts, leaving it in {IN-HARDIRQ-W} state. When the same lock is acquired in process context, the kernel detects inconsistent locking since the process context acquisition would enable interrupts while holding a lock previously acquired in interrupt context. Kernel log shows: [ 4039.310790] inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage. [ 4039.310804] kworker/7:2/409 [HC0[0]:SC0[0]:HE1:SE1] takes: [ 4039.310818] ffff9284e1bed000 (&fence_drv->fence_list_lock){?...}-{3:3}, [ 4039.310993] {IN-HARDIRQ-W} state was registered at: [ 4039.311004] lock_acquire+0xc6/0x300 [ 4039.311018] _raw_spin_lock+0x39/0x80 [ 4039.311031] amdgpu_userq_fence_driver_process.part.0+0x30/0x180 [amdgpu] [ 4039.311146] amdgpu_userq_fence_driver_process+0x17/0x30 [amdgpu] [ 4039.311257] gfx_v11_0_eop_irq+0x132/0x170 [amdgpu] Fix by using spin_lock_irqsave()/spin_unlock_irqrestore() to properly manage interrupt state regardless of calling context. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit ded3ad780cf97a04927773c4600823b84f7f3cc2) Cc: stable@vger.kernel.org
2025-11-11drm/amdgpu: jump to the correct label on failurePierre-Eric Pelloux-Prayer
drm_sched_entity_init wasn't called yet, so the only thing to do is to release allocated memory. This doesn't fix any bug since entity is zero allocated and drm_sched_entity_fini does nothing in this case. Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit ec49374ccb8da86b465beaf09c367f3dfd648d8f)
2025-11-11drm/amdgpu: disable peer-to-peer access for DCC-enabled GC12 VRAM surfacesVitaly Prosyak
Certain multi-GPU configurations (especially GFX12) may hit data corruption when a DCC-compressed VRAM surface is shared across GPUs using peer-to-peer (P2P) DMA transfers. Such surfaces rely on device-local metadata and cannot be safely accessed through a remote GPU’s page tables. Attempting to import a DCC-enabled surface through P2P leads to incorrect rendering or GPU faults. This change disables P2P for DCC-enabled VRAM buffers that are contiguous and allocated on GFX12+ hardware. In these cases, the importer falls back to the standard system-memory path, avoiding invalid access to compressed surfaces. Future work could consider optional migration (VRAM→System→VRAM) if a performance regression is observed when `attach->peer2peer = false`. Tested on: - Dual RX 9700 XT (Navi4x) setup - GNOME and Wayland compositor scenarios - Confirmed no corruption after disabling P2P under these conditions v2: Remove check TTM_PL_VRAM & TTM_PL_FLAG_CONTIGUOUS. v3: simplify for upsteam and fix ip version check (Alex) Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: Vitaly Prosyak <vitaly.prosyak@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 9dff2bb709e6fbd97e263fd12bf12802d2b5a0cf) Cc: stable@vger.kernel.org
2025-11-11drm/amdkfd: Fix GPU mappings for APU after prefetchHarish Kasiviswanathan
Fix the following corner case:- Consider a 2M huge page SVM allocation, followed by prefetch call for the first 4K page. The whole range is initially mapped with single PTE. After the prefetch, this range gets split to first page + rest of the pages. Currently, the first page mapping is not updated on MI300A (APU) since page hasn't migrated. However, after range split PTE mapping it not valid. Fix this by forcing page table update for the whole range when prefetch is called. Calling prefetch on APU doesn't improve performance. If all it deteriotes. However, functionality has to be supported. v2: Use apu_prefer_gtt as this issue doesn't apply to APUs with carveout VRAM v3: Simplify by setting the flag for all ASICs as it doesn't affect dGPU v4: Remove v2 and v3 changes. Force update_mapping when range is split at a size that is not aligned to prange granularity Suggested-by: Philip Yang <Philip.Yang@amd.com> Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Reviewed-by: Philip Yang<Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-11drm/amdgpu/vce1: Workaround PLL timeout on FirePro W9000Timur Kristóf
Sometimes the VCE PLL times out waiting for CTLACK/CTLACK2. When it happens, the VCE still works, but much slower. Observed on a Tahiti GPU, but not all: - FirePro W9000 has the issue - Radeon R9 280X not affected - Radeon HD 7990 not affected As a workaround, on the affected chip just don't put the VCE PLL in sleep mode. Leaving the VCE PLL in bypass mode or reset mode both work. Using bypass mode is simpler. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-11drm/amdgpu/vce1: Enable VCE1 on Tahiti, Pitcairn, Cape Verde GPUsTimur Kristóf
Add the VCE1 IP block to the SI GPUs that have it. Advertise the encoder capabilities corresponding to VCE1, so the userspace applications can detect and use it. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Co-developed-by: Alexandre Demers <alexandre.f.demers@gmail.com> Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-11drm/amd/pm/si: Hook up VCE1 to SI DPMTimur Kristóf
On SI GPUs, the SMC needs to be aware of whether or not the VCE1 is used. The VCE1 is enabled/disabled through the DPM code. Also print VCE clocks in amdgpu_pm_info. Users can inspect the current power state using: cat /sys/kernel/debug/dri/<card>/amdgpu_pm_info Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-11drm/amdgpu/vce1: Ensure VCPU BO is in lower 32-bit address space (v3)Timur Kristóf
Based on research and ideas by Alexandre and Christian. VCE1 actually executes its code from the VCPU BO. Due to various hardware limitations, the VCE1 requires the VCPU BO to be in the low 32 bit address range. However, VRAM is typically mapped at the high address range, which means the VCPU can't access VRAM through the FB aperture. To solve this, we write a few page table entries to map the VCPU BO in the GART address range. And we make sure that the GART is located at the low address range. That way the VCE1 can access the VCPU BO. v2: - Adjust to v2 of the GART helper commit. - Add empty line to multi-line comment. v3: - Instead of relying on gmc_v6 to set the GART space before GTT, add a new function amdgpu_vce_required_gart_pages() which is called from amdgpu_gtt_mgr_init() directly. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Co-developed-by: Alexandre Demers <alexandre.f.demers@gmail.com> Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-11drm/amdgpu: Check if AID is active before accessLijo Lazar
Access XGMI registers only if AID is active. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-11drm/amdgpu/vce1: Implement VCE1 IP block (v2)Timur Kristóf
Implement the necessary functionality to support the VCE1. This implementation is based on: - VCE2 code from amdgpu - VCE1 code from radeon (the old driver) - Some trial and error A subsequent commit will ensure correct mapping for the VCPU BO, which will make this actually work. v2: - Use memset_io more. - Use memcpy_toio more. - Remove __func__ from warnings. - Don't reserve and map the VCPU BO anymore. - Add empty line to multi-line comments Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Co-developed-by: Alexandre Demers <alexandre.f.demers@gmail.com> Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-11drm/amdgpu/vce1: Load VCE1 firmwareTimur Kristóf
Load VCE1 firmware using amdgpu_ucode_request, just like it is done for other VCE versions. All SI chips share the same VCE1 firmware file: vce_1_0_0.bin which will be sent to linux-firmware soon. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Co-developed-by: Alexandre Demers <alexandre.f.demers@gmail.com> Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-11drm/amdgpu/vce1: Clean up register definitionsTimur Kristóf
The sid.h header contained some VCE1 register definitions, but they were using byte offsets (probably copied from the old radeon driver). Move all of these to the proper VCE1 headers and ensure they are in dword offsets. Also add the register definitions that we need for the firmware validation mechanism in VCE1. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Co-developed-by: Alexandre Demers <alexandre.f.demers@gmail.com> Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-11drm/amdgpu/vce: Clear VCPU BO, don't unmap/unreserve (v4)Timur Kristóf
The VCPU BO doesn't only contain the VCE firmware but also other ranges that the VCE uses for its stack and data. Let's initialize this to zero to avoid having garbage in the VCPU BO. Additionally, don't unmap/unreserve the VCPU BO. The VCPU BO needs to stay at the same location before and after sleep/resume because the FW code is not relocatable once it's started. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-11drm/amdgpu/vce: Move firmware load to amdgpu_vce_early_initTimur Kristóf
Try to load the VCE firmware at early_init. When the correct firmware is not found, return -ENOENT. This way, the driver initialization will complete even without VCE, and the GPU will be functional, albeit without video encoding capabilities. This is necessary because we are planning to add support for the VCE1, and AMD hasn't yet publised the correct firmware for this version. So we need to anticipate that users will try to boot amdgpu on SI GPUs without the correct VCE1 firmware present on their system. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-11drm/amdgpu/ttm: Use GART helper to map VRAM pages (v2)Timur Kristóf
Use the GART helper function introduced in the previous commit to map the VRAM pages of the transfer window to GART. No functional changes, just code cleanup. Split this into a separate commit to make it easier to bisect, in case there are problems in the future. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-11drm/amdkfd: relax checks for over allocation of save areaJonathan Kim
Over allocation of save area is not fatal, only under allocation is. ROCm has various components that independently claim authority over save area size. Unless KFD decides to claim single authority, relax size checks. Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Philip Yang <philip.yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-11drm/amdgpu: Use DC by default on SI dGPUsTimur Kristóf
Now that DC supports analog connectors, it has reached feature parity with the legacy non-DC display driver on SI dGPUs. Use the DC display driver by default on SI dGPUs, unless it is explicitly disabled using the amdgpu.dc=0 module parameter. DC brings proper support for DP/HDMI audio, DP MST, 10-bit colors, some HDR features, atomic modesetting, etc. Also clarify the comment about what is missing to have full DC support for CIK APUs. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-11drm/amdgpu/gart: Add helper to bind VRAM pages (v2)Timur Kristóf
Binds pages that located in VRAM to the GART page table. Useful when a kernel BO is located in VRAM but needs to be accessed from the GART address space, for example to give a kernel BO a 32-bit address when GART is placed in LOW address space. v2: - Refactor function to be more reusable Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-11drm/amdgpu/gmc6: Place gart at low address rangeTimur Kristóf
Instead of using a best-fit algorithm to determine which part of the VMID 0 address space to use for GART, always use the low address range. A subsequent commit will use this to map the VCPU BO in GART for the VCE1 IP block. Split this into a separate patch to make it easier to bisect, in case there are any errors in the future. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-11drm/amdgpu/userqueue: Remove duplicate amdgpu_reset.h headerJiapeng Chong
./drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c: amdgpu_reset.h is included more than once. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=26930 Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-11drm/amdgpu: resume MES scheduling after user queue hang detection and recoveryJesse.Zhang
This patch ensures the Micro-Engine Scheduler (MES) is properly resumed after detecting and recovering from a user queue hang condition. Key changes: 1. Track when a hung user queue is detected using found_hung_queue flag 2. Call amdgpu_mes_resume() to restart MES scheduling after completing the hang recovery process 3. This complements the existing recovery steps (fence force completion and device wedging) by ensuring the scheduler can process new work Without this resume call, the MES scheduler may remain in a paused state even after the hung queue has been handled, preventing newly submitted work from being processed and leading to system stalls. Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-11drm/amdgpu/jpeg: Add parse_cs for JPEG5_0_1Sathishkumar S
enable parse_cs callback for JPEG5_0_1. Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-11drm/amd/pm: Remove power2_average nodeAsad Kamal
SOC power consumption is reported by power1_average. power2_cap_default/min/max only represent second level limits and don't represent a different type of power or power consumption by a subsection of the SOC. Therefore power2_average does not serve any purpose and hence removing power2_average sysfs node Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-11drm/amd/pm: Enable ppt1 caps for smu_v13_0_12Asad Kamal
Enable ppt1 caps to fetch and configure ppt1 for smu_v13_0_12 Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-11drm/amd/pm: Expose ppt1 limit for gc_v9_5_0Asad Kamal
Expose power2_cap hwmon node for retrieving and configuring ppt1 limit on supported boards for gc_v9_5_0 v2: Remove version check (Lijo) v3: Remove power2_average (Lijo) v4: Put back power2_average, will be removed separately (Lijo) Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-11drm/amd/pm: Add ppt1 support for smu_v13_0_12Asad Kamal
Add support to configure and retrieve ppt1 limit for smu_v13_0_12 v2: Add update_caps function and update ppt1 cap based on max ppt1 value, optimize the return values (Lijo) v3: Add Null ptr check, return not supported in case of invalid level/type (Lijo) Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-11drm/amd/pm: Update pmfw headers for smu_v13_0_12Asad Kamal
Update pmfw headers for smu_v13_0_12 to include ppt1 messages and static parameters Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-11drm/amd/display: Add kdoc params/returns in dc/link detection helpersSrinivasan Shanmugam
The link detection helpers in dc/link/link_detection.c were missing kdoc annotations for parameters and return values. Fixes the below with gcc W=1: ...link_detection.c:872 parameter 'edid_header' not described ...link_detection.c:890 parameter 'link' not described ...link_detection.c:914 parameter 'link' not described ...link_detection.c:1355 parameter 'link' not described ...link_detection.c:1355 parameter 'type' not described Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Cc: Roman Li <roman.li@amd.com> Cc: Harry Wentland <harry.wentland@amd.com> Cc: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-11drm/amd/display: Fix annotations for connector poll/detect parametersSrinivasan Shanmugam
Adds the missing @aconnector, @connector, and @force descriptions: @aconnector – This is the DM (Display Manager) connector. It gives access to the DRM connector, the DC link, and hotplug/poll state. The code uses it to check the link, update the sink, and manage connector state changes. @connector – This is the main DRM connector given by the DRM core. Inside the detect function, it is converted to amdgpu_dm_connector so we can run DC link detection, either light or full. @force – This flag tells the function whether to run a full detect again. If false, we avoid heavy DAC load detect steps to prevent flicker. If true, we force a re-detect even when we normally skip it. Fixes the below with gcc W=1: function param 'aconnector' not described in 'amdgpu_dm_connector_poll' function param 'force' not described in 'amdgpu_dm_connector_poll' function param 'connector' not described in 'amdgpu_dm_connector_detect' function param 'force' not described in 'amdgpu_dm_connector_detect' Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Cc: Roman Li <roman.li@amd.com> Cc: Harry Wentland <harry.wentland@amd.com> Cc: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-11drm/amd/amdgpu: Ensure isp_kernel_buffer_alloc() creates a new BOSultan Alsawaf
When the BO pointer provided to amdgpu_bo_create_kernel() points to non-NULL, amdgpu_bo_create_kernel() takes it as a hint to pin that address rather than allocate a new BO. This functionality is never desired for allocating ISP buffers. A new BO should always be created when isp_kernel_buffer_alloc() is called, per the description for isp_kernel_buffer_alloc(). Ensure this by zeroing *bo right before the amdgpu_bo_create_kernel() call. Fixes: 55d42f616976 ("drm/amd/amdgpu: Add helper functions for isp buffers") Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Reviewed-by: Pratap Nirujogi <pratap.nirujogi@amd.com> Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>