summaryrefslogtreecommitdiff
path: root/drivers/gpu
AgeCommit message (Collapse)Author
2025-10-09drm/xe: Move GGTT lock init to allocMatthew Brost
The GGTT lock is needed very early during GT initialization for a VF; move the GGTT lock initialization to the allocation phase. v8: - Rework function structure (Michal) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://lore.kernel.org/r/20251008214532.3442967-13-matthew.brost@intel.com
2025-10-09drm/xe/vf: Remove memory allocations from VF post migration recoveryMatthew Brost
VF post migration recovery is the path of dma-fence signaling / reclaim, avoid memory allocations in this path. v3: - s/lrc_wa_bb/scratch (Tomasz) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Tomasz Lis <tomasz.lis@intel.com> Link: https://lore.kernel.org/r/20251008214532.3442967-12-matthew.brost@intel.com
2025-10-09drm/xe/vf: Abort H2G sends during VF post-migration recoveryMatthew Brost
While VF post-migration recovery is in progress, abort H2G sends with -ECANCEL. These messages are treated as lost, and TLB invalidation errors are suppressed. During this phase, the H2G channel is down, and VF recovery requires the CT lock to proceed. v3: - Use xe_gt_recovery_inprogress (Michal) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://lore.kernel.org/r/20251008214532.3442967-11-matthew.brost@intel.com
2025-10-09drm/xe/vf: Make VF recovery run on per-GT workerMatthew Brost
VF recovery is a per-GT operation, so it makes sense to isolate it to a per-GT queue. Scheduling this operation on the same worker as the GT reset and TDR not only aligns with this design but also helps avoid race conditions, as those operations can also modify the queue state. v2: - Fix lockdep splat (Adam) - Use xe_sriov_vf_migration_supported helper v3: - Drop xe_gt_sriov_ prefix for private functions (Michal) - Drop message in xe_gt_sriov_vf_migration_init_early (Michal) - Logic rework in vf_post_migration_notify_resfix_done (Michal) - Rework init sequence layering (Michal) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Tomasz Lis <tomasz.lis@intel.com> Link: https://lore.kernel.org/r/20251008214532.3442967-10-matthew.brost@intel.com
2025-10-09drm/xe/vf: Add xe_gt_recovery_pending helperMatthew Brost
Add xe_gt_recovery_pending helper. This helper serves as the singular point to determine whether a GT recovery is currently in progress. Expected callers include the GuC CT layer and the GuC submission layer. Atomically visable as soon as vCPU are unhalted until VF recovery completes. v3: - Add GT layer xe_gt_recovery_inprogress (Michal) - Don't blow up in memirq not enabled (CI) - Add __memirq_received with clear argument (Michal) - xe_memirq_sw_int_0_irq_pending rename (Michal) - Use offset in xe_memirq_sw_int_0_irq_pending (Michal) v4: - Refactor xe_gt_recovery_inprogress logic around memirq (Michal) v5: - s/inprogress/pending (Michal) v7: - Fix typos, adjust comment (Michal) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Tomasz Lis <tomasz.lis@intel.com> Link: https://lore.kernel.org/r/20251008214532.3442967-9-matthew.brost@intel.com
2025-10-09drm/xe: Make LRC W/A scratch buffer usage consistentMatthew Brost
The LRC W/A currently checks for LRC being iomem in some places, while in others it checks if the scratch buffer is non-NULL. This inconsistency causes issues with the VF post-migration recovery code, which blindly passes in a scratch buffer. This patch standardizes the check by consistently verifying whether the LRC is iomem to determine if the scratch buffer should be used. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Tomasz Lis <tomasz.lis@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20251008214532.3442967-8-matthew.brost@intel.com
2025-10-09drm/xe: Don't change LRC ring head on job resubmissionMatthew Brost
Now that we save the job's head during submission, it's no longer necessary to adjust the LRC ring head during resubmission. Instead, a software-based adjustment of the tail will overwrite the old jobs in place. For some odd reason, adjusting the LRC ring head didn't work on parallel queues, which was causing issues in our CI. v5: - Add comment in guc_exec_queue_start explaning why the function works (Auld) v7: - Only adjust first state on first unsignaled job (Auld) v8: - Break unsignaled job handling to separate patch (Auld) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Tomasz Lis <tomasz.lis@intel.com> Link: https://lore.kernel.org/r/20251008214532.3442967-7-matthew.brost@intel.com
2025-10-09drm/xe: Return first unsignaled job first pending job helperMatthew Brost
In all cases where the first pending job helper is called, we only want to retrieve the first unsignaled pending job, as this helper is used exclusively in recovery flows. It is possible for signaled jobs to remain in the pending list as the scheduler is stopped, so those should be skipped. Also, add kernel documentation to clarify this behavior. v8: - Split out into own patch (Auld) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://lore.kernel.org/r/20251008214532.3442967-6-matthew.brost@intel.com
2025-10-09drm/xe: Track LR jobs in DRM scheduler pending listMatthew Brost
VF migration requires jobs to remain pending so they can be replayed after the VF comes back. Previously, LR job fences were intentionally signaled immediately after submission to avoid the risk of exporting them, as these fences do not naturally signal in a timely manner and could break dma-fence contracts. A side effect of this approach was that LR jobs were never added to the DRM scheduler’s pending list, preventing them from being tracked for later resubmission. We now avoid signaling LR job fences and ensure they are never exported; Xe already guards against exporting these internal fences. With that guarantee in place, we can safely track LR jobs in the scheduler’s pending list so they are eligible for resubmission during VF post-migration recovery (and similar recovery paths). An added benefit is that LR queues now gain the DRM scheduler’s built-in flow control over ring usage rather than rejecting new jobs in the exec IOCTL if the ring is full. v2: - Ensure DRM scheduler TDR doesn't run for LR jobs - Stack variable for killed_or_banned_or_wedged v4: - Clarify commit message (Tomasz) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Tomasz Lis <tomasz.lis@intel.com> Link: https://lore.kernel.org/r/20251008214532.3442967-5-matthew.brost@intel.com
2025-10-09drm/xe/guc: Track pending-enable source in submission stateMatthew Brost
Add explicit tracking in the GuC submission state to record the source of a pending enable (TDR vs. queue resume path vs. submission). Disambiguating the origin lets the GuC submission state machine apply the correct recovery/replay behavior. This helps VF restore: when the device comes back, the state machine knows whether the pending enable stems from timeout recovery, from a queue resume sequence, or submission and can gate sequencing and fixups accordingly. v4: - Clarify commit message (Tomasz) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Tomasz Lis <tomasz.lis@intel.com> Link: https://lore.kernel.org/r/20251008214532.3442967-4-matthew.brost@intel.com
2025-10-09drm/xe: Save off position in ring in which a job was programmedMatthew Brost
VF post-migration recovery needs to modify the ring with updated GGTT addresses for pending jobs. Save off position in ring in which a job was programmed to facilitate. v4: - s/VF resume/VF post-migration recovery (Tomasz) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Tomasz Lis <tomasz.lis@intel.com> Link: https://lore.kernel.org/r/20251008214532.3442967-3-matthew.brost@intel.com
2025-10-09drm/xe: Add NULL checks to scratch LRC allocationMatthew Brost
kmalloc can fail, the returned value must have a NULL check. This should be immediately after kmalloc for clarity. v5: - Assert state->buffer in setup_bo if buffer is iomem (Tomasz) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Tomasz Lis <tomasz.lis@intel.com> Link: https://lore.kernel.org/r/20251008214532.3442967-2-matthew.brost@intel.com
2025-10-09drm/i915/guc: Skip communication warning on reset in progressZhanjun Dong
GuC IRQ and tasklet handler receive just single G2H message, and let other messages to be received from next tasklet. During this chained tasklet process, if reset process started, communication will be disabled. Skip warning for this condition. Fixes: 65dd4ed0f4e1 ("drm/i915/guc: Don't receive all G2H messages in irq handler") Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/15018 Signed-off-by: Zhanjun Dong <zhanjun.dong@intel.com> Reviewed-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://lore.kernel.org/r/20250929152904.269776-1-zhanjun.dong@intel.com
2025-10-09drm/i915/alpm: Remove parameters suffix from intel_dp->alpm_parametersJouni Högander
Now as intel_dp->alpm_parameters doesn't really contain any parameters it doesn't make sense to call it as alpm_parameters -> remove parameters suffix. Signed-off-by: Jouni Högander <jouni.hogander@intel.com> Reviewed-by: Animesh Manna <animesh.manna@intel.com> Link: https://lore.kernel.org/r/20250929130003.28365-2-jouni.hogander@intel.com
2025-10-09drm/i915/alpm: Compute ALPM parameters into crtc_state->alpm_stateJouni Högander
Currently ALPM parameters are computed directly into intel_dp->alpm_parameters. This is a problem when compute config ends up to not using the computed state. Fix this by adding ALPM parameters into intel_crtc_state and compute into there. Copy needed parameters (io_wake_lines and fast_wake_lines used by PSR activate/exit) from crtc_state->alpm_state into intel_dp->alpm.alpm_parameters when they are configured into HW. v3: - enhance commit message v2: - store io/fast wake lines into intel_dp->dp instead of intel_dp->alpm_parameters and do it in intel_psr_enable_locked - rename crtc_state->alpm_parameters -> crtc_state->alpm_state - clarify commit message Signed-off-by: Jouni Högander <jouni.hogander@intel.com> Reviewed-by: Animesh Manna <animesh.manna@intel.com> Link: https://lore.kernel.org/r/20250929130003.28365-1-jouni.hogander@intel.com
2025-10-09drm/virtgpu: Use vblank timerThomas Zimmermann
Use a vblank timer to simulate the vblank interrupt. The DRM vblank helpers provide an implementation on top of Linux' hrtimer. Virtgpu enables and disables the timer as part of the CRTC. The atomic_flush callback sets up the event. Like vblank interrupts, the vblank timer fires at the rate of the display refresh. Most userspace limits its page flip rate according to the DRM vblank event. Virtgpu's virtual hardware does not provide vblank interrupts, so DRM sends each event ASAP. With the fast access times of virtual display memory, the event rate is much higher than the display mode's refresh rate; creating the next page flip almost immediately. This leads to excessive CPU overhead from even small display updates, such as moving the mouse pointer. This problem affects virtgpu and all other virtual displays. See [1] for a discussion in the context of hypervdrm. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://lore.kernel.org/dri-devel/SN6PR02MB415702B00D6D52B0EE962C98D46CA@SN6PR02MB4157.namprd02.prod.outlook.com/ # [1] Acked-by: Gerd Hoffmann <kraxel@redhat.com> Reviewed-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Tested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Link: https://lore.kernel.org/r/20251008130701.246988-1-tzimmermann@suse.de
2025-10-09drm/virtio: Handle drm_crtc_init_with_planes() errorsAlexandr Sapozhnikov
Return value of function drm_crtc_init_with_planes(), called by vgdev_output_init(), is not checked, but it is usually checked for this function. Found by Linux Verification Center (linuxtesting.org) with SVACE. Signed-off-by: Alexandr Sapozhnikov <alsp705@gmail.com> Reviewed-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> [dmitry.osipenko@collabora.com: coding style fix, edit commit message] Link: https://lore.kernel.org/r/20250922144418.41-1-alsp705@gmail.com
2025-10-08drm/xe: Move declarations under conditional branchTejas Upadhyay
The xe_device_shutdown() function was needing a few declarations that were only required under a specific condition. This change moves those declarations to be within that conditional branch to avoid unnecessary declarations. Reviewed-by: Nitin Gote <nitin.r.gote@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20251007100208.1407021-1-tejas.upadhyay@intel.com Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
2025-10-08drm: atmel-hlcdc: replace dev_* print functions with drm_* variantsEslam Khafagy
Update the Atmel HLCDC code to use DRM print macros drm_*() instead of dev_warn() and dev_err(). This change ensures consistency with DRM subsystem logging conventions [1]. [1] Link: https://docs.kernel.org/gpu/todo.html#convert-logging-to-drm-functions-with-drm-device-parameter Signed-off-by: Eslam Khafagy <eslam.medhat1993@gmail.com> Reviewed-by: Manikandan Muralidharan <manikandan.m@microchip.com> Link: https://lore.kernel.org/r/20250813224000.130292-1-eslam.medhat1993@gmail.com Signed-off-by: Dharma Balasubiramani <dharma.b@microchip.com>
2025-10-07drm/xe/pf: Add max_vfs configfs attribute to control PF modeMichal Wajdeczko
In addition to existing max_vfs modparam, add max_vfs configfs attribute to allow PF configuration on the per-device level. Default config value is still based on the modparam value. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Acked-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20251002232648.203370-1-michal.wajdeczko@intel.com
2025-10-07drm/nouveau: fix bad ret code in nouveau_bo_move_prepShuhao Fu
In `nouveau_bo_move_prep`, if `nouveau_mem_map` fails, an error code should be returned. Currently, it returns zero even if vmm addr is not correctly mapped. Cc: stable@vger.kernel.org Reviewed-by: Petr Vorel <pvorel@suse.cz> Signed-off-by: Shuhao Fu <sfual@cse.ust.hk> Fixes: 9ce523cc3bf2 ("drm/nouveau: separate buffer object backing memory from nvkm structures") Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2025-10-07drm/amd/display: Incorrect Mirror CositingJesse Agate
[WHY] hinit/vinit are incorrect in the case of mirroring. [HOW] Cositing sign must be flipped when image is mirrored in the vertical or horizontal direction. Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Reviewed-by: Samson Tam <samson.tam@amd.com> Signed-off-by: Jesse Agate <jesse.agate@amd.com> Signed-off-by: Brendan Leder <breleder@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-07drm/amd/display: Enable Dynamic DTBCLK SwitchFangzhi Zuo
[WHAT] Since dcn35, DTBCLK can be disabled when no DP2 sink connected for power saving purpose. Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Fangzhi Zuo <Jerry.Zuo@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-07drm/amdgpu: Report individual reset errorLijo Lazar
If reinitialization of one of the GPUs fails after reset, it logs failure on all subsequent GPUs eventhough they have resumed successfully. A sample log where only device at 0000:95:00.0 had a failure - amdgpu 0000:15:00.0: amdgpu: GPU reset(19) succeeded! amdgpu 0000:65:00.0: amdgpu: GPU reset(19) succeeded! amdgpu 0000:75:00.0: amdgpu: GPU reset(19) succeeded! amdgpu 0000:85:00.0: amdgpu: GPU reset(19) succeeded! amdgpu 0000:95:00.0: amdgpu: GPU reset(19) failed amdgpu 0000:e5:00.0: amdgpu: GPU reset(19) failed amdgpu 0000:f5:00.0: amdgpu: GPU reset(19) failed amdgpu 0000:05:00.0: amdgpu: GPU reset(19) failed amdgpu 0000:15:00.0: amdgpu: GPU reset end with ret = -5 To avoid confusion, report the error for each device separately and return the first error as the overall result. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-07drm/amdgpu: partially revert "revert to old status lock handling v3"Christian König
The CI systems are pointing out list corruptions, so we still need to fix something here. Keep the asserts, but revert the lock changes for now. Fixes: 59e4405e9ee2 ("drm/amdgpu: revert to old status lock handling v3") Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-07drm/amd/display: Fix unsafe uses of kernel mode FPUArd Biesheuvel
The point of isolating code that uses kernel mode FPU in separate compilation units is to ensure that even implicit uses of, e.g., SIMD registers for spilling occur only in a context where this is permitted, i.e., from inside a kernel_fpu_begin/end block. This is important on arm64, which uses -mgeneral-regs-only to build all kernel code, with the exception of such compilation units where FP or SIMD registers are expected to be used. Given that the compiler may invent uses of FP/SIMD anywhere in such a unit, none of its code may be accessible from outside a kernel_fpu_begin/end block. This means that all callers into such compilation units must use the DC_FP start/end macros, which must not occur there themselves. For robustness, all functions with external linkage that reside there should call dc_assert_fp_enabled() to assert that the FPU context was set up correctly. Fix this for the DCN35, DCN351 and DCN36 implementations. Cc: Austin Zheng <austin.zheng@amd.com> Cc: Jun Lei <jun.lei@amd.com> Cc: Harry Wentland <harry.wentland@amd.com> Cc: Leo Li <sunpeng.li@amd.com> Cc: Rodrigo Siqueira <siqueira@igalia.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: amd-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-07drm/amd/pm: Disable VCN queue reset on SMU v13.0.6 due to regressionJesse.Zhang
Disable VCN reset capability for the program 4 as it's causing regressions. Fixes: 9d20f37a106f ("drm/amd/pm: Add VCN reset support for SMU v13.0.6") Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Jesse Zhang <jesse.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-07drm/amdgpu: Fix general protection fault in amdgpu_vm_bo_reset_state_machineJesse.Zhang
After GPU reset with VRAM loss, a general protection fault occurs during user queue restoration when accessing vm_bo->vm after spinlock release in amdgpu_vm_bo_reset_state_machine. The root cause is that vm_bo points to the last entry from the list_for_each_entry loop, but this becomes invalid after the spinlock is released. Accessing vm_bo->vm at this point leads to memory corruption. Crash log shows: [ 326.981811] Oops: general protection fault, probably for non-canonical address 0x4156415741e58ac8: 0000 [#1] SMP NOPTI [ 326.981820] CPU: 13 UID: 0 PID: 1035 Comm: kworker/13:3 Tainted: G E 6.16.0+ #25 PREEMPT(voluntary) [ 326.981826] Tainted: [E]=UNSIGNED_MODULE [ 326.981827] Hardware name: Gigabyte Technology Co., Ltd. X870E AORUS PRO ICE/X870E AORUS PRO ICE, BIOS F3i 12/19/2024 [ 326.981831] Workqueue: events amdgpu_userq_restore_worker [amdgpu] [ 326.981999] RIP: 0010:amdgpu_vm_assert_locked+0x16/0x70 [amdgpu] [ 326.982094] Code: 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 48 85 ff 74 45 48 8b 87 80 03 00 00 48 85 c0 74 40 <48> 8b b8 80 01 00 00 48 85 ff 74 3b 8b 05 0c b7 0e f0 85 c0 75 05 [ 326.982098] RSP: 0018:ffffaa91c2a6bc20 EFLAGS: 00010206 [ 326.982100] RAX: 4156415741e58948 RBX: ffff9e8f013e8330 RCX: 0000000000000000 [ 326.982102] RDX: 0000000000000005 RSI: 000000001d254e88 RDI: ffffffffc144814a [ 326.982104] RBP: ffffaa91c2a6bc68 R08: 0000004c21a25674 R09: 0000000000000001 [ 326.982106] R10: 0000000000000001 R11: dccaf3f2f82863fc R12: ffff9e8f013e8000 [ 326.982108] R13: ffff9e8f013e8000 R14: 0000000000000000 R15: ffff9e8f09980000 [ 326.982110] FS: 0000000000000000(0000) GS:ffff9e9e79995000(0000) knlGS:0000000000000000 [ 326.982112] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 326.982114] CR2: 000055ed6c9caa80 CR3: 0000000797060000 CR4: 0000000000750ef0 [ 326.982116] PKRU: 55555554 Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-07drm/amdgpu: Check swus/ds for switch state saveLijo Lazar
For saving switch state, check if the GPU is having SWUS/DS architecture. Otherwise, skip saving. Reported-by: Roman Elshin <roman.elshin@gmail.com> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4602 Fixes: 1dd2fa0e00f1 ("drm/amdgpu: Save and restore switch state") Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-07drm/amd/pm: Avoid interface mismatch messagingLijo Lazar
PMFW interface version is not used by some IP implementations like SMU v13.0.6/12, instead rely on PMFW version checks. Avoid the log if interface version is not used. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-07drm/amdgpu: Merge amdgpu_vm_set_pasid into amdgpu_vm_initJesse.Zhang
As KFD no longer uses a separate PASID, the global amdgpu_vm_set_pasid()function is no longer necessary. Merge its functionality directly intoamdgpu_vm_init() to simplify code flow and eliminate redundant locking. v2: remove superflous check adjust amdgpu_vm_fin and remove amdgpu_vm_set_pasid (Chritian) v3: drop amdgpu_vm_assert_locked (Chritian) Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4614 Fixes: 59e4405e9ee2 ("drm/amdgpu: revert to old status lock handling v3") Reviewed-by: Christian König <christian.koenig@amd.com> Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-07drm/amd/amdgpu: Fix the mes version that support inv_tlbsShaoyun Liu
MES version 0x83 is not stable to use the inv_tlbs API. Defer it to 0x84 vertsion. Fixes: 85442bac8466 ("drm/amd/amdgpu: Fix the mes version that support inv_tlbs") Signed-off-by: Shaoyun Liu <shaoyun.liu@amd.com> Reviewed-by: Michael Chen <michael.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-07drm/amd: Check whether secure display TA loaded successfullyMario Limonciello
[Why] Not all renoir hardware supports secure display. If the TA is present but the feature isn't supported it will fail to load or send commands. This shows ERR messages to the user that make it seems like there is a problem. [How] Check the resp_status of the context to see if there was an error before trying to send any secure display commands. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/1415 Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-07drm/amdkfd: Fix mmap write lock not releasePhilip Yang
If mmap write lock is taken while draining retry fault, mmap write lock is not released because svm_range_restore_pages calls mmap_read_unlock then returns. This causes deadlock and system hangs later because mmap read or write lock cannot be taken. Downgrade mmap write lock to read lock if draining retry fault fix this bug. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-07drm/amdkfd: Fix kfd process ref leaking when userptr unmappingPhilip Yang
kfd_lookup_process_by_pid hold the kfd process reference to ensure it doesn't get destroyed while sending the segfault event to user space. Calling kfd_lookup_process_by_pid as function parameter leaks the kfd process refcount and miss the NULL pointer check if app process is already destroyed. Fixes: 2d274bf7099b ("amd/amdkfd: Trigger segfault for early userptr unmmapping") Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-07drm/amdgpu: Fix for GPU reset being blocked by KIQ I/O.Heng Zhou
There is some probability that reset workqueue is blocked by KIQ I/O for 10+ seconds after gpu hangs. So we need to add a in_reset check during each KIQ register poll. Signed-off-by: Heng Zhou <Heng.Zhou@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-07drm/amd/display: Disable scaling on DCE6 for nowTimur Kristóf
Scaling doesn't work on DCE6 at the moment, the current register programming produces incorrect output when using fractional scaling (between 100-200%) on resolutions higher than 1080p. Disable it until we figure out how to program it properly. Fixes: 7c15fd86aaec ("drm/amd/display: dc/dce: add initial DCE6 support (v10)") Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-07drm/amd/display: Properly disable scaling on DCE6Timur Kristóf
SCL_SCALER_ENABLE can be used to enable/disable the scaler on DCE6. Program it to 0 when scaling isn't used, 1 when used. Additionally, clear some other registers when scaling is disabled and program the SCL_UPDATE register as recommended. This fixes visible glitches for users whose BIOS sets up a mode with scaling at boot, which DC was unable to clean up. Fixes: b70aaf5586f2 ("drm/amd/display: dce_transform: add DCE6 specific macros,functions") Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-07drm/amd/display: Properly clear SCL_*_FILTER_CONTROL on DCE6Timur Kristóf
Previously, the code would set a bit field which didn't exist on DCE6 so it would be effectively a no-op. Fixes: b70aaf5586f2 ("drm/amd/display: dce_transform: add DCE6 specific macros,functions") Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-07drm/amd/display: Add missing DCE6 SCL_HORZ_FILTER_INIT* SRIsTimur Kristóf
Without these, it's impossible to program these registers. Fixes: 102b2f587ac8 ("drm/amd/display: dce_transform: DCE6 Scaling Horizontal Filter Init (v2)") Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-07drm/amdgpu: Add additional DCE6 SCL registersAlex Deucher
Fixes: 102b2f587ac8 ("drm/amd/display: dce_transform: DCE6 Scaling Horizontal Filter Init (v2)") Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-07Merge tag 'hyperv-next-signed-20251006' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull hyperv updates from Wei Liu: - Unify guest entry code for KVM and MSHV (Sean Christopherson) - Switch Hyper-V MSI domain to use msi_create_parent_irq_domain() (Nam Cao) - Add CONFIG_HYPERV_VMBUS and limit the semantics of CONFIG_HYPERV (Mukesh Rathor) - Add kexec/kdump support on Azure CVMs (Vitaly Kuznetsov) - Deprecate hyperv_fb in favor of Hyper-V DRM driver (Prasanna Kumar T S M) - Miscellaneous enhancements, fixes and cleanups (Abhishek Tiwari, Alok Tiwari, Nuno Das Neves, Wei Liu, Roman Kisel, Michael Kelley) * tag 'hyperv-next-signed-20251006' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: hyperv: Remove the spurious null directive line MAINTAINERS: Mark hyperv_fb driver Obsolete fbdev/hyperv_fb: deprecate this in favor of Hyper-V DRM driver Drivers: hv: Make CONFIG_HYPERV bool Drivers: hv: Add CONFIG_HYPERV_VMBUS option Drivers: hv: vmbus: Fix typos in vmbus_drv.c Drivers: hv: vmbus: Fix sysfs output format for ring buffer index Drivers: hv: vmbus: Clean up sscanf format specifier in target_cpu_store() x86/hyperv: Switch to msi_create_parent_irq_domain() mshv: Use common "entry virt" APIs to do work in root before running guest entry: Rename "kvm" entry code assets to "virt" to genericize APIs entry/kvm: KVM: Move KVM details related to signal/-EINTR into KVM proper mshv: Handle NEED_RESCHED_LAZY before transferring to guest x86/hyperv: Add kexec/kdump support on Azure CVMs Drivers: hv: Simplify data structures for VMBus channel close message Drivers: hv: util: Cosmetic changes for hv_utils_transport.c mshv: Add support for a new parent partition configuration clocksource: hyper-v: Skip unnecessary checks for the root partition hyperv: Add missing field to hv_output_map_device_interrupt
2025-10-07drm/buddy: Add KUnit tests for allocator performance under fragmentationArunpravin Paneer Selvam
Add KUnit test cases that create severe memory fragmentation and measure allocation/free performance. The tests simulate two scenarios - 1. Allocation under severe fragmentation - Allocate the entire 4 GiB space as 8 KiB blocks with 64 KiB alignment, split them into two groups and free with mixed flags to block coalescing. - Repeatedly allocate and free 64 KiB blocks while timing the loop. - Freelist runtime: 76475 ms(76.5 seconds), soft-lockup triggered. RB-tree runtime: 186 ms. 2. Reverse free order under fragmentation - Create a similarly fragmented space, free half the blocks, reverse the order of the remainder, and release them with the cleared flag. - Freelist runtime: 85620 ms(85.6 seconds). RB-tree runtime: 114 ms. Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://lore.kernel.org/r/20251006095124.1663-3-Arunpravin.PaneerSelvam@amd.com
2025-10-07drm/buddy: Separate clear and dirty free block treesArunpravin Paneer Selvam
Maintain two separate RB trees per order - one for clear (zeroed) blocks and another for dirty (uncleared) blocks. This separation improves code clarity and makes it more obvious which tree is being searched during allocation. It also improves scalability and efficiency when searching for a specific type of block, avoiding unnecessary checks and making the allocator more predictable under fragmentation. The changes have been validated using the existing drm_buddy_test KUnit test cases, along with selected graphics workloads, to ensure correctness and avoid regressions. v2: Missed adding the suggested-by tag. Added it in v2. v3(Matthew): - Remove the double underscores from the internal functions. - Rename the internal functions to have less generic names. - Fix the error handling code. - Pass tree argument for the tree macro. - Use the existing dirty/free bit instead of new tree field. - Make free_trees[] instead of clear_tree and dirty_tree for more cleaner approach. v4: - A bug was reported by Intel CI and it is fixed by Matthew Auld. - Replace the get_root function with &mm->free_trees[tree][order] (Matthew) - Remove the unnecessary rbtree_is_empty() check (Matthew) - Remove the unnecessary get_tree_for_flags() function. - Rename get_tree_for_block() name with get_block_tree() for more clarity. v5(Jani Nikula): - Don't use static inline in .c files. - enum free_tree and enumerator names are quite generic for a header and usage and the whole enum should be an implementation detail. v6: - Rewrite the __force_merge() function using the rb_last() and rb_prev(). v7(Matthew): - Replace the open-coded tree iteration for loops with the for_each_free_tree() macro throughout the code. - Fixed out_free_roots to prevent double decrement of i, addressing potential crash. - Replaced enum drm_buddy_free_tree with unsigned int in for_each_free_tree loops. Cc: stable@vger.kernel.org Fixes: a68c7eaa7a8f ("drm/amdgpu: Enable clear page functionality") Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Suggested-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4260 Link: https://lore.kernel.org/r/20251006095124.1663-2-Arunpravin.PaneerSelvam@amd.com
2025-10-07drm/buddy: Optimize free block management with RB treeArunpravin Paneer Selvam
Replace the freelist (O(n)) used for free block management with a red-black tree, providing more efficient O(log n) search, insert, and delete operations. This improves scalability and performance when managing large numbers of free blocks per order (e.g., hundreds or thousands). In the VK-CTS memory stress subtest, the buddy manager merges fragmented memory and inserts freed blocks into the freelist. Since freelist insertion is O(n), this becomes a bottleneck as fragmentation increases. Benchmarking shows list_insert_sorted() consumes ~52.69% CPU with the freelist, compared to just 0.03% with the RB tree (rbtree_insert.isra.0), despite performing the same sorted insert. This also improves performance in heavily fragmented workloads, such as games or graphics tests that stress memory. As the buddy allocator evolves with new features such as clear-page tracking, the resulting fragmentation and complexity have grown. These RB-tree based design changes are introduced to address that growth and ensure the allocator continues to perform efficiently under fragmented conditions. The RB tree implementation with separate clear/dirty trees provides: - O(n log n) aggregate complexity for all operations instead of O(n^2) - Elimination of soft lockups and system instability - Improved code maintainability and clarity - Better scalability for large memory systems - Predictable performance under fragmentation v3(Matthew): - Remove RB_EMPTY_NODE check in force_merge function. - Rename rb for loop macros to have less generic names and move to .c file. - Make the rb node rb and link field as union. v4(Jani Nikula): - The kernel-doc comment should be "/**" - Move all the rbtree macros to rbtree.h and add parens to ensure correct precedence. v5: - Remove the inline in a .c file (Jani Nikula). v6(Peter Zijlstra): - Add rb_add() function replacing the existing rbtree_insert() code. v7: - A full walk iteration in rbtree is slower than the list (Peter Zijlstra). - The existing rbtree_postorder_for_each_entry_safe macro should be used in scenarios where traversal order is not a critical factor (Christian). v8(Matthew): - Remove the rbtree_is_empty() check in this patch as well. Cc: stable@vger.kernel.org Fixes: a68c7eaa7a8f ("drm/amdgpu: Enable clear page functionality") Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://lore.kernel.org/r/20251006095124.1663-1-Arunpravin.PaneerSelvam@amd.com
2025-10-07drm/xe/pf: Improve reading VF config blob from debugfsMichal Wajdeczko
Due to the use of the file operation flows, we might encode the VF config blob multiple times. Avoid that by capturing it once during the open() operation instead of the read() operation. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Link: https://lore.kernel.org/r/20251004162036.1800-1-michal.wajdeczko@intel.com
2025-10-07Merge tag 'drm-xe-next-fixes-2025-10-03' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/xe/kernel into drm-next Cross-subsystem Changes: - Fix userptr to not allow device private pages with SVM (Thomas Hellström) Driver Changes: - Fix build with clang 16 (Michal Wajdeczko) - Fix handling of invalid configfs syntax usage and spell out the expected syntax in the documentation (Lucas De Marchi) - Do not try late bind firmware when running as VF since it shouldn't handle firmware loading (Michal Wajdeczko) - Fix idle assertion for local BOs (Thomas Hellström) - Fix uninitialized variable for late binding (Colin Ian King, Mallesh Koujalagi) - Do not require perfmon_capable to expose free memory at page granularity. Handle it like other drm drivers do (Matthew Auld) - Fix lock handling on suspend error path (Shuicheng Lin) - Fix I2C controller resume after S3 (Raag Jadav) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/q6yeyb7n2eqo5megxjqayooajirx5hhsntfo65m3y4myscz7oz@25qbabbbr4hj
2025-10-07Merge tag 'drm-misc-next-fixes-2025-10-02' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/misc/kernel into drm-next Short summary of fixes pull: v3d: - Fix fence locking Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://lore.kernel.org/r/20251002122303.GA21323@linux.fritz.box
2025-10-06drm/msm: Fix GEM free for imported dma-bufsRob Clark
Imported dma-bufs also have obj->resv != &obj->_resv. So we should check both this condition in addition to flags for handling the _NO_SHARE case. Fixes this splat that was reported with IRIS video playback: ------------[ cut here ]------------ WARNING: CPU: 3 PID: 2040 at drivers/gpu/drm/msm/msm_gem.c:1127 msm_gem_free_object+0x1f8/0x264 [msm] CPU: 3 UID: 1000 PID: 2040 Comm: .gnome-shell-wr Not tainted 6.17.0-rc7 #1 PREEMPT pstate: 81400005 (Nzcv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--) pc : msm_gem_free_object+0x1f8/0x264 [msm] lr : msm_gem_free_object+0x138/0x264 [msm] sp : ffff800092a1bb30 x29: ffff800092a1bb80 x28: ffff800092a1bce8 x27: ffffbc702dbdbe08 x26: 0000000000000008 x25: 0000000000000009 x24: 00000000000000a6 x23: ffff00083c72f850 x22: ffff00083c72f868 x21: ffff00087e69f200 x20: ffff00087e69f330 x19: ffff00084d157ae0 x18: 0000000000000000 x17: 0000000000000000 x16: ffffbc704bd46b80 x15: 0000ffffd0959540 x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000 x11: ffffbc702e6cdb48 x10: 0000000000000000 x9 : 000000000000003f x8 : ffff800092a1ba90 x7 : 0000000000000000 x6 : 0000000000000020 x5 : ffffbc704bd46c40 x4 : fffffdffe102cf60 x3 : 0000000000400032 x2 : 0000000000020000 x1 : ffff00087e6978e8 x0 : ffff00087e6977e8 Call trace: msm_gem_free_object+0x1f8/0x264 [msm] (P) drm_gem_object_free+0x1c/0x30 [drm] drm_gem_object_handle_put_unlocked+0x138/0x150 [drm] drm_gem_object_release_handle+0x5c/0xcc [drm] drm_gem_handle_delete+0x68/0xbc [drm] drm_gem_close_ioctl+0x34/0x40 [drm] drm_ioctl_kernel+0xc0/0x130 [drm] drm_ioctl+0x360/0x4e0 [drm] __arm64_sys_ioctl+0xac/0x104 invoke_syscall+0x48/0x104 el0_svc_common.constprop.0+0x40/0xe0 do_el0_svc+0x1c/0x28 el0_svc+0x34/0xec el0t_64_sync_handler+0xa0/0xe4 el0t_64_sync+0x198/0x19c ---[ end trace 0000000000000000 ]--- ------------[ cut here ]------------ Reported-by: Stephan Gerhold <stephan.gerhold@linaro.org> Fixes: de651b6e040b ("drm/msm: Fix refcnt underflow in error path") Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com> Tested-by: Stephan Gerhold <stephan.gerhold@linaro.org> Tested-by: Luca Weiss <luca.weiss@fairphone.com> Tested-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org> # qrb5165-rb5 Patchwork: https://patchwork.freedesktop.org/patch/676273/ Message-ID: <20250923140441.746081-1-robin.clark@oss.qualcomm.com>
2025-10-06drm/xe/guc: Ratelimit diagnostic messages from the relayMichal Wajdeczko
There might be some malicious VFs that by sending an invalid VF2PF relay messages will flood PF's dmesg with our diagnostics messages. Rate limit all relay messages, unless running in DEBUG_SRIOV mode. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Link: https://lore.kernel.org/r/20251005173946.2784-1-michal.wajdeczko@intel.com