summaryrefslogtreecommitdiff
path: root/drivers
AgeCommit message (Collapse)Author
2025-08-04drm/amdgpu: use kmalloc_array() instead of kmalloc()Yunshui Jiang
Use kmalloc_array() instead of kmalloc() with multiplication. kmalloc_array() is a safer way because of its multiply overflow check. Signed-off-by: Yunshui Jiang <jiangyunshui@kylinos.cn> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-08-04drm/amdgpu: Fix unintended error log in VCN5_0_0Sathishkumar S
The error log is supposed to be gaurded under if failure condition. Fixes: faab5ea08367 ("drm/amdgpu: Check vcn sram load return value") Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-08-04drm/amd/display: Fix DCE 6.0 and 6.4 PLL programming.Timur Kristóf
Apparently, both DCE 6.0 and 6.4 have 3 PLLs, but PLL0 can only be used for DP. Make sure to initialize the correct amount of PLLs in DC for these DCE versions and use PLL0 only for DP. Also, on DCE 6.0 and 6.4, the PLL0 needs to be powered on at initialization as opposed to DCE 6.1 and 7.x which use a different clock source for DFS. The following functions were used as reference from the old radeon driver implementation of DCE 6.x: - radeon_atom_pick_pll - atombios_crtc_set_disp_eng_pll Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-08-04drm/amd/display: Don't overwrite dce60_clk_mgrTimur Kristóf
dc_clk_mgr_create accidentally overwrites the dce60_clk_mgr with the dce_clk_mgr, causing incorrect behaviour on DCE6. Fix it by removing the extra dce_clk_mgr_construct. Fixes: 62eab49faae7 ("drm/amd/display: hide VGH asic specific structs") Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-08-04drm/amdgpu: Effective health check before resetCe Sun
Move amdgpu_device_health_check into amdgpu_device_gpu_recover to ensure that if the device is present can be checked before reset The reason is: 1.During the dpc event, the device where the dpc event occurs is not present on the bus 2.When both dpc event and ATHUB event occur simultaneously,the dpc thread holds the reset domain lock when detecting error,and the gpu recover thread acquires the hive lock.The device is simultaneously in the states of amdgpu_ras_in_recovery and occurs_dpc,so gpu recover thread will not go to amdgpu_device_health_check.It waits for the reset domain lock held by the dpc thread, but dpc thread has not released the reset domain lock.In the dpc callback slot_reset,to obtain the hive lock, the hive lock is held by the gpu recover thread at this time.So a deadlock occurred Signed-off-by: Ce Sun <cesun102@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-08-04drm/amdgpu: Avoid rma causes GPU duplicate resetCe Sun
Try to ensure poison creation handle is completed in time to set device rma value. Signed-off-by: Ce Sun <cesun102@amd.com> Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-08-04drm/amdgpu: Update IPID value for bad page threshold CPERXiang Liu
Update the IPID register value for bad page threshold CPER according to the latest definition. Signed-off-by: Xiang Liu <xiang.liu@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-08-04drm/amdgpu: Fix kdoc style in amdgpu_fence.cSrinivasan Shanmugam
The initial comment block before amdgpu_fence_driver_guilty_force_completion() incorrectly used '/**' but is not a kernel-doc comment, causing build warnings. Fixes the below with gcc W=1: drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c:742: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst * Kernel queue reset handling Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-08-04drm/amdkfd: Fix checkpoint-restore on multi-xccDavid Yat Sin
GPUs with multi-xcc have multiple MQDs per queue. This patch saves and restores all the MQDs within the partition. Signed-off-by: David Yat Sin <David.YatSin@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-08-04drm/amd: Restore cached manual clock settings during resumeMario Limonciello
If the SCLK limits have been set before S3 they will not be restored. The limits are however cached in the driver and so they can be restored by running a commit sequence during resume. Acked-by: Alex Deucher <alexander.deucher@amd.com> Link: https://lore.kernel.org/r/20250725031222.3015095-3-superm1@kernel.org Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-08-04drm/amd: Restore cached power limit during resumeMario Limonciello
The power limit will be cached in smu->current_power_limit but if the ASIC goes into S3 this value won't be restored. Restore the value during SMU resume. Acked-by: Alex Deucher <alexander.deucher@amd.com> Link: https://lore.kernel.org/r/20250725031222.3015095-2-superm1@kernel.org Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-08-04drm/amdgpu: Fix build error when CONFIG_SUSPEND is disabledPerry Yuan
The variable `pm_suspend_target_state` is conditionally defined only when `CONFIG_SUSPEND` is enabled (see `include/linux/suspend.h`). Directly referencing it without guarding by `#ifdef CONFIG_SUSPEND` causes build failures when suspend functionality is disabled (e.g., `CONFIG_SUSPEND=n`). Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Perry Yuan <perry.yuan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-08-04drm/amdgpu: rework how PTE flags are generated v3Christian König
Previously we tried to keep the HW specific PTE flags in each mapping, but for CRIU that isn't sufficient any more since the original value is needed for the checkpoint procedure. So rework the whole handling, nuke the early mapping function, keep the UAPI flags in each mapping instead of the HW flags and translate them to the HW flags while filling in the PTEs. Only tested on Navi 23 for now, so probably needs quite a bit of more work. v2: fix KFD and SVN handling v3: one more SVN fix pointed out by Felix v4: squash in gfx12 fix from David Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-08-04Mark xe driver as BROKEN if kernel page size is not 4kBSimon Richter
This driver, for the time being, assumes that the kernel page size is 4kB, so it fails on loong64 and aarch64 with 16kB pages, and ppc64el with 64kB pages. Signed-off-by: Simon Richter <Simon.Richter@hogyros.de> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Cc: stable@vger.kernel.org # v6.8+ Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://lore.kernel.org/r/20250802024152.3021-1-Simon.Richter@hogyros.de (cherry picked from commit 0521a868222ffe636bf202b6e9d29292c1e19c62) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-08-04drm/xe/pf: Make sure PF is ready to configure VFsMichal Wajdeczko
The PF driver might be resumed just to configure VFs, but since it is doing some asynchronous GuC reconfigurations after fresh reset, we should wait until all pending works are completed. This is especially important in case of LMEM provisioning, since we also need to update the LMTT and send invalidation requests to all GuCs, which are expected to be already in the VGT mode. Fixes: 68ae022278a1 ("drm/xe/pf: Force GuC virtualization mode") Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Cc: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://lore.kernel.org/r/20250801142822.180530-3-michal.wajdeczko@intel.com (cherry picked from commit c6c86441c465ea440dfb5039f1c26e629a6fd64c) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-08-04drm/xe/pf: Disable PF restart worker on device removalMichal Wajdeczko
We can't let restart worker run once device is removed, since other data that it might want to access could be already released. Explicitly disable worker as part of device cleanup action. Fixes: a4d1c5d0b99b ("drm/xe/pf: Move VFs reprovisioning to worker") Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Cc: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://lore.kernel.org/r/20250801142822.180530-2-michal.wajdeczko@intel.com (cherry picked from commit a424353937c24554bb242a6582ed8f018b4a411c) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-08-04drm/xe/devcoredump: Defer devcoredump initialization during probeBalasubramani Vivekanandan
Doing devcoredump initializing before GT though look harmless, it leads to problem during driver unbind. Because of this order, GT/Engine release functions will be called before xe devcoredump release function (xe_driver_devcoredump_fini) leading to the following kernel crash[1] because the devcoredump functions might still use GT/Engine datastructures after those are freed. The following crash is observed while running the IGT xe_wedged@wedged-at-any-timeout. The test forces a wedged state by submitting a workload which hangs. Then does a unbind/rebind of the driver to recover from the wedged state. The hanged workload leads to a devcoredump. The following crash is noticed when the devcoredump capture races with the driver unbind. During driver unbind, the release function hw_engine_fini() will be called which assigns NULL to hwe->gt. But the same data structure is accessed during the coredump capture in the function xe_engine_snapshot_print by reading snapshot->hwe->gt. With this patch, we make sure the devcoredump is stopped before deinitializing the core driver functions. [1]: BUG: kernel NULL pointer dereference, address: 0000000000000000 Workqueue: events_unbound xe_devcoredump_deferred_snap_work [xe] RIP: 0010:xe_engine_snapshot_print+0x47/0x420 [xe] Call Trace: <TASK> ? drm_printf+0x64/0x90 __xe_devcoredump_read+0x23f/0x2d0 [xe] ? __pfx___drm_printfn_coredump+0x10/0x10 ? __pfx___drm_puts_coredump+0x10/0x10 xe_devcoredump_deferred_snap_work+0x17a/0x190 [xe] process_one_work+0x22e/0x6f0 worker_thread+0x1e8/0x3d0 ? __pfx_worker_thread+0x10/0x10 kthread+0x11f/0x250 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x47/0x70 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1a/0x30 v2: Detailed commit description (Rodrigo) v3: FIXME added (Rodrigo, Stuart) Fixes: 4209d635a823 ("drm/xe: Remove devcoredump during driver release") Reviewed-by: Stuart Summers <stuart.summers@intel.com> Link: https://lore.kernel.org/r/20250731061300.14320-1-balasubramani.vivekanandan@intel.com Signed-off-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com> Link: https://lore.kernel.org/r/20250801052356.21885-1-balasubramani.vivekanandan@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit 1fdc4c381ff765479d76ccf3134717c430c871b8) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-08-04Merge tag 'for-6.17/dm-changes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper updates from Mikulas Patocka: - fix checking for request-based stackable devices (dm-table) - fix corrupt_bio_byte setup checks (dm-flakey) - add support for resync w/o metadata devices (dm raid) - small code simplification (dm, dm-mpath, vm-vdo, dm-raid) - remove support for asynchronous hashes (dm-verity) - close smatch warning (dm-zoned-target) - update the documentation and enable inline-crypto passthrough (dm-thin) * tag 'for-6.17/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm: set DM_TARGET_PASSES_CRYPTO feature for dm-thin dm-thin: update the documentation dm-raid: do not include dm-core.h vdo: omit need_resched() before cond_resched() md: dm-zoned-target: Initialize return variable r to avoid uninitialized use dm-verity: remove support for asynchronous hashes dm-mpath: don't print the "loaded" message if registering fails dm-mpath: make dm_unregister_path_selector return void dm: ima: avoid extra calls to strlen() dm: Simplify dm_io_complete() dm: Remove unnecessary return in dm_zone_endio() dm raid: add support for resync w/o metadata devices dm-flakey: Fix corrupt_bio_byte setup checks dm-table: fix checking for rq stackable devices
2025-08-04drm/xe/pf: Enable SR-IOV PF mode by defaultMichal Wajdeczko
We already claim official support for SR-IOV PF/VF modes on PTL and BMG platforms, but by default we start the Xe driver on those platforms in non-virtualized mode (native) since we still have max_vfs modparam set to disable creation of the VFs. It's time to let the Xe driver support SR-IOV PF mode by default. We were already testing this on our CI, which was relying on the patch that was enabling it for CONFIG_DRM_XE_DEBUG used by our CI. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Thomas Hellstrom <thomas.hellstrom@linux.intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250722182618.30811-3-michal.wajdeczko@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit a2b461bd6f3b36bded0a74178dec0e58e4714d3d) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-08-04drm/xe: Extend wa_13012615864 to additional Xe2 and Xe3 platformsTangudu Tilak Tirumalesh
Extend WA 13012615864 to Graphics Versions 20.01,20.02,20.04 and 30.03. Signed-off-by: Tangudu Tilak Tirumalesh <tilak.tirumalesh.tangudu@intel.com> Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com> Link: https://lore.kernel.org/r/20250731220143.72942-2-jonathan.cavitt@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-08-04drm/panel: sitronix-st7703: fix typo in commentsHugo Villeneuve
Fix typo in comments: souch -> such. Signed-off-by: Hugo Villeneuve <hvilleneuve@dimonoff.com> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://lore.kernel.org/r/20250721152818.1891212-1-hugo@hugovil.com
2025-08-04drm/panel: himax-hx8279: Remove unneeded semicolonChen Ni
Remove unnecessary semicolons reported by Coccinelle/coccicheck and the semantic patch at scripts/coccinelle/misc/semicolon.cocci. Signed-off-by: Chen Ni <nichen@iscas.ac.cn> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://lore.kernel.org/r/20250729054214.2264377-1-nichen@iscas.ac.cn
2025-08-04drm/panel: novatek-nt35560: Fix invalid return valueBrigham Campbell
Fix bug in nt35560_set_brightness() which causes the function to erroneously report an error. mipi_dsi_dcs_write() returns either a negative value when an error occurred or a positive number of bytes written when no error occurred. The buggy code reports an error under either condition. Fixes: 8152c2bfd780 ("drm/panel: Add driver for Sony ACX424AKP panel") Reviewed-by: Douglas Anderson <dianders@chromium.org> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Signed-off-by: Brigham Campbell <me@brighamcampbell.com> Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://lore.kernel.org/r/20250731032343.1258366-2-me@brighamcampbell.com
2025-08-04drm: panel: Add support for Hydis HV101HD1 MIPI DSI panelSvyatoslav Ryhel
HV101HD1-1E1 is a color active matrix TFT LCD module using amorphous silicon TFT's (Thin Film Transistors) as an active switching devices. This module has a 10.1 inch diagonally measured active area with HD resolutions (1366 horizontal by 768 vertical pixel array). Signed-off-by: Svyatoslav Ryhel <clamor95@gmail.com> Reviewed-by: David Heidelberg <david@ixit.cz> Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://lore.kernel.org/r/20250717135752.55958-3-clamor95@gmail.com
2025-08-04drm: panel: orisetech: improve error handling during probeAkhilesh Patil
Use dev_err_probe() helper as directed by core driver model to handle driver probe error. Use standard helper defined at drivers/base/core.c to maintain code consistency. Inspired by, commit a787e5400a1ce ("driver core: add device probe log helper") Signed-off-by: Akhilesh Patil <akhilesh@ee.iitb.ac.in> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://lore.kernel.org/r/aIJagJ/RnhSCtb2t@bhairav-test.ee.iitb.ac.in
2025-08-04drm/panel: Kconfig: Fix spelling mistake "pannel" -> "panel"Colin Ian King
There is a spelling mistake in the LEDS_BD2606MVV config. Fix it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://lore.kernel.org/r/20250724112358.142351-1-colin.i.king@gmail.com
2025-08-04drm: panel: add support for Samsung AMS561RA01 panel with S6E8AA5X01 controllerKaustabh Chakraborty
Samsung AMS561RA01 is an AMOLED panel, using the Samsung S6E8AA5X01 MIPI DSI controller. Implement a basic panel driver for such panels. The driver also initializes a backlight device, which works by changing the panel's gamma values and aid brightness levels appropriately, with the help of look-up tables acquired from downstream kernel sources. Signed-off-by: Kaustabh Chakraborty <kauschluss@disroot.org> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://lore.kernel.org/r/20250721-panel-samsung-s6e8aa5x01-v5-2-1a315aba530b@disroot.org
2025-08-04drm/panel: simple: Add Olimex LCD-OLinuXino-5CTS supportPaul Kocialkowski
Add support for the Olimex LCD-OLinuXino-5CTS, a 5-inch TFT LCD panel with a mode operating at 33.3 MHz. Signed-off-by: Paul Kocialkowski <paulk@sys-base.io> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://lore.kernel.org/r/20250702082230.1291953-2-paulk@sys-base.io
2025-08-04drm/xe/vf: Rebase exec queue parallel commands during migration recoveryTomasz Lis
Parallel exec queues have an additional command streamer buffer which holds a GGTT reference to data within context status. The GGTT references have to be fixed after VF migration. v2: Properly handle nop entry, verify if parsing goes ok v3: Improve error/warn logging, add propagation of errors, give names to magic offsets Signed-off-by: Tomasz Lis <tomasz.lis@intel.com> Cc: Michal Winiarski <michal.winiarski@intel.com> Reviewed-by: Michał Winiarski <michal.winiarski@intel.com> Link: https://lore.kernel.org/r/20250802031045.1127138-9-tomasz.lis@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
2025-08-04drm/xe/vf: Refresh utilization buffer during migration recoveryTomasz Lis
The WA buffer we use to capture context utilization contains GGTT references. This means its instructions have to be either fixed or re-emitted during VF post-migration recovery. This patch adds re-emitting content of the utilization WA BB during the recovery. The way we write to vram requires scratch buffer to be used before the whole block is memcopied. We are re-using a scratch buffer introduced in earlier part of the recovery. This is not a performance optimization, but a necessity to avoid creating dependencies between locks. v2: Notable rebase after "Prepare WA BB setup for more users" patch v3: Added error propagation Signed-off-by: Tomasz Lis <tomasz.lis@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Michal Winiarski <michal.winiarski@intel.com> Reviewed-by: Michal Winiarski <michal.winiarski@intel.com> Link: https://lore.kernel.org/r/20250802031045.1127138-8-tomasz.lis@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
2025-08-04drm/xe/vf: Post migration, repopulate ring area for pending requestTomasz Lis
The commands within ring area allocated for a request may contain references to GGTT. These references require update after VF migration, in order to continue any preempted LRCs, or jobs which were emitted to the ring but not sent to GuC yet. This change calls the emit function again for all such jobs, as part of post-migration recovery. v2: Moved few functions to better files v3: Take job_list_lock v4: Rephrased comments Signed-off-by: Tomasz Lis <tomasz.lis@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Michal Winiarski <michal.winiarski@intel.com> Reviewed-by: Michal Winiarski <michal.winiarski@intel.com> Cc: Jonathan Cavitt <jonathan.cavitt@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://lore.kernel.org/r/20250802031045.1127138-7-tomasz.lis@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
2025-08-04drm/xe/vf: Rebase MEMIRQ structures for all contexts after migrationTomasz Lis
All contexts require an update of state data, as the data includes GGTT references to memirq-related buffers. Default contexts need these references updated as well, because they are not refreshed when a new context is created from them. The way we write to vram requires scratch buffer to be used before the whole block is memcopied. Since using kalloc() within specific recovery functions would lead to unintended relations between locks, we are allocating the buffer earlier, before any locks are taken. The same buffer will be used for other steps of the recovery. v2: Update addresses by xe_lrc_write_ctx_reg() rather than set_memory_based_intr() v3: Renamed parameter, reordered parameters in some functs v4: Check if have MEMIRQ, move `xe_gt*` funct to proper file v5: Revert back to requiring scratch buffer, but allocate it earlier this time Signed-off-by: Tomasz Lis <tomasz.lis@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Michal Winiarski <michal.winiarski@intel.com> Acked-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com> Reviewed-by: Michal Winiarski <michal.winiarski@intel.com> Link: https://lore.kernel.org/r/20250802031045.1127138-6-tomasz.lis@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
2025-08-04drm/xe/vf: Rebase HWSP of all contexts after migrationTomasz Lis
All contexts require an update due to GGTT range shift, as that affects their HWSP. The HW status page of a context contains GGTT references, which need to be shifted to a new range (or re-computed using the previously updated vma nodes). The references include ring start address and indirect state address. v2: move some functions to better matched files v3: Add missing kerneldocs v4: Style fix Signed-off-by: Tomasz Lis <tomasz.lis@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Michal Winiarski <michal.winiarski@intel.com> Acked-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com> Reviewed-by: Michal Winiarski <michal.winiarski@intel.com> Link: https://lore.kernel.org/r/20250802031045.1127138-5-tomasz.lis@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
2025-08-04drm/xe: Block reset while recovering from VF migrationTomasz Lis
Resetting GuC during recovery could interfere with the recovery process. Such reset might be also triggered without justification, due to migration taking time, rather than due to the workload not progressing. Doing GuC reset during the recovery would cause exit of RESFIX state, and therefore continuation of GuC work while fixups are still being applied. To avoid that, reset needs to be blocked during the recovery. This patch blocks the reset during recovery. Reset request in that time range will be stalled, and unblocked only after GuC goes out of RESFIX state. In case a reset procedure already started while the recovery is triggered, there isn't much we can do - we cannot wait for it to finish as it involves waiting for hardware, and we can't be sure at which exact point of the reset procedure the GPU got switched. Therefore, the rare cases where migration happens while reset is in progress, are still dangerous. Resets are not a part of the standard flow, and cause unfinished workloads - that will happen during the reset interrupted by migration as well, so it doesn't diverge that much from what normally happens during such resets. v2: Introduce a new atomic for reset blocking, as we cannot reuse `stopped` atomic (that could lead to losing a workload). v3: Switched atomic functs to ones which include proper barriers Signed-off-by: Tomasz Lis <tomasz.lis@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Michal Winiarski <michal.winiarski@intel.com> Reviewed-by: Michal Winiarski <michal.winiarski@intel.com> Link: https://lore.kernel.org/r/20250802031045.1127138-4-tomasz.lis@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
2025-08-04drm/xe/vf: Pause submissions during RESFIX fixupsTomasz Lis
While applying post-migration fixups to VF, GuC will not respond to any commands. This means submissions have no way of finishing. To avoid acquiring additional resources and then stalling on hardware access, pause the submission work. This will decrease the chance of depleting resources, and speed up the recovery. v2: Commented xe_irq_resume() call v3: Typo fix Signed-off-by: Tomasz Lis <tomasz.lis@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Michal Winiarski <michal.winiarski@intel.com> Reviewed-by: Michał Winiarski <michal.winiarski@intel.com> Link: https://lore.kernel.org/r/20250802031045.1127138-3-tomasz.lis@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
2025-08-04drm/xe/sa: Avoid caching GGTT address within the managerTomasz Lis
Non-virtualized resources require fixups after SRIOV VF migration. Caching GGTT references rather than re-computing them from the underlying Buffer Object is something we want to avoid, as such code would require additional fixup step and additional locking around all the places where the address is accessed. This change removes the cached GPU address from the Sub-Allocation Manager, and introduces a function which recomputes and returns the address instead. v2: renamed xe_sa_manager_gpu_addr(), added kerneldoc Signed-off-by: Tomasz Lis <tomasz.lis@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://lore.kernel.org/r/20250802031045.1127138-2-tomasz.lis@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
2025-08-04tee: fix memory leak in tee_dyn_shm_alloc_helperPei Xiao
When shm_register() fails in tee_dyn_shm_alloc_helper(), the pre-allocated pages array is not freed, resulting in a memory leak. Fixes: cf4441503e20 ("tee: optee: Move pool_op helper functions") Signed-off-by: Pei Xiao <xiaopei01@kylinos.cn> Reviewed-by: Sumit Garg <sumit.garg@oss.qualcomm.com> Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
2025-08-04drm/{i915,xe}/display: Block hpd during suspendDibin Moolakadan Subrahmanian
It has been observed that during `xe_display_pm_suspend()` execution, an HPD interrupt can still be triggered, resulting in `dig_port_work` being scheduled. The issue arises when this work executes after `xe_display_pm_suspend_late()`, by which time the display is fully suspended. This can lead to errors such as "DC state mismatch", as the dig_port work accesses display resources that are no longer available or powered. To address this, introduce 'intel_encoder_block_all_hpds' and 'intel_encoder_unblock_all_hpds' functions, which iterate over all encoders and block/unblock HPD respectively. These are used to: - Block HPD IRQs before calling 'intel_hpd_cancel_work' in suspend and shutdown - Unblock HPD IRQs after 'intel_hpd_init' in resume This will prevent 'dig_port_work' being scheduled during display suspend. Continuation of previous patch discussion: https://patchwork.freedesktop.org/patch/663964/ Changes in v2: - Add 'intel_encoder_block_all_hpds' to 'xe_display_pm_shutdown'.(Imre Deak) - Add 'intel_hpd_cancel_work' to 'xe_display_fini_early' to cancel any HPD pending work at late driver removal. (Imre Deak) Changes in v3: - Move 'intel_encoder_block_all_hpds' after intel_dp_mst_suspend in 'xe_display_pm_shutdown'.(Imre Deak) Signed-off-by: Dibin Moolakadan Subrahmanian <dibin.moolakadan.subrahmanian@intel.com> Reviewed-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Acked-by: Jani Nikula <jani.nikula@intel.com> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Link: https://lore.kernel.org/r/20250724083928.2298199-1-dibin.moolakadan.subrahmanian@intel.com
2025-08-04tee: fix NULL pointer dereference in tee_shm_putPei Xiao
tee_shm_put have NULL pointer dereference: __optee_disable_shm_cache --> shm = reg_pair_to_ptr(...);//shm maybe return NULL tee_shm_free(shm); --> tee_shm_put(shm);//crash Add check in tee_shm_put to fix it. panic log: Unable to handle kernel paging request at virtual address 0000000000100cca Mem abort info: ESR = 0x0000000096000004 EC = 0x25: DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 FSC = 0x04: level 0 translation fault Data abort info: ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000 CM = 0, WnR = 0, TnD = 0, TagAccess = 0 GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 user pgtable: 4k pages, 48-bit VAs, pgdp=0000002049d07000 [0000000000100cca] pgd=0000000000000000, p4d=0000000000000000 Internal error: Oops: 0000000096000004 [#1] SMP CPU: 2 PID: 14442 Comm: systemd-sleep Tainted: P OE ------- ---- 6.6.0-39-generic #38 Source Version: 938b255f6cb8817c95b0dd5c8c2944acfce94b07 Hardware name: greatwall GW-001Y1A-FTH, BIOS Great Wall BIOS V3.0 10/26/2022 pstate: 80000005 (Nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : tee_shm_put+0x24/0x188 lr : tee_shm_free+0x14/0x28 sp : ffff001f98f9faf0 x29: ffff001f98f9faf0 x28: ffff0020df543cc0 x27: 0000000000000000 x26: ffff001f811344a0 x25: ffff8000818dac00 x24: ffff800082d8d048 x23: ffff001f850fcd18 x22: 0000000000000001 x21: ffff001f98f9fb88 x20: ffff001f83e76218 x19: ffff001f83e761e0 x18: 000000000000ffff x17: 303a30303a303030 x16: 0000000000000000 x15: 0000000000000003 x14: 0000000000000001 x13: 0000000000000000 x12: 0101010101010101 x11: 0000000000000001 x10: 0000000000000001 x9 : ffff800080e08d0c x8 : ffff001f98f9fb88 x7 : 0000000000000000 x6 : 0000000000000000 x5 : 0000000000000000 x4 : 0000000000000000 x3 : 0000000000000000 x2 : ffff001f83e761e0 x1 : 00000000ffff001f x0 : 0000000000100cca Call trace: tee_shm_put+0x24/0x188 tee_shm_free+0x14/0x28 __optee_disable_shm_cache+0xa8/0x108 optee_shutdown+0x28/0x38 platform_shutdown+0x28/0x40 device_shutdown+0x144/0x2b0 kernel_power_off+0x3c/0x80 hibernate+0x35c/0x388 state_store+0x64/0x80 kobj_attr_store+0x14/0x28 sysfs_kf_write+0x48/0x60 kernfs_fop_write_iter+0x128/0x1c0 vfs_write+0x270/0x370 ksys_write+0x6c/0x100 __arm64_sys_write+0x20/0x30 invoke_syscall+0x4c/0x120 el0_svc_common.constprop.0+0x44/0xf0 do_el0_svc+0x24/0x38 el0_svc+0x24/0x88 el0t_64_sync_handler+0x134/0x150 el0t_64_sync+0x14c/0x15 Fixes: dfd0743f1d9e ("tee: handle lookup of shm with reference count 0") Signed-off-by: Pei Xiao <xiaopei01@kylinos.cn> Reviewed-by: Sumit Garg <sumit.garg@oss.qualcomm.com> Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
2025-08-04Mark xe driver as BROKEN if kernel page size is not 4kBSimon Richter
This driver, for the time being, assumes that the kernel page size is 4kB, so it fails on loong64 and aarch64 with 16kB pages, and ppc64el with 64kB pages. Signed-off-by: Simon Richter <Simon.Richter@hogyros.de> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Cc: stable@vger.kernel.org # v6.8+ Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://lore.kernel.org/r/20250802024152.3021-1-Simon.Richter@hogyros.de
2025-08-04drivers: tee: improve sysfs interface by using sysfs_emit()Akhilesh Patil
Replace scnprintf() with sysfs_emit() while formatting buffer that is passed to userspace as per the recommendation in Documentation/filesystems/sysfs.rst. sysfs _show() callbacks should use sysfs_emit() or sysfs_emit_at() while returning values to the userspace. This change does not impact functionality, but aligns with sysfs interface usage guidelines for the tee driver. Signed-off-by: Akhilesh Patil <akhilesh@ee.iitb.ac.in> Reviewed-by: Sumit Garg <sumit.garg@oss.qualcomm.com> Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
2025-08-04drm/xe: fix stale comment about unordered_wq usageJani Nikula
Display has switched to its own workqueue, no longer using xe->unordered_wq. Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://lore.kernel.org/r/20250731111214.1130130-1-jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-08-04drm/xe/compat: stop including i915_utils.h from compat i915_drv.hJani Nikula
Expose the places that need i915_utils.h, and include it where needed. Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://lore.kernel.org/r/6338c8524e600e048b56c5484624cfb51ed49d1d.1753965351.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-08-04drm/xe/compat: remove unused platform macrosJani Nikula
After refactors, a lot of platform macros have become unused. Remove them before new users have a chance to pop up. Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://lore.kernel.org/r/4507b49ead12c997de4615fa6ec277e666e5226a.1753965351.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-08-04drm/bridge: Describe the newly introduced drm_connector parameter for ↵Andy Yan
drm_bridge_detect This fix the make htmldocs warnings: drivers/gpu/drm/drm_bridge.c:1242: warning: Function parameter or struct member 'connector' not described in 'drm_bridge_detect' Fixes: 5d156a9c3d5e ("drm/bridge: Pass down connector to drm bridge detect hook") Signed-off-by: Andy Yan <andyshrk@163.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Link: https://lore.kernel.org/r/20250716125602.3166573-1-andyshrk@163.com Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
2025-08-04Merge remote-tracking branch 'drm/drm-next' into drm-misc-next-fixesMaarten Lankhorst
Seems we missed one drm-misc-next pull-request in drm-misc-next-fixes. This is required to pull in 5d156a9c3d5e ("drm/bridge: Pass down connector to drm bridge detect hook") and update its docs. Signed-off-by: Maarten Lankhorst <dev@lankhorst.se>
2025-08-04drm/i915/display: Use the recomended min_hblank valuesArun R Murthy
Use recommended values as per wa_14021694213 to compare with the calculated value and choose minimum of them. v2: corrected checkpatch warning and retain the restriction for min_hblank (Jani) v3: use calculated value to compare with recomended value and choose minimum of them (Imre) v4: As driver supported min bpc is 8, omit the condition check for bpc6 with ycbcr420. Added a note for the same (Imre) v5: Add a warn for the unexpected case of 6bpc + uhbr + ycbcr420 v6: Reworded the comments and check fo anything < compressed bpp 8(Imre) v7: Fix checkpatch warning. (Ankit) Bspec: 74379 Signed-off-by: Arun R Murthy <arun.r.murthy@intel.com> Reviewed-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com> Link: https://lore.kernel.org/r/20250730-min_hblank-v7-1-179360220ced@intel.com
2025-08-03Merge tag 'ib-mfd-gpio-input-pwm-v6.17' of ↵Dmitry Torokhov
git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd into next Merge an immutable branch between MFD, GPIO, Input and PWM to resolve conflicts for the merge window pull request.
2025-08-03Input: max77693 - convert to atomic pwm operationUwe Kleine-König
The driver called pwm_config() and pwm_enable() separately. Today both are wrappers for pwm_apply_might_sleep() and it's more effective to call this function directly and only once. Also don't configure the duty_cycle and period if the next operation is to disable the PWM so configure the PWM in max77693_haptic_enable(). With the direct use of pwm_apply_might_sleep() the need to call pwm_apply_args() in .probe() is now gone, too, so drop this one. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com> Link: https://lore.kernel.org/r/20250630103851.2069952-2-u.kleine-koenig@baylibre.com Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2025-08-03Merge tag 'rtc-6.17' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux Pull RTC updates from Alexandre Belloni: "Support for a new RTC in an existing driver and all the drivers exposing clocks using the common clock framework have been converted to determine_rate(). Summary: Subsystem: - Convert drivers exposing a clock from round_rate() to determine_rate() Drivers: - ds1307: oscillator stop flag handling for ds1341 - pcf85063: add support for RV8063" * tag 'rtc-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux: (34 commits) rtc: ds1685: Update Joshua Kinard's email address. rtc: rv3032: convert from round_rate() to determine_rate() rtc: rv3028: convert from round_rate() to determine_rate() rtc: pcf8563: convert from round_rate() to determine_rate() rtc: pcf85063: convert from round_rate() to determine_rate() rtc: nct3018y: convert from round_rate() to determine_rate() rtc: max31335: convert from round_rate() to determine_rate() rtc: m41t80: convert from round_rate() to determine_rate() rtc: hym8563: convert from round_rate() to determine_rate() rtc: ds1307: convert from round_rate() to determine_rate() rtc: rv3028: fix incorrect maximum clock rate handling rtc: pcf8563: fix incorrect maximum clock rate handling rtc: pcf85063: fix incorrect maximum clock rate handling rtc: nct3018y: fix incorrect maximum clock rate handling rtc: hym8563: fix incorrect maximum clock rate handling rtc: ds1307: fix incorrect maximum clock rate handling rtc: pcf85063: scope pcf85063_config structures rtc: Optimize calculations in rtc_time64_to_tm() dt-bindings: rtc: amlogic,a4-rtc: Add compatible string for C3 rtc: ds1307: handle oscillator stop flag (OSF) for ds1341 ...