summaryrefslogtreecommitdiff
path: root/drivers
AgeCommit message (Collapse)Author
2025-10-28drm/amdgpu: Convert amdgpu userqueue management from IDR to XArrayJesse.Zhang
This commit refactors the AMDGPU userqueue management subsystem to replace IDR (ID Allocation) with XArray for improved performance, scalability, and maintainability. The changes address several issues with the previous IDR implementation and provide better locking semantics. Key changes: 1. **Global XArray Introduction**: - Added `userq_doorbell_xa` to `struct amdgpu_device` for global queue tracking - Uses doorbell_index as key for efficient global lookup - Replaces the previous `userq_mgr_list` linked list approach 2. **Per-process XArray Conversion**: - Replaced `userq_idr` with `userq_mgr_xa` in `struct amdgpu_userq_mgr` - Maintains per-process queue tracking with queue_id as key - Uses XA_FLAGS_ALLOC for automatic ID allocation 3. **Locking Improvements**: - Removed global `userq_mutex` from `struct amdgpu_device` - Replaced with fine-grained XArray locking using XArray's internal spinlocks 4. **Runtime Idle Check Optimization**: - Updated `amdgpu_runtime_idle_check_userq()` to use xa_empty 5. **Queue Management Functions**: - Converted all IDR operations to equivalent XArray functions: - `idr_alloc()` → `xa_alloc()` - `idr_find()` → `xa_load()` - `idr_remove()` → `xa_erase()` - `idr_for_each()` → `xa_for_each()` Benefits: - **Performance**: XArray provides better scalability for large numbers of queues - **Memory Efficiency**: Reduced memory overhead compared to IDR - **Thread Safety**: Improved locking semantics with XArray's internal spinlocks v2: rename userq_global_xa/userq_xa to userq_doorbell_xa/userq_mgr_xa Remove xa_lock and use its own lock. v3: Set queue->userq_mgr = uq_mgr in amdgpu_userq_create() v4: use xa_store_irq (Christian) hold the read side of the reset lock while creating/destroying queues and the manager data structure. (Chritian) Acked-by: Alex Deucher <alexander.deucher@amd.com> Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-28drm/amd/display: Promote DC to 3.2.356Taimur Hassan
This version brings along following update: - Fix incorrect return of vblank enable on unconfigured crtc - Add HDR workaround for a specific eDP - Make observers const-correct - Add lock descriptor to check_update - Update cursor offload assignments - Add dc interface to log pre os firmware information - Init dispclk from bootup clock for DCN315 - Remove dc param from check_update - Update link encoder assignment - Add more DC HW state info to underflow logging - Rename dml2 to dml2_0 folder - Fix notification of vtotal to DMU for cursor offload - Fix wrong index for DCN401 cursor offload - Add opp count validation to dml2.1 - Fix DMUB reset sequence for DCN32 - Bump minimum for frame_warn_limit Acked-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com> Signed-off-by: Wayne Lin <wayne.lin@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-28drm/amd/display: [FW Promotion] Release 0.1.33.0Taimur Hassan
[Why & How] - Extend reply debug flags, define a new bit as debug_log_enabled - Replace the padding to frame_skip_number in struct dmub_cmd_replay_set_coasting_vtotal_data Acked-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com> Signed-off-by: Wayne Lin <wayne.lin@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-28drm/amd/display: Fix incorrect return of vblank enable on unconfigured crtcIvan Lipski
[Why&How] Return -EINVAL when userspace asks us to enable vblank on a crtc that is not yet enabled. Suggested-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/1856 Signed-off-by: Ivan Lipski <ivan.lipski@amd.com> Signed-off-by: Wayne Lin <wayne.lin@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-28drm/amd/display: Add HDR workaround for a specific eDPAlex Hung
[WHY & HOW] Some eDP panels suffer from flicking when HDR is enabled in KDE or Gnome. This add another quirk to worksaround to skip VSC that is incompatible with an eDP panel. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/4452 Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Wayne Lin <wayne.lin@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-28drm/amd/display: Make observers const-correctDominik Kaszewski
[Why] Observers which do not modify their pointer arguments should take them as const. This clearly signals their intent to the caller, making it clear that the function is safe to call multiple times, or remove the call if the result is no longer necessary. [How] Made const-correct all of the functions below: * full_update_required[_weak] * fast_updates_exist * fast_update_only * dc_can_clear_cursor_limit * dc_stream_get_status (added const named overload) Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Dominik Kaszewski <dominik.kaszewski@amd.com> Signed-off-by: Wayne Lin <wayne.lin@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-28drm/amd/display: Add lock descriptor to check_updateDominik Kaszewski
[Why] DM locks the global DC lock during all updates, even if multiple updates touch different resources and could be run in parallel. [How] Add extra enum specifying which kind of resources should be locked. Reviewed-by: Aric Cyr <aric.cyr@amd.com> Signed-off-by: Dominik Kaszewski <dominik.kaszewski@amd.com> Signed-off-by: Wayne Lin <wayne.lin@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-28drm/amd/display: Update cursor offload assignmentsAlvin Lee
[Why & How] - Cursor lines per chunk must be assigned from hubp->att and not hubp->pos (the one in hubp->pos is unassigned) - In DCN401 DPP, cur0_enable in attribute struct must be assigned as this is the field passed to DMU - DCN401 should not program position in driver if offload is enabled Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Alvin Lee <Alvin.Lee2@amd.com> Signed-off-by: Wayne Lin <wayne.lin@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-28drm/amd/display: Add dc interface to log pre os firmware informationMeenakshikumar Somasundaram
[Why] Pre os firmware information is useful to debug pre os to post os fw transition issues. [How] Add dc interface dc_log_preos_dmcub_info() to log pre os firmware information. Reviewed-by: Cruise Hung <cruise.hung@amd.com> Signed-off-by: Meenakshikumar Somasundaram <meenakshikumar.somasundaram@amd.com> Signed-off-by: Wayne Lin <wayne.lin@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-28drm/amd/display: init dispclk from bootup clock for DCN315Zhongwei Zhang
[Why] Driver does not pick up and save vbios's clocks during init clocks, the dispclk in clk_mgr will keep 0. OS might change the timing (lower the pixel clock) after boot. Then driver will set the dispclk to lower when safe_to_lower is false, for in clk_mgr dispclk is zero, it's illegal and causes garbage. [How] Dump and save the vbios's clocks, and init the dispclk in dcn315_init_clocks. Reviewed-by: Charlene Liu <charlene.liu@amd.com> Signed-off-by: Zhongwei Zhang <Zhongwei.Zhang@amd.com> Signed-off-by: Wayne Lin <wayne.lin@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-28drm/amd/display: Remove dc param from check_updateDominik Kaszewski
[Why] dc_check_update_surfaces_for_stream should not have access to entire DC, especially not a mutable one. Concurrent checks should be able to run independently of one another, without risk of changing state. [How] * Replace dc and stream_status structs with new dc_check_config. * Move required fields from dc_debug and dc_caps to dc_check_config. Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Dominik Kaszewski <dominik.kaszewski@amd.com> Signed-off-by: Wayne Lin <wayne.lin@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-28drm/amd/display: update link encoder assignmentMeenakshikumar Somasundaram
[Why] Map a link encoder instance matching stream encoder instance if possible. [How] Get the stream encoder instance and assign the same link encoder instance if available. Reviewed-by: PeiChen Huang <peichen.huang@amd.com> Signed-off-by: Meenakshikumar Somasundaram <meenakshikumar.somasundaram@amd.com> Signed-off-by: Wayne Lin <wayne.lin@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-28drm/amd/display: Add more DC HW state info to underflow loggingKaren Chen
[Why] Debugging underflow issues frequently requires knowing the HW state at the time of underflow. To enable capturing this HW state information, interface functions are needed for the various DC HW blocks. [How] This change adds the interface functions to read HW state for the following DC HW blocks: - HUBBUB - HUBP - DPP - MPC - OPP - DSC - OPTC - DCCG Reviewed-by: George Shen <george.shen@amd.com> Signed-off-by: Karen Chen <Karen.Chen@amd.com> Signed-off-by: Wayne Lin <wayne.lin@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-28drm/amd/display: Rename dml2 to dml2_0 folderAustin Zheng
[Why] dml2 folder contains all logic for all versions of DML2 This is currently DML2.0 and DML2.1. Rename dml2 to dml2_0 folder to reflect this better (dml2_0 for DML2.0). [How] Rename dml2 to dml2_0 folder and update dml2 references to use dml2_0 folder. Reviewed-by: Dillon Varone <dillon.varone@amd.com> Signed-off-by: Austin Zheng <Austin.Zheng@amd.com> Signed-off-by: waynelin <Wayne.Lin@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-28drm/amd/display: Fix notification of vtotal to DMU for cursor offloadNicholas Kazlauskas
[Why] It was placed after the early return and the notification is never sent. [How] Place it after .set_drr and before the return. Reviewed-by: Alvin Lee <alvin.lee2@amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Wayne Lin <wayne.lin@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-28drm/amd/display: Fix wrong index for DCN401 cursor offloadNicholas Kazlauskas
[Why] Payloads are ignored because the wrong index is written as part of the pipe update implementation for DCN401. [How] Align it to the DCN35 implementation and ensure the + 1 is added. Reviewed-by: Alvin Lee <alvin.lee2@amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Wayne Lin <wayne.lin@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-28drm/amd/display: Add opp count validation to dml2.1Dmytro Laktyushkin
Newer asics can have mismatching dpp and opp counts and dml needs to account for this. Reviewed-by: Charlene Liu <charlene.liu@amd.com> Signed-off-by: Dmytro Laktyushkin <dmytro.laktyushkin@amd.com> Signed-off-by: Wayne Lin <wayne.lin@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-28drm/amd/display: Fix DMUB reset sequence for DCN32Dillon Varone
[WHY&HOW] Backport reset sequence fixes implemented on DCN401 to DCN32 to address stability issues when resetting the DMUB. Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Dillon Varone <Dillon.Varone@amd.com> Signed-off-by: Wayne Lin <wayne.lin@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-28drm/amd/display: Bump minimum for frame_warn_limitMario Limonciello
[Why] The bigger of CONFIG_FRAME_WARN and frame_warn_limit is used to trigger warnings about large stack frames. The dml_core_mode_support() stack frame has grown to 2056. [How] Update frame_warn_limit to 2056 so that CONFIG_FRAME_WARN of 2048 doesn't cause a failure. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4609 Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Mario Limonciello <superm1@kernel.org> Signed-off-by: Wayne Lin <wayne.lin@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-28drm/amd: Re-introduce property to control adaptive backlight modulationMario Limonciello
commit 0887054d14ae ("drm/amd: Drop abm_level property") dropped the abm level property in favor of sysfs control. Since then there have been discussions that compositors showed an interest in modifying a vendor specific property instead. So re-introduce the abm level property, but with different semantics. Rather than being an integer it's now an enum. One of the enum options is 'sysfs', and that is because there is still a sysfs file for use by userspace when the compositor doesn't support this property. If usespace has not modified this property, the default value will be for sysfs to control it. Once userspace has set the property stop allowing sysfs control. The property is only attached to non-OLED eDP panels. Cc: Xaver Hugl <xaver.hugl@kde.org> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-28drm/amdgpu: Fix pointer casts when reading dynamic region sizesSrinivasan Shanmugam
The function amdgpu_virt_get_dynamic_data_info() writes a 64-bit size value. In two places (amdgpu_bios.c and amdgpu_discovery.c), the code passed the address of a smaller variable by casting it to u64 *, which is unsafe. This could make the function write more bytes than the smaller variable can hold, possibly overwriting nearby memory. Reported by static analysis tools. v2: Dynamic region size comes from the host (SR-IOV setup) and is always fixed to 5 MB. (Lijo/Ellen) 5 MB easily fits inside a 32-bit value, so using a 64-bit type is not needed. It also avoids extra type casts Fixes: b4a8fcc7826a ("drm/amdgpu: Add logic for VF ipd and VF bios to init from dynamic crit_region offsets") Reported by: Dan Carpenter <dan.carpenter@linaro.org> Cc: Ellen Pan <yunru.pan@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-28drm/amdgpu: null check for hmm_pfns ptr before freeing itSunil Khatri
Due to low memory or when num of pages is too big to be accomodated, allocation could fail for pfn's. Chekc hmm_pfns for NULL before calling the kvfree for the it. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Acked-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-28drm/amd/pm: smu13: Enable VCN_RESET for pgm 7 with appropriate firmware versionJesse.Zhang
This patch extends the VCN_RESET capability check to include pgm 7 when the firmware version is 0x07551400 or newer. Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-28drm/amdgpu: Make SR-IOV critical region checks overflow-safeSrinivasan Shanmugam
The function amdgpu_virt_init_critical_region() contained an invalid check for a negative init_hdr_offset value: if (init_hdr_offset < 0) Since init_hdr_offset is an unsigned 32-bit integer, this condition can never be true and triggers a Smatch warning: warn: unsigned 'init_hdr_offset' is never less than zero In addition, the subsequent bounds check: if ((init_hdr_offset + init_hdr_size) > vram_size) was vulnerable to integer overflow when adding the two unsigned values. Thus, by promoting offset and size to 64-bit and using check_add_overflow() to safely validate the sum against VRAM size. Fixes: 07009df6494d ("drm/amdgpu: Introduce SRIOV critical regions v2 during VF init") Reported by: Dan Carpenter <dan.carpenter@linaro.org> Cc: Ellen Pan <yunru.pan@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Ellen Pan <yunru.pan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-28drm/amdgpu: fix SPDX header on cyan_skillfish_reg_init.cAlex Deucher
This should be MIT. The driver in general is MIT and the license text at the top of the file is MIT so fix it. Fixes: e8529dbc75ca ("drm/amdgpu: add ip offset support for cyan skillfish") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4654 Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-28drm/amdgpu: fix SPDX header on irqsrcs_vcn_5_0.hAlex Deucher
This should be MIT. The driver in general is MIT and the license text at the top of the file is MIT so fix it. Fixes: d1bb64651095 ("drm/amdgpu: add irq source ids for VCN5_0/JPEG5_0") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4654 Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-28drm/amdgpu: fix SPDX header on amd_cper.hAlex Deucher
This should be MIT. The driver in general is MIT and the license text at the top of the file is MIT so fix it. Fixes: 523b69c65445 ("drm/amd/include: Add amd cper header") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4654 Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-28drm/amdgpu: fix SPDX headers on amdgpu_cper.c/hAlex Deucher
These should be MIT. The driver in general is MIT and the license text at the top of the files is MIT so fix it. Fixes: 92d5d2a09de1 ("drm/amdgpu: Introduce funcs for populating CPER") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4654 Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-28drm/amdgpu/userqueue: Fix use after free in ↵Dan Carpenter
amdgpu_userq_buffer_vas_list_cleanup() The amdgpu_userq_buffer_va_list_del() function frees "va_cursor" but it is dereferenced on the next line when we print the debug message. Print the debug message first and then free it. Fixes: 2a28f9665dca ("drm/amdgpu: track the userq bo va for its obj management") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-28drm/amd/pm/powerplay/smumgr: Fix PCIeBootLinkLevel value on IcelandJohn Smith
Previously this was initialized with zero which represented PCIe Gen 1.0 instead of using the maximum value from the speed table which is the behaviour of all other smumgr implementations. Fixes: 18aafc59b106 ("drm/amd/powerplay: implement fw related smu interface for iceland.") Signed-off-by: John Smith <itistotalbotnet@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-28drm/amd/pm/powerplay/smumgr: Fix PCIeBootLinkLevel value on FijiJohn Smith
Previously this was initialized with zero which represented PCIe Gen 1.0 instead of using the maximum value from the speed table which is the behaviour of all other smumgr implementations. Fixes: 18edef19ea44 ("drm/amd/powerplay: implement fw image related smu interface for Fiji.") Signed-off-by: John Smith <itistotalbotnet@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-28drm/amd/pm: fix smu table id bound check issue in smu_cmn_update_table()Yang Wang
'table_index' is a variable defined by the smu driver (kmd) 'table_id' is a variable defined by the hw smu (pmfw) This code should use table_index as a bounds check. Fixes: caad2613dc4bd ("drm/amd/powerplay: move table setting common code to smu_cmn.c") Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-28drm/amd/display: Don't program BLNDGAM_MEM_PWR_FORCE when CM low-power is ↵Matthew Schwartz
disabled on DCN30 Before commit 33056a97ae5e ("drm/amd/display: Remove double checks for `debug.enable_mem_low_power.bits.cm`"), dpp3_program_blnd_lut(NULL) checked the low-power debug flag before calling dpp3_power_on_blnd_lut(false). After commit 33056a97ae5e ("drm/amd/display: Remove double checks for `debug.enable_mem_low_power.bits.cm`"), dpp3_program_blnd_lut(NULL) unconditionally calls dpp3_power_on_blnd_lut(false). The BLNDGAM power helper writes BLNDGAM_MEM_PWR_FORCE when CM low-power is disabled, causing immediate SRAM power toggles instead of deferring at vupdate. This can disrupt atomic color/LUT sequencing during transitions between direct scanout and composition within gamescope's DRM backend on Steam Deck OLED. To fix this, leave the BLNDGAM power state unchanged when low-power is disabled, matching dpp3_power_on_hdr3dlut and dpp3_power_on_shaper. Fixes: 33056a97ae5e ("drm/amd/display: Remove double checks for `debug.enable_mem_low_power.bits.cm`") Signed-off-by: Matthew Schwartz <matthew.schwartz@linux.dev> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-28drm/amdgpu: Add uniras version in sysfsJinzhou Su
Display uniras version in sysfs version interface when uniras enable. v2: display ras version detail info Signed-off-by: Jinzhou Su <jinzhou.su@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-28drm/amdgpu: get rev_id from strap register or IP-discovery tablePerry Yuan
Query the sub-revision field in the IP Discovery table for the VFs to obtain their revision ID. Meanwhile, read the revision ID from the strap register for the PF. Signed-off-by: Perry Yuan <perry.yuan@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-28drm/amdgpu: clear bad page info of ras moduleJinzhou Su
Clear bad page info of ras module. Signed-off-by: Jinzhou Su <jinzhou.su@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-28drm/amd/display: pause the workload setting in dmKenneth Feng
v1: Pause the workload setting in dm when doinn idle optimization v2: Rebase patch to latest kernel code base (kernel 6.16) Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-28drm/radeon: Remove calls to drm_put_dev()Daniel Palmer
Since the allocation of the drivers main structure was changed to devm_drm_dev_alloc() drm_put_dev()'ing to trigger it to be free'd should be done by devres. However, drm_put_dev() is still in the probe error and device remove paths. When the driver fails to probe warnings like the following are shown because devres is trying to drm_put_dev() after the driver already did it. [ 5.642230] radeon 0000:01:05.0: probe with driver radeon failed with error -22 [ 5.649605] ------------[ cut here ]------------ [ 5.649607] refcount_t: underflow; use-after-free. [ 5.649620] WARNING: CPU: 0 PID: 357 at lib/refcount.c:28 refcount_warn_saturate+0xbe/0x110 Fixes: a9ed2f052c5c ("drm/radeon: change drm_dev_alloc to devm_drm_dev_alloc") Signed-off-by: Daniel Palmer <daniel@0x0f.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-28drm/radeon: Do not kfree() devres managed rdevDaniel Palmer
Since the allocation of the drivers main structure was changed to devm_drm_dev_alloc() rdev is managed by devres and we shouldn't be calling kfree() on it. This fixes things exploding if the driver probe fails and devres cleans up the rdev after we already free'd it. Fixes: a9ed2f052c5c ("drm/radeon: change drm_dev_alloc to devm_drm_dev_alloc") Signed-off-by: Daniel Palmer <daniel@0x0f.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-28drm/radeon: Clean up pdev->dev instances in probeDaniel Palmer
Get a struct device pointer from the start and use it. Signed-off-by: Daniel Palmer <daniel@0x0f.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-28drm/amd: Check that VPE has reached DPM0 in idle handlerMario Limonciello
[Why] Newer VPE microcode has functionality that will decrease DPM level only when a workload has run for 2 or more seconds. If VPE is turned off before this DPM decrease and the PMFW doesn't reset it when power gating VPE, the SOC can get stuck with a higher DPM level. This can happen from amdgpu's ring buffer test because it's a short quick workload for VPE and VPE is turned off after 1s. [How] In idle handler besides checking fences are drained check PMFW version to determine if it will reset DPM when power gating VPE. If PMFW will not do this, then check VPE DPM level. If it is not DPM0 reschedule delayed work again until it is. v2: squash in return fix (Alex) Cc: Peyton.Lee@amd.com Reported-by: Sultan Alsawaf <sultan@kerneltoast.com> Reviewed-by: Sultan Alsawaf <sultan@kerneltoast.com> Tested-by: Sultan Alsawaf <sultan@kerneltoast.com> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4615 Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-28Merge tag 'counter-fixes-for-6.18' of ↵Greg Kroah-Hartman
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/wbg/counter into char-misc-next William writes: Counter fixes for 6.18 A fix to permit multiple counter channels to share the same TCB IRQ line for microchip-tcb-cpature. * tag 'counter-fixes-for-6.18' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/wbg/counter: counter: microchip-tcb-capture: Allow shared IRQ for multi-channel TCBs
2025-10-28drm/sched: avoid killing parent entity on child SIGKILLDavid Rosca
The DRM scheduler tracks who last uses an entity and when that process is killed blocks all further submissions to that entity. The problem is that we didn't track who initially created an entity, so when a process accidently leaked its file descriptor to a child and that child got killed, we killed the parent's entities. Avoid that and instead initialize the entities last user on entity creation. This also allows to drop the extra NULL check. Signed-off-by: David Rosca <david.rosca@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4568 Reviewed-by: Alex Deucher <alexander.deucher@amd.com> CC: stable@vger.kernel.org Acked-by: Philipp Stanner <phasta@kernel.org> Link: https://lore.kernel.org/r/20251015140128.1470-1-christian.koenig@amd.com Signed-off-by: Philipp Stanner <phasta@kernel.org> Link: https://patch.msgid.link/20251015140128.1470-1-christian.koenig@amd.com
2025-10-28dibs: Use subsys_initcall()Alexandra Winter
In the case of built-in modules, the order of module_init() calls are derived from the Makefiles. Use subsys_initcall() for the dibs module, to make sure dibs_init() is executed before dibs clients like smc and dibs devices like ism are initialized. So future dibs client or dibs device modules can use module_init() without the risk of getting the order in the Makefiles wrong. Reported-by: Mete Durlu <meted@linux.ibm.com> Signed-off-by: Alexandra Winter <wintera@linux.ibm.com> Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Link: https://patch.msgid.link/20251023150636.3995476-2-wintera@linux.ibm.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-10-28dibs: Remove reset of static vars in dibs_init()Alexandra Winter
'clients' and 'max_client' are static variables and therefore don't need to be initialized. Reported-by: Mete Durlu <meted@linux.ibm.com> Signed-off-by: Alexandra Winter <wintera@linux.ibm.com> Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Reviewed-by: Dust Li <dust.li@linux.alibaba.com> Link: https://patch.msgid.link/20251023150636.3995476-1-wintera@linux.ibm.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-10-28regulator: pca9450: add input supply linksMark Brown
Merge series from Oleksij Rempel <o.rempel@pengutronix.de>: This series adds input supply definitions for the NXP PCA9450 PMIC. Some systems detect power events such as undervoltage before the PMIC. To allow correct propagation of such events, each regulator must define its upstream input supply. The first patch updates the devicetree binding to document new *-supply properties, and the second patch adds matching .supply_name entries in the driver. Changes in this series: - Document INL1, INB13, INB26 and INB45 supply properties - Link all LDO and BUCK regulators to their corresponding input groups
2025-10-28net/mlx5: Add balance ID support for LAG multiplane groupsMark Bloch
Implement balance ID support for multiplane LAG configurations. This feature enables per-multiplane group load balancing by extending the software system image GUID with a balance ID component. Key implementations: - Enable lag_per_mp_group capability when supported by hardware. - Append load_balance_id to software system image GUID when conditions are met. - Increase MLX5_SW_IMAGE_GUID_MAX_BYTES from 8 to 9 to accommodate the extra byte. The balance ID is appended to the system image GUID only when both load_balance_id and lag_per_mp_group capabilities are available, ensuring backward compatibility while enabling enhanced LAG functionality. This enhancement allows for more granular load balancing control in complex multi-plane LAG deployments, improving network performance and flexibility. Signed-off-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Shay Drori <shayd@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/1761211020-925651-6-git-send-email-tariqt@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-10-28net/mlx5: Refactor HCA cap 2 settingMark Bloch
Refactor HCA capability 2 setting logic to be more structured and conditional. Move the sw_vhca_id_valid setting inside proper conditional checks and prepare the function for additional capability settings. The refactoring: - Always copy current capabilities to set_hca_cap buffer. - Apply sw_vhca_id_valid setting only when conditions are met. - Improve code readability and maintainability. This cleanup prepares the handle_hca_cap_2() function for the upcoming balance ID capability setting. Signed-off-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Shay Drori <shayd@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/1761211020-925651-5-git-send-email-tariqt@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-10-28net/mlx5: Refactor PTP clock devcom pairingMark Bloch
Refactor PTP clock device component pairing to use the clock identity buffer instead of casting it to a u64 key. This change leverages the new software system image GUID infrastructure. Changes include: - Pass identity buffer to mlx5_shared_clock_register(). - Use memcpy for identity buffer in devcom matching attributes. - Remove intermediate u64 key conversion. - Add BUILD_BUG_ON to ensure identity size fits in match key. Signed-off-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Shay Drori <shayd@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/1761211020-925651-4-git-send-email-tariqt@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-10-28net/mlx5: Add software system image GUID infrastructureMark Bloch
Replace direct hardware system image GUID usage with a new software system image GUID function that supports variable-length identifiers. Key changes: - Add mlx5_query_nic_sw_system_image_guid() function with length parameter. - Update all callsites to use the new function and buffer/length approach. - Modify mapping contexts to use byte arrays instead of u64 keys. - Update devcom matching to support variable-length keys. - Change mlx5_same_hw_devs() to use buffer comparison instead of u64. This refactoring prepares the infrastructure for balance ID support, which requires extending the system image GUID with additional data. The change maintains backward compatibility while enabling future enhancements. Signed-off-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Shay Drori <shayd@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/1761211020-925651-3-git-send-email-tariqt@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>