summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/amd/pm
AgeCommit message (Collapse)Author
7 daysdrm/amd/pm/si: Disregard vblank time when no displays are connectedTimur Kristóf
When no displays are connected, there is no vblank happening so the power management code shouldn't worry about it. This fixes a regression that caused the memory clock to be stuck at maximum when there were no displays connected to a SI GPU. Fixes: 9003a0746864 ("drm/amd/pm: Treat zero vblank time as too short in si_dpm (v3)") Fixes: 9d73b107a61b ("drm/amd/pm: Use pm_display_cfg in legacy DPM (v2)") Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Tested-by: Jeremy Klarenbeek <jeremy.klarenbeek99@gmail.com> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 6d87e0199f7b83735b56e422d59f170a201897a8) Cc: stable@vger.kernel.org
2026-05-19drm/amd/pm: fix memleak of dpm_policies on smu v15Yang Wang
In smu_v15_0_fini_smc_tables, dpm_policies was not freed or NULLed, causing a memory leak. Add kfree() and NULL assignment to properly release memory and avoid dangling pointers. Fixes: 2beedc3a92b7 ("drm/amd/pm: Add initial support for smu v15_0_8"); Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 014f329074f688b9b49383e8b70e79e9ef99359e)
2026-05-05drm/amdgpu/pm: align Hawaii mclk workaround with radeonAlex Deucher
Align the hawaii mclk workaround with radeon and windows. Link: https://gitlab.freedesktop.org/drm/amd/-/work_items/1816 Fixes: 9f4b35411cfe ("drm/amd/powerplay: add CI asics support to smumgr (v3)") Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Kent Russell <kent.russell@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 9649528b637f668c5af9f2b83ca4ad8576ae2121) Cc: stable@vger.kernel.org
2026-05-05drm/amdgpu/pm: add missing revision check for CIAlex Deucher
The ci_populate_all_memory_levels() workaround only applies to revision 0 SKUs. Link: https://gitlab.freedesktop.org/drm/amd/-/work_items/1816 Fixes: 9f4b35411cfe ("drm/amd/powerplay: add CI asics support to smumgr (v3)") Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Kent Russell <kent.russell@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 1db15ba8f72f400bbad8ae0ce24fafc43429d4bd) Cc: stable@vger.kernel.org
2026-04-28drm/amd/pm: Add fine grained flag to SMU v13.0.6Lijo Lazar
Gfx clock is fine grained on SMU v13.0.6/12 SOCs. Add the flag to report clock frequencies correctly. Fixes: 7380228401c4 ("drm/amd/pm: Use generic dpm table for SMUv13 SOCs") Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit d4871d837bbf70173f63426a84fa80b39e408b9e)
2026-04-28drm/amd/pm: Update emit clock logicLijo Lazar
If only one level is enabled in clock table, there is no need to follow the fine grained clock logic which expects a minimum of two levels (min/max). Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 7f19097af1496dd908a044ca95862f32d05f02df)
2026-04-24drm/amd/pm: fix missing fine-grained dpm table flag on aldebaranYang Wang
Add the missing SMU_DPM_TABLE_FINE_GRAINED flag to aldebaran DPM table. This fixes the pp_dpm_sclk node issue caused by missing flag configuration. Fixes: 7ea1c722fe1d ("drm/amd/pm: Use common helper for aldebaran dpm table") Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 3427dea3a48ebddb491a26093f3627384b3cb2c2)
2026-04-17drm/amd/pm: add od table upload error message parsing for smu v14.0.xYang Wang
parse and print detailed reasons for od table upload failures to help users understand error causes. example: $ echo "0 30 40" | sudo tee fan_curve $ echo "1 40 30" | sudo tee fan_curve $ echo "c" | sudo tee fan_curve kernel log: [ 75.040174] amdgpu 0000:0a:00.0: Failed to upload overdrive table, ret:-5 [ 75.040178] amdgpu 0000:0a:00.0: Invalid overdrive table content: OD_FAN_CURVE_PWM_ERROR (13) [ 75.040181] amdgpu 0000:0a:00.0: Failed to upload overdrive table! Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-04-17drm/amd/pm: add read arg support to smu_cmn_update_tableYang Wang
Extend the smu_cmn_update_table function to support reading a 32-bit return argument from the SMU firmware during table transfer operations. - Rename the original function to smu_cmn_update_table_read_arg - Add a uint32_t *read_arg output parameter to capture firmware response - Pass the read_arg pointer to the SMU message command - Keep full backward compatibility using a macro wrapper for the old API This allows the driver to retrieve status codes, results, or configuration feedback from the SMU firmware after table data transfer. No functional changes for existing users of the original smu_cmn_update_table() API. Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-04-17drm/amd/pm: fix runtime PM imbalance issue in amdgpu_pm.cYang Wang
Fix runtime PM counter imbalance to prevent device from failing to enter low power state Fixes: a50d32c41fb2 ("drm/amd/pm: Deprecate print_clock_levels interface") Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-04-17drm/amd/pm: Fix mode2 reset ACK handling on aldebaran v2Srinivasan Shanmugam
aldebaran_mode2_reset() sends a mode2 reset message and waits for an acknowledgment from the SMU. The current ACK handling is incorrect. The wait loop runs only when ret is -ETIME. But after a successful async send, ret is 0. Because of this, the loop is skipped and the code does not wait for the reset acknowledgment. Also, the code checks for ret != 1 after calling smu_msg_wait_response(). However, smu_msg_wait_response() returns 0 on success and negative error codes on failure. So checking against 1 is wrong. Return -EOPNOTSUPP when the firmware does not support this reset message. Fix this by setting ret to -ETIME before entering the wait loop, checking for ret != 0 after getting the SMU response, and returning -EOPNOTSUPP when the firmware does not support the message. v2: - Update ACK check to use ret != 0 instead of ret != 1, since smu_msg_wait_response() returns 0 on success (Feifei) - Remove unnecessary handling for ret == 0 Fixes: e42569d02acb ("drm/amd/pm: Modify mode2 msg sequence on aldebaran") Reported-by: Dan Carpenter <error27@gmail.com> Cc: Feifei Xu <Feifei.Xu@amd.com> Cc: Lijo Lazar <lijo.lazar@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-04-17drm/amd/pm: smu7: Remove stale error check in smu7_hwmgr_backend_initSrinivasan Shanmugam
smu7_hwmgr_backend_init() is responsible for initializing the SMU7 power management backend. It allocates and sets up the backend structure, initializes voltage tables, configures dependency tables, and prepares platform-specific power and clock parameters. The function follows a typical pattern where each initialization step returns a status in "result", and failures are handled via a common "goto fail" path that performs cleanup. Commit 2c21648bb814 ("drm/amd/pm/smu7: Remove non-functional SMU7 voltage dependency on DAL") removed a function call in this initialization sequence, but left behind the corresponding error check. As a result, "result" is checked twice without being updated in between: result = smu7_init_voltage_dependency_on_display_clock_table(hwmgr); if (result) goto fail; ... if (result) goto fail; The second check is redundant and unreachable for any new failure, since no operation modifies "result" between the two checks. This triggers a Smatch warning about a duplicate zero check and reduces code clarity. Remove the stale error check to keep the control flow correct and readable. Fixes: 9f49e3d4cb86 ("drm/amd/pm/smu7: Remove non-functional SMU7 voltage dependency on DAL") Reported-by: Dan Carpenter <error27@gmail.com> Cc: Timur Kristóf <timur.kristof@gmail.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-04-17drm/amd/pm: fix incorrect FeatureCtrlMask setting on smu v14.0.xYang Wang
OverDriveTable.FanMinimumPwm and FeatureCtrlMask.PP_OD_FEATURE_FAN_LEGACY_BIT have a hard dependency. Invalid handling of this dependency leads to disabled thermal monitoring and temperature boundary validation. v2: squash in typo fix (Yang) Fixes: 9710b84e2a6a ("drm/amd/pm: add overdrive support on smu v14.0.2/3") Cc: stable@vger.kernel.org Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-04-03drm/amd/pm: fix memleak issue in smu_v15_0_8_get_gpu_metrics()Yang Wang
remove unsued code to avoid memleak issue. (NOTE: This bug occurs during internal branch switching) Fixes: 0a66ca3b351f ("drm/amd/pm: add get_gpu_metrics support for 15.0.8") Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-04-03drm/amd/pm: optimize logic and remove unnecessary checks in smu v15.0.8Yang Wang
the following two sets of logic are clearly mutually exclusive in smu_v15_0_8_set_soft_freq_limited_range. remove unnecessary code logic to keep the code logic clear. e.g: if (smu_dpm->dpm_level != AMD_DPM_FORCED_LEVEL_MANUAL) return -EINVAL; if (smu_dpm->dpm_level == AMD_DPM_FORCED_LEVEL_MANUAL) { ... } Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-04-03drm/amd/pm: fix null pointer dereference issue in smu_v15_0_8_get_power_limit()Yang Wang
Fix null pointer issues caused by coding errors Fixes: e20e47bcb3f1 ("drm/amd/pm: add set{get}_power_limit support for smu 15.0.8") Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-04-03drm/amd/pm: correct mem_busy_percent display due to calculation errorsYang Wang
PMFW may return invalid values due to internal calculation errors. so, the kmd driver must validate and sanitize the returned values to prevent issues caused by firmware calculation errors. For example, values 0xfffe (-2) and 0xffff (-1) are treated as invalid and clamped to 0. this applies to devices with CAB (Cache As Buffer) functionality. Closes: https://gitlab.freedesktop.org/drm/amd/-/work_items/4905 Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-04-03drm/amd/pm: Use smu vram copy in SMUv15Lijo Lazar
Use smu vram copy wrapper function for vram copy operations in SMUv15.0.8 Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-04-03drm/amd/pm: Use smu vram copy in SMUv13Lijo Lazar
Use smu vram copy wrapper function for vram copy operations in SMUv13.0.6 and SMUv13.0.12. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-04-03drm/amd/pm: Add smu vram copy functionLijo Lazar
Add a wrapper function for copying data/to from vram. This additionally checks for any RAS fatal error. Copy cannot be trusted if any RAS fatal error happened as VRAM becomes inaccessible. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-30drm/amd/pm/smu7: Add SCLK cap for quirky Hawaii boardTimur Kristóf
On a specific Radeon R9 390X board, the GPU can "randomly" hang while gaming. Initially I thought this was a RADV bug and tried to work around this in Mesa: commit 8ea08747b86b ("radv: Mitigate GPU hang on Hawaii in Dota 2 and RotTR") However, I got some feedback from other users who are reporting that the above mitigation causes a significant performance regression for them, and they didn't experience the hang on their GPU in the first place. After some further investigation, it turns out that the problem is that the highest SCLK DPM level on this board isn't stable. Lowering SCLK to 1040 MHz (from 1070 MHz) works around the issue, and has a negligible impact on performance compared to the Mesa patch. (Note that increasing the voltage can also work around it, but we felt that lowering the SCLK is the safer option.) To solve the above issue, add an "sclk_cap" field to smu7_hwmgr and set this field for the affected board. The capped SCLK value correctly appears on the sysfs interface and shows up in GUI tools such as LACT. Fixes: 9f4b35411cfe ("drm/amd/powerplay: add CI asics support to smumgr (v3)") Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-30drm/amd/pm/ci: Fill DW8 fields from SMCTimur Kristóf
In ci_populate_dw8() we currently just read a value from the SMU and then throw it away. Instead of throwing away the value, we should use it to fill other fields in DW8 (like radeon). Otherwise the value of the other fiels is just cleared when we copy this data to the SMU later. Fixes: 9f4b35411cfe ("drm/amd/powerplay: add CI asics support to smumgr (v3)") Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-30drm/amd/pm/ci: Clear EnabledForActivity field for memory levelsTimur Kristóf
Follow what radeon did and what amdgpu does for other GPUs with SMU7. Fixes: 9f4b35411cfe ("drm/amd/powerplay: add CI asics support to smumgr (v3)") Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-30drm/amd/pm/ci: Fix powertune defaults for Hawaii 0x67B0Timur Kristóf
There is no AMD GPU with the ID 0x66B0, this looks like a typo. It should be 0x67B0 which is actually part of the PCI ID list, and should use the Hawaii XT powertune defaults according to the old radeon driver. Fixes: 9f4b35411cfe ("drm/amd/powerplay: add CI asics support to smumgr (v3)") Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-30drm/amd/pm/smu7: Remove non-functional SMU7 voltage dependency on DALTimur Kristóf
It looks like this was written for an old version of DC (DAL) and was never adapted afterwards. This was non-functional because it relied on the "dal_power_level" field which was never assigned anywhere in the code base. Also, it was not implemented for CI ASICs. Now superseded by the newer voltage dependency on display clock table added by the previous commit, let's remove. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-30drm/amd/pm/smu7: Fix SMU7 voltage dependency on display clockTimur Kristóf
The DCE (display controller engine) requires a minimum voltage in order to function correctly, depending on which clock level it currently uses. Add a new table that contains display clock frequency levels and the corresponding required voltages. The clock frequency levels are taken from DC (and the old radeon driver's voltage dependency table for CI in cases where its values were lower). The voltage levels are taken from the following function: phm_initializa_dynamic_state_adjustment_rule_settings(). Furthermore, in case of CI, call smu7_patch_vddc() on the new table to account for leakage voltage (like in radeon). Use the display clock value from amd_pp_display_configuration to look up the voltage level needed by the DCE. Send the voltage to the SMU via the PPSMC_MSG_VddC_Request command. The previous implementation of this feature was non-functional because it relied on a "dal_power_level" field which was never assigned; and it was not at all implemented for CI ASICs. I verified this on a Radeon R9 M380 which previously booted to a black screen with DC enabled (default since Linux 6.19), but now works correctly. Fixes: 599a7e9fe1b6 ("drm/amd/powerplay: implement smu7 hwmgr to manager asics with smu ip version 7.") Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-30drm/amd/pm/ci: Disable MCLK DPM on problematic CI ASICsTimur Kristóf
There are two known cases where MCLK DPM can causes issues: Radeon R9 M380 found in iMac computers from 2015. The SMU in this GPU just hangs as soon as we send it the PPSMC_MSG_MCLKDPM_Enable command, even when MCLK switching is disabled, and even when we only populate one MCLK DPM level. Apply workaround to all devices with the same subsystem ID. Radeon R7 260X due to old memory controller microcode. We only flash the MC ucode when it isn't set up by the VBIOS, therefore there is no way to make sure that it has the correct ucode version. I verified that this patch fixes the SMU hang on the R9 M380 which would previously fail to boot. This also fixes the UVD initialization error on that GPU which happened because the SMU couldn't ungate the UVD after it hung. Fixes: 86457c3b21cb ("drm/amd/powerplay: Add support for CI asics to hwmgr") Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-30drm/amd/pm/ci: Use highest MCLK on CI when MCLK DPM is disabledTimur Kristóf
When MCLK DPM is disabled for any reason, populate the MCLK table with the highest MCLK DPM level, so that the ASIC can use the highest possible memory clock to get good performance even when MCLK DPM is disabled. Fixes: 9f4b35411cfe ("drm/amd/powerplay: add CI asics support to smumgr (v3)") Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-30drm/amd/pm: Unify version check in SMUv14Lijo Lazar
Use common helper function for firmware version check and logging in SMUv14 Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-30drm/amd/pm: Unify version check in SMUv12Lijo Lazar
Use common helper function for firmware version check and logging in SMUv12. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-30drm/amd/pm: Unify version check in SMUv11Lijo Lazar
Use common helper function for firmware version check and logging in SMUv11 Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-30drm/amd/pm: Use str_enabled_disabled in amdgpu_pm sysfsAsad Kamal
Coccinelle flags hand-rolled "enabled"/"disabled" strings; use the shared str_enabled_disabled() helper from string_choices.h for npm_status and thermal throttling logging sysfs text. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202603251434.zIN2QYWn-lkp@intel.com/ Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-24drm/amd/pm: Enable VCN reset for pgm=4 with appropriate FW versionJesse.Zhang
Extend the VCN reset capability to include pgm=4 variants when the firmware version meets the required threshold (>= 0x04557100). This follows the existing pattern for pgm=0 and pgm=7, ensuring that VCN reset is enabled only on configurations where it is supported by the firmware. Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Jesse Zhang <jesse.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-24drm/amd/pm: add dedicated dram addr msg for smu v15Yang Wang
Add dedicated SMU Dram MSG mapping to avoid conflicts in SMU IP v15 common code for upcoming ASICs. add new smu msg: - SMU_MSG_SetDriverDramAddr - SMU_MSG_SetToolsDramAddr Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-24drm/amd/pm: disable OD_FAN_CURVE if temp or pwm range invalid for smu v14Yang Wang
Forcibly disable the OD_FAN_CURVE feature when temperature or PWM range is invalid, otherwise PMFW will reject this configuration on smu v14.0.2/14.0.3. example: $ sudo cat /sys/bus/pci/devices/<BDF>/gpu_od/fan_ctrl/fan_curve OD_FAN_CURVE: 0: 0C 0% 1: 0C 0% 2: 0C 0% 3: 0C 0% 4: 0C 0% OD_RANGE: FAN_CURVE(hotspot temp): 0C 0C FAN_CURVE(fan speed): 0% 0% $ echo "0 50 40" | sudo tee fan_curve kernel log: [ 969.761627] amdgpu 0000:03:00.0: amdgpu: Fan curve temp setting(50) must be within [0, 0]! [ 1010.897800] amdgpu 0000:03:00.0: amdgpu: Fan curve temp setting(50) must be within [0, 0]! Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-23drm/amd/pm: disable OD_FAN_CURVE if temp or pwm range invalid for smu v13Yang Wang
Forcibly disable the OD_FAN_CURVE feature when temperature or PWM range is invalid, otherwise PMFW will reject this configuration on smu v13.0.x example: $ sudo cat /sys/bus/pci/devices/<BDF>/gpu_od/fan_ctrl/fan_curve OD_FAN_CURVE: 0: 0C 0% 1: 0C 0% 2: 0C 0% 3: 0C 0% 4: 0C 0% OD_RANGE: FAN_CURVE(hotspot temp): 0C 0C FAN_CURVE(fan speed): 0% 0% $ echo "0 50 40" | sudo tee fan_curve kernel log: [ 756.442527] amdgpu 0000:03:00.0: amdgpu: Fan curve temp setting(50) must be within [0, 0]! [ 777.345800] amdgpu 0000:03:00.0: amdgpu: Fan curve temp setting(50) must be within [0, 0]! Closes: https://github.com/ROCm/amdgpu/issues/208 Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-23drm/amd/pm: Enable user specified gfx clock rangesAsad Kamal
Enable user specified gfx clock ranges for smu_15_0_8 Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-23drm/amdgpu: Add smu v15_0_8 ip blockHawking Zhang
Add smu v15_0_8 ip block Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-23drm/amd/pm: Add NPM support for smu_v15_0_8Asad Kamal
Add node power management support for smu_v15_0_8 Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-23drm/amd/pm: Add baseboard temperature metrics supportAsad Kamal
Add baseboard temperature metrics support via system metrics table for smu_v15_0_8 v4: Add separate function to fill baseboard temperature, use 16, remove casting v5: Optimize to use single switch case (Lijo) Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-23drm/amd/pm: Add gpuboard temperature metrics supportAsad Kamal
Add gpuboard temperature metrics support via system metrics table for smu_v15_0_8 v3: Use per sensor attr id (Lijo) v4: Use s16 for temp, remove cast, use separate function to fill gpuboard temperature metrics data (Lijo) Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-23drm/amd/pm: Add read sensor supportAsad Kamal
Add read sensor support for smu_v15_0_8 v2: Remove gfx voltage support (Lijo) Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-23drm/amd/pm: Add ppt1 supportAsad Kamal
Add ppt1 support for smu_v15_0_8 Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-23drm/amd/pm: Add get_thermal_temperature_range supportAsad Kamal
Add get_thermal_temperature_range support smu_v15_0_8 v2: Remove sriov check (Lijo) v3: Restrict to 1VF mode(Lijo) Signed-off-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-23drm/amd/pm: Add od_edit_dpm_table supportAsad Kamal
Add od_edit_dpm_table support for smu_v15_0_8 v2: Skip Gl2clk/Fclk (Lijo) v3: sqaush in set_performance_support (Asad) Signed-off-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-23drm/amd/pm: add populate_umd_state_clk supportAsad Kamal
add populate_umd_state_clk support for smu 15.0.8 v2: remove gl2clk/socclk/fclk, restrict to only current min/max (Lijo) Signed-off-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-23drm/amd/pm: Add emit clock supportAsad Kamal
Add emit clock support and fetching other metrics data like temperature, clock for smu_v15_0_8 v2: Use umc count for hbm stack temperature (Lijo) v3: Use correct logic for hbm stacks (Lijo) Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-23drm/amd/pm: add set{get}_power_limit support for smu 15.0.8Yang Wang
export .set_power_limit & .get_power_limit interface for smu 15.0.8 Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-23drm/amd/pm: add get_unique_id support for smu 15.0.8Yang Wang
export .get_unique_id interface for smu 15.0.8 Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-23drm/amd/pm: add get_gpu_metrics support for 15.0.8Yang Wang
export .get_gpu_metrics interface for 15.0.8 v2: Remove members already exposed by other interfaces, use mask, logical conversion (Lijo) v3: Use correct logic for hbm stacks loop (Lijo) Remove buffer allocation v4: Make out of bound check outside loop (Lijo) v5: fix locking in error case (Alex) Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>