summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2025-12-16btrfs: fix NULL dereference on root when tracing inode evictionMiquel Sabaté Solà
When evicting an inode the first thing we do is to setup tracing for it, which implies fetching the root's id. But in btrfs_evict_inode() the root might be NULL, as implied in the next check that we do in btrfs_evict_inode(). Hence, we either should set the ->root_objectid to 0 in case the root is NULL, or we move tracing setup after checking that the root is not NULL. Setting the rootid to 0 at least gives us the possibility to trace this call even in the case when the root is NULL, so that's the solution taken here. Fixes: 1abe9b8a138c ("Btrfs: add initial tracepoint support for btrfs") Reported-by: syzbot+d991fea1b4b23b1f6bf8@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=d991fea1b4b23b1f6bf8 Signed-off-by: Miquel Sabaté Solà <mssola@mssola.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2025-12-16btrfs: qgroup: update all parent qgroups when doing quick inheritQu Wenruo
[BUG] There is a bug that if a subvolume has multi-level parent qgroups, and is able to do a quick inherit, only the direct parent qgroup got updated: mkfs.btrfs -f -O quota $dev mount $dev $mnt btrfs subv create $mnt/subv1 btrfs qgroup create 1/100 $mnt btrfs qgroup create 2/100 $mnt btrfs qgroup assign 1/100 2/100 $mnt btrfs qgroup assign 0/256 1/100 $mnt btrfs qgroup show -p --sync $mnt Qgroupid Referenced Exclusive Parent Path -------- ---------- --------- ------ ---- 0/5 16.00KiB 16.00KiB - <toplevel> 0/256 16.00KiB 16.00KiB 1/100 subv1 1/100 16.00KiB 16.00KiB 2/100 2/100<1 member qgroup> 2/100 16.00KiB 16.00KiB - <0 member qgroups> btrfs subv snap -i 1/100 $mnt/subv1 $mnt/snap1 btrfs qgroup show -p --sync $mnt Qgroupid Referenced Exclusive Parent Path -------- ---------- --------- ------ ---- 0/5 16.00KiB 16.00KiB - <toplevel> 0/256 16.00KiB 16.00KiB 1/100 subv1 0/257 16.00KiB 16.00KiB 1/100 snap1 1/100 32.00KiB 32.00KiB 2/100 2/100<1 member qgroup> 2/100 16.00KiB 16.00KiB - <0 member qgroups> # Note that 2/100 is not updated, and qgroup numbers are inconsistent umount $mnt [CAUSE] If the snapshot source subvolume belongs to a parent qgroup, and the new snapshot target is also added to the new same parent qgroup, we allow a quick update without marking qgroup inconsistent. But that quick update only update the parent qgroup, without checking if there is any more parent qgroups. [FIX] Iterate through all parent qgroups during the quick inherit. Reported-by: Boris Burkov <boris@bur.io> Fixes: b20fe56cd285 ("btrfs: qgroup: allow quick inherit if snapshot is created and added to the same parent") Reviewed-by: Boris Burkov <boris@bur.io> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2025-12-16remoteproc: imx_rproc: Use strstarts for "rsc-table" checkShenwei Wang
The resource name may include an address suffix, for example: rsc-table@1fff8000. To handle such cases, use strstarts() instead of strcmp() when checking for "rsc-table". Signed-off-by: Shenwei Wang <shenwei.wang@nxp.com> Reviewed-by: Daniel Baluta <daniel.baluta@nxp.com> Reviewed-by: Frank Li <Frank.Li@nxp.com> Reviewed-by: Zhongqiu Han <zhongqiu.han@oss.qualcomm.com> Reviewed-by: Peng Fan <peng.fan@nxp.com> Fixes: 67a7bc7f0358 ("remoteproc: Use of_reserved_mem_region_* functions for "memory-region"") Link: https://lore.kernel.org/r/20251208233302.684139-1-shenwei.wang@nxp.com Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
2025-12-16btrfs: fix qgroup_snapshot_quick_inherit() squota bugBoris Burkov
qgroup_snapshot_quick_inherit() detects conditions where the snapshot destination would land in the same parent qgroup as the snapshot source subvolume. In this case we can avoid costly qgroup calculations and just add the nodesize of the new snapshot to the parent. However, in the case of squotas this is actually a double count, and also an undercount for deeper qgroup nestings. The following annotated script shows the issue: btrfs quota enable --simple "$mnt" # Create 2-level qgroup hierarchy btrfs qgroup create 2/100 "$mnt" # Q2 (level 2) btrfs qgroup create 1/100 "$mnt" # Q1 (level 1) btrfs qgroup assign 1/100 2/100 "$mnt" # Create base subvolume btrfs subvolume create "$mnt/base" >/dev/null base_id=$(btrfs subvolume show "$mnt/base" | grep 'Subvolume ID:' | awk '{print $3}') # Create intermediate snapshot and add to Q1 btrfs subvolume snapshot "$mnt/base" "$mnt/intermediate" >/dev/null inter_id=$(btrfs subvolume show "$mnt/intermediate" | grep 'Subvolume ID:' | awk '{print $3}') btrfs qgroup assign "0/$inter_id" 1/100 "$mnt" # Create working snapshot with --inherit (auto-adds to Q1) # src=intermediate (in only Q1) # dst=snap (inheriting only into Q1) # This double counts the 16k nodesize of the snapshot in Q1, and # undercounts it in Q2. btrfs subvolume snapshot -i 1/100 "$mnt/intermediate" "$mnt/snap" >/dev/null snap_id=$(btrfs subvolume show "$mnt/snap" | grep 'Subvolume ID:' | awk '{print $3}') # Fully complete snapshot creation sync # Delete working snapshot # Q1 and Q2 will lose the full snap usage btrfs subvolume delete "$mnt/snap" >/dev/null # Delete intermediate and remove from Q1 # Q1 and Q2 will lose the full intermediate usage btrfs qgroup remove "0/$inter_id" 1/100 "$mnt" btrfs subvolume delete "$mnt/intermediate" >/dev/null # Q1 should be at 0, but still has 16k. Q2 is "correct" at 0 (for now...) # Trigger cleaner, wait for deletions mount -o remount,sync=1 "$mnt" btrfs subvolume sync "$mnt" "$snap_id" btrfs subvolume sync "$mnt" "$inter_id" # Remove Q1 from Q2 # Frees 16k more from Q2, underflowing it to 16EiB btrfs qgroup remove 1/100 2/100 "$mnt" # And show the bad state: btrfs qgroup show -pc "$mnt" Qgroupid Referenced Exclusive Parent Child Path -------- ---------- --------- ------ ----- ---- 0/5 16.00KiB 16.00KiB - - <toplevel> 0/256 16.00KiB 16.00KiB - - base 1/100 16.00KiB 16.00KiB - - <0 member qgroups> 2/100 16.00EiB 16.00EiB - - <0 member qgroups> Fix this by simply not doing this quick inheritance with squotas. I suspect that it is also wrong in normal qgroups to not recurse up the qgroup tree in the quick inherit case, though other consistency checks will likely fix it anyway. Fixes: b20fe56cd285 ("btrfs: qgroup: allow quick inherit if snapshot is created and added to the same parent") Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Boris Burkov <boris@bur.io> Signed-off-by: David Sterba <dsterba@suse.com>
2025-12-16ASoC: Intel: common / SOF: Use function topologies forMark Brown
Merge series from Peter Ujfalusi <peter.ujfalusi@linux.intel.com>: support for NVL-S and the support using functional topology fragments for Soundwire configurations is introduced in 6.19-rc1 in parallel. The SOF projects plan is to not create individual topology files for NVL as with SDCA and the functional topology support can handle most if not all soundwire devices going forward. However one issue have been identified with the functional topology only support, which was masked by the presence of a single topology file: if the device contains a dai link for which we don't have topology fragment, then the probe will fail. This worked with a fallback to a monolithic topology file - which made the dai link to be ignored. The first patch in the series adds a flag to instruct the function discovery to make a best effort to form a card by ignoring functions without corresponding fragment (and print this out for developers) in case there is no fallback topology available. The second patch removes the match entry to refer to a topology file which will not be built by the SOF project.
2025-12-16ACPI: PNP: Drop acpi_nonpnp_device_ids[]Rafael J. Wysocki
Now that "system" devices are represented as platform devices, they are not claimed by the PNP ACPI scan handler any more and platform devices can be created for ACPI device objects listing "system" device IDs as their compatible device IDs. Accordingly, it should not be necessary any more to add device IDs to acpi_nonpnp_device_ids[], so drop it. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Link: https://patch.msgid.link/3587570.QJadu78ljV@rafael.j.wysocki
2025-12-16platform/x86/intel/vbtn: Stop creating a platform deviceRafael J. Wysocki
Now that "system" devices are represented as platform devices, they are not claimed by the PNP ACPI scan handler any more and the Intel virtual button array platform devices should be created by the ACPI core, so the driver does not need to attempt to create a platform device by itself. Accordingly, make it stop doing so. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Link: https://patch.msgid.link/8661724.NyiUUSuA9g@rafael.j.wysocki
2025-12-16platform/x86/intel/hid: Stop creating a platform deviceRafael J. Wysocki
Now that "system" devices are represented as platform devices, they are not claimed by the PNP ACPI scan handler any more and the Intel HID platform devices should be created by the ACPI core, so the driver does not need to attempt to create a platform device by itself. Accordingly, make it stop doing so. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Link: https://patch.msgid.link/6115868.MhkbZ0Pkbq@rafael.j.wysocki
2025-12-16ACPI: PNP: Drop PNP0C01 and PNP0C02 from acpi_pnp_device_ids[]Rafael J. Wysocki
There is a long-standing problem with ACPI device enumeration that if the given device has a compatible ID which is one of the generic system resource device IDs (PNP0C01 and PNP0C02), it will be claimed by the PNP scan handler and it will not be represented as a platform device, so it cannot be handled by a platform driver. Drivers have been working around this issue by "manually" creating platform devices that they can bind to (see the Intel HID driver for one example) or adding their device IDs to acpi_nonpnp_device_ids[]. None of the above is particularly clean though and the only reason why the PNP0C01 and PNP0C02 device IDs are present in acpi_pnp_device_ids[] is to allow the legacy PNP system driver to bind to those devices and reserve their resources so they are not used going forward. Obviously, to address this problem PNP0C01 and PNP0C02 need to be dropped from acpi_pnp_device_ids[], but doing so without making any other changes would be problematic because the ACPI core would then create platform devices for the generic system resource device objects and that would not work on all systems for two reasons. First, the PNP system driver explicitly avoids reserving I/O resources below the "standard PC hardware" boundary, 0x100, to avoid conflicts in that range (one possible case when this may happen is when the CMOS RTC driver is involved), but the platform device creation code does not do that. Second, there may be resource conflicts between the "system" devices and the other devices in the system, possibly including conflicts with PCI BARs. Registering the PNP system driver via fs_initcall() helps to manage those conflicts, even though it does not make them go away. Resource conflicts during the registration of "motherboard resources" that occur after PCI has claimed BARs are harmless as a rule and do not need to be addressed in any specific way. To overcome the issues mentioned above, use the observation that it is not actually necessary to create any device objects in addition to struct acpi_device ones in order to reserve the "system" device resources because that can be done directly in the ACPI device enumeration code. Namely, modify acpi_default_enumeration() to add the given ACPI device object to a special "system devices" list if its _HID is either PNP0C01 or PNP0C02 without creating a platform device for it. Next, add a new special acpi_scan_claim_resources() function that will be run via fs_initcall() and will walk that list and reserve resources for each device in it along the lines of what the PNP system driver does. Having made the above changes, drop PNP0C01 and PNP0C02 from acpi_pnp_device_ids[] which will allow platform devices to be created for ACPI device objects whose _CID lists contain PNP0C01 or PNP0C02, but the _HID is not in acpi_pnp_device_ids[]. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> [ rjw: Drop a leftover comment and add a new one elsewhere ] Link: https://patch.msgid.link/9550709.CDJkKcVGEf@rafael.j.wysocki Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-12-16drm/amdkfd: Fix improper NULL termination of queue restore SMI event stringBrian Kocoloski
Pass character "0" rather than NULL terminator to properly format queue restoration SMI events. Currently, the NULL terminator precedes the newline character that is intended to delineate separate events in the SMI event buffer, which can break userspace parsers. Signed-off-by: Brian Kocoloski <brian.kocoloski@amd.com> Reviewed-by: Philip Yang <Philip.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 6e7143e5e6e21f9d5572e0390f7089e6d53edf3c)
2025-12-16drm/amd/pm: restore SCLK settings after S0ix resumemythilam
User-configured SCLK(GPU core clock)frequencies were not persisting across S0ix suspend/resume cycles on smu v14 hardware. The issue occurred because of the code resetting clock frequency to zero during resume. This patch addresses the problem by: - Preserving user-configured values in driver and sets the clock frequency across resume - Preserved settings are sent to the hardware during resume Signed-off-by: mythilam <mythilam@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 20ba98326f4c69e6bf8d1f42942ece485a675b27)
2025-12-16drm/amdgpu: fix a job->pasid access race in gpu recoveryAlex Deucher
Avoid a possible UAF in GPU recovery due to a race between the sched timeout callback and the tdr work queue. The gpu recovery function calls drm_sched_stop() and later drm_sched_start(). drm_sched_start() restarts the tdr queue which will eventually free the job. If the tdr queue frees the job before time out callback completes, the job will be freed and we'll get a UAF when accessing the pasid. Cache it early to avoid the UAF. Example KASAN trace: [ 493.058141] BUG: KASAN: slab-use-after-free in amdgpu_device_gpu_recover+0x968/0x990 [amdgpu] [ 493.067530] Read of size 4 at addr ffff88b0ce3f794c by task kworker/u128:1/323 [ 493.074892] [ 493.076485] CPU: 9 UID: 0 PID: 323 Comm: kworker/u128:1 Tainted: G E 6.16.0-1289896.2.zuul.bf4f11df81c1410bbe901c4373305a31 #1 PREEMPT(voluntary) [ 493.076493] Tainted: [E]=UNSIGNED_MODULE [ 493.076495] Hardware name: TYAN B8021G88V2HR-2T/S8021GM2NR-2T, BIOS V1.03.B10 04/01/2019 [ 493.076500] Workqueue: amdgpu-reset-dev drm_sched_job_timedout [gpu_sched] [ 493.076512] Call Trace: [ 493.076515] <TASK> [ 493.076518] dump_stack_lvl+0x64/0x80 [ 493.076529] print_report+0xce/0x630 [ 493.076536] ? _raw_spin_lock_irqsave+0x86/0xd0 [ 493.076541] ? __pfx__raw_spin_lock_irqsave+0x10/0x10 [ 493.076545] ? amdgpu_device_gpu_recover+0x968/0x990 [amdgpu] [ 493.077253] kasan_report+0xb8/0xf0 [ 493.077258] ? amdgpu_device_gpu_recover+0x968/0x990 [amdgpu] [ 493.077965] amdgpu_device_gpu_recover+0x968/0x990 [amdgpu] [ 493.078672] ? __pfx_amdgpu_device_gpu_recover+0x10/0x10 [amdgpu] [ 493.079378] ? amdgpu_coredump+0x1fd/0x4c0 [amdgpu] [ 493.080111] amdgpu_job_timedout+0x642/0x1400 [amdgpu] [ 493.080903] ? pick_task_fair+0x24e/0x330 [ 493.080910] ? __pfx_amdgpu_job_timedout+0x10/0x10 [amdgpu] [ 493.081702] ? _raw_spin_lock+0x75/0xc0 [ 493.081708] ? __pfx__raw_spin_lock+0x10/0x10 [ 493.081712] drm_sched_job_timedout+0x1b0/0x4b0 [gpu_sched] [ 493.081721] ? __pfx__raw_spin_lock_irq+0x10/0x10 [ 493.081725] process_one_work+0x679/0xff0 [ 493.081732] worker_thread+0x6ce/0xfd0 [ 493.081736] ? __pfx_worker_thread+0x10/0x10 [ 493.081739] kthread+0x376/0x730 [ 493.081744] ? __pfx_kthread+0x10/0x10 [ 493.081748] ? __pfx__raw_spin_lock_irq+0x10/0x10 [ 493.081751] ? __pfx_kthread+0x10/0x10 [ 493.081755] ret_from_fork+0x247/0x330 [ 493.081761] ? __pfx_kthread+0x10/0x10 [ 493.081764] ret_from_fork_asm+0x1a/0x30 [ 493.081771] </TASK> Fixes: a72002cb181f ("drm/amdgpu: Make use of drm_wedge_task_info") Link: https://github.com/HansKristian-Work/vkd3d-proton/pull/2670 Cc: SRINIVASAN.SHANMUGAM@amd.com Cc: vitaly.prosyak@amd.com Cc: christian.koenig@amd.com Suggested-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 20880a3fd5dd7bca1a079534cf6596bda92e107d)
2025-12-16sched_ext: fix uninitialized ret on alloc_percpu() failureLiang Jie
Smatch reported: kernel/sched/ext.c:5332 scx_alloc_and_add_sched() warn: passing zero to 'ERR_PTR' In scx_alloc_and_add_sched(), the alloc_percpu() failure path jumps to err_free_gdsqs without initializing @ret. That can lead to returning ERR_PTR(0), which violates the ERR_PTR() convention and confuses callers. Set @ret to -ENOMEM before jumping to the error path when alloc_percpu() fails. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/r/202512141601.yAXDAeA9-lkp@intel.com/ Reported-by: Dan Carpenter <error27@gmail.com> Fixes: c201ea1578d3 ("sched_ext: Move event_stats_cpu into scx_sched") Signed-off-by: Liang Jie <liangjie@lixiang.com> Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> Reviewed-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-12-16drm/amd/display: Fix DP no audio issueCharlene Liu
[why] need to enable APG_CLOCK_ENABLE enable first also need to wake up az from D3 before access az block Reviewed-by: Swapnil Patel <swapnil.patel@amd.com> Signed-off-by: Charlene Liu <Charlene.Liu@amd.com> Signed-off-by: Chenyu Chen <chen-yu.chen@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit bf5e396957acafd46003318965500914d5f4edfa)
2025-12-16ASoC: SOF: Support for on-demand DSP bootMark Brown
Merge series from Peter Ujfalusi <peter.ujfalusi@linux.intel.com>: On system suspend / resume we always power up the DSP and boot the firmware, which is not strictly needed as right after the firmware booted up we power the DSP down again on suspend and we also power it down after resume after some inactivity. Similarly, on jack insert/removal we needlesly boot up the firmware to check the jack status, which needs no DSP/firmware communication. The on-demand DSP boot will make sure that we boot the DSP firmware up only when it is needed - for audio activity, in other cases the firmware will be not booted up, which saves time. Out of caution, add a new platform descriptor flag to enable on-demand DSP boot since this might not work without changes to platform code on certain platforms. With the on-demand dsp boot enabled we will not boot the DSP and firmware up on system or rpm resume, just enable audio subsystem since audio IPs, like HDA and SoundWire might be needed (codecs suspend/resume operation). Only boot up the DSP during the first hw_params() call when the DSP is really going to be needed. In this way we can handle the audio related use cases: normal audio use (rpm suspend/resume) system suspend/resume without active audio system suspend/resume with active audio system suspend/resume without active audio, and audio start before the rpm suspend timeout Add module option to force the on-demand DSP boot to allow it to be disabled or enabled without kernel change for testing. The on-demand boot has been tested in our CI for more than half a year and so far no issues have been seen on supported platforms since it's introduction to our development tree (sof-dev).
2025-12-16drm/amd/display: Fix scratch registers offsets for DCN351Ray Wu
[Why] Different platforms use different NBIO header files, causing display code to use differnt offset and read wrong accelerated status. [How] - Unified NBIO offset header file across platform. - Correct scratch registers offsets to proper locations. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4667 Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Ray Wu <ray.wu@amd.com> Signed-off-by: Chenyu Chen <chen-yu.chen@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 576e032e909c8a6bb3d907b4ef5f6abe0f644199) Cc: stable@vger.kernel.org
2025-12-16drm/amd/display: Fix scratch registers offsets for DCN35Ray Wu
[Why] Different platforms use differnet NBIO header files, causing display code to use differnt offset and read wrong accelerated status. [How] - Unified NBIO offset header file across platform. - Correct scratch registers offsets to proper locations. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4667 Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Ray Wu <ray.wu@amd.com> Signed-off-by: Chenyu Chen <chen-yu.chen@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 49a63bc8eda0304ba307f5ba68305f936174f72d) Cc: stable@vger.kernel.org
2025-12-16drm/amd: Resume the device in thaw() callback when console suspend is disabledMario Limonciello (AMD)
If console suspend has been disabled using `no_console_suspend` also wake up during thaw() so that some messages can be seen for debugging. Closes: https://gitlab.freedesktop.org/drm/amd/-/work_items/4191 Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 63387cbbb714d9f0d179d9d4560de1408d0906de)
2025-12-16Merge branch 'libbpf-move-arena-variables-out-of-the-zero-page'Andrii Nakryiko
Emil Tsalapatis says: ==================== libbpf: move arena variables out of the zero page Modify libbpf to place arena globals at the end of the arena mapping instead of the very beginning. This allows programs to leave the "zero page" of the arena unmapped, so that NULL arena pointer dereferences trigger a page fault and associated backtrace in BPF streams. In contrast, the current policy of placing global data in the zero pages means that NULL dereferences silently corrupt global data, e.g, arena qspinlock state. This makes arena bugs more difficult to debug. The patchset adds code to libbpf to move global arena data to the end of the arena. At load time, libbpf adjusts each symbol's location within the arena to point to the right location in the arena. The patchset also adjusts the arena skeleton pointer to point to the arena globals, now that they are not in the beginning of the arena region. CHANGESET ========= v3->v4: (https://lore.kernel.org/bpf/20251215161313.10120-1-emil@etsalapatis.com/T/#t) - Added Acks by Eduard - Changed jumptable sym_off to unsigned int for consistency (AI) - Adjusted selftests to ensure arena globals are actually mapped in (Eduard) - (Patch 2) Adjusted selftests that were failing because they were expecting the now removed "direct map offset" error message v2->v3: (https://lore.kernel.org/bpf/20251203162625.13152-1-emil@etsalapatis.com/) - Remove unnecessary kernel bounds check in resolve_pseudo_ldimm64 (Andrii) - Added patch to turn sym_off unsigned to prevent overflow (AI) - Remove obsolete references to offsets from test patch description (Andrii) - Use size_t for arena_data_off (Andrii) - Remove extra mutable variable from offset calculations (Andrii) v1->v2: (https://lore.kernel.org/bpf/20251118030058.162967-1-emil@etsalapatis.com) - Moved globals to the end of the mapping: (Andrii) - Removed extra parameter for offset and parameter picking logic - Removed padding in the skeleton - Removed additional libbpf call - Added Reviewed-by from Eduard on patch 1 Signed-off-by: Emil Tsalapatis <emil@etsalapatis.com> ==================== Link: https://patch.msgid.link/20251216173325.98465-1-emil@etsalapatis.com Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2025-12-16selftests/bpf: Add tests for the arena offset of globalsEmil Tsalapatis
Add tests for the new libbpf globals arena offset logic. The tests cover the case of globals being as large as the arena itself, and being smaller than the arena. In that case, the data is placed at the end of the arena, and the beginning of the arena is free. Signed-off-by: Emil Tsalapatis <emil@etsalapatis.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20251216173325.98465-6-emil@etsalapatis.com
2025-12-16libbpf: Move arena globals to the end of the arenaEmil Tsalapatis
Arena globals are currently placed at the beginning of the arena by libbpf. This is convenient, but prevents users from reserving guard pages in the beginning of the arena to identify NULL pointer dereferences. Adjust the load logic to place the globals at the end of the arena instead. Also modify bpftool to set the arena pointer in the program's BPF skeleton to point to the globals. Users now call bpf_map__initial_value() to find the beginning of the arena mapping and use the arena pointer in the skeleton to determine which part of the mapping holds the arena globals and which part is free. Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Emil Tsalapatis <emil@etsalapatis.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/bpf/20251216173325.98465-5-emil@etsalapatis.com
2025-12-16libbpf: Turn relo_core->sym_off unsignedEmil Tsalapatis
The symbols' relocation offsets in BPF are stored in an int field, but cannot actually be negative. When in the next patch libbpf relocates globals to the end of the arena, it is also possible to have valid offsets > 2GiB that are used to calculate the final relo offsets. Avoid accidentally interpreting large offsets as negative by turning the sym_off field unsigned. Signed-off-by: Emil Tsalapatis <emil@etsalapatis.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/bpf/20251216173325.98465-4-emil@etsalapatis.com
2025-12-16bpf/verifier: Do not limit maximum direct offset into arena mapEmil Tsalapatis
The verifier currently limits direct offsets into a map to 512MiB to avoid overflow during pointer arithmetic. However, this prevents arena maps from using direct addressing instructions to access data at the end of > 512MiB arena maps. This is necessary when moving arena globals to the end of the arena instead of the front. Refactor the verifier code to remove the offset calculation during direct value access calculations. This is possible because the only two map types that implement .map_direct_value_addr() are arrays and arenas, and they both do their own internal checks to ensure the offset is within bounds. Adjust selftests that expect the old error. These tests still fail because the verifier identifies the access as out of bounds for the map, so change them to expect an "invalid access to map value pointer" error instead. Signed-off-by: Emil Tsalapatis <emil@etsalapatis.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20251216173325.98465-3-emil@etsalapatis.com
2025-12-16selftests/bpf: Explicitly account for globals in verifier_arena_largeEmil Tsalapatis
The big_alloc1 test in verifier_arena_large assumes that the arena base and the first page allocated by bpf_arena_alloc_pages are identical. This is not the case, because the first page in the arena is populated by global arena data. The test still passes because the code makes the tacit assumption that the first page is on offset PAGE_SIZE instead of 0. Make this distinction explicit in the code, and adjust the page offsets requested during the test to count from the beginning of the arena instead of using the address of the first allocated page. Signed-off-by: Emil Tsalapatis <emil@etsalapatis.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/bpf/20251216173325.98465-2-emil@etsalapatis.com
2025-12-16drm/radeon: Convert legacy DRM logging in evergreen.c to drm_* helpersAbhishek Rajput
Replace DRM_DEBUG(), DRM_ERROR(), and DRM_INFO() calls with the corresponding drm_dbg(), drm_err(), and drm_info() helpers in the radeon driver. The drm_*() logging helpers take a struct drm_device * argument, allowing the DRM core to prefix log messages with the correct device name and instance. This is required to correctly distinguish log messages on systems with multiple GPUs. This change aligns radeon with the DRM TODO item: "Convert logging to drm_* functions with drm_device parameter". Signed-off-by: Abhishek Rajput <abhiraj21put@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-16drm/amdgpu: Add gfx v12_1 interrupt source headerHawking Zhang
To acommandate specific interrupt source for gfx v12_1 Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-16drm/amdkfd: Override KFD SVM mappings for GFX 12.1Mukul Joshi
Override the local MTYPE mappings in KFD SVM code with mtype_local modprobe param for GFX 12.1. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Alex Sierra <alex.sierra@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-16drm/amdgpu: correct rlc autoload for xcc harvestLikun Gao
If the number instances of firmware is RLC_NUM_INS_CODE0(Only 1 inst), need to copy it directly for rlcautolad. For the firmware which instances number bigger than 1, only copy for enabled XCC to save copy time. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-16drm/amdgpu: add gfx sysfs support for gfx_v12_1Likun Gao
Add gfx sysfs support for gfx_v12_1. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-16drm/amdgpu/mes_v12_1: fix mes access xcd registerJack Xiao
Fix to use local register offset inside die for mes fw accessing local/remote xcd register. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Reviewed-by: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-16drm/amdgpu: normalize reg addr as local xcc for gfx v12_1Likun Gao
Normalize registers address to local xcc address for gfx v12_1. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-16drm/amdgpu: support xcc harvest for ih translateLikun Gao
Support xcc harvest for ih translate to logic xcc. V2: Only check available instances Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-16drm/amdgpu: Correct inst_id input from physical to logicLikun Gao
Correct inst_id input from physical to logic for sdma v7_1. V2: Show real instance number on logic xcc. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-16drm/amdgpu: use physical xcc id to get rrmtLikun Gao
Use physical xcc_id to get rrmt on misc_op for mes v12_1. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-16drm/radeon: Convert logging in radeon_display.c to drm_* helpersMukesh Ogare
Replace DRM_ERROR() and DRM_INFO() calls in drivers/gpu/drm/radeon/radeon_display.c with the corresponding drm_err() and drm_info() helpers. The drm_*() logging functions take a struct drm_device * argument, allowing the DRM core to prefix log messages with the correct device name and instance. This is required to correctly distinguish log messages on systems with multiple GPUs. This change aligns radeon with the DRM TODO item: "Convert logging to drm_* functions with drm_device parameter". Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Mukesh Ogare <mukeshogare871@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-16drm/amdkfd: Fix improper NULL termination of queue restore SMI event stringBrian Kocoloski
Pass character "0" rather than NULL terminator to properly format queue restoration SMI events. Currently, the NULL terminator precedes the newline character that is intended to delineate separate events in the SMI event buffer, which can break userspace parsers. Signed-off-by: Brian Kocoloski <brian.kocoloski@amd.com> Reviewed-by: Philip Yang <Philip.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-16drm/amdgpu: Correct xcc_id input to GET_INST from physical to logicLikun Gao
Correct xcc_id input to GET_INST from physical to logic for gfx_v12_1. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-16drm/amdgpu: Fix CP_MEC_MDBASE in multi-xcc for gfx v12_1Michael Chen
Need to allocate memory for MEC FW data and program registers CP_MEC_MDBASE for each XCC respectively. Signed-off-by: Michael Chen <michael.chen@amd.com> Acked-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Reviewed-by: Shaoyun.liu <Shaoyun.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-16drm/amdgpu: Support 57bit fault address for GFX 12.1.0Philip Yang
The gmc fault virtual address is up to 57bit for 5 level page table, this also works with 48bit virtual address for 4 level page table. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Acked-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-16drm/amdgpu: Add pde3 table invalidation request for GFX 12.1.0Philip Yang
Set pde3 invalidation request bit during tlb flush for up to 5 level page table. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-16drm/amdkfd: Update LDS, Scratch base for 57bit addressPhilip Yang
For 5-level page tables, update compute vmid sh_mem_base LDS aperture and Scratch aperture base address to above 57-bit, use the same setting from gfx vmid, we can remove the duplicate macro. Update queue pdd lds_base and scratch_base to the same value as sh_mem_base setting. Then application get process apertures return the correct value to access LDS and Scratch memory for 57bit address 5-level page tables. This may pass to MES in future when mapping queue. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Acked-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-16drm/amdgpu: Enable 5-level page table for GFX 12.1.0Philip Yang
GFX 12.1.0 support 57bit virtual, 52bit physical address, set PDE max_level to 4, min_vm_size to 128PB to enable GPU vm 5-level page tables to support 57bit virtual address. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Acked-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-16drm/amdgpu: init RS64_MEC_P2/P3_STACK for gfx12.1Feifei Xu
Add GFX12.1 MEC P2/P3 STACK firmware init. Signed-off-by: Feifei Xu <Feifei.Xu@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-16drm/amdgpu: Fix CU info calculations for GFX 12.1Mukul Joshi
This patch fixes the CU info calculations for gfx 12.1. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-16drm/amdkfd: Update CWSR area calculations for GFX 12.1Mukul Joshi
Update the SGPR, VGPR, HWREG size and number of waves supported for GFX 12.1 CWSR memory limits. The CU calculation changed in topology, as a result, the values need to be updated. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-16drm/amdgpu: Add soc v1_0 ih client id tableHawking Zhang
To acommandate the specific ih client for soc v1_0 Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-16drm/amdgpu: Flush TLB on all XCCs on GFX 12.1Mukul Joshi
Currently, the driver code is flushing TLB on XCC 0 only. Fix it by flushing on all XCCs within the partition. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-16drm/amd/pm: restore SCLK settings after S0ix resumemythilam
User-configured SCLK(GPU core clock)frequencies were not persisting across S0ix suspend/resume cycles on smu v14 hardware. The issue occurred because of the code resetting clock frequency to zero during resume. This patch addresses the problem by: - Preserving user-configured values in driver and sets the clock frequency across resume - Preserved settings are sent to the hardware during resume Signed-off-by: mythilam <mythilam@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-16drm/amdgpu: do not use amdgpu_bo_gpu_offset_no_check individuallySaleemkhan Jamadar
This should not be used indiviually, use amdgpu_bo_gpu_offset with bo reserved. v3 - unpin bo in queue destroy (Christian) v2 - pin bo so that offset returned won't change after unlock (Christian) Signed-off-by: Saleemkhan Jamadar <saleemkhan083@gmail.com> Suggested-by: Christian König <christian.koenig@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-16drm/amdgpu: Change set ip clock/power gating paramLijo Lazar
It's not required to use generic void *, change to struct amdgpu_device *. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>