summaryrefslogtreecommitdiff
path: root/drivers
AgeCommit message (Collapse)Author
2025-09-18drm/xe/xe_late_bind_fw: Extract and print version infoBadal Nilawar
Extract and print version info of the late binding binary. v2: Some refinements (Daniele) Signed-off-by: Badal Nilawar <badal.nilawar@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://lore.kernel.org/r/20250905154953.3974335-10-badal.nilawar@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-09-18drm/xe/xe_late_bind_fw: Introduce debug fs node to disable late bindingBadal Nilawar
Introduce a debug filesystem node to disable late binding fw reload during the system or runtime resume. This is intended for situations where the late binding fw needs to be loaded from user mode, perticularly for validation purpose. Note that xe kmd doesn't participate in late binding flow from user space. Binary loaded from the userspace will be lost upon entering to D3 cold hence user space app need to handle this situation. v2: - s/(uval == 1) ? true : false/!!uval/ (Daniele) v3: - Refine the commit message (Daniele) Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Badal Nilawar <badal.nilawar@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://lore.kernel.org/r/20250905154953.3974335-9-badal.nilawar@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-09-18drm/xe/xe_late_bind_fw: Reload late binding fw during system resumeBadal Nilawar
Reload late binding fw during resume from system suspend v2: - Unconditionally reload late binding fw (Rodrigo) - Flush worker during system suspend Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Badal Nilawar <badal.nilawar@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://lore.kernel.org/r/20250905154953.3974335-8-badal.nilawar@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-09-18drm/xe/xe_late_bind_fw: Reload late binding fw in rpm resumeBadal Nilawar
Reload late binding fw during runtime resume. Signed-off-by: Badal Nilawar <badal.nilawar@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://lore.kernel.org/r/20250905154953.3974335-7-badal.nilawar@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-09-18drm/xe/xe_late_bind_fw: Load late binding firmwareBadal Nilawar
Load late binding firmware v2: - s/EAGAIN/EBUSY/ - Flush worker in suspend and driver unload (Daniele) v3: - Use retry interval of 6s, in steps of 200ms, to allow other OS components release MEI CL handle (Sasha) v4: - return -ENODEV if component not added (Daniele) - parse and print status returned by csc v5: - Use payload to check firmware valid (Daniele) - Obtain the RPM reference before scheduling the worker to ensure the device remains awake until the worker completes firmware loading (Rodrigo) v6: - In case of error donot re-attempt fw download (Daniele) v7 (Rodrigo): - Rename of mei structs and callback. Signed-off-by: Badal Nilawar <badal.nilawar@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://lore.kernel.org/r/20250905154953.3974335-6-badal.nilawar@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-09-18drm/xe/xe_late_bind_fw: Initialize late binding firmwareBadal Nilawar
Search for late binding firmware binaries and populate the meta data of firmware structures. v2 (Daniele): - drm_err if firmware size is more than max pay load size - s/request_firmware/firmware_request_nowarn/ as firmware will not be available for all possible cards v3 (Daniele): - init firmware from within xe_late_bind_init, propagate error - switch late_bind_fw to array to handle multiple firmware types v4 (Daniele): - Alloc payload dynamically, fix nits v6 (Daniele) - %s/MAX_PAYLOAD_SIZE/XE_LB_MAX_PAYLOAD_SIZE/ Signed-off-by: Badal Nilawar <badal.nilawar@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://lore.kernel.org/r/20250905154953.3974335-5-badal.nilawar@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-09-18drm/xe/xe_late_bind_fw: Introduce xe_late_bind_fwBadal Nilawar
Introduce xe_late_bind_fw to enable firmware loading for the devices, such as the fan controller, during the driver probe. Typically, firmware for such devices are part of IFWI flash image but can be replaced at probe after OEM tuning. This patch binds mei late binding component to enable firmware loading. v2: - Add devm_add_action_or_reset to remove the component (Daniele) - Add INTEL_MEI_GSC check in xe_late_bind_init() (Daniele) v3: - Fail driver probe if late bind initialization fails, add has_late_bind flag (Daniele) v4: - %s/I915_COMPONENT_LATE_BIND/INTEL_COMPONENT_LATE_BIND/ v6: - rebased v7: - rebased - In xe_late_bind_init, use drm_err when returning an error to stop the probe (Lucas) - Use imperative mode in commit message (Lucas) Signed-off-by: Badal Nilawar <badal.nilawar@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://lore.kernel.org/r/20250905154953.3974335-4-badal.nilawar@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-09-18mei: late_bind: add late binding component driverAlexander Usyskin
Introduce a new MEI client driver to support Late Binding firmware upload/update for Intel discrete graphics platforms. Late Binding is a runtime firmware upload/update mechanism that allows payloads, such as fan control and voltage regulator, to be securely delivered and applied without requiring SPI flash updates or system reboots. This driver enables the Xe graphics driver and other user-space tools to push such firmware blobs to the authentication firmware via the MEI interface. The driver handles authentication, versioning, and communication with the authentication firmware, which in turn coordinates with the PUnit/PCODE to apply the payload. This is a foundational component for enabling dynamic, secure, and re-entrant configuration updates on platforms like Battlemage. Cc: Badal Nilawar <badal.nilawar@intel.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Badal Nilawar <badal.nilawar@intel.com> Reviewed-by: Anshuman Gupta <anshuman.gupta@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/r/20250905154953.3974335-3-badal.nilawar@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-09-18mei: bus: add mei_cldev_mtu interfaceAlexander Usyskin
Add a new helper function that allows MEI client drivers to query the maximum transmission unit (MTU) for a connected MEI client. This is useful for clients that need to transmit large payloads, such as firmware blobs, allowing them to determine the maximum message size that can be safely sent before starting transmission and size of the buffer to allocate when receiving data. Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com> Signed-off-by: Badal Nilawar <badal.nilawar@intel.com> Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/r/20250905154953.3974335-2-badal.nilawar@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-09-18Merge patch series "mei: connect to card in D3cold"Greg Kroah-Hartman
Alexander Usyskin <alexander.usyskin@intel.com> says: When discrete graphic card enters D3cold th CSC engine is powered down. On wakeup from the D3cold full HECI link reset is required. The driver should detect that firmware requests link reset and initiate the link reset flow. In the usual flow the connect IOCTL will trigger the wake from D3cold and corresponding link reset. The MEI driver invalidates all open handles on link reset including the one that triggered the wake rendering this connection unusable. To break this loop make connect detect that it is interrupted by link reset and retry connect attempt after reset was completed. Link: https://lore.kernel.org/r/20250918130435.3327400-1-alexander.usyskin@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-09-18mei: gsc: demote unexpected reset printAlexander Usyskin
Discrete graphic card can go to D3cold. On the exit from D3cold the link reset is performed. Driver did not expect such link reset and print warning. Print debug message for unexpected reset in discrete graphic case and remove infrastructure to print warning is some cases. Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/r/20250918130435.3327400-6-alexander.usyskin@intel.com
2025-09-18mei: bus: demote error on connectAlexander Usyskin
There are flows, like exit from D3cold where connect via bus can fail. Demote error print to debug level to unclutter dmesg. Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/r/20250918130435.3327400-5-alexander.usyskin@intel.com
2025-09-18mei: retry connect if interrupted by link resetAlexander Usyskin
When device is in D3cold the connect message will wake device and cause link reset. Link reset flow cleans all queues and wakes all waiters. Retry the connect flow if connect is failed and link reset is detected. Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/r/20250918130435.3327400-4-alexander.usyskin@intel.com
2025-09-18mei: make a local copy of client uuid in connectAlexander Usyskin
Connect ioctl has the same memory for in and out parameters. Copy in parameter (client uuid) to the local stack to avoid it be overwritten by out parameters fill. Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/r/20250918130435.3327400-3-alexander.usyskin@intel.com
2025-09-18mei: me: trigger link reset if hw ready is unexpectedAlexander Usyskin
Driver can receive HW not ready interrupt unexpectedly. E.g. for cards that go donwn to D3cold. Trigger link reset in this case to synchronize driver and firmware state. No need to do that sync if driver is going down or interrupt is received before driver started initial link reset sequence. Introduce UNINITIALIZED device state to allow interrupt handler to ignore interrupts before first init. Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/r/20250918130435.3327400-2-alexander.usyskin@intel.com
2025-09-18mei: gsc: fix remove operations orderAlexander Usyskin
The mei disconnect should be the last operation in remove flow. Otherwise the device is used after destruction. Fix minor free flow that happens after device destruction too. The fault leads to the following oops in Intel Gfx CI: <4>[ 267.871331] Oops: general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b6bcb: 0000 [#1] SMP NOPTI ... <4>[ 267.871410] RIP: 0010:mei_gsc_remove+0x44/0x90 [mei_gsc] ... <4>[ 267.871555] Call Trace: <4>[ 267.871562] <TASK> <4>[ 267.871570] auxiliary_bus_remove+0x1b/0x30 <4>[ 267.871589] device_remove+0x43/0x80 <4>[ 267.871604] device_release_driver_internal+0x215/0x280 <4>[ 267.871619] device_release_driver+0x12/0x20 <4>[ 267.871630] bus_remove_device+0xdc/0x150 <4>[ 267.871645] device_del+0x15f/0x3b0 <4>[ 267.871656] ? bus_unregister_notifier+0x37/0x50 <4>[ 267.871672] gsc_destroy_one.isra.0+0x44/0x210 [i915] <4>[ 267.872295] intel_gsc_fini+0x28/0x50 [i915] <4>[ 267.872860] intel_gt_driver_unregister+0x2c/0x80 [i915] <4>[ 267.873300] i915_driver_remove+0x6e/0x150 [i915] <4>[ 267.873694] i915_pci_remove+0x1e/0x40 [i915] <4>[ 267.874095] pci_device_remove+0x3e/0xb0 <4>[ 267.874111] device_remove+0x43/0x80 <4>[ 267.874126] device_release_driver_internal+0x215/0x280 <4>[ 267.874137] ? bus_find_device+0xa5/0xe0 <4>[ 267.874153] device_driver_detach+0x14/0x20 <4>[ 267.874164] unbind_store+0xac/0xc0 <4>[ 267.874178] drv_attr_store+0x21/0x50 <4>[ 267.874190] sysfs_kf_write+0x4a/0x80 <4>[ 267.874204] kernfs_fop_write_iter+0x188/0x240 <4>[ 267.874222] vfs_write+0x283/0x540 <4>[ 267.874241] ksys_write+0x6f/0xf0 <4>[ 267.874253] __x64_sys_write+0x19/0x30 <4>[ 267.874264] x64_sys_call+0x79/0x26a0 <4>[ 267.874277] do_syscall_64+0x93/0xd50 <4>[ 267.874291] ? do_syscall_64+0x1a2/0xd50 <4>[ 267.874301] ? do_syscall_64+0x1a2/0xd50 <4>[ 267.874313] ? do_syscall_64+0x1a2/0xd50 <4>[ 267.874324] ? clear_bhb_loop+0x30/0x80 <4>[ 267.874336] ? clear_bhb_loop+0x30/0x80 <4>[ 267.874349] entry_SYSCALL_64_after_hwframe+0x76/0x7e Fixes: 7704e6be4ed2 ("mei: hook mei_device on class device") Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com> Link: https://lore.kernel.org/r/20250915124554.2263330-1-alexander.usyskin@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-09-18Merge tag 'platform-drivers-x86-v6.17-4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86 Pull x86 platform driver fixes from Ilpo Järvinen: "Fixes and new HW support: - amd/pmc: Add MECHREVO Yilong15Pro to spurious_8042 list - amd/pmf: Support new ACPI ID AMDI0108 - asus-wmi: Re-add extra keys to ignore_key_wlan quirk - oxpec: Add support for AOKZOE A1X and OneXPlayer X1Pro EVA-02" * tag 'platform-drivers-x86-v6.17-4' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: platform/x86: asus-wmi: Re-add extra keys to ignore_key_wlan quirk platform/x86/amd/pmf: Support new ACPI ID AMDI0108 platform/x86: oxpec: Add support for AOKZOE A1X platform/x86: oxpec: Add support for OneXPlayer X1Pro EVA-02 platform/x86/amd/pmc: Add MECHREVO Yilong15Pro to spurious_8042 list
2025-09-18drm/panthor: always set fence errors on CS_FAULTChia-I Wu
It is unclear why fence errors were set only for CS_INHERIT_FAULT. Downstream driver also does not treat CS_INHERIT_FAULT specially. Remove the check. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Steven Price <steven.price@arm.com> Link: https://lore.kernel.org/r/20250828200419.3533393-1-olvaffe@gmail.com
2025-09-18binder: fix double-free in dbitmapCarlos Llamas
A process might fail to allocate a new bitmap when trying to expand its proc->dmap. In that case, dbitmap_grow() fails and frees the old bitmap via dbitmap_free(). However, the driver calls dbitmap_free() again when the same process terminates, leading to a double-free error: ================================================================== BUG: KASAN: double-free in binder_proc_dec_tmpref+0x2e0/0x55c Free of addr ffff00000b7c1420 by task kworker/9:1/209 CPU: 9 UID: 0 PID: 209 Comm: kworker/9:1 Not tainted 6.17.0-rc6-dirty #5 PREEMPT Hardware name: linux,dummy-virt (DT) Workqueue: events binder_deferred_func Call trace: kfree+0x164/0x31c binder_proc_dec_tmpref+0x2e0/0x55c binder_deferred_func+0xc24/0x1120 process_one_work+0x520/0xba4 [...] Allocated by task 448: __kmalloc_noprof+0x178/0x3c0 bitmap_zalloc+0x24/0x30 binder_open+0x14c/0xc10 [...] Freed by task 449: kfree+0x184/0x31c binder_inc_ref_for_node+0xb44/0xe44 binder_transaction+0x29b4/0x7fbc binder_thread_write+0x1708/0x442c binder_ioctl+0x1b50/0x2900 [...] ================================================================== Fix this issue by marking proc->map NULL in dbitmap_free(). Cc: stable@vger.kernel.org Fixes: 15d9da3f818c ("binder: use bitmap for faster descriptor lookup") Signed-off-by: Carlos Llamas <cmllamas@google.com> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Reviewed-by: Tiffany Yang <ynaffit@google.com> Link: https://lore.kernel.org/r/20250915221248.3470154-1-cmllamas@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-09-18octeontx2-pf: Fix use-after-free bugs in otx2_sync_tstamp()Duoming Zhou
The original code relies on cancel_delayed_work() in otx2_ptp_destroy(), which does not ensure that the delayed work item synctstamp_work has fully completed if it was already running. This leads to use-after-free scenarios where otx2_ptp is deallocated by otx2_ptp_destroy(), while synctstamp_work remains active and attempts to dereference otx2_ptp in otx2_sync_tstamp(). Furthermore, the synctstamp_work is cyclic, the likelihood of triggering the bug is nonnegligible. A typical race condition is illustrated below: CPU 0 (cleanup) | CPU 1 (delayed work callback) otx2_remove() | otx2_ptp_destroy() | otx2_sync_tstamp() cancel_delayed_work() | kfree(ptp) | | ptp = container_of(...); //UAF | ptp-> //UAF This is confirmed by a KASAN report: BUG: KASAN: slab-use-after-free in __run_timer_base.part.0+0x7d7/0x8c0 Write of size 8 at addr ffff88800aa09a18 by task bash/136 ... Call Trace: <IRQ> dump_stack_lvl+0x55/0x70 print_report+0xcf/0x610 ? __run_timer_base.part.0+0x7d7/0x8c0 kasan_report+0xb8/0xf0 ? __run_timer_base.part.0+0x7d7/0x8c0 __run_timer_base.part.0+0x7d7/0x8c0 ? __pfx___run_timer_base.part.0+0x10/0x10 ? __pfx_read_tsc+0x10/0x10 ? ktime_get+0x60/0x140 ? lapic_next_event+0x11/0x20 ? clockevents_program_event+0x1d4/0x2a0 run_timer_softirq+0xd1/0x190 handle_softirqs+0x16a/0x550 irq_exit_rcu+0xaf/0xe0 sysvec_apic_timer_interrupt+0x70/0x80 </IRQ> ... Allocated by task 1: kasan_save_stack+0x24/0x50 kasan_save_track+0x14/0x30 __kasan_kmalloc+0x7f/0x90 otx2_ptp_init+0xb1/0x860 otx2_probe+0x4eb/0xc30 local_pci_probe+0xdc/0x190 pci_device_probe+0x2fe/0x470 really_probe+0x1ca/0x5c0 __driver_probe_device+0x248/0x310 driver_probe_device+0x44/0x120 __driver_attach+0xd2/0x310 bus_for_each_dev+0xed/0x170 bus_add_driver+0x208/0x500 driver_register+0x132/0x460 do_one_initcall+0x89/0x300 kernel_init_freeable+0x40d/0x720 kernel_init+0x1a/0x150 ret_from_fork+0x10c/0x1a0 ret_from_fork_asm+0x1a/0x30 Freed by task 136: kasan_save_stack+0x24/0x50 kasan_save_track+0x14/0x30 kasan_save_free_info+0x3a/0x60 __kasan_slab_free+0x3f/0x50 kfree+0x137/0x370 otx2_ptp_destroy+0x38/0x80 otx2_remove+0x10d/0x4c0 pci_device_remove+0xa6/0x1d0 device_release_driver_internal+0xf8/0x210 pci_stop_bus_device+0x105/0x150 pci_stop_and_remove_bus_device_locked+0x15/0x30 remove_store+0xcc/0xe0 kernfs_fop_write_iter+0x2c3/0x440 vfs_write+0x871/0xd70 ksys_write+0xee/0x1c0 do_syscall_64+0xac/0x280 entry_SYSCALL_64_after_hwframe+0x77/0x7f ... Replace cancel_delayed_work() with cancel_delayed_work_sync() to ensure that the delayed work item is properly canceled before the otx2_ptp is deallocated. This bug was initially identified through static analysis. To reproduce and test it, I simulated the OcteonTX2 PCI device in QEMU and introduced artificial delays within the otx2_sync_tstamp() function to increase the likelihood of triggering the bug. Fixes: 2958d17a8984 ("octeontx2-pf: Add support for ptp 1-step mode on CN10K silicon") Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-18cnic: Fix use-after-free bugs in cnic_delete_taskDuoming Zhou
The original code uses cancel_delayed_work() in cnic_cm_stop_bnx2x_hw(), which does not guarantee that the delayed work item 'delete_task' has fully completed if it was already running. Additionally, the delayed work item is cyclic, the flush_workqueue() in cnic_cm_stop_bnx2x_hw() only blocks and waits for work items that were already queued to the workqueue prior to its invocation. Any work items submitted after flush_workqueue() is called are not included in the set of tasks that the flush operation awaits. This means that after the cyclic work items have finished executing, a delayed work item may still exist in the workqueue. This leads to use-after-free scenarios where the cnic_dev is deallocated by cnic_free_dev(), while delete_task remains active and attempt to dereference cnic_dev in cnic_delete_task(). A typical race condition is illustrated below: CPU 0 (cleanup) | CPU 1 (delayed work callback) cnic_netdev_event() | cnic_stop_hw() | cnic_delete_task() cnic_cm_stop_bnx2x_hw() | ... cancel_delayed_work() | /* the queue_delayed_work() flush_workqueue() | executes after flush_workqueue()*/ | queue_delayed_work() cnic_free_dev(dev)//free | cnic_delete_task() //new instance | dev = cp->dev; //use Replace cancel_delayed_work() with cancel_delayed_work_sync() to ensure that the cyclic delayed work item is properly canceled and that any ongoing execution of the work item completes before the cnic_dev is deallocated. Furthermore, since cancel_delayed_work_sync() uses __flush_work(work, true) to synchronously wait for any currently executing instance of the work item to finish, the flush_workqueue() becomes redundant and should be removed. This bug was identified through static analysis. To reproduce the issue and validate the fix, I simulated the cnic PCI device in QEMU and introduced intentional delays — such as inserting calls to ssleep() within the cnic_delete_task() function — to increase the likelihood of triggering the bug. Fixes: fdf24086f475 ("cnic: Defer iscsi connection cleanup") Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-18net: liquidio: fix overflow in octeon_init_instr_queue()Alexey Nepomnyashih
The expression `(conf->instr_type == 64) << iq_no` can overflow because `iq_no` may be as high as 64 (`CN23XX_MAX_RINGS_PER_PF`). Casting the operand to `u64` ensures correct 64-bit arithmetic. Fixes: f21fb3ed364b ("Add support of Cavium Liquidio ethernet adapters") Signed-off-by: Alexey Nepomnyashih <sdl@nppct.ru> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-18Revert "net/mlx5e: Update and set Xon/Xoff upon port speed set"Tariq Toukan
This reverts commit d24341740fe48add8a227a753e68b6eedf4b385a. It causes errors when trying to configure QoS, as well as loss of L2 connectivity (on multi-host devices). Reported-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/20250910170011.70528106@kernel.org Fixes: d24341740fe4 ("net/mlx5e: Update and set Xon/Xoff upon port speed set") Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-18perf: arm_spe: Prevent overflow in PERF_IDX2OFF()Leo Yan
Cast nr_pages to unsigned long to avoid overflow when handling large AUX buffer sizes (>= 2 GiB). Fixes: d5d9696b0380 ("drivers/perf: Add support for ARMv8.2 Statistical Profiling Extension") Signed-off-by: Leo Yan <leo.yan@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
2025-09-18coresight: trbe: Prevent overflow in PERF_IDX2OFF()Leo Yan
Cast nr_pages to unsigned long to avoid overflow when handling large AUX buffer sizes (>= 2 GiB). Fixes: 3fbf7f011f24 ("coresight: sink: Add TRBE driver") Signed-off-by: Leo Yan <leo.yan@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
2025-09-18drivers/perf: riscv: Remove redundant ternary operatorsLiao Yuanhong
For ternary operators in the form of "a ? true : false", if 'a' itself returns a boolean result, the ternary operator can be omitted. Remove redundant ternary operators to clean up the code. Signed-off-by: Liao Yuanhong <liaoyuanhong@vivo.com> Reviewed-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20250828122510.30843-1-liaoyuanhong@vivo.com Signed-off-by: Paul Walmsley <pjw@kernel.org>
2025-09-18net: virtio_net: add get_rxrings ethtool callback for RX ring queriesBreno Leitao
Replace the existing virtnet_get_rxnfc callback with a dedicated virtnet_get_rxrings implementation to provide the number of RX rings directly via the new ethtool_ops get_rx_ring_count pointer. This simplifies the RX ring count retrieval and aligns virtio_net with the new ethtool API for querying RX ring parameters. Signed-off-by: Breno Leitao <leitao@debian.org> Link: https://patch.msgid.link/20250917-gxrings-v4-8-dae520e2e1cb@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-18drivers/perf: hisi: Add support for HiSilicon MN PMU driverJunhao He
MN (Miscellaneous Node) is a hybrid node in ARM CHI. It broadcasts the following two types of requests: DVM operations and PCIe configuration. MN PMU devices exist on both SCCL and SICL, so we named the MN pmu driver after SCL (Super cluster) ID. The MN PMU driver using the HiSilicon uncore PMU framework. And only the event parameter is supported. Signed-off-by: Junhao He <hejunhao3@huawei.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Signed-off-by: Will Deacon <will@kernel.org>
2025-09-18drivers/perf: hisi: Add support for HiSilicon NoC PMUYicong Yang
Adds the support for HiSilicon NoC (Network on Chip) PMU which will be used to monitor the events on the system bus. The PMU device will be named after the SCL ID (either Super CPU cluster or Super IO cluster) and the index ID, just similar to other HiSilicon Uncore PMUs. Below PMU formats are provided besides the event: - ch: the transaction channel (data, request, response, etc) which can be used to filter the counting. - tt_en: tracetag filtering enable. Just as other HiSilicon Uncore PMUs the NoC PMU supports only counting the transactions with tracetag. The NoC PMU doesn't have an interrupt to indicate the overflow. However we have a 64 bit counter which is large enough and it's nearly impossible to overflow. Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Signed-off-by: Will Deacon <will@kernel.org>
2025-09-18drm/amdgpu: add missing comment for the new argumentSunil Khatri
In function 'amdgpu_vm_lock_done_list' update the comment for the new argument 'vm'. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202509180211.UAqME0zj-lkp@intel.com/ Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-18drm/amdgpu: suspend KFD and KGD user queues for S0ixAlex Deucher
We need to make sure the user queues are preempted so GFX can enter gfxoff. Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Tested-by: David Perry <david.perry@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-18drm/amdgpu/userq: Optimize S0ix handlingAlex Deucher
In S0i3, GFX state is retained, so it's preferrable to preempt queues rather than unmapping them as the overhead is lower. Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Tested-by: David Perry <david.perry@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-18drm/amdgpu: Fix PRT flag for gfx12Joe.Wang
AMDGPU_PTE_PRT_GFX12 flag is missed during pageTable rework, add it back. Fixes: 6716a823d18d ("drm/amdgpu: rework how PTE flags are generated v3") Signed-off-by: Joe Wang <joe.wang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-18drm/amdgpu: Check VF critical region before RAS poison injectionXiang Liu
Check VF critical region before RAS poison injection to ensure that the poison injection will not hit the VF critical region. Signed-off-by: Xiang Liu <xiang.liu@amd.com> Reviewed-by: Shravan Kumar Gande <Shravankumar.Gande@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-18drm/amdkfd: add proper handling for S0ixAlex Deucher
When in S0i3, the GFX state is retained, so all we need to do is stop the runlist so GFX can enter gfxoff. Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Tested-by: David Perry <david.perry@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-18drm/amdgpu: Introduce VF critical region check for RAS poison injectionXiang Liu
The SRIOV guest send requet to host to check whether the poison injection address is in VF critical region or not via mabox. Signed-off-by: Xiang Liu <xiang.liu@amd.com> Reviewed-by: Shravan Kumar Gande <Shravankumar.Gande@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-18drm/amdgpu: remove non-DC DCE 11 codeAlex Deucher
DC has been the default for ~8 years now and supports many things that the non-DC code does not (audio, DP MST, etc.). No DCE 11.x IPs ever supported analog encoders so that is not an issue. Finally drop this code. Acked-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-18drm/amd/pm: Enable npm metrics dataAsad Kamal
Enable npm metrics data for smu_v13_0_12 v3: Add node id check for setting NPM_CAPS (Lijo) Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-18drm/amd/pm: Fetch npm data from system metrics tableAsad Kamal
Fetch npm data from system metrics table for smu_v13_0_12 v3: Remove intermittent type for npm data, remove node id check, move npm caps check to npm_get_data function (Lijo) Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-18drm/amd/pm: Add sysfs node for node powerAsad Kamal
Add sysfs node to expose node power limit for smu_v13_0_12 v2: Remove support check from visible function (Kevin) v3: Update comments (Kevin) Remove sysfs remove file, change format specifier for sysfs_emit, use attribute_group.name (Lijo) Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-18drm/amd/pm: Allow system metrics table in 1vf modeAsad Kamal
Allow fetching system metrics table in 1VF mode Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-18perf: arm_pmuv3: Factor out PMCCNTR_EL0 use conditionsYicong Yang
PMCCNTR_EL0 is preferred for counting CPU_CYCLES under certain conditions. Factor out the condition check to a separate function for further extension. Add documents for better understanding. No functional changes intended. Reviewed-by: James Clark <james.clark@linaro.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Signed-off-by: Will Deacon <will@kernel.org>
2025-09-18perf: arm_spe: Add support for FEAT_SPE_EFT extended filteringJames Clark
FEAT_SPE_EFT (optional from Armv9.4) adds mask bits for the existing load, store and branch filters. It also adds two new filter bits for SIMD and floating point with their own associated mask bits. The current filters only allow OR filtering on samples that are load OR store etc, and the new mask bits allow setting part of the filter to an AND, for example filtering samples that are store AND SIMD. With mask bits set to 0, the OR behavior is preserved, so the unless any masks are explicitly set old filters will behave the same. Add them all and make them behave the same way as existing format bits, hidden and return EOPNOTSUPP if set when the feature doesn't exist. Reviewed-by: Leo Yan <leo.yan@arm.com> Tested-by: Leo Yan <leo.yan@arm.com> Signed-off-by: James Clark <james.clark@linaro.org> Signed-off-by: Will Deacon <will@kernel.org>
2025-09-18perf: arm_spe: Expose event filterLeo Yan
Expose an "event_filter" entry in the caps folder to inform user space about which events can be filtered. Change the return type of arm_spe_pmu_cap_get() from u32 to u64 to accommodate the added event filter entry. Signed-off-by: Leo Yan <leo.yan@arm.com> Tested-by: Leo Yan <leo.yan@arm.com> Signed-off-by: James Clark <james.clark@linaro.org> Signed-off-by: Will Deacon <will@kernel.org>
2025-09-18perf: arm_spe: Support FEAT_SPEv1p4 filtersJames Clark
FEAT_SPEv1p4 (optional from Armv8.8) adds some new filter bits and also makes some previously available bits unavailable again e.g: E[30], bit [30] When FEAT_SPEv1p4 is _not_ implemented ... Continuing to hard code the valid filter bits for each version isn't scalable, and it also doesn't work for filter bits that aren't related to SPE version. For example most bits have a further condition: E[15], bit [15] When ... and filtering on event 15 is supported: Whether "filtering on event 15" is implemented or not is only discoverable from the TRM of that specific CPU or by probing PMSEVFR_EL1. Instead of hard coding them, write all 1s to the PMSEVFR_EL1 register and read it back to discover the RES0 bits. Unsupported bits are RAZ/WI so should read as 0s. For any hardware that doesn't strictly follow RAZ/WI for unsupported filters: Any bits that should have been supported in a specific SPE version but now incorrectly appear to be RES0 wouldn't have worked anyway, so it's better to fail to open events that request them rather than behaving unexpectedly. Bits that aren't implemented but also aren't RAZ/WI will be incorrectly reported as supported, but allowing them to be used is harmless. Testing on N1SDP shows the probed RES0 bits to be the same as the hard coded ones. The FVP with SPEv1p4 shows only additional new RES0 bits, i.e. no previously hard coded RES0 bits are missing. Tested-by: Leo Yan <leo.yan@arm.com> Signed-off-by: James Clark <james.clark@linaro.org> Signed-off-by: Will Deacon <will@kernel.org>
2025-09-18drm/i915: Defeature DRRS on LNL+Ville Syrjälä
DRRS has been defeatured on LNL+. Adjust HAS_DOUBLE_BUFFERED_M_N() to match. Note that the M/N registers still appear to be double buffered under the hood but the double buffer update point is now documented to be just the last register write to the M/N registers, so it no longer happens synchronously with the vblank/MSA transmission. We should perhaps rename HAS_DOUBLE_BUFFERED_M_N() to more accurately reflect reality, but couldn't come up with a decent name right now... Bspec: 68917 HSD: 14016007525 Cc: Ankit Nautiyal <ankit.k.nautiyal@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250912135926.18910-1-ville.syrjala@linux.intel.com Reviewed-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com>
2025-09-18drm/panfrost: Display list of device JM contexts over debugfsBoris Brezillon
For DebugFS builds, create a filesystem knob that, for every single open file of the Panfrost DRM device, shows its command name information and PID (when applicable), and all of its existing JM contexts. For every context, show the DRM scheduler priority value of all of its scheduling entities. Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com> Signed-off-by: Steven Price <steven.price@arm.com> Link: https://lore.kernel.org/r/20250917191859.500279-5-adrian.larumbe@collabora.com
2025-09-18drm/panfrost: Expose JM context IOCTLs to UMBoris Brezillon
Minor revision of the driver must be bumped because this expands the uAPI. On top of that, let UM know about the available priorities so that they can create contexts with legal priority values. Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com> Signed-off-by: Steven Price <steven.price@arm.com> Link: https://lore.kernel.org/r/20250917191859.500279-4-adrian.larumbe@collabora.com
2025-09-18drm/panfrost: Introduce JM contexts for manging job resourcesBoris Brezillon
A JM context describes user-requested priorities for the JM queues. Context creation leads to the initialization of scheduling entities of the same priority for all the device's job slots. Until context creation and destruction are exposed to UM, all issued jobs shall be bound to the default Panfrost file context, which has medium priority. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Steven Price <steven.price@arm.com> Link: https://lore.kernel.org/r/20250917191859.500279-3-adrian.larumbe@collabora.com
2025-09-18perf/dwc_pcie: Support counting multiple lane events in parallelIlkka Koskinen
While Designware PCIe PMU allows to count only one time based event at a time, it allows to count all the lane events simultaneously. After the patch one is able to count a group of lane events: $ perf stat -e '{dwc_rootport/tx_memory_write,lane=1/,dwc_rootport/rx_memory_read,lane=0/}' dd if=/dev/nvme0n1 of=/dev/null bs=1M count=1 Earlier the events wouldn't have been counted successfully. Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Signed-off-by: Will Deacon <will@kernel.org>