summaryrefslogtreecommitdiff
path: root/drivers
AgeCommit message (Collapse)Author
2025-10-14clk: renesas: cpg-mssr: Read back reset registers to assure values latchedMarek Vasut
On R-Car V4H, the PCIEC controller DBI read would generate an SError in case the controller reset is released by writing SRSTCLR register first, and immediately afterward reading some PCIEC controller DBI register. The issue triggers in rcar_gen4_pcie_additional_common_init() on dw_pcie_readl_dbi(dw, PCIE_PORT_LANE_SKEW), which on V4H is the first read after reset_control_deassert(dw->core_rsts[DW_PCIE_PWR_RST].rstc). The reset controller which contains the SRSTCLR register and the PCIEC controller which contains the DBI register share the same root access bus, but the bus then splits into separate segments before reaching each IP. Even if the SRSTCLR write access was posted on the bus before the DBI read access, it seems the DBI read access may reach the PCIEC controller before the SRSTCLR write completed, and trigger the SError. Mitigate the issue by adding a dummy SRSTCLR read, which assures the SRSTCLR write completes fully and is latched into the reset controller, before the PCIEC DBI read access can occur. Fixes: 0ab55cf18341 ("clk: renesas: cpg-mssr: Add support for R-Car V4H") Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Marek Vasut <marek.vasut+renesas@mailbox.org> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://patch.msgid.link/20250922162113.113223-1-marek.vasut+renesas@mailbox.org Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
2025-10-14clk: renesas: cpg-mssr: Add missing 1ms delay into reset toggle callbackMarek Vasut
R-Car V4H Reference Manual R19UH0186EJ0130 Rev.1.30 Apr. 21, 2025 page 583 Figure 9.3.1(a) Software Reset flow (A) as well as flow (B) / (C) indicate after reset has been asserted by writing a matching reset bit into register SRCR, it is mandatory to wait 1ms. This 1ms delay is documented on R-Car V4H and V4M, it is currently unclear whether S4 is affected as well. This patch does apply the extra delay on R-Car S4 as well. Fix the reset driver to respect the additional delay when toggling resets. Drivers which use separate reset_control_(de)assert() must assure matching delay in their driver code. Fixes: 0ab55cf18341 ("clk: renesas: cpg-mssr: Add support for R-Car V4H") Signed-off-by: Marek Vasut <marek.vasut+renesas@mailbox.org> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://patch.msgid.link/20250918030552.331389-1-marek.vasut+renesas@mailbox.org Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
2025-10-14pinctrl: renesas: Remove unneeded semicolonsGeert Uytterhoeven
Semicolons after end of function braces are not needed, remove them. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://patch.msgid.link/99db8c1bfb64980b54a4b5c4988c7935609133e1.1758718027.git.geert+renesas@glider.be
2025-10-14pinctrl: renesas: rzg2l: Remove extra semicolonsCosmin Tanislav
Semicolons after end of function braces are unnecessary, remove them. Signed-off-by: Cosmin Tanislav <cosmin-gabriel.tanislav.xa@renesas.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://patch.msgid.link/20250923174951.1136259-1-cosmin-gabriel.tanislav.xa@renesas.com Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
2025-10-14pinctrl: renesas: rzg2l: Fix PMC restoreBiju Das
PMC restore needs unlocking the register using the PWPR register. Fixes: ede014cd1ea6422d ("pinctrl: renesas: rzg2l: Add function pointer for PMC register write") Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://patch.msgid.link/20250921111557.103069-2-biju.das.jz@bp.renesas.com Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
2025-10-14pinctrl: renesas: Drop duplicate newlinesMarek Vasut
Remove duplicate newlines. No functional change. Signed-off-by: Marek Vasut <marek.vasut+renesas@mailbox.org> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://patch.msgid.link/20250918200409.37284-1-marek.vasut+renesas@mailbox.org Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
2025-10-14pinctrl: renesas: rzg2l: Drop unnecessary pin configurationsBiju Das
There is no need to reconfigure a pin if the pin's configuration values are the same as the reset values. E.g. the PS0 pin configuration for the NMI function is PMC = 1 and PFC = 0, which is the same as the reset values. Currently the code is first setting it to GPIO HI-Z state and then again reconfiguring to the NMI function, leading to spurious IRQs. Fix this by dropping unnecessary pin configuration from the driver. Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://patch.msgid.link/20250909104247.3309-1-biju.das.jz@bp.renesas.com Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
2025-10-14pinctrl: renesas: rzg2l: Fix ISEL restore on resumeClaudiu Beznea
Commit 1d2da79708cb ("pinctrl: renesas: rzg2l: Avoid configuring ISEL in gpio_irq_{en,dis}able*()") dropped the configuration of ISEL from struct irq_chip::{irq_enable, irq_disable} APIs and moved it to struct gpio_chip::irq::{child_to_parent_hwirq, child_irq_domain_ops::free} APIs to fix spurious IRQs. After commit 1d2da79708cb ("pinctrl: renesas: rzg2l: Avoid configuring ISEL in gpio_irq_{en,dis}able*()"), ISEL was no longer configured properly on resume. This is because the pinctrl resume code used struct irq_chip::irq_enable (called from rzg2l_gpio_irq_restore()) to reconfigure the wakeup interrupts. Some drivers (e.g. Ethernet) may also reconfigure non-wakeup interrupts on resume through their own code, eventually calling struct irq_chip::irq_enable. Fix this by adding ISEL configuration back into the struct irq_chip::irq_enable API and on resume path for wakeup interrupts. As struct irq_chip::irq_enable needs now to lock to update the ISEL, convert the struct rzg2l_pinctrl::lock to a raw spinlock and replace the locking API calls with the raw variants. Otherwise the lockdep reports invalid wait context when probing the adv7511 module on RZ/G2L: [ BUG: Invalid wait context ] 6.17.0-rc5-next-20250911-00001-gfcfac22533c9 #18 Not tainted ----------------------------- (udev-worker)/165 is trying to lock: ffff00000e3664a8 (&pctrl->lock){....}-{3:3}, at: rzg2l_gpio_irq_enable+0x38/0x78 other info that might help us debug this: context-{5:5} 3 locks held by (udev-worker)/165: #0: ffff00000e890108 (&dev->mutex){....}-{4:4}, at: __driver_attach+0x90/0x1ac #1: ffff000011c07240 (request_class){+.+.}-{4:4}, at: __setup_irq+0xb4/0x6dc #2: ffff000011c070c8 (lock_class){....}-{2:2}, at: __setup_irq+0xdc/0x6dc stack backtrace: CPU: 1 UID: 0 PID: 165 Comm: (udev-worker) Not tainted 6.17.0-rc5-next-20250911-00001-gfcfac22533c9 #18 PREEMPT Hardware name: Renesas SMARC EVK based on r9a07g044l2 (DT) Call trace: show_stack+0x18/0x24 (C) dump_stack_lvl+0x90/0xd0 dump_stack+0x18/0x24 __lock_acquire+0xa14/0x20b4 lock_acquire+0x1c8/0x354 _raw_spin_lock_irqsave+0x60/0x88 rzg2l_gpio_irq_enable+0x38/0x78 irq_enable+0x40/0x8c __irq_startup+0x78/0xa4 irq_startup+0x108/0x16c __setup_irq+0x3c0/0x6dc request_threaded_irq+0xec/0x1ac devm_request_threaded_irq+0x80/0x134 adv7511_probe+0x928/0x9a4 [adv7511] i2c_device_probe+0x22c/0x3dc really_probe+0xbc/0x2a0 __driver_probe_device+0x78/0x12c driver_probe_device+0x40/0x164 __driver_attach+0x9c/0x1ac bus_for_each_dev+0x74/0xd0 driver_attach+0x24/0x30 bus_add_driver+0xe4/0x208 driver_register+0x60/0x128 i2c_register_driver+0x48/0xd0 adv7511_init+0x5c/0x1000 [adv7511] do_one_initcall+0x64/0x30c do_init_module+0x58/0x23c load_module+0x1bcc/0x1d40 init_module_from_file+0x88/0xc4 idempotent_init_module+0x188/0x27c __arm64_sys_finit_module+0x68/0xac invoke_syscall+0x48/0x110 el0_svc_common.constprop.0+0xc0/0xe0 do_el0_svc+0x1c/0x28 el0_svc+0x4c/0x160 el0t_64_sync_handler+0xa0/0xe4 el0t_64_sync+0x198/0x19c Having ISEL configuration back into the struct irq_chip::irq_enable API should be safe with respect to spurious IRQs, as in the probe case IRQs are enabled anyway in struct gpio_chip::irq::child_to_parent_hwirq. No spurious IRQs were detected on suspend/resume, boot, ethernet link insert/remove tests (executed on RZ/G3S). Boot, ethernet link insert/remove tests were also executed successfully on RZ/G2L. Fixes: 1d2da79708cb ("pinctrl: renesas: rzg2l: Avoid configuring ISEL in gpio_irq_{en,dis}able*(") Cc: stable@vger.kernel.org Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://patch.msgid.link/20250912095308.3603704-1-claudiu.beznea.uj@bp.renesas.com Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
2025-10-14drm/rockchip: vop2: use correct destination rectangle height checkAlok Tiwari
The vop2_plane_atomic_check() function incorrectly checks drm_rect_width(dest) twice instead of verifying both width and height. Fix the second condition to use drm_rect_height(dest) so that invalid destination rectangles with height < 4 are correctly rejected. Fixes: 604be85547ce ("drm/rockchip: Add VOP2 driver") Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com> Reviewed-by: Andy Yan <andy.yan@rock-chips.com> Signed-off-by: Heiko Stuebner <heiko@sntech.de> Link: https://lore.kernel.org/r/20251012142005.660727-1-alok.a.tiwari@oracle.com
2025-10-14amd-xgbe: Avoid spurious link down messages during interface toggleRaju Rangoju
During interface toggle operations (ifdown/ifup), the driver currently resets the local helper variable 'phy_link' to -1. This causes the link state machine to incorrectly interpret the state as a link change event, resulting in spurious "Link is down" messages being logged when the interface is brought back up. Preserve the phy_link state across interface toggles to avoid treating the -1 sentinel value as a legitimate link state transition. Fixes: 88131a812b16 ("amd-xgbe: Perform phy connect/disconnect at dev open/stop") Signed-off-by: Raju Rangoju <Raju.Rangoju@amd.com> Reviewed-by: Dawid Osuchowski <dawid.osuchowski@linux.intel.com> Link: https://patch.msgid.link/20251010065142.1189310-1-Raju.Rangoju@amd.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-10-14xhci: dbc: enable back DbC in resume if it was enabled before suspendMathias Nyman
DbC is currently only enabled back if it's in configured state during suspend. If system is suspended after DbC is enabled, but before the device is properly enumerated by the host, then DbC would not be enabled back in resume. Always enable DbC back in resume if it's suspended in enabled, connected, or configured state Cc: stable <stable@kernel.org> Fixes: dfba2174dc42 ("usb: xhci: Add DbC support in xHCI driver") Tested-by: Łukasz Bartosik <ukaszb@chromium.org> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-14xhci: dbc: fix bogus 1024 byte prefix if ttyDBC read races with stall eventMathias Nyman
DbC may add 1024 bogus bytes to the beginneing of the receiving endpoint if DbC hw triggers a STALL event before any Transfer Blocks (TRBs) for incoming data are queued, but driver handles the event after it queued the TRBs. This is possible as xHCI DbC hardware may trigger spurious STALL transfer events even if endpoint is empty. The STALL event contains a pointer to the stalled TRB, and "remaining" untransferred data length. As there are no TRBs queued yet the STALL event will just point to first TRB position of the empty ring, with '0' bytes remaining untransferred. DbC driver is polling for events, and may not handle the STALL event before /dev/ttyDBC0 is opened and incoming data TRBs are queued. The DbC event handler will now assume the first queued TRB (length 1024) has stalled with '0' bytes remaining untransferred, and copies the data This race situation can be practically mitigated by making sure the event handler handles all pending transfer events when DbC reaches configured state, and only then create dev/ttyDbC0, and start queueing transfers. The event handler can this way detect the STALL events on empty rings and discard them before any transfers are queued. This does in practice solve the issue, but still leaves a small possible gap for the race to trigger. We still need a way to distinguish spurious STALLs on empty rings with '0' bytes remaing, from actual STALL events with all bytes transmitted. Cc: stable <stable@kernel.org> Fixes: dfba2174dc42 ("usb: xhci: Add DbC support in xHCI driver") Tested-by: Łukasz Bartosik <ukaszb@chromium.org> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-14usb: xhci-pci: Fix USB2-only root hub registrationMichal Pecio
A recent change to hide USB3 root hubs of USB2-only controllers broke registration of USB2 root hubs - allow_single_roothub is set too late, and by this time xhci_run() has already deferred root hub registration until after the shared HCD is added, which will never happen. This makes such controllers unusable, but testers didn't notice since they were only bothered by warnings about empty USB3 root hubs. The bug causes problems to other people who actually use such HCs and I was able to confirm it on an ordinary HC by patching to ignore USB3 ports. Setting allow_single_roothub during early setup fixes things. Reported-by: Arisa Snowbell <arisa.snowbell@gmail.com> Closes: https://lore.kernel.org/linux-usb/CABpa4MA9unucCoKtSdzJyOLjHNVy+Cwgz5AnAxPkKw6vuox1Nw@mail.gmail.com/ Reported-by: Michal Kubecek <mkubecek@suse.cz> Closes: https://lore.kernel.org/linux-usb/lnb5bum7dnzkn3fc7gq6hwigslebo7o4ccflcvsc3lvdgnu7el@fvqpobbdoapl/ Fixes: 719de070f764 ("usb: xhci-pci: add support for hosts with zero USB3 ports") Tested-by: Arisa Snowbell <arisa.snowbell@gmail.com> Tested-by: Michal Kubecek <mkubecek@suse.cz> Suggested-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Michal Pecio <michal.pecio@gmail.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-14drm/ttm: Add safety check for NULL man->bdev in ttm_resource_manager_usageJesse.Zhang
The `ttm_resource_manager_usage()` function currently assumes `man->bdev` is non-NULL when accessing `man->bdev->lru_lock`. However, in scenarios where the resource manager is not fully initialized (e.g., APU platforms that lack dedicated VRAM, or incomplete manager setup), `man->bdev` may remain NULL. This leads to a NULL pointer dereference when attempting to acquire the `lru_lock`, triggering kernel OOPS. Fix this by adding an explicit safety check for `man->bdev` before accessing its members: - Use `WARN_ON_ONCE(!man->bdev)` to emit a one-time warning (a soft assertion) when `man->bdev` is NULL. This helps catch invalid usage patterns during debugging without breaking production workflows. - Return 0 immediately if `man->bdev` is NULL, as a non-initialized manager cannot have valid resource usage to report. Suggested-by: Christian König <christian.koenig@amd.com> Suggested-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Jesse Zhang <jesse.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Link: https://lore.kernel.org/r/20251013085849.1735612-1-Jesse.Zhang@amd.com
2025-10-14Merge drm/drm-next into drm-intel-nextJani Nikula
Sync to v6.18-rc1. Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-10-14drm/draw: fix color truncation in drm_draw_fill24Francesco Valla
The color parameter passed to drm_draw_fill24() was truncated to 16 bits, leading to an incorrect color drawn to the target iosys_map. Fix this behavior, widening the parameter to 32 bits. Fixes: 31fa2c1ca0b2 ("drm/panic: Move drawing functions to drm_draw") Signed-off-by: Francesco Valla <francesco@valla.it> Reviewed-by: Jocelyn Falempe <jfalempe@redhat.com> Link: https://lore.kernel.org/r/20251003-drm_draw_fill24_fix-v1-1-8fb7c1c2a893@valla.it Signed-off-by: Jocelyn Falempe <jfalempe@redhat.com>
2025-10-13net: phy: realtek: Avoid PHYCR2 access if PHYCR2 not presentMarek Vasut
The driver is currently checking for PHYCR2 register presence in rtl8211f_config_init(), but it does so after accessing PHYCR2 to disable EEE. This was introduced in commit bfc17c165835 ("net: phy: realtek: disable PHY-mode EEE"). Move the PHYCR2 presence test before the EEE disablement and simplify the code. Fixes: bfc17c165835 ("net: phy: realtek: disable PHY-mode EEE") Signed-off-by: Marek Vasut <marek.vasut@mailbox.org> Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/20251011110309.12664-1-marek.vasut@mailbox.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-13ixgbe: fix too early devlink_free() in ixgbe_remove()Koichiro Den
Since ixgbe_adapter is embedded in devlink, calling devlink_free() prematurely in the ixgbe_remove() path can lead to UAF. Move devlink_free() to the end. KASAN report: BUG: KASAN: use-after-free in ixgbe_reset_interrupt_capability+0x140/0x180 [ixgbe] Read of size 8 at addr ffff0000adf813e0 by task bash/2095 CPU: 1 UID: 0 PID: 2095 Comm: bash Tainted: G S 6.17.0-rc2-tnguy.net-queue+ #1 PREEMPT(full) [...] Call trace: show_stack+0x30/0x90 (C) dump_stack_lvl+0x9c/0xd0 print_address_description.constprop.0+0x90/0x310 print_report+0x104/0x1f0 kasan_report+0x88/0x180 __asan_report_load8_noabort+0x20/0x30 ixgbe_reset_interrupt_capability+0x140/0x180 [ixgbe] ixgbe_clear_interrupt_scheme+0xf8/0x130 [ixgbe] ixgbe_remove+0x2d0/0x8c0 [ixgbe] pci_device_remove+0xa0/0x220 device_remove+0xb8/0x170 device_release_driver_internal+0x318/0x490 device_driver_detach+0x40/0x68 unbind_store+0xec/0x118 drv_attr_store+0x64/0xb8 sysfs_kf_write+0xcc/0x138 kernfs_fop_write_iter+0x294/0x440 new_sync_write+0x1fc/0x588 vfs_write+0x480/0x6a0 ksys_write+0xf0/0x1e0 __arm64_sys_write+0x70/0xc0 invoke_syscall.constprop.0+0xcc/0x280 el0_svc_common.constprop.0+0xa8/0x248 do_el0_svc+0x44/0x68 el0_svc+0x54/0x160 el0t_64_sync_handler+0xa0/0xe8 el0t_64_sync+0x1b0/0x1b8 Fixes: a0285236ab93 ("ixgbe: add initial devlink support") Signed-off-by: Koichiro Den <den@valinux.co.jp> Tested-by: Rinitha S <sx.rinitha@intel.com> Reviewed-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20251009-jk-iwl-net-2025-10-01-v3-6-ef32a425b92a@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-13ixgbe: handle IXGBE_VF_FEATURES_NEGOTIATE mbox cmdJedrzej Jagielski
Send to VF information about features supported by the PF driver. Increase API version to 1.7. Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20251009-jk-iwl-net-2025-10-01-v3-5-ef32a425b92a@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-13ixgbevf: fix mailbox API compatibility by negotiating supported featuresJedrzej Jagielski
There was backward compatibility in the terms of mailbox API. Various drivers from various OSes supporting 10G adapters from Intel portfolio could easily negotiate mailbox API. This convention has been broken since introducing API 1.4. Commit 0062e7cc955e ("ixgbevf: add VF IPsec offload code") added support for IPSec which is specific only for the kernel ixgbe driver. None of the rest of the Intel 10G PF/VF drivers supports it. And actually lack of support was not included in the IPSec implementation - there were no such code paths. No possibility to negotiate support for the feature was introduced along with introduction of the feature itself. Commit 339f28964147 ("ixgbevf: Add support for new mailbox communication between PF and VF") increasing API version to 1.5 did the same - it introduced code supported specifically by the PF ESX driver. It altered API version for the VF driver in the same time not touching the version defined for the PF ixgbe driver. It led to additional discrepancies, as the code provided within API 1.6 cannot be supported for Linux ixgbe driver as it causes crashes. The issue was noticed some time ago and mitigated by Jake within the commit d0725312adf5 ("ixgbevf: stop attempting IPSEC offload on Mailbox API 1.5"). As a result we have regression for IPsec support and after increasing API to version 1.6 ixgbevf driver stopped to support ESX MBX. To fix this mess add new mailbox op asking PF driver about supported features. Basing on a response determine whether to set support for IPSec and ESX-specific enhanced mailbox. New mailbox op, for compatibility purposes, must be added within new API revision, as API version of OOT PF & VF drivers is already increased to 1.6 and doesn't incorporate features negotiate op. Features negotiation mechanism gives possibility to be extended with new features when needed in the future. Reported-by: Jacob Keller <jacob.e.keller@intel.com> Closes: https://lore.kernel.org/intel-wired-lan/20241101-jk-ixgbevf-mailbox-v1-5-fixes-v1-0-f556dc9a66ed@intel.com/ Fixes: 0062e7cc955e ("ixgbevf: add VF IPsec offload code") Fixes: 339f28964147 ("ixgbevf: Add support for new mailbox communication between PF and VF") Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Cc: stable@vger.kernel.org Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20251009-jk-iwl-net-2025-10-01-v3-4-ef32a425b92a@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-13ixgbe: handle IXGBE_VF_GET_PF_LINK_STATE mailbox operationJedrzej Jagielski
Update supported API version and provide handler for IXGBE_VF_GET_PF_LINK_STATE cmd. Simply put stored values of link speed and link_up from adapter context. Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com> Link: https://lore.kernel.org/stable/20250828095227.1857066-3-jedrzej.jagielski%40intel.com Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20251009-jk-iwl-net-2025-10-01-v3-3-ef32a425b92a@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-13ixgbevf: fix getting link speed data for E610 devicesJedrzej Jagielski
E610 adapters no longer use the VFLINKS register to read PF's link speed and linkup state. As a result VF driver cannot get actual link state and it incorrectly reports 10G which is the default option. It leads to a situation where even 1G adapters print 10G as actual link speed. The same happens when PF driver set speed different than 10G. Add new mailbox operation to let the VF driver request a PF driver to provide actual link data. Update the mailbox api to v1.6. Incorporate both ways of getting link status within the legacy ixgbe_check_mac_link_vf() function. Fixes: 4c44b450c69b ("ixgbevf: Add support for Intel(R) E610 device") Co-developed-by: Andrzej Wilczynski <andrzejx.wilczynski@intel.com> Signed-off-by: Andrzej Wilczynski <andrzejx.wilczynski@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Cc: stable@vger.kernel.org Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20251009-jk-iwl-net-2025-10-01-v3-2-ef32a425b92a@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-13idpf: cleanup remaining SKBs in PTP flowsMilena Olech
When the driver requests Tx timestamp value, one of the first steps is to clone SKB using skb_get. It increases the reference counter for that SKB to prevent unexpected freeing by another component. However, there may be a case where the index is requested, SKB is assigned and never consumed by PTP flows - for example due to reset during running PTP apps. Add a check in release timestamping function to verify if the SKB assigned to Tx timestamp latch was freed, and release remaining SKBs. Fixes: 4901e83a94ef ("idpf: add Tx timestamp capabilities negotiation") Signed-off-by: Milena Olech <milena.olech@intel.com> Signed-off-by: Anton Nadezhdin <anton.nadezhdin@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Samuel Salin <Samuel.salin@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20251009-jk-iwl-net-2025-10-01-v3-1-ef32a425b92a@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-13net: phy: bcm54811: Fix GMII/MII/MII-Lite selectionKamil Horák - 2N
The Broadcom bcm54811 is hardware-strapped to select among RGMII and GMII/MII/MII-Lite modes. However, the corresponding bit, RGMII Enable in Miscellaneous Control Register must be also set to select desired RGMII or MII(-lite)/GMII mode. Fixes: 3117a11fff5af9e7 ("net: phy: bcm54811: PHY initialization") Signed-off-by: Kamil Horák - 2N <kamilh@axis.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://patch.msgid.link/20251009130656.1308237-2-kamilh@axis.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-13r8169: fix packet truncation after S4 resume on RTL8168H/RTL8111HLinmao Li
After resume from S4 (hibernate), RTL8168H/RTL8111H truncates incoming packets. Packet captures show messages like "IP truncated-ip - 146 bytes missing!". The issue is caused by RxConfig not being properly re-initialized after resume. Re-initializing the RxConfig register before the chip re-initialization sequence avoids the truncation and restores correct packet reception. This follows the same pattern as commit ef9da46ddef0 ("r8169: fix data corruption issue on RTL8402"). Fixes: 6e1d0b898818 ("r8169:add support for RTL8168H and RTL8107E") Signed-off-by: Linmao Li <lilinmao@kylinos.cn> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://patch.msgid.link/20251009122549.3955845-1-lilinmao@kylinos.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-13dpll: zl3073x: Handle missing or corrupted flash configurationIvan Vecera
If the internal flash contains missing or corrupted configuration, basic communication over the bus still functions, but the device is not capable of normal operation (for example, using mailboxes). This condition is indicated in the info register by the ready bit. If this bit is cleared, the probe procedure times out while fetching the device state. Handle this case by checking the ready bit value in zl3073x_dev_start() and skipping DPLL device and pin registration if it is cleared. Do not report this condition as an error, allowing the devlink device to be registered and enabling the user to flash the correct configuration. Prior this patch: [ 31.112299] zl3073x-i2c 1-0070: Failed to fetch input state: -ETIMEDOUT [ 31.116332] zl3073x-i2c 1-0070: error -ETIMEDOUT: Failed to start device [ 31.136881] zl3073x-i2c 1-0070: probe with driver zl3073x-i2c failed with error -110 After this patch: [ 41.011438] zl3073x-i2c 1-0070: FW not fully ready - missing or corrupted config Fixes: 75a71ecc24125 ("dpll: zl3073x: Register DPLL devices and pins") Signed-off-by: Ivan Vecera <ivecera@redhat.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20251008141445.841113-1-ivecera@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-13PCI: cadence: Search for MSI Capability with correct IDHans Zhang
907912c1daa7 ("PCI: cadence: Use cdns_pcie_find_*capability() to avoid hardcoding offsets") incorrectly searched for the MSI-X Capability ID instead of the MSI Capability ID in cdns_pcie_ep_get_msi(). Search for PCI_CAP_ID_MSI, not PCI_CAP_ID_MSIX, to fix this problem. Fixes: 907912c1daa7 ("PCI: cadence: Use cdns_pcie_find_*capability() to avoid hardcoding offsets") Reported-by: Sasha Levin <sashal@kernel.org> Closes: https://lore.kernel.org/r/aOfMk9BW8BH2P30V@laps/ Signed-off-by: Hans Zhang <18255117159@163.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20251010144307.12979-1-18255117159@163.com
2025-10-13cxl/features: Add check for no entries in cxl_feature_infoDave Jiang
cxl EDAC calls cxl_feature_info() to get the feature information and if the hardware has no Features support, cxlfs may be passed in as NULL. [ 51.957498] BUG: kernel NULL pointer dereference, address: 0000000000000008 [ 51.965571] #PF: supervisor read access in kernel mode [ 51.971559] #PF: error_code(0x0000) - not-present page [ 51.977542] PGD 17e4f6067 P4D 0 [ 51.981384] Oops: Oops: 0000 [#1] SMP NOPTI [ 51.986300] CPU: 49 UID: 0 PID: 3782 Comm: systemd-udevd Not tainted 6.17.0dj test+ #64 PREEMPT(voluntary) [ 51.997355] Hardware name: <removed> [ 52.009790] RIP: 0010:cxl_feature_info+0xa/0x80 [cxl_core] Add a check for cxlfs before dereferencing it and return -EOPNOTSUPP if there is no cxlfs created due to no hardware support. Fixes: eb5dfcb9e36d ("cxl: Add support to handle user feature commands for set feature") Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-10-13drm/xe/guc: Check GuC running state before deregistering exec queueShuicheng Lin
In normal operation, a registered exec queue is disabled and deregistered through the GuC, and freed only after the GuC confirms completion. However, if the driver is forced to unbind while the exec queue is still running, the user may call exec_destroy() after the GuC has already been stopped and CT communication disabled. In this case, the driver cannot receive a response from the GuC, preventing proper cleanup of exec queue resources. Fix this by directly releasing the resources when GuC is not running. Here is the failure dmesg log: " [ 468.089581] ---[ end trace 0000000000000000 ]--- [ 468.089608] pci 0000:03:00.0: [drm] *ERROR* GT0: GUC ID manager unclean (1/65535) [ 468.090558] pci 0000:03:00.0: [drm] GT0: total 65535 [ 468.090562] pci 0000:03:00.0: [drm] GT0: used 1 [ 468.090564] pci 0000:03:00.0: [drm] GT0: range 1..1 (1) [ 468.092716] ------------[ cut here ]------------ [ 468.092719] WARNING: CPU: 14 PID: 4775 at drivers/gpu/drm/xe/xe_ttm_vram_mgr.c:298 ttm_vram_mgr_fini+0xf8/0x130 [xe] " v2: use xe_uc_fw_is_running() instead of xe_guc_ct_enabled(). As CT may go down and come back during VF migration. Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Cc: stable@vger.kernel.org Cc: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20251010172529.2967639-2-shuicheng.lin@intel.com (cherry picked from commit 9b42321a02c50a12b2beb6ae9469606257fbecea) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-10-13drm/xe: Enable media sampler power gatingVinay Belgaumkar
Where applicable, enable media sampler power gating. Also, add it to the powergate_info debugfs. v2: Remove the sampler powergate status since it is cleared quickly anyway. v3: Use vcs mask (Rodrigo) and fix the version check for media v4: Remove extra spaces v5: Media samplers are independent of vcs mask, use Media version 1255 (Matt Roper) Fixes: 38e8c4184ea0 ("drm/xe: Enable Coarse Power Gating") Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Link: https://lore.kernel.org/r/20251010011047.2047584-1-vinay.belgaumkar@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit 4cbc08649a54c3d533df9832342d52d409dfbbf0) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-10-13drm/xe: Handle mixed mappings and existing VRAM on atomic faultsMatthew Brost
Moving to VRAM will fail if mixed mappings are present or if the page is already located in VRAM. Atomic faults that require a move to VRAM currently retry without attempting to evict mixed mappings or locate existing VRAM mappings. This patch fixes the issue by attempting to evict mixed mappings or find existing VRAM pages when a move to VRAM fails during atomic fault handling. Fixes: a9ac0fa455b0 ("drm/xe: Strict migration policy for atomic SVM faults") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://lore.kernel.org/r/20251009130629.3531962-1-matthew.brost@intel.com (cherry picked from commit 75188605c56d10c1bd3b1cd94f4872f349c3a9c8) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-10-13drm/xe/migrate: Fix an error pathThomas Hellström
The exhaustive eviction accidently changed an error path goto to a return. Fix this. Fixes: 59eabff2a352 ("drm/xe: Convert xe_bo_create_pin_map() for exhaustive eviction") Cc: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Francois Dugast <francois.dugast@intel.com> Link: https://lore.kernel.org/r/20250910160939.103473-1-thomas.hellstrom@linux.intel.com (cherry picked from commit 381f1ed15159c4b3f00dd37cc70924dedebeb111) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-10-13drm/xe: Move rebar to be done earlierLucas De Marchi
There may be cases in which the BAR0 also needs to move to accommodate the bigger BAR2. However if it's not released, the BAR2 resize fails. During the vram probe it can't be released as it's already in use by xe_mmio for early register access. Add a new function in xe_vram and let xe_pci call it directly before even early device probe. This allows the BAR2 to resize in cases BAR0 also needs to move, assuming there aren't other reasons to hold that move: [] xe 0000:03:00.0: vgaarb: deactivate vga console [] xe 0000:03:00.0: [drm] Attempting to resize bar from 8192MiB -> 16384MiB [] xe 0000:03:00.0: BAR 0 [mem 0x83000000-0x83ffffff 64bit]: releasing [] xe 0000:03:00.0: BAR 2 [mem 0x4000000000-0x41ffffffff 64bit pref]: releasing [] pcieport 0000:02:01.0: bridge window [mem 0x4000000000-0x41ffffffff 64bit pref]: releasing [] pcieport 0000:01:00.0: bridge window [mem 0x4000000000-0x41ffffffff 64bit pref]: releasing [] pcieport 0000:01:00.0: bridge window [mem 0x4000000000-0x43ffffffff 64bit pref]: assigned [] pcieport 0000:02:01.0: bridge window [mem 0x4000000000-0x43ffffffff 64bit pref]: assigned [] xe 0000:03:00.0: BAR 2 [mem 0x4000000000-0x43ffffffff 64bit pref]: assigned [] xe 0000:03:00.0: BAR 0 [mem 0x83000000-0x83ffffff 64bit]: assigned [] pcieport 0000:00:01.0: PCI bridge to [bus 01-04] [] pcieport 0000:00:01.0: bridge window [mem 0x83000000-0x840fffff] [] pcieport 0000:00:01.0: bridge window [mem 0x4000000000-0x44007fffff 64bit pref] [] pcieport 0000:01:00.0: PCI bridge to [bus 02-04] [] pcieport 0000:01:00.0: bridge window [mem 0x83000000-0x840fffff] [] pcieport 0000:01:00.0: bridge window [mem 0x4000000000-0x43ffffffff 64bit pref] [] pcieport 0000:02:01.0: PCI bridge to [bus 03] [] pcieport 0000:02:01.0: bridge window [mem 0x83000000-0x83ffffff] [] pcieport 0000:02:01.0: bridge window [mem 0x4000000000-0x43ffffffff 64bit pref] [] xe 0000:03:00.0: [drm] BAR2 resized to 16384M [] xe 0000:03:00.0: [drm:xe_pci_probe [xe]] BATTLEMAGE e221:0000 dgfx:1 gfx:Xe2_HPG (20.02) ... For BMG there are additional fix needed in the PCI side, but this helps getting it to a working resize. All the rebar logic is more pci-specific than xe-specific and can be done very early in the probe sequence. In future it would be good to move it out of xe_vram.c, but this refactor is left for later. Cc: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Cc: stable@vger.kernel.org # 6.12+ Link: https://lore.kernel.org/intel-xe/fafda2a3-fc63-ce97-d22b-803f771a4d19@linux.intel.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Link: https://lore.kernel.org/r/20250918-xe-pci-rebar-2-v1-2-6c094702a074@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit 45e33f220fd625492c11e15733d8e9b4f9db82a4) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-10-13drm/xe: Don't allow evicting of BOs in same VM in array of VM bindsMatthew Brost
An array of VM binds can potentially evict other buffer objects (BOs) within the same VM under certain conditions, which may lead to NULL pointer dereferences later in the bind pipeline. To prevent this, clear the allow_res_evict flag in the xe_bo_validate call. v2: - Invert polarity of no_res_evict (Thomas) - Add comment in code explaining issue (Thomas) Cc: stable@vger.kernel.org Reported-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/6268 Fixes: 774b5fa509a9 ("drm/xe: Avoid evicting object of the same vm in none fault mode") Fixes: 77f2ef3f16f5 ("drm/xe: Lock all gpuva ops during VM bind IOCTL") Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Tested-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://lore.kernel.org/r/20251009110618.3481870-1-matthew.brost@intel.com (cherry picked from commit 8b9ba8d6d95fe75fed6b0480bb03da4b321bea08) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-10-13drm/xe: Increase global invalidation timeout to 1000usKenneth Graunke
The previous timeout of 500us seems to be too small; panning the map in the Roll20 VTT in Firefox on a KDE/Wayland desktop reliably triggered timeouts within a few seconds of usage, causing the monitor to freeze and the following to be printed to dmesg: [Jul30 13:44] xe 0000:03:00.0: [drm] *ERROR* GT0: Global invalidation timeout [Jul30 13:48] xe 0000:03:00.0: [drm] *ERROR* [CRTC:82:pipe A] flip_done timed out I haven't hit a single timeout since increasing it to 1000us even after several multi-hour testing sessions. Fixes: 0dd2dd0182bc ("drm/xe: Move DSB l2 flush to a more sensible place") Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/5710 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Cc: stable@vger.kernel.org Cc: Maarten Lankhorst <dev@lankhorst.se> Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com> Link: https://lore.kernel.org/r/20250912223254.147940-1-kenneth@whitecape.org Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit 146046907b56578263434107f5a7d5051847c459) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-10-13Merge branch '6.18/scsi-queue' into 6.18/scsi-fixesMartin K. Petersen
Pull in outstanding SCSI fixes for 6.18. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-10-13drm/xe: Enable 2M pages in xe_migrate_vramMatthew Brost
Using 2M pages in xe_migrate_vram has two benefits: we issue fewer instructions per 2M copy (1 vs. 512), and the cache hit rate should be higher. This results in increased copy engine bandwidth, as shown by benchmark IGTs. Enable 2M pages by reserving PDEs in the migrate VM and using 2M pages in xe_migrate_vram if the DMA address order matches 2M. v2: - Reuse build_pt_update_batch_sram (Stuart) - Fix build_pt_update_batch_sram for PAGE_SIZE > 4K v3: - More fixes for PAGE_SIZE > 4K, align chunk, decrement chunk as needed - Use stack incr var in xe_migrate_vram_use_pde (Stuart) v4: - Split PAGE_SIZE > 4K fix out in different patch (Stuart) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Stuart Summers <stuart.summers@intel.com> Link: https://lore.kernel.org/r/20251013034555.4121168-3-matthew.brost@intel.com
2025-10-13drm/xe: Fix build_pt_update_batch_sram for non-4K PAGE_SIZEMatthew Brost
The build_pt_update_batch_sram function in the Xe migrate layer assumes PAGE_SIZE == XE_PAGE_SIZE (4K), which is not a valid assumption on non-x86 platforms. This patch updates build_pt_update_batch_sram to correctly handle PAGE_SIZE > 4K by programming multiple 4K GPU pages per CPU page. v5: - Mask off non-address bits during compare Signed-off-by: Matthew Brost <matthew.brost@intel.com> Tested-by: Simon Richter <Simon.Richter@hogyros.de> Reviewed-by: Stuart Summers <stuart.summers@intel.com> Link: https://lore.kernel.org/r/20251013034555.4121168-2-matthew.brost@intel.com
2025-10-13drm/xe: Fix comments in xe_gt structShuicheng Lin
Correct several spelling and grammar issues in xe_gt struct documentation to improve readability: - Fix "to not" -> "do not". - Fix "mmigrations" -> "migrations". - Fix "Multiple queues exists" -> "Multiple queues exist". - Fix "are be processed" -> "to be processed". - Fix "have being processed" -> "have been processed". These changes are purely cosmetic and do not affect functionality. v2: drop kernel-doc formatting change. (Jani) Cc: Jani Nikula <jani.nikula@linux.intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20251006151317.2553182-2-shuicheng.lin@intel.com
2025-10-13PM: sleep: Replace snprintf() with scnprintf() in show_trace_dev_match()Kaushlendra Kumar
Replace snprintf() with scnprintf() in show_trace_dev_match() to simplify buffer length handling. The scnprintf() function returns the number of characters actually written (excluding the null terminator), which eliminates the need for manual length checking and clamping. This change removes the redundant size check since scnprintf() guarantees that the return value will never exceed the buffer size, making the code cleaner and less error-prone. Signed-off-by: Kaushlendra Kumar <kaushlendra.kumar@intel.com> Link: https://patch.msgid.link/20250922055231.3523680-1-kaushlendra.kumar@intel.com [ rjw: Subject adjustment ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-10-13drm/i915/gem: fix typo in comment (I915_EXEC_NO_RELOC)Marlon Henrique Sanches
The comment referenced the flag name incorrectly as 'I915_EXEC_NORELOC' (missing underscore). This patch corrects the spelling in the comment only; there is no functional change. Signed-off-by: Marlon Henrique Sanches <marlonsanches@estudante.ufscar.br> Link: https://lore.kernel.org/r/20251013183123.438573-1-marlonsanches@estudante.ufscar.br Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-10-13cxl/port: Avoid missing port component registers setupLi Ming
port->nr_dports is used to represent how many dports added to the cxl port, it will increase in add_dport() when a new dport is being added to the cxl port, but it will not be reduced when a dport is removed from the cxl port. Currently, when the first dport is added to a cxl port, it will trigger component registers setup on the cxl port, the implementation is using port->nr_dports to confirm if the dport is the first dport. A corner case here is that adding dport could fail after port->nr_dports updating and before checking port->nr_dports for component registers setup. If the failure happens during the first dport attaching, it will cause that CXL subsystem has not chance to execute component registers setup for the cxl port. the failure flow like below: port->nr_dports = 0 dport 1 adding to the port: add_dport() # port->nr_dports: 1 failed on devm_add_action_or_reset() or sysfs_create_link() return error # port->nr_dports: 1 dport 2 adding to the port: add_dport() # port->nr_dports: 2 no failure skip component registers setup because of port->nr_dports is 2 The solution here is that moving component registers setup closer to add_dport(), so if add_dport() is executed correctly for the first dport, component registers setup on the port will be executed immediately after that. Fixes: f6ee24913de2 ("cxl: Move port register setup to when first dport appear") Signed-off-by: Li Ming <ming.li@zohomail.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-10-13drm/amd: Drop calls to restore power limit and clock from smu_resume()Mario Limonciello
User requested power limits and clock settings are already restored as part of smu_restore_dpm_user_profile(). It's unnecessary to call the same restore as part of smu_resume(). Revert the following commits to drop that extra restore: commit ed4efe426a49 ("drm/amd: Restore cached power limit during resume") commit 796ff8a7e01b ("drm/amd: Restore cached manual clock settings during resume") commit f9b80514a722 ("drm/amd: Only restore cached manual clock settings in restore if OD enabled") Suggested-by: Lijo Lazar <Lijo.Lazar@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-13drm/amdgpu: update remove after reset flag for MES remove queueJonathan Kim
Remove queue after reset flag is required to remove a queue that has been successfully reset to clean up the MES' internal state. Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-13drm/amdgpu: Add ras module files into amdgpuYiPeng Chai
Add ras module files into amdgpu. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-13drm/amdgpu/userqueue: validate userptrs for userqueuesSunil Khatri
userptrs could be changed by the user at any time and hence while locking all the bos before GPU start processing validate all the userptr bos. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-13drm/amdgpu: update the functions to use amdgpu version of hmmSunil Khatri
At times we need a bo reference for hmm and for that add a new struct amdgpu_hmm_range which will hold an optional bo member and hmm_range. Use amdgpu_hmm_range instead of hmm_range and let the bo as an optional argument for the caller if they want to the bo reference to be taken or they want to handle that explicitly. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-13drm/amdgpu: Reserve discovery TMR only if neededLijo Lazar
For legacy SOCs, discovery binary is sideloaded. Instead of checking for binary blob, use a flag to determine if discovery region needs to be reserved. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-13drm/amd/pm: export a function amdgpu_smu_ras_send_msg to allow send msg directlyYiPeng Chai
provide a interface that allows ras client send msg to smu/pmfw directly. Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-13drm/amd/pm: Grant interface access after full initLijo Lazar
Allow access to user interfaces like sysfs/hwmon only after full initialization of the device. When device is part of XGMI hive and a reset is required during initialization, the inteface files will be created as part of minimal device initialization. Full initialization of the device will be done only after all devices in XGMI hive are probed and a reset is done together on all. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>