summaryrefslogtreecommitdiff
path: root/drivers
AgeCommit message (Collapse)Author
2025-07-08net: phy: smsc: Fix link failure in forced mode with Auto-MDIXOleksij Rempel
Force a fixed MDI-X mode when auto-negotiation is disabled to prevent link instability. When forcing the link speed and duplex on a LAN9500 PHY (e.g., with `ethtool -s eth0 autoneg off ...`) while leaving MDI-X control in auto mode, the PHY fails to establish a stable link. This occurs because the PHY's Auto-MDIX algorithm is not designed to operate when auto-negotiation is disabled. In this state, the PHY continuously toggles the TX/RX signal pairs, which prevents the link partner from synchronizing. This patch resolves the issue by detecting when auto-negotiation is disabled. If the MDI-X control mode is set to 'auto', the driver now forces a specific, stable mode (ETH_TP_MDI) to prevent the pair toggling. This choice of a fixed MDI mode mirrors the behavior the hardware would exhibit if the AUTOMDIX_EN strap were configured for a fixed MDI connection. Fixes: 05b35e7eb9a1 ("smsc95xx: add phylib support") Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Cc: Andre Edich <andre.edich@microchip.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20250703114941.3243890-4-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-08net: phy: smsc: Force predictable MDI-X state on LAN87xxOleksij Rempel
Override the hardware strap configuration for MDI-X mode to ensure a predictable initial state for the driver. The initial mode of the LAN87xx PHY is determined by the AUTOMDIX_EN strap pin, but the driver has no documented way to read its latched status. This unpredictability means the driver cannot know if the PHY has initialized with Auto-MDIX enabled or disabled, preventing it from providing a reliable interface to the user. This patch introduces a `config_init` hook that forces the PHY into a known state by explicitly enabling Auto-MDIX. Fixes: 05b35e7eb9a1 ("smsc95xx: add phylib support") Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Cc: Andre Edich <andre.edich@microchip.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20250703114941.3243890-3-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-08net: phy: smsc: Fix Auto-MDIX configuration when disabled by strapOleksij Rempel
Correct the Auto-MDIX configuration to ensure userspace settings are respected when the feature is disabled by the AUTOMDIX_EN hardware strap. The LAN9500 PHY allows its default MDI-X mode to be configured via a hardware strap. If this strap sets the default to "MDI-X off", the driver was previously unable to enable Auto-MDIX from userspace. When handling the ETH_TP_MDI_AUTO case, the driver would set the SPECIAL_CTRL_STS_AMDIX_ENABLE_ bit but neglected to set the required SPECIAL_CTRL_STS_OVRRD_AMDIX_ bit. Without the override flag, the PHY falls back to its hardware strap default, ignoring the software request. This patch corrects the behavior by also setting the override bit when enabling Auto-MDIX. This ensures that the userspace configuration takes precedence over the hardware strap, allowing Auto-MDIX to be enabled correctly in all scenarios. Fixes: 05b35e7eb9a1 ("smsc95xx: add phylib support") Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Cc: Andre Edich <andre.edich@microchip.com> Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Link: https://patch.msgid.link/20250703114941.3243890-2-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-08net: stmmac: Fix interrupt handling for level-triggered mode in DWC_XGMAC2EricChan
According to the Synopsys Controller IP XGMAC-10G Ethernet MAC Databook v3.30a (section 2.7.2), when the INTM bit in the DMA_Mode register is set to 2, the sbd_perch_tx_intr_o[] and sbd_perch_rx_intr_o[] signals operate in level-triggered mode. However, in this configuration, the DMA does not assert the XGMAC_NIS status bit for Rx or Tx interrupt events. This creates a functional regression where the condition if (likely(intr_status & XGMAC_NIS)) in dwxgmac2_dma_interrupt() will never evaluate to true, preventing proper interrupt handling for level-triggered mode. The hardware specification explicitly states that "The DMA does not assert the NIS status bit for the Rx or Tx interrupt events" (Synopsys DWC_XGMAC2 Databook v3.30a, sec. 2.7.2). The fix ensures correct handling of both edge and level-triggered interrupts while maintaining backward compatibility with existing configurations. It has been tested on the hardware device (not publicly available), and it can properly trigger the RX and TX interrupt handling in both the INTM=0 and INTM=2 configurations. Fixes: d6ddfacd95c7 ("net: stmmac: Add DMA related callbacks for XGMAC2") Tested-by: EricChan <chenchuangyu@xiaomi.com> Signed-off-by: EricChan <chenchuangyu@xiaomi.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250703020449.105730-1-chenchuangyu@xiaomi.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-08eth: fbnic: Create fw_log file in DebugFSLee Trager
Allow reading the firmware log in DebugFS by accessing the fw_log file. Buffer is read while a spinlock is acquired. Signed-off-by: Lee Trager <lee@trager.us> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20250702192207.697368-7-lee@trager.us Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-08eth: fbnic: Enable firmware loggingLee Trager
The firmware log buffer is enabled during probe and freed during remove. Early versions of firmware do not support sending logs. Once the mailbox is up driver will enable logging when supported firmware versions are detected. Logging is disabled before the mailbox is freed. Signed-off-by: Lee Trager <lee@trager.us> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20250702192207.697368-6-lee@trager.us Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-08eth: fbnic: Add mailbox support for firmware logsLee Trager
By default firmware will not send logs to the host. This must be explicitly enabled by the driver. The mailbox has the concept of a flag which is a u32 used as a boolean. Lack of flag defaults to a value of false. When enabling logging historical logs may be optionally requested. These are log messages generated by the NIC before the driver was loaded. The driver also sends a log version to support changing the logging format in the future. [SEND_LOGS_REQ] = { [SEND_LOGS] /* flag to request log reporting */ [SEND_LOGS_HISTORY] /* flag to request historical logs */ [SEND_LOGS_VERSION] /* u32 indicating the log format version */ } Logs may be sent to the user either one at a time, or when historical logs are requested in bulk. Firmware may not send more than 14 messages in bulk to prevent flooding the mailbox. [LOG_MSG] = { [LOG_INDEX] /* entry 0 - u64 index of log */ [LOG_MSEC] /* entry 0 - u32 timestamp of log */ [LOG_MSG] /* entry 0 - char log message up to 256 */ [LOG_LENGTH] /* u32 of remaining log items in arrays */ [LOG_INDEX_ARRAY] = { [LOG_INDEX] /* entry 1 - u64 index of log */ [LOG_INDEX] /* entry 2 - u64 index of log */ ... } [LOG_MSEC_ARRAY] = { [LOG_MSEC] /* entry 1 - u32 timestamp of log */ [LOG_MSEC] /* entry 2 - u32 timestamp of log */ ... } [LOG_MSG_ARRAY] = { [LOG_MSG] /* entry 1 - char log message up to 256 */ [LOG_MSG] /* entry 2 - char log message up to 256 */ ... } } Signed-off-by: Lee Trager <lee@trager.us> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20250702192207.697368-5-lee@trager.us Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-08eth: fbnic: Create ring buffer for firmware logsLee Trager
When enabled, firmware may send logs messages which are specific to the device and not the host. Create a ring buffer to store these messages which are read by a user through DebugFS. Buffer access is protected by a spinlock. Signed-off-by: Lee Trager <lee@trager.us> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20250702192207.697368-4-lee@trager.us Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-08eth: fbnic: Use FIELD_PREP to generate minimum firmware versionLee Trager
Create a new macro based on FIELD_PREP to generate easily readable minimum firmware version ints. This macro will prevent the mistake from the previous patch from happening again. Signed-off-by: Lee Trager <lee@trager.us> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20250702192207.697368-3-lee@trager.us Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-08eth: fbnic: Fix incorrect minimum firmware versionLee Trager
The full minimum version is 0.10.6-0. The six is now correctly defined as patch and shifted appropriately. 0.10.6-0 is a preproduction version of firmware which was released over a year and a half ago. All production devices meet this requirement. Signed-off-by: Lee Trager <lee@trager.us> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20250702192207.697368-2-lee@trager.us Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-08Merge branch 'mlx5-next' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Tariq Toukan says: ==================== mlx5-next updates 2025-07-08 The following pull-request contains common mlx5 updates for your *net-next* tree. v2: https://lore.kernel.org/1751574385-24672-1-git-send-email-tariqt@nvidia.com * 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux: net/mlx5: Check device memory pointer before usage net/mlx5: fs, fix RDMA TRANSPORT init cleanup flow net/mlx5: Add IFC bits for PCIe Congestion Event object net/mlx5: Small refactor for general object capabilities net/mlx5: fs, add multiple prios to RDMA TRANSPORT steering domain ==================== Link: https://patch.msgid.link/1752002102-11316-1-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-09gpu: nova-core: convert `/*` comments to `//`Alexandre Courbot
The second form is preferred, and there was no reason to use the first. Signed-off-by: Alexandre Courbot <acourbot@nvidia.com> Link: https://lore.kernel.org/r/20250708-nova-docs-v4-4-9d188772c4c7@nvidia.com Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2025-07-09gpu: nova-core: Clarify falcon codeJoel Fernandes
Add documentation strings, comments and AES mode for completeness to the Falcon signatures. Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com> Link: https://lore.kernel.org/r/20250708-nova-docs-v4-3-9d188772c4c7@nvidia.com Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2025-07-09gpu: nova-core: Clarify sysmembar operationsJoel Fernandes
sysmembar is a critical operation that the GSP falcon needs to perform in the reset sequence. Add some code comments to clarify. [acourbot@nvdidia.com: move relevant documentation to SysmemFlush type] Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com> Link: https://lore.kernel.org/r/20250708-nova-docs-v4-2-9d188772c4c7@nvidia.com [ Minor grammar fix in the PFB register documentation. - Danilo ] Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2025-07-08drm/xe: Release runtime pm for error path of xe_devcoredump_read()Shuicheng Lin
xe_pm_runtime_put() is missed to be called for the error path in xe_devcoredump_read(). Add function description comments for xe_devcoredump_read() to help understand it. v2: more detail function comments and refine goto logic (Matt) Fixes: c4a2e5f865b7 ("drm/xe: Add devcoredump chunking") Cc: stable@vger.kernel.org Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250707004911.3502904-6-shuicheng.lin@intel.com
2025-07-08drm/xe: Remove unused code in devcoredump_snapshot()Shuicheng Lin
The deleted code is no longer needed because patch "drm/xe/guc: Plumb GuC-capture into dev coredump" has removed the related usage code. Remove the code to tidy up the function. v2: s/bacause/because Reviewed-by: Zhanjun Dong <zhanjun.dong@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250707004911.3502904-5-shuicheng.lin@intel.com
2025-07-09gpu: nova-core: Add code comments related to devinitJoel Fernandes
Add several code comments to reduce acronym soup and explain how devinit magic and bootflow works before driver loads. These are essential for debug and development of the nova driver. [acourbot@nvidia.com: reformat and reword a couple of sentences] Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com> Link: https://lore.kernel.org/r/20250708-nova-docs-v4-1-9d188772c4c7@nvidia.com Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2025-07-08drm/xe/uc: Disable GuC communication on hardware initialization errorZhanjun Dong
Disable GuC communication on Xe micro controller hardware initialization error. Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4917 Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Zhanjun Dong <zhanjun.dong@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250707231108.3217573-1-zhanjun.dong@intel.com
2025-07-08irqchip/gic-v5: Populate struct gic_kvm_infoSascha Bischoff
Populate the gic_kvm_info struct based on support for FEAT_GCIE_LEGACY. The struct is used by KVM to probe for a compatible GIC. Co-authored-by: Timothy Hayes <timothy.hayes@arm.com> Signed-off-by: Timothy Hayes <timothy.hayes@arm.com> Signed-off-by: Sascha Bischoff <sascha.bischoff@arm.com> Reviewed-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Link: https://lore.kernel.org/r/20250627100847.1022515-3-sascha.bischoff@arm.com Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-07-08irqchip/gic-v5: Skip deactivate for forwarded PPI interruptsSascha Bischoff
If a PPI interrupt is forwarded to a guest, skip the deactivate and only EOI. Rely on the guest deactivating both the virtual and physical interrupts (due to ICH_LRx_EL2.HW being set) later on as part of handling the injected interrupt. This mimics the behaviour seen on native GICv3. This is part of adding support for the GICv3 compatibility mode on a GICv5 host. Reviewed-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Co-authored-by: Timothy Hayes <timothy.hayes@arm.com> Signed-off-by: Timothy Hayes <timothy.hayes@arm.com> Signed-off-by: Sascha Bischoff <sascha.bischoff@arm.com> Link: https://lore.kernel.org/r/20250627100847.1022515-2-sascha.bischoff@arm.com Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-07-08Merge tag 'pwm/for-6.16-rc6-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ukleinek/linux Pull pwm fixes from Uwe Kleine-König: "Two fixes for v6.16-rc6 The first patch fixes an embarrassing bug in the pwm core. I really wonder this wasn't found earlier since it's introduction in v6.11-rc1 as it greatly disturbs driving a PWM via sysfs. The second and last patch fixes a clock balance issue in an error path of the Mediatek PWM driver" * tag 'pwm/for-6.16-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/ukleinek/linux: pwm: mediatek: Ensure to disable clocks in error path pwm: Fix invalid state detection
2025-07-08drm/xe/pm: Restore display pm if there is error after display suspendShuicheng Lin
xe_bo_evict_all() is called after xe_display_pm_suspend(). So if there is error with xe_bo_evict_all(), display pm should be restored. Fixes: 51462211f4a9 ("drm/xe/pxp: add PXP PM support") Fixes: cb8f81c17531 ("drm/xe/display: Make display suspend/resume work on discrete") Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://lore.kernel.org/r/20250708035424.3608190-2-shuicheng.lin@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-07-08eth: mlx5: migrate to the *_rxfh_context opsJakub Kicinski
Convert mlx5 to dedicated RXFH ops. This is a fairly shallow conversion, TBH, most of the driver code stays as is, but we let the core allocate the context ID for the driver. mlx5e_rx_res_rss_get_rxfh() and friends are made void, since core only calls the driver for context 0. The second call is right after context creation so it must exist (tm). Tested with drivers/net/hw/rss_ctx.py on MCX6. Reviewed-by: Gal Pressman <gal@nvidia.com> Link: https://patch.msgid.link/20250707184115.2285277-4-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-08eth: ice: drop the dead code related to rss_contextsJakub Kicinski
ICE appears to have some odd form of rss_context use plumbed in for .get_rxfh. The .set_rxfh side does not support creating contexts, however, so this must be dead code. For at least a year now (since commit 7964e7884643 ("net: ethtool: use the tracking array for get_rxfh on custom RSS contexts")) we have not been calling .get_rxfh with a non-zero rss_context. We just get the info from the RSS XArray under dev->ethtool. Remove what must be dead code in the driver, clear the support flags. Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250707184115.2285277-3-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-08eth: otx2: migrate to the *_rxfh_context opsJakub Kicinski
otx2 only supports additional indirection tables (no separate keys etc.) so the conversion to dedicated callbacks and core-allocated context is mostly removing the code which stores the extra tables in the driver. Core already stores the indirection tables for additional contexts, and doesn't call .get for them. One subtle change here is that we'll now start with the table covering all queues, not directing all traffic to queue 0. This is what core expects if the user doesn't pass the initial indir table explicitly (there's a WARN_ON() in the core trying to make sure driver authors don't forget to populate ctx to defaults). Drivers implementing .create_rxfh_context don't have to set cap_rss_ctx_supported, so remove it. Tested-by: Geetha Sowjanya <gakula@marvell.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250707184115.2285277-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-08drbd: add missing kref_get in handle_write_conflictsSarah Newman
With `two-primaries` enabled, DRBD tries to detect "concurrent" writes and handle write conflicts, so that even if you write to the same sector simultaneously on both nodes, they end up with the identical data once the writes are completed. In handling "superseeded" writes, we forgot a kref_get, resulting in a premature drbd_destroy_device and use after free, and further to kernel crashes with symptoms. Relevance: No one should use DRBD as a random data generator, and apparently all users of "two-primaries" handle concurrent writes correctly on layer up. That is cluster file systems use some distributed lock manager, and live migration in virtualization environments stops writes on one node before starting writes on the other node. Which means that other than for "test cases", this code path is never taken in real life. FYI, in DRBD 9, things are handled differently nowadays. We still detect "write conflicts", but no longer try to be smart about them. We decided to disconnect hard instead: upper layers must not submit concurrent writes. If they do, that's their fault. Signed-off-by: Sarah Newman <srn@prgmr.com> Signed-off-by: Lars Ellenberg <lars@linbit.com> Signed-off-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> Link: https://lore.kernel.org/r/20250627095728.800688-1-christoph.boehmwalder@linbit.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-07-08block: mtip32xx: Fix usage of dma_map_sg()Thomas Fourier
The dma_map_sg() can fail and, in case of failure, returns 0. If it fails, mtip_hw_submit_io() returns an error. The dma_unmap_sg() requires the nents parameter to be the same as the one passed to dma_map_sg(). This patch saves the nents in command->scatter_ents. Fixes: 88523a61558a ("block: Add driver for Micron RealSSD pcie flash cards") Signed-off-by: Thomas Fourier <fourier.thomas@gmail.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20250627121123.203731-2-fourier.thomas@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-07-08irqchip/gic-v5: Add GICv5 IWB supportLorenzo Pieralisi
The GICv5 architecture implements the Interrupt Wire Bridge (IWB) in order to support wired interrupts that cannot be connected directly to an IRS and instead uses the ITS to translate a wire event into an IRQ signal. Add the wired-to-MSI IWB driver to manage IWB wired interrupts. An IWB is connected to an ITS and it has its own deviceID for all interrupt wires that it manages; the IWB input wire number must be exposed to the ITS as an eventID with a 1:1 mapping. This eventID is not programmable and therefore requires a new msi_alloc_info_t flag to make sure the ITS driver does not allocate an eventid for the wire but rather it uses the msi_alloc_info_t.hwirq number to gather the ITS eventID. Co-developed-by: Sascha Bischoff <sascha.bischoff@arm.com> Signed-off-by: Sascha Bischoff <sascha.bischoff@arm.com> Co-developed-by: Timothy Hayes <timothy.hayes@arm.com> Signed-off-by: Timothy Hayes <timothy.hayes@arm.com> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Reviewed-by: Marc Zyngier <maz@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20250703-gicv5-host-v7-29-12e71f1b3528@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-07-08irqchip/gic-v5: Add GICv5 ITS supportLorenzo Pieralisi
The GICv5 architecture implements Interrupt Translation Service (ITS) components in order to translate events coming from peripherals into interrupt events delivered to the connected IRSes. Events (ie MSI memory writes to ITS translate frame), are translated by the ITS using tables kept in memory. ITS translation tables for peripherals is kept in memory storage (device table [DT] and Interrupt Translation Table [ITT]) that is allocated by the driver on boot. Both tables can be 1- or 2-level; the structure is chosen by the driver after probing the ITS HW parameters and checking the allowed table splits and supported {device/event}_IDbits. DT table entries are allocated on demand (ie when a device is probed); the DT table is sized using the number of supported deviceID bits in that that's a system design decision (ie the number of deviceID bits implemented should reflect the number of devices expected in a system) therefore it makes sense to allocate a DT table that can cater for the maximum number of devices. DT and ITT tables are allocated using the kmalloc interface; the allocation size may be smaller than a page or larger, and must provide contiguous memory pages. LPIs INTIDs backing the device events are allocated one-by-one and only upon Linux IRQ allocation; this to avoid preallocating a large number of LPIs to cover the HW device MSI vector size whereas few MSI entries are actually enabled by a device. ITS cacheability/shareability attributes are programmed according to the provided firmware ITS description. The GICv5 partially reuses the GICv3 ITS MSI parent infrastructure and adds functions required to retrieve the ITS translate frame addresses out of msi-map and msi-parent properties to implement the GICv5 ITS MSI parent callbacks. Co-developed-by: Sascha Bischoff <sascha.bischoff@arm.com> Signed-off-by: Sascha Bischoff <sascha.bischoff@arm.com> Co-developed-by: Timothy Hayes <timothy.hayes@arm.com> Signed-off-by: Timothy Hayes <timothy.hayes@arm.com> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Reviewed-by: Marc Zyngier <maz@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20250703-gicv5-host-v7-28-12e71f1b3528@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-07-08irqchip/msi-lib: Add IRQ_DOMAIN_FLAG_FWNODE_PARENT handlingLorenzo Pieralisi
In some irqchip implementations the fwnode representing the IRQdomain and the MSI controller fwnode do not match; in particular the IRQdomain fwnode is the MSI controller fwnode parent. To support selecting such IRQ domains, add a flag in core IRQ domain code that explicitly tells the MSI lib to use the parent fwnode while carrying out IRQ domain selection. Update the msi-lib select callback with the resulting logic. Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Reviewed-by: Marc Zyngier <maz@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20250703-gicv5-host-v7-27-12e71f1b3528@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-07-08irqchip/gic-v3: Rename GICv3 ITS MSI parentLorenzo Pieralisi
The GICv5 ITS will reuse some GICv3 ITS MSI parent functions therefore it makes sense to keep the code functionality in a compilation unit shared by the two drivers. Rename the GICv3 ITS MSI parent file and update the related Kconfig/Makefile entries to pave the way for code sharing. Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Reviewed-by: Marc Zyngier <maz@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20250703-gicv5-host-v7-26-12e71f1b3528@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-07-08PCI/MSI: Add pci_msi_map_rid_ctlr_node() helper functionLorenzo Pieralisi
IRQchip drivers need a PCI/MSI function to map a RID to a MSI controller deviceID namespace and at the same time retrieve the struct device_node pointer of the MSI controller the RID is mapped to. Add pci_msi_map_rid_ctlr_node() to achieve this purpose. Cc Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Reviewed-by: Marc Zyngier <maz@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20250703-gicv5-host-v7-25-12e71f1b3528@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-07-08of/irq: Add of_msi_xlate() helper functionLorenzo Pieralisi
Add an of_msi_xlate() helper that maps a device ID and returns the device node of the MSI controller the device ID is mapped to. Required by core functions that need an MSI controller device node pointer at the same time as a mapped device ID, of_msi_map_id() is not sufficient for that purpose. Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Reviewed-by: Marc Zyngier <maz@kernel.org> Cc: Rob Herring <robh@kernel.org> Cc: Marc Zyngier <maz@kernel.org> Reviewed-by: Rob Herring (Arm) <robh@kernel.org> Link: https://lore.kernel.org/r/20250703-gicv5-host-v7-24-12e71f1b3528@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-07-08irqchip/gic-v5: Enable GICv5 SMP bootingLorenzo Pieralisi
Set up IPIs by allocating IPI IRQs for all cpus and call into arm64 core code to initialise IPIs IRQ descriptors and request the related IRQ. Implement hotplug callback to enable interrupts on a cpu and register the cpu with an IRS. Co-developed-by: Sascha Bischoff <sascha.bischoff@arm.com> Signed-off-by: Sascha Bischoff <sascha.bischoff@arm.com> Co-developed-by: Timothy Hayes <timothy.hayes@arm.com> Signed-off-by: Timothy Hayes <timothy.hayes@arm.com> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Reviewed-by: Marc Zyngier <maz@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20250703-gicv5-host-v7-23-12e71f1b3528@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-07-08irqchip/gic-v5: Add GICv5 LPI/IPI supportLorenzo Pieralisi
An IRS supports Logical Peripheral Interrupts (LPIs) and implement Linux IPIs on top of it. LPIs are used for interrupt signals that are translated by a GICv5 ITS (Interrupt Translation Service) but also for software generated IRQs - namely interrupts that are not driven by a HW signal, ie IPIs. LPIs rely on memory storage for interrupt routing and state. LPIs state and routing information is kept in the Interrupt State Table (IST). IRSes provide support for 1- or 2-level IST tables configured to support a maximum number of interrupts that depend on the OS configuration and the HW capabilities. On systems that provide 2-level IST support, always allow the maximum number of LPIs; On systems with only 1-level support, limit the number of LPIs to 2^12 to prevent wasting memory (presumably a system that supports a 1-level only IST is not expecting a large number of interrupts). On a 2-level IST system, L2 entries are allocated on demand. The IST table memory is allocated using the kmalloc() interface; the allocation required may be smaller than a page and must be made up of contiguous physical pages if larger than a page. On systems where the IRS is not cache-coherent with the CPUs, cache mainteinance operations are executed to clean and invalidate the allocated memory to the point of coherency making it visible to the IRS components. On GICv5 systems, IPIs are implemented using LPIs. Add an LPI IRQ domain and implement an IPI-specific IRQ domain created as a child/subdomain of the LPI domain to allocate the required number of LPIs needed to implement the IPIs. IPIs are backed by LPIs, add LPIs allocation/de-allocation functions. The LPI INTID namespace is managed using an IDA to alloc/free LPI INTIDs. Associate an IPI irqchip with IPI IRQ descriptors to provide core code with the irqchip.ipi_send_single() method required to raise an IPI. Co-developed-by: Sascha Bischoff <sascha.bischoff@arm.com> Signed-off-by: Sascha Bischoff <sascha.bischoff@arm.com> Co-developed-by: Timothy Hayes <timothy.hayes@arm.com> Signed-off-by: Timothy Hayes <timothy.hayes@arm.com> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Reviewed-by: Marc Zyngier <maz@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Marc Zyngier <maz@kernel.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20250703-gicv5-host-v7-22-12e71f1b3528@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-07-08irqchip/gic-v5: Add GICv5 IRS/SPI supportLorenzo Pieralisi
The GICv5 Interrupt Routing Service (IRS) component implements interrupt management and routing in the GICv5 architecture. A GICv5 system comprises one or more IRSes, that together handle the interrupt routing and state for the system. An IRS supports Shared Peripheral Interrupts (SPIs), that are interrupt sources directly connected to the IRS; they do not rely on memory for storage. The number of supported SPIs is fixed for a given implementation and can be probed through IRS IDR registers. SPI interrupt state and routing are managed through GICv5 instructions. Each core (PE in GICv5 terms) in a GICv5 system is identified with an Interrupt AFFinity ID (IAFFID). An IRS manages a set of cores that are connected to it. Firmware provides a topology description that the driver uses to detect to which IRS a CPU (ie an IAFFID) is associated with. Use probeable information and firmware description to initialize the IRSes and implement GICv5 IRS SPIs support through an SPI-specific IRQ domain. The GICv5 IRS driver: - Probes IRSes in the system to detect SPI ranges - Associates an IRS with a set of cores connected to it - Adds an IRQchip structure for SPI handling SPIs priority is set to a value corresponding to the lowest permissible priority in the system (taking into account the implemented priority bits of the IRS and CPU interface). Since all IRQs are set to the same priority value, the value itself does not matter as long as it is a valid one. Co-developed-by: Sascha Bischoff <sascha.bischoff@arm.com> Signed-off-by: Sascha Bischoff <sascha.bischoff@arm.com> Co-developed-by: Timothy Hayes <timothy.hayes@arm.com> Signed-off-by: Timothy Hayes <timothy.hayes@arm.com> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Reviewed-by: Marc Zyngier <maz@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Marc Zyngier <maz@kernel.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20250703-gicv5-host-v7-21-12e71f1b3528@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-07-08irqchip/gic-v5: Add GICv5 PPI supportLorenzo Pieralisi
The GICv5 CPU interface implements support for PE-Private Peripheral Interrupts (PPI), that are handled (enabled/prioritized/delivered) entirely within the CPU interface hardware. To enable PPI interrupts, implement the baseline GICv5 host kernel driver infrastructure required to handle interrupts on a GICv5 system. Add the exception handling code path and definitions for GICv5 instructions. Add GICv5 PPI handling code as a specific IRQ domain to: - Set-up PPI priority - Manage PPI configuration and state - Manage IRQ flow handler - IRQs allocation/free - Hook-up a PPI specific IRQchip to provide the relevant methods PPI IRQ priority is chosen as the minimum allowed priority by the system design (after probing the number of priority bits implemented by the CPU interface). Co-developed-by: Sascha Bischoff <sascha.bischoff@arm.com> Signed-off-by: Sascha Bischoff <sascha.bischoff@arm.com> Co-developed-by: Timothy Hayes <timothy.hayes@arm.com> Signed-off-by: Timothy Hayes <timothy.hayes@arm.com> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Reviewed-by: Marc Zyngier <maz@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Marc Zyngier <maz@kernel.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20250703-gicv5-host-v7-20-12e71f1b3528@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-07-08arm64: cpucaps: Rename GICv3 CPU interface capabilityLorenzo Pieralisi
In preparation for adding a GICv5 CPU interface capability, rework the existing GICv3 CPUIF capability - change its name and description so that the subsequent GICv5 CPUIF capability can be added with a more consistent naming on top. Suggested-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Reviewed-by: Marc Zyngier <maz@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Marc Zyngier <maz@kernel.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20250703-gicv5-host-v7-16-12e71f1b3528@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-07-08char: ipmi: remove redundant variable 'type' and checkColin Ian King
The variable 'type' is assigned the value SI_INVALID which is zero and later checks of 'type' is non-zero (which is always false). The variable is not referenced anywhere else, so it is redundant and so is the check, so remove these. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Message-ID: <20250708151805.1893858-1-colin.i.king@gmail.com> Signed-off-by: Corey Minyard <corey@minyard.net>
2025-07-08perf: arm_spe: Relax period restrictionLeo Yan
The minimum interval specified the PMSIDR_EL1.Interval field is a hardware recommendation. However, this value is set by hardware designer before the production. It is not actual hardware limitation but tools currently have no way to test shorter periods. This change relaxes the limitation by allowing any non-zero periods, with simplifying code with clamp_t(). The downside is that small periods may increase the risk of AUX ring buffer overruns. When an overrun occurs, the perf core layer will trigger an irq work to disable the event and wake up the tool in user space to read the trace data. After the tool finishes reading, it will re-enable the AUX event. Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/20250627163028.3503122-1-leo.yan@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2025-07-08perf: arm_pmuv3: Add support for the Branch Record Buffer Extension (BRBE)Rob Herring (Arm)
The ARMv9.2 architecture introduces the optional Branch Record Buffer Extension (BRBE), which records information about branches as they are executed into set of branch record registers. BRBE is similar to x86's Last Branch Record (LBR) and PowerPC's Branch History Rolling Buffer (BHRB). BRBE supports filtering by exception level and can filter just the source or target address if excluded to avoid leaking privileged addresses. The h/w filter would be sufficient except when there are multiple events with disjoint filtering requirements. In this case, BRBE is configured with a union of all the events' desired branches, and then the recorded branches are filtered based on each event's filter. For example, with one event capturing kernel events and another event capturing user events, BRBE will be configured to capture both kernel and user branches. When handling event overflow, the branch records have to be filtered by software to only include kernel or user branch addresses for that event. In contrast, x86 simply configures LBR using the last installed event which seems broken. It is possible on x86 to configure branch filter such that no branches are ever recorded (e.g. -j save_type). For BRBE, events with a configuration that will result in no samples are rejected. Recording branches in KVM guests is not supported like x86. However, perf on x86 allows requesting branch recording in guests. The guest events are recorded, but the resulting branches are all from the host. For BRBE, events with branch recording and "exclude_host" set are rejected. Requiring "exclude_guest" to be set did not work. The default for the perf tool does set "exclude_guest" if no exception level options are specified. However, specifying kernel or user events defaults to including both host and guest. In this case, only host branches are recorded. BRBE can support some additional exception branch types compared to x86. On x86, all exceptions other than syscalls are recorded as IRQ. With BRBE, it is possible to better categorize these exceptions. One limitation relative to x86 is we cannot distinguish a syscall return from other exception returns. So all exception returns are recorded as ERET type. The FIQ branch type is omitted as the only FIQ user is Apple platforms which don't support BRBE. The debug branch types are omitted as there is no clear need for them. BRBE records are invalidated whenever events are reconfigured, a new task is scheduled in, or after recording is paused (and the records have been recorded for the event). The architecture allows branch records to be invalidated by the PE under implementation defined conditions. It is expected that these conditions are rare. Cc: Catalin Marinas <catalin.marinas@arm.com> Co-developed-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Co-developed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Tested-by: James Clark <james.clark@linaro.org> Signed-off-by: Rob Herring (Arm) <robh@kernel.org> tested-by: Adam Young <admiyo@os.amperecomputing.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20250611-arm-brbe-v19-v23-4-e7775563036e@kernel.org [will: Fix sparse warnings about mixed declarations and code. Fix C99 comment syntax.] Signed-off-by: Will Deacon <will@kernel.org>
2025-07-08gpu/trace: make TRACE_GPU_MEM configurableJuston Li
Move the source to a better place in Device Drivers -> Graphics support now that its configurable. v4: - Move source location (Tvrtko) v3: - Patch introduced to replace per-driver config (Lucas) Signed-off-by: Juston Li <justonli@chromium.org> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250611225145.1739201-1-justonli@chromium.org Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-07-08vhost/net: enable gso over UDP tunnel support.Paolo Abeni
Vhost net need to know the exact virtio net hdr size to be able to copy such header correctly. Teach it about the newly defined UDP tunnel-related option and update the hdr size computation accordingly. Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-07-08tun: enable gso over UDP tunnel support.Paolo Abeni
Add new tun features to represent the newly introduced virtio GSO over UDP tunnel offload. Allows detection and selection of such features via the existing TUNSETOFFLOAD ioctl and compute the expected virtio header size and tunnel header offset using the current netdev features, so that we can plug almost seamless the newly introduced virtio helpers to serialize the extended virtio header. Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> --- v6 -> v7: - rebased v4 -> v5: - encapsulate the guest feature guessing in a tun helper - dropped irrelevant check on xdp buff headroom - do not remove unrelated black line - avoid line len > 80 char v3 -> v4: - virtio tnl-related fields are at fixed offset, cleanup the code accordingly. - use netdev features instead of flags bit to check for the configured offload - drop packet in case of enabled features/configured hdr size mismatch v2 -> v3: - cleaned-up uAPI comments - use explicit struct layout instead of raw buf.
2025-07-08virtio_net: enable gso over UDP tunnel support.Paolo Abeni
If the related virtio feature is set, enable transmission and reception of gso over UDP tunnel packets. Most of the work is done by the previously introduced helper, just need to determine the UDP tunnel features inside the virtio_net_hdr and update accordingly the virtio net hdr size. Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-07-08virtio_net: add supports for extended offloadsPaolo Abeni
The virtio_net driver needs it to implement GSO over UDP tunnel offload. The only missing piece is mapping them to/from the extended features. Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-07-08vhost-net: allow configuring extended featuresPaolo Abeni
Use the extended feature type for 'acked_features' and implement two new ioctls operation allowing the user-space to set/query an unbounded amount of features. The actual number of processed features is limited by VIRTIO_FEATURES_MAX and attempts to set features above such limit fail with EOPNOTSUPP. Note that: the legacy ioctls implicitly truncate the negotiated features to the lower 64 bits range and the 'acked_backend_features' field don't need conversion, as the only negotiated feature there is in the low 64 bit range. Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-07-08virtio_pci_modern: allow configuring extended featuresPaolo Abeni
The virtio specifications allows for up to 128 bits for the device features. Soon we are going to use some of the 'extended' bits features (above 64) for the virtio_net driver. Extend the virtio pci modern driver to support configuring the full virtio features range, replacing the unrolled loops reading and writing the features space with explicit one bounded to the actual features space size in word and implementing the get_extended_features callback. Note that in vp_finalize_features() we only need to cache the lower 64 features bits, to process the transport features. Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-07-08virtio: introduce extended featuresPaolo Abeni
The virtio specifications allows for up to 128 bits for the device features. Soon we are going to use some of the 'extended' bits features (above 64) for the virtio_net driver. Introduce extended features as a fixed size array of u64. To minimize the diffstat allows legacy driver to access the low 64 bits via a transparent union. Introduce an extended get_extended_features configuration callback that devices supporting the extended features range must implement in place of the traditional one. Note that legacy and transport features don't need any change, as they are always in the low 64 bit range. Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-07-08net: airoha: Fix an error handling path in airoha_probe()Christophe JAILLET
If an error occurs after a successful airoha_hw_init() call, airoha_ppe_deinit() needs to be called as already done in the remove function. Fixes: 00a7678310fe ("net: airoha: Introduce flowtable offload support") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Simon Horman <horms@kernel.org> Acked-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://patch.msgid.link/1c940851b4fa3c3ed2a142910c821493a136f121.1746715755.git.christophe.jaillet@wanadoo.fr Signed-off-by: Jakub Kicinski <kuba@kernel.org>