summaryrefslogtreecommitdiff
path: root/drivers
AgeCommit message (Collapse)Author
2025-11-05iommupt: Add the basic structure of the iommu implementationJason Gunthorpe
The existing IOMMU page table implementations duplicate all of the working algorithms for each format. By using the generic page table API a single C version of the IOMMU algorithms can be created and re-used for all of the different formats used in the drivers. The implementation will provide a single C version of the iommu domain operations: iova_to_phys, map, unmap, and read_and_clear_dirty. Further, adding new algorithms and techniques becomes easy to do across the entire fleet of drivers and formats. The C functions are drop in compatible with the existing iommu_domain_ops using the IOMMU_PT_DOMAIN_OPS() macro. Each per-format implementation compilation unit will produce exported symbols following the pattern pt_iommu_FMT_map_pages() which the macro directly maps to the iommu_domain_ops members. This avoids the additional function pointer indirection like io-pgtable has. The top level struct used by the drivers is pt_iommu_table_FMT. It contains the other structs to allow container_of() to move between the driver, iommu page table, generic page table, and generic format layers. struct pt_iommu_table_amdv1 { struct pt_iommu { struct iommu_domain domain; } iommu; struct pt_amdv1 { struct pt_common common; } amdpt; }; The driver is expected to union the pt_iommu_table_FMT with its own existing domain struct: struct driver_domain { union { struct iommu_domain domain; struct pt_iommu_table_amdv1 amdv1; }; }; PT_IOMMU_CHECK_DOMAIN(struct driver_domain, amdv1, domain); To create an alias to avoid renaming 'domain' in a lot of driver code. This allows all the layers to access all the necessary functions to implement their different roles with no change to any of the existing iommu core code. Implement the basic starting point: pt_iommu_init(), get_info() and deinit(). Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Samiullah Khawaja <skhawaja@google.com> Tested-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Tested-by: Pasha Tatashin <pasha.tatashin@soleen.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2025-11-05genpt: Generic Page Table base APIJason Gunthorpe
The generic API is intended to be separated from the implementation of page table algorithms. It contains only accessors for walking and manipulating the table and helpers that are useful for building an implementation. Memory management is not in the generic API, but part of the implementation. Using a multi-compilation approach the implementation module would include headers in this order: common.h defs_FMT.h pt_defs.h FMT.h pt_common.h IMPLEMENTATION.h Where each compilation unit would have a combination of FMT and IMPLEMENTATION to produce a per-format per-implementation module. The API is designed so that the format headers have minimal logic, and default implementations are provided if the format doesn't include one. Generally formats provide their code via an inline function using the pattern: static inline FMTpt_XX(..) {} #define pt_XX FMTpt_XX The common code then enforces a function signature so that there is no drift in function arguments, or accidental polymorphic functions (as has been slightly troublesome in mm). Use of function-like #defines are avoided in the format even though many of the functions are small enough. Provide kdocs for the API surface. This is enough to implement the 8 initial format variations with all of their features: * Entries comprised of contiguous blocks of IO PTEs for larger page sizes (AMDv1, ARMv8) * Multi-level tables, up to 6 levels. Runtime selected top level * The size of the top table level can be selected at runtime (ARM's concatenated tables) * The number of levels in the table can optionally increase dynamically during map (AMDv1) * Optional leaf entries at any level * 32 bit/64 bit virtual and output addresses, using every bit * Sign extended addressing (x86) * Dirty tracking A basic simple format takes about 200 lines to declare the require inline functions. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Samiullah Khawaja <skhawaja@google.com> Tested-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Tested-by: Pasha Tatashin <pasha.tatashin@soleen.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2025-11-05accel/ivpu: Improve debug and warning messagesKarol Wachowski
Add IOCTL debug bit for logging user provided parameter validation errors. Refactor several warning and error messages to better reflect fault reason. User generated faults should not flood kernel messages with warnings or errors, so change those to ivpu_dbg(). Add additional debug logs for parameter validation in IOCTLs. Check size provided by in metric streamer start and return -EINVAL together with a debug message print. Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com> Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com> Link: https://patch.msgid.link/20251104132418.970784-1-karol.wachowski@linux.intel.com
2025-11-04net: ethernet: ti: netcp: Standardize knav_dma_open_channel to return NULL ↵Nishanth Menon
on error Make knav_dma_open_channel consistently return NULL on error instead of ERR_PTR. Currently the header include/linux/soc/ti/knav_dma.h returns NULL when the driver is disabled, but the driver implementation does not even return NULL or ERR_PTR on failure, causing inconsistency in the users. This results in a crash in netcp_free_navigator_resources as followed (trimmed): Unhandled fault: alignment exception (0x221) at 0xfffffff2 [fffffff2] *pgd=80000800207003, *pmd=82ffda003, *pte=00000000 Internal error: : 221 [#1] SMP ARM Modules linked in: CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.17.0-rc7 #1 NONE Hardware name: Keystone PC is at knav_dma_close_channel+0x30/0x19c LR is at netcp_free_navigator_resources+0x2c/0x28c [... TRIM...] Call trace: knav_dma_close_channel from netcp_free_navigator_resources+0x2c/0x28c netcp_free_navigator_resources from netcp_ndo_open+0x430/0x46c netcp_ndo_open from __dev_open+0x114/0x29c __dev_open from __dev_change_flags+0x190/0x208 __dev_change_flags from netif_change_flags+0x1c/0x58 netif_change_flags from dev_change_flags+0x38/0xa0 dev_change_flags from ip_auto_config+0x2c4/0x11f0 ip_auto_config from do_one_initcall+0x58/0x200 do_one_initcall from kernel_init_freeable+0x1cc/0x238 kernel_init_freeable from kernel_init+0x1c/0x12c kernel_init from ret_from_fork+0x14/0x38 [... TRIM...] Standardize the error handling by making the function return NULL on all error conditions. The API is used in just the netcp_core.c so the impact is limited. Note, this change, in effect reverts commit 5b6cb43b4d62 ("net: ethernet: ti: netcp_core: return error while dma channel open issue"), but provides a less error prone implementation. Suggested-by: Simon Horman <horms@kernel.org> Suggested-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Nishanth Menon <nm@ti.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20251103162811.3730055-1-nm@ti.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-04net: Convert proto_ops connect() callbacks to use sockaddr_unsizedKees Cook
Update all struct proto_ops connect() callback function prototypes from "struct sockaddr *" to "struct sockaddr_unsized *" to avoid lying to the compiler about object sizes. Calls into struct proto handlers gain casts that will be removed in the struct proto conversion patch. No binary changes expected. Signed-off-by: Kees Cook <kees@kernel.org> Link: https://patch.msgid.link/20251104002617.2752303-3-kees@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-04net: Convert proto_ops bind() callbacks to use sockaddr_unsizedKees Cook
Update all struct proto_ops bind() callback function prototypes from "struct sockaddr *" to "struct sockaddr_unsized *" to avoid lying to the compiler about object sizes. Calls into struct proto handlers gain casts that will be removed in the struct proto conversion patch. No binary changes expected. Signed-off-by: Kees Cook <kees@kernel.org> Link: https://patch.msgid.link/20251104002617.2752303-2-kees@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-04virtio-net: fix received length check in big packetsBui Quang Minh
Since commit 4959aebba8c0 ("virtio-net: use mtu size as buffer length for big packets"), when guest gso is off, the allocated size for big packets is not MAX_SKB_FRAGS * PAGE_SIZE anymore but depends on negotiated MTU. The number of allocated frags for big packets is stored in vi->big_packets_num_skbfrags. Because the host announced buffer length can be malicious (e.g. the host vhost_net driver's get_rx_bufs is modified to announce incorrect length), we need a check in virtio_net receive path. Currently, the check is not adapted to the new change which can lead to NULL page pointer dereference in the below while loop when receiving length that is larger than the allocated one. This commit fixes the received length check corresponding to the new change. Fixes: 4959aebba8c0 ("virtio-net: use mtu size as buffer length for big packets") Cc: stable@vger.kernel.org Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Tested-by: Lei Yang <leiyang@redhat.com> Link: https://patch.msgid.link/20251030144438.7582-1-minhquangbui99@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-04net: phy: fixed_phy: remove fixed_phy_addHeiner Kallweit
fixed_phy_add() has a number of problems/disadvantages: - It uses phy address 0 w/o checking whether a fixed phy with this address exists already. - A subsequent call to fixed_phy_register() would also use phy address 0, because fixed_phy_add() doesn't mark it as used. - fixed_phy_add() is used from platform code, therefore requires that fixed_phy code is built-in. Now that for the only two users (coldfire/5272 and bcm47xx) fixed_phy creation has been moved to the respective ethernet driver (fec, b44), we can remove fixed_phy_add(). Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://patch.msgid.link/bee046a1-1e77-4057-8b04-fdb2a1bbbd08@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-04net: b44: register a fixed phy using fixed_phy_register_100fd if neededHeiner Kallweit
In case of bcm47xx a fixed phy is used, which so far is created by platform code, using fixed_phy_add(). This function has a number of problems, therefore create a potentially needed fixed phy here, using fixed_phy_register_100fd. Due to lack of hardware, this is compile-tested only. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://patch.msgid.link/53e4e74d-a49e-4f37-b970-5543a35041db@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-04net: fec: register a fixed phy using fixed_phy_register_100fd if neededHeiner Kallweit
In case of coldfire/5272 a fixed phy is used, which so far is created by platform code, using fixed_phy_add(). This function has a number of problems, therefore create a potentially needed fixed phy here, using fixed_phy_register_100fd. Note 1: This includes a small functional change, as coldfire/5272 created a fixed phy in half-duplex mode. Likely this was by mistake, because the fec MAC is 100FD-capable, and connection is to a switch. Note 2: Usage of phy_find_next() makes use of the fact that dev_id can only be 0 or 1. Due to lack of hardware, this is compile-tested only. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://patch.msgid.link/adf4dc5c-5fa3-4ae6-a75c-a73954dede73@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-04net: phy: fixed_phy: add helper fixed_phy_register_100fdHeiner Kallweit
In few places a 100FD fixed PHY is used. Create a helper so that users don't have to define the struct fixed_phy_status. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://patch.msgid.link/bf564b19-e9bc-4896-aeae-9f721cc4fecd@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-04net: altera-tse: Init PCS and phylink before registering netdevMaxime Chevallier
register_netdev() must be done only once all resources are ready, as they may be used in .ndo_open() immediately upon registration. Move the lynx PCS and phylink initialisation before registerng the netdevice. We also remove the call to netif_carrier_off(), as phylink takes care of that. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Link: https://patch.msgid.link/20251103104928.58461-5-maxime.chevallier@bootlin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-04net: altera-tse: Don't use netdev name for the PCS mdio busMaxime Chevallier
The PCS mdio bus must be created before registering the net_device. To do that, we musn't depend on the netdev name to create the mdio bus name. Let's use the device's name instead. Note that this changes the bus name in /sys/bus/mdiobus Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Link: https://patch.msgid.link/20251103104928.58461-4-maxime.chevallier@bootlin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-04net: altera-tse: Warn on bad revision at probe timeMaxime Chevallier
Instead of reading the core revision at probe time, and print a warning for an unexecpected version at .ndo_open() time, let's print that warning directly in .probe(). This allows getting rid of the "revision" private field, and also prevent a potential race between reading the revision in .probe() after netdev registration, and accessing that revision in .ndo_open(). By printing the warning after register_netdev(), we are sure that we have a netdev name, and that we try to print the revision after having read it from the internal registers. Suggested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20251103104928.58461-3-maxime.chevallier@bootlin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-04net: altera-tse: Set platform drvdata before registering netdevMaxime Chevallier
We don't have to wait until netdev is registered before setting it as the pdev's drvdata. Move it at netdev alloc time. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Link: https://patch.msgid.link/20251103104928.58461-2-maxime.chevallier@bootlin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-04net: phy: make phy_device members pause and asym_pause bitfield bitsHeiner Kallweit
We can reduce the size of struct phy_device a little by switching the type of members pause and asym_pause from int to a single bit. As C99 is supported now, we can use type bool for the bitfield members, what provides us with the benefit of the usual implicit bool conversions. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://patch.msgid.link/764e9a31-b40b-4dc9-b808-118192a16d87@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-04net: rnpgbe: Add register_netdevDong Yibo
Complete the network device (netdev) registration flow for Mucse Gbe Ethernet chips, including: 1. Hardware state initialization: - Send powerup notification to firmware (via echo_fw_status) - Sync with firmware - Reset hardware 2. MAC address handling: - Retrieve permanent MAC from firmware (via mucse_mbx_get_macaddr) - Fallback to random valid MAC (eth_random_addr) if not valid mac from Fw Signed-off-by: Dong Yibo <dong100@mucse.com> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Link: https://patch.msgid.link/20251101013849.120565-6-dong100@mucse.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-04net: rnpgbe: Add basic mbx_fw supportDong Yibo
Add fundamental firmware (FW) communication operations via PF-FW mailbox, including: - FW sync (via HW info query with retries) - HW reset (post FW command to reset hardware) - MAC address retrieval (request FW for port-specific MAC) - Power management (powerup/powerdown notification to FW) Signed-off-by: Dong Yibo <dong100@mucse.com> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Link: https://patch.msgid.link/20251101013849.120565-5-dong100@mucse.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-04net: rnpgbe: Add basic mbx ops supportDong Yibo
Add fundamental mailbox (MBX) communication operations between PF (Physical Function) and firmware for n500/n210 chips Signed-off-by: Dong Yibo <dong100@mucse.com> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Link: https://patch.msgid.link/20251101013849.120565-4-dong100@mucse.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-04net: rnpgbe: Add n500/n210 chip support with BAR2 mappingDong Yibo
Add hardware initialization foundation for MUCSE 1Gbe controller, including: 1. Map PCI BAR2 as hardware register base; 2. Bind PCI device to driver private data (struct mucse) and initialize hardware context (struct mucse_hw); 3. Reserve board-specific init framework via rnpgbe_init_hw. Signed-off-by: Dong Yibo <dong100@mucse.com> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Reviewed-by: MD Danish Anwar <danishanwar@ti.com> Link: https://patch.msgid.link/20251101013849.120565-3-dong100@mucse.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-04net: rnpgbe: Add build support for rnpgbeDong Yibo
Add build options and doc for mucse. Initialize pci device access for MUCSE devices. Signed-off-by: Dong Yibo <dong100@mucse.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Reviewed-by: MD Danish Anwar <danishanwar@ti.com> Link: https://patch.msgid.link/20251101013849.120565-2-dong100@mucse.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-04ti: netcp: convert to ndo_hwtstamp callbacksVadim Fedorenko
Convert TI NetCP driver to use ndo_hwtstamp_get()/ndo_hwtstamp_set() callbacks. The logic is slightly changed, because I believe the original logic was not really correct. Config reading part is using the very first module to get the configuration instead of iterating over all of them and keep the last one as the configuration is supposed to be identical for all modules. HW timestamp config set path is now trying to configure all modules, but in case of error from one module it adds extack message. This way the configuration will be as synchronized as possible. There are only 2 modules using netcp core infrastructure, and both use the very same function to configure HW timestamping, so no actual difference in behavior is expected. Signed-off-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Reviewed-by: Kory Maincent <kory.maincent@bootlin.com> Link: https://patch.msgid.link/20251103172902.3538392-1-vadim.fedorenko@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-04net: pch_gbe: convert to use ndo_hwtstamp callbacksVadim Fedorenko
The driver implemented SIOCSHWTSTAMP ioctl command only, but it stores configuration in the private data, so it is possible to report it back to users. Implement both ndo_hwtstamp_set and ndo_hwtstamp_get callbacks. To properly report RX filter type, store it in hwts_rx_en instead of using this field as a simple flag. The logic didn't change because receive path used this field as boolean flag. Signed-off-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Reviewed-by: Kory Maincent <kory.maincent@bootlin.com> Link: https://patch.msgid.link/20251103150952.3538205-7-vadim.fedorenko@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-04net: thunderx: convert to use ndo_hwtstamp callbacksVadim Fedorenko
The driver implemented SIOCSHWTSTAMP ioctl command only, but it also stores configuration in private data, so it's possible to report it back to users. Implement both ndo_hwtstamp_set and ndo_hwtstamp_get callbacks. Reviewed-by: Kory Maincent <kory.maincent@bootlin.com> Signed-off-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Link: https://patch.msgid.link/20251103150952.3538205-6-vadim.fedorenko@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-04net: octeon: mgmt: convert to use ndo_hwtstamp callbacksVadim Fedorenko
The driver implemented SIOCSHWTSTAMP ioctl command only. But it stores timestamping configuration, so it is possible to report it to users. Implement both ndo_hwtstamp_set and ndo_hwtstamp_get callbacks. After this the ndo_eth_ioctl effectively becomes phy_do_ioctl - adjust callback accordingly. Reviewed-by: Kory Maincent <kory.maincent@bootlin.com> Signed-off-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Link: https://patch.msgid.link/20251103150952.3538205-5-vadim.fedorenko@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-04net: liquidio_vf: convert to use ndo_hwtstamp callbacksVadim Fedorenko
The driver implemented SIOCSHWTSTAMP ioctl command only, but there is a way to get configuration back. Implement both ndo_hwtstamp_set and ndo_hwtstamp_set callbacks. Reviewed-by: Kory Maincent <kory.maincent@bootlin.com> Signed-off-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Link: https://patch.msgid.link/20251103150952.3538205-4-vadim.fedorenko@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-04net: liquidio: convert to use ndo_hwtstamp callbacksVadim Fedorenko
The driver implemented SIOCSHWTSTAMP ioctl command only, but there is a way to get configured status. Implement both ndo_hwtstamp_set and ndo_hwtstamp_get callbacks. Reviewed-by: Kory Maincent <kory.maincent@bootlin.com> Signed-off-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Link: https://patch.msgid.link/20251103150952.3538205-3-vadim.fedorenko@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-04bnxt_en: Fix warning in bnxt_dl_reload_down()Shantiprasad Shettar
The existing code calls bnxt_cancel_reservations() after bnxt_hwrm_func_drv_unrgtr() in bnxt_dl_reload_down(). bnxt_cancel_reservations() calls the FW and it will always fail since the driver has already unregistered, triggering this warning: bnxt_en 0000:0a:00.0 ens2np0: resc_qcaps failed Fix it by calling bnxt_clear_reservations() which will skip the unnecessary FW call since we have unregistered. Fixes: 228ea8c187d8 ("bnxt_en: implement devlink dev reload driver_reinit") Reviewed-by: Mohammad Shuab Siddique <mohammad-shuab.siddique@broadcom.com> Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Shantiprasad Shettar <shantiprasad.shettar@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20251104005700.542174-6-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-04bnxt_en: Always provide max entry and entry size in coredump segmentsKashyap Desai
While populating firmware host logging segments for the coredump, it is possible for the FW command that flushes the segment to fail. When that happens, the existing code will not update the max entry and entry size in the segment header and this causes software that decodes the coredump to skip the segment. The segment most likely has already collected some DMA data, so always update these 2 segment fields in the header to allow the decoder to decode any data in the segment. Fixes: 3c2179e66355 ("bnxt_en: Add FW trace coredump segments to the coredump") Reviewed-by: Shruti Parab <shruti.parab@broadcom.com> Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20251104005700.542174-5-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-04bnxt_en: Fix null pointer dereference in bnxt_bs_trace_check_wrap()Gautam R A
With older FW, we may get the ASYNC_EVENT_CMPL_EVENT_ID_DBG_BUF_PRODUCER for FW trace data type that has not been initialized. This will result in a crash in bnxt_bs_trace_type_wrap(). Add a guard to check for a valid magic_byte pointer before proceeding. Fixes: 84fcd9449fd7 ("bnxt_en: Manage the FW trace context memory") Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Reviewed-by: Shruti Parab <shruti.parab@broadcom.com> Signed-off-by: Gautam R A <gautam-r.a@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20251104005700.542174-4-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-04bnxt_en: Fix a possible memory leak in bnxt_ptp_initKalesh AP
In bnxt_ptp_init(), when ptp_clock_register() fails, the driver is not freeing the memory allocated for ptp_info->pin_config. Fix it to unconditionally free ptp_info->pin_config in bnxt_ptp_free(). Fixes: caf3eedbcd8d ("bnxt_en: 1PPS support for 5750X family chips") Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20251104005700.542174-3-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-04bnxt_en: Shutdown FW DMA in bnxt_shutdown()Michael Chan
The netif_close() call in bnxt_shutdown() only stops packet DMA. There may be FW DMA for trace logging (recently added) that will continue. If we kexec to a new kernel, the DMA will corrupt memory in the new kernel. Add bnxt_hwrm_func_drv_unrgtr() to unregister the driver from the FW. This will stop the FW DMA. In case the call fails, call pcie_flr() to reset the function and stop the DMA. Fixes: 24d694aec139 ("bnxt_en: Allocate backing store memory for FW trace logs") Reported-by: Jakub Kicinski <kicinski@meta.com> Reviewed-by: Damodharam Ammepalli <damodharam.ammepalli@broadcom.com> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20251104005700.542174-2-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-04virtio_net: fix alignment for virtio_net_hdr_v1_hashMichael S. Tsirkin
Changing alignment of header would mean it's no longer safe to cast a 2 byte aligned pointer between formats. Use two 16 bit fields to make it 2 byte aligned as previously. This fixes the performance regression since commit ("virtio_net: enable gso over UDP tunnel support.") as it uses virtio_net_hdr_v1_hash_tunnel which embeds virtio_net_hdr_v1_hash. Pktgen in guest + XDP_DROP on TAP + vhost_net shows the TX PPS is recovered from 2.4Mpps to 4.45Mpps. Fixes: 56a06bd40fab ("virtio_net: enable gso over UDP tunnel support.") Cc: stable@vger.kernel.org Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com> Link: https://patch.msgid.link/20251031060551.126-1-jasowang@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-04net/mlx5e: Defer channels closure to reduce interface down timeTariq Toukan
Cap bit tis_tir_td_order=1 indicates that an old firmware requirement / limitation no longer exists. When unset, the latency of several firmware commands significantly increases with the presence of high number of co-existing channels (both old and new sets). Hence, we used to close unneeded old channels before invoking those firmware commands. Today, on capable devices, this is no longer the case. Minimize the interface down time by deferring the old channels closure, after the activation of the new ones. Perf numbers: Measured the number of dropped packets in a simple ping flood test, during a configuration change operation, that switches the number of channels from 247 to 248. Before: 71 packets lost After: 15 packets lost, ~80% saving. Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/1761831159-1013140-8-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-04net/mlx5e: Pass old channels as argument to mlx5e_switch_priv_channelsTariq Toukan
Let the caller function mlx5e_safe_switch_params() maintain a copy of the old channels, and pass it to mlx5e_switch_priv_channels(). This is in preparation for the next patch. Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/1761831159-1013140-7-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-04net/mlx5e: Do not re-apply TIR loopback configuration if not necessaryTariq Toukan
On old firmware, (tis_tir_td_order=0), TIR of a transport domain should either be created after all SQs of the same domain, or TIR.self_lb_en should be reapplied using MODIFY_TIR, for self loopback filtering to function correctly. This is not necessary anymnore on new FW (tis_tir_td_order=1), thus there's no need for calling modify_tir operations after creating a new set of SQs to maintain the self loopback prevention functional. Skip these operations. This saves O(max_num_channels) MODIFY_TIR firmware commands in operations like interface up or channels configuration change. Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/1761831159-1013140-6-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-04net/mlx5: IPoIB, set self loopback prevention in TIR initTariq Toukan
In IPoIB, the self loopback prevention configuration apply in activation stage has two roles: fulfill a firmware requirement for old firmware (tis_tir_td_order=0), and update the proper configuration as it was not set in init. Here we set the proper configuration in init, to allow skipping the modify_tirs commands on new firmware in a downstream patch. Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/1761831159-1013140-5-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-04net/mlx5e: Allow setting self loopback prevention bits on TIR initTariq Toukan
Until now, IPoIB was creating TIRs without setting self loopback prevention, then modifying them in activation stage. This is a preparation patch, that will be used by IPoIB to init TIRs properly without the need for following calls of modify_tir. Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/1761831159-1013140-4-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-04net/mlx5e: Use TIR API in mlx5e_modify_tirs_lb()Tariq Toukan
Extend the TIR API and use it in mlx5e_modify_tirs_lb() instead of the explicit modify_tir code. Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/1761831159-1013140-3-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-04net/mlx5e: Enhance function structures for self loopback prevention applicationTariq Toukan
The re-application of self loopback prevention attributes in TIRs is necessary in old firmwares (where tis_tir_td_order cap is cleared) after recreation of SQs. However, this is not needed in new firmware with tis_tir_td_order=1. As a preparation patch, enhance the function structures to differentiate between an explicit loopback prevention configuration apply, and the re-apply operation required by old firmware. Loopback selftests should now call mlx5e_modify_tirs_lb() directly, as their use case is not related to the firmware limitation. Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/1761831159-1013140-2-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-04xen/netfront: Comment Correction: Fix Spelling Error and Description of ↵Chu Guangqing
Queue Quantity Rules The original comments contained spelling errors and incomplete logical descriptions, which could easily lead to misunderstandings of the code logic. The specific modifications are as follows: Correct the spelling error by changing "inut max" to "but not exceed the maximum limit"; Add the note "If the user has not specified a value, the default maximum limit is 8" to clarify the default value logic; Improve the coherence of the statement to make the queue quantity rules clearer. After the modification, the comments can accurately reflect the code behavior of "taking the smaller value between the number of CPUs and the default maximum limit of 8 for the number of queues", enhancing code maintainability. Signed-off-by: Chu Guangqing <chuguangqing@inspur.com> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://patch.msgid.link/20251103032212.2462-1-chuguangqing@inspur.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-04net: sungem_phy: Fix a typo error in sungem_phyChu Guangqing
Fix a spelling mistakes for regularly Signed-off-by: Chu Guangqing <chuguangqing@inspur.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20251103054443.2878-1-chuguangqing@inspur.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-04veth: Fix a typo error in vethChu Guangqing
Fix a spellling error for resources Signed-off-by: Chu Guangqing <chuguangqing@inspur.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20251103055351.3150-1-chuguangqing@inspur.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-04gtp: Fix a typo error for sizeChu Guangqing
Fix the spelling error of "size". Signed-off-by: Chu Guangqing <chuguangqing@inspur.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20251103060504.3524-1-chuguangqing@inspur.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-04virtio_net: Fix a typo error in virtio_netChu Guangqing
Fix the spelling error of "separate". Signed-off-by: Chu Guangqing <chuguangqing@inspur.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Link: https://patch.msgid.link/20251103074305.4727-1-chuguangqing@inspur.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-04net: stmmac: imx: use ->set_phy_intf_sel()Russell King (Oracle)
Rather than placing the phy_intf_sel() setup in the ->init() method, move it to the new ->set_phy_intf_sel() method. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1vFt5C-0000000ChpR-2kAB@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-04net: stmmac: imx: cleanup arguments for set_intf_mode() methodRussell King (Oracle)
Pass the imx_priv_data instead of the plat_stmmacenet_data into the set_intf_mode() SoC specific methods. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1vFt57-0000000ChpL-25kS@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-04net: stmmac: imx: simplify set_intf_mode() implementationsRussell King (Oracle)
Simplify the set_intf_mode() implementations, testing the phy_intf_sel value rather than the PHY interface mode. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1vFt52-0000000ChpG-1bsd@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-04net: stmmac: imx: use stmmac_get_phy_intf_sel()Russell King (Oracle)
i.MX implementations other than IMX8DXL involve setting the dwmac core phy_intf_sel input. Use stmmac_get_phy_intf_sel() to decode the PHY interface mode to the phy_intf_sel value, validating the result, and passing it into the implementation specific .set_intf_mode() method rather than each .set_intf_mode() method doing this. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1vFt4x-0000000ChpA-1Edr@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-04net: stmmac: imx: use FIELD_PREP()/FIELD_GET() for PHY_INTF_SEL_xRussell King (Oracle)
Use FIELD_PREP()/FIELD_GET() in the functions to construct the PHY interface selection bitfield or to extract its value. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1vFt4s-0000000Chp4-0kwf@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>