summaryrefslogtreecommitdiff
path: root/drivers/net/ethernet/intel
AgeCommit message (Collapse)Author
13 daysice: dpll: fix misplaced header macrosIvan Vecera
The CGU register definitions (ICE_CGU_R10, ICE_CGU_R11 and related field masks) were placed after the #endif of the _ICE_DPLL_H_ include guard, leaving them unprotected. Move them inside the guard. Fixes: ad1df4f2d591 ("ice: dpll: Support E825-C SyncE and dynamic pin discovery") Signed-off-by: Ivan Vecera <ivecera@redhat.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260506-jk-iwl-net-2026-05-04-v2-8-a5ea4dc837a9@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 daysice: dpll: fix rclk pin state get for E810Ivan Vecera
The refactoring of ice_dpll_rclk_state_on_pin_get() to use ice_dpll_pin_get_parent_idx() omitted the base_rclk_idx adjustment that was correctly added in the ice_dpll_rclk_state_on_pin_set() path. This breaks E810 devices where base_rclk_idx is non-zero, causing the wrong hardware index to be used for pin state lookup and incorrect recovered clock state to be reported via the DPLL subsystem. E825C is unaffected as its base_rclk_idx is 0. While at it, add bounds check against ICE_DPLL_RCLK_NUM_MAX on hw_idx after the base_rclk_idx subtraction in both ice_dpll_rclk_state_on_pin_{get,set}() to prevent out-of-bounds access on the pin state array. Fixes: ad1df4f2d591 ("ice: dpll: Support E825-C SyncE and dynamic pin discovery") Signed-off-by: Ivan Vecera <ivecera@redhat.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260506-jk-iwl-net-2026-05-04-v2-7-a5ea4dc837a9@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 daysice: fix locking in ice_dcb_rebuild()Bart Van Assche
Move the mutex_lock() call up to prevent that DCB settings change after the first ice_query_port_ets() call. The second ice_query_port_ets() call in ice_dcb_rebuild() is already protected by pf->tc_mutex. This also fixes a bug in an error path, as before taking the first "goto dcb_error" in the function jumped over mutex_lock() to mutex_unlock(). This bug has been detected by the clang thread-safety analyzer. Cc: intel-wired-lan@lists.osuosl.org Fixes: 242b5e068b25 ("ice: Fix DCB rebuild after reset") Signed-off-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Arpana Arland <arpanax.arland@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260506-jk-iwl-net-2026-05-04-v2-6-a5ea4dc837a9@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 daysice: fix setting RSS VSI hash for E830Marcin Szycik
ice_set_rss_hfunc() performs a VSI update, in which it sets hashing function, leaving other VSI options unchanged. However, ::q_opt_flags is mistakenly set to the value of another field, instead of its original value, probably due to a typo. What happens next is hardware-dependent: On E810, only the first bit is meaningful (see ICE_AQ_VSI_Q_OPT_PE_FLTR_EN) and can potentially end up in a different state than before VSI update. On E830, some of the remaining bits are not reserved. Setting them to some unrelated values can cause the firmware to reject the update because of invalid settings, or worse - succeed. Reproducer: sudo ethtool -X $PF1 equal 8 Output in dmesg: Failed to configure RSS hash for VSI 6, error -5 Fixes: 352e9bf23813 ("ice: enable symmetric-xor RSS for Toeplitz hash function") Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260506-jk-iwl-net-2026-05-04-v2-5-a5ea4dc837a9@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 daysidpf: fix double free and use-after-free in aux device error pathsGreg Kroah-Hartman
When auxiliary_device_add() fails in idpf_plug_vport_aux_dev() or idpf_plug_core_aux_dev(), the err_aux_dev_add label calls auxiliary_device_uninit() and falls through to err_aux_dev_init. The uninit call will trigger put_device(), which invokes the release callback (idpf_vport_adev_release / idpf_core_adev_release) that frees iadev. The fall-through then reads adev->id from the freed iadev for ida_free() and double-frees iadev with kfree(). Free the IDA slot and clear the back-pointer before uninit, while adev is still valid, then return immediately. Commit 65637c3a1811 ("idpf: fix UAF in RDMA core aux dev deinitialization") fixed the same use-after-free in the matching unplug path in this file but missed both probe error paths. Cc: Tony Nguyen <anthony.l.nguyen@intel.com> Cc: Przemek Kitszel <przemyslaw.kitszel@intel.com> Cc: Andrew Lunn <andrew+netdev@lunn.ch> Cc: stable@kernel.org Fixes: be91128c579c ("idpf: implement RDMA vport auxiliary dev create, init, and destroy") Fixes: f4312e6bfa2a ("idpf: implement core RDMA auxiliary dev create, init, and destroy") Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260506-jk-iwl-net-2026-05-04-v2-4-a5ea4dc837a9@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 daysidpf: fix read_dev_clk_lock spinlock init in idpf_ptp_init()Emil Tantilov
In idpf_ptp_init(), read_dev_clk_lock is initialized after ptp_schedule_worker() had already been called (and after idpf_ptp_settime64() could reach the lock). The PTP aux worker fires immediately upon scheduling and can call into idpf_ptp_read_src_clk_reg_direct(), which takes spin_lock(&ptp->read_dev_clk_lock) on an uninitialized lock, triggering the lockdep "non-static key" warning: [12973.796587] idpf 0000:83:00.0: Device HW Reset initiated [12974.094507] INFO: trying to register non-static key. ... [12974.097208] Call Trace: [12974.097213] <TASK> [12974.097218] dump_stack_lvl+0x93/0xe0 [12974.097234] register_lock_class+0x4c4/0x4e0 [12974.097249] ? __lock_acquire+0x427/0x2290 [12974.097259] __lock_acquire+0x98/0x2290 [12974.097272] lock_acquire+0xc6/0x310 [12974.097281] ? idpf_ptp_read_src_clk_reg+0xb7/0x150 [idpf] [12974.097311] ? lockdep_hardirqs_on_prepare+0xde/0x190 [12974.097318] ? finish_task_switch.isra.0+0xd2/0x350 [12974.097330] ? __pfx_ptp_aux_kworker+0x10/0x10 [ptp] [12974.097343] _raw_spin_lock+0x30/0x40 [12974.097353] ? idpf_ptp_read_src_clk_reg+0xb7/0x150 [idpf] [12974.097373] idpf_ptp_read_src_clk_reg+0xb7/0x150 [idpf] [12974.097391] ? kthread_worker_fn+0x88/0x3d0 [12974.097404] ? kthread_worker_fn+0x4e/0x3d0 [12974.097411] idpf_ptp_update_cached_phctime+0x26/0x120 [idpf] [12974.097428] ? _raw_spin_unlock_irq+0x28/0x50 [12974.097436] idpf_ptp_do_aux_work+0x15/0x20 [idpf] [12974.097454] ptp_aux_kworker+0x20/0x40 [ptp] [12974.097464] kthread_worker_fn+0xd5/0x3d0 [12974.097474] ? __pfx_kthread_worker_fn+0x10/0x10 [12974.097482] kthread+0xf4/0x130 [12974.097489] ? __pfx_kthread+0x10/0x10 [12974.097498] ret_from_fork+0x32c/0x410 [12974.097512] ? __pfx_kthread+0x10/0x10 [12974.097519] ret_from_fork_asm+0x1a/0x30 [12974.097540] </TASK> Move the call to spin_lock_init() up a bit to make sure read_dev_clk_lock is not touched before it's been initialized. Fixes: 5cb8805d2366 ("idpf: negotiate PTP capabilities and get PTP clock") Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Reviewed-by: Madhu Chittim <madhu.chittim@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Samuel Salin <Samuel.salin@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260506-jk-iwl-net-2026-05-04-v2-3-a5ea4dc837a9@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 daysi40e: Cleanup PTP pins on probe failureMatt Vollrath
PTP pin structs are allocated early in probe, but never cleaned up. Fix this by calling i40e_ptp_free_pins in the error path. To support this, i40e_ptp_free_pins is added to the header and pin_config is correctly nullified after being freed. This has been an issue since i40e_ptp_alloc_pins was introduced. Fixes: 1050713026a08 ("i40e: add support for PTP external synchronization clock") Reported-by: Kohei Enju <kohei@enjuk.jp> Cc: stable@vger.kernel.org Signed-off-by: Matt Vollrath <tactii@gmail.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Kohei Enju <kohei@enjuk.jp> Tested-by: Sunitha Mekala <sunithax.d.mekala@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260506-jk-iwl-net-2026-05-04-v2-2-a5ea4dc837a9@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 daysi40e: Cleanup PTP registration on probe failureMatt Vollrath
Fix two conditions which would leak PTP registration on probe failure: 1. i40e_setup_pf_switch can encounter an error in i40e_setup_pf_filter_control, call i40e_ptp_init, then return non-zero, sending i40e_probe to err_vsis. 2. i40e_setup_misc_vector can return non-zero, sending i40e_probe to err_vsis. Both of these conditions have been present since PTP was introduced in this driver. Found with coccinelle. Fixes: beb0dff1251db ("i40e: enable PTP") Signed-off-by: Matt Vollrath <tactii@gmail.com> Tested-by: Sunitha Mekala <sunithax.d.mekala@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260506-jk-iwl-net-2026-05-04-v2-1-a5ea4dc837a9@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-30ice: add dpll peer notification for paired SMA and U.FL pinsPetr Oros
SMA and U.FL pins share physical signal paths in pairs (SMA1/U.FL1 and SMA2/U.FL2). When one pin's state changes via a PCA9575 GPIO write, the paired pin's state also changes, but no notification is sent for the peer pin. Userspace consumers monitoring the peer via dpll netlink subscribe never learn about the update. Add ice_dpll_sw_pin_notify_peer() which sends a change notification for the paired SW pin. Call it from ice_dpll_pin_sma_direction_set(), ice_dpll_sma_pin_state_set(), and ice_dpll_ufl_pin_state_set() after pf->dplls.lock is released. Use __dpll_pin_change_ntf() because dpll_lock is still held by the dpll netlink layer (dpll_pin_pre_doit). Fixes: 2dd5d03c77e2 ("ice: redesign dpll sma/u.fl pins control") Signed-off-by: Petr Oros <poros@redhat.com> Tested-by: Alexander Nowlin <alexander.nowlin@intel.com> Reviewed-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260427-jk-iwl-net-petr-oros-fixes-v1-11-cdcb48303fd8@intel.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-04-30ice: fix missing dpll notifications for SW pinsPetr Oros
The SMA/U.FL pin redesign (commit 2dd5d03c77e2 ("ice: redesign dpll sma/u.fl pins control")) introduced software-controlled pins that wrap backing CGU input/output pins, but never updated the notification and data paths to propagate pin events to these SW wrappers. The periodic work sends dpll_pin_change_ntf() only for direct CGU input pins. SW pins that wrap these inputs never receive change or phase offset notifications, so userspace consumers such as synce4l monitoring SMA pins via dpll netlink never learn about state transitions or phase offset updates. Similarly, ice_dpll_phase_offset_get() reads the SW pin's own phase_offset field which is never updated; the PPS monitor writes to the backing CGU input's field instead. Fix by introducing ice_dpll_pin_ntf(), a wrapper around dpll_pin_change_ntf() that also notifies any registered SMA/U.FL pin whose backing CGU input matches. Replace all direct dpll_pin_change_ntf() calls in the periodic notification paths with this wrapper. Fix ice_dpll_phase_offset_get() to return the backing CGU input's phase_offset for input-direction SW pins. Fixes: 2dd5d03c77e2 ("ice: redesign dpll sma/u.fl pins control") Signed-off-by: Petr Oros <poros@redhat.com> Tested-by: Alexander Nowlin <alexander.nowlin@intel.com> Reviewed-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Ivan Vecera <ivecera@redhat.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260427-jk-iwl-net-petr-oros-fixes-v1-10-cdcb48303fd8@intel.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-04-30ice: fix SMA and U.FL pin state changes affecting paired pinPetr Oros
SMA and U.FL pins share physical signal paths in pairs (SMA1/U.FL1 and SMA2/U.FL2) controlled by the PCA9575 GPIO expander. Each pair can only have one active pin at a time: SMA1 output and U.FL1 output share the same CGU output, SMA2 input and U.FL2 input share the same CGU input. The PCA9575 register bits determine which connector in each pair owns the signal path. The driver does not account for this pairing in two places: ice_dpll_ufl_pin_state_set() modifies PCA9575 bits and disables the backing CGU pin without checking whether the U.FL pin is currently active. Disconnecting an already inactive U.FL pin flips bits that the paired SMA pin relies on, breaking its connection. ice_dpll_sma_direction_set() does not propagate direction changes to the paired U.FL pin. For SMA2/U.FL2 the ICE_SMA2_UFL2_RX_DIS bit is never managed, so U.FL2 stays disconnected after SMA2 switches to output. For both pairs the backing CGU pin of the U.FL side is never enabled when a direction change activates it, so userspace sees the pin as disconnected even though the routing is correct. Fix by guarding the U.FL disconnect path against inactive pins and by updating the paired U.FL pin fully on SMA direction changes: manage ICE_SMA2_UFL2_RX_DIS for the SMA2/U.FL2 pair and enable the backing CGU pin whenever the peer becomes active. Fixes: 2dd5d03c77e2 ("ice: redesign dpll sma/u.fl pins control") Signed-off-by: Petr Oros <poros@redhat.com> Tested-by: Alexander Nowlin <alexander.nowlin@intel.com> Reviewed-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260427-jk-iwl-net-petr-oros-fixes-v1-8-cdcb48303fd8@intel.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-04-30ice: fix missing SMA pin initialization in DPLL subsystemPetr Oros
The DPLL SMA/U.FL pin redesign introduced ice_dpll_sw_pin_frequency_get() which gates frequency reporting on the pin's active flag. This flag is determined by ice_dpll_sw_pins_update() from the PCA9575 GPIO expander state. Before the redesign, SMA pins were exposed as direct HW input/output pins and ice_dpll_frequency_get() returned the CGU frequency unconditionally — the PCA9575 state was never consulted. The PCA9575 powers on with all outputs high, setting ICE_SMA1_DIR_EN, ICE_SMA1_TX_EN, ICE_SMA2_DIR_EN and ICE_SMA2_TX_EN. Nothing in the driver writes the register during initialization, so ice_dpll_sw_pins_update() sees all pins as inactive and ice_dpll_sw_pin_frequency_get() permanently returns 0 Hz for every SW pin. Fix this by writing a default SMA configuration in ice_dpll_init_info_sw_pins(): clear all SMA bits, then set SMA1 and SMA2 as active inputs (DIR_EN=0) with U.FL1 output and U.FL2 input disabled. Each SMA/U.FL pair shares a physical signal path so only one pin per pair can be active at a time. U.FL pins still report frequency 0 after this fix: U.FL1 (output-only) is disabled by ICE_SMA1_TX_EN which keeps the TX output buffer off, and U.FL2 (input-only) is disabled by ICE_SMA2_UFL2_RX_DIS. They can be activated by changing the corresponding SMA pin direction via dpll netlink. Fixes: 2dd5d03c77e2 ("ice: redesign dpll sma/u.fl pins control") Signed-off-by: Petr Oros <poros@redhat.com> Reviewed-by: Ivan Vecera <ivecera@redhat.com> Reviewed-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Tested-by: Alexander Nowlin <alexander.nowlin@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260427-jk-iwl-net-petr-oros-fixes-v1-7-cdcb48303fd8@intel.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-04-30ice: fix infinite recursion in ice_cfg_tx_topo via ice_init_dev_hwPetr Oros
On certain E810 configurations where firmware supports Tx scheduler topology switching (tx_sched_topo_comp_mode_en), ice_cfg_tx_topo() may need to apply a new 5-layer or 9-layer topology from the DDP package. If the AQ command to set the topology fails (e.g. due to invalid DDP data or firmware limitations), the global configuration lock must still be cleared via a CORER reset. Commit 86aae43f21cf ("ice: don't leave device non-functional if Tx scheduler config fails") correctly fixed this by refactoring ice_cfg_tx_topo() to always trigger CORER after acquiring the global lock and re-initialize hardware via ice_init_hw() afterwards. However, commit 8a37f9e2ff40 ("ice: move ice_deinit_dev() to the end of deinit paths") later moved ice_init_dev_hw() into ice_init_hw(), breaking the reinit path introduced by 86aae43f21cf. This creates an infinite recursive call chain: ice_init_hw() ice_init_dev_hw() ice_cfg_tx_topo() # topology change needed ice_deinit_hw() ice_init_hw() # reinit after CORER ice_init_dev_hw() # recurse ice_cfg_tx_topo() ... # stack overflow Fix by moving ice_init_dev_hw() back out of ice_init_hw() and calling it explicitly from ice_probe() and ice_devlink_reinit_up(). The third caller, ice_cfg_tx_topo(), intentionally does not need ice_init_dev_hw() during its reinit, it only needs the core HW reinitialization. This breaks the recursion cleanly without adding flags or guards. The deinit ordering changes from commit 8a37f9e2ff40 ("ice: move ice_deinit_dev() to the end of deinit paths") which fixed slow rmmod are preserved, only the init-side placement of ice_init_dev_hw() is reverted. Fixes: 8a37f9e2ff40 ("ice: move ice_deinit_dev() to the end of deinit paths") Signed-off-by: Petr Oros <poros@redhat.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Alexander Nowlin <alexander.nowlin@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260427-jk-iwl-net-petr-oros-fixes-v1-6-cdcb48303fd8@intel.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-04-30ice: fix NULL pointer dereference in ice_reset_all_vfs()Petr Oros
ice_reset_all_vfs() ignores the return value of ice_vf_rebuild_vsi(). When the VSI rebuild fails (e.g. during NVM firmware update via nvmupdate64e), ice_vsi_rebuild() tears down the VSI on its error path, leaving txq_map and rxq_map as NULL. The subsequent unconditional call to ice_vf_post_vsi_rebuild() leads to a NULL pointer dereference in ice_ena_vf_q_mappings() when it accesses vsi->txq_map[0]. The single-VF reset path in ice_reset_vf() already handles this correctly by checking the return value of ice_vf_reconfig_vsi() and skipping ice_vf_post_vsi_rebuild() on failure. Apply the same pattern to ice_reset_all_vfs(): check the return value of ice_vf_rebuild_vsi() and skip ice_vf_post_vsi_rebuild() and ice_eswitch_attach_vf() on failure. The VF is left safely disabled (ICE_VF_STATE_INIT not set, VFGEN_RSTAT not set to VFACTIVE) and can be recovered via a VFLR triggered by a PCI reset of the VF (sysfs reset or driver rebind). Note that this patch does not prevent the VF VSI rebuild from failing during NVM update — the underlying cause is firmware being in a transitional state while the EMP reset is processed, which can cause Admin Queue commands (ice_add_vsi, ice_cfg_vsi_lan) to fail. This patch only prevents the subsequent NULL pointer dereference that crashes the kernel when the rebuild does fail. crash> bt PID: 50795 TASK: ff34c9ee708dc680 CPU: 1 COMMAND: "kworker/u512:5" #0 [ff72159bcfe5bb50] machine_kexec at ffffffffaa8850ee #1 [ff72159bcfe5bba8] __crash_kexec at ffffffffaaa15fba #2 [ff72159bcfe5bc68] crash_kexec at ffffffffaaa16540 #3 [ff72159bcfe5bc70] oops_end at ffffffffaa837eda #4 [ff72159bcfe5bc90] page_fault_oops at ffffffffaa893997 #5 [ff72159bcfe5bce8] exc_page_fault at ffffffffab528595 #6 [ff72159bcfe5bd10] asm_exc_page_fault at ffffffffab600bb2 [exception RIP: ice_ena_vf_q_mappings+0x79] RIP: ffffffffc0a85b29 RSP: ff72159bcfe5bdc8 RFLAGS: 00010206 RAX: 00000000000f0000 RBX: ff34c9efc9c00000 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000010 RDI: ff34c9efc9c00000 RBP: ff34c9efc27d4828 R8: 0000000000000093 R9: 0000000000000040 R10: ff34c9efc27d4828 R11: 0000000000000040 R12: 0000000000100000 R13: 0000000000000010 R14: R15: ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #7 [ff72159bcfe5bdf8] ice_sriov_post_vsi_rebuild at ffffffffc0a85e2e [ice] #8 [ff72159bcfe5be08] ice_reset_all_vfs at ffffffffc0a920b4 [ice] #9 [ff72159bcfe5be48] ice_service_task at ffffffffc0a31519 [ice] #10 [ff72159bcfe5be88] process_one_work at ffffffffaa93dca4 #11 [ff72159bcfe5bec8] worker_thread at ffffffffaa93e9de #12 [ff72159bcfe5bf18] kthread at ffffffffaa946663 #13 [ff72159bcfe5bf50] ret_from_fork at ffffffffaa8086b9 The panic occurs attempting to dereference the NULL pointer in RDX at ice_sriov.c:294, which loads vsi->txq_map (offset 0x4b8 in ice_vsi). The faulting VSI is an allocated slab object but not fully initialized after a failed ice_vsi_rebuild(): crash> struct ice_vsi 0xff34c9efc27d4828 netdev = 0x0, rx_rings = 0x0, tx_rings = 0x0, q_vectors = 0x0, txq_map = 0x0, rxq_map = 0x0, alloc_txq = 0x10, num_txq = 0x10, alloc_rxq = 0x10, num_rxq = 0x10, The nvmupdate64e process was performing NVM firmware update: crash> bt 0xff34c9edd1a30000 PID: 49858 TASK: ff34c9edd1a30000 CPU: 1 COMMAND: "nvmupdate64e" #0 [ff72159bcd617618] __schedule at ffffffffab5333f8 #4 [ff72159bcd617750] ice_sq_send_cmd at ffffffffc0a35347 [ice] #5 [ff72159bcd6177a8] ice_sq_send_cmd_retry at ffffffffc0a35b47 [ice] #6 [ff72159bcd617810] ice_aq_send_cmd at ffffffffc0a38018 [ice] #7 [ff72159bcd617848] ice_aq_read_nvm at ffffffffc0a40254 [ice] #8 [ff72159bcd6178b8] ice_read_flat_nvm at ffffffffc0a4034c [ice] #9 [ff72159bcd617918] ice_devlink_nvm_snapshot at ffffffffc0a6ffa5 [ice] dmesg: ice 0000:13:00.0: firmware recommends not updating fw.mgmt, as it may result in a downgrade. continuing anyways ice 0000:13:00.1: ice_init_nvm failed -5 ice 0000:13:00.1: Rebuild failed, unload and reload driver Fixes: 12bb018c538c ("ice: Refactor VF reset") Signed-off-by: Petr Oros <poros@redhat.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260427-jk-iwl-net-petr-oros-fixes-v1-5-cdcb48303fd8@intel.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-04-30iavf: add VIRTCHNL_OP_ADD_VLAN to success completion handlerPetr Oros
The V1 ADD_VLAN opcode had no success handler; filters sent via V1 stayed in ADDING state permanently. Add a fallthrough case so V1 filters also transition ADDING -> ACTIVE on PF confirmation. Critically, add an `if (v_retval) break` guard: the error switch in iavf_virtchnl_completion() does NOT return after handling errors, it falls through to the success switch. Without this guard, a PF-rejected ADD would incorrectly mark ADDING filters as ACTIVE, creating a driver/HW mismatch where the driver believes the filter is installed but the PF never accepted it. For V2, this is harmless: iavf_vlan_add_reject() in the error block already kfree'd all ADDING filters, so the success handler finds nothing to transition. Fixes: 968996c070ef ("iavf: Fix VLAN_V2 addition/rejection") Signed-off-by: Petr Oros <poros@redhat.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260427-jk-iwl-net-petr-oros-fixes-v1-4-cdcb48303fd8@intel.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-04-30iavf: wait for PF confirmation before removing VLAN filtersPetr Oros
The VLAN filter DELETE path was asymmetric with the ADD path: ADD waits for PF confirmation (ADD -> ADDING -> ACTIVE), but DELETE immediately frees the filter struct after sending the DEL message without waiting for the PF response. This is problematic because: - If the PF rejects the DEL, the filter remains in HW but the driver has already freed the tracking structure, losing sync. - Race conditions between DEL pending and other operations (add, reset) cannot be properly resolved if the filter struct is already gone. Add IAVF_VLAN_REMOVING state to make the DELETE path symmetric: REMOVE -> REMOVING (send DEL) -> PF confirms -> kfree -> PF rejects -> ACTIVE In iavf_del_vlans(), transition filters from REMOVE to REMOVING instead of immediately freeing them. The new DEL completion handler in iavf_virtchnl_completion() frees filters on success or reverts them to ACTIVE on error. Update iavf_add_vlan() to handle the REMOVING state: if a DEL is pending and the user re-adds the same VLAN, queue it for ADD so it gets re-programmed after the PF processes the DEL. The !VLAN_FILTERING_ALLOWED early-exit path still frees filters directly since no PF message is sent in that case. Also update iavf_del_vlan() to skip filters already in REMOVING state: DEL has been sent to PF and the completion handler will free the filter when PF confirms. Without this guard, the sequence DEL(pending) -> user-del -> second DEL could cause the PF to return an error for the second DEL (filter already gone), causing the completion handler to incorrectly revert a deleted filter back to ACTIVE. Fixes: 968996c070ef ("iavf: Fix VLAN_V2 addition/rejection") Signed-off-by: Petr Oros <poros@redhat.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260427-jk-iwl-net-petr-oros-fixes-v1-3-cdcb48303fd8@intel.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-04-30iavf: stop removing VLAN filters from PF on interface downPetr Oros
When a VF goes down, the driver currently sends DEL_VLAN to the PF for every VLAN filter (ACTIVE -> DISABLE -> send DEL -> INACTIVE), then re-adds them all on UP (INACTIVE -> ADD -> send ADD -> ADDING -> ACTIVE). This round-trip is unnecessary because: 1. The PF disables the VF's queues via VIRTCHNL_OP_DISABLE_QUEUES, which already prevents all RX/TX traffic regardless of VLAN filter state. 2. The VLAN filters remaining in PF HW while the VF is down is harmless - packets matching those filters have nowhere to go with queues disabled. 3. The DEL+ADD cycle during down/up creates race windows where the VLAN filter list is incomplete. With spoofcheck enabled, the PF enables TX VLAN filtering on the first non-zero VLAN add, blocking traffic for any VLANs not yet re-added. Remove the entire DISABLE/INACTIVE state machinery: - Remove IAVF_VLAN_DISABLE and IAVF_VLAN_INACTIVE enum values - Remove iavf_restore_filters() and its call from iavf_open() - Remove VLAN filter handling from iavf_clear_mac_vlan_filters(), rename it to iavf_clear_mac_filters() - Remove DEL_VLAN_FILTER scheduling from iavf_down() - Remove all DISABLE/INACTIVE handling from iavf_del_vlans() VLAN filters now stay ACTIVE across down/up cycles. Only explicit user removal (ndo_vlan_rx_kill_vid) or PF/VF reset triggers VLAN filter deletion/re-addition. Fixes: ed1f5b58ea01 ("i40evf: remove VLAN filters on close") Signed-off-by: Petr Oros <poros@redhat.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260427-jk-iwl-net-petr-oros-fixes-v1-2-cdcb48303fd8@intel.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-04-30iavf: rename IAVF_VLAN_IS_NEW to IAVF_VLAN_ADDINGPetr Oros
Rename the IAVF_VLAN_IS_NEW state to IAVF_VLAN_ADDING to better describe what the state represents: an ADD request has been sent to the PF and is waiting for a response. This is a pure rename with no behavioral change, preparing for a cleanup of the VLAN filter state machine. Signed-off-by: Petr Oros <poros@redhat.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260427-jk-iwl-net-petr-oros-fixes-v1-1-cdcb48303fd8@intel.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-04-23Merge tag 'net-7.1-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from Netfilter. Steady stream of fixes. Last two weeks feel comparable to the two weeks before the merge window. Lots of AI-aided bug discovery. A newer big source is Sashiko/Gemini (Roman Gushchin's system), which points out issues in existing code during patch review (maybe 25% of fixes here likely originating from Sashiko). Nice thing is these are often fixed by the respective maintainers, not drive-bys. Current release - new code bugs: - kconfig: MDIO_PIC64HPSC should depend on ARCH_MICROCHIP Previous releases - regressions: - add async ndo_set_rx_mode and switch drivers which we promised to be called under the per-netdev mutex to it - dsa: remove duplicate netdev_lock_ops() for conduit ethtool ops - hv_sock: report EOF instead of -EIO for FIN - vsock/virtio: fix MSG_PEEK calculation on bytes to copy Previous releases - always broken: - ipv6: fix possible UAF in icmpv6_rcv() - icmp: validate reply type before using icmp_pointers - af_unix: drop all SCM attributes for SOCKMAP - netfilter: fix a number of bugs in the osf (OS fingerprinting) - eth: intel: fix timestamp interrupt configuration for E825C Misc: - bunch of data-race annotations" * tag 'net-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (148 commits) rxrpc: Fix error handling in rxgk_extract_token() rxrpc: Fix re-decryption of RESPONSE packets rxrpc: Fix rxrpc_input_call_event() to only unshare DATA packets rxrpc: Fix missing validation of ticket length in non-XDR key preparsing rxgk: Fix potential integer overflow in length check rxrpc: Fix conn-level packet handling to unshare RESPONSE packets rxrpc: Fix potential UAF after skb_unshare() failure rxrpc: Fix rxkad crypto unalignment handling rxrpc: Fix memory leaks in rxkad_verify_response() net: rds: fix MR cleanup on copy error m68k: mvme147: Make me the maintainer net: txgbe: fix firmware version check selftests/bpf: check epoll readiness during reuseport migration tcp: call sk_data_ready() after listener migration vhost_net: fix sleeping with preempt-disabled in vhost_net_busy_poll() ipv6: Cap TLV scan in ip6_tnl_parse_tlv_enc_lim tipc: fix double-free in tipc_buf_append() llc: Return -EINPROGRESS from llc_ui_connect() ipv4: icmp: validate reply type before using icmp_pointers selftests/net: packetdrill: cover RFC 5961 5.2 challenge ACK on both edges ...
2026-04-22ice: fix ice_ptp_read_tx_hwtstamp_status_eth56gJacob Keller
The ice_ptp_read_tx_hwtstamp_status_eth56g function calls ice_read_phy_eth56g with a PHY index. However the function actually expects a port index. This causes the function to read the wrong PHY_PTP_INT_STATUS registers, and effectively makes the status wrong for the second set of ports from 4 to 7. The ice_read_phy_eth56g function uses the provided port index to determine which PHY device to read. We could refactor the entire chain to take a PHY index, but this would impact many code sites. Instead, multiply the PHY index by the number of ports, so that we read from the first port of each PHY. Fixes: 7cab44f1c35f ("ice: Introduce ETH56G PHY model for E825C products") Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Petr Oros <poros@redhat.com> Tested-by: Sunitha Mekala <sunithax.d.mekala@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260420-jk-iwl-net-2026-04-20-ptp-e825c-phy-interrupt-fixes-v1-4-bc2240f42251@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-22ice: fix ready bitmap check for non-E822 devicesJacob Keller
The E800 hardware (apart from E810) has a ready bitmap for the PHY indicating which timestamp slots currently have an outstanding timestamp waiting to be read by software. This bitmap is checked in multiple places using the ice_get_phy_tx_tstamp_ready(): * ice_ptp_process_tx_tstamp() calls it to determine which timestamps to attempt reading from the PHY * ice_ptp_tx_tstamps_pending() calls it in a loop at the end of the miscellaneous IRQ to check if new timestamps came in while the interrupt handler was executing. * ice_ptp_maybe_trigger_tx_interrupt() calls it in the auxiliary work task to trigger a software interrupt in the event that the hardware logic gets stuck. For E82X devices, multiple PHYs share the same block, and the parameter passed to the ready bitmap is a block number associated with the given port. For E825-C devices, the PHYs have their own independent blocks and do not share, so the parameter passed needs to be the port number. For E810 devices, the ice_get_phy_tx_tstamp_ready() always returns all 1s regardless of what port, since this hardware does not have a ready bitmap. Finally, for E830 devices, each PF has its own ready bitmap accessible via register, and the block parameter is unused. The first call correctly uses the Tx timestamp tracker block parameter to check the appropriate timestamp block. This works because the tracker is setup correctly for each timestamp device type. The second two callers behave incorrectly for all device types other than the older E822 devices. They both iterate in a loop using ICE_GET_QUAD_NUM() which is a macro only used by E822 devices. This logic is incorrect for devices other than the E822 devices. For E810 the calls would always return true, causing E810 devices to always attempt to trigger a software interrupt even when they have no reason to. For E830, this results in duplicate work as the ready bitmap is checked once per number of quads. Finally, for E825-C, this results in the pending checks failing to detect timestamps on ports other than the first two. Fix this by introducing a new hardware API function to ice_ptp_hw.c, ice_check_phy_tx_tstamp_ready(). This function will check if any timestamps are available and returns a positive value if any timestamps are pending. For E810, the function always returns false, so that the re-trigger checks never happen. For E830, check the ready bitmap just once. For E82x hardware, check each quad. Finally, for E825-C, check every port. The interface function returns an integer to enable reporting of error code if the driver is unable read the ready bitmap. This enables callers to handle this case properly. The previous implementation assumed that timestamps are available if they failed to read the bitmap. This is problematic as it could lead to continuous software IRQ triggering if the PHY timestamp registers somehow become inaccessible. This change is especially important for E825-C devices, as the missing checks could leave a window open where a new timestamp could arrive while the existing timestamps aren't completed. As a result, the hardware threshold logic would not trigger a new interrupt. Without the check, the timestamp is left unhandled, and new timestamps will not cause an interrupt again until the timestamp is handled. Since both the interrupt check and the backup check in the auxiliary task do not function properly, the device may have Tx timestamps permanently stuck failing on a given port. The faulty checks originate from commit d938a8cca88a ("ice: Auxbus devices & driver for E822 TS") and commit 712e876371f8 ("ice: periodically kick Tx timestamp interrupt"), however at the time of the original coding, both functions only operated on E822 hardware. This is no longer the case, and hasn't been since the introduction of the ETH56G PHY model in commit 7cab44f1c35f ("ice: Introduce ETH56G PHY model for E825C products") Fixes: 7cab44f1c35f ("ice: Introduce ETH56G PHY model for E825C products") Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Petr Oros <poros@redhat.com> Tested-by: Sunitha Mekala <sunithax.d.mekala@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260420-jk-iwl-net-2026-04-20-ptp-e825c-phy-interrupt-fixes-v1-3-bc2240f42251@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-22ice: perform PHY soft reset for E825C ports at initializationGrzegorz Nitka
In some cases the PHY timestamp block of the E825C can become stuck. This is known to occur if the software writes 0 to the Tx timestamp threshold, and with older versions of the ice driver the threshold configuration is buggy and can race in such that hardware briefly operates with a zero threshold enabled. There are no other known ways to trigger this behavior, but once it occurs, the hardware is not recovered by normal reset, a driver reload, or even a warm power cycle of the system. A cold power cycle is sufficient to recover hardware, but this is extremely invasive and can result in significant downtime on customer deployments. The PHY for each port has a timestamping block which has its own reset functionality accessible by programming the PHY_REG_GLOBAL register. Writing to the PHY_REG_GLOBAL_SOFT_RESET_BIT triggers the hardware to perform a complete reset of the timestamping block of the PHY. This includes clearing the timestamp status for the port, clearing all outstanding timestamps in the memory bank, and resetting the PHY timer. The new ice_ptp_phy_soft_reset_eth56g() function toggles the PHY_REG_GLOBAL soft reset bit with the required delays, ensuring the PHY is properly reinitialized without requiring a full device reset. The sequence clears the reset bit, asserts it, then clears it again, with short waits between transitions to allow hardware stabilization. Call this function in the new ice_ptp_init_phc_e825c(), implementing the E825C device specific variant of the ice_ptp_init_phc(). Note that if ice_ptp_init_phc() fails, PTP functionality may be disabled, but the driver will still load to allow basic functionality to continue. This causes the clock owning PF driver to perform a PHY soft reset for every port during initialization. This ensures the driver begins life in a known functional state regardless of how it was previously programmed. This ensures that we properly reconfigure the hardware after a device reset or when loading the driver, even if it was previously misconfigured with an out-of-date or modified driver. Fixes: 7cab44f1c35f ("ice: Introduce ETH56G PHY model for E825C products") Signed-off-by: Timothy Miskell <timothy.miskell@intel.com> Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Petr Oros <poros@redhat.com> Tested-by: Sunitha Mekala <sunithax.d.mekala@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260420-jk-iwl-net-2026-04-20-ptp-e825c-phy-interrupt-fixes-v1-2-bc2240f42251@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-22ice: fix timestamp interrupt configuration for E825CGrzegorz Nitka
The E825C ice_phy_cfg_intr_eth56g() function is responsible for programming the PHY interrupt for a given port. This function writes to the PHY_REG_TS_INT_CONFIG register of the port. The register is responsible for configuring whether the port interrupt logic is enabled, as well as programming the threshold of waiting timestamps that will trigger an interrupt from this port. This threshold value must not be programmed to zero while the interrupt is enabled. Doing so puts the port in a misconfigured state where the PHY timestamp interrupt for the quad of connected ports will become stuck. This occurs, because a threshold of zero results in the timestamp interrupt status for the port becoming stuck high. The four ports in the connected quad have their timestamp status indicators muxed together. A new interrupt cannot be generated until the timestamp status indicators return low for all four ports. Normally, the timestamp status for a port will clear once there are fewer timestamps in that ports timestamp memory bank than the threshold. A threshold of zero makes this impossible, so the timestamp status for the port does not clear. The ice driver never intentionally programs the threshold to zero, indeed the driver always programs it to a value of 1, intending to get an interrupt immediately as soon as even a single packet is waiting for a timestamp. However, there is a subtle flaw in the programming logic in the ice_phy_cfg_intr_eth56g() function. Due to the way that the hardware handles enabling the PHY interrupt. If the threshold value is modified at the same time as the interrupt is enabled, the HW PHY state machine might enable the interrupt before the new threshold value is actually updated. This leaves a potential race condition caused by the hardware logic where a PHY timestamp interrupt might be triggered before the non-zero threshold is written, resulting in the PHY timestamp logic becoming stuck. Once the PHY timestamp status is stuck high, it will remain stuck even after attempting to reprogram the PHY block by changing its threshold or disabling the interrupt. Even a typical PF or CORE reset will not reset the particular block of the PHY that becomes stuck. Even a warm power cycle is not guaranteed to cause the PHY block to reset, and a cold power cycle is required. Prevent this by always writing the PHY_REG_TS_INT_CONFIG in two stages. First write the threshold value with the interrupt disabled, and only write the enable bit after the threshold has been programmed. When disabling the interrupt, leave the threshold unchanged. Additionally, re-read the register after writing it to guarantee that the write to the PHY has been flushed upon exit of the function. While we're modifying this function implementation, explicitly reject programming a threshold of 0 when enabling the interrupt. No caller does this today, but the consequences of doing so are significant. An explicit rejection in the code makes this clear. Fixes: 7cab44f1c35f ("ice: Introduce ETH56G PHY model for E825C products") Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Petr Oros <poros@redhat.com> Tested-by: Sunitha Mekala <sunithax.d.mekala@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260420-jk-iwl-net-2026-04-20-ptp-e825c-phy-interrupt-fixes-v1-1-bc2240f42251@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-21iavf: convert to ndo_set_rx_mode_asyncStanislav Fomichev
Convert iavf from ndo_set_rx_mode to ndo_set_rx_mode_async. iavf_set_rx_mode now takes explicit uc/mc list parameters and uses __hw_addr_sync_dev on the snapshots instead of __dev_uc_sync and __dev_mc_sync. The iavf_configure internal caller passes the real lists directly. Cc: Tony Nguyen <anthony.l.nguyen@intel.com> Cc: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20260416185712.2155425-10-sdf@fomichev.me Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-04-18e1000e: Unroll PTP in probe error handlingMatt Vollrath
If probe fails after registering the PTP clock and its delayed work, these resources must be released. This was not an issue until a 2016 fix moved the e1000e_ptp_init() call before the jump to err_register. Fixes: aa524b66c5ef ("e1000e: don't modify SYSTIM registers during SIOCSHWTSTAMP ioctl") Signed-off-by: Matt Vollrath <tactii@gmail.com> Tested-by: Avigail Dahan <avigailx.dahan@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260416-iwl-net-submission-2026-04-14-v2-12-686c33c9828d@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-18iavf: fix wrong VLAN mask for legacy Rx descriptors L2TAG2Petr Oros
The IAVF_RXD_LEGACY_L2TAG2_M mask was incorrectly defined as GENMASK_ULL(63, 32), extracting 32 bits from qw2 instead of the 16-bit VLAN tag. In the legacy Rx descriptor layout, the 2nd L2TAG2 (VLAN tag) occupies bits 63:48 of qw2, not 63:32. The oversized mask causes FIELD_GET to return a 32-bit value where the actual VLAN tag sits in bits 31:16. When this value is passed to iavf_receive_skb() as a u16 parameter, it gets truncated to the lower 16 bits (which contain the 1st L2TAG2, typically zero). As a result, __vlan_hwaccel_put_tag() is never called and software VLAN interfaces on VFs receive no traffic. This affects VFs behind ice PF (VIRTCHNL VLAN v2) when the PF advertises VLAN stripping into L2TAG2_2 and legacy descriptors are used. The flex descriptor path already uses the correct mask (IAVF_RXD_FLEX_L2TAG2_2_M = GENMASK_ULL(63, 48)). Reproducer: 1. Create 2 VFs on ice PF (echo 2 > sriov_numvfs) 2. Disable spoofchk on both VFs 3. Move each VF into a separate network namespace 4. On each VF: create VLAN interface (e.g. vlan 198), assign IP, bring up 5. Set rx-vlan-offload OFF on both VFs 6. Ping between VLAN interfaces -> expect PASS (VLAN tag stays in packet data, kernel matches in-band) 7. Set rx-vlan-offload ON on both VFs 8. Ping between VLAN interfaces -> expect FAIL if bug present (HW strips VLAN tag into descriptor L2TAG2 field, wrong mask extracts bits 47:32 instead of 63:48, truncated to u16 -> zero, __vlan_hwaccel_put_tag() never called, packet delivered to parent interface, not VLAN interface) The reproducer requires legacy Rx descriptors. On modern ice + iavf with full PTP support, flex descriptors are always negotiated and the buggy legacy path is never reached. Flex descriptors require all of: - CONFIG_PTP_1588_CLOCK enabled - VIRTCHNL_VF_OFFLOAD_RX_FLEX_DESC granted by PF - PTP capabilities negotiated (VIRTCHNL_VF_CAP_PTP) - VIRTCHNL_1588_PTP_CAP_RX_TSTAMP supported - VIRTCHNL_RXDID_2_FLEX_SQ_NIC present in DDP profile If any condition is not met, iavf_select_rx_desc_format() falls back to legacy descriptors (RXDID=1) and the wrong L2TAG2 mask is hit. Fixes: 2dc8e7c36d80 ("iavf: refactor iavf_clean_rx_irq to support legacy and flex descriptors") Signed-off-by: Petr Oros <poros@redhat.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260416-iwl-net-submission-2026-04-14-v2-10-686c33c9828d@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-18i40e: don't advertise IFF_SUPP_NOFCSKohei Enju
i40e advertises IFF_SUPP_NOFCS, allowing users to use the SO_NOFCS socket option. However, this option is silently ignored, as the driver does not check skb->no_fcs, and always enables FCS insertion offload. Fix this by removing the advertisement of IFF_SUPP_NOFCS. This behavior can be reproduced with a simple AF_PACKET socket: import socket s = socket.socket(socket.AF_PACKET, socket.SOCK_RAW) s.setsockopt(socket.SOL_SOCKET, 43, 1) # SO_NOFCS s.bind(("eth0", 0)) s.send(b'\xff' * 64) Previously, send() succeeds but the driver ignores SO_NOFCS. With this change, send() fails with -EPROTONOSUPPORT, as expected. Fixes: 41c445ff0f48 ("i40e: main driver core") Signed-off-by: Kohei Enju <kohei@enjuk.jp> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Sunitha Mekala <sunithax.d.mekala@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260416-iwl-net-submission-2026-04-14-v2-9-686c33c9828d@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-18ice: fix potential NULL pointer deref in error path of ice_set_ringparam()Kohei Enju
ice_set_ringparam nullifies tstamp_ring of temporary tx_rings, without clearing ICE_TX_RING_FLAGS_TXTIME bit. When ICE_TX_RING_FLAGS_TXTIME is set and the subsequent ice_setup_tx_ring() call fails, a NULL pointer dereference could happen in the unwinding sequence: ice_clean_tx_ring() -> ice_is_txtime_cfg() == true (ICE_TX_RING_FLAGS_TXTIME is set) -> ice_free_tx_tstamp_ring() -> ice_free_tstamp_ring() -> tstamp_ring->desc (NULL deref) Clear ICE_TX_RING_FLAGS_TXTIME bit to avoid the potential issue. Note that this potential issue is found by manual code review. Compile test only since unfortunately I don't have E830 devices. Fixes: ccde82e90946 ("ice: add E830 Earliest TxTime First Offload support") Signed-off-by: Kohei Enju <kohei@enjuk.jp> Reviewed-by: Paul Greenwalt <paul.greenwalt@intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260416-iwl-net-submission-2026-04-14-v2-8-686c33c9828d@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-18ice: fix race condition in TX timestamp ring cleanupKeita Morisaki
Fix a race condition between ice_free_tx_tstamp_ring() and ice_tx_map() that can cause a NULL pointer dereference. ice_free_tx_tstamp_ring currently clears the ICE_TX_FLAGS_TXTIME flag after NULLing the tstamp_ring. This could allow a concurrent ice_tx_map call on another CPU to dereference the tstamp_ring, which could lead to a NULL pointer dereference. CPU A:ice_free_tx_tstamp_ring() | CPU B:ice_tx_map() --------------------------------|--------------------------------- tx_ring->tstamp_ring = NULL | | ice_is_txtime_cfg() -> true | tstamp_ring = tx_ring->tstamp_ring | tstamp_ring->count // NULL deref! flags &= ~ICE_TX_FLAGS_TXTIME | Fix by: 1. Reordering ice_free_tx_tstamp_ring() to clear the flag before NULLing the pointer, with smp_wmb() to ensure proper ordering. 2. Adding smp_rmb() in ice_tx_map() after the flag check to order the flag read before the pointer read, using READ_ONCE() for the pointer, and adding a NULL check as a safety net. 3. Converting tx_ring->flags from u8 to DECLARE_BITMAP() and using atomic bitops (set_bit(), clear_bit(), test_bit()) for all flag operations throughout the driver: - ICE_TX_RING_FLAGS_XDP - ICE_TX_RING_FLAGS_VLAN_L2TAG1 - ICE_TX_RING_FLAGS_VLAN_L2TAG2 - ICE_TX_RING_FLAGS_TXTIME Fixes: ccde82e909467 ("ice: add E830 Earliest TxTime First Offload support") Signed-off-by: Keita Morisaki <kmta1236@gmail.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260416-iwl-net-submission-2026-04-14-v2-7-686c33c9828d@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-18ice: fix ICE_AQ_LINK_SPEED_M for 200GPaul Greenwalt
When setting PHY configuration during driver initialization, 200G link speed is not being advertised even when the PHY is capable. This is because the get PHY capabilities link speed response is being masked by ICE_AQ_LINK_SPEED_M, which does not include the 200G link speed bit. ICE_AQ_LINK_SPEED_200GB is defined as BIT(11), but the mask 0x7FF only covers bits 0-10. Fix ICE_AQ_LINK_SPEED_M to use GENMASK(11, 0) so that it covers all defined link speed bits including 200G. Fixes: 24407a01e57c ("ice: Add 200G speed/phy type use") Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Sunitha Mekala <sunithax.d.mekala@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260416-iwl-net-submission-2026-04-14-v2-6-686c33c9828d@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-18ice: fix PHY config on media change with link-down-on-closePaul Greenwalt
Commit 1a3571b5938c ("ice: restore PHY settings on media insertion") introduced separate flows for setting PHY configuration on media present: ice_configure_phy() when link-down-on-close is disabled, and ice_force_phys_link_state() when enabled. The latter incorrectly uses the previous configuration even after module change, causing link issues such as wrong speed or no link. Unify PHY configuration into a single ice_phy_cfg() function with a link_en parameter, ensuring PHY capabilities are always fetched fresh from hardware. Fixes: 1a3571b5938c ("ice: restore PHY settings on media insertion") Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Sunitha Mekala <sunithax.d.mekala@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260416-iwl-net-submission-2026-04-14-v2-5-686c33c9828d@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-18ice: fix double-free of tx_buf skbMichal Schmidt
If ice_tso() or ice_tx_csum() fail, the error path in ice_xmit_frame_ring() frees the skb, but the 'first' tx_buf still points to it and is marked as valid (ICE_TX_BUF_SKB). 'next_to_use' remains unchanged, so the potential problem will likely fix itself when the next packet is transmitted and the tx_buf gets overwritten. But if there is no next packet and the interface is brought down instead, ice_clean_tx_ring() -> ice_unmap_and_free_tx_buf() will find the tx_buf and free the skb for the second time. The fix is to reset the tx_buf type to ICE_TX_BUF_EMPTY in the error path, so that ice_unmap_and_free_tx_buf(). Move the initialization of 'first' up, to ensure it's already valid in case we hit the linearization error path. The bug was spotted by AI while I had it looking for something else. It also proposed an initial version of the patch. I reproduced the bug and tested the fix by adding code to inject failures, on a build with KASAN. I looked for similar bugs in related Intel drivers and did not find any. Fixes: d76a60ba7afb ("ice: Add support for VLANs and offloads") Assisted-by: Claude:claude-4.6-opus-high Cursor Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260416-iwl-net-submission-2026-04-14-v2-4-686c33c9828d@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-18ice: fix double free in ice_sf_eth_activate() error pathGuangshuo Li
When auxiliary_device_add() fails, ice_sf_eth_activate() jumps to aux_dev_uninit and calls auxiliary_device_uninit(&sf_dev->adev). The device release callback ice_sf_dev_release() frees sf_dev, but the current error path falls through to sf_dev_free and calls kfree(sf_dev) again, causing a double free. Keep kfree(sf_dev) for the auxiliary_device_init() failure path, but avoid falling through to sf_dev_free after auxiliary_device_uninit(). Fixes: 13acc5c4cdbe ("ice: subfunction activation and base devlink ops") Cc: stable@vger.kernel.org Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Guangshuo Li <lgs201920130244@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260416-iwl-net-submission-2026-04-14-v2-3-686c33c9828d@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-18ice: update PCS latency settings for E825 10G/25Gb modesGrzegorz Nitka
Update MAC Rx/Tx offset registers settings (PHY_MAC_[RX|TX]_OFFSET registers) with the data obtained with the latest research. It applies to PCS latency settings for the following speeds/modes: * 10Gb NO-FEC - TX latency changed from 71.25 ns to 73 ns - RX latency changed from -25.6 ns to -28 ns * 25Gb NO-FEC - TX latency changed from 28.17 ns to 33 ns - RX latency changed from -12.45 ns to -12 ns * 25Gb RS-FEC - TX latency changed from 64.5 ns to 69 ns - RX latency changed from -3.6 ns to -3 ns The original data came from simulation and pre-production hardware. The new data measures the actual delays and as such is more accurate. Fixes: 7cab44f1c35f ("ice: Introduce ETH56G PHY model for E825C products") Co-developed-by: Zoltan Fodor <zoltan.fodor@intel.com> Signed-off-by: Zoltan Fodor <zoltan.fodor@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com> Tested-by: Sunitha Mekala <sunithax.d.mekala@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260416-iwl-net-submission-2026-04-14-v2-2-686c33c9828d@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-18ice: fix 'adjust' timer programming for E830 devicesGrzegorz Nitka
Fix incorrect 'adjust the timer' programming sequence for E830 devices series. Only shadow registers GLTSYN_SHADJ were programmed in the current implementation. According to the specification [1], write to command GLTSYN_CMD register is also required with CMD field set to "Adjust the Time" value, for the timer adjustment to take the effect. The flow was broken for the adjustment less than S32_MAX/MIN range (around +/- 2 seconds). For bigger adjustment, non-atomic programming flow is used, involving set timer programming. Non-atomic flow is implemented correctly. Testing hints: Run command: phc_ctl /dev/ptpX get adj 2 get Expected result: Returned timestamps differ at least by 2 seconds [1] Intel® Ethernet Controller E830 Datasheet rev 1.3, chapter 9.7.5.4 https://cdrdv2.intel.com/v1/dl/getContent/787353?explicitVersion=true Fixes: f00307522786 ("ice: Implement PTP support for E830 devices") Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Rinitha S <sx.rinitha@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260416-iwl-net-submission-2026-04-14-v2-1-686c33c9828d@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-15Merge tag 'pci-v7.1-changes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci Pull pci updates from Bjorn Helgaas: "Enumeration: - Allow TLP Processing Hints to be enabled for RCiEPs (George Abraham P) - Enable AtomicOps only if we know the Root Port supports them (Gerd Bayer) - Don't enable AtomicOps for RCiEPs since none of them need Atomic Ops and we can't tell whether the Root Complex would support them (Gerd Bayer) - Leave Precision Time Measurement disabled until a driver enables it to avoid PCIe errors (Mika Westerberg) - Make pci_set_vga_state() fail if bridge doesn't support VGA routing, i.e., PCI_BRIDGE_CTL_VGA is not writable, and return errors to vga_get() callers including userspace via /dev/vga_arbiter (Simon Richter) - Validate max-link-speed from DT in j721e, brcmstb, mediatek-gen3, rzg3s drivers (where the actual controller constraints are known), and remove validation from the generic OF DT accessor (Hans Zhang) - Remove pc110pad driver (no longer useful after 486 CPU support removed) and no_pci_devices() (pc110pad was the last user) (Dmitry Torokhov, Heiner Kallweit) Resource management: - Prevent assigning space to unimplemented bridge windows; previously we mistakenly assumed prefetchable window existed and assigned space and put a BAR there (Ahmed Naseef) - Avoid shrinking bridge windows to fit in the initial Root Port window; fixes one problem with devices with large BARs connected via switches, e.g., Thunderbolt (Ilpo Järvinen) - Pass full extent of empty space, not just the aligned space, to resource_alignf callback so free space before the requested alignment can be used (Ilpo Järvinen) - Place small resources before larger ones for better utilization of address space (Ilpo Järvinen) - Fix alignment calculation for resource size larger than align, e.g., bridge windows larger than the 1MB required alignment (Ilpo Järvinen) Reset: - Update slot handling so all ARI functions are treated as being in the same slot. They're all reset by Secondary Bus Reset, but previously drivers of ARI functions that appeared to be on a non-zero device weren't notified and fatal hardware errors could result (Keith Busch) - Make sysfs reset_subordinate hotplug safe to avoid spurious hotplug events (Keith Busch) - Hide Secondary Bus Reset ('bus') from sysfs reset_methods if masked by CXL because it has no effect (Vidya Sagar) - Avoid FLR for AMD NPU device, where it causes the device to hang (Lizhi Hou) Error handling: - Clear only error bits in PCIe Device Status to avoid accidentally clearing Emergency Power Reduction Detected (Shuai Xue) - Check for AER errors even in devices without drivers (Lukas Wunner) - Initialize ratelimit info so DPC and EDR paths log AER error information (Kuppuswamy Sathyanarayanan) Power control: - Add UPD720201/UPD720202 USB 3.0 xHCI Host Controller .compatible so generic pwrctrl driver can control it (Neil Armstrong) Hotplug: - Set LED_HW_PLUGGABLE for NPEM hotplug-capable ports so LED core doesn't complain when setting brightness fails because the endpoint is gone (Richard Cheng) Peer-to-peer DMA: - Allow wildcards in list of host bridges that support peer-to-peer DMA between hierarchy domains and add all Google SoCs (Jacob Moroni) Endpoint framework: - Advertise dynamic inbound mapping support in pci-epf-test and update host pci_endpoint_test to skip doorbell testing if not advertised by endpoint (Koichiro Den) - Return 0, not remaining timeout, when MHI eDMA ops complete so mhi_ep_ring_add_element() doesn't interpret non-zero as failure (Daniel Hodges) - Remove vntb and ntb duplicate resource teardown that leads to oops when .allow_link() fails or .drop_link() is called (Koichiro Den) - Disable vntb delayed work before clearing BAR mappings and doorbells to avoid oops caused by doing the work after resources have been torn down (Koichiro Den) - Add a way to describe reserved subregions within BARs, e.g., platform-owned fixed register windows, and use it for the RK3588 BAR4 DMA ctrl window (Koichiro Den) - Add BAR_DISABLED for BARs that will never be available to an EPF driver, and change some BAR_RESERVED annotations to BAR_DISABLED (Niklas Cassel) - Add NTB .get_dma_dev() callback for cases where DMA API requires a different device, e.g., vNTB devices (Koichiro Den) - Add reserved region types for MSI-X Table and PBA so Endpoint controllers can them as describe hardware-owned regions in a BAR_RESERVED BAR (Manikanta Maddireddy) - Make Tegra194/234 BAR0 programmable and remove 1MB size limit (Manikanta Maddireddy) - Expose Tegra BAR2 (MSI-X) and BAR4 (DMA) as 64-bit BAR_RESERVED (Manikanta Maddireddy) - Add Tegra194 and Tegra234 device table entries to pci_endpoint_test (Manikanta Maddireddy) - Skip the BAR subrange selftest if there are not enough inbound window resources to run the test (Christian Bruel) New native PCIe controller drivers: - Add DT binding and driver for Andes QiLai SoC PCIe host controller (Randolph Lin) - Add DT binding and driver for ESWIN PCIe Root Complex (Senchuan Zhang) Baikal T-1 PCIe controller driver: - Remove driver since it never quite became usable (Andy Shevchenko) Cadence PCIe controller driver: - Implement byte/word config reads with dword (32-bit) reads because some Cadence controllers don't support sub-dword accesses (Aksh Garg) CIX Sky1 PCIe controller driver: - Add 'power-domains' to DT binding for SCMI power domain (Gary Yang) Freescale i.MX6 PCIe controller driver: - Add i.MX94 and i.MX943 to fsl,imx6q-pcie-ep DT binding (Richard Zhu) - Delay instead of polling for L2/L3 Ready after PME_Turn_off when suspending i.MX6SX because LTSSM registers are inaccessible (Richard Zhu) - Separate PERST# assertion (for resetting endpoints) from core reset (for resetting the RC itself) to prepare for new DTs with PERST# GPIO in per-Root Port nodes (Sherry Sun) - Retain Root Port MSI capability on i.MX7D, i.MX8MM, and i.MX8MQ so MSI from downstream devices will work (Richard Zhu) - Fix i.MX95 reference clock source selection when internal refclk is used (Franz Schnyder) Freescale Layerscape PCIe controller driver: - Allow building as a removable module (Sascha Hauer) MediaTek PCIe Gen3 controller driver: - Use dev_err_probe() to simplify error paths and make deferred probe messages visible in /sys/kernel/debug/devices_deferred (Chen-Yu Tsai) - Power off device if setup fails (Chen-Yu Tsai) - Integrate new pwrctrl API to enable power control for WiFi/BT adapters on mainboard or in PCIe or M.2 slots (Chen-Yu Tsai) NVIDIA Tegra194 PCIe controller driver: - Poll less aggressively and non-atomically for PME_TO_Ack during transition to L2 (Vidya Sagar) - Disable LTSSM after transition to Detect on surprise link down to stop toggling between Polling and Detect (Manikanta Maddireddy) - Don't force the device into the D0 state before L2 when suspending or shutting down the controller (Vidya Sagar) - Disable PERST# IRQ only in Endpoint mode because it's not registered in Root Port mode (Manikanta Maddireddy) - Handle 'nvidia,refclk-select' as optional (Vidya Sagar) - Disable direct speed change in Endpoint mode so link speed change is controlled by the host (Vidya Sagar) - Set LTR values before link up to avoid bogus LTR messages with 0 latency (Vidya Sagar) - Allow system suspend when the Endpoint link is down (Vidya Sagar) - Use DWC IP core version, not Tegra custom values, to avoid DWC core version check warnings (Manikanta Maddireddy) - Apply ECRC workaround to devices based on DesignWare 5.00a as well as 4.90a (Manikanta Maddireddy) - Disable PM Substate L1.2 in Endpoint mode to work around Tegra234 erratum (Vidya Sagar) - Delay post-PERST# cleanup until core is powered on to avoid CBB timeout (Manikanta Maddireddy) - Assert CLKREQ# so switches that forward it to their downstream side can bring up those links successfully (Vidya Sagar) - Calibrate pipe to UPHY for Endpoint mode to reset stale PLL state from any previous bad link state (Vidya Sagar) - Remove IRQF_ONESHOT flag from Endpoint interrupt registration so DMA driver and Endpoint controller driver can share the interrupt line (Vidya Sagar) - Enable DMA interrupt to support DMA in both Root Port and Endpoint modes (Vidya Sagar) - Enable hardware link retraining after link goes down in Endpoint mode (Vidya Sagar) - Add DT binding and driver support for core clock monitoring (Vidya Sagar) Qualcomm PCIe controller driver: - Advertise 'Hot-Plug Capable' and set 'No Command Completed Support' since Qcom Root Ports support hotplug events like DL_Up/Down and can accept writes to Slot Control without delays between writes (Krishna Chaitanya Chundru) Renesas R-Car PCIe controller driver: - Mark Endpoint BAR0 and BAR2 as Resizable (Koichiro Den) - Reduce EPC BAR alignment requirement to 4K (Koichiro Den) Renesas RZ/G3S PCIe controller driver: - Add RZ/G3E to DT binding and to driver (John Madieu) - Assert (not deassert) resets in probe error path (John Madieu) - Assert resets in suspend path in reverse order they were deasserted during probe (John Madieu) - Rework inbound window algorithm to prevent mapping more than intended region and enforce alignment on size, to prepare for RZ/G3E support (John Madieu) Rockchip DesignWare PCIe controller driver: - Add tracepoints for PCIe controller LTSSM transitions and link rate changes (Shawn Lin) - Trace LTSSM events collected by the dw-rockchip debug FIFO (Shawn Lin) SOPHGO PCIe controller driver: - Disable ASPM L0s and L1 on Sophgo 2042 PCIe Root Ports that advertise support for them (Yao Zi) Synopsys DesignWare PCIe controller driver: - Continue with system suspend even if an Endpoint doesn't respond with PME_TO_Ack message (Manivannan Sadhasivam) - Set Endpoint MSI-X Table Size in the correct function of a multi-function device when configuring MSI-X, not in Function 0 (Aksh Garg) - Set Max Link Width and Max Link Speed for all functions of a multi-function device, not just Function 0 (Aksh Garg) - Expose PCIe event counters in groups 5-7 in debugfs (Hans Zhang) Miscellaneous: - Warn only once about invalid ACS kernel parameter format (Richard Cheng) - Suppress FW_BUG warning when writing sysfs 'numa_node' with the current value (Li RongQing) - Drop redundant 'depends on PCI' from Kconfig (Julian Braha)" * tag 'pci-v7.1-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: (165 commits) PCI/P2PDMA: Add Google SoCs to the P2P DMA host bridge list PCI/P2PDMA: Allow wildcard Device IDs in host bridge list PCI: sg2042: Avoid L0s and L1 on Sophgo 2042 PCIe Root Ports PCI: cadence: Add flags for disabling ASPM capability for broken Root Ports PCI: tegra194: Add core monitor clock support dt-bindings: PCI: tegra194: Add monitor clock support PCI: tegra194: Enable hardware hot reset mode in Endpoint mode PCI: tegra194: Enable DMA interrupt PCI: tegra194: Remove IRQF_ONESHOT flag during Endpoint interrupt registration PCI: tegra194: Calibrate pipe to UPHY for Endpoint mode PCI: tegra194: Assert CLKREQ# explicitly by default PCI: tegra194: Fix CBB timeout caused by DBI access before core power-on PCI: tegra194: Disable L1.2 capability of Tegra234 EP PCI: dwc: Apply ECRC workaround to DesignWare 5.00a as well PCI: tegra194: Use DWC IP core version PCI: tegra194: Free up Endpoint resources during remove() PCI: tegra194: Allow system suspend when the Endpoint link is not up PCI: tegra194: Set LTR message request before PCIe link up in Endpoint mode PCI: tegra194: Disable direct speed change for Endpoint mode PCI: tegra194: Use devm_gpiod_get_optional() to parse "nvidia,refclk-select" ...
2026-04-14Merge tag 'net-next-7.1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Jakub Kicinski: "Core & protocols: - Support HW queue leasing, allowing containers to be granted access to HW queues for zero-copy operations and AF_XDP - Number of code moves to help the compiler with inlining. Avoid output arguments for returning drop reason where possible - Rework drop handling within qdiscs to include more metadata about the reason and dropping qdisc in the tracepoints - Remove the rtnl_lock use from IP Multicast Routing - Pack size information into the Rx Flow Steering table pointer itself. This allows making the table itself a flat array of u32s, thus making the table allocation size a power of two - Report TCP delayed ack timer information via socket diag - Add ip_local_port_step_width sysctl to allow distributing the randomly selected ports more evenly throughout the allowed space - Add support for per-route tunsrc in IPv6 segment routing - Start work of switching sockopt handling to iov_iter - Improve dynamic recvbuf sizing in MPTCP, limit burstiness and avoid buffer size drifting up - Support MSG_EOR in MPTCP - Add stp_mode attribute to the bridge driver for STP mode selection. This addresses concerns about call_usermodehelper() usage - Remove UDP-Lite support (as announced in 2023) - Remove support for building IPv6 as a module. Remove the now unnecessary function calling indirection Cross-tree stuff: - Move Michael MIC code from generic crypto into wireless, it's considered insecure but some WiFi networks still need it Netfilter: - Switch nft_fib_ipv6 module to no longer need temporary dst_entry object allocations by using fib6_lookup() + RCU. Florian W reports this gets us ~13% higher packet rate - Convert IPVS's global __ip_vs_mutex to per-net service_mutex and switch the service tables to be per-net. Convert some code that walks the service lists to use RCU instead of the service_mutex - Add more opinionated input validation to lower security exposure - Make IPVS hash tables to be per-netns and resizable Wireless: - Finished assoc frame encryption/EPPKE/802.1X-over-auth - Radar detection improvements - Add 6 GHz incumbent signal detection APIs - Multi-link support for FILS, probe response templates and client probing - New APIs and mac80211 support for NAN (Neighbor Aware Networking, aka Wi-Fi Aware) so less work must be in firmware Driver API: - Add numerical ID for devlink instances (to avoid having to create fake bus/device pairs just to have an ID). Support shared devlink instances which span multiple PFs - Add standard counters for reporting pause storm events (implement in mlx5 and fbnic) - Add configuration API for completion writeback buffering (implement in mana) - Support driver-initiated change of RSS context sizes - Support DPLL monitoring input frequency (implement in zl3073x) - Support per-port resources in devlink (implement in mlx5) Misc: - Expand the YAML spec for Netfilter Drivers - Software: - macvlan: support multicast rx for bridge ports with shared source MAC address - team: decouple receive and transmit enablement for IEEE 802.3ad LACP "independent control" - Ethernet high-speed NICs: - nVidia/Mellanox: - support high order pages in zero-copy mode (for payload coalescing) - support multiple packets in a page (for systems with 64kB pages) - Broadcom 25-400GE (bnxt): - implement XDP RSS hash metadata extraction - add software fallback for UDP GSO, lowering the IOMMU cost - Broadcom 800GE (bnge): - add link status and configuration handling - add various HW and SW statistics - Marvell/Cavium: - NPC HW block support for cn20k - Huawei (hinic3): - add mailbox / control queue - add rx VLAN offload - add driver info and link management - Ethernet NICs: - Marvell/Aquantia: - support reading SFP module info on some AQC100 cards - Realtek PCI (r8169): - add support for RTL8125cp - Realtek USB (r8152): - support for the RTL8157 5Gbit chip - add 2500baseT EEE status/configuration support - Ethernet NICs embedded and off-the-shelf IP: - Synopsys (stmmac): - cleanup and reorganize SerDes handling and PCS support - cleanup descriptor handling and per-platform data - cleanup and consolidate MDIO defines and handling - shrink driver memory use for internal structures - improve Tx IRQ coalescing - improve TCP segmentation handling - add support for Spacemit K3 - Cadence (macb): - support PHYs that have inband autoneg disabled with GEM - support IEEE 802.3az EEE - rework usrio capabilities and handling - AMD (xgbe): - improve power management for S0i3 - improve TX resilience for link-down handling - Virtual: - Google cloud vNIC: - support larger ring sizes in DQO-QPL mode - improve HW-GRO handling - support UDP GSO for DQO format - PCIe NTB: - support queue count configuration - Ethernet PHYs: - automatically disable PHY autonomous EEE if MAC is in charge - Broadcom: - add BCM84891/BCM84892 support - Micrel: - support for LAN9645X internal PHY - Realtek: - add RTL8224 pair order support - support PHY LEDs on RTL8211F-VD - support spread spectrum clocking (SSC) - Maxlinear: - add PHY-level statistics via ethtool - Ethernet switches: - Maxlinear (mxl862xx): - support for bridge offloading - support for VLANs - support driver statistics - Bluetooth: - large number of fixes and new device IDs - Mediatek: - support MT6639 (MT7927) - support MT7902 SDIO - WiFi: - Intel (iwlwifi): - UNII-9 and continuing UHR work - MediaTek (mt76): - mt7996/mt7925 MLO fixes/improvements - mt7996 NPU support (HW eth/wifi traffic offload) - Qualcomm (ath12k): - monitor mode support on IPQ5332 - basic hwmon temperature reporting - support IPQ5424 - Realtek: - add USB RX aggregation to improve performance - add USB TX flow control by tracking in-flight URBs - Cellular: - IPA v5.2 support" * tag 'net-next-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1561 commits) net: pse-pd: fix kernel-doc function name for pse_control_find_by_id() wireguard: device: use exit_rtnl callback instead of manual rtnl_lock in pre_exit wireguard: allowedips: remove redundant space tools: ynl: add sample for wireguard wireguard: allowedips: Use kfree_rcu() instead of call_rcu() MAINTAINERS: Add netkit selftest files selftests/net: Add additional test coverage in nk_qlease selftests/net: Split netdevsim tests from HW tests in nk_qlease tools/ynl: Make YnlFamily closeable as a context manager net: airoha: Add missing PPE configurations in airoha_ppe_hw_init() net: airoha: Fix VIP configuration for AN7583 SoC net: caif: clear client service pointer on teardown net: strparser: fix skb_head leak in strp_abort_strp() net: usb: cdc-phonet: fix skb frags[] overflow in rx_complete() selftests/bpf: add test for xdp_master_redirect with bond not up net, bpf: fix null-ptr-deref in xdp_master_redirect() for down master net: airoha: Remove PCE_MC_EN_MASK bit in REG_FE_PCE_CFG configuration sctp: disable BH before calling udp_tunnel_xmit_skb() sctp: fix missing encap_port propagation for GSO fragments net: airoha: Rely on net_device pointer in ETS callbacks ...
2026-04-14Merge tag 'bitmap-for-v7.1' of https://github.com/norov/linuxLinus Torvalds
Pull bitmap updates from Yury Norov: - new API: bitmap_weight_from() and bitmap_weighted_xor() (Yury) - drop unused __find_nth_andnot_bit() (Yury) - new tests and test improvements (Andy, Akinobu, Yury) - fixes for count_zeroes API (Yury) - cleanup bitmap_print_to_pagebuf() mess (Yury) - documentation updates (Andy, Kai, Kit). * tag 'bitmap-for-v7.1' of https://github.com/norov/linux: (24 commits) bitops: Update kernel-doc for sign_extendXX() powerpc/xive: simplify xive_spapr_debug_show() thermal: intel: switch cpumask_get() to using cpumask_print_to_pagebuf() coresight: don't use bitmap_print_to_pagebuf() lib/prime_numbers: drop temporary buffer in dump_primes() drm/xe: switch xe_pagefault_queue_init() to using bitmap_weighted_or() ice: use bitmap_empty() in ice_vf_has_no_qs_ena ice: use bitmap_weighted_xor() in ice_find_free_recp_res_idx() bitmap: introduce bitmap_weighted_xor() bitmap: add test_zero_nbits() bitmap: exclude nbits == 0 cases from bitmap test bitmap: test bitmap_weight() for more asm-generic/bitops: Fix a comment typo in instrumented-atomic.h bitops: fix kernel-doc parameter name for parity8() lib: count_zeros: unify count_{leading,trailing}_zeros() lib: count_zeros: fix 32/64-bit inconsistency in count_trailing_zeros() lib: crypto: fix comments for count_leading_zeros() x86/topology: use bitmap_weight_from() bitmap: add bitmap_weight_from() lib/find_bit_benchmark: avoid clearing randomly filled bitmap in test_find_first_bit() ...
2026-04-10iavf: fix kernel-doc comment style in iavf_ethtool.cAleksandr Loktionov
iavf_ethtool.c contains 31 kernel-doc comment blocks using the legacy `**/` terminator instead of the correct single `*/`. Two function headers also use a colon separator (`iavf_get_channels:`, `iavf_set_channels:`) instead of the ` - ` dash required by kernel-doc. Additionally several comments embed their return-value descriptions in the body paragraph, producing `scripts/kernel-doc -Wreturn` warnings. Void functions that incorrectly say "Returns ..." are also rephrased. Fix all issues across the full file: - Replace every `**/` terminator with `*/`. - Change `function_name:` doc headers to `function_name -`. - Move inline "Returns ..." sentences into dedicated `Return:` sections for non-void functions (iavf_get_msglevel, iavf_get_rxnfc, iavf_set_channels, iavf_get_rxfh_key_size, iavf_get_rxfh_indir_size, iavf_get_rxfh, iavf_set_rxfh). - Rephrase body descriptions in void functions that incorrectly said "Returns ..." (iavf_get_drvinfo, iavf_get_ringparam, iavf_get_coalesce). - Remove boilerplate body text for iavf_get_rxfh_key_size and iavf_get_rxfh_indir_size; the `Return:` line now conveys the same information without the vague "Returns the table size." sentence. Suggested-by: Anthony L. Nguyen <anthony.l.nguyen@intel.com> Suggested-by: Leszek Pepiak <leszek.pepiak@intel.com> Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Breno Leitao <leitao@debian.org> Reviewed-by: Joe Damato <joe@dama.to> Link: https://patch.msgid.link/20260409093020.3808687-1-aleksandr.loktionov@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-09Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR (net-7.0-rc8). Conflicts: net/ipv6/seg6_iptunnel.c c3812651b522f ("seg6: separate dst_cache for input and output paths in seg6 lwtunnel") 78723a62b969a ("seg6: add per-route tunnel source address") https://lore.kernel.org/adZhwtOYfo-0ImSa@sirena.org.uk net/ipv4/icmp.c fde29fd934932 ("ipv4: icmp: fix null-ptr-deref in icmp_build_probe()") d98adfbdd5c01 ("ipv4: drop ipv6_stub usage and use direct function calls") https://lore.kernel.org/adO3dccqnr6j-BL9@sirena.org.uk Adjacent changes: drivers/net/ethernet/stmicro/stmmac/chain_mode.c 51f4e090b9f8 ("net: stmmac: fix integer underflow in chain mode") 6b4286e05508 ("net: stmmac: rename STMMAC_GET_ENTRY() -> STMMAC_NEXT_ENTRY()") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-09ice: use bitmap_empty() in ice_vf_has_no_qs_enaYury Norov
bitmap_empty() is more verbose and efficient, as it stops traversing {r,t}xq_ena as soon as the 1st set bit found. Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Yury Norov <ynorov@nvidia.com>
2026-04-09ice: use bitmap_weighted_xor() in ice_find_free_recp_res_idx()Yury Norov
Use the right helper and save one bitmaps traverse. Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Yury Norov <ynorov@nvidia.com>
2026-04-06e1000: check return value of e1000_read_eepromAgalakov Daniil
[Why] e1000_set_eeprom() performs a read-modify-write operation when the write range is not word-aligned. This requires reading the first and last words of the range from the EEPROM to preserve the unmodified bytes. However, the code does not check the return value of e1000_read_eeprom(). If the read fails, the operation continues using uninitialized data from eeprom_buff. This results in corrupted data being written back to the EEPROM for the boundary words. Add the missing error checks and abort the operation if reading fails. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Co-developed-by: Iskhakov Daniil <dish@amicon.ru> Signed-off-by: Iskhakov Daniil <dish@amicon.ru> Signed-off-by: Agalakov Daniil <ade@amicon.ru> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2026-04-06igb: remove napi_synchronize() in igb_down()Alex Dvoretsky
When an AF_XDP zero-copy application terminates abruptly (e.g., kill -9), the XSK buffer pool is destroyed but NAPI polling continues. igb_clean_rx_irq_zc() repeatedly returns the full budget, preventing napi_complete_done() from clearing NAPI_STATE_SCHED. igb_down() calls napi_synchronize() before napi_disable() for each queue vector. napi_synchronize() spins waiting for NAPI_STATE_SCHED to clear, which never happens. igb_down() blocks indefinitely, the TX watchdog fires, and the TX queue remains permanently stalled. napi_disable() already handles this correctly: it sets NAPI_STATE_DISABLE. After a full-budget poll, __napi_poll() checks napi_disable_pending(). If set, it forces completion and clears NAPI_STATE_SCHED, breaking the loop that napi_synchronize() cannot. napi_synchronize() was added in commit 41f149a285da ("igb: Fix possible panic caused by Rx traffic arrival while interface is down"). napi_disable() provides stronger guarantees: it prevents further scheduling and waits for any active poll to exit. Other Intel drivers (ixgbe, ice, i40e) use napi_disable() without a preceding napi_synchronize() in their down paths. Remove redundant napi_synchronize() call and reorder napi_disable() before igb_set_queue_napi() so the queue-to-NAPI mapping is only cleared after polling has fully stopped. Fixes: 2c6196013f84 ("igb: Add AF_XDP zero-copy Rx support") Cc: stable@vger.kernel.org Suggested-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Alex Dvoretsky <advoretsky@gmail.com> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Patryk Holda <patryk.holda@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2026-04-06ixgbevf: add missing negotiate_features op to Hyper-V ops tableMichal Schmidt
Commit a7075f501bd3 ("ixgbevf: fix mailbox API compatibility by negotiating supported features") added the .negotiate_features callback to ixgbe_mac_operations and populated it in ixgbevf_mac_ops, but forgot to add it to ixgbevf_hv_mac_ops. This leaves the function pointer NULL on Hyper-V VMs. During probe, ixgbevf_negotiate_api() calls ixgbevf_set_features(), which unconditionally dereferences hw->mac.ops.negotiate_features(). On Hyper-V this results in a NULL pointer dereference: BUG: kernel NULL pointer dereference, address: 0000000000000000 [...] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine [...] Workqueue: events work_for_cpu_fn RIP: 0010:0x0 [...] Call Trace: ixgbevf_negotiate_api+0x66/0x160 [ixgbevf] ixgbevf_sw_init+0xe4/0x1f0 [ixgbevf] ixgbevf_probe+0x20f/0x4a0 [ixgbevf] local_pci_probe+0x50/0xa0 work_for_cpu_fn+0x1a/0x30 [...] Add ixgbevf_hv_negotiate_features_vf() that returns -EOPNOTSUPP and wire it into ixgbevf_hv_mac_ops. The caller already handles -EOPNOTSUPP gracefully. Fixes: a7075f501bd3 ("ixgbevf: fix mailbox API compatibility by negotiating supported features") Reported-by: Xiaoqiang Xiong <xxiong@redhat.com> Closes: https://issues.redhat.com/browse/RHEL-155455 Assisted-by: Claude:claude-4.6-opus-high Cursor Tested-by: Xiaoqiang Xiong <xxiong@redhat.com> Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2026-04-06ixgbe: stop re-reading flash on every get_drvinfo for e610Aleksandr Loktionov
ixgbe_get_drvinfo() calls ixgbe_refresh_fw_version() on every ethtool query for e610 adapters. That ends up in ixgbe_discover_flash_size(), which bisects the full 16 MB NVM space issuing one ACI command per step (~20 ms each, ~24 steps total = ~500 ms). Profiling on an idle E610-XAT2 system with telegraf scraping ethtool stats every 10 seconds: kretprobe:ixgbe_get_drvinfo took 527603 us kretprobe:ixgbe_get_drvinfo took 523978 us kretprobe:ixgbe_get_drvinfo took 552975 us kretprobe:ice_get_drvinfo took 3 us kretprobe:igb_get_drvinfo took 2 us kretprobe:i40e_get_drvinfo took 5 us The half-second stall happens under the RTNL lock, causing visible latency on ip-link and friends. The FW version can only change after an EMPR reset. All flash data is already populated at probe time and the cached adapter->eeprom_id is what get_drvinfo should be returning. The only place that needs to trigger a re-read is ixgbe_devlink_reload_empr_finish(), right after the EMPR completes and new firmware is running. Additionally, refresh the FW version in ixgbe_reinit_locked() so that any PF that undergoes a reinit after an EMPR (e.g. triggered by another PF's devlink reload) also picks up the new version in adapter->eeprom_id. ixgbe_devlink_info_get() keeps its refresh call for explicit "devlink dev info" queries, which is fine given those are user-initiated. Fixes: c9e563cae19e ("ixgbe: add support for devlink reload") Co-developed-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com> Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com> Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2026-04-06ice: fix PTP timestamping broken by SyncE code on E825CPetr Oros
The E825C SyncE support added in commit ad1df4f2d591 ("ice: dpll: Support E825-C SyncE and dynamic pin discovery") introduced a SyncE reconfiguration block in ice_ptp_link_change() that prevents ice_ptp_port_phy_restart() from being called in several error paths. Without the PHY restart, PTP timestamps stop working after any link change event. There are three ways the PHY restart gets blocked: 1. When DPLL initialization fails (e.g. missing ACPI firmware node properties), ICE_FLAG_DPLL is not set and the function returns early before reaching the PHY restart. 2. When ice_tspll_bypass_mux_active_e825c() fails to read the CGU register, WARN_ON_ONCE fires and the function returns early. 3. When ice_tspll_cfg_synce_ethdiv_e825c() fails to configure the clock divider for an active pin, same early return. SyncE and PTP are independent features. SyncE reconfiguration failures must not prevent the PTP PHY restart that is essential for timestamp recovery after link changes. Fix by making the entire SyncE block conditional on ICE_FLAG_DPLL without an early return, and replacing the WARN_ON_ONCE + return error handling inside the loop with dev_err_once + break. The function always proceeds to ice_ptp_port_phy_restart() regardless of SyncE errors. Fixes: ad1df4f2d591 ("ice: dpll: Support E825-C SyncE and dynamic pin discovery") Signed-off-by: Petr Oros <poros@redhat.com> Reviewed-by: Grzegorz Nitka <grzegorz.nitka@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Sunitha Mekala <sunithax.d.mekala@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2026-04-06ice: ptp: don't WARN when controlling PF is unavailableKohei Enju
In VFIO passthrough setups, it is possible to pass through only a PF which doesn't own the source timer. In that case the PTP controlling PF (adapter->ctrl_pf) is never initialized in the VM, so ice_get_ctrl_ptp() returns NULL and triggers WARN_ON() in ice_ptp_setup_pf(). Since this is an expected behavior in that configuration, replace WARN_ON() with an informational message and return -EOPNOTSUPP. Fixes: e800654e85b5 ("ice: Use ice_adapter for PTP shared data instead of auxdev") Signed-off-by: Kohei Enju <kohei@enjuk.jp> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2026-04-06idpf: set the payload size before calling the async handlerEmil Tantilov
Set the payload size before forwarding the reply to the async handler. Without this, xn->reply_sz will be 0 and idpf_mac_filter_async_handler() will never get past the size check. Fixes: 34c21fa894a1 ("idpf: implement virtchnl transaction manager") Cc: stable@vger.kernel.org Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Li Li <boolli@google.com> Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Tested-by: Samuel Salin <Samuel.salin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2026-04-06idpf: improve locking around idpf_vc_xn_push_free()Emil Tantilov
Protect the set_bit() operation for the free_xn bitmask in idpf_vc_xn_push_free(), to make the locking consistent with rest of the code and avoid potential races in that logic. Fixes: 34c21fa894a1 ("idpf: implement virtchnl transaction manager") Cc: stable@vger.kernel.org Reported-by: Ray Zhang <sgzhang@google.com> Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Tested-by: Samuel Salin <Samuel.salin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>