summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2026-01-22iommu/vt-d: Flush piotlb for SVM and Nested domainYi Liu
Besides the paging domains that use FS, SVM and Nested domains need to use piotlb invalidation descriptor as well. Fixes: b33125296b50 ("iommu/vt-d: Create unique domain ops for each stage") Cc: stable@vger.kernel.org Signed-off-by: Yi Liu <yi.l.liu@intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20251223065824.6164-1-yi.l.liu@intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2026-01-22iommu/vt-d: Flush cache for PASID table before using itDmytro Maluka
When writing the address of a freshly allocated zero-initialized PASID table to a PASID directory entry, do that after the CPU cache flush for this PASID table, not before it, to avoid the time window when this PASID table may be already used by non-coherent IOMMU hardware while its contents in RAM is still some random old data, not zero-initialized. Fixes: 194b3348bdbb ("iommu/vt-d: Fix PASID directory pointer coherency") Signed-off-by: Dmytro Maluka <dmaluka@chromium.org> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20251221123508.37495-1-dmaluka@chromium.org Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2026-01-22iommu/vt-d: Flush dev-IOTLB only when PCIe device is accessible in scalable modeJinhui Guo
Commit 4fc82cd907ac ("iommu/vt-d: Don't issue ATS Invalidation request when device is disconnected") relies on pci_dev_is_disconnected() to skip ATS invalidation for safely-removed devices, but it does not cover link-down caused by faults, which can still hard-lock the system. For example, if a VM fails to connect to the PCIe device, "virsh destroy" is executed to release resources and isolate the fault, but a hard-lockup occurs while releasing the group fd. Call Trace: qi_submit_sync qi_flush_dev_iotlb intel_pasid_tear_down_entry device_block_translation blocking_domain_attach_dev __iommu_attach_device __iommu_device_set_domain __iommu_group_set_domain_internal iommu_detach_group vfio_iommu_type1_detach_group vfio_group_detach_container vfio_group_fops_release __fput Although pci_device_is_present() is slower than pci_dev_is_disconnected(), it still takes only ~70 µs on a ConnectX-5 (8 GT/s, x2) and becomes even faster as PCIe speed and width increase. Besides, devtlb_invalidation_with_pasid() is called only in the paths below, which are far less frequent than memory map/unmap. 1. mm-struct release 2. {attach,release}_dev 3. set/remove PASID 4. dirty-tracking setup The gain in system stability far outweighs the negligible cost of using pci_device_is_present() instead of pci_dev_is_disconnected() to decide when to skip ATS invalidation, especially under GDR high-load conditions. Fixes: 4fc82cd907ac ("iommu/vt-d: Don't issue ATS Invalidation request when device is disconnected") Cc: stable@vger.kernel.org Signed-off-by: Jinhui Guo <guojinhui.liam@bytedance.com> Link: https://lore.kernel.org/r/20251211035946.2071-3-guojinhui.liam@bytedance.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2026-01-22iommu/vt-d: Skip dev-iotlb flush for inaccessible PCIe device without ↵Jinhui Guo
scalable mode PCIe endpoints with ATS enabled and passed through to userspace (e.g., QEMU, DPDK) can hard-lock the host when their link drops, either by surprise removal or by a link fault. Commit 4fc82cd907ac ("iommu/vt-d: Don't issue ATS Invalidation request when device is disconnected") adds pci_dev_is_disconnected() to devtlb_invalidation_with_pasid() so ATS invalidation is skipped only when the device is being safely removed, but it applies only when Intel IOMMU scalable mode is enabled. With scalable mode disabled or unsupported, a system hard-lock occurs when a PCIe endpoint's link drops because the Intel IOMMU waits indefinitely for an ATS invalidation that cannot complete. Call Trace: qi_submit_sync qi_flush_dev_iotlb __context_flush_dev_iotlb.part.0 domain_context_clear_one_cb pci_for_each_dma_alias device_block_translation blocking_domain_attach_dev iommu_deinit_device __iommu_group_remove_device iommu_release_device iommu_bus_notifier blocking_notifier_call_chain bus_notify device_del pci_remove_bus_device pci_stop_and_remove_bus_device pciehp_unconfigure_device pciehp_disable_slot pciehp_handle_presence_or_link_change pciehp_ist Commit 81e921fd3216 ("iommu/vt-d: Fix NULL domain on device release") adds intel_pasid_teardown_sm_context() to intel_iommu_release_device(), which calls qi_flush_dev_iotlb() and can also hard-lock the system when a PCIe endpoint's link drops. Call Trace: qi_submit_sync qi_flush_dev_iotlb __context_flush_dev_iotlb.part.0 intel_context_flush_no_pasid device_pasid_table_teardown pci_pasid_table_teardown pci_for_each_dma_alias intel_pasid_teardown_sm_context intel_iommu_release_device iommu_deinit_device __iommu_group_remove_device iommu_release_device iommu_bus_notifier blocking_notifier_call_chain bus_notify device_del pci_remove_bus_device pci_stop_and_remove_bus_device pciehp_unconfigure_device pciehp_disable_slot pciehp_handle_presence_or_link_change pciehp_ist Sometimes the endpoint loses connection without a link-down event (e.g., due to a link fault); killing the process (virsh destroy) then hard-locks the host. Call Trace: qi_submit_sync qi_flush_dev_iotlb __context_flush_dev_iotlb.part.0 domain_context_clear_one_cb pci_for_each_dma_alias device_block_translation blocking_domain_attach_dev __iommu_attach_device __iommu_device_set_domain __iommu_group_set_domain_internal iommu_detach_group vfio_iommu_type1_detach_group vfio_group_detach_container vfio_group_fops_release __fput pci_dev_is_disconnected() only covers safe-removal paths; pci_device_is_present() tests accessibility by reading vendor/device IDs and internally calls pci_dev_is_disconnected(). On a ConnectX-5 (8 GT/s, x2) this costs ~70 µs. Since __context_flush_dev_iotlb() is only called on {attach,release}_dev paths (not hot), add pci_device_is_present() there to skip inaccessible devices and avoid the hard-lock. Fixes: 37764b952e1b ("iommu/vt-d: Global devTLB flush when present context entry changed") Fixes: 81e921fd3216 ("iommu/vt-d: Fix NULL domain on device release") Cc: stable@vger.kernel.org Signed-off-by: Jinhui Guo <guojinhui.liam@bytedance.com> Link: https://lore.kernel.org/r/20251211035946.2071-2-guojinhui.liam@bytedance.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2026-01-22rust: iommu: fix `srctree` link warningMiguel Ojeda
The Rust kernel code should be kept `rustdoc`-clean [1]. Our custom `srctree` link checker in the `rustdoc` target reports: warning: srctree/ link to include/io-pgtable.h does not exist Thus fix it. Link: https://rust-for-linux.com/contributing#submit-checklist-addendum [1] Fixes: 2e2f6b0ef855 ("rust: iommu: add io_pgtable abstraction") Signed-off-by: Miguel Ojeda <ojeda@kernel.org> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2026-01-22rust: iommu: fix Rust formattingMiguel Ojeda
The Rust kernel code should be kept `rustfmt`-clean [1]. Thus run the `rustfmt` target to fix the formatting issue. Link: https://rust-for-linux.com/contributing#submit-checklist-addendum [1] Fixes: 2e2f6b0ef855 ("rust: iommu: add io_pgtable abstraction") Signed-off-by: Miguel Ojeda <ojeda@kernel.org> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2026-01-22media: uvcvideo: Pass allocation size directly to uvc_alloc_urb_bufferRicardo Ribalda
The uvc_alloc_urb_buffer() function implicitly depended on the stream->urb_size field, which was set by its caller, uvc_alloc_urb_buffers(). This implicit data flow makes the code harder to follow. More importantly, stream->urb_size was updated within the allocation loop before the allocation was confirmed to be successful. If the allocation failed, the stream object would be left with a urb_size that doesn't correspond to valid, allocated URB buffers. Refactor uvc_alloc_urb_buffer() to accept the buffer size as an explicit argument. This makes the function's dependencies clear and improves the robustness of the error handling path. The stream->urb_size is now set only after a complete and successful allocation. This is a pure refactoring and introduces no functional changes. Signed-off-by: Ricardo Ribalda <ribalda@chromium.org> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Tested-by: Itay Chamiel <itay.chamiel@q.ai> Link: https://patch.msgid.link/20260114-uvc-alloc-urb-v1-2-cedf3fb66711@chromium.org Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
2026-01-22media: uvcvideo: Fix allocation for small frame sizesRicardo Ribalda
If a frame has size of less or equal than one packet size uvc_alloc_urb_buffers() is unable to allocate memory for it due to a off-by-one error. Fix the off-by-one-error and now that we are at it, make sure that stream->urb_size has always a valid value when we return from the function, even when an error happens. Fixes: efdc8a9585ce ("V4L/DVB (10295): uvcvideo: Retry URB buffers allocation when the system is low on memory.") Reported-by: Itay Chamiel <itay.chamiel@q.ai> Closes: https://lore.kernel.org/linux-media/CANiDSCsSoZf2LsCCoWAUbCg6tJT-ypXR1B85aa6rAdMVYr2iBQ@mail.gmail.com/T/#t Co-developed-by: Itay Chamiel <itay.chamiel@q.ai> Signed-off-by: Itay Chamiel <itay.chamiel@q.ai> Signed-off-by: Ricardo Ribalda <ribalda@chromium.org> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Tested-by: Itay Chamiel <itay.chamiel@q.ai> Link: https://patch.msgid.link/20260114-uvc-alloc-urb-v1-1-cedf3fb66711@chromium.org Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
2026-01-22media: uvcvideo: Return queued buffers on start_streaming() failureMichal Pecio
Return buffers if streaming fails to start due to uvc_pm_get() error. This bug may be responsible for a warning I got running while :; do yavta -c3 /dev/video0; done on an xHCI controller which failed under this workload. I had no luck reproducing this warning again to confirm. xhci_hcd 0000:09:00.0: HC died; cleaning up usb 13-2: USB disconnect, device number 2 WARNING: CPU: 2 PID: 29386 at drivers/media/common/videobuf2/videobuf2-core.c:1803 vb2_start_streaming+0xac/0x120 Fixes: 7dd56c47784a ("media: uvcvideo: Remove stream->is_streaming field") Cc: stable@vger.kernel.org Signed-off-by: Michal Pecio <michal.pecio@gmail.com> Reviewed-by: Ricardo Ribalda <ribalda@chromium.org> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Link: https://patch.msgid.link/20251015133642.3dede646.michal.pecio@gmail.com Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
2026-01-22media: uvcvideo: Create an ID namespace for streaming output terminalsRicardo Ribalda
Some devices, such as the Grandstream GUV3100 and the LSK Meeting Eye for Business & Home, exhibit entity ID collisions between units and streaming output terminals. The UVC specification requires unit and terminal IDs to be unique, and uses the ID to reference entities: - In control requests, to identify the target entity - In the UVC units and terminals descriptors' bSourceID field, to identify source entities - In the UVC input header descriptor's bTerminalLink, to identify the terminal associated with a streaming interface Entity ID collisions break accessing controls and make the graph description in the UVC descriptors ambiguous. However, collisions where one of the entities is a streaming output terminal and the other entity is not a streaming terminal are less severe. Streaming output terminals have no controls, and, as they are the final entity in pipelines, they are never referenced in descriptors as source entities. They are referenced by ID only from innput header descriptors, which by definition only reference streaming terminals. For these reasons, we can work around the collision by giving streaming output terminals their own ID namespace. Do so by setting bit UVC_TERM_OUTPUT (15) in the uvc_entity.id field, which is normally never set as the ID is a 8-bit value. This ID change doesn't affect the entity name in the media controller graph as the name isn't constructed from the ID, so there should not be any impact on the uAPI. Although this change handles some ID collisions automagically, keep printing an error in uvc_alloc_new_entity() when a camera has invalid descriptors. Hopefully this message will help vendors fix their invalid descriptors. This new method of handling ID collisions includes a revert of commit 758dbc756aad ("media: uvcvideo: Use heuristic to find stream entity") that attempted to fix the problem urgently due to regression reports. Suggested-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Ricardo Ribalda <ribalda@chromium.org> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Tested-by: Lili Orosz <lily@floofy.city> Co-developed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Link: https://patch.msgid.link/20251113210400.28618-1-laurent.pinchart@ideasonboard.com Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
2026-01-21Merge tag 'hyperv-fixes-signed-20260121' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull hyperv fixes from Wei Liu: - Fix ARM64 port of the MSHV driver (Anirudh Rayabharam) - Fix huge page handling in the MSHV driver (Stanislav Kinsburskii) - Minor fixes to driver code (Julia Lawall, Michael Kelley) * tag 'hyperv-fixes-signed-20260121' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: mshv: handle gpa intercepts for arm64 mshv: add definitions for arm64 gpa intercepts mshv: Add __user attribute to argument passed to access_ok() mshv: Store the result of vfs_poll in a variable of type __poll_t mshv: Align huge page stride with guest mapping Drivers: hv: Always do Hyper-V panic notification in hv_kmsg_dump() Drivers: hv: vmbus: fix typo in function name reference
2026-01-21Merge tag 'perf-tools-fixes-for-v6.19-2026-01-21' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools Pull perf-tools fix from Namhyung Kim: "A minor fix for error handling in the event parser" * tag 'perf-tools-fixes-for-v6.19-2026-01-21' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: perf parse-events: Fix evsel allocation failure
2026-01-21Merge tag 'nf-next-26-01-20' of ↵Jakub Kicinski
https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next Florian Westphal says: ==================== Subject: netfilter: updates for net-next 1) Speed up nftables transactions after earlier transaction failed. Due to a (harmeless) bug we remained in slow paranoia mode until a successful transaction completes. 2) Allow generic tracker to resolve clashes, this avoids very rare packet drops. From Yuto Hamaguchi. 3) Increase the cleanup budget to 64 entries in nf_conncount to reap more entries in one go, from Fernando Fernandez Mancera. 4) Allow icmp trackers to resolve clashes, this avoids very rare initial packet drop with test cases that have high-frequency pings. After this all trackers except tcp and sctp allow clash resolution. 5) Disentangle netfilter headers, don't include nftables/xtables headers in subsystems that are unrelated. 6) Don't rely on implicit includes coming from nf_conntrack_proto_gre.h. 7) Allow nfnetlink_queue nfq instance struct to get accounted via memcg, from Scott Mitchell. 8) Reject bogus xt target/match data upfront via netlink policiy in nft_compat interface rather than relying on x_tables API to do it. 9) Fix nf_conncount breakage when trying to limit loopback flows via prerouting rule, from Fernando Fernandez Mancera. This is a recent breakage but not seen as urgent enough to rush this via net tree at this late stage in development cycle. 10) Fix a possible off-by-one when parsing tcp option in xtables tcpmss match. Also handled via -next due to late stage in development cycle. * tag 'nf-next-26-01-20' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next: netfilter: xt_tcpmss: check remaining length before reading optlen netfilter: nf_conncount: fix tracking of connections from localhost netfilter: nft_compat: add more restrictions on netlink attributes netfilter: nfnetlink_queue: nfqnl_instance GFP_ATOMIC -> GFP_KERNEL_ACCOUNT allocation netfilter: nf_conntrack: don't rely on implicit includes netfilter: don't include xt and nftables.h in unrelated subsystems netfilter: nf_conntrack: enable icmp clash support netfilter: nf_conncount: increase the connection clean up limit to 64 netfilter: nf_conntrack: Add allow_clash to generic protocol handler netfilter: nf_tables: reset table validation state on abort ==================== Link: https://patch.msgid.link/20260120191803.22208-1-fw@strlen.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-21octeontx2-af: Fix error handlingRatheesh Kannoth
This commit adds error handling and rollback logic to rvu_mbox_handler_attach_resources() to properly clean up partially attached resources when rvu_attach_block() fails. Fixes: 746ea74241fa0 ("octeontx2-af: Add RVU block LF provisioning support") Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com> Link: https://patch.msgid.link/20260121033934.1900761-1-rkannoth@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-21net: pcs: pcs-mtk-lynxi: report in-band capability for 2500Base-XDaniel Golle
It turns out that 2500Base-X actually works fine with in-band status on MediaTek's LynxI PCS -- I wrongly concluded it didn't because it is broken in all the copper SFP modules and GPON sticks I used for testing. Hence report LINK_INBAND_ENABLE also for 2500Base-X mode. This reverts most of commit a003c38d9bbb ("net: pcs: pcs-mtk-lynxi: correctly report in-band status capabilities"). The removal of the QSGMII interface mode was correct and is left untouched. Link: https://github.com/openwrt/openwrt/issues/21436 Fixes: a003c38d9bbb ("net: pcs: pcs-mtk-lynxi: correctly report in-band status capabilities") Signed-off-by: Daniel Golle <daniel@makrotopia.org> Link: https://patch.msgid.link/b1cf26157b63fee838be09ae810497fb22fd8104.1768961746.git.daniel@makrotopia.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-21rxrpc: Fix data-race warning and potential load/store tearingDavid Howells
Fix the following: BUG: KCSAN: data-race in rxrpc_peer_keepalive_worker / rxrpc_send_data_packet which is reporting an issue with the reads and writes to ->last_tx_at in: conn->peer->last_tx_at = ktime_get_seconds(); and: keepalive_at = peer->last_tx_at + RXRPC_KEEPALIVE_TIME; The lockless accesses to these to values aren't actually a problem as the read only needs an approximate time of last transmission for the purposes of deciding whether or not the transmission of a keepalive packet is warranted yet. Also, as ->last_tx_at is a 64-bit value, tearing can occur on a 32-bit arch. Fix both of these by switching to an unsigned int for ->last_tx_at and only storing the LSW of the time64_t. It can then be reconstructed at need provided no more than 68 years has elapsed since the last transmission. Fixes: ace45bec6d77 ("rxrpc: Fix firewall route keepalive") Reported-by: syzbot+6182afad5045e6703b3d@syzkaller.appspotmail.com Closes: https://lore.kernel.org/r/695e7cfb.050a0220.1c677c.036b.GAE@google.com/ Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: Simon Horman <horms@kernel.org> cc: linux-afs@lists.infradead.org cc: stable@kernel.org Link: https://patch.msgid.link/1107124.1768903985@warthog.procyon.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-21Merge branch '100GbE' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2026-01-20 (ice, idpf) For ice: Cody Haas breaks dependency of needing both RSS key and LUT for ice_get_rxfh() as ethtool ioctls do not always supply both. Paul fixes issues related to devlink reload; adding missing deinit HW call and moving hwmon exit function to the proper call chain. For idpf: Mina Almasry moves a register read call into the time sandwich to ensure the register is properly flushed. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: idpf: read lower clock bits inside the time sandwich ice: fix devlink reload call trace ice: add missing ice_deinit_hw() in devlink reinit path ice: Fix persistent failure in ice_get_rxfh ==================== Link: https://patch.msgid.link/20260120224430.410377-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-21net: dsa: fix off-by-one in maximum bridge ID determinationVladimir Oltean
Prior to the blamed commit, the bridge_num range was from 0 to ds->max_num_bridges - 1. After the commit, it is from 1 to ds->max_num_bridges. So this check: if (bridge_num >= max) return 0; must be updated to: if (bridge_num > max) return 0; in order to allow the last bridge_num value (==max) to be used. This is easiest visible when a driver sets ds->max_num_bridges=1. The observed behaviour is that even the first created bridge triggers the netlink extack "Range of offloadable bridges exceeded" warning, and is handled in software rather than being offloaded. Fixes: 3f9bb0301d50 ("net: dsa: make dp->bridge_num one-based") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20260120211039.3228999-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-21Merge branch 'phylink-link-callback-replay-helpers-for-sja1105-and-xpcs'Jakub Kicinski
Vladimir Oltean says: ==================== Phylink link callback replay helpers for SJA1105 and XPCS The sja1105 is reducing its direct interaction with the XPCS. The changes presented here are an older simplification idea, broken out of a previous patch set to allow for more thorough review. ==================== Link: https://patch.msgid.link/20260119121954.1624535-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-21net: dsa: sja1105: re-merge sja1105_set_port_speed() and ↵Vladimir Oltean
sja1105_set_port_config() Commit a18891b55703 ("net: dsa: sja1105: simplify static configuration reload") split sja1105_mac_link_up() -> sja1105_adjust_port_config() into two separate: - sja1105_set_port_speed() - sja1105_set_port_config() in order to pick up the second sja1105_set_port_config() and reuse it for the sja1105_static_config_reload() procedure which involves saving and restoring MAC and PCS settings. Now that these settings are restored by phylink itself, the driver no longer needs to call its own sja1105_set_port_config(), and the splitting is unnatural. Merge the functions back, which is to say that the only supported internal code path is to submit the MAC Configuration Table entry to hardware after phylink has dictated what we should set it to. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20260119121954.1624535-5-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-21net: dsa: sja1105: let phylink help with the replay of link callbacksVladimir Oltean
sja1105_static_config_reload() changes major settings in the switch and it requires a reset. A use case is to change things like Qdiscs (but see sja1105_reset_reasons[] for full list) while PTP synchronization is running, and the servo loop must not exit the locked state (s2). Therefore, stopping and restarting the phylink instances of all ports is not desirable, because that also stops the phylib state machine, and retriggers a seconds-long auto-negotiation process that breaks PTP. Thus, saving and restoring the link management settings is handled privately by the driver. The method got progressively more complex as SGMII support got added, because this is handled through the xpcs phylink_pcs component, to which we don't have unfettered access. Nonetheless, the switch reset line is hardwired to also reset the XPCS, creating a situation where it loses state and needs to be reprogrammed at a moment in time outside phylink's control. Although commits 907476c66d73 ("net: dsa: sja1105: call PCS config/link_up via pcs_ops structure") and 41bf58314b17 ("net: dsa: sja1105: use phylink_pcs internally") made the sja1105 <-> xpcs interaction slightly prettier, we still depend heavily on the PCS being "XPCS-like", because to back up its settings, we read the MII_BMCR register, through a mdiobus_c45_read() operation, breaking all layering separation. With the existence of phylink link callback replay helpers, we can do away with all this custom code and become even more PCS-agnostic, even though the reset domain is tightly coupled. This creates the unique opportunity to simplify away even more code than just the xpcs handling from sja1105_static_config_reload(). The sja1105_set_port_config() method is also invoked from sja1105_mac_link_up(). And since that is now called directly by phylink - we can just remove it from sja1105_static_config_reload(). This makes it possible to re-merge sja1105_set_port_speed() and sja1105_set_port_config() in a later change. Note that my only setups with sja1105 where the xpcs is used is with the xpcs on the CPU-facing port (fixed-link). Thus, I cannot test xpcs + PHY. But the replay procedure walks through all ports, and I did test a regular RGMII user port + a PHY. ptp4l[54.552]: master offset 5 s2 freq -931 path delay 764 ptp4l[55.551]: master offset 22 s2 freq -913 path delay 764 ptp4l[56.551]: master offset 13 s2 freq -915 path delay 765 ptp4l[57.552]: master offset 5 s2 freq -919 path delay 765 ptp4l[58.553]: master offset 13 s2 freq -910 path delay 765 ptp4l[59.553]: master offset 13 s2 freq -906 path delay 765 ptp4l[60.553]: master offset 6 s2 freq -909 path delay 765 ptp4l[61.553]: master offset 6 s2 freq -907 path delay 765 ptp4l[62.553]: master offset 6 s2 freq -906 path delay 765 ptp4l[63.553]: master offset 14 s2 freq -896 path delay 765 $ ip link set br0 type bridge vlan_filtering 1 [ 63.983283] sja1105 spi2.0 sw0p0: Link is Down [ 63.991913] sja1105 spi2.0: Link is Down [ 64.009784] sja1105 spi2.0: Reset switch and programmed static config. Reason: VLAN filtering [ 64.020217] sja1105 spi2.0 sw0p0: Link is Up - 1Gbps/Full - flow control off [ 64.030683] sja1105 spi2.0: Link is Up - 1Gbps/Full - flow control off ptp4l[64.554]: master offset 7397 s2 freq +6491 path delay 765 ptp4l[65.554]: master offset 38 s2 freq +1352 path delay 765 ptp4l[66.554]: master offset -2225 s2 freq -900 path delay 764 ptp4l[67.555]: master offset -2226 s2 freq -1569 path delay 765 ptp4l[68.555]: master offset -1553 s2 freq -1563 path delay 765 ptp4l[69.555]: master offset -865 s2 freq -1341 path delay 765 ptp4l[70.555]: master offset -401 s2 freq -1137 path delay 765 ptp4l[71.556]: master offset -145 s2 freq -1001 path delay 765 ptp4l[72.558]: master offset -26 s2 freq -926 path delay 765 ptp4l[73.557]: master offset 30 s2 freq -877 path delay 765 ptp4l[74.557]: master offset 47 s2 freq -851 path delay 765 ptp4l[75.557]: master offset 29 s2 freq -855 path delay 765 Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20260119121954.1624535-4-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-21net: phylink: introduce helpers for replaying link callbacksVladimir Oltean
Some drivers of MAC + tightly integrated PCS (example: SJA1105 + XPCS covered by same reset domain) need to perform resets at runtime. The reset is triggered by the MAC driver, and it needs to restore its and the PCS' registers, all invisible to phylink. However, there is a desire to simplify the API through which the MAC and the PCS interact, so this becomes challenging. Phylink holds all the necessary state to help with this operation, and can offer two helpers which walk the MAC and PCS drivers again through the callbacks required during a destructive reset operation. The procedure is as follows: Before reset, MAC driver calls phylink_replay_link_begin(): - Triggers phylink mac_link_down() and pcs_link_down() methods After reset, MAC driver calls phylink_replay_link_end(): - Triggers phylink mac_config() -> pcs_config() -> mac_link_up() -> pcs_link_up() methods. MAC and PCS registers are restored with no other custom driver code. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/20260119121954.1624535-3-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-21net: phylink: simplify phylink_resolve() -> phylink_major_config() pathVladimir Oltean
This is a trivial change with no functional effect which replaces the pattern: if (a) { if (b) { do_stuff(); } } with: if (a && b) { do_stuff(); }; The purpose is to reduce the delta of a subsequent functional change. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/20260119121954.1624535-2-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-21Merge branch 'phy-polarity-inversion-via-generic-device-tree-properties'Jakub Kicinski
Vladimir Oltean says: ==================== PHY polarity inversion via generic device tree properties Using the "rx-polarity" and "tx-polarity" device tree properties introduced in linux-phy and merged into net-next in commit 96a2d53f2478 ("Merge tag 'phy_common_properties' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy") we convert here two existing networking use cases - the EN8811H Ethernet PHY and the Mediatek LynxI PCS. Original cover letter: Polarity inversion (described in patch 4/10) is a feature with at least 4 potential new users waiting for a generic description: - Horatiu Vultur with the lan966x SerDes - Daniel Golle with the MaxLinear GSW1xx switches - Bjørn Mork with the AN8811HB Ethernet PHY - Me with a custom SJA1105 board, switch which uses the DesignWare XPCS I became interested in exploring the problem space because I was averse to the idea of adding vendor-specific device tree properties to describe a common need. This set contains an implementation of a generic feature that should cater to all known needs that were identified during my documentation phase. Apart from what is converted here, we also have the following, which I did not touch: - "st,px_rx_pol_inv" - its binding is a .txt file and I don't have time for such a large detour to convert it to dtschema. - "st,pcie-tx-pol-inv" and "st,sata-tx-pol-inv" - these are defined in a .txt schema but are not implemented in any driver. My verdict would be "delete the properties" but again, I would prefer not introducing such dependency to this series. ==================== Link: https://patch.msgid.link/20260119091220.1493761-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-21net: pcs: pcs-mtk-lynxi: deprecate "mediatek,pnswap"Vladimir Oltean
Prefer the new "rx-polarity" and "tx-polarity" properties, which in this case have the advantage that polarity inversion can be specified per direction (and per protocol, although this isn't useful here). We use the vendor specific ones as fallback if the standard description doesn't exist. Daniel, referring to the Mediatek SDK, clarifies that the combined SGMII_PN_SWAP_TX_RX register field should be split like this: bit 0 is TX and bit 1 is RX: https://lore.kernel.org/linux-phy/aSW--slbJWpXK0nv@makrotopia.org/ Suggested-by: Daniel Golle <daniel@makrotopia.org> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20260119091220.1493761-6-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-21net: pcs: pcs-mtk-lynxi: pass SGMIISYS OF node to PCSVladimir Oltean
The Mediatek LynxI PCS is used from the MT7530 DSA driver (where it does not have an OF presence) and from mtk_eth_soc, where it does (Documentation/devicetree/bindings/net/pcs/mediatek,sgmiisys.yaml informs of a combined clock provider + SGMII PCS "SGMIISYS" syscon block). Currently, mtk_eth_soc parses the SGMIISYS OF node for the "mediatek,pnswap" property and sets a bit in the "flags" argument of mtk_pcs_lynxi_create() if set. I'd like to deprecate "mediatek,pnswap" in favour of a property which takes the current phy-mode into consideration. But this is only known at mtk_pcs_lynxi_config() time, and not known at mtk_pcs_lynxi_create(), when the SGMIISYS OF node is parsed. To achieve that, we must pass the OF node of the PCS, if it exists, to mtk_pcs_lynxi_create(), and let the PCS take a reference on it and handle property parsing whenever it wants. Use the fwnode API which is more general than OF (in case we ever need to describe the PCS using some other format). This API should be NULL tolerant, so add no particular tests for the mt7530 case. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20260119091220.1493761-5-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-21dt-bindings: net: pcs: mediatek,sgmiisys: deprecate "mediatek,pnswap"Vladimir Oltean
Reference the common PHY properties, and update the example to use them. Note that a PCS subnode exists, and it seems a better container of the polarity description than the SGMIISYS node that hosts "mediatek,pnswap". So use that. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Rob Herring (Arm) <robh@kernel.org> Link: https://patch.msgid.link/20260119091220.1493761-4-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-21net: phy: air_en8811h: deprecate "airoha,pnswap-rx" and "airoha,pnswap-tx"Vladimir Oltean
Prefer the new "rx-polarity" and "tx-polarity" properties, and use the vendor specific ones as fallback if the standard description doesn't exist. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Link: https://patch.msgid.link/20260119091220.1493761-3-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-21dt-bindings: net: airoha,en8811h: deprecate "airoha,pnswap-rx" and ↵Vladimir Oltean
"airoha,pnswap-tx" Reference the common PHY properties, and update the example to use them. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Rob Herring (Arm) <robh@kernel.org> Link: https://patch.msgid.link/20260119091220.1493761-2-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-21net: bcmasp: Fix network filter wake for asp-3.0Justin Chen
We need to apply the tx_chan_offset to the netfilter cfg channel or the output channel will be incorrect for asp-3.0 and newer. Fixes: e9f31435ee7d ("net: bcmasp: Add support for asp-v3.0") Signed-off-by: Justin Chen <justin.chen@broadcom.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://patch.msgid.link/20260120192339.2031648-1-justin.chen@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-21Merge branch 'gro-inline-tcp6_gro_-receive-complete'Jakub Kicinski
Eric Dumazet says: ==================== gro: inline tcp6_gro_{receive,complete} On some platforms, GRO stack is too deep and causes cpu stalls. Decreasing call depths by one shows a 1.5 % gain on Zen2 cpus. (32 RX queues, 100Gbit NIC, RFS enabled, tcp_rr with 128 threads and 10,000 flows) We can go further by inlining ipv6_gro_{receive,complete} and take care of IPv4 if there is interest. Note: two temporary __always_inline will be replaced with inline_for_performance when/if available. Cumulative size increase for this series (of 3): $ scripts/bloat-o-meter -t vmlinux.0 vmlinux.3 add/remove: 2/2 grow/shrink: 5/1 up/down: 1572/-471 (1101) Function old new delta ipv6_gro_receive 1069 1846 +777 ipv6_gro_complete 433 733 +300 tcp6_check_fraglist_gro - 272 +272 tcp6_gro_complete 227 306 +79 tcp4_gro_complete 325 397 +72 ipv6_offload_init 218 274 +56 __pfx_tcp6_check_fraglist_gro - 16 +16 __pfx___skb_incr_checksum_unnecessary 32 - -32 __skb_incr_checksum_unnecessary 186 - -186 tcp6_gro_receive 959 706 -253 Total: Before=22592724, After=22593825, chg +0.00% ==================== Link: https://patch.msgid.link/20260120164903.1912995-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-21gro: inline tcp6_gro_complete()Eric Dumazet
Remove one function call from GRO stack for native IPv6 + TCP packets. $ scripts/bloat-o-meter -t vmlinux.2 vmlinux.3 add/remove: 0/0 grow/shrink: 1/1 up/down: 298/-5 (293) Function old new delta ipv6_gro_complete 435 733 +298 tcp6_gro_complete 311 306 -5 Total: Before=22593532, After=22593825, chg +0.00% Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260120164903.1912995-4-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-21gro: inline tcp6_gro_receive()Eric Dumazet
FDO/LTO are unable to inline tcp6_gro_receive() from ipv6_gro_receive() Make sure tcp6_check_fraglist_gro() is only called only when needed, so that compiler can leave it out-of-line. $ scripts/bloat-o-meter -t vmlinux.1 vmlinux.2 add/remove: 2/0 grow/shrink: 3/1 up/down: 1123/-253 (870) Function old new delta ipv6_gro_receive 1069 1846 +777 tcp6_check_fraglist_gro - 272 +272 ipv6_offload_init 218 274 +56 __pfx_tcp6_check_fraglist_gro - 16 +16 ipv6_gro_complete 433 435 +2 tcp6_gro_receive 959 706 -253 Total: Before=22592662, After=22593532, chg +0.00% Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260120164903.1912995-3-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-21net: always inline __skb_incr_checksum_unnecessary()Eric Dumazet
clang does not inline this helper in GRO fast path. We can save space and cpu cycles. $ scripts/bloat-o-meter -t vmlinux.0 vmlinux.1 add/remove: 0/2 grow/shrink: 2/0 up/down: 156/-218 (-62) Function old new delta tcp6_gro_complete 227 311 +84 tcp4_gro_complete 325 397 +72 __pfx___skb_incr_checksum_unnecessary 32 - -32 __skb_incr_checksum_unnecessary 186 - -186 Total: Before=22592724, After=22592662, chg -0.00% Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260120164903.1912995-2-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-21bonding: provide a net pointer to __skb_flow_dissect()Eric Dumazet
After 3cbf4ffba5ee ("net: plumb network namespace into __skb_flow_dissect") we have to provide a net pointer to __skb_flow_dissect(), either via skb->dev, skb->sk, or a user provided pointer. In the following case, syzbot was able to cook a bare skb. WARNING: net/core/flow_dissector.c:1131 at __skb_flow_dissect+0xb57/0x68b0 net/core/flow_dissector.c:1131, CPU#1: syz.2.1418/11053 Call Trace: <TASK> bond_flow_dissect drivers/net/bonding/bond_main.c:4093 [inline] __bond_xmit_hash+0x2d7/0xba0 drivers/net/bonding/bond_main.c:4157 bond_xmit_hash_xdp drivers/net/bonding/bond_main.c:4208 [inline] bond_xdp_xmit_3ad_xor_slave_get drivers/net/bonding/bond_main.c:5139 [inline] bond_xdp_get_xmit_slave+0x1fd/0x710 drivers/net/bonding/bond_main.c:5515 xdp_master_redirect+0x13f/0x2c0 net/core/filter.c:4388 bpf_prog_run_xdp include/net/xdp.h:700 [inline] bpf_test_run+0x6b2/0x7d0 net/bpf/test_run.c:421 bpf_prog_test_run_xdp+0x795/0x10e0 net/bpf/test_run.c:1390 bpf_prog_test_run+0x2c7/0x340 kernel/bpf/syscall.c:4703 __sys_bpf+0x562/0x860 kernel/bpf/syscall.c:6182 __do_sys_bpf kernel/bpf/syscall.c:6274 [inline] __se_sys_bpf kernel/bpf/syscall.c:6272 [inline] __x64_sys_bpf+0x7c/0x90 kernel/bpf/syscall.c:6272 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xec/0xf80 arch/x86/entry/syscall_64.c:94 Fixes: 58deb77cc52d ("bonding: balance ICMP echoes in layer3+4 mode") Reported-by: syzbot+c46409299c70a221415e@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/696faa23.050a0220.4cb9c.001f.GAE@google.com/T/#u Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Matteo Croce <mcroce@redhat.com> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20260120161744.1893263-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-21selftests: net: amt: wait longer for connection before sending packetsTaehee Yoo
Both send_mcast4() and send_mcast6() use sleep 2 to wait for the tunnel connection between the gateway and the relay, and for the listener socket to be created in the LISTENER namespace. However, tests sometimes fail because packets are sent before the connection is fully established. Increase the waiting time to make the tests more reliable, and use wait_local_port_listen() to explicitly wait for the listener socket. Fixes: c08e8baea78e ("selftests: add amt interface selftest script") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Link: https://patch.msgid.link/20260120133930.863845-1-ap420073@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-21tcp: preserve const qualifier in tcp_rsk() and inet_rsk()Eric Dumazet
We can change tcp_rsk() and inet_rsk() to propagate their argument const qualifier thanks to container_of_const(). Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20260120125353.1470456-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-21be2net: Fix NULL pointer dereference in be_cmd_get_mac_from_listAndrey Vatoropin
When the parameter pmac_id_valid argument of be_cmd_get_mac_from_list() is set to false, the driver may request the PMAC_ID from the firmware of the network card, and this function will store that PMAC_ID at the provided address pmac_id. This is the contract of this function. However, there is a location within the driver where both pmac_id_valid == false and pmac_id == NULL are being passed. This could result in dereferencing a NULL pointer. To resolve this issue, it is necessary to pass the address of a stub variable to the function. Fixes: 95046b927a54 ("be2net: refactor MAC-addr setup code") Signed-off-by: Andrey Vatoropin <a.vatoropin@crpt.ru> Link: https://patch.msgid.link/20260120113734.20193-1-a.vatoropin@crpt.ru Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-21Merge branch ↵Jakub Kicinski
'airoha-add-the-capability-to-read-firmware-binary-names-from-dts-for-airoha-npu-driver' Lorenzo Bianconi says: ==================== airoha: Add the capability to read firmware binary names from dts for Airoha NPU driver This patch is needed because NPU firmware binaries are board specific since they depend on the MediaTek WiFi chip used on the board (e.g. MT7996 or MT7992). This is a preliminary patch to enable MT76 NPU offloading if the Airoha SoC is equipped with MT7996 (Eagle) WiFi chipset. ==================== Link: https://patch.msgid.link/20260120-airoha-npu-firmware-name-v4-0-88999628b4c1@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-21net: airoha: npu: Add the capability to read firmware names from dtsLorenzo Bianconi
Introduce the capability to read the firmware binary names from device-tree using the firmware-name property if available. This patch is needed because NPU firmware binaries are board specific since they depend on the MediaTek WiFi chip used on the board (e.g. MT7996 or MT7992) and the WiFi chip version info is not available in the NPU driver. This is a preliminary patch to enable MT76 NPU offloading if the Airoha SoC is equipped with MT7996 (Eagle) WiFi chipset. Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20260120-airoha-npu-firmware-name-v4-2-88999628b4c1@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-21dt-bindings: net: airoha: npu: Add firmware-name propertyLorenzo Bianconi
Add firmware-name property in order to introduce the capability to specify the firmware names used for 'RiscV core' and 'Data section' binaries. This patch is needed because NPU firmware binaries are board specific since they depend on the MediaTek WiFi chip used on the board (e.g. MT7996 or MT7992) and the WiFi chip version info is not available in the NPU driver. This is a preliminary patch to enable MT76 NPU offloading if the Airoha SoC is equipped with MT7996 (Eagle) WiFi chipset. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://patch.msgid.link/20260120-airoha-npu-firmware-name-v4-1-88999628b4c1@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-22drm/i915/gvt_mmio_table: Use the gvt versions of the display macrosAnkit Nautiyal
Include gvt/display_helpers.h so that the display register macros in intel_gvt_mmio_table.c expand through the exported display functions. This lets us keep the existing macro calls while avoiding direct access to display internals, helping the display modularization work. Signed-off-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Link: https://patch.msgid.link/20260114025456.1639171-1-ankit.k.nautiyal@intel.com
2026-01-21Merge branch 'netconsole-support-automatic-target-recovery'Jakub Kicinski
Andre Carvalho says: ==================== netconsole: support automatic target recovery This patchset introduces target resume capability to netconsole allowing it to recover targets when underlying low-level interface comes back online. The patchset starts by refactoring netconsole state representation in order to allow representing deactivated targets (targets that are disabled due to interfaces unregister). It then modifies netconsole to handle NETDEV_REGISTER events for such targets, setups netpoll and forces the device UP. Targets are matched with incoming interfaces depending on how they were bound in netconsole (by mac or interface name). For these reasons, we also attempt resuming on NETDEV_CHANGENAME. The patchset includes a selftest that validates netconsole target state transitions and that target is functional after resumed. ==================== Link: https://patch.msgid.link/20260118-netcons-retrigger-v11-0-4de36aebcf48@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-21selftests: netconsole: validate target resumeAndre Carvalho
Introduce a new netconsole selftest to validate that netconsole is able to resume a deactivated target when the low level interface comes back. The test setups the network using netdevsim, creates a netconsole target and then remove/add netdevsim in order to bring the same interfaces back. Afterwards, the test validates that the target works as expected. Targets are created via cmdline parameters to the module to ensure that we are able to resume targets that were bound by mac and interface name. Reviewed-by: Breno Leitao <leitao@debian.org> Signed-off-by: Andre Carvalho <asantostc@gmail.com> Tested-by: Breno Leitao <leitao@debian.org> Link: https://patch.msgid.link/20260118-netcons-retrigger-v11-7-4de36aebcf48@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-21netconsole: resume previously deactivated targetAndre Carvalho
Attempt to resume a previously deactivated target when the associated interface comes back (NETDEV_REGISTER) or when it changes name (NETDEV_CHANGENAME) by calling netpoll_setup on the device. Depending on how the target was setup (by mac or interface name), the corresponding field is compared with the device being brought up. Targets that match the incoming device, are scheduled for resume on a workqueue. Resuming happens on a workqueue as we can't execute netpoll_setup in the context of the netdev event. A standalone workqueue (as opposed to the global one) is used to allow for proper cleanup process during netconsole module cleanup as we need to be able to flush all pending work before traversing the target list given that targets are temporarily removed from the list during resume_target. Target transitions to STATE_DISABLED in case of failures resuming it to avoid retrying the same target indefinitely. Signed-off-by: Andre Carvalho <asantostc@gmail.com> Reviewed-by: Breno Leitao <leitao@debian.org> Tested-by: Breno Leitao <leitao@debian.org> Link: https://patch.msgid.link/20260118-netcons-retrigger-v11-6-4de36aebcf48@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-21netconsole: introduce helpers for dynamic_netconsole_mutex lock/unlockAndre Carvalho
This commit introduces two helper functions to perform lock/unlock on dynamic_netconsole_mutex providing no-op stub versions when compiled without CONFIG_NETCONSOLE_DYNAMIC and refactors existing call sites to use the new helpers. This is done following kernel coding style guidelines, in preparation for an upcoming change. It avoids the need for preprocessor conditionals in the call site and keeps the logic easier to follow. Signed-off-by: Andre Carvalho <asantostc@gmail.com> Reviewed-by: Breno Leitao <leitao@debian.org> Tested-by: Breno Leitao <leitao@debian.org> Link: https://patch.msgid.link/20260118-netcons-retrigger-v11-5-4de36aebcf48@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-21netconsole: clear dev_name for devices bound by macAndre Carvalho
This patch makes sure netconsole clears dev_name for devices bound by mac in order to allow calling setup_netpoll on targets that have previously been cleaned up (in order to support resuming deactivated targets). This is required as netpoll_setup populates dev_name even when devices are matched via mac address. The cleanup is done inside netconsole as bound by mac is a netconsole concept. Signed-off-by: Andre Carvalho <asantostc@gmail.com> Reviewed-by: Breno Leitao <leitao@debian.org> Tested-by: Breno Leitao <leitao@debian.org> Link: https://patch.msgid.link/20260118-netcons-retrigger-v11-4-4de36aebcf48@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-21netconsole: add STATE_DEACTIVATED to track targets disabled by low levelBreno Leitao
When the low level interface brings a netconsole target down, record this using a new STATE_DEACTIVATED state. This allows netconsole to distinguish between targets explicitly disabled by users and those deactivated due to interface state changes. It also enables automatic recovery and re-enabling of targets if the underlying low-level interfaces come back online. From a code perspective, anything that is not STATE_ENABLED is disabled. Devices (de)enslaving are marked STATE_DISABLED to prevent automatically resuming as enslaved interfaces cannot have netconsole enabled. Signed-off-by: Breno Leitao <leitao@debian.org> Signed-off-by: Andre Carvalho <asantostc@gmail.com> Tested-by: Breno Leitao <leitao@debian.org> Link: https://patch.msgid.link/20260118-netcons-retrigger-v11-3-4de36aebcf48@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-21netconsole: convert 'enabled' flag to enum for clearer state managementAndre Carvalho
This patch refactors the netconsole driver's target enabled state from a simple boolean to an explicit enum (`target_state`). This allow the states to be expanded to a new state in the upcoming change. Co-developed-by: Breno Leitao <leitao@debian.org> Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Breno Leitao <leitao@debian.org> Signed-off-by: Andre Carvalho <asantostc@gmail.com> Tested-by: Breno Leitao <leitao@debian.org> Link: https://patch.msgid.link/20260118-netcons-retrigger-v11-2-4de36aebcf48@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-21netconsole: add target_state enumBreno Leitao
Introduces a enum to track netconsole target state which is going to replace the enabled boolean. Signed-off-by: Breno Leitao <leitao@debian.org> Signed-off-by: Andre Carvalho <asantostc@gmail.com> Tested-by: Breno Leitao <leitao@debian.org> Link: https://patch.msgid.link/20260118-netcons-retrigger-v11-1-4de36aebcf48@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>