summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2025-12-01net: ps3_gelic_net: Use napi_alloc_skb() and napi_gro_receive()Florian Fuchs
Use the napi functions napi_alloc_skb() and napi_gro_receive() instead of netdev_alloc_skb() and netif_receive_skb() for more efficient packet receiving. The switch to napi aware functions increases the RX throughput, reduces the occurrence of retransmissions and improves the resilience against SKB allocation failures. Signed-off-by: Florian Fuchs <fuchsfl@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20251130194155.1950980-1-fuchsfl@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-01Merge branch 'dsa-simple-hsr-offload'Jakub Kicinski
Vladimir Oltean says: ==================== DSA simple HSR offload Provide a "simple" form of HSR offload for 8 DSA drivers (just the NETIF_F_HW_HSR_DUP feature) based on the fact that their taggers use the dsa_xmit_port_mask() function. This is in patches 6-13/15. The helpers per se are introduced in patch 5/15, and documented in patch 15/15. Patch 14/15 is another small (and related) documentation update. For HSR interlink ports the offloading rules are not quite so clear, and for now we completely reject the offload. We can revise that once we see a full offload implementation and understand what is needed. To reject the offload, we need to know the port type, and patch 2/15 helps with that. xrs700x is another driver which should have rejected offload based on port type (patch 4/15). This is a bug fix submitted through net-next due to the extra API required to fix it. If necessary, it could also be picked up separately for backporting. There is also patch 3/15, which makes the HSR offload like the others supported by DSA: if we fall back to the software implementation, don't call port_hsr_leave(), because by definition there won't be anything to do. A slightly unrelated change is patch 1/15, but I noticed this along the way, and if I were to submit it separately, it would conflict with this work (it would appear in patch 12/15's context). Most of the driver additions are trivial. By far the most complex was ocelot (which I could test). Microchip ksz (which I cannot test, and did not patch) would also have some complexity. Essentially, ksz_hsr_join() could fall back to a partial offload through the simple helpers, if the full offload is not possible. But keeping track of which offload kind was used is necessary later in ksz_hsr_leave(). This is left as homework for interested developers. With this patch set, one can observe a 50% reduction in transmitted traffic over HSR interfaces. ==================== Link: https://patch.msgid.link/20251130131657.65080-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-01Documentation: net: dsa: mention simple HSR offload helpersVladimir Oltean
Keep the documentation up to date. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20251130131657.65080-16-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-01Documentation: net: dsa: mention availability of RedBoxVladimir Oltean
Since commit 5055cccfc2d1 ("net: hsr: Provide RedBox support (HSR-SAN)"), RedBox is available (including for offload in DSA). Update the DSA documentation that states it isn't. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20251130131657.65080-15-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-01net: dsa: a5psw: use simple HSR offload helpersVladimir Oltean
The "a5psw" tagging protocol uses dsa_xmit_port_mask(), which means we can offload HSR packet duplication on transmit. Enable that feature. Cc: "Clément Léger" <clement.leger@bootlin.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20251130131657.65080-14-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-01net: dsa: mt7530: use simple HSR offload helpersVladimir Oltean
The "mtk" tagging protocol uses dsa_xmit_port_mask(), which means we can offload HSR packet duplication on transmit. Enable that feature. Cc: Daniel Golle <daniel@makrotopia.org> Cc: DENG Qingfang <dqfext@gmail.com> Cc: Sean Wang <sean.wang@mediatek.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Acked-by: Chester A. Unal <chester.a.unal@arinc9.com> Link: https://patch.msgid.link/20251130131657.65080-13-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-01net: dsa: hellcreek: use simple HSR offload helpersVladimir Oltean
The "hellcreek" tagging protocol uses dsa_xmit_port_mask(), which means we can offload HSR packet duplication on transmit. Enable that feature. Cc: Kurt Kanzenbach <kurt@linutronix.de> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20251130131657.65080-12-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-01net: dsa: mv88e6060: use simple HSR offload helpersVladimir Oltean
The "trailer" tagging protocol uses dsa_xmit_port_mask(), which means we can offload HSR packet duplication on transmit. Enable that feature. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20251130131657.65080-11-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-01net: dsa: lantiq_gswip: use simple HSR offload helpersVladimir Oltean
Both the "gswip" and "gsw1xx" protocols use dsa_xmit_port_mask(), so they are compatible with accelerating TX packet duplication for HSR rings. Enable that feature by setting the port_hsr_join() and port_hsr_leave() operations to the simple helpers provided by DSA. Cc: Hauke Mehrtens <hauke@hauke-m.de> Cc: Daniel Golle <daniel@makrotopia.org> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20251130131657.65080-10-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-01net: dsa: realtek: use simple HSR offload helpersVladimir Oltean
All known Realtek protocols: "rtl4a", "rtl8_4" and "rtl8_4t" use dsa_xmit_port_mask(), so they are compatible with accelerating TX packet duplication for HSR rings. Enable that feature by setting the port_hsr_join() and port_hsr_leave() operations to the simple helpers provided by DSA. Cc: "Alvin Šipraga" <alsi@bang-olufsen.dk> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Linus Walleij <linusw@kernel.org> Link: https://patch.msgid.link/20251130131657.65080-9-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-01net: dsa: ocelot: use simple HSR offload helpersVladimir Oltean
Accelerate TX packet duplication with HSR rings. This is only possible with the NPI-based "ocelot" tagging protocol, not with "ocelot-8021q", because the latter does not use dsa_xmit_port_mask(). This has 2 implications: - Depending on tagging protocol, we should set (or not set) the offload feature flags. Switching tagging protocols is done with ports down, by design. Additional calls to dsa_port_simple_hsr_join() can be put in the ds->ops->change_tag_protocol() path, as I had originally tried, but this would not work: dsa_user_setup_tagger() would later clear the feature flag that we just set. So the additional call to dsa_port_simple_hsr_join() should sit in the ds->ops->port_enable() call. - When joining a HSR ring and we are currently using "ocelot-8021q", there are cases when we should return -EOPNOTSUPP (pessimistic) and cases when we shouldn't (optimistic). In the pessimistic case, it is a configuration that the port won't support even with the right tagging protocol. Distinguishing between these 2 cases matters because if we just return -EOPNOTSUPP regardless, we lose the dp->hsr_dev pointer and can no longer replay the offload later for the optimistic case, from felix_port_enable(). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20251130131657.65080-8-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-01net: dsa: yt921x: use simple HSR offloading helpersVladimir Oltean
Accelerate TX packet duplication with HSR rings. Cc: David Yang <mmyangfl@gmail.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20251130131657.65080-7-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-01net: dsa: add simple HSR offload helpersVladimir Oltean
It turns out that HSR offloads are so fine-grained that many DSA switches can do a small part even though they weren't specifically designed for the protocols supported by that driver (HSR and PRP). Specifically NETIF_F_HW_HSR_DUP - it is simple packet duplication on transmit, towards all (aka 2) ports members of the HSR device. For many DSA switches, we know how to duplicate a packet, even though we never typically use that feature. The transmit port mask from the tagging protocol can have multiple bits set, and the switch should send the packet once to every port with a bit set from that mask. Nonetheless, not all tagging protocols are like this, and sometimes the port is a single numeric value rather than a bit mask. For that reason, and also because switches can sometimes change tagging protocols for different ones, we need to make HSR offload helpers opt-in. For devices that can do nothing else HSR-specific, we introduce dsa_port_simple_hsr_join() and dsa_port_simple_hsr_leave(). These functions monitor when two user ports of the same switch are part of the same HSR device, and when that condition is true, they toggle the NETIF_F_HW_HSR_DUP feature flag of both net devices. Normally only dsa_port_simple_hsr_join() and dsa_port_simple_hsr_leave() are needed. The dsa_port_simple_hsr_validate() helper is just to see what kind of configuration could be offloadable using the generic helpers. This is used by switch drivers which are not currently using the right tagging protocol to offload this HSR ring, but could in principle offload it after changing the tagger. Suggested-by: David Yang <mmyangfl@gmail.com> Cc: "Alvin Šipraga" <alsi@bang-olufsen.dk> Cc: Chester A. Unal" <chester.a.unal@arinc9.com> Cc: "Clément Léger" <clement.leger@bootlin.com> Cc: Daniel Golle <daniel@makrotopia.org> Cc: DENG Qingfang <dqfext@gmail.com> Cc: Florian Fainelli <florian.fainelli@broadcom.com> Cc: George McCollister <george.mccollister@gmail.com> Cc: Hauke Mehrtens <hauke@hauke-m.de> Cc: Jonas Gorski <jonas.gorski@gmail.com> Cc: Kurt Kanzenbach <kurt@linutronix.de> Cc: Linus Walleij <linus.walleij@linaro.org> Cc: Sean Wang <sean.wang@mediatek.com> Cc: UNGLinuxDriver@microchip.com Cc: Woojung Huh <woojung.huh@microchip.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20251130131657.65080-6-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-01net: dsa: xrs700x: reject unsupported HSR configurationsVladimir Oltean
As discussed here: https://lore.kernel.org/netdev/20240620090210.drop6jwh7e5qw556@skbuf/ the fact is that the xrs700x.c driver only supports offloading HSR_PT_SLAVE_A and HSR_PT_SLAVE_B (which were the only port types at the time the offload was written, _for this driver_). Up until now, the API did not explicitly tell offloading drivers what port has what role. So xrs700x can get confused and think that it can support a configuration which it actually can't. There was a table in the attached link which gave an example: $ ip link add name hsr0 type hsr slave1 swp0 slave2 swp1 \ interlink swp2 supervision 45 version 1 HSR_PT_SLAVE_A HSR_PT_SLAVE_B HSR_PT_INTERLINK ---------------------------------------------------------------- user space 0 1 2 requests ---------------------------------------------------------------- XRS700X driver 1 2 - understands The switch would act as if the ring ports were swp1 and swp2. Now that we have explicit hsr_get_port_type() API, let's use that to work around the unintended semantical changes of the offloading API brought by the introduction of interlink ports in HSR. Fixes: 5055cccfc2d1 ("net: hsr: Provide RedBox support (HSR-SAN)") Cc: Lukasz Majewski <lukma@denx.de> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: George McCollister <george.mccollister@gmail.com> Link: https://patch.msgid.link/20251130131657.65080-5-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-01net: dsa: avoid calling ds->ops->port_hsr_leave() when unoffloadedVladimir Oltean
This mirrors what we do in dsa_port_lag_leave() and dsa_port_bridge_leave(): when ds->ops->port_hsr_join() returns -EOPNOTSUPP, we fall back to a software implementation where dp->hsr_dev is NULL, and the unoffloaded port is no longer bothered with calls from the HSR layer. This helps, for example, with interlink ports which current DSA drivers don't know how to offload. We have to check only in port_hsr_join() for the port type, then in port_hsr_leave() we are sure we're dealing only with known port types. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20251130131657.65080-4-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-01net: hsr: create an API to get hsr port typeXiaoliang Yang
Since the introduction of HSR_PT_INTERLINK in commit 5055cccfc2d1 ("net: hsr: Provide RedBox support (HSR-SAN)"), we see that different port types require different settings for hardware offload, which was not the case before when we only had HSR_PT_SLAVE_A and HSR_PT_SLAVE_B. But there is currently no way to know which port is which type, so create the hsr_get_port_type() API function and export it. When hsr_get_port_type() is called from the device driver, the port can must be found in the HSR port list. An important use case is for this function to work from offloading drivers' NETDEV_CHANGEUPPER handler, which is triggered by hsr_portdev_setup() -> netdev_master_upper_dev_link(). Therefore, we need to move the addition of the hsr_port to the HSR port list prior to calling hsr_portdev_setup(). This makes the error restoration path also more similar to hsr_del_port(), where kfree_rcu(port) is already used. Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Lukasz Majewski <lukma@denx.de> Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Łukasz Majewski <lukma@nabladev.com> Link: https://patch.msgid.link/20251130131657.65080-3-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-01net: dsa: mt7530: unexport mt7530_switch_opsVladimir Oltean
Commit cb675afcddbb ("net: dsa: mt7530: introduce separate MDIO driver") exported mt7530_switch_ops for use from mt7530-mmio.c. Later in the patch set, mt7530-mmio.c used mt7530_probe_common() to access the mt7530_switch_ops still from mt7530.c - see commit 110c18bfed41 ("net: dsa: mt7530: introduce driver for MT7988 built-in switch"). This proves that exporting mt7530_switch_ops was unnecessary, so unexport it back. Cc: DENG Qingfang <dqfext@gmail.com> Cc: Sean Wang <sean.wang@mediatek.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Daniel Golle <daniel@makrotopia.org> Acked-by: Daniel Golle <daniel@makrotopia.org> Acked-by: Chester A. Unal <chester.a.unal@arinc9.com> Link: https://patch.msgid.link/20251130131657.65080-2-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-01Merge tag 'vfs-6.19-rc1.autofs' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull autofs update from Christian Brauner: "Prevent futile mount triggers in private mount namespaces. Fix a problematic loop in autofs when a mount namespace contains autofs mounts that are propagation private and there is no namespace-specific automount daemon to handle possible automounting. Previously, attempted path resolution would loop until MAXSYMLINKS was reached before failing, causing significant noise in the log. The fix adds a check in autofs ->d_automount() so that the VFS can immediately return EPERM in this case. Since the mount is propagation private, EPERM is the most appropriate error code" * tag 'vfs-6.19-rc1.autofs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: autofs: dont trigger mount if it cant succeed
2025-12-01Merge tag 'vfs-6.19-rc1.ovl' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull overlayfs cred guard conversion from Christian Brauner: "This converts all of overlayfs to use credential guards, eliminating manual credential management throughout the filesystem. Credential guard conversion: - Convert all of overlayfs to use credential guards, replacing the manual ovl_override_creds()/ovl_revert_creds() pattern with scoped guards. This makes credential handling visually explicit and eliminates a class of potential bugs from mismatched override/revert calls. (1) Basic credential guard (with_ovl_creds) (2) Creator credential guard (ovl_override_creator_creds): Introduced a specialized guard for file creation operations that handles the two-phase credential override (mounter credentials, then fs{g,u}id override). The new pattern is much clearer: with_ovl_creds(dentry->d_sb) { scoped_class(prepare_creds_ovl, cred, dentry, inode, mode) { if (IS_ERR(cred)) return PTR_ERR(cred); /* creation operations */ } } (3) Copy-up credential guard (ovl_cu_creds): Introduced a specialized guard for copy-up operations, simplifying the previous struct ovl_cu_creds helper and associated functions. Ported ovl_copy_up_workdir() and ovl_copy_up_tmpfile() to this pattern. Cleanups: - Remove ovl_revert_creds() after all callers converted to guards - Remove struct ovl_cu_creds and associated functions - Drop ovl_setup_cred_for_create() after conversion - Refactor ovl_fill_super(), ovl_lookup(), ovl_iterate(), ovl_rename() for cleaner credential guard scope - Introduce struct ovl_renamedata to simplify rename handling - Don't override credentials for ovl_check_whiteouts() (unnecessary) - Remove unneeded semicolon" * tag 'vfs-6.19-rc1.ovl' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (54 commits) ovl: remove unneeded semicolon ovl: remove struct ovl_cu_creds and associated functions ovl: port ovl_copy_up_tmpfile() to cred guard ovl: mark *_cu_creds() as unused temporarily ovl: port ovl_copy_up_workdir() to cred guard ovl: add copy up credential guard ovl: drop ovl_setup_cred_for_create() ovl: port ovl_create_or_link() to new ovl_override_creator_creds cleanup guard ovl: mark ovl_setup_cred_for_create() as unused temporarily ovl: reflow ovl_create_or_link() ovl: port ovl_create_tmpfile() to new ovl_override_creator_creds cleanup guard ovl: add ovl_override_creator_creds cred guard ovl: remove ovl_revert_creds() ovl: port ovl_fill_super() to cred guard ovl: refactor ovl_fill_super() ovl: port ovl_lower_positive() to cred guard ovl: port ovl_lookup() to cred guard ovl: refactor ovl_lookup() ovl: port ovl_copyfile() to cred guard ovl: port ovl_rename() to cred guard ...
2025-12-01Merge tag 'vfs-6.19-rc1.directory.locking' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull directory locking updates from Christian Brauner: "This contains the work to add centralized APIs for directory locking operations. This series is part of a larger effort to change directory operation locking to allow multiple concurrent operations in a directory. The ultimate goal is to lock the target dentry(s) rather than the whole parent directory. To help with changing the locking protocol, this series centralizes locking and lookup in new helper functions. The helpers establish a pattern where it is the dentry that is being locked and unlocked (currently the lock is held on dentry->d_parent->d_inode, but that can change in the future). This also changes vfs_mkdir() to unlock the parent on failure, as well as dput()ing the dentry. This allows end_creating() to only require the target dentry (which may be IS_ERR() after vfs_mkdir()), not the parent" * tag 'vfs-6.19-rc1.directory.locking' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: nfsd: fix end_creating() conversion VFS: introduce end_creating_keep() VFS: change vfs_mkdir() to unlock on failure. ecryptfs: use new start_creating/start_removing APIs Add start_renaming_two_dentries() VFS/ovl/smb: introduce start_renaming_dentry() VFS/nfsd/ovl: introduce start_renaming() and end_renaming() VFS: add start_creating_killable() and start_removing_killable() VFS: introduce start_removing_dentry() smb/server: use end_removing_noperm for for target of smb2_create_link() VFS: introduce start_creating_noperm() and start_removing_noperm() VFS/nfsd/cachefiles/ovl: introduce start_removing() and end_removing() VFS/nfsd/cachefiles/ovl: add start_creating() and end_creating() VFS: tidy up do_unlinkat() VFS: introduce start_dirop() and end_dirop() debugfs: rename end_creating() to debugfs_end_creating()
2025-12-01Merge tag 'vfs-6.19-rc1.directory.delegations' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull directory delegations update from Christian Brauner: "This contains the work for recall-only directory delegations for knfsd. Add support for simple, recallable-only directory delegations. This was decided at the fall NFS Bakeathon where the NFS client and server maintainers discussed how to merge directory delegation support. The approach starts with recallable-only delegations for several reasons: 1. RFC8881 has gaps that are being addressed in RFC8881bis. In particular, it requires directory position information for CB_NOTIFY callbacks, which is difficult to implement properly under Linux. The spec is being extended to allow that information to be omitted. 2. Client-side support for CB_NOTIFY still lags. The client side involves heuristics about when to request a delegation. 3. Early indication shows simple, recallable-only delegations can help performance. Anna Schumaker mentioned seeing a multi-minute speedup in xfstests runs with them enabled. With these changes, userspace can also request a read lease on a directory that will be recalled on conflicting accesses. This may be useful for applications like Samba. Users can disable leases altogether via the fs.leases-enable sysctl if needed. VFS changes: - Dedicated Type for Delegations Introduce struct delegated_inode to track inodes that may have delegations that need to be broken. This replaces the previous approach of passing raw inode pointers through the delegation breaking code paths, providing better type safety and clearer semantics for the delegation machinery. - Break parent directory delegations in open(..., O_CREAT) codepath - Allow mkdir to wait for delegation break on parent - Allow rmdir to wait for delegation break on parent - Add try_break_deleg calls for parents to vfs_link(), vfs_rename(), and vfs_unlink() - Make vfs_create(), vfs_mknod(), and vfs_symlink() break delegations on parent directory - Clean up argument list for vfs_create() - Expose delegation support to userland Filelock changes: - Make lease_alloc() take a flags argument - Rework the __break_lease API to use flags - Add struct delegated_inode - Push the S_ISREG check down to ->setlease handlers - Lift the ban on directory leases in generic_setlease NFSD changes: - Allow filecache to hold S_IFDIR files - Allow DELEGRETURN on directories - Wire up GET_DIR_DELEGATION handling Fixes: - Fix kernel-doc warnings in __fcntl_getlease - Add needed headers for new struct delegation definition" * tag 'vfs-6.19-rc1.directory.delegations' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: vfs: add needed headers for new struct delegation definition filelock: __fcntl_getlease: fix kernel-doc warnings vfs: expose delegation support to userland nfsd: wire up GET_DIR_DELEGATION handling nfsd: allow DELEGRETURN on directories nfsd: allow filecache to hold S_IFDIR files filelock: lift the ban on directory leases in generic_setlease vfs: make vfs_symlink break delegations on parent dir vfs: make vfs_mknod break delegations on parent directory vfs: make vfs_create break delegations on parent directory vfs: clean up argument list for vfs_create() vfs: break parent dir delegations in open(..., O_CREAT) codepath vfs: allow rmdir to wait for delegation break on parent vfs: allow mkdir to wait for delegation break on parent vfs: add try_break_deleg calls for parents to vfs_{link,rename,unlink} filelock: push the S_ISREG check down to ->setlease handlers filelock: add struct delegated_inode filelock: rework the __break_lease API to use flags filelock: make lease_alloc() take a flags argument
2025-12-01Merge tag 'vfs-6.19-rc1.minix' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull minix fixes from Christian Brauner: "Fix two syzbot corruption bugs in the minix filesystem. Syzbot fuzzes filesystems by trying to mount and manipulate deliberately corrupted images. This should not lead to BUG_ONs and WARN_ONs for easy to detect corruptions. - Add error handling to minix filesystem for inode corruption detection, enabling the filesystem to report such corruptions cleanly. - Fix a drop_nlink warning in minix_rmdir() triggered by corrupted directory link counts. - Fix a drop_nlink warning in minix_rename() triggered by corrupted inode link counts" * tag 'vfs-6.19-rc1.minix' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: Fix a drop_nlink warning in minix_rename Fix a drop_nlink warning in minix_rmdir Add error handling to minix filesystem for inode corruption detection
2025-12-01Merge branch 'net-dsa-yt921x-add-stp-mst-support'Jakub Kicinski
David Yang says: ==================== net: dsa: yt921x: Add STP/MST support Support for these features was deferred from the initial submission of the driver. v3: https://lore.kernel.org/20251126093240.2853294-1-mmyangfl@gmail.com v2: https://lore.kernel.org/20251025170606.1937327-1-mmyangfl@gmail.com v1: https://lore.kernel.org/20251024033237.1336249-1-mmyangfl@gmail.com ==================== Link: https://patch.msgid.link/20251201094232.3155105-1-mmyangfl@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-01net: dsa: yt921x: Add STP/MST supportDavid Yang
Support for STP/MST was deferred from the initial submission of the driver. Signed-off-by: David Yang <mmyangfl@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20251201094232.3155105-3-mmyangfl@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-01net: dsa: yt921x: Use *_ULL bitfield macros for VLAN_CTRLDavid Yang
VLAN_CTRL should be treated as a 64-bit register. GENMASK and BIT macros use unsigned long as the underlying type, which will result in a build error on architectures where sizeof(long) == 4. Replace them with unsigned long long variants. Signed-off-by: David Yang <mmyangfl@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20251201094232.3155105-2-mmyangfl@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-01Merge branch ↵Jakub Kicinski
'add-sqi-and-sqi-support-for-oatc14-10base-t1s-phys-and-microchip-t1s-driver' Parthiban Veerasooran says: ==================== Add SQI and SQI+ support for OATC14 10Base-T1S PHYs and Microchip T1S driver This patch series adds Signal Quality Indicator (SQI) and enhanced SQI+ support for OATC14 10Base-T1S PHYs, along with integration into the Microchip T1S PHY driver. This enables ethtool to report the SQI value for OATC14 10Base-T1S PHYs. ==================== Link: https://patch.msgid.link/20251201032346.6699-1-parthiban.veerasooran@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-01net: phy: microchip_t1s: add SQI support for LAN867x Rev.D0 PHYsParthiban Veerasooran
Add support for Signal Quality Index (SQI) reporting in the Microchip T1S PHY driver for LAN867x Rev.D0 (OATC14-compliant) PHYs. This patch registers the following callbacks in the microchip_t1s driver structure: - .get_sqi - returns the current SQI value - .get_sqi_max - returns the maximum SQI value This enables ethtool to report the SQI value for LAN867x Rev.D0 PHYs. Signed-off-by: Parthiban Veerasooran <parthiban.veerasooran@microchip.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20251201032346.6699-3-parthiban.veerasooran@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-01net: phy: phy-c45: add SQI and SQI+ support for OATC14 10Base-T1S PHYsParthiban Veerasooran
Add support for reading Signal Quality Indicator (SQI) and enhanced SQI+ from OATC14 10Base-T1S PHYs. - Introduce MDIO register definitions for DCQ_SQI and DCQ_SQIPLUS. - Add `genphy_c45_oatc14_get_sqi_max()` to return the maximum supported SQI/SQI+ level. - Add `genphy_c45_oatc14_get_sqi()` to return the current SQI or SQI+ value. - Update `include/linux/phy.h` to expose the new APIs. SQI+ capability is read from the Advanced Diagnostic Features Capability register (ADFCAP). If SQI+ is supported, the driver calculates the value from the MSBs of the DCQ_SQIPLUS register; otherwise, it falls back to basic SQI (0-7 levels). This enables ethtool to report the SQI value for OATC14 10Base-T1S PHYs. Open Alliance TC14 10BASE-T1S Advanced Diagnostic PHY Features Specification ref: https://opensig.org/wp-content/uploads/2025/06/OPEN_Alliance_10BASE-T1S_Advanced_PHY_features_for-automotive_Ethernet_V2.1b.pdf Signed-off-by: Parthiban Veerasooran <parthiban.veerasooran@microchip.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20251201032346.6699-2-parthiban.veerasooran@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-01Merge branch 'net-mlx5e-enhance-dcbnl-get-set-maxrate-code'Jakub Kicinski
Tariq Toukan says: ==================== net/mlx5e: Enhance DCBNL get/set maxrate code This series by Gal introduces multiple small code quality improvements for the DCBNL operations mlx5e_dcbnl_ieee_[gs]etmaxrate(). No functional change. ==================== Link: https://patch.msgid.link/1764498334-1327918-1-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-01net/mlx5e: Use standard unit definitions for bandwidth conversionGal Pressman
MLX5E_100MB and MLX5E_1GB defines are confusing, MLX5E_100MB is not equal to 100 * MEGA, and MLX5E_1GB is not equal to one GIGA, as they hide the Kbps rate conversion required for ieee_maxrate. Replace hardcoded bandwidth conversion values with standard unit definitions from linux/units.h. Rename MLX5E_100MB/MLX5E_1GB to MLX5E_100MB_TO_KB/MLX5E_1GB_TO_KB to clarify these are conversion factors to Kbps, not absolute bandwidth values. Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Nimrod Oren <noren@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/1764498334-1327918-5-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-01net/mlx5e: Use U8_MAX instead of hard coded magic numberGal Pressman
Replace hard coded 255 magic number with U8_MAX (the register field is 8 bits). Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Nimrod Oren <noren@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/1764498334-1327918-4-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-01net/mlx5e: Rename upper_limit_mbps to upper_limit_100mbpsGal Pressman
Clarify that the limit represents the threshold for using 100 Mbps units rather than a general Mbps limit. Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Nimrod Oren <noren@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/1764498334-1327918-3-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-01net/mlx5e: Use u64 instead of __u64 in ieee_setmaxrateGal Pressman
Change upper_limit_mbps/gbps from __u64 to u64 to follow kernel coding conventions. Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Nimrod Oren <noren@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/1764498334-1327918-2-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-01Revert "r8169: add DASH support for RTL8127AP"Jakub Kicinski
This reverts commit 17e9f841dd227a4dc976b22d000d5f669bc14493. Nathan reports error messages appearing in dmesg since commit under Fixes: [ 3.844125] r8169 0000:01:00.0 (unnamed net_device) (uninitialized): rtl_eriar_cond == 0 (loop: 100, delay: 100). [ 3.864844] r8169 0000:01:00.0 eth0: rtl_eriar_cond == 1 (loop: 100, delay: 100). [ 3.878825] r8169 0000:01:00.0 eth0: rtl_eriar_cond == 1 (loop: 100, delay: 100). [ 3.892632] r8169 0000:01:00.0 eth0: rtl_eriar_cond == 1 (loop: 100, delay: 100). [ 5.002551] r8169 0000:01:00.0 eth0: rtl_eriar_cond == 1 (loop: 100, delay: 100). [ 5.016286] r8169 0000:01:00.0 eth0: rtl_eriar_cond == 1 (loop: 100, delay: 100). [ 5.030027] r8169 0000:01:00.0 eth0: rtl_eriar_cond == 1 (loop: 100, delay: 100). Let's drop the bad change and revisit in the next release cycle. Repoted-by: Nathan Chancellor <nathan@kernel.org> Link: https://lore.kernel.org/20251201224238.GA604467@ax162 Fixes: 17e9f841dd22 ("r8169: add DASH support for RTL8127AP") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-01Merge branch 'net-dsa-b53-fix-arl-accesses-for-bcm5325-65-and-allow-vid-0'Jakub Kicinski
Jonas Gorski says: ==================== net: dsa: b53: fix ARL accesses for BCM5325/65 and allow VID 0 ARL entries on BCM5325 and BCM5365 were broken significantly in two ways: - Entries for the CPU port were using the wrong port id, pointing to a non existing port. - Setting the VLAN ID for entries was not done, adding them all to VLAN 0 instead. While the former technically broke any communication to the CPU port, with the latter they were added to the currently unused VID 0, so they never became effective. Presumably the default PVID was set to 1 because of these issues 0 was broken (and the root cause not found). So fix writing and reading entries on BCM5325/65 by first fixing the CPU port entries, then fixing setting the VLAN ID for entries. Finally, re-allow VID 0 for BCM5325/65 to allow the whole 1-15 VLAN ID range to be available to users, and align VLAN handling with all other switch chips. ==================== Link: https://patch.msgid.link/20251128080625.27181-1-jonas.gorski@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-01net: dsa: b53: allow VID 0 for BCM5325/65Jonas Gorski
Now that writing ARL entries works properly, we can actually use VID 0 as the default untagged VLAN for BCM5325 and BCM5365 as well. So use 0 as default PVID for all chips and do not reject VLAN 0 anymore, which we ignored since commit 45e9d59d3950 ("net: dsa: b53: do not allow to configure VLAN 0") anyway. Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://patch.msgid.link/20251128080625.27181-8-jonas.gorski@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-01net: dsa: b53: fix BCM5325/65 ARL entry VIDsJonas Gorski
BCM5325/65's ARL entry registers do not contain the VID, only the search result register does. ARL entries have a separate VID entry register for the index into the VLAN table. So make ARL entry accessors use the VID entry registers instead, and move the VLAN ID field definition to the search register definition. Fixes: c45655386e53 ("net: dsa: b53: add support for FDB operations on 5325/5365") Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://patch.msgid.link/20251128080625.27181-7-jonas.gorski@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-01net: dsa: b53: fix BCM5325/65 ARL entry multicast port masksJonas Gorski
We currently use the mask 0xf for writing and reading b53_entry::port, but this is only correct for unicast ARL entries. Multicast ARL entries use a bitmask, and 0xf is not enough space for ports > 3, which includes the CPU port. So extend the mask accordingly to also fit port 4 (bit 4) and MII (bit 5). According to the datasheet the multicast port mask is [60:48], making it 12 bit wide, but bits 60-55 are reserved anyway, and collide with the priority field at [60:59], so I am not sure if this is valid. Therefore leave it at the actual used range, [53:48]. The ARL search result register differs a bit, and there the mask is only [52:48], so only spanning the user ports. The MII port bit is contained in the Search Result Extension register. So create a separate search result parse function that properly handles this. Fixes: c45655386e53 ("net: dsa: b53: add support for FDB operations on 5325/5365") Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com> Link: https://patch.msgid.link/20251128080625.27181-6-jonas.gorski@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-01net: dsa: b53: fix CPU port unicast ARL entries for BCM5325/65Jonas Gorski
On BCM5325 and BCM5365, unicast ARL entries use 8 as the value for the CPU port, so we need to translate it to/from 5 as used for the CPU port at most other places. Fixes: c45655386e53 ("net: dsa: b53: add support for FDB operations on 5325/5365") Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://patch.msgid.link/20251128080625.27181-5-jonas.gorski@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-01net: dsa: b53: use same ARL search result offset for BCM5325/65Jonas Gorski
BCM5365's search result is at the same offset as BCM5325's search result, and they (mostly) share the same format, so switch BCM5365 to BCM5325's arl ops. Fixes: c45655386e53 ("net: dsa: b53: add support for FDB operations on 5325/5365") Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Tested-by: Álvaro Fernández Rojas <noltari@gmail.com> Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com> Link: https://patch.msgid.link/20251128080625.27181-4-jonas.gorski@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-01net: dsa: b53: fix extracting VID from entry for BCM5325/65Jonas Gorski
BCM5325/65's Entry register uses the highest three bits for VALID/STATIC/AGE, so shifting by 53 only will add these to b53_arl_entry::vid. So make sure to mask the vid value as well, to not get invalid VIDs. Fixes: c45655386e53 ("net: dsa: b53: add support for FDB operations on 5325/5365") Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Tested-by: Álvaro Fernández Rojas <noltari@gmail.com> Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com> Link: https://patch.msgid.link/20251128080625.27181-3-jonas.gorski@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-01net: dsa: b53: fix VLAN_ID_IDX write size for BCM5325/65Jonas Gorski
Since BCM5325 and BCM5365 only support up to 256 VLANs, the VLAN_ID_IDX register is only 8 bit wide, not 16 bit, so use an appropriate accessor. Fixes: c45655386e53 ("net: dsa: b53: add support for FDB operations on 5325/5365") Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Tested-by: Álvaro Fernández Rojas <noltari@gmail.com> Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com> Link: https://patch.msgid.link/20251128080625.27181-2-jonas.gorski@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-01Merge tag 'vfs-6.19-rc1.guards' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull superblock lock guard updates from Christian Brauner: "This starts the work of introducing guards for superblock related locks. Introduce super_write_guard for scoped superblock write protection. This provides a guard-based alternative to the manual sb_start_write() and sb_end_write() pattern, allowing the compiler to automatically handle the cleanup" * tag 'vfs-6.19-rc1.guards' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: xfs: use super write guard in xfs_file_ioctl() open: use super write guard in do_ftruncate() btrfs: use super write guard in relocating_repair_kthread() ext4: use super write guard in write_mmp_block() btrfs: use super write guard in sb_start_write() btrfs: use super write guard btrfs_run_defrag_inode() btrfs: use super write guard in btrfs_reclaim_bgs_work() fs: add super_write_guard
2025-12-01Merge branch 'amd-xgbe-schedule-napi-on-rbu-event'Jakub Kicinski
Raju Rangoju says: ==================== amd-xgbe: schedule NAPI on RBU event During the RX overload the Rx buffers may not be refilled, trying to schedule the NAPI when an Rx Buffer Unavailable is signaled may help in improving the such situation, in case we missed an IRQ. ==================== Link: https://patch.msgid.link/20251129175016.3034185-1-Raju.Rangoju@amd.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-01amd-xgbe: schedule NAPI on Rx Buffer Unavailable (RBU)Raju Rangoju
Under heavy load, Rx Buffer Unavailable (RBU) can occur if Rx processing is slower than network. When an RBU is signaled, try to schedule NAPI to help recover from such situation (including cases where an IRQ may be missed or such) Signed-off-by: Raju Rangoju <Raju.Rangoju@amd.com> Link: https://patch.msgid.link/20251129175016.3034185-3-Raju.Rangoju@amd.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-01amd-xgbe: refactor the dma IRQ handling code pathRaju Rangoju
Refactor the DMA interrupt bottom-half handling to improve the readability, maintainability, without changing the intended behavior. Signed-off-by: Raju Rangoju <Raju.Rangoju@amd.com> Link: https://patch.msgid.link/20251129175016.3034185-2-Raju.Rangoju@amd.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-01team: Add matching error label for failed actionNikola Z. Ivanov
Follow the "action" - "err_action" pairing of labels found across the source code of team net device. Currently in team_port_add the err_set_slave_promisc label is reused for exiting on error when setting allmulti level of the new slave. Signed-off-by: Nikola Z. Ivanov <zlatistiv@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://patch.msgid.link/20251128072544.223645-1-zlatistiv@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-01Merge tag 'vfs-6.19-rc1.fs_header' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull fs header updates from Christian Brauner: "This contains initial work to start splitting up fs.h. Begin the long-overdue work of splitting up the monolithic fs.h header. The header has grown to over 3000 lines and includes types and functions for many different subsystems, making it difficult to navigate and causing excessive compilation dependencies. This series introduces new focused headers for superblock-related code: - Rename fs_types.h to fs_dirent.h to better reflect its actual content (directory entry types) - Add fs/super_types.h containing superblock type definitions - Add fs/super.h containing superblock function declarations This is the first step in a longer effort to modularize the VFS headers. Cleanups: - Inode Field Layout Optimization (Mateusz Guzik) Move inode fields used during fast path lookup closer together to improve cache locality during path resolution. - current_umask() Optimization (Mateusz Guzik) Inline current_umask() and move it to fs_struct.h. This improves performance by avoiding function call overhead for this frequently-used function, and places it in a more appropriate header since it operates on fs_struct" * tag 'vfs-6.19-rc1.fs_header' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: fs: move inode fields used during fast path lookup closer together fs: inline current_umask() and move it to fs_struct.h fs: add fs/super.h header fs: add fs/super_types.h header fs: rename fs_types.h to fs_dirent.h
2025-12-01net: mana: Handle hardware recovery events when probing the deviceLong Li
When MANA is being probed, it's possible that hardware is in recovery mode and the device may get GDMA_EQE_HWC_RESET_REQUEST over HWC in the middle of the probe. Detect such condition and go through the recovery service procedure. Signed-off-by: Long Li <longli@microsoft.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Link: https://patch.msgid.link/1764193552-9712-1-git-send-email-longli@linux.microsoft.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-01net: mctp: test: move TX packetqueue from dst to devJeremy Kerr
To capture TX packets during a test, we are currently intercepting the dst->output with an implementation that adds the transmitted packet to a skb queue attached to the test-specific mock dst. The netdev itself is not involved in the test TX path. Instead, we can just use our test device to stash TXed packets for later inspection by the test. This means we can include the actual mctp_dst_output() implementation in the test (by setting dst.output in the test case), and don't need to be creating fake dst objects, or their corresponding skb queues. We need to ensure that the netdev is up to allow delivery to ndo_start_xmit, but the tests assume active devices at present anyway. Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au> Link: https://patch.msgid.link/20251126-dev-mctp-test-tx-queue-v2-1-4e5bbd1d6c57@codeconstruct.com.au Signed-off-by: Jakub Kicinski <kuba@kernel.org>