summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2025-09-25net: gro: remove unnecessary df checksRichard Gobert
Currently, packets with fixed IDs will be merged only if their don't-fragment bit is set. This restriction is unnecessary since packets without the don't-fragment bit will be forwarded as-is even if they were merged together. The merged packets will be segmented into their original forms before being forwarded, either by GSO or by TSO. The IDs will also remain identical unless NETIF_F_TSO_MANGLEID is set, in which case the IDs can become incrementing, which is also fine. Clean up the code by removing the unnecessary don't-fragment checks. Signed-off-by: Richard Gobert <richardbgobert@gmail.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250923085908.4687-5-richardbgobert@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-09-25net: gso: restore ids of outer ip headers correctlyRichard Gobert
Currently, NETIF_F_TSO_MANGLEID indicates that the inner-most ID can be mangled. Outer IDs can always be mangled. Make GSO preserve outer IDs by default, with NETIF_F_TSO_MANGLEID allowing both inner and outer IDs to be mangled. This commit also modifies a few drivers that use SKB_GSO_FIXEDID directly. Signed-off-by: Richard Gobert <richardbgobert@gmail.com> Reviewed-by: Edward Cree <ecree.xilinx@gmail.com> # for sfc Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250923085908.4687-4-richardbgobert@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-09-25net: gro: only merge packets with incrementing or fixed outer idsRichard Gobert
Only merge encapsulated packets if their outer IDs are either incrementing or fixed, just like for inner IDs and IDs of non-encapsulated packets. Add another ip_fixedid bit for a total of two bits: one for outer IDs (and for unencapsulated packets) and one for inner IDs. This commit preserves the current behavior of GSO where only the IDs of the inner-most headers are restored correctly. Signed-off-by: Richard Gobert <richardbgobert@gmail.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250923085908.4687-3-richardbgobert@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-09-25net: gro: remove is_ipv6 from napi_gro_cbRichard Gobert
Remove is_ipv6 from napi_gro_cb and use sk->sk_family instead. This frees up space for another ip_fixedid bit that will be added in the next commit. udp_sock_create always creates either a AF_INET or a AF_INET6 socket, so using sk->sk_family is reliable. In IPv6-FOU, cfg->ipv6_v6only is always enabled. Signed-off-by: Richard Gobert <richardbgobert@gmail.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250923085908.4687-2-richardbgobert@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-09-25sched: Fix some typos in include/linux/preempt.hMenglong Dong
There are some typos in the comments of migrate in include/linux/preempt.h: elegible -> eligible it's -> its migirate_disable -> migrate_disable abritrary -> arbitrary Just fix them. Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
2025-09-25sched: Make migrate_{en,dis}able() inlineMenglong Dong
For now, migrate_enable and migrate_disable are global, which makes them become hotspots in some case. Take BPF for example, the function calling to migrate_enable and migrate_disable in BPF trampoline can introduce significant overhead, and following is the 'perf top' of FENTRY's benchmark (./tools/testing/selftests/bpf/bench trig-fentry): 54.63% bpf_prog_2dcccf652aac1793_bench_trigger_fentry [k] bpf_prog_2dcccf652aac1793_bench_trigger_fentry 10.43% [kernel] [k] migrate_enable 10.07% bpf_trampoline_6442517037 [k] bpf_trampoline_6442517037 8.06% [kernel] [k] __bpf_prog_exit_recur 4.11% libc.so.6 [.] syscall 2.15% [kernel] [k] entry_SYSCALL_64 1.48% [kernel] [k] memchr_inv 1.32% [kernel] [k] fput 1.16% [kernel] [k] _copy_to_user 0.73% [kernel] [k] bpf_prog_test_run_raw_tp So in this commit, we make migrate_enable/migrate_disable inline to obtain better performance. The struct rq is defined internally in kernel/sched/sched.h, and the field "nr_pinned" is accessed in migrate_enable/migrate_disable, which makes it hard to make them inline. Alexei Starovoitov suggests to generate the offset of "nr_pinned" in [1], so we can define the migrate_enable/migrate_disable in include/linux/sched.h and access "this_rq()->nr_pinned" with "(void *)this_rq() + RQ_nr_pinned". The offset of "nr_pinned" is generated in include/generated/rq-offsets.h by kernel/sched/rq-offsets.c. Generally speaking, we move the definition of migrate_enable and migrate_disable to include/linux/sched.h from kernel/sched/core.c. The calling to __set_cpus_allowed_ptr() is leaved in ___migrate_enable(). The "struct rq" is not available in include/linux/sched.h, so we can't access the "runqueues" with this_cpu_ptr(), as the compilation will fail in this_cpu_ptr() -> raw_cpu_ptr() -> __verify_pcpu_ptr(): typeof((ptr) + 0) So we introduce the this_rq_raw() and access the runqueues with arch_raw_cpu_ptr/PERCPU_PTR directly. The variable "runqueues" is not visible in the kernel modules, and export it is not a good idea. As Peter Zijlstra advised in [2], we define and export migrate_enable/migrate_disable in kernel/sched/core.c too, and use them for the modules. Before this patch, the performance of BPF FENTRY is: fentry : 113.030 ± 0.149M/s fentry : 112.501 ± 0.187M/s fentry : 112.828 ± 0.267M/s fentry : 115.287 ± 0.241M/s After this patch, the performance of BPF FENTRY increases to: fentry : 143.644 ± 0.670M/s fentry : 149.764 ± 0.362M/s fentry : 149.642 ± 0.156M/s fentry : 145.263 ± 0.221M/s Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/bpf/CAADnVQ+5sEDKHdsJY5ZsfGDO_1SEhhQWHrt2SMBG5SYyQ+jt7w@mail.gmail.com/ [1] Link: https://lore.kernel.org/all/20250819123214.GH4067720@noisy.programming.kicks-ass.net/ [2]
2025-09-25rcu: Replace preempt.h with sched.h in include/linux/rcupdate.hMenglong Dong
In the next commit, we will move the definition of migrate_enable() and migrate_disable() to linux/sched.h. However, migrate_enable/migrate_disable will be used in commit 1b93c03fb319 ("rcu: add rcu_read_lock_dont_migrate()") in bpf-next tree. In order to fix potential compiling error, replace linux/preempt.h with linux/sched.h in include/linux/rcupdate.h. Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
2025-09-25sched/deadline: Fix dl_server behaviourPeter Zijlstra
John reported undesirable behaviour with the dl_server since commit: cccb45d7c4295 ("sched/deadline: Less agressive dl_server handling"). When starving fair tasks on purpose (starting spinning FIFO tasks), his fair workload, which often goes (briefly) idle, would delay fair invocations for a second, running one invocation per second was both unexpected and terribly slow. The reason this happens is that when dl_se->server_pick_task() returns NULL, indicating no runnable tasks, it would yield, pushing any later jobs out a whole period (1 second). Instead simply stop the server. This should restore behaviour in that a later wakeup (which restarts the server) will be able to continue running (subject to the CBS wakeup rules). Notably, this does not re-introduce the behaviour cccb45d7c4295 set out to solve, any start/stop cycle is naturally throttled by the timer period (no active cancel). Fixes: cccb45d7c4295 ("sched/deadline: Less agressive dl_server handling") Reported-by: John Stultz <jstultz@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: John Stultz <jstultz@google.com>
2025-09-25sched/deadline: Fix dl_server getting stuckPeter Zijlstra
John found it was easy to hit lockup warnings when running locktorture on a 2 CPU VM, which he bisected down to: commit cccb45d7c429 ("sched/deadline: Less agressive dl_server handling"). While debugging it seems there is a chance where we end up with the dl_server dequeued, with dl_se->dl_server_active. This causes dl_server_start() to return without enqueueing the dl_server, thus it fails to run when RT tasks starve the cpu. When this happens, dl_server_timer() catches the '!dl_se->server_has_tasks(dl_se)' case, which then calls replenish_dl_entity() and dl_server_stopped() and finally return HRTIMER_NO_RESTART. This ends in no new timer and also no enqueue, leaving the dl_server 'dead', allowing starvation. What should have happened is for the bandwidth timer to start the zero-laxity timer, which in turn would enqueue the dl_server and cause dl_se->server_pick_task() to be called -- which will stop the dl_server if no fair tasks are observed for a whole period. IOW, it is totally irrelevant if there are fair tasks at the moment of bandwidth refresh. This removes all dl_se->server_has_tasks() users, so remove the whole thing. Fixes: cccb45d7c4295 ("sched/deadline: Less agressive dl_server handling") Reported-by: John Stultz <jstultz@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: John Stultz <jstultz@google.com>
2025-09-25ns: move ns type into struct ns_commonChristian Brauner
It's misplaced in struct proc_ns_operations and ns->ops might be NULL if the namespace is compiled out but we still want to know the type of the namespace for the initial namespace struct. Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-09-25nstree: make struct ns_tree privateChristian Brauner
Don't expose it directly. There's no need to do that. Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-09-25afs: Add support for RENAME_NOREPLACE and RENAME_EXCHANGEDavid Howells
Add support for RENAME_NOREPLACE and RENAME_EXCHANGE, if the server supports them. The default is translated to YFS.Rename_Replace, falling back to YFS.Rename; RENAME_NOREPLACE is translated to YFS.Rename_NoReplace and RENAME_EXCHANGE to YFS.Rename_Exchange, both of which fall back to reporting EINVAL. Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/740476.1758718189@warthog.procyon.org.uk cc: Marc Dionne <marc.dionne@auristor.com> cc: Dan Carpenter <dan.carpenter@linaro.org> cc: linux-afs@lists.infradead.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-09-25accel/habanalabs: add HL_GET_P_STATE passthrough typeAriel Aviad
Add a new passthrough type HL_GET_P_STATE to the cpucp generic ioctl to allow userspace to read the device performance state via firmware. Signed-off-by: Ariel Aviad <ariel.aviad@intel.com> Reviewed-by: Koby Elbaz <koby.elbaz@intel.com> Signed-off-by: Koby Elbaz <koby.elbaz@intel.com>
2025-09-25accel/habanalabs: fix typo in trace output (cms -> cmd)Tomer Tayar
Fix a typo in TP_printk format string of habanalabs tracepoint: replace "cms" with "cmd". Signed-off-by: Tomer Tayar <tomer.tayar@intel.com> Reviewed-by: Koby Elbaz <koby.elbaz@intel.com> Signed-off-by: Koby Elbaz <koby.elbaz@intel.com>
2025-09-25accel/habanalabs: add generic message type to get error countersVitaly Margolin
Add a new CPUCP generic message type to retrieve HBM, SRAM and critical error counters from the device. Signed-off-by: Vitaly Margolin <vitaly.margolin@intel.com> Reviewed-by: Koby Elbaz <koby.elbaz@intel.com> Signed-off-by: Koby Elbaz <koby.elbaz@intel.com>
2025-09-25once: fix race by moving DO_ONCE to separate sectionQi Xi
The commit c2c60ea37e5b ("once: use __section(".data.once")") moved DO_ONCE's ___done variable to .data.once section, which conflicts with DO_ONCE_LITE() that also uses the same section. This creates a race condition when clear_warn_once is used: Thread 1 (DO_ONCE) Thread 2 (DO_ONCE) __do_once_start read ___done (false) acquire once_lock execute func __do_once_done write ___done (true) __do_once_start release once_lock // Thread 3 clear_warn_once reset ___done read ___done (false) acquire once_lock execute func schedule once_work __do_once_done once_deferred: OK write ___done (true) static_branch_disable release once_lock schedule once_work once_deferred: BUG_ON(!static_key_enabled) DO_ONCE_LITE() in once_lite.h is used by WARN_ON_ONCE() and other warning macros. Keep its ___done flag in the .data..once section and allow resetting by clear_warn_once, as originally intended. In contrast, DO_ONCE() is used for functions like get_random_once() and relies on its ___done flag for internal synchronization. We should not reset DO_ONCE() by clear_warn_once. Fix it by isolating DO_ONCE's ___done into a separate .data..do_once section, shielding it from clear_warn_once. Fixes: c2c60ea37e5b ("once: use __section(".data.once")") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Qi Xi <xiqi2@huawei.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2025-09-24scsi: ufs: core: Fix data race in CPU latency PM QoS request handlingZhongqiu Han
The cpu_latency_qos_add/remove/update_request interfaces lack internal synchronization by design, requiring the caller to ensure thread safety. The current implementation relies on the 'pm_qos_enabled' flag, which is insufficient to prevent concurrent access and cannot serve as a proper synchronization mechanism. This has led to data races and list corruption issues. A typical race condition call trace is: [Thread A] ufshcd_pm_qos_exit() --> cpu_latency_qos_remove_request() --> cpu_latency_qos_apply(); --> pm_qos_update_target() --> plist_del <--(1) delete plist node --> memset(req, 0, sizeof(*req)); --> hba->pm_qos_enabled = false; [Thread B] ufshcd_devfreq_target --> ufshcd_devfreq_scale --> ufshcd_scale_clks --> ufshcd_pm_qos_update <--(2) pm_qos_enabled is true --> cpu_latency_qos_update_request --> pm_qos_update_target --> plist_del <--(3) plist node use-after-free Introduces a dedicated mutex to serialize PM QoS operations, preventing data races and ensuring safe access to PM QoS resources, including sysfs interface reads. Fixes: 2777e73fc154 ("scsi: ufs: core: Add CPU latency QoS support for UFS driver") Signed-off-by: Zhongqiu Han <zhongqiu.han@oss.qualcomm.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Tested-by: Huan Tang <tanghuan@vivo.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-09-24scsi: ufs: core: Change MCQ interrupt enable flowPeter Wang
Move the MCQ interrupt enable process to ufshcd_mcq_make_queues_operational() to ensure that interrupts are set correctly when making queues operational, similar to ufshcd_make_hba_operational(). This change addresses the issue where ufshcd_mcq_make_queues_operational() was not fully operational due to missing interrupt enablement. This change only affects host drivers that call ufshcd_mcq_make_queues_operational(), i.e. ufs-mediatek. Signed-off-by: Peter Wang <peter.wang@mediatek.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-09-24byteorder: Add memcpy_to_le32() and memcpy_from_le32()Anup Patel
Add common memcpy APIs for copying u32 array to/from __le32 array. Suggested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Anup Patel <apatel@ventanamicro.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Acked-by: Jassi Brar <jassisinghbrar@gmail.com> Link: https://lore.kernel.org/r/20250818040920.272664-7-apatel@ventanamicro.com Signed-off-by: Paul Walmsley <pjw@kernel.org>
2025-09-24mailbox: Allow controller specific mapping using fwnodeAnup Patel
Introduce optional fw_node() callback which allows a mailbox controller driver to provide controller specific mapping using fwnode. The Linux OF framework already implements fwnode operations for the Linux DD framework so the fw_xlate() callback works fine with device tree as well. Acked-by: Jassi Brar <jassisinghbrar@gmail.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Anup Patel <apatel@ventanamicro.com> Link: https://lore.kernel.org/r/20250818040920.272664-6-apatel@ventanamicro.com Signed-off-by: Paul Walmsley <pjw@kernel.org>
2025-09-24hfs/hfsplus: rework debug output subsystemViacheslav Dubeyko
Currently, HFS/HFS+ has very obsolete and inconvenient debug output subsystem. Also, the code is duplicated in HFS and HFS+ driver. This patch introduces linux/hfs_common.h for gathering common declarations, inline functions, and common short methods. Currently, this file contains only hfs_dbg() function that employs pr_debug() with the goal to print a debug-level messages conditionally. So, now, it is possible to enable the debug output by means of: echo 'file extent.c +p' > /proc/dynamic_debug/control echo 'func hfsplus_evict_inode +p' > /proc/dynamic_debug/control And debug output looks like this: hfs: pid 5831:fs/hfs/catalog.c:228 hfs_cat_delete(): delete_cat: 00,48 hfs: pid 5831:fs/hfs/extent.c:484 hfs_file_truncate(): truncate: 48, 409600 -> 0 hfs: pid 5831:fs/hfs/extent.c:212 hfs_dump_extent(): hfs: pid 5831:fs/hfs/extent.c:214 hfs_dump_extent(): 78:4 hfs: pid 5831:fs/hfs/extent.c:214 hfs_dump_extent(): 0:0 hfs: pid 5831:fs/hfs/extent.c:214 hfs_dump_extent(): 0:0 v4 Debug messages have been reworked and information about new HFS/HFS+ shared declarations file has been added to MAINTAINERS file. v5 Yangtao Li suggested to clean up debug output and fix several typos. Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com> cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> cc: Yangtao Li <frank.li@vivo.com> cc: linux-fsdevel@vger.kernel.org cc: Johannes Thumshirn <Johannes.Thumshirn@wdc.com> Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
2025-09-24mailbox: Add common header for RPMI messages sent via mailboxAnup Patel
The RPMI based mailbox controller drivers and mailbox clients need to share defines related to RPMI messages over mailbox interface so add a common header for this purpose. Acked-by: Jassi Brar <jassisinghbrar@gmail.com> Co-developed-by: Rahul Pathak <rpathak@ventanamicro.com> Signed-off-by: Rahul Pathak <rpathak@ventanamicro.com> Signed-off-by: Anup Patel <apatel@ventanamicro.com> Link: https://lore.kernel.org/r/20250818040920.272664-5-apatel@ventanamicro.com Signed-off-by: Paul Walmsley <pjw@kernel.org>
2025-09-24crypto: af_alg - Fix incorrect boolean values in af_alg_ctxEric Biggers
Commit 1b34cbbf4f01 ("crypto: af_alg - Disallow concurrent writes in af_alg_sendmsg") changed some fields from bool to 1-bit bitfields of type u32. However, some assignments to these fields, specifically 'more' and 'merge', assign values greater than 1. These relied on C's implicit conversion to bool, such that zero becomes false and nonzero becomes true. With a 1-bit bitfields of type u32 instead, mod 2 of the value is taken instead, resulting in 0 being assigned in some cases when 1 was intended. Fix this by restoring the bool type. Fixes: 1b34cbbf4f01 ("crypto: af_alg - Disallow concurrent writes in af_alg_sendmsg") Cc: stable@vger.kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-09-24Merge tag 'soc-fixes-6.17-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull SoC fixes from Arnd Bergmann: "There are a few minor code fixes for tegra firmware, i.MX firmware and the eyeq reset controller, and a MAINTAINERS update as Alyssa Rosenzweig moves on to non-kernel projects. The other changes are all for devicetree files: - Multiple Marvell Armada SoCs need changes to fix PCIe, audio and SATA - A socfpga board fails to probe the ethernet phy - The two temperature sensors on i.MX8MP are swapped - Allwinner devicetree files cause build-time warnings - Two Rockchip based boards need corrections for headphone detection and SPI flash" * tag 'soc-fixes-6.17-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: MAINTAINERS: remove Alyssa Rosenzweig firmware: tegra: Do not warn on missing memory-region property arm64: dts: marvell: cn9132-clearfog: fix multi-lane pci x2 and x4 ports arm64: dts: marvell: cn9132-clearfog: disable eMMC high-speed modes arm64: dts: marvell: cn913x-solidrun: fix sata ports status ARM: dts: kirkwood: Fix sound DAI cells for OpenRD clients arm64: dts: imx8mp: Correct thermal sensor index ARM: imx: Kconfig: Adjust select after renamed config option firmware: imx: Add stub functions for SCMI CPU API firmware: imx: Add stub functions for SCMI LMM API firmware: imx: Add stub functions for SCMI MISC API riscv: dts: allwinner: rename devterm i2c-gpio node to comply with binding arm64: dts: rockchip: Fix the headphone detection on the orangepi 5 arm64: dts: rockchip: Add vcc supply for SPI Flash on NanoPC-T6 ARM: dts: socfpga: sodia: Fix mdio bus probe and PHY address reset: eyeq: fix OF node leak ARM64: dts: mcbin: fix SATA ports on Macchiatobin ARM: dts: armada-370-db: Fix stereo audio input routing on Armada 370 ARM: dts: allwinner: Minor whitespace cleanup
2025-09-24kcfi: Rename CONFIG_CFI_CLANG to CONFIG_CFIKees Cook
The kernel's CFI implementation uses the KCFI ABI specifically, and is not strictly tied to a particular compiler. In preparation for GCC supporting KCFI, rename CONFIG_CFI_CLANG to CONFIG_CFI (along with associated options). Use new "transitional" Kconfig option for old CONFIG_CFI_CLANG that will enable CONFIG_CFI during olddefconfig. Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Link: https://lore.kernel.org/r/20250923213422.1105654-3-kees@kernel.org Signed-off-by: Kees Cook <kees@kernel.org>
2025-09-24Merge back earlier cpufreq material for 6.18Rafael J. Wysocki
2025-09-24Merge tag 'for-netdev' of ↵Jakub Kicinski
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Martin KaFai Lau says: ==================== pull-request: bpf-next 2025-09-23 We've added 9 non-merge commits during the last 33 day(s) which contain a total of 10 files changed, 480 insertions(+), 53 deletions(-). The main changes are: 1) A new bpf_xdp_pull_data kfunc that supports pulling data from a frag into the linear area of a xdp_buff, from Amery Hung. This includes changes in the xdp_native.bpf.c selftest, which Nimrod's future work depends on. It is a merge from a stable branch 'xdp_pull_data' which has also been merged to bpf-next. There is a conflict with recent changes in 'include/net/xdp.h' in the net-next tree that will need to be resolved. 2) A compiler warning fix when CONFIG_NET=n in the recent dynptr skb_meta support, from Jakub Sitnicki. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: selftests: drv-net: Pull data before parsing headers selftests/bpf: Test bpf_xdp_pull_data bpf: Support specifying linear xdp packet data size for BPF_PROG_TEST_RUN bpf: Make variables in bpf_prog_test_run_xdp less confusing bpf: Clear packet pointers after changing packet data in kfuncs bpf: Support pulling non-linear xdp data bpf: Allow bpf_xdp_shrink_data to shrink a frag from head and tail bpf: Clear pfmemalloc flag when freeing all fragments bpf: Return an error pointer for skb metadata when CONFIG_NET=n ==================== Link: https://patch.msgid.link/20250924050303.2466356-1-martin.lau@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-24Merge patch series "Add generated modalias to modules.builtin.modinfo"Nathan Chancellor
Alexey Gladkov says: The modules.builtin.modinfo file is used by userspace (kmod to be specific) to get information about builtin modules. Among other information about the module, information about module aliases is stored. This is very important to determine that a particular modalias will be handled by a module that is inside the kernel. There are several mechanisms for creating modalias for modules: The first is to explicitly specify the MODULE_ALIAS of the macro. In this case, the aliases go into the '.modinfo' section of the module if it is compiled separately or into vmlinux.o if it is builtin into the kernel. The second is the use of MODULE_DEVICE_TABLE followed by the use of the modpost utility. In this case, vmlinux.o no longer has this information and does not get it into modules.builtin.modinfo. For example: $ modinfo pci:v00008086d0000A36Dsv00001043sd00008694bc0Csc03i30 modinfo: ERROR: Module pci:v00008086d0000A36Dsv00001043sd00008694bc0Csc03i30 not found. $ modinfo xhci_pci name: xhci_pci filename: (builtin) license: GPL file: drivers/usb/host/xhci-pci description: xHCI PCI Host Controller Driver The builtin module is missing alias "pci:v*d*sv*sd*bc0Csc03i30*" which will be generated by modpost if the module is built separately. To fix this it is necessary to add the generated by modpost modalias to modules.builtin.modinfo. Fortunately modpost already generates .vmlinux.export.c for exported symbols. It is possible to add `.modinfo` for builtin modules and modify the build system so that `.modinfo` section is extracted from the intermediate vmlinux after modpost is executed. Link: https://patch.msgid.link/cover.1758182101.git.legion@kernel.org Signed-off-by: Nathan Chancellor <nathan@kernel.org>
2025-09-24modpost: Create modalias for builtin modulesAlexey Gladkov
For some modules, modalias is generated using the modpost utility and the section is added to the module file. When a module is added inside vmlinux, modpost does not generate modalias for such modules and the information is lost. As a result kmod (which uses modules.builtin.modinfo in userspace) cannot determine that modalias is handled by a builtin kernel module. $ cat /sys/devices/pci0000:00/0000:00:14.0/modalias pci:v00008086d0000A36Dsv00001043sd00008694bc0Csc03i30 $ modinfo xhci_pci name: xhci_pci filename: (builtin) license: GPL file: drivers/usb/host/xhci-pci description: xHCI PCI Host Controller Driver Missing modalias "pci:v*d*sv*sd*bc0Csc03i30*" which will be generated by modpost if the module is built separately. To fix this it is necessary to generate the same modalias for vmlinux as for the individual modules. Fortunately '.vmlinux.export.o' is already generated from which '.modinfo' can be extracted in the same way as for vmlinux.o. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Alexey Gladkov <legion@kernel.org> Tested-by: Stephen Rothwell <sfr@canb.auug.org.au> Reviewed-by: Nicolas Schier <nsc@kernel.org> Link: https://patch.msgid.link/28d4da3b0e3fc8474142746bcf469e03752c3208.1758182101.git.legion@kernel.org Signed-off-by: Nathan Chancellor <nathan@kernel.org>
2025-09-24modpost: Add modname to mod_device_table aliasAlexey Gladkov
At this point, if a symbol is compiled as part of the kernel, information about which module the symbol belongs to is lost. To save this it is possible to add the module name to the alias name. It's not very pretty, but it's possible for now. Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Andreas Hindborg <a.hindborg@kernel.org> Cc: Danilo Krummrich <dakr@kernel.org> Cc: Alex Gaynor <alex.gaynor@gmail.com> Cc: rust-for-linux@vger.kernel.org Signed-off-by: Alexey Gladkov <legion@kernel.org> Acked-by: Danilo Krummrich <dakr@kernel.org> Acked-by: Nicolas Schier <nsc@kernel.org> Link: https://patch.msgid.link/1a0d0bd87a4981d465b9ed21e14f4e78eaa03ded.1758182101.git.legion@kernel.org Signed-off-by: Nathan Chancellor <nathan@kernel.org>
2025-09-24kbuild: keep .modinfo section in vmlinux.unstrippedMasahiro Yamada
Keep the .modinfo section during linking, but strip it from the final vmlinux. Adjust scripts/mksysmap to exclude modinfo symbols from kallsyms. This change will allow the next commit to extract the .modinfo section from the vmlinux.unstripped intermediate. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Alexey Gladkov <legion@kernel.org> Reviewed-by: Nicolas Schier <nsc@kernel.org> Link: https://patch.msgid.link/aaf67c07447215463300fccaa758904bac42f992.1758182101.git.legion@kernel.org Signed-off-by: Nathan Chancellor <nathan@kernel.org>
2025-09-24can: dev: add can_get_ctrlmode_str()Vincent Mailhol
In an effort to give more human readable messages when errors occur because of conflicting options, it can be useful to convert the CAN control mode flags into text. Add a function which converts the first set CAN control mode into a human readable string. The reason to only convert the first one is to simplify edge cases: imagine that there are several invalid control modes, we would just return the first invalid one to the user, thus not having to handle complex string concatenation. The user can then solve the first problem, call the netlink interface again and see the next issue. People who wish to enumerate all the control modes can still do so by, for example, using this new function in a for_each_set_bit() loop. Signed-off-by: Vincent Mailhol <mailhol@kernel.org> Link: https://patch.msgid.link/20250923-canxl-netlink-prep-v4-19-e720d28f66fe@kernel.org Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2025-09-24can: calc_bittiming: make can_calc_tdco() FD agnosticVincent Mailhol
can_calc_tdco() uses the CAN_CTRLMODE_FD_TDC_MASK and CAN_CTRLMODE_TDC_AUTO macros making it specific to CAN FD. Add the tdc mask to the function parameter list. The value of the tdc auto flag can then be derived from that mask and stored in a local variable. This way, the function becomes CAN FD agnostic and can be reused later on for the CAN XL TDC. Signed-off-by: Vincent Mailhol <mailhol@kernel.org> Link: https://patch.msgid.link/20250923-canxl-netlink-prep-v4-18-e720d28f66fe@kernel.org Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2025-09-24can: netlink: add can_validate_tdc()Vincent Mailhol
Factorise the TDC validation out of can_validate() and move it in the new can_validate_tdc() function. This is a preparation patch for the introduction of CAN XL because this TDC validation will be reused later on. Signed-off-by: Vincent Mailhol <mailhol@kernel.org> Link: https://patch.msgid.link/20250923-canxl-netlink-prep-v4-5-e720d28f66fe@kernel.org Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2025-09-24can: netlink: document which symbols are FD specificVincent Mailhol
The CAN XL netlink interface will also have data bitrate and TDC parameters. The current FD parameters do not have a prefix in their names to differentiate them. Because the netlink interface is part of the UAPI, it is unfortunately not feasible to rename the existing symbols to add an FD_ prefix. The best alternative is to add a comment for each of the symbols to notify the reader of which parts are CAN FD specific. While at it, fix a typo: transiver -> transceiver. Signed-off-by: Vincent Mailhol <mailhol@kernel.org> Link: https://patch.msgid.link/20250923-canxl-netlink-prep-v4-3-e720d28f66fe@kernel.org Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2025-09-24can: dev: make can_get_relative_tdco() FD agnostic and move it to bittiming.hVincent Mailhol
can_get_relative_tdco() needs to access can_priv->fd making it specific to CAN FD. Change the function parameter from struct can_priv to struct data_bittiming_params. This way, the function becomes CAN FD agnostic and can be reused later on for the CAN XL TDC. Now that we dropped the dependency on struct can_priv, also move can_get_relative_tdco() back to bittiming.h where it was meant to belong to. Signed-off-by: Vincent Mailhol <mailhol@kernel.org> Link: https://patch.msgid.link/20250923-canxl-netlink-prep-v4-2-e720d28f66fe@kernel.org Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2025-09-24can: dev: move struct data_bittiming_params to linux/can/bittiming.hVincent Mailhol
In commit b803c4a4f788 ("can: dev: add struct data_bittiming_params to group FD parameters"), struct data_bittiming_params was put into linux/can/dev.h. This structure being a collection of bittiming parameters, on second thought, bittiming.h is actually a better location. This way, users of struct data_bittiming_params will not have to forcefully include linux/can/dev.h thus removing some complexity and reducing the risk of circular dependencies in headers. Move struct data_bittiming_params from linux/can/dev.h to linux/can/bittiming.h. Signed-off-by: Vincent Mailhol <mailhol@kernel.org> Link: https://patch.msgid.link/20250923-canxl-netlink-prep-v4-1-e720d28f66fe@kernel.org Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2025-09-24can: populate the minimum and maximum MTU valuesVincent Mailhol
By populating: net_device->min_mtu and net_device->max_mtu the net core infrastructure will automatically: 1. validate that the user's inputs are in range. 2. report those min and max MTU values through the netlink interface. Add can_set_default_mtu() which sets the default mtu value as well as the minimum and maximum values. The logic for the default mtu value remains unchanged: - CANFD_MTU if the device has a static CAN_CTRLMODE_FD. - CAN_MTU otherwise. Call can_set_default_mtu() each time the CAN_CTRLMODE_FD is modified. This will guarantee that the MTU value is always consistent with the control mode flags. With this, the checks done in can_change_mtu() become fully redundant and will be removed in an upcoming change and it is now possible to confirm the minimum and maximum MTU values on a physical CAN interface by doing: $ ip --details link show can0 The virtual interfaces (vcan and vxcan) are not impacted by this change. Signed-off-by: Vincent Mailhol <mailhol@kernel.org> Link: https://patch.msgid.link/20250923-can-fix-mtu-v3-3-581bde113f52@kernel.org [mkl: squashed https://patch.msgid.link/20250924143644.17622-2-mailhol@kernel.org] Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2025-09-24hwmon: (dell-smm) Add support for automatic fan modeArmin Wolf
Many machines treat fan state 3 as some sort of automatic mode, which is superior to the separate SMM calls for switching to automatic fan mode for two reasons: - the fan control mode can be controlled for each fan separately - the current fan control mode can be retrieved from the BIOS On some machines however, this special fan state does not exist. Fan state 3 acts like a regular fan state on such machines or does not exist at all. Such machines usually use separate SMM calls for enabling/disabling automatic fan control. Add support for it. If the machine supports separate SMM calls for changing the fan control mode, then the other interface is ignored. Signed-off-by: Armin Wolf <W_Armin@gmx.de> Link: https://lore.kernel.org/r/20250917181036.10972-4-W_Armin@gmx.de Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2025-09-24asm-generic/io.h: Skip trace helpers if rwmmio events are disabledVarad Gautam
With `CONFIG_TRACE_MMIO_ACCESS=y`, the `{read,write}{b,w,l,q}{_relaxed}()` mmio accessors unconditionally call `log_{post_}{read,write}_mmio()` helpers, which in turn call the ftrace ops for `rwmmio` trace events This adds a performance penalty per mmio accessor call, even when `rwmmio` events are disabled at runtime (~80% overhead on local measurement). Guard these with `tracepoint_enabled()`. Signed-off-by: Varad Gautam <varadgautam@google.com> Fixes: 210031971cdd ("asm-generic/io: Add logging support for MMIO accessors") Cc: stable@vger.kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2025-09-24gpio: generic: move GPIO_GENERIC_ flags to the correct headerBartosz Golaszewski
These flags are specific to gpio-mmio and belong in linux/gpio/generic.h so move them there. Link: https://lore.kernel.org/r/20250917-gpio-generic-flags-v1-2-69f51fee8c89@linaro.org Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2025-09-24gpio: generic: rename BGPIOF_ flags to GPIO_GENERIC_Bartosz Golaszewski
Make the flags passed to gpio_generic_chip_init() use the same prefix as the rest of the modernized generic GPIO chip API. Link: https://lore.kernel.org/r/20250917-gpio-generic-flags-v1-1-69f51fee8c89@linaro.org Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2025-09-24Merge tag 'coresight-next-v6.18-v2' of ↵Greg Kroah-Hartman
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/coresight/linux into char-misc-next Suzuki writes: coresight: Updates for Linux v6.18, take 2 This is an updated drop for v6.18, fixing the invalid commit reference in the original tag. CoreSight selfhosted tracing subsystem updates targeting Linux v6.18, includes: - Clean up and consolidate clocks handling - Support for exposing labels via sysfs for better device identification - Add Qualcomm Trace Network On Chip driver support - Miscellaneous fixes Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> * tag 'coresight-next-v6.18-v2' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/coresight/linux: (27 commits) coresight: Add label sysfs node support dt-bindings: arm: Add label in the coresight components coresight: tnoc: add new AMBA ID to support Trace Noc V2 coresight: Fix incorrect handling for return value of devm_kzalloc coresight: tpda: fix the logic to setup the element size coresight: trbe: Return NULL pointer for allocation failures coresight: Refactor runtime PM coresight: Make clock sequence consistent coresight: Refactor driver data allocation coresight: Consolidate clock enabling coresight: Avoid enable programming clock duplicately coresight: Appropriately disable trace bus clocks coresight: Appropriately disable programming clocks coresight: etm4x: Support atclk coresight: catu: Support atclk coresight: tmc: Support atclk coresight-etm4x: Conditionally access register TRCEXTINSELR coresight: fix indentation error in cscfg_remove_owned_csdev_configs() coresight: tnoc: Fix a NULL vs IS_ERR() bug in probe coresight: add coresight Trace Network On Chip driver ...
2025-09-24bpf: Mark kfuncs as __nocloneAndrea Righi
Some distributions (e.g., CachyOS) support building the kernel with -O3, but doing so may break kfuncs, resulting in their symbols not being properly exported. In fact, with gcc -O3, some kfuncs may be optimized away despite being annotated as noinline. This happens because gcc can still clone the function during IPA optimizations, e.g., by duplicating or inlining it into callers, and then dropping the standalone symbol. This breaks BTF ID resolution since resolve_btfids relies on the presence of a global symbol for each kfunc. Currently, this is not an issue for upstream, because we don't allow building the kernel with -O3, but it may be safer to address it anyway, to prevent potential issues in the future if compilers become more aggressive with optimizations. Therefore, add __noclone to __bpf_kfunc to ensure kfuncs are never cloned and remain distinct, globally visible symbols, regardless of the optimization level. Fixes: 57e7c169cd6af ("bpf: Add __bpf_kfunc tag for marking kernel functions as kfuncs") Acked-by: David Vernet <void@manifault.com> Acked-by: Yonghong Song <yonghong.song@linux.dev> Signed-off-by: Andrea Righi <arighi@nvidia.com> Link: https://lore.kernel.org/r/20250924081426.156934-1-arighi@nvidia.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-09-24bpf: Allow uprobe program to change context registersJiri Olsa
Currently uprobe (BPF_PROG_TYPE_KPROBE) program can't write to the context registers data. While this makes sense for kprobe attachments, for uprobe attachment it might make sense to be able to change user space registers to alter application execution. Since uprobe and kprobe programs share the same type (BPF_PROG_TYPE_KPROBE), we can't deny write access to context during the program load. We need to check on it during program attachment to see if it's going to be kprobe or uprobe. Storing the program's write attempt to context and checking on it during the attachment. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20250916215301.664963-2-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-09-23net/mlx5: fs, fix UAF in flow counter releaseMoshe Shemesh
Fix a kernel trace [1] caused by releasing an HWS action of a local flow counter in mlx5_cmd_hws_delete_fte(), where the HWS action refcount and mutex were not initialized and the counter struct could already be freed when deleting the rule. Fix it by adding the missing initializations and adding refcount for the local flow counter struct. [1] Kernel log: Call Trace: <TASK> dump_stack_lvl+0x34/0x48 mlx5_fs_put_hws_action.part.0.cold+0x21/0x94 [mlx5_core] mlx5_fc_put_hws_action+0x96/0xad [mlx5_core] mlx5_fs_destroy_fs_actions+0x8b/0x152 [mlx5_core] mlx5_cmd_hws_delete_fte+0x5a/0xa0 [mlx5_core] del_hw_fte+0x1ce/0x260 [mlx5_core] mlx5_del_flow_rules+0x12d/0x240 [mlx5_core] ? ttwu_queue_wakelist+0xf4/0x110 mlx5_ib_destroy_flow+0x103/0x1b0 [mlx5_ib] uverbs_free_flow+0x20/0x50 [ib_uverbs] destroy_hw_idr_uobject+0x1b/0x50 [ib_uverbs] uverbs_destroy_uobject+0x34/0x1a0 [ib_uverbs] uobj_destroy+0x3c/0x80 [ib_uverbs] ib_uverbs_run_method+0x23e/0x360 [ib_uverbs] ? uverbs_finalize_object+0x60/0x60 [ib_uverbs] ib_uverbs_cmd_verbs+0x14f/0x2c0 [ib_uverbs] ? do_tty_write+0x1a9/0x270 ? file_tty_write.constprop.0+0x98/0xc0 ? new_sync_write+0xfc/0x190 ib_uverbs_ioctl+0xd7/0x160 [ib_uverbs] __x64_sys_ioctl+0x87/0xc0 do_syscall_64+0x59/0x90 Fixes: b581f4266928 ("net/mlx5: fs, manage flow counters HWS action sharing by refcount") Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/1758525094-816583-2-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-23net: phy: stop exporting phy_driver_registerHeiner Kallweit
phy_driver_register() isn't used outside phy_device.c any longer, so we can stop exporting it. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://patch.msgid.link/dff44b83-4a85-4fff-bf6b-f12efd97b56e@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-23udp: remove busylock and add per NUMA queuesEric Dumazet
busylock was protecting UDP sockets against packet floods, but unfortunately was not protecting the host itself. Under stress, many cpus could spin while acquiring the busylock, and NIC had to drop packets. Or packets would be dropped in cpu backlog if RPS/RFS were in place. This patch replaces the busylock by intermediate lockless queues. (One queue per NUMA node). This means that fewer number of cpus have to acquire the UDP receive queue lock. Most of the cpus can either: - immediately drop the packet. - or queue it in their NUMA aware lockless queue. Then one of the cpu is chosen to process this lockless queue in a batch. The batch only contains packets that were cooked on the same NUMA node, thus with very limited latency impact. Tested: DDOS targeting a victim UDP socket, on a platform with 6 NUMA nodes (Intel(R) Xeon(R) 6985P-C) Before: nstat -n ; sleep 1 ; nstat | grep Udp Udp6InDatagrams 1004179 0.0 Udp6InErrors 3117 0.0 Udp6RcvbufErrors 3117 0.0 After: nstat -n ; sleep 1 ; nstat | grep Udp Udp6InDatagrams 1116633 0.0 Udp6InErrors 14197275 0.0 Udp6RcvbufErrors 14197275 0.0 We can see this host can now proces 14.2 M more packets per second while under attack, and the victim socket can receive 11 % more packets. I used a small bpftrace program measuring time (in us) spent in __udp_enqueue_schedule_skb(). Before: @udp_enqueue_us[398]: [0] 24901 |@@@ | [1] 63512 |@@@@@@@@@ | [2, 4) 344827 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| [4, 8) 244673 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ | [8, 16) 54022 |@@@@@@@@ | [16, 32) 222134 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ | [32, 64) 232042 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ | [64, 128) 4219 | | [128, 256) 188 | | After: @udp_enqueue_us[398]: [0] 5608855 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| [1] 1111277 |@@@@@@@@@@ | [2, 4) 501439 |@@@@ | [4, 8) 102921 | | [8, 16) 29895 | | [16, 32) 43500 | | [32, 64) 31552 | | [64, 128) 979 | | [128, 256) 13 | | Note that the remaining bottleneck for this platform is in udp_drops_inc() because we limited struct numa_drop_counters to only two nodes so far. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20250922104240.2182559-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-23Merge branch 'bpf-next/xdp_pull_data' into 'bpf-next/master'Martin KaFai Lau
Merge the xdp_pull_data stable branch into the master branch. No conflict. Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2025-09-23Merge branch 'bpf-next/xdp_pull_data' into 'bpf-next/net'Martin KaFai Lau
Merge the xdp_pull_data stable branch into the net branch. No conflict. Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>