summaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)Author
2026-03-14tcp: implement RFC 7323 window retraction receiver requirementsSimon Baatz
By default, the Linux TCP implementation does not shrink the advertised window (RFC 7323 calls this "window retraction") with the following exceptions: - When an incoming segment cannot be added due to the receive buffer running out of memory. Since commit 8c670bdfa58e ("tcp: correct handling of extreme memory squeeze") a zero window will be advertised in this case. It turns out that reaching the required memory pressure is easy when window scaling is in use. In the simplest case, sending a sufficient number of segments smaller than the scale factor to a receiver that does not read data is enough. - Commit b650d953cd39 ("tcp: enforce receive buffer memory limits by allowing the tcp window to shrink") addressed the "eating memory" problem by introducing a sysctl knob that allows shrinking the window before running out of memory. However, RFC 7323 does not only state that shrinking the window is necessary in some cases, it also formulates requirements for TCP implementations when doing so (Section 2.4). This commit addresses the receiver-side requirements: After retracting the window, the peer may have a snd_nxt that lies within a previously advertised window but is now beyond the retracted window. This means that all incoming segments (including pure ACKs) will be rejected until the application happens to read enough data to let the peer's snd_nxt be in window again (which may be never). To comply with RFC 7323, the receiver MUST honor any segment that would have been in window for any ACK sent by the receiver and, when window scaling is in effect, SHOULD track the maximum window sequence number it has advertised. This patch tracks that maximum window sequence number rcv_mwnd_seq throughout the connection and uses it in tcp_sequence() when deciding whether a segment is acceptable. rcv_mwnd_seq is updated together with rcv_wup and rcv_wnd in tcp_select_window(). If we count tcp_sequence() as fast path, it is read in the fast path. Therefore, rcv_mwnd_seq is put into rcv_wnd's cacheline group. The logic for handling received data in tcp_data_queue() is already sufficient and does not need to be updated. Signed-off-by: Simon Baatz <gmbnomis@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260309-tcp_rfc7323_retract_wnd_rfc-v3-1-4c7f96b1ec69@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-14Merge branch 'io_uring-7.0' into for-7.1/io_uringJens Axboe
Merge upstream io_uring fixes to avoid conflicts in later patches. * io_uring-7.0: io_uring/kbuf: check if target buffer list is still legacy on recycle io_uring: fix physical SQE bounds check for SQE_MIXED 128-byte ops io_uring/eventfd: use ctx->rings_rcu for flags checking io_uring: ensure ctx->rings is stable for task work flags manipulation io_uring/bpf_filter: use bpf_prog_run_pin_on_cpu() to prevent migration io_uring/register: fix comment about task_no_new_privs
2026-03-14blk-integrity: support arbitrary buffer alignmentKeith Busch
A bio segment may have partial interval block data with the rest continuing into the next segments because direct-io data payloads only need to align in memory to the device's DMA limits. At the same time, the protection information may also be split in multiple segments. The most likely way that may happen is if two requests merge, or if we're directly using the io_uring user metadata. The generate/verify, however, only ever accessed the first bip_vec. Further, it may be possible to unalign the protection fields from the user space buffer, or if there are odd additional opaque bytes in front or in back of the protection information metadata region. Change up the iteration to allow spanning multiple segments. This patch is mostly a re-write of the protection information handling to allow any arbitrary alignments, so it's probably easier to review the end result rather than the diff. Many controllers are not able to handle interval data composed of multiple segments when PI is used, so this patch introduces a new integrity limit that a low level driver can set to notify that it is capable, default to false. The nvme driver is the first one to enable it in this patch. Everyone else will force DMA alignment to the logical block size as before to ensure interval data is always aligned within a single segment. Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org> Link: https://patch.msgid.link/20260313144701.1221652-2-kbusch@meta.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2026-03-14Merge tag 'device_lock_cond_guard-7.1-rc1' into driver-core-testingDanilo Krummrich
DEFINE_GUARD_COND() for device_lock_interruptible() Introduce conditional guard version of device_lock() for scenarios that require conditional device lock holding. This is a stable tag for other trees to merge. Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2026-03-13udp: Remove UDPLITE_SEND_CSCOV and UDPLITE_RECV_CSCOV.Kuniyuki Iwashima
UDP-Lite supports variable-length checksum and has two socket options, UDPLITE_SEND_CSCOV and UDPLITE_RECV_CSCOV, to control the checksum coverage. Let's remove the support. setsockopt(UDPLITE_SEND_CSCOV / UDPLITE_RECV_CSCOV) was only available for UDP-Lite and returned -ENOPROTOOPT for UDP. Now, the options are handled in ip_setsockopt() and ipv6_setsockopt(), which still return the same error. getsockopt(UDPLITE_SEND_CSCOV / UDPLITE_RECV_CSCOV) was available for UDP and always returned 0, meaning full checksum, but now -ENOPROTOOPT is returned. Given that getsockopt() is meaningless for UDP and even the options are not defined under include/uapi/, this should not be a problem. $ man 7 udplite ... BUGS Where glibc support is missing, the following definitions are needed: #define IPPROTO_UDPLITE 136 #define UDPLITE_SEND_CSCOV 10 #define UDPLITE_RECV_CSCOV 11 Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20260311052020.1213705-10-kuniyu@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-13dma-mapping: Separate DMA sync issuing and completion waitingBarry Song
Currently, arch_sync_dma_for_cpu and arch_sync_dma_for_device always wait for the completion of each DMA buffer. That is, issuing the DMA sync and waiting for completion is done in a single API call. For scatter-gather lists with multiple entries, this means issuing and waiting is repeated for each entry, which can hurt performance. Architectures like ARM64 may be able to issue all DMA sync operations for all entries first and then wait for completion together. To address this, arch_sync_dma_for_* now batches DMA operations and performs a flush afterward. On ARM64, the flush is implemented with a dsb instruction in arch_sync_dma_flush(). On other architectures, arch_sync_dma_flush() is currently a nop. Cc: Leon Romanovsky <leon@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Ada Couprie Diaz <ada.coupriediaz@arm.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Marc Zyngier <maz@kernel.org> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> Cc: Tangquan Zheng <zhengtangquan@oppo.com> Reviewed-by: Juergen Gross <jgross@suse.com> # drivers/xen/swiotlb-xen.c Tested-by: Xueyuan Chen <xueyuan.chen21@gmail.com> Signed-off-by: Barry Song <baohua@kernel.org> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Link: https://lore.kernel.org/r/20260228221316.59934-1-21cnbao@gmail.com
2026-03-13of: Add of_machine_get_match() helperGeert Uytterhoeven
Currently, there are two helpers to match the root compatible value against an of_device_id array: - of_machine_device_match() returns true if a match is found, - of_machine_get_match_data() returns the match data if a match is found. However, there is no helper that returns the actual of_device_id structure corresponding to the match, leading to code duplication in various drivers. Fix this by reworking of_machine_device_match() to return the actual match structure, and renaming it to of_machine_get_match(). Retain the old of_machine_device_match() functionality using a cheap static inline wrapper around the new of_machine_get_match() helper. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Link: https://patch.msgid.link/14e1c03d443b1a5f210609ec3a1ebbaeab8fb3d9.1772468323.git.geert+renesas@glider.be Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
2026-03-13sched_ext: Implement SCX_ENQ_IMMEDTejun Heo
Add SCX_ENQ_IMMED enqueue flag for local DSQ insertions. Once a task is dispatched with IMMED, it either gets on the CPU immediately and stays on it, or gets reenqueued back to the BPF scheduler. It will never linger on a local DSQ behind other tasks or on a CPU taken by a higher-priority class. rq_is_open() uses rq->next_class to determine whether the rq is available, and wakeup_preempt_scx() triggers reenqueue when a higher-priority class task arrives. These capture all higher class preemptions. Combined with reenqueue points in the dispatch path, all cases where an IMMED task would not execute immediately are covered. SCX_TASK_IMMED persists in p->scx.flags until the next fresh enqueue, so the guarantee survives SAVE/RESTORE cycles. If preempted while running, put_prev_task_scx() reenqueues through ops.enqueue() with SCX_TASK_REENQ_PREEMPTED instead of silently placing the task back on the local DSQ. This enables tighter scheduling latency control by preventing tasks from piling up on local DSQs. It also enables opportunistic CPU sharing across sub-schedulers - without this, a sub-scheduler can stuff the local DSQ of a shared CPU, making it difficult for others to use. v2: - Rewrite is_curr_done() as rq_is_open() using rq->next_class and implement wakeup_preempt_scx() to achieve complete coverage of all cases where IMMED tasks could get stranded. - Track IMMED persistently in p->scx.flags and reenqueue preempted-while-running tasks through ops.enqueue(). - Bound deferred reenq cycles (SCX_REENQ_LOCAL_MAX_REPEAT). - Misc renames, documentation. Signed-off-by: Tejun Heo <tj@kernel.org> Reviewed-by: Andrea Righi <arighi@nvidia.com>
2026-03-13Merge tag 'block-7.0-20260312' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux Pull block fixes from Jens Axboe: - NVMe pull request via Keith: - Fix nvme-pci IRQ race and slab-out-of-bounds access - Fix recursive workqueue locking for target async events - Various cleanups - Fix a potential NULL pointer dereference in ublk on size setting - ublk automatic partition scanning fix - Two s390 dasd fixes * tag 'block-7.0-20260312' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: nvme: Annotate struct nvme_dhchap_key with __counted_by nvme-core: do not pass empty queue_limits to blk_mq_alloc_queue() nvme-pci: Fix race bug in nvme_poll_irqdisable() nvmet: move async event work off nvmet-wq nvme-pci: Fix slab-out-of-bounds in nvme_dbbuf_set s390/dasd: Copy detected format information to secondary device s390/dasd: Move quiesce state with pprc swap ublk: don't clear GD_SUPPRESS_PART_SCAN for unprivileged daemons ublk: fix NULL pointer dereference in ublk_ctrl_set_size()
2026-03-13Merge tag 'io_uring-7.0-20260312' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux Pull io_uring fixes from Jens Axboe: - Fix an inverted true/false comment on task_no_new_privs, from the BPF filtering changes merged in this release - Use the migration disabling way of running the BPF filters, as the io_uring side doesn't do that already - Fix an issue with ->rings stability under resize, both for local task_work additions and for eventfd signaling - Fix an issue with SQE mixed mode, where a bounds check wasn't correct for having a 128b SQE - Fix an issue where a legacy provided buffer group is changed to to ring mapped one while legacy buffers from that group are in flight * tag 'io_uring-7.0-20260312' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: io_uring/kbuf: check if target buffer list is still legacy on recycle io_uring: fix physical SQE bounds check for SQE_MIXED 128-byte ops io_uring/eventfd: use ctx->rings_rcu for flags checking io_uring: ensure ctx->rings is stable for task work flags manipulation io_uring/bpf_filter: use bpf_prog_run_pin_on_cpu() to prevent migration io_uring/register: fix comment about task_no_new_privs
2026-03-13driver core: auxiliary bus: Introduce dev_is_auxiliary()Rafael J. Wysocki
Introduce dev_is_auxiliary() in analogy with dev_is_platform() to facilitate subsequent changes. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Danilo Krummrich <dakr@kernel.org> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://patch.msgid.link/5079467.GXAFRqVoOG@rafael.j.wysocki
2026-03-13integrity: Eliminate weak definition of arch_get_secureboot()Nathan Chancellor
security/integrity/secure_boot.c contains a single __weak function, which breaks recordmcount when building with clang: $ make -skj"$(nproc)" ARCH=powerpc LLVM=1 ppc64_defconfig security/integrity/secure_boot.o Cannot find symbol for section 2: .text. security/integrity/secure_boot.o: failed Introduce a Kconfig symbol, CONFIG_HAVE_ARCH_GET_SECUREBOOT, to indicate that an architecture provides a definition of arch_get_secureboot(). Provide a static inline stub when this symbol is not defined to achieve the same effect as the __weak function, allowing secure_boot.c to be removed altogether. Move the s390 definition of arch_get_secureboot() out of the CONFIG_KEXEC_FILE block to ensure it is always available, as it does not actually depend on KEXEC_FILE. Reported-by: Arnd Bergmann <arnd@arndb.de> Fixes: 31a6a07eefeb ("integrity: Make arch_ima_get_secureboot integrity-wide") Signed-off-by: Nathan Chancellor <nathan@kernel.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2026-03-13wifi: ieee80211: fix definition of EHT-MCS 15 in MRUShayne Chen
According to the definition in IEEE Std 802.11be-2024, Table 9-417r, each bit indicates support for the transmission and reception of EHT-MCS 15 in: - B0: 52+26-tone and 106+26-tone MRUs. - B1: a 484+242-tone MRU if 80 MHz is supported. - B2: a 996+484-tone MRU and a 996+484+242-tone MRU if 160 MHz is supported. - B3: a 3×996-tone MRU if 320 MHz is supported. Fixes: 6239da18d2f9 ("wifi: mac80211: adjust EHT capa when lowering bandwidth") Signed-off-by: Shayne Chen <shayne.chen@mediatek.com> Link: https://patch.msgid.link/20260313062150.3165433-1-shayne.chen@mediatek.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-03-13driver core: make struct class groups members constant arraysHeiner Kallweit
Constify the groups arrays, allowing to assign constant arrays. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://patch.msgid.link/7ff56b07-09ca-4948-b98f-5bd37ceef21e@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-03-13driver: core: constify groups array argument in device_add_groups and ↵Heiner Kallweit
device_remove_groups Now that sysfs_create_groups() and sysfs_remove_groups() allow to pass constant groups arrays, we can constify the groups array argument also here. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://patch.msgid.link/8ea2d6d1-0adb-4d7f-92bc-751e93ce08d6@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-03-13sysfs: constify group arrays in function argumentsHeiner Kallweit
Constify the groups array argument where applicable. This allows to pass constant arrays as arguments. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://patch.msgid.link/17035265-8882-4101-b7a7-16b3eb94f8b5@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-03-13vt: save/restore unicode screen buffer for alternate screenNicolas Pitre
The alternate screen support added by commit 23743ba64709 ("vt: add support for smput/rmput escape codes") only saves and restores the regular screen buffer (vc_origin), but completely ignores the corresponding unicode screen buffer (vc_uni_lines) creating a messed-up display. Add vc_saved_uni_lines to save the unicode screen buffer when entering the alternate screen, and restore it when leaving. Also ensure proper cleanup in reset_terminal() and vc_deallocate(). Fixes: 23743ba64709 ("vt: add support for smput/rmput escape codes") Cc: stable <stable@kernel.org> Signed-off-by: Nicolas Pitre <npitre@baylibre.com> Link: https://patch.msgid.link/5o2p6qp3-91pq-0p17-or02-1oors4417ns7@onlyvoer.pbz Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-03-13tee: clean up tee_core.h kernel-docRandy Dunlap
Use the correct struct member name and function parameter name in kernel-doc comments. Move a macro that was between a struct's documentation and its declaration. These eliminate the following kernel-doc warnings: Warning: include/linux/tee_core.h:73 struct member 'c_no_users' not described in 'tee_device' Warning: include/linux/tee_core.h:132 #define TEE_DESC_PRIVILEGED 0x1; error: Cannot parse struct or union! Warning: include/linux/tee_core.h:257 function parameter 'connection_data' not described in 'tee_session_calc_client_uuid' Warning: include/linux/tee_core.h:320 function parameter 'teedev' not described in 'tee_get_drvdata' Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reviewed-by: Sumit Garg <sumit.garg@oss.qualcomm.com> Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
2026-03-12ip_tunnel: adapt iptunnel_xmit_stats() to NETDEV_PCPU_STAT_DSTATSEric Dumazet
Blamed commits forgot that vxlan/geneve use udp_tunnel[6]_xmit_skb() which call iptunnel_xmit_stats(). iptunnel_xmit_stats() was assuming tunnels were only using NETDEV_PCPU_STAT_TSTATS. @syncp offset in pcpu_sw_netstats and pcpu_dstats is different. 32bit kernels would either have corruptions or freezes if the syncp sequence was overwritten. This patch also moves pcpu_stat_type closer to dev->{t,d}stats to avoid a potential cache line miss since iptunnel_xmit_stats() needs to read it. Fixes: 6fa6de302246 ("geneve: Handle stats using NETDEV_PCPU_STAT_DSTATS.") Fixes: be226352e8dc ("vxlan: Handle stats using NETDEV_PCPU_STAT_DSTATS.") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Guillaume Nault <gnault@redhat.com> Link: https://patch.msgid.link/20260311123110.1471930-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-12driver core: Add conditional guard support for device_lock()Li Ming
Introduce conditional guard version of device_lock() for scenarios that require conditional device lock holding. Suggested-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Li Ming <ming.li@zohomail.com> Link: https://patch.msgid.link/20260310-fix_access_endpoint_without_drv_check-v1-1-94fe919a0b87@zohomail.com Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2026-03-12Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR (net-7.0-rc4). drivers/net/ethernet/mellanox/mlx5/core/en_rx.c db25c42c2e1f9 ("net/mlx5e: RX, Fix XDP multi-buf frag counting for striding RQ") dff1c3164a692 ("net/mlx5e: SHAMPO, Always calculate page size") https://lore.kernel.org/aa7ORohmf67EKihj@sirena.org.uk drivers/net/ethernet/ti/am65-cpsw-nuss.c 840c9d13cb1ca ("net: ethernet: ti: am65-cpsw-nuss: Fix rx_filter value for PTP support") a23c657e332f2 ("net: ethernet: ti: am65-cpsw: Use also port number to identify timestamps") https://lore.kernel.org/abK3EkIXuVgMyGI7@sirena.org.uk No adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-12Merge tag 'net-7.0-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from CAN and netfilter. Current release - regressions: - eth: mana: Null service_wq on setup error to prevent double destroy Previous releases - regressions: - nexthop: fix percpu use-after-free in remove_nh_grp_entry - sched: teql: fix NULL pointer dereference in iptunnel_xmit on TEQL slave xmit - bpf: fix nd_tbl NULL dereference when IPv6 is disabled - neighbour: restore protocol != 0 check in pneigh update - tipc: fix divide-by-zero in tipc_sk_filter_connect() - eth: - mlx5: - fix crash when moving to switchdev mode - fix DMA FIFO desync on error CQE SQ recovery - iavf: fix PTP use-after-free during reset - bonding: fix type confusion in bond_setup_by_slave() - lan78xx: fix WARN in __netif_napi_del_locked on disconnect Previous releases - always broken: - core: add xmit recursion limit to tunnel xmit functions - net-shapers: don't free reply skb after genlmsg_reply() - netfilter: - fix stack out-of-bounds read in pipapo_drop() - fix OOB read in nfnl_cthelper_dump_table() - mctp: - fix device leak on probe failure - i2c: fix skb memory leak in receive path - can: keep the max bitrate error at 5% - eth: - bonding: fix nd_tbl NULL dereference when IPv6 is disabled - bnxt_en: fix RSS table size check when changing ethtool channels - amd-xgbe: prevent CRC errors during RX adaptation with AN disabled - octeontx2-af: devlink: fix NIX RAS reporter recovery condition" * tag 'net-7.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (71 commits) net: prevent NULL deref in ip[6]tunnel_xmit() octeontx2-af: devlink: fix NIX RAS reporter to use RAS interrupt status octeontx2-af: devlink: fix NIX RAS reporter recovery condition net: ethernet: ti: am65-cpsw-nuss: Fix rx_filter value for PTP support net/mana: Null service_wq on setup error to prevent double destroy selftests: rtnetlink: add neighbour update test neighbour: restore protocol != 0 check in pneigh update net: dsa: realtek: Fix LED group port bit for non-zero LED group tipc: fix divide-by-zero in tipc_sk_filter_connect() net: dsa: microchip: Fix error path in PTP IRQ setup bpf: bpf_out_neigh_v6: Fix nd_tbl NULL dereference when IPv6 is disabled bpf: bpf_out_neigh_v4: Fix nd_tbl NULL dereference when IPv6 is disabled net: bonding: Fix nd_tbl NULL dereference when IPv6 is disabled ipv6: move the disable_ipv6_mod knob to core code net: bcmgenet: fix broken EEE by converting to phylib-managed state net-shapers: don't free reply skb after genlmsg_reply() net: dsa: mxl862xx: don't set user_mii_bus net: ethernet: arc: emac: quiesce interrupts before requesting IRQ page_pool: store detach_time as ktime_t to avoid false-negatives net: macb: Shuffle the tx ring before enabling tx ...
2026-03-12Merge tag 'v7.0-rc3' into nextDmitry Torokhov
Sync up with the mainline to brig up the latest changes, specifically changes to ALPS driver.
2026-03-12base: soc: rename and export soc_device_get_machine()Bartosz Golaszewski
Some SoC drivers reimplement the functionality of soc_device_get_machine(). Make this function accessible through the sys_soc.h header and rename it to a more descriptive name. Reviewed-by: Christophe Leroy (CS GROUP) <chleroy@kernel.org> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com> Reviewed-by: Rob Herring (Arm) <robh@kernel.org> Link: https://patch.msgid.link/20260223-soc-of-root-v2-4-b45da45903c8@oss.qualcomm.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-03-12of: provide of_machine_read_model()Bartosz Golaszewski
Provide a helper function allowing users to read the model string of the machine, hiding the access to the root node. Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com> Reviewed-by: Rob Herring (Arm) <robh@kernel.org> Link: https://patch.msgid.link/20260223-soc-of-root-v2-2-b45da45903c8@oss.qualcomm.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-03-12of: provide of_machine_read_compatible()Bartosz Golaszewski
Provide a helper function allowing users to read the compatible string of the machine, hiding the access to the root node. Reviewed-by: Christophe Leroy (CS GROUP) <chleroy@kernel.org> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com> Reviewed-by: Rob Herring (Arm) <robh@kernel.org> Link: https://patch.msgid.link/20260223-soc-of-root-v2-1-b45da45903c8@oss.qualcomm.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-03-12serial: 8250: Add serial8250_handle_irq_locked()Ilpo Järvinen
8250_port exports serial8250_handle_irq() to HW specific 8250 drivers. It takes port's lock within but a HW specific 8250 driver may want to take port's lock itself, do something, and then call the generic handler in 8250_port but to do that, the caller has to release port's lock for no good reason. Introduce serial8250_handle_irq_locked() which a HW specific driver can call while already holding port's lock. As this is new export, put it straight into a namespace (where all 8250 exports should eventually be moved). Tested-by: Bandal, Shankar <shankar.bandal@intel.com> Tested-by: Murthy, Shanth <shanth.murthy@intel.com> Cc: stable <stable@kernel.org> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Link: https://patch.msgid.link/20260203171049.4353-4-ilpo.jarvinen@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-03-12tty: tty_port: add workqueue to flip TTY bufferXin Zhao
On the embedded platform, certain critical data, such as IMU data, is transmitted through UART. The tty_flip_buffer_push() interface in the TTY layer uses system_dfl_wq to handle the flipping of the TTY buffer. Although the unbound workqueue can create new threads on demand and wake up the kworker thread on an idle CPU, it may be preempted by real-time tasks or other high-prio tasks. flush_to_ldisc() needs to wake up the relevant data handle thread. When executing __wake_up_common_lock(), it calls spin_lock_irqsave(), which does not disable preemption but disables migration in RT-Linux. This prevents the kworker thread from being migrated to other cores by CPU's balancing logic, resulting in long delays. The call trace is as follows: __wake_up_common_lock __wake_up ep_poll_callback __wake_up_common __wake_up_common_lock __wake_up n_tty_receive_buf_common n_tty_receive_buf2 tty_ldisc_receive_buf tty_port_default_receive_buf flush_to_ldisc In our system, the processing interval for each frame of IMU data transmitted via UART can experience significant jitter due to this issue. Instead of the expected 10 to 15 ms frame processing interval, we see spikes up to 30 to 35 ms. Moreover, in just one or two hours, there can be 2 to 3 occurrences of such high jitter, which is quite frequent. This jitter exceeds the software's tolerable limit of 20 ms. Introduce flip_wq in tty_port which can be set by tty_port_link_wq() or as default linked to default workqueue allocated when tty_register_driver(). The default workqueue is allocated with flag WQ_SYSFS, so that cpumask and nice can be set dynamically. The execution timing of tty_port_link_wq() is not clearly restricted. The newly added function tty_port_link_driver_wq() checks whether the flip_wq of the tty_port has already been assigned when linking the default tty_driver's workqueue to the port. After the user has set a custom workqueue for a certain tty_port using tty_port_link_wq(), the system will only use this custom workqueue, even if tty_driver does not have %TTY_DRIVER_NO_WORKQUEUE flag. When tty_port register device, flip_wq link operation is done by tty_port_link_driver_wq(), but for in-memory devices the link operation cannot cover all the cases. Although tty_port_install() is dedicated for in-memory devices lik PTY to link port allocated on demand, the logic of tty_port_install() is so simple that people may not call it, vc_cons[0].d->port is one such case. We check the buf.flip_wq when flip TTY buffer, if buf.flip_wq of TTY port is NULL, use system_dfl_wq as a backup. To avoid naming conflict of the default tty_driver's workqueue, using '"%s-%s", driver->name, driver->driver_name' as the workqueue name. In cases where driver_name is not specified and therefore is NULL, the workqueue is not created. Drivers that do not define driver_name are potentially in-memory devices like vty, which generally do not require special workqueue settings. Even with the combination of name and driver_name, the workqueue names can still be duplicated, as many tty serial drivers use "ttyS" as dev_name and "serial" as driver_name. I modified the conflicting driver_name of these drivers by appending a suffix of _xx based on the corresponding .c file. If this modification is not made, it could not only lead to duplicate workqueue names but also result in duplicate entries for the /proc/tty/driver/<driver_name> nodes. Introduce %TTY_DRIVER_NO_WORKQUEUE flag meaning not to create the default single tty_driver workqueue. Two reasons why need to introduce the %TTY_DRIVER_NO_WORKQUEUE flag: 1. If the WQ_SYSFS parameter is enabled, workqueue_sysfs_register() will fail when trying to create a workqueue with the same name. The pty is an example of this; if both CONFIG_LEGACY_PTYS and CONFIG_UNIX98_PTYS are enabled, the call to tty_register_driver() in unix98_pty_init() will fail. 2. Different TTY ports may be used for different tasks, which may require separate core binding control via workqueues. In this case, the workqueue created by default in the TTY driver is unnecessary. Enabling this flag prevents the creation of this redundant workqueue. After applying this patch, we can set the related UART TTY flip buffer workqueue by sysfs. We set the cpumask to CPU cores associated with the IMU tasks, and set the nice to -20. Testing has shown significant improvement in the previously described issue, with almost no stuttering occurring anymore. Tested-by: Tommaso Merciai <tommaso.merciai.xr@bp.renesas.com> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Xin Zhao <jackzxcui1989@163.com> Link: https://patch.msgid.link/20260213085039.3274704-1-jackzxcui1989@163.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-03-12Merge drm/drm-next into drm-xe-nextMatthew Brost
Backmerging to bring in 7.00-rc3. Important ahead GPU SVM merging THP support. Signed-off-by: Matthew Brost <matthew.brost@intel.com>
2026-03-12serdev: serdev.h: clean up kernel-doc commentsRandy Dunlap
Correct kernel-doc comment format and add a missing to avoid kernel-doc warnings: Warning: include/linux/serdev.h:49 struct member 'write_comp' not described in 'serdev_device' Warning: include/linux/serdev.h:49 struct member 'write_lock' not described in 'serdev_device' Warning: include/linux/serdev.h:68 struct member 'shutdown' not described in 'serdev_device_driver' Warning: include/linux/serdev.h:134 function parameter 'serdev' not described in 'serdev_device_put' Warning: include/linux/serdev.h:162 function parameter 'ctrl' not described in 'serdev_controller_put' Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Link: https://patch.msgid.link/20260311052347.305612-1-rdunlap@infradead.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-03-12tty: constify tty_ldisc_opsQingfang Deng
tty_ldisc_ops is not modified once registered, so make it const. Signed-off-by: Qingfang Deng <dqfext@gmail.com> Link: https://patch.msgid.link/20260206062004.1273890-1-dqfext@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-03-12tcp: add tcp_release_cb_cond() helperEric Dumazet
Majority of tcp_release_cb() calls do nothing at all. Provide tcp_release_cb_cond() helper so that release_sock() can avoid these calls. Also hint the compiler that __release_sock() and wake_up() are rarely called. $ scripts/bloat-o-meter -t vmlinux.old vmlinux.new add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-77 (-77) Function old new delta release_sock 258 181 -77 Total: Before=25235790, After=25235713, chg -0.00% Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20260310124451.2280968-1-edumazet@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-03-12hrtimer: Remove trailing comma after HRTIMER_MAX_CLOCK_BASESThomas Weißschuh (Schneider Electric)
HRTIMER_MAX_CLOCK_BASES is required to stay the last value of the enum. Drop the trailing comma so no new members are added after it by mistake. Signed-off-by: Thomas Weißschuh (Schneider Electric) <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Link: https://patch.msgid.link/20260311-hrtimer-cleanups-v1-11-095357392669@linutronix.de
2026-03-12hrtimer: Mark index and clockid of clock base as constThomas Weißschuh (Schneider Electric)
These fields are initialized once and are never supposed to change. Mark them as const to make this explicit. Signed-off-by: Thomas Weißschuh (Schneider Electric) <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Link: https://patch.msgid.link/20260311-hrtimer-cleanups-v1-10-095357392669@linutronix.de
2026-03-12hrtimer: Drop spurious space in 'enum hrtimer_base_type'Thomas Weißschuh (Schneider Electric)
This spurious space makes grepping for the enum definition annoying. Remove it. Signed-off-by: Thomas Weißschuh (Schneider Electric) <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Link: https://patch.msgid.link/20260311-hrtimer-cleanups-v1-8-095357392669@linutronix.de
2026-03-12hrtimer: Remove hrtimer_get_expires_ns()Thomas Weißschuh (Schneider Electric)
There are no users left. Signed-off-by: Thomas Weißschuh (Schneider Electric) <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Link: https://patch.msgid.link/20260311-hrtimer-cleanups-v1-6-095357392669@linutronix.de
2026-03-12tracing: Use explicit array size instead of sentinel elements in symbol printingThomas Weißschuh (Schneider Electric)
The sentinel value added by the wrapper macros __print_symbolic() et al prevents the callers from adding their own trailing comma. This makes constructing symbol list dynamically based on kconfig values tedious. Drop the sentinel elements, so callers can either specify the trailing comma or not, just like in regular array initializers. Signed-off-by: Thomas Weißschuh (Schneider Electric) <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Link: https://patch.msgid.link/20260311-hrtimer-cleanups-v1-2-095357392669@linutronix.de
2026-03-12include/linux/local_lock_internal.h: Make this header file again compatible ↵Bart Van Assche
with sparse There are two versions of the __this_cpu_local_lock() definitions in include/linux/local_lock_internal.h: one version that relies on the Clang overloading functionality and another version that does not. Select the latter version when using sparse. This patch fixes the following errors reported by sparse: include/linux/local_lock_internal.h:331:40: sparse: sparse: multiple definitions for function '__this_cpu_local_lock' include/linux/local_lock_internal.h:325:37: sparse: the previous one is here Closes: https://lore.kernel.org/oe-kbuild-all/202603062334.wgI5htP0-lkp@intel.com/ Fixes: d3febf16dee2 ("locking/local_lock: Support Clang's context analysis") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Marco Elver <elver@google.com> Link: https://patch.msgid.link/20260311231455.1961413-1-bvanassche@acm.org
2026-03-12Merge drm/drm-next into drm-misc-nextMaxime Ripard
Biju Das needs a patch for rz-du merged in 7.0-rc3 Signed-off-by: Maxime Ripard <mripard@kernel.org>
2026-03-12sched/wait: correct kernel-doc descriptionsRandy Dunlap
Use the correct function name and function parameter name to avoid these kernel-doc warnings: Warning: include/linux/wait_bit.h:424 expecting prototype for wait_var_event_killable(). Prototype was for wait_var_event_interruptible() instead Warning: include/linux/wait_bit.h:508 function parameter 'lock' not described in 'wait_var_event_mutex' Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20260302005237.3473095-1-rdunlap@infradead.org
2026-03-11ipv6: move the disable_ipv6_mod knob to core codeJakub Kicinski
From: Jakub Kicinski <kuba@kernel.org> Make sure disable_ipv6_mod itself is not part of the IPv6 module, in case core code wants to refer to it. We will remove support for IPv6=m soon, this change helps make fixes we commit before that less messy. Link: https://patch.msgid.link/20260307-net-nd_tbl_fixes-v4-1-e2677e85628c@suse.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-11cgroup: replace global cgroup_file_kn_lock with per-cgroup_file lockShakeel Butt
Replace the global cgroup_file_kn_lock with a per-cgroup_file spinlock to eliminate cross-cgroup contention as it is not really protecting data shared between different cgroups. The lock is initialized in cgroup_add_file() alongside timer_setup(). No lock acquisition is needed during initialization since the cgroup directory is being populated under cgroup_mutex and no concurrent accessors exist at that point. Reported-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev> Signed-off-by: Tejun Heo <tj@kernel.org>
2026-03-11clone: add CLONE_AUTOREAPChristian Brauner
Add a new clone3() flag CLONE_AUTOREAP that makes a child process auto-reap on exit without ever becoming a zombie. This is a per-process property in contrast to the existing auto-reap mechanism via SA_NOCLDWAIT or SIG_IGN for SIGCHLD which applies to all children of a given parent. Currently the only way to automatically reap children is to set SA_NOCLDWAIT or SIG_IGN on SIGCHLD. This is a parent-scoped property affecting all children which makes it unsuitable for libraries or applications that need selective auto-reaping of specific children while still being able to wait() on others. CLONE_AUTOREAP stores an autoreap flag in the child's signal_struct. When the child exits do_notify_parent() checks this flag and causes exit_notify() to transition the task directly to EXIT_DEAD. Since the flag lives on the child it survives reparenting: if the original parent exits and the child is reparented to a subreaper or init the child still auto-reaps when it eventually exits. CLONE_AUTOREAP can be combined with CLONE_PIDFD to allow the parent to monitor the child's exit via poll() and retrieve exit status via PIDFD_GET_INFO. Without CLONE_PIDFD it provides a fire-and-forget pattern where the parent simply doesn't care about the child's exit status. No exit signal is delivered so exit_signal must be zero. CLONE_AUTOREAP is rejected in combination with CLONE_PARENT. If a CLONE_AUTOREAP child were to clone(CLONE_PARENT) the new grandchild would inherit exit_signal == 0 from the autoreap parent's group leader but without signal->autoreap. This grandchild would become a zombie that never sends a signal and is never autoreaped - confusing and arguably broken behavior. The flag is not inherited by the autoreap process's own children. Each child that should be autoreaped must be explicitly created with CLONE_AUTOREAP. Link: https://github.com/uapi-group/kernel-features/issues/45 Link: https://patch.msgid.link/20260226-work-pidfs-autoreap-v5-1-d148b984a989@kernel.org Reviewed-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-03-11io_uring: ensure ctx->rings is stable for task work flags manipulationJens Axboe
If DEFER_TASKRUN | SETUP_TASKRUN is used and task work is added while the ring is being resized, it's possible for the OR'ing of IORING_SQ_TASKRUN to happen in the small window of swapping into the new rings and the old rings being freed. Prevent this by adding a 2nd ->rings pointer, ->rings_rcu, which is protected by RCU. The task work flags manipulation is inside RCU already, and if the resize ring freeing is done post an RCU synchronize, then there's no need to add locking to the fast path of task work additions. Note: this is only done for DEFER_TASKRUN, as that's the only setup mode that supports ring resizing. If this ever changes, then they too need to use the io_ctx_mark_taskrun() helper. Link: https://lore.kernel.org/io-uring/20260309062759.482210-1-naup96721@gmail.com/ Cc: stable@vger.kernel.org Fixes: 79cfe9e59c2a ("io_uring/register: add IORING_REGISTER_RESIZE_RINGS") Reported-by: Hao-Yu Yang <naup96721@gmail.com> Suggested-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2026-03-11Merge branch 'sched/hrtick' into timers/coreThomas Gleixner
Pick up the hrtick related hrtimer changes so other unrelated changes can be queued on top.
2026-03-11efi: Drop unused efi_range_is_wc() functionArd Biesheuvel
efi_range_is_wc() has no callers, so remove it. Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2026-03-11Merge tag 'kvm-x86-generic-7.0-rc3' of https://github.com/kvm-x86/linux into ↵Paolo Bonzini
HEAD KVM generic changes for 7.0 - Remove a subtle pseudo-overlay of kvm_stats_desc, which, aside from being unnecessary and confusing, triggered compiler warnings due to -Wflex-array-member-not-at-end. - Document that vcpu->mutex is take outside of kvm->slots_lock and kvm->slots_arch_lock, which is intentional and desirable despite being rather unintuitive.
2026-03-11mtd: concat: replace alloc + calloc with 1 allocRosen Penev
A flex array can be used to reduce the allocation to 1. And actually mtdconcat was using the pointer + 1 trick to point to the overallocated area. Better alternatives exist. Signed-off-by: Rosen Penev <rosenp@gmail.com> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
2026-03-11usb: core: new quirk to handle devices with zero configurationsJie Deng
Some USB devices incorrectly report bNumConfigurations as 0 in their device descriptor, which causes the USB core to reject them during enumeration. logs: usb 1-2: device descriptor read/64, error -71 usb 1-2: no configurations usb 1-2: can't read configurations, error -22 However, these devices actually work correctly when treated as having a single configuration. Add a new quirk USB_QUIRK_FORCE_ONE_CONFIG to handle such devices. When this quirk is set, assume the device has 1 configuration instead of failing with -EINVAL. This quirk is applied to the device with VID:PID 5131:2007 which exhibits this behavior. Signed-off-by: Jie Deng <dengjie03@kylinos.cn> Link: https://patch.msgid.link/20260227084931.1527461-1-dengjie03@kylinos.cn Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-03-11USB: core: Limit the length of unkillable synchronous timeoutsAlan Stern
The usb_control_msg(), usb_bulk_msg(), and usb_interrupt_msg() APIs in usbcore allow unlimited timeout durations. And since they use uninterruptible waits, this leaves open the possibility of hanging a task for an indefinitely long time, with no way to kill it short of unplugging the target device. To prevent this sort of problem, enforce a maximum limit on the length of these unkillable timeouts. The limit chosen here, somewhat arbitrarily, is 60 seconds. On many systems (although not all) this is short enough to avoid triggering the kernel's hung-task detector. In addition, clear up the ambiguity of negative timeout values by treating them the same as 0, i.e., using the maximum allowed timeout. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Link: https://lore.kernel.org/linux-usb/3acfe838-6334-4f6d-be7c-4bb01704b33d@rowland.harvard.edu/ Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") CC: stable@vger.kernel.org Link: https://patch.msgid.link/15fc9773-a007-47b0-a703-df89a8cf83dd@rowland.harvard.edu Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>