summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2026-04-08codel: annotate data-races in codel_dump_stats()Eric Dumazet
codel_dump_stats() only runs with RTNL held, reading fields that can be changed in qdisc fast path. Add READ_ONCE()/WRITE_ONCE() annotations. Alternative would be to acquire the qdisc spinlock, but our long-term goal is to make qdisc dump operations lockless as much as we can. tc_codel_xstats fields don't need to be latched atomically, otherwise this bug would have been caught earlier. No change in kernel size: $ scripts/bloat-o-meter -t vmlinux.0 vmlinux add/remove: 0/0 grow/shrink: 1/1 up/down: 3/-1 (2) Function old new delta codel_qdisc_dequeue 2462 2465 +3 codel_dump_stats 250 249 -1 Total: Before=29739919, After=29739921, chg +0.00% Fixes: 76e3cc126bb2 ("codel: Controlled Delay AQM") Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260407143053.1570620-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-08bonding: remove unused bond_is_first_slave and bond_is_last_slave macrosXiang Mei
Since commit 2884bf72fb8f ("net: bonding: fix use-after-free in bond_xmit_broadcast()"), bond_is_last_slave() was only used in bond_xmit_broadcast(). After the recent fix replaced that usage with a simple index comparison, bond_is_last_slave() has no remaining callers. bond_is_first_slave() likewise has no callers. Remove both unused macros. Signed-off-by: Xiang Mei <xmei5@asu.edu> Link: https://patch.msgid.link/20260404220412.444753-1-xmei5@asu.edu Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-08dt-bindings: clock: qcom: Add Nord Global Clock ControllerTaniya Das
Add device tree bindings for the global clock controller on Qualcomm Nord platform. The global clock controller on Nord SoC is divided into multiple clock controllers (GCC,SE_GCC,NE_GCC and NW_GCC). Add each of the bindings to define the clock controllers. Signed-off-by: Taniya Das <taniya.das@oss.qualcomm.com> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com> Link: https://lore.kernel.org/r/20260403-nord-clks-v1-3-018af14979fd@oss.qualcomm.com Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2026-04-08dt-bindings: clock: qcom: Document the Nord SoC TCSR Clock ControllerTaniya Das
The Nord SoC TCSR block provides CLKREF clocks for DP, PCIe, UFS, SGMII and USB. Signed-off-by: Taniya Das <taniya.das@oss.qualcomm.com> [Shawn: Use compatible qcom,nord-tcsrcc rather than qcom,nord-tcsr] Signed-off-by: Shawn Guo <shengchao.guo@oss.qualcomm.com> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com> Link: https://lore.kernel.org/r/20260403-nord-clks-v1-1-018af14979fd@oss.qualcomm.com Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2026-04-08scsi: libsas: Delete unused to_dom_device() and to_dev_attr()Thomas Weißschuh
These macros are unused and to_dev_attr() will conflict with an upcoming centralization of general attribute macros. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Reviewed-by: John Garry <john.g.garry@oracle.com> Link: https://patch.msgid.link/20260408-libsas-cleanup-v1-1-826325bbc0ba@weissschuh.net Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2026-04-08Merge tag 'nf-26-04-08' of ↵Jakub Kicinski
https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf Florian Westphal says: ==================== netfilter updates for net I only included crash fixes, as we're closer to a release, rest will be handled via -next. 1) Fix a NULL pointer dereference in ip_vs_add_service error path, from Weiming Shi, bug added in 6.2 development cycle. 2) Don't leak kernel data bytes from allocator to userspace: nfnetlink_log needs to init the trailing NLMSG_DONE terminator. From Xiang Mei. 3) xt_multiport match lacks range validation, bogus userspace request will cause out-of-bounds read. From Ren Wei. 4) ip6t_eui64 match must reject packets with invalid mac header before calling eth_hdr. Make existing check unconditional. From Zhengchuan Liang. 5) nft_ct timeout policies are free'd via kfree() while they may still be reachable by other cpus that process a conntrack object that uses such a timeout policy. Existing reaping of entries is not sufficient because it doesn't wait for a grace period. Use kfree_rcu(). From Tuan Do. 6/7) Make nfnetlink_queue hash table per queue. As-is we can hit a page fault in case underlying page of removed element was free'd. Per-queue hash prevents parallel lookups. This comes with a test case that demonstrates the bug, from Fernando Fernandez Mancera. * tag 'nf-26-04-08' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: selftests: nft_queue.sh: add a parallel stress test netfilter: nfnetlink_queue: make hash table per queue netfilter: nft_ct: fix use-after-free in timeout object destroy netfilter: ip6t_eui64: reject invalid MAC header for all packets netfilter: xt_multiport: validate range encoding in checkentry netfilter: nfnetlink_log: initialize nfgenmsg in NLMSG_DONE terminator ipvs: fix NULL deref in ip_vs_add_service error path ==================== Link: https://patch.msgid.link/20260408163512.30537-1-fw@strlen.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-08rxrpc: Fix to request an ack if window is limitedMarc Dionne
Peers may only send immediate acks for every 2 UDP packets received. When sending a jumbogram, it is important to check that there is sufficient window space to send another same sized jumbogram following the current one, and request an ack if there isn't. Failure to do so may cause the call to stall waiting for an ack until the resend timer fires. Where jumbograms are in use this causes a very significant drop in performance. Fixes: fe24a5494390 ("rxrpc: Send jumbo DATA packets") Signed-off-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> cc: Jeffrey Altman <jaltman@auristor.com> cc: Simon Horman <horms@kernel.org> cc: linux-afs@lists.infradead.org cc: stable@kernel.org Link: https://patch.msgid.link/20260408121252.2249051-10-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-08rxrpc: Fix use of wrong skb when comparing queued RESP challenge serialAlok Tiwari
In rxrpc_post_response(), the code should be comparing the challenge serial number from the cached response before deciding to switch to a newer response, but looks at the newer packet private data instead, rendering the comparison always false. Fix this by switching to look at the older packet. Fix further[1] to substitute the new packet in place of the old one if newer and also to release whichever we don't use. Fixes: 5800b1cf3fd8 ("rxrpc: Allow CHALLENGEs to the passed to the app for a RESPONSE") Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeffrey Altman <jaltman@auristor.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: Simon Horman <horms@kernel.org> cc: linux-afs@lists.infradead.org cc: stable@kernel.org Link: https://sashiko.dev/#/patchset/20260319150150.4189381-1-dhowells%40redhat.com [1] Link: https://patch.msgid.link/20260408121252.2249051-7-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-08rxrpc: Fix call removal to use RCU safe deletionDavid Howells
Fix rxrpc call removal from the rxnet->calls list to use list_del_rcu() rather than list_del_init() to prevent stuffing up reading /proc/net/rxrpc/calls from potentially getting into an infinite loop. This, however, means that list_empty() no longer works on an entry that's been deleted from the list, making it harder to detect prior deletion. Fix this by: Firstly, make rxrpc_destroy_all_calls() only dump the first ten calls that are unexpectedly still on the list. Limiting the number of steps means there's no need to call cond_resched() or to remove calls from the list here, thereby eliminating the need for rxrpc_put_call() to check for that. rxrpc_put_call() can then be fixed to unconditionally delete the call from the list as it is the only place that the deletion occurs. Fixes: 2baec2c3f854 ("rxrpc: Support network namespacing") Closes: https://sashiko.dev/#/patchset/20260319150150.4189381-1-dhowells%40redhat.com Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: Jeffrey Altman <jaltman@auristor.com> cc: Linus Torvalds <torvalds@linux-foundation.org> cc: Simon Horman <horms@kernel.org> cc: linux-afs@lists.infradead.org cc: stable@kernel.org Link: https://patch.msgid.link/20260408121252.2249051-5-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-08bpf: Make find_linfo widely availableKumar Kartikeya Dwivedi
Move find_linfo() as bpf_find_linfo() into core.c to allow for its use in the verifier in subsequent patches. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Acked-by: Mykyta Yatsenko <yatsenko@meta.com> Link: https://lore.kernel.org/r/20260408021359.3786905-4-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-04-08bpf: Extract bpf_get_linfo_file_lineKumar Kartikeya Dwivedi
Extract bpf_get_linfo_file_line as its own function so that the logic to obtain the file, line, and line number for a given program can be shared in subsequent patches. Reviewed-by: Puranjay Mohan <puranjay@kernel.org> Acked-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20260408021359.3786905-3-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-04-08hfsplus: rework logic of map nodes creation in xattr b-treeViacheslav Dubeyko
In hfsplus_init_header_node() when node_count > 63488 (header bitmap capacity), the code calculates map_nodes, subtracts them from free_nodes, and marks their positions used in the bitmap. However, it doesn't write the actual map node structure (type, record offsets, bitmap) for those physical positions, only node 0 is written. This patch reworks hfsplus_create_attributes_file() logic by introducing a specialized method of hfsplus_init_map_node() and writing the allocated map b-tree's nodes by means of hfsplus_write_attributes_file_node() method. cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> cc: Yangtao Li <frank.li@vivo.com> cc: linux-fsdevel@vger.kernel.org Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com> Link: https://lore.kernel.org/r/20260403230556.614171-5-slava@dubeyko.com Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
2026-04-08Merge tag 'hid-for-linus-2026040801' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid Pull HID fixes from Jiri Kosina: - handling of new keycodes for contextual AI usages (Akshai Murari) - fix for UAF in hid-roccat (Benoît Sevens) - deduplication of error logging in amd_sfh (Maximilian Pezzullo) - various device-specific quirks and device ID additions (Even Xu, Lode Willems, Leo Vriska) * tag 'hid-for-linus-2026040801' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid: Input: add keycodes for contextual AI usages (HUTRR119) HID: Kysona: Add support for VXE Dragonfly R1 Pro HID: amd_sfh: don't log error when device discovery fails with -EOPNOTSUPP HID: quirks: add HID_QUIRK_ALWAYS_POLL for 8BitDo Pro 3 HID: roccat: fix use-after-free in roccat_report_event HID: Intel-thc-hid: Intel-quickspi: Add NVL Device IDs HID: Intel-thc-hid: Intel-quicki2c: Add NVL Device IDs
2026-04-08tracing: Report ipi_raise target CPUs as cpumaskCaoRuichuang
Bugzilla 217447 points out that ftrace bitmask fields still use the legacy dynamic-array format, which makes trace consumers treat them as unsigned long arrays instead of bitmaps. This is visible in the ipi events today: ipi_send_cpumask already reports its CPU mask as '__data_loc cpumask_t', but ipi_raise still exposes target_cpus as '__data_loc unsigned long[]'. Switch ipi_raise to __cpumask() and the matching helpers so its tracefs format matches the existing cpumask representation used by the other ipi event. The underlying storage size stays the same, but trace data consumers can now recognize the field as a cpumask directly. Link: https://patch.msgid.link/20260406162434.40767-1-create0818@163.com Link: https://bugzilla.kernel.org/show_bug.cgi?id=217447 Signed-off-by: CaoRuichuang <create0818@163.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2026-04-08x86: shadow stacks: proper error handling for mmap lockLinus Torvalds
김영민 reports that shstk_pop_sigframe() doesn't check for errors from mmap_read_lock_killable(), which is a silly oversight, and also shows that we haven't marked those functions with "__must_check", which would have immediately caught it. So let's fix both issues. Reported-by: 김영민 <osori@hspace.io> Acked-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Dave Hansen <dave.hansen@intel.com> Acked-by: Rick Edgecombe <rick.p.edgecombe@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-04-08ASoC: SDCA: Unregister IRQ handlers on module removeRichard Fitzgerald
Ensure that all interrupt handlers are unregistered before the parent regmap_irq is unregistered. sdca_irq_cleanup() was only called from the component_remove(). If the module was loaded and removed without ever being component probed the FDL interrupts would not be unregistered and this would hit a WARN when devm called regmap_del_irq_chip() during the removal of the parent IRQ. Fixes: 4e53116437e9 ("ASoC: SDCA: Fix errors in IRQ cleanup") Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com> Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com> Link: https://patch.msgid.link/20260408093835.2881486-5-ckeepax@opensource.cirrus.com Signed-off-by: Mark Brown <broonie@kernel.org>
2026-04-08Add Renesas RZ/G3L RSPI supportMark Brown
Biju <biju.das.au@gmail.com> says: This patch series adds binding and driver support for RSPI IP found on the RZ/G3L SoC. The RSPI is compatible with RZ/V2H RSPI, but has 2 clocks compared to 3 on RZ/V2H. Link: https://patch.msgid.link/20260408085418.18770-1-biju.das.jz@bp.renesas.com
2026-04-08dma-fence: correct kernel-doc function parameter @flagsRandy Dunlap
'make htmldocs' complains that dma_fence_unlock_irqrestore() is missing a description of its @flags parameter. The description is there but it is missing a ':' sign. Add that and correct the possessive form of "its". WARNING: ../include/linux/dma-fence.h:414 function parameter 'flags' not described in 'dma_fence_unlock_irqrestore' Fixes: 3e5067931b5d ("dma-buf: abstract fence locking v2") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Link: https://lore.kernel.org/r/20260407043649.2015894-1-rdunlap@infradead.org Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com>
2026-04-08netfilter: nfnetlink_queue: make hash table per queueFlorian Westphal
Sharing a global hash table among all queues is tempting, but it can cause crash: BUG: KASAN: slab-use-after-free in nfqnl_recv_verdict+0x11ac/0x15e0 [nfnetlink_queue] [..] nfqnl_recv_verdict+0x11ac/0x15e0 [nfnetlink_queue] nfnetlink_rcv_msg+0x46a/0x930 kmem_cache_alloc_node_noprof+0x11e/0x450 struct nf_queue_entry is freed via kfree, but parallel cpu can still encounter such an nf_queue_entry when walking the list. Alternative fix is to free the nf_queue_entry via kfree_rcu() instead, but as we have to alloc/free for each skb this will cause more mem pressure. Cc: Scott Mitchell <scott.k.mitch1@gmail.com> Fixes: e19079adcd26 ("netfilter: nfnetlink_queue: optimize verdict lookup with hash table") Signed-off-by: Florian Westphal <fw@strlen.de>
2026-04-08netfilter: nft_ct: fix use-after-free in timeout object destroyTuan Do
nft_ct_timeout_obj_destroy() frees the timeout object with kfree() immediately after nf_ct_untimeout(), without waiting for an RCU grace period. Concurrent packet processing on other CPUs may still hold RCU-protected references to the timeout object obtained via rcu_dereference() in nf_ct_timeout_data(). Add an rcu_head to struct nf_ct_timeout and use kfree_rcu() to defer freeing until after an RCU grace period, matching the approach already used in nfnetlink_cttimeout.c. KASAN report: BUG: KASAN: slab-use-after-free in nf_conntrack_tcp_packet+0x1381/0x29d0 Read of size 4 at addr ffff8881035fe19c by task exploit/80 Call Trace: nf_conntrack_tcp_packet+0x1381/0x29d0 nf_conntrack_in+0x612/0x8b0 nf_hook_slow+0x70/0x100 __ip_local_out+0x1b2/0x210 tcp_sendmsg_locked+0x722/0x1580 __sys_sendto+0x2d8/0x320 Allocated by task 75: nft_ct_timeout_obj_init+0xf6/0x290 nft_obj_init+0x107/0x1b0 nf_tables_newobj+0x680/0x9c0 nfnetlink_rcv_batch+0xc29/0xe00 Freed by task 26: nft_obj_destroy+0x3f/0xa0 nf_tables_trans_destroy_work+0x51c/0x5c0 process_one_work+0x2c4/0x5a0 Fixes: 7e0b2b57f01d ("netfilter: nft_ct: add ct timeout support") Cc: stable@vger.kernel.org Signed-off-by: Tuan Do <tuan@calif.io> Signed-off-by: Florian Westphal <fw@strlen.de>
2026-04-08Merge branch kvm-arm64/vgic-fixes-7.1 into kvmarm-master/nextMarc Zyngier
* kvm-arm64/vgic-fixes-7.1: : . : FIrst pass at fixing a number of vgic-v5 bugs that were found : after the merge of the initial series. : . KVM: arm64: Advertise ID_AA64PFR2_EL1.GCIE KVM: arm64: vgic-v5: Fold PPI state for all exposed PPIs KVM: arm64: set_id_regs: Allow GICv3 support to be set at runtime KVM: arm64: Don't advertises GICv3 in ID_PFR1_EL1 if AArch32 isn't supported KVM: arm64: Correctly plumb ID_AA64PFR2_EL1 into pkvm idreg handling KVM: arm64: Move GICv5 timer PPI validation into timer_irqs_are_valid() KVM: arm64: Remove evaluation of timer state in kvm_cpu_has_pending_timer() KVM: arm64: Kill arch_timer_context::direct field KVM: arm64: vgic-v5: Correctly set dist->ready once initialised KVM: arm64: vgic-v5: Make the effective priority mask a strict limit KVM: arm64: vgic-v5: Cast vgic_apr to u32 to avoid undefined behaviours KVM: arm64: vgic-v5: Transfer edge pending state to ICH_PPI_PENDRx_EL2 KVM: arm64: vgic-v5: Hold config_lock while finalizing GICv5 PPIs KVM: arm64: Account for RESx bits in __compute_fgt() KVM: arm64: Fix writeable mask for ID_AA64PFR2_EL1 arm64: Fix field references for ICH_PPI_DVIR[01]_EL2 KVM: arm64: Don't skip per-vcpu NV initialisation KVM: arm64: vgic: Don't reset cpuif/redist addresses at finalize time Signed-off-by: Marc Zyngier <maz@kernel.org>
2026-04-08Merge branch kvm-arm64/pkvm-protected-guest into kvmarm-master/nextMarc Zyngier
* kvm-arm64/pkvm-protected-guest: (41 commits) : . : pKVM support for protected guests, implementing the very long : awaited support for anonymous memory, as the elusive guestmem : has failed to deliver on its promises despite a multi-year : effort. Patches courtesy of Will Deacon. From the initial cover : letter: : : "[...] this patch series implements support for protected guest : memory with pKVM, where pages are unmapped from the host as they are : faulted into the guest and can be shared back from the guest using pKVM : hypercalls. Protected guests are created using a new machine type : identifier and can be booted to a shell using the kvmtool patches : available at [2], which finally means that we are able to test the pVM : logic in pKVM. Since this is an incremental step towards full isolation : from the host (for example, the CPU register state and DMA accesses are : not yet isolated), creating a pVM requires a developer Kconfig option to : be enabled in addition to booting with 'kvm-arm.mode=protected' and : results in a kernel taint." : . KVM: arm64: Don't hold 'vm_table_lock' across guest page reclaim KVM: arm64: Allow get_pkvm_hyp_vm() to take a reference to a dying VM KVM: arm64: Prevent teardown finalisation of referenced 'hyp_vm' drivers/virt: pkvm: Add Kconfig dependency on DMA_RESTRICTED_POOL KVM: arm64: Rename PKVM_PAGE_STATE_MASK KVM: arm64: Extend pKVM page ownership selftests to cover guest hvcs KVM: arm64: Extend pKVM page ownership selftests to cover forced reclaim KVM: arm64: Register 'selftest_vm' in the VM table KVM: arm64: Extend pKVM page ownership selftests to cover guest donation KVM: arm64: Add some initial documentation for pKVM KVM: arm64: Allow userspace to create protected VMs when pKVM is enabled KVM: arm64: Implement the MEM_UNSHARE hypercall for protected VMs KVM: arm64: Implement the MEM_SHARE hypercall for protected VMs KVM: arm64: Add hvc handler at EL2 for hypercalls from protected VMs KVM: arm64: Return -EFAULT from VCPU_RUN on access to a poisoned pte KVM: arm64: Reclaim faulting page from pKVM in spurious fault handler KVM: arm64: Introduce hypercall to force reclaim of a protected page KVM: arm64: Annotate guest donations with handle and gfn in host stage-2 KVM: arm64: Change 'pkvm_handle_t' to u16 KVM: arm64: Introduce host_stage2_set_owner_metadata_locked() ... Signed-off-by: Marc Zyngier <maz@kernel.org>
2026-04-08Merge branch kvm-arm64/vgic-v5-ppi into kvmarm-master/nextMarc Zyngier
* kvm-arm64/vgic-v5-ppi: (40 commits) : . : Add initial GICv5 support for KVM guests, only adding PPI support : for the time being. Patches courtesy of Sascha Bischoff. : : From the cover letter: : : "This is v7 of the patch series to add the virtual GICv5 [1] device : (vgic_v5). Only PPIs are supported by this initial series, and the : vgic_v5 implementation is restricted to the CPU interface, : only. Further patch series are to follow in due course, and will add : support for SPIs, LPIs, the GICv5 IRS, and the GICv5 ITS." : . KVM: arm64: selftests: Add no-vgic-v5 selftest KVM: arm64: selftests: Introduce a minimal GICv5 PPI selftest KVM: arm64: gic-v5: Communicate userspace-driveable PPIs via a UAPI Documentation: KVM: Introduce documentation for VGICv5 KVM: arm64: gic-v5: Probe for GICv5 device KVM: arm64: gic-v5: Set ICH_VCTLR_EL2.En on boot KVM: arm64: gic-v5: Introduce kvm_arm_vgic_v5_ops and register them KVM: arm64: gic-v5: Hide FEAT_GCIE from NV GICv5 guests KVM: arm64: gic: Hide GICv5 for protected guests KVM: arm64: gic-v5: Mandate architected PPI for PMU emulation on GICv5 KVM: arm64: gic-v5: Enlighten arch timer for GICv5 irqchip/gic-v5: Introduce minimal irq_set_type() for PPIs KVM: arm64: gic-v5: Initialise ID and priority bits when resetting vcpu KVM: arm64: gic-v5: Create and initialise vgic_v5 KVM: arm64: gic-v5: Support GICv5 interrupts with KVM_IRQ_LINE KVM: arm64: gic-v5: Implement direct injection of PPIs KVM: arm64: Introduce set_direct_injection irq_op KVM: arm64: gic-v5: Trap and mask guest ICC_PPI_ENABLERx_EL1 writes KVM: arm64: gic-v5: Check for pending PPIs KVM: arm64: gic-v5: Clear TWI if single task running ... Signed-off-by: Marc Zyngier <maz@kernel.org>
2026-04-08Merge branch kvm-arm64/hyp-tracing into kvmarm-master/nextMarc Zyngier
* kvm-arm64/hyp-tracing: (40 commits) : . : EL2 tracing support, adding both 'remote' ring-buffer : infrastructure and the tracing itself, courtesy of : Vincent Donnefort. From the cover letter: : : "The growing set of features supported by the hypervisor in protected : mode necessitates debugging and profiling tools. Tracefs is the : ideal candidate for this task: : : * It is simple to use and to script. : : * It is supported by various tools, from the trace-cmd CLI to the : Android web-based perfetto. : : * The ring-buffer, where are stored trace events consists of linked : pages, making it an ideal structure for sharing between kernel and : hypervisor. : : This series first introduces a new generic way of creating remote events and : remote buffers. Then it adds support to the pKVM hypervisor." : . tracing: selftests: Extend hotplug testing for trace remotes tracing: Non-consuming read for trace remotes with an offline CPU tracing: Adjust cmd_check_undefined to show unexpected undefined symbols tracing: Restore accidentally removed SPDX tag KVM: arm64: avoid unused-variable warning tracing: Generate undef symbols allowlist for simple_ring_buffer KVM: arm64: tracing: add ftrace dependency tracing: add more symbols to whitelist tracing: Update undefined symbols allow list for simple_ring_buffer KVM: arm64: Fix out-of-tree build for nVHE/pKVM tracing tracing: selftests: Add hypervisor trace remote tests KVM: arm64: Add selftest event support to nVHE/pKVM hyp KVM: arm64: Add hyp_enter/hyp_exit events to nVHE/pKVM hyp KVM: arm64: Add event support to the nVHE/pKVM hyp and trace remote KVM: arm64: Add trace reset to the nVHE/pKVM hyp KVM: arm64: Sync boot clock with the nVHE/pKVM hyp KVM: arm64: Add trace remote for the nVHE/pKVM hyp KVM: arm64: Add tracing capability for the nVHE/pKVM hyp KVM: arm64: Support unaligned fixmap in the pKVM hyp KVM: arm64: Initialise hyp_nr_cpus for nVHE hyp ... Signed-off-by: Marc Zyngier <maz@kernel.org>
2026-04-08thermal: core: Suspend thermal zones later and resume them earlierRafael J. Wysocki
To avoid some undesirable interactions between thermal zone suspend and resume with user space that is running when those operations are carried out, move them closer to the suspend and resume of devices, respectively, by updating dpm_prepare() to carry out thermal zone suspend and dpm_complete() to start thermal zone resume (that will continue asynchronously). This also makes the code easier to follow by removing one, arguably redundant, level of indirection represented by the thermal PM notifier. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Armin Wolf <W_Armin@gmx.de> Link: https://patch.msgid.link/2036875.PYKUYFuaPT@rafael.j.wysocki
2026-04-08pmdomain: Merge branch dt into nextUlf Hansson
Merge the immutable branch dt into next, to allow the updated DT bindings to be tested together with the pmdomain changes that are targeted for the next release. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2026-04-08dt-bindings: power: qcom,rpmhpd: Add RPMh power domain for Hawi SoCFenglin Wu
Document the RPMh power domain for Hawi SoC, and add definitions for the new power domains which present in Hawi SoC: - RPMHPD_DCX (Display Core X): supplies VDD_DISP for the display subsystem - RPMHPD_GBX (Graphics Box): supplies VDD_GFX_BX for the GPU/graphics subsystem Also, add constants for new power domain levels that supported in Hawi SoC, including: LOW_SVS_D3_0, LOW_SVS_D1_0, LOW_SVS_D0_0, SVS_L2_0, TURBO_L1_0/1/2, TURBO_L1_0/1/2. Signed-off-by: Fenglin Wu <fenglin.wu@oss.qualcomm.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2026-04-08entry: Split preemption from irqentry_exit_to_kernel_mode()Mark Rutland
Some architecture-specific work needs to be performed between the state management for exception entry/exit and the "real" work to handle the exception. For example, arm64 needs to manipulate a number of exception masking bits, with different exceptions requiring different masking. Generally this can all be hidden in the architecture code, but for arm64 the current structure of irqentry_exit_to_kernel_mode() makes this particularly difficult to handle in a way that is correct, maintainable, and efficient. The gory details are described in the thread surrounding: https://lore.kernel.org/lkml/acPAzdtjK5w-rNqC@J2N7QTR9R3/ The summary is: * Currently, irqentry_exit_to_kernel_mode() handles both involuntary preemption AND state management necessary for exception return. * When scheduling (including involuntary preemption), arm64 needs to have all arm64-specific exceptions unmasked, though regular interrupts must be masked. * Prior to the state management for exception return, arm64 needs to mask a number of arm64-specific exceptions, and perform some work with these exceptions masked (with RCU watching, etc). While in theory it is possible to handle this with a new arch_*() hook called somewhere under irqentry_exit_to_kernel_mode(), this is fragile and complicated, and doesn't match the flow used for exception return to user mode, which has a separate 'prepare' step (where preemption can occur) prior to the state management. To solve this, refactor irqentry_exit_to_kernel_mode() to match the style of {irqentry,syscall}_exit_to_user_mode(), moving preemption logic into a new irqentry_exit_to_kernel_mode_preempt() function, and moving state management in a new irqentry_exit_to_kernel_mode_after_preempt() function. The existing irqentry_exit_to_kernel_mode() is left as a caller of both of these, avoiding the need to modify existing callers. There should be no functional change as a result of this change. [ tglx: Updated kernel doc ] Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Reviewed-by: Jinjie Ruan <ruanjinjie@huawei.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20260407131650.3813777-6-mark.rutland@arm.com
2026-04-08entry: Split kernel mode logic from irqentry_{enter,exit}()Mark Rutland
The generic irqentry code has entry/exit functions specifically for exceptions taken from user mode, but doesn't have entry/exit functions specifically for exceptions taken from kernel mode. It would be helpful to have separate entry/exit functions specifically for exceptions taken from kernel mode. This would make the structure of the entry code more consistent, and would make it easier for architectures to manage logic specific to exceptions taken from kernel mode. Move the logic specific to kernel mode out of irqentry_enter() and irqentry_exit() into new irqentry_enter_from_kernel_mode() and irqentry_exit_to_kernel_mode() functions. These are marked __always_inline and placed in irq-entry-common.h, as with irqentry_enter_from_user_mode() and irqentry_exit_to_user_mode(), so that they can be inlined into architecture-specific wrappers. The existing out-of-line irqentry_enter() and irqentry_exit() functions retained as callers of the new functions. The lockdep assertion from irqentry_exit() is moved into irqentry_exit_to_user_mode() and irqentry_exit_to_kernel_mode(). This was previously missing from irqentry_exit_to_user_mode() when called directly, and any new lockdep assertion failure relating from this change is a latent bug. Aside from the lockdep change noted above, there should be no functional change as a result of this change. [ tglx: Updated kernel doc ] Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Reviewed-by: Jinjie Ruan <ruanjinjie@huawei.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20260407131650.3813777-5-mark.rutland@arm.com
2026-04-08entry: Move irqentry_enter() prototype laterMark Rutland
Subsequent patches will rework the irqentry_*() functions. The end result (and the intermediate diffs) will be much clearer if the prototype for the irqentry_enter() function is moved later, immediately before the prototype of the irqentry_exit() function. Move the prototype later. This is purely a move; there should be no functional change as a result of this change. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Reviewed-by: Jinjie Ruan <ruanjinjie@huawei.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20260407131650.3813777-4-mark.rutland@arm.com
2026-04-08entry: Remove local_irq_{enable,disable}_exit_to_user()Mark Rutland
local_irq_enable_exit_to_user() and local_irq_disable_exit_to_user() are never overridden by architecture code, and are always equivalent to local_irq_enable() and local_irq_disable(). These functions were added on the assumption that arm64 would override them to manage 'DAIF' exception masking, as described by Thomas Gleixner in these threads: https://lore.kernel.org/all/20190919150809.340471236@linutronix.de/ https://lore.kernel.org/all/alpine.DEB.2.21.1910240119090.1852@nanos.tec.linutronix.de/ In practice arm64 did not need to override either. Prior to moving to the generic irqentry code, arm64's management of DAIF was reworked in commit: 97d935faacde ("arm64: Unmask Debug + SError in do_notify_resume()") Since that commit, arm64 only masks interrupts during the 'prepare' step when returning to user mode, and masks other DAIF exceptions later. Within arm64_exit_to_user_mode(), the arm64 entry code is as follows: local_irq_disable(); exit_to_user_mode_prepare_legacy(regs); local_daif_mask(); mte_check_tfsr_exit(); exit_to_user_mode(); Remove the unnecessary local_irq_enable_exit_to_user() and local_irq_disable_exit_to_user() functions. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Reviewed-by: Jinjie Ruan <ruanjinjie@huawei.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20260407131650.3813777-3-mark.rutland@arm.com
2026-04-08entry: Fix stale comment for irqentry_enter()Mark Rutland
The kerneldoc comment for irqentry_enter() refers to idtentry_exit(), which is an accidental holdover from the x86 entry code that the generic irqentry code was based on. Correct this to refer to irqentry_exit(). Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Reviewed-by: Jinjie Ruan <ruanjinjie@huawei.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20260407131650.3813777-2-mark.rutland@arm.com
2026-04-08ALSA: tea6330t: add mixer state restore helperCássio Gabriel
The InterWave STB variant uses a TEA6330T mixer on its private I2C bus. The mixer state is cached in software, but there is no helper to push that register image back to hardware after system resume. Add a small restore helper that reapplies the cached TEA6330T register image to the device so board drivers can restore the external mixer state as part of their PM resume path. Take snd_i2c_lock() around the full device lookup and restore sequence so the bus device list traversal is also protected. Signed-off-by: Cássio Gabriel <cassiogabrielcontato@gmail.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Link: https://patch.msgid.link/20260407-alsa-interwave-pm-v2-2-8dd96c6129e9@gmail.com
2026-04-08wifi: mac80211, cfg80211: Export michael_mic() and move it to cfg80211Eric Biggers
Export michael_mic() so that the ath11k and ath12k drivers can call it. In addition, move it from mac80211 to cfg80211 so that the ipw2x00 drivers, which depend on cfg80211 but not mac80211, can also call it. Currently these drivers have their own local implementations of michael_mic() based on crypto_shash, which is redundant and inefficient. By consolidating all the Michael MIC code into cfg80211, we'll be able to remove the duplicate Michael MIC code in the crypto/ directory. Signed-off-by: Eric Biggers <ebiggers@kernel.org> Link: https://patch.msgid.link/20260408030651.80336-3-ebiggers@kernel.org Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-04-08netfilter: nf_tables_offload: add nft_flow_action_entry_next() and use itPablo Neira Ayuso
Add a new helper function to retrieve the next action entry in flow rule, check if the maximum number of actions is reached, bail out in such case. Replace existing opencoded iteration on the action array by this helper function. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Florian Westphal <fw@strlen.de>
2026-04-08netfilter: nft_meta: add double-tagged vlan and pppoe supportPablo Neira Ayuso
Currently: add rule netdev x y ip saddr 1.1.1.1 does not work with neither double-tagged vlan nor pppoe packets. This is because the network and transport header offset are not pointing to the IP and transport protocol headers in the stack. This patch expands NFT_META_PROTOCOL and NFT_META_L4PROTO to parse double-tagged vlan and pppoe packets so matching network and transport header fields becomes possible with the existing userspace generated bytecode. Note that this parser only supports double-tagged vlan which is composed of vlan offload + vlan header in the skb payload area for simplicity. NFT_META_PROTOCOL is used by bridge and netdev family as an implicit dependency in the bytecode to match on network header fields. Similarly, there is also NFT_META_L4PROTO, which is also used as an implicit dependency when matching on the transport protocol header fields. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Florian Westphal <fw@strlen.de>
2026-04-08netfilter: nf_tables: add netlink policy based cap on registersFlorian Westphal
Should have no effect in practice; all of these use the nft_parse_register_load/store apis which is mandatory anyway due to the need to further validate the register load/store, e.g. that the size argument doesn't result in out-of-bounds load/store. OTOH this is a simple method to reject obviously wrong input at earlier stage. Signed-off-by: Florian Westphal <fw@strlen.de>
2026-04-08netfilter: nf_tables: Fix typo in enum descriptionJelle van der Waa
Fix the spelling of "options". Signed-off-by: Jelle van der Waa <jelle@vdwaa.nl> Signed-off-by: Florian Westphal <fw@strlen.de>
2026-04-08netfilter: use function typedefs for __rcu NAT helper hook pointersSun Jian
After commit 07919126ecfc ("netfilter: annotate NAT helper hook pointers with __rcu"), sparse can warn about type/address-space mismatches when RCU-dereferencing NAT helper hook function pointers. The hooks are __rcu-annotated and accessed via rcu_dereference(), but the combination of complex function pointer declarators and the WRITE_ONCE() machinery used by RCU_INIT_POINTER()/rcu_assign_pointer() can confuse sparse and trigger false positives. Introduce typedefs for the NAT helper function types, so __rcu applies to a simple "fn_t __rcu *" pointer form. Also replace local typeof(hook) variables with "fn_t *" to avoid propagating __rcu address space into temporaries. No functional change intended. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202603022359.3dGE9fwI-lkp@intel.com/ Signed-off-by: Sun Jian <sun.jian.kdev@gmail.com> Signed-off-by: Florian Westphal <fw@strlen.de>
2026-04-07net: pull headers in qdisc_pkt_len_segs_init()Eric Dumazet
Most ndo_start_xmit() methods expects headers of gso packets to be already in skb->head. net/core/tso.c users are particularly at risk, because tso_build_hdr() does a memcpy(hdr, skb->data, hdr_len); qdisc_pkt_len_segs_init() already does a dissection of gso packets. Use pskb_may_pull() instead of skb_header_pointer() to make sure drivers do not have to reimplement this. Some malicious packets could be fed, detect them so that we can drop them sooner with a new SKB_DROP_REASON_SKB_BAD_GSO drop_reason. Fixes: e876f208af18 ("net: Add a software TSO helper API") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Joe Damato <joe@dama.to> Link: https://patch.msgid.link/20260403221540.3297753-3-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-08ttm/pool: port to list_lru. (v2)Dave Airlie
This is an initial port of the TTM pools for write combined and uncached pages to use the list_lru. This makes the pool's more NUMA aware and avoids needing separate NUMA pools (later commit enables this). Cc: Christian Koenig <christian.koenig@amd.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Dave Chinner <david@fromorbit.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
2026-04-08mm: add gpu active/reclaim per-node stat counters (v2)Dave Airlie
While discussing memcg intergration with gpu memory allocations, it was pointed out that there was no numa/system counters for GPU memory allocations. With more integrated memory GPU server systems turning up, and more requirements for memory tracking it seems we should start closing the gap. Add two counters to track GPU per-node system memory allocations. The first is currently allocated to GPU objects, and the second is for memory that is stored in GPU page pools that can be reclaimed, by the shrinker. Cc: Christian Koenig <christian.koenig@amd.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: linux-mm@kvack.org Cc: Andrew Morton <akpm@linux-foundation.org> Acked-by: Zi Yan <ziy@nvidia.com> Acked-by: Shakeel Butt <shakeel.butt@linux.dev> Acked-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
2026-04-07PCI: Remove no_pci_devices()Heiner Kallweit
After having removed the last usage of no_pci_devices(), this function can be removed. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/b0ce592d-c34c-4e0b-b389-4e346b3a0c44@gmail.com
2026-04-07bpf: Retire rcu_trace_implies_rcu_gp()Kumar Kartikeya Dwivedi
RCU Tasks Trace grace period implies RCU grace period, and this guarantee is expected to remain in the future. Only BPF is the user of this predicate, hence retire the API and clean up all in-tree users. RCU Tasks Trace is now implemented on SRCU-fast and its grace period mechanism always has at least one call to synchronize_rcu() as it is required for SRCU-fast's correctness (it replaces the smp_mb() that SRCU-fast readers skip). So, RCU-tt GP will always imply RCU GP. Reviewed-by: Puranjay Mohan <puranjay@kernel.org> Reviewed-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20260407162234.785270-1-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-04-07Merge tag 'hyperv-fixes-signed-20260406' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull Hyper-V fixes from Wei Liu: - Two fixes for Hyper-V PCI driver (Long Li, Sahil Chandna) - Fix an infinite loop issue in MSHV driver (Stanislav Kinsburskii) * tag 'hyperv-fixes-signed-20260406' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: mshv: Fix infinite fault loop on permission-denied GPA intercepts PCI: hv: Fix double ida_free in hv_pci_probe error path PCI: hv: Set default NUMA node to 0 for devices without affinity info
2026-04-07btrfs: tree-checker: introduce checks for FREE_SPACE_INFOQu Wenruo
Introduce checks for FREE_SPACE_INFO item, which include: - Key alignment check The objectid is the logical bytenr of the chunk/bg, and offset is the length of the chunk/bg, thus they should all be aligned to the fs block size. - Item size check The FREE_SPACE_INFO should a fix size. - Flags check The flags member should have no other flags than BTRFS_FREE_SPACE_USING_BITMAPS. For future expansion, introduce a new macro BTRFS_FREE_SPACE_FLAGS_MASK for such checks. And since we're here, the BTRFS_FREE_SPACE_USING_BITMAPS should not use unsigned long long, as the flags is only 32 bits wide. So fix that to use unsigned long. - Extent count check That member shows how many free space bitmap/extent items there are inside the chunk/bg. We know the chunk size (from key->offset), thus there should be at most (key->offset >> sectorsize_bits) blocks inside the chunk. Use that value as the upper limit and if that counter is larger than that, there is a high chance it's a bitflip in high bits. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2026-04-07btrfs: add tracepoint for search slot restart trackingLeo Martins
Add a btrfs_search_slot_restart tracepoint that fires at each restart site in btrfs_search_slot(), recording the root, tree level, and reason for the restart. This enables tracking search slot restarts which contribute to COW amplification under memory pressure. The four restart reasons are: - write_lock: insufficient write lock level, need to restart with higher lock - setup_nodes: node setup returned -EAGAIN - slot_zero: insertion at slot 0 requires higher write lock level - read_block: read_block_for_search returned -EAGAIN (block not cached or lock contention) COW counts are already tracked by the existing trace_btrfs_cow_block() tracepoint. The per-restart-site tracepoint avoids counter overhead in the critical path when tracepoints are disabled, and provides richer per-event information that bpftrace scripts can aggregate into counts, histograms, and per-root breakdowns. Reviewed-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: Boris Burkov <boris@bur.io> Signed-off-by: Leo Martins <loemra.dev@gmail.com> Signed-off-by: David Sterba <dsterba@suse.com>
2026-04-07landlock: Control pathname UNIX domain socket resolution by pathGünther Noack
* Add a new access right LANDLOCK_ACCESS_FS_RESOLVE_UNIX, which controls the lookup operations for named UNIX domain sockets. The resolution happens during connect() and sendmsg() (depending on socket type). * Change access_mask_t from u16 to u32 (see below) * Hook into the path lookup in unix_find_bsd() in af_unix.c, using a LSM hook. Make policy decisions based on the new access rights * Increment the Landlock ABI version. * Minor test adaptations to keep the tests working. * Document the design rationale for scoped access rights, and cross-reference it from the header documentation. With this access right, access is granted if either of the following conditions is met: * The target socket's filesystem path was allow-listed using a LANDLOCK_RULE_PATH_BENEATH rule, *or*: * The target socket was created in the same Landlock domain in which LANDLOCK_ACCESS_FS_RESOLVE_UNIX was restricted. In case of a denial, connect() and sendmsg() return EACCES, which is the same error as it is returned if the user does not have the write bit in the traditional UNIX file system permissions of that file. The access_mask_t type grows from u16 to u32 to make space for the new access right. This also doubles the size of struct layer_access_masks from 32 byte to 64 byte. To avoid memory layout inconsistencies between architectures (especially m68k), pack and align struct access_masks [2]. Document the (possible future) interaction between scoped flags and other access rights in struct landlock_ruleset_attr, and summarize the rationale, as discussed in code review leading up to [3]. This feature was created with substantial discussion and input from Justin Suess, Tingmao Wang and Mickaël Salaün. Cc: Tingmao Wang <m@maowtm.org> Cc: Justin Suess <utilityemal77@gmail.com> Cc: Kuniyuki Iwashima <kuniyu@google.com> Suggested-by: Jann Horn <jannh@google.com> Link[1]: https://github.com/landlock-lsm/linux/issues/36 Link[2]: https://lore.kernel.org/all/20260401.Re1Eesu1Yaij@digikod.net/ Link[3]: https://lore.kernel.org/all/20260205.8531e4005118@gnoack.org/ Signed-off-by: Günther Noack <gnoack3000@gmail.com> Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://lore.kernel.org/r/20260327164838.38231-5-gnoack3000@gmail.com [mic: Fix kernel-doc formatting, pack and align access_masks] Signed-off-by: Mickaël Salaün <mic@digikod.net>
2026-04-07lsm: Add LSM hook security_unix_findJustin Suess
Add an LSM hook security_unix_find. This hook is called to check the path of a named UNIX socket before a connection is initiated. The peer socket may be inspected as well. Why existing hooks are unsuitable: Existing socket hooks, security_unix_stream_connect(), security_unix_may_send(), and security_socket_connect() don't provide TOCTOU-free / namespace independent access to the paths of sockets. (1) We cannot resolve the path from the struct sockaddr in existing hooks. This requires another path lookup. A change in the path between the two lookups will cause a TOCTOU bug. (2) We cannot use the struct path from the listening socket, because it may be bound to a path in a different namespace than the caller, resulting in a path that cannot be referenced at policy creation time. Consumers of the hook wishing to reference @other are responsible for acquiring the unix_state_lock and checking for the SOCK_DEAD flag therein, ensuring the socket hasn't died since lookup. Cc: Günther Noack <gnoack3000@gmail.com> Cc: Tingmao Wang <m@maowtm.org> Cc: Mickaël Salaün <mic@digikod.net> Cc: Paul Moore <paul@paul-moore.com> Signed-off-by: Justin Suess <utilityemal77@gmail.com> Signed-off-by: Günther Noack <gnoack3000@gmail.com> Reviewed-by: Georgia Garcia <georgia.garcia@canonical.com> Acked-by: Paul Moore <paul@paul-moore.com> Link: https://lore.kernel.org/r/20260327164838.38231-2-gnoack3000@gmail.com Signed-off-by: Mickaël Salaün <mic@digikod.net>
2026-04-07landlock: Allow TSYNC with LOG_SUBDOMAINS_OFF and fd=-1Mickaël Salaün
LANDLOCK_RESTRICT_SELF_TSYNC does not allow LANDLOCK_RESTRICT_SELF_LOG_SUBDOMAINS_OFF with ruleset_fd=-1, preventing a multithreaded process from atomically propagating subdomain log muting to all threads without creating a domain layer. Relax the fd=-1 condition to accept TSYNC alongside LOG_SUBDOMAINS_OFF, and update the documentation accordingly. Add flag validation tests for all TSYNC combinations with ruleset_fd=-1, and audit tests verifying both transition directions: muting via TSYNC (logged to not logged) and override via TSYNC (not logged to logged). Cc: Günther Noack <gnoack@google.com> Cc: stable@vger.kernel.org Fixes: 42fc7e6543f6 ("landlock: Multithreading support for landlock_restrict_self()") Reviewed-by: Günther Noack <gnoack3000@gmail.com> Link: https://lore.kernel.org/r/20260407164107.2012589-2-mic@digikod.net Signed-off-by: Mickaël Salaün <mic@digikod.net>