summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2026-06-04tracing: Fix CFI violation in probestub being called by tprobesEva Kurchatova
The probestub is a function to allow tprobes to hook to a tracepoint to gain access to its parameters. The function itself is only referenced by the tracepoint structure which lives in the __tracepoint section. objtool explicitly ignores that section and when processing functions in the kernel, if it detects one that has no references it will seal it to have its ENDBR stripped on boot up. This means when a tprobe is attached to the sched_wakeup tracepoint, when it is triggered it will call __probestub_sched_wakeup and due to the missing ENDBR on a CFI-enabled machine it will take a #CP exception. Fix this by adding CFI_NOSEAL annotation to probestub declaration. Cc: stable@vger.kernel.org Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Link: https://patch.msgid.link/20260603153147.573589-1-eva.kurchatova@virtuozzo.com Fixes: d5173f753750 ("objtool: Exclude __tracepoints data from ENDBR checks") Signed-off-by: Eva Kurchatova <eva.kurchatova@virtuozzo.com> [ Updated change log ] Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2026-06-04cpu: Add lockdep_is_cpus_held()/lockdep_is_cpus_write_held() stubs for ↵Reinette Chatre
!CONFIG_HOTPLUG_CPU lockdep_is_cpus_held() and lockdep_is_cpus_write_held() are undefined when !CONFIG_HOTPLUG_CPU. This is ok because their few callers protect the calls with a "if (IS_ENABLED(CONFIG_HOTPLUG_CPU) ..." check. It is error prone to require callers to protect lockdep_is_cpus_held() and lockdep_is_cpus_write_held() with an IS_ENABLED(CONFIG_HOTPLUG_CPU) check while the custom for equivalent functions, for example the more prevalent lockdep_is_held(), is to not require similar protection. It is also inconsistent with CPU hotplug lockdep code self since related call lockdep_assert_cpus_held() does not require protection. Create stubs for lockdep_is_cpus_held() and lockdep_is_cpus_write_held() that returns 1 (LOCK_STATE_UNKNOWN/LOCK_STATE_HELD) when !CONFIG_HOTPLUG_CPU. This makes the CPU hotplug lockdep checks consistent while following existing lockdep custom. Drop the "extern" from the function declaration as part of the move to match kernel coding style. Keep the IS_ENABLED(CONFIG_HOTPLUG_CPU) checks in existing users since removing them would change the logic of these expressions. Reported-by: Sashiko <sashiko-bot@kernel.org> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Link: https://patch.msgid.link/7484f0b58fd86153d445819cc4e172adba16cff9.1780543665.git.reinette.chatre@intel.com Closes: https://sashiko.dev/#/patchset/cover.1780456704.git.reinette.chatre%40intel.com?part=1
2026-06-04buffer: Remove end_buffer_write_sync()Matthew Wilcox (Oracle)
It has no callers left, so delete it. Inline __end_buffer_write_sync() into bh_end_write(). Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Link: https://patch.msgid.link/20260528173150.1093780-35-willy@infradead.org Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
2026-06-04buffer: Remove b_end_ioMatthew Wilcox (Oracle)
This shrinks buffer_head by 8 bytes, letting us pack more buffer heads per slab. With a Debian config, it shrinks from 104 bytes to 96 bytes which is 42 objects per 4KiB page rather than 39, a 7% reduction in the amount of memory used. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Link: https://patch.msgid.link/20260528173150.1093780-33-willy@infradead.org Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
2026-06-04buffer: Remove submit_bh()Matthew Wilcox (Oracle)
No users are left; remove this API. Also remove/fix comments mentioning it, and end_bio_bh_io_sync() as it's now unused. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Link: https://patch.msgid.link/20260528173150.1093780-32-willy@infradead.org Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
2026-06-04buffer: Remove mark_buffer_async_write()Matthew Wilcox (Oracle)
There are no more callers of this function, so delete it. end_buffer_async_write() then has only one caller left, so inline it into bh_end_async_write(). Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Link: https://patch.msgid.link/20260528173150.1093780-27-willy@infradead.org Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
2026-06-04buffer: Add bh_end_read(), bh_end_write() and bh_end_async_write()Matthew Wilcox (Oracle)
These are the bio_end_io_t versions of end_buffer_read_sync(), end_buffer_write_sync() and end_buffer_async_write(). They do not contain a put_bh() call as it is no longer necessary. Also add the helper function bio_endio_bh(). Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Link: https://patch.msgid.link/20260528173150.1093780-5-willy@infradead.org Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
2026-06-04buffer: Add bh_submit()Matthew Wilcox (Oracle)
bh_submit() takes a bio_end_io allowing users to avoid the indirect function call through bh->b_end_io, and eventually allowing us to remove bh->b_end_io. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Link: https://patch.msgid.link/20260528173150.1093780-3-willy@infradead.org Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
2026-06-04ALSA: core: Add scoped cleanup helper for card referencesCássio Gabriel
Several ALSA paths acquire temporary card references with snd_card_ref() and release them manually with snd_card_unref(). control_led.c already defines a local cleanup helper for this pattern, while other core paths still open-code the release. Move the helper to the common ALSA core header and use it in control-layer card-reference paths. This makes the ownership rule explicit and avoids future missing-unref mistakes when adding early exits. No functional change is intended. Signed-off-by: Cássio Gabriel <cassiogabrielcontato@gmail.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Link: https://patch.msgid.link/20260604-alsa-scoped-cleanups-v1-2-10c43152a728@gmail.com
2026-06-04mm: kick writeback flusher for IOCB_DONTCACHE with targeted dirty trackingJeff Layton
The IOCB_DONTCACHE writeback path in generic_write_sync() calls filemap_flush_range() on every write, submitting writeback inline in the writer's context. Perf lock contention profiling shows the performance problem is not lock contention but the writeback submission work itself — walking the page tree and submitting I/O blocks the writer for milliseconds, inflating p99.9 latency from 23ms (buffered) to 93ms (dontcache). Replace the inline filemap_flush_range() call with a flusher kick that drains dirty pages in the background. This moves writeback submission completely off the writer's hot path. To avoid flushing unrelated buffered dirty data, add a dedicated WB_start_dontcache bit and wb_check_start_dontcache() handler that uses the per-wb WB_DONTCACHE_DIRTY counter to determine how many pages to write back. The flusher writes back that many pages from the oldest dirty inodes (not restricted to dontcache-specific inodes). This helps preserve I/O batching while limiting the scope of expedited writeback. Like WB_start_all, the WB_start_dontcache bit coalesces multiple DONTCACHE writes into a single flusher wakeup without per-write allocations. Use test_and_clear_bit to atomically consume the kick request before reading the dirty counter and starting writeback, so that concurrent DONTCACHE writes during writeback can re-set the bit and schedule a follow-up flusher run. Read the dirty counter with wb_stat_sum() (aggregating per-CPU batches) rather than wb_stat() (which reads only the global counter) to ensure small writes below the percpu batch threshold are visible to the flusher. In filemap_dontcache_kick_writeback(), set the WB_start_dontcache bit inside the unlocked_inode_to_wb_begin/end section for correct cgroup writeback domain targeting, but defer the wb_wakeup() call until after the section ends, since wb_wakeup() uses spin_unlock_irq() which would unconditionally re-enable interrupts while the i_pages xa_lock may still be held under irqsave during a cgroup writeback switch. Pin the wb with wb_get() inside the RCU critical section before calling wb_wakeup() outside it, since cgroup bdi_writeback structures are RCU-freed and the wb pointer could become invalid after unlocked_inode_to_wb_end() drops the RCU read lock. Also add WB_REASON_DONTCACHE as a new writeback reason for tracing visibility. dontcache-bench results (same host, T6F_SKL_1920GBF, 251 GiB RAM, xfs on NVMe, fio io_uring): Buffered and direct I/O paths are unaffected by this patchset. All improvements are confined to the dontcache path: Single-stream throughput (MB/s): Before After Change seq-write/dontcache 298 897 +201% rand-write/dontcache 131 236 +80% Tail latency improvements (seq-write/dontcache): p99: 135,266 us -> 23,986 us (-82%) p99.9: 8,925,479 us -> 28,443 us (-99.7%) Multi-writer (4 jobs, sequential write): Before After Change dontcache aggregate (MB/s) 2,529 4,532 +79% dontcache p99 (us) 8,553 1,002 -88% dontcache p99.9 (us) 109,314 1,057 -99% Dontcache multi-writer throughput now matches buffered (4,532 vs 4,616 MB/s). 32-file write (Axboe test): Before After Change dontcache aggregate (MB/s) 1,548 3,499 +126% dontcache p99 (us) 10,170 602 -94% Peak dirty pages (MB) 1,837 213 -88% Dontcache now reaches 81% of buffered throughput (was 35%). Competing writers (dontcache vs buffered, separate files): Before After buffered writer 868 433 MB/s dontcache writer 415 433 MB/s Aggregate 1,284 866 MB/s Previously the buffered writer starved the dontcache writer 2:1. With per-bdi_writeback tracking, both writers now receive equal bandwidth. The aggregate matches the buffered-vs-buffered baseline (863 MB/s), indicating fair sharing regardless of I/O mode. The dontcache writer's p99.9 latency collapsed from 119 ms to 33 ms (-73%), eliminating the severe periodic stalls seen in the baseline. Both writers now share identical latency profiles, matching the buffered-vs-buffered pattern. The per-bdi_writeback dirty tracking dramatically reduces peak dirty pages in dontcache workloads, with the 32-file test dropping from 1.8 GB to 213 MB. Dontcache sequential write throughput triples and multi-writer throughput reaches parity with buffered I/O, with tail latencies collapsing by 1-2 orders of magnitude. Assisted-by: Claude:claude-opus-4-6 Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://patch.msgid.link/20260511-dontcache-v7-3-2848ddce8090@kernel.org Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
2026-06-04mm: track DONTCACHE dirty pages per bdi_writebackJeff Layton
Add a per-wb WB_DONTCACHE_DIRTY counter that tracks the number of dirty pages with the dropbehind flag set (i.e., pages dirtied via RWF_DONTCACHE writes). Increment the counter alongside WB_RECLAIMABLE in folio_account_dirtied() when the folio has the dropbehind flag set, and decrement it in folio_clear_dirty_for_io() and folio_account_cleaned(). Also decrement it when a non-DONTCACHE lookup atomically clears the dropbehind flag on a dirty folio in __filemap_get_folio_mpol(), using folio_test_clear_dropbehind() to prevent concurrent lookups from double-decrementing the counter, and guarding the decrement with mapping_can_writeback() to match the increment path. Transfer the counter alongside WB_RECLAIMABLE in inode_do_switch_wbs() so that the stat is properly migrated when an inode switches cgroup writeback domains. The counter will be used by the writeback flusher to determine how many pages to write back when expediting writeback for IOCB_DONTCACHE writes, without flushing the entire BDI's dirty pages. Suggested-by: Jan Kara <jack@suse.cz> Assisted-by: Claude:claude-opus-4-6 Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://patch.msgid.link/20260511-dontcache-v7-2-2848ddce8090@kernel.org Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
2026-06-04Merge tag 'zynqmp-soc-for-7.2' of https://github.com/Xilinx/linux-xlnx into ↵Linus Walleij
soc/drivers arm64: Xilinx SOC changes for 7.2 firmware: - Add CSU register discovery with sysfs interface zynqmp_power: - Fix race condition in event registration - Fix shutdown and free rx mailbox channel * tag 'zynqmp-soc-for-7.2' of https://github.com/Xilinx/linux-xlnx: firmware: zynqmp: Add dynamic CSU register discovery and sysfs interface Documentation: ABI: add sysfs interface for ZynqMP CSU registers soc: xilinx: Shutdown and free rx mailbox channel soc: xilinx: Fix race condition in event registration Signed-off-by: Linus Walleij <linusw@kernel.org>
2026-06-03Merge tag 'for-net-2026-06-03' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth Luiz Augusto von Dentz says: ==================== bluetooth pull request for net: - hci_core: fix memory leak in error path of hci_alloc_dev() - hci_sync: reject oversized Broadcast Announcement prepend - MGMT: Fix backward compatibility with userspace - MGMT: validate advertising TLV before type checks - L2CAP: reject BR/EDR signaling packets over MTUsig - RFCOMM: validate skb length in MCC handlers - RFCOMM: hold listener socket in rfcomm_connect_ind() - ISO: Fix not releasing hdev reference on iso_conn_big_sync - ISO: Fix a use-after-free of the hci_conn pointer - ISO: Fix data-race on iso_pi fields in hci_get_route calls - SCO: Fix data-race on sco_pi fields in sco_connect - BNEP: reject short frames before parsing * tag 'for-net-2026-06-03' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth: Bluetooth: MGMT: Fix backward compatibility with userspace Bluetooth: SCO: Fix data-race on sco_pi fields in sco_connect Bluetooth: ISO: Fix data-race on iso_pi fields in hci_get_route calls Bluetooth: ISO: Fix a use-after-free of the hci_conn pointer Bluetooth: ISO: Fix not releasing hdev reference on iso_conn_big_sync Bluetooth: fix memory leak in error path of hci_alloc_dev() Bluetooth: bnep: reject short frames before parsing Bluetooth: hci_sync: reject oversized Broadcast Announcement prepend Bluetooth: L2CAP: reject BR/EDR signaling packets over MTUsig Bluetooth: RFCOMM: validate skb length in MCC handlers Bluetooth: MGMT: validate advertising TLV before type checks Bluetooth: RFCOMM: hold listener socket in rfcomm_connect_ind() ==================== Link: https://patch.msgid.link/20260603162714.342496-1-luiz.dentz@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-03mptcp: change mptcp_established_options() to return opt_sizeEric Dumazet
Instead of passing opt_size address to mptcp_established_options(), change this function to return it by value. This removes the need for an expensive stack canary in tcp_established_options() when CONFIG_STACKPROTECTOR_STRONG=y. $ scripts/bloat-o-meter -t vmlinux.old vmlinux.new add/remove: 0/0 grow/shrink: 0/3 up/down: 0/-92 (-92) Function old new delta tcp_options_write.isra 1423 1407 -16 mptcp_established_options 2746 2720 -26 tcp_established_options 553 503 -50 Total: Before=22110750, After=22110658, chg -0.00% Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20260602125138.2317015-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-03mptcp: fix uninit-value in mptcp_established_optionsPaolo Abeni
syzbot reported the following uninit splat: BUG: KMSAN: uninit-value in mptcp_write_data_fin net/mptcp/options.c:542 [inline] BUG: KMSAN: uninit-value in mptcp_established_options_dss net/mptcp/options.c:590 [inline] BUG: KMSAN: uninit-value in mptcp_established_options+0x112f/0x3530 net/mptcp/options.c:874 mptcp_write_data_fin net/mptcp/options.c:542 [inline] mptcp_established_options_dss net/mptcp/options.c:590 [inline] mptcp_established_options+0x112f/0x3530 net/mptcp/options.c:874 tcp_established_options+0x312/0xcc0 net/ipv4/tcp_output.c:1192 __tcp_transmit_skb+0x5dc/0x5fe0 net/ipv4/tcp_output.c:1575 __tcp_send_ack+0x967/0xad0 net/ipv4/tcp_output.c:4499 tcp_send_ack+0x3d/0x60 net/ipv4/tcp_output.c:4505 mptcp_subflow_shutdown+0x164/0x690 net/mptcp/protocol.c:3137 mptcp_check_send_data_fin+0x31b/0x3d0 net/mptcp/protocol.c:3218 __mptcp_wr_shutdown net/mptcp/protocol.c:3234 [inline] __mptcp_close+0x860/0x1360 net/mptcp/protocol.c:3313 mptcp_close+0x42/0x260 net/mptcp/protocol.c:3367 inet_release+0x1ee/0x2a0 net/ipv4/af_inet.c:442 __sock_release net/socket.c:722 [inline] sock_close+0xd6/0x2f0 net/socket.c:1514 __fput+0x60e/0x1010 fs/file_table.c:510 ____fput+0x25/0x30 fs/file_table.c:538 task_work_run+0x208/0x2b0 kernel/task_work.c:233 resume_user_mode_work include/linux/resume_user_mode.h:50 [inline] __exit_to_user_mode_loop kernel/entry/common.c:67 [inline] exit_to_user_mode_loop+0x306/0x1b60 kernel/entry/common.c:98 __exit_to_user_mode_prepare include/linux/irq-entry-common.h:207 [inline] syscall_exit_to_user_mode_prepare include/linux/irq-entry-common.h:238 [inline] syscall_exit_to_user_mode include/linux/entry-common.h:318 [inline] __do_fast_syscall_32+0x2c7/0x460 arch/x86/entry/syscall_32.c:310 do_fast_syscall_32+0x37/0x80 arch/x86/entry/syscall_32.c:332 do_SYSENTER_32+0x1f/0x30 arch/x86/entry/syscall_32.c:370 entry_SYSENTER_compat_after_hwframe+0x84/0x8e Local variable opts created at: __tcp_transmit_skb+0x4d/0x5fe0 net/ipv4/tcp_output.c:1536 __tcp_send_ack+0x967/0xad0 net/ipv4/tcp_output.c:4499 The output path currently omits initializing the mptcp extension `use_map` flag in a few corner cases. Address the issue always zeroing all the extensions flags before eventually initializing the individual bits. To that extent, introduce and use a struct_group to avoid multiple bitwise operations. Fixes: cfcceb7a39fc ("tcp: shrink per-packet memset in __tcp_transmit_skb()") Cc: stable@vger.kernel.org Reported-by: syzbot+ff020673c5e3d94d9478@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=ff020673c5e3d94d9478 Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20260602-net-mptcp-misc-fixes-7-1-rc7-v2-10-856831229976@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-03geneve: Introduce IFLA_GENEVE_LOCAL and IFLA_GENEVE_LOCAL6.Kuniyuki Iwashima
By default, a GENEVE device bind()s its underlying UDP socket(s) to the IPv4 or IPv6 wildcard address because there is no way to specify a specific local IP address to bind() to. This prevents deploying multiple GENEVE devices on a multi-homed host where each device should be isolated and bound to a different local IP address on the same UDP port. Let's introduce new options, IFLA_GENEVE_LOCAL and IFLA_GENEVE_LOCAL6, to allow specifying a local IPv4/IPv6 address for the backend UDP socket. By default, when collect metadata mode (IFLA_GENEVE_COLLECT_METADATA) is enabled, both IPv4 and IPv6 sockets are created. However, if a source address is specified via the new attributes, only a single socket corresponding to that specific address family is created. Accordingly, geneve_find_sock() and geneve_find_dev() are updated to take the source address into account, ensuring that multiple devices and sockets configured with different source addresses can coexist without conflict. In addition, the source address is validated in geneve_xmit_skb() and geneve6_xmit_skb(), so the BPF prog must set it in bpf_tunnel_key. With this change, multiple GENEVE devices can be successfully created and bound to their respective local IP addresses: (*) "local" is the keyword for IFLA_GENEVE_LOCAL / IFLA_GENEVE_LOCAL6 # for i in $(seq 1 2); do ip link add geneve4_${i} type geneve local 192.168.0.${i} external ip addr add 192.168.0.${i}/24 dev geneve4_${i} ip link set geneve4_${i} up ip link add geneve6_${i} type geneve local 2001:9292::${i} external ip addr add 2001:9292::${i}/64 dev geneve6_${i} nodad ip link set geneve6_${i} up done # ip -d l | grep geneve 9: geneve4_1: <BROADCAST,MULTICAST,UP,LOWER_UP> ... geneve external id 0 local 192.168.0.1 ... 10: geneve6_1: <BROADCAST,MULTICAST,UP,LOWER_UP> ... geneve external id 0 local 2001:9292::1 ... 11: geneve4_2: <BROADCAST,MULTICAST,UP,LOWER_UP> ... geneve external id 0 local 192.168.0.2 ... 12: geneve6_2: <BROADCAST,MULTICAST,UP,LOWER_UP> ... geneve external id 0 local 2001:9292::2 ... # ss -ua | grep geneve UNCONN 0 0 192.168.0.2:geneve 0.0.0.0:* UNCONN 0 0 192.168.0.1:geneve 0.0.0.0:* UNCONN 0 0 [2001:9292::2]:geneve *:* UNCONN 0 0 [2001:9292::1]:geneve *:* Note that even if the local address is explicitly configured with the wildcard address, kernel does not dump it except for devices with IFLA_GENEVE_COLLECT_METADATA. This is consistent with the behaviour of is_tnl_info_zero(), which treats the wildcard remote address as not configured. ## ynl example. # ./tools/net/ynl/pyynl/cli.py \ --spec ./Documentation/netlink/specs/rt-link.yaml \ --do newlink --create \ --json '{"ifname": "geneve0", "linkinfo": {"kind":"geneve", "data": {"local": "0.0.0.0", "collect-metadata": true}}}' # ./tools/net/ynl/pyynl/cli.py \ --spec ./Documentation/netlink/specs/rt-link.yaml \ --do getlink \ --json '{"ifname": "geneve0"}' --output-json | \ jq .linkinfo.data.local "0.0.0.0" Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20260602190436.139591-6-kuniyu@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-03power: supply: Remove unused jz4740-battery.hCosta Shulyupin
The last user was removed in commit aea12071d6fc ("power/supply: Drop obsolete JZ4740 driver") and replaced by a self-contained IIO-based driver. No file includes this header. Assisted-by: Claude:claude-opus-4-6 Signed-off-by: Costa Shulyupin <costa.shul@redhat.com> Link: https://patch.msgid.link/20260515185043.1523363-1-costa.shul@redhat.com Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
2026-06-04Merge tag 'drm-msm-next-2026-05-30' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/msm into drm-next Changes for v7.2 Core: - Fixed documentation for msm_gem_shrinker functions - IFPC related enablement/fixes for gen8 - PERFCNTR_CONFIG ioctl support GPU - Reworked handling of UBWC configuration - a810 suppport MDSS: - Added Milos platform support - Reworked handling of UBWC configuration DisplayPort: - Reworked HPD handling, preparing for the MST support DPU: - Added Milos platform support - Reworked handling of UBWC configuration DSI: - Added Milos platform support Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rob Clark <rob.clark@oss.qualcomm.com> Link: https://patch.msgid.link/CACSVV00DXZcvFH2-C3fouve5DGs0DGa-vvsJPuaRmUZZVNKOfg@mail.gmail.com
2026-06-03power: supply: max17042_battery: use ModelCfg refresh on max17055Sebastian Krzyszkowiak
Unlike other models, max17055 doesn't require cell characterization data and operates on a smaller set of input variables (`DesignCap`, `VEmpty`, `IChgTerm`, and `ModelCfg`). Those values can be filled in through `max17042_override_por_values()`, but the refresh bit has to be set afterward in order to make them apply. Signed-off-by: Sebastian Krzyszkowiak <sebastian.krzyszkowiak@puri.sm> Signed-off-by: Vincent Cloutier <vincent@cloutier.co> Link: https://patch.msgid.link/20260406205759.493288-8-vincent.cloutier@icloud.com Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
2026-06-03power: supply: max17042_battery: Remove unused platform-data plumbingVincent Cloutier
No in-tree user still provides `max17042_platform_data` or `max17042_reg_data`. Move the simple runtime fields into `struct max17042_chip`, populate them directly from DT or the default hardware state, and drop the unused public platform-data interface. While here, write the MAX17047/MAX17050 default `FullSOCThr` value directly in probe instead of carrying it through an `init_data` table. Signed-off-by: Vincent Cloutier <vincent@cloutier.co> Link: https://patch.msgid.link/20260406205759.493288-7-vincent.cloutier@icloud.com Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
2026-06-03liveupdate: Remove limit on the number of files per sessionPasha Tatashin
To remove the fixed limit on the number of preserved files per session, transition the file metadata serialization from a single contiguous memory block to a chain of linked blocks. Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Reviewed-by: Pratyush Yadav (Google) <pratyush@kernel.org> Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Link: https://patch.msgid.link/20260603154402.468928-11-pasha.tatashin@soleen.com Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
2026-06-03liveupdate: Remove limit on the number of sessionsPasha Tatashin
Currently, the number of LUO sessions is limited by a fixed number of pre-allocated pages for serialization (16 pages, allowing for ~819 sessions). This limitation is problematic if LUO is used to support things such as systemd file descriptor store, and would be used not just as VM memory but to save other states on the machine. Remove this limit by transitioning to a linked-block approach for session metadata serialization. Instead of a single contiguous block, session metadata is now stored in a chain of 16-page blocks. Each block starts with a header containing the physical address of the next block and the number of session entries in the current block. Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Reviewed-by: Pratyush Yadav (Google) <pratyush@kernel.org> Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Link: https://patch.msgid.link/20260603154402.468928-10-pasha.tatashin@soleen.com Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
2026-06-03kho: add support for linked-block serializationPasha Tatashin
Introduce a linked-block serialization mechanism for state handover. Previously, LUO used contiguous memory blocks for serializing sessions and files, which imposed limits on the total number of items that could be preserved across a live update. This commit adds the infrastructure for a more flexible, block-based approach where serialized data is stored in a chain of linked blocks. This is a generic KHO serialization block infrastructure that can be used by multiple subsystems. Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Link: https://patch.msgid.link/20260603154402.468928-8-pasha.tatashin@soleen.com Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
2026-06-03liveupdate: register luo_ser as KHO subtreePasha Tatashin
Entirely remove the LUO FDT wrapper since the FDT only carries the compatible string and the pointer to the centralized struct luo_ser. Instead, register the struct luo_ser via the KHO raw subtree API, placing the compatibility string inside the structure itself. Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Reviewed-by: Pratyush Yadav (Google) <pratyush@kernel.org> Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Link: https://patch.msgid.link/20260603154402.468928-5-pasha.tatashin@soleen.com Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
2026-06-03liveupdate: centralize state management into struct luo_serPasha Tatashin
Transition the LUO to ABI v2, which centralizes state management into a single struct luo_ser header. Previously, LUO state was spread across multiple FDT properties and subnodes. ABI v2 simplifies this by placing all core state, including the liveupdate number and physical addresses for sessions and FLB headers into a centralized struct luo_ser. Note that this change introduces a semantic difference: the sessions and FLB serialization formats are no longer completely independent of the core LUO. Their metadata (such as physical addresses for sessions and FLB headers) is now coupled to and managed via the centralized struct luo_ser. Reviewed-by: Pratyush Yadav (Google) <pratyush@kernel.org> Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Link: https://patch.msgid.link/20260603154402.468928-4-pasha.tatashin@soleen.com Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
2026-06-03mm: simplify the mempool_alloc_bulk APIChristoph Hellwig
The mempool_alloc_bulk was modelled after the alloc_pages_bulk API, including some misunderstanding of it. Remove checking for NULL slots in the array, as alloc_pages_bulk and kmem_cache_alloc_bulk always fill the array from the beginning and thus we know the offset of the first failing allocation. This removes support for working well with alloc_pages_bulk used to refill page arrays that might have an entry removed from in the middle, but that is only used by sunrpc and hopefully on it's way out. Also remove the allocated parameter as it is redundant because the caller can simply specific and offset into the entries array. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://patch.msgid.link/20260602160038.3976341-1-hch@lst.de Signed-off-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
2026-06-03mm/slab: improve kmem_cache_alloc_bulkChristoph Hellwig
The kmem_cache_alloc_bulk return value is weird. It returns the number of allocated objects, but that must always be 0 or the requested number based on the implementations and the handling in the callers, but that assumption is not actually documented anywhere, which confuses automated review tools. Fix this by returning a bool if the allocation succeeded and adding a kerneldoc comment explaining the API. [rob.clark@oss.qualcomm.com: fixups in msm_iommu_pagetable_prealloc_allocate() ] Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> # skbuff Link: https://patch.msgid.link/20260528093437.2519248-2-hch@lst.de Signed-off-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
2026-06-03Bluetooth: L2CAP: reject BR/EDR signaling packets over MTUsigMichael Bommarito
net/bluetooth/l2cap_core.c:l2cap_sig_channel() accepts BR/EDR signaling packets up to the channel MTU and dispatches each command without enforcing the signaling MTU (MTUsig). A Bluetooth BR/EDR peer within radio range can send a fixed-channel CID 0x0001 packet that is larger than MTUsig and contains many L2CAP_ECHO_REQ commands before pairing. In a real-radio stock-kernel run, one 681-byte signaling packet containing 168 zero-length ECHO_REQ commands made the target transmit 168 ECHO_RSP frames over about 220 ms. Impact: a Bluetooth BR/EDR peer within radio range, before pairing, can force 168 ECHO_RSP frames from one 681-byte fixed-channel signaling packet containing packed ECHO_REQ commands. Define Linux's BR/EDR signaling MTU as the spec minimum of 48 bytes and reject any larger signaling packet with one L2CAP_COMMAND_REJECT_RSP carrying L2CAP_REJ_MTU_EXCEEDED before any command is dispatched. The Bluetooth Core spec wording for MTUExceeded says the reject identifier shall match the first request command in the packet, and that packets containing only responses shall be silently discarded. Linux intentionally deviates from that prescription: silently discarding desynchronizes the peer because the remote stack never learns its responses were dropped, and locating the first request command requires walking command headers past MTUsig, i.e. processing bytes from a packet we have already decided is too large to process. We therefore always emit one reject and use the identifier from the first command header, a single fixed-offset byte read. The unrestricted BR/EDR signaling parser and ECHO_REQ response path both trace to the initial git import; no later introducing commit is available for a Fixes tag. Cc: stable@vger.kernel.org Suggested-by: Luiz Augusto von Dentz <luiz.dentz@gmail.com> Link: https://lore.kernel.org/r/20260518002800.1361430-1-michael.bommarito@gmail.com Link: https://lore.kernel.org/r/20260520135034.1060859-1-michael.bommarito@gmail.com Link: https://lore.kernel.org/r/20260521000555.3712030-1-michael.bommarito@gmail.com Assisted-by: Claude:claude-opus-4-7 Assisted-by: Codex:gpt-5-5-xhigh Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2026-06-03ext4: fast commit: add lock_updates tracepointLi Chen
Commit-time fast commit snapshots run under jbd2_journal_lock_updates(), so it is useful to quantify the time spent with updates locked and to understand why snapshotting can fail. Add a new tracepoint, ext4_fc_lock_updates, reporting the time spent in the updates-locked window along with the number of snapshotted inodes and ranges. Record the first snapshot failure reason in a stable snap_err field for tooling. Signed-off-by: Li Chen <chenl311@chinatelecom.cn> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org> Link: https://patch.msgid.link/20260515091829.194810-7-me@linux.beauty Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2026-06-03Merge tag 'tegra-for-7.2-arm-dt' of ↵Linus Walleij
git://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux into soc/dt ARM: tegra: Device tree changes for v7.2-rc1 The bulk of this is various improvements for some of the older ASUS and LG devices, but there's also support for interconnects on Tegra114 to help improve memory frequency scaling. * tag 'tegra-for-7.2-arm-dt' of git://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux: ARM: tegra: tf600t: Invert accelerometer calibration matrix ARM: tegra: tf600t: Drop backlight regulator ARM: tegra: tf600t: Configure panel ARM: tegra: transformers: Add connector node for common trees ARM: tegra: transformer: Add support for front camera ARM: tegra: grouper: Add support for front camera ARM: tegra: p880: Lower CPU thermal limit ARM: tegra: lg-x3: Set PMIC's RTC address ARM: tegra: lg-x3: Complete video device graph ARM: tegra: Configure Tegra114 power domains ARM: tegra: Add DC interconnections for Tegra114 ARM: tegra: Add EMC OPP and ICC properties to Tegra114 EMC and ACTMON device-tree nodes ARM: tegra: Add #{address,size}-cells to Chromium-based /firmware dt-bindings: memory: Document Tegra114 External Memory Controller dt-bindings: memory: Document Tegra114 Memory Controller Signed-off-by: Linus Walleij <linusw@kernel.org>
2026-06-03wifi: mac80211: AP: handle DBE for clientsJohannes Berg
In AP mode, track the BSS non-DBE bandwidth and apply that to all non-DBE clients, then track OMP updates from the clients and enable/disable DBE accordingly. For now don't send a response, clients need to have a timer anyway (it's up to the driver to set the right timeout in UHR capabilities.) Link: https://patch.msgid.link/20260529102644.be84f2b055cc.I4d2c067dfe54c47621d5a872ca07a0e754d6c20f@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-06-03wifi: mac80211: parse and apply UHR DBE channelJohannes Berg
When a UHR AP has DBE enabled, parse the channel and apply it to the chandef. Apply for TX only after the OMP response (or timeout) so that the AP doesn't receive frames with DBE width before the station completed transition to DBE. Link: https://patch.msgid.link/20260529102644.cb810f212128.Ife37c2673251346e84e4250b242b31f0895520ab@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-06-03wifi: Update UHR MAC capabilities to D1.4Johannes Berg
There are now 8 more reserved bits in D1.4, update the code accordingly. Link: https://patch.msgid.link/20260529102644.6e27c54cfceb.Id395c07ffde286011494fc75190dc6060117436e@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-06-03wifi: Update UHR PHY capabilities to D1.4Johannes Berg
There are new capabilities in D1.4, and some reserved bits. Update the code accordingly. Link: https://patch.msgid.link/20260529102644.f146932b21e2.I12bad84157bf809fbe285b79420143b3c456d9d2@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-06-03wifi: ieee80211: define some UHR link reconfiguration frame typesJohannes Berg
Define some values needed for UHR link reconfiguration frames, in particular to prepare for UHR mode change request/handling. Link: https://patch.msgid.link/20260529102644.03029bae6447.If22b0c1e10d9db712dca408a420469b3d385b4ea@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-06-03wifi: mac80211: basic S1G rx rate reporting supportLachlan Hodges
Introduce basic rate encoding/decoding for S1G stas such that the usermode rx reporting is relevant as it currently uses VHT calculations which are obviously wildy different to S1G. Sample iw output (with the associated iw patches applied): Connected to 0c:bf:74:00:21:c4 (on wlan0) SSID: wifi_halow freq: 923.500 RX: 7325230 bytes (4756 packets) TX: 190044 bytes (2238 packets) signal: -38 dBm rx bitrate: 43.3 MBit/s S1G-MCS 9 8MHz short GI S1G-NSS 1 tx bitrate: 43.3 MBit/s S1G-MCS 9 8MHz short GI S1G-NSS 1 bss flags: dtim period: 1 beacon int: 100 Signed-off-by: Lachlan Hodges <lachlan.hodges@morsemicro.com> Link: https://patch.msgid.link/20260602062224.1792985-1-lachlan.hodges@morsemicro.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-06-03wifi: mac80211: Fix PERR frame processingMasashi Honma
There are no issues with the PERR processing itself; however, to maintain consistency with the previous PREQ/PREP code modifications, I will create a new mesh_path_parse_error_frame() function to separately implement the frame format validation and the "not supported" check. Signed-off-by: Masashi Honma <masashi.honma@gmail.com> Link: https://patch.msgid.link/20260529230952.124754-6-masashi.honma@gmail.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-06-03wifi: mac80211: Fix overread in PREP frame processingMasashi Honma
When the AF flag is enabled, hwmp_prep_frame_process() overreads orig_addr by 2 bytes. Since this occurs within the socket buffer, it does not read across memory boundaries and therefore poses no security risk; however, we will fix it as a precaution. In this fix, a new function mesh_path_parse_reply_frame() is established to separate the implementation of frame format validation and the check for unsupported features. This is intended to facilitate future work when implementing the currently unsupported parts. Assisted-by: Claude:Sonnet 4.6 Signed-off-by: Masashi Honma <masashi.honma@gmail.com> Link: https://patch.msgid.link/20260529230952.124754-5-masashi.honma@gmail.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-06-03wifi: mac80211: Fix overread in PREQ frame processingMasashi Honma
When the AF flag is enabled, hwmp_preq_frame_process() overreads target_addr by 2 bytes. Since this occurs within the socket buffer, it does not read across memory boundaries and therefore poses no security risk; however, we will fix it as a precaution. In this fix, a new function mesh_path_parse_request_frame() is established to separate the implementation of frame format validation and the check for unsupported features. This is intended to facilitate future work when implementing the currently unsupported parts. Assisted-by: Claude:Sonnet 4.6 Signed-off-by: Masashi Honma <masashi.honma@gmail.com> Link: https://patch.msgid.link/20260529230952.124754-4-masashi.honma@gmail.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-06-03wifi: mac80211: Use struct instead of macro for PERR frameMasashi Honma
The existing PERR_IE_* macros access HWMP PERR frame fields via hardcoded byte offsets. Each PERR destination entry contains an optional 6-byte AE (Address Extension) address followed by a reason code, making offset-based access error-prone. Introduce typed packed C structs to represent the PERR frame layout: - ieee80211_mesh_hwmp_perr: top-level frame containing TTL and destination count - ieee80211_mesh_hwmp_perr_dst: per-destination entry with optional AE address and variable-position reason code Add ieee80211_mesh_hwmp_perr_get_rcode() to locate the reason code in each destination entry depending on whether the AE flag is set. This refactoring makes the PERR processing code consistent with the struct-based approach adopted for PREQ and PREP in preceding patches. Signed-off-by: Masashi Honma <masashi.honma@gmail.com> Link: https://patch.msgid.link/20260529230952.124754-3-masashi.honma@gmail.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-06-03wifi: mac80211: Use struct instead of macro for PREP frameMasashi Honma
The existing PREP_IE_* macros access HWMP PREP frame fields via hardcoded byte offsets. When the AE (Address Extension) flag is set, an additional 6 bytes appear mid-frame, making the offset arithmetic error-prone. Introduce typed packed C structs to represent the PREP frame layout: - ieee80211_mesh_hwmp_prep_top: fixed fields before the optional AE address - ieee80211_mesh_hwmp_prep_bottom: fields after the optional AE address Add ieee80211_mesh_hwmp_prep_get_bottom() to locate the bottom struct correctly based on whether the AE flag is set. This preparatory refactoring is needed to fix a 2-byte overread of orig_addr in hwmp_prep_frame_process() when AE is enabled, which is addressed in a subsequent patch. Signed-off-by: Masashi Honma <masashi.honma@gmail.com> Link: https://patch.msgid.link/20260529230952.124754-2-masashi.honma@gmail.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-06-03wifi: mac80211: Use struct instead of macro for PREQ frameMasashi Honma
The existing PREQ_IE_* macros access HWMP PREQ frame fields via hardcoded byte offsets. When the AE (Address Extension) flag is set, an additional 6 bytes appear mid-frame, and the macros handle this with conditional arithmetic (e.g., AE_F_SET(x) ? x + N+6 : x + N). This approach obscures the frame layout and is prone to miscalculation. Introduce typed packed C structs to represent the PREQ frame layout: - ieee80211_mesh_hwmp_preq_top: fixed fields before the optional AE address - ieee80211_mesh_hwmp_preq_bottom: fields after the optional AE address - ieee80211_mesh_hwmp_preq_target: per-target fields Add ieee80211_mesh_hwmp_preq_get_bottom() to locate the bottom struct correctly based on whether the AE flag is set. This preparatory refactoring is needed to fix a 2-byte overread of target_addr in hwmp_preq_frame_process() when AE is enabled, which is addressed in a subsequent patch. Signed-off-by: Masashi Honma <masashi.honma@gmail.com> Link: https://patch.msgid.link/20260529230952.124754-1-masashi.honma@gmail.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-06-03wifi: cfg80211: remove 5/10 MHz channel supportJohannes Berg
Remove WIPHY_FLAG_SUPPORTS_5_10_MHZ and 5/10 MHz channel width support. We contemplated this back in early 2023 and didn't do it yet, but nobody stepped up to maintain it. It's already _mostly_ dead code since it can really only be used for AP and maybe IBSS and monitor, but not on a client since there's no way to scan (and hasn't been in a very long time, if ever), so the only thing that ever could really happen with it was run syzbot and trip over assumptions in the code. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Reviewed-by: Lachlan Hodges <lachlan.hodges@morsemicro.com> Link: https://patch.msgid.link/20260529084502.080c5885f0b7.I77cc94485b523c3c006005b9233db13cd4e077b3@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-06-03rv: Prevent task migration while handling per-CPU eventsGabriele Monaco
Tracepoint handlers are fully preemptible after a46023d5616 ("tracing: Guard __DECLARE_TRACE() use of __DO_TRACE_CALL() with SRCU-fast"). When a per-CPU monitor handles an event, it retrieves the monitor state using a per-CPU pointer. If the event itself doesn't disable preemption, the task can migrate to a different CPU and we risk updating the wrong monitor. Mitigate this by explicitly disabling task migration before acquiring the monitor pointer. This cannot guarantee the monitor runs on the correct CPU but reduces the race condition window and prevents warnings. Reviewed-by: Wen Yang <wen.yang@linux.dev> Reviewed-by: Nam Cao <namcao@linutronix.de> Link: https://lore.kernel.org/r/20260601153840.124372-10-gmonaco@redhat.com Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
2026-06-03rv: Ensure synchronous cleanup for HA monitorsGabriele Monaco
HA monitors may start timers, all cleanup functions currently stop the timers asynchronously to avoid sleeping in the wrong context. Nothing makes sure running callbacks terminate on cleanup. Run the entire HA timer callback in an RCU read-side critical section, this way we can simply synchronize_rcu() with any pending timer and are sure any cleanup using kfree_rcu() runs after callbacks terminated. Additionally make sure any unlikely callback running late won't run any code if the monitor is marked as disabled or if destruction started. Use memory barriers to serialise with racing resets. Fixes: f5587d1b6ec9 ("rv: Add Hybrid Automata monitor type") Fixes: 4a24127bd6cb ("rv: Add support for per-object monitors in DA/HA") Reviewed-by: Nam Cao <namcao@linutronix.de> Link: https://lore.kernel.org/r/20260601153840.124372-9-gmonaco@redhat.com Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
2026-06-03rv: Add automatic cleanup handlers for per-task HA monitorsGabriele Monaco
Hybrid automata monitors may start timers, depending on the model, these may remain active on an exiting task and cause false positives or even access freed memory. Add an enable/disable hook in the HA code, currently only populated by the per-task handler for registration and deregistration. This hooks to the sched_process_exit event and ensures the timer is stopped for every exiting task. The handler is enabled automatically but may be disabled, for instance if the monitor uses the event for another purpose (but should still manually ensure timers are stopped). Fixes: f5587d1b6ec9 ("rv: Add Hybrid Automata monitor type") Reviewed-by: Nam Cao <namcao@linutronix.de> Link: https://lore.kernel.org/r/20260601153840.124372-8-gmonaco@redhat.com Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
2026-06-03rv: Do not rely on clean monitor when initialising HAGabriele Monaco
Hybrid Automata monitors hook into the DA implementation when doing da_monitor_reset(). This function is called both on initialisation and teardown, HA monitors try to cancel a timer only when it's initialised relying on the da_mon->monitoring flag. This flag could however be corrupted during initialisation. This happens for instance on per-task monitors that share the same storage with different type of monitors like LTL or in case of races during a previous teardown. Stop relying on the monitoring flag during initialisation, assume that can have any value, so use a separate da_reset_state() skiping timer cancellation. New monitors (e.g. new tasks) are always zero-initialised so it is safe to rely on the monitoring flag for those. Reported-by: Wen Yang <wen.yang@linux.dev> Closes: https://lore.kernel.org/lkml/d02c656aada7d071f083460a5c9a454363669b61.1778522945.git.wen.yang@linux.dev Suggested-by: Nam Cao <namcao@linutronix.de> Fixes: f5587d1b6ec9 ("rv: Add Hybrid Automata monitor type") Reviewed-by: Wen Yang <wen.yang@linux.dev> Reviewed-by: Nam Cao <namcao@linutronix.de> Link: https://lore.kernel.org/r/20260601153840.124372-7-gmonaco@redhat.com Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
2026-06-03rv: Fix monitor start ordering and memory ordering for monitoring flagWen Yang
da_monitor_start() set monitoring=1 before calling da_monitor_init_hook(), may racing with the sched_switch handler: da_monitor_start() sched_switch handler ------------------------- --------------------------------- da_mon->monitoring = 1; if (da_monitoring(da_mon)) /* true */ ha_start_timer_ns(...); /* hrtimer->base == NULL, crash */ da_monitor_init_hook(da_mon); /* hrtimer_setup() sets base */ Fix the ordering and pair with release/acquire semantics: da_monitor_init_hook(da_mon); smp_store_release(&da_mon->monitoring, 1); /* da_monitor_start() */ return smp_load_acquire(&da_mon->monitoring); /* da_monitoring() */ On ARM64 a plain STR + LDR does not form a release-acquire pair, so the load can observe monitoring=1 while hrtimer->base is still NULL. The plain accesses are also data races under KCSAN. Use WRITE_ONCE for the monitoring=0 store in da_monitor_reset() to cover the reset path. Fixes: 792575348ff7 ("rv/include: Add deterministic automata monitor definition via C macros") Signed-off-by: Wen Yang <wen.yang@linux.dev> Reviewed-by: Gabriele Monaco <gmonaco@redhat.com> Reviewed-by: Nam Cao <namcao@linutronix.de> Link: https://lore.kernel.org/r/20260601153840.124372-6-gmonaco@redhat.com Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
2026-06-03rv: Ensure all pending probes terminate on per-obj monitor destroyGabriele Monaco
The monitor disable/destroy sequence detaches all probes and resets the monitor's data, however it doesn't wait for pending probes. This is an issue with per-object monitors, which free the monitor storage. Call tracepoint_synchronize_unregister() to make sure to wait for all pending probes before destroying the monitor storage. Fixes: 4a24127bd6cb ("rv: Add support for per-object monitors in DA/HA") Reviewed-by: Wen Yang <wen.yang@linux.dev> Reviewed-by: Nam Cao <namcao@linutronix.de> Link: https://lore.kernel.org/r/20260601153840.124372-5-gmonaco@redhat.com Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
2026-06-03rv: Prevent in-flight per-task handlers from using invalid slotsGabriele Monaco
Per-task monitors use a slot in the task_struct->rv[] array and store that locally (e.g. task_mon_slot), this slot is returned during the destruction process but currently hanlers can be running while that slot is returning and this race may lead to accessing an invalid slot. Synchronise with all in-flight tracepoint handlers using tracepoint_synchronize_unregister() before returning the slot. Fixes: f5587d1b6ec9 ("rv: Add Hybrid Automata monitor type") Fixes: a9769a5b9878 ("rv: Add support for LTL monitors") Suggested-by: Wen Yang <wen.yang@linux.dev> Reviewed-by: Nam Cao <namcao@linutronix.de> Link: https://lore.kernel.org/r/20260601153840.124372-4-gmonaco@redhat.com Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>