summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
8 daysnet: mana: Skip redundant detach on already-detached portDipayaan Roy
When mana_per_port_queue_reset_work_handler() runs after a previous detach succeeded but attach failed, the port is left in a detached state with apc->tx_qp and apc->rxqs already freed. Calling mana_detach() again unconditionally leads to NULL pointer dereferences during queue teardown. Add an early exit in mana_detach() when the port is already in detached state (!netif_device_present) for non-close callers, making it safe to call idempotently. This allows the queue reset handler and other recovery paths to simply retry mana_attach() without redundant teardown. Fixes: 3b194343c250 ("net: mana: Implement ndo_tx_timeout and serialize queue resets per port.") Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: Dipayaan Roy <dipayanroy@linux.microsoft.com> Link: https://patch.msgid.link/20260525081129.1230035-3-dipayanroy@linux.microsoft.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 daysnet: mana: Add NULL guards in teardown path to prevent panic on attach failureDipayaan Roy
When queue allocation fails partway through, the error cleanup frees and NULLs apc->tx_qp and apc->rxqs. Multiple teardown paths such as mana_remove(), mana_change_mtu() recovery, and internal error handling in mana_alloc_queues() can subsequently call into functions that dereference these pointers without NULL checks: - mana_chn_setxdp() dereferences apc->rxqs[0], causing a NULL pointer dereference panic (CR2: 0000000000000000 at mana_chn_setxdp+0x26). - mana_destroy_vport() iterates apc->rxqs without a NULL check. - mana_fence_rqs() iterates apc->rxqs without a NULL check. - mana_dealloc_queues() iterates apc->tx_qp without a NULL check. Add NULL guards for apc->rxqs in mana_fence_rqs(), mana_destroy_vport(), and before the mana_chn_setxdp() call. Add a NULL guard for apc->tx_qp in mana_dealloc_queues() to skip TX queue draining when TX queues were never allocated or already freed. Fixes: ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)") Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: Dipayaan Roy <dipayanroy@linux.microsoft.com> Link: https://patch.msgid.link/20260525081129.1230035-2-dipayanroy@linux.microsoft.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 daysMerge tag 'imx-soc-fixes-for-v7.1' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/frank.li/linux into arm/fixes i.MX SoC fixes for v7.1 Fix CAAM driver probe failures caused by missing SoC information by retrieving the match data directly through of_machine_get_match_data(), which provides the correct SoC-specific data. * tag 'imx-soc-fixes-for-v7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/frank.li/linux: soc: imx8m: Fix match data lookup for soc device Signed-off-by: Arnd Bergmann <arnd@arndb.de>
8 daysARM: dts: gemini: Fix partition offsetsLinus Walleij
These FIS partition offsets were never right: the comment clearly states the FIS index is at 0xfe0000 and 0x7f * 0x200000 is 0xfe0000. Tested on the iTian SQ201. Fixes: d88b11ef91b1 ("ARM: dts: Fix up SQ201 flash access") Fixes: b5a923f8c739 ("ARM: dts: gemini: Switch to redboot partition parsing") Signed-off-by: Linus Walleij <linusw@kernel.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
8 daysMerge tag 'qcom-arm64-defconfig-fixes-for-7.1' of ↵Arnd Bergmann
https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux into arm/fixes Qualcomm Arm64 defconfig fixes for v7.1 A number of targets now depends on the M.2 PCIe power sequencing driver, enable this to keep these devices functional with a defconfig build. * tag 'qcom-arm64-defconfig-fixes-for-7.1' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux: arm64: defconfig: Enable PCI M.2 power sequencing driver Signed-off-by: Arnd Bergmann <arnd@arndb.de>
8 daysMerge tag 'qcom-arm64-fixes-for-7.1' of ↵Arnd Bergmann
https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux into arm/fixes Qualcomm Arm64 DeviceTree fixes for v7.1 Add missing power-domain and iface clocks for the ICE node of Eliza and Milos to avoid the validation errors that resulted from late binding changes. Also drop the reference clock for the USB QMP PHYs, for the same reason. Avoid touching the 20'th I2C bus on the Hamoa-based (X Elite) Dell laptops, as this conflicts with the battery management firmware. * tag 'qcom-arm64-fixes-for-7.1' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux: arm64: dts: qcom: eliza: Add power-domain and iface clk for ice node arm64: dts: qcom: milos: Add power-domain and iface clk for ice node arm64: dts: qcom: x1-dell-thena: remove i2c20 (battery SMBus) and reserve its pins arm64: dts: qcom: glymur: Drop RPMh CXO clocks from QMP PHYs Signed-off-by: Arnd Bergmann <arnd@arndb.de>
8 daysMerge tag 'qcom-drivers-fixes-for-7.1' of ↵Arnd Bergmann
https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux into arm/fixes Qualcomm driver fixes for v7.1 The Qualcomm ICE driver suffers from race conditions between probe() and get() and will in certain cases return the wrong error code, which results in storage drivers failing to probe. Fix these issues. Also correct the DeviceTree binding, to ensure that relevant clocks are described and voted for, to prevent the driver from accessing unclocked hardware during boot. * tag 'qcom-drivers-fixes-for-7.1' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux: soc: qcom: ice: Fix the error code when 'qcom,ice' property is not found scsi: ufs: ufs-qcom: Remove NULL check from devm_of_qcom_ice_get() mmc: sdhci-msm: Remove NULL check from devm_of_qcom_ice_get() soc: qcom: ice: Return proper error codes from devm_of_qcom_ice_get() instead of NULL soc: qcom: ice: Return -ENODEV if the ICE platform device is not found soc: qcom: ice: Fix race between qcom_ice_probe() and of_qcom_ice_get() soc: qcom: ice: Allow explicit votes on 'iface' clock for ICE dt-bindings: crypto: qcom,ice: Fix missing power-domain and iface clk Signed-off-by: Arnd Bergmann <arnd@arndb.de>
8 daysMerge tag 'acpi-7.1-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI support fixes from Rafael Wysocki: "Fix three issues in the ACPI button driver: a possible crash due to a button press after unloading the driver (introduced during the 6.15 development cycle), function keys breakage on Toshiba Tecra X40 due to missing ACPI events (introduced during the 7.0 development cycle), and a missing probe rollback path item that has not been added by mistake during a recent update" * tag 'acpi-7.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI: button: Add missing device class clearing on probe failures ACPI: button: Enable wakeup GPEs for ACPI buttons at probe time ACPI: button: Fix ACPI GPE handler leak during removal
8 daysMerge tag 'pm-7.1-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fix from Rafael Wysocki: "Fix a possible amd-pstate-ut cpufreq driver crash introduced by a recent update (K Prateek Nayak)" * tag 'pm-7.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: cpufreq/amd-pstate-ut: Disable dynamic_epp after the mode switch
8 daysMerge tag 'net-7.1-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "This is again significantly bigger than the same point into the previous cycle, but at least smaller than last week. I'm not aware of any pending regression for the current cycle. Including fixes from netfilter. Current release - regressions: - netfilter: walk fib6_siblings under RCU Previous releases - regressions: - netlink: fix sending unassigned nsid after assigned one - bridge: fix sleep in atomic context in netlink path - sched: fix ethx:ingress -> ethy:egress -> ethx:ingress mirred loop - ipv4: fix net->ipv4.sysctl_local_reserved_ports UaF - eth: tun: free page on short-frame rejection in tun_xdp_one() Previous releases - always broken: - skbuff: fix missing zerocopy reference in pskb_carve helpers - handshake: drain pending requests at net namespace exit - ethtool: - rss: avoid modifying the RSS context response - module: avoid leaking a netdev ref on module flash errors - coalesce: cap profile updates at NET_DIM_PARAMS_NUM_PROFILES - netfilter: fix dst corruption in same register operation - nfc: hci: fix out-of-bounds read in HCP header parsing - ipv6: exthdrs: refresh nh pointer after ipv6_hop_jumbo() - eth: - vti: use ip6_tnl.net in vti6_changelink(). - vxlan: do not reuse cached ip_hdr() value after skb_tunnel_check_pmtu()" * tag 'net-7.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (94 commits) dpll: zl3073x: make frequency monitor a per-device attribute dpll: zl3073x: use __dpll_device_change_ntf() and remove change_work dpll: export __dpll_device_change_ntf() for use under dpll_lock net/handshake: Drain pending requests at net namespace exit net/handshake: Verify file-reference balance in submit paths net/handshake: Close the submit-side sock_hold race net/handshake: hand off the pinned file reference to accept_doit net/handshake: Take a long-lived file reference at submit net/handshake: Pass negative errno through handshake_complete() nvme-tcp: store negative errno in queue->tls_err net/handshake: Use spin_lock_bh for hn_lock net: skbuff: fix missing zerocopy reference in pskb_carve helpers net: hibmcge: move dma_rmb() after dma_sync_single_for_cpu() in RX path net: hibmcge: disable Relaxed Ordering to fix RX packet corruption selftests/tc-testing: Add netem test case exercising loops selftests/tc-testing: Add mirred test cases exercising loops net/sched: act_mirred: Fix return code in early mirred redirect error paths net/sched: act_mirred: Fix blockcast recursion bypass leading to stack overflow net/sched: Fix ethx:ingress -> ethy:egress -> ethx:ingress mirred loop net/sched: fix packet loop on netem when duplicate is on ...
8 daysMerge tag 'gpio-fixes-for-v7.1-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux Pull gpio fixes from Bartosz Golaszewski: - fix interrupt handling in gpio-mxc - fix scoped_guard() usage in gpio-adnp - don't accept partial writes in gpio-virtuser debugfs interface as they can't really work correctly - fix resource leaks in gpio-rockchip - fix locking issues in remove path in shared GPIO management - undo the vote of a GPIO shared proxy virtual device on GPIO release * tag 'gpio-fixes-for-v7.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux: gpio: rockchip: teardown bugs and resource leaks gpio: rockchip: convert bank->clk to devm_clk_get_enabled() gpio: virtuser: Fix uninitialized data bug in gpio_virtuser_direction_do_write() gpio: shared: fix lockdep false positive by removing unneeded lock gpio: shared: fix deadlock on shared proxy's parent removal gpio: adnp: fix flow control regression caused by scoped_guard() gpio: shared: undo the vote of the proxy on GPIO free gpio: mxc: fix irq_high handling
8 dayssecurity/keys: fix missed RCU read section on lookupLinus Torvalds
Nicholas Carlini reports that the keyring code calls assoc_array_find() in find_key_to_update() without holding the RCU read lock, while the assoc_array_gc() code really is designed around removing the node from the tree and then freeing it after an RCU grace-period. The regular key handling doesn't see this because holding the keyring semaphore hides any lifetime issues, but the persistent key handling uses a different model. Instead of extending the keyring locking, just do the simple RCU locking that the assoc_array was designed for. Reported-by: Nicholas Carlini <npc@anthropic.com> Cc: David Howells <dhowells@redhat.com> Cc: Jarkko Sakkinen <jarkko@kernel.org> Cc: Paul Moore <paul@paul-moore.com> Cc: James Morris James Morris <jmorris@namei.org> Cc: Serge E. Hallyn <serge@hallyn.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 daysHID: wacom: Fix OOB write in wacom_hid_set_device_mode()Lee Jones
wacom_hid_set_device_mode() currently assumes that the HID_DG_INPUTMODE usage is always located in the first field (field[0]) of the feature report. However, a device can specify HID_DG_INPUTMODE in a different field. If HID_DG_INPUTMODE is in a field other than the first one and the first field has a report_count smaller than the usage_index of HID_DG_INPUTMODE, this leads to an out-of-bounds write to r->field[0]->value. Fix this by storing the field index of HID_DG_INPUTMODE in 'struct hid_data' during feature mapping. In wacom_hid_set_device_mode(), use this stored field index to access the correct field and add bounds checks to ensure both the field index and the value index are within valid ranges before writing. Cc: stable@vger.kernel.org Fixes: 5ae6e89f7409 ("HID: wacom: implement the finger part of the HID generic handling") Tested-by: Ping Cheng <ping.cheng@wacom.com> Reviewed-by: Ping Cheng <ping.cheng@wacom.com> Signed-off-by: Lee Jones <lee@kernel.org> Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
8 daysdma-buf: fix UAF in dma_buf_fd() tracepointDavid Carlier
Once FD_ADD() returns, the fd is live in the file descriptor table and a thread sharing that table can close() it before DMA_BUF_TRACE() runs. The close drops the last reference, __fput() frees the dma_buf, and the tracepoint then dereferences dmabuf to take dmabuf->name_lock -- slab-use-after-free. Split FD_ADD() back into get_unused_fd_flags() + fd_install() and emit the tracepoint between them. While the fdtable slot is reserved with a NULL file pointer, a racing close() returns -EBADF without entering __fput(), so the dma_buf stays alive across the trace. Same approach as commit 2d76319c4cbb ("dma-buf: fix UAF in dma_buf_put() tracepoint"). This undoes the FD_ADD() conversion done in commit 34dfce523c90 ("dma: convert dma_buf_fd() to FD_ADD()"); FD_ADD() has no place to hook the tracepoint safely. Reported-by: syzbot+7f4987d0afb97dd090cb@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=7f4987d0afb97dd090cb Fixes: 281a22631423 ("dma-buf: add some tracepoints to debug.") Cc: stable@vger.kernel.org # 7.0.x Signed-off-by: David Carlier <devnexen@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org> Link: https://patch.msgid.link/20260523181446.69525-1-devnexen@gmail.com
8 daysregmap: reject volatile update_bits() in cache-only modebui duc phuc
Prevent _regmap_update_bits() from accessing hardware when the register map is in cache-only mode. Unlike regmap_raw_read() and _regmap_read(), the volatile _regmap_update_bits() fast path bypasses the cache_only check. This can result in unexpected hardware accesses while the device is suspended. Return -EBUSY to ensure behavior is consistent with other cache-only access paths. Signed-off-by: bui duc phuc <phucduc.bui@gmail.com> Link: https://patch.msgid.link/20260528053204.46783-1-phucduc.bui@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org>
8 daysDisable -Wattribute-alias for clang-23 and newerNathan Chancellor
Clang recently added support for -Wattribute-alias [1], which results in the same warnings that necessitated commit bee20031772a ("disable -Wattribute-alias warning for SYSCALL_DEFINEx()") for GCC. kernel/time/itimer.c:325:1: error: alias and aliasee have different types 'long (unsigned int)' and 'long (typeof (__builtin_choose_expr((__builtin_types_compatible_p(typeof ((unsigned int)0), typeof (0LL)) || __builtin_types_compatible_p(typeof ((unsigned int)0), typeof (0ULL))), 0LL, 0L)))' (aka 'long (long)') [-Werror,-Wattribute-alias] 325 | SYSCALL_DEFINE1(alarm, unsigned int, seconds) | ^ include/linux/syscalls.h:225:36: note: expanded from macro 'SYSCALL_DEFINE1' 225 | #define SYSCALL_DEFINE1(name, ...) SYSCALL_DEFINEx(1, _##name, __VA_ARGS__) | ^ include/linux/syscalls.h:236:2: note: expanded from macro 'SYSCALL_DEFINEx' 236 | __SYSCALL_DEFINEx(x, sname, __VA_ARGS__) | ^ include/linux/syscalls.h:251:18: note: expanded from macro '__SYSCALL_DEFINEx' 251 | __attribute__((alias(__stringify(__se_sys##name)))); \ | ^ kernel/time/itimer.c:325:1: note: aliasee is declared here include/linux/syscalls.h:225:36: note: expanded from macro 'SYSCALL_DEFINE1' 225 | #define SYSCALL_DEFINE1(name, ...) SYSCALL_DEFINEx(1, _##name, __VA_ARGS__) | ^ include/linux/syscalls.h:236:2: note: expanded from macro 'SYSCALL_DEFINEx' 236 | __SYSCALL_DEFINEx(x, sname, __VA_ARGS__) | ^ include/linux/syscalls.h:255:18: note: expanded from macro '__SYSCALL_DEFINEx' 255 | asmlinkage long __se_sys##name(__MAP(x,__SC_LONG,__VA_ARGS__)) \ | ^ <scratch space>:16:1: note: expanded from here 16 | __se_sys_alarm | ^ Disable the warnings in the same way for clang-23 and newer. Disable the warning about unknown warning options to avoid breaking the build for versions of clang-23 that do not have -Wattribute-alias, such as ones deployed by vendors like Android or CI systems or when bisecting LLVM between llvmorg-23-init and release/23.x. Cc: stable@vger.kernel.org Closes: https://github.com/ClangBuiltLinux/linux/issues/2163 Link: https://github.com/llvm/llvm-project/commit/40da6920a0d71d49dfa2392b09153600b0759f5e [1] Link: https://patch.msgid.link/20260515-syscall-disable-attribute-alias-for-clang-v1-1-9a9d95d41df6@kernel.org Signed-off-by: Nathan Chancellor <nathan@kernel.org>
8 daysMerge tag 'qcomtee-fix-for-v7.1' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/jenswi/linux-tee into arm/fixes QCOMTEE fix for v7.1 Adding a missing va_end in early return qcomtee_object_user_init() * tag 'qcomtee-fix-for-v7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/jenswi/linux-tee: tee: qcomtee: add missing va_end in early return qcomtee_object_user_init() Signed-off-by: Arnd Bergmann <arnd@arndb.de>
8 daysMerge tag 'optee-fix-for-v7.1' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/jenswi/linux-tee into arm/fixes OP-TEE fix for v7.1 Prevent possible use after free in supplicant communication. * tag 'optee-fix-for-v7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/jenswi/linux-tee: tee: optee: prevent use-after-free when the client exits before the supplicant Signed-off-by: Arnd Bergmann <arnd@arndb.de>
8 daysgpio: rockchip: teardown bugs and resource leaksMarco Scardovi
Address several teardown issues and resource leaks in the driver's remove path and error handling: 1. Debounce clock reference leak: The debounce clock (bank->db_clk) is obtained using of_clk_get() which increments the clock's reference count, but clk_put() is never called. Register a devm action to cleanly release it on unbind. Note that of_clk_get(..., 1) remains necessary over devm_clk_get() because the DT binding does not define clock-names, precluding name-based lookup. 2. Unregistered chained IRQ handler: The chained IRQ handler is not disconnected in remove(). If a stray interrupt fires after the driver is removed, the kernel attempts to execute a stale handler, leading to a panic. Fix this by clearing the handler in remove(). 3. IRQ domain leak: The linear IRQ domain and its generic chips are allocated manually during probe but never removed. Remove the IRQ domain during driver teardown to free the associated generic chips and mappings. Fixes: 936ee2675eee ("gpio/rockchip: add driver for rockchip gpio") Assisted-by: Antigravity:gemini-3.5-flash Signed-off-by: Marco Scardovi <scardracs@disroot.org> Link: https://patch.msgid.link/20260526171050.12785-3-scardracs@disroot.org [Bartosz: don't emit an error message on devres allocation failure] Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
8 daysgpio: rockchip: convert bank->clk to devm_clk_get_enabled()Marco Scardovi
The bank->clk was previously obtained via of_clk_get() and manually prepared/enabled. However, it was missing a corresponding clk_put() in both the error paths and the remove function, leading to a reference leak. Convert the allocation to devm_clk_get_enabled(), which also properly propagates failures from clk_prepare_enable() that were previously ignored. The GPIO bank device uses the same OF node as the previous of_clk_get() call, so devm_clk_get_enabled(dev, NULL) correctly resolves the same clock provider entry. Fix the reference leak and simplify the code by removing the manual clk_disable_unprepare() calls in the probe error paths and in the remove function. Fixes: 936ee2675eee ("gpio/rockchip: add driver for rockchip gpio") Assisted-by: Antigravity:gemini-3.5-flash Signed-off-by: Marco Scardovi <scardracs@disroot.org> Link: https://patch.msgid.link/20260526171050.12785-2-scardracs@disroot.org Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
8 daysgpio: virtuser: Fix uninitialized data bug in gpio_virtuser_direction_do_write()Dan Carpenter
If *ppos is non-zero (user-space write split over multiple calls to write()) then simple_write_to_buffer() won't initialize the start of the buffer. Really, non-zero values for *ppos aren't going to work at all. Check for that and return -EINVAL at the start of the function. Fixes: 91581c4b3f29 ("gpio: virtuser: new virtual testing driver for the GPIO API") Signed-off-by: Dan Carpenter <error27@gmail.com> Link: https://patch.msgid.link/ahP3BJWWy-m_qI0X@stanley.mountain Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
8 daysgpio: shared: fix lockdep false positive by removing unneeded lockBartosz Golaszewski
By the time gpio_device_teardown_shared() is called, the parent device is gone from the global list of GPIO devices and all outstanding SRCU read-side critical sections have completed. That means that no concurrent gpio_find_and_request() can call gpio_shared_add_proxy_lookup() for this device at this time. There's also no risk of the parent device being re-bound to the driver before the unbinding completes (including the child devices). Lockdep produces a false-positive report about a possible circular dependency as it doesn't know the ordering guarantee. Not taking the ref->lock in gpio_device_teardown_shared() silences it and is safe to do. Cc: stable@vger.kernel.org Fixes: ea513dd3c066 ("gpio: shared: make locking more fine-grained") Reviewed-by: Linus Walleij <linusw@kernel.org> Link: https://patch.msgid.link/20260522-gpio-shared-deadlock-v1-2-76bca088f8c0@oss.qualcomm.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
8 daysgpio: shared: fix deadlock on shared proxy's parent removalBartosz Golaszewski
Commit 710abda58055 ("gpio: shared: call gpio_chip::of_xlate() if set") used the mutex embedded in struct gpio_shared_entry to protect the offset field which now can be modified after assignment. The critical section however is too wide and introduced a potential deadlock on the removal of the shared GPIO proxy's parent. Make the critical section shorter - only protect the offset when it's being read. While at it: mention the fact that the entry lock is now also used to protect against concurrent access to the offset field in the structure's documentation. Cc: stable@vger.kernel.org Fixes: 710abda58055 ("gpio: shared: call gpio_chip::of_xlate() if set") Reviewed-by: Linus Walleij <linusw@kernel.org> Link: https://patch.msgid.link/20260522-gpio-shared-deadlock-v1-1-76bca088f8c0@oss.qualcomm.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
8 daysgpio: adnp: fix flow control regression caused by scoped_guard()Bartosz Golaszewski
scoped_guard() is implemented as a for loop. Using it to protect code using the continue statement changes the flow as we now only break out of the hidden loop inside scoped_guard(), not the original for loop. Use a regular code block instead. Fixes: c7fe19ed3973 ("gpio: adnp: use lock guards for the I2C lock") Reported-by: David Lechner <dlechner@baylibre.com> Closes: https://lore.kernel.org/all/cde2abb2-4cc8-4fc9-b34a-0c5d2b95779f@baylibre.com/ Reviewed-by: Linus Walleij <linusw@kernel.org> Link: https://patch.msgid.link/20260522073527.9812-1-bartosz.golaszewski@oss.qualcomm.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
8 daysgpio: shared: undo the vote of the proxy on GPIO freeBartosz Golaszewski
When the user of a shared GPIO managed by gpio-shared-proxy calls gpiod_put() to release it, we never undo the potential "vote" for driving the shared line "high". In the free() callback, check if this proxy voted for "high" and - if so - decrease the number of votes and potentially revert the value to low if this is the last user. Cc: stable@vger.kernel.org Fixes: e992d54c6f97 ("gpio: shared-proxy: implement the shared GPIO proxy driver") Closes: https://sashiko.dev/#/patchset/20260513-gpio-shared-dynamic-voting-v1-1-8e1c49961b7d%40oss.qualcomm.com Reviewed-by: Linus Walleij <linusw@kernel.org> Link: https://patch.msgid.link/20260522-gpio-shared-free-vote-v3-1-8a4fddc6bedb@oss.qualcomm.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
8 daysMerge tag 'tee-fixes-for-v7.1' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/jenswi/linux-tee into arm/fixes TEE fixes for v7.1 Fixing: - params_from_user() cleanup in error path in tee_ioctl_supp_recv() - possible tee_shm leak in error path in register_shm_helper() - padding in struct tee_ioctl_object_invoke_arg * tag 'tee-fixes-for-v7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/jenswi/linux-tee: tee: fix params_from_user() error path in tee_ioctl_supp_recv tee: shm: fix shm leak in register_shm_helper() tee: fix tee_ioctl_object_invoke_arg padding Signed-off-by: Arnd Bergmann <arnd@arndb.de>
8 daysBluetooth: hci_sync: Reset device counters in hci_dev_close_sync()Heitor Alves de Siqueira
Before resetting or closing the device, protocol counters should also be zeroed. Fixes: d0b137062b2d ("Bluetooth: hci_sync: Rework init stages") Signed-off-by: Heitor Alves de Siqueira <halves@igalia.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
8 daysBluetooth: hci_sync: Set HCI_CMD_DRAIN_WORKQUEUE during device closeHeitor Alves de Siqueira
Since hci_dev_close_sync() can now be called during the reset path, we should also set HCI_CMD_DRAIN_WORKQUEUE. This avoids queuing timeouts while the hdev workqueue is being drained. Fixes: 877afadad2dc ("Bluetooth: When HCI work queue is drained, only queue chained work") Signed-off-by: Heitor Alves de Siqueira <halves@igalia.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
8 daysBluetooth: hci_core: Rework hci_dev_do_reset() to use hci_sync functionsHeitor Alves de Siqueira
The current HCI reset function in hci_core.c duplicates most of the work done by hci_dev_close_sync(), and doesn't handle LE, advertising or discovery. Instead of porting these to hci_dev_do_reset(), directly call the close/open functions from hci_sync to reset the hdev. MGMT now notifies when a user performs a reset. Suggested-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Heitor Alves de Siqueira <halves@igalia.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
8 daysBluetooth: ISO: serialize iso_sock_clear_timer with socket lockMuhammad Bilal
iso_sock_close() calls iso_sock_clear_timer() before acquiring lock_sock(sk). iso_sock_clear_timer() reads iso_pi(sk)->conn twice without the socket lock held: if (!iso_pi(sk)->conn) return; cancel_delayed_work(&iso_pi(sk)->conn->timeout_work); Concurrently, iso_conn_del() executes under lock_sock(sk) and calls iso_chan_del(), which sets iso_pi(sk)->conn to NULL and may result in the final reference to the connection being dropped: CPU0 CPU1 ---- ---- iso_sock_clear_timer() if (conn != NULL) ... lock_sock(sk) iso_chan_del() iso_pi(sk)->conn = NULL cancel_delayed_work(conn) /* NULL deref or UAF */ iso_pi(sk)->conn is not stable across the unlock window, causing a NULL pointer dereference or use-after-free. Serialize iso_sock_clear_timer() with the socket lock by moving it inside lock_sock()/release_sock(), matching the pattern used in iso_conn_del() and all other call sites. Fixes: ccf74f2390d60a2f9a75ef496d2564abb478f46a ("Bluetooth: Add BTPROTO_ISO socket type") Cc: stable@vger.kernel.org Signed-off-by: Muhammad Bilal <meatuni001@gmail.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
8 daysBluetooth: ISO: fix UAF in iso_recv_frameMuhammad Bilal
iso_recv_frame reads conn->sk under iso_conn_lock but releases the lock before using sk, with no reference held. A concurrent iso_sock_kill() can free sk in that window, causing use-after-free on sk->sk_state and sock_queue_rcv_skb(). Fix by replacing the bare pointer read with iso_sock_hold(conn), which calls sock_hold() while the spinlock is held, atomically elevating the refcount before the lock drops. Add a drop_put label so sock_put() is called on all exit paths where the hold succeeded. Fixes: ccf74f2390d60a2f9a75ef496d2564abb478f46a ("Bluetooth: Add BTPROTO_ISO socket type") Cc: stable@vger.kernel.org Signed-off-by: Muhammad Bilal <meatuni001@gmail.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
8 daysBluetooth: L2CAP: Fix possible crash on l2cap_ecred_conn_rspLuiz Augusto von Dentz
If dcid is received for an already-assigned destination CID the spec requires that both channels to be discarded, but calling l2cap_chan_del may invalidate the tmp cursor created by list_for_each_entry_safe and in fact it is the wrong procedure as the chan->dcid may be assigned previously it really needs to be disconnected. Calling l2cap_chan_clone directly may still lead to l2cap_chan_del so instead schedule l2cap_chan_timeout with delay 0 to close the channel asynchronously. Fixes: 15f02b910562 ("Bluetooth: L2CAP: Add initial code for Enhanced Credit Based Mode") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
8 daysBluetooth: l2cap: clear chan->ident on ECRED reconfiguration successZhenghang Xiao
l2cap_ecred_reconf_rsp() returns early on success without clearing chan->ident. Every other L2CAP response handler (l2cap_ecred_conn_rsp, l2cap_le_connect_rsp, l2cap_config_rsp) clears chan->ident after a successful transaction to prevent the channel from matching subsequent responses with the recycled ident value. A remote attacker that completed a reconfiguration as the peer can replay a failure response with the stale ident, causing the kernel to match and destroy the already-established channel via l2cap_chan_del(chan, ECONNRESET). Clear chan->ident for all matching channels on success, and harden the failure path by using l2cap_chan_hold_unless_zero() consistent with other L2CAP handlers (l2cap_le_command_rej, __l2cap_get_chan_by_ident). Fixes: 15f02b910562 ("Bluetooth: L2CAP: Add initial code for Enhanced Credit Based Mode") Signed-off-by: Zhenghang Xiao <kipreyyy@gmail.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
8 daysspi: spi-mem: avoid mutating op template in spi_mem_supports_op()Santhosh Kumar K
spi_mem_supports_op() accepts a const struct spi_mem_op pointer but casts away const internally to call spi_mem_adjust_op_freq(). This mutates the caller's op template, which causes stale max_freq values when callers reuse persistent templates - subsequent calls won't re-apply the device frequency cap since spi_mem_adjust_op_freq() skips non-zero values. Fix by operating on a stack-local copy instead. Fixes: a4f8e70d75dd ("spi: spi-mem: add spi_mem_adjust_op_freq() in spi_mem_supports_op()") Cc: Tianyu Xu <xtydtc@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Santhosh Kumar K <s-k6@ti.com> Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://patch.msgid.link/20260527173736.2243004-1-s-k6@ti.com Signed-off-by: Mark Brown <broonie@kernel.org>
8 daysMerge branch 'dpll-zl3073x-various-fixes'Paolo Abeni
Ivan Vecera says: ==================== dpll: zl3073x: various fixes Three fixes for the zl3073x DPLL driver. Patch 1 exports __dpll_device_change_ntf() for use by drivers that need to send device change notifications from within callbacks already running under dpll_lock. Patch 2 replaces the change_work workqueue mechanism with direct calls to __dpll_device_change_ntf(), eliminating a race condition where the work handler could dereference a freed dpll_dev pointer during device teardown. Patch 3 moves the freq_monitor flag from per-DPLL to per-device scope to match the hardware behavior where frequency measurement registers are shared across all DPLL channels. ==================== Link: https://patch.msgid.link/20260526074525.1451008-1-ivecera@redhat.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
8 daysdpll: zl3073x: make frequency monitor a per-device attributeIvan Vecera
The frequency monitoring feature uses shared hardware registers that measure input reference frequencies independently of individual DPLL channels. However, the freq_monitor flag was incorrectly placed in the per-DPLL structure, causing each channel to track its own enable/disable state independently. Since the DPLL core calls measured_freq_get() only for the first pin registration, the measured_freq_check() in the periodic worker was gated by the per-DPLL freq_monitor flag of whichever channel happens to be checked. If the first DPLL channel had frequency monitoring disabled while another had it enabled, measurements were never reported. Move freq_monitor from struct zl3073x_dpll to struct zl3073x_dev so all DPLL channels share a single flag, matching the hardware behavior. Update freq_monitor_set() to notify other DPLL devices about the change (like phase_offset_avg_factor_set() already does) and remove the mode-dependent guard in zl3073x_dpll_changes_check() since all input pin monitoring (pin state, phase offset, FFO, and measured frequency) works correctly in all DPLL modes. Fixes: bfc923b642874 ("dpll: zl3073x: implement frequency monitoring") Signed-off-by: Ivan Vecera <ivecera@redhat.com> Link: https://patch.msgid.link/20260526074525.1451008-4-ivecera@redhat.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
8 daysdpll: zl3073x: use __dpll_device_change_ntf() and remove change_workIvan Vecera
The change_work was introduced to send device change notifications from DPLL device callbacks without deadlocking on dpll_lock, since the callbacks are already invoked under that lock. Now that __dpll_device_change_ntf() is exported for callers that already hold dpll_lock, use it directly and remove the change_work infrastructure entirely. This eliminates a race condition where change_work could be re-scheduled after cancel_work_sync() during device teardown, potentially causing the handler to dereference a freed or NULL dpll_dev pointer. Fixes: 9363b4837659 ("dpll: zl3073x: Allow to configure phase offset averaging factor") Signed-off-by: Ivan Vecera <ivecera@redhat.com> Link: https://patch.msgid.link/20260526074525.1451008-3-ivecera@redhat.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
8 daysdpll: export __dpll_device_change_ntf() for use under dpll_lockIvan Vecera
Export __dpll_device_change_ntf() so that drivers can send device change notifications from within device callbacks, which are already called under dpll_lock. Using dpll_device_change_ntf() in that context would deadlock. Add lockdep_assert_held() to catch misuse without the lock held. Signed-off-by: Ivan Vecera <ivecera@redhat.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://patch.msgid.link/20260526074525.1451008-2-ivecera@redhat.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
8 daysMAINTAINERS: Add my employer to my entriesJoerg Roedel
AMD pays for my IOMMU maintainer work, so mention that in the MAINTAINERS file as well. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
8 daysMAINTAINERS: Add Vasant Hegde to reviewers of AMD IOMMUJoerg Roedel
Vasant has a long history of providing valuable feedback and testing results for the AMD IOMMU code. Still, too often he gets not Cc'ed on code changes, so make his reviewer status official. Acked-by: Vasant Hegde <vasant.hegde@amd.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
8 daysMerge tag 'asoc-fix-v7.1-rc5' of ↵Takashi Iwai
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus ASoC: Fixes for v7.1 This round of fixes is mostly Sirini's Qualcomm cleanups that have been in review for a while, we also have a couple of small fixes from Cássio.
8 daysMerge branch 'net-handshake-anchor-request-lifetime-to-a-pinned-file-reference'Paolo Abeni
Chuck Lever says: ==================== net/handshake: anchor request lifetime to a pinned file reference handshake_nl_accept_doit() has accumulated four follow-on fixes since 3b3009ea8abb ("net/handshake: Create a NETLINK service for handling handshake requests"): 7ea9c1ec66bc, 7798b59409c3, fe67b063f687, and dabac51b8102. Each was a local refcount or NULL-check correction; none moved where the file reference is owned, and the same code keeps producing the same class of bug. Reworking the ownership is what breaks the pattern. For the duration of a request, sock->file has no single owner. Submit publishes the request without taking a file reference; accept_doit acquires one inside the handler, after the request has already left the pending list. The consumer can drop its own reference at any time, including the moment between handshake_req_next() popping the request and accept_doit reaching get_file(). The submit-side sock_hold() pins only struct sock; struct socket and sock->file remain under the consumer's control via the file descriptor. This series places the file reference under unambiguous ownership. handshake_req_submit() pins it on the request and completion or cancel drops it (patches 4-5); the submit-side sock_hold() then becomes redundant, and dropping it also closes a publish-before-pin race the late sock_hold itself opened (patch 6). The handshake_complete() API and its consumers move to a uniform negative-errno sign convention (patch 3), with the matching sign correction in nvme-tcp (patch 2). Patch 1 hardens hn_lock for BH context, the netns-exit drain fix builds on the new file-pin infrastructure (patch 8), and new KUnit file-count assertions verify the refcount contract (patch 7). Three things in this restructuring want a careful look. In handshake_complete(), the fput() of the request's file reference has to come after hp_done() -- fput() can transitively run handshake_sk_destruct() and free the request, so the patch stashes hr_file in a local first. handshake_sk_destruct() itself is kept on purpose: it owns rhashtable removal and kfree, and remains the backstop if a consumer path bypasses handshake_complete() entirely. Third, handshake_req_next() now returns its request with an extra get_file() held under hn_lock; accept_doit must consume that reference (FD_PREPARE on success, explicit fput on the fdf.err path), and any future caller has to honor the same contract. v2: https://patch.msgid.link/20260521-handshake-file-pin-v2-0-b9dadc472840@oracle.com v1: https://patch.msgid.link/20260518-handshake-file-pin-v1-0-4bbcb7e62fda@oracle.com ==================== Link: https://patch.msgid.link/20260525-handshake-file-pin-v3-0-66c616906ead@oracle.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
8 daysnet/handshake: Drain pending requests at net namespace exitChuck Lever
The arguments to list_splice_init() in handshake_net_exit() are reversed. The call moves the local empty "requests" list onto hn->hn_requests, leaving the local list empty, so the subsequent drain loop runs zero iterations. Pending handshake requests that had not yet been accepted are not torn down when the net namespace is destroyed; each one keeps a reference on a socket file and on the handshake_req allocation. Pass the source and destination in the documented order (list_splice_init(list, head) moves list onto head) so the pending list is transferred to the local scratch list and drained through handshake_complete(). Fixing the splice direction exposes a list-corruption race. After the splice each req->hr_list still has non-empty link pointers, threading the stack-local scratch list rather than hn_requests. A concurrent handshake_req_cancel() -- for example, from sunrpc's TLS timeout on a kernel socket whose netns reference was not taken -- finds the request through the rhashtable, calls remove_pending(), and sees !list_empty(&req->hr_list). __remove_pending_locked() then list_del_init()s an entry off the scratch list while the drain iterates, corrupting it. The same call arriving after the drain loop has run list_del() on an entry hits LIST_POISON instead. Have remove_pending() check HANDSHAKE_F_NET_DRAINING under hn_lock and report not-found when drain is in progress. The drain has already taken ownership; handshake_complete()'s existing test_and_set on HANDSHAKE_F_REQ_COMPLETED still arbitrates between drain and cancel for who calls the consumer's hp_done. Use list_del_init() rather than list_del() in the drain so req->hr_list does not carry LIST_POISON after drain releases the entry. The DRAINING guard in remove_pending() makes cancel return false, but cancel still falls through to test_and_set_bit on HANDSHAKE_F_REQ_COMPLETED and drops the request's hr_file reference. Without another pin, if that is the last reference, sk_destruct frees the request while it is still linked on the drain loop's local list. Pin each request's hr_file under hn_lock before releasing the list, and drop that drain pin after the loop finishes with the request. Fixes: 3b3009ea8abb ("net/handshake: Create a NETLINK service for handling handshake requests") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Hannes Reinecke <hare@kernel.org> Link: https://patch.msgid.link/20260525-handshake-file-pin-v3-8-66c616906ead@oracle.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
8 daysnet/handshake: Verify file-reference balance in submit pathsChuck Lever
The new file-reference contract on struct handshake_req is silently breakable: a missing get_file() at submit or a missing fput() on an error path leaves the file leaked but does not crash the test, so the existing absence-of-crash checks pass either way. Snapshot file_count(filp) before each handshake_req_submit() in the submit-success, EAGAIN, EBUSY, and cancel tests, and assert the expected balance after submit and again after cancel. The already-completed cancel test also asserts the post-complete balance, which pins down that handshake_complete() drops the reference and that the subsequent cancel does not double-fput. The destroy test gets the same treatment before __fput_sync(), which double-checks that cancel's fput() ran and the only remaining reference is the one sock_alloc_file() established. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Hannes Reinecke <hare@kernel.org> Link: https://patch.msgid.link/20260525-handshake-file-pin-v3-7-66c616906ead@oracle.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
8 daysnet/handshake: Close the submit-side sock_hold raceChuck Lever
handshake_req_submit() publishes the request via handshake_req_hash_add() and __add_pending_locked(), drops hn_lock, and calls handshake_genl_notify() (which can sleep) before taking sock_hold() on req->hr_sk. A fast tlshd ACCEPT followed by DONE can drive handshake_complete()'s sock_put() into the window between the spin_unlock and the late sock_hold(); on a system where the consumer's fd held the only sk reference, the late sock_hold() then operates on an sk whose refcount has reached zero. The preceding two patches install an explicit file reference on struct handshake_req. That file pins sock->file, which pins the embedded struct socket, which defers inet_release()'s sock_put(). As long as hr_file is held, sk cannot reach refcount zero from the consumer side, and the submit-side sock_hold() with its matching sock_put() calls in handshake_complete() and handshake_req_cancel() is now redundant. Drop all three. The file reference already keeps each request's socket alive, and the lifetime story is contained in a single get_file()/fput() pair. Fixes: 3b3009ea8abb ("net/handshake: Create a NETLINK service for handling handshake requests") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Hannes Reinecke <hare@kernel.org> Link: https://patch.msgid.link/20260525-handshake-file-pin-v3-6-66c616906ead@oracle.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
8 daysnet/handshake: hand off the pinned file reference to accept_doitChuck Lever
handshake_req_next() removes the request from the per-net pending list and drops hn_lock before handshake_nl_accept_doit() reads req->hr_sk->sk_socket and dereferences sock->file (once in FD_PREPARE() and again in get_file()). In that window a consumer running tls_handshake_cancel() followed by sockfd_put() (svc_sock_free) or __fput_sync() (xs_reset_transport) releases sock->file. sock_release() then runs sock_orphan(), zeroing sk_socket, and frees the struct socket. The accept-side code either reads NULL through sk_socket or chases freed memory. The submit-side sock_hold() does not prevent this. sk_refcnt protects struct sock, but struct socket and sock->file are independently refcounted via the file descriptor the consumer owns. Pinning sk leaves sock and sock->file unprotected. Retarget the accept-side dereferences at req->hr_file, which was pinned at submit time, instead of req->hr_sk->sk_socket->file. Pinning on its own is not sufficient: a consumer that cancels between handshake_req_next() returning and accept_doit reaching FD_PREPARE() takes the !remove_pending() branch in handshake_req_cancel() and drops hr_file before the accept side takes its own reference. Hand off an additional file reference inside handshake_req_next(), under hn_lock, so the accept side operates on a reference that no concurrent handshake_req_cancel() can revoke. FD_PREPARE() consumes that handed-off reference, either by transferring it to the new fd in fd_publish() or by dropping it in the cleanup destructor on error; the explicit get_file() that previously balanced FD_PREPARE() is therefore redundant and goes away. Update handshake_req_cancel_test2 and _test3 to simulate the FD_PREPARE() consumption with an fput() so the kunit file-count assertions stay balanced. Reported-by: Chris Mason <clm@meta.com> Fixes: 3b3009ea8abb ("net/handshake: Create a NETLINK service for handling handshake requests") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Hannes Reinecke <hare@kernel.org> Link: https://patch.msgid.link/20260525-handshake-file-pin-v3-5-66c616906ead@oracle.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
8 daysnet/handshake: Take a long-lived file reference at submitChuck Lever
handshake_nl_accept_doit() needs the file pointer backing req->hr_sk->sk_socket to survive the window between handshake_req_next() and the subsequent FD_PREPARE() and get_file(). The submit-side sock_hold() does not provide that. sk_refcnt keeps struct sock alive, but struct socket is owned by sock->file: when the consumer fputs the last file reference, sock_release() tears the socket down regardless of any sock_hold. Add an hr_file pointer to struct handshake_req and acquire an explicit reference on sock->file during handshake_req_submit(). handshake_complete() and handshake_req_cancel() release the reference on the completion-bit-winning path. The submit error path must also release the file reference, but after rhashtable insertion a concurrent handshake_req_cancel() can discover the request and race the error path. Gate the error-path cleanup -- sk_destruct restoration, fput, and request destruction -- with test_and_set_bit(HANDSHAKE_F_REQ_COMPLETED), the same serialization handshake_complete() and handshake_req_cancel() already use. When cancel has already claimed ownership, the submit error path returns without touching the request; socket teardown handles final destruction. The accept-side dereferences are not yet retargeted; that change comes in the next patch. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Link: https://patch.msgid.link/20260525-handshake-file-pin-v3-4-66c616906ead@oracle.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
8 daysnet/handshake: Pass negative errno through handshake_complete()Chuck Lever
handshake_complete() declares status as unsigned int and tls_handshake_done() negates that value (-status) before handing it to the TLS consumer. Consumers match on negative errno constants -- xs_tls_handshake_done() has switch (status) { case 0: case -EACCES: case -ETIMEDOUT: lower_transport->xprt_err = status; break; default: lower_transport->xprt_err = -EACCES; } so the API as designed expects callers to pass positive errno values that the tlshd shim then negates. Three internal callers in handshake_nl_accept_doit(), the net-exit drain, and a kunit test follow kernel convention and pass negative errnos -- -EIO, -ETIMEDOUT, -ETIMEDOUT. The implicit conversion to unsigned int turns -ETIMEDOUT into 0xFFFFFF92; the subsequent -status in tls_handshake_done() wraps back to 110, the consumer's switch falls through, and the xprt reports -EACCES on what should be -ETIMEDOUT or -EIO. Fix the API rather than the call sites. The natural kernel convention is negative errno in, negative errno out. Change handshake_complete() and hp_done to take int status, drop the negation in tls_handshake_done(), and negate once in handshake_nl_done_doit() where status arrives from the wire as an unsigned netlink attribute. The three internal callers were already correct under that convention and need no change. At the same wire boundary, declare MAX_ERRNO as the netlink policy upper bound for HANDSHAKE_A_DONE_STATUS. Attribute validation rejects out-of-range values before handshake_nl_done_doit() runs, and negating a bounded u32 there stays within int range -- closing the UBSAN-visible signed- integer overflow that an unconstrained u32 would invoke. Fixes: 3b3009ea8abb ("net/handshake: Create a NETLINK service for handling handshake requests") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Hannes Reinecke <hare@kernel.org> Link: https://patch.msgid.link/20260525-handshake-file-pin-v3-3-66c616906ead@oracle.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
8 daysnvme-tcp: store negative errno in queue->tls_errChuck Lever
nvme_tcp_tls_done() assigns queue->tls_err in three branches. The ENOKEY lookup failure and the EOPNOTSUPP initializer both store negative errnos. The third branch, reached when the handshake layer reports a non-zero status, stores -status. The handshake layer delivers status to the consumer callback as a negative errno; the other in-tree consumers -- xs_tls_handshake_done() and the nvmet target callback -- treat their status argument that way. The extra negation in nvme_tcp_tls_done() flips the sign, leaving tls_err as a positive value (for instance, +EIO), which nvme_tcp_start_tls() then returns to its caller. Drop the extra negation so queue->tls_err uniformly carries a negative errno on failure. Fixes: be8e82caa685 ("nvme-tcp: enable TLS handshake upcall") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Hannes Reinecke <hare@kernel.org> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Link: https://patch.msgid.link/20260525-handshake-file-pin-v3-2-66c616906ead@oracle.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
8 daysnet/handshake: Use spin_lock_bh for hn_lockChuck Lever
nvmet_tcp_state_change(), a socket callback that runs in BH context, can reach handshake_req_cancel() via nvmet_tcp_schedule_release_queue() and tls_handshake_cancel(). handshake_req_cancel() acquires hn->hn_lock with plain spin_lock(). If a process-context thread on the same CPU holds hn->hn_lock when a softirq invokes the cancel path, the lock attempt deadlocks. This is the only caller that invokes tls_handshake_cancel() from BH context; every other consumer calls it from process context. Deferring the cancel to process context in the NVMe target is not straightforward: nvmet_tcp_schedule_release_queue() must call tls_handshake_cancel() atomically with its state transition to DISCONNECTING. If the cancel were deferred, the handshake completion callback could fire in the window before the cancel runs, observe the unexpected state, and return without dropping its kref on the queue. Reworking that interlock is considerably more invasive than hardening the handshake lock. Convert all hn->hn_lock acquisitions from spin_lock/spin_unlock to spin_lock_bh/spin_unlock_bh so the lock is never taken with softirqs enabled. Fixes: 675b453e0241 ("nvmet-tcp: enable TLS handshake upcall") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Hannes Reinecke <hare@kernel.org> Link: https://patch.msgid.link/20260525-handshake-file-pin-v3-1-66c616906ead@oracle.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>