summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
5 daysMerge tag 'fs_for_v7.1-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull isofs and udf fixes from Jan Kara: "Several isofs and udf fixes" * tag 'fs_for_v7.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: docs: isofs: replace dead ECMA-119 FTP link udf: reject descriptors with oversized CRC length isofs: use QSTR_LEN() in isofs_cmp isofs: validate block number from NFS file handle in isofs_export_iget isofs: validate Rock Ridge CE continuation extent against volume size
5 daysMerge tag 'fsnotify_for_v7.1-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull fsnotify fixes from Jan Kara: "Three fixes for fsnotify / fanotify" * tag 'fsnotify_for_v7.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: fsnotify: fix inode reference leak in fsnotify_recalc_mask() fanotify: Fix spelling mistake "enforecement" -> "enforcement" fanotify: fix false positive on permission events
5 daysMerge tag 'for-7.1-rc1-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: - space reservation fixes: - correctly undo 'may_use' accounting for remap tree - avoid double decrement of 'may_use' when submitting async io - actually enable the shutdown ioctl callback (not just the superblock ops) - raid stripe tree fixes when deleting extents - add missing error handling - fix various incorrect values set - fix transaction state when removing a directory, possibly leading to EIO during log replay - additional b-tree node key checks during metadata readahead - error handling and transaction abort updates * tag 'for-7.1-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: fix double-decrement of bytes_may_use in submit_one_async_extent() btrfs: check return value of btrfs_partially_delete_raid_extent() btrfs: handle -EAGAIN from btrfs_duplicate_item and refresh stale leaf pointer btrfs: replace ASSERT with proper error handling in stripe lookup fallback btrfs: fix wrong min_objectid in btrfs_previous_item() call btrfs: fix raid stripe search missing entries at leaf boundaries btrfs: copy devid in btrfs_partially_delete_raid_extent() btrfs: handle unexpected free-space-tree key types btrfs: fix missing last_unlink_trans update when removing a directory btrfs: don't clobber errors in add_remap_tree_entries() btrfs: enable shutdown ioctl for non-experimental builds btrfs: apply first key check for readahead when possible btrfs: abort transaction in do_remap_reloc_trans() on failure btrfs: fix bytes_may_use leak in do_remap_reloc_trans() btrfs: fix bytes_may_use leak in move_existing_remap()
7 daysMerge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rmk/linuxLinus Torvalds
Pull ARM updates from Russell King: - fix a race condition handling PG_dcache_clean - further cleanups for the fault handling, allowing RT to be enabled - fixing nzones validation in adfs filesystem driver - fix for module unwinding * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rmk/linux: ARM: 9463/1: Allow to enable RT ARM: 9472/1: fix race condition on PG_dcache_clean in __sync_icache_dcache() ARM: 9471/1: module: fix unwind section relocation out of range error fs/adfs: validate nzones in adfs_validate_bblk() ARM: provide individual is_translation_fault() and is_permission_fault() ARM: move FSR fault status definitions before fsr_fs() ARM: use BIT() and GENMASK() for fault status register fields ARM: move is_permission_fault() and is_translation_fault() to fault.h ARM: move vmalloc() lazy-page table population ARM: ensure interrupts are enabled in __do_user_fault()
8 daysMerge tag 'nfs-for-7.1-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds
Pull NFS client updates from Trond Myklebust: "Bugfixes: - Fix handling of ENOSPC so that if we have to resend writes, they are written synchronously - SUNRPC RDMA transport fixes from Chuck - Several fixes for delegated timestamps in NFSv4.2 - Failure to obtain a directory delegation should not cause stat() to fail with NFSv4 - Rename was failing to update timestamps when a directory delegation is held on NFSv4 - Ensure we check rsize/wsize after crossing a NFSv4 filesystem boundary - NFSv4/pnfs: - If the server is down, retry the layout returns on reboot - Fallback to MDS could result in a short write being incorrectly logged Cleanups: - Use memcpy_and_pad in decode_fh" * tag 'nfs-for-7.1-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (21 commits) NFS: Fix RCU dereference of cl_xprt in nfs_compare_super_address NFS: remove redundant __private attribute from nfs_page_class NFSv4.2: fix CLONE/COPY attrs in presence of delegated attributes NFS: fix writeback in presence of errors nfs: use memcpy_and_pad in decode_fh NFSv4.1: Apply session size limits on clone path NFSv4: retry GETATTR if GET_DIR_DELEGATION failed NFS: fix RENAME attr in presence of directory delegations pnfs/flexfiles: validate ds_versions_cnt is non-zero NFS/blocklayout: print each device used for SCSI layouts xprtrdma: Post receive buffers after RPC completion xprtrdma: Scale receive batch size with credit window xprtrdma: Replace rpcrdma_mr_seg with xdr_buf cursor xprtrdma: Decouple frwr_wp_create from frwr_map xprtrdma: Close lost-wakeup race in xprt_rdma_alloc_slot xprtrdma: Avoid 250 ms delay on backlog wakeup xprtrdma: Close sendctx get/put race that can block a transport nfs: update inode ctime after removexattr operation nfs: fix utimensat() for atime with delegated timestamps NFS: improve "Server wrote zero bytes" error ...
8 daysMerge tag 'ceph-for-7.1-rc1' of https://github.com/ceph/ceph-clientLinus Torvalds
Pull ceph updates from Ilya Dryomov: "We have a series from Alex which extends CephFS client metrics with support for per-subvolume data I/O performance and latency tracking (metadata operations aren't included) and a good variety of fixes and cleanups across RBD and CephFS" * tag 'ceph-for-7.1-rc1' of https://github.com/ceph/ceph-client: ceph: add subvolume metrics collection and reporting ceph: parse subvolume_id from InodeStat v9 and store in inode ceph: handle InodeStat v8 versioned field in reply parsing libceph: Fix slab-out-of-bounds access in auth message processing rbd: fix null-ptr-deref when device_add_disk() fails crush: cleanup in crush_do_rule() method ceph: clear s_cap_reconnect when ceph_pagelist_encode_32() fails ceph: only d_add() negative dentries when they are unhashed libceph: update outdated comment in ceph_sock_write_space() libceph: Remove obsolete session key alignment logic ceph: fix num_ops off-by-one when crypto allocation fails libceph: Prevent potential null-ptr-deref in ceph_handle_auth_reply()
8 daysMerge tag 'ntfs-for-7.1-rc1-part2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/ntfs Pull ntfs updates from Namjae Jeon: - Fix potential data leakage by zeroing the portion of the straddle block beyond initialized_size when reading non-resident attributes - Remove unnecessary zeroing in ntfs_punch_hole() for ranges beyond initialized_size, as they are already returned as zeros on read - Fix writable check in ntfs_file_mmap_prepare() to correctly handle shared mappings using VMA_SHARED_BIT | VMA_MAYWRITE_BIT - Use page allocation instead of kmemdup() for IOMAP_INLINE data to ensure page-aligned address and avoid BUG trap in iomap_inline_data_valid() caused by the page boundary check - Add a size check before memory allocation in ntfs_attr_readall() and reject overly large attributes - Remove unneeded noop_direct_IO from ntfs_aops as it is no longer required following the FMODE_CAN_ODIRECT flag - Fix seven static analysis warnings reported by Smatch * tag 'ntfs-for-7.1-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/ntfs: ntfs: use page allocation for resident attribute inline data ntfs: fix mmap_prepare writable check for shared mappings ntfs: fix potential 32-bit truncation in ntfs_write_cb() ntfs: fix uninitialized variable in ntfs_map_runlist_nolock ntfs: delete dead code ntfs: add missing error code in ntfs_mft_record_alloc() ntfs: fix uninitialized variables in ntfs_ea_set_wsl_inode() ntfs: fix uninitialized pointer in ntfs_write_mft_block ntfs: fix uninitialized variable in ntfs_write_simple_iomap_begin_non_resident ntfs: remove noop_direct_IO from address_space_operations ntfs: limit memory allocation in ntfs_attr_readall ntfs: not zero out range beyond init in punch_hole ntfs: zero out stale data in straddle block beyond initialized_size
8 daysMerge tag '9p-for-7.1-rc1' of https://github.com/martinetd/linuxLinus Torvalds
Pull 9p updates from Dominique Martinet: - 9p access flag fix (cannot change access flag since new mount API implem) - some minor cleanup * tag '9p-for-7.1-rc1' of https://github.com/martinetd/linux: 9p/trans_xen: replace simple_strto* with kstrtouint 9p/trans_xen: make cleanup idempotent after dataring alloc errors 9p: document missing enum values in kernel-doc comments 9p: fix access mode flags being ORed instead of replaced 9p: fix memory leak in v9fs_init_fs_context error path
9 daysMerge tag 'vfs-7.1-rc1.fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs fixes from Christian Brauner: - eventpoll: fix ep_remove() UAF and follow-up cleanup - fs: aio: set VMA_DONTCOPY_BIT in mmap to fix NULL-pointer-dereference error - writeback: Fix use after free in inode_switch_wbs_work_fn() - fuse: reject oversized dirents in page cache - fs: aio: reject partial mremap to avoid Null-pointer-dereference error - nstree: fix func. parameter kernel-doc warnings - fs: Handle multiply claimed blocks more gracefully with mmb * tag 'vfs-7.1-rc1.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: eventpoll: drop vestigial epi->dying flag eventpoll: drop dead bool return from ep_remove_epi() eventpoll: refresh eventpoll_release() fast-path comment eventpoll: move f_lock acquisition into ep_remove_file() eventpoll: fix ep_remove struct eventpoll / struct file UAF eventpoll: move epi_fget() up eventpoll: rename ep_remove_safe() back to ep_remove() eventpoll: drop vestigial __ prefix from ep_remove_{file,epi}() eventpoll: kill __ep_remove() eventpoll: split __ep_remove() eventpoll: use hlist_is_singular_node() in __ep_remove() fs: Handle multiply claimed blocks more gracefully with mmb nstree: fix func. parameter kernel-doc warnings fs: aio: reject partial mremap to avoid Null-pointer-dereference error fuse: reject oversized dirents in page cache writeback: Fix use after free in inode_switch_wbs_work_fn() fs: aio: set VMA_DONTCOPY_BIT in mmap to fix NULL-pointer-dereference error
9 daysMerge tag 'v7.1-rc-part2-ksmbd-fixes' of git://git.samba.org/ksmbdLinus Torvalds
Pull more smb server updates from Steve French: - move fs/smb/common/smbdirect to fs/smb/smbdirect - change signature calc to use AES-CMAC library, simpler and faster - invalid signature fix - multichannel fix - open create options fix - fix durable handle leak - cap maximum lock count to avoid potential denial of service - four connection fixes: connection free and session destroy IDA fixes, refcount fix, connection leak fix, max_connections off by one fix - IPC validation fix - fix out of bounds write in getting xattrs - fix use after free in durable handle reconnect - three ACL fixes: fix potential ACL overflow, harden num_aces check, and fix minimum ACE size check * tag 'v7.1-rc-part2-ksmbd-fixes' of git://git.samba.org/ksmbd: smb: smbdirect: move fs/smb/common/smbdirect/ to fs/smb/smbdirect/ smb: server: stop sending fake security descriptors ksmbd: scope conn->binding slowpath to bound sessions only ksmbd: fix CreateOptions sanitization clobbering the whole field ksmbd: fix durable fd leak on ClientGUID mismatch in durable v2 open ksmbd: fix O(N^2) DoS in smb2_lock via unbounded LockCount ksmbd: destroy async_ida in ksmbd_conn_free() ksmbd: destroy tree_conn_ida in ksmbd_session_destroy() ksmbd: Use AES-CMAC library for SMB3 signature calculation ksmbd: reset rcount per connection in ksmbd_conn_wait_idle_sess_id() ksmbd: fix out-of-bounds write in smb2_get_ea() EA alignment ksmbd: use check_add_overflow() to prevent u16 DACL size overflow ksmbd: fix use-after-free in smb2_open during durable reconnect ksmbd: validate num_aces and harden ACE walk in smb_inherit_dacl() smb: server: fix max_connections off-by-one in tcp accept path ksmbd: require minimum ACE size in smb_check_perm_dacl() ksmbd: validate response sizes in ipc_validate_msg() smb: server: fix active_num_conn leak on transport allocation failure
9 daysMerge tag 'v7.1-rc1-part3-smb3-client-fixes' of ↵Linus Torvalds
git://git.samba.org/sfrench/cifs-2.6 Pull smb client fixes from Steve French: - Four bug fixes: OOB read in ioctl query info, 3 ACL fixes - SMB1 Unix extensions mount fix - Four crypto improvements: move to AES-CMAC library, simpler and faster - Remove drop_dir_cache to avoid potential crash, and move to /procfs - Seven SMB3.1.1 compression fixes * tag 'v7.1-rc1-part3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: smb: client: Drop 'allocate_crypto' arg from smb*_calc_signature() smb: client: Make generate_key() return void smb: client: Remove obsolete cmac(aes) allocation smb: client: Use AES-CMAC library for SMB3 signature calculation smb: common: add SMB3_COMPRESS_MAX_ALGS smb: client: compress: add code docs to lz77.c smb: client: compress: LZ77 optimizations smb: client: compress: increase LZ77_MATCH_MAX_DIST smb: client: compress: fix counting in LZ77 match finding smb: client: compress: fix buffer overrun in lz77_compress() smb: client: scope end_of_dacl to CIFS_DEBUG2 use in parse_dacl smb: client: fix (remove) drop_dir_cache module parameter smb: client: require a full NFS mode SID before reading mode bits smb: client: validate the whole DACL before rewriting it in cifsacl smb: client: fix OOB read in smb2_ioctl_query_info QUERY_INFO path cifs: update internal module version number smb: client: compress: fix bad encoding on last LZ77 flag smb: client: fix dir separator in SMB1 UNIX mounts
9 dayseventpoll: drop vestigial epi->dying flagChristian Brauner
With ep_remove() now pinning @file via epi_fget() across the f_ep clear and hlist_del_rcu(), the dying flag no longer orchestrates anything: it was set in eventpoll_release_file() (which only runs from __fput(), i.e. after @file's refcount has reached zero) and read in __ep_remove() / ep_remove() as a cheap bail before attempting the same synchronization epi_fget() now provides unconditionally. The implication is simple: epi->dying == true always coincides with file_ref_get(&file->f_ref) == false, because __fput() is reachable only once the refcount hits zero and the refcount is monotone in that state. The READ_ONCE(epi->dying) in ep_remove() therefore selects exactly the same callers that epi_fget() would reject, just one atomic cheaper. That's not worth a struct field, a second coordination mechanism, and the comments on both. Refresh the eventpoll_release_file() comment to describe what actually makes the path race-free now (the pin in ep_remove()). No functional change: the correctness argument is unchanged, only the mechanism is now a single one instead of two. Link: https://patch.msgid.link/20260423-work-epoll-uaf-v1-10-2470f9eec0f5@kernel.org Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
9 dayseventpoll: drop dead bool return from ep_remove_epi()Christian Brauner
ep_remove_epi() always returns true -- the "can be disposed" answer was meaningful back when the dying-check lived inside the pre-split __ep_remove(), but after that check moved to ep_remove() the return value is just noise. Both callers gate on it unconditionally: if (ep_remove_epi(ep, epi)) WARN_ON_ONCE(ep_refcount_dec_and_test(ep)); dispose = ep_remove_epi(ep, epi); ... if (dispose && ep_refcount_dec_and_test(ep)) ep_free(ep); Make ep_remove_epi() return void, drop the dispose local in eventpoll_release_file(), and the useless conditionals at both callers. No functional change. Link: https://patch.msgid.link/20260423-work-epoll-uaf-v1-9-2470f9eec0f5@kernel.org Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
9 dayseventpoll: move f_lock acquisition into ep_remove_file()Christian Brauner
Let the helper own its critical section end-to-end: take &file->f_lock at the top, read file->f_ep inside the lock, release on exit. Callers (ep_remove() and eventpoll_release_file()) no longer need to wrap the call, and the function-comment lock-handoff contract is gone. Link: https://patch.msgid.link/20260423-work-epoll-uaf-v1-7-2470f9eec0f5@kernel.org Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
9 dayseventpoll: fix ep_remove struct eventpoll / struct file UAFChristian Brauner
ep_remove() (via ep_remove_file()) cleared file->f_ep under file->f_lock but then kept using @file inside the critical section (is_file_epoll(), hlist_del_rcu() through the head, spin_unlock). A concurrent __fput() taking the eventpoll_release() fastpath in that window observed the transient NULL, skipped eventpoll_release_file() and ran to f_op->release / file_free(). For the epoll-watches-epoll case, f_op->release is ep_eventpoll_release() -> ep_clear_and_put() -> ep_free(), which kfree()s the watched struct eventpoll. Its embedded ->refs hlist_head is exactly where epi->fllink.pprev points, so the subsequent hlist_del_rcu()'s "*pprev = next" scribbles into freed kmalloc-192 memory. In addition, struct file is SLAB_TYPESAFE_BY_RCU, so the slot backing @file could be recycled by alloc_empty_file() -- reinitializing f_lock and f_ep -- while ep_remove() is still nominally inside that lock. The upshot is an attacker-controllable kmem_cache_free() against the wrong slab cache. Pin @file via epi_fget() at the top of ep_remove() and gate the critical section on the pin succeeding. With the pin held @file cannot reach refcount zero, which holds __fput() off and transitively keeps the watched struct eventpoll alive across the hlist_del_rcu() and the f_lock use, closing both UAFs. If the pin fails @file has already reached refcount zero and its __fput() is in flight. Because we bailed before clearing f_ep, that path takes the eventpoll_release() slow path into eventpoll_release_file() and blocks on ep->mtx until the waiter side's ep_clear_and_put() drops it. The bailed epi's share of ep->refcount stays intact, so the trailing ep_refcount_dec_and_test() in ep_clear_and_put() cannot free the eventpoll out from under eventpoll_release_file(); the orphaned epi is then cleaned up there. A successful pin also proves we are not racing eventpoll_release_file() on this epi, so drop the now-redundant re-check of epi->dying under f_lock. The cheap lockless READ_ONCE(epi->dying) fast-path bailout stays. Fixes: 58c9b016e128 ("epoll: use refcount to reduce ep_mutex contention") Reported-by: Jaeyoung Chung <jjy600901@snu.ac.kr> Link: https://patch.msgid.link/20260423-work-epoll-uaf-v1-6-2470f9eec0f5@kernel.org Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
9 dayseventpoll: move epi_fget() upChristian Brauner
We'll need it when removing files so move it up. No functional change. Link: https://patch.msgid.link/20260423-work-epoll-uaf-v1-5-2470f9eec0f5@kernel.org Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
9 dayseventpoll: rename ep_remove_safe() back to ep_remove()Christian Brauner
The current name is just confusing and doesn't clarify anything. Link: https://patch.msgid.link/20260423-work-epoll-uaf-v1-4-2470f9eec0f5@kernel.org Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
9 dayseventpoll: drop vestigial __ prefix from ep_remove_{file,epi}()Christian Brauner
With __ep_remove() gone, the double-underscore on __ep_remove_file() and __ep_remove_epi() no longer contrasts with a __-less parent and just reads as noise. Rename both to ep_remove_file() and ep_remove_epi(). No functional change. Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
9 dayseventpoll: kill __ep_remove()Christian Brauner
Remove the boolean conditional in __ep_remove() and restructure the code so the check for racing with eventpoll_release_file() are only done in the ep_remove_safe() path where they belong. Link: https://patch.msgid.link/20260423-work-epoll-uaf-v1-3-2470f9eec0f5@kernel.org Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
9 dayseventpoll: split __ep_remove()Christian Brauner
Split __ep_remove() to delineate file removal from epoll item removal. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Link: https://patch.msgid.link/20260423-work-epoll-uaf-v1-2-2470f9eec0f5@kernel.org Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
9 dayseventpoll: use hlist_is_singular_node() in __ep_remove()Christian Brauner
Replace the open-coded "epi is the only entry in file->f_ep" check with hlist_is_singular_node(). Same semantics, and the helper avoids the head-cacheline access in the common false case. Link: https://patch.msgid.link/20260423-work-epoll-uaf-v1-1-2470f9eec0f5@kernel.org Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
9 daysfs: Handle multiply claimed blocks more gracefully with mmbJan Kara
When a metadata block is referenced by multiple inodes and tracked by metadata bh infrastructure (which is forbidden and generally indicates filesystem corruption), it can happen that mmb_mark_buffer_dirty() is called for two different mmb structures in parallel. This can lead to a corruption of mmb linked list. Handle that situation gracefully (at least from mmb POV) by serializing on setting bh->b_mmb. Reported-by: Ruikai Peng <ruikai@pwno.io> Signed-off-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/20260423090311.10955-2-jack@suse.cz Signed-off-by: Christian Brauner <brauner@kernel.org>
9 daysfs: aio: reject partial mremap to avoid Null-pointer-dereference errorZizhi Wo
[BUG] Recently, our internal syzkaller testing uncovered a null pointer dereference issue: BUG: kernel NULL pointer dereference, address: 0000000000000000 ... [ 51.111664] filemap_read_folio+0x25/0xe0 [ 51.112410] filemap_fault+0xad7/0x1250 [ 51.113112] __do_fault+0x4b/0x460 [ 51.113699] do_pte_missing+0x5bc/0x1db0 [ 51.114250] ? __pte_offset_map+0x23/0x170 [ 51.114822] __handle_mm_fault+0x9f8/0x1680 ... Crash analysis showed the file involved was an AIO ring file. The phenomenon triggered is the same as the issue described in [1]. [CAUSE] Consider the following scenario: userspace sets up an AIO context via io_setup(), which creates a VMA covering the entire ring buffer. Then userspace calls mremap() with the AIO ring address as the source, a smaller old_len (less than the full ring size), MREMAP_MAYMOVE set, and without MREMAP_DONTUNMAP. The kernel will relocate the requested portion to a new destination address. During this move, __split_vma() splits the original AIO ring VMA. The requested portion is unmapped from the source and re-established at the destination, while the remainder stays at the original source address as an orphan VMA. The aio_ring_mremap() callback fires on the new destination VMA, updating ctx->mmap_base to the destination address. But the callback is unaware that only a partial region was moved and that an orphan VMA still exists at the source: source(AIO): +-------------------+---------------------+ | moved to dest | orphan VMA (AIO) | +-------------------+---------------------+ A A+partial_len A+ctx->mmap_size dest: +-------------------+ | moved VMA (AIO) | +-------------------+ B B+partial_len Later, io_destroy() calls vm_munmap(ctx->mmap_base, ctx->mmap_size), which unmaps the destination. This not only fails to unmap the orphan VMA at the source, but also overshoots the destination VMA and may unmap unrelated mappings adjacent to it! After put_aio_ring_file() calls truncate_setsize() to remove all pages from the pagecache, any subsequent access to the orphan VMA triggers filemap_fault(), which calls a_ops->read_folio(). Since aio does not implement read_folio, this results in a NULL pointer dereference. [FIX] Note that expanding mremap (new_len > old_len) is already rejected because AIO ring VMAs are created with VM_DONTEXPAND. The only problematic case is a partial move where "old_len == new_len" but both are smaller than the full ring size. Fix this by checking in aio_ring_mremap() that the new VMA covers the entire ring. This ensures the AIO ring is always moved as a whole, preventing orphan VMAs and the subsequent crash. [1]: https://lore.kernel.org/all/20260413010814.548568-1-wozizhi@huawei.com/ Signed-off-by: Zizhi Wo <wozizhi@huaweicloud.com> Link: https://patch.msgid.link/20260418060634.3713620-1-wozizhi@huaweicloud.com Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
9 daysfuse: reject oversized dirents in page cacheSamuel Page
fuse_add_dirent_to_cache() computes a serialized dirent size from the server-controlled namelen field and copies the dirent into a single page-cache page. The existing logic only checks whether the dirent fits in the remaining space of the current page and advances to a fresh page if not. It never checks whether the dirent itself exceeds PAGE_SIZE. As a result, a malicious FUSE server can return a dirent with namelen=4095, producing a serialized record size of 4120 bytes. On 4 KiB page systems this causes memcpy() to overflow the cache page by 24 bytes into the following kernel page. Reject dirents that cannot fit in a single page before copying them into the readdir cache. Fixes: 69e34551152a ("fuse: allow caching readdir") Cc: stable@vger.kernel.org # v6.16+ Assisted-by: Bynario AI Signed-off-by: Samuel Page <sam@bynar.io> Reported-by: Qi Tang <tpluszz77@gmail.com> Reported-by: Zijun Hu <nightu@northwestern.edu> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Link: https://patch.msgid.link/20260420090139.662772-1-mszeredi@redhat.com Signed-off-by: Christian Brauner <brauner@kernel.org>
9 dayswriteback: Fix use after free in inode_switch_wbs_work_fn()Jan Kara
inode_switch_wbs_work_fn() has a loop like: wb_get(new_wb); while (1) { list = llist_del_all(&new_wb->switch_wbs_ctxs); /* Nothing to do? */ if (!list) break; ... process the items ... } Now adding of items to the list looks like: wb_queue_isw() if (llist_add(&isw->list, &wb->switch_wbs_ctxs)) queue_work(isw_wq, &wb->switch_work); Because inode_switch_wbs_work_fn() loops when processing isw items, it can happen that wb->switch_work is pending while wb->switch_wbs_ctxs is empty. This is a problem because in that case wb can get freed (no isw items -> no wb reference) while the work is still pending causing use-after-free issues. We cannot just fix this by cancelling work when freeing wb because that could still trigger problematic 0 -> 1 transitions on wb refcount due to wb_get() in inode_switch_wbs_work_fn(). It could be all handled with more careful code but that seems unnecessarily complex so let's avoid that until it is proven that the looping actually brings practical benefit. Just remove the loop from inode_switch_wbs_work_fn() instead. That way when wb_queue_isw() queues work, we are guaranteed we have added the first item to wb->switch_wbs_ctxs and nobody is going to remove it (and drop the wb reference it holds) until the queued work runs. Fixes: e1b849cfa6b6 ("writeback: Avoid contention on wb->list_lock when switching inodes") CC: stable@vger.kernel.org Signed-off-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/20260413093618.17244-2-jack@suse.cz Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
9 daysfs: aio: set VMA_DONTCOPY_BIT in mmap to fix NULL-pointer-dereference errorZizhi Wo
[BUG] Recently, our internal syzkaller testing uncovered a null pointer dereference issue: BUG: kernel NULL pointer dereference, address: 0000000000000000 ... [ 51.111664] filemap_read_folio+0x25/0xe0 [ 51.112410] filemap_fault+0xad7/0x1250 [ 51.113112] __do_fault+0x4b/0x460 [ 51.113699] do_pte_missing+0x5bc/0x1db0 [ 51.114250] ? __pte_offset_map+0x23/0x170 [ 51.114822] __handle_mm_fault+0x9f8/0x1680 [ 51.115408] handle_mm_fault+0x24c/0x570 [ 51.115958] do_user_addr_fault+0x226/0xa50 ... Crash analysis showed the file involved was an AIO ring file. [CAUSE] PARENT process CHILD process t=0 io_setup(1, &ctx) [access ctx addr] fork() io_destroy vm_munmap // not affect child vma percpu_ref_put ... put_aio_ring_file t=1 [access ctx addr] // pagefault ... __do_fault filemap_fault max_idx = DIV_ROUND_UP(i_size_read(inode), PAGE_SIZE) t=2 truncate_setsize truncate_pagecache t=3 filemap_get_folio // no folio, create folio __filemap_get_folio(..., FGP_CREAT, ...) // page_not_uptodate filemap_read_folio(file, mapping->a_ops->read_folio, folio) // oops! At t=0, the parent process calls io_setup and then fork. The child process gets its own VMA but without any PTEs. The parent then calls io_destroy. Before i_size is truncated to 0, at t=1 the child process accesses this AIO ctx address and triggers a pagefault. After the max_idx check passes, at t=2 the parent calls truncate_setsize and truncate_pagecache. At t=3 the child fails to obtain the folio, falls into the "page_not_uptodate" path, and hits this problem because AIO does not implement "read_folio". [Fix] Fix this by marking the AIO ring buffer VMA with VM_DONTCOPY so that fork()'s dup_mmap() skips it entirely. This is the correct semantic because: 1) The child's ioctx_table is already reset to NULL by mm_init_aio() during fork(), so the child has no AIO context and no way to perform any AIO operations on this mapping. 2) The AIO ring VMA is only meaningful in conjunction with its associated kioctx, which is never inherited across fork(). So child process with no AIO context has no legitimate reason to access the ring buffer. Delivering SIGSEGV on such an erroneous access is preferable to a kernel crash. Signed-off-by: Zizhi Wo <wozizhi@huaweicloud.com> Link: https://patch.msgid.link/20260413010814.548568-1-wozizhi@huawei.com Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
10 dayssmb: smbdirect: move fs/smb/common/smbdirect/ to fs/smb/smbdirect/Stefan Metzmacher
This also removes the smbdirect_ prefix from the files. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/linux-cifs/CAHk-=whmue3PVi88K0UZLZO0at22QhQZ-yu+qO2TOKyZpGqecw@mail.gmail.com/ Cc: Steve French <smfrench@gmail.com> Cc: Tom Talpey <tom@talpey.com> Cc: Long Li <longli@microsoft.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: linux-cifs@vger.kernel.org Cc: samba-technical@lists.samba.org Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Stefan Metzmacher <metze@samba.org> Signed-off-by: Steve French <stfrench@microsoft.com>
10 daysMerge tag 'tracefs-v7.1-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracefs fixes from Steven Rostedt: - Use list_add_tail_rcu() for walking eventfs children The linked list of children is protected by SRCU and list walkers can walk the list with only using SRCU. Using just list_add_tail() on weakly ordered architectures can cause issues. Instead use list_add_tail_rcu(). - Hold eventfs_mutex and SRCU for remount walk events The trace_apply_options() walks the tracefs_inodes where some are eventfs inodes and eventfs_remount() is called which in turn calls eventfs_set_attr(). This walk only holds normal RCU read locks, but the eventfs_mutex and SRCU should be held. Add a eventfs_remount_(un)lock() helpers to take the necessary locks before iterating the list. * tag 'tracefs-v7.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: eventfs: Hold eventfs_mutex and SRCU when remount walks events eventfs: Use list_add_tail_rcu() for SRCU-protected children list
10 daysudf: reject descriptors with oversized CRC lengthMichael Bommarito
udf_read_tagged() skips CRC verification when descCRCLength + sizeof(struct tag) exceeds the block size. A crafted UDF image can set descCRCLength to an oversized value to bypass CRC validation entirely; the descriptor is then accepted based solely on the 8-bit tag checksum, which is trivially recomputable. Reject such descriptors instead of silently accepting them. A legitimate single-block descriptor should never have a CRC length that exceeds the block. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Cc: stable@vger.kernel.org Assisted-by: Claude:claude-opus-4-6 Assisted-by: Codex:gpt-5-4 Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com> Link: https://patch.msgid.link/20260413211240.853662-1-michael.bommarito@gmail.com Signed-off-by: Jan Kara <jack@suse.cz>
10 dayssmb: client: Drop 'allocate_crypto' arg from smb*_calc_signature()Eric Biggers
Since the crypto library API is now being used instead of crypto_shash, all structs for MAC computation are now just fixed-size structs allocated on the stack; no dynamic allocations are ever required. Besides being much more efficient, this also means that the 'allocate_crypto' argument to smb2_calc_signature() and smb3_calc_signature() is no longer used. Remove this unused argument. Acked-by: Steve French <stfrench@microsoft.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
10 dayssmb: client: Make generate_key() return voidEric Biggers
Since the crypto library API is now being used instead of crypto_shash, generate_key() can no longer fail. Make it return void and simplify the callers accordingly. Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
10 dayssmb: client: Remove obsolete cmac(aes) allocationEric Biggers
Since the crypto library API is now being used instead of crypto_shash, the "cmac(aes)" crypto_shash that is being allocated and stored in 'struct cifs_secmech' is no longer used. Remove it. That makes the kconfig selection of CRYPTO_CMAC and the module softdep on "cmac" unnecessary. So remove those too. Finally, since this removes the last use of crypto_shash from the smb client, also remove the remaining crypto_shash-related helper functions. Note: cifs_unicode.c was relying on <linux/unaligned.h> being included transitively via <crypto/internal/hash.h>. Since the latter include is removed, make cifs_unicode.c include <linux/unaligned.h> explicitly. Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
10 dayssmb: client: Use AES-CMAC library for SMB3 signature calculationEric Biggers
Convert smb3_calc_signature() to use the AES-CMAC library instead of a "cmac(aes)" crypto_shash. The result is simpler and faster code. With the library there's no need to allocate memory, no need to handle errors except for key preparation, and the AES-CMAC code is accessed directly without inefficient indirect calls and other unnecessary API overhead. For now a "cmac(aes)" crypto_shash is still being allocated in 'struct cifs_secmech'. Later commits will remove that, simplifying the code even further. Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
10 dayssmb: common: add SMB3_COMPRESS_MAX_ALGSEnzo Matsumiya
Set it to number of currently defined algorithms (6 as of now). Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de> Signed-off-by: Steve French <stfrench@microsoft.com>
10 dayssmb: client: compress: add code docs to lz77.cEnzo Matsumiya
Document parts of the code, especially the apparently non-sense parts. Other: - change pointer increment constants to sizeof() values Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de> Signed-off-by: Steve French <stfrench@microsoft.com>
10 dayssmb: client: compress: LZ77 optimizationsEnzo Matsumiya
This patch implements several micro-optimizations on lz77_compress() with the goal of reducing the number of instructions per [input] byte (a.k.a. IPB). Changes: - change hashtable to be u32 (instead of u64) -- change the hash function to reflect that (adds lz77_hash() and lz77_read32() helpers) - batch-write literals instead of 1 by 1 -- now that we have a well defined hot path (match finding) and a cold path (encode literals + match), batch writing makes a significant difference - implement adaptive skipping of input bytes -- skip input bytes more aggressively if too few matches are being found - name some constants for more meaningful context Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de> Signed-off-by: Steve French <stfrench@microsoft.com>
10 dayssmb: client: compress: increase LZ77_MATCH_MAX_DISTEnzo Matsumiya
Increase max distance (i.e. window size) from 1k to 8k. This allows better compression and is just as fast. Other: - drop LZ77_MATCH_MIN_DIST as it's nused -- main loop already checks if dist > 0 Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de> Signed-off-by: Steve French <stfrench@microsoft.com>
10 dayssmb: client: compress: fix counting in LZ77 match findingEnzo Matsumiya
- lz77_match_len() increments @cur before checking for equality, leading to off-by-one match len in some cases. Fix by moving pointers increment to inside the loop. Also rename @wnd arg to @match (more accurate name). - both lz77_match_len() and lz77_compress() checked for "buf + step < end" when the correct is "<=" for such cases. Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de> Signed-off-by: Steve French <stfrench@microsoft.com>
10 dayssmb: client: compress: fix buffer overrun in lz77_compress()Enzo Matsumiya
@dst buffer is allocated with same size as @src, which, for good compression cases, works fine. However, when compression goes bad (e.g. random bytes payloads), the compressed size can increase significantly, and even by stopping the main loop at 7/8 of @slen, writing leftover literals could write past the end of @dst because of LZ77 metadata. To fix this, add lz77_compressed_alloc_size() helper to compute the correct allocation size for @dst, accounting for metadata and worst cast scenario (all literals). While this is overprovisioning memory, it's not only correct, but also allows lz77_compress() main loop to run without ever checking @dst limits (i.e. a perf improvement). Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de> Signed-off-by: Steve French <stfrench@microsoft.com>
10 dayssmb: client: scope end_of_dacl to CIFS_DEBUG2 use in parse_daclMichael Bommarito
After validate_dacl() was factored out in commit 149822e5541c, the local end_of_dacl in parse_dacl() is only read by the dump_ace() call under #ifdef CONFIG_CIFS_DEBUG2. With CIFS_DEBUG2 off the variable is assigned but never used, which gcc -W=1 flags as -Wunused-but-set-variable. Remove the local and compute the end-of-dacl pointer inline at the single call site inside the existing CIFS_DEBUG2 guard. No functional change: when CIFS_DEBUG2 is enabled the argument value is identical to what the removed local carried; when CIFS_DEBUG2 is disabled the code was already dead. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202604220046.tGkRxVtS-lkp@intel.com/ Fixes: 149822e5541c ("smb: client: validate the whole DACL before rewriting it in cifsacl") Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com> Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Steve French <stfrench@microsoft.com>
10 dayssmb: client: fix (remove) drop_dir_cache module parameterEnzo Matsumiya
Being a module parameter, it's possible to do: # modprobe cifs drop_dir_cache=1 Which will lead to a crash, because cifs_tcp_ses_list hasn't been initialized yet: [ 168.242624] BUG: kernel NULL pointer dereference, address: 0000000000000010 [ 168.242952] #PF: supervisor read access in kernel mode [ 168.243175] #PF: error_code(0x0000) - not-present page [ 168.243394] PGD 0 P4D 0 [ 168.243524] Oops: Oops: 0000 [#1] SMP NOPTI [ 168.243703] CPU: 2 UID: 0 PID: 1105 Comm: modprobe Not tainted 7.0.0-lku #5 PREEMPT(lazy) [ 168.244054] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.17.0-2-g4f253b9b-prebuilt.qemu.org 04/01/2014 [ 168.244557] RIP: 0010:cifs_param_set_drop_dir_cache+0x7c/0x100 [cifs] ... [ 168.248785] Call Trace: [ 168.248915] <TASK> [ 168.249023] parse_args+0x285/0x3a0 [ 168.249204] ? __pfx_unknown_module_param_cb+0x10/0x10 [ 168.249448] load_module+0x192b/0x1bb0 [ 168.249637] ? __pfx_unknown_module_param_cb+0x10/0x10 [ 168.249882] ? kernel_read_file+0x27d/0x2b0 [ 168.250088] init_module_from_file+0xce/0xf0 [ 168.250291] idempotent_init_module+0xfb/0x2f0 [ 168.250496] __x64_sys_finit_module+0x5a/0xa0 [ 168.250694] do_syscall_64+0xe0/0x5a0 [ 168.250863] ? exc_page_fault+0x65/0x160 [ 168.251050] entry_SYSCALL_64_after_hwframe+0x77/0x7f [ 168.251284] RIP: 0033:0x7fcaa12b774d Instead of fixing this with some kind of "is module initialized" approach, this patch instead moves that functionality to procfs, setting a write op for the existing open_dirs entry, where writing a 0 to it will drop the cached directory entries. Also make it available only when CONFIG_CIFS_DEBUG=y. A small change needed now is to not call flush_delayed_work() on invalidate_all_cached_dirs() when called from procfs (can't sleep in that context). So add a @sync arg to invalidate_all_cached_dirs() to control when to flush the delayed works. Fixes: dde6667fa3c8 ("smb: client: add drop_dir_cache module parameter to invalidate cached dirents") Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de> Signed-off-by: Steve French <stfrench@microsoft.com>
10 dayssmb: client: require a full NFS mode SID before reading mode bitsMichael Bommarito
parse_dacl() treats an ACE SID matching sid_unix_NFS_mode as an NFS mode SID and reads sid.sub_auth[2] to recover the mode bits. That assumes the ACE carries three subauthorities, but compare_sids() only compares min(a, b) subauthorities. A malicious server can return an ACE with num_subauth = 2 and sub_auth[] = {88, 3}, which still matches sid_unix_NFS_mode and then drives the sub_auth[2] read four bytes past the end of the ACE. Require num_subauth >= 3 before treating the ACE as an NFS mode SID. This keeps the fix local to the special-SID mode path without changing compare_sids() semantics for the rest of cifsacl. Fixes: e2f8fbfb8d09 ("cifs: get mode bits from special sid on stat") Cc: stable@vger.kernel.org Assisted-by: Claude:claude-opus-4-6 Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com> Signed-off-by: Steve French <stfrench@microsoft.com>
10 dayssmb: client: validate the whole DACL before rewriting it in cifsaclMichael Bommarito
build_sec_desc() and id_mode_to_cifs_acl() derive a DACL pointer from a server-supplied dacloffset and then use the incoming ACL to rebuild the chmod/chown security descriptor. The original fix only checked that the struct smb_acl header fits before reading dacl_ptr->size or dacl_ptr->num_aces. That avoids the immediate header-field OOB read, but the rewrite helpers still walk ACEs based on pdacl->num_aces with no structural validation of the incoming DACL body. A malicious server can return a truncated DACL that still contains a header, claims one or more ACEs, and then drive replace_sids_and_copy_aces() or set_chmod_dacl() past the validated extent while they compare or copy attacker-controlled ACEs. Factor the DACL structural checks into validate_dacl(), extend them to validate each ACE against the DACL bounds, and use the shared validator before the chmod/chown rebuild paths. parse_dacl() reuses the same validator so the read-side parser and write-side rewrite paths agree on what constitutes a well-formed incoming DACL. Fixes: bc3e9dd9d104 ("cifs: Change SIDs in ACEs while transferring file ownership.") Cc: stable@vger.kernel.org Assisted-by: Claude:claude-opus-4-6 Assisted-by: Codex:gpt-5-4 Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com> Signed-off-by: Steve French <stfrench@microsoft.com>
10 dayssmb: client: fix OOB read in smb2_ioctl_query_info QUERY_INFO pathMichael Bommarito
smb2_ioctl_query_info() has two response-copy branches: PASSTHRU_FSCTL and the default QUERY_INFO path. The QUERY_INFO branch clamps qi.input_buffer_length to the server-reported OutputBufferLength and then copies qi.input_buffer_length bytes from qi_rsp->Buffer to userspace, but it never verifies that the flexible-array payload actually fits within rsp_iov[1].iov_len. A malicious server can return OutputBufferLength larger than the actual QUERY_INFO response, causing copy_to_user() to walk past the response buffer and expose adjacent kernel heap to userspace. Guard the QUERY_INFO copy with a bounds check on the actual Buffer payload. Use struct_size(qi_rsp, Buffer, qi.input_buffer_length) rather than an open-coded addition so the guard cannot overflow on 32-bit builds. Fixes: f5778c398713 ("SMB3: Allow SMB3 FSCTL queries to be sent to server from tools") Cc: stable@vger.kernel.org Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com> Assisted-by: Claude:claude-opus-4-6 Assisted-by: Codex:gpt-5-4 Signed-off-by: Steve French <stfrench@microsoft.com>
11 dayssmb: server: stop sending fake security descriptorsMarios Makassikis
in smb2_get_info_sec, a dummy security descriptor (SD) is returned if the requested information is not supported. the code is currently wrong, as DACL_PROTECTED is set in the type field, but there is no DACL is present. instead of faking a security, report a STATUS_NOT_SUPPORTED error. this seems to fix a "Error 0x80090006: Invalid Signature" on file transfers with Windows 11 clients (25H2, build 26200.8246). capturing traffic shows that the client is sending a GET_INFO/SEC_INFO request, with the additional_info field set to 0x20 (ATTRIBUTE_SECURITY_INFORMATION). Returning an empty SD (with only SELF_RELATIVE set) does not fix the error. Signed-off-by: Marios Makassikis <mmakassikis@freebox.fr> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
11 daysksmbd: scope conn->binding slowpath to bound sessions onlyHyunwoo Kim
When the binding SESSION_SETUP sets conn->binding = true, the flag stays set after the call so that the global session lookup in ksmbd_session_lookup_all() can find the session, which was not added to conn->sessions. Because the flag is connection-wide, the global lookup path will also resolve any other session by id if asked. Tighten the global lookup so that the returned session must have this connection registered in its channel xarray (sess->ksmbd_chann_list). The channel entry is installed by the existing binding_session path in ntlm_authenticate()/krb5_authenticate() when a SESSION_SETUP completes successfully, so this condition is a strict equivalent of "this connection has been accepted as a channel of this session". Connections that have not bound to a given session cannot reach it via the global table. The existing conn->binding gate for entering the slowpath is preserved so that non-binding connections keep the fast-path-only behavior, and the session->state check is unchanged. Fixes: f5a544e3bab7 ("ksmbd: add support for SMB3 multichannel") Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
11 daysksmbd: fix CreateOptions sanitization clobbering the whole fieldDaeMyung Kang
smb2_open() attempts to clear conflicting CreateOptions bits (FILE_SEQUENTIAL_ONLY_LE together with FILE_RANDOM_ACCESS_LE, and FILE_NO_COMPRESSION_LE on a directory open), but uses a plain assignment of the bitwise negation of the target flag: req->CreateOptions = ~(FILE_SEQUENTIAL_ONLY_LE); req->CreateOptions = ~(FILE_NO_COMPRESSION_LE); This replaces the entire field with 0xFFFFFFFB / 0xFFFFFFEF rather than clearing a single bit. With the SEQUENTIAL/RANDOM case, the next check for FILE_OPEN_BY_FILE_ID_LE | CREATE_TREE_CONNECTION | FILE_RESERVE_OPFILTER_LE then trivially matches and a legitimate request is rejected with -EOPNOTSUPP. With the NO_COMPRESSION case, every downstream test (FILE_DELETE_ON_CLOSE, etc.) operates on a corrupted CreateOptions value. Use &= ~FLAG to clear only the intended bit in both places. Signed-off-by: DaeMyung Kang <charsyam@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
11 daysksmbd: fix durable fd leak on ClientGUID mismatch in durable v2 openDaeMyung Kang
ksmbd_lookup_fd_cguid() returns a ksmbd_file with its refcount incremented via ksmbd_fp_get(). parse_durable_handle_context() in the DURABLE_REQ_V2 case properly releases this reference on every path inside the ClientGUID-match branch, either by calling ksmbd_put_durable_fd() or by transferring ownership to dh_info->fp for a successful reconnect. However, when an entry exists in the global file table with the same CreateGuid but a different ClientGUID, the code simply falls through to the new-open path without dropping the reference obtained from ksmbd_lookup_fd_cguid(). Per MS-SMB2 section 3.3.5.9.10 ("Handling the SMB2_CREATE_DURABLE_HANDLE_REQUEST_V2 Create Context"), the server MUST locate an Open whose Open.CreateGuid matches the request's CreateGuid AND whose Open.ClientGuid matches the ClientGuid of the connection that received the request. If no such Open is found, the server MUST continue with the normal open execution phase. A CreateGuid hit with a ClientGUID mismatch is therefore the "Open not found" case: proceeding with a new open is correct, but the reference obtained purely as a side effect of the lookup must not be leaked. Repeated requests that hit this mismatch pin global_ft entries, prevent __ksmbd_close_fd() from ever running for the corresponding files, and defeat the durable scavenger, leading to long-lived resource leaks. Release the reference in the mismatch path and clear dh_info->fp so subsequent logic does not mistake a non-matching lookup result for a reconnect target. Fixes: c8efcc786146 ("ksmbd: add support for durable handles v1/v2") Signed-off-by: DaeMyung Kang <charsyam@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
11 daysksmbd: fix O(N^2) DoS in smb2_lock via unbounded LockCountAkif Sait
smb2_lock() performs O(N^2) conflict detection with no cap on LockCount. Cap lock_count at 64 to prevent CPU exhaustion from a single request. Signed-off-by: Akif Sait <akif.sait111@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
11 daysksmbd: destroy async_ida in ksmbd_conn_free()DaeMyung Kang
When per-connection async_ida was converted from a dynamically allocated ksmbd_ida to an embedded struct ida, ksmbd_ida_free() was removed from the connection teardown path but no matching ida_destroy() was added. The connection is therefore freed with the IDA's backing xarray still intact. The kernel IDA API expects ida_init() and ida_destroy() to be paired over an object's lifetime, so add the missing cleanup before the connection is freed. No leak has been observed in testing; this is a pairing fix to match the IDA lifetime rules, not a response to a reproduced regression. Fixes: d40012a83f87 ("cifsd: declare ida statically") Signed-off-by: DaeMyung Kang <charsyam@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>