| Age | Commit message (Collapse) | Author |
|
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull isofs and udf fixes from Jan Kara:
"Several isofs and udf fixes"
* tag 'fs_for_v7.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
docs: isofs: replace dead ECMA-119 FTP link
udf: reject descriptors with oversized CRC length
isofs: use QSTR_LEN() in isofs_cmp
isofs: validate block number from NFS file handle in isofs_export_iget
isofs: validate Rock Ridge CE continuation extent against volume size
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull fsnotify fixes from Jan Kara:
"Three fixes for fsnotify / fanotify"
* tag 'fsnotify_for_v7.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
fsnotify: fix inode reference leak in fsnotify_recalc_mask()
fanotify: Fix spelling mistake "enforecement" -> "enforcement"
fanotify: fix false positive on permission events
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
- space reservation fixes:
- correctly undo 'may_use' accounting for remap tree
- avoid double decrement of 'may_use' when submitting async io
- actually enable the shutdown ioctl callback (not just the superblock
ops)
- raid stripe tree fixes when deleting extents
- add missing error handling
- fix various incorrect values set
- fix transaction state when removing a directory, possibly leading to
EIO during log replay
- additional b-tree node key checks during metadata readahead
- error handling and transaction abort updates
* tag 'for-7.1-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: fix double-decrement of bytes_may_use in submit_one_async_extent()
btrfs: check return value of btrfs_partially_delete_raid_extent()
btrfs: handle -EAGAIN from btrfs_duplicate_item and refresh stale leaf pointer
btrfs: replace ASSERT with proper error handling in stripe lookup fallback
btrfs: fix wrong min_objectid in btrfs_previous_item() call
btrfs: fix raid stripe search missing entries at leaf boundaries
btrfs: copy devid in btrfs_partially_delete_raid_extent()
btrfs: handle unexpected free-space-tree key types
btrfs: fix missing last_unlink_trans update when removing a directory
btrfs: don't clobber errors in add_remap_tree_entries()
btrfs: enable shutdown ioctl for non-experimental builds
btrfs: apply first key check for readahead when possible
btrfs: abort transaction in do_remap_reloc_trans() on failure
btrfs: fix bytes_may_use leak in do_remap_reloc_trans()
btrfs: fix bytes_may_use leak in move_existing_remap()
|
|
Pull ARM updates from Russell King:
- fix a race condition handling PG_dcache_clean
- further cleanups for the fault handling, allowing RT to be enabled
- fixing nzones validation in adfs filesystem driver
- fix for module unwinding
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rmk/linux:
ARM: 9463/1: Allow to enable RT
ARM: 9472/1: fix race condition on PG_dcache_clean in __sync_icache_dcache()
ARM: 9471/1: module: fix unwind section relocation out of range error
fs/adfs: validate nzones in adfs_validate_bblk()
ARM: provide individual is_translation_fault() and is_permission_fault()
ARM: move FSR fault status definitions before fsr_fs()
ARM: use BIT() and GENMASK() for fault status register fields
ARM: move is_permission_fault() and is_translation_fault() to fault.h
ARM: move vmalloc() lazy-page table population
ARM: ensure interrupts are enabled in __do_user_fault()
|
|
Pull NFS client updates from Trond Myklebust:
"Bugfixes:
- Fix handling of ENOSPC so that if we have to resend writes, they
are written synchronously
- SUNRPC RDMA transport fixes from Chuck
- Several fixes for delegated timestamps in NFSv4.2
- Failure to obtain a directory delegation should not cause stat() to
fail with NFSv4
- Rename was failing to update timestamps when a directory delegation
is held on NFSv4
- Ensure we check rsize/wsize after crossing a NFSv4 filesystem
boundary
- NFSv4/pnfs:
- If the server is down, retry the layout returns on reboot
- Fallback to MDS could result in a short write being incorrectly
logged
Cleanups:
- Use memcpy_and_pad in decode_fh"
* tag 'nfs-for-7.1-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (21 commits)
NFS: Fix RCU dereference of cl_xprt in nfs_compare_super_address
NFS: remove redundant __private attribute from nfs_page_class
NFSv4.2: fix CLONE/COPY attrs in presence of delegated attributes
NFS: fix writeback in presence of errors
nfs: use memcpy_and_pad in decode_fh
NFSv4.1: Apply session size limits on clone path
NFSv4: retry GETATTR if GET_DIR_DELEGATION failed
NFS: fix RENAME attr in presence of directory delegations
pnfs/flexfiles: validate ds_versions_cnt is non-zero
NFS/blocklayout: print each device used for SCSI layouts
xprtrdma: Post receive buffers after RPC completion
xprtrdma: Scale receive batch size with credit window
xprtrdma: Replace rpcrdma_mr_seg with xdr_buf cursor
xprtrdma: Decouple frwr_wp_create from frwr_map
xprtrdma: Close lost-wakeup race in xprt_rdma_alloc_slot
xprtrdma: Avoid 250 ms delay on backlog wakeup
xprtrdma: Close sendctx get/put race that can block a transport
nfs: update inode ctime after removexattr operation
nfs: fix utimensat() for atime with delegated timestamps
NFS: improve "Server wrote zero bytes" error
...
|
|
Pull ceph updates from Ilya Dryomov:
"We have a series from Alex which extends CephFS client metrics with
support for per-subvolume data I/O performance and latency tracking
(metadata operations aren't included) and a good variety of fixes and
cleanups across RBD and CephFS"
* tag 'ceph-for-7.1-rc1' of https://github.com/ceph/ceph-client:
ceph: add subvolume metrics collection and reporting
ceph: parse subvolume_id from InodeStat v9 and store in inode
ceph: handle InodeStat v8 versioned field in reply parsing
libceph: Fix slab-out-of-bounds access in auth message processing
rbd: fix null-ptr-deref when device_add_disk() fails
crush: cleanup in crush_do_rule() method
ceph: clear s_cap_reconnect when ceph_pagelist_encode_32() fails
ceph: only d_add() negative dentries when they are unhashed
libceph: update outdated comment in ceph_sock_write_space()
libceph: Remove obsolete session key alignment logic
ceph: fix num_ops off-by-one when crypto allocation fails
libceph: Prevent potential null-ptr-deref in ceph_handle_auth_reply()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/ntfs
Pull ntfs updates from Namjae Jeon:
- Fix potential data leakage by zeroing the portion of the straddle
block beyond initialized_size when reading non-resident attributes
- Remove unnecessary zeroing in ntfs_punch_hole() for ranges beyond
initialized_size, as they are already returned as zeros on read
- Fix writable check in ntfs_file_mmap_prepare() to correctly handle
shared mappings using VMA_SHARED_BIT | VMA_MAYWRITE_BIT
- Use page allocation instead of kmemdup() for IOMAP_INLINE data to
ensure page-aligned address and avoid BUG trap in
iomap_inline_data_valid() caused by the page boundary check
- Add a size check before memory allocation in ntfs_attr_readall() and
reject overly large attributes
- Remove unneeded noop_direct_IO from ntfs_aops as it is no longer
required following the FMODE_CAN_ODIRECT flag
- Fix seven static analysis warnings reported by Smatch
* tag 'ntfs-for-7.1-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/ntfs:
ntfs: use page allocation for resident attribute inline data
ntfs: fix mmap_prepare writable check for shared mappings
ntfs: fix potential 32-bit truncation in ntfs_write_cb()
ntfs: fix uninitialized variable in ntfs_map_runlist_nolock
ntfs: delete dead code
ntfs: add missing error code in ntfs_mft_record_alloc()
ntfs: fix uninitialized variables in ntfs_ea_set_wsl_inode()
ntfs: fix uninitialized pointer in ntfs_write_mft_block
ntfs: fix uninitialized variable in ntfs_write_simple_iomap_begin_non_resident
ntfs: remove noop_direct_IO from address_space_operations
ntfs: limit memory allocation in ntfs_attr_readall
ntfs: not zero out range beyond init in punch_hole
ntfs: zero out stale data in straddle block beyond initialized_size
|
|
Pull 9p updates from Dominique Martinet:
- 9p access flag fix (cannot change access flag since new mount API implem)
- some minor cleanup
* tag '9p-for-7.1-rc1' of https://github.com/martinetd/linux:
9p/trans_xen: replace simple_strto* with kstrtouint
9p/trans_xen: make cleanup idempotent after dataring alloc errors
9p: document missing enum values in kernel-doc comments
9p: fix access mode flags being ORed instead of replaced
9p: fix memory leak in v9fs_init_fs_context error path
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs fixes from Christian Brauner:
- eventpoll: fix ep_remove() UAF and follow-up cleanup
- fs: aio: set VMA_DONTCOPY_BIT in mmap to fix NULL-pointer-dereference
error
- writeback: Fix use after free in inode_switch_wbs_work_fn()
- fuse: reject oversized dirents in page cache
- fs: aio: reject partial mremap to avoid Null-pointer-dereference
error
- nstree: fix func. parameter kernel-doc warnings
- fs: Handle multiply claimed blocks more gracefully with mmb
* tag 'vfs-7.1-rc1.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
eventpoll: drop vestigial epi->dying flag
eventpoll: drop dead bool return from ep_remove_epi()
eventpoll: refresh eventpoll_release() fast-path comment
eventpoll: move f_lock acquisition into ep_remove_file()
eventpoll: fix ep_remove struct eventpoll / struct file UAF
eventpoll: move epi_fget() up
eventpoll: rename ep_remove_safe() back to ep_remove()
eventpoll: drop vestigial __ prefix from ep_remove_{file,epi}()
eventpoll: kill __ep_remove()
eventpoll: split __ep_remove()
eventpoll: use hlist_is_singular_node() in __ep_remove()
fs: Handle multiply claimed blocks more gracefully with mmb
nstree: fix func. parameter kernel-doc warnings
fs: aio: reject partial mremap to avoid Null-pointer-dereference error
fuse: reject oversized dirents in page cache
writeback: Fix use after free in inode_switch_wbs_work_fn()
fs: aio: set VMA_DONTCOPY_BIT in mmap to fix NULL-pointer-dereference error
|
|
Pull more smb server updates from Steve French:
- move fs/smb/common/smbdirect to fs/smb/smbdirect
- change signature calc to use AES-CMAC library, simpler and faster
- invalid signature fix
- multichannel fix
- open create options fix
- fix durable handle leak
- cap maximum lock count to avoid potential denial of service
- four connection fixes: connection free and session destroy IDA fixes,
refcount fix, connection leak fix, max_connections off by one fix
- IPC validation fix
- fix out of bounds write in getting xattrs
- fix use after free in durable handle reconnect
- three ACL fixes: fix potential ACL overflow, harden num_aces check,
and fix minimum ACE size check
* tag 'v7.1-rc-part2-ksmbd-fixes' of git://git.samba.org/ksmbd:
smb: smbdirect: move fs/smb/common/smbdirect/ to fs/smb/smbdirect/
smb: server: stop sending fake security descriptors
ksmbd: scope conn->binding slowpath to bound sessions only
ksmbd: fix CreateOptions sanitization clobbering the whole field
ksmbd: fix durable fd leak on ClientGUID mismatch in durable v2 open
ksmbd: fix O(N^2) DoS in smb2_lock via unbounded LockCount
ksmbd: destroy async_ida in ksmbd_conn_free()
ksmbd: destroy tree_conn_ida in ksmbd_session_destroy()
ksmbd: Use AES-CMAC library for SMB3 signature calculation
ksmbd: reset rcount per connection in ksmbd_conn_wait_idle_sess_id()
ksmbd: fix out-of-bounds write in smb2_get_ea() EA alignment
ksmbd: use check_add_overflow() to prevent u16 DACL size overflow
ksmbd: fix use-after-free in smb2_open during durable reconnect
ksmbd: validate num_aces and harden ACE walk in smb_inherit_dacl()
smb: server: fix max_connections off-by-one in tcp accept path
ksmbd: require minimum ACE size in smb_check_perm_dacl()
ksmbd: validate response sizes in ipc_validate_msg()
smb: server: fix active_num_conn leak on transport allocation failure
|
|
git://git.samba.org/sfrench/cifs-2.6
Pull smb client fixes from Steve French:
- Four bug fixes: OOB read in ioctl query info, 3 ACL fixes
- SMB1 Unix extensions mount fix
- Four crypto improvements: move to AES-CMAC library, simpler and faster
- Remove drop_dir_cache to avoid potential crash, and move to /procfs
- Seven SMB3.1.1 compression fixes
* tag 'v7.1-rc1-part3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
smb: client: Drop 'allocate_crypto' arg from smb*_calc_signature()
smb: client: Make generate_key() return void
smb: client: Remove obsolete cmac(aes) allocation
smb: client: Use AES-CMAC library for SMB3 signature calculation
smb: common: add SMB3_COMPRESS_MAX_ALGS
smb: client: compress: add code docs to lz77.c
smb: client: compress: LZ77 optimizations
smb: client: compress: increase LZ77_MATCH_MAX_DIST
smb: client: compress: fix counting in LZ77 match finding
smb: client: compress: fix buffer overrun in lz77_compress()
smb: client: scope end_of_dacl to CIFS_DEBUG2 use in parse_dacl
smb: client: fix (remove) drop_dir_cache module parameter
smb: client: require a full NFS mode SID before reading mode bits
smb: client: validate the whole DACL before rewriting it in cifsacl
smb: client: fix OOB read in smb2_ioctl_query_info QUERY_INFO path
cifs: update internal module version number
smb: client: compress: fix bad encoding on last LZ77 flag
smb: client: fix dir separator in SMB1 UNIX mounts
|
|
With ep_remove() now pinning @file via epi_fget() across the
f_ep clear and hlist_del_rcu(), the dying flag no longer
orchestrates anything: it was set in eventpoll_release_file()
(which only runs from __fput(), i.e. after @file's refcount has
reached zero) and read in __ep_remove() / ep_remove() as a cheap
bail before attempting the same synchronization epi_fget() now
provides unconditionally.
The implication is simple: epi->dying == true always coincides
with file_ref_get(&file->f_ref) == false, because __fput() is
reachable only once the refcount hits zero and the refcount is
monotone in that state. The READ_ONCE(epi->dying) in ep_remove()
therefore selects exactly the same callers that epi_fget() would
reject, just one atomic cheaper. That's not worth a struct
field, a second coordination mechanism, and the comments on
both.
Refresh the eventpoll_release_file() comment to describe what
actually makes the path race-free now (the pin in ep_remove()).
No functional change: the correctness argument is unchanged,
only the mechanism is now a single one instead of two.
Link: https://patch.msgid.link/20260423-work-epoll-uaf-v1-10-2470f9eec0f5@kernel.org
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
|
|
ep_remove_epi() always returns true -- the "can be disposed"
answer was meaningful back when the dying-check lived inside the
pre-split __ep_remove(), but after that check moved to ep_remove()
the return value is just noise. Both callers gate on it
unconditionally:
if (ep_remove_epi(ep, epi))
WARN_ON_ONCE(ep_refcount_dec_and_test(ep));
dispose = ep_remove_epi(ep, epi);
...
if (dispose && ep_refcount_dec_and_test(ep))
ep_free(ep);
Make ep_remove_epi() return void, drop the dispose local in
eventpoll_release_file(), and the useless conditionals at both
callers. No functional change.
Link: https://patch.msgid.link/20260423-work-epoll-uaf-v1-9-2470f9eec0f5@kernel.org
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
|
|
Let the helper own its critical section end-to-end: take &file->f_lock
at the top, read file->f_ep inside the lock, release on exit. Callers
(ep_remove() and eventpoll_release_file()) no longer need to wrap the
call, and the function-comment lock-handoff contract is gone.
Link: https://patch.msgid.link/20260423-work-epoll-uaf-v1-7-2470f9eec0f5@kernel.org
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
|
|
ep_remove() (via ep_remove_file()) cleared file->f_ep under
file->f_lock but then kept using @file inside the critical section
(is_file_epoll(), hlist_del_rcu() through the head, spin_unlock).
A concurrent __fput() taking the eventpoll_release() fastpath in
that window observed the transient NULL, skipped
eventpoll_release_file() and ran to f_op->release / file_free().
For the epoll-watches-epoll case, f_op->release is
ep_eventpoll_release() -> ep_clear_and_put() -> ep_free(), which
kfree()s the watched struct eventpoll. Its embedded ->refs
hlist_head is exactly where epi->fllink.pprev points, so the
subsequent hlist_del_rcu()'s "*pprev = next" scribbles into freed
kmalloc-192 memory.
In addition, struct file is SLAB_TYPESAFE_BY_RCU, so the slot
backing @file could be recycled by alloc_empty_file() --
reinitializing f_lock and f_ep -- while ep_remove() is still
nominally inside that lock. The upshot is an attacker-controllable
kmem_cache_free() against the wrong slab cache.
Pin @file via epi_fget() at the top of ep_remove() and gate the
critical section on the pin succeeding. With the pin held @file
cannot reach refcount zero, which holds __fput() off and
transitively keeps the watched struct eventpoll alive across the
hlist_del_rcu() and the f_lock use, closing both UAFs.
If the pin fails @file has already reached refcount zero and its
__fput() is in flight. Because we bailed before clearing f_ep,
that path takes the eventpoll_release() slow path into
eventpoll_release_file() and blocks on ep->mtx until the waiter
side's ep_clear_and_put() drops it. The bailed epi's share of
ep->refcount stays intact, so the trailing ep_refcount_dec_and_test()
in ep_clear_and_put() cannot free the eventpoll out from under
eventpoll_release_file(); the orphaned epi is then cleaned up
there.
A successful pin also proves we are not racing
eventpoll_release_file() on this epi, so drop the now-redundant
re-check of epi->dying under f_lock. The cheap lockless
READ_ONCE(epi->dying) fast-path bailout stays.
Fixes: 58c9b016e128 ("epoll: use refcount to reduce ep_mutex contention")
Reported-by: Jaeyoung Chung <jjy600901@snu.ac.kr>
Link: https://patch.msgid.link/20260423-work-epoll-uaf-v1-6-2470f9eec0f5@kernel.org
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
|
|
We'll need it when removing files so move it up. No functional change.
Link: https://patch.msgid.link/20260423-work-epoll-uaf-v1-5-2470f9eec0f5@kernel.org
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
|
|
The current name is just confusing and doesn't clarify anything.
Link: https://patch.msgid.link/20260423-work-epoll-uaf-v1-4-2470f9eec0f5@kernel.org
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
|
|
With __ep_remove() gone, the double-underscore on __ep_remove_file()
and __ep_remove_epi() no longer contrasts with a __-less parent and
just reads as noise. Rename both to ep_remove_file() and
ep_remove_epi(). No functional change.
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
|
|
Remove the boolean conditional in __ep_remove() and restructure the code
so the check for racing with eventpoll_release_file() are only done in
the ep_remove_safe() path where they belong.
Link: https://patch.msgid.link/20260423-work-epoll-uaf-v1-3-2470f9eec0f5@kernel.org
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
|
|
Split __ep_remove() to delineate file removal from epoll item removal.
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://patch.msgid.link/20260423-work-epoll-uaf-v1-2-2470f9eec0f5@kernel.org
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
|
|
Replace the open-coded "epi is the only entry in file->f_ep" check
with hlist_is_singular_node(). Same semantics, and the helper avoids
the head-cacheline access in the common false case.
Link: https://patch.msgid.link/20260423-work-epoll-uaf-v1-1-2470f9eec0f5@kernel.org
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
|
|
When a metadata block is referenced by multiple inodes and tracked by
metadata bh infrastructure (which is forbidden and generally indicates
filesystem corruption), it can happen that mmb_mark_buffer_dirty() is
called for two different mmb structures in parallel. This can lead to a
corruption of mmb linked list. Handle that situation gracefully (at
least from mmb POV) by serializing on setting bh->b_mmb.
Reported-by: Ruikai Peng <ruikai@pwno.io>
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20260423090311.10955-2-jack@suse.cz
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
[BUG]
Recently, our internal syzkaller testing uncovered a null pointer
dereference issue:
BUG: kernel NULL pointer dereference, address: 0000000000000000
...
[ 51.111664] filemap_read_folio+0x25/0xe0
[ 51.112410] filemap_fault+0xad7/0x1250
[ 51.113112] __do_fault+0x4b/0x460
[ 51.113699] do_pte_missing+0x5bc/0x1db0
[ 51.114250] ? __pte_offset_map+0x23/0x170
[ 51.114822] __handle_mm_fault+0x9f8/0x1680
...
Crash analysis showed the file involved was an AIO ring file. The
phenomenon triggered is the same as the issue described in [1].
[CAUSE]
Consider the following scenario: userspace sets up an AIO context via
io_setup(), which creates a VMA covering the entire ring buffer. Then
userspace calls mremap() with the AIO ring address as the source, a smaller
old_len (less than the full ring size), MREMAP_MAYMOVE set, and without
MREMAP_DONTUNMAP. The kernel will relocate the requested portion to a new
destination address.
During this move, __split_vma() splits the original AIO ring VMA. The
requested portion is unmapped from the source and re-established at the
destination, while the remainder stays at the original source address as
an orphan VMA. The aio_ring_mremap() callback fires on the new destination
VMA, updating ctx->mmap_base to the destination address. But the callback
is unaware that only a partial region was moved and that an orphan VMA
still exists at the source:
source(AIO):
+-------------------+---------------------+
| moved to dest | orphan VMA (AIO) |
+-------------------+---------------------+
A A+partial_len A+ctx->mmap_size
dest:
+-------------------+
| moved VMA (AIO) |
+-------------------+
B B+partial_len
Later, io_destroy() calls vm_munmap(ctx->mmap_base, ctx->mmap_size), which
unmaps the destination. This not only fails to unmap the orphan VMA at the
source, but also overshoots the destination VMA and may unmap unrelated
mappings adjacent to it! After put_aio_ring_file() calls truncate_setsize()
to remove all pages from the pagecache, any subsequent access to the orphan
VMA triggers filemap_fault(), which calls a_ops->read_folio(). Since aio
does not implement read_folio, this results in a NULL pointer dereference.
[FIX]
Note that expanding mremap (new_len > old_len) is already rejected because
AIO ring VMAs are created with VM_DONTEXPAND. The only problematic case is
a partial move where "old_len == new_len" but both are smaller than the
full ring size.
Fix this by checking in aio_ring_mremap() that the new VMA covers the
entire ring. This ensures the AIO ring is always moved as a whole,
preventing orphan VMAs and the subsequent crash.
[1]: https://lore.kernel.org/all/20260413010814.548568-1-wozizhi@huawei.com/
Signed-off-by: Zizhi Wo <wozizhi@huaweicloud.com>
Link: https://patch.msgid.link/20260418060634.3713620-1-wozizhi@huaweicloud.com
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
fuse_add_dirent_to_cache() computes a serialized dirent size from the
server-controlled namelen field and copies the dirent into a single
page-cache page. The existing logic only checks whether the dirent fits
in the remaining space of the current page and advances to a fresh page
if not. It never checks whether the dirent itself exceeds PAGE_SIZE.
As a result, a malicious FUSE server can return a dirent with
namelen=4095, producing a serialized record size of 4120 bytes. On 4 KiB
page systems this causes memcpy() to overflow the cache page by 24 bytes
into the following kernel page.
Reject dirents that cannot fit in a single page before copying them into
the readdir cache.
Fixes: 69e34551152a ("fuse: allow caching readdir")
Cc: stable@vger.kernel.org # v6.16+
Assisted-by: Bynario AI
Signed-off-by: Samuel Page <sam@bynar.io>
Reported-by: Qi Tang <tpluszz77@gmail.com>
Reported-by: Zijun Hu <nightu@northwestern.edu>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Link: https://patch.msgid.link/20260420090139.662772-1-mszeredi@redhat.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
inode_switch_wbs_work_fn() has a loop like:
wb_get(new_wb);
while (1) {
list = llist_del_all(&new_wb->switch_wbs_ctxs);
/* Nothing to do? */
if (!list)
break;
... process the items ...
}
Now adding of items to the list looks like:
wb_queue_isw()
if (llist_add(&isw->list, &wb->switch_wbs_ctxs))
queue_work(isw_wq, &wb->switch_work);
Because inode_switch_wbs_work_fn() loops when processing isw items, it
can happen that wb->switch_work is pending while wb->switch_wbs_ctxs is
empty. This is a problem because in that case wb can get freed (no isw
items -> no wb reference) while the work is still pending causing
use-after-free issues.
We cannot just fix this by cancelling work when freeing wb because that
could still trigger problematic 0 -> 1 transitions on wb refcount due to
wb_get() in inode_switch_wbs_work_fn(). It could be all handled with
more careful code but that seems unnecessarily complex so let's avoid
that until it is proven that the looping actually brings practical
benefit. Just remove the loop from inode_switch_wbs_work_fn() instead.
That way when wb_queue_isw() queues work, we are guaranteed we have
added the first item to wb->switch_wbs_ctxs and nobody is going to
remove it (and drop the wb reference it holds) until the queued work
runs.
Fixes: e1b849cfa6b6 ("writeback: Avoid contention on wb->list_lock when switching inodes")
CC: stable@vger.kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20260413093618.17244-2-jack@suse.cz
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
[BUG]
Recently, our internal syzkaller testing uncovered a null pointer
dereference issue:
BUG: kernel NULL pointer dereference, address: 0000000000000000
...
[ 51.111664] filemap_read_folio+0x25/0xe0
[ 51.112410] filemap_fault+0xad7/0x1250
[ 51.113112] __do_fault+0x4b/0x460
[ 51.113699] do_pte_missing+0x5bc/0x1db0
[ 51.114250] ? __pte_offset_map+0x23/0x170
[ 51.114822] __handle_mm_fault+0x9f8/0x1680
[ 51.115408] handle_mm_fault+0x24c/0x570
[ 51.115958] do_user_addr_fault+0x226/0xa50
...
Crash analysis showed the file involved was an AIO ring file.
[CAUSE]
PARENT process CHILD process
t=0 io_setup(1, &ctx)
[access ctx addr]
fork()
io_destroy
vm_munmap // not affect child vma
percpu_ref_put
...
put_aio_ring_file
t=1 [access ctx addr] // pagefault
...
__do_fault
filemap_fault
max_idx = DIV_ROUND_UP(i_size_read(inode), PAGE_SIZE)
t=2 truncate_setsize
truncate_pagecache
t=3 filemap_get_folio // no folio, create folio
__filemap_get_folio(..., FGP_CREAT, ...) // page_not_uptodate
filemap_read_folio(file, mapping->a_ops->read_folio, folio) // oops!
At t=0, the parent process calls io_setup and then fork. The child process
gets its own VMA but without any PTEs. The parent then calls io_destroy.
Before i_size is truncated to 0, at t=1 the child process accesses this AIO
ctx address and triggers a pagefault. After the max_idx check passes, at
t=2 the parent calls truncate_setsize and truncate_pagecache. At t=3 the
child fails to obtain the folio, falls into the "page_not_uptodate" path,
and hits this problem because AIO does not implement "read_folio".
[Fix]
Fix this by marking the AIO ring buffer VMA with VM_DONTCOPY so
that fork()'s dup_mmap() skips it entirely. This is the correct
semantic because:
1) The child's ioctx_table is already reset to NULL by mm_init_aio() during
fork(), so the child has no AIO context and no way to perform any AIO
operations on this mapping.
2) The AIO ring VMA is only meaningful in conjunction with its associated
kioctx, which is never inherited across fork(). So child process with no
AIO context has no legitimate reason to access the ring buffer. Delivering
SIGSEGV on such an erroneous access is preferable to a kernel crash.
Signed-off-by: Zizhi Wo <wozizhi@huaweicloud.com>
Link: https://patch.msgid.link/20260413010814.548568-1-wozizhi@huawei.com
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
This also removes the smbdirect_ prefix from the files.
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/linux-cifs/CAHk-=whmue3PVi88K0UZLZO0at22QhQZ-yu+qO2TOKyZpGqecw@mail.gmail.com/
Cc: Steve French <smfrench@gmail.com>
Cc: Tom Talpey <tom@talpey.com>
Cc: Long Li <longli@microsoft.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-cifs@vger.kernel.org
Cc: samba-technical@lists.samba.org
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracefs fixes from Steven Rostedt:
- Use list_add_tail_rcu() for walking eventfs children
The linked list of children is protected by SRCU and list walkers can
walk the list with only using SRCU. Using just list_add_tail() on
weakly ordered architectures can cause issues. Instead use
list_add_tail_rcu().
- Hold eventfs_mutex and SRCU for remount walk events
The trace_apply_options() walks the tracefs_inodes where some are
eventfs inodes and eventfs_remount() is called which in turn calls
eventfs_set_attr(). This walk only holds normal RCU read locks, but
the eventfs_mutex and SRCU should be held.
Add a eventfs_remount_(un)lock() helpers to take the necessary locks
before iterating the list.
* tag 'tracefs-v7.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
eventfs: Hold eventfs_mutex and SRCU when remount walks events
eventfs: Use list_add_tail_rcu() for SRCU-protected children list
|
|
udf_read_tagged() skips CRC verification when descCRCLength +
sizeof(struct tag) exceeds the block size. A crafted UDF image can
set descCRCLength to an oversized value to bypass CRC validation
entirely; the descriptor is then accepted based solely on the 8-bit
tag checksum, which is trivially recomputable.
Reject such descriptors instead of silently accepting them. A
legitimate single-block descriptor should never have a CRC length that
exceeds the block.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Cc: stable@vger.kernel.org
Assisted-by: Claude:claude-opus-4-6
Assisted-by: Codex:gpt-5-4
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
Link: https://patch.msgid.link/20260413211240.853662-1-michael.bommarito@gmail.com
Signed-off-by: Jan Kara <jack@suse.cz>
|
|
Since the crypto library API is now being used instead of crypto_shash,
all structs for MAC computation are now just fixed-size structs
allocated on the stack; no dynamic allocations are ever required.
Besides being much more efficient, this also means that the
'allocate_crypto' argument to smb2_calc_signature() and
smb3_calc_signature() is no longer used. Remove this unused argument.
Acked-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Since the crypto library API is now being used instead of crypto_shash,
generate_key() can no longer fail. Make it return void and simplify the
callers accordingly.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Since the crypto library API is now being used instead of crypto_shash,
the "cmac(aes)" crypto_shash that is being allocated and stored in
'struct cifs_secmech' is no longer used. Remove it.
That makes the kconfig selection of CRYPTO_CMAC and the module softdep
on "cmac" unnecessary. So remove those too.
Finally, since this removes the last use of crypto_shash from the smb
client, also remove the remaining crypto_shash-related helper functions.
Note: cifs_unicode.c was relying on <linux/unaligned.h> being included
transitively via <crypto/internal/hash.h>. Since the latter include is
removed, make cifs_unicode.c include <linux/unaligned.h> explicitly.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Convert smb3_calc_signature() to use the AES-CMAC library instead of a
"cmac(aes)" crypto_shash.
The result is simpler and faster code. With the library there's no need
to allocate memory, no need to handle errors except for key preparation,
and the AES-CMAC code is accessed directly without inefficient indirect
calls and other unnecessary API overhead.
For now a "cmac(aes)" crypto_shash is still being allocated in
'struct cifs_secmech'. Later commits will remove that, simplifying the
code even further.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Set it to number of currently defined algorithms (6 as of now).
Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Document parts of the code, especially the apparently
non-sense parts.
Other:
- change pointer increment constants to sizeof() values
Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
This patch implements several micro-optimizations on lz77_compress()
with the goal of reducing the number of instructions per [input]
byte (a.k.a. IPB).
Changes:
- change hashtable to be u32 (instead of u64) -- change the hash
function to reflect that (adds lz77_hash() and lz77_read32() helpers)
- batch-write literals instead of 1 by 1 -- now that we have a well
defined hot path (match finding) and a cold path (encode literals +
match), batch writing makes a significant difference
- implement adaptive skipping of input bytes -- skip input bytes more
aggressively if too few matches are being found
- name some constants for more meaningful context
Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Increase max distance (i.e. window size) from 1k to 8k.
This allows better compression and is just as fast.
Other:
- drop LZ77_MATCH_MIN_DIST as it's nused -- main loop
already checks if dist > 0
Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
- lz77_match_len() increments @cur before checking for equality,
leading to off-by-one match len in some cases.
Fix by moving pointers increment to inside the loop.
Also rename @wnd arg to @match (more accurate name).
- both lz77_match_len() and lz77_compress() checked for
"buf + step < end" when the correct is "<=" for such cases.
Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
@dst buffer is allocated with same size as @src, which, for good
compression cases, works fine.
However, when compression goes bad (e.g. random bytes payloads), the
compressed size can increase significantly, and even by stopping the
main loop at 7/8 of @slen, writing leftover literals could write past
the end of @dst because of LZ77 metadata.
To fix this, add lz77_compressed_alloc_size() helper to compute the
correct allocation size for @dst, accounting for metadata and worst
cast scenario (all literals).
While this is overprovisioning memory, it's not only correct, but also
allows lz77_compress() main loop to run without ever checking @dst
limits (i.e. a perf improvement).
Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
After validate_dacl() was factored out in commit 149822e5541c, the
local end_of_dacl in parse_dacl() is only read by the dump_ace()
call under #ifdef CONFIG_CIFS_DEBUG2. With CIFS_DEBUG2 off the
variable is assigned but never used, which gcc -W=1 flags as
-Wunused-but-set-variable.
Remove the local and compute the end-of-dacl pointer inline at the
single call site inside the existing CIFS_DEBUG2 guard. No
functional change: when CIFS_DEBUG2 is enabled the argument value
is identical to what the removed local carried; when CIFS_DEBUG2
is disabled the code was already dead.
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202604220046.tGkRxVtS-lkp@intel.com/
Fixes: 149822e5541c ("smb: client: validate the whole DACL before rewriting it in cifsacl")
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
Assisted-by: Claude:claude-opus-4-7
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Being a module parameter, it's possible to do:
# modprobe cifs drop_dir_cache=1
Which will lead to a crash, because cifs_tcp_ses_list hasn't been
initialized yet:
[ 168.242624] BUG: kernel NULL pointer dereference, address: 0000000000000010
[ 168.242952] #PF: supervisor read access in kernel mode
[ 168.243175] #PF: error_code(0x0000) - not-present page
[ 168.243394] PGD 0 P4D 0
[ 168.243524] Oops: Oops: 0000 [#1] SMP NOPTI
[ 168.243703] CPU: 2 UID: 0 PID: 1105 Comm: modprobe Not tainted 7.0.0-lku #5 PREEMPT(lazy)
[ 168.244054] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.17.0-2-g4f253b9b-prebuilt.qemu.org 04/01/2014
[ 168.244557] RIP: 0010:cifs_param_set_drop_dir_cache+0x7c/0x100 [cifs]
...
[ 168.248785] Call Trace:
[ 168.248915] <TASK>
[ 168.249023] parse_args+0x285/0x3a0
[ 168.249204] ? __pfx_unknown_module_param_cb+0x10/0x10
[ 168.249448] load_module+0x192b/0x1bb0
[ 168.249637] ? __pfx_unknown_module_param_cb+0x10/0x10
[ 168.249882] ? kernel_read_file+0x27d/0x2b0
[ 168.250088] init_module_from_file+0xce/0xf0
[ 168.250291] idempotent_init_module+0xfb/0x2f0
[ 168.250496] __x64_sys_finit_module+0x5a/0xa0
[ 168.250694] do_syscall_64+0xe0/0x5a0
[ 168.250863] ? exc_page_fault+0x65/0x160
[ 168.251050] entry_SYSCALL_64_after_hwframe+0x77/0x7f
[ 168.251284] RIP: 0033:0x7fcaa12b774d
Instead of fixing this with some kind of "is module initialized"
approach, this patch instead moves that functionality to procfs,
setting a write op for the existing open_dirs entry, where
writing a 0 to it will drop the cached directory entries.
Also make it available only when CONFIG_CIFS_DEBUG=y.
A small change needed now is to not call flush_delayed_work()
on invalidate_all_cached_dirs() when called from procfs (can't sleep in
that context).
So add a @sync arg to invalidate_all_cached_dirs() to control when to
flush the delayed works.
Fixes: dde6667fa3c8 ("smb: client: add drop_dir_cache module parameter to invalidate cached dirents")
Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
parse_dacl() treats an ACE SID matching sid_unix_NFS_mode as an NFS
mode SID and reads sid.sub_auth[2] to recover the mode bits.
That assumes the ACE carries three subauthorities, but compare_sids()
only compares min(a, b) subauthorities. A malicious server can return
an ACE with num_subauth = 2 and sub_auth[] = {88, 3}, which still
matches sid_unix_NFS_mode and then drives the sub_auth[2] read four
bytes past the end of the ACE.
Require num_subauth >= 3 before treating the ACE as an NFS mode SID.
This keeps the fix local to the special-SID mode path without changing
compare_sids() semantics for the rest of cifsacl.
Fixes: e2f8fbfb8d09 ("cifs: get mode bits from special sid on stat")
Cc: stable@vger.kernel.org
Assisted-by: Claude:claude-opus-4-6
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
build_sec_desc() and id_mode_to_cifs_acl() derive a DACL pointer from a
server-supplied dacloffset and then use the incoming ACL to rebuild the
chmod/chown security descriptor.
The original fix only checked that the struct smb_acl header fits before
reading dacl_ptr->size or dacl_ptr->num_aces. That avoids the immediate
header-field OOB read, but the rewrite helpers still walk ACEs based on
pdacl->num_aces with no structural validation of the incoming DACL body.
A malicious server can return a truncated DACL that still contains a
header, claims one or more ACEs, and then drive
replace_sids_and_copy_aces() or set_chmod_dacl() past the validated
extent while they compare or copy attacker-controlled ACEs.
Factor the DACL structural checks into validate_dacl(), extend them to
validate each ACE against the DACL bounds, and use the shared validator
before the chmod/chown rebuild paths. parse_dacl() reuses the same
validator so the read-side parser and write-side rewrite paths agree on
what constitutes a well-formed incoming DACL.
Fixes: bc3e9dd9d104 ("cifs: Change SIDs in ACEs while transferring file ownership.")
Cc: stable@vger.kernel.org
Assisted-by: Claude:claude-opus-4-6
Assisted-by: Codex:gpt-5-4
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
smb2_ioctl_query_info() has two response-copy branches: PASSTHRU_FSCTL
and the default QUERY_INFO path. The QUERY_INFO branch clamps
qi.input_buffer_length to the server-reported OutputBufferLength and then
copies qi.input_buffer_length bytes from qi_rsp->Buffer to userspace, but
it never verifies that the flexible-array payload actually fits within
rsp_iov[1].iov_len.
A malicious server can return OutputBufferLength larger than the actual
QUERY_INFO response, causing copy_to_user() to walk past the response
buffer and expose adjacent kernel heap to userspace.
Guard the QUERY_INFO copy with a bounds check on the actual Buffer
payload. Use struct_size(qi_rsp, Buffer, qi.input_buffer_length)
rather than an open-coded addition so the guard cannot overflow on
32-bit builds.
Fixes: f5778c398713 ("SMB3: Allow SMB3 FSCTL queries to be sent to server from tools")
Cc: stable@vger.kernel.org
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
Assisted-by: Claude:claude-opus-4-6
Assisted-by: Codex:gpt-5-4
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
in smb2_get_info_sec, a dummy security descriptor (SD) is returned if
the requested information is not supported.
the code is currently wrong, as DACL_PROTECTED is set in the type field,
but there is no DACL is present.
instead of faking a security, report a STATUS_NOT_SUPPORTED error.
this seems to fix a "Error 0x80090006: Invalid Signature" on file
transfers with Windows 11 clients (25H2, build 26200.8246).
capturing traffic shows that the client is sending a GET_INFO/SEC_INFO
request, with the additional_info field set to 0x20
(ATTRIBUTE_SECURITY_INFORMATION). Returning an empty SD
(with only SELF_RELATIVE set) does not fix the error.
Signed-off-by: Marios Makassikis <mmakassikis@freebox.fr>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
When the binding SESSION_SETUP sets conn->binding = true, the flag stays
set after the call so that the global session lookup in
ksmbd_session_lookup_all() can find the session, which was not added to
conn->sessions. Because the flag is connection-wide, the global lookup
path will also resolve any other session by id if asked.
Tighten the global lookup so that the returned session must have this
connection registered in its channel xarray (sess->ksmbd_chann_list).
The channel entry is installed by the existing binding_session path in
ntlm_authenticate()/krb5_authenticate() when a SESSION_SETUP completes
successfully, so this condition is a strict equivalent of "this
connection has been accepted as a channel of this session". Connections
that have not bound to a given session cannot reach it via the global
table.
The existing conn->binding gate for entering the slowpath is preserved
so that non-binding connections keep the fast-path-only behavior, and
the session->state check is unchanged.
Fixes: f5a544e3bab7 ("ksmbd: add support for SMB3 multichannel")
Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
smb2_open() attempts to clear conflicting CreateOptions bits
(FILE_SEQUENTIAL_ONLY_LE together with FILE_RANDOM_ACCESS_LE, and
FILE_NO_COMPRESSION_LE on a directory open), but uses a plain
assignment of the bitwise negation of the target flag:
req->CreateOptions = ~(FILE_SEQUENTIAL_ONLY_LE);
req->CreateOptions = ~(FILE_NO_COMPRESSION_LE);
This replaces the entire field with 0xFFFFFFFB / 0xFFFFFFEF rather
than clearing a single bit. With the SEQUENTIAL/RANDOM case, the
next check for FILE_OPEN_BY_FILE_ID_LE | CREATE_TREE_CONNECTION |
FILE_RESERVE_OPFILTER_LE then trivially matches and a legitimate
request is rejected with -EOPNOTSUPP. With the NO_COMPRESSION case,
every downstream test (FILE_DELETE_ON_CLOSE, etc.) operates on a
corrupted CreateOptions value.
Use &= ~FLAG to clear only the intended bit in both places.
Signed-off-by: DaeMyung Kang <charsyam@gmail.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
ksmbd_lookup_fd_cguid() returns a ksmbd_file with its refcount
incremented via ksmbd_fp_get(). parse_durable_handle_context() in
the DURABLE_REQ_V2 case properly releases this reference on every
path inside the ClientGUID-match branch, either by calling
ksmbd_put_durable_fd() or by transferring ownership to dh_info->fp
for a successful reconnect. However, when an entry exists in the
global file table with the same CreateGuid but a different
ClientGUID, the code simply falls through to the new-open path
without dropping the reference obtained from ksmbd_lookup_fd_cguid().
Per MS-SMB2 section 3.3.5.9.10 ("Handling the
SMB2_CREATE_DURABLE_HANDLE_REQUEST_V2 Create Context"), the server
MUST locate an Open whose Open.CreateGuid matches the request's
CreateGuid AND whose Open.ClientGuid matches the ClientGuid of the
connection that received the request. If no such Open is found, the
server MUST continue with the normal open execution phase. A
CreateGuid hit with a ClientGUID mismatch is therefore the
"Open not found" case: proceeding with a new open is correct, but
the reference obtained purely as a side effect of the lookup must
not be leaked.
Repeated requests that hit this mismatch pin global_ft entries,
prevent __ksmbd_close_fd() from ever running for the corresponding
files, and defeat the durable scavenger, leading to long-lived
resource leaks.
Release the reference in the mismatch path and clear dh_info->fp so
subsequent logic does not mistake a non-matching lookup result for
a reconnect target.
Fixes: c8efcc786146 ("ksmbd: add support for durable handles v1/v2")
Signed-off-by: DaeMyung Kang <charsyam@gmail.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
smb2_lock() performs O(N^2) conflict detection with no cap on LockCount.
Cap lock_count at 64 to prevent CPU exhaustion from a single request.
Signed-off-by: Akif Sait <akif.sait111@gmail.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
When per-connection async_ida was converted from a dynamically
allocated ksmbd_ida to an embedded struct ida, ksmbd_ida_free() was
removed from the connection teardown path but no matching
ida_destroy() was added. The connection is therefore freed with the
IDA's backing xarray still intact.
The kernel IDA API expects ida_init() and ida_destroy() to be paired
over an object's lifetime, so add the missing cleanup before the
connection is freed.
No leak has been observed in testing; this is a pairing fix to match
the IDA lifetime rules, not a response to a reproduced regression.
Fixes: d40012a83f87 ("cifsd: declare ida statically")
Signed-off-by: DaeMyung Kang <charsyam@gmail.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|