summaryrefslogtreecommitdiff
path: root/fs/nfsd
AgeCommit message (Collapse)Author
12 daysMerge tag 'nfsd-7.1-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd fixes from Chuck Lever: "Regressions: - Tighten bounds checking for sunrpc cache hash tables - Don't report key material in the ftrace log Stable fix: - Fix lockd's implementation of the NLM TEST procedure" * tag 'nfsd-7.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: lockd: fix TEST handling when not all permissions are available. NFSD: Report whether fh_key was actually updated sunrpc: prevent out-of-bounds read in __cache_seq_start()
2026-05-21NFSD: Report whether fh_key was actually updatedChuck Lever
The nfsd_ctl_fh_key_set tracepoint was introduced to capture operator activity on the filehandle signing key. Earlier revisions logged the key bytes verbatim; the version that landed hashes the 16 key bytes through crc32_le and stores the result. CRC32 is a linear projection of its input rather than a one-way function, and truncating any hash of fixed-size secret material leaves the key recoverable under offline brute force when the threat model includes an attacker with access to the trace ring. The operational question the fingerprint was meant to answer is whether a NFSD_CMD_THREADS_SET call that carries an NFSD_A_SERVER_FH_KEY attribute actually replaced the active key or re-installed the value already in place. Answer that question directly: compare the incoming key bytes against the current nn->fh_key inside nfsd_nl_fh_key_set() and surface a single bit to the tracepoint. The event now prints "updated" when the stored key changed and "unmodified" otherwise. A first set that fails kmalloc reports "unmodified" because no key was installed. Reported-by: jaeyeong <fin@spl.team> Fixes: 62346217fd72 ("NFSD: Add a key for signing filehandles") Cc: Benjamin Coddington <bcodding@hammerspace.com> Reviewed-by: Benjamin Coddington <bcodding@hammerspace.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-05-15Merge tag 'nfsd-7.1-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd fixes from Chuck Lever: "Fixes for this release: - Correctness fix for the new sunrpc cache netlink protocol Marked for stable: - Correctness fixes for delegated attributes - Prevent an infinite loop when revoking layouts" * tag 'nfsd-7.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: NFSD: Fix infinite loop in layout state revocation sunrpc: start cache request seqno at 1 to fix netlink GET_REQS nfsd: update mtime/ctime on COPY in presence of delegated attributes nfsd: update mtime/ctime on CLONE in presense of delegated attributes nfsd: fix file change detection in CB_GETATTR nfsd: fix GET_DIR_DELEGATION when VFS leases are disabled
2026-05-10NFSD: Fix infinite loop in layout state revocationChuck Lever
find_one_sb_stid() skips stids whose sc_status is non-zero, but the SC_TYPE_LAYOUT case in nfsd4_revoke_states() never sets sc_status before calling nfsd4_close_layout(). The retry loop therefore finds the same layout stid on every iteration, hanging the revoker indefinitely. Fixes: 1e33e1414bec ("nfsd: allow layout state to be admin-revoked.") Reported-by: Dai Ngo <dai.ngo@oracle.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Tested-by: Dai Ngo <dai.ngo@oracle.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-05-10nfsd: update mtime/ctime on COPY in presence of delegated attributesOlga Kornievskaia
When delegated attributes are given on open, the file is opened with NOCMTIME and modifying operations do not update mtime/ctime as to not get out-of-sync with the client's delegated view. However, for COPY operation, the server should update its view of mtime/ctime and reflect that in any GETATTR queries. Fixes: e5e9b24ab8fa ("nfsd: freeze c/mtime updates with outstanding WRITE_ATTRS delegation") Cc: stable@vger.kernel.org Signed-off-by: Olga Kornievskaia <okorniev@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-05-10nfsd: update mtime/ctime on CLONE in presense of delegated attributesOlga Kornievskaia
When delegated attributes are given on open, the file is opened with NOCMTIME and modifying operations do not update mtime/ctime as to not get out-of-sync with the client's delegated view. However, for CLONE operation, the server should update its view of mtime/ctime and reflect that in any GETATTR queries. Fixes: e5e9b24ab8fa ("nfsd: freeze c/mtime updates with outstanding WRITE_ATTRS delegation") Cc: stable@vger.kernel.org Signed-off-by: Olga Kornievskaia <okorniev@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-05-10nfsd: fix file change detection in CB_GETATTRScott Mayhew
RFC 8881, section 10.4.3 doesn't say anything about caching the file size in the delegation record, nor does it say anything about comparing a cached file size with the size reported by the client in the CB_GETATTR reply for the purpose of determining if the client holds modified data for the file. What section 10.4.3 of RFC 8881 does say is that the server should compare the *current* file size with the size reported by the client holding the delegation in the CB_GETATTR reply, and if they differ to treat it as a modification regardless of the change attribute retrieved via the CB_GETATTR. Doing otherwise would cause the server to believe the client holding the delegation has a modified version of the file, even if the client flushed the modifications to the server prior to the CB_GETATTR. This would have the added side effect of subsequent CB_GETATTRs causing updates to the mtime, ctime, and change attribute even if the client holding the delegation makes no further updates to the file. Modify nfsd4_deleg_getattr_conflict() to obtain the current file size via i_size_read(). Retain the ncf_cur_fsize field, since it's a convenient way to return the file size back to nfsd4_encode_fattr4(), but don't use it for the purpose of detecting file changes. Remove the unnecessary initialization of ncf_cur_fsize in nfs4_open_delegation(). Also, if we recall the delegation (because the client didn't respond to the CB_GETATTR), then skip the logic that checks the nfs4_cb_fattr fields. Fixes: c5967721e106 ("NFSD: handle GETATTR conflict with write delegation") Cc: stable@vger.kernel.org Signed-off-by: Scott Mayhew <smayhew@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-04-26nfsd: fix GET_DIR_DELEGATION when VFS leases are disabledOlga Kornievskaia
When leases are disabled on the server, running xfstest generic/309 leads to an error because GET_DIR_DELEGATION returns EINVAL. nfsd_get_dir_deleg() can fail in several ways: like memory allocation and unable to get a lease because either leases are disable or it's already held. Currently only the condition "already held" is translated to returning directory-delegation-is-unavailable error. However, other failure conditions are likely temporary and thus should result in the same kind of error. Fixes: 8b99f6a8c116 ("nfsd: wire up GET_DIR_DELEGATION handling") Cc: stable@vger.kernel.org Signed-off-by: Olga Kornievskaia <okorniev@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-04-20Merge tag 'nfsd-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linuxLinus Torvalds
Pull nfsd updates from Chuck Lever: - filehandle signing to defend against filehandle-guessing attacks (Benjamin Coddington) The server now appends a SipHash-2-4 MAC to each filehandle when the new "sign_fh" export option is enabled. NFSD then verifies filehandles received from clients against the expected MAC; mismatches return NFS error STALE - convert the entire NLMv4 server-side XDR layer from hand-written C to xdrgen-generated code, spanning roughly thirty patches (Chuck Lever) XDR functions are generally boilerplate code and are easy to get wrong. The goals of this conversion are improved memory safety, lower maintenance burden, and groundwork for eventual Rust code generation for these functions. - improve pNFS block/SCSI layout robustness with two related changes (Dai Ngo) SCSI persistent reservation fencing is now tracked per client and per device via an xarray, to avoid both redundant preempt operations on devices already fenced and a potential NFSD deadlock when all nfsd threads are waiting for a layout return. - scalability and infrastructure improvements Sincere thanks to all contributors, reviewers, testers, and bug reporters who participated in the v7.1 NFSD development cycle. * tag 'nfsd-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: (83 commits) NFSD: Docs: clean up pnfs server timeout docs nfsd: fix comment typo in nfsxdr nfsd: fix comment typo in nfs3xdr NFSD: convert callback RPC program to per-net namespace NFSD: use per-operation statidx for callback procedures svcrdma: Use contiguous pages for RDMA Read sink buffers SUNRPC: Add svc_rqst_page_release() helper SUNRPC: xdr.h: fix all kernel-doc warnings svcrdma: Factor out WR chain linking into helper svcrdma: Add Write chunk WRs to the RPC's Send WR chain svcrdma: Clean up use of rdma->sc_pd->device svcrdma: Clean up use of rdma->sc_pd->device in Receive paths svcrdma: Add fair queuing for Send Queue access SUNRPC: Optimize rq_respages allocation in svc_alloc_arg SUNRPC: Track consumed rq_pages entries svcrdma: preserve rq_next_page in svc_rdma_save_io_pages SUNRPC: Handle NULL entries in svc_rqst_release_pages SUNRPC: Allocate a separate Reply page array SUNRPC: Tighten bounds checking in svc_rqst_replace_page NFSD: Sign filehandles ...
2026-04-13Merge tag 'vfs-7.1-rc1.kino' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs i_ino updates from Christian Brauner: "For historical reasons, the inode->i_ino field is an unsigned long, which means that it's 32 bits on 32 bit architectures. This has caused a number of filesystems to implement hacks to hash a 64-bit identifier into a 32-bit field, and deprives us of a universal identifier field for an inode. This changes the inode->i_ino field from an unsigned long to a u64. This shouldn't make any material difference on 64-bit hosts, but 32-bit hosts will see struct inode grow by at least 4 bytes. This could have effects on slabcache sizes and field alignment. The bulk of the changes are to format strings and tracepoints, since the kernel itself doesn't care that much about the i_ino field. The first patch changes some vfs function arguments, so check that one out carefully. With this change, we may be able to shrink some inode structures. For instance, struct nfs_inode has a fileid field that holds the 64-bit inode number. With this set of changes, that field could be eliminated. I'd rather leave that sort of cleanups for later just to keep this simple" * tag 'vfs-7.1-rc1.kino' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: nilfs2: fix 64-bit division operations in nilfs_bmap_find_target_in_group() EVM: add comment describing why ino field is still unsigned long vfs: remove externs from fs.h on functions modified by i_ino widening treewide: fix missed i_ino format specifier conversions ext4: fix signed format specifier in ext4_load_inode trace event treewide: change inode->i_ino from unsigned long to u64 nilfs2: widen trace event i_ino fields to u64 f2fs: widen trace event i_ino fields to u64 ext4: widen trace event i_ino fields to u64 zonefs: widen trace event i_ino fields to u64 hugetlbfs: widen trace event i_ino fields to u64 ext2: widen trace event i_ino fields to u64 cachefiles: widen trace event i_ino fields to u64 vfs: widen trace event i_ino fields to u64 net: change sock.sk_ino and sock_i_ino() to u64 audit: widen ino fields to u64 vfs: widen inode hash/lookup functions to u64
2026-04-13Merge tag 'vfs-7.1-rc1.directory' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs directory updates from Christian Brauner: "Recently 'start_creating', 'start_removing', 'start_renaming' and related interfaces were added which combine the locking and the lookup. At that time many callers were changed to use the new interfaces. However there are still an assortment of places out side of the core vfs where the directory is locked explictly, whether with inode_lock() or lock_rename() or similar. These were missed in the first pass for an assortment of uninteresting reasons. This addresses the remaining places where explicit locking is used, and changes them to use the new interfaces, or otherwise removes the explicit locking. The biggest changes are in overlayfs. The other changes are quite simple, though maybe the cachefiles changes is the least simple of those" * tag 'vfs-7.1-rc1.directory' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: VFS: unexport lock_rename(), lock_rename_child(), unlock_rename() ovl: remove ovl_lock_rename_workdir() ovl: use is_subdir() for testing if one thing is a subdir of another ovl: change ovl_create_real() to get a new lock when re-opening created file. ovl: pass name buffer to ovl_start_creating_temp() cachefiles: change cachefiles_bury_object to use start_renaming_dentry() ovl: Simplify ovl_lookup_real_one() VFS: make lookup_one_qstr_excl() static. nfsd: switch purge_old() to use start_removing_noperm() selinux: Use simple_start_creating() / simple_done_creating() Apparmor: Use simple_start_creating() / simple_done_creating() libfs: change simple_done_creating() to use end_creating() VFS: move the start_dirop() kerndoc comment to before start_dirop() fs/proc: Don't lock root inode when creating "self" and "thread-self" VFS: note error returns in documentation for various lookup functions
2026-04-03nfsd: fix comment typo in nfsxdrJoseph Salisbury
The file contains a spelling error in a source comment (occured). Typos in comments reduce readability and make text searches less reliable for developers and maintainers. Replace 'occured' with 'occurred' in the affected comment. This is a comment-only cleanup and does not change behavior. Signed-off-by: Joseph Salisbury <joseph.salisbury@oracle.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-04-03nfsd: fix comment typo in nfs3xdrJoseph Salisbury
The file contains a spelling error in a source comment (occured). Typos in comments reduce readability and make text searches less reliable for developers and maintainers. Replace 'occured' with 'occurred' in the affected comment. This is a comment-only cleanup and does not change behavior. Signed-off-by: Joseph Salisbury <joseph.salisbury@oracle.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-04-03NFSD: convert callback RPC program to per-net namespaceDai Ngo
The callback channel's rpc_program, rpc_version, rpc_stat, and per-procedure counts are declared as file-scope statics in nfs4callback.c, shared across all network namespaces. Forechannel RPC statistics are already maintained per-netns (via nfsd_svcstats in struct nfsd_net); the backchannel has no such separation. When backchannel statistics are eventually surfaced to userspace, the global counters would expose cross-namespace data. Allocate per-netns copies of these structures through a new opaque struct nfsd_net_cb, managed by nfsd_net_cb_init() and nfsd_net_cb_shutdown(). The struct definition is private to nfs4callback.c; struct nfsd_net holds only a pointer. Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-04-03NFSD: use per-operation statidx for callback proceduresChuck Lever
The callback RPC procedure table uses NFSPROC4_CB_##call for p_statidx, which maps CB_NULL to index 0 and every compound-based callback (CB_RECALL, CB_LAYOUT, CB_OFFLOAD, etc.) to index 1. All compound callback operations therefore share a single statistics counter, making per-operation accounting impossible. Assign p_statidx from the NFSPROC4_CLNT_##proc enum instead, giving each callback operation its own counter slot. The counts array is already sized by ARRAY_SIZE(nfs4_cb_procedures), so no allocation change is needed. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-03-29NFSD: Sign filehandlesBenjamin Coddington
NFS clients may bypass restrictive directory permissions by using open_by_handle() (or other available OS system call) to guess the filehandles for files below that directory. In order to harden knfsd servers against this attack, create a method to sign and verify filehandles using SipHash-2-4 as a MAC (Message Authentication Code). According to https://cr.yp.to/siphash/siphash-20120918.pdf, SipHash can be used as a MAC, and our use of SipHash-2-4 provides a low 1 in 2^64 chance of forgery. Filehandles that have been signed cannot be tampered with, nor can clients reasonably guess correct filehandles and hashes that may exist in parts of the filesystem they cannot access due to directory permissions. Append the 8 byte SipHash to encoded filehandles for exports that have set the "sign_fh" export option. Filehandles received from clients are verified by comparing the appended hash to the expected hash. If the MAC does not match the server responds with NFS error _STALE. If unsigned filehandles are received for an export with "sign_fh" they are rejected with NFS error _STALE. Signed-off-by: Benjamin Coddington <bcodding@hammerspace.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-03-29NFSD/export: Add sign_fh export optionBenjamin Coddington
In order to signal that filehandles on this export should be signed, add a "sign_fh" export option. Filehandle signing can help the server defend against certain filehandle guessing attacks. Setting the "sign_fh" export option sets NFSEXP_SIGN_FH. In a future patch NFSD uses this signal to append a MAC onto filehandles for that export. While we're in here, tidy a few stray expflags to more closely align to the export flag order. Link: https://lore.kernel.org/linux-nfs/cover.1772022373.git.bcodding@hammerspace.com Signed-off-by: Benjamin Coddington <bcodding@hammerspace.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-03-29NFSD: Add a key for signing filehandlesBenjamin Coddington
A future patch will enable NFSD to sign filehandles by appending a Message Authentication Code(MAC). To do this, NFSD requires a secret 128-bit key that can persist across reboots. A persisted key allows the server to accept filehandles after a restart. Enable NFSD to be configured with this key via the netlink interface. Link: https://lore.kernel.org/linux-nfs/cover.1772022373.git.bcodding@hammerspace.com Signed-off-by: Benjamin Coddington <bcodding@hammerspace.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-03-29nfsd: use dynamic allocation for oversized NFSv4.0 replay cacheChuck Lever
Commit 1e8e9913672a ("nfsd: fix heap overflow in NFSv4.0 LOCK replay cache") capped the replay cache copy at NFSD4_REPLAY_ISIZE to prevent a heap overflow, but set rp_buflen to zero when the encoded response exceeded the inline buffer. A retransmitted LOCK reaching the replay path then produced only a status code with no operation body, resulting in a malformed XDR response. When the encoded response exceeds the 112-byte inline rp_ibuf, a buffer is kmalloc'd to hold it. If the allocation fails, rp_buflen remains zero, preserving the behavior from the capped-copy fix. The buffer is freed when the stateowner is released or when a subsequent operation's response fits in the inline buffer. Fixes: 1e8e9913672a ("nfsd: fix heap overflow in NFSv4.0 LOCK replay cache") Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-03-29nfsd: convert global state_lock to per-net deleg_lockJeff Layton
Replace the global state_lock spinlock with a per-nfsd_net deleg_lock. The state_lock was only used to protect delegation lifecycle operations (the del_recall_lru list and delegation hash/unhash), all of which are scoped to a single network namespace. Making the lock per-net removes a source of unnecessary contention between containers. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-03-29NFSD: Enforce timeout on layout recall and integrate lease manager fencingDai Ngo
When a layout conflict triggers a recall, enforcing a timeout is necessary to prevent excessive nfsd threads from being blocked in __break_lease ensuring the server continues servicing incoming requests efficiently. This patch introduces a new function to lease_manager_operations: lm_breaker_timedout: Invoked when a lease recall times out and is about to be disposed of. This function enables the lease manager to inform the caller whether the file_lease should remain on the flc_list or be disposed of. For the NFSD lease manager, this function now handles layout recall timeouts. If the layout type supports fencing and the client has not been fenced, a fence operation is triggered to prevent the client from accessing the block device. While the fencing operation is in progress, the conflicting file_lease remains on the flc_list until fencing is complete. This guarantees that no other clients can access the file, and the client with exclusive access is properly blocked before disposal. Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-03-29NFSD: fix nfs4_file access extra count in nfsd4_add_rdaccess_to_wrdelegDai Ngo
In nfsd4_add_rdaccess_to_wrdeleg, if fp->fi_fds[O_RDONLY] is already set by another thread, __nfs4_file_get_access should not be called to increment the nfs4_file access count since that was already done by the thread that added READ access to the file. The extra fi_access count in nfs4_file can prevent the corresponding nfsd_file from being freed. When stopping nfs-server service, these extra access counts trigger a BUG in kmem_cache_destroy() that shows nfsd_file object remaining on __kmem_cache_shutdown. This problem can be reproduced by running the Git project's test suite over NFS. Fixes: 8072e34e1387 ("nfsd: fix nfsd_file reference leak in nfsd4_add_rdaccess_to_wrdeleg()") Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-03-29sunrpc: Kill RPC_IFDEBUG()Andy Shevchenko
RPC_IFDEBUG() is used in only two places. In one the user of the definition is guarded by ifdeffery, in the second one it's implied due to dprintk() usage. Kill the macro and move the ifdeffery to the regular condition with the variable defined inside, while in the second case add the same conditional and move the respective code there. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-03-29lockd: Relocate nlmsvc_unlock API declarationsChuck Lever
The nlmsvc_unlock_all_by_sb() and nlmsvc_unlock_all_by_ip() functions are part of lockd's external API, consumed by other kernel subsystems. Their declarations currently reside in linux/lockd/lockd.h alongside internal implementation details, which blurs the boundary between lockd's public interface and its private internals. Moving these declarations to linux/lockd/bind.h groups them with other external API functions and makes the separation explicit. This clarifies which functions are intended for external use and reduces the risk of internal implementation details leaking into the public API surface. Build-tested with allyesconfig; no functional changes. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-03-29lockd: Have nlm_fopen() return errno valuesChuck Lever
The nlm_fopen() function is part of the API between nfsd and lockd. Currently its return value is an on-the-wire NLM status code. But that forces NFSD to include NLM wire protocol definitions despite having no other dependency on the NLM wire protocol. In addition, a CONFIG_LOCKD_V4 Kconfig symbol appears in the middle of NFSD source code. Refactor: Let's not use on-the-wire values as part of a high-level API between two Linux kernel modules. That's what we have errno for, right? And, instead of simply moving the CONFIG_LOCKD_V4 check, we can get rid of it entirely and let the decision of what actual NLM status code goes on the wire to be left up to NLM version-specific code. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-03-29lockd: Relocate and rename nlm_drop_replyChuck Lever
The nlm_drop_reply status code is internal to the kernel's lockd implementation and must never appear on the wire. Its previous location in xdr.h grouped it with legitimate NLM protocol status codes, obscuring this critical distinction. Relocate the definition to lockd.h with a comment block for internal status codes, and rename to nlm__int__drop_reply to make its internal-only nature explicit. This prepares for adding additional internal status codes in subsequent patches. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-03-29nfsd: remove NFSD_V4_DELEG_TIMESTAMPS Kconfig optionJeff Layton
Now that there is a runtime debugfs switch, eliminate the compile-time switch and always build in support for delegated timestamps. Administrators who previously disabled this feature at compile time can disable it at runtime via: echo 0 > /sys/kernel/debug/nfsd/delegated_timestamps Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-03-29nfsd: add a runtime switch for disabling delegated timestampsJeff Layton
The delegated timestamp code seems to be working well enough now that we want to make it always be built in. In the event that there are problems though, we still want to be able to disable them for debugging purposes. Add a switch to debugfs to enable them at runtime. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-03-29NFSD: Track SCSI Persistent Registration Fencing per Client with xarrayDai Ngo
When a client holding pNFS SCSI layouts becomes unresponsive, the server revokes access by preempting the client's SCSI persistent reservation key. A layout recall is issued for each layout the client holds; if the client fails to respond, each recall triggers a fence operation. The first preempt for a given device succeeds and removes the client's key registration. Subsequent preempts for the same device fail because the key is no longer registered. Update the NFS server to handle SCSI persistent registration fencing on a per-client and per-device basis by utilizing an xarray associated with the nfs4_client structure. Each xarray entry is indexed by the dev_t of a block device registered by the client. The entry maintains a flag indicating whether this device has already been fenced for the corresponding client. When the server issues a persistent registration key to a client, it creates a new xarray entry at the dev_t index with the fenced flag initialized to 0. Before performing a fence via nfsd4_scsi_fence_client, the server checks the corresponding entry using the device's dev_t. If the fenced flag is already set, the fence operation is skipped; otherwise, the flag is set to 1 and fencing proceeds. The xarray is destroyed when the nfs4_client is released in __destroy_client. Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-03-29nfsd/sunrpc: move rq_cachetype into struct nfsd_thread_local_infoJeff Layton
The svc_rqst->rq_cachetype field is only accessed by nfsd. Move it into the nfsd_thread_local_info instead. Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Benjamin Coddington <bcodding@hammerspace.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-03-29nfsd/sunrpc: add svc_rqst->rq_private pointer and remove rq_lease_breakerJeff Layton
rq_lease_breaker has always been a NFSv4 specific layering violation in svc_rqst. The reason it's there though is that we need a place that is thread-local, and accessible from the svc_rqst pointer. Add a new rq_private pointer to struct svc_rqst. This is intended for use by the threads that are handling the service. sunrpc code doesn't touch it. In nfsd, define a new struct nfsd_thread_local_info. nfsd declares one of these on the stack and puts a pointer to it in rq_private. Add a new ntli_lease_breaker field to the new struct and convert all of the places that access rq_lease_breaker to use the new field instead. Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Benjamin Coddington <bcodding@hammerspace.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-03-18Merge tag 'nfsd-7.0-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd fixes from Chuck Lever: - Fix cache_request leak in cache_release() - Fix heap overflow in the NFSv4.0 LOCK replay cache - Hold net reference for the lifetime of /proc/fs/nfs/exports fd - Defer sub-object cleanup in export "put" callbacks * tag 'nfsd-7.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: nfsd: fix heap overflow in NFSv4.0 LOCK replay cache sunrpc: fix cache_request leak in cache_release NFSD: Hold net reference for the lifetime of /proc/fs/nfs/exports fd NFSD: Defer sub-object cleanup in export put callbacks
2026-03-16nfsd: fix heap overflow in NFSv4.0 LOCK replay cacheJeff Layton
The NFSv4.0 replay cache uses a fixed 112-byte inline buffer (rp_ibuf[NFSD4_REPLAY_ISIZE]) to store encoded operation responses. This size was calculated based on OPEN responses and does not account for LOCK denied responses, which include the conflicting lock owner as a variable-length field up to 1024 bytes (NFS4_OPAQUE_LIMIT). When a LOCK operation is denied due to a conflict with an existing lock that has a large owner, nfsd4_encode_operation() copies the full encoded response into the undersized replay buffer via read_bytes_from_xdr_buf() with no bounds check. This results in a slab-out-of-bounds write of up to 944 bytes past the end of the buffer, corrupting adjacent heap memory. This can be triggered remotely by an unauthenticated attacker with two cooperating NFSv4.0 clients: one sets a lock with a large owner string, then the other requests a conflicting lock to provoke the denial. We could fix this by increasing NFSD4_REPLAY_ISIZE to allow for a full opaque, but that would increase the size of every stateowner, when most lockowners are not that large. Instead, fix this by checking the encoded response length against NFSD4_REPLAY_ISIZE before copying into the replay buffer. If the response is too large, set rp_buflen to 0 to skip caching the replay payload. The status is still cached, and the client already received the correct response on the original request. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Cc: stable@kernel.org Reported-by: Nicholas Carlini <npc@anthropic.com> Tested-by: Nicholas Carlini <npc@anthropic.com> Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-03-14NFSD: Hold net reference for the lifetime of /proc/fs/nfs/exports fdChuck Lever
The /proc/fs/nfs/exports proc entry is created at module init and persists for the module's lifetime. exports_proc_open() captures the caller's current network namespace and stores its svc_export_cache in seq->private, but takes no reference on the namespace. If the namespace is subsequently torn down (e.g. container destruction after the opener does setns() to a different namespace), nfsd_net_exit() calls nfsd_export_shutdown() which frees the cache. Subsequent reads on the still-open fd dereference the freed cache_detail, walking a freed hash table. Hold a reference on the struct net for the lifetime of the open file descriptor. This prevents nfsd_net_exit() from running -- and thus prevents nfsd_export_shutdown() from freeing the cache -- while any exports fd is open. cache_detail already stores its net pointer (cd->net, set by cache_create_net()), so exports_release() can retrieve it without additional per-file storage. Reported-by: Misbah Anjum N <misanjum@linux.ibm.com> Closes: https://lore.kernel.org/linux-nfs/dcd371d3a95815a84ba7de52cef447b8@linux.ibm.com/ Fixes: 96d851c4d28d ("nfsd: use proper net while reading "exports" file") Cc: stable@vger.kernel.org Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: NeilBrown <neil@brown.name> Tested-by: Olga Kornievskaia <okorniev@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-03-14NFSD: Defer sub-object cleanup in export put callbacksChuck Lever
svc_export_put() calls path_put() and auth_domain_put() immediately when the last reference drops, before the RCU grace period. RCU readers in e_show() and c_show() access both ex_path (via seq_path/d_path) and ex_client->name (via seq_escape) without holding a reference. If cache_clean removes the entry and drops the last reference concurrently, the sub-objects are freed while still in use, producing a NULL pointer dereference in d_path. Commit 2530766492ec ("nfsd: fix UAF when access ex_uuid or ex_stats") moved kfree of ex_uuid and ex_stats into the call_rcu callback, but left path_put() and auth_domain_put() running before the grace period because both may sleep and call_rcu callbacks execute in softirq context. Replace call_rcu/kfree_rcu with queue_rcu_work(), which defers the callback until after the RCU grace period and executes it in process context where sleeping is permitted. This allows path_put() and auth_domain_put() to be moved into the deferred callback alongside the other resource releases. Apply the same fix to expkey_put(), which has the identical pattern with ek_path and ek_client. A dedicated workqueue scopes the shutdown drain to only NFSD export release work items; flushing the shared system_unbound_wq would stall on unrelated work from other subsystems. nfsd_export_shutdown() uses rcu_barrier() followed by flush_workqueue() to ensure all deferred release callbacks complete before the export caches are destroyed. Reported-by: Misbah Anjum N <misanjum@linux.ibm.com> Closes: https://lore.kernel.org/linux-nfs/dcd371d3a95815a84ba7de52cef447b8@linux.ibm.com/ Fixes: c224edca7af0 ("nfsd: no need get cache ref when protected by rcu") Fixes: 1b10f0b603c0 ("SUNRPC: no need get cache ref when protected by rcu") Cc: stable@vger.kernel.org Reviwed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: NeilBrown <neil@brown.name> Tested-by: Olga Kornievskaia <okorniev@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-03-06treewide: change inode->i_ino from unsigned long to u64Jeff Layton
On 32-bit architectures, unsigned long is only 32 bits wide, which causes 64-bit inode numbers to be silently truncated. Several filesystems (NFS, XFS, BTRFS, etc.) can generate inode numbers that exceed 32 bits, and this truncation can lead to inode number collisions and other subtle bugs on 32-bit systems. Change the type of inode->i_ino from unsigned long to u64 to ensure that inode numbers are always represented as 64-bit values regardless of architecture. Update all format specifiers treewide from %lu/%lx to %llu/%llx to match the new type, along with corresponding local variable types. This is the bulk treewide conversion. Earlier patches in this series handled trace events separately to allow trace field reordering for better struct packing on 32-bit. Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://patch.msgid.link/20260304-iino-u64-v3-12-2257ad83d372@kernel.org Acked-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-03-06nfsd: switch purge_old() to use start_removing_noperm()NeilBrown
Rather than explicit locking, use the start_removing_noperm() and end_removing() wrappers. This was not done with other start_removing changes due to conflicting in-flight patches. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: NeilBrown <neil@brown.name> Link: https://patch.msgid.link/20260224222542.3458677-8-neilb@ownmail.net Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-03-02Merge tag 'nfsd-7.0-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd fixes from Chuck Lever: - Restore previous nfsd thread count reporting behavior - Fix credential reference leaks in the NFSD netlink admin protocol * tag 'nfsd-7.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: nfsd: report the requested maximum number of threads instead of number running nfsd: Fix cred ref leak in nfsd_nl_listener_set_doit(). nfsd: Fix cred ref leak in nfsd_nl_threads_set_doit().
2026-02-24nfsd: report the requested maximum number of threads instead of number runningJeff Layton
The current netlink and /proc interfaces deviate from their traditional values when dynamic threading is enabled, and there is currently no way to know what the current setting is. This patch brings the reporting back in line with traditional behavior. Make these interfaces report the requested maximum number of threads instead of the number currently running. Also, update documentation and comments to reflect that this value represents a maximum and not the number currently running. Fixes: d8316b837c2c ("nfsd: add controls to set the minimum number of threads per pool") Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-02-22Convert remaining multi-line kmalloc_obj/flex GFP_KERNEL usesKees Cook
Conversion performed via this Coccinelle script: // SPDX-License-Identifier: GPL-2.0-only // Options: --include-headers-for-types --all-includes --include-headers --keep-comments virtual patch @gfp depends on patch && !(file in "tools") && !(file in "samples")@ identifier ALLOC = {kmalloc_obj,kmalloc_objs,kmalloc_flex, kzalloc_obj,kzalloc_objs,kzalloc_flex, kvmalloc_obj,kvmalloc_objs,kvmalloc_flex, kvzalloc_obj,kvzalloc_objs,kvzalloc_flex}; @@ ALLOC(... - , GFP_KERNEL ) $ make coccicheck MODE=patch COCCI=gfp.cocci Build and boot tested x86_64 with Fedora 42's GCC and Clang: Linux version 6.19.0+ (user@host) (gcc (GCC) 15.2.1 20260123 (Red Hat 15.2.1-7), GNU ld version 2.44-12.fc42) #1 SMP PREEMPT_DYNAMIC 1970-01-01 Linux version 6.19.0+ (user@host) (clang version 20.1.8 (Fedora 20.1.8-4.fc42), LLD 20.1.8) #1 SMP PREEMPT_DYNAMIC 1970-01-01 Signed-off-by: Kees Cook <kees@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-21Convert more 'alloc_obj' cases to default GFP_KERNEL argumentsLinus Torvalds
This converts some of the visually simpler cases that have been split over multiple lines. I only did the ones that are easy to verify the resulting diff by having just that final GFP_KERNEL argument on the next line. Somebody should probably do a proper coccinelle script for this, but for me the trivial script actually resulted in an assertion failure in the middle of the script. I probably had made it a bit _too_ trivial. So after fighting that far a while I decided to just do some of the syntactically simpler cases with variations of the previous 'sed' scripts. The more syntactically complex multi-line cases would mostly really want whitespace cleanup anyway. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-21Convert 'alloc_flex' family to use the new default GFP_KERNEL argumentLinus Torvalds
This is the exact same thing as the 'alloc_obj()' version, only much smaller because there are a lot fewer users of the *alloc_flex() interface. As with alloc_obj() version, this was done entirely with mindless brute force, using the same script, except using 'flex' in the pattern rather than 'objs*'. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-21Convert 'alloc_obj' family to use the new default GFP_KERNEL argumentLinus Torvalds
This was done entirely with mindless brute force, using git grep -l '\<k[vmz]*alloc_objs*(.*, GFP_KERNEL)' | xargs sed -i 's/\(alloc_objs*(.*\), GFP_KERNEL)/\1)/' to convert the new alloc_obj() users that had a simple GFP_KERNEL argument to just drop that argument. Note that due to the extreme simplicity of the scripting, any slightly more complex cases spread over multiple lines would not be triggered: they definitely exist, but this covers the vast bulk of the cases, and the resulting diff is also then easier to check automatically. For the same reason the 'flex' versions will be done as a separate conversion. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-21treewide: Replace kmalloc with kmalloc_obj for non-scalar typesKees Cook
This is the result of running the Coccinelle script from scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to avoid scalar types (which need careful case-by-case checking), and instead replace kmalloc-family calls that allocate struct or union object instances: Single allocations: kmalloc(sizeof(TYPE), ...) are replaced with: kmalloc_obj(TYPE, ...) Array allocations: kmalloc_array(COUNT, sizeof(TYPE), ...) are replaced with: kmalloc_objs(TYPE, COUNT, ...) Flex array allocations: kmalloc(struct_size(PTR, FAM, COUNT), ...) are replaced with: kmalloc_flex(*PTR, FAM, COUNT, ...) (where TYPE may also be *VAR) The resulting allocations no longer return "void *", instead returning "TYPE *". Signed-off-by: Kees Cook <kees@kernel.org>
2026-02-16Merge tag 'vfs-7.0-rc1.misc.2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull more misc vfs updates from Christian Brauner: "Features: - Optimize close_range() from O(range size) to O(active FDs) by using find_next_bit() on the open_fds bitmap instead of linearly scanning the entire requested range. This is a significant improvement for large-range close operations on sparse file descriptor tables. - Add FS_XFLAG_VERITY file attribute for fs-verity files, retrievable via FS_IOC_FSGETXATTR and file_getattr(). The flag is read-only. Add tracepoints for fs-verity enable and verify operations, replacing the previously removed debug printk's. - Prevent nfsd from exporting special kernel filesystems like pidfs and nsfs. These filesystems have custom ->open() and ->permission() export methods that are designed for open_by_handle_at(2) only and are incompatible with nfsd. Update the exportfs documentation accordingly. Fixes: - Fix KMSAN uninit-value in ovl_fill_real() where strcmp() was used on a non-null-terminated decrypted directory entry name from fscrypt. This triggered on encrypted lower layers when the decrypted name buffer contained uninitialized tail data. The fix also adds VFS-level name_is_dot(), name_is_dotdot(), and name_is_dot_dotdot() helpers, replacing various open-coded "." and ".." checks across the tree. - Fix read-only fsflags not being reset together with xflags in vfs_fileattr_set(). Currently harmless since no read-only xflags overlap with flags, but this would cause inconsistencies for any future shared read-only flag - Return -EREMOTE instead of -ESRCH from PIDFD_GET_INFO when the target process is in a different pid namespace. This lets userspace distinguish "process exited" from "process in another namespace", matching glibc's pidfd_getpid() behavior Cleanups: - Use C-string literals in the Rust seq_file bindings, replacing the kernel::c_str!() macro (available since Rust 1.77) - Fix typo in d_walk_ret enum comment, add porting notes for the readlink_copy() calling convention change" * tag 'vfs-7.0-rc1.misc.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: fs: add porting notes about readlink_copy() pidfs: return -EREMOTE when PIDFD_GET_INFO is called on another ns nfsd: do not allow exporting of special kernel filesystems exportfs: clarify the documentation of open()/permission() expotrfs ops fsverity: add tracepoints fs: add FS_XFLAG_VERITY for fs-verity files rust: seq_file: replace `kernel::c_str!` with C-Strings fs: dcache: fix typo in enum d_walk_ret comment ovl: use name_is_dot* helpers in readdir code fs: add helpers name_is_dot{,dot,_dotdot} ovl: Fix uninit-value in ovl_fill_real fs: reset read-only fsflags together with xflags fs/file: optimize close_range() complexity from O(N) to O(Sparse)
2026-02-14nfsd: Fix cred ref leak in nfsd_nl_listener_set_doit().Kuniyuki Iwashima
nfsd_nl_listener_set_doit() uses get_current_cred() without put_cred(). As we can see from other callers, svc_xprt_create_from_sa() does not require the extra refcount. nfsd_nl_listener_set_doit() is always in the process context, sendmsg(), and current->cred does not go away. Let's use current_cred() in nfsd_nl_listener_set_doit(). Fixes: 16a471177496 ("NFSD: add listener-{set,get} netlink command") Cc: stable@vger.kernel.org Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-02-14nfsd: Fix cred ref leak in nfsd_nl_threads_set_doit().Kuniyuki Iwashima
syzbot reported memory leak of struct cred. [0] nfsd_nl_threads_set_doit() passes get_current_cred() to nfsd_svc(), but put_cred() is not called after that. The cred is finally passed down to _svc_xprt_create(), which calls get_cred() with the cred for struct svc_xprt. The ownership of the refcount by get_current_cred() is not transferred to anywhere and is just leaked. nfsd_svc() is also called from write_threads(), but it does not bump file->f_cred there. nfsd_nl_threads_set_doit() is called from sendmsg() and current->cred does not go away. Let's use current_cred() in nfsd_nl_threads_set_doit(). [0]: BUG: memory leak unreferenced object 0xffff888108b89480 (size 184): comm "syz-executor", pid 5994, jiffies 4294943386 hex dump (first 32 bytes): 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace (crc 369454a7): kmemleak_alloc_recursive include/linux/kmemleak.h:44 [inline] slab_post_alloc_hook mm/slub.c:4958 [inline] slab_alloc_node mm/slub.c:5263 [inline] kmem_cache_alloc_noprof+0x412/0x580 mm/slub.c:5270 prepare_creds+0x22/0x600 kernel/cred.c:185 copy_creds+0x44/0x290 kernel/cred.c:286 copy_process+0x7a7/0x2870 kernel/fork.c:2086 kernel_clone+0xac/0x6e0 kernel/fork.c:2651 __do_sys_clone+0x7f/0xb0 kernel/fork.c:2792 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xa4/0xf80 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f Fixes: 924f4fb003ba ("NFSD: convert write_threads to netlink command") Cc: stable@vger.kernel.org Reported-by: syzbot+dd3b43aa0204089217ee@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/69744674.a00a0220.33ccc7.0000.GAE@google.com/ Tested-by: syzbot+dd3b43aa0204089217ee@syzkaller.appspotmail.com Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-02-12Merge tag 'mm-nonmm-stable-2026-02-12-10-48' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull non-MM updates from Andrew Morton: - "ocfs2: give ocfs2 the ability to reclaim suballocator free bg" saves disk space by teaching ocfs2 to reclaim suballocator block group space (Heming Zhao) - "Add ARRAY_END(), and use it to fix off-by-one bugs" adds the ARRAY_END() macro and uses it in various places (Alejandro Colomar) - "vmcoreinfo: support VMCOREINFO_BYTES larger than PAGE_SIZE" makes the vmcore code future-safe, if VMCOREINFO_BYTES ever exceeds the page size (Pnina Feder) - "kallsyms: Prevent invalid access when showing module buildid" cleans up kallsyms code related to module buildid and fixes an invalid access crash when printing backtraces (Petr Mladek) - "Address page fault in ima_restore_measurement_list()" fixes a kexec-related crash that can occur when booting the second-stage kernel on x86 (Harshit Mogalapalli) - "kho: ABI headers and Documentation updates" updates the kexec handover ABI documentation (Mike Rapoport) - "Align atomic storage" adds the __aligned attribute to atomic_t and atomic64_t definitions to get natural alignment of both types on csky, m68k, microblaze, nios2, openrisc and sh (Finn Thain) - "kho: clean up page initialization logic" simplifies the page initialization logic in kho_restore_page() (Pratyush Yadav) - "Unload linux/kernel.h" moves several things out of kernel.h and into more appropriate places (Yury Norov) - "don't abuse task_struct.group_leader" removes the usage of ->group_leader when it is "obviously unnecessary" (Oleg Nesterov) - "list private v2 & luo flb" adds some infrastructure improvements to the live update orchestrator (Pasha Tatashin) * tag 'mm-nonmm-stable-2026-02-12-10-48' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (107 commits) watchdog/hardlockup: simplify perf event probe and remove per-cpu dependency procfs: fix missing RCU protection when reading real_parent in do_task_stat() watchdog/softlockup: fix sample ring index wrap in need_counting_irqs() kcsan, compiler_types: avoid duplicate type issues in BPF Type Format kho: fix doc for kho_restore_pages() tests/liveupdate: add in-kernel liveupdate test liveupdate: luo_flb: introduce File-Lifecycle-Bound global state liveupdate: luo_file: Use private list list: add kunit test for private list primitives list: add primitives for private list manipulations delayacct: fix uapi timespec64 definition panic: add panic_force_cpu= parameter to redirect panic to a specific CPU netclassid: use thread_group_leader(p) in update_classid_task() RDMA/umem: don't abuse current->group_leader drm/pan*: don't abuse current->group_leader drm/amd: kill the outdated "Only the pthreads threading model is supported" checks drm/amdgpu: don't abuse current->group_leader android/binder: use same_thread_group(proc->tsk, current) in binder_mmap() android/binder: don't abuse current->group_leader kho: skip memoryless NUMA nodes when reserving scratch areas ...
2026-02-12Merge tag 'nfsd-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linuxLinus Torvalds
Pull nfsd updates from Chuck Lever: "Neil Brown and Jeff Layton contributed a dynamic thread pool sizing mechanism for NFSD. The sunrpc layer now tracks minimum and maximum thread counts per pool, and NFSD adjusts running thread counts based on workload: idle threads exit after a timeout when the pool exceeds its minimum, and new threads spawn automatically when all threads are busy. Administrators control this behavior via the nfsdctl netlink interface. Rick Macklem, FreeBSD NFS maintainer, generously contributed server- side support for the POSIX ACL extension to NFSv4, as specified in draft-ietf-nfsv4-posix-acls. This extension allows NFSv4 clients to get and set POSIX access and default ACLs using native NFSv4 operations, eliminating the need for sideband protocols. The feature is gated by a Kconfig option since the IETF draft has not yet been ratified. Chuck Lever delivered numerous improvements to the xdrgen tool. Error reporting now covers parsing, AST transformation, and invalid declarations. Generated enum decoders validate incoming values against valid enumerator lists. New features include pass-through line support for embedding C directives in XDR specifications, 16-bit integer types, and program number definitions. Several code generation issues were also addressed. When an administrator revokes NFSv4 state for a filesystem via the unlock_fs interface, ongoing async COPY operations referencing that filesystem are now cancelled, with CB_OFFLOAD callbacks notifying affected clients. The remaining patches in this pull request are clean-ups and minor optimizations. Sincere thanks to all contributors, reviewers, testers, and bug reporters who participated in the v7.0 NFSD development cycle" * tag 'nfsd-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: (45 commits) NFSD: Add POSIX ACL file attributes to SUPPATTR bitmasks NFSD: Add POSIX draft ACL support to the NFSv4 SETATTR operation NFSD: Add support for POSIX draft ACLs for file creation NFSD: Add support for XDR decoding POSIX draft ACLs NFSD: Refactor nfsd_setattr()'s ACL error reporting NFSD: Do not allow NFSv4 (N)VERIFY to check POSIX ACL attributes NFSD: Add nfsd4_encode_fattr4_posix_access_acl NFSD: Add nfsd4_encode_fattr4_posix_default_acl NFSD: Add nfsd4_encode_fattr4_acl_trueform_scope NFSD: Add nfsd4_encode_fattr4_acl_trueform Add RPC language definition of NFSv4 POSIX ACL extension NFSD: Add a Kconfig setting to enable support for NFSv4 POSIX ACLs xdrgen: Implement pass-through lines in specifications nfsd: cancel async COPY operations when admin revokes filesystem state nfsd: add controls to set the minimum number of threads per pool nfsd: adjust number of running nfsd threads based on activity sunrpc: allow svc_recv() to return -ETIMEDOUT and -EBUSY sunrpc: split new thread creation into a separate function sunrpc: introduce the concept of a minimum number of threads per pool sunrpc: track the max number of requested threads in a pool ...
2026-02-09Merge tag 'vfs-7.0-rc1.atomic_open' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs atomic_open updates from Christian Brauner: "Allow knfsd to use atomic_open() While knfsd offers combined exclusive create and open results to clients, on some filesystems those results are not atomic. The separate vfs_create() + vfs_open() sequence in dentry_create() can produce races and unexpected errors. For example, open O_CREAT with mode 0 will succeed in creating the file but return -EACCES from vfs_open(). Additionally, network filesystems benefit from reducing remote round-trip operations by using a single atomic_open() call. Teach dentry_create() -- whose sole caller is knfsd -- to use atomic_open() for filesystems that support it" * tag 'vfs-7.0-rc1.atomic_open' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: fs/namei: fix kernel-doc markup for dentry_create VFS/knfsd: Teach dentry_create() to use atomic_open() VFS: Prepare atomic_open() for dentry_create() VFS: move dentry_create() from fs/open.c to fs/namei.c