summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2025-12-05smb: rename to STATUS_SMB_NO_PREAUTH_INTEGRITY_HASH_OVERLAPChenXiaoSong
See MS-SMB2 3.3.5.4. To keep the name consistent with the documentation. Additionally, move STATUS_INVALID_LOCK_RANGE to correct position in order. Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-12-05smb/client: remove unused elements from smb2_error_map_table arrayChenXiaoSong
STATUS_SUCCESS and STATUS_WAIT_0 are both zero, and since zero indicates success, they are not needed. Since smb2_print_status() has been removed, the last element in the array is no longer needed. Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-12-05smb/client: reduce loop count in map_smb2_to_linux_error() by halfChenXiaoSong
The smb2_error_map_table array currently has 1743 elements. When searching for the last element and calling smb2_print_status(), 3486 comparisons are needed. The loop in smb2_print_status() is unnecessary, smb2_print_status() can be removed, and only iterate over the array once, printing the message when the target status code is found. Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-12-05smb: client: Add tracepoint for krb5 authPaulo Alcantara
Add tracepoint to help debugging krb5 auth failures. Example: $ trace-cmd record -e smb3_kerberos_auth $ mount.cifs ... $ trace-cmd report mount.cifs-1667 [003] ..... 5810.668549: smb3_kerberos_auth: vers=2 host=w22-dc1.zelda.test ip=192.168.124.30:445 sec=krb5 uid=0 cruid=0 user=root pid=1667 upcall_target=app err=-126 Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> Reviewed-by: David Howells <dhowells@redhat.com> Cc: Pierguido Lambri <plambri@redhat.com> Cc: linux-cifs@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
2025-12-05smb: client: improve error message when creating SMB sessionPaulo Alcantara
When failing to create a new SMB session with 'sec=krb5' for example, the following error message isn't very useful CIFS: VFS: \\srv Send error in SessSetup = -126 Improve it by printing the following instead on dmesg CIFS: VFS: \\srv failed to create a new SMB session with Kerberos: -126 Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> Cc: Pierguido Lambri <plambri@redhat.com> Reviewed-by: David Howells <dhowells@redhat.com> Cc: linux-cifs@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
2025-12-05smb: client: relax session and tcon reconnect attemptsPaulo Alcantara
When the client re-establishes connection to the server, it will queue a worker thread that will attempt to reconnect sessions and tcons on every two seconds, which is kinda overkill as it is a very common scenario when having expired passwords or KRB5 TGT tickets, or deleted shares. Use an exponential backoff strategy to handle session/tcon reconnect attempts in the worker thread to prevent the client from overloading the system when it is very unlikely to re-establish any session/tcon soon while client is idle. Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> Reviewed-by: David Howells <dhowells@redhat.com> Cc: Pierguido Lambri <plambri@redhat.com> Cc: linux-cifs@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
2025-12-05Merge tag 'fuse-update-6.19' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse Pull fuse updates from Miklos Szeredi: - Add mechanism for cleaning out unused, stale dentries; controlled via a module option (Luis Henriques) - Fix various bugs - Cleanups * tag 'fuse-update-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: Uninitialized variable in fuse_epoch_work() fuse: fix io-uring list corruption for terminated non-committed requests fuse: signal that a fuse inode should exhibit local fs behaviors fuse: Always flush the page cache before FOPEN_DIRECT_IO write fuse: Invalidate the page cache after FOPEN_DIRECT_IO write fuse: rename 'namelen' to 'namesize' fuse: use strscpy instead of strcpy fuse: refactor fuse_conn_put() to remove negative logic. fuse: new work queue to invalidate dentries from old epochs fuse: new work queue to periodically invalidate expired dentries dcache: export shrink_dentry_list() and add new helper d_dispose_if_unused() fuse: add WARN_ON and comment for RCU revalidate fuse: Fix whitespace for fuse_uring_args_to_ring() comment fuse: missing copy_finish in fuse-over-io-uring argument copies fuse: fix readahead reclaim deadlock
2025-12-05mshv: Cleanly shutdown root partition with MSHVPraveen K Paladugu
Root partitions running on MSHV currently attempt ACPI power-off, which MSHV intercepts and triggers a Machine Check Exception (MCE), leading to a kernel panic. Root partitions panic with a trace similar to: [ 81.306348] reboot: Power down [ 81.314709] mce: [Hardware Error]: CPU 0: Machine Check Exception: 4 Bank 0: b2000000c0060001 [ 81.314711] mce: [Hardware Error]: TSC 3b8cb60a66 PPIN 11d98332458e4ea9 [ 81.314713] mce: [Hardware Error]: PROCESSOR 0:606a6 TIME 1759339405 SOCKET 0 APIC 0 microcode ffffffff [ 81.314715] mce: [Hardware Error]: Run the above through 'mcelog --ascii' [ 81.314716] mce: [Hardware Error]: Machine check: Processor context corrupt [ 81.314717] Kernel panic - not syncing: Fatal machine check To avoid this, configure the sleep state in the hypervisor and invoke the HVCALL_ENTER_SLEEP_STATE hypercall as the final step in the shutdown sequence. This ensures a clean and safe shutdown of the root partition. Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com> Co-developed-by: Anatol Belski <anbelski@linux.microsoft.com> Signed-off-by: Anatol Belski <anbelski@linux.microsoft.com> Reviewed-by: Easwar Hariharan <easwar.hariharan@linux.microsoft.com> Acked-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>
2025-12-05mshv: Use reboot notifier to configure sleep statePraveen K Paladugu
Configure sleep state information from ACPI in MSHV hypervisor using a reboot notifier. This data allows the hypervisor to correctly power off the host during shutdown. Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com> Co-developed-by: Anatol Belski <anbelski@linux.microsoft.com> Signed-off-by: Anatol Belski <anbelski@linux.microsoft.com> Reviewed-by: Easwar Hariharan <easwar.hariharan@linux.microsoft.com> Reviewed-by: Stansialv Kinsburskii <skinsburskii@linux.miscrosoft.com> Acked-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Nuno Das Neves <nunodasneves@linux.microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>
2025-12-05mshv: Add definitions for MSHV sleep state configurationPraveen K Paladugu
Add the definitions required to configure sleep states in mshv hypervsior. Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com> Co-developed-by: Anatol Belski <anbelski@linux.microsoft.com> Signed-off-by: Anatol Belski <anbelski@linux.microsoft.com> Reviewed-by: Easwar Hariharan <easwar.hariharan@linux.microsoft.com> Reviewed-by: Nuno Das Neves <nunodasneves@linux.microsoft.com> Acked-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>
2025-12-05mshv: Add support for movable memory regionsStanislav Kinsburskii
Introduce support for movable memory regions in the Hyper-V root partition driver to improve memory management flexibility and enable advanced use cases such as dynamic memory remapping. Mirror the address space between the Linux root partition and guest VMs using HMM. The root partition owns the memory, while guest VMs act as devices with page tables managed via hypercalls. MSHV handles VP intercepts by invoking hmm_range_fault() and updating SLAT entries. When memory is reclaimed, HMM invalidates the relevant regions, prompting MSHV to clear SLAT entries; guest VMs will fault again on access. Integrate mmu_interval_notifier for movable regions, implement handlers for HMM faults and memory invalidation, and update memory region mapping logic to support movable regions. While MMU notifiers are commonly used in virtualization drivers, this implementation leverages HMM (Heterogeneous Memory Management) for its specialized functionality. HMM provides a framework for mirroring, invalidation, and fault handling, reducing boilerplate and improving maintainability compared to generic MMU notifiers. Signed-off-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com> Reviewed-by: Nuno Das Neves <nunodasneves@linux.microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>
2025-12-05mshv: Add refcount and locking to mem regionsStanislav Kinsburskii
Introduce kref-based reference counting and spinlock protection for memory regions in Hyper-V partition management. This change improves memory region lifecycle management and ensures thread-safe access to the region list. Previously, the regions list was protected by the partition mutex. However, this approach is too heavy for frequent fault and invalidation operations. Finer grained locking is now used to improve efficiency and concurrency. This is a precursor to supporting movable memory regions. Fault and invalidation handling for movable regions will require safe traversal of the region list and holding a region reference while performing invalidation or fault operations. Signed-off-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>
2025-12-05mshv: Fix huge page handling in memory region traversalStanislav Kinsburskii
The previous code assumed that if a region's first page was huge, the entire region consisted of huge pages and stored this in a large_pages flag. This premise is incorrect not only for movable regions (where pages can be split and merged on invalidate callbacks or page faults), but even for pinned regions: THPs can be split and merged during allocation, so a large, pinned region may contain a mix of huge and regular pages. This change removes the large_pages flag and replaces region-wide assumptions with per-chunk inspection of the actual page size when mapping, unmapping, sharing, and unsharing. This makes huge page handling correct for mixed-page regions and avoids relying on stale metadata that can easily become invalid as memory is remapped. Signed-off-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com> Reviewed-by: Anirudh Rayabharam (Microsoft) <anirudh@anirudhrb.com> Reviewed-by: Nuno Das Neves <nunodasneves@linux.microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>
2025-12-05mshv: Move region management to mshv_regions.cStanislav Kinsburskii
Refactor memory region management functions from mshv_root_main.c into mshv_regions.c for better modularity and code organization. Adjust function calls and headers to use the new implementation. Improve maintainability and separation of concerns in the mshv_root module. Signed-off-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com> Reviewed-by: Anirudh Rayabharam (Microsoft) <anirudh@anirudhrb.com> Reviewed-by: Nuno Das Neves <nunodasneves@linux.microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>
2025-12-05mshv: Centralize guest memory region destructionStanislav Kinsburskii
Centralize guest memory region destruction to prevent resource leaks and inconsistent cleanup across unmap and partition destruction paths. Unify region removal, encrypted partition access recovery, and region invalidation to improve maintainability and reliability. Reduce code duplication and make future updates less error-prone by encapsulating cleanup logic in a single helper. Signed-off-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com> Reviewed-by: Nuno Das Neves <nunodasneves@linux.microsoft.com> Reviewed-by: Anirudh Rayabharam (Microsoft) <anirudh@anirudhrb.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>
2025-12-05mshv: Refactor and rename memory region handling functionsStanislav Kinsburskii
Simplify and unify memory region management to improve code clarity and reliability. Consolidate pinning and invalidation logic, adopt consistent naming, and remove redundant checks to reduce complexity. Enhance documentation and update call sites for maintainability. Signed-off-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com> Reviewed-by: Nuno Das Neves <nunodasneves@linux.microsoft.com> Reviewed-by: Anirudh Rayabharam (Microsoft) <anirudh@anirudhrb.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>
2025-12-05mshv: adjust interrupt control structure for ARM64Jinank Jain
Interrupt control structure (union hv_interupt_control) has different fields when it comes to x86 vs ARM64. Bring in the correct structure from HyperV header files and adjust the existing interrupt routing code accordingly. Signed-off-by: Jinank Jain <jinankjain@microsoft.com> Signed-off-by: Anirudh Rayabharam (Microsoft) <anirudh@anirudhrb.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>
2025-12-05Drivers: hv: use kmalloc_array() instead of kmalloc()Gongwei Li
Replace kmalloc() with kmalloc_array() to prevent potential overflow, as recommended in Documentation/process/deprecated.rst. Signed-off-by: Gongwei Li <ligongwei@kylinos.cn> Signed-off-by: Wei Liu <wei.liu@kernel.org>
2025-12-05mshv: Add ioctl for self targeted passthrough hvcallsAnirudh Rayabharam (Microsoft)
Allow MSHV_ROOT_HVCALL IOCTL on the /dev/mshv fd. This IOCTL would execute a passthrough hypercall targeting the root/parent partition i.e. HV_PARTITION_ID_SELF. This will be useful for the VMM to query things like supported synthetic processor features, supported VMM capabiliites etc. Since hypercalls targeting the host partition could potentially perform privileged operations, allow only a limited set of hypercalls. To begin with, allow only: HVCALL_GET_PARTITION_PROPERTY HVCALL_GET_PARTITION_PROPERTY_EX Signed-off-by: Anirudh Rayabharam (Microsoft) <anirudh@anirudhrb.com> Reviewed-by: Nuno Das Neves <nunodasneves@linux.microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>
2025-12-05Drivers: hv: Introduce mshv_vtl driverNaman Jain
Provide an interface for Virtual Machine Monitor like OpenVMM and its use as OpenHCL paravisor to control VTL0 (Virtual trust Level). Expose devices and support IOCTLs for features like VTL creation, VTL0 memory management, context switch, making hypercalls, mapping VTL0 address space to VTL2 userspace, getting new VMBus messages and channel events in VTL2 etc. Co-developed-by: Roman Kisel <romank@linux.microsoft.com> Signed-off-by: Roman Kisel <romank@linux.microsoft.com> Co-developed-by: Saurabh Sengar <ssengar@linux.microsoft.com> Signed-off-by: Saurabh Sengar <ssengar@linux.microsoft.com> Reviewed-by: Michael Kelley <mhklinux@outlook.com> Signed-off-by: Naman Jain <namjain@linux.microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>
2025-12-05cifs: Fix handling of a beyond-EOF DIO/unbuffered read over SMB2David Howells
If a DIO read or an unbuffered read request extends beyond the EOF, the server will return a short read and a status code indicating that EOF was hit, which gets translated to -ENODATA. Note that the client does not cap the request at i_size, but asks for the amount requested in case there's a race on the server with a third party. Now, on the client side, the request will get split into multiple subrequests if rsize is smaller than the full request size. A subrequest that starts before or at the EOF and returns short data up to the EOF will be correctly handled, with the NETFS_SREQ_HIT_EOF flag being set, indicating to netfslib that we can't read more. If a subrequest, however, starts after the EOF and not at it, HIT_EOF will not be flagged, its error will be set to -ENODATA and it will be abandoned. This will cause the request as a whole to fail with -ENODATA. Fix this by setting NETFS_SREQ_HIT_EOF on any subrequest that lies beyond the EOF marker. Fixes: 1da29f2c39b6 ("netfs, cifs: Fix handling of short DIO read") Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> cc: Shyam Prasad N <sprasad@microsoft.com> cc: linux-cifs@vger.kernel.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
2025-12-05cifs: client: allow changing multichannel mount options on remountRajasi Mandal
Previously, the client did not update a session's channel state when multichannel or max_channels mount options were changed via remount. This led to inconsistent behavior and prevented enabling or disabling multichannel support without a full unmount/remount cycle. Enable dynamic reconfiguration of multichannel and max_channels during remount by: - Introducing smb3_sync_ses_chan_max(), a centralized function for channel updates which synchronizes the session's channels with the updated configuration. - Replacing cifs_disable_secondary_channels() with cifs_decrease_secondary_channels(), which accepts a disable_mchan flag to support multichannel disable when the server stops supporting multichannel. - Updating remount logic to detect changes in multichannel or max_channels and trigger appropriate session/channel updates. Current limitation: - The query_interfaces worker runs even when max_channels=1 so that multichannel can be enabled later via remount without requiring an unmount. This is a temporary approach and may be refined in the future. Users can safely modify multichannel and max_channels on an existing mount. The client will correctly adjust the session's channel state to match the new configuration, preserving durability where possible and avoiding unnecessary disconnects. Reviewed-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Rajasi Mandal <rajasimandal@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-12-05cifs: Do some preparation prior to organising the function declarationsDavid Howells
Make some preparatory cleanups prior to running a script to organise the function declarations within the fs/smb/client/ headers. These include: (1) Remove "inline" from the dummy cifs_proc_init/clean() functions as they are in a .c file. (2) Move should_compress()'s kdoc comment to the .c file and remove kdoc markers from the comments. (3) Rename CIFS_ALLOW_INSECURE_LEGACY in #endif comments to have CONFIG_ on the front to allow the script to recognise it. (4) Don't let comments have bare words at the left margin as that confused the simplistic function detection code in the script. (5) Adjust some argument lists so that when and if the cleanup script is run they don't end up over 100 chars. (6) Fix a few comments to have missing '*' added or the "*/" moved to their own lines so that checkpatch doesn't moan over the cleanup script patch. (7) Move struct cifs_calc_sig_ctx to cifsglob.h. (8) Remove some __KERNEL__ conditionals. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> cc: linux-cifs@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
2025-12-05cifs: Add a tracepoint to log EIO errorsDavid Howells
Add a tracepoint to log EIO errors and give it the capacity to convey up to two integers of information. This is then wrapped with three functions: int smb_EIO(enum smb_eio_trace trace) int smb_EIO1(enum smb_eio_trace trace, unsigned long info) int smb_EIO2(enum smb_eio_trace trace, unsigned long info, unsigned long info2) depending on how many bits of info are desired to be logged with any particular trace. The functions all return -EIO and can be used in place of -EIO. The trace argument is an enum value that gets translated to a string when the trace is printed. This makes is easier to log EIO instances when the client is under high load than turning on a printk wrapper such as cifs_dbg(). Granted, EIO could have its own separate EIO printing since EIO shouldn't happen. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> cc: linux-cifs@vger.kernel.org cc: linux-fsdevel@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
2025-12-05cifs: Don't need state locking in smb2_get_mid_entry()David Howells
There's no need to get ->srv_lock or ->ses_lock in smb2_get_mid_entry() as all that happens of relevance (to the lock) inside the locked sections is the reading of one status value in each. Replace the locking with READ_ONCE() and use a switch instead of a chain of if-statements. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> cc: Shyam Prasad N <sprasad@microsoft.com> cc: Tom Talpey <tom@talpey.com> cc: linux-cifs@vger.kernel.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
2025-12-05cifs: Remove the server pointer from smb_messageDavid Howells
Remove the server pointer from smb_message and instead pass it down to all the things that access it. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> cc: Shyam Prasad N <sprasad@microsoft.com> cc: Tom Talpey <tom@talpey.com> (RDMA, smbdirect) cc: linux-cifs@vger.kernel.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
2025-12-05cifs: Fix specification of function pointersDavid Howells
Change the mid_receive_t, mid_callback_t and mid_handle_t function pointers to have the pointer marker in the typedef. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> cc: linux-cifs@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
2025-12-05cifs: Replace SendReceiveBlockingLock() with SendReceive() plus flagsDavid Howells
Replace the smb1 transport's SendReceiveBlockingLock() with SendReceive() plus a couple of flags. This will then allow that to pick up the transport changes there. The first flag, CIFS_INTERRUPTIBLE_WAIT, is added to indicate that the wait should be interruptible and the second, CIFS_WINDOWS_LOCK, indicates that we need to send a Lock command with unlock type rather than a Cancel. send_lock_cancel() is then called from cifs_lock_cancel() which is called from the main transport loop in compound_send_recv(). [!] I *think* the error code handling is probably right. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> cc: Shyam Prasad N <sprasad@microsoft.com> cc: Tom Talpey <tom@talpey.com> cc: linux-cifs@vger.kernel.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
2025-12-05cifs: Clean up some places where an extra kvec[] was required for rfc1002David Howells
Clean up some places where previously an extra element in the kvec array was being used to hold an rfc1002 header for SMB1 (a previous patch removed this and generated it on the fly as for SMB2/3). Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> cc: Shyam Prasad N <sprasad@microsoft.com> cc: Tom Talpey <tom@talpey.com> cc: linux-cifs@vger.kernel.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
2025-12-05cifs: Make smb1's SendReceive() wrap cifs_send_recv()David Howells
Make the smb1 transport's SendReceive() simply wrap cifs_send_recv() as does SendReceive2(). This will then allow that to pick up the transport changes there. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> cc: Shyam Prasad N <sprasad@microsoft.com> cc: Tom Talpey <tom@talpey.com> cc: linux-cifs@vger.kernel.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
2025-12-05cifs: Remove the RFC1002 header from smb_hdrDavid Howells
Remove the RFC1002 header from struct smb_hdr as used for SMB-1.0. This simplifies the SMB-1.0 code by simplifying a lot of places that have to add or subtract 4 to work around the fact that the RFC1002 header isn't really part of the message and the base for various offsets within the message is from the base of the smb_hdr, not the RFC1002 header. Further, clean up a bunch of places that require an extra kvec struct specifically pointing to the RFC1002 header, such that kvec[0].iov_base must be exactly 4 bytes before kvec[1].iov_base. This allows the header preamble size stuff to be removed too. The size of the request and response message are then handed around either directly or by summing the size of all the iov_len members in the kvec array for which we have a count. Also, this simplifies and cleans up the common transmission and receive paths for SMB1 and SMB2/3 as there no longer needs to be special handling casing for SMB1 messages as the RFC1002 header is now generated on the fly for SMB1 as it is for SMB2/3. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Tom Talpey <tom@talpey.com> Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> cc: Shyam Prasad N <sprasad@microsoft.com> cc: linux-cifs@vger.kernel.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
2025-12-05cifs: Fix handling of a beyond-EOF DIO/unbuffered read over SMB1David Howells
If a DIO read or an unbuffered read request extends beyond the EOF, the server will return a short read and a status code indicating that EOF was hit, which gets translated to -ENODATA. Note that the client does not cap the request at i_size, but asks for the amount requested in case there's a race on the server with a third party. Now, on the client side, the request will get split into multiple subrequests if rsize is smaller than the full request size. A subrequest that starts before or at the EOF and returns short data up to the EOF will be correctly handled, with the NETFS_SREQ_HIT_EOF flag being set, indicating to netfslib that we can't read more. If a subrequest, however, starts after the EOF and not at it, HIT_EOF will not be flagged, its error will be set to -ENODATA and it will be abandoned. This will cause the request as a whole to fail with -ENODATA. Fix this by setting NETFS_SREQ_HIT_EOF on any subrequest that lies beyond the EOF marker. This can be reproduced by mounting with "cache=none,sign,vers=1.0" and doing a read of a file that's significantly bigger than the size of the file (e.g. attempting to read 64KiB from a 16KiB file). Fixes: a68c74865f51 ("cifs: Fix SMB1 readv/writev callback in the same way as SMB2/3") Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> cc: Shyam Prasad N <sprasad@microsoft.com> cc: linux-cifs@vger.kernel.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
2025-12-05Merge tag 'pull-persistency' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull persistent dentry infrastructure and conversion from Al Viro: "Some filesystems use a kinda-sorta controlled dentry refcount leak to pin dentries of created objects in dcache (and undo it when removing those). A reference is grabbed and not released, but it's not actually _stored_ anywhere. That works, but it's hard to follow and verify; among other things, we have no way to tell _which_ of the increments is intended to be an unpaired one. Worse, on removal we need to decide whether the reference had already been dropped, which can be non-trivial if that removal is on umount and we need to figure out if this dentry is pinned due to e.g. unlink() not done. Usually that is handled by using kill_litter_super() as ->kill_sb(), but there are open-coded special cases of the same (consider e.g. /proc/self). Things get simpler if we introduce a new dentry flag (DCACHE_PERSISTENT) marking those "leaked" dentries. Having it set claims responsibility for +1 in refcount. The end result this series is aiming for: - get these unbalanced dget() and dput() replaced with new primitives that would, in addition to adjusting refcount, set and clear persistency flag. - instead of having kill_litter_super() mess with removing the remaining "leaked" references (e.g. for all tmpfs files that hadn't been removed prior to umount), have the regular shrink_dcache_for_umount() strip DCACHE_PERSISTENT of all dentries, dropping the corresponding reference if it had been set. After that kill_litter_super() becomes an equivalent of kill_anon_super(). Doing that in a single step is not feasible - it would affect too many places in too many filesystems. It has to be split into a series. This work has really started early in 2024; quite a few preliminary pieces have already gone into mainline. This chunk is finally getting to the meat of that stuff - infrastructure and most of the conversions to it. Some pieces are still sitting in the local branches, but the bulk of that stuff is here" * tag 'pull-persistency' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (54 commits) d_make_discardable(): warn if given a non-persistent dentry kill securityfs_recursive_remove() convert securityfs get rid of kill_litter_super() convert rust_binderfs convert nfsctl convert rpc_pipefs convert hypfs hypfs: swich hypfs_create_u64() to returning int hypfs: switch hypfs_create_str() to returning int hypfs: don't pin dentries twice convert gadgetfs gadgetfs: switch to simple_remove_by_name() convert functionfs functionfs: switch to simple_remove_by_name() functionfs: fix the open/removal races functionfs: need to cancel ->reset_work in ->kill_sb() functionfs: don't bother with ffs->ref in ffs_data_{opened,closed}() functionfs: don't abuse ffs_data_closed() on fs shutdown convert selinuxfs ...
2025-12-05Merge tag 'mm-stable-2025-12-03-21-26' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: "__vmalloc()/kvmalloc() and no-block support" (Uladzislau Rezki) Rework the vmalloc() code to support non-blocking allocations (GFP_ATOIC, GFP_NOWAIT) "ksm: fix exec/fork inheritance" (xu xin) Fix a rare case where the KSM MMF_VM_MERGE_ANY prctl state is not inherited across fork/exec "mm/zswap: misc cleanup of code and documentations" (SeongJae Park) Some light maintenance work on the zswap code "mm/page_owner: add debugfs files 'show_handles' and 'show_stacks_handles'" (Mauricio Faria de Oliveira) Enhance the /sys/kernel/debug/page_owner debug feature by adding unique identifiers to differentiate the various stack traces so that userspace monitoring tools can better match stack traces over time "mm/page_alloc: pcp->batch cleanups" (Joshua Hahn) Minor alterations to the page allocator's per-cpu-pages feature "Improve UFFDIO_MOVE scalability by removing anon_vma lock" (Lokesh Gidra) Address a scalability issue in userfaultfd's UFFDIO_MOVE operation "kasan: cleanups for kasan_enabled() checks" (Sabyrzhan Tasbolatov) "drivers/base/node: fold node register and unregister functions" (Donet Tom) Clean up the NUMA node handling code a little "mm: some optimizations for prot numa" (Kefeng Wang) Cleanups and small optimizations to the NUMA allocation hinting code "mm/page_alloc: Batch callers of free_pcppages_bulk" (Joshua Hahn) Address long lock hold times at boot on large machines. These were causing (harmless) softlockup warnings "optimize the logic for handling dirty file folios during reclaim" (Baolin Wang) Remove some now-unnecessary work from page reclaim "mm/damon: allow DAMOS auto-tuned for per-memcg per-node memory usage" (SeongJae Park) Enhance the DAMOS auto-tuning feature "mm/damon: fixes for address alignment issues in DAMON_LRU_SORT and DAMON_RECLAIM" (Quanmin Yan) Fix DAMON_LRU_SORT and DAMON_RECLAIM with certain userspace configuration "expand mmap_prepare functionality, port more users" (Lorenzo Stoakes) Enhance the new(ish) file_operations.mmap_prepare() method and port additional callsites from the old ->mmap() over to ->mmap_prepare() "Fix stale IOTLB entries for kernel address space" (Lu Baolu) Fix a bug (and possible security issue on non-x86) in the IOMMU code. In some situations the IOMMU could be left hanging onto a stale kernel pagetable entry "mm/huge_memory: cleanup __split_unmapped_folio()" (Wei Yang) Clean up and optimize the folio splitting code "mm, swap: misc cleanup and bugfix" (Kairui Song) Some cleanups and a minor fix in the swap discard code "mm/damon: misc documentation fixups" (SeongJae Park) "mm/damon: support pin-point targets removal" (SeongJae Park) Permit userspace to remove a specific monitoring target in the middle of the current targets list "mm: MISC follow-up patches for linux/pgalloc.h" (Harry Yoo) A couple of cleanups related to mm header file inclusion "mm/swapfile.c: select swap devices of default priority round robin" (Baoquan He) improve the selection of swap devices for NUMA machines "mm: Convert memory block states (MEM_*) macros to enums" (Israel Batista) Change the memory block labels from macros to enums so they will appear in kernel debug info "ksm: perform a range-walk to jump over holes in break_ksm" (Pedro Demarchi Gomes) Address an inefficiency when KSM unmerges an address range "mm/damon/tests: fix memory bugs in kunit tests" (SeongJae Park) Fix leaks and unhandled malloc() failures in DAMON userspace unit tests "some cleanups for pageout()" (Baolin Wang) Clean up a couple of minor things in the page scanner's writeback-for-eviction code "mm/hugetlb: refactor sysfs/sysctl interfaces" (Hui Zhu) Move hugetlb's sysfs/sysctl handling code into a new file "introduce VM_MAYBE_GUARD and make it sticky" (Lorenzo Stoakes) Make the VMA guard regions available in /proc/pid/smaps and improves the mergeability of guarded VMAs "mm: perform guard region install/remove under VMA lock" (Lorenzo Stoakes) Reduce mmap lock contention for callers performing VMA guard region operations "vma_start_write_killable" (Matthew Wilcox) Start work on permitting applications to be killed when they are waiting on a read_lock on the VMA lock "mm/damon/tests: add more tests for online parameters commit" (SeongJae Park) Add additional userspace testing of DAMON's "commit" feature "mm/damon: misc cleanups" (SeongJae Park) "make VM_SOFTDIRTY a sticky VMA flag" (Lorenzo Stoakes) Address the possible loss of a VMA's VM_SOFTDIRTY flag when that VMA is merged with another "mm: support device-private THP" (Balbir Singh) Introduce support for Transparent Huge Page (THP) migration in zone device-private memory "Optimize folio split in memory failure" (Zi Yan) "mm/huge_memory: Define split_type and consolidate split support checks" (Wei Yang) Some more cleanups in the folio splitting code "mm: remove is_swap_[pte, pmd]() + non-swap entries, introduce leaf entries" (Lorenzo Stoakes) Clean up our handling of pagetable leaf entries by introducing the concept of 'software leaf entries', of type softleaf_t "reparent the THP split queue" (Muchun Song) Reparent the THP split queue to its parent memcg. This is in preparation for addressing the long-standing "dying memcg" problem, wherein dead memcg's linger for too long, consuming memory resources "unify PMD scan results and remove redundant cleanup" (Wei Yang) A little cleanup in the hugepage collapse code "zram: introduce writeback bio batching" (Sergey Senozhatsky) Improve zram writeback efficiency by introducing batched bio writeback support "memcg: cleanup the memcg stats interfaces" (Shakeel Butt) Clean up our handling of the interrupt safety of some memcg stats "make vmalloc gfp flags usage more apparent" (Vishal Moola) Clean up vmalloc's handling of incoming GFP flags "mm: Add soft-dirty and uffd-wp support for RISC-V" (Chunyan Zhang) Teach soft dirty and userfaultfd write protect tracking to use RISC-V's Svrsw60t59b extension "mm: swap: small fixes and comment cleanups" (Youngjun Park) Fix a small bug and clean up some of the swap code "initial work on making VMA flags a bitmap" (Lorenzo Stoakes) Start work on converting the vma struct's flags to a bitmap, so we stop running out of them, especially on 32-bit "mm/swapfile: fix and cleanup swap list iterations" (Youngjun Park) Address a possible bug in the swap discard code and clean things up a little [ This merge also reverts commit ebb9aeb980e5 ("vfio/nvgrace-gpu: register device memory for poison handling") because it looks broken to me, I've asked for clarification - Linus ] * tag 'mm-stable-2025-12-03-21-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (321 commits) mm: fix vma_start_write_killable() signal handling mm/swapfile: use plist_for_each_entry in __folio_throttle_swaprate mm/swapfile: fix list iteration when next node is removed during discard fs/proc/task_mmu.c: fix make_uffd_wp_huge_pte() huge pte handling mm/kfence: add reboot notifier to disable KFENCE on shutdown memcg: remove inc/dec_lruvec_kmem_state helpers selftests/mm/uffd: initialize char variable to Null mm: fix DEBUG_RODATA_TEST indentation in Kconfig mm: introduce VMA flags bitmap type tools/testing/vma: eliminate dependency on vma->__vm_flags mm: simplify and rename mm flags function for clarity mm: declare VMA flags by bit zram: fix a spelling mistake mm/page_alloc: optimize lowmem_reserve max lookup using its semantic monotonicity mm/vmscan: skip increasing kswapd_failures when reclaim was boosted pagemap: update BUDDY flag documentation mm: swap: remove scan_swap_map_slots() references from comments mm: swap: change swap_alloc_slow() to void mm, swap: remove redundant comment for read_swap_cache_async mm, swap: use SWP_SOLIDSTATE to determine if swap is rotational ...
2025-12-05tracing: Fix typo in trace_seq.cMaurice Hieronymus
Fix typo "wont" to "won't". Link: https://patch.msgid.link/20251121221835.28032-15-mhi@mailbox.org Signed-off-by: Maurice Hieronymus <mhi@mailbox.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-12-05tracing: Fix typo in trace_probe.cMaurice Hieronymus
Fix typo "separater" to "separator". Link: https://patch.msgid.link/20251121221835.28032-14-mhi@mailbox.org Signed-off-by: Maurice Hieronymus <mhi@mailbox.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-12-05tracing: Fix multiple typos in trace_osnoise.cMaurice Hieronymus
Fix multiple typos in comments: "Anotate" -> "Annotate" "infor" -> "info" "timestemp" -> "timestamp" "tread" -> "thread" "varaibles" -> "variables" "wast" -> "waste" Link: https://patch.msgid.link/20251121221835.28032-13-mhi@mailbox.org Signed-off-by: Maurice Hieronymus <mhi@mailbox.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-12-05tracing: Fix multiple typos in trace_events_user.cMaurice Hieronymus
Fix multiple typos in comments: "ambigious" -> "ambiguous" "explictly" -> "explicitly" "Uknown" -> "Unknown" Link: https://patch.msgid.link/20251121221835.28032-12-mhi@mailbox.org Signed-off-by: Maurice Hieronymus <mhi@mailbox.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-12-05tracing: Fix typo in trace_events_trigger.cMaurice Hieronymus
Fix typo "componenents" to "components". Link: https://patch.msgid.link/20251121221835.28032-11-mhi@mailbox.org Signed-off-by: Maurice Hieronymus <mhi@mailbox.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-12-05tracing: Fix typo in trace_events_hist.cMaurice Hieronymus
Fix typo "tigger" to "trigger". Link: https://patch.msgid.link/20251121221835.28032-10-mhi@mailbox.org Signed-off-by: Maurice Hieronymus <mhi@mailbox.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-12-05tracing: Fix typo in trace_events_filter.cMaurice Hieronymus
Fix typo "singe" to "single". Link: https://patch.msgid.link/20251121221835.28032-9-mhi@mailbox.org Signed-off-by: Maurice Hieronymus <mhi@mailbox.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-12-05tracing: Fix multiple typos in trace_events.cMaurice Hieronymus
Fix multiple typos in comments: "appened" -> "appended" "paranthesis" -> "parenthesis" "parethesis" -> "parenthesis" "wont" -> "won't" Link: https://patch.msgid.link/20251121221835.28032-8-mhi@mailbox.org Signed-off-by: Maurice Hieronymus <mhi@mailbox.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-12-05tracing: Fix multiple typos in trace.cMaurice Hieronymus
Fix multiple typos in comments: "alse" -> "also" "enabed" -> "enabled" "instane" -> "instance" "outputing" -> "outputting" "seperated" -> "separated" Link: https://patch.msgid.link/20251121221835.28032-7-mhi@mailbox.org Signed-off-by: Maurice Hieronymus <mhi@mailbox.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-12-05tracing: Fix typo in ring_buffer_benchmark.cMaurice Hieronymus
Fix typo "overwite" to "overwrite". Link: https://patch.msgid.link/20251121221835.28032-6-mhi@mailbox.org Signed-off-by: Maurice Hieronymus <mhi@mailbox.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-12-05tracing: Fix multiple typos in ring_buffer.cMaurice Hieronymus
Fix multiple typos in comments: "ording" -> "ordering" "scatch" -> "scratch" "wont" -> "won't" Link: https://patch.msgid.link/20251121221835.28032-5-mhi@mailbox.org Signed-off-by: Maurice Hieronymus <mhi@mailbox.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-12-05tracing: Fix typo in fprobe.cMaurice Hieronymus
Fix typo "funciton" to "function". Link: https://patch.msgid.link/20251121221835.28032-4-mhi@mailbox.org Signed-off-by: Maurice Hieronymus <mhi@mailbox.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-12-05tracing: Fix typo in fpgraph.cMaurice Hieronymus
Fix typo "reservered" to "reserved". Link: https://patch.msgid.link/20251121221835.28032-3-mhi@mailbox.org Signed-off-by: Maurice Hieronymus <mhi@mailbox.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-12-05tracing: Fix fixed array of synthetic eventSteven Rostedt
The commit 4d38328eb442d ("tracing: Fix synth event printk format for str fields") replaced "%.*s" with "%s" but missed removing the number size of the dynamic and static strings. The commit e1a453a57bc7 ("tracing: Do not add length to print format in synthetic events") fixed the dynamic part but did not fix the static part. That is, with the commands: # echo 's:wake_lat char[] wakee; u64 delta;' >> /sys/kernel/tracing/dynamic_events # echo 'hist:keys=pid:ts=common_timestamp.usecs if !(common_flags & 0x18)' > /sys/kernel/tracing/events/sched/sched_waking/trigger # echo 'hist:keys=next_pid:delta=common_timestamp.usecs-$ts:onmatch(sched.sched_waking).trace(wake_lat,next_comm,$delta)' > /sys/kernel/tracing/events/sched/sched_switch/trigger That caused the output of: <idle>-0 [001] d..5. 193.428167: wake_lat: wakee=(efault)sshd-sessiondelta=155 sshd-session-879 [001] d..5. 193.811080: wake_lat: wakee=(efault)kworker/u34:5delta=58 <idle>-0 [002] d..5. 193.811198: wake_lat: wakee=(efault)bashdelta=91 The commit e1a453a57bc7 fixed the part where the synthetic event had "char[] wakee". But if one were to replace that with a static size string: # echo 's:wake_lat char[16] wakee; u64 delta;' >> /sys/kernel/tracing/dynamic_events Where "wakee" is defined as "char[16]" and not "char[]" making it a static size, the code triggered the "(efaul)" again. Remove the added STR_VAR_LEN_MAX size as the string is still going to be nul terminated. Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Douglas Raillard <douglas.raillard@arm.com> Link: https://patch.msgid.link/20251204151935.5fa30355@gandalf.local.home Fixes: e1a453a57bc7 ("tracing: Do not add length to print format in synthetic events") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-12-05tracing: Fix enabling of tracing on file releaseSteven Rostedt
The trace file will pause tracing if the tracing instance has the "pause-on-trace" option is set. This happens when the file is opened, and it is unpaused when the file is closed. When this was first added, there was only one user that paused tracing. On open, the check to pause was: if (!iter->snapshot && (tr->trace_flags & TRACE_ITER(PAUSE_ON_TRACE))) Where if it is not the snapshot tracer and the "pause-on-trace" option is set, then it increments a "stop_count" of the trace instance. On close, the check is: if (!iter->snapshot && tr->stop_count) That is, if it is not the snapshot buffer and it was stopped, it will re-enable tracing. Now there's more places that stop tracing. This means, if something else stops tracing the tr->stop_count will be non-zero, and that means if the trace file is closed, it will decrement the stop_count even though it never incremented it. This causes a warning because when the user that stopped tracing enables it again, the stop_count goes below zero. Instead of relying on the stop_count being set to know if the close of the trace file should enable tracing again, add a new flag to the trace iterator. The trace iterator is unique per open of the trace file, and if the open stops tracing set the trace iterator PAUSE flag. On close, if the PAUSE flag is set, then re-enable it again. Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://patch.msgid.link/20251202161751.24abaaf1@gandalf.local.home Fixes: 06e0a548bad0f ("tracing: Do not disable tracing when reading the trace file") Reported-by: syzbot+ccdec3bfe0beec58a38d@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/692f44a5.a70a0220.2ea503.00c8.GAE@google.com/ Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-12-05drm: nouveau: Replace sprintf() with sysfs_emit()Madhur Kumar
Replace sprintf() calls with sysfs_emit() to follow current kernel coding standards. sysfs_emit() is the preferred method for formatting sysfs output as it provides better bounds checking and is more secure. Signed-off-by: Madhur Kumar <madhurkumar004@gmail.com> Reviewed-by: Lyude Paul <lyude@redhat.com> Signed-off-by: Lyude Paul <lyude@redhat.com> Link: https://patch.msgid.link/20251205091804.317801-1-madhurkumar004@gmail.com Fixes: 11b7d895216f ("drm/nouveau/pm: manual pwm fanspeed management for nv40+ boards") Cc: <stable@vger.kernel.org> # v3.3+