summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2025-12-06LoongArch: Add new PCI ID for pci_fixup_vgadev()Huacai Chen
Loongson-2K3000 has a new PCI ID (0x7a46) for its display controller, Add it for pci_fixup_vgadev() since we prefer a discrete graphics card as default boot device if present. Cc: stable@vger.kernel.org Signed-off-by: Tianrui Zhao <zhaotianrui@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-12-06LoongArch: Add and use some macros for AVECSong Gao
Add and use some macros for AVEC interrupt controller, instead of using magic numbers. Signed-off-by: Song Gao <gaosong@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-12-06LoongArch: Correct the calculation logic of thread_countQiang Ma
For thread_count, the current calculation method has a maximum of 255, which may not be sufficient in the future. Therefore, we are correcting it now. Reference: SMBIOS Specification, 7.5 Processor Information (Type 4)[1] [1]: https://www.dmtf.org/sites/default/files/standards/documents/DSP0134_3.9.0.pdf Cc: stable@vger.kernel.org Signed-off-by: Qiang Ma <maqianga@uniontech.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-12-06LoongArch: Use unsigned long for _end and _textTiezhu Yang
It is better to use unsigned long rather than long for _end and _text to calculate the kernel length. Cc: stable@vger.kernel.org # v6.3+ Fixes: e5f02b51fa0c ("LoongArch: Add support for kernel address space layout randomization (KASLR)") Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-12-06LoongArch: Use __pmd()/__pte() for swap entry conversionsYuli Wang
The __pmd() and __pte() helper macros provide the correct initialization syntax and abstraction for the pmd_t and pte_t types. Use __pmd() to fix follow warning about __swp_entry_to_pmd() with gcc-15 under specific configs [1] : In file included from ./include/linux/pgtable.h:6, from ./include/linux/mm.h:31, from ./include/linux/pagemap.h:8, from arch/loongarch/mm/init.c:14: ./include/linux/swapops.h: In function ‘swp_entry_to_pmd’: ./arch/loongarch/include/asm/pgtable.h:302:34: error: missing braces around initializer [-Werror=missing-braces] 302 | #define __swp_entry_to_pmd(x) ((pmd_t) { (x).val | _PAGE_HUGE }) | ^ ./include/linux/swapops.h:559:16: note: in expansion of macro ‘__swp_entry_to_pmd’ 559 | return __swp_entry_to_pmd(arch_entry); | ^~~~~~~~~~~~~~~~~~ cc1: all warnings being treated as errors Also update __swp_entry_to_pte() to use __pte() for consistency. [1]. https://download.01.org/0day-ci/archive/20251119/202511190316.luI90kAo-lkp@intel.com/config Cc: stable@vger.kernel.org Signed-off-by: Yuli Wang <wangyl5933@chinaunicom.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-12-06LoongArch: Fix arch_dup_task_struct() for CONFIG_RANDSTRUCTHuacai Chen
Now the optimized version of arch_dup_task_struct() for LoongArch assumes 'thread' is the last member of 'task_struct'. But this is not true if CONFIG_RANDSTRUCT is enabled after Linux-6.16. So fix the arch_dup_task_struct() function for CONFIG_RANDSTRUCT by copying the whole 'task_struct'. Cc: stable@vger.kernel.org # 6.16+ Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-12-06LoongArch: Fix build errors for CONFIG_RANDSTRUCTHuacai Chen
When CONFIG_RANDSTRUCT enabled, members of task_struct are randomized. There is a chance that TASK_STACK_CANARY be out of 12bit immediate's range and causes build errors. TASK_STACK_CANARY is naturally aligned, so fix it by replacing ld.d/st.d with ldptr.d/stptr.d which have 14bit immediates. Cc: stable@vger.kernel.org Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202511240656.0NaPcJs1-lkp@intel.com/ Suggested-by: Rui Wang <wangrui@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-12-06LoongArch: Simplify __arch_bitrev32() implementationXi Ruoyao
LoongArch has the bitrev.w instruction to reverse bits in a 32-bit integer, thus there's no need to reverse the bytes and use bitrev.4b. Signed-off-by: Xi Ruoyao <xry111@xry111.site> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-12-06LoongArch: Select HAVE_ARCH_BITREVERSE in KconfigXi Ruoyao
Without HAVE_ARCH_BITREVERSE, the architecture-optimized bit reverse implementations in arch/loongarch/include/asm/bitrev.h are not used at all. Signed-off-by: Xi Ruoyao <xry111@xry111.site> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-12-05nfs/localio: fix regression due to out-of-order __put_credMike Snitzer
Commit f2060bdc21d7 ("nfs/localio: add refcounting for each iocb IO associated with NFS pgio header") inadvertantly reintroduced the same potential for __put_cred() triggering BUG_ON(cred == current->cred) that commit 992203a1fba5 ("nfs/localio: restore creds before releasing pageio data") fixed. Fix this by saving and restoring the cred around each {read,write}_iter call within the respective for loop of nfs_local_call_{read,write} using scoped_with_creds(). NOTE: this fix started by first reverting the following commits: 94afb627dfc2 ("nfs: use credential guards in nfs_local_call_read()") bff3c841f7bd ("nfs: use credential guards in nfs_local_call_write()") 1d18101a644e ("Merge tag 'kernel-6.19-rc1.cred' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs") followed by narrowly fixing the cred lifetime issue by using scoped_with_creds(). In doing so, this commit's changes appear more extensive than they really are (as evidenced by comparing to v6.18's fs/nfs/localio.c). Reported-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org> Acked-by: Trond Myklebust <trond.myklebust@hammerspace.com> Reviewed-by: Christian Brauner <brauner@kernel.org> Link: https://lore.kernel.org/linux-next/20251205111942.4150b06f@canb.auug.org.au/ Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-12-05Merge tag 'soc-drivers-6.19-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull more SoC driver updates from Arnd Bergmann: "These updates came a little late, or were based on a later 6.18-rc tag than the others: - A new driver for cache management on cxl devices with memory shared in a coherent cluster. This is part of the drivers/cache/ tree, but unlike the other drivers that back the dma-mapping interfaces, this one is needed only during CPU hotplug. - A shared branch for reset controllers using swnode infrastructure - Added support for new SoC variants in the Amlogic soc_device identification - Minor updates in Freescale, Microchip, Samsung, and Apple SoC drivers" * tag 'soc-drivers-6.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (24 commits) soc: samsung: exynos-pmu: fix device leak on regmap lookup soc: samsung: exynos-pmu: Fix structure initialization soc: fsl: qbman: use kmalloc_array() instead of kmalloc() soc: fsl: qbman: add WQ_PERCPU to alloc_workqueue users MAINTAINERS: Update email address for Christophe Leroy MAINTAINERS: refer to intended file in STANDALONE CACHE CONTROLLER DRIVERS cache: Support cache maintenance for HiSilicon SoC Hydra Home Agent cache: Make top level Kconfig menu a boolean dependent on RISCV MAINTAINERS: Add Jonathan Cameron to drivers/cache and add lib/cache_maint.c + header arm64: Select GENERIC_CPU_CACHE_MAINTENANCE lib: Support ARCH_HAS_CPU_CACHE_INVALIDATE_MEMREGION soc: amlogic: meson-gx-socinfo: add new SoCs id dt-bindings: arm: amlogic: meson-gx-ao-secure: support more SoCs memregion: Support fine grained invalidate by cpu_cache_invalidate_memregion() memregion: Drop unused IORES_DESC_* parameter from cpu_cache_invalidate_memregion() dt-bindings: cache: sifive,ccache0: add a pic64gx compatible MAINTAINERS: rename Microchip RISC-V entry MAINTAINERS: add new soc drivers to Microchip RISC-V entry soc: microchip: add mfd drivers for two syscon regions on PolarFire SoC dt-bindings: soc: microchip: document the simple-mfd syscon on PolarFire SoC ...
2025-12-05Merge tag 'soc-drivers-6.19' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull SoC driver updates from Arnd Bergmann: "This is the first half of the driver changes: - A treewide interface change to the "syscore" operations for power management, as a preparation for future Tegra specific changes - Reset controller updates with added drivers for LAN969x, eic770 and RZ/G3S SoCs - Protection of system controller registers on Renesas and Google SoCs, to prevent trivially triggering a system crash from e.g. debugfs access - soc_device identification updates on Nvidia, Exynos and Mediatek - debugfs support in the ST STM32 firewall driver - Minor updates for SoC drivers on AMD/Xilinx, Renesas, Allwinner, TI - Cleanups for memory controller support on Nvidia and Renesas" * tag 'soc-drivers-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (114 commits) memory: tegra186-emc: Fix missing put_bpmp Documentation: reset: Remove reset_controller_add_lookup() reset: fix BIT macro reference reset: rzg2l-usbphy-ctrl: Fix a NULL vs IS_ERR() bug in probe reset: th1520: Support reset controllers in more subsystems reset: th1520: Prepare for supporting multiple controllers dt-bindings: reset: thead,th1520-reset: Add controllers for more subsys dt-bindings: reset: thead,th1520-reset: Remove non-VO-subsystem resets reset: remove legacy reset lookup code clk: davinci: psc: drop unused reset lookup reset: rzg2l-usbphy-ctrl: Add support for RZ/G3S SoC reset: rzg2l-usbphy-ctrl: Add support for USB PWRRDY dt-bindings: reset: renesas,rzg2l-usbphy-ctrl: Document RZ/G3S support reset: eswin: Add eic7700 reset driver dt-bindings: reset: eswin: Documentation for eic7700 SoC reset: sparx5: add LAN969x support dt-bindings: reset: microchip: Add LAN969x support soc: rockchip: grf: Add select correct PWM implementation on RK3368 soc/tegra: pmc: Add USB wake events for Tegra234 amba: tegra-ahb: Fix device leak on SMMU enable ...
2025-12-05Merge tag 'soc-newsoc-6.19' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull new SoC families update from Arnd Bergmann: "These three new families of SoC are split out into a separate branch because they touch multiple parts of the source tree and are better left separate for the initial merge. - Black Sesame Technologies C1200 is an automotive SoC using Cortex-A78 CPU cores - Anlogic dr1v90 (not to be confused with Amlogic) is an FPGA platform using a single nuclei ux900 RISC-V core - Tenstorrent Blackhole is a Neural Processing Unit using custom "Tensix" cores for computation offload managed by Linux running on SiFive X280 RISC-V cores. Support for all three is rather rudimentary at the moment and will get improved as device drivers are merged through other tree" * tag 'soc-newsoc-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (24 commits) MAINTAINERS: add Black Sesame Technologies (BST) ARM SoC support arm64: defconfig: enable BST platform support arm64: dts: bst: add support for Black Sesame Technologies C1200 CDCU1.0 board arm64: Kconfig: add ARCH_BST for Black Sesame Technologies SoCs dt-bindings: arm: add Black Sesame Technologies (bst) SoC dt-bindings: vendor-prefixes: Add Black Sesame Technologies Co., Ltd. MAINTAINERS: Setup support for Anlogic tree riscv: defconfig: Enable Anlogic SoC riscv: dts: anlogic: Add Milianke MLKPAI FS01 board riscv: dts: Add initial Anlogic DR1V90 SoC device tree riscv: Add Anlogic SoC famly Kconfig support dt-bindings: serial: snps-dw-apb-uart: Add Anlogic DR1V90 uart dt-bindings: timer: Add Anlogic DR1V90 ACLINT MTIMER dt-bindings: riscv: Add Anlogic DR1V90 dt-bindings: riscv: Add Nuclei UX900 compatibles dt-bindings: vendor-prefixes: Add Anlogic, Milianke and Nuclei riscv: defconfig: Enable Tenstorrent SoCs riscv: Kconfig.socs: Add ARCH_TENSTORRENT for Tenstorrent SoCs riscv: dts: Add Tenstorrent Blackhole SoC PCIe cards dt-bindings: interrupt-controller: Add Tenstorrent Blackhole compatible ...
2025-12-05Merge tag 'soc-dt-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/socLinus Torvalds
Pull SoC devicetree updates from Arnd Bergmann: "Three new SoCs got added in existing arm64 chip families: - Renesas R-Car X5H (R8A78000) is a new generation of automotive SoCs, based on 16 Cortex-A720 (Armv9.2) cores, which makes the the currently highest-perforance embedded SoC. - TI AM62L is a new variant of the AM62 family of industrial SoCs, this one comes without a GPU. - Qualcomm MSM8937 (Snapdragon 430) is an older mobile phone chip based on Cortex-A53, and closely related to MSM8917 (Snapdragn 425), which we already support. In addition, there are a good number of newly supported machines across SoC families: - Two Aspeed AST2600 (Cortex-A7) based BMC setups for large servers - Mobile Phones and tables based on Mediatek MT6582, Nvidia Tegra124, Qualcomm MSM8937 and Qualcomm MSM8939, - Two Laptops based on Qualcomm SoCs: one using the older sdm850, the other using x1p42100. - One Router based on Rockchips RK3568 - 24 variants of the Enclustra Mercury system-on-module, all based on 32-bit Intel/Altera SocFPGA chips, plus two boards using 64-bit SocFPGA Agilex chips.. - 30 industrial/embedded boards and single-board computers, using various chips from NXP, Rockchips, Mediatek, TI, Amlogic, Qualcomm, Spacemit, and Starfive. In total there are 783 commits here, the majority of these improving hardware support and cleaning up devicetree files across the tree, with the majority of the changes going into the Qualcomm, NXP, Renesas and Rockchips platforms" * tag 'soc-dt-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (782 commits) arm64: dts: mediatek: mt8195: Fix address range for JPEG decoder core 1 ARM: dts: samsung: exynos4412-midas: turn off SDIO WLAN chip during system suspend ARM: dts: samsung: exynos4210-trats: turn off SDIO WLAN chip during system suspend ARM: dts: samsung: exynos4210-i9100: turn off SDIO WLAN chip during system suspend ARM: dts: samsung: universal_c210: turn off SDIO WLAN chip during system suspend arm64: dts: amlogic: meson-g12b: Fix L2 cache reference for S922X CPUs arm64: dts: Add gpio_intc node for Amlogic S7D SoCs arm64: dts: Add gpio_intc node for Amlogic S7 SoCs arm64: dts: Add gpio_intc node for Amlogic S6 SoCs arm64: dts: amlogic: s7d: add ao secure node arm64: dts: amlogic: s7: add ao secure node arm64: dts: amlogic: s6: add ao secure node arm64: dts: amlogic: Fix the register name of the 'DBI' region dts: arm64: amlogic: add a5 pinctrl node arm64: dts: amlogic: s7d: add power domain controller node arm64: dts: amlogic: s7: add power domain controller node arm64: dts: amlogic: s6: add power domain controller node dts: arm64: amlogic: Add ISP related nodes for C3 arm64: dts: meson: add initial device-tree for Tanix TX9 Pro dt-bindings: arm: amlogic: add support for Tanix TX9 Pro ...
2025-12-05Merge tag 'soc-arm-6.19' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull SoC ARM code updates from Arnd Bergmann: "These are very minimal changes for 32-bit Arm platform code, enabling SMP bringup for one more SoC variant (mt6582) among spelling changes and a build warning fix" * tag 'soc-arm-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: ARM: omap1: avoid symbol clashes in fiq handler ARM: gemini: fix typos in comments ARM: versatile: Fix typo in versatile.c ARM: OMAP2+: Fix falg->flag typo in omap_smc2() ARM: mediatek: add MT6582 smp bring up code ARM: mediatek: add board_dt_compat entry for the MT6582 SoC
2025-12-05Merge tag 'soc-defconfig-6.19' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull SoC defconfig updates from Arnd Bergmann: "As usual, a number of newly added drivers get enabled in the arm64 defconfig, in addition to minor housekeeping work on defconfig files for arm32, arm64 and riscv" * tag 'soc-defconfig-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (24 commits) arm64: defconfig: enable Exynos ACPM clocks arm64: defconfig: Remove the redundant SCHED_MC/SCHED_SMT ARM: multi_v7_defconfig: Enable TI PRU Ethernet driver arm64: defconfig: enable i.MX AIPSTZ driver ARM: mxs_defconfig: enable sound drivers for imx28-amarula-rmm arm64: defconfig: Enable i.MX95 drivers for pinctrl, Ethernet and PCIe arm64: defconfig: enable rockchip camera interface ARM: tegra: Enable EXT4 for Tegra arm64: defconfig: Enable NVIDIA VRS PSEQ RTC arm64: defconfig: Enable SX150x GPIO expander driver riscv: defconfig: enable SPI_FSL_QUADSPI as a module ARM: at91: at91_dt_defconfig: set MMC_SPI to module arm64: defconfig: Build NSS clock controller driver for IPQ5424 arm64: defconfig: Enable SCSI UFS Crypto and Block Inline encryption drivers arm64: defconfig: Add M31 eUSB2 PHY config arm64: defconfig: Enable configs for Fairphone 3, 4, 5 smartphones arm64: defconfig: Enable two Novatek display panels for MTP8750 and Tianma arm64: defconfig: Enable RZ/T2H / RZ/N2H ADC driver ARM: shmobile: defconfig: Refresh for v6.18-rc1 arm64: defconfig: Enable DW HDMI QP CEC support ...
2025-12-05Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM updates from Paolo Bonzini: "ARM: - Support for userspace handling of synchronous external aborts (SEAs), allowing the VMM to potentially handle the abort in a non-fatal manner - Large rework of the VGIC's list register handling with the goal of supporting more active/pending IRQs than available list registers in hardware. In addition, the VGIC now supports EOImode==1 style deactivations for IRQs which may occur on a separate vCPU than the one that acked the IRQ - Support for FEAT_XNX (user / privileged execute permissions) and FEAT_HAF (hardware update to the Access Flag) in the software page table walkers and shadow MMU - Allow page table destruction to reschedule, fixing long need_resched latencies observed when destroying a large VM - Minor fixes to KVM and selftests Loongarch: - Get VM PMU capability from HW GCFG register - Add AVEC basic support - Use 64-bit register definition for EIOINTC - Add KVM timer test cases for tools/selftests RISC/V: - SBI message passing (MPXY) support for KVM guest - Give a new, more specific error subcode for the case when in-kernel AIA virtualization fails to allocate IMSIC VS-file - Support KVM_DIRTY_LOG_INITIALLY_SET, enabling dirty log gradually in small chunks - Fix guest page fault within HLV* instructions - Flush VS-stage TLB after VCPU migration for Andes cores s390: - Always allocate ESCA (Extended System Control Area), instead of starting with the basic SCA and converting to ESCA with the addition of the 65th vCPU. The price is increased number of exits (and worse performance) on z10 and earlier processor; ESCA was introduced by z114/z196 in 2010 - VIRT_XFER_TO_GUEST_WORK support - Operation exception forwarding support - Cleanups x86: - Skip the costly "zap all SPTEs" on an MMIO generation wrap if MMIO SPTE caching is disabled, as there can't be any relevant SPTEs to zap - Relocate a misplaced export - Fix an async #PF bug where KVM would clear the completion queue when the guest transitioned in and out of paging mode, e.g. when handling an SMI and then returning to paged mode via RSM - Leave KVM's user-return notifier registered even when disabling virtualization, as long as kvm.ko is loaded. On reboot/shutdown, keeping the notifier registered is ok; the kernel does not use the MSRs and the callback will run cleanly and restore host MSRs if the CPU manages to return to userspace before the system goes down - Use the checked version of {get,put}_user() - Fix a long-lurking bug where KVM's lack of catch-up logic for periodic APIC timers can result in a hard lockup in the host - Revert the periodic kvmclock sync logic now that KVM doesn't use a clocksource that's subject to NTP corrections - Clean up KVM's handling of MMIO Stale Data and L1TF, and bury the latter behind CONFIG_CPU_MITIGATIONS - Context switch XCR0, XSS, and PKRU outside of the entry/exit fast path; the only reason they were handled in the fast path was to paper of a bug in the core #MC code, and that has long since been fixed - Add emulator support for AVX MOV instructions, to play nice with emulated devices whose guest drivers like to access PCI BARs with large multi-byte instructions x86 (AMD): - Fix a few missing "VMCB dirty" bugs - Fix the worst of KVM's lack of EFER.LMSLE emulation - Add AVIC support for addressing 4k vCPUs in x2AVIC mode - Fix incorrect handling of selective CR0 writes when checking intercepts during emulation of L2 instructions - Fix a currently-benign bug where KVM would clobber SPEC_CTRL[63:32] on VMRUN and #VMEXIT - Fix a bug where KVM corrupt the guest code stream when re-injecting a soft interrupt if the guest patched the underlying code after the VM-Exit, e.g. when Linux patches code with a temporary INT3 - Add KVM_X86_SNP_POLICY_BITS to advertise supported SNP policy bits to userspace, and extend KVM "support" to all policy bits that don't require any actual support from KVM x86 (Intel): - Use the root role from kvm_mmu_page to construct EPTPs instead of the current vCPU state, partly as worthwhile cleanup, but mostly to pave the way for tracking per-root TLB flushes, and elide EPT flushes on pCPU migration if the root is clean from a previous flush - Add a few missing nested consistency checks - Rip out support for doing "early" consistency checks via hardware as the functionality hasn't been used in years and is no longer useful in general; replace it with an off-by-default module param to WARN if hardware fails a check that KVM does not perform - Fix a currently-benign bug where KVM would drop the guest's SPEC_CTRL[63:32] on VM-Enter - Misc cleanups - Overhaul the TDX code to address systemic races where KVM (acting on behalf of userspace) could inadvertantly trigger lock contention in the TDX-Module; KVM was either working around these in weird, ugly ways, or was simply oblivious to them (though even Yan's devilish selftests could only break individual VMs, not the host kernel) - Fix a bug where KVM could corrupt a vCPU's cpu_list when freeing a TDX vCPU, if creating said vCPU failed partway through - Fix a few sparse warnings (bad annotation, 0 != NULL) - Use struct_size() to simplify copying TDX capabilities to userspace - Fix a bug where TDX would effectively corrupt user-return MSR values if the TDX Module rejects VP.ENTER and thus doesn't clobber host MSRs as expected Selftests: - Fix a math goof in mmu_stress_test when running on a single-CPU system/VM - Forcefully override ARCH from x86_64 to x86 to play nice with specifying ARCH=x86_64 on the command line - Extend a bunch of nested VMX to validate nested SVM as well - Add support for LA57 in the core VM_MODE_xxx macro, and add a test to verify KVM can save/restore nested VMX state when L1 is using 5-level paging, but L2 is not - Clean up the guest paging code in anticipation of sharing the core logic for nested EPT and nested NPT guest_memfd: - Add NUMA mempolicy support for guest_memfd, and clean up a variety of rough edges in guest_memfd along the way - Define a CLASS to automatically handle get+put when grabbing a guest_memfd from a memslot to make it harder to leak references - Enhance KVM selftests to make it easer to develop and debug selftests like those added for guest_memfd NUMA support, e.g. where test and/or KVM bugs often result in hard-to-debug SIGBUS errors - Misc cleanups Generic: - Use the recently-added WQ_PERCPU when creating the per-CPU workqueue for irqfd cleanup - Fix a goof in the dirty ring documentation - Fix choice of target for directed yield across different calls to kvm_vcpu_on_spin(); the function was always starting from the first vCPU instead of continuing the round-robin search" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (260 commits) KVM: arm64: at: Update AF on software walk only if VM has FEAT_HAFDBS KVM: arm64: at: Use correct HA bit in TCR_EL2 when regime is EL2 KVM: arm64: Document KVM_PGTABLE_PROT_{UX,PX} KVM: arm64: Fix spelling mistake "Unexpeced" -> "Unexpected" KVM: arm64: Add break to default case in kvm_pgtable_stage2_pte_prot() KVM: arm64: Add endian casting to kvm_swap_s[12]_desc() KVM: arm64: Fix compilation when CONFIG_ARM64_USE_LSE_ATOMICS=n KVM: arm64: selftests: Add test for AT emulation KVM: arm64: nv: Expose hardware access flag management to NV guests KVM: arm64: nv: Implement HW access flag management in stage-2 SW PTW KVM: arm64: Implement HW access flag management in stage-1 SW PTW KVM: arm64: Propagate PTW errors up to AT emulation KVM: arm64: Add helper for swapping guest descriptor KVM: arm64: nv: Use pgtable definitions in stage-2 walk KVM: arm64: Handle endianness in read helper for emulated PTW KVM: arm64: nv: Stop passing vCPU through void ptr in S2 PTW KVM: arm64: Call helper for reading descriptors directly KVM: arm64: nv: Advertise support for FEAT_XNX KVM: arm64: Teach ptdump about FEAT_XNX permissions KVM: s390: Use generic VIRT_XFER_TO_GUEST_WORK functions ...
2025-12-05NFSv4: Handle NFS4ERR_NOTSUPP errors for directory delegationsTrond Myklebust
The error NFS4ERR_NOTSUPP will be returned for operations that are legal, but not supported by the server. Fixes: 156b09482933 ("NFS: Request a directory delegation on ACCESS, CREATE, and UNLINK") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-12-05nfs/localio: remove 61 byte hole from needless ____cacheline_alignedMike Snitzer
struct nfs_local_kiocb used ____cacheline_aligned on its iters[] array and as the structure evolved it caused a 61 byte hole to form. Fix this by removing ____cacheline_aligned and reordering iters[] before iter_is_dio_aligned[]. Fixes: 6a218b9c3183 ("nfs/localio: do not issue misaligned DIO out-of-order") Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-12-05nfs/localio: remove alignment size checking in nfs_is_local_dio_possibleMike Snitzer
This check to ensure dio_offset_align isn't larger than PAGE_SIZE is no longer relevant (older iterations of NFS Direct was allocating misaligned head and tail pages but no longer does, so this check isn't needed). Fixes: c817248fc831 ("nfs/localio: add proper O_DIRECT support for READ and WRITE") Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-12-05Merge tag 'uml-for-linux-6.19-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux Pull UML updates from Johannes Berg: "Apart from the usual small churn, we have - initial SMP support (only kernel) - major vDSO cleanups (and fixes for 32-bit)" * tag 'uml-for-linux-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux: (33 commits) um: Disable KASAN_INLINE when STATIC_LINK is selected um: Don't rename vmap to kernel_vmap um: drivers: virtio: use string choices helper um: Always set up AT_HWCAP and AT_PLATFORM x86/um: Remove FIXADDR_USER_START and FIXADDR_USE_END um: Remove __access_ok_vsyscall() um: Remove redundant range check from __access_ok_vsyscall() um: Remove fixaddr_user_init() x86/um: Drop gate area handling x86/um: Do not inherit vDSO from host um: Split out default elf_aux_hwcap x86/um: Move ELF_PLATFORM fallback to x86-specific code um: Split out default elf_aux_platform um: Avoid circular dependency on asm-offsets in pgtable.h um: Enable SMP support on x86 asm-generic: percpu: Add assembly guard um: vdso: Remove getcpu support on x86 um: Add initial SMP support um: Define timers on a per-CPU basis um: Determine sleep based on need_resched() ...
2025-12-05Merge tag 'riscv-for-linus-6.19-mw1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V updates from Paul Walmsley: - Enable parallel hotplug for RISC-V - Optimize vector regset allocation for ptrace() - Add a kernel selftest for the vector ptrace interface - Enable the userspace RAID6 test to build and run using RISC-V vectors - Add initial support for the Zalasr RISC-V ratified ISA extension - For the Zicbop RISC-V ratified ISA extension to userspace, expose hardware and kernel support to userspace and add a kselftest for Zicbop - Convert open-coded instances of 'asm goto's that are controlled by runtime ALTERNATIVEs to use riscv_has_extension_{un,}likely(), following arm64's alternative_has_cap_{un,}likely() - Remove an unnecessary mask in the GFP flags used in some calls to pagetable_alloc() * tag 'riscv-for-linus-6.19-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: selftests/riscv: Add Zicbop prefetch test riscv: hwprobe: Expose Zicbop extension and its block size riscv: Introduce Zalasr instructions riscv: hwprobe: Export Zalasr extension dt-bindings: riscv: Add Zalasr ISA extension description riscv: Add ISA extension parsing for Zalasr selftests: riscv: Add test for the Vector ptrace interface riscv: ptrace: Optimize the allocation of vector regset raid6: test: Add support for RISC-V raid6: riscv: Allow code to be compiled in userspace raid6: riscv: Prevent compiler from breaking inline vector assembly code riscv: cmpxchg: Use riscv_has_extension_likely riscv: bitops: Use riscv_has_extension_likely riscv: hweight: Use riscv_has_extension_likely riscv: checksum: Use riscv_has_extension_likely riscv: pgtable: Use riscv_has_extension_unlikely riscv: Remove __GFP_HIGHMEM masking RISC-V: Enable HOTPLUG_PARALLEL for secondary CPUs
2025-12-05Merge tag 'powerpc-6.19-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: - Restore clearing of MSR[RI] at interrupt/syscall exit on 32-bit - Fix unpaired stwcx on interrupt exit on 32-bit - Fix race condition leading to double list-add in mac_hid_toggle_emumouse() - Fix mprotect on book3s 32-bit - Fix SLB multihit issue during SLB preload with 64-bit hash MMU - Add support for crashkernel CMA reservation - Add die_id and die_cpumask for Power10 & later to expose chip hemispheres - A series of minor fixes and improvements to the hash SLB code Thanks to Antonio Alvarez Feijoo, Ben Collins, Bhaskar Chowdhury, Christophe Leroy, Daniel Thompson, Dave Vasilevsky, Donet Tom, J. Neuschäfer, Kunwu Chan, Long Li, Naresh Kamboju, Nathan Chancellor, Ritesh Harjani (IBM), Shirisha G, Shrikanth Hegde, Sourabh Jain, Srikar Dronamraju, Stephen Rothwell, Thomas Zimmermann, Venkat Rao Bagalkote, and Vishal Chourasia. * tag 'powerpc-6.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (32 commits) macintosh/via-pmu-backlight: Include <linux/fb.h> and <linux/of.h> powerpc/powermac: backlight: Include <linux/of.h> powerpc/64s/slb: Add no_slb_preload early cmdline param powerpc/64s/slb: Make preload_add return type as void powerpc/ptdump: Dump PXX level info for kernel_page_tables powerpc/64s/pgtable: Enable directMap counters in meminfo for Hash powerpc/64s/hash: Update directMap page counters for Hash powerpc/64s/hash: Hash hpt_order should be only available with Hash MMU powerpc/64s/hash: Improve hash mmu printk messages powerpc/64s/hash: Fix phys_addr_t printf format in htab_initialize() powerpc/64s/ptdump: Fix kernel_hash_pagetable dump for ISA v3.00 HPTE format powerpc/64s/hash: Restrict stress_hpt_struct memblock region to within RMA limit powerpc/64s/slb: Fix SLB multihit issue during SLB preload powerpc, mm: Fix mprotect on book3s 32-bit powerpc/smp: Expose die_id and die_cpumask powerpc/83xx: Add a null pointer check to mcu_gpiochip_add arch:powerpc:tools This file was missing shebang line, so added it kexec: Include kernel-end even without crashkernel powerpc: p2020: Rename wdt@ nodes to watchdog@ powerpc: 86xx: Rename wdt@ nodes to watchdog@ ...
2025-12-05Merge branch 'support-associating-bpf-programs-with-struct_ops'Andrii Nakryiko
Amery Hung says: ==================== Support associating BPF programs with struct_ops Hi, This patchset adds a new BPF command BPF_PROG_ASSOC_STRUCT_OPS to the bpf() syscall to allow associating a BPF program with a struct_ops. The command is introduced to address a emerging need from struct_ops users. As the number of subsystems adopting struct_ops grows, more users are building their struct_ops-based solution with some help from other BPF programs. For example, scx_layer uses a syscall program as a user space trigger to refresh layers [0]. It also uses tracing program to infer whether a task is using GPU and needs to be prioritized [1]. In these use cases, when there are multiple struct_ops instances, the struct_ops kfuncs called from different BPF programs, whether struct_ops or not needs to be able to refer to a specific one, which currently is not possible. The new BPF command will allow users to explicitly associate a BPF program with a struct_ops map. The libbpf wrapper can be called after loading programs and before attaching programs and struct_ops. Internally, it will set prog->aux->st_ops_assoc to the struct_ops map. struct_ops kfuncs can then get the associated struct_ops struct by calling bpf_prog_get_assoc_struct_ops() with prog->aux, which can be acquired from a "__prog" argument. The value of the special argument will be fixed up by the verifier during verification. The command conceptually associates the implementation of BPF programs with struct_ops map, not the attachment. A program associated with the map will take a refcount of it so that st_ops_assoc always points to a valid struct_ops struct. struct_ops implementers can use the helper, bpf_prog_get_assoc_struct_ops to get the pointer. The returned struct_ops if not NULL is guaranteed to be valid and initialized. However, it is not guaranteed that the struct_ops is attached. The struct_ops implementer still need to take steps to track and check the state of the struct_ops in kdata, if the use case demand the struct_ops to be attached. We can also consider support associating struct_ops link with BPF programs, which on one hand make struct_ops implementer's job easier, but might complicate libbpf workflow and does not apply to legacy struct_ops attachment. [0] https://github.com/sched-ext/scx/blob/main/scheds/rust/scx_layered/src/bpf/main.bpf.c#L557 [1] https://github.com/sched-ext/scx/blob/main/scheds/rust/scx_layered/src/bpf/main.bpf.c#L754 --- v7 -> v8 - Fix libbpf return (Andrii) - Follow kfunc _impl suffic naming convention in selftest (Alexei) Link: https://lore.kernel.org/bpf/20251121231352.4032020-1-ameryhung@gmail.com/ v6 -> v7 - Drop the guarantee that bpf_prog_get_assoc_struct_ops() will always return an initialized struct_ops (Martin) - Minor misc. changes in selftests Link: https://lore.kernel.org/bpf/20251114221741.317631-1-ameryhung@gmail.com/ v5 -> v6 - Drop refcnt bumping for async callbacks and add RCU annotation (Martin) - Fix libbpf bug and update comments (Andrii) - Fix refcount bug in bpf_prog_assoc_struct_ops() (AI) Link: https://lore.kernel.org/bpf/20251104172652.1746988-1-ameryhung@gmail.com/ v4 -> v5 - Simplify the API for getting associated struct_ops and dont't expose struct_ops map lifecycle management (Andrii, Alexei) Link: https://lore.kernel.org/bpf/20251024212914.1474337-1-ameryhung@gmail.com/ v3 -> v4 - Fix potential dangling pointer in timer callback. Protect st_ops_assoc with RCU. The get helper now needs to be paired with bpf_struct_ops_put() - The command should only increase refcount once for a program (Andrii) - Test a struct_ops program reused in two struct_ops maps - Test getting associated struct_ops in timer callback Link: https://lore.kernel.org/bpf/20251017215627.722338-1-ameryhung@gmail.com/ v2 -> v3 - Change the type of st_ops_assoc from void* (i.e., kdata) to bpf_map (Andrii) - Fix a bug that clears BPF_PTR_POISON when a struct_ops map is freed (Andrii) - Return NULL if the map is not fully initialized (Martin) - Move struct_ops map refcount inc/dec into internal helpers (Martin) - Add libbpf API, bpf_program__assoc_struct_ops (Andrii) Link: https://lore.kernel.org/bpf/20251016204503.3203690-1-ameryhung@gmail.com/ v1 -> v2 - Poison st_ops_assoc when reusing the program in more than one struct_ops maps and add a helper to access the pointer (Andrii) - Minor style and naming changes (Andrii) Link: https://lore.kernel.org/bpf/20251010174953.2884682-1-ameryhung@gmail.com/ --- ==================== Link: https://patch.msgid.link/20251203233748.668365-1-ameryhung@gmail.com Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2025-12-05selftests/bpf: Test getting associated struct_ops in timer callbackAmery Hung
Make sure 1) a timer callback can also reference the associated struct_ops, and then make sure 2) the timer callback cannot get a dangled pointer to the struct_ops when the map is freed. The test schedules a timer callback from a struct_ops program since struct_ops programs do not pin the map. It is possible for the timer callback to run after the map is freed. The timer callback calls a kfunc that runs .test_1() of the associated struct_ops, which should return MAP_MAGIC when the map is still alive or -1 when the map is gone. The first subtest added in this patch schedules the timer callback to run immediately, while the map is still alive. The second subtest added schedules the callback to run 500ms after syscall_prog runs and then frees the map right after syscall_prog runs. Both subtests then wait until the callback runs to check the return of the kfunc. Signed-off-by: Amery Hung <ameryhung@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20251203233748.668365-7-ameryhung@gmail.com
2025-12-05selftests/bpf: Test ambiguous associated struct_opsAmery Hung
Add a test to make sure implicit struct_ops association does not break backward compatibility nor return incorrect struct_ops. struct_ops programs should still be allowed to be reused in different struct_ops map. The associated struct_ops map set implicitly however will be poisoned. Trying to read it through the helper bpf_prog_get_assoc_struct_ops() should result in a NULL pointer. While recursion of test_1() cannot happen due to the associated struct_ops being ambiguois, explicitly check for it to prevent stack overflow if the test regresses. Signed-off-by: Amery Hung <ameryhung@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20251203233748.668365-6-ameryhung@gmail.com
2025-12-05selftests/bpf: Test BPF_PROG_ASSOC_STRUCT_OPS commandAmery Hung
Test BPF_PROG_ASSOC_STRUCT_OPS command that associates a BPF program with a struct_ops. The test follows the same logic in commit ba7000f1c360 ("selftests/bpf: Test multi_st_ops and calling kfuncs from different programs"), but instead of using map id to identify a specific struct_ops, this test uses the new BPF command to associate a struct_ops with a program. The test consists of two sets of almost identical struct_ops maps and BPF programs associated with the map. Their only difference is the unique value returned by bpf_testmod_multi_st_ops::test_1(). The test first loads the programs and associates them with struct_ops maps. Then, it exercises the BPF programs. They will in turn call kfunc bpf_kfunc_multi_st_ops_test_1_prog_arg() to trigger test_1() of the associated struct_ops map, and then check if the right unique value is returned. Signed-off-by: Amery Hung <ameryhung@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20251203233748.668365-5-ameryhung@gmail.com
2025-12-05libbpf: Add support for associating BPF program with struct_opsAmery Hung
Add low-level wrapper and libbpf API for BPF_PROG_ASSOC_STRUCT_OPS command in the bpf() syscall. Signed-off-by: Amery Hung <ameryhung@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20251203233748.668365-4-ameryhung@gmail.com
2025-12-05bpf: Support associating BPF program with struct_opsAmery Hung
Add a new BPF command BPF_PROG_ASSOC_STRUCT_OPS to allow associating a BPF program with a struct_ops map. This command takes a file descriptor of a struct_ops map and a BPF program and set prog->aux->st_ops_assoc to the kdata of the struct_ops map. The command does not accept a struct_ops program nor a non-struct_ops map. Programs of a struct_ops map is automatically associated with the map during map update. If a program is shared between two struct_ops maps, prog->aux->st_ops_assoc will be poisoned to indicate that the associated struct_ops is ambiguous. The pointer, once poisoned, cannot be reset since we have lost track of associated struct_ops. For other program types, the associated struct_ops map, once set, cannot be changed later. This restriction may be lifted in the future if there is a use case. A kernel helper bpf_prog_get_assoc_struct_ops() can be used to retrieve the associated struct_ops pointer. The returned pointer, if not NULL, is guaranteed to be valid and point to a fully updated struct_ops struct. For struct_ops program reused in multiple struct_ops map, the return will be NULL. prog->aux->st_ops_assoc is protected by bumping the refcount for non-struct_ops programs and RCU for struct_ops programs. Since it would be inefficient to track programs associated with a struct_ops map, every non-struct_ops program will bump the refcount of the map to make sure st_ops_assoc stays valid. For a struct_ops program, it is protected by RCU as map_free will wait for an RCU grace period before disassociating the program with the map. The helper must be called in BPF program context or RCU read-side critical section. struct_ops implementers should note that the struct_ops returned may not be initialized nor attached yet. The struct_ops implementer will be responsible for tracking and checking the state of the associated struct_ops map if the use case expects an initialized or attached struct_ops. Signed-off-by: Amery Hung <ameryhung@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/bpf/20251203233748.668365-3-ameryhung@gmail.com
2025-12-05bpf: Allow verifier to fixup kernel module kfuncsAmery Hung
Allow verifier to fixup kfuncs in kernel module to support kfuncs with __prog arguments. Currently, special kfuncs and kfuncs with __prog arguments are kernel kfuncs. Allowing kernel module kfuncs should not affect existing kfunc fixup as kernel module kfuncs have BTF IDs greater than kernel kfuncs' BTF IDs. Signed-off-by: Amery Hung <ameryhung@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20251203233748.668365-2-ameryhung@gmail.com
2025-12-05ovl: pass original credentials, not mounter credentials during createChristian Brauner
When creating new files the security layer expects the original credentials to be passed. When cleaning up the code this was accidently changed to pass the mounter's credentials by relying on current->cred which is already overriden at this point. Pass the original credentials directly. Reported-by: Ondrej Mosnacek <omosnace@redhat.com> Reported-by: Paul Moore <paul@paul-moore.com> Fixes: e566bff96322 ("ovl: port ovl_create_or_link() to new ovl_override_creator_creds") Link: https://lore.kernel.org/CAFqZXNvL1ciLXMhHrnoyBmQu1PAApH41LkSWEhrcvzAAbFij8Q@mail.gmail.com Signed-off-by: Christian Brauner <brauner@kernel.org> Tested-by: Ondrej Mosnacek <omosnace@redhat.com> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-12-05libbpf: Add debug messaging in dedup equivalence/identity matchingAlan Maguire
We have seen a number of issues like [1]; failures to deduplicate key kernel data structures like task_struct. These are often hard to debug from pahole even with verbose output, especially when identity/equivalence checks fail deep in a nested struct comparison. Here we add debug messages of the form libbpf: STRUCT 'task_struct' size=2560 vlen=194 cand_id[54222] canon_id[102820] shallow-equal but not equiv for field#23 'sched_class': 0 These will be emitted during dedup from pahole when --verbose/-V is specified. This greatly helps identify exactly where dedup failures are experienced. [1] https://lore.kernel.org/bpf/b8e8b560-bce5-414b-846d-0da6d22a9983@oracle.com/ Changes since v1: - updated debug messages to refer to shallow-equal, added ids (Andrii) Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20251203191507.55565-1-alan.maguire@oracle.com
2025-12-05cifs: Remove dead function prototypesDavid Howells
Remove a bunch of dead function prototypes. Signed-off-by: David Howells <dhowells@redhat.com> cc: linux-cifs@vger.kernel.org cc: linux-fsdevel@vger.kernel.org Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-12-05Merge tag 'vfs-6.19-rc1.fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs fixes from Christian Brauner: - Fix a type conversion bug in the ipc subsystem - Fix per-dentry timeout warning in autofs - Drop the fd conversion from sockets - Move assert from iput_not_last() to iput() - Fix reversed check in filesystems_freeze_callback() - Use proper uapi types for new struct delegation definitions * tag 'vfs-6.19-rc1.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: vfs: use UAPI types for new struct delegation definition mqueue: correct the type of ro to int Revert "net/socket: convert sock_map_fd() to FD_ADD()" autofs: fix per-dentry timeout warning fs: assert on I_FREEING not being set in iput() and iput_not_last() fs: PM: Fix reverse check in filesystems_freeze_callback()
2025-12-05Merge tag 'exfat-for-6.19-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat Pull exfat updates from Namjae Jeon: - Fix a remount failure caused by differing process masks by inheriting the original mount options during the remount process - Fix a potential divide-by-zero error and system crash in exfat_allocate_bitmap that occurred when the readahead count was zero - Add validation for directory cluster bitmap bits to prevent directory and root cluster from being incorrectly zeroed out on corrupted images - Clear the post-EOF page cache when extending a file to prevent stale mmap data from becoming visible, addressing an generic/363 failure - Fix a reference count leak in exfat_find by properly releasing the dentry set in specific error paths * tag 'exfat-for-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat: exfat: fix remount failure in different process environments exfat: fix divide-by-zero in exfat_allocate_bitmap exfat: validate the cluster bitmap bits of directory exfat: zero out post-EOF page cache on file extension exfat: fix refcount leak in exfat_find
2025-12-05smb/client: add two elements to smb2_error_map_table arrayChenXiaoSong
Both status codes are mapped to -EIO. Now all status codes from common/smb2status.h are included in the smb2_error_map_table array(except for the first two zero definitions). Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-12-05smb: rename to STATUS_SMB_NO_PREAUTH_INTEGRITY_HASH_OVERLAPChenXiaoSong
See MS-SMB2 3.3.5.4. To keep the name consistent with the documentation. Additionally, move STATUS_INVALID_LOCK_RANGE to correct position in order. Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-12-05smb/client: remove unused elements from smb2_error_map_table arrayChenXiaoSong
STATUS_SUCCESS and STATUS_WAIT_0 are both zero, and since zero indicates success, they are not needed. Since smb2_print_status() has been removed, the last element in the array is no longer needed. Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-12-05smb/client: reduce loop count in map_smb2_to_linux_error() by halfChenXiaoSong
The smb2_error_map_table array currently has 1743 elements. When searching for the last element and calling smb2_print_status(), 3486 comparisons are needed. The loop in smb2_print_status() is unnecessary, smb2_print_status() can be removed, and only iterate over the array once, printing the message when the target status code is found. Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-12-05smb: client: Add tracepoint for krb5 authPaulo Alcantara
Add tracepoint to help debugging krb5 auth failures. Example: $ trace-cmd record -e smb3_kerberos_auth $ mount.cifs ... $ trace-cmd report mount.cifs-1667 [003] ..... 5810.668549: smb3_kerberos_auth: vers=2 host=w22-dc1.zelda.test ip=192.168.124.30:445 sec=krb5 uid=0 cruid=0 user=root pid=1667 upcall_target=app err=-126 Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> Reviewed-by: David Howells <dhowells@redhat.com> Cc: Pierguido Lambri <plambri@redhat.com> Cc: linux-cifs@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
2025-12-05smb: client: improve error message when creating SMB sessionPaulo Alcantara
When failing to create a new SMB session with 'sec=krb5' for example, the following error message isn't very useful CIFS: VFS: \\srv Send error in SessSetup = -126 Improve it by printing the following instead on dmesg CIFS: VFS: \\srv failed to create a new SMB session with Kerberos: -126 Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> Cc: Pierguido Lambri <plambri@redhat.com> Reviewed-by: David Howells <dhowells@redhat.com> Cc: linux-cifs@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
2025-12-05smb: client: relax session and tcon reconnect attemptsPaulo Alcantara
When the client re-establishes connection to the server, it will queue a worker thread that will attempt to reconnect sessions and tcons on every two seconds, which is kinda overkill as it is a very common scenario when having expired passwords or KRB5 TGT tickets, or deleted shares. Use an exponential backoff strategy to handle session/tcon reconnect attempts in the worker thread to prevent the client from overloading the system when it is very unlikely to re-establish any session/tcon soon while client is idle. Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> Reviewed-by: David Howells <dhowells@redhat.com> Cc: Pierguido Lambri <plambri@redhat.com> Cc: linux-cifs@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
2025-12-05Merge tag 'fuse-update-6.19' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse Pull fuse updates from Miklos Szeredi: - Add mechanism for cleaning out unused, stale dentries; controlled via a module option (Luis Henriques) - Fix various bugs - Cleanups * tag 'fuse-update-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: Uninitialized variable in fuse_epoch_work() fuse: fix io-uring list corruption for terminated non-committed requests fuse: signal that a fuse inode should exhibit local fs behaviors fuse: Always flush the page cache before FOPEN_DIRECT_IO write fuse: Invalidate the page cache after FOPEN_DIRECT_IO write fuse: rename 'namelen' to 'namesize' fuse: use strscpy instead of strcpy fuse: refactor fuse_conn_put() to remove negative logic. fuse: new work queue to invalidate dentries from old epochs fuse: new work queue to periodically invalidate expired dentries dcache: export shrink_dentry_list() and add new helper d_dispose_if_unused() fuse: add WARN_ON and comment for RCU revalidate fuse: Fix whitespace for fuse_uring_args_to_ring() comment fuse: missing copy_finish in fuse-over-io-uring argument copies fuse: fix readahead reclaim deadlock
2025-12-05mshv: Cleanly shutdown root partition with MSHVPraveen K Paladugu
Root partitions running on MSHV currently attempt ACPI power-off, which MSHV intercepts and triggers a Machine Check Exception (MCE), leading to a kernel panic. Root partitions panic with a trace similar to: [ 81.306348] reboot: Power down [ 81.314709] mce: [Hardware Error]: CPU 0: Machine Check Exception: 4 Bank 0: b2000000c0060001 [ 81.314711] mce: [Hardware Error]: TSC 3b8cb60a66 PPIN 11d98332458e4ea9 [ 81.314713] mce: [Hardware Error]: PROCESSOR 0:606a6 TIME 1759339405 SOCKET 0 APIC 0 microcode ffffffff [ 81.314715] mce: [Hardware Error]: Run the above through 'mcelog --ascii' [ 81.314716] mce: [Hardware Error]: Machine check: Processor context corrupt [ 81.314717] Kernel panic - not syncing: Fatal machine check To avoid this, configure the sleep state in the hypervisor and invoke the HVCALL_ENTER_SLEEP_STATE hypercall as the final step in the shutdown sequence. This ensures a clean and safe shutdown of the root partition. Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com> Co-developed-by: Anatol Belski <anbelski@linux.microsoft.com> Signed-off-by: Anatol Belski <anbelski@linux.microsoft.com> Reviewed-by: Easwar Hariharan <easwar.hariharan@linux.microsoft.com> Acked-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>
2025-12-05mshv: Use reboot notifier to configure sleep statePraveen K Paladugu
Configure sleep state information from ACPI in MSHV hypervisor using a reboot notifier. This data allows the hypervisor to correctly power off the host during shutdown. Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com> Co-developed-by: Anatol Belski <anbelski@linux.microsoft.com> Signed-off-by: Anatol Belski <anbelski@linux.microsoft.com> Reviewed-by: Easwar Hariharan <easwar.hariharan@linux.microsoft.com> Reviewed-by: Stansialv Kinsburskii <skinsburskii@linux.miscrosoft.com> Acked-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Nuno Das Neves <nunodasneves@linux.microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>
2025-12-05mshv: Add definitions for MSHV sleep state configurationPraveen K Paladugu
Add the definitions required to configure sleep states in mshv hypervsior. Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com> Co-developed-by: Anatol Belski <anbelski@linux.microsoft.com> Signed-off-by: Anatol Belski <anbelski@linux.microsoft.com> Reviewed-by: Easwar Hariharan <easwar.hariharan@linux.microsoft.com> Reviewed-by: Nuno Das Neves <nunodasneves@linux.microsoft.com> Acked-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>
2025-12-05mshv: Add support for movable memory regionsStanislav Kinsburskii
Introduce support for movable memory regions in the Hyper-V root partition driver to improve memory management flexibility and enable advanced use cases such as dynamic memory remapping. Mirror the address space between the Linux root partition and guest VMs using HMM. The root partition owns the memory, while guest VMs act as devices with page tables managed via hypercalls. MSHV handles VP intercepts by invoking hmm_range_fault() and updating SLAT entries. When memory is reclaimed, HMM invalidates the relevant regions, prompting MSHV to clear SLAT entries; guest VMs will fault again on access. Integrate mmu_interval_notifier for movable regions, implement handlers for HMM faults and memory invalidation, and update memory region mapping logic to support movable regions. While MMU notifiers are commonly used in virtualization drivers, this implementation leverages HMM (Heterogeneous Memory Management) for its specialized functionality. HMM provides a framework for mirroring, invalidation, and fault handling, reducing boilerplate and improving maintainability compared to generic MMU notifiers. Signed-off-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com> Reviewed-by: Nuno Das Neves <nunodasneves@linux.microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>
2025-12-05mshv: Add refcount and locking to mem regionsStanislav Kinsburskii
Introduce kref-based reference counting and spinlock protection for memory regions in Hyper-V partition management. This change improves memory region lifecycle management and ensures thread-safe access to the region list. Previously, the regions list was protected by the partition mutex. However, this approach is too heavy for frequent fault and invalidation operations. Finer grained locking is now used to improve efficiency and concurrency. This is a precursor to supporting movable memory regions. Fault and invalidation handling for movable regions will require safe traversal of the region list and holding a region reference while performing invalidation or fault operations. Signed-off-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>
2025-12-05mshv: Fix huge page handling in memory region traversalStanislav Kinsburskii
The previous code assumed that if a region's first page was huge, the entire region consisted of huge pages and stored this in a large_pages flag. This premise is incorrect not only for movable regions (where pages can be split and merged on invalidate callbacks or page faults), but even for pinned regions: THPs can be split and merged during allocation, so a large, pinned region may contain a mix of huge and regular pages. This change removes the large_pages flag and replaces region-wide assumptions with per-chunk inspection of the actual page size when mapping, unmapping, sharing, and unsharing. This makes huge page handling correct for mixed-page regions and avoids relying on stale metadata that can easily become invalid as memory is remapped. Signed-off-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com> Reviewed-by: Anirudh Rayabharam (Microsoft) <anirudh@anirudhrb.com> Reviewed-by: Nuno Das Neves <nunodasneves@linux.microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>
2025-12-05mshv: Move region management to mshv_regions.cStanislav Kinsburskii
Refactor memory region management functions from mshv_root_main.c into mshv_regions.c for better modularity and code organization. Adjust function calls and headers to use the new implementation. Improve maintainability and separation of concerns in the mshv_root module. Signed-off-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com> Reviewed-by: Anirudh Rayabharam (Microsoft) <anirudh@anirudhrb.com> Reviewed-by: Nuno Das Neves <nunodasneves@linux.microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>