summaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)Author
2026-03-04KVM: nSVM: Drop the non-architectural consistency check for NP_ENABLEYosry Ahmed
KVM currenty fails a nested VMRUN and injects VMEXIT_INVALID (aka SVM_EXIT_ERR) if L1 sets NP_ENABLE and the host does not support NPTs. On first glance, it seems like the check should actually be for guest_cpu_cap_has(X86_FEATURE_NPT) instead, as it is possible for the host to support NPTs but the guest CPUID to not advertise it. However, the consistency check is not architectural to begin with. The APM does not mention VMEXIT_INVALID if NP_ENABLE is set on a processor that does not have X86_FEATURE_NPT. Hence, NP_ENABLE should be ignored if X86_FEATURE_NPT is not available for L1, so sanitize it when copying from the VMCB12 to KVM's cache. Apart from the consistency check, NP_ENABLE in VMCB12 is currently ignored because the bit is actually copied from VMCB01 to VMCB02, not from VMCB12. Fixes: 4b16184c1cca ("KVM: SVM: Initialize Nested Nested MMU context on VMRUN") Cc: stable@vger.kernel.org Signed-off-by: Yosry Ahmed <yosry@kernel.org> Link: https://patch.msgid.link/20260303003421.2185681-15-yosry@kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-03-04KVM: nSVM: Drop nested_vmcb_check_{save/control}() wrappersYosry Ahmed
The wrappers provide little value and make it harder to see what KVM is checking in the normal flow. Drop them. Opportunistically fixup comments referring to the functions, adding '()' to make it clear it's a reference to a function. No functional change intended. Co-developed-by: Sean Christopherson <seanjc@google.com> Cc: stable@vger.kernel.org Signed-off-by: Yosry Ahmed <yosry@kernel.org> Link: https://patch.msgid.link/20260303003421.2185681-14-yosry@kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-03-04KVM: nSVM: Clear tracking of L1->L2 NMI and soft IRQ on nested #VMEXITYosry Ahmed
KVM clears tracking of L1->L2 injected NMIs (i.e. nmi_l1_to_l2) and soft IRQs (i.e. soft_int_injected) on a synthesized #VMEXIT(INVALID) due to failed VMRUN. However, they are not explicitly cleared in other synthesized #VMEXITs. soft_int_injected is always cleared after the first VMRUN of L2 when completing interrupts, as any re-injection is then tracked by KVM (instead of purely in vmcb02). nmi_l1_to_l2 is not cleared after the first VMRUN if NMI injection failed, as KVM still needs to keep track that the NMI originated from L1 to avoid blocking NMIs for L1. It is only cleared when the NMI injection succeeds. KVM could synthesize a #VMEXIT to L1 before successfully injecting the NMI into L2 (e.g. due to a #NPF on L2's NMI handler in L1's NPTs). In this case, nmi_l1_to_l2 will remain true, and KVM may not correctly mask NMIs and intercept IRET when injecting an NMI into L1. Clear both nmi_l1_to_l2 and soft_int_injected in nested_svm_vmexit(), i.e. for all #VMEXITs except those that occur due to failed consistency checks, as those happen before nmi_l1_to_l2 or soft_int_injected are set. Fixes: 159fc6fa3b7d ("KVM: nSVM: Transparently handle L1 -> L2 NMI re-injection") Cc: stable@vger.kernel.org Signed-off-by: Yosry Ahmed <yosry@kernel.org> Link: https://patch.msgid.link/20260303003421.2185681-13-yosry@kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-03-04KVM: nSVM: Clear EVENTINJ fields in vmcb12 on nested #VMEXITYosry Ahmed
According to the APM, from the reference of the VMRUN instruction: Upon #VMEXIT, the processor performs the following actions in order to return to the host execution context: ... clear EVENTINJ field in VMCB KVM already syncs EVENTINJ fields from vmcb02 to cached vmcb12 on every L2->L0 #VMEXIT. Since these fields are zeroed by the CPU on #VMEXIT, they will mostly be zeroed in vmcb12 on nested #VMEXIT by nested_svm_vmexit(). However, this is not the case when: 1. Consistency checks fail, as nested_svm_vmexit() is not called. 2. Entering guest mode fails before L2 runs (e.g. due to failed load of CR3). (2) was broken by commit 2d8a42be0e2b ("KVM: nSVM: synchronize VMCB controls updated by the processor on every vmexit"), as prior to that nested_svm_vmexit() always zeroed EVENTINJ fields. Explicitly clear the fields in all nested #VMEXIT code paths. Fixes: 3d6368ef580a ("KVM: SVM: Add VMRUN handler") Fixes: 2d8a42be0e2b ("KVM: nSVM: synchronize VMCB controls updated by the processor on every vmexit") Cc: stable@vger.kernel.org Signed-off-by: Yosry Ahmed <yosry@kernel.org> Link: https://patch.msgid.link/20260303003421.2185681-12-yosry@kernel.org [sean: massage changelog formatting] Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-03-04KVM: nSVM: Clear GIF on nested #VMEXIT(INVALID)Yosry Ahmed
According to the APM, GIF is set to 0 on any #VMEXIT, including an #VMEXIT(INVALID) due to failed consistency checks. Clear GIF on consistency check failures. Fixes: 3d6368ef580a ("KVM: SVM: Add VMRUN handler") Cc: stable@vger.kernel.org Signed-off-by: Yosry Ahmed <yosry@kernel.org> Link: https://patch.msgid.link/20260303003421.2185681-11-yosry@kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-03-04KVM: nSVM: Triple fault if restore host CR3 fails on nested #VMEXITYosry Ahmed
If loading L1's CR3 fails on a nested #VMEXIT, nested_svm_vmexit() returns an error code that is ignored by most callers, and continues to run L1 with corrupted state. A sane recovery is not possible in this case, and HW behavior is to cause a shutdown. Inject a triple fault instead, and do not return early from nested_svm_vmexit(). Continue cleaning up the vCPU state (e.g. clear pending exceptions), to handle the failure as gracefully as possible. From the APM: Upon #VMEXIT, the processor performs the following actions in order to return to the host execution context: ... if (illegal host state loaded, or exception while loading host state) shutdown else execute first host instruction following the VMRUN Remove the return value of nested_svm_vmexit(), which is mostly unchecked anyway. Fixes: d82aaef9c88a ("KVM: nSVM: use nested_svm_load_cr3() on guest->host switch") CC: stable@vger.kernel.org Signed-off-by: Yosry Ahmed <yosry@kernel.org> Link: https://patch.msgid.link/20260303003421.2185681-10-yosry@kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-03-04KVM: nSVM: Triple fault if mapping VMCB12 fails on nested #VMEXITYosry Ahmed
KVM currently injects a #GP and hopes for the best if mapping VMCB12 fails on nested #VMEXIT, and only if the failure mode is -EINVAL. Mapping the VMCB12 could also fail if creating host mappings fails. After the #GP is injected, nested_svm_vmexit() bails early, without cleaning up (e.g. KVM_REQ_GET_NESTED_STATE_PAGES is set, is_guest_mode() is true, etc). Instead of optionally injecting a #GP, triple fault the guest if mapping VMCB12 fails since KVM cannot make a sane recovery. The APM states that a #VMEXIT will triple fault if host state is illegal or an exception occurs while loading host state, so the behavior is not entirely made up. Do not return early from nested_svm_vmexit(), continue cleaning up the vCPU state (e.g. switch back to vmcb01), to handle the failure as gracefully as possible. Fixes: cf74a78b229d ("KVM: SVM: Add VMEXIT handler and intercepts") CC: stable@vger.kernel.org Co-developed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Yosry Ahmed <yosry@kernel.org> Link: https://patch.msgid.link/20260303003421.2185681-9-yosry@kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-03-04KVM: nSVM: Refactor writing vmcb12 on nested #VMEXIT as a helperYosry Ahmed
Move mapping vmcb12 and updating it out of nested_svm_vmexit() into a helper, no functional change intended. CC: stable@vger.kernel.org Co-developed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Yosry Ahmed <yosry@kernel.org> Link: https://patch.msgid.link/20260303003421.2185681-8-yosry@kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-03-04KVM: nSVM: Refactor checking LBRV enablement in vmcb12 into a helperYosry Ahmed
Refactor the vCPU cap and vmcb12 flag checks into a helper. The unlikely() annotation is dropped, it's unlikely (huh) to make a difference and the CPU will probably predict it better on its own. CC: stable@vger.kernel.org Co-developed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Yosry Ahmed <yosry@kernel.org> Link: https://patch.msgid.link/20260303003421.2185681-7-yosry@kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-03-04KVM: nSVM: Always inject a #GP if mapping VMCB12 fails on nested VMRUNYosry Ahmed
nested_svm_vmrun() currently only injects a #GP if kvm_vcpu_map() fails with -EINVAL. But it could also fail with -EFAULT if creating a host mapping failed. Inject a #GP in all cases, no reason to treat failure modes differently. Fixes: 8c5fbf1a7231 ("KVM/nSVM: Use the new mapping API for mapping guest memory") CC: stable@vger.kernel.org Co-developed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Yosry Ahmed <yosry@kernel.org> Link: https://patch.msgid.link/20260303003421.2185681-6-yosry@kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-03-04KVM: SVM: Add missing save/restore handling of LBR MSRsYosry Ahmed
MSR_IA32_DEBUGCTLMSR and LBR MSRs are currently not enumerated by KVM_GET_MSR_INDEX_LIST, and LBR MSRs cannot be set with KVM_SET_MSRS. So save/restore is completely broken. Fix it by adding the MSRs to msrs_to_save_base, and allowing writes to LBR MSRs from userspace only (as they are read-only MSRs) if LBR virtualization is enabled. Additionally, to correctly restore L1's LBRs while L2 is running, make sure the LBRs are copied from the captured VMCB01 save area in svm_copy_vmrun_state(). Note, for VMX, this also fixes a flaw where MSR_IA32_DEBUGCTLMSR isn't reported as an MSR to save/restore. Note #2, over-reporting MSR_IA32_LASTxxx on Intel is ok, as KVM already handles unsupported reads and writes thanks to commit b5e2fec0ebc3 ("KVM: Ignore DEBUGCTL MSRs with no effect") (kvm_do_msr_access() will morph the unsupported userspace write into a nop). Fixes: 24e09cbf480a ("KVM: SVM: enable LBR virtualization") Cc: stable@vger.kernel.org Reported-by: Jim Mattson <jmattson@google.com> Signed-off-by: Yosry Ahmed <yosry@kernel.org> Link: https://patch.msgid.link/20260303003421.2185681-4-yosry@kernel.org [sean: guard with lbrv checks, massage changelog] Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-03-04KVM: SVM: Switch svm_copy_lbrs() to a macroYosry Ahmed
In preparation for using svm_copy_lbrs() with 'struct vmcb_save_area' without a containing 'struct vmcb', and later even 'struct vmcb_save_area_cached', make it a macro. Macros are generally not preferred compared to functions, mainly due to type-safety. However, in this case it seems like having a simple macro copying a few fields is better than copy-pasting the same 5 lines of code in different places. Cc: stable@vger.kernel.org Signed-off-by: Yosry Ahmed <yosry@kernel.org> Link: https://patch.msgid.link/20260303003421.2185681-3-yosry@kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-03-04KVM: nSVM: Avoid clearing VMCB_LBR in vmcb12Yosry Ahmed
svm_copy_lbrs() always marks VMCB_LBR dirty in the destination VMCB. However, nested_svm_vmexit() uses it to copy LBRs to vmcb12, and clearing clean bits in vmcb12 is not architecturally defined. Move vmcb_mark_dirty() to callers and drop it for vmcb12. This also facilitates incoming refactoring that does not pass the entire VMCB to svm_copy_lbrs(). Fixes: d20c796ca370 ("KVM: x86: nSVM: implement nested LBR virtualization") Cc: stable@vger.kernel.org Signed-off-by: Yosry Ahmed <yosry@kernel.org> Link: https://patch.msgid.link/20260303003421.2185681-2-yosry@kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-03-04KVM: SVM: Inject #UD for INVLPGA if EFER.SVME=0Kevin Cheng
INVLPGA should cause a #UD when EFER.SVME is not set. Add a check to properly inject #UD when EFER.SVME=0. Fixes: ff092385e828 ("KVM: SVM: Implement INVLPGA") Cc: stable@vger.kernel.org Signed-off-by: Kevin Cheng <chengkev@google.com> Reviewed-by: Yosry Ahmed <yosry.ahmed@linux.dev> Link: https://patch.msgid.link/20260228033328.2285047-3-chengkev@google.com [sean: tag for stable@] Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-03-04KVM: nSVM: Delay setting soft IRQ RIP tracking fields until vCPU runSean Christopherson
In the save+restore path, when restoring nested state, the values of RIP and CS base passed into nested_vmcb02_prepare_control() are mostly incorrect. They are both pulled from the vmcb02. For CS base, the value is only correct if system regs are restored before nested state. The value of RIP is whatever the vCPU had in vmcb02 before restoring nested state (zero on a freshly created vCPU). Instead, take a similar approach to NextRIP, and delay initializing the RIP tracking fields until shortly before the vCPU is run, to make sure the most up-to-date values of RIP and CS base are used regardless of KVM_SET_SREGS, KVM_SET_REGS, and KVM_SET_NESTED_STATE's relative ordering. Fixes: cc440cdad5b7 ("KVM: nSVM: implement KVM_GET_NESTED_STATE and KVM_SET_NESTED_STATE") CC: stable@vger.kernel.org Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Yosry Ahmed <yosry@kernel.org> Link: https://patch.msgid.link/20260225005950.3739782-8-yosry@kernel.org [sean: deal with the svm_cancel_injection() madness] Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-03-04KVM: nSVM: Delay stuffing L2's current RIP into NextRIP until vCPU runYosry Ahmed
For guests with NRIPS disabled, L1 does not provide NextRIP when running an L2 with an injected soft interrupt, instead it advances L2's RIP before running it. KVM uses L2's current RIP as the NextRIP in vmcb02 to emulate a CPU without NRIPS. However, in svm_set_nested_state(), the value used for L2's current RIP comes from vmcb02, which is just whatever the vCPU had in vmcb02 before restoring nested state (zero on a freshly created vCPU). Passing the cached RIP value instead (i.e. kvm_rip_read()) would only fix the issue if registers are restored before nested state. Instead, split the logic of setting NextRIP in vmcb02. Handle the 'normal' case of initializing vmcb02's NextRIP using NextRIP from vmcb12 (or KVM_GET_NESTED_STATE's payload) in nested_vmcb02_prepare_control(). Delay the special case of stuffing L2's current RIP into vmcb02's NextRIP until shortly before the vCPU is run, to make sure the most up-to-date value of RIP is used regardless of KVM_SET_REGS and KVM_SET_NESTED_STATE's relative ordering. Fixes: cc440cdad5b7 ("KVM: nSVM: implement KVM_GET_NESTED_STATE and KVM_SET_NESTED_STATE") CC: stable@vger.kernel.org Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Yosry Ahmed <yosry@kernel.org> Link: https://patch.msgid.link/20260225005950.3739782-7-yosry@kernel.org [sean: use new helper, svm_fixup_nested_rips()] Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-03-04Merge tag 'riscv-soc-fixes-for-v7.0-rc1' of ↵Arnd Bergmann
https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux into arm/fixes RISC-V soc fixes for v7.0-rc1 drivers: Fix leaks in probe/init function teardown code in three drivers. microchip: Fix a warning introduced by a recent binding change, that made resets required on Polarfire SoC's CAN IP. Signed-off-by: Conor Dooley <conor.dooley@microchip.com> * tag 'riscv-soc-fixes-for-v7.0-rc1' of https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux: cache: ax45mp: Fix device node reference leak in ax45mp_cache_init() cache: starfive: fix device node leak in starlink_cache_init() riscv: dts: microchip: add can resets to mpfs soc: microchip: mpfs: Fix memory leak in mpfs_sys_controller_probe() Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2026-03-04arm64: dts: qcom: monaco: Fix UART10 pinconfLoic Poulain
UART10 RTS and TX pins were incorrectly mapped to gpio84 and gpio85. Correct them to gpio85 (RTS) and gpio86 (TX) to match the hardware I/O mapping. Fixes: 467284a3097f ("arm64: dts: qcom: qcs8300: Add QUPv3 configuration") Signed-off-by: Loic Poulain <loic.poulain@oss.qualcomm.com> Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Link: https://lore.kernel.org/r/20260202155611.1568-1-loic.poulain@oss.qualcomm.com Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2026-03-04arm64: dts: qcom: monaco: Add EL2 overlayMukesh Ojha
All the Monaco IOT variants boards are using Gunyah hypervisor which means that, so far, Linux-based OS could only boot in EL1 on those devices. However, it is possible for us to boot Linux at EL2 on these devices [1]. When running under Gunyah, the remote processor firmware IOMMU streams are controlled by Gunyah. However, without Gunyah, the IOMMU is managed by the consumer of this DeviceTree. Therefore, describe the firmware streams for each remote processor. Add a EL2-specific DT overlay and apply it to Monaco IOT variant devices to create -el2.dtb for each of them alongside "normal" dtb. [1] https://docs.qualcomm.com/bundle/publicresource/topics/80-70020-4/boot-developer-touchpoints.html#uefi Signed-off-by: Mukesh Ojha <mukesh.ojha@oss.qualcomm.com> Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Link: https://lore.kernel.org/r/20260127-talos-el2-overlay-v2-2-b6a2266532c4@oss.qualcomm.com Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2026-03-04arm64: dts: qcom: lemans: disable zap-shader for EL2 configurationMukesh Ojha
We don't need to use zap shader in EL2 as Linux can zap the gpu on it's own. Lets disable zap-shader for Lemans EL2 configuration. Signed-off-by: Mukesh Ojha <mukesh.ojha@oss.qualcomm.com> Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Link: https://lore.kernel.org/r/20260127-talos-el2-overlay-v2-1-b6a2266532c4@oss.qualcomm.com Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2026-03-04arm64: dts: qcom: hamoa: Add EL2 overlay for hamoa-evkXin Liu
Add support for building an EL2 combined DTB for the hamoa-evk in the Qualcomm DTS Makefile. The new hamoa-iot-evk-el2.dtb is generated by combining the base hamoa-iot-evk.dtb with the x1-el2.dtbo overlay, enabling EL2-specific configurations required by the platform. Signed-off-by: Xin Liu <xin.liu@oss.qualcomm.com> Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Link: https://lore.kernel.org/r/20260127062425.1084673-1-xin.liu@oss.qualcomm.com Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2026-03-04arm64: dts: qcom: talos: Add missing clock-names to GCCKonrad Dybcio
The binding for this clock controller requires that clock-names are present. They're not really used by the kernel driver, but they're marked as required, so someone might have assumed it's done on purpose (where in reality we try to stay away from that since index-based references are faster, take up less space and are already widely used) and referenced it in drivers for another OS. Hence, do the least painful thing and add the missing entries. Fixes: 8e266654a2fe ("arm64: dts: qcom: add QCS615 platform") Signed-off-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Reviewed-by: Taniya Das <taniya.das@oss.qualcomm.com> Link: https://lore.kernel.org/r/20260126-topic-talos_dt_warn-v1-1-c452afc647ad@oss.qualcomm.com Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2026-03-04arm64: dts: qcom: ipq9574: remove MP5496 regulator references from SoC dtsiGabor Juhos
The 'cpu-supply' properties in the IPQ9574 SoC dtsi are referencing to a regulator provided by an MP5496 PMIC via the RPM firmware which's node is defined externally in the common RDP dtsi file. Since the PMIC is not part of the SoC it should not be referenced from the SoC specific dtsi, so remove the properties from there and define those in the common RDP dtsi instead. While at it, also change the prefix of the label from 'ipq9574' to 'mp5496' to keep it consistent with the labels of the l{2,5} regulators provided by the same PMIC. No functional changes. According to dtx_diff there are no differences between the ipq9574*.dtb files built with and without the change. Signed-off-by: Gabor Juhos <j4g8y7@gmail.com> Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Link: https://lore.kernel.org/r/20260123-ipq9574-mp5496-cleanup-v1-1-9fa86f72b873@gmail.com Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2026-03-04arm64: dts: qcom: kodiak: Fix PCIe1 PHY ref clock votingKrishna Chaitanya Chundru
GCC_PCIE_CLKREF_EN controls a repeater that provides the reference clock only to the PCIe0 PHY. PCIe1 PHY receives its refclk directly from the CXO source. If the PCIe1 driver in HLOS votes for or against GCC_PCIE_CLKREF_EN, it will inadvertently modify the refclk to PCIe0 as well. Since PCIe0 is managed by WPSS while PCIe1 is managed in HLOS, there is no mechanism to coordinate these votes. As a result, HLOS may disable this repeater during suspend and cut off the PCIe0 PHY refclk while PCIe0 is still active. Replace the unused GCC_PCIE_CLKREF_EN clock entry with RPMH_CXO_CLK to reflect the actual hardware wiring and prevent unintended changes to PCIe0 clocking. Fixes: 92e0ee9f83b3 ("arm64: dts: qcom: sc7280: Add PCIe and PHY related nodes") Cc: stable@vger.kernel.org Signed-off-by: Krishna Chaitanya Chundru <krishna.chundru@oss.qualcomm.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Link: https://lore.kernel.org/r/20260123-fix_pcie1_phy_clk-v1-1-38f82ea01792@oss.qualcomm.com Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2026-03-04arm64: dts: qcom: Add support for ECS LIVA QC710Val Packett
Add a device tree for the ECS LIVA QC710 (Snapdragon 7c) mini PC/devkit. Working: - Wi-Fi (wcn3990 hw1.0) - Bluetooth - USB Type-A (USB3 and USB2) - Ethernet (over USB2) - HDMI Display - eMMC - SDHC (microSD slot) Not included: - HDMI Audio - EC (IT8987) Signed-off-by: Val Packett <val@packett.cool> Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Link: https://lore.kernel.org/r/20260120234029.419825-10-val@packett.cool [bjorn: Reordered apps_rsc and tlmm nodes] Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2026-03-04arm64: dts: qcom: sdm630: add SPI7 interfaceGianluca Boiano
Add spi7 interface to SDM630 device tree. Signed-off-by: Gianluca Boiano <morf3089@gmail.com> Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Link: https://lore.kernel.org/r/20260120193634.1089688-1-morf3089@gmail.com Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2026-03-04arm64: dts: qcom: Add base PURWA-IOT-EVK boardYijie Yang
The PURWA-IOT-EVK is an evaluation platform for IoT products, composed of the Purwa IoT SoM and a carrier board. Together, they form a complete embedded system capable of booting to UART. PURWA-IOT-EVK uses the PS8833 as a retimer for USB0, unlike HAMOA-IOT-EVK. Meanwhile, USB0 bypasses the SBU selector FSUSB42. Make the following peripherals on the carrier board enabled: - UART - On-board regulators - USB Type-C mux - Pinctrl - Embedded USB (EUSB) repeaters - NVMe - pmic-glink - USB DisplayPorts - Bluetooth - WLAN - Audio - PCIe ports for PCIe3 through PCIe6a - TPM Signed-off-by: Yijie Yang <yijie.yang@oss.qualcomm.com> Reviewed-by: Abel Vesa <abel.vesa@oss.qualcomm.com> Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Link: https://lore.kernel.org/r/20260202073555.1345260-4-yijie.yang@oss.qualcomm.com Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2026-03-04arm64: dts: qcom: Add PURWA-IOT-SOM platformYijie Yang
The PURWA-IOT-SOM is a compact computing module that integrates a System on Chip (SoC) — specifically the x1p42100 — along with essential components optimized for IoT applications. It is designed to be mounted on carrier boards, enabling the development of complete embedded systems. Purwa uses a slightly different Iris HW revision (8.1.2 on Hamoa, 8.1.11 on Purwa). Support will be added later. Make the following peripherals on the SOM enabled: - Regulators on the SOM - Reserved memory regions - PCIe3, PCIe4, PCIe5, PCIe6a - USB0 through USB6 and their PHYs - ADSP, CDSP - Graphic Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Signed-off-by: Yijie Yang <yijie.yang@oss.qualcomm.com> Reviewed-by: Abel Vesa <abel.vesa@oss.qualcomm.com> Link: https://lore.kernel.org/r/20260202073555.1345260-3-yijie.yang@oss.qualcomm.com Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2026-03-04arm64: dts: qcom: qcs6490-rubikpi3: Use lt9611 DSI Port BHongyang Zhao
The LT9611 HDMI bridge on RubikPi3 has DSI physically connected to Port B. Update the devicetree to use port@1 which corresponds to Port B input on the LT9611. Fixes: f055a39f6874 ("arm64: dts: qcom: Add qcs6490-rubikpi3 board dts") Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Signed-off-by: Hongyang Zhao <hongyang.zhao@thundersoft.com> Reviewed-by: Roger Shimizu <rosh@debian.org> Tested-by: Roger Shimizu <rosh@debian.org> Link: https://lore.kernel.org/r/20260207-rubikpi-next-20260116-v3-3-23b9aa189a3a@thundersoft.com Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2026-03-04arm64: dts: qcom: talos: Mark usb controllers are wakeup capable devicesKrishna Kurapati
USB controllers on talos are wakeup capable. Hence add wakeup-source property to both controller nodes. Signed-off-by: Krishna Kurapati <krishna.kurapati@oss.qualcomm.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Link: https://lore.kernel.org/r/20260128062720.437712-3-krishna.kurapati@oss.qualcomm.com Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2026-03-04arm64: dts: qcom: talos: Flatten usb controller nodesKrishna Kurapati
Flatten usb controller nodes and update to using latest bindings and flattened driver approach. Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Signed-off-by: Krishna Kurapati <krishna.kurapati@oss.qualcomm.com> Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Link: https://lore.kernel.org/r/20260128062720.437712-2-krishna.kurapati@oss.qualcomm.com Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2026-03-04arm64: dts: qcom: Add Redmi Note 8TBarnabás Czémán
Redmi Note 8T (willow) is very similar to Redmi Note 8 (ginkgo) the only difference is willow have NFC. Make a common base from ginkgo devicetree for both device. Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Signed-off-by: Barnabás Czémán <barnabas.czeman@mainlining.org> Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Link: https://lore.kernel.org/r/20260126-xiaomi-willow-v3-7-aad7b106c311@mainlining.org Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2026-03-04arm64: dts: qcom: sm6125-xiaomi-ginkgo: Fix reserved gpio rangesBarnabás Czémán
The device was crashing on boot because the reserved gpio ranges was wrongly defined. Correct the ranges for avoid pinctrl crashing. Fixes: 9b1a6c925c88 ("arm64: dts: qcom: sm6125: Initial support for xiaomi-ginkgo") Tested-by: Biswapriyo Nath <nathbappai@gmail.com> Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Signed-off-by: Barnabás Czémán <barnabas.czeman@mainlining.org> Link: https://lore.kernel.org/r/20260126-xiaomi-willow-v3-5-aad7b106c311@mainlining.org Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2026-03-04arm64: dts: qcom: sm6125-xiaomi-ginkgo: Remove extconBarnabás Czémán
GPIO pin 102 is related to DisplayPort what is not supported by this device and it is also disabled at downstream, remove the unnecessary extcon-usb node. Fixes: 9b1a6c925c88 ("arm64: dts: qcom: sm6125: Initial support for xiaomi-ginkgo") Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Signed-off-by: Barnabás Czémán <barnabas.czeman@mainlining.org> Link: https://lore.kernel.org/r/20260126-xiaomi-willow-v3-4-aad7b106c311@mainlining.org Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2026-03-04arm64: dts: qcom: sm6125-xiaomi-ginkgo: Set memory-region for framebufferBarnabás Czémán
Use memory-region property for framebuffer instead of reg. Signed-off-by: Barnabás Czémán <barnabas.czeman@mainlining.org> Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Link: https://lore.kernel.org/r/20260126-xiaomi-willow-v3-3-aad7b106c311@mainlining.org Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2026-03-04arm64: dts: qcom: sm6125-xiaomi-ginkgo: Correct reserved memory rangesBarnabás Czémán
The device was crashing on high memory load because the reserved memory ranges was wrongly defined. Correct the ranges for avoid the crashes. Change the ramoops memory range to match with the values from the recovery to be able to get the results from the device. Fixes: 9b1a6c925c88 ("arm64: dts: qcom: sm6125: Initial support for xiaomi-ginkgo") Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Signed-off-by: Barnabás Czémán <barnabas.czeman@mainlining.org> Link: https://lore.kernel.org/r/20260126-xiaomi-willow-v3-2-aad7b106c311@mainlining.org Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2026-03-04arm64: dts: qcom: sm6125-xiaomi-ginkgo: Remove board-idBarnabás Czémán
Remove board-id it is not necessary for the bootloader. Fixes: 9b1a6c925c88 ("arm64: dts: qcom: sm6125: Initial support for xiaomi-ginkgo") Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Signed-off-by: Barnabás Czémán <barnabas.czeman@mainlining.org> Link: https://lore.kernel.org/r/20260126-xiaomi-willow-v3-1-aad7b106c311@mainlining.org Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2026-03-04ARM: dts: qcom: Drop unused .dtsiRob Herring (Arm)
These .dtsi files are not included anywhere in the tree and can't be tested. Signed-off-by: Rob Herring (Arm) <robh@kernel.org> Link: https://lore.kernel.org/r/20251212203226.458694-2-robh@kernel.org Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2026-03-04KVM: nSVM: Always use NextRIP as vmcb02's NextRIP after first L2 VMRUNYosry Ahmed
For guests with NRIPS disabled, L1 does not provide NextRIP when running an L2 with an injected soft interrupt, instead it advances the current RIP before running it. KVM uses the current RIP as the NextRIP in vmcb02 to emulate a CPU without NRIPS. However, after L2 runs the first time, NextRIP will be updated by the CPU and/or KVM, and the current RIP is no longer the correct value to use in vmcb02. Hence, after save/restore, use the current RIP if and only if a nested run is pending, otherwise use NextRIP. Give soft_int_next_rip the same treatment, as it's the same logic, just for a narrower use case. Fixes: cc440cdad5b7 ("KVM: nSVM: implement KVM_GET_NESTED_STATE and KVM_SET_NESTED_STATE") CC: stable@vger.kernel.org Signed-off-by: Yosry Ahmed <yosry@kernel.org> Link: https://patch.msgid.link/20260225005950.3739782-6-yosry@kernel.org [sean: give soft_int_next_rip the same treatment] Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-03-04x86/mm/pat: Convert split_large_page() to use ptdescsVishal Moola (Oracle)
Use the ptdesc APIs for all page table allocation and free sites to allow their separate allocation from struct page in the future. Update split_large_page() to allocate a ptdesc instead of allocating a page for use as a page table. Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Link: https://patch.msgid.link/20260303194828.1406905-5-vishal.moola@gmail.com
2026-03-04x86/mm/pat: Convert populate_pgd() to use page table apisVishal Moola (Oracle)
Use the ptdesc APIs for all page table allocation and free sites to allow their separate allocation from struct page in the future. Convert the remaining get_zeroed_page() calls to the generic page table APIs, as they already use ptdescs. Pass through init_mm since these are kernel page tables, as both functions require it to identify kernel page tables. Because the generic implementations do not use the second argument, pass a placeholder to avoid reimplementing them or risking breakage on other architectures. It is not obvious whether these pages are freed. Regardless, convert the remaining free paths as needed, noting that the only other possible free paths have already been converted and that a frozen page table test kernel has not reported any issues. Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Link: https://patch.msgid.link/20260303194828.1406905-4-vishal.moola@gmail.com
2026-03-04x86/mm/pat: Convert pmd code to use page table apisVishal Moola (Oracle)
Use the ptdesc APIs for all page table allocation and free sites to allow their separate allocation from struct page in the future. Convert the PMD allocation and free sites to use the generic page table APIs, as they already use ptdescs. Pass through init_mm since these are kernel page tables, as pmd_alloc_one() requires it to identify kernel page tables. Because the generic implementation does not use the second argument, pass a placeholder to avoid reimplementing it or risking breakage on other architectures. Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Link: https://patch.msgid.link/20260303194828.1406905-3-vishal.moola@gmail.com
2026-03-04x86/mm/pat: Convert pte code to use page table apisVishal Moola (Oracle)
Use the ptdesc APIs for all page table allocation and free sites to allow their separate allocation from struct page in the future. Convert the PTE allocation and free sites to use the generic page table APIs, as they already use ptdescs. Pass through init_mm since these are kernel page tables; otherwise, pte_alloc_one_kernel() becomes a no-op. Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Link: https://patch.msgid.link/20260303194828.1406905-2-vishal.moola@gmail.com
2026-03-04KVM: TDX: Fold tdx_bringup() into tdx_hardware_setup()Sean Christopherson
Now that TDX doesn't need to manually enable virtualization through _KVM_ APIs during setup, fold tdx_bringup() into tdx_hardware_setup() where the code belongs, e.g. so that KVM doesn't leave the S-EPT kvm_x86_ops wired up when TDX is disabled. The weird ordering (and naming) was necessary to allow KVM TDX to use kvm_enable_virtualization(), which in turn had a hard dependency on kvm_x86_ops.enable_virtualization_cpu and thus kvm_x86_vendor_init(). Tested-by: Chao Gao <chao.gao@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Tested-by: Sagi Shahar <sagis@google.com> Link: https://patch.msgid.link/20260214012702.2368778-17-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-03-04x86/virt/tdx: Use ida_is_empty() to detect if any TDs may be runningSean Christopherson
Drop nr_configured_hkid and instead use ida_is_empty() to detect if any HKIDs have been allocated/configured. Suggested-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Chao Gao <chao.gao@intel.com> Tested-by: Chao Gao <chao.gao@intel.com> Tested-by: Sagi Shahar <sagis@google.com> Link: https://patch.msgid.link/20260214012702.2368778-15-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-03-04x86/virt/tdx: KVM: Consolidate TDX CPU hotplug handlingChao Gao
The core kernel registers a CPU hotplug callback to do VMX and TDX init and deinit while KVM registers a separate CPU offline callback to block offlining the last online CPU in a socket. Splitting TDX-related CPU hotplug handling across two components is odd and adds unnecessary complexity. Consolidate TDX-related CPU hotplug handling by integrating KVM's tdx_offline_cpu() to the one in the core kernel. Also move nr_configured_hkid to the core kernel because tdx_offline_cpu() references it. Since HKID allocation and free are handled in the core kernel, it's more natural to track used HKIDs there. Reviewed-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Chao Gao <chao.gao@intel.com> Tested-by: Chao Gao <chao.gao@intel.com> Tested-by: Sagi Shahar <sagis@google.com> Link: https://patch.msgid.link/20260214012702.2368778-14-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-03-04x86/virt/tdx: Tag a pile of functions as __init, and globals as __ro_after_initSean Christopherson
Now that TDX-Module initialization is done during subsys init, tag all related functions as __init, and relevant data as __ro_after_init. Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Chao Gao <chao.gao@intel.com> Tested-by: Chao Gao <chao.gao@intel.com> Tested-by: Sagi Shahar <sagis@google.com> Link: https://patch.msgid.link/20260214012702.2368778-13-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-03-04KVM: x86/tdx: Do VMXON and TDX-Module initialization during subsys initSean Christopherson
Now that VMXON can be done without bouncing through KVM, do TDX-Module initialization during subsys init (specifically before module_init() so that it runs before KVM when both are built-in). Aside from the obvious benefits of separating core TDX code from KVM, this will allow tagging a pile of TDX functions and globals as being __init and __ro_after_init. Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Chao Gao <chao.gao@intel.com> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Tested-by: Chao Gao <chao.gao@intel.com> Tested-by: Sagi Shahar <sagis@google.com> Link: https://patch.msgid.link/20260214012702.2368778-12-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-03-04x86/virt/tdx: Drop the outdated requirement that TDX be enabled in IRQ contextSean Christopherson
Remove TDX's outdated requirement that per-CPU enabling be done via IPI function call, which was a stale artifact leftover from early versions of the TDX enablement series. The requirement that IRQs be disabled should have been dropped as part of the revamped series that relied on a the KVM rework to enable VMX at module load. In other words, the kernel's "requirement" was never a requirement at all, but instead a reflection of how KVM enabled VMX (via IPI callback) when the TDX subsystem code was merged. Note, accessing per-CPU information is safe even without disabling IRQs, as tdx_online_cpu() is invoked via a cpuhp callback, i.e. from a per-CPU thread. Link: https://lore.kernel.org/all/ZyJOiPQnBz31qLZ7@google.com Tested-by: Chao Gao <chao.gao@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Tested-by: Sagi Shahar <sagis@google.com> Link: https://patch.msgid.link/20260214012702.2368778-11-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-03-04x86/virt: Add refcounting of VMX/SVM usage to support multiple in-kernel usersSean Christopherson
Implement a per-CPU refcounting scheme so that "users" of hardware virtualization, e.g. KVM and the future TDX code, can co-exist without pulling the rug out from under each other. E.g. if KVM were to disable VMX on module unload or when the last KVM VM was destroyed, SEAMCALLs from the TDX subsystem would #UD and panic the kernel. Disable preemption in the get/put APIs to ensure virtualization is fully enabled/disabled before returning to the caller. E.g. if the task were preempted after a 0=>1 transition, the new task would see a 1=>2 and thus return without enabling virtualization. Explicitly disable preemption instead of requiring the caller to do so, because the need to disable preemption is an artifact of the implementation. E.g. from KVM's perspective there is no _need_ to disable preemption as KVM guarantees the pCPU on which it is running is stable (but preemption is enabled). Opportunistically abstract away SVM vs. VMX in the public APIs by using X86_FEATURE_{SVM,VMX} to communicate what technology the caller wants to enable and use. Cc: Xu Yilun <yilun.xu@linux.intel.com> Reviewed-by: Chao Gao <chao.gao@intel.com> Tested-by: Chao Gao <chao.gao@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Tested-by: Sagi Shahar <sagis@google.com> Link: https://patch.msgid.link/20260214012702.2368778-10-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>