summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2025-12-08drm/amdgpu: update VRAM typesHawking Zhang
Update VRAM types. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08drm/amdgpu: Move XCP_INST_MASK amdgpu_xcp.hHawking Zhang
Move the common macro for xcp manger to amdgpu_xcp.h Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08drm/amdgpu: Verify dpm setting for enabling smu with direct fw loadingHawking Zhang
Ensure that amdgpu_dpm kernel module parameter is set to 1 when enabling smu with direct firmware loading Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08drm/amdkfd: refactor rlc/gfx spmJames Zhu
for adding multiple xcc support. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Bing Ma <Bing.Ma@amd.com> Reviewed-by: Gang Ba <gaba@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08drm/amdgpu: Generalize HQD and VMID mask calculation for MESMukul Joshi
Generalize the calculation for determining the HQD mask and VMID mask passed to MES during initialization. v2: rebase (Alex) Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08drm/amdgpu/mes: add multi-xcc supportJack Xiao
a. extend mes pipe instances to num_xcc * max_mes_pipe b. initialize mes schq/kiq pipes per xcc c. submit mes packet to mes ring according to xcc_id v2: rebase (Alex) Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08drm/amdgpu: add PCIe atomics bit in PTEMukul Joshi
To enable atomic access to memory, setup the new PCIe atomics bit in PTE. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Alex Sierra <alex.sierra@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08drm/amdgpu: update soc15 IH client idsMukul Joshi
Add client id for UTCL2. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08drm/amdgpu: Add hwid for AIGCHawking Zhang
Add hwid for a new ip block named AIGC Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08drm/amdgpu: Add hwid for ATUHawking Zhang
Add hwid for Address Translation Unit (ATU) Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08drm/amdgpu: Increase the maximum number of IP instancesHawking Zhang
SOC v1_0 supports a greater number of IP instances. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08drm/amdkfd: Rework reserved SDMA queue handlingMukul Joshi
We would need to reserve SDMA queues per KFD node. As a result, rework the SDMA reserved queue handling to make it per KFD node. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08drm/amdgpu: fix NULL pointer issue for supports_bacoLikun Gao
Return 0 if the realted ASIC do not have supports_baco function to fix the NULL pointer issue. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08drm/amdgpu: fix NULL pointer issue buffer funcsLikun Gao
If SDMA block not enabled, buffer_funcs will not initialize, fix the null pointer issue if buffer_funcs not initialized. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08selftests: cgroup: Replace sleep with cg_read_key_long_poll() for waiting on ↵Guopeng Zhang
nr_dying_descendants Replace the manual sleep-and-retry logic in test_kmem_dead_cgroups() with the new helper `cg_read_key_long_poll()`. This change improves the robustness of the test by polling the "nr_dying_descendants" counter in `cgroup.stat` until it reaches 0 or the timeout is exceeded. Additionally, increase the retry timeout to 8 seconds (from 5 seconds) based on testing results: - With 5-second timeout: 4/20 runs passed. - With 8-second timeout: 20/20 runs passed. The 8 second timeout is based on stress testing of test_kmem_dead_cgroups() under load: 5 seconds was occasionally not enough for reclaim of dying descendants to complete, whereas 8 seconds consistently covered the observed latencies. This value is intended as a generous upper bound for the asynchronous reclaim and is not tied to any specific kernel constant, so it can be adjusted in the future if reclaim behavior changes. Signed-off-by: Guopeng Zhang <zhangguopeng@kylinos.cn> Reviewed-by: Shakeel Butt <shakeel.butt@linux.dev> Acked-by: Michal Koutný <mkoutny@suse.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-12-08selftests: cgroup: make test_memcg_sock robust against delayed sock statsGuopeng Zhang
test_memcg_sock() currently requires that memory.stat's "sock " counter is exactly zero immediately after the TCP server exits. On a busy system this assumption is too strict: - Socket memory may be freed with a small delay (e.g. RCU callbacks). - memcg statistics are updated asynchronously via the rstat flushing worker, so the "sock " value in memory.stat can stay non-zero for a short period of time even after all socket memory has been uncharged. As a result, test_memcg_sock() can intermittently fail even though socket memory accounting is working correctly. Make the test more robust by polling memory.stat for the "sock " counter and allowing it some time to drop to zero instead of checking it only once. The timeout is set to 3 seconds to cover the periodic rstat flush interval (FLUSH_TIME = 2*HZ by default) plus some scheduling slack. If the counter does not become zero within the timeout, the test still fails as before. On my test system, running test_memcontrol 50 times produced: - Before this patch: 6/50 runs passed. - After this patch: 50/50 runs passed. Signed-off-by: Guopeng Zhang <zhangguopeng@kylinos.cn> Suggested-by: Lance Yang <lance.yang@linux.dev> Reviewed-by: Shakeel Butt <shakeel.butt@linux.dev> Acked-by: Michal Koutný <mkoutny@suse.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-12-08selftests: cgroup: Add cg_read_key_long_poll() to poll a cgroup key with retriesGuopeng Zhang
Introduce a new helper function `cg_read_key_long_poll()` in cgroup_util.h. This function polls the specified key in a cgroup file until it matches the expected value or the retry limit is reached, with configurable wait intervals between retries. This helper is particularly useful for handling asynchronously updated cgroup statistics (e.g., memory.stat), where immediate reads may observe stale values, especially on busy systems. It allows tests and other utilities to handle such cases more flexibly. Signed-off-by: Guopeng Zhang <zhangguopeng@kylinos.cn> Suggested-by: Michal Koutný <mkoutny@suse.com> Reviewed-by: Shakeel Butt <shakeel.butt@linux.dev> Acked-by: Michal Koutný <mkoutny@suse.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-12-08cgroup: rstat: use LOCK CMPXCHG in css_rstat_updatedShakeel Butt
On x86-64, this_cpu_cmpxchg() uses CMPXCHG without LOCK prefix which means it is only safe for the local CPU and not for multiple CPUs. Recently the commit 36df6e3dbd7e ("cgroup: make css_rstat_updated nmi safe") make css_rstat_updated lockless and uses lockless list to allow reentrancy. Since css_rstat_updated can invoked from process context, IRQ and NMI, it uses this_cpu_cmpxchg() to select the winner which will inset the lockless lnode into the global per-cpu lockless list. However the commit missed one case where lockless node of a cgroup can be accessed and modified by another CPU doing the flushing. Basically llist_del_first_init() in css_process_update_tree(). On a cursory look, it can be questioned how css_process_update_tree() can see a lockless node in global lockless list where the updater is at this_cpu_cmpxchg() and before llist_add() call in css_rstat_updated(). This can indeed happen in the presence of IRQs/NMI. Consider this scenario: Updater for cgroup stat C on CPU A in process context is after llist_on_list() check and before this_cpu_cmpxchg() in css_rstat_updated() where it get interrupted by IRQ/NMI. In the IRQ/NMI context, a new updater calls css_rstat_updated() for same cgroup C and successfully inserts rstatc_pcpu->lnode. Now concurrently CPU B is running the flusher and it calls llist_del_first_init() for CPU A and got rstatc_pcpu->lnode of cgroup C which was added by the IRQ/NMI updater. Now imagine CPU B calling init_llist_node() on cgroup C's rstatc_pcpu->lnode of CPU A and on CPU A, the process context updater calling this_cpu_cmpxchg(rstatc_pcpu->lnode) concurrently. The CMPXCNG without LOCK on CPU A is not safe and thus we need LOCK prefix. In Meta's fleet running the kernel with the commit 36df6e3dbd7e, we are observing on some machines the memcg stats are getting skewed by more than the actual memory on the system. On close inspection, we noticed that lockless node for a workload for specific CPU was in the bad state and thus all the updates on that CPU for that cgroup was being lost. To confirm if this skew was indeed due to this CMPXCHG without LOCK in css_rstat_updated(), we created a repro (using AI) at [1] which shows that CMPXCHG without LOCK creates almost the same lnode corruption as seem in Meta's fleet and with LOCK CMPXCHG the issue does not reproduces. Link: http://lore.kernel.org/efiagdwmzfwpdzps74fvcwq3n4cs36q33ij7eebcpssactv3zu@se4hqiwxcfxq [1] Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev> Cc: stable@vger.kernel.org # v6.17+ Fixes: 36df6e3dbd7e ("cgroup: make css_rstat_updated nmi safe") Signed-off-by: Tejun Heo <tj@kernel.org>
2025-12-08sched/ext: Avoid null ptr traversal when ->put_prev_task() is called with ↵John Stultz
NULL next Early when trying to get sched_ext and proxy-exe working together, I kept tripping over NULL ptr in put_prev_task_scx() on the line: if (sched_class_above(&ext_sched_class, next->sched_class)) { Which was due to put_prev_task() passes a NULL next, calling: prev->sched_class->put_prev_task(rq, prev, NULL); put_prev_task_scx() already guards for a NULL next in the switch_class case, but doesn't seem to have a guard for sched_class_above() check. I can't say I understand why this doesn't trip usually without proxy-exec. And in newer kernels there are way fewer put_prev_task(), and I can't easily reproduce the issue now even with proxy-exec. But we still have one put_prev_task() call left in core.c that seems like it could trip this, so I wanted to send this out for consideration. tj: put_prev_task() can be called with NULL @next; however, when @p is queued, that doesn't happen, so this condition shouldn't currently be triggerable. The connection isn't straightforward or necessarily reliable, so add the NULL check even if it can't currently be triggered. Link: http://lkml.kernel.org/r/20251206022218.1541878-1-jstultz@google.com Signed-off-by: John Stultz <jstultz@google.com> Reviewed-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-12-08sched_ext: Fix the memleak for sch->helper objectsZqiang
This commit use kthread_destroy_worker() to release sch->helper objects to fix the following kmemleak: unreferenced object 0xffff888121ec7b00 (size 128): comm "scx_simple", pid 1197, jiffies 4295884415 hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 ad 4e ad de .............N.. ff ff ff ff 00 00 00 00 ff ff ff ff ff ff ff ff ................ backtrace (crc 587b3352): kmemleak_alloc+0x62/0xa0 __kmalloc_cache_noprof+0x28d/0x3e0 kthread_create_worker_on_node+0xd5/0x1f0 scx_enable.isra.210+0x6c2/0x25b0 bpf_scx_reg+0x12/0x20 bpf_struct_ops_link_create+0x2c3/0x3b0 __sys_bpf+0x3102/0x4b00 __x64_sys_bpf+0x79/0xc0 x64_sys_call+0x15d9/0x1dd0 do_syscall_64+0xf0/0x470 entry_SYSCALL_64_after_hwframe+0x77/0x7f Fixes: bff3b5aec1b7 ("sched_ext: Move disable machinery into scx_sched") Cc: stable@vger.kernel.org # v6.16+ Signed-off-by: Zqiang <qiang.zhang@linux.dev> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-12-08accel/amdxdna: Fix tail-pointer polling in mailbox_get_msg()Lizhi Hou
In mailbox_get_msg(), mailbox_reg_read_non_zero() is called to poll for a non-zero tail pointer. This assumed that a zero value indicates an error. However, certain corner cases legitimately produce a zero tail pointer. To handle these cases, remove mailbox_reg_read_non_zero(). The zero tail pointer will be treated as a valid rewind event. Reviewed-by: Maciej Falkowski <maciej.falkowski@linux.intel.com> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20251204181603.793824-1-lizhi.hou@amd.com
2025-12-08nfsd: Mark variable __maybe_unused to avoid W=1 build breakAndy Shevchenko
Clang is not happy about set but (in some cases) unused variable: fs/nfsd/export.c:1027:17: error: variable 'inode' set but not used [-Werror,-Wunused-but-set-variable] since it's used as a parameter to dprintk() which might be configured a no-op. To avoid uglifying code with the specific ifdeffery just mark the variable __maybe_unused. The commit [1], which introduced this behaviour, is quite old and hence the Fixes tag points to the first of the Git era. Link: https://git.kernel.org/pub/scm/linux/kernel/git/history/history.git/commit/?id=0431923fb7a1 [1] Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Cc: stable@vger.kernel.org Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2025-12-08svcrdma: bound check rq_pages index in inline pathJoshua Rogers
svc_rdma_copy_inline_range indexed rqstp->rq_pages[rc_curpage] without verifying rc_curpage stays within the allocated page array. Add guards before the first use and after advancing to a new page. Fixes: d7cc73972661 ("svcrdma: support multiple Read chunks per RPC") Cc: stable@vger.kernel.org Signed-off-by: Joshua Rogers <linux@joshua.hu> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2025-12-08svcrdma: return 0 on success from svc_rdma_copy_inline_rangeJoshua Rogers
The function comment specifies 0 on success and -EINVAL on invalid parameters. Make the tail return 0 after a successful copy loop. Fixes: d7cc73972661 ("svcrdma: support multiple Read chunks per RPC") Cc: stable@vger.kernel.org Signed-off-by: Joshua Rogers <linux@joshua.hu> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2025-12-08svcrdma: use rc_pageoff for memcpy byte offsetJoshua Rogers
svc_rdma_copy_inline_range added rc_curpage (page index) to the page base instead of the byte offset rc_pageoff. Use rc_pageoff so copies land within the current page. Found by ZeroPath (https://zeropath.com) Fixes: 8e122582680c ("svcrdma: Move svc_rdma_read_info::ri_pageno to struct svc_rdma_recv_ctxt") Cc: stable@vger.kernel.org Signed-off-by: Joshua Rogers <linux@joshua.hu> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2025-12-08SUNRPC: svcauth_gss: avoid NULL deref on zero length gss_token in ↵Joshua Rogers
gss_read_proxy_verf A zero length gss_token results in pages == 0 and in_token->pages[0] is NULL. The code unconditionally evaluates page_address(in_token->pages[0]) for the initial memcpy, which can dereference NULL even when the copy length is 0. Guard the first memcpy so it only runs when length > 0. Fixes: 5866efa8cbfb ("SUNRPC: Fix svcauth_gss_proxy_init()") Cc: stable@vger.kernel.org Signed-off-by: Joshua Rogers <linux@joshua.hu> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2025-12-08dma-buf: enable DMABUF_DEBUG by default on DEBUG kernelsChristian König
The overhead of enforcing the DMA-buf rules for importers is now so low that it safe to enable it by default on DEBUG kernels. This will hopefully result in fixing more issues in importers. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Link: https://lore.kernel.org/r/20251205130604.1582-2-christian.koenig@amd.com
2025-12-08dma-buf: improve sg_table debugging hack v4Christian König
This debugging hack is important to enforce the rule that importers should *never* touch the underlying struct page of the exporter. Instead of just mangling the page link create a copy of the sg_table but only copy over the DMA addresses and not the pages. This will cause a NULL pointer de-reference if the importer tries to touch the struct page. Still quite a hack but this at least allows the exporter to properly keeps it's sg_table intact while allowing the DMA-buf maintainer to find and fix misbehaving importers and finally switch over to using a different data structure in the future. v2: improve the hack further by using a wrapper structure and explaining the background a bit more in the commit message. v3: fix some whitespace issues, use sg_assign_page(). v4: give the functions a better name Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Link: https://lore.kernel.org/r/20251205130604.1582-1-christian.koenig@amd.com
2025-12-08drm/atomic: Add dev pointer to drm_private_objMaxime Ripard
All the objects that need to implement some callbacks in KMS have a pointer in there structure to the main drm_device. However, it's not the case for drm_private_objs, which makes it harder than it needs to be to implement some of its callbacks. Let's add that pointer. Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Reviewed-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com> Reviewed-by: Luca Ceresoli <luca.ceresoli@bootlin.com> Tested-by: Luca Ceresoli <luca.ceresoli@bootlin.com> Link: https://patch.msgid.link/20251014-drm-private-obj-reset-v2-1-6dd60e985e9d@kernel.org Signed-off-by: Maxime Ripard <mripard@kernel.org>
2025-12-08KVM: nVMX: Immediately refresh APICv controls as needed on nested VM-ExitDongli Zhang
If an APICv status updated was pended while L2 was active, immediately refresh vmcs01's controls instead of pending KVM_REQ_APICV_UPDATE as kvm_vcpu_update_apicv() only calls into vendor code if a change is necessary. E.g. if APICv is inhibited, and then activated while L2 is running: kvm_vcpu_update_apicv() | -> __kvm_vcpu_update_apicv() | -> apic->apicv_active = true | -> vmx_refresh_apicv_exec_ctrl() | -> vmx->nested.update_vmcs01_apicv_status = true | -> return Then L2 exits to L1: __nested_vmx_vmexit() | -> kvm_make_request(KVM_REQ_APICV_UPDATE) vcpu_enter_guest(): KVM_REQ_APICV_UPDATE -> kvm_vcpu_update_apicv() | -> __kvm_vcpu_update_apicv() | -> return // because if (apic->apicv_active == activate) Reported-by: Chao Gao <chao.gao@intel.com> Closes: https://lore.kernel.org/all/aQ2jmnN8wUYVEawF@intel.com Fixes: 7c69661e225c ("KVM: nVMX: Defer APICv updates while L2 is active until L1 is active") Cc: stable@vger.kernel.org Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com> [sean: write changelog] Link: https://patch.msgid.link/20251205231913.441872-3-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-12-08KVM: VMX: Update SVI during runtime APICv activationDongli Zhang
The APICv (apic->apicv_active) can be activated or deactivated at runtime, for instance, because of APICv inhibit reasons. Intel VMX employs different mechanisms to virtualize LAPIC based on whether APICv is active. When APICv is activated at runtime, GUEST_INTR_STATUS is used to configure and report the current pending IRR and ISR states. Unless a specific vector is explicitly included in EOI_EXIT_BITMAP, its EOI will not be trapped to KVM. Intel VMX automatically clears the corresponding ISR bit based on the GUEST_INTR_STATUS.SVI field. When APICv is deactivated at runtime, the VM_ENTRY_INTR_INFO_FIELD is used to specify the next interrupt vector to invoke upon VM-entry. The VMX IDT_VECTORING_INFO_FIELD is used to report un-invoked vectors on VM-exit. EOIs are always trapped to KVM, so the software can manually clear pending ISR bits. There are scenarios where, with APICv activated at runtime, a guest-issued EOI may not be able to clear the pending ISR bit. Taking vector 236 as an example, here is one scenario. 1. Suppose APICv is inactive. Vector 236 is pending in the IRR. 2. To handle KVM_REQ_EVENT, KVM moves vector 236 from the IRR to the ISR, and configures the VM_ENTRY_INTR_INFO_FIELD via vmx_inject_irq(). 3. After VM-entry, vector 236 is invoked through the guest IDT. At this point, the data in VM_ENTRY_INTR_INFO_FIELD is no longer valid. The guest interrupt handler for vector 236 is invoked. 4. Suppose a VM exit occurs very early in the guest interrupt handler, before the EOI is issued. 5. Nothing is reported through the IDT_VECTORING_INFO_FIELD because vector 236 has already been invoked in the guest. 6. Now, suppose APICv is activated. Before the next VM-entry, KVM calls kvm_vcpu_update_apicv() to activate APICv. 7. Unfortunately, GUEST_INTR_STATUS.SVI is not configured, although vector 236 is still pending in the ISR. 8. After VM-entry, the guest finally issues the EOI for vector 236. However, because SVI is not configured, vector 236 is not cleared. 9. ISR is stalled forever on vector 236. Here is another scenario. 1. Suppose APICv is inactive. Vector 236 is pending in the IRR. 2. To handle KVM_REQ_EVENT, KVM moves vector 236 from the IRR to the ISR, and configures the VM_ENTRY_INTR_INFO_FIELD via vmx_inject_irq(). 3. VM-exit occurs immediately after the next VM-entry. The vector 236 is not invoked through the guest IDT. Instead, it is saved to the IDT_VECTORING_INFO_FIELD during the VM-exit. 4. KVM calls kvm_queue_interrupt() to re-queue the un-invoked vector 236 into vcpu->arch.interrupt. A KVM_REQ_EVENT is requested. 5. Now, suppose APICv is activated. Before the next VM-entry, KVM calls kvm_vcpu_update_apicv() to activate APICv. 6. Although APICv is now active, KVM still uses the legacy VM_ENTRY_INTR_INFO_FIELD to re-inject vector 236. GUEST_INTR_STATUS.SVI is not configured. 7. After the next VM-entry, vector 236 is invoked through the guest IDT. Finally, an EOI occurs. However, due to the lack of GUEST_INTR_STATUS.SVI configuration, vector 236 is not cleared from the ISR. 8. ISR is stalled forever on vector 236. Using QEMU as an example, vector 236 is stuck in ISR forever. (qemu) info lapic 1 dumping local APIC state for CPU 1 LVT0 0x00010700 active-hi edge masked ExtINT (vec 0) LVT1 0x00010400 active-hi edge masked NMI LVTPC 0x00000400 active-hi edge NMI LVTERR 0x000000fe active-hi edge Fixed (vec 254) LVTTHMR 0x00010000 active-hi edge masked Fixed (vec 0) LVTT 0x000400ec active-hi edge tsc-deadline Fixed (vec 236) Timer DCR=0x0 (divide by 2) initial_count = 0 current_count = 0 SPIV 0x000001ff APIC enabled, focus=off, spurious vec 255 ICR 0x000000fd physical edge de-assert no-shorthand ICR2 0x00000000 cpu 0 (X2APIC ID) ESR 0x00000000 ISR 236 IRR 37(level) 236 The issue isn't applicable to AMD SVM as KVM simply writes vmcb01 directly irrespective of whether L1 (vmcs01) or L2 (vmcb02) is active (unlike VMX, there is no need/cost to switch between VMCBs). In addition, APICV_INHIBIT_REASON_IRQWIN ensures AMD SVM AVIC is not activated until the last interrupt is EOI'd. Fix the bug by configuring Intel VMX GUEST_INTR_STATUS.SVI if APICv is activated at runtime. Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com> Reviewed-by: Chao Gao <chao.gao@intel.com> Link: https://patch.msgid.link/20251110063212.34902-1-dongli.zhang@oracle.com [sean: call out that SVM writes vmcb01 directly, tweak comment] Link: https://patch.msgid.link/20251205231913.441872-2-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-12-08s390/bug: Add missing alignmentHeiko Carstens
All objects are supposed to have a minimal alignment of two, since a couple of instructions only work with even addresses. Add the missing align statement for the file string. Fixes: 6584ff203aec ("bugs/s390: Use 'cond_str' in __EMIT_BUG()") Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2025-12-08s390/bug: Add missing CONFIG_BUG ifdef againHeiko Carstens
Fallback to generic BUG implementation in case CONFIG_BUG is disabled. This restores the old behaviour before 'cond_str' support was added. It probably doesn't matter, since nobody should disable CONFIG_BUG, but at least this is consistent to before. Fixes: 6584ff203aec ("bugs/s390: Use 'cond_str' in __EMIT_BUG()") Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2025-12-08ALSA: uapi: Fix typo in asound.h commentAndres J Rosa
Fix 'level-shit' to 'level-shift' in struct snd_cea_861_aud_if comment. Fixes: 7ba1c40b536e ("ALSA: Add definitions for CEA-861 Audio InfoFrames") Signed-off-by: Andres J Rosa <andyrosa@gmail.com> Link: https://patch.msgid.link/20251203162509.1822-1-andyrosa@gmail.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2025-12-08drm/xe/throttle: Skip reason prefix while emitting arrayRaag Jadav
The newly introduced "reasons" attribute already signifies possible reasons for throttling and makes the prefix in individual attribute names redundant while emitting them as an array. Skip the prefix. Fixes: 83ccde67a3f7 ("drm/xe/gt_throttle: Avoid TOCTOU when monitoring reasons") Signed-off-by: Raag Jadav <raag.jadav@intel.com> Reviewed-by: Sk Anirban <sk.anirban@intel.com> Link: https://patch.msgid.link/20251203123355.571606-1-raag.jadav@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-12-08KVM: s390: Fix gmap_helper_zap_one_page() againClaudio Imbrenda
A few checks were missing in gmap_helper_zap_one_page(), which can lead to memory corruption in the guest under specific circumstances. Add the missing checks. Fixes: 5deafa27d9ae ("KVM: s390: Fix to clear PTE when discarding a swapped page") Cc: stable@vger.kernel.org Reported-by: Marc Hartmayer <mhartmay@linux.ibm.com> Tested-by: Marc Hartmayer <mhartmay@linux.ibm.com> Acked-by: Christian Borntraeger <borntraeger@linux.ibm.com> Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2025-12-08i2c: qcom-cci: Add msm8953 compatibleLuca Weiss
Add a config for the v1.2.5 CCI found on msm8953 which has different values in .params compared to others already supported in the driver. Reviewed-by: Loic Poulain <loic.poulain@oss.qualcomm.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Signed-off-by: Luca Weiss <luca@lucaweiss.eu> Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
2025-12-08Merge tag 'i2c-host-6.19-v2' of ↵Wolfram Sang
git://git.kernel.org/pub/scm/linux/kernel/git/andi.shyti/linux into i2c/for-mergewindow i2c-host for v6.19 - general cleanups in bcm2835, designware, pcf8584, and stm32 - amd-mp2: fix device refcount - designware: avoid interrupt storms caused by bad firmware - i801: fix supported features - spacemit: fix device detection failures New device support: - Intel Diamond Rapids - Rockchip RK3506 - Qualcomm Kaanapali, MSM8953
2025-12-08LoongArch: Adjust default config files for 32BIT/64BITHuacai Chen
Add loongson32_defconfig (for 32BIT) and rename loongson3_defconfig to loongson64_defconfig (for 64BIT). Also adjust graphics drivers, such as FB_EFI is replaced with EFIDRM. Reviewed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-12-08LoongArch: Adjust VDSO/VSYSCALL for 32BIT/64BITHuacai Chen
Adjust VDSO/VSYSCALL because read_cpu_id() for 32BIT/64BIT are different, and LoongArch32 doesn't support GENERIC_GETTIMEOFDAY now (will be supported in future). Reviewed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-12-08LoongArch: Adjust misc routines for 32BIT/64BITHuacai Chen
Adjust misc routines for both 32BIT and 64BIT, including: bitops, bswap, checksum, string, jump label, unaligned access emulator, suspend/wakeup routines, etc. Reviewed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-12-08LoongArch: Adjust user accessors for 32BIT/64BITHuacai Chen
Adjust user accessors for both 32BIT and 64BIT, including: get_user(), put_user(), copy_user(), clear_user(), etc. Reviewed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-12-08LoongArch: Adjust system call for 32BIT/64BITHuacai Chen
Adjust system call for both 32BIT and 64BIT, including: add the uapi unistd_{32,64}.h and syscall_table_{32,64}.h inclusion, add sys_mmap2() definition, change the system call entry routines, etc. Reviewed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-12-08LoongArch: Adjust module loader for 32BIT/64BITHuacai Chen
Adjust module loader for both 32BIT and 64BIT, including: change the s64 type to long, change the u64 type to unsigned long, change the plt entry definition and handling, etc. Reviewed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-12-08LoongArch: Adjust time routines for 32BIT/64BITHuacai Chen
Adjust time routines for both 32BIT and 64BIT, including: rdtime_h() / rdtime_l() definitions for 32BIT and rdtime_d() definition for 64BIT, get_cycles() and get_cycles64() definitions for 32BIT/64BIT, show time frequency info ("CPU MHz" and "BogoMIPS") in /proc/cpuinfo, etc. Use do_div() for division which works on both 32BIT and 64BIT platforms. Reviewed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-12-08LoongArch: Adjust process management for 32BIT/64BITHuacai Chen
Adjust process management for both 32BIT and 64BIT, including: CPU context switching, FPU loading/restoring, process dumping and process tracing routines. Q: Why modify switch.S? A: LoongArch32 has no ldptr.d/stptr.d instructions, and asm offsets of thead_struct members are too large to be filled in the 12b immediate field of ld.w/st.w. Reviewed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-12-08LoongArch: Adjust memory management for 32BIT/64BITHuacai Chen
Adjust memory management for both 32BIT and 64BIT, including: address space definition, DMW CSR definition, page table bits definition, boot time detection of VA/PA bits, page table init, tlb exception handling, copy_page/clear_page/dump_tlb libraries, etc. Reviewed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com> Signed-off-by: Yawei Li <liyawei@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-12-08LoongArch: Adjust boot & setup for 32BIT/64BITHuacai Chen
Adjust boot & setup for both 32BIT and 64BIT, including: efi header definition, MAX_IO_PICS definition, kernel entry and environment setup routines, etc. Add a fallback path in fdt_cpu_clk_init() to avoid 0MHz in /proc/cpuinfo if there is no valid clock freq from firmware. Reviewed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-12-08Documentation/gpu/drm-mm: Add THP paragraph to GEM mapping sectionLoïc Molinari
Add a paragraph to the GEM Objects Creation section about the drm_gem_huge_mnt_create() helper and to the GEM objects mapping section explaining how transparent huge pages are handled by GEM. v4: - fix wording after huge_pages handler removal v6: - fix wording after map_pages handler removal v11: - mention drm_gem_huge_mnt_create() helper - add Boris and Maíra R-bs Signed-off-by: Loïc Molinari <loic.molinari@collabora.com> Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Maíra Canal <mcanal@igalia.com> Link: https://patch.msgid.link/20251205182231.194072-11-loic.molinari@collabora.com Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
2025-12-08drm/panfrost: Introduce huge tmpfs mountpoint optionLoïc Molinari
Introduce the 'panfrost.transparent_hugepage' boolean module parameter (false by default). When the parameter is set to true, a new tmpfs mountpoint is created and mounted using the 'huge=within_size' option. It's then used at GEM object creation instead of the default 'shm_mnt' mountpoint in order to enable Transparent Hugepage (THP) for the object (without having to rely on a system wide parameter). v3: - use huge tmpfs mountpoint in drm_device v4: - fix builds with CONFIG_TRANSPARENT_HUGEPAGE=n - clean up mountpoint creation error handling - print negative error value v5: - use drm_gem_has_huge_tmp() helper - get rid of CONFIG_TRANSPARENT_HUGEPAGE ifdefs v9: - replace drm_gem_has_huge_tmp() by drm_gem_get_huge_tmp() v11: - enable 'panfrost.transparent_hugepage' by default Signed-off-by: Loïc Molinari <loic.molinari@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Link: https://patch.msgid.link/20251205182231.194072-10-loic.molinari@collabora.com Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>