| Age | Commit message (Collapse) | Author |
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Catalin Marinas:
- Handle probe on hinted conditional branch instructions.
BC.cond instructions can be simulated in the same way as B.cond
instructions, so extend the decode mask for B.cond to cover BC.cond
- Flush the walk cache when unsharing PMD tables. Recent changes to
huge_pmd_unshare() introduced mmu_gather::unshared_tables but the
arm64 code was still treating the TLB flushing as only targeting leaf
entries (TLBI VALE1IS).
Fix it by using non-leaf-only instructions (TLBI VAE1IS) when
tlb->unshared_tables is set
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: tlb: Flush walk cache when unsharing PMD tables
arm64: probes: Handle probes on hinted conditional branch instructions
|
|
When huge_pmd_unshare() is called to unshare a PMD table, the
tlb_unshare_pmd_ptdesc() function sets tlb->unshared_tables=true
but the aarch64 tlb_flush() only checked tlb->freed_tables to
determine whether to use TLBF_NONE (vae1is, invalidates walk
cache) or TLBF_NOWALKCACHE (vale1is, leaf-only).
This caused the stale PMD page table entry to remain in the walk cache
after unshare, potentially leading to incorrect page table walks.
Fix by including unshared_tables in the check, so that when
unsharing tables, TLBF_NONE is used and the walk cache is properly
invalidated.
Here is the detailed distinction between vae1is and vale1is:
| Instruction Combination | Actual Invalidation Scope |
| ------------------------ | --------------------------------------------------|
| `VAE1IS` + TTL=`0` | All entries at all levels (full invalidation) |
| `VAE1IS` + TTL=`2` (L2) | Non-leaf at Level 0/1 + leaf at Level 2 |
| `VALE1IS` + TTL=`0` | Leaf entries at all levels (non-leaf not cleared) |
| `VALE1IS` + TTL=`2` (L2) | Leaf entry at Level 2 only |
Signed-off-by: Zeng Heng <zengheng4@huawei.com>
Fixes: 8ce720d5bd91 ("mm/hugetlb: fix excessive IPI broadcasts when unsharing PMD tables using mmu_gather")
Cc: <stable@vger.kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull ring-buffer fixes from Steven Rostedt:
- Fix reporting MISSED EVENTS in trace iterator
When the "trace" file is read with tracing enabled, if the writer
were to pass the iterator reader, it resets, sets a "missed_events"
flag and continues. The tracing output checks for missed events and
if there are some, it prints out "[LOST EVENTS]" to let the user know
events were dropped.
But the clearing of the missed_events happened when the tracing
system queried the ring buffer iterator about missed events. This was
premature as the ring buffer is per CPU, and the tracing code reads
all the CPU buffers and checks for missed events when it is read. If
the CPU iterator that had missed events isn't printed next, the
output for the LOST EVENTS is lost.
Clear the missed_events flag when the iterator moves to the next
event and not when the missed_events flag is queried. Also clear it
on reset.
- Flush and stop the persistent ring buffer on panic
On panic the persistent ring buffer is used to debug what caused the
panic. But on some architectures, it requires flushing the memory
from cache, otherwise, the ring buffer persistent memory may not have
the last events and this could also cause the ring buffer to be
corrupted on the next boot.
- Fix nr_subbufs initialization in simple_ring_buffer_init_mm
The remote simple ring buffer meta data nr_subbufs is initialized too
early and gets cleared later on, making it zero and not reflect the
actual number of sub-buffers.
- Fix unload_page for simple_ring_buffer init rollback
On error, the pages loaded need to be unloaded. To unload a page it
is expected that: page = load_page(va); -> unload_page(page). But the
code was doing: unload_page(va) and not unload_page(page).
- Create output file from cmd_check_undefined
The check for undefined symbols checks if the file *.o.checked exists
and if so it skips doing the work. But the *.o.checked file never was
created making every build do the work even when it was already done
previously.
* tag 'trace-ringbuffer-v7.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tracing: Create output file from cmd_check_undefined
tracing: Fix unload_page for simple_ring_buffer init rollback
tracing: Fix nr_subbufs initialization in simple_ring_buffer_init_mm()
ring-buffer: Flush and stop persistent ring buffer on panic
ring-buffer: Fix reporting of missed events in iterator
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull SoC fixes from Arnd Bergmann:
- The ff-a firmware driver gets 11 individual bugfixes for a number of
issues with robustness to buggy firmware or client implementations.
Another firmware fix address suspend to RAM via PSCI firmware.
- The final code change is for the old Arm Integrator reference
platform that recently started exposing an old NULL pointer
dereference bug.
- The MAINTAINERS file gets two updates, notably James Tai and Yu-Chun
Lin are stepping up as co-maintainers for the Realtek platform.
- The remaining patches are all for devicetree files. Two of these are
for riscv boards, the rest are all for enesas Arm platforms,
addressing build time checking issues as well as minor configuration
problems.
* tag 'soc-fixes-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (30 commits)
firmware: psci: Set pm_set_resume/suspend_via_firmware() for SYSTEM_SUSPEND
ARM: realtek: MAINTAINERS: Include pin controller drivers
MAINTAINERS: Add maintainers for ARM/REALTEK ARCHITECTURE
ARM: integrator: Fix early initialization
firmware: arm_ffa: Fix sched-recv callback partition lookup
firmware: arm_ffa: Snapshot notifier callbacks under lock
firmware: arm_ffa: Align RxTx buffer size before mapping
firmware: arm_ffa: Validate framework notification message layout
firmware: arm_ffa: Keep framework RX release under lock
firmware: arm_ffa: Bound PARTITION_INFO_GET_REGS copies
firmware: arm_ffa: Unregister bus notifier on teardown for FF-A v1.0
firmware: arm_ffa: Fix per-vcpu self notifications handling in workqueue
firmware: arm_ffa: Avoid collapsing NPI work from different CPUs
firmware: arm_ffa: Skip free_pages on RX buffer alloc failure
firmware: arm_ffa: Check for NULL FF-A ID table while driver registration
riscv: dts: microchip: fix icicle i2c pinctrl configuration
riscv: dts: starfive: jh7110: Drop CAMSS node
arm64: dts: renesas: r9a09g056: Add #mux-state-cells to usb20phyrst
arm64: dts: renesas: r9a09g057: Add #mux-state-cells to usb2{0,1}phyrst
ARM: dts: renesas: rskrza1: Drop superfluous cells
...
|
|
On real hardware, panic and machine reboot may not flush hardware cache
to memory. This means the persistent ring buffer, which relies on a
coherent state of memory, may not have its events written to the buffer
and they may be lost. Moreover, there may be inconsistency with the
counters which are used for validation of the integrity of the
persistent ring buffer which may cause all data to be discarded.
To avoid this issue, stop recording of the ring buffer on panic and
flush the cache of the ring buffer's memory.
Fixes: e645535a954a ("tracing: Add option to use memmapped memory for trace boot instance")
Cc: stable@vger.kernel.org
Cc: Will Deacon <will@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Ian Rogers <irogers@google.com>
Link: https://patch.msgid.link/177751969602.2136606.12031934362587643488.stgit@mhiramat.tok.corp.google.com
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
"14 hotfixes. 9 are for MM. 10 are cc:stable and the remainder are for
post-7.1 issues or aren't deemed suitable for backporting.
There's a two-patch MAINTAINERS series from Mike Rapoport which
updates us for the new KEXEC/KDUMP/crash/LUO/etc arrangements. And
another two-patch series from Muchun Song to fix a couple of
memory-hotplug issues. Otherwise singletons, please see the changelogs
for details"
* tag 'mm-hotfixes-stable-2026-05-18-21-07' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
mm/memory: fix spurious warning when unmapping device-private/exclusive pages
mm: fix __vm_normal_page() to handle missing support for pmd_special()/pud_special()
drivers/base/memory: fix memory block reference leak in poison accounting
mm/memory_hotplug: fix memory block reference leak on remove
lib: kunit_iov_iter: fix test fail on powerpc
mm/page_alloc: fix initialization of tags of the huge zero folio with init_on_free
MAINTAINERS: add kexec@ list to LIVE UPDATE ENTRY
MAINTAINERS: add tree for KDUMP and KEXEC
selftests/mm: run_vmtests.sh: fix destructive tests invocation
scripts/gdb: slab: update field names of struct kmem_cache
scripts/gdb: mm: cast untyped symbols in x86_page_ops
mm/damon: fix damos_stat tracepoint format for sz_applied
mm/damon/sysfs-schemes: call missing mem_cgroup_iter_break()
mm/migrate_device: fix spinlock leak in migrate_vma_insert_huge_pmd_page
|
|
BC.cond instructions introduced by FEAT_HBC cannot be executed
out-of-line, like other branch instructions. However, they can be
simulated in the same way as B.cond instructions.
Extend the B.cond decoder mask to match BC.cond instructions as well,
and handle them using the existing B.cond simulation path.
Fixes: 7f86d128e437 ("arm64: add HWCAP for FEAT_HBC (hinted conditional branches)")
Cc: <stable@vger.kernel.org>
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fix from Ingo Molnar:
- Fix ARM64-specific rseq regressions (Mark Rutland)
* tag 'sched-urgent-2026-05-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
arm64/entry: Fix arm64-specific rseq brokenness
|
|
init_on_free
__GFP_ZEROTAGS semantics are currently a bit weird, but effectively this
flag is only ever set alongside __GFP_ZERO and __GFP_SKIP_KASAN.
If we run with init_on_free, we will zero out pages during
__free_pages_prepare(), to skip zeroing on the allocation path.
However, when allocating with __GFP_ZEROTAG set, post_alloc_hook() will
consequently not only skip clearing page content, but also skip clearing
tag memory.
Not clearing tags through __GFP_ZEROTAGS is irrelevant for most pages that
will get mapped to user space through set_pte_at() later: set_pte_at() and
friends will detect that the tags have not been initialized yet
(PG_mte_tagged not set), and initialize them.
However, for the huge zero folio, which will be mapped through a PMD
marked as special, this initialization will not be performed, ending up
exposing whatever tags were still set for the pages.
The docs (Documentation/arch/arm64/memory-tagging-extension.rst) state
that allocation tags are set to 0 when a page is first mapped to user
space. That no longer holds with the huge zero folio when init_on_free is
enabled.
Fix it by decoupling __GFP_ZEROTAGS from __GFP_ZERO, passing to
tag_clear_highpages() whether we want to also clear page content.
Invert the meaning of the tag_clear_highpages() return value to have
clearer semantics.
Reproduced with the huge zero folio by modifying the check_buffer_fill
arm64/mte selftest to use a 2 MiB area, after making sure that pages have
a non-0 tag set when freeing (note that, during boot, we will not actually
initialize tags, but only set KASAN_TAG_KERNEL in the page flags).
$ ./check_buffer_fill
1..20
...
not ok 17 Check initial tags with private mapping, sync error mode and mmap memory
not ok 18 Check initial tags with private mapping, sync error mode and mmap/mprotect memory
...
This code needs more cleanups; we'll tackle that next, like
decoupling __GFP_ZEROTAGS from __GFP_SKIP_KASAN.
[akpm@linux-foundation.org: s/__GPF_ZERO/__GFP_ZERO/, per David]
Link: https://lore.kernel.org/20260421-zerotags-v2-1-05cb1035482e@kernel.org
Fixes: adfb6609c680 ("mm/huge_memory: initialise the tags of the huge zero folio")
Signed-off-by: David Hildenbrand (Arm) <david@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Lance Yang <lance.yang@linux.dev>
Cc: Brendan Jackman <jackmanb@google.com>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Liam Howlett <liam@infradead.org>
Cc: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Pull kvm fixes from Paolo Bonzini:
"arm64:
- Add the pKVM side of the workaround for ARM's erratum 4193714,
provided that the EL3 firmware does its part of the job. KVM will
refuse to initialise otherwise
- Correctly handle 52bit VAs for guest EL2 stage-1 translations when
running under NV with E2H==0
- Correctly deal with permission faults in guest_memfd memslots
- Fix the steal-time selftest after the infrastructure was reworked
- Make sure the host cannot pass a non-sensical clock update to the
EL2 tracing infrastructure
- Appoint Steffen Eiden as a reviewer in anticipation of the KVM/s390
ability to run arm64 guests, which will inevitably lead to arm64
code being directly used on s390
- Make sure that EL2 is configured with both exception entry and exit
being Context Synchronization Events
- Handle the current vcpu being NULL on EL2 panic
- Fix the selftest_vcpu memcache being empty at the point of donation
or sharing
- Check that the memcache has enough capacity before engaging on the
share/donate path
- Fix __deactivate_fgt() to use its parameter rather than a variable
in the macro context
s390:
- Fix array overrun with large amounts of PCI devices
x86:
- Never use L0's PAUSE loop exiting while L2 is running, since it's
unlikely that a nested guest will help solving the hypervisor's
spinlock contention
- Fix emulation of MOVNTDQA
- Fix typo in Xen hypercall tracepoint
- Add back an optimization that was left behind when recently fixing
a bug
- Add module parameter to disable CET, whose implementation seems to
have issues. For now it remains enabled by default
Generic:
- Reject offset causing an unsigned overflow in kvm_reset_dirty_gfn()
Documentation:
- Update stale links
Selftests:
- Fix guest_memfd_test with host page size > guest page size"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (22 commits)
KVM: VMX: introduce module parameter to disable CET
KVM: x86: Swap the dst and src operand for MOVNTDQA
KVM: x86: use again the flush argument of __link_shadow_page()
KVM: selftests: Ensure gmem file sizes are multiple of host page size
Documentation: kvm: update links in the references section of AMD Memory Encryption
KVM: nSVM: Never use L0's PAUSE loop exiting while L2 is running
KVM: x86: Fix Xen hypercall tracepoint argument assignment
KVM: Reject wrapped offset in kvm_reset_dirty_gfn()
KVM: arm64: Pre-check vcpu memcache for host->guest donate
KVM: arm64: Pre-check vcpu memcache for host->guest share
KVM: arm64: Seed pkvm_ownership_selftest vcpu memcache
KVM: arm64: Fix __deactivate_fgt macro parameter typo
KVM: arm64: Guard against NULL vcpu on VHE hyp panic path
KVM: arm64: Make EL2 exception entry and exit context-synchronization events
MAINTAINERS: Add Steffen as reviewer for KVM/arm64
KVM: arm64: Remove potential UB on nvhe tracing clock update
KVM: selftests: arm64: Fix steal_time test after UAPI refactoring
KVM: arm64: Handle permission faults with guest_memfd
KVM: arm64: nv: Consider the DS bit when translating TCR_EL2
KVM: arm64: Work around C1-Pro erratum 4193714 for protected guests
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fix from Catalin Marinas:
- ptrace(PTRACE_SETREGSET) fix to zero the target's fpsimd_state rather
than the tracer's
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64/fpsimd: ptrace: zero target's fpsimd_state, not the tracer's
|
|
Mathias Stearn reports that since v6.19, there are two big issues
affecting rseq:
(1) On arm64 specifically, rseq critical sections aren't aborted when
they should be.
(2) The 'cpu_id_start' field is no longer written by the kernel in all
cases it used to be, including some cases where TCMalloc depends on
the kernel clobbering the field.
This patch fixes issue #1. This patch DOES NOT fix issue #2, which will
need to be addressed by other patches.
The arm64-specific brokenness is a result of commits:
2fc0e4b4126c ("rseq: Record interrupt from user space")
39a167560a61 ("rseq: Optimize event setting")
The first commit failed to add a call to rseq_note_user_irq_entry() on
arm64. Thus arm64 never sets rseq_event::user_irq to record that it may
be necessary to abort an active rseq critical section upon return to
userspace. On its own, this commit had no functional impact as the value
of rseq_event::user_irq was not consumed.
The second commit relied upon rseq_event::user_irq to determine whether
or not to bother to perform rseq work when returning to userspace. As
rseq_event::user_irq wasn't set on arm64, this work would be skipped,
and consequently an active rseq critical section would not be aborted.
Fix this by giving arm64 syscall-specific entry/exit paths, and
performing the relevant logic in syscall and non-syscall paths,
including calling rseq_note_user_irq_entry() for non-syscall entry.
Currently arm64 cannot use syscall_enter_from_user_mode(),
syscall_exit_to_user_mode(), and irqentry_exit_to_user_mode(), due to
ordering constraints with exception masking, and risk of ABI breakage
for syscall tracing/audit/etc. For the moment the entry/exit logic is
left as arm64-specific, directly using enter_from_user_mode() and
exit_to_user_mode(), but mirroring the generic code.
I intend to follow up with refactoring/cleanup, as we did for kernel
mode entry paths in commit:
041aa7a85390 ("entry: Split preemption from irqentry_exit_to_kernel_mode()")
... which will allow arm64 to use the GENERIC_IRQ_ENTRY functions directly.
Fixes: 39a167560a61 ("rseq: Optimize event setting")
Reported-by: Mathias Stearn <mathias@mongodb.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/regressions/CAHnCjA25b+nO2n5CeifknSKHssJpPrjnf+dtr7UgzRw4Zgu=oA@mail.gmail.com/
Link: https://patch.msgid.link/20260508142023.3268622-1-mark.rutland@arm.com
|
|
__pkvm_host_donate_guest() flips the host stage-2 PTE for the
donated page to a non-valid annotation via
host_stage2_set_owner_metadata_locked() and then calls
kvm_pgtable_stage2_map() to install the matching guest stage-2
mapping. The map's return value is wrapped in WARN_ON() and
otherwise discarded, asserting that the call cannot fail.
WARN_ON() at nVHE EL2 panics, so this assertion is only correct
if the call genuinely cannot fail. kvm_pgtable_stage2_map() can
fail with -ENOMEM even at PAGE_SIZE granularity: the donate path
verifies PKVM_NOPAGE for the guest IPA before the map, so the
walker must allocate fresh page-table pages from the vcpu
memcache, and the host controls the vcpu memcache via the topup
interface. An under-provisioned donation request would otherwise
turn a recoverable -ENOMEM into a fatal hyp panic.
Bound the worst-case walker allocation alongside the existing
__host_check_page_state_range() / __guest_check_page_state_range()
pre-checks, using the helper introduced for host->guest share. If
the vcpu memcache holds fewer pages than kvm_mmu_cache_min_pages(),
return -ENOMEM before any state mutation.
Fixes: 1e579adca177 ("KVM: arm64: Introduce __pkvm_host_donate_guest()")
Assisted-by: Gemini:gemini-3.1-pro review-prompts
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260501112149.2824881-7-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
__pkvm_host_share_guest() ends with kvm_pgtable_stage2_map() to
install the guest stage-2 mapping, after a forward pass that mutates
the host vmemmap (sets PKVM_PAGE_SHARED_OWNED and increments
host_share_guest_count) for every page in the range. The map's
return value is wrapped in WARN_ON() and otherwise discarded,
asserting that the call cannot fail.
WARN_ON() at nVHE EL2 panics, so this assertion is only correct if
the call genuinely cannot fail. kvm_pgtable_stage2_map() can fail
with -ENOMEM when the stage-2 walker exhausts the caller's
memcache, and the host controls the vcpu memcache via the topup
interface, so an under-provisioned share request would otherwise
turn a recoverable -ENOMEM into a fatal hyp panic.
Bound the worst-case walker allocation in the existing pre-check
pass so that kvm_pgtable_stage2_map() cannot fail at the call
site, using kvm_mmu_cache_min_pages() -- the same bound host EL1
uses for its own stage-2 maps. If the vcpu memcache holds fewer
pages, return -ENOMEM before any state mutation.
Fixes: d0bd3e6570ae ("KVM: arm64: Introduce __pkvm_host_share_guest()")
Assisted-by: Gemini:gemini-3.1-pro review-prompts
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260501112149.2824881-6-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
The hypercall handlers call pkvm_refill_memcache() to top up the
hyp_vcpu memcache before invoking __pkvm_host_{share,donate}_guest().
pkvm_ownership_selftest invokes those functions directly with a
static selftest_vcpu that has an empty memcache.
Seed selftest_vcpu's memcache from the prepopulated selftest
pages, leaving the remainder for selftest_vm.pool. Required by
the memcache-sufficiency pre-check added in the following
patches.
Assisted-by: Gemini:gemini-3.1-pro review-prompts
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260501112149.2824881-5-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
__deactivate_fgt() declares its first parameter as "htcxt" but the body
references "hctxt". The parameter is unused; the macro silently captures
"hctxt" from the enclosing scope. Both existing callers
(__deactivate_traps_hfgxtr() and __deactivate_traps_ich_hfgxtr()) happen
to define a local "struct kvm_cpu_context *hctxt", so the macro works
by coincidence.
A future caller without an "hctxt" local in scope, or naming it
differently, would compile but bind to the wrong context. Align the
parameter name with the sibling __activate_fgt() macro.
The "vcpu" parameter remains unused in the body, kept for API symmetry
with __activate_fgt() (which uses it).
Fixes: f5a5a406b4b8 ("KVM: arm64: Propagate and handle Fine-Grained UNDEF bits")
Assisted-by: Gemini:gemini-3.1-pro review-prompts
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260501112149.2824881-4-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
On VHE, __hyp_call_panic() unconditionally calls __deactivate_traps(vcpu)
on the vcpu pointer read from host_ctxt->__hyp_running_vcpu. That pointer
is cleared after every guest exit (and is never set when no guest is
running), so an unexpected EL2 exception landing in _guest_exit_panic,
e.g. via the el2t*_invalid / el2h_irq_invalid vectors - reaches this
function with vcpu == NULL. __deactivate_traps() then dereferences vcpu
via ___deactivate_traps() -> vserror_state_is_nested() -> vcpu_has_nv()
-> vcpu->arch.features, faulting inside the panic handler and obscuring
the original failure.
The nVHE counterpart (hyp_panic() in arch/arm64/kvm/hyp/nvhe/switch.c)
already guards its vcpu-using cleanup with "if (vcpu)"; mirror that
here. sysreg_restore_host_state_vhe() does not depend on vcpu and
continues to run unconditionally, preserving panic forensics. The
trailing panic("...VCPU:%p", vcpu) prints "(null)" safely via printk's
%p handling.
Fixes: 6a0259ed29bb ("KVM: arm64: Remove hyp_panic arguments")
Assisted-by: Gemini:gemini-3.1-pro review-prompts
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260501112149.2824881-3-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
SCTLR_EL2.EIS and SCTLR_EL2.EOS control whether exception entry and
exit at EL2 are Context Synchronisation Events (CSEs). Per ARM DDI
0487 M.b D24.2.175 (p. D24-9754):
- !FEAT_ExS: the bit is RES1, so the entry/exit is unconditionally
a CSE.
- FEAT_ExS: the reset value is architecturally UNKNOWN; software
must set the bit to make the entry/exit a CSE.
INIT_SCTLR_EL2_MMU_ON in arch/arm64/include/asm/sysreg.h sets neither
bit. KVM/arm64 hot paths rely on ERET from EL2 being a CSE, and on
synchronous EL1->EL2 entry being a CSE, to elide explicit ISBs after
MSRs to context-switching system registers (HCR_EL2, ZCR_EL2,
ptrauth keys, etc.). On FEAT_ExS hardware those reliances are not
architecturally backed unless EOS=1 (and, for entry, EIS=1).
Until commit 0a35bd285f43 ("arm64: Convert SCTLR_EL2 to sysreg
infrastructure"), SCTLR_EL2_RES1 was a hand-rolled mask that
included BIT(11) (EOS) and BIT(22) (EIS), so INIT_SCTLR_EL2_MMU_ON
was setting both unconditionally. The conversion made
SCTLR_EL2_RES1 auto-generated; because the sysreg tooling only
models unconditionally-RES1 fields and EIS/EOS are RES1 only when
FEAT_ExS is absent, the auto-generated mask is UL(0). The seven
other bits dropped from the old mask (positions 4, 5, 16, 18, 23,
28, 29) are unconditionally RES1 in the E2H=0 SCTLR_EL2 layout per
DDI 0487 M.b D24.2.175, so dropping them is harmless. EIS and EOS
are the only bits whose semantics changed for FEAT_ExS hardware
and where the kernel relies on the value being 1.
Make the guarantee explicit: include SCTLR_ELx_EIS | SCTLR_ELx_EOS in
INIT_SCTLR_EL2_MMU_ON so that EL2 exception entry and exit are
unconditionally CSEs regardless of whether FEAT_ExS is implemented.
This matches the pairing in arch/arm64/kvm/config.c which treats EIS
and EOS together as RES1 under !FEAT_ExS.
Fixes: 0a35bd285f43 ("arm64: Convert SCTLR_EL2 to sysreg infrastructure")
Reviewed-by: Yuan Yao <yaoyuan@linux.alibaba.com>
Assisted-by: Gemini:gemini-3.1-pro review-prompts
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260501112149.2824881-2-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/geert/renesas-devel into arm/fixes
Renesas fixes for v7.1
- Fix SCIF (serial port) clocks on R-Car X5H,
- Fix various dtc and dtbs_check warnings.
* tag 'renesas-fixes-for-v7.1-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/renesas-devel:
arm64: dts: renesas: r9a09g056: Add #mux-state-cells to usb20phyrst
arm64: dts: renesas: r9a09g057: Add #mux-state-cells to usb2{0,1}phyrst
ARM: dts: renesas: rskrza1: Drop superfluous cells
ARM: dts: renesas: genmai: Drop superfluous cells
ARM: dts: renesas: r7s72100: Add missing unit address to bus node
ARM: dts: renesas: r8a7792: Add missing unit address to bus node
ARM: dts: renesas: r8a7779: Add missing unit address to bus node
ARM: dts: renesas: r8a7778: Add missing unit address to bus node
arm64: dts: renesas: rz-smarc-du-adv7513-smarc: Fix missing cells and reg in DU subnode
arm64: dts: renesas: rz-smarc-cru-csi-ov5645: Fix missing cells and reg in CSI2 subnode
arm64: dts: renesas: salvator-panel: Fix missing cells and reg in DTO
arm64: dts: renesas: draak/ebisu-panel: Fix missing cells and reg in DTO
arm64: dts: renesas: r8a78000: Fix SCIF brg_int clocks
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
Sashiko(locally) reports possiblity of division by zero and
out-of-bounds bitwise shift in trace_clock_update().
Although the clock update is untrusted, we should at least have some
basic checks to avoid undefined behaviours.
Reviewed-by: Vincent Donnefort <vdonnefort@google.com>
Signed-off-by: Mostafa Saleh <smostafa@google.com>
Link: https://patch.msgid.link/20260430103724.2151625-1-smostafa@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
gmem_abort() calls kvm_pgtable_stage2_map() to make changes to stage 2. It
does this for both relaxing permissions on an existing mapping and to
install a missing mapping.
kvm_pgtable_stage2_map() doesn't make changes to stage 2 if there is an
existing, valid entry and the new entry modifies only the permissions.
This is checked in:
kvm_pgtable_stage2_map()
stage2_map_walk_leaf()
stage2_map_walker_try_leaf()
stage2_pte_needs_update()
and if only the permissions differ, kvm_pgtable_stage2_map() returns
-EAGAIN and KVM returns to the guest to replay the instruction. The
assumption is that a concurrent fault on a different VCPU already mapped
the faulting IPA, and replaying the instruction will either succeed, or
cause a permission fault, which should be handled with
kvm_pgtable_stage2_relax_perms().
gmem_abort(), on a read or write fault on a system without DIC (instruction
cache invalidation required for data to instruction coherence), installs a
valid entry with read and write permissions, but without executable
permissions. On an execution fault on the same page, gmem_abort() attempts
to relax the permissions to allow execution, but calls
kvm_pgtable_stage2_map() to change the existing, valid, entry.
kvm_pgtable_stage2_map() returns -EAGAIN and KVM resumes execution from the
faulting instruction, which leads to an infinite loop of permission faults
on the same instruction.
Allow the guest to make progress by using kvm_pgtable_stage2_relax_perms()
to relax permissions.
Fixes: a7b57e099592 ("KVM: arm64: Handle guest_memfd-backed guest page faults")
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Reviewed-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260505094913.75317-1-alexandru.elisei@arm.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
When running an nVHE L1, TCR_EL2 is mapped to TCR_EL1. Writes to the
register are trapped and written to TCR_EL1 after a translation.
Booting an nVHE L1 with 52-bit VA isn't working because the translation
was ignoring the DS bit set by the guest, hence causing repeating level
0 faults. Add it in the translation function.
Signed-off-by: Wei-Lin Chang <weilin.chang@arm.com>
Link: https://patch.msgid.link/20260505144735.1496530-1-weilin.chang@arm.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
C1-Pro cores with SME have an erratum where TLBI+DSB does not complete
all outstanding SME accesses. Instead a DSB needs to be executed on the
affected CPUs. The implication is that pages cannot be unmapped from the
host Stage 2 and then provided to a protected guest or to the
hypervisor. Host SME accesses may still complete after this point.
This erratum breaks pKVM's guarantees, and the workaround is hard to
implement as EL2 and EL1 share a security state meaning EL1 can mask
IPIs sent by EL2, leading to interrupt blackouts.
Instead, do this in EL3. This has the advantage of a separate security
state, meaning lower EL cannot mask the IPI. It is also simpler for EL3
to know about CPUs that are off or in PSCI's CPU_SUSPEND.
Add the needed hook to host_stage2_set_owner_metadata_locked(). This
covers the cases where the host loses access to a page:
__pkvm_host_donate_guest()
__pkvm_guest_unshare_host()
host_stage2_set_owner_locked() when owner_id == PKVM_ID_HYP
Since pKVM relies on the firmware call for correctness, check for the
firmware counterpart during protected KVM initialisation and fail the
pKVM initialisation if it is missing.
Signed-off-by: James Morse <james.morse@arm.com>
Co-developed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Oliver Upton <oupton@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Vincent Donnefort <vdonnefort@google.com>
Cc: Lorenzo Pieralisi <lpieralisi@kernel.org>
Cc: Sudeep Holla <sudeep.holla@kernel.org>
Link: https://patch.msgid.link/20260505165205.2690919-1-catalin.marinas@arm.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
sve_set_common() is the backend for PTRACE_SETREGSET(NT_ARM_SVE) and
PTRACE_SETREGSET(NT_ARM_SSVE). Every write in the function operates on
the tracee (target) - except a single memset that uses current instead,
zeroing the tracer's saved V0-V31 / FPSR / FPCR shadow on every ptrace
SETREGSET call.
The memset is meant to give the tracee a defined zero register image
before the user-supplied payload is copied in (for partial writes,
header-only writes, and FPSIMD<->SVE format switches). Aiming it at
current both denies the tracee that clean slate and silently corrupts
the tracer.
The corruption of the tracer's saved FPSIMD state is not always
observable. Where the tracer's state is live on a CPU, this may be
reused without loading the corrupted state from memory, and will
eventually be written back over the corrupted state. Where the tracer's
state is saved in SVE_PT_REGS_SVE format, only the FPSR and FPCR are
clobbered, and the effective copy of the vectors is in the task's
sve_state.
Reproducible on an arm64 kernel with SVE: a single-threaded tracer that
loads a known pattern into V0-V31, issues PTRACE_SETREGSET(NT_ARM_SVE)
on a child, and reads V0-V31 back observes them all zeroed within tens
of thousands of iterations when a sibling thread keeps stealing the
FPSIMD CPU binding.
Fixes: 316283f276eb ("arm64/fpsimd: ptrace: Consistently handle partial writes to NT_ARM_(S)SVE")
Cc: <stable@vger.kernel.org>
Signed-off-by: Breno Leitao <leitao@debian.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Catalin Marinas:
- Avoid writing an uninitialised stack variable to POR_EL0 on sigreturn
if the poe_context record is absent
- Reserve one more page for the early 4K-page kernel mapping to cover
the extra [_text, _stext) split introduced by the non-executable
read-only mapping
- Force the arch_local_irq_*() wrappers to be __always_inline so that
noinstr entry and idle paths cannot call out-of-line, instrumentable
copies
- Fix potential sign extension in the arm64 SCS unwinder's DWARF
advance_loc4 decoding
- Tolerate arm64 ACPI platforms with only WFI and no deeper PSCI idle
states, restoring cpuidle registration on such systems
- Include the UAPI <asm/ptrace.h> header in the arm64 GCS libc test
rather than carrying a duplicate struct user_gcs definition (the
original #ifdef NT_ARM_GCS was wrong to cover the structure
definition as it would be masked out if the toolchain defined it)
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: signal: Preserve POR_EL0 if poe_context is missing
arm64: Reserve an extra page for early kernel mapping
kselftest/arm64: Include <asm/ptrace.h> for user_gcs definition
ACPI: arm64: cpuidle: Tolerate platforms with no deep PSCI idle states
arm64/irqflags: __always_inline the arch_local_irq_*() helpers
arm64/scs: Fix potential sign extension issue of advance_loc4
|
|
Commit 2e8a1acea859 ("arm64: signal: Improve POR_EL0 handling to
avoid uaccess failures") delayed the write to POR_EL0 in
rt_sigreturn to avoid spurious uaccess failures. This change however
relies on the poe_context frame record being present: on a system
supporting POE, calling sigreturn without a poe_context record now
results in writing arbitrary data from the kernel stack into POR_EL0.
Fix this by adding a __valid_fields member to struct
user_access_state, and zeroing the struct on allocation.
restore_poe_context() then indicates that the por_el0 field is valid
by setting the corresponding bit in __valid_fields, and
restore_user_access_state() only touches POR_EL0 if there is a valid
value to set it to. This is in line with how POR_EL0 was originally
handled; all frame records are currently optional, except
fpsimd_context.
To ensure that __valid_fields is kept in sync, fields (currently
just por_el0) are now accessed via accessors and prefixed with __ to
discourage direct access.
Fixes: 2e8a1acea859 ("arm64: signal: Improve POR_EL0 handling to avoid uaccess failures")
Cc: <stable@vger.kernel.org>
Reported-by: Will Deacon <will@kernel.org>
Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
The final part of [data, end) segment may overflow into the next page of
init_pg_end[1] which is the gap page before early_init_stack[2]:
[1]
crash_arm64_v9.0.1> vtop ffffffed00601000
VIRTUAL PHYSICAL
ffffffed00601000 83401000
PAGE DIRECTORY: ffffffecffd62000
PGD: ffffffecffd62da0 => 10000000833fb003
PMD: ffffff80033fb018 => 10000000833fe003
PTE: ffffff80033fe008 => 68000083401f03
PAGE: 83401000
PTE PHYSICAL FLAGS
68000083401f03 83401000 (VALID|SHARED|AF|NG|PXN|UXN)
PAGE PHYSICAL MAPPING INDEX CNT FLAGS
fffffffec00d0040 83401000 0 0 1 4000 reserved
[2]
ffffffed002c8000 (r) __pi__data
ffffffed0054e000 (d) __pi___bss_start
ffffffed005f5000 (b) __pi_init_pg_dir
ffffffed005fe000 (b) __pi_init_pg_end
ffffffed005ff000 (B) early_init_stack
ffffffed00608000 (b) __pi__end
For 4K pages, the early kernel mapping may use 2MB block entries but the
kernel segments are only 64KB aligned. Segment boundaries that fall
within a 2MB block therefore require a PTE table so that different
attributes can be applied on either side of the boundary.
KERNEL_SEGMENT_COUNT still correctly counts the five permanent kernel
VMAs registered by declare_kernel_vmas(). However, since commit
5973a62efa34 ("arm64: map [_text, _stext) virtual address range
non-executable+read-only"), the early mapper also maps [_text, _stext)
separately from [_stext, _etext). This adds one more early-only split
and can require one more page-table page than the existing
EARLY_SEGMENT_EXTRA_PAGES allowance reserves.
Increase the 4K-page early mapping allowance by one page to cover that
additional split.
Fixes: 5973a62efa34 ("arm64: map [_text, _stext) virtual address range non-executable+read-only")
Assisted-by: TRAE:GLM-5.1
Suggested-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
[catalin.marinas@arm.com: rewrote part of the commit log]
[catalin.marinas@arm.com: expanded the code comment]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
The arch_local_irq_*() wrappers in <asm/irqflags.h> dispatch between two
underlying primitives: the __daif_* path on most systems, and the
__pmr_* path on builds that use GIC PMR-based masking (Pseudo-NMI). The
leaf primitives are already __always_inline, but the wrappers themselves
are plain "static inline".
That is unsafe for noinstr callers: nothing prevents the compiler from
emitting an out-of-line copy of e.g. arch_local_irq_disable(), and an
out-of-line copy can be instrumented (ftrace, kcov, sanitizers), which
breaks the noinstr contract on the entry/idle paths that rely on these
helpers.
x86 hit and fixed exactly this class of bug in commit 7a745be1cc90
("x86/entry: __always_inline irqflags for noinstr").
Force-inline all of the arch_local_irq_*() wrappers so they cannot be
emitted out-of-line:
- arch_local_irq_enable()
- arch_local_irq_disable()
- arch_local_save_flags()
- arch_irqs_disabled_flags()
- arch_irqs_disabled()
- arch_local_irq_save()
- arch_local_irq_restore()
The primary motivation is noinstr safety. There is a useful side effect
for fleet-wide profiling: when the wrapper is emitted out-of-line,
samples taken inside it during the post-WFI IRQ unmask in
default_idle_call() are attributed to arch_local_irq_enable rather than
default_idle_call(), and the FP-unwinder loses default_idle_call() from
the chain.
Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Leonardo Bras <leo.bras@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
The expression (*opcode++ << 24) and exp * code_alignment_factor
may overflow signed int and becomes negative.
Fix this by casting each byte to u64 before shifting. Also fix
the misaligned break statement while we are here.
Example of the result can be seen here:
Link: https://godbolt.org/z/zhY8d3595
It maybe not a real problem, but could be a issue in future.
Fixes: d499e9627d70 ("arm64/scs: Fix handling of advance_loc4")
Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
The renesas,rzv2h-usb2phy-reset binding schema defines #mux-state-cells
as a required property. Add it to the usb20phyrst node to fix the
following warnings:
arch/arm64/boot/dts/renesas/r9a09g056n48-rzv2n-evk.dtb: usb20phy-reset@15830000 (renesas,r9a09g056-usb2phy-reset): '#mux-state-cells' is a required property
arch/arm64/boot/dts/renesas/r9a09g056n48-rzv2n-evk-cn15-emmc.dtb: usb20phy-reset@15830000 (renesas,r9a09g056-usb2phy-reset): '#mux-state-cells' is a required property
arch/arm64/boot/dts/renesas/r9a09g056n48-rzv2n-evk-cn15-sd.dtb: usb20phy-reset@15830000 (renesas,r9a09g056-usb2phy-reset): '#mux-state-cells' is a required property
Fixes: 6a1b6f7e56dc ("dt-bindings: reset: renesas,rzv2h-usb2phy: Add '#mux-state-cells' property")
Signed-off-by: Tommaso Merciai <tommaso.merciai.xr@bp.renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://patch.msgid.link/31210e05f7189b466b30eedbdda3d11726dac279.1775575276.git.tommaso.merciai.xr@bp.renesas.com
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
|
|
The renesas,rzv2h-usb2phy-reset binding schema defines #mux-state-cells
as a required property. Add it to the usb20phyrst and usb21phyrst nodes
to fix the following warnings:
arch/arm64/boot/dts/renesas/r9a09g057h44-rzv2h-evk.dtb: usb20phy-reset@15830000 (renesas,r9a09g057-usb2phy-reset): '#mux-state-cells' is a required property
arch/arm64/boot/dts/renesas/r9a09g057h44-rzv2h-evk.dtb: usb21phy-reset@15840000 (renesas,r9a09g057-usb2phy-reset): '#mux-state-cells' is a required property
arch/arm64/boot/dts/renesas/r9a09g057h44-rzv2h-evk-cn15-emmc.dtb: usb20phy-reset@15830000 (renesas,r9a09g057-usb2phy-reset): '#mux-state-cells' is a required property
arch/arm64/boot/dts/renesas/r9a09g057h44-rzv2h-evk-cn15-emmc.dtb: usb21phy-reset@15840000 (renesas,r9a09g057-usb2phy-reset): '#mux-state-cells' is a required property
arch/arm64/boot/dts/renesas/r9a09g057h44-rzv2h-evk-cn15-sd.dtb: usb20phy-reset@15830000 (renesas,r9a09g057-usb2phy-reset): '#mux-state-cells' is a required property
arch/arm64/boot/dts/renesas/r9a09g057h44-rzv2h-evk-cn15-sd.dtb: usb21phy-reset@15840000 (renesas,r9a09g057-usb2phy-reset): '#mux-state-cells' is a required property
Fixes: 6a1b6f7e56dc ("dt-bindings: reset: renesas,rzv2h-usb2phy: Add '#mux-state-cells' property")
Signed-off-by: Tommaso Merciai <tommaso.merciai.xr@bp.renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://patch.msgid.link/22fb9a500cdbc3272dc23cd5e36bca5fbbec75fc.1775575276.git.tommaso.merciai.xr@bp.renesas.com
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
|
|
DU subnode
Add missing cells and reg DT property in the DU subnode to fix the
following DTC W=1 warning:
arch/arm64/boot/dts/renesas/rz-smarc-du-adv7513.dtsi:29.10-33.5: Warning (unit_address_vs_reg): /fragment@1/__overlay__/ports/port@0: node has a unit name, but no reg or ranges property
Signed-off-by: Marek Vasut <marek.vasut+renesas@mailbox.org>
Reviewed-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://patch.msgid.link/20260326042411.215241-5-marek.vasut+renesas@mailbox.org
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
|
|
CSI2 subnode
Add missing cells and reg DT property in the CSI2 subnode to fix the
following DTC W=1 warning:
arch/arm64/boot/dts/renesas/rz-smarc-cru-csi-ov5645.dtsi:49.10-55.5: Warning (unit_address_vs_reg): /fragment@2/__overlay__/ports/port@0: node has a unit name, but no reg or ranges property
Signed-off-by: Marek Vasut <marek.vasut+renesas@mailbox.org>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Link: https://patch.msgid.link/20260326042411.215241-4-marek.vasut+renesas@mailbox.org
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
|
|
Add missing cells and reg DT property in the Salvator-X panel DTO to fix
the following DTC W=1 warning:
arch/arm64/boot/dts/renesas/salvator-panel-aa104xd12.dtso:30.10-34.5: Warning (unit_address_vs_reg): /fragment@2/__overlay__/ports/port@1: node has a unit name, but no reg or ranges property
Signed-off-by: Marek Vasut <marek.vasut+renesas@mailbox.org>
Reviewed-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://patch.msgid.link/20260326042411.215241-3-marek.vasut+renesas@mailbox.org
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
|
|
Add missing cells and reg DT property in the Draak/Ebisu panel DTO to
fix the following DTC W=1 warning:
arch/arm64/boot/dts/renesas/draak-ebisu-panel-aa104xd12.dtso:30.10-34.5: Warning (unit_address_vs_reg): /fragment@2/__overlay__/ports/port@1: node has a unit name, but no reg or ranges property
Signed-off-by: Marek Vasut <marek.vasut+renesas@mailbox.org>
Reviewed-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://patch.msgid.link/20260326042411.215241-2-marek.vasut+renesas@mailbox.org
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
|
|
According to the documentation, the internal clock input for the BRG is
SGASYNCD4_PERW_BUSφ.
Fixes: c13a643e2c491f5b ("arm64: dts: renesas: Add R8A78000 SoC support")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://patch.msgid.link/459d360a8332f92b3766b30814e7e1c76169aaf7.1767719254.git.geert+renesas@glider.be
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 fixes for 7.1, take #1
- Allow tracing for non-pKVM, which was accidentally disabled when
the series was merged
- Rationalise the way the pKVM hypercall ranges are defined by using
the same mechanism as already used for the vcpu_sysreg enum
- Enforce that SMCCC function numbers relayed by the pKVM proxy are
actually compliant with the specification
- Fix a couple of feature to idreg mappings which resulted in the
wrong sanitisation being applied
- Fix the GICD_IIDR revision number field that could never been
written correctly by userspace
- Make kvm_vcpu_initialized() correctly use its parameter instead
of relying on the surrounding context
- Enforce correct ordering in __pkvm_init_vcpu(), plugging a
potential pin leak at the same time
- Move __pkvm_init_finalise() to a less dangerous spot, avoiding
future problems
- Restore functional userspace irqchip support after a four year
breakage (last functional kernel was 5.18...). This is obviously
ripe for garbage collection.
- ... and the usual lot of spelling fixes
|
|
It appears that there is nothing in the wake-up path that
evaluates whether the in-kernel interrupts are pending unless
we have a vgic.
This means that the userspace irqchip support has been broken for
about four years, and nobody noticed. It was also broken before
as we wouldn't wake-up on a PMU interrupt, but hey, who cares...
It is probably time to remove the feature altogether, because it
was a terrible idea 10 years ago, and it still is.
Fixes: b57de4ffd7c6d ("KVM: arm64: Simplify kvm_cpu_has_pending_timer()")
Link: https://patch.msgid.link/20260423163607.486345-1-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
|
|
fix_host_ownership() walks the hypervisor's stage-1 page-table to
adjust the host's stage-2 accordingly. Any such adjustment that
requires cache maintenance operations depends on the per-CPU hyp
fixmap being present. However, fix_host_ownership() is currently
called before fix_hyp_pgtable_refcnt() and hyp_create_fixmap(), so
the fixmap does not yet exist when it runs.
This is benign today because the host stage-2 starts empty and no
CMOs are needed, but it becomes a latent crash as soon as
fix_host_ownership() is extended to operate on a non-empty
page-table.
Reorder the calls so that fix_hyp_pgtable_refcnt() and
hyp_create_fixmap() complete before fix_host_ownership() is invoked.
Fixes: 0d16d12eb26e ("KVM: arm64: Fix-up hyp stage-1 refcounts for all pages mapped at EL2")
Signed-off-by: Quentin Perret <qperret@google.com>
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260424084908.370776-7-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
|
|
Two bugs exist in the vCPU initialisation path:
1. If a check fails after hyp_pin_shared_mem() succeeds, the cleanup
path jumps to 'unlock' without calling unpin_host_vcpu() or
unpin_host_sve_state(), permanently leaking pin references on the
host vCPU and SVE state pages.
Extract a register_hyp_vcpu() helper that performs the checks and
the store. When register_hyp_vcpu() returns an error, call
unpin_host_vcpu() and unpin_host_sve_state() inline before falling
through to the existing 'unlock' label.
2. register_hyp_vcpu() publishes the new vCPU pointer into
'hyp_vm->vcpus[]' with a bare store, allowing a concurrent caller
of pkvm_load_hyp_vcpu() to observe a partially initialised vCPU
object.
Ensure the store uses smp_store_release() and the load uses
smp_load_acquire(). While 'vm_table_lock' currently serialises the
store and the load, these barriers ensure the reader sees the fully
initialised 'hyp_vcpu' object even if there were a lockless path or
if the lock's own ordering guarantees were insufficient for nested
object initialization.
Fixes: 49af6ddb8e5c ("KVM: arm64: Add infrastructure to create and track pKVM instances at EL2")
Reported-by: Ben Simner <ben.simner@cl.cam.ac.uk>
Co-developed-by: Will Deacon <willdeacon@google.com>
Signed-off-by: Will Deacon <willdeacon@google.com>
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260424084908.370776-6-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
|
|
The macro is defined with parameter 'v' but the body references the
literal token 'vcpu' instead, causing it to silently operate on whatever
'vcpu' resolves to in the caller's scope rather than the value passed by
the caller. All current call sites happen to use a variable named 'vcpu',
so the bug is latent.
Fixes: e016333745c7 ("KVM: arm64: Only reset vCPU-scoped feature ID regs once")
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260424084908.370776-5-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
|
|
FEAT_SPE_FnE is architecturally detected via PMSIDR_EL1.FnE [6], not
ID_AA64DFR0_EL1.PMSVer. The FEAT_X macro form (register, field, value)
cannot encode a PMSIDR_EL1-based feature, so FEAT_SPE_FnE was defined
identically to FEAT_SPEv1p2 (ID_AA64DFR0_EL1, PMSVer, V1P2), producing
a duplicate that used PMSVer >= V1P2 as a proxy.
Replace the macro with feat_spe_fne(), following the same pattern as
the sibling feat_spe_fds(): guard on FEAT_SPEv1p2 and read
PMSIDR_EL1.FnE [6] directly. Wire the two NEEDS_FEAT consumers to use
the new function.
Remove the now-unused FEAT_SPE_FnE macro.
Fixes: 63d423a7635b ("KVM: arm64: Switch to table-driven FGU configuration")
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260424084908.370776-4-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
|
|
Revists -> Revisit. The following patch will add another similar line.
No functional change intended.
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260424084908.370776-3-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
FEAT_Debugv8p9 is incorrectly defined against ID_AA64DFR0_EL1.PMUVer
instead of ID_AA64DFR0_EL1.DebugVer. All three consumers of the macro
gate features that are architecturally tied to FEAT_Debugv8p9
(DebugVer = 0b1011, DDI0487 M.b A2.2.10):
- HDFGRTR2_EL2.nMDSELR_EL1, HDFGWTR2_EL2.nMDSELR_EL1: MDSELR_EL1
is present only when FEAT_Debugv8p9 is implemented (D24.3.21).
- MDCR_EL2.EBWE: the Extended Breakpoint and Watchpoint Enable bit
is RES0 unless FEAT_Debugv8p9 is implemented (D24.3.17).
Neither register has any dependency on PMUVer.
FEAT_Debugv8p9 and FEAT_PMUv3p9 are independent. Per DDI0487 M.b
A2.2.10, FEAT_Debugv8p9 is unconditionally mandatory from Armv8.9,
whereas FEAT_PMUv3p9 is mandatory only when FEAT_PMUv3 is implemented.
An Armv8.9 CPU without a PMU has DebugVer = 0b1011 but PMUVer = 0b0000,
so the wrong field check would cause KVM to incorrectly treat EBWE and
MDSELR_EL1 as RES0 on such hardware.
Fixes: 4bc0fe089840 ("KVM: arm64: Add sanitisation for FEAT_FGT2 registers")
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260424084908.370776-2-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
|
|
Prevent the propagation of a function-id that has the top bits set since
this is not compliant with the SMCCC spec and can overlap with the
already known function-id decoders. (eg. if we invoke an smc with
0xffffffffc4000012 it will be decoded as a PSCI reset call). Instead,
make it clear that we don't support it and return an error.
Signed-off-by: Sebastian Ene <sebastianene@google.com>
Link: https://patch.msgid.link/20260408114118.422604-1-sebastianene@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
The uaccess write handlers for GICD_IIDR in both GICv2 and GICv3
extract the revision field from 'reg' (the current IIDR value read back
from the emulated distributor) instead of 'val' (the value userspace is
trying to write). This means userspace can never actually change the
implementation revision — the extracted value is always the current one.
Fix the FIELD_GET to use 'val' so that userspace can select a different
revision for migration compatibility.
Fixes: 49a1a2c70a7f ("KVM: arm64: vgic-v3: Advertise GICR_CTLR.{IR, CES} as a new GICD_IIDR revision")
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Link: https://patch.msgid.link/20260407210949.2076251-2-dwmw2@infradead.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull more SoC updates from Arnd Bergmann:
"These are the contents that arrived during the easter vacation and
didn't make it into the last 7.0 bugfixes or the first set of branches
for the merge window. Aside from a reset controller bugfix and an
update to the MAINTAINERS entry, this is all devicetree changes.
The Marvell devicetree updates contain the usual minor updates and
bugfixes, along with a two larger but trivial patches to drop unused
dtsi files, the single broadcom fix addresses a build time warning
introduced during the merge window.
The freescale, amlogic, and apple changes missed the last fixes branch
for 7.0"
* tag 'soc-late-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (38 commits)
arm64: dts: meson-gxl-p230: fix ethernet PHY interrupt number
arm64: dts: amlogic: meson-axg: Add missing cache information to cpu0
arm64: dts: amlogic: t7: khadas-vim4: fix board model name
arm64: dts: amlogic: Fix GIC register ranges for Amlogic T7
arm64: dts: amlogic: t7: khadas-vim4: fix memory layout for 8GB RAM
arm64: dts: amlogic: s6: Drop CPU masks from GICv3 PPI interrupts
Documentation/process: maintainer-soc: Document purpose of defconfigs
Documentation/process: maintainer-soc: Trim from trivial ask-DT
ARM: dts: bcm4709: fix bus range assignment
arm64: dts: apple: Fix spelling error
dt-bindings: Update Sasha Finkelstein's email address
mailmap: Update Sasha Finkelstein's email address
arm64: dts: marvell: armada-37xx: swap PHYs' order in USB3 controller node
arm64: dts: marvell: armada-37xx: use 'usb2-phy' in USB3 controller's phy-names
arm64: dts: imx8mm-tqma8mqml: Correct PAD settings for PMIC_nINT
arm64: dts: imx8mn-tqma8mqnl: Correct PAD settings for PMIC_nINT
arm64: dts: imx8mm-emtop-som: Correct PAD settings for PMIC_nINT
reset: amlogic: t7: Fix null reset ops
arm64: dts: imx8mp-data-modul-edm-sbc: Correct PAD settings for PMIC_nINT
arm64: dts: imx8mp-dhcom-som: Correct PAD settings for PMIC_nINT
...
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/amlogic/linux into soc/late2
Amlogic DT Fixes for v7.1:
- Fix ethernet PHY interrupt number for P230 reference board
- Add missing cache information to cpu0 for Amlogic AXG
- Fix Khadas VIM4 board model name
- Fix GIC register ranges for Amlogic T7
- Fix Khadas VIM4 memory layout for 8GB RAM
- Drop CPU masks from GICv3 PPI interrupts for Amlogic S6
* tag 'amlogic-fixes-v7.1-rc' of https://git.kernel.org/pub/scm/linux/kernel/git/amlogic/linux:
arm64: dts: meson-gxl-p230: fix ethernet PHY interrupt number
arm64: dts: amlogic: meson-axg: Add missing cache information to cpu0
arm64: dts: amlogic: t7: khadas-vim4: fix board model name
arm64: dts: amlogic: Fix GIC register ranges for Amlogic T7
arm64: dts: amlogic: t7: khadas-vim4: fix memory layout for 8GB RAM
arm64: dts: amlogic: s6: Drop CPU masks from GICv3 PPI interrupts
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 updates from Vasily Gorbik:
- Add support for CONFIG_PAGE_TABLE_CHECK and enable it in
debug_defconfig. s390 can only tell user from kernel PTEs via the mm,
so mm_struct is now passed into pxx_user_accessible_page() callbacks
- Expose the PCI function UID as an arch-specific slot attribute in
sysfs so a function can be identified by its user-defined id while
still in standby. Introduces a generic ARCH_PCI_SLOT_GROUPS hook in
drivers/pci/slot.c
- Refresh s390 PCI documentation to reflect current behavior and cover
previously undocumented sysfs attributes
- zcrypt device driver cleanup series: consistent field types, clearer
variable naming, a kernel-doc warning fix, and a comment explaining
the intentional synchronize_rcu() in pkey_handler_register()
- Provide an s390 arch_raw_cpu_ptr() that avoids the detour via
get_lowcore() using alternatives, shrinking defconfig by ~27 kB
- Guard identity-base randomization with kaslr_enabled() so nokaslr
keeps the identity mapping at 0 even with RANDOMIZE_IDENTITY_BASE=y
- Build S390_MODULES_SANITY_TEST as a module only by requiring KUNIT &&
m, since built-in would not exercise module loading
- Remove the permanently commented-out HMCDRV_DEV_CLASS create_class()
code in the hmcdrv driver
- Drop stale ident_map_size extern conflicting with asm/page.h
* tag 's390-7.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/zcrypt: Fix warning about wrong kernel doc comment
PCI: s390: Expose the UID as an arch specific PCI slot attribute
docs: s390/pci: Improve and update PCI documentation
s390/pkey: Add comment about synchronize_rcu() to pkey base
s390/hmcdrv: Remove commented out code
s390/zcrypt: Slight rework on the agent_id field
s390/zcrypt: Explicitly use a card variable in _zcrypt_send_cprb
s390/zcrypt: Rework MKVP fields and handling
s390/zcrypt: Make apfs a real unsigned int field
s390/zcrypt: Rework domain processing within zcrypt device driver
s390/zcrypt: Move inline function rng_type6cprb_msgx from header to code
s390/percpu: Provide arch_raw_cpu_ptr()
s390: Enable page table check for debug_defconfig
s390/pgtable: Add s390 support for page table check
s390/pgtable: Use set_pmd_bit() to invalidate PMD entry
mm/page_table_check: Pass mm_struct to pxx_user_accessible_page()
s390/boot: Respect kaslr_enabled() for identity randomization
s390/Kconfig: Make modules sanity test a module-only option
s390/setup: Drop stale ident_map_size declaration
|
|
Correct the interrupt number assigned to the Realtek PHY in the p230
following the same logic as commit 3106507e1004 ("ARM64: dts: meson-gxm:
fix q200 interrupt number"),as reported in [PATCH 0/2] Ethernet PHY
interrupt improvements [1].
[1] https://lore.kernel.org/all/20171202214037.17017-1-martin.blumenstingl@googlemail.com/
Fixes: b94d22d94ad2 ("ARM64: dts: meson-gx: add external PHY interrupt on some platforms")
Signed-off-by: Jun Yan <jerrysteve1101@gmail.com>
Reviewed-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Link: https://patch.msgid.link/20260330145111.115318-1-jerrysteve1101@gmail.com
Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
|