linux.git/arch/arm64/kernel/pi, branch v6.15

Merge tag 'kvmarm-fixes-6.15-2' of https://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

2025-04-24T17:28:53+00:00

KVM/arm64 fixes for 6.15, round #2

 - Single fix for broken usage of 'multi-MIDR' infrastructure in PI
   code, adding an open-coded erratum check for everyone's favorite pile
   of sand: Cavium ThunderX

arm64: Rework checks for broken Cavium HW in the PI code

2025-04-18T20:51:07+00:00

Calling into the MIDR checking framework from the PI code has recently
become much harder, due to the new fancy "multi-MIDR" support that
relies on tables being populated at boot time, but not that early that
they are available to the PI code. There are additional issues with
this framework, as the code really isn't position independend *at all*.

This leads to some ugly breakages, as reported by Ada.

It so appears that the only reason for the PI code to call into the
MIDR checking code is to cope with The Most Broken ARM64 System Ever,
aka Cavium ThunderX, which cannot deal with nG attributes that result
of the combination of KASLR and KPTI as a consequence of Erratum 27456.

Duplicate the check for the erratum in the PI code, removing the
dependency on the bulk of the MIDR checking framework. This allows
dropping that same check from kaslr_requires_kpti(), as the KPTI code
already relies on the ARM64_WORKAROUND_CAVIUM_27456 cap.

Fixes: c8c2647e69bed ("arm64: Make  _midr_in_range_list() an exported function")
Reported-by: Ada Couprie Diaz 
Signed-off-by: Marc Zyngier 
Link: https://lore.kernel.org/r/3d97e45a-23cf-419b-9b6f-140b4d88de7b@arm.com
Cc: Catalin Marinas 
Cc: Will Deacon 
Cc: Shameer Kolothum 
Cc: Oliver Upton 
Reviewed-by: Catalin Marinas 
Link: https://lore.kernel.org/r/20250418093129.1755739-1-maz@kernel.org
Signed-off-by: Oliver Upton

arm64/mm: Define PTDESC_ORDER

2025-03-14T17:01:23+00:00

Address bytes shifted with a single 64 bit page table entry (any page table
level) has been always hard coded as 3 (aka 2^3 = 8). Although intuitive it
is not very readable or easy to reason about. Besides it is going to change
with D128, where each 128 bit page table entry will shift address bytes by
4 (aka 2^4 = 16) instead.

Let's just formalise this address bytes shift value into a new macro called
PTDESC_ORDER establishing a logical abstraction, thus improving readability
as well. While here re-organize EARLY_LEVEL macro along with its dependents
for better clarity. This does not cause any functional change. Also replace
all (PAGE_SHIFT - PTDESC_ORDER) instances with PTDESC_TABLE_SHIFT.

Cc: Will Deacon 
Cc: Mark Rutland 
Cc: Andrey Ryabinin 
Cc: Alexander Potapenko 
Cc: Andrey Konovalov 
Cc: Dmitry Vyukov 
Cc: Ard Biesheuvel 
Cc: Ryan Roberts 
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Cc: kasan-dev@googlegroups.com
Acked-by: Ard Biesheuvel 
Reviewed-by: Ryan Roberts 
Signed-off-by: Anshuman Khandual 
Link: https://lore.kernel.org/r/20250311045710.550625-1-anshuman.khandual@arm.com
Signed-off-by: Catalin Marinas

arm64/kernel: Always use level 2 or higher for early mappings

2025-03-13T18:05:55+00:00

The page table population code in map_range() uses a recursive algorithm
to create the early mappings of the kernel, the DTB and the ID mapped
text and data pages, and this fails to take into account that the way
these page tables may be constructed is not precisely the same at each
level. In particular, block mappings are not permitted at each level,
and the code as it exists today might inadvertently create such a
forbidden block mapping if it were used to map a region of the
appropriate size and alignment.

This never happens in practice, given the limited size of the assets
being mapped by the early boot code. Nonetheless, it would be better if
this code would behave correctly in all circumstances.

So only permit block mappings at level 2, and page mappings at level 3,
for any page size, and use table mappings exclusively at all other
levels. This change should have no impact in practice, but it makes the
code more robust.

Cc: Anshuman Khandual 
Reported-by: Ryan Roberts 
Signed-off-by: Ard Biesheuvel 
Reviewed-by: Anshuman Khandual 
Reviewed-by: Ryan Roberts 
Link: https://lore.kernel.org/r/20250311073043.96795-2-ardb+git@google.com
Signed-off-by: Catalin Marinas

arm64/mm: Override PARange for !LPA2 and use it consistently

2024-12-19T17:23:52+00:00

When FEAT_LPA{,2} are not implemented, the ID_AA64MMFR0_EL1.PARange and
TCR.IPS values corresponding with 52-bit physical addressing are
reserved.

Setting the TCR.IPS field to 0b110 (52-bit physical addressing) has side
effects, such as how the TTBRn_ELx.BADDR fields are interpreted, and so
it is important that disabling FEAT_LPA2 (by overriding the
ID_AA64MMFR0.TGran fields) also presents a PARange field consistent with
that.

So limit the field to 48 bits unless LPA2 is enabled, and update
existing references to use the override consistently.

Fixes: 352b0395b505 ("arm64: Enable 52-bit virtual addressing for 4k and 16k granule configs")
Cc: stable@vger.kernel.org
Signed-off-by: Ard Biesheuvel 
Acked-by: Marc Zyngier 
Link: https://lore.kernel.org/r/20241212081841.2168124-10-ardb+git@google.com
Signed-off-by: Will Deacon

Merge branches 'for-next/gcs', 'for-next/probes', 'for-next/asm-offsets', 'for-next/tlb', 'for-next/misc', 'for-next/mte', 'for-next/sysreg', 'for-next/stacktrace', 'for-next/hwcap3', 'for-next/kselftest', 'for-next/crc32', 'for-next/guest-cca', 'for-next/haft' and 'for-next/scs', remote-tracking branch 'arm64/for-next/perf' into for-next/core

2024-11-14T12:07:16+00:00

* arm64/for-next/perf:
  perf: Switch back to struct platform_driver::remove()
  perf: arm_pmuv3: Add support for Samsung Mongoose PMU
  dt-bindings: arm: pmu: Add Samsung Mongoose core compatible
  perf/dwc_pcie: Fix typos in event names
  perf/dwc_pcie: Add support for Ampere SoCs
  ARM: pmuv3: Add missing write_pmuacr()
  perf/marvell: Marvell PEM performance monitor support
  perf/arm_pmuv3: Add PMUv3.9 per counter EL0 access control
  perf/dwc_pcie: Convert the events with mixed case to lowercase
  perf/cxlpmu: Support missing events in 3.1 spec
  perf: imx_perf: add support for i.MX91 platform
  dt-bindings: perf: fsl-imx-ddr: Add i.MX91 compatible
  drivers perf: remove unused field pmu_node

* for-next/gcs: (42 commits)
  : arm64 Guarded Control Stack user-space support
  kselftest/arm64: Fix missing printf() argument in gcs/gcs-stress.c
  arm64/gcs: Fix outdated ptrace documentation
  kselftest/arm64: Ensure stable names for GCS stress test results
  kselftest/arm64: Validate that GCS push and write permissions work
  kselftest/arm64: Enable GCS for the FP stress tests
  kselftest/arm64: Add a GCS stress test
  kselftest/arm64: Add GCS signal tests
  kselftest/arm64: Add test coverage for GCS mode locking
  kselftest/arm64: Add a GCS test program built with the system libc
  kselftest/arm64: Add very basic GCS test program
  kselftest/arm64: Always run signals tests with GCS enabled
  kselftest/arm64: Allow signals tests to specify an expected si_code
  kselftest/arm64: Add framework support for GCS to signal handling tests
  kselftest/arm64: Add GCS as a detected feature in the signal tests
  kselftest/arm64: Verify the GCS hwcap
  arm64: Add Kconfig for Guarded Control Stack (GCS)
  arm64/ptrace: Expose GCS via ptrace and core files
  arm64/signal: Expose GCS state in signal frames
  arm64/signal: Set up and restore the GCS context for signal handlers
  arm64/mm: Implement map_shadow_stack()
  ...

* for-next/probes:
  : Various arm64 uprobes/kprobes cleanups
  arm64: insn: Simulate nop instruction for better uprobe performance
  arm64: probes: Remove probe_opcode_t
  arm64: probes: Cleanup kprobes endianness conversions
  arm64: probes: Move kprobes-specific fields
  arm64: probes: Fix uprobes for big-endian kernels
  arm64: probes: Fix simulate_ldr*_literal()
  arm64: probes: Remove broken LDR (literal) uprobe support

* for-next/asm-offsets:
  : arm64 asm-offsets.c cleanup (remove unused offsets)
  arm64: asm-offsets: remove PREEMPT_DISABLE_OFFSET
  arm64: asm-offsets: remove DMA_{TO,FROM}_DEVICE
  arm64: asm-offsets: remove VM_EXEC and PAGE_SZ
  arm64: asm-offsets: remove MM_CONTEXT_ID
  arm64: asm-offsets: remove COMPAT_{RT_,SIGFRAME_REGS_OFFSET
  arm64: asm-offsets: remove VMA_VM_*
  arm64: asm-offsets: remove TSK_ACTIVE_MM

* for-next/tlb:
  : TLB flushing optimisations
  arm64: optimize flush tlb kernel range
  arm64: tlbflush: add __flush_tlb_range_limit_excess()

* for-next/misc:
  : Miscellaneous patches
  arm64: tls: Fix context-switching of tpidrro_el0 when kpti is enabled
  arm64/ptrace: Clarify documentation of VL configuration via ptrace
  acpi/arm64: remove unnecessary cast
  arm64/mm: Change protval as 'pteval_t' in map_range()
  arm64: uprobes: Optimize cache flushes for xol slot
  acpi/arm64: Adjust error handling procedure in gtdt_parse_timer_block()
  arm64: fix .data.rel.ro size assertion when CONFIG_LTO_CLANG
  arm64/ptdump: Test both PTE_TABLE_BIT and PTE_VALID for block mappings
  arm64/mm: Sanity check PTE address before runtime P4D/PUD folding
  arm64/mm: Drop setting PTE_TYPE_PAGE in pte_mkcont()
  ACPI: GTDT: Tighten the check for the array of platform timer structures
  arm64/fpsimd: Fix a typo
  arm64: Expose ID_AA64ISAR1_EL1.XS to sanitised feature consumers
  arm64: Return early when break handler is found on linked-list
  arm64/mm: Re-organize arch_make_huge_pte()
  arm64/mm: Drop _PROT_SECT_DEFAULT
  arm64: Add command-line override for ID_AA64MMFR0_EL1.ECV
  arm64: head: Drop SWAPPER_TABLE_SHIFT
  arm64: cpufeature: add POE to cpucap_is_possible()
  arm64/mm: Change pgattr_change_is_safe() arguments as pteval_t

* for-next/mte:
  : Various MTE improvements
  selftests: arm64: add hugetlb mte tests
  hugetlb: arm64: add mte support

* for-next/sysreg:
  : arm64 sysreg updates
  arm64/sysreg: Update ID_AA64MMFR1_EL1 to DDI0601 2024-09

* for-next/stacktrace:
  : arm64 stacktrace improvements
  arm64: preserve pt_regs::stackframe during exec*()
  arm64: stacktrace: unwind exception boundaries
  arm64: stacktrace: split unwind_consume_stack()
  arm64: stacktrace: report recovered PCs
  arm64: stacktrace: report source of unwind data
  arm64: stacktrace: move dump_backtrace() to kunwind_stack_walk()
  arm64: use a common struct frame_record
  arm64: pt_regs: swap 'unused' and 'pmr' fields
  arm64: pt_regs: rename "pmr_save" -> "pmr"
  arm64: pt_regs: remove stale big-endian layout
  arm64: pt_regs: assert pt_regs is a multiple of 16 bytes

* for-next/hwcap3:
  : Add AT_HWCAP3 support for arm64 (also wire up AT_HWCAP4)
  arm64: Support AT_HWCAP3
  binfmt_elf: Wire up AT_HWCAP3 at AT_HWCAP4

* for-next/kselftest: (30 commits)
  : arm64 kselftest fixes/cleanups
  kselftest/arm64: Try harder to generate different keys during PAC tests
  kselftest/arm64: Don't leak pipe fds in pac.exec_sign_all()
  kselftest/arm64: Corrupt P0 in the irritator when testing SSVE
  kselftest/arm64: Add FPMR coverage to fp-ptrace
  kselftest/arm64: Expand the set of ZA writes fp-ptrace does
  kselftets/arm64: Use flag bits for features in fp-ptrace assembler code
  kselftest/arm64: Enable build of PAC tests with LLVM=1
  kselftest/arm64: Check that SVCR is 0 in signal handlers
  kselftest/arm64: Fix printf() compiler warnings in the arm64 syscall-abi.c tests
  kselftest/arm64: Fix printf() warning in the arm64 MTE prctl() test
  kselftest/arm64: Fix printf() compiler warnings in the arm64 fp tests
  kselftest/arm64: Fix build with stricter assemblers
  kselftest/arm64: Test signal handler state modification in fp-stress
  kselftest/arm64: Provide a SIGUSR1 handler in the kernel mode FP stress test
  kselftest/arm64: Implement irritators for ZA and ZT
  kselftest/arm64: Remove unused ADRs from irritator handlers
  kselftest/arm64: Correct misleading comments on fp-stress irritators
  kselftest/arm64: Poll less often while waiting for fp-stress children
  kselftest/arm64: Increase frequency of signal delivery in fp-stress
  kselftest/arm64: Fix encoding for SVE B16B16 test
  ...

* for-next/crc32:
  : Optimise CRC32 using PMULL instructions
  arm64/crc32: Implement 4-way interleave using PMULL
  arm64/crc32: Reorganize bit/byte ordering macros
  arm64/lib: Handle CRC-32 alternative in C code

* for-next/guest-cca:
  : Support for running Linux as a guest in Arm CCA
  arm64: Document Arm Confidential Compute
  virt: arm-cca-guest: TSM_REPORT support for realms
  arm64: Enable memory encrypt for Realms
  arm64: mm: Avoid TLBI when marking pages as valid
  arm64: Enforce bounce buffers for realm DMA
  efi: arm64: Map Device with Prot Shared
  arm64: rsi: Map unprotected MMIO as decrypted
  arm64: rsi: Add support for checking whether an MMIO is protected
  arm64: realm: Query IPA size from the RMM
  arm64: Detect if in a realm and set RIPAS RAM
  arm64: rsi: Add RSI definitions

* for-next/haft:
  : Support for arm64 FEAT_HAFT
  arm64: pgtable: Warn unexpected pmdp_test_and_clear_young()
  arm64: Enable ARCH_HAS_NONLEAF_PMD_YOUNG
  arm64: Add support for FEAT_HAFT
  arm64: setup: name 'tcr2' register
  arm64/sysreg: Update ID_AA64MMFR1_EL1 register

* for-next/scs:
  : Dynamic shadow call stack fixes
  arm64/scs: Drop unused prototype __pi_scs_patch_vmlinux()
  arm64/scs: Deal with 64-bit relative offsets in FDE frames
  arm64/scs: Fix handling of DWARF augmentation data in CIE/FDE frames

arm64/mm: Change protval as 'pteval_t' in map_range()

2024-11-12T14:46:17+00:00

pgprot_t has been defined as an encapsulated structure with pteval_t as its
element. Hence it is prudent to use pteval_t as the type instead of via the
size based u64. Besides pteval_t type might be different size later on with
FEAT_D128.

Cc: Will Deacon 
Cc: Ard Biesheuvel 
Cc: Ryan Roberts 
Cc: Mark Rutland 
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Anshuman Khandual 
Reviewed-by: Gavin Shan 
Link: https://lore.kernel.org/r/20241111075249.609493-1-anshuman.khandual@arm.com
Signed-off-by: Catalin Marinas

arm64/scs: Deal with 64-bit relative offsets in FDE frames

2024-11-08T16:37:55+00:00

In some cases, the compiler may decide to emit DWARF FDE frames with
64-bit signed fields for the code offset and range fields. This may
happen when using the large code model, for instance, which permits
an executable to be spread out over more than 4 GiB of address space.

Whether this is the case can be inferred from the augmentation data in
the CIE frame, so decode this data before processing the FDE frames.

Signed-off-by: Ard Biesheuvel 
Reviewed-by: Sami Tolvanen 
Tested-by: Sami Tolvanen 
Link: https://lore.kernel.org/r/20241106185513.3096442-7-ardb+git@google.com
Signed-off-by: Catalin Marinas

arm64/scs: Fix handling of DWARF augmentation data in CIE/FDE frames

2024-11-08T16:37:55+00:00

The dynamic SCS patching code pretends to parse the DWARF augmentation
data in the CIE (header) frame, and handle accordingly when processing
the individual FDE frames based on this CIE frame. However, the boolean
variable is defined inside the loop, and so the parsed value is ignored.

The same applies to the code alignment field, which is also read from
the header but then discarded.

This was never spotted before because Clang is the only compiler that
supports dynamic SCS patching (which is essentially an Android feature),
and the unwind tables it produces are highly uniform, and match the
de facto defaults.

So instead of testing for the 'z' flag in the augmentation data field,
require a fixed augmentation data string of 'zR', and simplify the rest
of the code accordingly.

Also introduce some error codes to specify why the patching failed, and
log it to the kernel console on failure when this happens when loading a
module. (Doing so for vmlinux is infeasible, as the patching is done
extremely early in the boot.)

Signed-off-by: Ard Biesheuvel 
Reviewed-by: Sami Tolvanen 
Tested-by: Sami Tolvanen 
Link: https://lore.kernel.org/r/20241106185513.3096442-6-ardb+git@google.com
Signed-off-by: Catalin Marinas

arm64: Add command-line override for ID_AA64MMFR0_EL1.ECV

2024-10-23T10:57:46+00:00

It appears that relatively popular hardware out there implements
the CNTPOFF_EL2 variant of FEAT_ECV, advertises it via ID_AA64MMFR0_EL1,
but cannot be bothered to set SCR_EL3.ECVEn to 1.

You would probably think that "this is fine, EL3 will take the
trap on access to CNTPOFF_EL2 and flip the ECVEn bit", as that's
what a semi-decent firmware implementation would do.

But no. None of that. This particular implementation takes the trap,
considers its purpose in life, decides that it has none, and *RESETS*
the system.

Yes, x1e001de, I'm talking about you.

In order to allow this machine to be promoted slightly above the
level of a glorified door-stop, add a new "id_aa64mmfr0.ecv" override.
allowing the kernel to pretend this option was never there.

Signed-off-by: Marc Zyngier 
Link: https://lore.kernel.org/r/20241021181434.1052974-1-maz@kernel.org
Signed-off-by: Catalin Marinas