linux-stable.git/arch/arm64/include, branch linux-5.17.y

irqchip/gic-v3: Refactor ISB + EOIR at ack time

2022-06-09T08:25:53+00:00

[ Upstream commit 6efb50923771f392122f5ce69dfc43b08f16e449 ]

There are cases where a context synchronization event is necessary
between an IRQ being raised and being handled, and there are races such
that we cannot rely upon the exception entry being subsequent to the
interrupt being raised. To fix this, we place an ISB between a read of
IAR and the subsequent invocation of an IRQ handler.

When EOI mode 1 is in use, we need to EOI an interrupt prior to invoking
its handler, and we have a write to EOIR for this. As this write to EOIR
requires an ISB, and this is provided by the gic_write_eoir() helper, we
omit the usual ISB in this case, with the logic being:

|	if (static_branch_likely(&supports_deactivate_key))
|		gic_write_eoir(irqnr);
|	else
|		isb();

This is somewhat opaque, and it would be a little clearer if there were
an unconditional ISB, with only the write to EOIR being conditional,
e.g.

|	if (static_branch_likely(&supports_deactivate_key))
|		write_gicreg(irqnr, ICC_EOIR1_EL1);
|
|	isb();

This patch rewrites the code that way, with this logic factored into a
new helper function with comments explaining what the ISB is for, as
were originally laid out in commit:

  39a06b67c2c1256b ("irqchip/gic: Ensure we have an ISB between ack and ->handle_irq")

Note that since then, we removed the IAR polling in commit:

  342677d70ab92142 ("irqchip/gic-v3: Remove acknowledge loop")

... which removed one of the two race conditions.

For consistency, other portions of the driver are made to manipulate
EOIR using write_gicreg() and explcit ISBs, and the gic_write_eoir()
helper function is removed.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland 
Cc: Marc Zyngier 
Cc: Thomas Gleixner 
Cc: Will Deacon 
Signed-off-by: Marc Zyngier 
Link: https://lore.kernel.org/r/20220513133038.226182-3-mark.rutland@arm.com
Signed-off-by: Sasha Levin

arm64: stackleak: fix current_top_of_stack()

2022-06-09T08:25:48+00:00

[ Upstream commit e85094c31ddb794ac41c299a5a7a68243148f829 ]

Due to some historical confusion, arm64's current_top_of_stack() isn't
what the stackleak code expects. This could in theory result in a number
of problems, and practically results in an unnecessary performance hit.
We can avoid this by aligning the arm64 implementation with the x86
implementation.

The arm64 implementation of current_top_of_stack() was added
specifically for stackleak in commit:

  0b3e336601b82c6a ("arm64: Add support for STACKLEAK gcc plugin")

This was intended to be equivalent to the x86 implementation, but the
implementation, semantics, and performance characteristics differ
wildly:

* On x86, current_top_of_stack() returns the top of the current task's
  task stack, regardless of which stack is in active use.

  The implementation accesses a percpu variable which the x86 entry code
  maintains, and returns the location immediately above the pt_regs on
  the task stack (above which x86 has some padding).

* On arm64 current_top_of_stack() returns the top of the stack in active
  use (i.e. the one which is currently being used).

  The implementation checks the SP against a number of
  potentially-accessible stacks, and will BUG() if no stack is found.

The core stackleak_erase() code determines the upper bound of stack to
erase with:

| if (on_thread_stack())
|         boundary = current_stack_pointer;
| else
|         boundary = current_top_of_stack();

On arm64 stackleak_erase() is always called on a task stack, and
on_thread_stack() should always be true. On x86, stackleak_erase() is
mostly called on a trampoline stack, and is sometimes called on a task
stack.

Currently, this results in a lot of unnecessary code being generated for
arm64 for the impossible !on_thread_stack() case. Some of this is
inlined, bloating stackleak_erase(), while portions of this are left
out-of-line and permitted to be instrumented (which would be a
functional problem if that code were reachable).

As a first step towards improving this, this patch aligns arm64's
implementation of current_top_of_stack() with x86's, always returning
the top of the current task's stack. With GCC 11.1.0 this results in the
bulk of the unnecessary code being removed, including all of the
out-of-line instrumentable code.

While I don't believe there's a functional problem in practice I've
marked this as a fix since the semantic was clearly wrong, the fix
itself is simple, and other code might rely upon this in future.

Fixes: 0b3e336601b82c6a ("arm64: Add support for STACKLEAK gcc plugin")
Signed-off-by: Mark Rutland 
Cc: Alexander Popov 
Cc: Andrew Morton 
Cc: Andy Lutomirski 
Cc: Catalin Marinas 
Cc: Kees Cook 
Cc: Will Deacon 
Acked-by: Catalin Marinas 
Signed-off-by: Kees Cook 
Link: https://lore.kernel.org/r/20220427173128.2603085-2-mark.rutland@arm.com
Signed-off-by: Sasha Levin

arm[64]/memremap: don't abuse pfn_valid() to ensure presence of linear map

2022-05-18T08:28:22+00:00

commit 260364d112bc822005224667c0c9b1b17a53eafd upstream.

The semantics of pfn_valid() is to check presence of the memory map for a
PFN and not whether a PFN is covered by the linear map.  The memory map
may be present for NOMAP memory regions, but they won't be mapped in the
linear mapping.  Accessing such regions via __va() when they are
memremap()'ed will cause a crash.

On v5.4.y the crash happens on qemu-arm with UEFI [1]:

<1>[    0.084476] 8<--- cut here ---
<1>[    0.084595] Unable to handle kernel paging request at virtual address dfb76000
<1>[    0.084938] pgd = (ptrval)
<1>[    0.085038] [dfb76000] *pgd=5f7fe801, *pte=00000000, *ppte=00000000

...

<4>[    0.093923] [] (memcpy) from [] (dmi_setup+0x60/0x418)
<4>[    0.094204] [] (dmi_setup) from [] (arm_dmi_init+0x8/0x10)
<4>[    0.094408] [] (arm_dmi_init) from [] (do_one_initcall+0x50/0x228)
<4>[    0.094619] [] (do_one_initcall) from [] (kernel_init_freeable+0x15c/0x1f8)
<4>[    0.094841] [] (kernel_init_freeable) from [] (kernel_init+0x8/0x10c)
<4>[    0.095057] [] (kernel_init) from [] (ret_from_fork+0x14/0x2c)

On kernels v5.10.y and newer the same crash won't reproduce on ARM because
commit b10d6bca8720 ("arch, drivers: replace for_each_membock() with
for_each_mem_range()") changed the way memory regions are registered in
the resource tree, but that merely covers up the problem.

On ARM64 memory resources registered in yet another way and there the
issue of wrong usage of pfn_valid() to ensure availability of the linear
map is also covered.

Implement arch_memremap_can_ram_remap() on ARM and ARM64 to prevent access
to NOMAP regions via the linear mapping in memremap().

Link: https://lore.kernel.org/all/Yl65zxGgFzF1Okac@sirena.org.uk
Link: https://lkml.kernel.org/r/20220426060107.7618-1-rppt@kernel.org
Signed-off-by: Mike Rapoport 
Reported-by: "kernelci.org bot" 
Tested-by: Mark Brown 
Reviewed-by: Ard Biesheuvel 
Acked-by: Catalin Marinas 
Cc: Greg Kroah-Hartman 
Cc: Mark Brown 
Cc: Mark-PK Tsai 
Cc: Russell King 
Cc: Tony Lindgren 
Cc: Will Deacon 
Cc: 	[5.4+]
Signed-off-by: Andrew Morton 
Signed-off-by: Greg Kroah-Hartman

arm64: mm: fix p?d_leaf()

2022-04-27T12:41:05+00:00

[ Upstream commit 23bc8f69f0eceecbb87c3801d2e48827d2dca92b ]

The pmd_leaf() is used to test a leaf mapped PMD, however, it misses
the PROT_NONE mapped PMD on arm64.  Fix it.  A real world issue [1]
caused by this was reported by Qian Cai. Also fix pud_leaf().

Link: https://patchwork.kernel.org/comment/24798260/ [1]
Fixes: 8aa82df3c123 ("arm64: mm: add p?d_leaf() definitions")
Reported-by: Qian Cai 
Signed-off-by: Muchun Song 
Link: https://lore.kernel.org/r/20220422060033.48711-1-songmuchun@bytedance.com
Signed-off-by: Will Deacon 
Signed-off-by: Sasha Levin

KVM: arm64: mixed-width check should be skipped for uninitialized vCPUs

2022-04-20T07:36:12+00:00

[ Upstream commit 26bf74bd9f6ff0f1545b4f0c92a37c232d076014 ]

KVM allows userspace to configure either all EL1 32bit or 64bit vCPUs
for a guest.  At vCPU reset, vcpu_allowed_register_width() checks
if the vcpu's register width is consistent with all other vCPUs'.
Since the checking is done even against vCPUs that are not initialized
(KVM_ARM_VCPU_INIT has not been done) yet, the uninitialized vCPUs
are erroneously treated as 64bit vCPU, which causes the function to
incorrectly detect a mixed-width VM.

Introduce KVM_ARCH_FLAG_EL1_32BIT and KVM_ARCH_FLAG_REG_WIDTH_CONFIGURED
bits for kvm->arch.flags.  A value of the EL1_32BIT bit indicates that
the guest needs to be configured with all 32bit or 64bit vCPUs, and
a value of the REG_WIDTH_CONFIGURED bit indicates if a value of the
EL1_32BIT bit is valid (already set up). Values in those bits are set at
the first KVM_ARM_VCPU_INIT for the guest based on KVM_ARM_VCPU_EL1_32BIT
configuration for the vCPU.

Check vcpu's register width against those new bits at the vcpu's
KVM_ARM_VCPU_INIT (instead of against other vCPUs' register width).

Fixes: 66e94d5cafd4 ("KVM: arm64: Prevent mixed-width VM creation")
Signed-off-by: Reiji Watanabe 
Reviewed-by: Oliver Upton 
Signed-off-by: Marc Zyngier 
Link: https://lore.kernel.org/r/20220329031924.619453-2-reijiw@google.com
Signed-off-by: Sasha Levin

KVM: arm64: Generalise VM features into a set of flags

2022-04-20T07:36:12+00:00

[ Upstream commit 06394531b425794dc56f3d525b7994d25b8072f7 ]

We currently deal with a set of booleans for VM features,
while they could be better represented as set of flags
contained in an unsigned long, similarily to what we are
doing on the CPU side.

Signed-off-by: Marc Zyngier 
[Oliver: Flag-ify the 'ran_once' boolean]
Signed-off-by: Oliver Upton 
Signed-off-by: Marc Zyngier 
Link: https://lore.kernel.org/r/20220311174001.605719-2-oupton@google.com
Signed-off-by: Sasha Levin

arm64: Add part number for Arm Cortex-A78AE

2022-04-13T17:27:34+00:00

commit 83bea32ac7ed37bbda58733de61fc9369513f9f9 upstream.

Add the MIDR part number info for the Arm Cortex-A78AE[1] and add it to
spectre-BHB affected list[2].

[1]: https://developer.arm.com/Processors/Cortex-A78AE
[2]: https://developer.arm.com/Arm%20Security%20Center/Spectre-BHB

Cc: Catalin Marinas 
Cc: Mark Rutland 
Cc: Will Deacon 
Cc: James Morse 
Signed-off-by: Chanho Park 
Link: https://lore.kernel.org/r/20220407091128.8700-1-chanho61.park@samsung.com
Signed-off-by: Will Deacon 
Signed-off-by: Greg Kroah-Hartman

KVM: arm64: Do not change the PMU event filter after a VCPU has run

2022-04-13T17:27:13+00:00

[ Upstream commit 5177fe91e4cf78a659aada2c9cf712db4d788481 ]

Userspace can specify which events a guest is allowed to use with the
KVM_ARM_VCPU_PMU_V3_FILTER attribute. The list of allowed events can be
identified by a guest from reading the PMCEID{0,1}_EL0 registers.

Changing the PMU event filter after a VCPU has run can cause reads of the
registers performed before the filter is changed to return different values
than reads performed with the new event filter in place. The architecture
defines the two registers as read-only, and this behaviour contradicts
that.

Keep track when the first VCPU has run and deny changes to the PMU event
filter to prevent this from happening.

Signed-off-by: Marc Zyngier 
[ Alexandru E: Added commit message, updated ioctl documentation ]
Signed-off-by: Alexandru Elisei 
Signed-off-by: Marc Zyngier 
Link: https://lore.kernel.org/r/20220127161759.53553-2-alexandru.elisei@arm.com
Signed-off-by: Sasha Levin

arm64: module: remove (NOLOAD) from linker script

2022-04-08T11:58:37+00:00

[ Upstream commit 4013e26670c590944abdab56c4fa797527b74325 ]

On ELF, (NOLOAD) sets the section type to SHT_NOBITS[1]. It is conceptually
inappropriate for .plt and .text.* sections which are always
SHT_PROGBITS.

In GNU ld, if PLT entries are needed, .plt will be SHT_PROGBITS anyway
and (NOLOAD) will be essentially ignored. In ld.lld, since
https://reviews.llvm.org/D118840 ("[ELF] Support (TYPE=) to
customize the output section type"), ld.lld will report a `section type
mismatch` error. Just remove (NOLOAD) to fix the error.

[1] https://lld.llvm.org/ELF/linker_script.html As of today, "The
section should be marked as not loadable" on
https://sourceware.org/binutils/docs/ld/Output-Section-Type.html is
outdated for ELF.

Tested-by: Nathan Chancellor 
Reported-by: Nathan Chancellor 
Signed-off-by: Fangrui Song 
Acked-by: Ard Biesheuvel 
Link: https://lore.kernel.org/r/20220218081209.354383-1-maskray@google.com
Signed-off-by: Will Deacon 
Signed-off-by: Sasha Levin

arm64: prevent instrumentation of bp hardening callbacks

2022-04-08T11:57:37+00:00

[ Upstream commit 614c0b9fee711dd89b1dd65c88ba83612a373fdc ]

We may call arm64_apply_bp_hardening() early during entry (e.g. in
el0_ia()) before it is safe to run instrumented code. Unfortunately this
may result in running instrumented code in two cases:

* The hardening callbacks called by arm64_apply_bp_hardening() are not
  marked as `noinstr`, and have been observed to be instrumented when
  compiled with either GCC or LLVM.

* Since arm64_apply_bp_hardening() itself is only marked as `inline`
  rather than `__always_inline`, it is possible that the compiler
  decides to place it out-of-line, whereupon it may be instrumented.

For example, with defconfig built with clang 13.0.0,
call_hvc_arch_workaround_1() is compiled as:

| :
|        d503233f        paciasp
|        f81f0ffe        str     x30, [sp, #-16]!
|        320183e0        mov     w0, #0x80008000
|        d503201f        nop
|        d4000002        hvc     #0x0
|        f84107fe        ldr     x30, [sp], #16
|        d50323bf        autiasp
|        d65f03c0        ret

... but when CONFIG_FTRACE=y and CONFIG_KCOV=y this is compiled as:

| :
|        d503245f        bti     c
|        d503201f        nop
|        d503201f        nop
|        d503233f        paciasp
|        a9bf7bfd        stp     x29, x30, [sp, #-16]!
|        910003fd        mov     x29, sp
|        94000000        bl      0 <__sanitizer_cov_trace_pc>
|        320183e0        mov     w0, #0x80008000
|        d503201f        nop
|        d4000002        hvc     #0x0
|        a8c17bfd        ldp     x29, x30, [sp], #16
|        d50323bf        autiasp
|        d65f03c0        ret

... with a patchable function entry registered with ftrace, and a direct
call to __sanitizer_cov_trace_pc(). Neither of these are safe early
during entry sequences.

This patch avoids the unsafe instrumentation by marking
arm64_apply_bp_hardening() as `__always_inline` and by marking the
hardening functions as `noinstr`. This avoids the potential for
instrumentation, and causes clang to consistently generate the function
as with the defconfig sample.

Note: in the defconfig compilation, when CONFIG_SVE=y, x30 is spilled to
the stack without being placed in a frame record, which will result in a
missing entry if call_hvc_arch_workaround_1() is backtraced. Similar is
true of qcom_link_stack_sanitisation(), where inline asm spills the LR
to a GPR prior to corrupting it. This is not a significant issue
presently as we will only backtrace here if an exception is taken, and
in such cases we may omit entries for other reasons today.

The relevant hardening functions were introduced in commits:

  ec82b567a74fbdff ("arm64: Implement branch predictor hardening for Falkor")
  b092201e00206141 ("arm64: Add ARM_SMCCC_ARCH_WORKAROUND_1 BP hardening support")

... and these were subsequently moved in commit:

  d4647f0a2ad71110 ("arm64: Rewrite Spectre-v2 mitigation code")

The arm64_apply_bp_hardening() function was introduced in commit:

  0f15adbb2861ce6f ("arm64: Add skeleton to harden the branch predictor against aliasing attacks")

... and was subsequently moved and reworked in commit:

  6279017e807708a0 ("KVM: arm64: Move BP hardening helpers into spectre.h")

Fixes: ec82b567a74fbdff ("arm64: Implement branch predictor hardening for Falkor")
Fixes: b092201e00206141 ("arm64: Add ARM_SMCCC_ARCH_WORKAROUND_1 BP hardening support")
Fixes: d4647f0a2ad71110 ("arm64: Rewrite Spectre-v2 mitigation code")
Fixes: 0f15adbb2861ce6f ("arm64: Add skeleton to harden the branch predictor against aliasing attacks")
Fixes: 6279017e807708a0 ("KVM: arm64: Move BP hardening helpers into spectre.h")
Signed-off-by: Mark Rutland 
Cc: Ard Biesheuvel 
Cc: Catalin Marinas 
Cc: James Morse 
Cc: Marc Zyngier 
Cc: Mark Brown 
Cc: Will Deacon 
Acked-by: Marc Zyngier 
Reviewed-by: Mark Brown 
Link: https://lore.kernel.org/r/20220224181028.512873-1-mark.rutland@arm.com
Signed-off-by: Will Deacon 
Signed-off-by: Sasha Levin