linux-stable.git/arch/arm64/kernel/process.c, branch linux-5.4.y

arm64: tls: Fix context-switching of tpidrro_el0 when kpti is enabled

2024-12-14T18:44:38+00:00

commit 67ab51cbdfee02ef07fb9d7d14cc0bf6cb5a5e5c upstream.

Commit 18011eac28c7 ("arm64: tls: Avoid unconditional zeroing of
tpidrro_el0 for native tasks") tried to optimise the context switching
of tpidrro_el0 by eliding the clearing of the register when switching
to a native task with kpti enabled, on the erroneous assumption that
the kpti trampoline entry code would already have taken care of the
write.

Although the kpti trampoline does zero the register on entry from a
native task, the check in tls_thread_switch() is on the *next* task and
so we can end up leaving a stale, non-zero value in the register if the
previous task was 32-bit.

Drop the broken optimisation and zero tpidrro_el0 unconditionally when
switching to a native 64-bit task.

Cc: Mark Rutland 
Cc: stable@vger.kernel.org
Fixes: 18011eac28c7 ("arm64: tls: Avoid unconditional zeroing of tpidrro_el0 for native tasks")
Signed-off-by: Will Deacon 
Acked-by: Mark Rutland 
Link: https://lore.kernel.org/r/20241114095332.23391-1-will@kernel.org
Signed-off-by: Catalin Marinas 
Signed-off-by: Greg Kroah-Hartman

arm64: errata: Fix exec handling in erratum 1418040 workaround

2022-02-01T16:24:34+00:00

commit 38e0257e0e6f4fef2aa2966b089b56a8b1cfb75c upstream.

The erratum 1418040 workaround enables CNTVCT_EL1 access trapping in EL0
when executing compat threads. The workaround is applied when switching
between tasks, but the need for the workaround could also change at an
exec(), when a non-compat task execs a compat binary or vice versa. Apply
the workaround in arch_setup_new_exec().

This leaves a small window of time between SET_PERSONALITY and
arch_setup_new_exec where preemption could occur and confuse the old
workaround logic that compares TIF_32BIT between prev and next. Instead, we
can just read cntkctl to make sure it's in the state that the next task
needs. I measured cntkctl read time to be about the same as a mov from a
general-purpose register on N1. Update the workaround logic to examine the
current value of cntkctl instead of the previous task's compat state.

Fixes: d49f7d7376d0 ("arm64: Move handling of erratum 1418040 into C code")
Cc:  # 5.9.x
Signed-off-by: D Scott Phillips 
Reviewed-by: Marc Zyngier 
Link: https://lore.kernel.org/r/20211220234114.3926-1-scott@os.amperecomputing.com
Signed-off-by: Catalin Marinas 
Signed-off-by: Greg Kroah-Hartman

arm64: Mark __stack_chk_guard as __ro_after_init

2021-09-30T08:09:25+00:00

[ Upstream commit 9fcb2e93f41c07a400885325e7dbdfceba6efaec ]

__stack_chk_guard is setup once while init stage and never changed
after that.

Although the modification of this variable at runtime will usually
cause the kernel to crash (so does the attacker), it should be marked
as __ro_after_init, and it should not affect performance if it is
placed in the ro_after_init section.

Signed-off-by: Dan Li 
Acked-by: Mark Rutland 
Link: https://lore.kernel.org/r/1631612642-102881-1-git-send-email-ashimida@linux.alibaba.com
Signed-off-by: Catalin Marinas 
Signed-off-by: Sasha Levin

arm64: errata: Fix handling of 1418040 with late CPU onlining

2020-11-24T12:29:00+00:00

[ Upstream commit f969f03888b9438fdb227b6460d99ede5737326d ]

In a surprising turn of events, it transpires that CPU capabilities
configured as ARM64_CPUCAP_WEAK_LOCAL_CPU_FEATURE are never set as the
result of late-onlining. Therefore our handling of erratum 1418040 does
not get activated if it is not required by any of the boot CPUs, even
though we allow late-onlining of an affected CPU.

In order to get things working again, replace the cpus_have_const_cap()
invocation with an explicit check for the current CPU using
this_cpu_has_cap().

Cc: Sai Prakash Ranjan 
Cc: Stephen Boyd 
Cc: Catalin Marinas 
Cc: Mark Rutland 
Reviewed-by: Suzuki K Poulose 
Acked-by: Marc Zyngier 
Link: https://lore.kernel.org/r/20201106114952.10032-1-will@kernel.org
Signed-off-by: Will Deacon 
Signed-off-by: Sasha Levin

arm64: Move handling of erratum 1418040 into C code

2020-09-03T09:27:01+00:00

[ Upstream commit d49f7d7376d0c0daf8680984a37bd07581ac7d38 ]

Instead of dealing with erratum 1418040 on each entry and exit,
let's move the handling to __switch_to() instead, which has
several advantages:

- It can be applied when it matters (switching between 32 and 64
  bit tasks).
- It is written in C (yay!)
- It can rely on static keys rather than alternatives

Signed-off-by: Marc Zyngier 
Tested-by: Sai Prakash Ranjan 
Reviewed-by: Stephen Boyd 
Acked-by: Will Deacon 
Link: https://lore.kernel.org/r/20200731173824.107480-2-maz@kernel.org
Signed-off-by: Catalin Marinas 
Signed-off-by: Sasha Levin

arm64: ssbs: Fix context-switch when SSBS is present on all CPUs

2020-02-19T18:53:02+00:00

commit fca3d33d8ad61eb53eca3ee4cac476d1e31b9008 upstream.

When all CPUs in the system implement the SSBS extension, the SSBS field
in PSTATE is the definitive indication of the mitigation state. Further,
when the CPUs implement the SSBS manipulation instructions (advertised
to userspace via an HWCAP), EL0 can toggle the SSBS field directly and
so we cannot rely on any shadow state such as TIF_SSBD at all.

Avoid forcing the SSBS field in context-switch on such a system, and
simply rely on the PSTATE register instead.

Cc: 
Cc: Catalin Marinas 
Cc: Srinivas Ramana 
Fixes: cbdf8a189a66 ("arm64: Force SSBS on context switch")
Reviewed-by: Marc Zyngier 
Signed-off-by: Will Deacon 
Signed-off-by: Greg Kroah-Hartman

arm64: Implement copy_thread_tls

2020-01-14T19:08:34+00:00

commit a4376f2fbcc8084832f2f114577c8d68234c7903 upstream.

This is required for clone3 which passes the TLS value through a
struct rather than a register.

Signed-off-by: Amanieu d'Antras 
Cc: linux-arm-kernel@lists.infradead.org
Cc:  # 5.3.x
Acked-by: Will Deacon 
Link: https://lore.kernel.org/r/20200102172413.654385-3-amanieu@gmail.com
Signed-off-by: Christian Brauner 
Signed-off-by: Greg Kroah-Hartman

arm64: entry.S: Do not preempt from IRQ before all cpufeatures are enabled

2019-10-16T16:51:43+00:00

Preempting from IRQ-return means that the task has its PSTATE saved
on the stack, which will get restored when the task is resumed and does
the actual IRQ return.

However, enabling some CPU features requires modifying the PSTATE. This
means that, if a task was scheduled out during an IRQ-return before all
CPU features are enabled, the task might restore a PSTATE that does not
include the feature enablement changes once scheduled back in.

* Task 1:

PAN == 0 ---|                          |---------------
            |                          |<- return from IRQ, PSTATE.PAN = 0
            | <- IRQ                   |
            +--------+ <- preempt()  +--
                                     ^
                                     |
                                     reschedule Task 1, PSTATE.PAN == 1
* Init:
        --------------------+------------------------
                            ^
                            |
                            enable_cpu_features
                            set PSTATE.PAN on all CPUs

Worse than this, since PSTATE is untouched when task switching is done,
a task missing the new bits in PSTATE might affect another task, if both
do direct calls to schedule() (outside of IRQ/exception contexts).

Fix this by preventing preemption on IRQ-return until features are
enabled on all CPUs.

This way the only PSTATE values that are saved on the stack are from
synchronous exceptions. These are expected to be fatal this early, the
exception is BRK for WARN_ON(), but as this uses do_debug_exception()
which keeps IRQs masked, it shouldn't call schedule().

Signed-off-by: Julien Thierry 
[james: Replaced a really cool hack, with an even simpler static key in C.
 expanded commit message with Julien's cover-letter ascii art]
Signed-off-by: James Morse 
Signed-off-by: Will Deacon

arm64/sve: Fix wrong free for task->thread.sve_state

2019-10-01T12:30:52+00:00

The system which has SVE feature crashed because of
the memory pointed by task->thread.sve_state was destroyed
by someone.

That is because sve_state is freed while the forking the
child process. The child process has the pointer of sve_state
which is same as the parent's because the child's task_struct
is copied from the parent's one. If the copy_process()
fails as an error on somewhere, for example, copy_creds(),
then the sve_state is freed even if the parent is alive.
The flow is as follows.

copy_process
        p = dup_task_struct
            => arch_dup_task_struct
                *dst = *src;  // copy the entire region.
:
        retval = copy_creds
        if (retval < 0)
                goto bad_fork_free;
:
bad_fork_free:
...
        delayed_free_task(p);
          => free_task
             => arch_release_task_struct
                => fpsimd_release_task
                   => __sve_free
                      => kfree(task->thread.sve_state);
                         // free the parent's sve_state

Move child's sve_state = NULL and clearing TIF_SVE flag
to arch_dup_task_struct() so that the child doesn't free the
parent's one.
There is no need to wait until copy_process() to clear TIF_SVE for
dst, because the thread flags for dst are initialized already by
copying the src task_struct.
This change simplifies the code, so get rid of comments that are no
longer needed.

As a note, arm64 used to have thread_info on the stack. So it
would not be possible to clear TIF_SVE until the stack is initialized.
From commit c02433dd6de3 ("arm64: split thread_info from task stack"),
the thread_info is part of the task, so it should be valid to modify
the flag from arch_dup_task_struct().

Cc: stable@vger.kernel.org # 4.15.x-
Fixes: bc0ee4760364 ("arm64/sve: Core task context handling")
Signed-off-by: Masayoshi Mizuma 
Reported-by: Hidetoshi Seto 
Suggested-by: Dave Martin 
Reviewed-by: Dave Martin 
Tested-by: Julien Grall 
Signed-off-by: Will Deacon

arm64, mm: make randomization selected by generic topdown mmap layout

2019-09-24T22:54:11+00:00

This commits selects ARCH_HAS_ELF_RANDOMIZE when an arch uses the generic
topdown mmap layout functions so that this security feature is on by
default.

Note that this commit also removes the possibility for arm64 to have elf
randomization and no MMU: without MMU, the security added by randomization
is worth nothing.

Link: http://lkml.kernel.org/r/20190730055113.23635-6-alex@ghiti.fr
Signed-off-by: Alexandre Ghiti 
Acked-by: Catalin Marinas 
Acked-by: Kees Cook 
Reviewed-by: Christoph Hellwig 
Reviewed-by: Luis Chamberlain 
Cc: Albert Ou 
Cc: Alexander Viro 
Cc: Christoph Hellwig 
Cc: James Hogan 
Cc: Palmer Dabbelt 
Cc: Paul Burton 
Cc: Ralf Baechle 
Cc: Russell King 
Cc: Will Deacon 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds